Skip to content
Share
Explore

Kerala Ayurveda — Product Analyst II Assignment


Dataset: GA4 clickstream ​Scope: D2C India website (web funnel analytics) ​Expected effort: ~6-8 hours ​Deadline: 18th Jan ​Submission method: Link-only (no attachments)

Background

You’ll work with GA4 clickstream data captured on our India D2C website. The dataset includes event-level touchpoints from session start through purchase. Your goal is to (1) build a clean, reusable funnel model and (2) produce an MBR-style narrative: what happened, why it happened, and what to do next.

1) Dataset structure (what you’re getting)

Files

A ZIP containing a folder of parquet files containing event-level data of one month captured by GA4 on website. Download

Base event columns (high level)

User/session
user_id (GA4 pseudo user identifier)
session_id (session identifier)
Event
event (event name)
event_ts (timestamp)
date_ist, time_ist
Page / context
page_location (URL)
page_type (categorized page type, if available)
device
Parameters
event_params (GA4-style nested list of key/value pairs)

event_params (nested GA4 params)

event_params is a list/array of key/value objects (not a flat dict). You will typically need to flatten/unnest it into columns to use it.
Common keys you may see (not exhaustive):
Attribution: source, medium, campaign, term, gclid (often on session_start)
Commerce: transaction_id, value, currency, payment_type, coupon, shipping, tax, discount (often on purchase)

Funnel events (relevant set)

Session start
session_start
Product view (choose definition)
view_item
view_product_page_loaded
Add to cart
add_to_cart
add_to_cart_custom_event
Checkout start
begin_checkout
gokwik_checkout_initiated
Checkout progression
add_shipping_info
add_payment_info
Purchase
purchase
Not every journey is linear; events can be missing or out of order. Handling this sensibly (and calling out QA findings) is part of the assignment.

2) Your task

Part A — Build a reusable funnel model (session-level)

Create a session-level table/dataset session_funnel across the full month.

Required fields

Identifiers
session_id, user_id
Dimensions
device
source, medium, campaign (from session_start params; null/unknown allowed)
landing_page (first page_location in the session)
Funnel steps For each step, include:
flag (0/1) per session
first timestamp (optional but encouraged)
Steps (use OR logic where there are variants):
product view (view_item OR view_product_page_loaded)
add to cart (add_to_cart OR add_to_cart_custom_event)
begin checkout (begin_checkout OR gokwik_checkout_initiated)
add shipping info
add payment info
purchase
Purchase outputs
orders (distinct transaction_id preferred; state your approach)
revenue (sum of purchase value, with deduping logic as needed)
AOV (revenue / orders)

Data QA (must include)

A short QA section with checks like:
duplicate purchases / duplicate transaction_id
sessions with purchase but no checkout events
null spikes in source/medium
any other anomalies you notice

Part B — MBR-style analysis

Write a concise memo + supporting analysis.

Required analyses

Monthly KPI snapshot
sessions, users
product view rate, add-to-cart rate, checkout-start rate
purchase CVR (session → purchase)
revenue + AOV
Funnel performance
overall step conversion rates
break down by at least two cuts, e.g.:
device
source/medium
landing page group / page_type
Driver diagnosis (surgical) Compare:
Week 1 vs Week 4 or
First half vs second half or
Best week vs worst week
Explain what moved and why:
traffic mix shifts (source/medium/device/market)
step conversion changes (view→ATC, ATC→checkout, checkout→purchase)

3 recommendations Prioritized actions (product/growth/tracking), with:
expected impact (direction + where it helps)
how you’d validate (experiment, tracking check, next analysis)

Output format

Memo: 2 pages max
Supporting: 3–5 charts/tables (can be in notebook)

3) Submission instructions (Link-only)

Please submit a single link to a folder that contains your materials.
Allowed: Google Drive / OneDrive / Dropbox / Notion / GitHub repo (public or private with access granted) ​Not allowed: email attachments

Folder contents (required)

README (how to run + assumptions + QA checklist)
Notebook (.ipynb) or SQL scripts + run instructions
Memo (pdf or md)
Outputs (session_funnel as csv)

Naming convention (important)

Name the folder: ​KeralaAyurveda_ProductAnalyst_<YourName>

Link permissions

Make sure the link is accessible (viewer access at minimum)
If it’s a private repo/folder, grant access to: nishant@keralaayurveda.biz

What to include in your email response (copy/paste template)

Subject: Product Analyst Assignment Submission —
Body:
Submission link:
Approx time spent:
Any assumptions / known limitations (3–5 bullets)

4) Evaluation criteria

Accuracy & definitions (30%)
Driver identification (25%)
Business usefulness (20%)
Communication (15%)
Pragmatism in chaos (10%)

Want to print your doc?
This is not the way.
Try clicking the ··· in the right corner or using a keyboard shortcut (
CtrlP
) instead.