Explore

PFS

Executive Summary

This PFS document outlines the vision of Meesho’s Search. Through this PFS exercise, we have focused on the need to shift from a query volume-driven approach to a more user-centric (long tail focus) model. We analysed the current search metrics, identified key problems and envisioned search through a new lens.

The key takeaways from this exercise are-

Vision: Make Meesho the go-to app for product searches and become a truly long tail, multi-modal search system that caters to users’ individual preferences.

Best-in-Class Metrics: Improve search aided awareness, search adoption, query market share, CTR and MRR while reducing time to first click and increasing conversion rates (Current long tail CTR is 2.1% whereas that for head is 2.8%; long tail MRR is 0.11 whereas that for head is 0.14).

Problems: Top relevance issues in Search include ineffective understanding of query and attributes (~26% sessions, ~3% NMV opp), inaccurate product type retrieval (~10% sessions, ~1.5% NMV opp), and vernacular queries (~1% sessions, ~0.5% NMV opp).

System architecture: Vision backwards design proposed to solve for long tail query problems and considering multi modal inputs.

Align on the short term trade-offs: Since we are proposing to change the roadmap to solve the problems listed on long tail queries, we want to align on a 0.7% NMV/Vi tradeoff for this cycle (Reduce goal of 1.5% this cycle to 0.8%). We plan to recover this tradeoff by taking a goal of 2% NMV/Vi till Dec’25.

Objective

The objective of the document is to envision the future of search at Meesho. So far, our approach to solving for Meesho’s search was driven by query volume. Our focus was more on head and torso queries (~75% sessions) and less on tail, which has limited our scope to immediate needs rather than long-term growth opportunities.

Opportunity: Also, we realised that our torso conversion is only marginally better than tail conversion but head conversion is almost twice that of torso and tail, which presents an opportunity of ~5% plat level conversion through improvement in ~45% sessions (long tail).

Scope of this doc-

This document focuses on understanding the current problems in search results’ relevance at Meesho Search especially on long tail queries, the metrics we look at and the opportunity that exists.

This document does not detail the implementation plans but focuses on vision, problems, challenges with current system and high-level solutions.

Why envision now?-

Unstructured User Queries: We receive around 17M unique queries everyday of which 11M are tail queries. The number of unique tail queries has increased by 10% in the last 6 months. We have identified that even though we are a long tail e-commerce platform knowing that our users express themselves in an unstructured way, we do not serve tail queries very well (Current head CTR 2.9% vs long tail CTR 2.4%). Our major focus has been on improving for Head and Torso queries which has to be shifted to Tail queries.

Technological Advancements: Technological advancements around GenAI and new techniques in machine learning algorithms give us the confidence to create systems that can help in understanding the query better and improve relevance of results.

Search Vision

Vision- Make Meesho the go-to app for product searches and become a truly long tail multi-modal search system that caters to users’ individual preferences

By envisioning a forward-thinking search structure, we aim to:

Become the go-to platform for any product search for India: When users think of a product, their first instinct should be to open the Meesho app. {Aim for high Search Aided Awareness and high query market share}

Be a truly long tail search platform: Deliver the best-in-class relevance using artificial general intelligence even when a query is absolutely new on the platform. {Aim for high CTR}

Deliver a truly inclusive and multi-modal search experience: Empower users to interact with Meesho app through voice, image, text, and video-based search, making the journey seamless regardless of input preference. {Aim for high Search adoption}

Understand and serve the user intent: Surface relevant products that are highly personalised to the users’ preferences {Aim for high MRR}

For this PFS, the focus is on relevance of search results. We plan to cover experience and personalisation related points in separate sessions.

Search Excellence Metrics

To understand how current search performs, we went through the metrics and the numbers we trend at today. The key insight is that we significantly lag behind in relevance metrics for long tail compared to head, especially in CTR, MRR and No clicks sessions.

Table 83

Table 83

Metric type

Metric & Current value (Jan'25)

Definition

Takeaway

Trend

Excellence metrics {Output}

Search Aided Awareness (33% Meesho)

When you think of online shopping, which app do you use first to search for the product? Google, Meesho, Amazon, Flipkart, others please specify

Meesho is the go-to app for females and Tier 2, 3, 4 users. But for Metros and males, it is Flipkart /Amazon. Note: We plan to understand whether the users are clear that the question is around search since there is a probability of confusion here.

⁠

Query market share (23.5%)

{Source- Bobble}

Of all search queries across apps, what volume is captured by Meesho app?

Currently a larger volume of our TG’s search queries is occupied by FK, Google, Instagram.

Search adoption (37%)

Search clicker/DAU

Currently only 37% users landing on the app use search. This can be driven up to drive conversion through understanding user intent.

⁠

⁠

Excellence metrics {Input}

Search relevance-

Feed CTR (2.6%)

Head CTR (2.9%)

Long tail CTR (2.4%)

Feed Relevance Score (LLM eval)*

Feed CTR= Clicks/Views Feed Relevance Score- Relevance score through markings through LLMs

Note: LLM training for relevance evaluation is in progress

⁠

MRR (Mean reciprocal rank) (MRR=0.12, avg position=8)

Head MRR (0.14)

Long tail MRR (0.11)

Average first click position on search results page

MRR should grow towards 1 to help user reach to the desired product quickly.

Other Key Metrics

Search conversion-

Overall (2.0%)

Typed (2.1%)

Fully typed (1.4%)

Suggestions(2.7%)

Voice (1.4%)

Visual (2.5%)

Head: 2.1%

Long tail: 1.1%

Orders/Search session

Suggestions have a better conversion due to lesser friction. We have taken KRs to improve the adoption of suggestions.

Visual search has a better conversion due to higher intent and clarity of desired product.

Imp callout: Orders attribution has changed post Aug’24 due to which we see a downtrend in conv in all REs

Search retention- W10 retention : 40%

Of 100 users that searched on Meesho, what % search again in week 10

Stickiness (14.27%)

Search DAU/Search WAU

⁠

⁠

Search sessions (SS) & Order contribution (OC) by type-

Typed (SS-74.3% & OC-78.6%)

Fully typed (SS-35.0% & OC-24.1%)

Suggestions (SS-39.4% & OC-54.5%)

Voice (SS-20.3% & OC-15.4%)

Visual (SS-5.4% & OC-6.0%)

SS- Proportion of search sessions by type OC- Order contribution by type

Voice search sessions higher in proportion but OC lower suggests the existence of user demand but low conversion

⁠

Search relevance and ranking indicators-

No click sessions (48%)

Head: 46%

Long tail: 54%

Feed CVR (1.5%)

Query Reformulation Rate (31%)

Query Bounce Rate (8.6%)

Median scroll depth (26)

No click sessions- % sessions with no clicks

Feed CVR (Order/click)

Number of sessions rewritten within 60 secs

Number of sessions with no clicks/scrolls/wishlists/shares

Average position of catalog till which the user scrolled

High % of no click sessions and reformulation rate suggests that user has to put in a lot of effort before reaching to the desired products.

Time to first click

Search results shown to product click (16 secs)

Search bar to product click (28 secs)

Median duration from search bar clicked to first product clicked on search results page

The vision of search is to make discovery easier, hence we should aim to bring this metric down to a few seconds.

⁠

Screenshot 2025-02-04 at 12.15.36 PM.png

⁠

Problems and Challenges with current system

Problems-

To understand the problems in search, we tried to comprehend the issues with the queries with lowest CTR (relevance indicator) and worst (high number) first click position. To find the directional opportunity, we compared the CTR, MRR and Conversion (orders/session) with head (queries where we do relatively better today).

To understand the search architecture, please refer

this⁠

diagram. L0 is retrieval layer, L1 is intermediate ranker, L2 is final ranker.

Table 84

Table 84

Problem theme

Problem

Examples

Current metrics

Directional Opportunity

Challenge with the current system

Retrieved catalogs

Final feed

Irrelevant products shown

Attribute not considered

bottle school- water bottles are being shown instead of regular bottles

boroline cream 1 dabbi- other brands being shown instead of boroline

doerimon pansl box- doreamon not shown

CTR- 2.7% Conversion- 1.1% %Sessions- 26%

+3% NMV

Query Understanding- Query attributes not identified.

L0- Catalogs with attribute not given a higher relevance score by retrieval layer.

L1 & L2- Rankers unable to push catalogs with attributes higher.

⁠

Incorrect product type served

mobile- mobile covers are being shown

bajaj induction cooktop- bike covers are being shown

puzzle mat for floor- puzzles are being shown

Exact product match (Proper noun not identified correctly)-

catan- curtains/cotton is shown

half girlfriend

CTR- 2.2% Conversion- 1.1% %Sessions- 10%

+1.5% NMV

L0- Retrieval layer unable to retrieve relevant catalogs.

L1 & L2- Rankers are unable to surface the relevant products at top as the optimising function for them is net O/V.

⁠

Natural language queries

joote dikhao (Mix of jute and shoes in the feed)

poco c61 ka m naam ka cover dikhaiye

CTR- 1.8% Conversion- 1.2% %Sessions- 1%

+0.4% NMV

Query Understanding- User intent not grasped.

⁠

Non-roman script queries (Translation/transliteration problem)

‘ચોલી’ (choli)- potlis and bangles are being shown

‘बाल पोंछने वाला’- razors are being shown

‘സാറ്റിൻ ക്ലോത് നൈസ്’ (nice satin cloth)- undergarments are being shown instead

CTR- 1.5% Conversion- 0.9% %Sessions- 0.1%

+0.1% NMV

Query Understanding- Query not translated/transliterated correctly.

⁠

System’s misinterpretation of queries

‘chitar’ is being changed to ‘churidar’- User was looking for windcheater which is usually referred to as cheater)

‘Zanjeer’ changed to anjeer

TBD

Query Understanding- DS spell correction model in QUL changing the meaning of the query.

⁠

Relevant products shown but poorly ranked

Users scroll a lot to find the relevant product in head and torso queries

Queries where relevance is fine but MRR is low-

man chapal

shuj women

Jersey Jersey

+2% NMV

L2- Users not finding a product they like.

Freshness of feed

new/trendy/latest- keywords not being accounted in the feed

TBD

All systems- Users not finding new/unexplored products in the feed.

Relevant products missed due to other issues

Incorrect cataloging data

Catalog title and image suggests silk saree but catalog data mentions cotton as the fabric

Current systems work on the attributes added by the seller. If the accuracy is low, it creates a potential risk to the solutions that depend on this data.

Unclear user intent

Meaning of user query is unclear or system could not interpret the query

sunil ko

sowp gym

tekfar vilutot (probably, techfire bluetooth)

CTR- 2.1% Conversion- 1.2% %Sessions- 5%

⁠

For the scope of this doc, the relevance problems have been focused upon.

Alignment points

Relevance Goal for 2025 (2% NMV/Vi with Relevance Only Improvements):

By Dec 2025, we intend to bring overall Search relevance performance closer to head query performance by solving for the top 2 relevance problems (while going long tail query first).

CTR Goal:

Overall: 2.8% against today’s 2.65% (5.6% Increase)

Long tail: 2.7% against today’s 2.4% (12.5% Increase)

MRR Goal:

Overall: 0.135 against today’s 0.13 (4% Increase)

Long tail: 0.125 against today’s 0.11 (13.6% Increase)

LLM eval Goal: TBD

This shall lead to overall platform NMV increase of 2% NMV.

Short term impact trade off (0.7% NMV/Vi to be pushed out from the current cycle)

Investing in the solves for long tail need more infra investment and iterations to realise this impact, hence we intend to immediately repurpose pod’s bandwidth on long tail focus. As an effect of this, our iterations on relevance solves (NER for attribute relevance) for head queries will be pushed out to first build for long tail and extend that solve to head queries.

Hence, 0.7% NMV gets pushed out of the current cycle, which will be realised as part of the overall 2% NMV goal for relevance solve for both long tail and head queries.

Vision backwards proposed design-

Guiding Principles-

Relevance Over Conversion: Prioritise relevant products to help users find what they need faster.

Scalable System Design: Make decisions with long-tail scalability in mind.

Understand Users Better

Search Experience: Analyse user interactions (e.g., voice search experience like long-pressing the voice icon in WA).

Search Relevance & Ranking: Leverage implicit/explicit signals (e.g., some users prefer branded products, which current algorithms overlook).

Transparent Communication: Inform users when a product isn’t available to avoid confusion (e.g., clarifying that laptops aren’t sold).

Proposed design-

We propose the following design, which incorporates:

Query Understanding: Leveraging LLM/SLM for query correction, translation, and category/attribute understanding, moving away from manually created rules.

Retrieval Layer: Implementing multi-vector retrieval, taxonomy attributes, image analysis, and supplier-enriched data to improve catalog understanding, with future enhancements like semantic retrieval and attribute-based boosting.

Relevance Filtering: Using LLM-based evaluators to filter irrelevant catalogs, ensuring precision-focused relevance.

Ranking: Ranking catalogs based on query, attributes, and user affinity for factors like price, rating, and quality, with future plans for enhanced ranking based on gender, region, and attribute preferences.

Post-Ranker Diversification: Introducing a diversity-based ranking to increase variety in the feed, with a planned new component for further improving variety.

Catalog enrichment: Enhancing catalog data through inaccuracies’ identification, missing data imputation, supplier-provided enrichments, and AI-driven metadata generation to improve retrieval and ranking effectiveness.

This design aims to enhance catalog retrieval, filtering, and ranking through AI-driven methods, better semantic understanding, and personalized relevance adjustments.

⁠

image.png

⁠

Current exploit system design (Optional)-

⁠

image.png

⁠

Currently, we operate differently for head-torso vs tail queries. Head and torso results are pre-computed and ranked real time. Tail results are both computed and ranked real time. For Tail queries, we do not have a L1 ranker and ES yet.

Note 2- Zonal and non-GST CGs have been left out here to reduce complexity of representation.

Next steps

Post alignment, we plan to conduct-

WS: Conduct working sessions on individual solutions, starting with the solution for attribute not considered problem (~3% total NMV opp).

In-depth outside-in: Build upon outside-in solutions and architecture understanding through knowledge repositories available outside and expert calls.

Long term investment: Follow the proposed solutions for the respective streams and get back on the tentative timelines.

Table 85

Table 85

Stream

Proposed solution

Query understanding

Small language models (SLMs) for long tail queries to understand the intent better

Retrieval layer

Intelligent retrieval layer that fetches only the relevant products

Salience of relevance in ranking

Prioritising highly relevant results on top of the feed

Ranking

Enhanced ranking based on region, gender and attribute preferences

Platform investment

Catalog enrichment- Catalog data inaccuracies identification and improving the fill rate

There are no rows in this table

⁠

Want to print your doc?
This is not the way.

Try clicking the ··· in the right corner or using a keyboard shortcut (

CtrlP

) instead.