Skip to content

PFS

Executive Summary

This PFS document outlines the vision of Meesho’s Search. Through this PFS exercise, we have focused on the need to shift from a query volume-driven approach to a more user-centric (long tail focus) model. We analysed the current search metrics, identified key problems and envisioned search through a new lens.
The key takeaways from this exercise are-
Vision: Make Meesho the go-to app for product searches and become a truly long tail, multi-modal search system that caters to users’ individual preferences.
Best-in-Class Metrics: Improve search aided awareness, search adoption, query market share, CTR and MRR while reducing time to first click and increasing conversion rates (Current long tail CTR is 2.1% whereas that for head is 2.8%; long tail MRR is 0.11 whereas that for head is 0.14).
Problems: Top relevance issues in Search include ineffective understanding of query and attributes (~26% sessions, ~3% NMV opp), inaccurate product type retrieval (~10% sessions, ~1.5% NMV opp), and vernacular queries (~1% sessions, ~0.5% NMV opp).
System architecture: Vision backwards design proposed to solve for long tail query problems and considering multi modal inputs.
Align on the short term trade-offs: Since we are proposing to change the roadmap to solve the problems listed on long tail queries, we want to align on a 0.7% NMV/Vi tradeoff for this cycle (Reduce goal of 1.5% this cycle to 0.8%). We plan to recover this tradeoff by taking a goal of 2% NMV/Vi till Dec’25.

Objective

The objective of the document is to envision the future of search at Meesho. So far, our approach to solving for Meesho’s search was driven by query volume. Our focus was more on head and torso queries (~75% sessions) and less on tail, which has limited our scope to immediate needs rather than long-term growth opportunities.
Opportunity: Also, we realised that our torso conversion is only marginally better than tail conversion but head conversion is almost twice that of torso and tail, which presents an opportunity of ~5% plat level conversion through improvement in ~45% sessions (long tail).

Scope of this doc-

This document focuses on understanding the current problems in search results’ relevance at Meesho Search especially on long tail queries, the metrics we look at and the opportunity that exists.
This document does not detail the implementation plans but focuses on vision, problems, challenges with current system and high-level solutions.

Why envision now?-

Unstructured User Queries: We receive around 17M unique queries everyday of which 11M are tail queries. The number of unique tail queries has increased by 10% in the last 6 months. We have identified that even though we are a long tail e-commerce platform knowing that our users express themselves in an unstructured way, we do not serve tail queries very well (Current head CTR 2.9% vs long tail CTR 2.4%). Our major focus has been on improving for Head and Torso queries which has to be shifted to Tail queries.
Technological Advancements: Technological advancements around GenAI and new techniques in machine learning algorithms give us the confidence to create systems that can help in understanding the query better and improve relevance of results.

Search Vision

Vision- Make Meesho the go-to app for product searches and become a truly long tail multi-modal search system that caters to users’ individual preferences

By envisioning a forward-thinking search structure, we aim to:
Become the go-to platform for any product search for India: When users think of a product, their first instinct should be to open the Meesho app. {Aim for high Search Aided Awareness and high query market share}
Be a truly long tail search platform: Deliver the best-in-class relevance using artificial general intelligence even when a query is absolutely new on the platform. {Aim for high CTR}
Deliver a truly inclusive and multi-modal search experience: Empower users to interact with Meesho app through voice, image, text, and video-based search, making the journey seamless regardless of input preference. {Aim for high Search adoption}
Understand and serve the user intent: Surface relevant products that are highly personalised to the users’ preferences {Aim for high MRR}
For this PFS, the focus is on relevance of search results. We plan to cover experience and personalisation related points in separate sessions.

Search Excellence Metrics

To understand how current search performs, we went through the metrics and the numbers we trend at today. The key insight is that we significantly lag behind in relevance metrics for long tail compared to head, especially in CTR, MRR and No clicks sessions.
Table 83
Metric type
Metric & Current value (Jan'25)
Definition
Takeaway
Trend
Excellence metrics {Output}
3
Search Aided Awareness (33% Meesho)
When you think of online shopping, which app do you use first to search for the product? Google, Meesho, Amazon, Flipkart, others please specify
Meesho is the go-to app for females and Tier 2, 3, 4 users. But for Metros and males, it is Flipkart /Amazon. Note: We plan to understand whether the users are clear that the question is around search since there is a probability of confusion here.
image.png
Query market share (23.5%)
{Source- Bobble}
Of all search queries across apps, what volume is captured by Meesho app?
Currently a larger volume of our TG’s search queries is occupied by FK, Google, Instagram.
image.png
Search adoption (37%)
Search clicker/DAU
Currently only 37% users landing on the app use search. This can be driven up to drive conversion through understanding user intent.
Screenshot 2025-02-03 at 4.45.28 PM.png
Excellence metrics {Input}
2
Search relevance-
Feed CTR (2.6%)
Head CTR (2.9%)
Long tail CTR (2.4%)
Feed Relevance Score (LLM eval)*
Feed CTR= Clicks/Views Feed Relevance Score- Relevance score through markings through LLMs
Note: LLM training for relevance evaluation is in progress
image.png
MRR (Mean reciprocal rank) (MRR=0.12, avg position=8)
Head MRR (0.14)
Long tail MRR (0.11)
Average first click position on search results page
MRR should grow towards 1 to help user reach to the desired product quickly.
image.png
Other Key Metrics
6
Search conversion-
Overall (2.0%)
Typed (2.1%)
Fully typed (1.4%)
Suggestions(2.7%)
Voice (1.4%)
Visual (2.5%)
Head: 2.1%
Long tail: 1.1%
Orders/Search session
Suggestions have a better conversion due to lesser friction. We have taken KRs to improve the adoption of suggestions.
Visual search has a better conversion due to higher intent and clarity of desired product.
Imp callout: Orders attribution has changed post Aug’24 due to which we see a downtrend in conv in all REs
image.png
Search retention- W10 retention : 40%
Of 100 users that searched on Meesho, what % search again in week 10
Stickiness (14.27%)
Search DAU/Search WAU
Screenshot 2025-02-03 at 7.56.59 PM.png
Search sessions (SS) & Order contribution (OC) by type-
Typed (SS-74.3% & OC-78.6%)
Fully typed (SS-35.0% & OC-24.1%)
Suggestions (SS-39.4% & OC-54.5%)
Voice (SS-20.3% & OC-15.4%)
Visual (SS-5.4% & OC-6.0%)
SS- Proportion of search sessions by type OC- Order contribution by type
Voice search sessions higher in proportion but OC lower suggests the existence of user demand but low conversion
image.png
,
image.png
Search relevance and ranking indicators-
No click sessions (48%)
Head: 46%
Long tail: 54%
Feed CVR (1.5%)
Query Reformulation Rate (31%)
Query Bounce Rate (8.6%)
Median scroll depth (26)
No click sessions- % sessions with no clicks
Feed CVR (Order/click)
Number of sessions rewritten within 60 secs
Number of sessions with no clicks/scrolls/wishlists/shares
Average position of catalog till which the user scrolled
High % of no click sessions and reformulation rate suggests that user has to put in a lot of effort before reaching to the desired products.
Screenshot 2025-02-03 at 4.35.52 PM.png
Time to first click
Search results shown to product click (16 secs)
Search bar to product click (28 secs)
Median duration from search bar clicked to first product clicked on search results page
The vision of search is to make discovery easier, hence we should aim to bring this metric down to a few seconds.
Screenshot 2025-02-04 at 12.15.36 PM.png

Problems and Challenges with current system

Problems-

To understand the problems in search, we tried to comprehend the issues with the queries with lowest CTR (relevance indicator) and worst (high number) first click position. To find the directional opportunity, we compared the CTR, MRR and Conversion (orders/session) with head (queries where we do relatively better today).
To understand the search architecture, please refer
diagram. L0 is retrieval layer, L1 is intermediate ranker, L2 is final ranker.
Table 84
Problem theme
Problem
Examples
Current metrics
Directional Opportunity
Challenge with the current system
Retrieved catalogs
Final feed
Irrelevant products shown
5
Attribute not considered
bottle school- water bottles are being shown instead of regular bottles
boroline cream 1 dabbi- other brands being shown instead of boroline
doerimon pansl box- doreamon not shown
CTR- 2.7% Conversion- 1.1% %Sessions- 26%
+3% NMV
Query Understanding- Query attributes not identified.
L0- Catalogs with attribute not given a higher relevance score by retrieval layer.
L1 & L2- Rankers unable to push catalogs with attributes higher.
unnamed.png
,
unnamed.png
,
unnamed.png
image.png
,
image.png
,
image.png
Incorrect product type served
mobile- mobile covers are being shown
bajaj induction cooktop- bike covers are being shown
puzzle mat for floor- puzzles are being shown
Exact product match (Proper noun not identified correctly)-
catan- curtains/cotton is shown
half girlfriend
CTR- 2.2% Conversion- 1.1% %Sessions- 10%
+1.5% NMV
L0- Retrieval layer unable to retrieve relevant catalogs.
L1 & L2- Rankers are unable to surface the relevant products at top as the optimising function for them is net O/V.
unnamed.png
,
unnamed.png
,
unnamed.png
,
image.png
,
image.png
image.png
,
image.png
,
image.png
,
image.png
,
image.png
Natural language queries
joote dikhao (Mix of jute and shoes in the feed)
poco c61 ka m naam ka cover dikhaiye
CTR- 1.8% Conversion- 1.2% %Sessions- 1%
+0.4% NMV
Query Understanding- User intent not grasped.
image.png
image.png
,
image.png
Non-roman script queries (Translation/transliteration problem)
‘ચોલી’ (choli)- potlis and bangles are being shown
‘बाल पोंछने वाला’- razors are being shown
‘സാറ്റിൻ ക്ലോത് നൈസ്’ (nice satin cloth)- undergarments are being shown instead
CTR- 1.5% Conversion- 0.9% %Sessions- 0.1%
+0.1% NMV
Query Understanding- Query not translated/transliterated correctly.
unnamed.png
,
unnamed.png
,
unnamed.png
image.png
,
image.png
,
image.png
System’s misinterpretation of queries
‘chitar’ is being changed to ‘churidar’- User was looking for windcheater which is usually referred to as cheater)
‘Zanjeer’ changed to anjeer
NA
TBD
Query Understanding- DS spell correction model in QUL changing the meaning of the query.
unnamed.png
,
unnamed.png
image.png
,
image.png
Relevant products shown but poorly ranked
2
Users scroll a lot to find the relevant product in head and torso queries
Queries where relevance is fine but MRR is low-
man chapal
shuj women
Jersey Jersey
NA
+2% NMV
L2- Users not finding a product they like.
Freshness of feed
new/trendy/latest- keywords not being accounted in the feed
NA
TBD
All systems- Users not finding new/unexplored products in the feed.
Relevant products missed due to other issues
1
Incorrect cataloging data
Catalog title and image suggests silk saree but catalog data mentions cotton as the fabric
NA
NA
Current systems work on the attributes added by the seller. If the accuracy is low, it creates a potential risk to the solutions that depend on this data.
image.png
Unclear user intent
1
Meaning of user query is unclear or system could not interpret the query
sunil ko
sowp gym
tekfar vilutot (probably, techfire bluetooth)
CTR- 2.1% Conversion- 1.2% %Sessions- 5%
NA
NA
For the scope of this doc, the relevance problems have been focused upon.

Alignment points

Relevance Goal for 2025 (2% NMV/Vi with Relevance Only Improvements):

By Dec 2025, we intend to bring overall Search relevance performance closer to head query performance by solving for the top 2 relevance problems (while going long tail query first).
CTR Goal:
Overall: 2.8% against today’s 2.65% (5.6% Increase)
Long tail: 2.7% against today’s 2.4% (12.5% Increase)
MRR Goal:
Overall: 0.135 against today’s 0.13 (4% Increase)
Long tail: 0.125 against today’s 0.11 (13.6% Increase)
LLM eval Goal: TBD
This shall lead to overall platform NMV increase of 2% NMV.

Short term impact trade off (0.7% NMV/Vi to be pushed out from the current cycle)

Investing in the solves for long tail need more infra investment and iterations to realise this impact, hence we intend to immediately repurpose pod’s bandwidth on long tail focus. As an effect of this, our iterations on relevance solves (NER for attribute relevance) for head queries will be pushed out to first build for long tail and extend that solve to head queries.
Hence, 0.7% NMV gets pushed out of the current cycle, which will be realised as part of the overall 2% NMV goal for relevance solve for both long tail and head queries.

Vision backwards proposed design-

Guiding Principles-

Relevance Over Conversion: Prioritise relevant products to help users find what they need faster.
Scalable System Design: Make decisions with long-tail scalability in mind.
Understand Users Better
Search Experience: Analyse user interactions (e.g., voice search experience like long-pressing the voice icon in WA).
Search Relevance & Ranking: Leverage implicit/explicit signals (e.g., some users prefer branded products, which current algorithms overlook).
Transparent Communication: Inform users when a product isn’t available to avoid confusion (e.g., clarifying that laptops aren’t sold).

Proposed design-

We propose the following design, which incorporates:
Query Understanding: Leveraging LLM/SLM for query correction, translation, and category/attribute understanding, moving away from manually created rules.
Retrieval Layer: Implementing multi-vector retrieval, taxonomy attributes, image analysis, and supplier-enriched data to improve catalog understanding, with future enhancements like semantic retrieval and attribute-based boosting.
Relevance Filtering: Using LLM-based evaluators to filter irrelevant catalogs, ensuring precision-focused relevance.
Ranking: Ranking catalogs based on query, attributes, and user affinity for factors like price, rating, and quality, with future plans for enhanced ranking based on gender, region, and attribute preferences.
Post-Ranker Diversification: Introducing a diversity-based ranking to increase variety in the feed, with a planned new component for further improving variety.
Catalog enrichment: Enhancing catalog data through inaccuracies’ identification, missing data imputation, supplier-provided enrichments, and AI-driven metadata generation to improve retrieval and ranking effectiveness.
This design aims to enhance catalog retrieval, filtering, and ranking through AI-driven methods, better semantic understanding, and personalized relevance adjustments.
image.png

Current exploit system design (Optional)-

image.png
Currently, we operate differently for head-torso vs tail queries. Head and torso results are pre-computed and ranked real time. Tail results are both computed and ranked real time. For Tail queries, we do not have a L1 ranker and ES yet.
Note 2- Zonal and non-GST CGs have been left out here to reduce complexity of representation.

Next steps

Post alignment, we plan to conduct-
WS: Conduct working sessions on individual solutions, starting with the solution for attribute not considered problem (~3% total NMV opp).
In-depth outside-in: Build upon outside-in solutions and architecture understanding through knowledge repositories available outside and expert calls.
Long term investment: Follow the proposed solutions for the respective streams and get back on the tentative timelines.
Table 85
Stream
Proposed solution
Query understanding
Small language models (SLMs) for long tail queries to understand the intent better
Retrieval layer
Intelligent retrieval layer that fetches only the relevant products
Salience of relevance in ranking
Prioritising highly relevant results on top of the feed
Ranking
Enhanced ranking based on region, gender and attribute preferences
Platform investment
Catalog enrichment- Catalog data inaccuracies identification and improving the fill rate
There are no rows in this table
Want to print your doc?
This is not the way.
Try clicking the ··· in the right corner or using a keyboard shortcut (
CtrlP
) instead.