Filter Ranking Solutioning
Search RE
GOAL: Given a user search query, and a list of associated result products, come up with following things
Ranking of Labels for Static Filters Within a Static Filter Label, Ranking of Label Values Figuring important attribute Filter Label and Values to be shown as HVF Ranking of these HVF Filters Figuring important attribute Filter Label and Values to be shown as IF Ranking of these IF Filters
Current Scenario for Static Filters
For a given query, a list of products are returned. An intersection of their respective Label Keys and Values are taken, if there are decent products present for a particular Key-Value pair, that particular Filter is shown on the app.
There is a static position given to every Key and Value and the Filter Label and Label Value is displayed at that position.
DS Intervention
Depending on the query, the ranking of Filter label and label values should be dynamically changed.
Need to build come up with a relevance score for each Filter Label and Label Value.
The Relevance score will be calculated at (Category and Label) and (Category and Label Value) level.
Signals Used to get Relevance Score
For a given (SSCat_i, Key_i) {can be extended to Value}
Features Available For now
Existing demand signals for existing label values (X%) - Factors in demand signals proactively used by users to channelise intent Historical filter usage - Factors in usage spread of filters across SSCAT cross label value combinations Historical search usage - Factors in attributes forming pareto volume in across base SSCATs search queries Alternative demand signals for new label values (Y%) - Order profile - Factors OC coming in for various category cross label value combinations Key Importance in Ordered Products in a SSCatId (in last [1,3,7,14] days) → (No of ordered products with Key_i in SSCat_i/total ordered products in SSCat_i) Click profile - Factors clicks coming in for various category cross label combinations Key Importance in Clicked Products in a SSCatId (in last [1,3,7,14] days) → (No of clicked products with Key_i in SSCat_i/total clicked products in SSCat_i) This importance is calculated for each RE (Wishlist, Search, PDP Reco, FY) View profile - Factors views coming in for various category cross label value combinations Key Importance in Viewed Products in a SSCatId (in last [1,3,7,14] days) → (No of viewed products with Key_i in SSCat_i/total viewed products in SSCat_i) This importance is calculated for each RE (Wishlist, Search, PDP Reco, FY) View to click ratio - Factors clicks coming in per view for a given category cross label value combinations Average Click/View (in last [1,3,7,14] days) → AVG over all products with Key_i in SSCat_i [(No of times a product was clicked)/(No of times that product was viewed)] this ratio is calculated for each RE Lets say for a sscatid there 1000 products. Each product has a c/v value associated with it. Click to order ratio - Factors orders coming in per product click for a given category cross label value combination Average Order/View (in last [1,3,7,14] days) → AVG over all products with Key_i in SSCat_i [(No of times a product was ordered)/(No of times that product was viewed)] Attribute level supply spread (Z%) Supply profile - Factors count of products present in the platform a given category cross label value combination Key importance in SSCatId → (No of Products with Key_i in SSCat_i)/(total products in SSCat_i) Remove keys that have extremely low Key Importance (For example in Face Wipes Category, 2 products have Sleeve Length attribute key out of 90k products. This is an anomaly because Sleeve Length can never be relevant key for Face Wipes category). Removing such keys will help removing these anomalies. Attribute level quality spread (AA%) - Factors pushing filter labels where ratings are typically higher % of High quality Products (> 4.0 Rating) with Key_i in SSCat_i
How do we finalise X,Y,Z and AA - Principally filter and historical search usage are the most reflective of an user’s intent to buy.
Other interaction metrics could give a directional sense of the attribute classes most consumed, interacted or transacted per category.
With this understanding, we’d want to weigh existing demand signals at ~50% and alternative signal at 40%, while keeping a 5% explore budget for supply and quality spread.
To align on final ranker, we could run variants with different weights and check performance in form of filter’s CTR and conversion as the output.
Open point - Should we build distinct models per RE and prioritises view / click / order interactions on that RE or do this iteratively C
Contribution to Total C/v
Features Not Readily Available
Query Level feature: For given SSCat_i, all queries that talk about Key_i/total queries for SSCat_i Review Level feature: For a given SSCat_i, all reviews that talk about Key_i/total review for SSCat_i
Using these features come up with a relevance score for each (SSCat_i, Key_i)
IMP NOTE:
In an ideal world, where all the current labels/label values that the sellers are uploading are reflected on the static and HVF filters, we would use the user-filter interaction signals to understand what filters are better. For ex (number of product ordered/no of products clicked/no of products viewed) → post application of filter.
This is a very strong signal which directly correlates with the filter importance. However, currently we are not showing on the app, all the possible filters that the sellers are uploading as taxonomy attributes. Hence for the time being these aforementioned features are not being used.
However once we put the Filter Label and Label Value Rationalisation fix in the tech and all the filter label and label values start getting shown on the app. We can start getting these signals and then use it to rank our filters even better.
DS solution overview for Filter ranking service