Hunch RecSys

OP - Amazon Personalize Model Optimization

"Not Interested" Button & "Category" Widget: Data Science Perspective

Case Study: Specific User Spamming Hunch Primary Feed

Weekly Digest: Weekly Polls Recommendations to Cohorts

Explore

"Not Interested" Button & "Category" Widget: Data Science Perspective

Introduction

This document provides details on data science perspective for having the following on the Hunch App:

“Not Interested” Button: With selective polls we start showing a ‘Not interested’ button, on click :

We register the category of that particular poll and exclude questions of that category from the feed

Why ? : People know more about what they don’t want to see rather than what they want to see.

“Category” Widget: With every poll, we start showing the category as well and on click :

We curate a list of questions from that category and present it to the user

And register that category as an interesting category for the user

Why ? : This will help us both validate the category and help user easily engage with more number of polls without having to mindlessly scroll through the feed in order to find the next interesting poll.

The above are referenced from the below Coda Doc:

⁠

OP - Ask interest areas for better recommendation⁠

⁠

Not Interested Button

Problem Statement

At present negative feedback, where the user implicitly/explicitly indicates that they dislike a poll, is currently not supported as training input for Amazon Personalize. Furthermore, there is currently no way to add weight/reward to specific interactions by event type or event value. For more details,

read this.⁠

⁠

Following is a snapshot of the data that we send to AWS Personalize for training:

EVENT_TYPE EVENT_VALUE

poll_expanded 1.5

poll_view 0.5

cast_vote 4.0

Further, AWS Personalize does not provide a way to exclude interactions data from training that are considered as providing negative feedback.

For example, in our use case, we consider EVENT_TYPE == 'poll_view' as a negative feedback to our recommender system. And AWS Personalize does not provide a way to filter out poll_view interactions from its training.

Further, while we can remove poll_view interactions from the data before sending it to training, it would complicate the event tracker based data stream that AWS Personalize consumes in real-time. Also, we perform filtering within AWS Personalize of user-poll pair based on:

EXCLUDE EVENT_TYPE == 'poll_view'

Thus, EVENT_TYPE == 'poll_view' are required data instances for AWS Personalize.

Solution

While AWS Personalize does not provide a way to exclude negative interactions, it does provide a way to include only positive interactions for training. Thus, one solution for our use-case is the following:

Include the whole interactions dataset into AWS Personalize, but train on positive interactions only. For more details,

read this.⁠

⁠

Providing a “Not Interested” Button on selective polls allows use to get explicit feedback from users on the polls they are not interested in.

The way this would work with AWS Personalize is described below:

We would use the EVENT_TYPE column to store categorical values Interested or NotInterested within the interactions dataset which would signify a poll being of interest or vice-versa for user-poll pairs. By default, the values of all interactions would be Interested.

If a user clicks the “Not Interested” button, we would update the database of this particular user-poll pair with EVENT_TYPE column being NotInterested.

Further, we would create a new column within the interactions dataset called INTERACTION_TYPE which would contain values from poll_view, poll_expanded, cast_vote, like_comment, poll_creation for user-poll pairs.

NOTE

The set of values for INTERACTION_TYPE for EVENT_TYPE == 'Interested' are:

poll_view, poll_expanded, cast_vote, like_comment, poll_creation

The set of values for INTERACTION_TYPE for EVENT_TYPE == 'NotInterested' are:

poll_view

Following is a snapshot of how the interactions dataset would look like that we would input to AWS Personalize:

EVENT_TYPE EVENT_VALUE INTERACTION_TYPE

Interested 0.5 poll_view

Interested 1.5 poll_expanded

Interested 4.0 cast_vote

Interested 6.0 like_comment

Interested 7.0 poll_creation

NotInterested 0.0 poll_view

During training with AWS Personalize, we can then include interactions data with Interested column value for EVENT_TYPE column.

When providing the data transformed like above as input to AWS Personalize, we would use the below logic to filter out negative data instances from model training:

aws_personalize(event_type_to_train_on='Interested')

Frequently Asked Questions

Question 1: If a “Not Interested” new data instance for a specific user-poll pair comes in, can that be considered as a Cold Item, and thus show up in recommendation feed as an exploration item ?

Answer: Yes, it is definitely possible for a “Not Interested” user-poll pair being recommended as an exploration item for the said user. We can though avoid this with the below filter:

EXCLUDE EVENT_TYPE == "NotInterested"

Question 2: Consider the following Edge Case. If a poll has been “Voted” and also been submitted as “Not Interested”, what will happen ?

Answer: There are 2 scenarios for this edge case:

Scenario 1: A user clicks the “Not Interested” Button and then “Votes” on the poll. In this scenario, once the user “Votes”, we would update our database keeping EVENT_TYPE = 'Interested' and INTERACTION_TYPE = 'cast_vote' . Further, we would also disable/remove the “Not Interested” button, once the user “Votes” on a poll.

Scenario 2: A user “Votes” on a poll and then clicks the “Not Interested” button. As mentioned in scenario 1, this scenario won’t be possible if we disable/remove the “Not Interested” button, once the user “Votes” on the poll.

Category Widget

Using the Category Widget with every poll, we can collect category_click_counter and polls_interacted_with_after_click with every user-poll pair. This can be used as contextual metadata within the interactions dataset that we then feed from training to AWS Personalize.

This would lead to better personalised relevance from a recommendation point of view. For more details on this, read the below doc:

⁠

Increasing recommendation relevance with contextual metadata⁠

⁠

However, we would already be using following contents from the poll as poll metadata information within AWS Personalize in near future:

Poll Question

Poll Options

Poll Text Description

Poll Comments

This gives us much richer information semantically and contextually thus vanishing the relevance of using the explicit user feedback that we gain from the “Category” widget.

NOTE: The above statement is a hypothesis. We would performs experiments to come to a data-driven conclusion if including category_click_counter and polls_interacted_with_after_click with every user-poll pair would be relevant and beneficial for our AWS Personalize Model.

This is within our Phase 2 goals for AWS Personalize Model Optimisation Roadmap. For further details, refer to the below doc:

⁠

Incorporating Unstructured Textual Data as Polls Metadata⁠

⁠

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.