"Not Interested" Button & "Category" Widget: Data Science Perspective
Introduction
This document provides details on data science perspective for having the following on the Hunch App:
“Not Interested” Button: With selective polls we start showing a ‘Not interested’ button, on click :
We register the category of that particular poll and exclude questions of that category from the feed
Why ? : People know more about what they don’t want to see rather than what they want to see.
“Category”Widget: With every poll, we start showing the category as well and on click :
We curate a list of questions from that category and present it to the user
And register that category as an interesting category for the user
Why ? : This will help us both validate the category and help user easily engage with more number of polls without having to mindlessly scroll through the feed in order to find the next interesting poll.
At present negative feedback, where the user implicitly/explicitly indicates that they dislike a poll, is currently not supported as training input for Amazon Personalize. Furthermore, there is currently no way to add weight/reward to specific interactions by event type or event value. For more details,
Following is a snapshot of the data that we send to AWS Personalize for training:
EVENT_TYPE EVENT_VALUE
poll_expanded 1.5
poll_view 0.5
cast_vote 4.0
Further, AWS Personalize does not provide a way to exclude interactions data from training that are considered as providing negative feedback.
For example, in our use case, we consider EVENT_TYPE == 'poll_view' as a negative feedback to our recommender system. And AWS Personalize does not provide a way to filter out poll_view interactions from its training.
Further, while we can remove poll_view interactions from the data before sending it to training, it would complicate the event tracker based data stream that AWS Personalize consumes in real-time. Also, we perform filtering within AWS Personalize of user-poll pair based on:
EXCLUDE EVENT_TYPE == 'poll_view'
Thus, EVENT_TYPE == 'poll_view' are required data instances for AWS Personalize.
Solution
While AWS Personalize does not provide a way to exclude negative interactions, it does provide a way to include only positive interactions for training. Thus, one solution for our use-case is the following:
Include the whole interactions dataset into AWS Personalize, but train on positive interactions only.
For more details,
Providing a “Not Interested” Button on selective polls allows use to get explicit feedback from users on the polls they are not interested in.
The way this would work with AWS Personalize is described below:
We would use the EVENT_TYPE column to store categorical values Interested or NotInterested within the interactions dataset which would signify a poll being of interest or vice-versa for user-poll pairs. By default, the values of all interactions would be Interested.
If a user clicks the “Not Interested” button, we would update the database of this particular user-poll pair with EVENT_TYPE column being NotInterested.
Further, we would create a new column within the interactions dataset called INTERACTION_TYPE which would contain values from poll_view, poll_expanded, cast_vote, like_comment, poll_creation for user-poll pairs.
NOTE
The set of values for INTERACTION_TYPE for EVENT_TYPE == 'Interested' are:
The set of values for INTERACTION_TYPE for EVENT_TYPE == 'NotInterested' are:
poll_view
Following is a snapshot of how the interactions dataset would look like that we would input to AWS Personalize:
EVENT_TYPE EVENT_VALUE INTERACTION_TYPE
Interested 0.5 poll_view
Interested 1.5 poll_expanded
Interested 4.0 cast_vote
Interested 6.0 like_comment
Interested 7.0 poll_creation
NotInterested 0.0 poll_view
During training with AWS Personalize, we can then include interactions data with Interested column value for EVENT_TYPE column.
When providing the data transformed like above as input to AWS Personalize, we would use the below logic to filter out negative data instances from model training:
Question 1: If a “Not Interested” new data instance for a specific user-poll pair comes in, can that be considered as a Cold Item, and thus show up in recommendation feed as an exploration item ?
Answer: Yes, it is definitely possible for a “Not Interested” user-poll pair being recommended as an exploration item for the said user. We can though avoid this with the below filter:
EXCLUDE EVENT_TYPE == "NotInterested"
Question 2: Consider the following Edge Case. If a poll has been “Voted” and also been submitted as “Not Interested”, what will happen ?
Answer: There are 2 scenarios for this edge case:
Scenario 1: A user clicks the “Not Interested” Button and then “Votes” on the poll. In this scenario, once the user “Votes”, we would update our database keeping EVENT_TYPE = 'Interested' and INTERACTION_TYPE = 'cast_vote' .
Further, we would also disable/remove the “Not Interested” button, once the user “Votes” on a poll.
Scenario 2: A user “Votes” on a poll and then clicks the “Not Interested” button. As mentioned in scenario 1, this scenario won’t be possible if we disable/remove the “Not Interested” button, once the user “Votes” on the poll.
Category Widget
Using the Category Widget with every poll, we can collect category_click_counterand polls_interacted_with_after_click with every user-poll pair. This can be used as contextual metadata within the interactions dataset that we then feed from training to AWS Personalize.
This would lead to better personalised relevance from a recommendation point of view. For more details on this, read the below doc:
However, we would already be using following contents from the poll as poll metadata information within AWS Personalize in near future:
Poll Question
Poll Options
Poll Text Description
Poll Comments
This gives us much richer information semantically and contextually thus vanishing the relevance of using the explicit user feedback that we gain from the “Category” widget.
NOTE:The above statement is a hypothesis. We would performs experiments to come to a data-driven conclusion if including category_click_counterand polls_interacted_with_after_click with every user-poll pair would be relevant and beneficial for our AWS Personalize Model.
This is within our Phase 2 goals for AWS Personalize Model Optimisation Roadmap. For further details, refer to the below doc: