Recommender Systems part 1: Introduction

eyereece
Jul 21, 2023
1 min read

Updated: Jul 26, 2023

40 % of app installs on Google play come from recommendations and 60% of watch time on Youtube come from recommendations

Terminology

Items (documents): the entities a system recommends, for example, apps, movies, videos, books
Query (context): the information a system uses to make recommendations

Queries can be a combination of the following:

user information, such as user id, item(s) a user has previously interacted with
additional context, such as, time of day or user's device

Embedding: a mapping from a discrete set (in this case, the set of queries or the set of items to recommend) to a vector space called the embedding space.

Many recommendation systems rely on learning an appropriate embedding representation

Overview

A common architecture many recommender systems employ consist of the following components:

candidate generation: collaborative and/or content-based filtering
scoring
re-ranking

Candidate generation

During this stage, the system potentially starts with a huge corpus and generate a smaller subset of candidates. for example, the candidate generator of Youtube reduces billions of videos down to hundreds or thousands.

The model needs to evaluate queries quickly given the enormous size of the corpus. A given model may provide multiple candidate generators, each nominating a different subset of candidates

Scoring

Next, another model scores and ranks the candidates in order to select the set of items (on the order of 10) to display to the user.

Since this model evaluates a relatively small subset of items, the system can use a more precise model relying on additional queries

Re-ranking

Finally, the system take into account additional constraints for the final ranking. for example, the system may remove items that the user explicitly dislikes or boosts the score of fresher content.

re-ranking can also help ensure diversity, freshness, and fairness

References

Google recommenders course