top of page

Recommender Systems part 1: Introduction


40 % of app installs on Google play come from recommendations and 60% of watch time on Youtube come from recommendations

Terminology

  • Items (documents): the entities a system recommends, for example, apps, movies, videos, books

  • Query (context): the information a system uses to make recommendations


Queries can be a combination of the following:

  • user information, such as user id, item(s) a user has previously interacted with

  • additional context, such as, time of day or user's device

Embedding: a mapping from a discrete set (in this case, the set of queries or the set of items to recommend) to a vector space called the embedding space.

Many recommendation systems rely on learning an appropriate embedding representation


Overview

A common architecture many recommender systems employ consist of the following components:

  • candidate generation: collaborative and/or content-based filtering

  • scoring

  • re-ranking


Candidate generation

During this stage, the system potentially starts with a huge corpus and generate a smaller subset of candidates. for example, the candidate generator of Youtube reduces billions of videos down to hundreds or thousands.


The model needs to evaluate queries quickly given the enormous size of the corpus. A given model may provide multiple candidate generators, each nominating a different subset of candidates


Scoring

Next, another model scores and ranks the candidates in order to select the set of items (on the order of 10) to display to the user.

Since this model evaluates a relatively small subset of items, the system can use a more precise model relying on additional queries


Re-ranking

Finally, the system take into account additional constraints for the final ranking. for example, the system may remove items that the user explicitly dislikes or boosts the score of fresher content.

re-ranking can also help ensure diversity, freshness, and fairness


References

Comments


bottom of page