Recommender Systems — The Models Used Everywhere

Nicolas Pogeant
8 min readAug 29, 2022

--

This article focuses on widely used recommender systems, the different types and methods behind them plus their evolution.

Photo by Sajad Nori on Unsplash

It is almost impossible to browse the Internet without encountering a recommender system. Whether it’s on a video service platform like YouTube or a marketplace like Amazon, these models are implemented everywhere to boost consumption and always feed the customer new products or content.

Obviously, this can be debatable because in order to be recommended something, you have to be known by the system. Thus, services that implement recommendations use a massive amount of data to build a model that is accurate and relevant to users.

Let’s look at the different ways a recommender system can be established…

How Recommender Systems work

The idea behind recommender systems is to have a service that sends a response of any type to a user based on some data provided to the underlying model.

As I said before, the response can be a banner on the website, an email, a push notification and more. The goal is to give the user a reason to use the application or website by getting interested in those recommendations.

The simplest example is the videos that surround the one you are watching on Youtube for example. However, these videos are not necessarily of the same type as the one you are watching because recommendation systems can use any data to make their suggestions.

Indeed, there are two methods behind recommender systems :

  • Collaborative Filtering
  • Content Based Filtering

What is Collaborative Filtering ?

This method uses user and item data to learn and make predictions. It uses what is called, a user-item matrix, to calculate similarity between users/items. This is what a user-items matrix looks like :

The idea is to use this matrix to know how a user would react to a new element, for example a movie, a product or something else.

There are two ways of building a collaborative filtering recommender systems, either from a User-Based Filtering or from an Item-Based Filtering.

  • User-Based Filtering : find similar users and use their data to recommend the product.
  • Item-Based Filtering : find similar items and use the behavior of other users to make recommendations.

Various techniques can be implemented to build the system from collaborative filtering. For example, using K-Nearest Neighbors and similarity/distance measures (Cosine, Jaccard, Euclidean, Manhattan…) to find similar users or items, and then make predictions about an item the user hasn’t seen.

However, it requires a lot of computational resources as we call these types of Filtering : Memory Based Filtering.

Thus, another method to get predictions from the collaborative filtering method is : Matrix Factorization.

Matrix Factorization is a dimensionality reduction that transforms the User-Item matrix into embeddings such that multiplying together the resulting embedding layers gives an estimate for the original user-item matrix.

As you can see, it factors the matrix in embeddings (one for the rows and one for the columns) that allow to compute latent features as estimated outputs (which is all the elements of the new estimated matrix in the right). The content of embeddings are initialized randomly and find with a training such as a linear regression does.

This is how the estimated matrix looks like with factors and constant terms :

To find the values inside each embedding, I and P, the process is to minimize a loss function by modifying those values until it converges. A classic loss function is Alternative Least Squares that basically fixes one of the two embeddings and optimizes the other one, alternatively.

In the example above, 0 in the User-Item Matrix means that the user did not buy the product. Therefore, the estimated matrix computed with the Matrix Factorization will tell if a user could buy the product based on other users data. For example, as it is trained as regression, predictions in the estimated matrix are continuous and the higher they are, the more likely the user will buy the item.
To have this kind of model, the training has to put away missing values which are in the example, 0. In another case, rated review about movies, missing values would be Null and the estimated matrix would predict the rating of each user for each movies.

Making a prediction with Matrix Factorization is just about inputting the user and the item of our choice.

Problems encountered by Collaborative Filtering :

  • Cold Start : as those type of systems need to be trained with the entire dataset, new data (user or item) cannot be use as input because the model doesn’t know it and cannot compare it to other user/item data.
  • Echo Chambers : it happens if the system is not well configured and creates a loop of recommendation that are very similar. The issue is that the user is not led to new items that could please him.
  • Shilling Attacks : similar to data poisoning, where the system is corrupted by data from users that want either to boost an item or degrade its popularity. For example, give a lot of bad reviews about a restaurant on Google so that it is not recommended to anyone.

What is Content-Based Filtering ?

Unlike the previous method, this one doesn’t need a database of many users and items to be used for a recommender system. Indeed, content-based filtering focuses on similarity between items based on the features that characterize them. It does not need information about other users, but only data about the one to whom the recommendation is made.

The idea is that the system will cross the features of previously seen movies or products purchased by the user with the features of all other items (movies or products for example) in order to find the most similar items.

Once the features denoted and well ordered, find the best item is done by using a similarity measure that calculates a value showing how one element is reflected in another.

U1 is describing the features (F1… Fn) of the user itself (for example, for a movie, the features can be the genre, therefore it is telling the system what the user likes from previous movies he has seen).
I1 is referencing the features for the first item (in this case, it doesn’t belong to the genre F1 but it is for F2).
Thanks to this matrix, it is possible to measure which item is similar to what the user is looking for.

I listed some of the main measures before but the one that is often used is cosine similarity :

From the great post of machinelearningplus here.

Content-based filtering also has its limits :

  • Recommendations can be too simple to anticipate based on this method. This means that suggesting a movie similar to another on the basis of certain characteristics can be done without a recommender system, while doing it on the basis of what another user who is similar to you liked can be more original and recommend surprises.
  • Any new element with its characteristics must be constantly added to the database. Thus, after a while, the computing complexity can become very high.

Deep Learning Extension : Present and Future of Recommender Systems

From the challenges to which the previous methods face up, the subfield of Machine Learning came with an answer.

Indeed, instead of having to train each time new data comes in the training dataset like we have seen before, what is called the Deep Learning Extension comes to counteract this. I will not describe it all but the idea is that all the elements of the user item matrix become inputs in a neural network composed of embedding layers for users and items, and then of a fully connected layer that brings together all into a final output layer with a linear activation function.

The idea is to be able to train the model in a streaming way like any other similar neural network and thus prevent the time taken by matrix factorization for each new entry.

Beside that, Deep Learning Hybrid models arrived with the contraction of a Collaborative Filtering network and a Content-based Filtering one in order to obtain the best of each method.

If you are interested in the models behind the big recommender systems around you, you can check two of the most famous architectures that the field is using :

  • Wide & Deep : Google revolutionary architecture highlighted in 2016, here.
  • DLRM : Facebook innovative recommender system in 2019, here.

Conclusion

Recommender systems are everywhere and useful for any service that is looking to always satisfies consumer palatability. Two main machine learning methods exist : Collaborative Filtering and Content-Based Filtering. The first one uses information from users to know how to suggest any item to another and the second focuses on the items themselves and their attributes.

Due to issues behind each method, the Deep Learning Extension comes to apply those in a more complicated but more suitable for production start-up way. Mixing the two methods in a unique system has become the basis today.

Thank you for reading this article, I hope that you learned the basics of recommender systems and an idea of the state of art today !

Resources

--

--