Under the Hood of Our Algorithmic Engine – How We Serve Content Recommendations
Our main goal is to serve good content recommendations to readers on the Internet. The typical situation is a user reading a content page. We want to recommend content for further reading, which is a “good” recommendation.
What is a good recommendation? We believe a good recommendation is one that is interesting to the user and is both timely and relevant. The user should not only want to click the recommendation title or image we put on the recommendations widget, but also like the page that he/she sees after the click and even want to continue investigating more content on the recommendation’s site. All recommendations should give a good experience to the user so he/she will become familar with our widget and know that it gives good recommendations. So with this goal in mind, we do not use any “click traps” — we want a long-term relationship with the user, not a single click.
Note the target here is to serve content which is interesting to the user, and not to serve content which is relevant to content the user is already reading. Relevancy becomes just one of the methods to get interesting recommendations, not the target.
In order to serve good recommendations we run a large set of algorithms in parallel and get a set of candidate recommendations. Then, we decide which recommendations to serve to the user by machine learning techniques. I.e., We try to learn what the user or a group of users (the simplest group can be the readers of a specific site, but it can be more complex) like to read and serve it more often.
We can divide the algorithmic methods used to contextual algorithms, behavioral algorithms and personal algorithms:
Contextual algorithms analyze the context the user is reading now and finds relevant content. Relevant content can be interesting to the user. We use the Solr search engine with some enhacements we did here for the search, and we can also classify content into categories and use categories matching instead of a search.
Behavioral algorithms learn a set of statistical behaviors of groups of users. The simplest algorithms can bring the most visited documents in a site, the most rated documents, the ones with most social sharing events and so on. More complex algorithms can apply colleborative filtering methods to get other content which people who “liked” this content also liked.
From our experience we have seen behavioral algorithms perform differently than contextual algorithms. The best performance comes from giving a few recommendations from each type. On different sites, different algorithms give different results.
Scalability is an issue in recommendations serving. We serve recommendations in an average of about 30 milliseconds. To achieve this fast serving time we do most processing in offline, saving results in a memory cache tool Memcached. We use key-value databases (like Cassandra) on top of traditional rational databases (MySql) to get a good response time for getting offline prefetched answers to queries (Data needed to calculate recommendations for documents, for example).
Time relevancy is a big issue — how do you decide if a document is still relevant? Some documents are always good, “evergreen” as we can them, but many age very fast. An article on a future sports event will age when the event happens. Stock market status reports become irrelevant very fast. We have some behavioral methods to try to understand users like these recommendations less over time, thus we stop serving them. Still some titles will make people click over and over even when the content is not relevant any more. Identifying relevancy is an interesting challenge.
We are totally measureable. We use Hive/Hadoop to create statistics about various aspects of the system. As an example, we know how good each algorithm performed in any environment (e.g. data center) in any source every hour, so we can always monitor logical performance of our algorithms and make intelligence decisions. We use the historic data for research. We even have learning algorithms that analyze current algorithmic performance and give best performing recommendations in a page or site more often for this page or site.
Development is done mostly in Java. We develop really Agile and fast. We use continuous deployment and have a staging environment to which we can deploy new algorithms and ideas very quickly. This means we can see how a new algorithm performs on real production data (a small fraction of it) a very short time after it was developed. We can do AB testing on algorithm properties and decide which value works best for each parameter.
Shlomy Boshy is Outbrain’s Algorithms Team Leader