Overview of Recommendation System Products and Algorithms
Author Gogongyouliu
Reprinting Arrogant Data and Artificial Intelligence (ID: gh_b8b5b02c348b)
In the fifth part of "Recommendation System Paradigm" of the article "Engineering Implementation of Recommendation System" (Click Blue Word to Review), the author talks about five commonly used recommendations for industrial recommendation systems: non-personalized paradigm, fully personalized paradigm, group personalized paradigm, object-related paradigm and Cartesian product paradigm. Paradigm. This paper will explain the commonly used recommendation algorithms according to these five paradigms, but will not explain the implementation principle of the algorithm in depth, but will outline the implementation ideas of the algorithm. In the following series of articles, I will make a detailed and in-depth analysis of the commonly used key algorithms.
This paper will explain the recommendation algorithm and product introduction, recommendation recall algorithm overview, sorting algorithm overview, recommendation algorithm landing needs to pay attention to several issues. Complete personalization paradigm and object-related paradigm are the most commonly used recommendation paradigm. There are a large number of real scene applications in Internet products, which are also the focus of this paper.
After reading this article, you will know what algorithms are commonly used for each type of paradigm, what ideas to implement, and common application scenarios. This article can also be used as a reference guide for the reader's landing recommendation algorithm to the real recommendation scenario.
Recommendation Algorithms and Product Introduction
The recommendation business process of industrial recommendation system is generally divided into two stages: recall and ranking. The recall is to extract the subject matter that the user may be interested in from the full-scale object database by algorithm. Generally, many algorithms are used to recall, such as hot recall, collaborative filtering recall, label recall, etc. The list of objects in the recall phase is sorted according to the possible click probability of the user (so-called CTR prediction). In practical business, a layer of control logic will be added after sorting, and the sorted list will be further added and fine-tuned according to business rules and operational strategies to meet specific operational needs.
Figure 1 below is the business process of the recommendation system of TV Cat (a video player software based on OTT terminal [smart TV or smart box]). It includes three algorithms and strategy modules: recall, sorting and business regulation. It can be used as a reference for designing the algorithm module of the recommendation system. This paper only explains the algorithms involved in the recall and ranking phases. Business regulation is strongly related to specific business and company operation strategy. This paper does not give too much description.
640? Wx_fmt = png Fig. 1 Television Cat Recommendation System Traffic Flow
Recommendation algorithm is a machine learning algorithm, so the quality of the algorithm model depends strongly on the data set used for algorithm training. Here we briefly mention what data can be used by the recommendation system (refer to the data sources of Figure 2 below and Figure 1 above). Generally, recommendation system relies on three types of data: metadata data data of the subject matter (description information of the subject matter), user portrait data (user-related data, such as age, region, gender, income, etc.), user behavior data (user's operation behavior of the subject matter, such as playback, click, purchase, collection, etc.). These three types of data are the main available model data. In addition, manual labeling data and third-party data can also be used to supplement and improve the above three types of data.
640? Wx_fmt = png Figure 2 Recommends three types of system-dependent data
Believe that you have a preliminary understanding of the recommendation process and the data that the algorithm relies on. Next, we will focus on the corresponding recommendation products and feasible recommendation algorithms according to different recommendation paradigms, so that you can apply different recommendation algorithms to different recommendation products.
The five recommended paradigms mentioned above can be understood from three dimensions:
One is the user dimension.
One is the object dimension.
One is the Cartesian product dimension of users and labels.
From the user dimension, it is to recommend the subject matter that users may be interested in. From the subject dimension, it is to refer to a group of subject matter as recommendation when users visit the subject matter details page (or exit the subject matter details page). The third dimension combines the user dimension with the subject dimension, and different users can see different contents when they visit the same subject details page.
1. Recommendation based on user dimension
Recommendations based on user dimension can be divided into non-personalized, group personalized and completely personalized according to the granularity of personalization. These three granularities correspond to the non-personalized paradigm, group personalization paradigm and complete personalization paradigm mentioned earlier.
Non-personalization is the same recommendation content that every user sees. Editors of traditional portal websites arrange content in a non-personalized way. Every user sees the same content. Recommendations for various websites or APP rankings are also non-personalized. Figure 3 below is the recommendation of Netease Cloud Music Ranking List. It calculates all kinds of lists according to each dimension.
640? Wx_fmt = png
Figure 3 Netease Music Rankings
Group personalization is to aggregate users with the same characteristics into a group. The same group of users have similarities in some characteristics. We recommend exactly the same content for this group of users.
Fine operation will generally use this way, through the user portrait system circle a group of people, and do a unified operation of these people. For example, members of the video industry fine operation, when members are about to expire, they can retain users with the help of precise operation. Specifically, they can circle out members who are about to expire, and do member discount activities for these users, so as to promote users to produce new purchases.
Figure 4 is a TV cat show.
Please read the Chinese version for details.