How we built near real-time “X for you” recommender systems at Bol | Blog | bol.com


Figure 5: Product relationships: most customers that buy P_2 also buy P_4, resulting into a buy-buy relationship. Most customers that view product P_2 end up buying P_5, resulting into a view-buy relationship. In this example, P_2 plays three types of roles – view query, buy query and target.The aim of training an encoder model is to capture these existing item-to-item relationships and then generalize this understanding to include new potential connections between items, thereby expanding the graph with plausible new item-to-item relationships.

Step 2 is about using the transformer encoder trained in step 1 and generating embeddings for all items in the catalog.

Step 3 is about indexing the items that need to be matched (e.g. items with promotional labels or items that are new releases). The items that are indexed are then matched against all potential queries (viewed or purchased items). The results of the search are then stored in a lookup table.

Step 4 is about generating personalized feeds per customer based on customer interactions and the lookup table from step 3.The process for generating a ranked list of items per user includes: 1) selecting queries for each customer (up to 100), 2) retrieving up to 10 potential next items- to-buy for each query, and 3) combining these items and applying ranking, diversity, and business criteria (See Figure4d). This process is executed daily for all customers and every two minutes for those active in the last two minutes. Recommendations resulting from recent queries are prioritized over those from historical ones. All these steps are orchestrated with Airflow.

Applications of Pfeed

We applied Pfeed to generate various personalized feeds at Bol, viewable on the app or website with titles likeTop deals for you, Top picks for you, and New for you.The feeds differ on at least one of two factors: the specific items targeted for personalization and/or the queries selected to represent customer interests. There is also another feed called Select Deals for you. In this feed, items with Select Deals are personalized exclusively for Select members, customers who pay annual fees for certain benefits. You can find Select Deals for you on empty baskets. 

In general, Pfeed is designed to generate“X for you”feed by limiting the search index or the search output to consist of only items belonging to category 𝑋 for all potential queries.

Evaluation

We perform two types of evaluation – offline and online. The offline evaluation is used for quick validation of the efficiency and quality of embeddings. The online evaluation is used to assess the impact of the embeddings in personalizing customers’ homepage experiences.

Offline evaluation

We use about two million matching query-target pairs and about one million random items for training, validation and testing in the proportion of 80%, 10%, %10.We randomly select a million products from the catalog, forming a distractor set, which is then mixed with the true targets in the test dataset. The objective of evaluation is to determine, for known matching query-target pairs, the percentage of times the true targets are among the top 10 retrieved items for their respective queries.

In the embedding space using dot product (Recall@10). The higher score, the better.Table 1 shows that two embedding models, called SIMO-128 and SISO-128, achieve comparable Recall@10 scores. The SIMO-128 model generates three 128 dimensional embeddings in one shot, while the SISO-128 generates the same three 128-dimensional embeddings but in three separate runs.

The efficiency advantage of SIMO-128 implies that we can generate embeddings for the entire catalog much faster without sacrificing embedding quality.

Latest articles

spot_imgspot_img

Related articles

Leave a reply

Please enter your comment!
Please enter your name here

spot_imgspot_img