De-Biased Modelling of Search Click Behavior with Reinforcement Learning

by   Jianghong Zhou, et al.

Users' clicks on Web search results are one of the key signals for evaluating and improving web search quality and have been widely used as part of current state-of-the-art Learning-To-Rank(LTR) models. With a large volume of search logs available for major search engines, effective models of searcher click behavior have emerged to evaluate and train LTR models. However, when modeling the users' click behavior, considering the bias of the behavior is imperative. In particular, when a search result is not clicked, it is not necessarily chosen as not relevant by the user, but instead could have been simply missed, especially for lower-ranked results. These kinds of biases in the click log data can be incorporated into the click models, propagating the errors to the resulting LTR ranking models or evaluation metrics. In this paper, we propose the De-biased Reinforcement Learning Click model (DRLC). The DRLC model relaxes previously made assumptions about the users' examination behavior and resulting latent states. To implement the DRLC model, convolutional neural networks are used as the value networks for reinforcement learning, trained to learn a policy to reduce bias in the click logs. To demonstrate the effectiveness of the DRLC model, we first compare performance with the previous state-of-art approaches using established click prediction metrics, including log-likelihood and perplexity. We further show that DRLC also leads to improvements in ranking performance. Our experiments demonstrate the effectiveness of the DRLC model in learning to reduce bias in click logs, leading to improved modeling performance and showing the potential for using DRLC for improving Web search quality.


Handling Position Bias for Unbiased Learning to Rank in Hotels Search

Nowadays, search ranking and recommendation systems rely on a lot of dat...

Constructing an Interaction Behavior Model for Web Image Search

User interaction behavior is a valuable source of implicit relevance fee...

Did We Get It Right? Predicting Query Performance in E-commerce Search

In this paper, we address the problem of evaluating whether results serv...

Evaluation metrics for behaviour modeling

A primary difficulty with unsupervised discovery of structure in large d...

RLIRank: Learning to Rank with Reinforcement Learning for Dynamic Search

To support complex search tasks, where the initial information requireme...

All You Need Is Logs: Improving Code Completion by Learning from Anonymous IDE Usage Logs

Integrated Development Environments (IDE) are designed to make users mor...

FE-TCM: Filter-Enhanced Transformer Click Model for Web Search

Constructing click models and extracting implicit relevance feedback inf...

Please sign up or login with your details

Forgot password? Click here to reset