Contextual Document Similarity for Content-based Literature Recommender Systems

by   Malte Ostendorff, et al.

To cope with the ever-growing information overload, an increasing number of digital libraries employ content-based recommender systems. These systems traditionally recommend related documents with the help of similarity measures. However, current document similarity measures simply distinguish between similar and dissimilar documents. This simplification is especially crucial for extensive documents, which cover various facets of a topic and are often found in digital libraries. Still, these similarity measures neglect to what facet the similarity relates. Therefore, the context of the similarity remains ill-defined. In this doctoral thesis, we explore contextual document similarity measures, i.e., methods that determine document similarity as a triple of two documents and the context of their similarity. The context is here a further specification of the similarity. For example, in the scientific domain, research papers can be similar with respect to their background, methodology, or findings. The measurement of similarity in regards to one or more given contexts will enhance recommender systems. Namely, users will be able to explore document collections by formulating queries in terms of documents and their contextual similarities. Thus, our research objective is the development and evaluation of a recommender system based on contextual similarity. The underlying techniques will apply established similarity measures and as well as neural approaches while utilizing semantic features obtained from links between documents and their text.


page 1

page 2

page 3

page 4


Aspect-based Document Similarity for Research Papers

Traditional document similarity measures provide a coarse-grained distin...

Generative Interest Estimation for Document Recommendations

Learning distributed representations of documents has pushed the state-o...

Classifying document types to enhance search and recommendations in digital libraries

In this paper, we address the problem of classifying documents available...

Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles

Many digital libraries recommend literature to their users considering t...

From Task Classification Towards Similarity Measures for Recommendation in Crowdsourcing Systems

Task selection in micro-task markets can be supported by recommender sys...

Specialized Document Embeddings for Aspect-based Similarity of Research Papers

Document embeddings and similarity measures underpin content-based recom...

Representation Learning for Recommender Systems with Application to the Scientific Literature

The scientific literature is a large information network linking various...

Please sign up or login with your details

Forgot password? Click here to reset