Evaluation in Recommender Systems

Zhe Wang; Chao Pu; Felice Wang

doi:10.1017/9781009447515.009

7 - Evaluation in Recommender Systems

Published online by Cambridge University Press: 08 May 2025

Zhe Wang ,

Chao Pu and

Felice Wang

Book contents

Get access

Summary

This chapter examines the critical role of evaluation within the framework of recommender systems, highlighting its significance alongside system construction. We identify three key aspects of evaluation: the impact of metrics on optimization quality, the collaborative nature of evaluation efforts across teams, and the alignment of chosen metrics with organizational goals. Our discussion spans a comprehensive range of evaluation techniques, from offline methods to online experiments. We explore offline evaluation methods and metrics, offline simulation through replay, online A/B testing, and fast online evaluation via interleaving. Ultimately, we propose a multilayer evaluation architecture that integrates these diverse methods to enhance the scientific rigor and efficiency of recommender system assessments.

Keywords

recommender systems evaluation metrics selection offline evaluation online A/B testing fast online evaluation multilayer evaluation architecture

Information

Type: Chapter
Information: Deep Learning Recommender Systems , pp. 225 - 247

DOI: https://doi.org/10.1017/9781009447515.009 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2025

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Li, Lihong, et al. Unbiased offline evaluation of contextual-bandit-based news article Recommender algorithms. Proceedings of the 4th ACM International Conference on Web Search and Data Mining, Hong Kong, China, February 9–12, 2011.Google Scholar

Tang, Diane, et al. Overlapping experiment infrastructure: More, better, faster experimentation. Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, July 25–28, 2010.CrossRef Google Scholar

Radlinski, Filip, Craswell, Nick. Optimized interleaving for online retrieval evaluation. Proceedings of the 6th ACM International Conference on Web Search and Data Mining, Rome, Italy, February 4–8, 2013.CrossRef Google Scholar

Parks, Joshua, et al. Innovating Faster on Personalization Algorithms at Netflix Using Interleaving. Netflix Technology Blog. 2017. https://netflixtechblog.com/interleaving-in-online-experiments-at-netflix-a04ee392ec55 Google Scholar

Accessibility standard: Unknown

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

Accessibility compliance for the PDF of this book is currently unknown and may be updated in the future.