I now turn to how interactive search systems should be evaluated, including the measures that should be employed, the methodologies, and the data and tools that are needed to perform these evaluations. This is particularly important if systems are being compared across multiple experimental sites and systems are being tested at scale with millions of searchers. Web search providers need to understand the performance of their systems at scale, across a diverse set of information needs and large populations. As a result, methods to understand searcher preferences solely via the retrospective analysis of logged search activity (e.g., which results or interface items receive the most attention given variations in ranking and/or interface presentation) are particularly attractive. More sophisticated experimental apparatuses allow different measures of systems performance to be computed and a more complete sense of searcher performance to be attained. There are also a number of alternatives to computing metrics from behavior, such as capturing labels directly from searchers directly (in situ) at search time, and measuring signals such as cognitive load and affect to enrich signals based only on the explicit actions that searchers perform.
Email your librarian or administrator to recommend adding this book to your organisation's collection.