Skip to main content Accessibility help
×
Hostname: page-component-8448b6f56d-dnltx Total loading time: 0 Render date: 2024-04-24T14:19:30.168Z Has data issue: false hasContentIssue false

12 - Data, Tools, and Privacy

from Part III - Evaluation

Published online by Cambridge University Press:  05 March 2016

Ryen W. White
Affiliation:
Microsoft Research
Get access

Summary

An important aspect of many of the methods covered in this book is the availability of data on how people interact with search systems. It is therefore important to discuss how searcher data are collected, and what data are available for research purposes. An important aspect in mining, analyzing, and applying these data is searcher privacy, which permeates all aspects of collection and use – from the consent of searchers to collect the data at the outset, to the de-identification, aggregation, and restrictions of sharing and applying data (Horvitz and Mulligan, 2015). The collection of such interaction data is standard practice for large commercial entities, such as Web search engines, who use the data to understand how people are interacting with their services and improve the user experience. Because of privacy concerns, once the data are collected, they are usually not shareable with external parties. Efforts to release data (e.g., by America Online in 2006) have led to serious privacy breaches associated with a failure to completely anonymize the dataset. Serious events such as this make future broad data releases unlikely. Limited releases under license to researchers and the extreme anonymization of datasets have been used as strategies to address privacy challenges and promote research into behavioral analysis and user modeling.

In this chapter, I discuss the need for the shared resources (e.g., datasets), tools (e.g., logging support), and infrastructure that are necessary to build and evaluate competitive search systems. These pillars are important when comparing or coordinating the performance of interactive search systems across multiple experimental sites. Lagergren and Over (1998) described an experimental design for cross-site comparisons of experimental results (i.e., a matrix design to which participating sites must strictly adhere) to address issues such as two-way interactions and effects specific to how the experiment was conducted at a particular site, in the context of the TREC Interactive Track (in which a single search system was used as a baseline at all sites). This involved significant coordination effort and was still focused on comparing systems. Important alternative goals include advancing our understanding of search behavior, improving the design of systems to support searching, and facilitating comparability between laboratory studies performed at different sites.

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2016

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×