Hostname: page-component-6766d58669-bp2c4 Total loading time: 0 Render date: 2026-05-15T15:43:14.476Z Has data issue: false hasContentIssue false

Mining, analyzing, and modeling text written on mobile devices

Published online by Cambridge University Press:  10 October 2019

K. Vertanen*
Affiliation:
Michigan Technological University, Houghton, MI, USA
P.O. Kristensson
Affiliation:
University of Cambridge, Cambridge, UK
*
*Corresponding author. Email: vertanen@mtu.edu

Abstract

We present a method for mining the web for text entered on mobile devices. Using searching, crawling, and parsing techniques, we locate text that can be reliably identified as originating from 300 mobile devices. This includes 341,000 sentences written on iPhones alone. Our data enables a richer understanding of how users type “in the wild” on their mobile devices. We compare text and error characteristics of different device types, such as touchscreen phones, phones with physical keyboards, and tablet computers. Using our mined data, we train language models and evaluate these models on mobile test data. A mixture model trained on our mined data, Twitter, blog, and forum data predicts mobile text better than baseline models. Using phone and smartwatch typing data from 135 users, we demonstrate our models improve the recognition accuracy and word predictions of a state-of-the-art touchscreen virtual keyboard decoder. Finally, we make our language models and mined dataset available to other researchers.

Information

Type
Article
Copyright
© Cambridge University Press 2019

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable