from Part III - Information geometry
Published online by Cambridge University Press: 27 May 2010
Abstract
High-dimensional structured data such as text and images is often poorly understood and misrepresented in statistical modelling. Typical approaches to modelling such data involve, either explicitly or implicitly, arbitrary geometric assumptions. In this chapter, we consider statistical modelling of non-Euclidean data whose geometry is obtained by embedding the data in a statistical manifold. The resulting models perform better than their Euclidean counterparts on real world data and draw an interesting connection between Caronencov and Campbell's axiomatic characterisation of the Fisher information and the recently proposed diffusion kernels and square root embedding.
Introduction
Geometry is ubiquitous in many aspects of statistical modelling. During the last half century a geometrical theory of statistical inference has been constructed by Rao, Efron, Amari, and others. This theory, commonly referred to as information geometry, describes many aspects of statistical modelling through the use of Riemannian geometric notions such as distance, curvature and connections (Amari and Nagaoka 2000). Information geometry has been mostly involved with the geometric interpretations of asymptotic inference. Focusing on the geometry of parametric statistical families ρ = {ρθ : θ ∈ θ Θ}, information geometry has had relatively little influence on the geometrical analysis of data. In particular, it has largely ignored the role of the geometry of the data space X in statistical inference and algorithmic data analysis.
On the other hand, the recent growth in computing resources and data availability has lead to widespread analysis and modelling of structured data such as text and images. Such data does not naturally lie in ℝn and the Euclidean distance and its corresponding geometry do not describe it well.
To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Find out more about the Kindle Personal Document Service.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.