To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter we describe a framework of density-ratio estimation by density-ratio fitting under the Bregman divergence (Bregman, 1967). This framework is a natural extension of the least-squares approach described in Chapter 6, and it includes various existing approaches as special cases (Sugiyama et al., 2011a).
In Section 7.1 we first describe the framework of density-ratio fitting under the Bregman divergence. Then, in Section 7.2, we show that various existing approaches can be accommodated in this framework, such as kernel mean matching (see Section 3.3), logistic regression (see Section 4.2), Kullback–Leibler importance estimation procedure (see Section 5.1), and least-squares importance fitting (see Section 6.1).We then show other views of the density-ratio fitting framework in Section 7.3. Furthermore, in Section 7.4, a robust density-ratio estimator is derived as an instance of the density-ratio fitting approach based on Basu's power divergence (Basu et al., 1998). The chapter is concluded in Section 7.5.
Basic Framework
A basic idea of density-ratio fitting is to fit a density ratio model r(x) to the true density-ratio function r*(x) under some divergence (Figure 7.1). At a glance, this density-ratio fitting problem may look equivalent to the regression problem, which is aimed at learning a real-valued function [see Section 1.1.1 and Figure 1.1(a)]. However, density-ratio fitting is essentially different from regression because samples of the true density-ratio function are not available. Here we employ the Bregman (BR) divergence (Bregman, 1967) for measuring the discrepancy between the true density-ratio function and the density-ratio model.
Machine learning is aimed at developing systems that learn. The mathematical foundation of machine learning and its real-world applications have been extensively explored in the last decades. Various tasks of machine learning, such as regression and classification, typically can be solved by estimating probability distributions behind data. However, estimating probability distributions is one of the most difficult problems in statistical data analysis, and thus solving machine learning tasks without going through distribution estimation is a key challenge in modern machine learning.
So far, various algorithms have been developed that do not involve distribution estimation but solve target machine learning tasks directly. The support vector machine is a successful example that follows this line – it does not estimate data generating distributions but directly obtains the class-decision boundary that is sufficient for classification. However, developing such an excellent algorithm for each of the machine learning tasks could be highly costly and difficult.
To overcome these limitations of current machine learning research, we introduce and develop a novel paradigm called density-ratio estimation – instead of probability distributions, the ratio of probability densities is estimated for statistical data processing. The density-ratio approach covers various machine learning tasks, for example, non-stationarity adaptation, multi-task learning, outlier detection, two-sample tests, feature selection, dimensionality reduction, independent component analysis, causal inference, conditional density estimation, and probabilitic classification. Thus, density-ratio estimation is a versatile tool for machine learning. This book is aimed at introducing the mathematical foundation, practical algorithms, and applications of density-ratio estimation.