We consider a finite mixture of Gaussian regression models for high-dimensional
heterogeneous data where the number of covariates may be much larger than the sample size.
We propose to estimate the unknown conditional mixture density by an
ℓ1-penalized maximum likelihood estimator. We shall provide
an ℓ1-oracle inequality satisfied by this Lasso estimator with
the Kullback–Leibler loss. In particular, we give a condition on the regularization
parameter of the Lasso to obtain such an oracle inequality. Our aim is twofold: to extend
the ℓ1-oracle inequality established by Massart and Meynet
[12] in the homogeneous Gaussian linear
regression case, and to present a complementary result to Städler et al.
[18], by studying the Lasso for its
ℓ1-regularization properties rather than considering it as a
variable selection procedure. Our oracle inequality shall be deduced from a finite mixture
Gaussian regression model selection theorem for ℓ1-penalized
maximum likelihood conditional density estimation, which is inspired from Vapnik’s method
of structural risk minimization [23] and from the
theory on model selection for maximum likelihood estimators developed by Massart in [11].