Hostname: page-component-6766d58669-tq7bh Total loading time: 0 Render date: 2026-05-25T03:30:07.418Z Has data issue: false hasContentIssue false

Genetic analysis of complex traits via Bayesian variable selection: the utility of a mixture of uniform priors

Published online by Cambridge University Press:  18 July 2011

TIMO KNÜRR*
Affiliation:
Department of Mathematics and Statistics, P.O. Box 68, FIN-00014 University of Helsinki, Finland
ESA LÄÄRÄ
Affiliation:
Department of Mathematical Sciences/Statistics, P.O. Box 3000, FIN-90014 University of Oulu, Finland
MIKKO J. SILLANPÄÄ
Affiliation:
Department of Mathematics and Statistics, P.O. Box 68, FIN-00014 University of Helsinki, Finland Department of Agricultural Sciences, P.O. Box 28, FIN-00014 University of Helsinki, Finland
*
*Corresponding author: Department of Mathematics and Statistics, P.O. Box 68, FIN-00014 University of Helsinki, Finland. Tel: +358-9-191 51526. Fax: +358-9-191 51400. E-mail: Timo.Knurr@helsinki.fi
Rights & Permissions [Opens in a new window]

Summary

A new estimation-based Bayesian variable selection approach is presented for genetic analysis of complex traits based on linear or logistic regression. By assigning a mixture of uniform priors (MU) to genetic effects, the approach provides an intuitive way of specifying hyperparameters controlling the selection of multiple influential loci. It aims at avoiding the difficulty of interpreting assumptions made in the specifications of priors. The method is compared in two real datasets with two other approaches, stochastic search variable selection (SSVS) and a re-formulation of Bayes B utilizing indicator variables and adaptive Student's t-distributions (IAt). The Markov Chain Monte Carlo (MCMC) sampling performance of the three methods is evaluated using the publicly available software OpenBUGS (model scripts are provided in the Supplementary material). The sensitivity of MU to the specification of hyperparameters is assessed in one of the data examples.

Information

Type
Research Papers
Copyright
Copyright © Cambridge University Press 2011
Figure 0

Fig. 1. Prior density of an effect size βm with probability of marker exclusion p0, border value b and upper limit l.

Figure 1

Fig. 2. Comparison of prior distributions in IAt (blue), SSVS (red) and MU (black). The curves indicate the CDF of the prior distributions assigned to gene effects βm in the analysis of the Barley data (left) and the CF data (right).

Figure 2

Fig. 3. Results for the Barley data. Left panel (a–c): Bayes factors (BFs) for marker occupancy on logarithmic scale. The 12 BFs reported in Table 1 are not shown. Dashed lines indicate the borders of the BF categories of strength of evidence (see section 4). Right panel (d–f): Bland–Altman plots for effect sizes of 127 markers.

Figure 3

Table 1. Comparison of Bayes factors (BFs) for marker occupancy and ranks in the three competing models (Barley data). Results of 12 markers that are among the 10 markers with highest BFs in at least one model.

Figure 4

Fig. 4. Results for the Barley data. Left panel (a–c): number of switches per minute during MCMC simulation for 127 marker indicators. Right panel (d–f): Bland–Altman plots for ESS/min for 127 posterior means of effect sizes.

Figure 5

Fig. 5. Marker occupancy probabilities for the eight MCMC chains A–H used to assess the sensitivity of model MU on the analysis of the Barley data. The vertical lines indicate the markers with the highest (BFs) for marker occupancy as reported in Table 1. The horizontal lines indicate the probability levels corresponding to BFs of 10 under the respective values of p0 (0·99 in chains A–D and 0·79 in chains E–H).

Figure 6

Table 2. Prior specifications and posterior estimates for the eight MCMC chains A–H used to evaluate the sensitivity of model MU on the analysis of the Barley data. The summary statistic {\rm N}_{{\rm Q}} \equals \sum _{{\rm i} \equals \setnum{1}}^{{\rm M}} {\rm S}_{{\rm m}} is the number of occupied markers and has prior mean E(NQ)=M(1−p0). The lower and upper limits of the reported credible intervals (95% CI) are the 2·5% and 97·5% quantiles, respectively. The point estimate used for the residual variance σ2 is the MAP estimate.

Figure 7

Table 3. Prior and posterior distributions of the summary statistic {\rm N}_{\rm Q} \equals \sum _{{\rm i} \equals {\rm M}}^{\rm M} {\rm S}_{\rm m} (number of occupied markers) for the CF data.

Figure 8

Fig. 6. Results for the CF data. (a) Bayes factors (BFs) for marker occupancy on logarithmic scale. Dashed lines indicate the borders of the BF categories of strength of evidence (see section 4). (b) MCMC switches per minute for marker indicators. (c) Posterior means of effect sizes on logistic liability scale. (d) MCMC effective samples per minute (ESS/min) for effect sizes.

Supplementary material: PDF

Knürr Supplementary Material

Knürr Supplementary Material

Download Knürr Supplementary Material(PDF)
PDF 112.9 KB