TY - RPRT
T1 - Strategies for MCMC computation inquantitative genetics
AU - Waagepetersen, Rasmus
AU - Ibánēz-Escriche, Noelia
AU - Sorensen, Daniel
PY - 2007
Y1 - 2007
N2 - Given observations of a trait and a pedigree for a group of animals, the basic model in quantitative genetics is a linear mixed model with genetic random effects. The correlation matrix of the genetic random effects is determined by the pedigree and is typically very high-dimensional but with a sparse inverse. Maximum likelihood inference and Bayesian inference for the linear mixed model are well-studied topics (Sorensen and Gianola, 2002). Regarding Bayesian inference, with appropriate choice of priors, the full conditional distributions are standard distributions and Gibbs sampling can be implemented relatively straightforwardly. The assumptions of normality, linearity, and variance homogeneity are in many cases not valid. One may then consider generalized linear mixed models where the genetic random effects enter at the level of the linear predictor. San Cristobal-Gaudy et al. (1998) proposed another extension of the linear mixed model introducing genetic random effects influencing the log residual variances of the observations thereby producing a genetically structured variance heterogeneity. Considerable computational problems arise when abandoning the standard linear mixed model. Maximum likelihood inference is complicated since it is not possible to evaluate explicitly the likelihood function and conventional Gibbs sampling is difficult since the full conditional distributions are not anymore of standard forms. The aim of this paper is to discuss strategies to obtain efficient Markov chain Monte Carlo (MCMC) algorithms for non-standard models of the kind mentioned in the previous paragraph. In particular we focus on the problem of constructing efficient updating schemes for the high-dimensional vectors of genetic random effects. We review the methodological background and discuss the various algorithms in the context of the heterogeneous variance model. Apart from being a model of great interest in its own right, this model has proven to be a hard test for MCMC methods. We compare the performances of the different algorithms when applied to three real datasets which differ markedly both in size and regarding the inferences concerning the genetic covariance parameters. Section 2 discusses general strategies for obtaining efficient MCMC algorithms while Section 3 considers these strategies in the specific context of the San Cristobal-Gaudy et al. (1998) model. Section 4 presents results of applying two MCMC schemes to data sets with pig litter sizes, rabbit litter sizes, and snail weights. Some concluding remarks are given in Section 5.
AB - Given observations of a trait and a pedigree for a group of animals, the basic model in quantitative genetics is a linear mixed model with genetic random effects. The correlation matrix of the genetic random effects is determined by the pedigree and is typically very high-dimensional but with a sparse inverse. Maximum likelihood inference and Bayesian inference for the linear mixed model are well-studied topics (Sorensen and Gianola, 2002). Regarding Bayesian inference, with appropriate choice of priors, the full conditional distributions are standard distributions and Gibbs sampling can be implemented relatively straightforwardly. The assumptions of normality, linearity, and variance homogeneity are in many cases not valid. One may then consider generalized linear mixed models where the genetic random effects enter at the level of the linear predictor. San Cristobal-Gaudy et al. (1998) proposed another extension of the linear mixed model introducing genetic random effects influencing the log residual variances of the observations thereby producing a genetically structured variance heterogeneity. Considerable computational problems arise when abandoning the standard linear mixed model. Maximum likelihood inference is complicated since it is not possible to evaluate explicitly the likelihood function and conventional Gibbs sampling is difficult since the full conditional distributions are not anymore of standard forms. The aim of this paper is to discuss strategies to obtain efficient Markov chain Monte Carlo (MCMC) algorithms for non-standard models of the kind mentioned in the previous paragraph. In particular we focus on the problem of constructing efficient updating schemes for the high-dimensional vectors of genetic random effects. We review the methodological background and discuss the various algorithms in the context of the heterogeneous variance model. Apart from being a model of great interest in its own right, this model has proven to be a hard test for MCMC methods. We compare the performances of the different algorithms when applied to three real datasets which differ markedly both in size and regarding the inferences concerning the genetic covariance parameters. Section 2 discusses general strategies for obtaining efficient MCMC algorithms while Section 3 considers these strategies in the specific context of the San Cristobal-Gaudy et al. (1998) model. Section 4 presents results of applying two MCMC schemes to data sets with pig litter sizes, rabbit litter sizes, and snail weights. Some concluding remarks are given in Section 5.
M3 - Report
T3 - Research Report Series
BT - Strategies for MCMC computation inquantitative genetics
PB - Department of Mathematical Sciences, Aalborg University
CY - Aalborg
ER -