mrhevo

MR-Hevo – Inference of causal effects by Mendelian randomization, marginalizing over distribution of pleiotropic effects

Mendelian randomization has been widely used to study causal effects of exposures (broadly defined to include behavioural traits, biomarkers and gene expression levels) on diseases. The biggest methodological challenge is how to infer causality when some of the genetic instruments have direct (pleiotropic) effects on the outcome that are not mediated through the exposure under study. These pleiotropic effects are not directly observed, and their distribution over the instruments is unknown.

Inference of causal effects can be tackled like any other statistical problem, by computing the likelihood (or posterior distribution) of the parameter of interest (the causal effect) while marginalizing over the distribution of nuisance variables (in this case the pleiotropic effects). If we can compute the posterior distribution of the parameter of interest, we can obtain the likelihood by dividing by the prior on that parameter.

As the form of the distribution of pleiotropic effects over loci is unknown, any realistic statistical model has to specify a prior that encompassses a broad family of symmetric distributions ranging from a spike-and-slab to a Gaussian. An initial implementation of this approach has been been described by Berzuini et al (2020). They specify a horseshoe prior for the pleiotropic effects, and generate the posterior distribution of all model parameters, including the causal effect parameter, by Markov chain Monte Carlo sampling. An implementation that uses only summary statistics has been described by Grant and Burgess (2020)

This method extends the likelihood-based approach

  1. to two-step Mendelian randomization, where step 1 uses only summary statistics for the effects of genetic instruments on exposure, and step 2 uses individual-level data to test the effects of these instruments on the outcome.

  2. to use a regularized horseshoe prior on shrinkage coefficients, as described by Piironen and Vehtari (2017). This prior, known as the “Finnish horseshoe”, has better computational properties than the original horseshoe. On this basis, the method is named MR-Hevo (hevo is Finnish for a horse).

  3. to generate classical maximum likelihood estimates and p-values for the causal effect.

The motivation for this work was to develop a method to test formally for causality in genome-wide aggregated trans- effects analysis, which aims to detect core genes for a disease or trait by testing for association with predicted trans- effects of SNPs on gene expression, aggregated over multiple QTLs. With this approach, the genetic instruments are clumps of SNPs with trans- effects on the expression of a gene as transcript or circulating protein.

A description of the statistical model is here and a vignette is here

This is work in progress. We shall upload further examples and eventually an R package.