Statistical Software

placeholder-large
Advances in human genetic research are fueled by advancements in technology and methodologies. To this end, the center’s faculty have developed analytical framework to address the biological, epidemiological, statistical, or evolutionary questions encountered in human genetics. In particular, the center’s faculty have developed the following statistical software that is commonly used in genetic epidemiology studies:

BVS is an R package that focuses on analyzing case-control association studies involving a group of genetic variants. The main focus is to model the outcome variable as a function of a multivariate genetic profile using Bayesian model uncertainty and variable selection techniques. The package allows for numerous genetic predictors to be modeled either jointly as main effects, in combination as expected haplotypes, conditional on the current SNPs selected in a model and the ability to model rare variants via the Bayesian Risk Index. Most notably, the package allows for the incorporation of external biological information via a set of specified prior covariates to inform the marginal inclusion probabilities.

FIZI leverages functional information together with reference linkage-disequilibrium (LD) to impute GWAS summary statistics (Z-scores).

FOCUS is software to fine-map transcriptome-wide association study statistics at genomic risk regions. The software takes as input summary GWAS data along with eQTL weights and outputs a credible set of genes to explain observed genomic risk.

Click here to learn more.

JAM is a scalable algorithm for joint analysis of marginal summary statistics for the re-analysis of published marginal summary statistics under joint multi–single nucleotide polymorphism (SNP) models. The correlation is accounted for according to estimates from a reference data set and models. SNPs that best explain the complete joint pattern of marginal effects are highlighted via an integrated Bayesian penalized regression framework.

Click here to learn more.

LUCID is an integrative model to estimate latent unknown clusters, aiming to both distinguish unique genomic, exposure and informative biomarkers or “-omic” effects while jointly estimating subgroups relevant to the outcome of interest.

Click here to learn more.

PriorityPruner can prune a list of SNPs that are in high linkage disequilibrium (LD) with other SNPs in the list, while preferentially keeping/selecting SNPs of higher priority (e.g., the most significant SNPs in a GWAS). A user can input data in PLINK format with corresponding SNP annotation, including p-values and other SNP characteristics used for prioritization.

PriorityPruner iterates over the entire list of inputted SNPs, in order of descending priority (e.g., lowest to highest p-value), to select LD-independent SNPs according to customizable options and thresholds.

Click here to learn more.

RHOGE is an R package that estimates the genome-wide genetic correlation between two complex traits (diseases) as a function of predicted gene expression effect on trait (ρge). Given output from two transcriptome-wide association studies, RHOGE estimates the mediating effect of predicted gene expression and estimates the correlation of effect sizes across traits (diseases). This approach is extended to a bidirectional regression that provides putative causal directions between traits with non-zero ρge.

Click here to learn more.

TWAS Simulator is software to simulate a complex trait as a function of latent steady-state expression, fit eQTL weights in independent data, and perform GWAS+TWAS on the simulated complex trait.

Click here to learn more.