The NIH Figshare Archive
Browse
1/2
23 files

The multi-PRS weights computed with the 35 lab biomarkers described in 'Genetics of 35 blood and urine biomarkers in the UK Biobank'

dataset
posted on 2020-06-19, 17:45 authored by Yosuke TanigawaYosuke Tanigawa, Nasa Sinnott-ArmstrongNasa Sinnott-Armstrong, Manuel Rivas

The dataset contains the multi-PRS weights computed with the 35 biomarker traits described in the following preprint:
N. Sinnott-Armstrong*, Y. Tanigawa*, et al, Genetics of 38 blood and urine biomarkers in the UK Biobank. bioRxiv, 660506 (2019). doi:10.1101/660506

Note that we are preparing a revised version of the manuscript and this dataset contains 35 (instead of 38) biomarker phenotypes.

The list of disease endpoints included in this dataset is: angina, alcoholic cirrhosis, gallstones, hypertension, cholecystitis, kidney failure, heart failure, myocardial infarction, gout, and type 2 diabetes (T2D).

We provide weights of the 23 polygenic risk scores characterized by multi-PRS. The list of models are summarized in list_of_multi-PRS-models.tsv. This index file the following columns:

  • Filename: the filename of polygenic risk score weights in this dataset.
  • Trait: the disease outcome.
  • Covariate_adjustment: a binary variable indicating whether the multi-PRS model is trained with covariate (age, sex, and PC1-10) adjustment.
  • Family_history_adjustment: a binary variable indicating whether the multi-PRS model is trained with family history.
  • Note: Additional information when relevant.

For the PRS models listed with "Covariate_adjustment == TRUE", we fit multi-PRS regression model adjusted by age, sex, and 10 principal components where as the ones with "Covariate_adjustment == FALSE" we did not use those covariates.

For T2D, we have two sets of models: (1) models trained for Eastwood et al. T2D cases in UK Biobank and (2) models trained for Eastwood et al. T2D cases in UK Biobank vs. filtered controls with HbA1c < 39.

For myocardial infarction, we provide a model with family history adjustment, weights_familyhistory.HC326.tsv.gz. This model is trained with covariates (age, sex, and 10 principal components) and family history of heart disease as covariates.

Please read our manuscript for more details.

For each PRS model listed in list_of_multi-PRS-models.tsv, we provide a compressed tab-delimited files, which contain the multi-PRS weights. The files have the following columns:

  • CHROM: the chromosome
  • POS: the position
  • ID: the variant identifier
  • REF: the reference allele
  • ALT: the alternate allele
  • A1: the risk allele
  • weights.: the coefficients (weights) of the PRS

Note that we used GRCh37/hg19 genome reference in the analysis and the BETA is always reported for the alternate allele.

The multi-PRS weights files are compressed with gzip. One should be able to read those files with the standard gzip/zcat.

Funding

SOFTWARE FOR LARGE-SCALE INFERENCE OF THE GENETICS OF LIFESTYLE MEASURES, BIOMARKERS, AND COMMON AND RARE DISEASES

National Human Genome Research Institute

Find out more...

History

Select an IC:

  • HG - National Human Genome Research Institute (NHGRI)

Is this associated with a publication?

  • Yes

DOI(s) of associated publication(s):

I confirm there is no human identifiable information in this dataset.

  • Yes

Usage metrics

    Categories

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC