The NIH Figshare Archive
Browse
1/1
3 files

Decomposed matrices used for the analysis described in 'Components of genetic associations across 2,138 phenotypes in the UK Biobank highlight adipocyte biology'

dataset
posted on 2019-08-01, 10:00 authored by Yosuke TanigawaYosuke Tanigawa, Manuel Rivas

The dataset deposited here contains decomposed matrices of GWAS summary statistics across 2,138 phenotypes described in the following publication:


Y. Tanigawa*, J. Li*, et al., Components of genetic associations across 2,138 phenotypes in the UK Biobank

highlight adipocyte biology. Nature Communications (2019). doi:10.1038/s41467-019-11953-9.


The data are provided as three Python Numpy data (npz) files, each of which corresponds to the three datasets used in computational analysis described in our manuscript.


- "all" dataset: dev_allNonMHC_z_center_p0001_100PCs_20180129.npz

- "Coding only" dataset: dev_codingNonMHC_z_center_p0001_100PCs_20180129.npz

- "PTVs only" dataset: dev_PTVsNonMHC_z_center_p0001_100PCs_20180129.npz


Those files can be loaded with Python numpy package and were used in our analysis scripts and notebook (https://github.com/rivas-lab/public-resources/tree/master/uk_biobank/DeGAs).


Please read our publication for more information regarding this dataset.


Abstract


Population-based biobanks with genomic and dense phenotype data provide opportunities for generating effective therapeutic hypotheses and understanding the genomic role in disease predisposition. To characterize latent components of genetic associations, we applied truncated singular value decomposition (DeGAs) to matrices of summary statistics derived from genome-wide association analyses across 2,138 phenotypes measured in 337,199 White British individuals in the UK Biobank study. We systematically identified key components of genetic associations and the contributions of variants, genes, and phenotypes to each component. As an illustration of the utility of the approach to inform downstream experiments, we report putative loss of function variants, rs114285050 (GPR151) and rs150090666 (PDE3B), that substantially contribute to obesity-related traits, and experimentally demonstrate the role of these genes in adipocyte biology. Our approach to dissect components of genetic associations across the human phenome will accelerate biomedical hypothesis generation by providing insights on previously unexplored latent structures.

Funding

SOFTWARE FOR LARGE-SCALE INFERENCE OF THE GENETICS OF LIFESTYLE MEASURES, BIOMARKERS, AND COMMON AND RARE DISEASES

National Human Genome Research Institute

Find out more...

Beyond GWAS of insulin resistance: An integrated approach to translate genetic association to function

National Institute of Diabetes and Digestive and Kidney Diseases

Find out more...

History

Select an IC:

  • HG - National Human Genome Research Institute (NHGRI)

Is this associated with a publication?

  • Yes

DOI(s) of associated publication(s):

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC