Decomposed matrices used for the analysis described in 'Components of genetic associations across 2,138 phenotypes in the UK Biobank highlight adipocyte biology'
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
The dataset deposited here contains decomposed matrices of GWAS summary statistics across 2,138 phenotypes described in the following publication:
Y. Tanigawa*, J. Li*, et al., Components of genetic associations across 2,138 phenotypes in the UK Biobank
highlight adipocyte biology. Nature Communications (2019). doi:10.1038/s41467-019-11953-9.
The data are provided as three Python Numpy data (npz) files, each of which corresponds to the three datasets used in computational analysis described in our manuscript.
- "all" dataset: dev_allNonMHC_z_center_p0001_100PCs_20180129.npz
- "Coding only" dataset: dev_codingNonMHC_z_center_p0001_100PCs_20180129.npz
- "PTVs only" dataset: dev_PTVsNonMHC_z_center_p0001_100PCs_20180129.npz
Those files can be loaded with Python numpy package and were used in our analysis scripts and notebook (https://github.com/rivas-lab/public-resources/tree/master/uk_biobank/DeGAs).
Please read our publication for more information regarding this dataset.
Population-based biobanks with genomic and dense phenotype data provide opportunities for generating effective therapeutic hypotheses and understanding the genomic role in disease predisposition. To characterize latent components of genetic associations, we applied truncated singular value decomposition (DeGAs) to matrices of summary statistics derived from genome-wide association analyses across 2,138 phenotypes measured in 337,199 White British individuals in the UK Biobank study. We systematically identified key components of genetic associations and the contributions of variants, genes, and phenotypes to each component. As an illustration of the utility of the approach to inform downstream experiments, we report putative loss of function variants, rs114285050 (GPR151) and rs150090666 (PDE3B), that substantially contribute to obesity-related traits, and experimentally demonstrate the role of these genes in adipocyte biology. Our approach to dissect components of genetic associations across the human phenome will accelerate biomedical hypothesis generation by providing insights on previously unexplored latent structures.
SOFTWARE FOR LARGE-SCALE INFERENCE OF THE GENETICS OF LIFESTYLE MEASURES, BIOMARKERS, AND COMMON AND RARE DISEASES
National Human Genome Research InstituteFind out more...
Beyond GWAS of insulin resistance: An integrated approach to translate genetic association to function
National Institute of Diabetes and Digestive and Kidney DiseasesFind out more...
Select an IC:
- HG - National Human Genome Research Institute (NHGRI)