RH_pools_1.zip (5.59 GB)

Pooled analysis of radiation hybrids to identify genes for cell growth and paclitaxel action

Download (5.59 GB)
dataset
posted on 18.12.2019 by Arshad H. Khan, Andy Lin, Richard T. Wang, Joshua S. Bloom, Kenneth Lange, Desmond Smith
SUMMARY
Data from bulk segregant analysis of radiation hybrids to identify genes for cell growth and paclitaxel action.

LaTeX DOCUMENTS
The LaTeX documents "master.tex", "main.tex" and "supp.tex" are located in the "Words" directory and provide the key to navigate the scripts and data. These documents can be found by search. The pdf output file of the latex documents is "master.pdf".

SCRIPTS
The scripts necessary to analyze the data are in the "Data_Figs" directory and can also be found by search. The vast majority of the data files needed to run the scripts are in the directory "RH_pools_workspace_1". A few miscellaneous data files are in the "Data_Figs" directory.

FINDING RELEVANT SCRIPTS
For example, the second paragraph in the "Results" section ("master.pdf") commences with the phrase "We created six independent RH pools...", and provides some basic statistics on the RH pools. Searching for this phrase in "main.tex" reveals the names of two R scripts "clone_sem_1.R" and "graph_Human_retent_2.R" above the paragraph. These scripts provide the corresponding results.

Inspection of the script "clone_sem_1.R" shows that it uses the data file "clone.txt", while the script "graph_Human_retent_2.R" uses the data files "RH_pool_human_total_align.txt", "RH_pool_hamster_total_align.txt", "RH_human_gseq.txt", "RH_hamster_gseq.txt", "gencode_gtf_ensembl_ucsc_v31.txt", "clone.txt" and "cell_label_info.txt". All these data files can be found in the directory "RH_pools_workspace_1".

FINDING RELEVANT PARTS OF SCRIPTS
Statistics in the LaTeX documents have been quoted to unrealistic levels of precision, but are rounded in the pdf output file, "master.pdf". However, the redundant digits can be useful. For example, in the last paragraph on pg S12 of the "Supporting Information" (section "Overlap of interaction loci with growth and paclitaxel loci"), we are informed that "There were 15 genes that overlapped between the interaction loci and the 859 unique growth loci (odds ratio = 22.6, P = 8.3 × 10−15, Fisher’s Exact Test)..."

Reference to "supp.tex" reveals that the pertinent script is "g_d_comb_fish_2.R" and that the exact P value is "8.281682e-15". Searching "g_d_comb_fish_2.R" for "8.281682e-15" takes the reader to the relevant part of the script.

Funding

A cost-effective platform for high-precision genome analysis of mammalian cells

National Human Genome Research Institute

Find out more...

History

Select an IC:

  • HG - National Human Genome Research Institute (NHGRI)

Is this associated with a publication?

Yes

DOI(s) of associated publication(s):

I confirm there is no human identifiable information in this dataset.

Yes

Licence

Exports