Welcome to the NIEHS SNPs Program
The NIEHS Environmental Genome Project is a multi-disciplinary, collaborative effort focused on examining the relationships between environmental exposures, inter-individual sequence variation in human genes and disease risk in U.S. populations. The NIEHS SNPs Program at the University of Washington (UW) targets the systematic identification and genotyping of single nucleotide polymorphisms (SNPs) in environmental response genes and has been expanded genome-wide, as described below (Phase 2).
Phase 1 - Candidate Gene Resequencing (completed)
Phase 1 of the NIEHS SNPs program at UW focused on the targeted resequencing (typically all promotor, intronic and exonic regions) of 647 candidate genes involved in apoptosis, cell cycle control, DNA repair, drug metabolism, oxidative stress and many other medically and biologically important pathways (see links under Pathway Image Maps in the navigation menu on the left). NIEHS SNPs are available in the national database resource, dbSNP.
Phase 2 - Exome Sequencing (in progress)
To greatly expand the set of variant data within candidate genes and across the entire gene coding region, second-generation DNA sequencing technology is being used. Phase 2 of the NIEHS SNPs program at UW will comprehensively scan all gene-coding regions (i.e., the exome) on the NIEHS reference panel of 95 individuals from diverse U.S. populations (i.e., EGP - Panel 2). All variation data are uploaded to the NIEHS SNPs Exome Variant Server, from which investigators can access summary data of all coding variation and the uniquely aligned read data with the ability to interrogate any position within an exon.
For Phase 1, automated DNA sequencing was used to identify
and genotype SNPs in human candidate genes (see PolyPhred).
Candidate genes were sequenced to identify common sequence
variation for functional analysis and population-based studies.
Candidate genes were formerly sequenced across a panel of 90 individuals
representative of the U.S. population (see Sample
Population Descriptions Panel 1). For Phase 1, a subset of candidate genes were sequenced across a panel of 95 individuals of known ethniticies
Sample Population Descriptions Panel 2). All SNPs have been
identified using only high quality sequence data (Q > 25) and
each SNP reported from the NIEHS SNPs program has been confirmed
in multiple individuals and/or in multiple reactions.
For Phase 2 exome sequencing, all variants are identified using an analysis pipeline consisting of various quality controls and genotyping software from the Genome Analysis ToolKit (GATK). The Unified Genotyper software (GATK) provides variant identification, quality control and filtering to arrive at a final exome variant dataset. Only variants within the exome target are genotyped. Variants and coverage are analyzed, and final datasets are stored in the NIEHS Exome Variant Server. Panel 2 is currently being used for the Phase 2 exome sequencing.