NIEHS
Search Site 
Summary Information
for Completed Genes

Summary Data
Summary Statistics
Data Download
Other SNP Resources
Seattle SNPs
dbSNP
Webmaster
Colleen Davis
(codavis@uw.edu)
Welcome to the NIEHS SNPs Program

Introduction

The NIEHS Environmental Genome Project is a multi-disciplinary, collaborative effort focused on examining the relationships between environmental exposures, inter-individual sequence variation in human genes and disease risk in U.S. populations. The NIEHS SNPs Program at the University of Washington (UW) targets the systematic identification and genotyping of single nucleotide polymorphisms (SNPs) in environmental response genes and has been expanded genome-wide, as described below (Phase 2).

Phase 1 - Candidate Gene Resequencing (completed)

Phase 1 of the NIEHS SNPs program at UW focused on the targeted resequencing (typically all promotor, intronic and exonic regions) of 647 candidate genes involved in apoptosis, cell cycle control, DNA repair, drug metabolism, oxidative stress and many other medically and biologically important pathways (see links under Pathway Image Maps in the navigation menu on the left). NIEHS SNPs are available in the national database resource, dbSNP.

Phase 2 - Exome Sequencing (in progress)

To greatly expand the set of variant data within candidate genes and across the entire gene coding region, second-generation DNA sequencing technology is being used. Phase 2 of the NIEHS SNPs program at UW will comprehensively scan all gene-coding regions (i.e., the exome) on the NIEHS reference panel of 95 individuals from diverse U.S. populations (i.e., EGP - Panel 2). All variation data are uploaded to the NIEHS SNPs Exome Variant Server, from which investigators can access summary data of all coding variation and the uniquely aligned read data with the ability to interrogate any position within an exon.

Polymorphism Analysis

For Phase 1, automated DNA sequencing was used to identify and genotype SNPs in human candidate genes (see PolyPhred). Candidate genes were sequenced to identify common sequence variation for functional analysis and population-based studies. Candidate genes were formerly sequenced across a panel of 90 individuals representative of the U.S. population (see Sample Population Descriptions Panel 1). For Phase 1, a subset of candidate genes were sequenced across a panel of 95 individuals of known ethniticies (see Sample Population Descriptions Panel 2). All SNPs have been identified using only high quality sequence data (Q > 25) and each SNP reported from the NIEHS SNPs program has been confirmed in multiple individuals and/or in multiple reactions.

For Phase 2 exome sequencing, all variants are identified using an analysis pipeline consisting of various quality controls and genotyping software from the Genome Analysis ToolKit (GATK). The Unified Genotyper software (GATK) provides variant identification, quality control and filtering to arrive at a final exome variant dataset. Only variants within the exome target are genotyped. Variants and coverage are analyzed, and final datasets are stored in the NIEHS Exome Variant Server. Panel 2 is currently being used for the Phase 2 exome sequencing.

 
National Institute of Environmental Health Sciences Environmental Genome Project National Institute of Environmental Health Sciences UW NIEHS