Posts Tagged ‘SNP’

Bioinformatics Tool Review: Genome Variant Analysis Tools, Volume 2 (Volume Two: Latest in Genomics Methodologies for Therapeutics: Gene Editing, NGS and BioInformatics, Simulations and the Genome Ontology), Part 1: Next Generation Sequencing (NGS)

Bioinformatics Tool Review: Genome Variant Analysis Tools

Curator: Stephen J. Williams, Ph.D.

Updated 02/07/2021

Updated 11/15/2018

The following post will be an ongoing curation of reviews of gene variant bioinformatic software.

The Ensembl Variant Effect Predictor.

McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, Flicek P, Cunningham F.

Genome Biol. 2016 Jun 6;17(1):122. doi: 10.1186/s13059-016-0974-4.

Author information


European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. wm2@ebi.ac.uk.


European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.


European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. fiona@ebi.ac.uk.


The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.

Rare diseases can be difficult to diagnose due to low incidence and incomplete penetrance of implicated alleles however variant analysis of whole genome sequencing can identify underlying genetic events responsible for the disease (Nature, 2015).  However, a large cohort is required for many WGS association studies in order to produce enough statistical power for interpretation (see post and here).  To this effect major sequencing projects have been initiated worldwide including:

A more thorough curation of sequencing projects can be seen in the following post:

Icelandic Population Genomic Study Results by deCODE Genetics come to Fruition: Curation of Current genomic studies

And although sequencing costs have dramatically been reduced over the years, the costs to determine the functional consequences of such variants remains high, as thorough basic research studies must be conducted to validate the interpretation of variant data with respect to the underlying disease, as only a small fraction of variants from a genome sequencing project will encode for a functional protein.  Correct annotation of sequences and variants, identification of correct corresponding reference genes or transcripts in GENCODE or RefSeq respectively offer compelling challenges to the proper identification of sequenced variants as potential functional variants.

To this effect, the authors developed the Ensembl Variant Effect Predictor (VEP), which is a software suite that performs annotations and analysis of most types of genomic variation in coding and non-coding regions of the genome.

Summary of Features

  • Annotation: VEP can annotate two broad categories of genomic variants
    • Sequence variants with specific and defined changes: indels, base substitutions, SNVs, tandem repeats
    • Larger structural variants > 50 nucleotides
  • Species and assembly/genomic database support: VEP can analyze data from any species with assembled genome sequence and annotated gene set. VEP supports chromosome assemblies such as the latest GRCh38, FASTA, as well as transcripts from RefSeq as well as user-derived sequences
  • Transcript Annotation: VEP includes a wide variety of gene and transcript related information including NCBI Gene ID, Gene Symbol, Transcript ID, NCBI RefSeq ID, exon/intron information, and cross reference to other databases such as UniProt
  • Protein Annotation: Protein-related fields include Protein ID, RefSeq ID, SwissProt, UniParc ID, reference codons and amino acids, SIFT pathogenicity score, protein domains
  • Noncoding Annotation: VEP reports variants in noncoding regions including genomic regulatory regions, intronic regions, transcription binding motifs. Data from ENCODE, BLUEPRINT, and NIH Epigenetics RoadMap are used for primary annotation.  Plugins to the Perl coding are also available to link other databases which annotate noncoding sequence features.
  • Frequency, phenotype, and citation annotation: VEP searches Ensembl databases containing a large amount of germline variant information and checks variants against the dbSNP single nucleotide polymorphism database. VEP integrates with mutational databases such as COSMIC, the Human Gene Mutation Database, and structural and copy number variants from Database of Genomic Variants.  Allele Frequencies are reported from 1000 Genomes and NHLBI and integrates with PubMed for literature annotation.  Phenotype information is from OMIM, Orphanet, GWAS and clinical information of variants from ClinVar.
  • Flexible Input and Output Formats: VEP supports input data format called “variant call format” or VCP, a standard in next-gen sequencing. VEP has the ability to process variant identifiers from other database formats.  Output formats are tab deliminated and give the user choices in presentation of results (HTML or text based)
  • Choice of user interface
    • Online tool (VEP Web): simple point and click; incorporates Instant VEP Functionality and copy and paste features. Results can be stored online in cloud storage on Ensembl.
    • VEP script: VEP is available as a downloadable PERL script (see below for link) and can process large amounts of data rapidly. This interface is powerfully flexible with the ability to integrate multiple plugins available from Ensembl and GitHub.  The ability to alter the PERL code and add plugins and code functions allows the flexibility to modify any feature of VEP.
    • VEP REST API: provides robust computational access to any programming language and returns basic variant annotation. Can make use of external plugins.


Watch Video on VES Instructional Webinar: https://youtu.be/7Fs7MHfXjWk

Watch Video on VES Web Version training on How to Analyze Your Sequence in VEP

Availability of data and materials

The dataset supporting the conclusions of this article is available from Illumina’s Platinum Genomes [93] and using the Ensembl release 75 gene set. Pre-built data sets are available for all Ensembl and Ensembl Genomes species [94]. They can also be downloaded automatically during set up whilst installing the VEP.


Large-scale discovery of novel genetic causes of developmental disorders.

Deciphering Developmental Disorders Study.

Nature2015 Mar 12;519(7542):223-8. doi: 10.1038/nature14135. PMID:25533962

Updated 11/15/2018

Research Points to Caution in Use of Variant Effect Prediction Bioinformatic Tools

Although we have the ability to use high throughput sequencing to identify allelic variants occurring in rare disease, correlation of these variants with the underlying disease is often difficult due to a few concerns:

  • For rare sporadic diseases, classical gene/variant association studies have proven difficult to perform (Meyts et al. 2016)
  • As Whole Exome Sequencing (WES) returns a considerable number of variants, how to differentiate the normal allelic variation found in the human population from disease-causing pathogenic alleles
  • For rare diseases, pathogenic allele frequencies are generally low

Therefore, for these rare pathogenic alleles, the use of bioinformatics tools in order to predict the resulting changes in gene function may provide insight into disease etiology when validation of these allelic changes might be experimentally difficult.

In a 2017 Genes & Immunity paper, Line Lykke Andersen and Rune Hartmann tested the reliability of various bioinformatic software to predict the functional consequence of variants of six different genes involved in interferon induction and sixteen allelic variants of the IFNLR1 gene.  These variants were found in cohorts of patients presenting with herpes simplex encephalitis (HSE). Most of the adult population is seropositive for Herpes Simplex Virus (HSV) however a minor fraction (1 in 250,000 individuals per year) of HSV infected individuals will develop HSE (Hjalmarsson et al., 2007).  It has been suggested that HSE occurs in individuals with rare primary immunodeficiencies caused by gene defects affecting innate immunity through reduced production of interferons (IFN) (Zhang et al., Lim et al.).


Meyts I, Bosch B, Bolze A, Boisson B, Itan Y, Belkadi A, et al. Exome and genome sequencing for inborn errors of immunity. J Allergy Clin Immunol. 2016;138:957–69.

Hjalmarsson A, Blomqvist P, Skoldenberg B. Herpes simplex encephalitis in Sweden, 1990-2001: incidence, morbidity, and mortality. Clin Infect Dis. 2007;45:875–80.

Zhang SY, Jouanguy E, Ugolini S, Smahi A, Elain G, Romero P, et al. TLR3 deficiency in patients with herpes simplex encephalitis. Science. 2007;317:1522–7.

Lim HK, Seppanen M, Hautala T, Ciancanelli MJ, Itan Y, Lafaille FG, et al. TLR3 deficiency in herpes simplex encephalitis: high allelic heterogeneity and recurrence risk. Neurology. 2014;83:1888–97.

Genes Immun. 2017 Dec 4. doi: 10.1038/s41435-017-0002-z.

Frequently used bioinformatics tools overestimate the damaging effect of allelic variants.

Andersen LL1Terczyńska-Dyla E1Mørk N2Scavenius C1Enghild JJ1Höning K3Hornung V3,4Christiansen M5,6Mogensen TH2,6Hartmann R7.


We selected two sets of naturally occurring human missense allelic variants within innate immune genes. The first set represented eleven non-synonymous variants in six different genes involved in interferon (IFN) induction, present in a cohort of patients suffering from herpes simplex encephalitis (HSE) and the second set represented sixteen allelic variants of the IFNLR1 gene. We recreated the variants in vitro and tested their effect on protein function in a HEK293T cell based assay. We then used an array of 14 available bioinformatics tools to predict the effect of these variants upon protein function. To our surprise two of the most commonly used tools, CADD and SIFT, produced a high rate of false positives, whereas SNPs&GO exhibited the lowest rate of false positives in our test. As the problem in our test in general was false positive variants, inclusion of mutation significance cutoff (MSC) did not improve accuracy.


  1. Identification of rare variants
  2. Genomes of nineteen Dutch patients with a history of HSE sequenced by WES and identification of novel HSE causing variants determined by filtering the single nucleotide polymorphisms (SNPs) that had a frequency below 1% in the NHBLI Exome Sequencing Project Exome Variant Server and the 1000 Genomes Project and were present within 204 genes involved in the immune response to HSV.
  3. Identified variants (204) manually evaluated for involvement of IFN induction based on IDBase and KEGG pathway database analysis.
  4. In-silico predictions: Variants classified by the in silico variant pathogenicity prediction programs: SIFT, Mutation Assessor, FATHMM, PROVEAN, SNAP2, PolyPhen2, PhD-SNP, SNP&GO, FATHMM-MKL, MutationTaster2, PredictSNP, Condel, MetaSNP, and CADD. Each program returned prediction scores measuring likelihood of a variant either being ‘deleterious’ or ‘neutral’. Prediction accuracy measured as

ACC = (true positive+true negative)/(true positive+true negative+false positive+false negative)

  1. Validation of prediction software/tools

In order to validate the predictive value of the software, HEK293T cells, deficient in IRF3, MAVS, and IKKe/TBK1, were cotransfected with the nine variants of the aforementioned genes and a luciferase reporter under control of the IFN-b promoter and luciferase activity measured as an indicator of IFN signaling function.  Western blot was performed to confirm the expression of the constructs.


Table 2 Summary of the
bioinformatic predictions
HSE variants IFNLR1 variants Overall ACC
Uniform cutoff
SIFT 4 1 0 4 9 0.56 8 1 0 7 16 0.56 0.56
Mutation assessor 6 1 0 2 9 0.78 9 1 0 6 16 0.63 0.68
FATHMM 7 1 0 1 9 0.89 0.89
PROVEAN 8 1 0 0 9 1.00 11 1 0 4 16 0.75 0.84
SNAP2 5 1 0 3 9 0.67 8 0 1 7 16 0.50 0.56
PolyPhen2 6 1 0 2 9 0.78 12 1 0 3 16 0.81 0.80
PhD-SNP 7 1 0 1 9 0.89 11 1 0 4 16 0.75 0.80
SNPs&GO 8 1 0 0 9 1.00 14 1 0 1 16 0.94 0.96
FATHMM MKL 4 1 0 4 9 0.56 13 0 1 2 16 0.81 0.72
MutationTaster2 4 0 1 4 9 0.44 14 0 1 1 16 0.88 0.72
PredictSNP 6 1 0 2 9 0.78 11 1 0 4 16 0.75 0.76
Condel 6 1 0 2 9 0.78 0.78
Meta-SNP 8 1 0 0 9 1.00 11 1 0 4 16 0.75 0.84
CADD 2 1 0 6 9 0.33 8 0 1 7 16 0.50 0.44
MSC 95% cutoff
SIFT 5 1 0 3 9 0.67 8 1 0 8 16 0.50 0.56
PolyPhen2 6 1 0 2 9 0.78 13 1 0 3 16 0.81 0.80
CADD 4 1 0 4 9 0.56 7 0 1 9 16 0.44 0.48

Note: TN: true negative, TP: true positive, FN: false negative, FP: false positive, ACC: accuracy

Functional testing (data obtained from reporter construct experiments) were considered as the correct outcome.

Three prediction tools (PROVEAN, SNP&GO, and MetaSNP correctly predicted the effect of all nine variants tested.

Updated 02/07/2021

InMeRF: prediction of pathogenicity of missense variants by individual modeling for each amino acid substitution
Jun-Ichi Takeda Kentaro Nanatsue Ryosuke Yamagishi Mikako Ito Nobuhiko Haga 2Hiromi Hirata Tomoo Ogi Kinji Ohno in NAR Genomics and  Bioinformatics. 2020 May 26;2(2):lqaa038.doi: 10.1093/nargab/lqaa038. eCollection 2020 Jun.


In predicting the pathogenicity of a nonsynonymous single-nucleotide variant (nsSNV), a radical change in amino acid properties is prone to be classified as being pathogenic. However, not all such nsSNVs are associated with human diseases. We generated random forest (RF) models individually for each amino acid substitution to differentiate pathogenic nsSNVs in the Human Gene Mutation Database and common nsSNVs in dbSNP. We named a set of our models ‘Individual Meta RF’ (InMeRF). Ten-fold cross-validation of InMeRF showed that the areas under the curves (AUCs) of receiver operating characteristic (ROC) and precision-recall curves were on average 0.941 and 0.957, respectively. To compare InMeRF with seven other tools, the eight tools were generated using the same training dataset, and were compared using the same three testing datasets. ROC-AUCs of InMeRF were ranked first in the eight tools. We applied InMeRF to 155 pathogenic and 125 common nsSNVs in seven major genes causing congenital myasthenic syndromes, as well as in VANGL1 causing spina bifida, and found that the sensitivity and specificity of InMeRF were 0.942 and 0.848, respectively. We made the InMeRF web service, and also made genome-wide InMeRF scores available online (https://www.med.nagoya-u.ac.jp/neurogenetics/InMeRF/).

Source: https://pubmed.ncbi.nlm.nih.gov/33543123/

ADDRESS: A database of disease-associated human variants incorporating protein structure and folding stabilities
Jaie Woodard Chengxin Zhang Yang Zhang in J Mol Biol. 2021 Feb 1;166840. doi: 10.1016/j.jmb.2021.166840.


Numerous human diseases are caused by mutations in genomic sequences. Since amino acid changes affect protein function through mechanisms often predictable from protein structure, the integration of structural and sequence data enables us to estimate with greater accuracy whether and how a given mutation will lead to disease. Publicly available annotated databases enable hypothesis assessment and benchmarking of prediction tools. However, the results are often presented as summary statistics or black box predictors, without providing full descriptive information. We developed a new semi-manually curated human variant database presenting information on the protein contact-map, sequence-to-structure mapping, amino acid identity change, and stability prediction for the popular UniProt database. We found that the profiles of pathogenic and benign missense polymorphisms can be effectively deduced using decision trees and comparative analyses based on the presented dataset. The database is made publicly available through https://zhanglab.ccmb.med.umich.edu/ADDRESS.

Source: https://pubmed.ncbi.nlm.nih.gov/33539887/

PopDel identifies medium-size deletions simultaneously in tens of thousands of genomes


Thousands of genomic structural variants (SVs) segregate in the human population and can impact phenotypic traits and diseases. Their identification in whole-genome sequence data of large cohorts is a major computational challenge. Most current approaches identify SVs in single genomes and afterwards merge the identified variants into a joint call set across many genomes. We describe the approach PopDel, which directly identifies deletions of about 500 to at least 10,000 bp in length in data of many genomes jointly, eliminating the need for subsequent variant merging. PopDel scales to tens of thousands of genomes as we demonstrate in evaluations on up to 49,962 genomes. We show that PopDel reliably reports common, rare and de novo deletions. On genomes with available high-confidence reference call sets PopDel shows excellent recall and precision. Genotype inheritance patterns in up to 6794 trios indicate that genotypes predicted by PopDel are more reliable than those of previous SV callers. Furthermore, PopDel’s running time is competitive with the fastest tested previous tools. The demonstrated scalability and accuracy of PopDel enables routine scans for deletions in large-scale sequencing studies.

Source: https://pubmed.ncbi.nlm.nih.gov/33526789/

Other articles related to Genomics and Bioinformatics on this online Open Access Journal Include:

Finding the Genetic Links in Common Disease: Caveats of Whole Genome Sequencing Studies

Large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes

US Personalized Cancer Genome Sequencing Market Outlook 2018 –

Icelandic Population Genomic Study Results by deCODE Genetics come to Fruition: Curation of Current genomic studies

Read Full Post »

Precision Medicine for Future of Genomics Medicine is The New Era

Demet Sag, PhD, CRA, GCP


Are we there yet?  Life is a journey so the science.

Governor Brown announced Precision Medicine initiative for California on April 14, 2015.  UC San Francisco is hosting the two-year initiative, through UC Health, which includes UC’s five medical centers, with $3 million in startup funds from the state. The public-private initiative aims to leverage these funds with contributions from other academic and industry partners.

With so many campuses spread throughout the state and so much scientific, clinical and computational expertise, the UC system has the potential to bring it all together, said Atul Butte, MD, PhD, who is leading the initiative.

At the beginning of 2015 President Obama signed this initiative and assigned people to work on this project.

Previously NCI Director Harold Varmus, MD said that “Precision medicine is really about re-engineering the diagnostic categories for cancer to be consistent with its genomic underpinnings, so we can make better choices about therapy,” and “In that sense, many of the things we’re proposing to do are already under way.”

The proposed initiative has two main components:

  • a near-term focus on cancers and
  • a longer-term aim to generate knowledge applicable to the whole range of health and disease.

Both components are now within our reach because of advances in basic research, including molecular biology, genomics, and bioinformatics. Furthermore, the initiative taps into converging trends of increased connectivity, through social media and mobile devices, and Americans’ growing desire to be active partners in medical research.

Since the human genome is sequenced it became clear that actually there are few genes than expected and shared among organisms to accomplish same or similar core biological functions.  As a result, knowledge of the biological role of such shared proteins in one organism can be transferred to another organism.

I remember when I was screening the X-chromosome by using deletion/duplication mapping and using P elements and bar balancers as a tool to keep the genome stable to identify transregulating elements of ovo gene, female germline specific Drosophila melanogaster germline sex determination gene. At the time for my dissertation, I screened X-chromosome using 45 deficiency strains, I found that these trans-regulating regions were grouped into 12 loci based on overlapping cytology. Five regions were trans-regulating activators, and seven were trans-regulating repressors; extrapolating to the entire genome, this result predicted nearly 85 loci. This one gene may expressed three proteins at different time of development and activate/downregulate various regions to accommadate proper system development in addition to auto-regulate and gene dose responses. Drosophila has only four chromosomes but the cellular interactions and signaling mechanisms are still complicated yet as not complicated as human. I do appreciate the new applications and upcoming changes.

Now, the technology is much better and precision is the key to establish to use in clinics.  However, we have new issues to overcome like computing such a big data, align properly, analyze effectively, compare and contrast the outcomes to identify the variations that may function in on  population, or two etc. At the end of the day collaboration, standardization, and data sharing are few of the key factors.

It is necessary to generate a dynamic yet controlled standardized collection of information with ever changing and accumulating data so  Gene Ontology Consortium is created. Three independent ontologies can be reached at  (http://www.geneontology.org) developed based on :

  1. biological process,
  2. molecular function and
  3. cellular component.


We need a common language for annotation for a functional conservation. Genesis of the grand biological unification made it possible to complete the genomic sequences of not only human but also the main model organisms and more. some examples include:

  • the budding yeast, Saccharomyces cerevisiae,
  • the nematode worm Caenorhabditis elegans
  • the fruitfly Drosophila melanogaster,
  • the flowering plant Arabidopsis thaliana
  • fission yeast Schizosaccharomyces pombe
  • the  mouse , Mus musculus

On the other hand, as we know there are allelic variations that underlie common diseases and complete genome sequencing for many individuals with and without disease is required.  However, there are advantages and disadvantages as we can carry out partial surveys of the genome by genotyping large numbers of common SNPs in genome-wide association studies but there are problems such as computing the data efficiently and sharing the information without tempering privacy. Therefore we should be mindful about few main conditions including:

  1. models of the allelic architecture of common diseases,
  2. sample size,
  3. map density and
  4. sample-collection biases.

This will lead into the cost control and efficiency while identifying genuine disease-susceptibility loci. The genome-wide association studies (GWAS) have progressed from assaying fewer than 100,000 SNPs to more than one million, and sample sizes have increased dramatically as the search for variants that explain more of the disease/trait heritability has intensified.

In addition, we must translate this sequence information from genomics locus of the genes to function with related polymorphism of these genes so that possible patterns of the gene expression and disease traits can be matched. Then, we may develop precision technologies for:

  1. Diagnostics
  2. Targeted Drugs and Treatments
  3. Biomarkers to modulate cells for correct functions

With the knowledge of:

  1. gene expression variations
  2. insight in the genetic contribution to clinical endpoints ofcomplex disease and
  3. their biological risk factors,
  4. share etiologic pathways

therefore, requires an understanding of both:

  • the structure and
  • the biology of the genome.

These studies demonstrated hundreds of associations of common genetic variants with over 80 diseases and traits collected under a controlled online resource.  However, identifying published GWAS can be challenging as a simple PubMed search using the words “genome wide association studies”  may be easily populated with unrelevant  GWAS.

National Human Genome Research Institute (NHGRI) Catalog of Published Genome-Wide Association Studies (http://www.genome.gov/gwastudies), an online, regularly updated database of SNP-trait associations extracted from published GWAS was developed.

Therefore, sequencing of a human genome is a quite undertake and requires tools to make it possible:

  • to explore the genetic component in complex diseases and
  • to fully understand the genetic pathways contributing to complex disease

Examples of Gene Ontology

The rapid increase in the number of GWAS provides an unprecedented opportunity to examine the potential impact of common genetic variants on complex diseases by systematically cataloging and summarizing key characteristics of the observed associations and the trait/disease associated SNPs (TASs) underlying them.


With this in mind, many forms can be established:

  1. to describe the features of this resource and the methods we have used to produce it,
  2. to provide and examine key descriptive characteristics of reported TASs such as estimated risk allele frequencies and odds ratios,
  3. to examine the underlying functionality of reported risk loci by mapping them to genomic annotation sets and assessing overrepresentation via Monte Carlo simulations and
  4. to investigate the relationship between recent human evolution and human disease phenotypes.


This procedure has no clear path so there are several obstacles in the actual functional variant that is often unknown. This may be due to:

  1. trait/disease associated SNPs (TASs),
  2. a well known SNP+ strong linkage disequilibrium (LD) with the TAS,
  3. an unknown common SNP tagged by a haplotype
  4. rare single nucleotide variant tagged by a haplotype on which the TAS occurs, or
  5. Copy Number variation (CNV), a linked copy number variant.


There can be other factors such as

  • Evolution,
  • Natural Selection
  • Environment
  • Pedigree
  • Epigenetics


Even though heritage is another big factor, the concept of heritability and its definition as an estimable, dimensionless population parameter as introduced by Sewall Wright and Ronald Fisher almost a century ago.


As a result, heritability gain interest since it allows us to compare of the relative importance of genes and environment to the variation of traits within and across populations. The heritability is an ongoing mechanism and  remains as a main factor:


  • to selection in evolutionary biology and agriculture, and
  • to the prediction of disease risk in medicine.

Reported TASs associated with two or more distinct traits

Chromosomal region Rs number(s) Attributed genes Associated traits reported in catalog
1p13.2 rs2476601, rs6679677 PTPN22 Crohn’s disease, type 1 diabetes, rheumatoid arthritis
1q23.2 rs2251746, rs2494250 FCER1A Serum IgE levels, select biomarker traits (MCP1)
2p15 rs1186868, rs1427407 BCL11A Fetal hemoglobin, F-cell distribution
2p23.3 rs780094 GCKR CRP, lipids, waist circumference
6p21.33 rs3131379, rs3117582 HLA / MHC region Systemic lupus erythematosus, lung cancer, psoriasis, inflammatory bowel disease, ulcerative colitis, celiac disease, rheumatoid arthritis, juvenile idiopathic arthritis, multiple sclerosis, type 1 diabetes
6p22.3 rs6908425, rs7756992, rs7754840, rs10946398, rs6931514 CDKAL1 Crohn’s disease, type 2 diabetes
6p25.3 rs1540771, rs12203592, rs872071 IRF4 Freckles, hair color, chronic lymphocytic leukemia
6q23.3 rs5029939, rs10499194 TNFAIP3 Systemic lupus erythematosus, rheumatoid arthritis
7p15.1 rs1635852, rs864745 JAZF1 Height, type 2 diabetes*
8q24.21 rs6983267 Intergenic Prostate or colorectal cancer, breast cancer
9p21.3 rs10811661, rs1333040, rs10811661, rs10757278, rs1333049 CDKN2A, CDKN2B Type 2 diabetes, intracranial aneurysm, myocardial infarction
9q34.2 rs505922, rs507666, rs657152 ABO Protein quantitative trait loci (TNF-α), soluble ICAM-1, plasma levels of liver enzymes (alkaline phosphatase)
12q24 rs1169313, rs7310409, rs1169310, rs2650000 HNF1A Plasma levels of liver enzyme (GGT), C-reactive protein, LDL cholesterol
16q12.2 rs8050136, rs9930506, rs6499640, rs9939609, rs1121980 FTO Type 2 diabetes, body mass index or weight
17q12 rs7216389, rs2872507 ORMDL3 Asthma, Crohn’s disease
17q12 rs4430796 TCF2 Prostate cancer, type 2 diabetes
18p11.21 rs2542151 PTPN2 Type 1 diabetes, Crohn’s disease
19q13.32 rs4420638 APOE, APOC1, APOC4 Alzheimer’s disease, lipids

* The well known association of JAZF1 with prostate cancer was reported with a p value of 2 × 10−6, which did not meet the threshold of 5 × 10−8 for this analysis.

PMC full text: Proc Natl Acad Sci U S A. 2009 Jun 9; 106(23): 9362–9367.Published online 2009 May 27. doi:  10.1073/pnas.0903103106


Allele-Frequency Data for Nine Reproducible Associations

gene diseasea SNP associated alleleb Europeand Africane δf FST reference(s)c
CTLA4 T1DM Thr17Ala Ala .38 (1,670) .209 (402) .171 .06 Osei-Hyiaman et al. 2001; Lohmueller et al. 2003
DRD3 Schizophrenia Ser9Gly Ser/Ser .67 (202) .116 (112) .554 .458 Crocq et al. 1996; Lohmueller et al.2003
AGT Hypertension Thr235Met Thr .42 (3,034) .91 (658) .49 .358 Rotimi et al. 1996; Nakajima et al.2002
PRNP CJD Met129Val Met .72 (138) .556 (72) .164 .049 Hirschhorn et al. 2002; Soldevila et al. 2003
F5 DVT Arg506Gln Gln .044 (1,236) .00 (251) .044 .03 Rees et al. 1995; Hirschhorn et al.2002
HFE HFE Cys382Tyr Tyr .038 (2,900) .00 (806) .038 .024 Feder et al. 1996; Merryweather-Clarke et al. 1997
MTHFR DVT C677T T .3 (188) .066 (468) .234 .205 Schneider et al. 1998; Ray et al.2002
PPARG T2DM Pro12Ala Pro .925 (120) 1.0 (120) .075 .067 Altshuler et al. 2000HapMap Project
KCNJ11 T2DM Asp23Lys Lys .36 (96) .09 (98) .27 .182 Florez et al. 2004

aCJD = Creutzfeldt-Jacob disease; DVT = deep venous thrombosis; HFE = hemochromatosis; T1DM = type I diabetes; T2DM = type II diabetes.

bThe associated allele is the SNP associated with disease, regardless of whether it is the derived or the ancestral allele. The frequencies for this allele are given.

cThe reference that claims this to be a reproducible association, as well as the reference from which the allele frequencies were taken. For allele frequencies obtained from a meta-analysis, only the reference claiming reproducible association is given.

dAllele frequency obtained from the literature involving a European population. Either the general population frequency or the frequency in control groups in an association study was used. To reduce bias, when a control frequency was used for Europeans, a control frequency was also used for Africans. The total number of chromosomes surveyed is given in parentheses after each frequency.

eAllele frequency obtained from the literature involving a West African population. The total number of chromosomes surveyed is given in parentheses after each frequency.

fδ = The difference in the allele frequency between Europeans and Africans.


PMC full text:

Am J Hum Genet. 2006 Jan; 78(1): 130–136.Published online 2005 Nov 16. doi:  10.1086/499287Copyright/License ►Request permission to reuse

Allele-Frequency Data for 39 Reported Associations

gene disease/phenotypea SNP associated alleleb Europeand Africane δf FST referencec
ADRB1 MI Arg389Gly Arg .717 (46) .467 (30) .251 .1 Iwai et al. 2003
ALOX5AP MI, stroke rs10507391 T .682 (44) .159 (44) .523 .425 Helgadottir et al. 2004
CAT Hypertension −844 (C/T) Tg .714 (42) .659 (44) .055 0 Jiang et al. 2001
CCR2 AIDS susceptibility Ile64Val Val .87 (46) .813 (48) .057 0 Smith et al. 1997
CD36 Malaria Y to stop Stop 0 (46) .083 (48) .083 .062 Aitman et al. 2000
F13 MI Val34Leu Val .762 (42) .795 (44) .033 0 Kohler et al. 1999
FGA Pulmonary embolism Thr312Ala Ala .2 (40) .5 (42) .3 .159 Carter et al. 2000
GP1BA CAD Thr145Met Met .022 (46) .167 (48) .145 .095 Gonzalez-Conejero et al.1998
ICAM1 MS Lys469Glu Lys .643 (42) .875 (48) .232 .12 Nejentsev et al. 2003
ICAM1 Malaria Lys29Met Met 0 (46) .354 (48) .354 .335 Fernandez-Reyes et al.1997
IFNGR1 Hp infection −56 (C/T) T .455 (44) .604 (48) .15 .023 Thye et al. 2003
IL13 Asthma −1055 (C/T) T .196 (46) .25 (44) .054 0 van der Pouw Kraan et al. 1999
IL13 Bronchial asthma Arg110Gln Gln .273 (44) .119 (42) .154 .05 Heinzmann et al. 2003
IL1A AD −889 (C/T) T .295 (44) .391 (46) .096 0 Nicoll et al. 2000
IL1B Gastric cancer −31 (C/T) T .826 (46) .375 (48) .451 .335 El-Omar et al. 2000
IL3 RA −16 (C/T) C .739 (46) .875 (48) .136 .037 Yamada et al. 2001
IL4 Asthma −590 (T/C) T .174 (46) .708 (48) .534 .436 Noguchi et al. 1998
IL4R Asthma Gln576Arg Arg .295 (44) .565 (46) .27 .118 Hershey et al. 1997
IL6 Juvenile arthritis −174 (C/G) G .5 (44) 1 (46) .5 .494 Fishman et al. 1998
IL8 RSV bronchiolitis −251 (T/A) Th .659 (44) .229 (48) .43 .301 Hull et al. 2000
ITGA2 MI 807 (C/T) T .316 (38) .25 (48) .066 0 Moshfegh et al. 1999
LTA MI Thr26Asn Asn .357 (42) .5 (44) .143 .018 Ozaki et al. 2002
MC1R Fair skin Val92Met Met .068 (44) 0 (44) .068 .047 Valverde et al. 1995
NOS3 MI Glu298Asp Asp .5 (44) .136 (44) .364 .247 Shimasaki et al. 1998
PLAU AD Pro141Leu Pro .659 (44) .979 (48) .32 .287 Finckh et al. 2003
PON1 CAD Arg192Gln Arg .174 (46) .727 (44) .553 .461 Serrato and Marian 1995
PON2 CAD Cys311Ser Ser .826 (46) .762 (42) .064 0 Sanghera et al. 1998
PTGS2 Colon cancer −765 (G/C) C .238 (42) .292 (48) .054 0 Koh et al. 2004
PTPN22i RA Arg620Trp Trp .084 (1,120) .024 (818) .059 .03 Begovich et al. 2004
SELE CAD Ser128Arg Arg .091 (44) .021 (48) .07 .025 Wenzel et al. 1994
SELL IgA nephropathy Pro238Ser Ser .065 (46) .333 (48) .268 .183 Takei et al. 2002
SELP MI Thr715Pro Thr .864 (44) .977 (44) .114 .063 Herrmann et al. 1998
SFTPB ARDS Ile131Thr Thr .5 (44) .348 (46) .152 .025 Lin et al. 2000
SPD RSV infection Met11Thr Met .568 (44) .478 (46) .09 0 Lahti et al. 2002
TF AD Pro570Ser Pro .957 (46) .935 (46) .022 0 Zhang et al. 2003
THBD MI Ala455Val Ala .87 (46) .848 (46) .022 0 Norlund et al. 1997
THBS4 MI Ala387Pro Pro .341 (44) .083 (48) .258 .166 Topol et al. 2001
TNFA Infectious disease −308 (A/G) A .182 (44) .205 (44) .023 0 Bayley et al. 2004
VCAM1 Stroke in SCD Gly413Ala Gly 1 (46) .938 (48) .063 .041 Taylor et al. 2002

aAD = Alzheimer disease; AIDS = acquired immunodeficiency syndrome; ARDS = acute respiratory distress syndrome; CAD = coronary artery disease; Hp = Helicobacter pylori; MI = myocardial infarction; MS = multiple sclerosis; RA = rheumatoid arthritis; RSV = respiratory syncytial virus; SCD = sickle cell disease.

bThe associated allele is the SNP associated with disease, regardless of whether it is the derived or the ancestral allele. The frequencies for this allele are given.

cThe reference that reported association with the listed disease/phenotype.

dFrequency obtained from the Seattle SNPs database for the European sample. The total number of chromosomes surveyed is given in parentheses after each frequency.

eFrequency obtained from the Seattle SNPs database for the African American sample. The total number of chromosomes surveyed is given in parentheses after each frequency.

fδ = The difference in the allele frequency between African Americans and Europeans.

gAssociated allele in database is A.

hAssociated allele in reference is A.

iThis SNP was not from the Seattle SNPs database; instead, allele frequencies from Begovich et al. (2004) were used.


They reported that “The SNPs associated with common disease that we investigated do not show much higher levels of differentiation than those of random SNPs. Thus, in these cases, ethnicity is a poor predictor of an individual’s genotype, which is also the pattern for random variants in the genome. This lends support to the hypothesis that many population differences in disease risk are environmental, rather than genetic, in origin. However, some exceptional SNPs associated with common disease are highly differentiated in frequency across populations, because of either a history of random drift or natural selection. The exceptional SNPs given  are located in AGT, DRD3, ALOX5AP, ICAM1, IL1B, IL4, IL6, IL8, and PON1.

Of note, evidence of selection has been observed for AGT (Nakajima et al. 2004), IL4(Rockman et al. 2003), IL8 (Hull et al. 2001), and PON1 (Allebrandt et al. 2002). Yet, for the vast majority of the common-disease–associated polymorphisms we examined, ethnicity is likely to be a poor predictor of an individual’s genotype.”


In 2002 the International HapMap Project was launched:

  • to provide a public resource
  • to accelerate medical genetic research.

Two Hapmap projects were completed.

In phase I the objective was to genotype at least one common SNP every 5 kilobases (kb) across the euchromatic portion of the genome in 270 individuals from four geographically diverse population.

In Phase II of the HapMap Project, a further 2.1 million SNPs were successfully genotyped on the same individuals.

The re-mapping of SNPs from Phase I of the project identified 21,177 SNPs that had an ambiguous position or some other feature indicative of low reliability. These are not included in the filtered Phase II data release. All genotype data are available from the HapMap Data Coordination Center located at (http://www.hapmap.org) and dbSNP (http://www.ncbi.nlm.nih.gov/SNP).

In the Phase II HapMap we identified 32,996 recombination hotspots (an increase of over 50% from Phase I) of which 68% localized to a region of≤5 kb. The median map distance induced by a hotspot is 0.043 cM (or one crossover per 2,300 meioses) and the hottest identified, on chromosome 20, is 1.2 cM (one crossover per 80 meioses). Hotspots account for approximately 60% of recombination in the human genome and about 6% of sequence.

In addition to many previously identified regions in HapMap Phase I including LARGESYT1 andSULT1C2 (previously called SULT1C1), about  200 regions identified from the Phase II HapMap that include many established cases of selection, such as the genes HBB andLCT, the HLA region, and an inversion on chromosome 17. Finally, in the future, whole-genome sequencing will provide a natural convergence of technologies to type both SNP and structural variation. Nevertheless, until that point, and even after, the HapMap Project data will provide an invaluable resource for understanding the structure of human genetic variation and its link to phenotype.

HMM libraries, such as PANTHER, Pfam, and SMART, are used primarily

  • to recognize and
  • to annotate conserved motifs in protein sequences.

In the genomic era, one of the fundamental goals is to characterize the function of proteins on a large scale.

PANTHER, for relating protein sequence relationships to function relationships in a robust and accurate way under two main parts:

  • the PANTHER library (PANTHER/LIB)- collection of “books,” each representing a protein family as a multiple sequence alignment, a Hidden Markov Model (HMM), and a family tree.
  • the PANTHER index (PANTHER/X)- ontology for summarizing and navigating molecular functions and biological processes associated with the families and subfamilies.


PANTHER can be applied on three areas of active research:

  • to report the size and sequence diversity of the families and subfamilies, characterizing the relationship between sequence divergence and functional divergence across a wide range of protein families.
  • use the PANTHER/X ontology to give a high-level representation of gene function across the human and mouse genomes.
  • to rank missense single nucleotide polymorphisms (SNPs), on a database-wide scale, according to their likelihood of affecting protein function.

PRINTS is ” a compendium of protein motif ‘fingerprints’. A fingerprint is defined as a group of motifs excised from conserved regions of a sequence alignment, whose diagnostic power or potency is refined by iterative databasescanning (in this case the OWL composite sequence database)”.

The information contained within PRINTS is distinct from, but complementary to the consensus expressions stored in the widely-used PROSITE dictionary of patterns.

However, the position-specific amino acid probabilities in an HMM can also be used to annotate individual positions in a protein as being conserved (or conserving a property such as hydrophobicity) and therefore likely to be required for molecular function. For example, a mutation (or variant) at a conserved position is more likely to impact the function of that protein.

In addition, HMMs from different subfamilies of the same family can be compared with each other, to provide hypotheses about which residues may mediate the differences in function or specificity between the subfamilies.

Several computational algorithms and databases for comparing protein sequences developed and matured profile methods (Gribskov et al. 1987;Henikoff and Henikoff 1991Attwood et al. 1994):

  1. particularly Hidden Markov Models (HMM;Krogh et al. 1994Eddy 1996) and
  2. PSI-BLAST (Altschul et al. 1997),

The profile has a different amino acid substitution vector at each position in the profile, based on the pattern of amino acids observed in a multiple alignment of related sequences.

Profile methods combine algorithms with databases:

A group of related sequences is used to build a statistical representation of corresponding positions in the related proteins. The power of these methods therefore increases as new sequences are added to the database of known proteins.

Multiple sequence alignments (Dayhoff et al. 1974) and profiles have allowed a systematic study of related sequences. One of the key observations is that some positions are “conserved,” that is, the amino acid is invariant or restricted to a particular property (such as hydrophobicity), across an entire group of related sequences.

The dependence of profile and pattern-matching approaches (Jongeneel et al. 1989) on sequence databases led to the development of databases of profiles

  1. BLOCKS,Henikoff and Henikoff 1991;
  2. PRINTS,Attwood et al. 1994) and
  3. patterns (Prosite,Bairoch 1991) that could be searched in much the same way as sequence databases.


Among the most widely used protein family databases are

  1. Pfam (Sonnhammer et al. 1997;Bateman et al. 2002) and
  2. SMART (Schultz et al. 1998;Letunic et al. 2002), which combine expert analysis with the well-developed HMM formalism for statistical modeling of protein families (mostly families of related protein domains).

Either knowing its family membership to predict its function, or subfamily within that family is enough (Hannenhalli and Russell 2000).

  • Phylogenetic trees (representing the evolutionary relationships between sequences) and
  • dendrograms (tree structures representing the similarity between sequences) (e.g.,Chiu et al. 1985Rollins et al. 1991).


The PANTHER/LIB HMMs can be viewed as a statistical method for scoring the “functional likelihood” of different amino acid substitutions on a wide variety of proteins. Because it uses evolutionarily related sequences to estimate the probability of a given amino acid at a particular position in a protein, the method can be referred to as generating position-specific evolutionary conservation” (PSEC) scores.


The process for building PANTHER families include:

  1. Family clustering.
  2. Multiple sequence alignment (MSA), family HMM, and family tree building.
  3. Family/subfamily definition and naming.
  4. Subfamily HMM building.
  5. Molecular function and biological process association.

Of these, steps 1, 2, and 4 are computational, and steps 3 and 5 are human-curated (with the extensive aid of software tools).



Precision medicine effort is the beginning of a new journey to provide better health solutions.


Further Reading and References:

Human Phenome Project:

Freimer N., Sabatti C. The human phenome project. Nat. Genet. 2003;34:15–21.

Jones R., Pembrey M., Golding J., Herrick D. The search for genenotype/phenotype associations and the phenome scan. Paediatr. Perinat. Epidemiol. 2005;19:264–275.

Stearns F.W. One hundred years of pleiotropy: A retrospective. Genetics.2010;186:767–773.

Welch J.J., Waxman D. Modularity and the cost of complexity. Evolution.2003;57:1723–1734.

Albert A.Y., Sawaya S., Vines T.H., Knecht A.K., Miller C.T., Summers B.R., Balabhadra S., Kingsley D.M., Schluter D. The genetics of adaptive shape shift in stickleback: Pleiotropy and effect size. Evolution. 2008;62:76–85.

Brem R.B., Yvert G., Clinton R., Kruglyak L. Genetic dissection of transcriptional regulation in budding yeast. Science. 2002;296:752–755.

Morley M., Molony C.M., Weber T.M., Devlin J.L., Ewens K.G., Spielman R.S., Cheung V.G. Genetic analysis of genome-wide variation in human gene expression. Nature. 2004;430:743–747. [PMC free article] [PubMed]

Wagner G.P., Zhang J. The pleiotropic structure of the genotype-phenotype map: The evolvability of complex organisms. Nat. Rev. Genet. 2011;12:204–213.

Cooper Z.N., Nelson R.M., Ross L.F. Informed consent for genetic research involving pleiotropic genes: An empirical study of ApoE research. IRB. 2006;28:1–11.

Model Organisms:

Worm Sequencing Consortium. The C. elegans Sequencing Consortium Genome sequence of the nematode C. elegans: a platform for investigating biology. Science.1998;282:2012–2018.

Adams MD, et al. The genome sequence of Drosophila melanogasterScience.2000;287:2185–2195.

Meinke DW, et al. Arabidopsis thaliana: a model plant for genome analysis. Science. 1998;282:662–682. [PubMed]

Chervitz SA, et al. Using the Saccharomyces Genome Database (SGD) for analysis of protein similarities and structure. Nucleic Acids Res. 1999;27:74–78.

The FlyBase Consortium The FlyBase database of the Drosophila Genome Projects and community literature. Nucleic Acids Res. 1999;27:85–88.

Blake JA, et al. The Mouse Genome Database (MGD): expanding genetic and genomic resources for the laboratory mouse. Nucleic Acids Res. 2000;28:108–111.

Ball CA, et al. Integrating functional genomic information into the Saccharomyces Genome Database. Nucleic Acids Res. 2000;28:77–80.

Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A., et al. 2001. The sequence of the human genome. Science 291: 1304–1351.

Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. 2001. Initial sequencing and analysis of the human genome. Nature 409: 860–921.

Mi, H., Vandergriff, J., Campbell, M., Narechania, A., Lewis, S., Thomas, P.D., and Ashburner, M. 2003. Assessment of genome-wide protein function classification for Drosophila melanogaster. Genome Res.

Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al. The Gene Ontology Consortium. 2000. Gene ontology: Tool for the unification of biology. Nat. Genet. 25: 25–29.

Computational Biology

Attwood TK, Beck ME, Bleasby AJ, Parry-Smith DJ. PRINTS–a database of protein motif fingerprints. Nucleic Acids Res. 1994 Sep;22(17):3590-6.

Obenauer JC, Yaffe MB. Computational prediction of protein-protein interactions.

Methods Mol Biol. 2004;261:445-68. Review.

Aitken A. Protein consensus sequence motifs. Mol Biotechnol. 1999 Oct;12(3):241-53. Review.

Bork P, Koonin EV. Protein sequence motifs. Curr Opin Struct Biol. 1996 Jun;6(3):366-76. Review.

Hodgman TC. The elucidation of protein function by sequence motif analysis.  Comput Appl Biosci. 1989 Feb;5(1):1-13. Review.

Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25: 3389–3402.

Spencer CC, et al. The influence of recombination on human genetic diversity.PLoS Genet. 2006;2:e148.

Petes TD. Meiotic recombination hot spots and cold spots. Nature Rev. Genet.2001;2:360–369.

Griffiths RC, Tavaré S. The age of a mutation in a general coalescent tree. Stoch Models. 1998;14:273–295. doi: 10.1080/15326349808807471.

Gauderman WJ. Sample size requirements for matched case-control studies of gene-environment interaction. Stat Med. 2002;21(1):35–50. doi: 10.1002/sim.973.

Attwood, T.K., Beck, M.E., Bleasby, A.J., and Parry-Smith, D.J. 1994. PRINTS—A database of protein motif fingerprints. Nucleic Acids Res. 22: 3590–3596.

Bairoch, A. 1991. PROSITE: A dictionary of sites and patterns in proteins. Nucleic Acids Res. 19 Suppl: 2241–2245.

Bairoch, A. and Apweiler, R. 2000. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28: 45–48.

Bateman, A., Birney, E., Cerruti, L., Durbin, R., Etwiller, L., Eddy, S.R., Griffiths-Jones, S., Howe, K.L., Marshall, M., and Sonnhammer, E.L. 2002. The Pfam protein families database. Nucleic Acids Res. 30: 276–280.

Sonnhammer, E.L., Eddy, S.R., and Durbin, R. 1997. Pfam: A comprehensive database of protein domain families based on seed alignments. Proteins 28:405–420.

Swets, J.A. 1988. Measuring the accuracy of diagnostic systems. Science 240:1285–1293. [PubMed]

Thomas, P.D., Kejariwal, A., Campbell, M.J., Mi, H., Diemer, K., Guo, N., Ladunga, I., Ulitsky-Lazareva, B., Muruganujan, A., Rabkin, S., et al. 2003. PANTHER: A browsable database of gene products organized by biological function, using curated protein family and subfamily classification. Nucleic Acids Res. 31: 334–341.

HUGO Gene Nomenclature Committee (2011). HGNC Database.http://www.genenames.org/.

Population Genomics, GWAS, Inheritance, Heritability, Migration, Selection  an Evolution:

Dayhoff, M.O., Barker, W.C., and McLaughlin, P.J. 1974. Inferences from protein and nucleic acid sequences: Early molecular evolution, divergence of kingdoms and rates of change. Orig. Life 5: 311–330.

Joseph Lachance Disease-associated alleles in genome-wide association studies are enriched for derived low frequency alleles relative to HapMap and neutral expectations BMC Med Genomics. 2010; 3: 57.

Joseph Lachance, Sarah A. Tishkoff  Biased Gene Conversion Skews Allele Frequencies in Human Populations, Increasing the Disease Burden of Recessive Alleles  Am J Hum Genet. 2014 October 2; 95(4): 408-420.

Hemalatha Kuppusamy, Helga M. Ogmundsdottir, Eva Baigorri, Amanda Warkentin, Hlif Steingrimsdottir, Vilhelmina Haraldsdottir, Michael J. Mant, John Mackey, James B. Johnston, Sophia Adamia, Andrew R. Belch, Linda M. Pilarski Inherited Polymorphisms in Hyaluronan Synthase 1 Predict Risk of Systemic B-Cell Malignancies but Not of Breast Cancer  PLoS One. 2014; 9(6): e100691.

Joseph Lachance, Sarah A. Tishkoff  Population Genomics of Human Adaptation

Annu Rev Ecol Evol Syst. Author manuscript; available in PMC 2014 November 5.

Published in final edited form as: Annu Rev Ecol Evol Syst. 2013 November; 44: 123–143

Joseph Lachance, Sarah A. Tishkoff SNP ascertainment bias in population genetic analyses: Why it is important, and how to correct it  Bioessays.

Erik Corona, Rong Chen, Martin Sikora, Alexander A. Morgan, Chirag J. Patel, Aditya Ramesh, Carlos D. Bustamante, Atul J. Butte Analysis of the Genetic Basis of Disease in the Context of Worldwide Human Relationships and Migration PLoS Genet. 2013 May; 9(5): e1003447.

Olga Y. Gorlova, Jun Ying, Christopher I. Amos, Margaret R. Spitz, Bo Peng, Ivan P. Gorlov J Derived SNP Alleles Are Used More Frequently Than Ancestral Alleles As Risk-Associated Variants In Common Human Diseases Bioinform Comput Biol.

Ani Manichaikul, Wei-Min Chen, Kayleen Williams, Quenna Wong, Michèle M. Sale, James S. Pankow, Michael Y. Tsai, Jerome I. Rotter, Stephen S. Rich, Josyf C. Mychaleckyj  Analysis of Family- and Population-Based Samples in Cohort Genome-Wide Association Studies Hum Genet.

Altshuler D, Daly MJ, Lander ES. Genetic mapping in human disease. Science. 2008; 322(5903):881–888. doi: 10.1126/science.1156409.

Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447(7145):661–678. doi: 10.1038/nature05911.

Kotowski IK, Pertsemlidis A, Luke A, Cooper RS, Vega GL, Cohen JC, Hobbs HH. A spectrum of PCSK9 Alleles contributes to plasma levels of low-density lipoprotein cholesterol. American Journal of Human Genetics.2006;78(3):410–422. doi: 10.1086/500615.

Tomlinson I, Webb E, Carvajal-Carmona L, Broderick P, Kemp Z, Spain S, Penegar S, Chandler I, Gorman M, Wood W. et al. A genome-wide association scan of tag SNPs identifies a susceptibility variant for colorectal cancer at 8q24.21. Nature Genetics. 2007;39(8):984–988. doi: 10.1038/ng2085.

Todd JA, Walker NM, Cooper JD, Smyth DJ, Downes K, Plagnol V, Bailey R, Nejentsev S, Field SF, Payne F. et al. Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nature Genetics. 2007;39(7):857–864. doi: 10.1038/ng2068.

Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A. et al. Finding the missing heritability of complex diseases. Nature. 2009;461(7265):747–753. doi: 10.1038/nature08494.

Maher B. Personal genomes: The case of the missing heritability. Nature.2008;456(7218):18–21. doi: 10.1038/456018a.

Clark AG, Boerwinkle E, Hixson J, Sing CF. Determinants of the success of whole-genome association testing. Genome Res. 2005;15(11):1463–1467. doi: 10.1101/gr.4244005.

Clarke AJ, Cooper DN. GWAS: heritability missing in action? European Journal of Human Genetics. 2010;18:859–861. doi: 10.1038/ejhg.2010.35.

Moore JH, Williams SM. Epistasis and its implications for personal genetics. Am J Hum Genet. 2009;85(3):309–320. doi: 10.1016/j.ajhg.2009.08.006.

Goldstein DB. Common genetic variation and human traits. N Engl J Med.2009;360(17):1696–1698. doi: 10.1056/NEJMp0806284.

Hirschhorn JN. Genomewide association studies–illuminating biologic pathways. N Engl J Med. 2009;360(17):1699–1701. doi: 10.1056/NEJMp0808934.

Iles MM. What can genome-wide association studies tell us about the genetics of common disease? PLoS Genet. 2008;4(2):e33. doi: 10.1371/journal.pgen.0040033.

Myles S, Davison D, Barrett J, Stoneking M, Timpson N. Worldwide population differentiation at disease-associated SNPs. BMC Med Genomics.2008;1:22. doi: 10.1186/1755-8794-1-22.

Lohmueller KE, Mauney MM, Reich D, Braverman JM. Variants associated with common disease are not unusually differentiated in frequency across populations. Am J Hum Genet. 2006;78(1):130–136. doi: 10.1086/499287.

Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA.2009;106(23):9362–9367. doi: 10.1073/pnas.0903103106.

Wang WYS, Barratt BJ, Clayton DG, Todd JA. Genome-wide association studies: Theoretical and practical concerns. Nature Reviews Genetics.2005;6(2):109–118. doi: 10.1038/nrg1522.

Hacia JG, Fan JB, Ryder O, Jin L, Edgemon K, Ghandour G, Mayer RA, Sun B, Hsie L, Robbins C. et al. Determination of ancestral alleles for human single-nucleotide polymorphisms using high-density oligonucleotide arrays. Nat Genet. 1999;22(2):164–167. doi: 10.1038/9674.

Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn JN. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat Genet. 2003;33(2):177–182. doi: 10.1038/ng1071.

Wang WY, Pike N. The allelic spectra of common diseases may resemble the allelic spectrum of the full genome. Med Hypotheses. 2004;63(4):748–751. doi: 10.1016/j.mehy.2003.12.057.

HapMart. http://hapmart.hapmap.org/BioMart/martview/

Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal S. et al. A second generation human haplotype map of over 3.1 million SNPs. Nature.2007;449(7164):851–861. doi: 10.1038/nature06258.

Rotimi CN, Jorde LB. Ancestry and disease in the age of genomic medicine. N Engl J Med. 2010;363(16):1551–1558. doi: 10.1056/NEJMra0911564.

Ganapathy G, Uyenoyama MK. Site frequency spectra from genomic SNP surveys. Theor Popul Biol. 2009;75(4):346–354. doi: 10.1016/j.tpb.2009.04.003.

Nielsen R, Hubisz MJ, Clark AG. Reconstituting the frequency spectrum of ascertained single-nucleotide polymorphism data. Genetics.2004;168(4):2373–2382. doi: 10.1534/genetics.104.031039.

Watterson GA, Guess HA. Is the most frequent allele the oldest? Theor Popul Biol. 1977;11(2):141–160. doi: 10.1016/0040-5809(77)90023-5.

Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29(1):308–311. doi: 10.1093/nar/29.1.308.

Spencer CC, Deloukas P, Hunt S, Mullikin J, Myers S, Silverman B, Donnelly P, Bentley D, McVean G. The influence of recombination on human genetic diversity. PLoS Genet. 2006;2(9):e148. doi: 10.1371/journal.pgen.0020148.

Kimura M. The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics. 1969;61(4):893–903.

Johnson AD, O’Donnell CJ. An open access database of genome-wide association results. BMC Med Genet. 2009;10:6. doi: 10.1186/1471-2350-10-6.

Kimura M, Ohta T. The age of a neutral mutant persisting in a finite population. Genetics. 1973;75(1):199–212.

McVean GA, et al. The fine-scale structure of recombination rate variation in the human genome. Science. 2004;304:581–584.

Slatkin M, Rannala B. Estimating the age of alleles by use of intraallelic variability. Am J Hum Genet. 1997;60(2):447–458.

Green R, Krause J, Briggs A, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz M. et al. A Draft Sequence of the Neanderthal Genome. Science. 2010;328:710–722. doi: 10.1126/science.1188021.

Bamshad M, Wooding SP. Signatures of natural selection in the human genome. Nat Rev Genet. 2003;4(2):99–111. doi: 10.1038/nrg999.

Hernandez RD, Williamson SH, Bustamante CD. Context dependence, ancestral misidentification, and spurious signatures of natural selection. Mol Biol Evol. 2007;24(8):1792–1800. doi: 10.1093/molbev/msm108.

Bustamante CD, et al. Natural selection on protein-coding genes in the human genome. Nature. 2005;437:1153–1157.

Personalized Medicine in Cancer [3]

Personalized Medicine in Cancer [3] larryhbern
Advances in Gene Editing Technology: New Gene Therapy Options in Personalized Medicine 2012pharmaceutical
Big Data for Personalized Medicine and Biomarker Discovery, May 5-6, 2015 | Philadelphia, PA 2012pharmaceutical
Tweets by @pharma_BI and by @AVIVA1950 for @PMWCIntl, #PMWC15, #PMWC2015 LIVE @Silicon Valley 2015 Personalized Medicine World Conference 2012pharmaceutical
Presentations Content for Track One @Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA, January 26 to January 28, 2015 2012pharmaceutical
Views of Content Presentations – Track One @Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA, January 26 to January 28, 2015 2012pharmaceutical
Word Associations of Twitter Discussions for 10th Annual Personalized Medicine Conference at the Harvard Medical School, November 12-13, 2014 2012pharmaceutical
8:30AM–12:00PM, January 28, 2015 – Morality, Ethics & Public Law in PM, LIVE @Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA 2012pharmaceutical
2:00PM–5:00PM, January 27, 2015 – Personalizing Evidence in the Learning Healthcare System & Biomarker Discovery Technologies, LIVE @Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA 2012pharmaceutical
9:15AM–2:00PM, January 27, 2015 – Regulatory & Reimbursement Frameworks for Molecular Testing, LIVE @Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA 2012pharmaceutical
7:45AM–9:15AM, January 27, 2015 – Risk, Reward & Innovation, LIVE @Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA 2012pharmaceutical
3:30PM –5:15PM, January 26, 2015 – NGS Applications: Impact of Genomics on Cancer Care @Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA 2012pharmaceutical
2:15PM – 3:00PM, January 26, 2015 – Impact of Genomics on Cancer Care @Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA 2012pharmaceutical
1:00PM – 1:15PM, January 26, 2015 – Clinical Methodologies of NGS – LIVE @Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA 2012pharmaceutical
10:30AM-12PM, January 26, 2015 – NGS Applications: Impact of Genomics on Cancer Care – LIVE @Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA 2012pharmaceutical
9AM-10AM, January 26, 2015 – Newborn & Prenatal Diagnosis – LIVE @Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA 2012pharmaceutical
7:55AM – 9AM, January 26, 2015 – Introduction and Overview – LIVE @Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA 2012pharmaceutical
Hamburg, Snyderman to Address Timely Issues in Personalized Medicine at 2015 Personalized Medicine World Conference in Silicon Valley 2012pharmaceutical
The Personalized Medicine Coalition welcomes the Administration’s focus on Personalized Medicine 2012pharmaceutical
Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA, January 26, 2015, 8:00AM to January 28, 2015, 3:30PM PST 2012pharmaceutical



TOTAL Views of Presentation Content per Presentation: 10th Annual Personalized Medicine Conference at the Harvard Medical School, November 12-13, 2014 2012pharmaceutical
Silicon Valley 2015 Personalized Medicine World Conference, Mountain View, CA, January 26, 2015, 8:00AM to January 28, 2015, 3:30PM PST 2012pharmaceutical
FDA Commissioner, Dr. Margaret A. Hamburg on HealthCare for 310Million Americans and the Role of Personalized Medicine 2012pharmaceutical
Tweeting on the 10th Annual Personalized Medicine Conference at the Harvard Medical School, November 12-13, 2014 2012pharmaceutical
Content of the Presentations at the 10th Annual Personalized Medicine Conference at the Harvard Medical School, November 12-13, 2014 2012pharmaceutical
2:15PM 11/13/2014 – Panel Discussion Reimbursement/Regulation @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
1:00PM 11/13/2014 – Panel Discussion Genomics in Prenatal and Childhood Disorders @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
11:30AM 11/13/2014 – Role of Genetics and Genomics in Pharmaceutical Development @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
10:15AM 11/13/2014 – Panel Discussion — IT/Big Data @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
8:30AM 11/13/2014 – Harvard Business School Case Study: 23andMe @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
8:00AM 11/13/2014 – Welcome from Gary Gottlieb, M.D., Partners HealthCare @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
4:00PM 11/12/2014 – Panel Discussion Novel Approaches to Personalized Medicine @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
3:15PM 11/12/2014 – Discussion Complex Disorders @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
1:45PM 11/12/2014 – Panel Discussion – Oncology @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
1:15PM 11/12/2014 – Keynote Speaker – International Genetics Health and Disease @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
11:30AM 11/12/2014 – Personalized Medicine Coalition Award & Award Recipient Speech @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
11:00AM 11/12/2014 – Keynote Speaker – Past, Present and Future of Personalized Medicine @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
9:20AM 11/12/2014 – Panel Discussion – Genomic Technologies @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
8:50AM 11/12/2014 – Keynote Speaker – CEO, American Medical Association @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
8:20AM 11/12/2014 – Special Guest Keynote Speaker – The Future of Personalized Medicine @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
8:00AM 11/12/2014 – Welcome & Opening Remarks @10th Annual Personalized Medicine Conference at the Harvard Medical School, Boston 2012pharmaceutical
Hashtags and Twitter Handles for 10th Annual Personalized Medicine at Harvard Medical School, 11/12 – 11/13/2014 2012pharmaceutical
Personalized Medicine Coalition (PMC) – Upcoming Events 2012pharmaceutical
10th Annual Personalized Medicine Conference at the Harvard Medical School, November 12-13, 2014, The Joseph B. Martin Conference Center at Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 2012pharmaceutical
Personalized Medicine Coalition Recognizes Mark Levin with Award for Leadership 2012pharmaceutical
Research and Markets: Global Personalized Medicine Report 2014 – Scientific … – Rock Hill Herald (press release) 2012pharmaceutical
The Role of Medical Imaging in Personalized Medicine Dror Nir
CardioPredict™ Personalized Medicine Molecular Diagnostic Test 2012pharmaceutical
Life Sciences Circle Event: Next omics – Personalized Medicine beyond Genomics, December 11, 2013 5:30-8:30PM, The Broad Institute, Cambridge 2012pharmaceutical
Issues in Personalized Medicine: Discussions of Intratumor Heterogeneity from the Oncology Pharma forum on LinkedIn sjwilliamspa
Personalized medicine-based diagnostic test for NSCLC ritusaxena
Personalized Medicine and Colon Cancer tildabarliya
Systems Diagnostics – Real Personalized Medicine: David de Graaf, PhD, CEO, Selventa Inc. 2012pharmaceutical
Helping Physicians identify Gene-Drug Interactions for Treatment Decisions: New ‘CLIPMERGE’ program – Personalized Medicine @ The Mount Sinai Medical Center 2012pharmaceutical
Issues in Personalized Medicine in Cancer: Intratumor Heterogeneity and Branched Evolution Revealed by Multiregion Sequencing sjwilliamspa
Ethical Concerns in Personalized Medicine: BRCA1/2 Testing in Minors and Communication of Breast Cancer Risk sjwilliamspa
Personalized Medicine: Clinical Aspiration of Microarrays sjwilliamspa
The Promise of Personalized Medicine larryhbern
Personalized Medicine in NSCLC larryhbern
Attitudes of Patients about Personalized Medicine larryhbern
Understanding the Role of Personalized Medicine larryhbern
Directions for Genomics in Personalized Medicine larryhbern
Personalized Medicine: An Institute Profile – Coriell Institute for Medical Research: Part 3 2012pharmaceutical
Paradigm Shift in Human Genomics – Predictive Biomarkers and Personalized Medicine – Part 1 2012pharmaceutical
Harnessing Personalized Medicine for Cancer Management, Prospects of Prevention and Cure: Opinions of Cancer Scientific Leaders @ http://pharmaceuticalintelligence.com 2012pharmaceutical
Nanotechnology, personalized medicine and DNA sequencing tildabarliya
Personalized medicine gearing up to tackle cancer ritusaxena
Personalized Medicine Company Genection launched ritusaxena
Personalized Medicine: Cancer Cell Biology and Minimally Invasive Surgery (MIS) 2012pharmaceutical
The Way With Personalized Medicine: Reporters’ Voice at the 8th Annual Personalized Medicine Conference,11/28-29, 2012, Harvard Medical School, Boston, MA 2012pharmaceutical
Personalized Medicine Coalition: Upcoming Events 2012pharmaceutical
Highlights from 8th Annual Personalized Medicine Conference, November 28-29, 2012, Harvard Medical School, Boston, MA 2012pharmaceutical
Personalized medicine-based cure for cancer might not be far away ritusaxena
GSK for Personalized Medicine using Cancer Drugs needs Alacris systems biology model to determine the in silico effect of the inhibitor in its “virtual clinical trial” 2012pharmaceutical
Congestive Heart Failure & Personalized Medicine: Two-gene Test predicts response to Beta Blocker Bucindolol 2012pharmaceutical
Personalized Medicine as Key Area for Future Pharmaceutical Growth 2012pharmaceutical
Clinical Genetics, Personalized Medicine, Molecular Diagnostics, Consumer-targeted DNA – Consumer Genetics Conference (CGC) – October 3-5, 2012, Seaport Hotel, Boston, MA 2012pharmaceutical
AGENDA – Personalized Diagnostics, February 16-18, 2015 | Moscone North Convention Center | San Francisco, CA Part of the 22nd Annual Molecular Medicine Tri-Conference 2012pharmaceutical
Arrowhead’s 6th Annual Personalized & Precision Medicine Conference is coming to San Francisco, October 29-30, 2014 2012pharmaceutical
Personalized Cardiovascular Genetic Medicine at Partners HealthCare and Harvard Medical School 2012pharmaceutical
Precision Medicine for Future of Genomics Medicine is The New Era Demet Sag, Ph.D., CRA, GCP
Precision Medicine Initiative: Now is a State Initiative in California 2012pharmaceutical
1:30 pm – 2:20 pm 3/26/2015, LIVE Precision Medicine: Who’s Paying? @ MassBio Annual Meeting 2015, Cambridge, MA, Sonesta Hotel, 3/26 – 3/27, 2015 2012pharmaceutical
We Celebrate >600,000 Views for our 2,830 Scientific Articles in Life Sciences and Medicine 2012pharmaceutical
attn #3: Investors in HealthCare — Platforms in the Ecosystem of Regulatory & Reimbursement – Integrated Informational Platforms in Orthopedic Medical Devices, and Global Peer-Reviewed Scientific Curations: Bone Disease and Orthopedic Medicine – Draft 2012pharmaceutical
Foundation Medicine: Roche has Taken Over at $1.2B and 52.4 percent to 56.3 percent of Foundation Medicine on a fully diluted basis 2012pharmaceutical
Bridging the Gap in Precision Medicine @UCSF 2012pharmaceutical
Germline Genes and Drug Targets: Medicine more Proactive and Disease Prevention more Effective. 2012pharmaceutical
Proteomics – The Pathway to Understanding and Decision-making in Medicine larryhbern
Multi-drug, Multi-arm, Biomarker-driven Clinical Trial for patients with Squamous Cell Carcinoma called the Lung Cancer Master Protocol, or Lung-MAP launched by NCI, Foundation Medicine, and Five Pharma Firms 2012pharmaceutical
Preventive Care: Anticipated Changes caused by Genomics in the Clinic and Personalised Medicine 2012pharmaceutical
Cancer Labs at School of Medicine @ Technion: Janet and David Polak Cancer and Vascular Biology Research Center 2012pharmaceutical
Reprogramming Adult Patient Cells into Stem Cells: the Promise of Personalized Genetic Therapy 2012pharmaceutical
US Personalized Cancer Genome Sequencing Market Outlook 2018 – 2012pharmaceutical
Summary of Translational Medicine – e-Series A: Cardiovascular Diseases, Volume Four – Part 1 larryhbern
Introduction to Translational Medicine (TM) – Part 1: Translational Medicine larryhbern
Cancer Diagnosis at the Crossroads: Precision Medicine Driving Change, 9/14 – 9/17/2014, Sheraton Seattle Hotel, Seattle WA 2012pharmaceutical
Genomic Medicine and the Bioeconomy: Innovation for a Better World May 12–16, 2014 • Boston, MA 2012pharmaceutical
Institute of Medicine (IOM) Report on Genome-based Therapeutics and Companion Diagnostics 2012pharmaceutical
“Medicine Meets Virtual Reality” – NextMed-MMVR21 Conference 2/19 – 2/22/2014, Manhattan Beach Marriott, Manhattan Beach, CAView 2012pharmaceutical


Read Full Post »

Reporter: Aviva Lev-Ari, PhD, RN


Preeclampsia is a disorder that occurs only during pregnancy and the postpartum period and affects both the mother and the unborn baby. Affecting at least 5-8% of all pregnancies, it is a rapidly progressive condition characterized by high blood pressure and the presence of protein in the urine. Swelling, sudden weight gain, headaches and changes in vision are important symptoms; however, some women with rapidly advancing disease report few symptoms.

Typically, preeclampsia occurs after 20 weeks gestation (in the late 2nd or 3rd trimesters or middle to late pregnancy) and up to six weeks postpartum, though in rare cases it can occur earlier than 20 weeks. Proper prenatal care is essential to diagnose and manage preeclampsia. Pregnancy Induced Hypertension (PIH) and toxemia are outdated terms for preeclampsia. HELLP syndrome and eclampsia (seizures) are other variants of preeclampsia.

Globally, preeclampsia and other hypertensive disorders of pregnancy are a leading cause of maternal and infant illness and death. By conservative estimates, these disorders are responsible for 76,000 maternal and 500,000 infant deaths each year.


VIEW VIDEO – SIX Sections, Pauses in between


  • Preeclampsia vs. Pregnency -Induced Hypertension
  • When Preeclampsia Occur
  • Preeclampsia – Effects on Fetus Health
  • Preeclampsia – Effects on the Baby

Genetic Aspects of Pre-eclampsia

The genetics of pre-eclampsia and other hypertensive disorders of pregnancy

Human Genetics Research Group, School of Molecular and Medical Sciences, University of Nottingham, A Floor West Block, Queen’s Medical Centre, Nottingham NG7 2UH, UK
*Corresponding author. Tel.: +44 (0) 115 8230758; Fax: +44 (0) 115 8230759. Email: Paula.Williams@nottingham.ac.uk
Epidemiological studies clearly confirm a genetic component to pre-eclampsia. Numerous candidate genes have been studied that fall into groups based on their proposed pathological mechanism, including

  • thrombophilia,
  • endothelial function,
  • vasoactive proteins,
  • oxidative stress and
  • lipid metabolism and
  • immunogenetics.
It is expected that no one gene will be identified as the sole risk factor for pre-eclampsia, as in the general population pre-eclampsia represents a complex genetic disorder. Interactions between numerous SNP either alone or with combination with predisposing environmental factors, are most likely underpin the genetic component of this disorder. We must be cautious in our approach to genetics and acknowledge that we are still in the infancy of this research. Following on from GWAS, further fine mapping studies to delineate SNP that are causal from those that are in linkage disequilibrium, followed by functional laboratory studies will be required. Only when we have a better understanding of how the environment interacts with genes will we be in a better position to target treatment for women, for example knowing that women with a certain genotype will benefit from losing weight, enabling us to yield clinical benefit.
At present no genetic test is available to predict pre-eclampsia. The lack of a predictive test can be overcome by careful monitoring and assessment of women, especially those in high-risk groups, including:

    Those at either end of the reproductive age spectrum•Obesity•Black ethnicity•Primiparity•Previous history of pre-eclampsia•Multiple pregnancy•Pre-existing medical conditions: renal disease, insulin-dependent diabetes, autoimmune disease, antiphospholipid syndrom

Genetic aspects of pre-eclampsia

Clustering of cases of pre-eclampsia within families has been recognised since the 19th century, suggesting a genetic component to the disorder.2 Deciphering the genetic involvement in pre-eclampsia is challenging, not least because the phenotype is expressed only in parous women. Furthermore, in complex disorders of pregnancy, it is necessary to consider two genotypes, that of the mother and that of the fetus, which includes genes inherited from both mother and father. Maternal and fetal genes may have independent or interactive effects on the risk of pre-eclampsia. Finally, the heterogeneous nature of the disorder, with a sliding scale of severity, has resulted in differences in the definition of pre-eclampsia used within studies (see above), often with overlap of non-proteinuric gestational hypertension.

Twin studies investigating the relative contribution of genetic versus environmental factors to pre-eclampsia risk, initially yielded disappointing results. They showed that discordance for pre-eclampsia between monozygotic twin sisters was common, suggesting that heritability caused by maternal genes was low.3 These early studies were small. More recent investigations, however, using the large Swedish Twin, Medical Birth and Multigeneration Registries have estimated the heritability of pre-eclampsia to be about 55%, with contributions from both maternal and fetal genes. A further study in monozygotic twins4 found concordance of pre-eclampsia to be as common as discordance. Evidence from the largest published twin study, which correlated the Swedish Twin Register with the Swedish Medical Register, revealed pre-eclampsia penetrance to be less than 50%, suggesting diversity within models of inheritance.5–7

Pre-eclampsia: a complex genetic disorder

For a small number of families, pre-eclampsia seems to follow Mendelian patterns of disease inheritance,8 consistent with a rare deleterious monogenic variant or mutation with high penetrance. For most of the population, however, pre-eclampsia seems to represent a complex genetic disorder, and occurs as the result of numerous common variants at different loci which, individually, have small effects but collectively contribute to an individual’s susceptibility to disease. Environmental exposures, including age and weight, also determine whether these low penetrant variants result in phenotypic manifestation of the disease. It is likely that no single cause or genetic variant will account for all cases of pre-eclampsia, although it is possible that different variants are associated with various subsets of disease (e.g. pre-eclampsia combined with intrauterine growth restriction). Complex genetic disorders affect a high proportion of the population, representing a large burden to public health. New approaches to susceptibility gene discovery have emerged to address this challenge. Unfortunately, early diagnosis would only permit closer focus on routine antenatal care, as at present no intervention other than delivery has been shown to alter the course of pre-eclampsia.

Determining susceptibility to pre-eclampsia

The need to assess both the maternal and the fetal genotype is clear. The role of the placenta in the primary pathogenesis of the disorder indisputably indicates a fetal contribution to susceptibility to the disorder.9 Reports of severe, very early-onset pre-eclampsia in cases of fetal chromosomal abnormalities such as diandric hydatifidiform moles of entirely paternal genetic origin10 are consistent with a role for paternally inherited fetal genes in the determination of clinical phenotype. This is supported by epidemiological studies reporting a higher rate of pre-eclampsia in pregnancies fathered by men who were themselves born of pre-eclamptic pregnancies.11 The occurrence of pre-eclampsia in daughters-in-law of index women9 further supports a genetic contribution from both parents. The genetic conflict hypothesis states that fetal (paternal) genes will be selected to increase the transfer of nutrients to the fetus, whereas maternal genes will be selected to limit transfer in excess of a specific maternal optimum.12 Fetal genes are predicted to raise maternal blood pressure in order to enhance the uteroplacental blood flow, whereas maternal genes act the opposite way. Endothelial dysfunction in mothers with pre-eclampsia could, therefore, be interpreted as a fetal attempt to compensate for an inadequate uteroplacental nutrient supply.

As the phenotype is apparently only expressed during pregnancy, identification of ‘susceptible’ men is impossible. Most genetic studies of pre-eclampsia have focused on maternal genotypes only. The Genetics of Pre-eclampsia consortium highlighted the need to include analysis of all contributing genotypes, and carried out transmission disequilibrium testing in maternal and fetal triads.13 Understanding the contribution of the fetal genotype will require large sample sizes, with the development of algorithms to determine the relative contribution from mother and fetus. Furthermore, the decreased incidence of pre-eclampsia in second and subsequent pregnancies hampers analysis of the contribution of the fetal genotype.

Candidate gene approach

The candidate gene approach has been widely used in pre-eclampsia, and largely focuses on the maternal genotype. In this method, a single gene is chosen as the candidate for investigation based on prior biological knowledge of the pathophysiology of pre-eclampsia. The choice is strengthened if the gene lies within a region identified by linkage studies. A case-control design is usually used, comparing the frequencies of allelic variants in women with pre-eclampsia and normotensive pregnancies. Such studies need careful definition of inclusion criteria for cases and controls, and subtle ethnic stratification of groups must be avoided. Such performance characteristics of the genotyping assays as the rate of mis-genotyping, and the quality assurance methods used, should be clearly stated, but this is rarely done. Over 70 biological candidate genes have been examined, representing pathways involved in various pathophysiological processes, including vasoactive proteins, thrombophilia and hypofibrinolysis, oxidative stress and lipid metabolism, endothelial injury and immunogenetics.14 In common with the experience in other genetically complex disorders, results from candidate gene studies have been inconsistent, and no universally accepted susceptibility gene has been identified. Although this may, in part, be attributed to variation within populations, a more important factor is the small size of most of the candidate studies, which have been underpowered to detect variants with small effects. As there are more than 20,000 genes and 10 million single nucleotide polymorphisms (SNP) available, multiple testing will inevitably result in numerous results that achieve P values of less than 0.05. The development of robust statistical techniques for the minimisation of both false positive and false negative results is an important area.15,16 Only in recent years, as susceptibility genes for other complex disorders have been reported, has the small effect size of individual genetic variants become apparent, the majority increasing the risk of disease by less than 50%. A further limitation of the candidate gene approach is its reliance on the generation of an a-priori hypothesis based on our current incomplete knowledge of the pathophysiology of the disorder. The candidate genes studied belong to different groups according to their functional properties and plausible role in the pathophysiology (Table 2).


A successful pregnancy requires the development of adequate placental circulation. It is hypothesised that thrombophilias may increase the risk of placental insufficiency because of placental micro-vascular thrombosis, macro-vascular thrombosis, or both, as well as effects on trophoblast growth and differentiation.17 Abnormalities of the clotting cascade are well documented in women with pre-eclampsia.18 The endothelial damage of pre-eclampsia is associated with an altered phenotype from anticoagulant to procoagulant and decreased endothelially mediated vasorelaxation. It is possible that this phenotype is present before pre-eclampsia in pregnancy, or it may develop as a consequence of damage initiated during placentation. Furthermore, a subset of women develop frank thrombocytopaenia, often in association with haemolysis, elevated liver enzymes and low platelet count (HELLP) syndrome. Association of the three most widely studied thrombophilic factors, factor V Leiden (F5), methylenetetrahydrofolate (MTHFR) and prothrombin (F2), with pre-eclampsia has been shown; however, several studies have also shown contradictory results.14 A recent meta-analysis indicated a two-fold increase in risk for pre-eclampsia associated with 1691G>A mutation in F5, but no associations were found for MTHFR or F2.19 To date, the number of studies showing no association with pre-eclampsia for these three genes is much higher than the number confirming association. Association with the inhibitor of fibrinolysis plasminogen activator factor-1 gene has also been reported; however, replication attempts have failed.20–22

Haemodynamics and endothelial function

The renin-angiotensin system (RAS) is important for regulating the cardiovascular and renal changes that occur in pregnancy. Several studies have implicated the RAS in the pathophysiology of pre-eclampsia.23 As such, genes in the RAS have been considered as plausible candidates for pre-eclampsia. Angiotensin-converting enzyme (ACE), angiotensin II type 1 and type 2 receptor (AGTR1, AGTR2), and angiotensinogen (AGT) have all been studied extensively in pre-eclampsia. Recent meta-analyses have identified the T allele of AGT M235T as increasing the risk of developing pre-eclampsia by 1.62 times and similar increases in disease risk have been found in AGT and the angiotensin-converting enzyme I/D polymorphism.24 A rare functional polymorphism in AGT, which results in replacement of leucine by phenylalanine at the site of renin cleavage, has been reported in association with severe pre-eclampsia.25

Endothelial nitric oxide synthase 3 (eNOS3), which is involved in vascular remodelling and vasodilation, has been shown to have reduced activity in pre-eclampsia26 Association studies in different ethnic populations, however, have yielded both positive and negative findings. A meta-analysis investigating the E298D polymorphism, which had initially been associated with pre-eclampsia in Colombian women, failed to find increased risk.24 Vascular endothelial growth factor (VEGF) is important for endothelial cell proliferation, migration, survival and regulation of vascular permeability. The number of studies that have investigated SNP in the genes involved in the VEGF system is small. Two polymorphisms in VEGF, 405G>C and 936C>T, were found to be associated with the severe form of pre-eclampsia in two small studies, but cannot at present be considered as major risk factors.27,28

Oxidative stress and lipid metabolism

Oxidative stress plays a central role in the pathogenesis of pre-eclampsia. Maternal perfusion of the placenta does not occur until towards the end of the first trimester,29 when a rapid increase in local oxygen tension takes place, and the probable occurrence of a period of hypoxia–reperfusion until stability is reached. This is accompanied by increased expression and activity of such antioxidants as glutathione peroxidase, catalase and the various forms of superoxide dismutase.30 If this antioxidant response were reduced, then the cascade of events leading to impaired placentation could be initiated. Evidence for reduced antioxidant activity in pre-eclampsia has recently been reviewed.31 Genes involved in the generation or inactivation of reactive oxygen species, if defective, could increase endothelial dysfunction via lipid peroxidation, which has been a candidate causative agent for the endothelial damage of pre-eclampsia for more than 20 years.32 Despite the strong correlation between oxidative stress and pre-eclampsia, only a small handful of genes have been investigated. Functional polymorphisms in the gene for microsomal epoxide hydrolase (EPHX) that catalyses the hydrolysis of certain oxides and may produce toxic intermediates that could be involved in pre-eclampsia, and glutathione S-transferase (GST), an antioxidant capable of inactivating reactive oxygen species, have shown associations. Conflicting results, however, have also been reported.33–36

Abnormal lipid profiles associated with the lipid peroxidation caused by oxidative stress are also characteristic of pre-eclampsia. Lipoprotein lipase (LPL) and apolipoprotein E (ApoE) are the two major regulators of lipid metabolism, abundantly expressed in placenta, and have therefore been proposed as possible candidate genes.37,38 A recent study using bioinformatic analysis identified altered glycosylation of circulating ApoE isoforms in pre-eclampsia.39 A deglycosylated basic ApoE isoform was increased in pre-eclampsia, and an acidic ApoE sialyated isoform was decreased. Functionally, this might increase the risk of developing placental atherotic changes. The most promising genetic variant in this context is a mis-sense mutation, Asn291Ser, in LPL which correlates with lowered LPL activity and increased dyslipidaemia in two separate studies. Again, others have failed to replicate these findings.38,40,41 The fetal genotype of these two genes has also been reported to contribute to the metabolism of the maternal lipoproteins.37

Immune system

The maternal immune response to pregnancy is crucial in determining pregnancy outcome and success. The increased incidence of pre-eclampsia in primiparous women, especially those at either end of the childbearing age range, indicates a strong association between immune factors and pre-eclampsia.42 However, the protective effect of multiparity is lost with change of partner. Advances in assisted reproductive technology are also posing new challenges to the maternal immune system. The use of donated sperm or eggs increases the risk of pre-eclampsia three-fold.43

Human leucocyte antigen

Trophoblast cells express an unusual repertoire of histocompatibility antigens, comprising human leucocyte C, E and G class antigens (HLA-C, HLA-E, HLA-E), of which only HLA-C displays marked polymorphism. The expression of HLA on the invading cytotrophoblast is important, as these interact with killer immunoglobulin, such as receptors (KIR) expressed on maternal uNKs and cytotoxic T-lymphocytes, down-regulating their cytolytic activity and stimulating the production of cytokines needed for successful placentation. Multiple highly homologous KIR genes map to chromosome 19q, probably arising from ancestral gene duplications, and the two main resulting gene clusters have been classified as haplotypes A and B. The A group codes mainly for KIR, which inhibit natural killer cells, whereas the B group has additional stimulatory genes.44 Pre-eclampsia is more frequent in women who are homozygous for the inhibitory A haplotypes (AA) than in women homozygous for the stimulatory B haplotypes (BB). The effect is strongest if the fetus is homozygous for the HLA-C2 haplotype.45 Alteration in KIR interaction on uNK cells with HLA-C on interstitial trophoblast alters the decidual immune response, resulting in impaired extravillous trophoblast invasion and deficient spiral artery remodelling, associated with pre-eclampsia.

An association of HLA-G, which displays limited polymorphism, with pre-eclampsia, has also been reported. A possible association between the presence of the HLA-G allele G*0106 in the placenta and an increased risk of pre-eclampsia has been identified in two small studies.46,47 these were underpowered, however, and further studies using larger cohorts of mothers and babies are needed to replicate these results. HLA-G variants foreign to the mother may lead to histo-incompatibility between mother and child. A maternal rejection response to the semi-allogeneic fetus may represent one of the pathways involved in the development of pre-eclampsia.

A number of pro-inflammatory cytokines have also been investigated for possible associations with pre-eclampsia. Excessive release of tumour necrosis factor alpha (TNFα) has been implicated owing to its contribution to endothelial activation, which in turn could contribute to maternal symptoms.48 Interestingly, in pregnant rats, TNF induces hypertension, a response not seen in non-pregnant rats.49 Furthermore, plasma levels of TNFα are significantly higher in women with pre-eclampsia than matched controls.50 TNFα is also involved in the production of reactive oxygen species and subsequently oxidant mediated endothelial damage. The most frequently studied variant in pre-eclampsia is the –308G>A transition in the promoter region, which is associated with increased levels of TNFα production and an increased risk for pre-eclampsia linked disorders, including type 2 diabetes, coronary artery disease and dyslipidaemia.51,52 However, a meta-analysis from 2008 combined 16 studies investigating this promoter SNP, but failed to detect a significant association to pre-eclampsia.53

Interleukin-10 (IL-10) has also been implicated in the pathogenesis of pre-eclampsia by enhancing the inflammatory response towards trophoblast cells resulting in reduced invasion and remodelling of the spiral arteries.54 Expression of IL-10 is reduced in pre-eclamptic placentae.55 Studies investigating associations of variants of the gene and pre-eclampsia, however, have yielded conflicting results.56–58 Associations have also been detected for two additional inflammatory genes, interleukin-1α (IL-1α) and the interleukin 1 receptor anatagonist (IL1Ra) in relatively small studies, but few studies have addressed the role of polymorphisms in these genes so far.59,60

Antioxidant enzymes

A large family of cytosolic glutathione-s-transferases (GST) exists, and the P class is highly expressed in the human placenta. Several relatively small case-control studies of polymorphisms in this family in relation to pre-eclampsia have failed to identify any significant effect of several GST polymorphisms studied individually. However, a cumulative effect of the number of polymorphisms in various biotransformation enzymes, including GST, which would result in decreased antioxidant capacity, has been reported.61 Intriguingly, the use of semi-quantitative polymerase chain reaction on a small data set identified using serial analysis of gene expression profiles, seems to identify a specific molecular signature for HELLP, which includes decreased expression of GST P1.62

Remarkably, few studies of possible functional polymorphisms in antioxidant enzyme systems have been reported. The 242C>T polymorphism in exon 4 of the gene for the p22phox subunit of NADPH/NADH oxidase (CYBA), which is part of the cascade of superoxide generation, has been reported as showing no evidence of an association with either pre-eclampsia or HELLP and pre-eclampsa.63 A small preliminary study of the Ala40Thr polymorphism of the superoxide dismutase 3 gene (SOD3), which has been associated with insulin resistance, reported a significant excess of the mutant allele in women with severe intrauterine growth restriction.64


High blood pressure in pregnancy: What’s your story?

By Mary M. Murry, R.N., C.N.M.

Blood pressure tends to fluctuate during pregnancy.

For example, it’s normal to experience a drop in blood pressure during the second trimester. In fact, your blood pressure might be lower than it’s ever been. During the third trimester, a gradual increase in blood pressure is common.

Sometimes, though, blood pressure changes more dramatically — or sustained high blood pressure becomes a concern.

By definition, there are various types of high blood pressure during pregnancy:

  • Chronic hypertension. If high blood pressure develops before pregnancy or during pregnancy but before 20 weeks, it’s known as chronic hypertension. High blood pressure that lasts more than 12 weeks after delivery is also considered chronic hypertension.
  • Gestational hypertension. If high blood pressure develops after 20 weeks of pregnancy, it’s known as gestational hypertension. Gestational hypertension usually goes away after delivery.
  • Preeclampsia. Sometimes chronic hypertension or gestational hypertension leads to preeclampsia. This is a serious condition characterized by high blood pressure and protein in the urine after 20 weeks of pregnancy.

All of these conditions can be dangerous for you and your baby. If your pregnancy has been normal until now, a diagnosis of high blood pressure can be especially jarring.

Depending on the circumstances, your health care provider might recommend close monitoring or, in some cases, an early delivery.

Count on your health care provider to help you understand what’s happening and what you can do to promote a healthy outcome. Above all, don’t hesitate to ask questions. Being fully informed can help you make the best decisions for you and your baby.


Texas A&M Researcher Uncovers New Data for the Treatment of Preeclampsia

Posted Thursday , June 06,2013


A Researcher From Texas A&M Has Uncovered New Data for the Treatment of Preeclampsia: Preclinical Research Shows PLX Cells May Be Effective in Treating Preeclampsia.

Preliminary research led by Brett Mitchell, PhD, an Associate Professor of Internal Medicine in the Cardiovascular Research Institute (CVRI) at Texas A&M University College of Medicine, is demonstrating that administrating placental stem cells may aid in reversing symptoms linked with preeclampsia within days after dosing with no apparent harmful effects to fetus or mother.

Preeclampsia may occur after the 20th week of pregnancy when the mother-to-be’s blood pressure has increased and there are signs of excessive protein in the urine. This condition affects somewhere between 6-8 percentage of pregnancies in the US, and can be serious, as there is a shift from protecting mother and fetus as immunologically privileged sites. This brings about vascular issues that involve the inability of blood vessels to dilate or relax.

Dr. Mitchel has been able to look at the immune cells that are responsible for the development of high blood pressure (hypertension) during pregnancy in hopes to develop new therapies that diminish the immune cells that are responsible for this action while maintaining normal immune cell function.

Mitchel and colleagues have taken mice that had preeclampsia and injected placenta-based cells (stem cells) known as PLX (Placentall eXpanded) into leg muscle.  PLX cells are used as a way of delivering drugs and in particular therapeutic proteins in response to inflammatory and ischemic events.  They tested eight groups of 2 separate animal models (preeclampsia models) and found that PLX cells were effective in treating preeclampsia.

They observed a reduction in

  • systolic pressure to normal levels within 3 days and a reduction of
  • urinary proteins within 4 days.

They also observed an

  • increase in endothelial function.  This was measured by acetylcholine-induced relaxation and was effective within 4 days. A
  • weight reduction of the spleen was also observed within 4 days.

Pregnant mice who didn’t have preeclampsia were subjected to the same protocol and it was found that muscle injection of PLX cells did not effect a normal pregnancy.  They also found that the number of pups or fetal demise in a litter were not different indicating that PLX cells caused no fetal harm.

Dr. Mitchel presented his findings at the Society for Gynecologic Investigation Summit in Jerusalem on May 30, 2013.  Mitchell suggests that the factors that were secreted from the PLX cells were able to decrease inflammation thereby restoring endothelial function.

Currently, there are no treatments available for preeclampsia, so this therapy looks promising.




  1. Pregnancy. National Heart, Lung, and Blood Institute. http://www.nhlbi.nih.gov/hbp/issues/preg/preg.htm. Accessed March 9, 2011.
  2. Conde-Agudelo A, et al. Maternal infection and risk of preeclampsia: Systematic review and metaanalysis. American Journal of Obstetrics and Gynecology. 2008;198:7.
  3. Bodnar LM, et al. Maternal vitamin D deficiency increases the risk of preeclampsia. Journal of Clinical Endocrinology & Metabolism. 2007;92:3517.
  4. High blood pressure and preeclampsia. March of Dimes. http://www.marchofdimes.com/complications_preeclampsia.html. Accessed March 9, 2011.
  5. Norwitz ER, et al. Management of preeclampsia. http://www.uptodate.com/home/index.html. Accessed March 7, 2011.
  6. Leanos-Miranda A, et al. Urinary prolactin as a reliable marker for preeclampsia, its severity, and the occurrence of adverse pregnancy outcomes. Journal of Clinical Endocrinology & Metabolism. 2008;93:2492.
  7. August P, et al. Clinical features, diagnosis, and long-term prognosis of preeclampsia. http://www.uptodate.com/home/index.html. Accessed March 7, 2011.
  8. Sibai BM, et al. Hypertension. In: Gabbe SG, et al. Obstetrics: Normal and Problem Pregnancies. 5th ed. Philadelphia, Pa.: Churchill Livingstone Elsevier; 2007. http://www.mdconsult.com/das/book/body/208746819-4/0/1528/0.html. Accessed March 9, 2011.
  9. Barton JR, et al. Prediction and prevention of recurrent preeclampsia. Obstetrics & Gynecology. 2008;112:359.
  10. Bellamy L, et al. Pre-eclampsia and risk of cardiovascular disease and cancer in later life: Systematic review and meta-analysis. British Medical Journal. 2007;335:974.
  11. Facchinetti F, et al. Migraine is a risk factor for hypertensive disorders in pregnancy: A prospective cohort study. Cephalalgia: An International Journal of Headache. 2009;29:286.
  12. Steegers EA, et al. Pre-eclampsia. The Lancet. 2010;376:631.



1. Brown M.A., Lindheimer M.D., de Swiet M. The classification and diagnosis of the hypertensive disorders of pregnancy: statement from the International Society for the Study of Hypertension in Pregnancy (ISSHP) Hypertens Pregnancy. 2001;20:IX–XIV. [PubMed: 12044323]
2. Chesley L.C., Annitto J.E., Cosgrove R.A. The familial factor in toxemia of pregnancy. Obstet Gynecol. 1968;32:303–311. [PubMed: 5742111]
3. Thornton J.G., Macdonald A.M. Twin mothers, pregnancy hypertension and pre-eclampsia. Br J Obstet Gynaecol. 1999;106:570–575.[PubMed: 10426615]
4. O’Shaughnessy K.M., Ferraro F., Fu B. Identification of monozygotic twins that are concordant for preeclampsia. Am J Obstet Gynecol. 2000;182:1156–1157. [PubMed: 10819852]
5. Chappell S., Morgan L. Searching for genetic clues to the causes of pre-eclampsia. Clin Sci (Lond) 2006;110:443–458. [PubMed: 16526948]
6. Cnattingius S. The epidemiology of smoking during pregnancy: smoking prevalence, maternal characteristics, and pregnancy outcomes. Nicotine Tob Res.2004;6(Suppl. 2):S125–140. [PubMed: 15203816]
7. Salonen Ros H., Lichtenstein P., Lipworth L. Genetic effects on the liability of developing pre-eclampsia and gestational hypertension. Am J Med Genet.2000;91:256–260. [PubMed: 10766979]
8. Redman C.W., Sargent I.L. Latest advances in understanding preeclampsia. Science. 2005;308:1592–1594. [PubMed: 15947178]
9. Cooper D.W., Brennecke S.P., Wilton A.N. Genetics of pre-eclampsia. Hypertens Pregnancy. 1993;12:1–23.
10. Esplin M.S., Fausett M.B., Fraser A. Paternal and maternal components of the predisposition to preeclampsia. N Engl J Med. 2001;344:867–872.[PubMed: 11259719]
11. Skjaerven R., Vatten L.J., Wilcox A.J. Recurrence of pre-eclampsia across generations: exploring fetal and maternal genetic components in a population based cohort. BMJ. 2005;331:877. [PMCID: PMC1255793] [PubMed: 16169871]
12. Haig D. Genetic conflicts in human pregnancy. Q Rev Biol. 1993;68:495–532. [PubMed: 8115596]
13. GOPEC Disentangling fetal and maternal susceptibility for pre-eclampsia: a British multicenter candidate-gene study. Am J Hum Genet. 2005;77:127–131. [PMCID: PMC1226184] [PubMed: 15889386]
14. Mutze S., Rudnik-Schoneborn S., Zerres K. Genes and the preeclampsia syndrome. J Perinat Med. 2008;36:38–58. [PubMed: 18184097]
15. Colhoun H., McKeigue P., Davey Smith G. Problems of reporting genetic associations with comlex outcomes. Lancet. 2003;361:865–872.[PubMed: 12642066]
16. Wacholder S., Chanock S., Garcia-Closas M. Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J Natl Cancer Inst. 2004;96:434–442. [PubMed: 15026468]
17. Isermann B., Sood R., Pawlinski R. The thrombomodulin-protein C system is essential for the maintenance of pregnancy. Nat Med. 2003;9:331–337.[PubMed: 12579195]
18. Brenner B. Thrombophilia and pregnancy loss. Thromb Res. 2002;108:197–202. [PubMed: 12617981]
19. Lin J., August P. Genetic thrombophilias and preeclampsia: a meta-analysis. Obstet Gynecol. 2005;105:182–192. [PubMed: 15625161]
20. Dalmaz C.A., Santos K.G., Botton M.R. Relationship between polymorphisms in thrombophilic genes and preeclampsia in a Brazilian population. Blood Cells Mol Dis. 2006;37:107–110. [PubMed: 16963292]
21. Fabbro D., D’Elia A.V., Spizzo R. Association between plasminogen activator inhibitor 1 gene polymorphisms and preeclampsia. Gynecol Obstet Invest.2003;56:17–22. [PubMed: 12867763]
22. Gerhardt A., Goecke T.W., Beckmann M.W. The G20210A prothrombin-gene mutation and the plasminogen activator inhibitor (PAI-1) 5G/5G genotype are associated with early onset of severe preeclampsia. J Thromb Haemost. 2005;3:686–691. [PubMed: 15842353]
23. Shah N.C., Pringle S., Struthers A. Aldosterone blockade over and above ACE-inhibitors in patients with coronary artery disease but without heart failure.J Renin Angiotensin Aldosterone Syst. 2006;7:20–30. [PubMed: 17083070]
24. Medica I., Kastrin A., Peterlin B. Genetic polymorphisms in vasoactive genes and preeclampsia: a meta-analysis. Eur J Obstet Gynecol Reprod Biol.2007;131:115–126. [PubMed: 17112651]
25. Inoue I., Rohrwasser A., Helin C. A mutation of angiotensinogen in a patient with preeclampsia leads to altered kinetics of the renin-angiotensin system.J Biol Chem. 1995;270:11430–11436. [PubMed: 7744780]
26. Brennecke S.P., Gude N.M., Di Iulio J.L. Reduction of placental nitric oxide synthase activity in pre-eclampsia. Clin Sci (Lond) 1997;93:51–55.[PubMed: 9279203]
27. Banyasz I., Bokodi G., Vannay A. Genetic polymorphisms of vascular endothelial growth factor and angiopoietin 2 in retinopathy of prematurity. Curr Eye Res. 2006;31:685–690. [PubMed: 16877277]
28. Papazoglou D., Galazios G., Koukourakis M.I. Vascular endothelial growth factor gene polymorphisms and pre-eclampsia. Mol Hum Reprod.2004;10:321–324. [PubMed: 14997002]
29. Foidart J., Hustin J., Dubois M. The human placenta becomes haemochorial at the 13th week of pregnancy. Int J Dev Biol. 1992;36:451–453.[PubMed: 1445791]
30. Jauniaux E., Watson A., Hempstock J. Onset of maternal arterial blood flow and placental oxidative stress. A possible factor in human early pregnancy failure. Am J Pathol. 2000;157:2111–2122. [PMCID: PMC1885754] [PubMed: 11106583]
31. Perkins A.V. Endogenous anti-oxidants in pregnancy and preeclampsia. Aust N Z J Obstet Gynaecol. 2006;46:77–83. [PubMed: 16638026]
32. Wickens D., Wilkins M.H., Lunec J. Free radical oxidation (peroxidation)products in plasma in normal and abnormal pregnancy. Ann Clin Biochem.1981;18:158–162. [PubMed: 7283366]
33. Canto P., Canto-Cetina T., Juarez-Velazquez R. Methylenetetrahydrofolate reductase C677T and glutathione S-transferase P1 A313G are associated with a reduced risk of preeclampsia in Maya-Mestizo women. Hypertens Res. 2008;31:1015–1019. [PubMed: 18712057]
34. Gebhardt G.S., Peters W.H., Hillermann R. Maternal and fetal single nucleotide polymorphisms in the epoxide hydrolase and gluthatione S-transferase P1 genes are not associated with pre-eclampsia in the Coloured population of the Western Cape, South Africa. J Obstet Gynaecol. 2004;24:866–872.[PubMed: 16147638]
35. Laasanen J., Romppanen E.L., Hiltunen M. Two exonic single nucleotide polymorphisms in the microsomal epoxide hydrolase gene are jointly associated with preeclampsia. Eur J Hum Genet. 2002;10:569–573. [PubMed: 12173035]
36. Ohta K., Kobashi G., Hata A. Association between a variant of the glutathione S-transferase P1 gene (GSTP1) and hypertension in pregnancy in Japanese: interaction with parity, age, and genetic factors. Semin Thromb Hemost. 2003;29:653–659. [PubMed: 14719182]
37. Descamps O.S., Bruniaux M., Guilmot P.F. Lipoprotein metabolism of pregnant women is associated with both their genetic polymorphisms and those of their newborn children. J Lipid Res. 2005;46:2405–2414. [PubMed: 16106048]
38. Kim Y.J., Williamson R.A., Chen K. Lipoprotein lipase gene mutations and the genetic susceptibility of preeclampsia. Hypertension. 2001;38:992–996.[PubMed: 11711487]
39. Atkinson K.R., Blumenstein M., Black M.A. An altered pattern of circulating apolipoprotein E3 isoforms is implicated in preeclampsia. J Lipid Res.2009;50:71–80. [PubMed: 18725658]
40. Hubel C.A., Roberts J.M., Ferrell R.E. Association of pre-eclampsia with common coding sequence variations in the lipoprotein lipase gene. Clin Genet.1999;56:289–296. [PubMed: 10636447]
41. Zhang C., Austin M.A., Edwards K.L. Functional variants of the lipoprotein lipase gene and the risk of preeclampsia among non-Hispanic Caucasian women. Clin Genet. 2006;69:33–39. [PubMed: 16451134]
42. Roberts J.M., Pearson G., Cutler J. Summary of the NHLBI Working Group on Research on Hypertension During Pregnancy. Hypertension.2003;41:437–445. [PubMed: 12623940]
43. Wang J.X., Knottnerus A.M., Schuit G. Surgically obtained sperm, and risk of gestational hypertension and pre-eclampsia. Lancet. 2002;359:673–674.[PubMed: 11879865]
44. Hiby S.E., Walker J.J., O’Shaughnessy K.M. Combinations of maternal KIR and fetal HLA-C genes influence the risk of preeclampsia and reproductive success. J Exp Med. 2004;200:957–965. [PMCID: PMC2211839] [PubMed: 15477349]
45. Parham P. MHC class I molecules and KIRs in human history, health and survival. Nat Rev Immunol. 2005;5:201–214. [PubMed: 15719024]
46. Moreau P., Contu L., Alba F. HLA-G gene polymorphism in human placentas: possible association of G*0106 allele with preeclampsia and miscarriage.Biol Reprod. 2008;79:459–467. [PubMed: 18509163]
47. Tan C.Y., Ho J.F., Chong Y.S. Paternal contribution of HLA-G*0106 significantly increases risk for pre-eclampsia in multigravid pregnancies. Mol Hum Reprod. 2008;14:317–324. [PubMed: 18353802]
48. LaMarca B.D., Ryan M.J., Gilbert J.S. Inflammatory cytokines in the pathophysiology of hypertension during preeclampsia. Curr Hypertens Rep.2007;9:480–485. [PubMed: 18367011]
49. Alexander B.T., Cockrell K.L., Massey M.B. Tumor necrosis factor-alpha-induced hypertension in pregnant rats results in decreased renal neuronal nitric oxide synthase expression. Am J Hypertens. 2002;15:170–175. [PubMed: 11863253]
50. Sharma A., Satyam A., Sharma J.B. Leptin, IL-10 and inflammatory markers (TNF-alpha, IL-6 and IL-8) in pre-eclamptic, normotensive pregnant and healthy non-pregnant women. Am J Reprod Immunol. 2007;58:21–30. [PubMed: 17565544]
51. Elahi M.M., Asotra K., Matata B.M. Tumor necrosis factor alpha -308 gene locus promoter polymorphism: an analysis of association with health and disease. Biochim Biophys Acta. 2009;1792:163–172. [PubMed: 19708125]
52. Saarela T., Hiltunen M., Helisalmi S. Tumour necrosis factor-alpha gene haplotype is associated with pre-eclampsia. Mol Hum Reprod. 2005;11:437–440. [PubMed: 15901845]
53. Bombell S., McGuire W. Tumour necrosis factor (-308A) polymorphism in pre-eclampsia: meta-analysis of 16 case-control studies. Aust N Z J Obstet Gynaecol. 2008;48:547–551. [PubMed: 19133041]
54. Renaud S.J., Macdonald-Goodfellow S.K., Graham C.H. Coordinated regulation of human trophoblast invasiveness by macrophages and interleukin 10.Biol Reprod. 2007;76:448–454. [PubMed: 17151353]
55. Makris A., Xu B., Yu B. Placental deficiency of interleukin-10 (IL-10) in preeclampsia and its relationship to an IL10 promoter polymorphism. Placenta.2006;27:445–451. [PubMed: 16026832]
56. Daher S., Sass N., Oliveira L.G. Cytokine genotyping in preeclampsia. Am J Reprod Immunol. 2006;55:130–135. [PubMed: 16433832]
57. Goddard K.A., Tromp G., Romero R. Candidate-gene association study of mothers with pre-eclampsia, and their infants, analyzing 775 SNPs in 190 genes. Hum Hered. 2007;63:1–16. [PubMed: 17179726]
58. Kamali-Sarvestani E., Kiany S., Gharesi-Fard B. Association study of IL-10 and IFN-gamma gene polymorphisms in Iranian women with preeclampsia.J Reprod Immunol. 2006;72:118–126. [PubMed: 16863661]
59. Faisel F., Romppanen E.L., Hiltunen M. Polymorphism in the interleukin 1 receptor antagonist gene in women with preeclampsia. J Reprod Immunol.2003;60:61–70. [PubMed: 14568678]
60. Haggerty C.L., Ferrell R.E., Hubel C.A. Association between allelic variants in cytokine genes and preeclampsia. Am J Obstet Gynecol. 2005;193:209–215.[PubMed: 16021081]
61. Zusterzeel P.L., Peters W.H., Burton G.J. Susceptibility to pre-eclampsia is associated with multiple genetic polymorphisms in maternal biotransformation enzymes. Gynecol Obstet Invest. 2007;63:209–213. [PubMed: 17167268]
62. Buimer M., Keijser R., Jebbink J.M. Seven placental transcripts characterize HELLP-syndrome. Placenta. 2008;29:444–453. [PubMed: 18374411]
63. Raijmakers M.T., Roes E.M., Steegers E.A. The C242T-polymorphism of the NADPH/NADH oxidase gene p22phox subunit is not associated with pre-eclampsia. J Hum Hypertens. 2002;16:423–425. [PubMed: 12037698]
64. Rosta K., Molvarec A., Enzsoly A. Association of extracellular superoxide dismutase (SOD3) Ala40Thr gene polymorphism with pre-eclampsia complicated by severe fetal growth restriction. Eur J Obstet Gynecol Reprod Biol. 2009;142:134–138. [PubMed: 19108943]
65. Arngrimsson R., Sigurardo-ttir S., Frigge M.L. A genome-wide scan reveals a maternal susceptibility locus for pre-eclampsia on chromosome 2p13. Hum Mol Genet. 1999;8:1799–1805. [PubMed: 10441346]
66. Laivuori H., Lahermo P., Ollikainen V. Susceptibility loci for preeclampsia on chromosomes 2p25 and 9p13 in Finnish families. Am J Hum Genet.2003;72:168–177. [PMCID: PMC378622] [PubMed: 12474145]
67. Moses E.K., Lade J.A., Guo G. A genome scan in families from Australia and New Zealand confirms the presence of a maternal susceptibility locus for pre-eclampsia, on chromosome 2. Am J Hum Genet. 2000;67:1581–1585. [PMCID: PMC1287935] [PubMed: 11035632]
68. Lachmeijer A.M., Arngrimsson R., Bastiaans E.J. A genome-wide scan for preeclampsia in the Netherlands. Eur J Hum Genet. 2001;9:758–764.[PubMed: 11781687]
69. Lander E., Kruglyak L. Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat Genet. 1995;11:241–247.[PubMed: 7581446]
70. Zintzaras E., Kitsios G., Harrison G.A. Heterogeneity-based genome search meta-analysis for preeclampsia. Hum Genet. 2006;120:360–370.[PubMed: 16868762]
71. Akolekar R., Etchegaray A., Zhou Y. Maternal serum activin a at 11-13 weeks of gestation in hypertensive disorders of pregnancy. Fetal Diagn Ther.2009;25:320–327. [PubMed: 19776595]
72. Roten L.T., Johnson M.P., Forsmo S. Association between the candidate susceptibility gene ACVR2A on chromosome 2q22 and pre-eclampsia in a large Norwegian population-based study (the HUNT study) Eur J Hum Genet. 2009;17:250–257. [PMCID: PMC2696227] [PubMed: 18781190]
73. Fitzpatrick E., Johnson M.P., Dyer T.D. Genetic association of the activin A receptor gene (ACVR2A) and pre-eclampsia. Mol Hum Reprod.2009;15:195–204. [PMCID: PMC2647107] [PubMed: 19126782]
74. Riento K., Ridley A.J. Rocks: multifunctional kinases in cell behaviour. Nat Rev Mol Cell Biol. 2003;4:446–456. [PubMed: 12778124]
75. Kandabashi T., Shimokawa H., Miyata K. Inhibition of myosin phosphatase by upregulated rho-kinase plays a key role for coronary artery spasm in a porcine model with interleukin-1beta. Circulation. 2000;101:1319–1323. [PubMed: 10725293]
76. Ark M., Yilmaz N., Yazici G. Rho-associated protein kinase II (rock II) expression in normal and preeclamptic human placentas. Placenta. 2005;26:81–84. [PubMed: 15664415]
77. Johnson M.P., Roten L.T., Dyer T.D. The ERAP2 gene is associated with preeclampsia in Australian and Norwegian populations. Human Genetics.2009;126(5):655–666. PMCID: PMC2783187. [PMCID: PMC2783187] [PubMed: 19578876]
78. WTCCC Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447:661–678.[PMCID: PMC2719288] [PubMed: 17554300]
79. Bezerra P.C., Leao M.D., Queiroz J.W. Family history of hypertension as an important risk factor for the development of severe preeclampsia. Acta Obstet Gynecol Scand. 2010;89:612–617. [PubMed: 20423274]
80. Hatada I., Mukai T. Genomic imprinting of p57KIP2, a cyclin-dependent kinase inhibitor, in mouse. Nat Genet. 1995;11:204–206. [PubMed: 7550351]
81. Oudejans C.B., Mulders J., Lachmeijer A.M. The parent-of-origin effect of 10q22 in pre-eclamptic females coincides with two regions clustered for genes with down-regulated expression in androgenetic placentas. Mol Hum Reprod. 2004;10:589–598. [PubMed: 15208369]
82. Rigourd V., Chauvet C., Chelbi S.T. STOX1 overexpression in choriocarcinoma cells mimics transcriptional alterations observed in preeclamptic placentas. PLoS One. 2008;3:e3905. [PMCID: PMC2592700] [PubMed: 19079545]
83. Berends A.L., Bertoli-Avella A.M., de Groot C.J. STOX1 gene in pre-eclampsia and intrauterine growth restriction. BJOG. 2007;114:1163–1167.[PubMed: 17617193]
84. Iglesias-Platas I., Monk D., Jebbink J. STOX1 is not imprinted and is not likely to be involved in preeclampsia. Nat Genet. 2007;39:279–280. author reply 280–271. [PubMed: 17325670]
85. Kivinen K., Peterson H., Hiltunen L. Evaluation of STOX1 as a preeclampsia candidate gene in a population-wide sample. Eur J Hum Genet.2007;15:494–497. [PubMed: 17290274]
86. Yu L., Chen M., Zhao D. The H19 gene imprinting in normal pregnancy and pre-eclampsia. Placenta. 2009;30:443–447. [PubMed: 19342096]
87. Nussbaum R.L., McInnes R.R., Willard H.F. In: Genetics in medicine. 6th ed. Thompson and Thompson, editor. Saunders; Philadelphia: 2004. pp. 289–309.
88. Treloar S.A., Cooper D.W., Brennecke S.P. An Australian twin study of the genetic basis of preeclampsia and eclampsia. Am J Obstet Gynecol.2001;184:374–381. [PubMed: 11228490]
89. Ronningen K.S., Paltiel L., Meltzer H.M. The biobank of the Norwegian mother and child cohort study: a resource for the next 100 years. Eur J Epidemiol. 2006;21:619–625. [PMCID: PMC1820840] [PubMed: 17031521]
90. Kho E.M., McCowan L.M., North R.A. Duration of sexual relationship and its effect on preeclampsia and small for gestational age perinatal outcome. J Reprod Immunol. 2009;82:66–73. [PubMed: 19679359]

Read Full Post »

Genomics and Evolution

Author: Marcus W. Feldman, PhD


Insofar as the genetic evolution of modern humans is concerned, large scale SNP studies of worldwide populations have provided a consistent picture of a migration out of Africa that gave rise to the human populations of the other continents. This migration probably began 60–80 kya, was probably not continuous, and could have resulted in a division during the passage through the Levant en route from east Africa. One division may have moved in a more southerly direction towards south and east Asia, possibly to Australia, and eventually, 15–30 kya into the Americas. The other division may have “turned left” and moved towards Europe.

In this process, which we call the “serial founder” model of human expansion (refs. 1, 2), migration and demography probably had effects that constrained the subsequent action of natural selection on human genes.

  • Variation in skin pigmentation genes today provides some of the strongest signals of natural selection during this human expansion. However, it is also likely that the
  • Immune response genes, e.g., MHC genes, achieved their high levels of polymorphism in response to new pathogens encountered in the great expansion.

Many of the strongest signals of natural selection indicate the importance of the innovations of farming and pastoralism. The gene sequences involved in lactose tolerance and starch metabolism, for example, are strikingly different in groups that adopted dairying or farming, respectively, from hunter-gatherers, who did not.

From the analysis of SNPs, I take home two messages.

  • The first is that although some parts of the genome show clear signals of selection, most of our DNA perceived via SNPs does not.
  • The second is that population growth and migration have been major forces in determining the patterns of variation. Indeed,
  • recent analyses of exome sequences confirm that the spectrum of rare allele frequencies is compatible only with recent and rapid population growth (ref. 3). Indeed,
  • recent analyses of the 1000 genomes data, that is, data from whole genome sequencing of one-thousand human genomes representing Africa (Yoruba), Europe (from Utah), and East Asia (China and Japan), identified only 35 non-synonymous SNPs from 33 genes as having been subject to recent adaptive selection (ref. 4).

The next phase of genomic analysis of humans, complete exome sequencing of large cohorts, or whole genome sequencing of samples from many representative populations, will focus more on two themes.

  • The first will be the role of rare alleles in human phenotypes, especially diseases. The previous phase, GWAS (genome-wide association studies), has been disappointing in revealing genetic “causes” of complex traits. However, my view is that
  • the second theme, the molecular genetics of gene regulation, and interaction of this regulation with the environment, is likely to have bigger payoffs, not only for determination of phenotypes, but also in showing where in the genome the strongest signals of selection lie. As more methylation profiles, small RNA patterns of interference, and other gene-regulatory analyses of whole genomes are completed, both the medical and evolutionary significance of DNA variation will become clearer.

Pemberton, T. J., D. Absher, M. W. Feldman, R. M. Myers, N. A. Rosenberg, and J. Z. Li. 2012. Genomic patterns of homozygosity in worldwide human populations. Am. J. Hum. Genet. 91: 275–292.

Genome-wide patterns of homozygosity runs and their variation across individuals provide a valuable and often untapped resource for studying human genetic diversity and evolutionary history. Using genotype data at 577,489 autosomal SNPs, we employed a likelihood-based approach to identify runs of homozygosity (ROH) in 1,839 individuals representing 64 worldwide populations, classifying them by length into three classes—short, intermediate, and long—with a model-based clustering algorithm. For each class, the number and total length of ROH per individual show considerable variation across individuals and populations. The total lengths of short and intermediate ROH per individual increase with the distance of a population from East Africa, in agreement with similar patterns previously observed for locus-wise homozygosity and linkage disequilibrium. By contrast, total lengths of long ROH show large inter-individual variations that probably reflect recent inbreeding patterns, with higher values occurring more often in populations with known high frequencies of consanguineous unions. Across the genome, distributions of ROH are not uniform, and they have distinctive continental patterns. ROH frequencies across the genome are correlated with local genomic variables such as recombination rate, as well as with signals of recent positive selection. In addition, long ROH are more frequent in genomic regions harboring genes associated with autosomal- dominant diseases than in regions not implicated in Mendelian diseases. These results provide insight into the way in which homozygosity patterns are produced, and they generate baseline homozygosity patterns that can be used to aid homozygosity mapping of genes associated with recessive diseases.

Pepperell, C. S., J. M. Granka, D. C. Alexander, M. A. Behr, L. Chui, J. Gordon, J. L. Guthrie, F. B. Jamieson, D. Langlois-Klassen, R. Long, D. Nguyen, W. Wobeser, and M. W. Feldman. 2011. Dispersal of Mycobacterium tuberculosis via the Canadian fur trade. Proc. Natl. Acad. Sci. USA 108: 6526–6531.

Patterns of gene flow can have marked effects on the evolution of populations. To better understand the migration dynamics of Mycobacterium tuberculosis, we studied genetic data from European M. tuberculosis lineages currently circulating in Aboriginal and French Canadian communities. A single M. tuberculosis lineage, characterized by the DS6Quebec genomic deletion, is at highest frequency among Aboriginal populations in Ontario, Saskatchewan, and Alberta; this bacterial lineage is also dominant among tuberculosis (TB) cases in French Canadians resident in Quebec. Substantial contact between these human populations is limited to a specific historical era (1710–1870), during which individuals from these populations met to barter furs. Statistical analyses of extant M. tuberculosis minisatellite data are consistent with Quebec as a source population for M. tuberculosis gene flow into Aboriginal populations during the fur trade era. Historical and genetic analyses suggest that tiny M. tuberculosis populations persisted for ∼100 y among indigenous populations and subsequently expanded in the late 19th century after environmental changes favoring the pathogen. Our study suggests that spread of TB can occur by two asynchronous processes: (i) dispersal of M. tuberculosis by minimal numbers of human migrants, during which small pathogen populations are sustained by ongoing migration and slow disease dynamics, and (ii) expansion of the M. tuberculosis population facilitated by shifts in host ecology. If generalizable, these migration dynamics can help explain the low DNA sequence diversity observed among isolates of M. tuberculosis and the difficulties in global elimination of tuberculosis, as small, widely dispersed pathogen populations are difficult both to detect and to eradicate.

Henn, B. M., C. R. Gignoux, M. Jobin, J. M. Granka, J. M. Macpherson, J. M. Kidd, L. Rodríguez-Botigué, S. Ramachandran, L. Hon, A. Brisbin, A. A. Lin, P. A. Underhill, D. Comas, K. K. Kidd, P. J. Norman, P. Parham, C. D. Bustamante, J. L. Mountain, and M. W. Feldman. 2011. Hunter-gatherer genomic diversity suggests a southern African origin for modern humans. Proc. Natl. Acad. Sci. USA 108: 5154–5162.

Africa is inferred to be the continent of origin for all modern human populations, but the details of human prehistory and evolution in Africa remain largely obscure owing to the complex histories of hundreds of distinct populations. We present data for more than 580,000 SNPs for several hunter-gatherer populations: the Hadza and Sandawe of Tanzania, and the !Khomani Bushmen of South Africa, including speakers of the nearly extinct N|u language. We find that African hunter-gatherer populations today remain highly differentiated, encompassing major components of variation that are not found in other African populations. Hunter-gatherer populations also tend to have the lowest levels of genome-wide linkage disequilibrium among 27 African populations. We analyzed geographic patterns of linkage disequilibrium and population differentiation, as measured by FST, in Africa. The observed patterns are consistent with an origin of modern humans in southern Africa rather than eastern Africa, as is generally assumed. Additionally, genetic variation in African hunter-gatherer populations has been significantly affected by interaction with farmers and herders over the past 5,000 y, through both severe population bottlenecks and sex-biased migration. However, African hunter-gatherer populations continue to maintain the highest levels of genetic diversity in the world.

Casto, A. M., and M. W. Feldman. 2011. Genome-wide association study SNPs in the human genome diversity project populations: does selection affect unlinked SNPs with shared trait associations? PLoS Genet. 7(1): e1001266.

Genome-wide association studies (GWAS) have identified more than 2,000 trait-SNP associations, and the number continues to increase. GWAS have focused on traits with potential consequences for human fitness, including many immunological, metabolic, cardiovascular, and behavioral phenotypes. Given the polygenic nature of complex traits, selection may exert its influence on them by altering allele frequencies at many associated loci, a possibility which has yet to be explored empirically. Here we use 38 different measures of allele frequency variation and 8 iHS scores to characterize over 1,300 GWAS SNPs in 53 globally distributed human populations. We apply these same techniques to evaluate SNPs grouped by trait association. We find that groups of SNPs associated with pigmentation, blood pressure, infectious disease, and autoimmune disease traits exhibit unusual allele frequency patterns and elevated iHS scores in certain geographical locations. We also find that GWAS SNPs have generally elevated scores for measures of allele frequency variation and for iHS in Eurasia and East Asia. Overall, we believe that our results provide evidence for selection on several complex traits that has caused changes in allele frequencies and/or elevated iHS scores at a number of associated loci. Since GWAS SNPs collectively exhibit elevated allele frequency measures and iHS scores, selection on complex traits may be quite widespread. Our findings are most consistent with this selection being either positive or negative, although the relative contributions of the two are difficult to discern. Our results also suggest that trait-SNP associations identified in Eurasian samples may not be present in Africa, Oceania, and the Americas, possibly due to differences in linkage disequilibrium patterns. This observation suggests that non-Eurasian and non-East Asian sample populations should be included in future GWAS.

Casto, A. M., J. Z. Li, D. Absher, R. Myers, S. Ramachandran, and M. W. Feldman. 2010. Characterization of X-linked SNP genotypic variation in globally distributed human populations. Genome Biol. 11:R10.

Background: The transmission pattern of the human X chromosome reduces its population size relative to the autosomes, subjects it to disproportionate influence by female demography, and leaves X-linked mutations exposed to selection in males. As a result, the analysis of X-linked genomic variation can provide insights into the influence of demography and selection on the human genome. Here we characterize the genomic variation represented by 16,297 X-linked SNPs genotyped in the CEPH human genome diversity project samples.
Results: We found that X chromosomes tend to be more differentiated between human populations than autosomes, with several notable exceptions. Comparisons between genetically distant populations also showed an excess of Xlinked SNPs with large allele frequency differences. Combining information about these SNPs with results from tests designed to detect selective sweeps, we identified two regions that were clear outliers from the rest of the X chromosome for haplotype structure and allele frequency distribution. We were also able to more precisely define the geographical extent of some previously described X-linked selective sweeps.
Conclusions: The relationship between male and female demographic histories is likely to be complex as evidence supporting different conclusions can be found in the same dataset. Although demography may have contributed to the excess of SNPs with large allele frequency differences observed on the X chromosome, we believe that selection is at least partially responsible. Finally, our results reveal the geographical complexities of selective sweeps on the X chromosome and argue for the use of diverse populations in studies of selection.


1.  Cavalli-Sforza, L.L., and M.W. Feldman. 2003. The application of molecular genetic approaches to the study of human evolution. Nat. Genet. Supp. 33: 266–275.

2.  Henn, B. M., L. L. Cavalli-Sforza, and M. W. Feldman. 2012. The great human expansion. Proc. Natl. Acad. Sci. USA 109: 17758–17764.

3.  Keinan, A., and A. G. Clark. 2012. Recent explosive human population growth has resulted in an excess of rate genetic variants. Science 336: 740–743.

4.  Grossman, S. R., K. G. Andersen, I. Shlyakhter, S. Tabrizi, S. Winnicki, A. Yen, D. J. Park, D. Griesemer, E. K. Karlsson, S. H. Wong, M. Cabili, R. A. Adegbola, R. N. K. Bamezai, A. V. S. Hill, F. O. Vannberg, J. L. Rinn, 1000 Genomes Project, E. S. Lander, S. F. Schaffner, and P. C. Sabeti. 2013. Identifying recent adaptations in large-scale genomic data. Cell 152: 703–713.

Read Full Post »

%d bloggers like this: