Posts Tagged ‘Genome Biology’

Recognitions for Contributions in Genomics by Dan David Prize Awards

Reporter: Aviva Lev-Ari, PhD, RN

The Source for this List is a Search for “Genomics” on the Dan David Prize website


This is a compilation of all Dan David Prizes awarded in the Field of Genomics

When Will Genomics Cure Cancer?
A conversation with the biogeneticist ERIC S. LANDER [2012 laureate] about how genetic advances are transforming medical treatment “Eric S. Lander, one of the leaders of the Human Genome Project, a map of the 3 billion letters of DNA that make up a…

J. Craig Venter
Founder, Chairman, and President of the J. Craig Venter Institute, Rockville, MD and La Jolla, CA, USA and CEO of Synthetic Genomics Inc., La Jolla, CA, USA.

David Botstein
Anthony B. Evnin Professor of Genomics; Director, Lewis-Sigler Institute for Integrative Genomics; Director, Certificate Program in Quantitative and Computational Biology, Princeton University, Princeton, NJ, USA.

Laureates Announced 2012
Dan David Prize 2012 Laureates Announced Robert Conquest, Martin Gilbert – for Biography/History William Kentridge – for Plastic Arts David Botstein, Craig Venter, Eric Lander – for Genome Research Tel Aviv (February 27, 2012) —The international Dan…

Cutting Edge Genomic Research in the World’s First Carbon-Neutral Laboratory Facility
J. CRAIG VENTER, 2012 laureate, is Founder, Chairman, and President of the J. Craig Venter Institute, Rockville, MD and La Jolla, CA, USA and CEO of Synthetic Genomics Inc., La Jolla, CA, USA. “One of our quests is to help solve two troubling issues —…

Prof. David Haussler
Prof. David Haussler is a Distinguished Professor of Biomolecular Engineering at the University of California, Santa Cruz, and Scientific Director of the UC Santa Cruz Genomics Institute.

Eric Lander
Founding Director, Broad Institute Harvard and MIT and director of its Genome Biology Program, Cambridge, MA, USA.

Future – Bioinformatics
Bioinformatics is a field in which mathematics, statistics, and computer algorithms are harnessed towards novel biological discoveries. Bioinformatics methodologies have revolutionized biology, by making it more quantitative and less descriptive….

J. CRAIG VENTER – Life at the Speed of Light
The Dawn of an Era In his NEW BOOK ‘Life at the Speed of Light: From the Double Helix to the Dawn of Digital Life’ J. CRAIG VENTER, 2012 laureate, explains the coming era of discovery (see Wired interview below). What is the significance of Venter’s…

From the Press : Hebrew
The Marker, June 14, 2012 – Dan David Prize: The Next Generation Calcalist, June 14, 2012 – Dan David Prize Awarded: Thoughts of Creating Life, Boycotting Scientists, Protests, Entrepreneurs and Ceremonies Ma’ariv, June 12, 2012 – Who Attended the Dan…

Gary Ruvkun
Professor of Genetics, Department of Molecular BiologyMassachusetts General Hospital, Harvard University Gary Ruvkun has made a major contribution to the future of human health with the discovery of conserved hormonal signaling pathways with…

Selected Fields 2012
Past – HISTORY / BIOGRAPHY Biography is an important sub-discipline of history. Every progressive society makes room for achievement and excellence. Since ancient times, this has been done by immortalizing the names of heroes, role models and…

Prof. Michael S. Waterman
Prof. Michael S. Waterman is Professor of Biological Sciences, of Mathematics, of Computer Science, Department of Biological Sciences, University of Southern California.

Other related articles published in this Open Access Online Scientific Journal include the following:

2013 Genomics: The Era Beyond the Sequencing of the Human Genome: Francis Collins, Craig Venter, Eric Lander, et al.

Curator: Aviva Lev-Ari, PhD, RN



Read Full Post »


Writer and Curator: Larry H. Bernstein, MD, FCAP


Implementation and utilization of genetic testing in personalized medicine

NS Abul-Husn, AO Obeng, SC Sanderson, O Gottesman, S A Scott
Pharmacogenomics and Personalized Medicine 2014:7 227–240

Clinical genetic testing began over 30 years ago with the availability of mutation detection for sickle cell disease diagnosis. Since then, the field has dramatically transformed to include gene sequencing, high-throughput targeted genotyping, prenatal mutation detection, preimplantation genetic diagnosis, population-based carrier screening, and now genome-wide analyses using microarrays and next-generation sequencing. Despite these significant advances in molecular technologies and testing capabilities, clinical genetics laboratories historically have been centered on mutation detection for Mendelian disorders. However, the ongoing identification of deoxyribonucleic acid (DNA) sequence variants associated with common diseases prompted the availability of testing for personal disease risk estimation, and created commercial opportunities for direct-to-consumer genetic testing companies that assay these variants. This germline genetic risk, in conjunction with other clinical, family, and demographic variables, are the key components of the personalized medicine paradigm, which aims to apply personal genomic and other relevant data into a patient’s clinical assessment to more precisely guide medical management. However, genetic testing for disease risk estimation is an ongoing topic of debate, largely due to inconsistencies in the results, concerns over clinical validity and utility, and the variable mode of delivery when returning genetic results to patients in the absence of traditional counseling. A related class of genetic testing with analogous issues of clinical utility and acceptance is pharmacogenetic testing, which interrogates sequence variants implicated in interindividual drug response variability. Although clinical pharmacogenetic testing has not previously been widely adopted, advances in rapid turnaround time genetic testing technology and the recent implementation of preemptive genotyping programs at selected medical centers suggest that personalized medicine through pharmacogenetics is now a reality. This review aims to summarize the current state of implementing genetic testing for personalized medicine, with an emphasis on clinical pharmacogenetic testing.

Pharmacogenomic knowledge gaps and educational resource needs among physicians in selected specialties

Katherine A Johansen Taber, Barry D Dickinson
Pharmacogenomics and Personalized Medicine 2014:7 145–162

Background: The use of pharmacogenomic testing in the clinical setting has the potential to improve the safety and effectiveness of drug therapy, yet studies have revealed that physicians lack knowledge about the topic of pharmacogenomics, and are not prepared to implement it in the clinical setting. This study further explores the pharmacogenomic knowledge deficit and educational resource needs among physicians.
Materials and methods: Surveys of primary care physicians, cardiologists, and psychiatrists were conducted.
Results: Few physicians reported familiarity with the topic of pharmacogenomics, but more reported confidence in their knowledge about the influence of genetics on drug therapy. Only a small minority had undergone formal training in pharmacogenomics, and a majority reported being unsure what type of pharmacogenomic tests were appropriate to order for the clinical situation. Respondents indicated that an ideal pharmacogenomic educational resource should be electronic and include such components as how to interpret pharmacogenomic test results, recommendations for prescribing, population subgroups most likely to be affected, and contact information for laboratories offering pharmacogenomic testing.
Conclusion: Physicians continue to demonstrate pharmacogenomic knowledge gaps, and are unsure about how to use pharmacogenomic testing in clinical practice. Educational resources that are clinically oriented and easily accessible are preferred by physicians, and may best support appropriate clinical implementation of pharmacogenomics.

Developing genomic knowledge bases and databases to support clinical management: current perspectives

Vojtech Huser, Murat Sincan, James J Cimino
Pharmacogenomics and Personalized Medicine 2014:7 275–283

Personalized medicine, the ability to tailor diagnostic and treatment decisions for individual patients, is seen as the evolution of modern medicine. We characterize here the informatics resources available today or envisioned in the near future that can support clinical interpretation of genomic test results. We assume a clinical sequencing scenario (germline whole-exome sequencing) in which a clinical specialist, such as an endocrinologist, needs to tailor patient management decisions within his or her specialty (targeted findings) but relies on a genetic counselor to interpret off-target incidental findings. We characterize the genomic input data and list various types of knowledge bases that provide genomic knowledge for generating clinical decision support. We highlight the need for patient-level databases with detailed lifelong phenotype content in addition to genotype data and provide a list of recommendations for personalized medicine knowledge bases and databases. We conclude that no single knowledge base can currently support all aspects of personalized recommendations and that consolidation of several current resources into larger, more dynamic and collaborative knowledge bases may offer a future path forward.


Tumor Heterogeneity: Mechanisms and Bases for a Reliable Application of Molecular Marker Design

Salvador J. Diaz-Cano
Int. J. Mol. Sci. 2012, 13, 1951-2011; http://dx.doi.org/10.3390/ijms13021951

Tumor heterogeneity is a confusing finding in the assessment of neoplasms, potentially resulting in inaccurate diagnostic, prognostic and predictive tests. This tumor heterogeneity is not always a random and unpredictable phenomenon, whose knowledge helps designing better tests. The biologic reasons for this intratumoral heterogeneity would then be important to understand both the natural history of neoplasms and the selection of test samples for reliable analysis. The main factors contributing to intratumoral heterogeneity inducing gene abnormalities or modifying its expression include: the gradient ischemic level within neoplasms, the action of tumor microenvironment (bidirectional interaction between tumor cells and stroma), mechanisms of intercellular transference of genetic information (exosomes), and differential mechanisms of sequence-independent modifications of genetic material and proteins. The intratumoral heterogeneity is at the origin of tumor progression and it is also the byproduct of the selection process during progression. Any analysis of heterogeneity mechanisms must be integrated within the process of segregation of genetic changes in tumor cells during the clonal expansion and progression of neoplasms. The evaluation of these mechanisms must also consider the redundancy and pleiotropism of molecular pathways, for which appropriate surrogate markers would support the presence or not of heterogeneous genetics and the main mechanisms responsible. This knowledge would constitute a solid scientific background for future therapeutic planning.

Systematic evaluation of connectivity map for disease indications

Jie Cheng, Lun Yang, Vinod Kumar and Pankaj Agarwal
Genome Medicine 2014, 6:95 http://genomemedicine.com/content/6/12/95

Background: Connectivity map data and associated methodologies have become a valuable tool in understanding drug mechanism of action (MOA) and discovering new indications for drugs. One of the key ideas of connectivity map (CMAP) is to measure the connectivity between disease gene expression signatures and compound-induced gene expression profiles. Despite multiple impressive anecdotal validations, only a few systematic evaluations have assessed the accuracy of this aspect of CMAP, and most of these utilize drug-to-drug matching to transfer indications across the two drugs.
Methods: To assess CMAP methodologies in a more direct setting, namely the power of classifying known drug-disease relationships, we evaluated three CMAP-based methods on their prediction performance against a curated dataset of 890 true drug-indication pairs. The disease signatures were generated using Gene Logic BioExpress system and the compound profiles were derived from the Connectivity Map database (CMAP, build 02, http://www.broadinstitute.org/CMAP/).
Results: The similarity scoring algorithm called eXtreme Sum (XSum) better than the standard Kolmogorov-Smirnov (KS) statistic in terms of the area under curve and can achieve a four-fold enrichment at 0.01, false positive rate level, with AUC = 2.2E-4, P value = 0.0035.
Conclusion: Connectivity map can significantly enrich true positive drug-indication pairs given an effective matching algorithm.

Pharmacogenetics of Statin-Induced Myopathy: A Focused Review of the Clinical Translation of Pharmacokinetic Genetic Variants

Jasmine A Talameh and Joseph P Kitzmiller
J Pharmacogenomics Pharmacoproteomics 2014, 5:2 http://dx.doi.org/10.4172/2153-0645.1000128

Statins are the most commonly prescribed drugs in the United States and are extremely effective in reducing major cardiovascular events in the millions of Americans with hyperlipidemia. However, many patients (up to 25%) cannot tolerate or discontinue statin therapy due to statin-induced myopathy (SIM). Patients will continue to experience SIM at unacceptably high rates or experience unnecessary cardiovascular events (as a result of discontinuing or decreasing their statin therapy) until strategies for predicting or mitigating SIM are identified. A promising strategy for predicting or mitigating SIM is pharmacogenetic testing particularly of pharmacokinetic genetic variants as SIM is  related to statin exposure. Data is emerging on the association between pharmacokinetic genetic variants and SIM.
A current, critical evaluation of the literature on pharmacokinetic genetic variants and SIM for potential translation to clinical practice is lacking. This review focuses specifically on pharmacokinetic genetic variants and their association with SIM clinical outcomes. We also discuss future directions, specific to the research on pharmacokinetic genetic variants, which could speed the translation into clinical practice. For simvastatin, we did not find sufficient evidence to support the clinical translation of pharmacokinetic genetic variants other than SLCO1B1. However, SLCO1B1 may also be clinically relevant for pravastatin- and pitavastatin-induced myopathy, but additional studies assessing SIM clinical outcome are needed. CYP2D6*4 may be clinically relevant for atorvastatin-induced myopathy, but mechanistic studies are needed. Future research efforts need to incorporate statin-specific analyses, multi-variant analyses, and a standard definition of SIM. As the use of statins is extremely common and SIM continues to occur in a significant number of patients, future research investments in pharmacokinetic genetic variants have the potential to make a profound impact on public health.

Benefits of Pharmacogenetics in the Management of Hypertension

Clara Torrellas, Juan Carlos Carril and Ramón Cacabelos
J Pharmacogenomics Pharmacoproteomics 2014, 5:2 http://dx.doi.org/10.4172/2153-0645.1000126

Introduction: Hypertension, suffered by 35% of the population, stands out as the main risk factor for cardiovascular disorders with the highest death rate worldwide. Only a small number of patients with hypertension gets efficient control over blood pressure (BP) with appropriate drug therapy.  harmacogenetics, as a tool to identify antihypertensive therapeutic response-associated polymorphisms, could help to reduce this problem.
Objectives: We present here an epidemiological study of the prevalence of hypertension and its pharmacological treatment to demonstrate the error rate that physicians can commit when the patient´s pharmacogenetic profile is unknown.
Method: The sample consisted of 1115 individuals of which 332 met criteria for hypertension. We recorded each patient´s drug prescription prior to their visit to EuroEspes Biomedical Research Center, and analyzed their pharmacogenetic profile.
Results: About 30% of patients were hypertensive, of whom only 40.4% were receiving an active ingredient for hypertension control. Among them, CYP3A4/5 and CYP2C9 were the major metabolizing enzymes. Antagonists of angiotensin II receptors, followed by calcium-blocking agents and beta-adrenergic antagonists were the most commonly-prescribed drug categories. However, 61% of hypertensive patients were not taking suitable antihypertensive agents for their metabolism according to their genetic idiosyncrasy. Furthermore, the highest error rate was determined for CYP2C9.
Conclusion: The introduction of changes in the management of hypertension in the Spanish population could be useful to promote the prevention and treatment of high blood pressure in a more efficient way. The integration of pharmacogenetic testing into routine clinical procedures could optimize the therapeutic response, guiding the physician in the choice of the correct antihypertensive drug and the correct dose. The control of BP arises as an area of particular interest in assessing the validity and utility of pharmacogenetic testing/intervention.

Pharmacogenomics Study of Clopidogrel by RFLP based Genotyping of CYP2C19 in Cardiovascular Disease Patients in North-East Population of India

Prasanthi SV, Vinayak S Jamdade, Nityanand B Bolshette, Ranadeep Gogoi and Mangala Lahkar
J Pharmacogenomics Pharmacoproteomics 2014, 5:3 http://dx.doi.org/10.4172/2153-0645.1000132

Introduction and Objective: Pharmacogenetics is a genetically determined variability in drug responses. The genes and their allelic variants which affect our response to drugs are the main routes in development of pharmacogenetics. Clopidogrel is an antiplatelet drug, used against athero-thrombotic events in cardiovascular patients. The objective of our study was to identify the CYP2C19 Single Nucleotide Polymorphisms, responsible for altering the metabolism of clopidogrel, at gene level. And to document the prevalence of CYP2C19 gene mutations in clopidogrel treated cardiovascular disease patients in Assam population, Guwahati Medical College & Hospital, in North- East India.
Patients and Methods: We have studied 60 patients who received clopidogrel from Gauhati medical college and hospital Assam. Genomic DNA was extracted by using Hipura blood genomic DNA extracting mini preparation kit by following the manufacturer’s instructions.RFLP analysis was done by DNA amplification which was carried out by using set of primers and resulting ampicons of CYP2C19*2;CYP2C19*3 and CYP2C19*17 were subjected for Restriction digestion with SmaI, BamHI and Lwe0I respectively.
Results: We found that CYP2C19*2 had allelic frequency of ~40% in Gauhati Medical College and Hospital, Assam, North East India. None of the samples were mutated with CYP2C19*3 andCYP2C19*17 allele. Other CYP2C19 variant alleles with reduced or absent enzymatic activity have been identified. Conclusion: We found that loss of functional allele CYP2C19*2 had higher carriage frequency; whereas, CYP2C19*3 and *17 alleles were not found in cardiovascular patients who were taking clopidogrel. Personalized therapy targeting patients who carry these genetic variants might help to improve the clinical outcome.

Role of cytochrome P450 genotype in the steps toward personalized drug therapy

Larisa H Cavallari, Hyunyoung Jeong, Adam Bress
Pharmacogenomics and Personalized Medicine 2011:4 123–136

Genetic polymorphism for cytochrome 450 (P450) enzymes leads to interindividual variability in the plasma concentrations of many drugs. In some cases, P450 genotype results in decreased enzyme activity and an increased risk for adverse drug effects. For example, individuals with the CYP2D6 loss-of-function genotype are at increased risk for ventricular arrhythmia if treated with usual does of thioridazine. In other cases, P450 genotype may influence the dose of a drug required to achieve a desired effect. This is the case with warfarin, with lower doses often necessary in carriers of a variant CYP2C9*2 or *3 allele to avoid supratherapeutic anticoagulation. When a prodrug, such as clopidogrel or codeine, must undergo hepatic biotransformation to its active form, a loss-of-function P450 genotype leads to reduced concentrations of the active drug and decreased drug efficacy. In contrast, patients with multiple CYP2D6 gene copies are at risk for opioid-related toxicity if treated with usual doses of codeine-containing analgesics. At least 25 drugs contain information in their US Food and Drug Administration-approved labeling regarding P450 genotype. The CYP2C9, CYP2C19, and CYP2D6 genes are the P450 genes most often cited. To date, integration of P450 genetic information into clinical decision making is limited. However, some institutions are beginning to embrace routine P450 genotyping to assist in the treatment of their patients. Genotyping for P450 variants may carry less risk for discrimination compared with genotyping for disease-associated variants. As such, P450 genotyping is likely to lead the way in the clinical implementation of pharmacogenomics. This review discusses variability in the CYP2C9, CYP2C19, and CYP2D6 genes and the implications of this for drug efficacy and safety.

Asthma pharmacogenetics and the development of genetic profiles for personalized medicine

Victor E Ortega, Deborah A Meyers, Eugene R Bleecker
Pharmacogenomics and Personalized Medicine 2015:8 9–22

Human genetics research will be critical to the development of genetic profiles for personalized or precision medicine in asthma. Genetic profiles will consist of gene variants that predict individual disease susceptibility and risk for progression, predict which pharmacologic therapies will result in a maximal therapeutic benefit, and predict whether a therapy will result in an adverse response and should be avoided in a given individual. Pharmacogenetic studies of the glucocorticoid, leukotriene, and β2-adrenergic receptor pathways have focused on candidate genes within these pathways and, in addition to a small number of genome-wide association studies, have identified genetic loci associated with therapeutic responsiveness. This review summarizes these pharmacogenetic discoveries and the future of genetic profiles for personalized medicine in asthma. The benefit of a personalized, tailored approach to health care delivery is needed in the development of expensive biologic drugs directed at a specific biologic pathway. Prior pharmacogenetic discoveries, in combination with additional variants identified in future studies, will form the basis for future genetic profiles for personalized tailored approaches to maximize therapeutic benefit for an individual asthmatic while minimizing the risk for adverse events.

Clinical application of high throughput molecular screening techniques for pharmacogenomics

Arun P Wiita, Iris Schrijver
Pharmacogenomics and Personalized Medicine 2011:4 109–121

Genetic analysis is one of the fastest-growing areas of clinical diagnostics. Fortunately, as our knowledge of clinically relevant genetic variants rapidly expands, so does our ability to detect these variants in patient samples. Increasing demand for genetic information may necessitate the use of high throughput diagnostic methods as part of clinically validated testing. Here we provide a general overview of our current and near-future abilities to perform large-scale genetic testing in the clinical laboratory. First we review in detail molecular methods used for high throughput mutation detection, including techniques able to monitor thousands of genetic variants for a single patient or to genotype a single genetic variant for thousands of patients simultaneously. These methods are analyzed in the context of pharmacogenomic testing in the clinical laboratories, with a focus on tests that are currently validated as well as those that hold strong promise for widespread clinical application in the near future. We further discuss the unique economic and clinical challenges posed by pharmacogenomic markers. Our ability to detect genetic variants frequently outstrips our ability to accurately interpret them in a clinical context, carrying implications both for test development and introduction into patient management algorithms. These complexities must be taken into account prior to the introduction of any pharmacogenomic biomarker into routine clinical testing.

Clinical implementation of RNA signatures for pharmacogenomic decision-making

Weihua Tang, Zhiyuan Hu, Hind Muallem, Margaret L Gulley
Pharmacogenomics and Personalized Medicine 2011:4 95–107

RNA profiling is increasingly used to predict drug response, dose, or toxicity based on analysis of drug pharmacokinetic or pharmacodynamic pathways. Before implementing multiplexed RNA arrays in clinical practice, validation studies are carried out to demonstrate sufficient evidence of analytic and clinical performance, and to establish an assay protocol with quality assurance measures. Pathologists assure quality by selecting input tissue and by interpreting results in the context of the input tissue as well as the technologies that were used and the clinical setting in which the test was ordered. A strength of RNA profiling is the array-based measurement of tens to thousands of RNAs at once, including redundant tests for critical analytes or pathways to promote confidence in test results. Instrument and reagent manufacturers are crucial for supplying reliable components of the test system. Strategies for quality assurance include careful attention to RNA preservation and quality checks at pertinent steps in the assay protocol, beginning with specimen collection and proceeding through the variousphases of transport, processing, storage, analysis, interpretation, and reporting. Specimen quality is checked by probing housekeeping transcripts, while spiked and exogenous controls serve as a check on analytic performance of the test system. Software is required to manipulate abundant array data and present it for interpretation by a laboratory physician who reports results in a manner facilitating therapeutic decision-making. Maintenance of the assay requires periodic documentation of personnel competency and laboratory proficiency. These strategies are shepherding genomic arrays into clinical settings to provide added value to patients and to the larger health care system.

Dysregulation of the homeobox transcription factor gene HOXB13: role in prostate cancer

Brennan Decker, Elaine A Ostrander
Pharmacogenomics and Personalized Medicine 2014:7 193–201

Prostate cancer (PC) is the most common noncutaneous cancer in men, and epidemiological studies suggest that about 40% of PC risk is heritable. Linkage analyses in hereditary PC families have identified multiple putative loci. However, until recently, identification of specific risk alleles has proven elusive. Cooney et al used linkage mapping and segregation analysis to identify a putative risk locus on chromosome 17q21-22. In search of causative variant(s) in genes from the candidate region, a novel, potentially deleterious G84E substitution in homeobox transcription factor gene HOXB13 was observed in multiple hereditary PC families. In follow-up testing, the G84E allele was enriched in cases, especially those with an early diagnosis or positive family history of disease. This finding was replicated by others, confirming HOXB13 as a PC risk gene. The HOXB13 protein plays diverse biological roles in embryonic development and terminally differentiated tissue. In tumor cell lines, HOXB13 participates in a number of biological functions, including coactivation and localization of the androgen receptor and FOXA1. However, no consensus role has emerged and many questions remain. All HOXB13 variants with a proposed role in PC risk are predicted to damage the protein and lie in domains that are highly conserved across species. The G84E variant has the strongest epidemiological support and lies in a highly conserved MEIS protein-binding domain, which binds cofactors required for activation. On the basis of epidemiological and biological data, the G84E variant likely modulates the interaction between the HOXB13 protein and the androgen receptor, as well as affecting FOXA1-mediated transcriptional programming. However, further studies of the mutated protein are required to clarify the mechanisms by which this translates into PC risk.

Patient selection and targeted treatment in the management of platinum-resistant ovarian cancer

Christopher P Leamon, Chandra D Lovejoy, Binh Nguyen
Pharmacogenomics and Personalized Medicine 2013:6 113–125

Ovarian cancer (OC) has the highest mortality rate of any gynecologic cancer, and patients generally have a poor prognosis due to high chemotherapy resistance and late stage disease diagnosis. Platinum-resistant OC can be treated with cytotoxic chemotherapy such as paclitaxel, topotecan, pegylated liposomal doxorubicin, and gemcitabine, but many patients eventually relapse upon treatment. Fortunately, there are currently a number of targeted therapies in development for these patients who have shown promising results in recent clinical trials. These treatments often target the vascular endothelial growth factor pathway (eg, bevacizumab and aflibercept), DNA repair mechanisms (eg, iniparib and olaparib), or they are directed against folate related pathways (eg, pemetrexed, farletuzumab, and vintafolide). As many targeted therapies are only effective in a subset of patients, there is an increasing need for the identification of response predictive biomarkers. Selecting the right patients through biomarker screening will help tailor therapy to patients and decrease superfluous treatment to those who are biomarker negative; this approach should lead to improved clinical results and decreased toxicities. In this review the current targeted therapies used for treating platinum-resistant OC are discussed. Furthermore, use of prognostic and response predictive biomarkers to define OC patient populations that may benefit from specific targeted therapies is also highlighted.

Pharmacogenetics in breast cancer: steps toward personalized medicine in breast cancer management

Sarah Rofaiel, Esther N Muo1, Shaker A Mousa
Pharmacogenomics and Personalized Medicine 2010:3 129–143

There is wide individual variability in the pharmacokinetics, pharmacodynamics, and tolerance to anticancer drugs within the same ethnic group and even greater variability among different ethnicities. Pharmacogenomics (PG) has the potential to provide personalized therapy based on individual genetic variability in an effort to maximize efficacy and reduce adverse effects. The benefits of PG include improved therapeutic index, improved dose regimen, and selection of optimal types of drug for an individual or set of individuals. Advanced or metastatic breast cancer is typically treated with single or multiple combinations of chemotherapy regimens including anthracyclines, taxanes, antimetabolites, alkylating agents, platinum drugs, vinca alkaloids, and others. In this review, the PG of breast cancer therapeutics, including tamoxifen, which is the most widely used therapeutic for the treatment of hormone-dependent breast cancer, is reviewed. The pharmacological activity of tamoxifen depends on its conversion by cytochrome P450 2D6 (CYP2D6) to its abundant active metabolite, endoxifen. Patients with reduced CYP2D6 activity, as a result of either their genotype or induction by the coadministration of other drugs that inhibit CYP2D6 function, produce little endoxifen and hence derive limited therapeutic benefit from tamoxifen; the same can be said about the different classes of therapeutics in breast cancer. PG studies of breast cancer therapeutics should provide patients with breast cancer with optimal and personalized therapy

Novel treatment strategies in triple-negative breast cancer: specific role of poly(adenosine diphosphate-ribose) polymerase inhibition

M William Audeh
Pharmacogenomics and Personalized Medicine 2014:7 307–316

Inhibitors of the poly(adenosine triphosphate-ribose) polymerase (PARP)-1 enzyme induce synthetic lethality in cancers with ineffective DNA (DNA) repair or homologous repair deficiency, and have shown promising clinical activity in cancers deficient in DNA repair due to germ-line mutation in BRCA1 and BRCA2. The majority of breast cancers arising in carriers of BRCA1 germ-line mutations, as well as half of those in BRCA2 carriers, are classified as triple-negative breast cancer (TNBC). TNBC is a biologically heterogeneous group of breast cancers characterized by the lack of immunohistochemical expression of the ER, PR, or HER2 proteins, and for which the current standard of care in systemic therapy is cytotoxic chemotherapy. Many “sporadic” cases of TNBC appear to have indicators of DNA repair dysfunction similar to those in BRCA-mutation carriers, suggesting the possible utility of PARP inhibitors in a subset of TNBC. Significant genetic heterogeneity has been observed within the TNBC cohort, creating challenges for interpretation of prior clinical trial data, and for the design of future clinical trials. Several PARP inhibitors are currently in clinical development in BRCA-mutated breast cancer. The use of PARP inhibitors in TNBC without BRCA mutation will require biomarkers that identify cancers with homologous repair deficiency in order to select patients likely to respond. Beyond mutations in the BRCA genes, dysfunction in other genes that interact with the homologous repair pathway may offer opportunities to induce synthetic lethality when combined with PARP inhibition.

Clinical potential of novel therapeutic targets in breast cancer: CDK4/6, Src, JAK/STAT, PARP, HDAC, and PI3K/AKT/mTOR pathways

Sarah R Hosford, Todd W Miller
Pharmacogenomics and Personalized Medicine 2014:7 203–215

Breast cancers expressing estrogen receptor α, progesterone receptor, or the human epidermal growth factor receptor 2 (HER2) proto-oncogene account for approximately 90% of cases, and treatment with antiestrogens and HER2-targeted agents has resulted in drastically improved survival in many of these patients. However, de novo or acquired resistance to antiestrogen and HER2-targeted therapies is common, and many tumors will recur or progress despite these treatments. Additionally, the remaining 10% of breast tumors are negative for estrogen receptor α, progesterone receptor, and HER2 (“triple-negative”), and a clinically proven tumor-specific drug target for this group has not yet been identified. Therefore, the identification of new therapeutic targets in breast cancer is of vital clinical importance. Preclinical studies elucidating the mechanisms driving resistance to standard therapies have identified promising targets including cyclin-dependent kinase 4/6, phosphoinositide 3-kinase, poly adenosine diphosphate–ribose polymerase, Src, and histone deacetylase. Herein, we discuss the clinical potential and status of new therapeutic targets in breast cancer.

Overview of diagnostic/targeted treatment combinations in personalized medicine for breast cancer patients

Anna Tessari, Dario Palmieri, Serena Di Cosimo
Pharmacogenomics and Personalized Medicine 2014:7 1–19

Breast cancer includes a body of molecularly distinct subgroups, characterized by different presentation, prognosis, and sensitivity to treatments. Significant advances in our understanding of the complex architecture of this pathology have been achieved in the last few decades, thanks to new biotechnologies that have recently come into the research field and the clinical practice, giving oncologists new instruments that are based on biomarkers and allowing them to set up a personalized approach for each individual patient. Here we review the main treatments available or in preclinical development, the biomolecular diagnostic and prognostic approaches that changed our perspective about breast cancer, giving an overview of targeted therapies that represent the current standard of care for these patients. Finally, we report some examples of how new technologies in clinical practice can set in motion the development of new drugs.

Human ABC transporter ABCG2/BCRP expression in chemoresistance: basic and clinical perspectives for molecular cancer therapeutics

Kohji Noguchi, Kazuhiro Katayama, Yoshikazu Sugimoto
Pharmacogenomics and Personalized Medicine 2014:7 53–64

Adenosine triphosphate (ATP)-binding cassette (ABC) transporter proteins, such as ABCB1/P-glycoprotein (P-gp) and ABCG2/breast cancer resistance protein (BCRP), transport various structurally unrelated compounds out of cells. ABCG2/BCRP is referred to as a “half-type” ABC transporter, functioning as a homodimer, and transports anticancer agents such as irinotecan, 7-ethyl-10-hydroxycamptothecin (SN-38), gefitinib, imatinib, methotrexate, and mitoxantrone from cells. The expression of ABCG2/BCRP can confer a multidrug-resistant phenotype on cancer cells and affect drug absorption, distribution, metabolism, and excretion in normal tissues, thus modulating the in vivo efficacy of chemotherapeutic agents. Clarification of the substrate preferences and structural relationships of ABCG2/BCRP is essential for our understanding of the molecular mechanisms underlying its effects in vivo during chemotherapy. Its single-nucleotide polymorphisms are also involved in determining the efficacy of chemotherapeutics, and those that reduce the functional activity of ABCG2/BCRP might be associated with unexpected adverse effects from normal doses of anticancer drugs that are ABCG2/BCRP substrates. Importantly, many recently developed molecular-targeted cancer drugs, such as the tyrosine kinase inhisbitors, imatinib mesylate, gefitinib, and others, can also interact with ABCG2/BCRP. Both functional single-nucleotide polymorphisms and inhibitory agents of ABCG2/BCRP modulate the in vivo pharmacokinetics and pharmacodynamics of these molecular cancer treatments, so the pharmacogenetics of ABCG2/BCRP is an important consideration in the application of molecular-targeted chemotherapies.

Bosutinib: a SRC–ABL tyrosine kinase inhibitor for treatment of chronic myeloid leukemia

Fuad El Rassi, Hanna Jean Khoury
Pharmacogenomics and Personalized Medicine 2013:6 57–62

Bosutinib is one of five tyrosine kinase inhibitors commercially available in the United States for the treatment of chronic myeloid leukemia. This review of bosutinib summarizes the mode of action, pharmacokinetics, efficacy and safety data, as well as the patient-focused perspective through quality-of-life data. Bosutinib has shown considerable and sustained efficacy in chronic myeloid leukemia, especially in the chronic phase, with resistance or intolerance to prior tyrosine kinase inhibitors. Bosutinib has distinct but manageable adverse events. In the absence of T315I and V299L mutations, there are no absolute contraindications for the use of bosutinib in this patient population.

Toward precision medicine with next-generation EGFR inhibitors in non-small-cell lung cancer
Timothy A Yap, Sanjay Popat
Pharmacogenomics and Personalized Medicine 2014:7 285–295

The use of genomics to discover novel targets and biomarkers has placed the field of oncology at the forefront of precision medicine. First-generation epidermal growth factor receptor (EGFR) inhibitors have transformed the therapeutic landscape of EGFR mutant non-small-cell lung carcinoma through the genetic stratification of tumors from patients with this disease. Somatic EGFR mutations in lung adenocarcinoma are now well established as predictive biomarkers of response and resistance to small-molecule EGFR inhibitors. Despite early patient benefit, primary resistance and subsequent tumor progression to first-generation EGFR inhibitors are seen in 10%–30% of patients with EGFR mutant non-small-cell lung carcinoma. Acquired drug resistance is also inevitable, with patients developing disease progression after only 10–13 months of antitumor therapy. This review details strategies pursued in circumventing T790M-mediated drug resistance to EGFR inhibitors, which is the most common mechanism of acquired resistance, and focuses on the clinical development of second-generation EGFR inhibitors, exemplified by afatinib (BIBW2992). We discuss the rationale, mechanism of action, clinical efficacy, and toxicity profile of afatinib, including the LUX-Lung studies. We also discuss the emergence of third-generation irreversible mutant-selective inhibitors of EGFR and envision the future management of EGFR mutant lung adenocarcinoma.

ALK-driven tumors and targeted therapy: focus on crizotinib

Carlos Murga-Zamalloa, Megan S Lim
Pharmacogenomics and Personalized Medicine 2014:7 87–94

Receptor tyrosine kinases have emerged as promising therapeutic targets for a diverse set of tumors. Overactivation of the tyrosine kinase anaplastic lymphoma kinase (ALK) has been reported in several types of malignancies such as anaplastic large cell lymphoma, inflammatory myofibroblastic tumor, neuroblastoma, and non-small-cell lung carcinoma. Further characterization of the molecular role of ALK has revealed an oncogenic signaling signature that results in tumor dependence on ALK. ALK-positive tumors display a different behavior than their ALK-negative counterparts; however, the specific role of ALK in some of these tumors remains to be elucidated. Although more studies are required to establish selective targeting of ALK as a definitive therapeutic option, initial trials have shown extraordinary results in the majority of cases.

Non-small-cell lung cancer: molecular targeted therapy and personalized medicine – drug resistance, mechanisms, and strategies

Marybeth Sechler, AD Cizmic, S Avasarala, M Van Scoyk, C Brzezinski, et al.
Pharmacogenomics and Personalized Medicine 2013:6 25–36

Targeted therapies for cancer bring the hope of specific treatment, providing high efficacy and in some cases lower toxicity than conventional treatment. Although targeted therapeutics have helped immensely in the treatment of several cancers, like chronic myelogenous leukemia, colon cancer, and breast cancer, the benefit of these agents in the treatment of lung cancer remains limited, in part due to the development of drug resistance. In this review, we discuss the mechanisms of drug resistance and the current strategies used to treat lung cancer. A better understanding of these drug-resistance mechanisms could potentially benefit from the development of a more robust personalized medicine approach for the treatment of lung cancer.

ERCC1 and XRCC1 as biomarkers for lung and head and neck cancer

Alec Vaezi, Chelsea H Feldman, Laura J Niedernhofer
Pharmacogenomics and Personalized Medicine 2011:4 47–63

Advanced stage non-small cell lung cancer and head and neck squamous cell carcinoma are both treated with DNA damaging agents including platinum-based compounds and radiation therapy. However, at least one quarter of all tumors are resistant or refractory to these genotoxic agents. Yet the agents are extremely toxic, leading to undesirable side effects with potentially no benefit. Alternative therapies exist, but currently there are no tools to predict whether the first-line genotoxic agents will work in any given patient. To maximize therapeutic success and limit unnecessary toxicity, emerging clinical trials aim to inform personalized treatments tailored to the biology of individual tumors. Worldwide, significant resources have been invested in identifying biomarkers for guiding the treatment of lung and head and neck cancer. DNA repair proteins of the nucleotide excision repair pathway (ERCC1) and of the base excision repair pathway (XRCC1), which are instrumental in clearing DNA damage caused by platinum drugs and radiation, have been extensively studied as potential biomarkers of clinical outcomes in lung and head and neck cancers. The results are complex and contradictory. Here we summarize the current status of single nucleotide polymorphisms, mRNA, and protein expression of ERCC1 and XRCC1 in relation
to cancer risk and patient outcomes.

Optimizing response to gefitinib in the treatment of non-small-cell lung cancer

Pietro Carotenuto, Cristin Roma, Anna Maria Rachiglio, Raffaella Pasquale, et al.
Pharmacogenomics and Personalized Medicine 2011:4 1–9

The epidermal growth factor receptor (EGFR) is expressed in the majority of non-small-cell lung cancer (NSCLC). However, only a restricted subgroup of NSCLC patients respond to treatment with the EGFR tyrosine kinase inhibitor (EGFR TKI) gefitinib. Clinical trials have demonstrated that patients carrying activating mutations of the EGFR significantly benefit from treatment with gefitinib. In particular, mutations of the EGFR TK domain have been shown to increase the sensitivity of the EGFR to exogenous growth factors and, at the same time, to EGFR TKIs such as gefitinib. EGFR mutations are more frequent in patients with particular clinical and pathological features such as female sex, nonsmoker status, adenocarcinoma histology, and East Asian ethnicity. A close correlation was found between EGFR mutations and response to gefitinib in NSCLC patients. More importantly, randomized Phase III studies have shown the superiority of gefitinib compared with chemotherapy in EGFR mutant patients in the first-line setting. In addition, gefitinib showed a good toxicity profile with an incidence of adverse events that was significantly lower compared with chemotherapy. Therefore, gefitinib is a major breakthrough for the management of EGFR mutant NSCLC patients and represents the first step toward personalized treatment of NSCLC.

Pharmacogenomics of drug metabolizing enzymes and transporters: implications for cancer therapy

Jing Li, Martin H Bluth
Pharmacogenomics and Personalized Medicine 2011:4 11–33

The new era of personalized medicine, which integrates the uniqueness of an Individual with respect to the pharmacokinetics and pharmacodynamics of a drug, holds promise as a means to provide greater safety and efficacy in drug design and development. Personalized medicine is particularly important in oncology, whereby most clinically used anticancer drugs have a narrow therapeutic window and exhibit a large interindividual pharmacokinetic and pharmacodynamics variability. This variability can be explained, at least in part, by genetic variations in the genes encoding drug metabolizing enzymes, transporters, or drug targets. Understanding of how genetic variations influence drug disposition and action could help in tailoring cancer therapy based on individual’s genetic makeup. This review focuses on the pharmacogenomics of drug metabolizing enzymes and drug transporters, with a particular highlight of examples whereby genetic variations in the metabolizing enzymes and transporters influence the pharmacokinetics and/or response of chemotherapeutic agents.

Transcriptome-wide signatures of tumor stage in kidney renal clear cell carcinoma: connecting copy number variation, methylation and transcription factor activity
Qi Liu, Pei-Fang Su, Shilin Zhao and Yu Shyr
Genome Medicine 2014, 6:117 http://genomemedicine.com/content/6/12/117

Background: Comparative analysis of expression profiles between early and late stage cancers can help to understand cancer progression and metastasis mechanisms and to predict the clinical aggressiveness of cancer. The observed stage-dependent expression changes can be explained by genetic and epigenetic alterations as well as transcription dysregulation. Unlike genetic and epigenetic alterations, however, activity changes of transcription factors, generally occurring at the post-transcriptional or post-translational level, are hard to detect and quantify.
Methods: Here we developed a statistical framework to infer the activity changes of transcription factors by simultaneously taking into account the contributions of genetic and epigenetic alterations to mRNA expression variations.
Results: Applied to kidney renal clear cell carcinoma (KIRC), the model underscored the role of methylation as a significant contributor to stage-dependent expression alterations and identified key transcription factors as potential drivers of cancer progression.
Conclusions: Integrating copy number, methylation, and transcription factor activity signatures to explain stage-dependent expression alterations presented a precise and comprehensive view on the underlying mechanisms during KIRC progression.

Developments in renal pharmacogenomics and applications in chronic kidney disease

Ariadna Padullés, Inés Rama, Inés Llaudó, Núria Lloberas
Pharmacogenomics and Personalized Medicine 2014:7 251–266

Chronic kidney disease (CKD) has shown an increasing prevalence in the last century. CKD encompasses a poor prognosis related to a remarkable number of comorbidities, and many patients suffer from this disease progression. Once the factors linked with CKD evolution are distinguished, it will be possible to provide and enhance a more intensive treatment to high-risk patients. In this review, we focus on the emerging markers that might be predictive or related to CKD progression physiopathology as well as those related to a different pattern of response to treatment, such as inhibitors of the renin–angiotensin system (including angiotensin-converting enzyme inhibitors and angiotensin II receptor blockers; the vitamin D receptor agonist; salt sensitivity hypertension; and progressive kidney-disease markers with identified genetic polymorphisms). Candidate-gene association studies and genome-wide association studies have analyzed the genetic basis for common renal diseases, including CKD and related factors such as diabetes and hypertension. This review will, in brief, consider genotype-based pharmacotherapy, risk prediction, drug target recognition, and personalized treatments, and will mainly focus on findings in CKD patients. An improved understanding will smooth the progress of switching from classical clinical medicine to gene-based medicine.








Read Full Post »

Introduction to e-Series A: Cardiovascular Diseases, Volume Four Part 2: Regenerative Medicine

Introduction to e-Series A: Cardiovascular Diseases, Volume Four Part 2: Regenerative Medicine

Author and Curator: Larry H Bernstein, MD, FCAP


Curator: Aviva Lev-Ari, PhD, RN

This document is entirely devoted to medical and surgical therapies that have made huge strides in

  • simplification of interventional procedures,
  • reduced complexity, resulting in procedures previously requiring surgery are now done, circumstances permitting, by medical intervention.

This revolution in cardiovascular interventional therapy is regenerative medicine.  It is regenerative because it is largely driven by

  • the introduction into the impaired vasculature of an induced pleuripotent cell, called a stem cell, although
  • the level of differentiation may not be a most primitive cell line.

There is also a very closely aligned development in cell biology that extends beyond and including vascular regeneration that is called synthetic biology.  These developments have occurred at an accelerated rate in the last 15 years. The methods of interventional cardiology were already well developed in the mid 1980s.  This was at the peak of cardiothoracic bypass surgery.

Research on the endothelial cell,

  • endothelial cell proliferation,
  • shear flow in small arteries, especially at branch points, and
  • endothelial-platelet interactions

led to insights about plaque formation and vessel thrombosis.

Much was learned in biomechanics about the shear flow stresses on the luminal surface of the vasculature, and there was also

  • the concomitant discovery of nitric oxide,
  • oxidative stress, and
  • the isoenzymes of nitric oxide synthase (eNOS, iNOS, and nNOS).

It became a fundamental tenet of vascular biology that

  • atherogenesis is a maladjustment to oxidative stress not only through genetic, but also
  • non-genetic nutritional factors that could be related to the balance of omega (ω)-3 and omega (ω)-6 fatty acids,
  • a pro-inflammatory state that elicits inflammatory cytokines, such as, interleukin-6 (IL6) and c-reactive protein(CRP),
  • insulin resistance with excess carbohydrate associated with type 2 diabetes and beta (β) cell stress,
  • excess trans- and saturated fats, and perhaps
  • the now plausible colonic microbial population of the gastrointestinal tract (GIT).

There is also an association of abdominal adiposity,

  • including the visceral peritoneum, with both T2DM and with arteriosclerotic vessel disease,
  • which is presenting at a young age, and has ties to
  • the effects of an adipokine, adiponectin.

Much important work has already been discussed in the domain of cardiac catheterization and research done to

  • prevent atheroembolization.and beyond that,
  • research done to implant an endothelial growth matrix.

Even then, dramatic work had already been done on

  • the platelet structure and metabolism, and
  • this has transformed our knowledge of platelet biology.

The coagulation process has been discussed in detailed in a previous document.  The result was the development of a

  • new class of platelet aggregation inhibitors designed to block the activation of protein on the platelet surface that
  • is critical in the coagulation cascade.

In addition, the term long used to describe atherosclerosis, atheroma notwithstanding, is “hardening of the arteries”.  This is particularly notable with respect to mid-size arteries and arterioles that feed the heart and kidneys. Whether it is preceded by or develops concurrently with chronic renal insufficiency and lowered glomerular filtration rate is perhaps arguable.  However, there is now a body of evidence that points to

  • a change in the vascular muscularis and vessel stiffness, in addition to the endothelial features already mentioned.

This has provided a basis for

  • targeted pharmaceutical intervention, and
  • reduction in salt intake.

So we have a  group of metabolic disorders, which may alone or in combination,

  • lead to and be associated with the long term effects of cardiovascular disease, including
  • congestive heart failure.

This has been classically broken down into forward and backward failure,

  • depending on decrease outflow through the aorta (ejection fraction), or
  • decreased venous return through the vena cava,

which involves increased pulmonary vascular resistance and decreased return into the left atrium.

This also has ties to several causes, which may be cardiac or vascular. This document, as the previous, has four pats.  They are broadly:

  1. Stem Cells in Cardiovascular Diseases
  2. Regenerative Cell and Molecular Biology
  3. Therapeutics Levels In Molecular Cardiology
  4. Research Proposals for Endogenous Augmentation of circulating Endothelial Progenitor Cells (cEPCs)

As in the previous section, we start with the biology of the stem cell and the degeneration in cardiovascular diseases, then proceed to regeneration, then therapeutics, and finally – proposals for augmenting therapy with circulating endogenous endothelial progenitor cells (cEPCs).



stem cells























Key pathways involving NO

Key pathways involving NO





stem cell lin28

stem cellLlin28

1479-5876-10-175-1-l  translational research with feedback loops

Tranlational Research -Lab to Bedside



Read Full Post »

Gene Sequencing – to the Bedside

Reporter: Larry H Bernstein, MD, FCAP

Gene sequencing leaves the laboratory

Maturing technology speeds medical diagnoses.
Erika Check Hayden  19 February 2013
The steep fall in the cost of sequencing a genome has, for the moment, slowed. Yet researchers attending this year’s Advances in Genome Biology and Technology (AGBT) meeting in Marco Island, Florida, on 20–23 February are not complaining. At a cost as low as US$5,000–10,000 per human genome, sequencing has become cheap and reliable enough that researchers are not waiting for the next sequencing machine to perfect new applications in medicine.
Single-cell genomics is allowing fertility clinics to screen embryos for abnormalities more cheaply.

Human genome to genes

Human genome to genes (Photo credit: Wikipedia)

Read Full Post »

Reporter: Ritu Saxena, Ph.D.

On December 4, 2012, molecular diagnostic firm Invivoscribe Technologies launched a personalized medicine company. Genection is offering both routine and esoteric genetic tests, exome and whole-genome sequencing, cancer somatic mutation testing, and pharmacogenomics.


Because the Genection model is not payor-driven, it said, it can provide doctors access to genetic tests that are currently unavailable, overlooked, or inaccessible through their patients’ health plans and healthcare institutions.

The privately held company added that it has agreements in place with several CLIA- and CAP-certified laboratories, including ARUP Laboratories, Foundation Medicine, Cypher Genomics, Invivoscribe’s wholly owned subsidiary the Laboratory for Personalized Molecular Medicine and LPMM’s laboratory in Martinsried, Germany. It also has relationships with Illumina and Ambry Genetics and agreements with “a consortium” of genetic counselors.

“In order to make personalized molecular medicine a clinical reality, new platforms need to be developed for the delivery of healthcare. Genection’s mission seeks to accelerate this adoption process,” Genection Chief Medical Officer Bradley Patay said in a statement. “The combination of CLIA-validated genetic testing, whole-exome or whole-genome sequencing, and broad targeted assays, along with critical bioinformatics, analytic tools, and interpretative guidelines will contribute to timely definitive diagnoses for patients with rare, unexplained diseases or complex diseases; in essence, this integration will speed delivery of genomic test results and improve patient care.”

The company profile states that because the cost of genomic sequencing has declined steeply, utilizing deep sequencing of tumors, doctors can now offer targeted treatments to the specific type of cancer for each patient. This personalized approach may offer better treatment options that are tailored for each individual versus conventional approaches.  For example, The Cancer Genome Atlas Research Network found a potential therapeutic target in most squamous cell lung cancers. Genetic testing would also be able to provide insight on drug’s effectiveness and help a physician tailor the dosage and/or select another drug if it’s determined that you have a genetic variant that could affect the drug’s efficacy.



Invivoscribe Technologies: http://www.invivoscribe.com/

Genection: http://www.genection.com/

Read Full Post »

Reporter: Aviva Lev-Ari, PhD, RN

Set of Papers Outline ENCODE Findings as Consortium Looks Ahead to Future Studies

NEW YORK (GenomeWeb News) – An international collaboration involving more than 400 researchers working to characterize gene regulatory networks in the human genome is publishing dozens of new studies this week.

In papers appearing in NatureScienceGenome ResearchGenome BiologyJournal of Biological Chemistry, and elsewhere, members of the Encyclopedia of DNA Elements, or ENCODE, consortium describe approaches used to define some four million regulatory regions in the genome, among other things. All told, the team explained, ENCODE efforts have made it possible assign biological functions to around 80 percent of genome sequences — filling in large gaps left by studies that focused on protein-coding sequences alone.

“We found that a much bigger part of the genome — a surprising amount, in fact — is involved in controlling when and where proteins are produced, than in simply manufacturing the building blocks,” ENCODE’s lead analysis coordinator Ewan Birney, associate director of the European Molecular Biology Laboratory European Bioinformatics Institute, said in a statement.

“This concept of ‘junk DNA,’ which has been sort of perpetuated for the past 20 years or so is really not accurate,” ENCODE researcher Rick Myers, director of the HudsonAlpha Institute for Biotechnology, said during a telephone briefing with reporters today. “Most of the genome — more than 80 percent of the base pairs in the genome — has some biological activity, some biological function.”

Researchers participating in a complementary effort within the larger ENCODE project, known as GENCODE, more completely characterize the coding portions of the genome. “As part of the ENCODE project, we both tidied up the protein-coding genes and we also found many non-coding RNA genes as well,” Birney said during today’s telebriefing.

Based on the success of ENCODE so far, the project is expected to be extended by another four years or so. The amount of new funding from the National Human Genome Research Institute for that follow-up work is expected to be as high as $123 million.

“Later this month, NHGRI will be announcing a new round of funding that will take the ENCODE project into its next phase,” NHGRI Director Eric Green said during the call.

Studies done in the decade or so since the human genome was deciphered have highlighted how little of the genome is actually comprised of gene sequences. With the realization that only around 2 percent of the genome is dedicated to protein-coding functions came a spate of speculation about the role of the other 98 percent of genome.

While this portion of the genome was suspected of harboring regulatory sequences, the extent of that regulation and its impact on coding sequences in human tissues over time was not known.

“When the Human Genome Project ended in 2003, we quickly realized that we understood the meaning of only a very small percent of the human genome’s letters,” Green explained. “We did know the genetic code for determining the order of amino acids and proteins, but we understood precious little about the signals that turned genes on or off — or that controlled the amount of proteins produced in different tissues.”

To begin studying such control networks systematically, the international ENCODE consortium kicked off the main phase of its analyses in 2007, following an earlier pilot study.

NHGRI has provided $123 million for the project over the past five years. Another $30 million went to support the development of ENCODE-related technologies since the ENCODE pilot started in 2003, while $40.6 million from NHGRI went towards the pilot itself.

During the study’s main phase, investigators from nearly three-dozen labs around the world took multi-pronged approaches to assess transcription factor binding patterns, histone modification patterns, chromatin structure signatures and other features of the genome that interact with one another to control gene expression over time and across different tissues in the body.

To accomplish the roughly 1,600 experiments done to test some 180 cell types for ENCODE, teams turned to methods such as chromatin immunoprecipitation coupled with sequencing to define the genome-wide binding patterns for more than 100 different transcription factors, for example, while other strategies were used to profile DNA methylation patterns, chromatin features, and so forth.

“It’s really a detailed hierarchy, where proteins bind and epigenetic marks — like DNA methylation and other marks — precisely cooperate and regulate how the genes are going to get turned on [or off] and the amount of this,” Myers said. “These complex networks are one of the big components of the contributions of the 30 papers that are being published today.”

For example, a University of Washington-led team reporting in Science online todaydefined millions of regulatory regions, including some that are operational during normal development, by taking advantage of an enzyme known as DNase I, which chops off DNA specifically at open chromatin sites in the genome. That group found that more than three-quarters of disease-associated variants identified in genome-wide association studies fall in parts of the genome that overlap with regulatory sites.

“We now know that the majority of these changes that are associated with common diseases and traits that don’t fall within genes actually occur within the gene-controlling switches,” University of Washington genome sciences researcher John Stamatoyannopoulos, senior author on that study, said during today’s telebriefing. “This phenomenon is not confined to a particular type of disease. It seems to be present across the board for a very wide variety of different diseases and traits.”

Results from such analyses also hint that some outwardly unrelated conditions might be traced back to similar regulatory processes. And, researchers say, by bringing together information on active regulatory regions with disease-risk variants, it may be possible to define new functionally important tissues for certain conditions.

“By creating these extensive blueprints of the control circuitry, we’re now exposing previously hidden connections between different kinds of diseases that may explain common clinical features,” Stamatoyannopoulos said.

“This has also allowed us to see that the GWAS studies that have been performed contain far more information than was previously believed,” he added, “because hundreds of additional DNA changes that were not thought to be important also appear to affect these gene-controlling switches.”

The new data are also expected to help in understanding genetic disease and interpreting information from personal genomes, according to Michael Snyder, an ENCODE investigator and director of Stanford University’s Center of Genomics and Personalized Medicine.

“We believe the ENCODE project will have a profound impact on personal genomes and, ultimately on personalized medicine,” Snyder told reporters. “We can now better see what personal variants do, in terms of causing phenotypic differences, drug responses, and disease risk.”

Many of the studies stemming from ENCODE can be viewed through a Nature,Genome Research, and Genome Biology-conceived website that links ENCODE papers that share themes or “threads” that are related to one another.

Along with the newly published papers, the ENCODE team is making data available to other members of the research community through the project’s website. Data from studies can also be accessed through an ENCODE browser housed at the University of California at Santa Cruz or via NCBI or EBI sites.

“For basic researchers, the ENCODE data represents a powerful resource for understanding fundamental questions about how life is encoded in our genome,” NHGRI’s Green said. “For more clinically-oriented researchers, the ENCODE data provide key information about which genome sequences are functionally important.”

Related Stories

  • Team IDs Characteristic Epigenetic Enhancer Patterns in Colon Cancer
    April 12, 2012 / GenomeWeb Daily News
  • NIH to Award $25M for Newborn Sequencing Studies
    August 10, 2012 / GenomeWeb Daily News
  • Illumina Q2 Revenues Down 2 Percent
    July 25, 2012 / GenomeWeb Daily News
  • Study: Exon Arrays Have Benefits over RNA-seq, but Fall Short in Finding Novel Transcription Events
    July 10, 2012 / In Sequence
  • Consortium Members Publish Collection of Studies Stemming from Human Microbiome Project
    June 13, 2012 / GenomeWeb Daily News


    52 | NATURE | VOL 489 | 6 SEPTEMBER 2012

    FORUM: Genomics

    ENCODE explained

    The Encyclopedia of DNA Elements (ENCODE) project dishes up a hearty banquet of data that illuminate the roles of the functional elements of the human genome. Here, five scientists describe the project and discuss how the data are influencing research directions across many fields. See Articles p.57, p.75, p.83, p.91, p.101 & Letter p.109

    Serving up a genome feast


    Starting with a list of simple ingredients and blending them in the precise amounts needed to prepare a gourmet meal is a challenging task. In many respects, this task is analogous to the goal of the ENCODE project1, the recent progress of which is described in this issue2–7. The project aims to fully describe the list of common ingredients (functional elements) that make up the human genome (Fig. 1). When mixed in the right proportions, these ingredients constitute the information needed to build all the types of cells, body organs and, ultimately, an entire person from a single genome.

    The ENCODE pilot project8 focused on just 1% of the genome — a mere appetizer — and its results hinted that the list of human genes was incomplete. Although there was scepticism about the feasibility of scaling up the project to the entire genome and to many hundreds of cell types, recent advances in low-cost, rapid DNA-sequencing technology radically changed that view9. Now the ENCODE consortium presents a menu of 1,640 genome-wide data sets prepared from 147 cell types, providing a six-course serving of papers in Nature, along with many companion publications in other journals.

    One of the more remarkable findings described in the consortium’s ‘entrée’ paper (page 57)2 is that 80% of the genome contains elements linked to biochemical functions, dispatching the widely held view that the human genome is mostly ‘junk DNA’. The authors report that the space between genes is filled with enhancers (regulatory DNA elements), promoters (the sites at which DNA’s transcription into RNA is initiated) and numerous previously overlooked regions that encode RNA transcripts that are not translated into proteins but might have regulatory roles. Of note, these results show that many DNA variants previously correlated with certain diseases lie within or very near non-coding functional DNA elements, providing new leads for linking genetic variation and disease.

    The five companion articles3–7 dish up diverse sets of genome-wide data regarding the mapping of transcribed regions, DNA binding of regulatory proteins (transcription factors) and the structure and modifications of chromatin (the association of DNA and proteins that makes up chromosomes), among other delicacies.

    Djebali and colleagues3 (page 101) describe ultra-deep sequencing of RNAs prepared from many different cell lines and from specific compartments within the cells. They conclude that about 75% of the genome is transcribed at some point in some cells, and that genes are highly interlaced with overlapping transcripts that are synthesized from both DNA strands. These findings force a rethink of the definition of a gene and of the minimum unit of heredity.

    Moving on to the second and third courses, Thurman et al.4 and Neph et al.5 (pages 75 and 83) have prepared two tasty chromatin-related treats. Both studies are based on the DNase I hypersensitivity assay, which detects genomic regions at which enzyme access to, and subsequent cleavage of, DNA is unobstructed by chromatin proteins. The authors identified cell-specific patterns of DNase I hypersensitive sites that show remarkable concordance with experimentally determined and computationally predicted binding sites of transcription factors. Moreover, they have doubled the number of known recognition sequences for DNA-binding proteins in the human genome, and have revealed a 50-base-pair ‘footprint’ that is present in thousands of promoters5.

    The next course, provided by Gerstein and colleagues6 (page 91) examines the principles behind the wiring of transcription-factor networks. In addition to assigning relatively simple functions to genome elements (such as ‘protein X binds to DNA element Y’), this study attempts to clarify the hierarchies of transcription factors and how the intertwined networks arise.

    Beyond the linear organization of genes and transcripts on chromosomes lies a more complex (and still poorly understood) network of chromosome loops and twists through which promoters and more distal elements, such as enhancers, can communicate their regulatory information to each other. In the final course of the ENCODE genome feast, Sanyal and colleagues7 (page 109) map more than 1,000 of these long-range signals in each cell type. Their findings begin to overturn the long-held (and probably oversimplified) prediction that the regulation of a gene is dominated by its proximity to the closest regulatory elements.

    One of the major future challenges for ENCODE (and similarly ambitious projects) will be to capture the dynamic aspects of gene regulation. Most assays provide a single snapshot of cellular regulatory events, whereas a time series capturing how such processes change is preferable. Additionally, the examination of large batches of cells — as required for the current assays — may present too simplified a view of the underlying regulatory complexity, because individual cells in a batch (despite being genetically identical) can sometimes behave in different ways. The development of new technologies aimed at the simultaneous capture of multiple data types, along with their regulatory dynamics in single cells, would help to tackle these issues.

    A further challenge is identifying how the genomic ingredients are combined to assemble the gene networks and biochemical pathways that carry out complex functions, such as cell-to-cell communication, which enable organs and tissues to develop. An even greater challenge will be to use the rapidly growing body

    “These findings force a rethink of the definition of a gene and of the minimum unit of heredity.”ENCODEEncyclopedia of DNA Elementsnature.com/encode

    © 2012 Macmillan Publishers Limited. All rights reserved



    6 SEPTEMBER 2012 | VOL 489 | NATURE | 53

    of data from genome-sequencing projects to understand the range of human phenotypes (traits), from normal developmental processes, such as ageing, to disorders such as Alzheimer’s disease10.

    Achieving these ambitious goals may require a parallel investment of functional studies using simpler organisms — for example, of the type that might be found scampering around the floor, snatching up crumbs in the chefs’ kitchen. All in all, however, the ENCODE project has served up an all-you-can-eat feast of genomic data that we will be digesting for some time. Bon appétit!

    Joseph R. Ecker is at the Howard Hughes Medical Institute and the Salk Institute for Biological Studies, La Jolla, California 92037, USA.

    e-mail: ecker@salk.eduNucleosomeHistoneChromatinmodicationsLong-rangechromatin interactionsFunctionalgenomicelementsDNase IhypersensitivesitesDNA methylationChromosomeDNALong-rangeregulatoryelementsProtein-codingand non-codingtranscriptsPromoterarchitectureTranscriptionfactorTranscriptionmachineryTranscription-factorbinding sitesTranscribed region

    Figure 1 | Beyond the sequence. The ENCODE project2–7 provides information on the human genome far beyond that contained within the DNA sequence — it describes the functional genomic elements that orchestrate the development and function of a human. The project contains data about the degree of DNA methylation and chemical modifications to histones that can influence the rate of transcription of DNA into RNA molecules (histones are the proteins around which DNA is wound to form chromatin). ENCODE also examines long-range chromatin interactions, such as looping, that alter the relative proximities of different chromosomal regions in three dimensions and also affect transcription. Furthermore, the project describes the binding activity of transcription-factor proteins and the architecture (location and sequence) of gene-regulatory DNA elements, which include the promoter region upstream of the point at which transcription of an RNA molecule begins, and more distant (long-range) regulatory elements. Another section of the project was devoted to testing the accessibility of the genome to the DNA-cleavage protein DNase I. These accessible regions, called DNase I hypersensitive sites, are thought to indicate specific sequences at which the binding of transcription factors and transcription-machinery proteins has caused nucleosome displacement. In addition, ENCODE catalogues the sequences and quantities of RNA transcripts, from both non-coding and protein-coding regions.

    Expression control


    Once the human genome had been sequenced, it became apparent that an encyclopaedic knowledge of chromatin organization would be needed if we were to understand how gene expression is regulated. The ENCODE project goes a long way to achieving this goal and highlights the pivotal role of transcription factors in sculpting the chromatin landscape.

    Although some of the analyses largely confirm conclusions from previous smaller-scale studies, this treasure trove of genome-wide data provides fresh insight into regulatory pathways and identifies prodigious numbers of regulatory elements. This is particularly so for Thurman and colleagues’ data4 regarding DNase I hypersensitive sites (DHSs) and for Gerstein and colleagues’ results6 concerning DNA binding of transcription factors. DHSs are genomic regions that are accessible to enzymatic cleavage as a result of the displacement of nucleosomes (the basic units of chromatin) by DNA-binding proteins (Fig. 1). They are the hallmark of cell-type-specific enhancers, which are often located far away from promoters.

    The ENCODE papers expose the profusion of DHSs — more than 200,000 per cell type, far outstripping the number of promoters — and their variability between cell types. Through the simultaneous presence in the same cell type of a DHS and a nearby active promoter, the researchers paired half a million enhancers with their probable target genes. But this leaves

    © 2012 Macmillan Publishers Limited. All rights reserved



    more than 2 million putative enhancers without known targets, revealing the enormous expanse of the regulatory genome landscape that is yet to be explored. Chromosome-conformation-capture methods that detect long-range physical associations between distant DNA regions are attempting to bridge this gap. Indeed, Sanyal and colleagues7 applied these techniques to survey such associations across 1% of the genome.

    The ENCODE data start to paint a picture of the logic and architecture of transcriptional networks, in which DNA binding of a few high-affinity transcription factors displaces nucleosomes and creates a DHS, which in turn facilitates the binding of further, lower-affinity factors. The results also support the idea that transcription-factor binding can block DNA methylation (a chemical modification of DNA that affects gene expression), rather than the other way around — which is highly relevant to the interpretation of disease-associated sites of altered DNA methylation11.

    The exquisite cell-type specificity of regulatory elements revealed by the ENCODE studies emphasizes the importance of having appropriate biological material on which to test hypotheses. The researchers have focused their efforts on a set of well-established cell lines, with selected assays extended to some freshly isolated cells. Challenges for the future include following the dynamic changes in the regulatory landscape during specific developmental pathways, and understanding chromatin structure in tissues containing heterogeneous cell populations.

    Wendy A. Bickmore is in the Medical Research Council Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK.

    e-mail: wendy.bickmore@igmm.ed.ac.uk 

    “The results imply that sequencing studies focusing on protein-coding sequences risk missing crucial parts of the genome.”

    11 Years Ago

    The draft human genome


    Unless the human genome contains a lot of genes that are opaque to our computers, it is clear that we do not gain our undoubted complexity over worms and plants by using many more genes. Understanding what does give us our complexity — our enormous behavioural repertoire, ability to produce conscious action, remarkable physical coordination (shared with other vertebrates), precisely tuned alterations in response to external variations of the environment, learning, memory … need I go on? — remains a challenge for the future.

    David Baltimore

    From Nature 15 February 2001


    With the draft in hand, researchers have a new tool for studying the regulatory regions and networks of genes. Comparisons with other genomes should reveal common regulatory elements, and the environments of genes shared with other species may offer insight into function and regulation beyond the level of individual genes. The draft is also a starting point for studies of the three-dimensional packing of the genome into a cell’s nucleus. Such packing is likely to influence gene regulation … The human genome lies before us, ready for interpretation.

    Peer Bork and Richard Copley

    From Nature 15 February 2001

    Non-codingbut functional


    The vast majority of the human genome does not code for proteins and, until now, did not seem to contain defined gene-regulatory elements. Why evolution would maintain large amounts of ‘useless’ DNA had remained a mystery, and seemed wasteful. It turns out, however, that there are good reasons to keep this DNA. Results from the ENCODE project2–8 show that most of these stretches of DNA harbour regions that bind proteins and RNA molecules, bringing these into positions from which they cooperate with each other to regulate the function and level of expression of protein-coding genes. In addition, it seems that widespread transcription from non-coding DNA potentially acts as a reservoir for the creation of new functional molecules, such as regulatory RNAs.

    What are the implications of these results for genetic studies of complex human traits and disease? Genome-wide association studies (GWAS), which link variations in DNA sequence with specific traits and diseases, have in recent years become the workhorse of the field, and have identified thousands of DNA variants associated with hundreds of complex traits (such as height) and diseases (such as diabetes). But association is not causality, and identifying those variants that are causally linked to a given disease or trait, and understanding how they exert such influence, has been difficult. Furthermore, most of these associated variants lie in non-coding regions, so their functional effects have remained undefined.

    The ENCODE project provides a detailed map of additional functional non-coding units in the human genome, including some that have cell-type-specific activity. In fact, the catalogue contains many more functional non-coding regions than genes. These data show that results of GWAS are typically enriched for variants that lie within such non-coding functional units, sometimes in a cell-type-specific manner that is consistent with certain traits, suggesting that many of these regions could be causally linked to disease. Thus, the project demonstrates that non-coding regions must be considered when interpreting GWAS results, and it provides a strong motivation for reinterpreting previous GWAS findings. Furthermore, these results imply that sequencing studies focusing on protein-coding sequences (the ‘exome’) risk missing crucial parts of the genome and the ability to identify true causal variants.

    However, although the ENCODE catalogues represent a remarkable tour de force, they contain only an initial exploration of the depths of our genome, because many more cell types must yet be investigated. Some of the remaining challenges for scientists searching for causal disease variants lie in: accessing data derived from cell types and tissues relevant to the disease under study; understanding how these functional units affect genes that may be distantly located7; and the ability to generalize such results to the entire organism.

    Inês Barroso is at the Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK, and at the University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Cambridge, UK.e-mail: ib1@sanger.ac.uk5 4 | N AT U R E | VO L 4 8 9 | 6 S E P T E M B E R 2 0 1 2

    © 2012 Macmillan Publishers Limited. All rights reserved

    Evolution and the code


    One of the great challenges in evolutionary biology is to understand how differences in DNA sequence between species determine differences in their phenotypes. Evolutionary change may occur both through changes in protein-coding sequences and through sequence changes that alter gene regulation.

    There is growing recognition of the importance of this regulatory evolution, on the basis of numerous specific examples as well as on theoretical grounds. It has been argued that potentially adaptive changes to protein-coding sequences may often be prevented by natural selection because, even if they are beneficial in one cell type or tissue, they may be detrimental elsewhere in the organism. By contrast, because gene-regulatory sequences are frequently associated with temporally and spatially specific gene-expression patterns, changes in these regions may modify the function of only certain cell types at specific times, making it more likely that they will confer an evolutionary advantage12.

    However, until now there has been little information about which genomic regions have regulatory activity. The ENCODE project has provided a first draft of a ‘parts list’ of these regulatory elements, in a wide range of cell types, and moves us considerably closer to one of the key goals of genomics: understanding the functional roles (if any) of every position in the human genome.

    Nonetheless, it will take a great deal of work to identify the critical sequence changes in the newly identified regulatory elements that drive functional differences between humans and other species. There are some precedents for identifying key regulatory differences (see, for example, ref. 13), but ENCODE’s improved identification of regulatory elements should greatly accelerate progress in this area. The data may also allow researchers to begin to identify sequence alterations occurring simultaneously in multiple genomic regions, which, when added together, drive phenotypic change — a process called polygenic adaptation14.

    However, despite the progress brought by the ENCODE consortium and other research groups, it remains difficult to discern with confidence which variants in putative regulatory regions will drive functional changes, and what these changes will be. We also still have an incomplete understanding of how regulatory sequences are linked to target genes. Furthermore, the ENCODE project focused mainly on the control of transcription, but many aspects of post-transcriptional regulation, which may also drive evolutionary changes, are yet to be fully explored.

    Nonetheless, these are exciting times for studies of the evolution of gene regulation. With such new resources in hand, we can expect to see many more descriptions of adaptive regulatory evolution, and how this has contributed to human evolution.

    Jonathan K. Pritchard and Yoav Gilad are in the Department of Human Genetics, University of Chicago, Chicago 60637 Illinois, USA. J.K.P. is also at the Howard Hughes Medical Institute, University of Chicago.

    e-mails: pritch@uchicago.edu; gilad@uchicago.edu 

    From catalogue to function


    Projects that produce unprecedented amounts of data, such as the human genome project15 or the ENCODE project, present new computational and data-analysis challenges and have been a major force driving the development of computational methods in genomics. The human genome project produced one bit of information per DNA base pair, and led to advances in algorithms for sequence matching and alignment. By contrast, in its 1,640 genome-wide data sets, ENCODE provides a profile of the accessibility, methylation, transcriptional status, chromatin structure and bound molecules for every base pair. Processing the project’s raw data to obtain this functional information has been an immense effort.

    For each of the molecular-profiling methods used, the ENCODE researchers devised novel processing algorithms designed to remove outliers and protocol-specific biases, and to ensure the reliability of the derived functional information. These processing pipelines and quality-control measures have been adapted by the research community as the standard for the analysis of such data. The high quality of the functional information they produce is evident from the exquisite detail and accuracy achieved, such as the ability to observe the crystallographic topography of protein–DNA interfaces in DNase I footprints5, and the observation of more than one-million-fold variation in dynamic range in the concentrations of different RNA transcripts3.

    But beyond these individual methods for data processing, the profound biological insights of ENCODE undoubtedly come from computational approaches that integrated multiple data types. For example, by combining data on DNA methylation, DNA accessibility and transcription-factor expression. Thurman et al.4 provide fascinating insight into the causal role of DNA methylation in gene silencing. They find that transcription-factor binding sites are, on average, less frequently methylated in cell types that express those transcription factors, suggesting that binding-site methylation often results from a passive mechanism that methylates sites not bound by transcription factors.

    Despite the extensive functional information provided by ENCODE, we are still far from the ultimate goal of understanding the function of the genome in every cell of every person, and across time within the same person. Even if the throughput rate of the ENCODE profiling methods increases dramatically, it is clear that brute-force measurement of this vast space is not feasible. Rather, we must move on from descriptive and correlative computational analyses, and work towards deriving quantitative models that integrate the relevant protein, RNA and chromatin components. We must then describe how these components interact with each other, how they bind the genome and how these binding events regulate transcription.

    If successful, such models will be able to predict the genome’s function at times and in settings that have not been directly measured. By allowing us to determine which assumptions regarding the physical interactions of the system lead to models that better explain measured patterns, the ENCODE data provide an invaluable opportunity to address this next immense computational challenge. ■

    Eran Segal is in the Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel.

    e-mail: eran.segal@weizmann.ac.il

    1. The ENCODE Project Consortium Science 306, 636–640 (2004).

    2. The ENCODE Project Consortium Nature 489, 57–74 (2012).

    3. Djebali, S. et al. Nature 489, 101–108 (2012).

    4. Thurman, R. E. et al. Nature 489, 75–82 (2012).

    5. Neph, S. et al. Nature 489, 83–90 (2012).

    6. Gerstein, M. B. et al. Nature 489, 91–100 (2012).

    7. Sanyal, A., Lajoie, B., Jain, G. & Dekker, J. Nature 489, 109–113 (2012).

    8. Birney, E. et al. Nature 447, 799–816 (2007).

    9. Mardis, E. R. Nature 470, 198–203 (2011).

    10. Gonzaga-Jauregui, C., Lupski, J. R. & Gibbs, R. A. Annu. Rev. Med. 63, 35–61 (2012).

    11. Sproul, D. et al. Proc. Natl Acad. Sci. USA 108, 4364–4369 (2011).

    12. Carroll, S. B. Cell 134, 25–36 (2008).

    13. Prabhakar, S. et al. Science 321, 1346–1350 (2008).

    14. Pritchard, J. K., Pickrell, J. K. & Coop, G. Curr. Biol. 20, R208–R215 (2010).

    15. Lander, E. S. et al. Nature 409, 860–921 (2001).

    “The high quality of the functional information produced is evident from the exquisite detail and accuracy achieved.” 

    6 S E P T E M B E R 2 0 1 2 | VO L 4 8 9 | N AT U R E | 5 5 NEWS & VIEWS RESEARCH © 2012 Macmillan Publishers Limited. All rights reserved

    http://www.sciencemag.org SCIENCE VOL 337 7 SEPTEMBER 2012 1159


    When researchers fi rst sequenced the human

    genome, they were astonished by how few

    traditional genes encoding proteins were

    scattered along those 3 billion DNA bases.

    Instead of the expected 100,000 or more

    genes, the initial analyses found about 35,000

    and that number has since been whittled down

    to about 21,000. In between were megabases

    of “junk,” or so it seemed.

    This week, 30 research papers, including

    six in Nature and additional papers published

    by Science, sound the death knell for

    the idea that our DNA is mostly littered with

    useless bases. A decadelong project, the

    Encyclopedia of DNA Elements (ENCODE),

    has found that 80% of the human genome

    serves some purpose, biochemically speaking.

    “I don’t think anyone would have anticipated

    even close to the amount of sequence

    that ENCODE has uncovered that looks like

    it has functional importance,” says John A.

    Stamatoyannopoulos, an ENCODE re searcher

    at the University of Washington, Seattle.

    Beyond defi ning proteins, the DNA bases

    highlighted by ENCODE specify landing

    spots for proteins that infl uence gene activity,

    strands of RNA with myriad roles, or

    simply places where chemical modifi cations

    serve to silence stretches of our chromosomes.

    These results are going “to change

    the way a lot of [genomics] concepts are

    written about and presented in textbooks,”

    Stamatoyannopoulos predicts.

    The insights provided by ENCODE into

    how our DNA works are already clarifying

    genetic risk factors for a variety of diseases

    and offering a better understanding of gene

    regulation and function. “It’s a treasure trove

    of information,” says Manolis Kellis, a computational

    biologist at Massachusetts Institute

    of Technology (MIT) in Cambridge who analyzed

    data from the project.

    The ENCODE effort has revealed that

    a gene’s regulation is far more complex

    than previously thought, being infl uenced

    by multiple stretches of regulatory DNA

    located both near and far from the gene

    itself and by strands of RNA not translated

    into proteins, so-called noncoding RNA.

    “What we found is how beautifully complex

    the biology really is,” says Jason Lieb,

    an ENCODE researcher at the University of

    North Carolina, Chapel Hill.

    Throughout the 1990s, various researchers

    called the idea of junk DNA into question.

    With the human genome in hand, the

    National Human Genome Research Institute

    (NHGRI) in Bethesda, Maryland, decided it

    wanted to fi nd out once and for all how much

    of the genome was a wasteland with no functional

    purpose. In 2003, it funded a pilot

    ENCODE, in which 35 research teams analyzed

    44 regions of the genome—30 million

    bases in all, about 1% of the total genome. In

    2007, the pilot project’s results revealed that

    much of this DNA sequence was active in

    some way. The work called into serious question

    our gene-centric view of the genome,

    fi nding extensive RNA-generating activity

    beyond traditional gene boundaries (Science,

    15 June 2007, p. 1556). But the question

    remained whether the rest of the genome was

    like this 1%. “We want to know what all the

    bases are doing,” says Yale University bioinformatician

    Mark Gerstein.

    Teams at 32 institutions worldwide have

    now carried out scores of tests, generating

    1640 data sets. While the pilot phase tests

    depended on computer chip–like devices

    called microarrays to analyze DNA samples,

    the expanded phase benefi ted from the arrival

    of new sequencing technology, which made it

    cost-effective to directly read the DNA bases.

    Taken together, the tests present “a greater

    idea of what the landscape of the genome

    looks like,” says NHGRI’s Elise Feingold.

    Because the parts of the genome used

    could differ among various kinds of cells,

    ENCODE needed to look at DNA function

    in multiple types of cells and tissues. At

    fi rst the goal was to study intensively three

    types of cells. They included GM12878, the

    immature white blood cell line used in the

    1000 Genomes Project, a large-scale effort to

    catalog genetic variation across humans; a leukemia

    cell line called K562; and an approved

    human embryonic stem cell line, H1-hESC.

    As ENCODE was ramping up, new

    sequencing technology brought the cost of

    sequencing down enough to make it feasible

    to test extensively even more cell types.

    ENCODE added a liver cancer cell line,

    HepG2; the laboratory workhorse cancer cell

    line, HeLa S3; and human umbilical cord tissue

    to the mix. Another 140 cell types were

    studied to a much lesser degree.

    In these cells, ENCODE researchers

    closely examined which DNA bases are transcribed

    into RNA and then whether those

    strands of RNA are subsequently translated

    into proteins, verifying predicted proteincoding

    genes and more precisely locating

    each gene’s beginning, end, and coding

    regions. The latest protein-coding gene count

    is 20,687, with hints of about 50 more, the

    consortium reports in Nature. Those genes

    account for about 3% of the human genome,

    less if one counts only their coding regions.

    Another 11,224 DNA stretches are classifi ed

    as pseudogenes, “dead” genes now known to

    be active in some cell types or individuals.

    ENCODE Project Writes Eulogy

    For Junk DNA






    Long-range regulatory elements

    (enhancers, repressors/

    silencers, insulators)

    cis-regulatory elements

    (promoters, transcription

    factor binding sites)

    Gene Transcript



    CH3CO (Epigenetic modifications)



    predictions and






    Zooming in. A diagram of DNA in ever-greater detail shows how ENCODE’s various tests (gray boxes) translate

    DNA’s features into functional elements along a chromosome.


    Published by AAAS

    Downloaded from http://www.sciencemag.org on September 10, 2012

    http://www.sciencemag.org SCIENCE VOL 337 7 SEPTEMBER 2012 1161


    ENCODE drives home, however, that

    there are many “genes” out there in which

    DNA codes for RNA, not a protein, as the end

    product. The big surprise of the pilot project

    was that 93% of the bases studied were transcribed

    into RNA; in the full genome, 76%

    is transcribed. ENCODE defi ned 8800 small

    RNA molecules and 9600 long noncoding

    RNA molecules, each of which is at least 200

    bases long. Thomas Gingeras of Cold Spring

    Harbor Laboratory in New York has found

    that various ones home in on different cell

    compartments, as if they have fi xed addresses

    where they operate. Some go to the nucleus,

    some to the nucleolus, and some to the cytoplasm,

    for example. “So there’s quite a lot

    of sophistication in how RNA works,” says

    Ewan Birney of the European Bioinformatics

    Institute in Hinxton, U.K., one of the key leaders

    of ENCODE (see p. 1162).

    As a result of ENCODE, Gingeras and

    others argue that the fundamental unit of

    the genome and the basic unit of heredity

    should be the transcript—the piece of

    RNA decoded from DNA—and not the

    gene. “The project has played an important

    role in changing our concept of the gene,”

    Stamatoyannopoulos says.

    Another way to test for functionality of

    DNA is to evaluate whether specific base

    sequences are conserved between species, or

    among individuals in a species. Previous studies

    have shown that 5% of the human genome

    is conserved across mammals, even though

    ENCODE studies implied that much more

    of the genome is functional. So MIT’s Lucas

    Ward and Kellis compared functional regions

    newly identifi ed by ENCODE among multiple

    humans, sampling from the

    1000 Genomes Project. Some

    DNA sequences not conserved

    between humans and other

    mammals were nonetheless

    very much preserved across

    multiple people, indicating

    that an additional 4% of the

    genome is newly under selection

    in the human lineage, they

    report in a paper published

    online by Science (http://scim.

    ag/WardKellis). Two such regions were near

    genes for nerve growth and the development

    of cone cells in the eye, which underlie distinguishing

    traits in humans. On the fl ip side,

    they also found that some supposedly conserved

    regions of the human genome, as highlighted

    by the comparison with 29 mammals,

    actually varied among humans, suggesting

    these regions were no longer functional.

    Beyond transcription, DNA’s bases function

    in gene regulation through their interactions

    with transcription factors and other

    proteins. ENCODE carried out several tests

    to map where those proteins bind along the

    genome (Science, 25 May 2007, p. 1120). Two,

    DNase-seq and FAIRE-seq, gave an overview

    of the genome, identifying where the protein-

    DNA complex chromatin unwinds and a protein

    can hook up with the DNA, and were

    applied to multiple cell types. ENCODE’s

    DNase-seq found 2.89 million such sites

    in 125 cell types. Stamatoyannopoulos and

    his colleagues describe their more extensive

    DNase-seq studies in Science (p. 1190): His

    team examined 349 types of cells, including

    233 60- to 160-day-old fetal tissue samples.

    Each type of cell had about 200,000 accessible

    locations, and there seemed to be at least

    3.9 million regions where transcription factors

    can bind in the genome. Across all cell

    types, about 42% of the genome can be accessible,

    he and his colleagues report. In many

    cases, the assays were able to pinpoint the specifi

    c bases involved in binding.

    Last year, Stamatoyannopoulos showed

    that these newly discovered functional regions

    sometimes overlap with specifi c DNA bases

    linked to higher or lower risks of various diseases,

    suggesting that the regulation of genes

    might be at the heart of these risk variations

    (Science, 27 May 2011, p. 1031). The work

    demonstrated how researchers could use

    ENCODE data to come up with new hypotheses

    about the link between genetics and a

    particular disorder. (The ENCODE analysis

    found that 12% of these bases, or SNPs,

    colocate with transcription factor binding

    sites and 34% are in open chromatin defi ned

    by the DNase-seq tests.) Now, in their new

    work published in Science,

    Stamatoyannopoulos’s lab has

    linked those regulatory regions

    to their specifi c target genes,

    homing in on the risk-enhancing

    ones. In addition, the group

    fi nds it can predict the cell type

    involved in a given disease.

    For example, the analysis fi ngered

    two types of T cells as

    pathogenic in Crohn’s disease,

    both of which are involved in

    this inflammatory bowel disorder. “We are

    informing disease studies in a way that would

    be very hard to do otherwise,” Birney says.

    Another test, called ChIP-seq, uses an

    antibody to home in on a particular DNAbinding

    protein and helps pinpoint the locations

    along the genome where that protein

    works. To date, ENCODE has examined

    about 100 of the 1500 or so transcription

    factors and about 20 other DNA binding

    proteins, including those involved in modifying

    the chromatin-associated proteins

    called histones. The binding sites found

    through ChIP-seq coincided with the sites

    mapped through FAIRE-seq and DNAseseq.

    Overall, 8% of the genome falls within

    a transcription factor binding site, a percentage

    that is expected to double once more

    transcription factors have been tested.

    Yale’s Gerstein used these results to fi gure

    out all the interactions among the transcription

    factors studied and came up with a network

    view of how these regulatory proteins

    work. These transcription factors formed a

    three-layer hierarchy, with the ones at the top

    having the broadest effects and the ones in

    the middle working together to coregulate a

    common target gene, he and his colleagues

    report in Nature.

    Using a technique called 5C, other

    researchers looked for places where DNA

    from distant regions of a chromosome, or

    even different chromosomes, interacted. It

    found that an average of 3.9 distal stretches

    of DNA linked up with the beginning of each

    gene. “Regulation is a 3D puzzle that has to

    be put together,” Gingeras says. “That’s what

    ENCODE is putting out on the table.”

    To date, NHGRI has put $288 million

    toward ENCODE, including the pilot project,

    technology development, and ENCODE

    efforts for the mouse, nematode, and fruit fl y.

    All together, more than 400 papers have been

    published by ENCODE researchers. Another

    110 or more studies have used ENCODE data,

    says NHGRI molecular biologist Michael

    Pazin. Molecular biologist Mathieu Lupien of

    the University of Toronto in Canada authored

    one of those papers, a study looking at epigenetics

    and cancer. “ENCODE data were

    fundamental” to the work, he says. “The cost

    is defi nitely worth every single dollar.”


    ENCODE By the Numbers

    147 cell types studied

    80% functional portion of human genome

    20,687 protein-coding genes

    18,400 RNA genes

    1640 data sets

    30 papers published this week

    442 researchers

    $288 million funding for pilot,

    technology, model organism, and current project

    “ We are informing

    disease studies in a

    way that would be

    very hard to do





    Published by AAAS

    Downloaded from http://www.sciencemag.org on September 10, 2012


Read Full Post »

%d bloggers like this: