Feeds:
Posts
Comments

Posts Tagged ‘junk DNA’

The Human Genome Gets Fully Sequenced: A Simplistic Take on Century Long Effort

 

Curator: Stephen J. Williams, PhD

Ever since the hard work by Rosalind Franklin to deduce structures of DNA and the coincidental work by Francis Crick and James Watson who modeled the basic building blocks of DNA, DNA has been considered as the basic unit of heredity and life, with the “Central Dogma” (DNA to RNA to Protein) at its core.  These were the discoveries in the early twentieth century, and helped drive the transformational shift of biological experimentation, from protein isolation and characterization to cloning protein-encoding genes to characterizing how the genes are expressed temporally, spatially, and contextually.

Rosalind Franklin, who’s crystolagraphic data led to determination of DNA structure. Shown as 1953 Time cover as Time person of the Year

Dr Francis Crick and James Watson in front of their model structure of DNA

 

 

 

 

 

 

 

 

 

Up to this point (1970s-mid 80s) , it was felt that genetic information was rather static, and the goal was still to understand and characterize protein structure and function while an understanding of the underlying genetic information was more important for efforts like linkage analysis of genetic defects and tools for the rapidly developing field of molecular biology.  But the development of the aforementioned molecular biology tools including DNA cloning, sequencing and synthesis, gave scientists the idea that a whole recording of the human genome might be possible and worth the effort.

How the Human Genome Project  Expanded our View of Genes Genetic Material and Biological Processes

 

 

From the Human Genome Project Information Archive

Source:  https://web.ornl.gov/sci/techresources/Human_Genome/project/hgp.shtml

History of the Human Genome Project

The Human Genome Project (HGP) refers to the international 13-year effort, formally begun in October 1990 and completed in 2003, to discover all the estimated 20,000-25,000 human genes and make them accessible for further biological study. Another project goal was to determine the complete sequence of the 3 billion DNA subunits (bases in the human genome). As part of the HGP, parallel studies were carried out on selected model organisms such as the bacterium E. coli and the mouse to help develop the technology and interpret human gene function. The DOE Human Genome Program and the NIH National Human Genome Research Institute (NHGRI) together sponsored the U.S. Human Genome Project.

 

Please see the following for goals, timelines, and funding for this project

 

History of the Project

It is interesting to note that multiple government legislation is credited for the funding of such a massive project including

Project Enabling Legislation

  • The Atomic Energy Act of 1946 (P.L. 79-585) provided the initial charter for a comprehensive program of research and development related to the utilization of fissionable and radioactive materials for medical, biological, and health purposes.
  • The Atomic Energy Act of 1954 (P.L. 83-706) further authorized the AEC “to conduct research on the biologic effects of ionizing radiation.”
  • The Energy Reorganization Act of 1974 (P.L. 93-438) provided that responsibilities of the Energy Research and Development Administration (ERDA) shall include “engaging in and supporting environmental, biomedical, physical, and safety research related to the development of energy resources and utilization technologies.”
  • The Federal Non-nuclear Energy Research and Development Act of 1974 (P.L. 93-577) authorized ERDA to conduct a comprehensive non-nuclear energy research, development, and demonstration program to include the environmental and social consequences of the various technologies.
  • The DOE Organization Act of 1977 (P.L. 95-91) mandated the Department “to assure incorporation of national environmental protection goals in the formulation and implementation of energy programs; and to advance the goal of restoring, protecting, and enhancing environmental quality, and assuring public health and safety,” and to conduct “a comprehensive program of research and development on the environmental effects of energy technology and program.”

It should also be emphasized that the project was not JUST funded through NIH but also Department of Energy

Project Sponsors

For a great read on Dr. Craig Ventnor with interviews with the scientist see Dr. Larry Bernstein’s excellent post The Human Genome Project

 

By 2003 we had gained much information about the structure of DNA, genes, exons, introns and allowed us to gain more insights into the diversity of genetic material and the underlying protein coding genes as well as many of the gene-expression regulatory elements.  However there was much uninvestigated material dispersed between genes, the then called “junk DNA” and, up to 2003 not much was known about the function of this ‘junk DNA’.  In addition there were two other problems:

  • The reference DNA used was actually from one person (Craig Ventor who was the lead initiator of the project)
  • Multiple gaps in the DNA sequence existed, and needed to be filled in

It is important to note that a tremendous amount of diversity of protein has been realized from both transcriptomic and proteomic studies.  Although about 20 to 25,000 coding genes exist the human proteome contains about 600,000 proteoforms (due to alternative splicing, posttranslational modifications etc.)

This expansion of the proteoform via alternate splicing into isoforms, gene duplication to paralogs has been shown to have major effects on, for example, cellular signaling pathways (1)

However just recently it has been reported that the FULL human genome has been sequenced and is complete and verified.  This was the focus of a recent issue in the journal Science.

Source: https://www.science.org/doi/10.1126/science.abj6987

Abstract

Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion–base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding. The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.

 

The current human reference genome was released by the Genome Reference Consortium (GRC) in 2013 and most recently patched in 2019 (GRCh38.p13) (1). This reference traces its origin to the publicly funded Human Genome Project (2) and has been continually improved over the past two decades. Unlike the competing Celera effort (3) and most modern sequencing projects based on “shotgun” sequence assembly (4), the GRC assembly was constructed from sequenced bacterial artificial chromosomes (BACs) that were ordered and oriented along the human genome by means of radiation hybrid, genetic linkage, and fingerprint maps. However, limitations of BAC cloning led to an underrepresentation of repetitive sequences, and the opportunistic assembly of BACs derived from multiple individuals resulted in a mosaic of haplotypes. As a result, several GRC assembly gaps are unsolvable because of incompatible structural polymorphisms on their flanks, and many other repetitive and polymorphic regions were left unfinished or incorrectly assembled (5).

 

Fig. 1. Summary of the complete T2T-CHM13 human genome assembly.
(A) Ideogram of T2T-CHM13v1.1 assembly features. For each chromosome (chr), the following information is provided from bottom to top: gaps and issues in GRCh38 fixed by CHM13 overlaid with the density of genes exclusive to CHM13 in red; segmental duplications (SDs) (42) and centromeric satellites (CenSat) (30); and CHM13 ancestry predictions (EUR, European; SAS, South Asian; EAS, East Asian; AMR, ad-mixed American). Bottom scale is measured in Mbp. (B and C) Additional (nonsyntenic) bases in the CHM13 assembly relative to GRCh38 per chromosome, with the acrocentrics highlighted in black (B) and by sequence type (C). (Note that the CenSat and SD annotations overlap.) RepMask, RepeatMasker. (D) Total nongap bases in UCSC reference genome releases dating back to September 2000 (hg4) and ending with T2T-CHM13 in 2021. Mt/Y/Ns, mitochondria, chrY, and gaps.

Note in Figure 1D the exponential growth in genetic information.

Also very important is the ability to determine all the paralogs, isoforms, areas of potential epigenetic regulation, gene duplications, and transposable elements that exist within the human genome.

Analyses and resources

A number of companion studies were carried out to characterize the complete sequence of a human genome, including comprehensive analyses of centromeric satellites (30), segmental duplications (42), transcriptional (49) and epigenetic profiles (29), mobile elements (49), and variant calls (25). Up to 99% of the complete CHM13 genome can be confidently mapped with long-read sequencing, opening these regions of the genome to functional and variational analysis (23) (fig. S38 and table S14). We have produced a rich collection of annotations and omics datasets for CHM13—including RNA sequencing (RNA-seq) (30), Iso-seq (21), precision run-on sequencing (PRO-seq) (49), cleavage under targets and release using nuclease (CUT&RUN) (30), and ONT methylation (29) experiments—and have made these datasets available via a centralized University of California, Santa Cruz (UCSC), Assembly Hub genome browser (54).

 

To highlight the utility of these genetic and epigenetic resources mapped to a complete human genome, we provide the example of a segmentally duplicated region of the chromosome 4q subtelomere that is associated with facioscapulohumeral muscular dystrophy (FSHD) (55). This region includes FSHD region gene 1 (FRG1), FSHD region gene 2 (FRG2), and an intervening D4Z4 macrosatellite repeat containing the double homeobox 4 (DUX4) gene that has been implicated in the etiology of FSHD (56). Numerous duplications of this region throughout the genome have complicated past genetic analyses of FSHD.

The T2T-CHM13 assembly reveals 23 paralogs of FRG1 spread across all acrocentric chromosomes as well as chromosomes 9 and 20 (Fig. 5A). This gene appears to have undergone recent amplification in the great apes (57), and approximate locations of FRG1 paralogs were previously identified by FISH (58). However, only nine FRG1 paralogs are found in GRCh38, hampering sequence-based analysis.

Future of the human reference genome

The T2T-CHM13 assembly adds five full chromosome arms and more additional sequence than any genome reference release in the past 20 years (Fig. 1D). This 8% of the genome has not been overlooked because of a lack of importance but rather because of technological limitations. High-accuracy long-read sequencing has finally removed this technological barrier, enabling comprehensive studies of genomic variation across the entire human genome, which we expect to drive future discovery in human genomic health and disease. Such studies will necessarily require a complete and accurate human reference genome.

CHM13 lacks a Y chromosome, and homozygous Y-bearing CHMs are nonviable, so a different sample type will be required to complete this last remaining chromosome. However, given its haploid nature, it should be possible to assemble the Y chromosome from a male sample using the same methods described here and supplement the T2T-CHM13 reference assembly with a Y chromosome as needed.

Extending beyond the human reference genome, large-scale resequencing projects have revealed genomic variation across human populations. Our reanalyses of the 1KGP (25) and SGDP (42) datasets have already shown the advantages of T2T-CHM13, even for short-read analyses. However, these studies give only a glimpse of the extensive structural variation that lies within the most repetitive regions of the genome assembled here. Long-read resequencing studies are now needed to comprehensively survey polymorphic variation and reveal any phenotypic associations within these regions.

Although CHM13 represents a complete human haplotype, it does not capture the full diversity of human genetic variation. To address this bias, the Human Pangenome Reference Consortium (59) has joined with the T2T Consortium to build a collection of high-quality reference haplotypes from a diverse set of samples. Ideally, all genomes could be assembled at the quality achieved here, but automated T2T assembly of diploid genomes presents a difficult challenge that will require continued development. Until this goal is realized, and any human genome can be completely sequenced without error, the T2T-CHM13 assembly represents a more complete, representative, and accurate reference than GRCh38.

 

This paper was the focus of a Time article and their basis for making the lead authors part of their Time 100 people of the year.

From TIME

The Human Genome Is Finally Fully Sequenced

Source: https://time.com/6163452/human-genome-fully-sequenced/

 

The first human genome was mapped in 2001 as part of the Human Genome Project, but researchers knew it was neither complete nor completely accurate. Now, scientists have produced the most completely sequenced human genome to date, filling in gaps and correcting mistakes in the previous version.

The sequence is the most complete reference genome for any mammal so far. The findings from six new papers describing the genome, which were published in Science, should lead to a deeper understanding of human evolution and potentially reveal new targets for addressing a host of diseases.

A more precise human genome

“The Human Genome Project relied on DNA obtained through blood draws; that was the technology at the time,” says Adam Phillippy, head of genome informatics at the National Institutes of Health’s National Human Genome Research Institute (NHGRI) and senior author of one of the new papers. “The techniques at the time introduced errors and gaps that have persisted all of these years. It’s nice now to fill in those gaps and correct those mistakes.”

“We always knew there were parts missing, but I don’t think any of us appreciated how extensive they were, or how interesting,” says Michael Schatz, professor of computer science and biology at Johns Hopkins University and another senior author of the same paper.

The work is the result of the Telomere to Telomere consortium, which is supported by NHGRI and involves genetic and computational biology experts from dozens of institutes around the world. The group focused on filling in the 8% of the human genome that remained a genetic black hole from the first draft sequence. Since then, geneticists have been trying to add those missing portions bit by bit. The latest group of studies identifies about an entire chromosome’s worth of new sequences, representing 200 million more base pairs (the letters making up the genome) and 1,956 new genes.

 

NOTE: In 2001 many scientists postulated there were as much as 100,000 coding human genes however now we understand there are about 20,000 to 25,000 human coding genes.  This does not however take into account the multiple diversity obtained from alternate splicing, gene duplications, SNPs, and chromosomal rearrangements.

Scientists were also able to sequence the long stretches of DNA that contained repeated sequences, which genetic experts originally thought were similar to copying errors and dismissed as so-called “junk DNA”. These repeated sequences, however, may play roles in certain human diseases. “Just because a sequence is repetitive doesn’t mean it’s junk,” says Eichler. He points out that critical genes are embedded in these repeated regions—genes that contribute to machinery that creates proteins, genes that dictate how cells divide and split their DNA evenly into their two daughter cells, and human-specific genes that might distinguish the human species from our closest evolutionary relatives, the primates. In one of the papers, for example, researchers found that primates have different numbers of copies of these repeated regions than humans, and that they appear in different parts of the genome.

“These are some of the most important functions that are essential to live, and for making us human,” says Eichler. “Clearly, if you get rid of these genes, you don’t live. That’s not junk to me.”

Deciphering what these repeated sections mean, if anything, and how the sequences of previously unsequenced regions like the centromeres will translate to new therapies or better understanding of human disease, is just starting, says Deanna Church, a vice president at Inscripta, a genome engineering company who wrote a commentary accompanying the scientific articles. Having the full sequence of a human genome is different from decoding it; she notes that currently, of people with suspected genetic disorders whose genomes are sequenced, about half can be traced to specific changes in their DNA. That means much of what the human genome does still remains a mystery.

The investigators in the Telomere to Telomere Consortium made the Time 100 People of the Year.

Michael Schatz, Karen Miga, Evan Eichler, and Adam Phillippy

Illustration by Brian Lutz for Time (Source Photos: Will Kirk—Johns Hopkins University; Nick Gonzales—UC Santa Cruz; Patrick Kehoe; National Human Genome Research Institute)

BY JENNIFER DOUDNA

MAY 23, 2022 6:08 AM EDT

Ever since the draft of the human genome became available in 2001, there has been a nagging question about the genome’s “dark matter”—the parts of the map that were missed the first time through, and what they contained. Now, thanks to Adam Phillippy, Karen Miga, Evan Eichler, Michael Schatz, and the entire Telomere-to-Telomere Consortium (T2T) of scientists that they led, we can see the full map of the human genomic landscape—and there’s much to explore.

In the scientific community, there wasn’t a consensus that mapping these missing parts was necessary. Some in the field felt there was already plenty to do using the data in hand. In addition, overcoming the technical challenges to getting the missing information wasn’t possible until recently. But the more we learn about the genome, the more we understand that every piece of the puzzle is meaningful.

I admire the

T2T group’s willingness to grapple with the technical demands of this project and their persistence in expanding the genome map into uncharted territory. The complete human genome sequence is an invaluable resource that may provide new insights into the origin of diseases and how we can treat them. It also offers the most complete look yet at the genetic script underlying the very nature of who we are as human beings.

Doudna is a biochemist and winner of the 2020 Nobel Prize in Chemistry

Source: https://time.com/collection/100-most-influential-people-2022/6177818/evan-eichler-karen-miga-adam-phillippy-michael-schatz/

Other articles on the Human Genome Project and Junk DNA in this Open Access Scientific Journal Include:

 

International Award for Human Genome Project

 

Cracking the Genome – Inside the Race to Unlock Human DNA – quotes in newspapers

 

The Human Genome Project

 

Junk DNA and Breast Cancer

 

A Perspective on Personalized Medicine

 

 

 

 

 

 

 

Additional References

 

  1. P. Scalia, A. Giordano, C. Martini, S. J. Williams, Isoform- and Paralog-Switching in IR-Signaling: When Diabetes Opens the Gates to Cancer. Biomolecules 10, (Nov 30, 2020).

 

 

Read Full Post »

Junk DNA and Breast Cancer

Larry H. Bernstein, MD, FCAP, Curator

LPBI

 

Junk DNA Helps Suppress Breast Cancer

GEN News      http://www.genengnews.com/gen-news-highlights/junk-dna-helps-suppress-breast-cancer/81252315/

 

http://www.genengnews.com/Media/images/GENHighlight/thumb__iStock_000019302482Small1434125230.jpg

The cancer-suppressing effects of junk DNA due to transcriptional interference and the junk DNA’s RNA product have been distinguished by means of an RNA interference technique. The technique may allow for finer dissection of cancer mechanisms. Already, it has been used to disentangle cell-cycle and cell-migration effects associated with a tumor-suppression gene involved in breast and ovarian cancer, as well as MET signaling and changes in cell shape. [iStock/David Marchal]

 

A piece of noncoding DNA helps prevents cells from turning cancerous, and does so via two mechanisms. The noncoding DNA, which is called GNG12-ASI, acts both directly by regulating the transcription of a cell-replication gene, and also indirectly by producing noncoding RNA that interferes with a cell-migration signaling mechanism.

Direct and indirect effects of this sort are ordinarily hard to disentangle experimentally, but they were recognized as being distinct and independent by researchers at the Universities of Bath and Cambridge. These researchers, led by Adele Murrel, Ph.D., reader in regenerative medicine at University of Bath, made use of a technique called RNA interference.

In a paper (“Transcriptional silencing of long noncoding RNA GNG12-AS1 uncouples its transcriptional and product-related functions”) that appeared February 2 in Nature Communications, the researchers explained that they used multiple small interfering RNAs (siRNAs) to silence GNG12-AS1, which is a long noncoding RNA (lncRNA). GNG12-AS1 is transcribed in an antisense orientation to a neighboring gene, the tumor-suppressor DIRAS3, which is downregulated in 70% of breast and ovarian cancer.

“While most siRNAs silence GNG12-AS1 post-transcriptionally, siRNA complementary to exon 1 of GNG12-AS1 suppresses its transcription by recruiting Argonaute 2 and inhibiting RNA polymerase II binding,” wrote the authors. “Transcriptional, but not post-transcriptional, silencing of GNG12-AS1 causes concomitant upregulation of DIRAS3, indicating a function in transcriptional interference.”

The authors noted that the transcriptional effect, the change in DIRAS3 expression, sufficed to impair cell cycle progression. In addition, the post-transcriptional effect, the reduction in GNG12-AS1 transcripts, altered MET signaling, effectively suppressing a gene network that prepares cells to change their shape and participate in metastasis.

Importantly, the changes in signaling and metastatic behavior associated with the post-transcriptional effect were independent of the changes associated with transcriptional interference. The latter changes affected both migratory behavior and the cell cycle.

“The cells in our body occur in numbers that are balanced by the level at which the new cells replace the old cells that die. Sometimes the switches that control this growth get stuck in the ‘on’ position, which can lead to cancer,” explained Dr. Murrell. “As the tumor grows and the cancer cells get crowded, they start to break away from the tumor, change shape, and are able to burrow through tissues to the bloodstream where they migrate to other parts of the body, which is how the cancer spreads. This process is called metastasis and requires a whole network of genes to regulate the transformation of cell shape and mobilization.”

“In our study, we’ve identified that GNG12-AS1, a strand of noncoding RNA, prevents the growth switch getting stuck and suppresses metastasis,” Dr. Murrell continued. “The specific genomic region where this noncoding RNA is located often gets damaged in breast cancer patients—this control is removed and the cancer cells spread.”

“Research like this is helping is to unpick the precise details about how [noncoding genome] regions work,” added Kat Arney, Ph.D., science communication manager at Cancer Research UK. “[It sheds] light on their potential role in the development of cancer and pointing towards new approaches for tackling the disease.”

“Altogether,” concluded the authors of the Nature Communications paper, “our results demonstrate that an siRNA-based strategy can be employed to successfully separate functions that are due to lncRNA transcription from those of the transcript.”

Read Full Post »

CRACKING THE CODE OF HUMAN LIFE: Recent Advances in Genomic Analysis and Disease – Part IIC

CRACKING THE CODE OF HUMAN LIFE: Recent Advances in Genomic Analysis and Disease – Part IIC

Author: Larry H. Bernstein, MD, FCAP, Triplex Medical Science

 

Part I: The Initiation and Growth of Molecular Biology and Genomics – Part I From Molecular Biology to Translational Medicine: How Far Have We Come, and Where Does It Lead Us?

http://pharmaceuticalintelligence.com/wp-admin/post.php?post=8634&action=edit&message=1

Part II: CRACKING THE CODE OF HUMAN LIFE is divided into a three part series.

Part IIA. “CRACKING THE CODE OF HUMAN LIFE: Milestones along the Way” reviews the Human Genome Project and the decade beyond.

http://pharmaceuticalintelligence.com/2013/02/12/cracking-the-code-of-human-life-milestones-along-the-way/

Part IIB. “CRACKING THE CODE OF HUMAN LIFE: The Birth of BioInformatics & Computational Genomics” lays the manifold multivariate systems analytical tools that has moved the science forward to a groung that ensures clinical application.

http://pharmaceuticalintelligence.com/2013/02/13/cracking-the-code-of-human-life-the-birth-of-bioinformatics-and-computational-genomics/

Part IIC. “CRACKING THE CODE OF HUMAN LIFE: Recent Advances in Genomic Analysis and Disease “ will extend the discussion to advances in the management of patients as well as providing a roadmap for pharmaceutical drug targeting.

http://pharmaceuticalintelligence.com/2013/02/14/cracking-the-code-of-human-life-recent-advances-in-genomic-analysis-and-disease/

To be followed by:
Part III will conclude with Ubiquitin, it’s role in Signaling and Regulatory Control.

 

Part IIC of series on CODE OF HUMAN LIFE
CRACKING THE CODE OF HUMAN LIFE: Recent Advances in Genomic Analysis and Disease

This final paper of Part II concludes a thorough review of the scientific events leading to the discovery of the human genome, the purification and identification of the components of the chromosome and the DNA structure and role in regulation of embryogenesis, and potential targets for cancer.

The first two articles, Part IIA, Part IIB,  go into some depth to elucidate the problems and breakthoughs encountered in the Human Genome Project, and the construction of a 3-D model necessary to explain interactions at a distance.

Part IIC, the final article, is entirely concerned with clinical application of this treasure trove of knowledge to resolving diseases of epigenetic nature in the young and the old, chronic inflammatory diseases, autoimmune diseases, infectious disease, gastrointestinal disorders, neurological and neurodegenerative diseases, and cancer.

 

CRACKING THE CODE OF HUMAN LIFE: Recent Advances in Genomic Analysis and Disease – Part IIC

 

1. Gene Links to Heart Disease

 

Recently, large studies have identified some of the genetic basis for important common diseases such as heart disease and diabetes, but most of the genetic contribution to them remains undiscovered. Now researchers at the University of Massachusetts Amherst led by biostatistician Andrea Foulkes have applied sophisticated statistical tools to existing large databases to reveal substantial new information about genes that cause such conditions as high cholesterol linked to heart disease.

Foulkes says, “This new approach to data analysis provides opportunities for developing new treatments.” It also advances approaches

  • to identifying people at greatest risk for heart disease. Another important point is that our method is straightforward to use with freely
  • available computer software and can be applied broadly to advance genetic knowledge of many diseases.

The new analytical approach she developed with cardiologist Dr. Muredach Reilly at the University of Pennsylvania and others is called “Mixed modeling of Meta-Analysis P-values” or MixMAP. Because it makes use of existing public databases, the powerful new method

  • represents a low-cost tool for investigators.
  • MixMAP draws on a principled statistical modeling framework and the vast array of summary data now available from genetic association
  • studies to formally test at a new, locus-level, association.

While that traditional statistical method looks for one unusual “needle in a haystack” as a possible disease signal, Foulkes and colleagues’

  • new method uses knowledge of DNA regions in the genome that are likely to
  • contain several genetic signals for disease variation clumped together in one region.
  • Thus, it is able to detect groups of unusual variants rather than just single SNPs, offering a way to “call out” gene
  • regions that have a consistent signal above normal variation.

http://Science.com/Science News/Identify Genes Linked to Heart Disease/

2. Apolipoprotein(a) Genetic Sequence Variants

The LPA gene codes for apolipoprotein(a), which, when linked with low-density lipoprotein particles, forms lipoprotein(a) [Lp(a)] —

  • a well-studied molecule associated with coronary artery disease (CAD). The Lp(a) molecule has both atherogenic and thrombogenic effects in vitro , but the extent to which these translate to differences in how atherothrombotic disease presents is unknown.

LPA contains many single-nucleotide polymorphisms, and 2 have been identified by previous groups as being strongly associated with

  • levels of Lp(a) and, as a consequence, strongly associated with CAD.

However, because atherosclerosis is thought to be a systemic disease, it is unclear to what extent Lp(a) leads to atherosclerosis in other arterial beds (eg, carotid, abdominal aorta, and lower extremity),

  • as well as to other thrombotic disorders (eg, ischemic/cardioembolic stroke and venous thromboembolism).

Such distinctions are important, because therapies that might lower Lp(a) could potentially reduce forms of atherosclerosis beyond the coronary tree.

To answer this question, Helgadottir and colleagues compiled clinical and genetic data on the LPA gene from thousands of previous

  • participants in genetic research studies from across the world. They did not have access to Lp(a) levels, but by knowing the genotypes for
  • 2 LPA variants, they inferred the levels of Lp(a) on the basis of prior associations between these variants and Lp(a) levels. [1]

Their studies included not only individuals of white European descent but also a significant proportion of black persons, in order to

  • widen the generalizability of their results.

Their main findings are that LPA variants (and, by proxy, Lp(a) levels) are associated with

  • CAD,
  • peripheral arterial disease,
  • abdominal aortic aneurysm,
  • number of CAD vessels,
  • age at onset of CAD diagnosis, and
  • large-artery atherosclerosis-type stroke.

They did not find an association with

  • cardioembolic or small-vessel disease-type stroke;
  • intracranial aneurysm;
  • venous thrombosis;
  • carotid intima thickness; or,
  • in a small subset of individuals, myocardial infarction.

Apolipoprotein(a) Genetic Sequence Variants Associated With Systemic Atherosclerosis and Coronary Atherosclerotic Burden but Not With Venous Thromboembolism. Helgadottir A, Gretarsdottir S, Thorleifsson G, et al.    J Am Coll Cardiol. 2012;60:722-729

English: Structure of the LPA protein. Based o...

English: Structure of the LPA protein. Based on PyMOL rendering of PDB 1i71. (Photo credit: Wikipedia)

Micrograph of an artery that supplies the hear...

Micrograph of an artery that supplies the heart with significant atherosclerosis and marked luminal narrowing. Tissue has been stained using Masson’s trichrome. (Photo credit: Wikipedia)

Genomic Blueprint of the Heart

Scientists at the Gladstone Institutes have revealed the precise order and timing of hundreds of genetic “switches” required to construct a fully

  • functional heart from embryonic heart cells — providing new clues into the genetic basis for some forms of congenital heart disease.

In a study being published online today in the journal Cell, researchers in the laboratory of Gladstone Senior Investigator Benoit Bruneau, PhD,

  • employed stem cell technology, next-generation DNA sequencing and computing tools to piece together the instruction manual, or “genomic
  • blueprint” for how a heart becomes a heart. These findings offer renewed hope for combating life-threatening heart defects such as arrhythmias (irregular heart beat) and ventricular septal defects (“holes in the heart”).

ScienceDaily (Sep. 13, 2012)

They approach heart formation with a wide-angle lens by

  • looking at the entirety of the genetic material that gives heart cells their unique identity.

The news comes at a time of emerging importance for the biological process called “epigenetics,” in which a non-genetic factor impacts a cell’s genetic

  • makeup early during development — but sometimes with longer-term consequences. All of the cells in an organism contain the same DNA, but the
  • epigenetic instructions encoded in specific DNA sequences give the cell its identity. Epigenetics is of particular interest in heart formation, as the
  • incorrect on-and-off switching of genes during fetal development can lead to congenital heart disease — some forms of which may not be apparent until adulthood.

the scientists took embryonic stem cells from mice and reprogrammed them into beating heart cells by mimicking embryonic development in a petri dish. Next, they extracted the DNA from developing and mature heart cells, using an advanced gene-sequencing technique called ChIP-seq that lets scientists “see” the epigenetic signatures written in the DNA.

Map of Heart Disease Death Rates in US White M...

Map of Heart Disease Death Rates in US White Males from 2000-2004 (Photo credit: Wikipedia)

Estimated propability of death or non-fatal my...

Estimated propability of death or non-fatal myocardial-infarction over one year corresponding ti selectet values of the individual scores. Ordinate: individual score, abscissa: Propability of death or non-fatal myocardial infarction in 1 year (in %) (Photo credit: Wikipedia)

simply finding these signatures was only half the battle — we next had to decipher which aspects of heart formation they encoded

To do that, we harnessed the computing power of the Gladstone Bioinformatics Core. This allowed us to take the mountains of data collected from

  • gene sequencing and organize it into a readable, meaningful blueprint for how a heart becomes a heart.”

http://ScienceDaily.org/Scientists Map the Genomic Blueprint of the Heart.  ScienceDaily.

Performance of transcription factor identification tools from differential gene expression data

A three step process is a clear way to establish belief in the performance of transcription factor identification tools

  • from differential gene expression data.
  • identify several types of differential gene expression data sets where the stimulus or trigger is clearly know
  • identify the transcription factors most likely associated with the sets expression data.
  • perform an upstream analysis from the identified transcription factor.

If the transcription factor and upstream analysis tools can trace the signal cascade back to the stimulus, the tools are

  • clearly producing relevant results, and belief in the performance of the analysis tools is established.

At this point, the tools can be directed with confidence to more challenging analyses such as

  • developed resistance or pathway elucidation.

The performance of IPA‘s new Transcription Factor and Upstream analysis tools was evaluated on the following datasets (processing details below):

  • TGFb stimulation, 1 hour, A549 lung adenocarcinoma cell line
  • BMP2 stimulation, 1 hour, Mouse Embryonic Stem Cell E14Tg2A.4
  • TNFa stimulation, 1 hour primary murine hepatocytes

For each of the above datasets, an upstream analysis from the identified transcription factors correctly identified the stimulus. IPA’s tools were very

  • easy to use and the
  • analysis time for the above experiments was less than one minute.

The performance, speed, and ease of use can only be characterized as very good, perhaps leading to breakthroughs when extended and used creatively. Ingenuity’s new transcription factor analysis tool in IPA, coupled with Ingenuity’s established upstream grow tools,  should be strongly considered for every lab analyzing differential expression data.

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE17896

http://www.ncbi.nlm.nih.gov/projects/geo/query/acc.cgi?acc=GSE2639

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE19272

Differential expression data was obtained from CEL files using the Matlab functions:

affyrma, genelowvalfilter, genevarfilter, mattest, and mavolcanoplot.

Rick Stanton, Pathway Analysis Consultant Ingenuity.com

3. miR-200a regulates Nrf2 activation by targeting Keap1 mRNA in breast cancer cells.

Eades G, Yang M, Yao Y, Zhang Y, Zhou Q. J Biol Chem. 2011 Nov 25;286(47):40725-33. Epub 2011 Sep 16.
http://JBiolChem.com/miR-200a regulates Nrf2 activation by targeting Keap1 mRNA in breast cancer cells.

NF-E2-related factor 2 (Nrf2) is an important transcription factor that

  • activates the expression of cellular detoxifying enzymes.

Nrf2 expression is largely regulated through the association of Nrf2 with Kelch-like ECH-associated protein 1 (Keap1), which

  • results in cytoplasmic Nrf2 degradation.

Conversely, little is known concerning the regulation of Keap1 expression. Until now, a regulatory role for microRNAs (miRs) in controlling Keap1 gene expression had not been characterized. By using miR array-

  • based screening, we observed miR-200a silencing in breast cancer cells and
  • demonstrated that upon re-expression, miR-200a
  • targets the Keap1 3′-untranslated region (3′-UTR), leading to Keap1 mRNA degradation. Loss of this regulatory mechanism may
  • contribute to the dysregulation of Nrf2 activity in breast cancer. Previously, we have identified epigenetic repression of miR-200a

in breast cancer cells. Here, we find that treatment with epigenetic therapy, the histone deacetylase inhibitor suberoylanilide hydroxamic acid, restored miR-200a expression and reduced Keap1 levels. This reduction in Keap1 levels corresponded with

  • Nrf2 nuclear translocation
  • and activation of Nrf2-dependent NAD(P)H-quinone oxidoreductase 1 (NQO1) gene transcription.

Moreover, we found that Nrf2 activation inhibited the anchorage-independent growth of breast cancer cells. Finally, our in vitro observations were confirmed in a model of carcinogen-induced mammary hyperplasia in vivo. In conclusion, our study demonstrates

  • that miR-200a regulates the Keap1/Nrf2 pathway in mammary epithelium, and we find that epigenetic therapy can restore miR-200a
  • regulation of Keap1 expression,
  • reactivating the Nrf2-dependent antioxidant pathway in breast cancer.

Nuclear factor-like 2  (erythroid-derived 2, also known as NFE2L2 or Nrf2, is a transcription factor that in humans is encoded by the NFE2L2 gene.[1])  NFE2L2 induces the expression of various genes including those that encode for several antioxidant enzymes, and it may play a physiological role in the regulation of oxidative stress. Investigational drugs that target NFE2L2 are of interest as potential therapeutic interventions for

  • oxidative-stress related pathologies.

4. Highly active zinc finger nucleases by extended modular assembly

MS Bhakta, IM Henry, DG Ousterout, KT Das, et al.  Corresponding author; email: djsegal@ucdavis.edu
http://CSHNLpress.com/Highly active zinc finger nucleases by extended modular assembly

Zinc finger nucleases (ZFNs) are important tools for genome engineering. Despite intense interest by many academic groups,

  • the lack of robust non-commercial methods has hindered their widespread use. The modular assembly (MA) of ZFNs from
  • publicly-available one-finger archives provides a rapid method to create proteins that can recognize a very broad spectrum of DNA sequences.

However, three- and four-finger arrays often fail to produce active nucleases. Efforts to improve the specificity of the one-finger archives have not increased the success rate above 25%, suggesting that the MA method might

  • be inherently inefficient due to its insensitivity to context-dependent effects.

Here we present the first systematic study on the effect of array length on ZFN activity.  ZFNs composed of six-finger MA arrays produced mutations at 15 of 21 (71%) targeted

  • loci in human and mouse cells. A novel Drop-Out Linker scheme was used to rapidly assess three- to six-finger combinations,
  • demonstrating that shorter arrays could improve activity in some cases. Analysis of 268 array variants revealed that half of

MA ZFNs of any array composition that exceed an ab initio

  • B-score cut-off of 15 were active.
  • MA ZFNs are able to target more DNA sequences with higher success rates than other methods.

This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date http://genome.cshlp.org/site/misc/terms.xhtml
After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at
http://creativecommons.org/licenses/by-nc/3.0/Highly_active_zinc_finger_nucleases_by_extended_ modular_assembly/

PERSONALIZED MEDICINE in the Pipeline

These insightful reviews are based on the strategic data and insights from Thomson Reuters Cortellis™ for Competitive Intelligence.  (A Review of April-June 2012).

http://ThomsonReuters.com/DIFFERENTIATED INNOVATION: PERSONALIZED MEDICINE IN THE PIPELINE/ Cortellis™ for Competitive Intelligence/APRIL-JUNE 2012

The majority of diseases are complex and multi-factorial, involving multiple genes interacting with environmental factors. At the genetic level,

  • information from genome-wide association studies that elucidate common patterns of genetic variation across various human populations,
  • in addition to profiling, technologies can be utilized in discovery research to provide snapshots of genes and expression profiles that are controlled
  • by the same regulatory mechanism and are altered between healthy and diseased states.

The characterization of genes that are abnormally expressed in disease tissues could further be employed as

  • diagnostic markers,
  • prognostic indicators of efficacy and/or toxicity, or as
  • targets for therapeutic intervention.

As the defining catalyst that exponentially paved the way for personalized medicine, information from the published genome sequence revealed that much of the genetic variations in humans are concentrated in about 0.1 percent of the over 3 billion base pairs in the haploid DNA. Most of these variations involve substitution of a single nucleotide for another at a given location in the genetic sequence, known as single nucleotide polymorphism (SNP).

  • Combinations of linked SNPs aggregate together to form haplotypes and
  • together these serve as markers for locating genetic variations in DNA sequences.

SNPs located within the protein-coding region of a gene or within the control regions of DNA that regulate a gene’s activity could

  • have a substantial effect on the encoded protein and thus influence phenotypic outcomes.

Analyzing SNPs between patient population cohorts could highlight specific genotypic variations which can be correlated with specific phenotypic variations in disease predisposition and drug responses.

Prior to the genomic revolution, many of the established therapies were directed against less than 500 drug targets, with many of the top selling drugs acting on well defined protein pathways. However, the sequencing of the human genome has massively expanded the pool of molecular targets that could be exploited in unmet medical needs and currently, of the approximately 22,300 protein-coding genes in the human code, it has been estimated that up to 3000 are druggable. Furthermore, genomic technologies such as

  • high-throughput sequencing
  • and transcription profiling,

can be used to identify and validate biologically relevant target molecules, or can be applied to cell-based and mice disease models or directly to in vivo human tissues,

  • helping to correlate gene targets with phenotypic traits of complex diseases.

This is particularly important, as

  • insufficient validation of target gene/proteins in complex diseases may be a contributing factor in the decline in R&D productivity.

Personalized medicine no doubt is already having a tremendous impact on drug development pipelines. According to a study conducted by the Tufts Center for the Study of Drug Development, more than 90 percent of biopharmaceutical companies now utilize at least some

  • genomics-derived targets in their drug discovery programs.

However, pipeline analysis from Cortellis for Competitive Intelligence suggests that there is still a scientific gap that has resulted in difficulty optimizing these novel genomic targets into the clinical R&D portfolios of major pharmaceutical companies, particularly outside the oncology field. Selected examples of personalized medicine product candidates in clinical development include (see TABLE 4).

Table 4: Selected Personalized Medicines in Clinical Development
(DATA are Derived from Cortellis for Competitive Intelligence & Thomson Reuters IntegritySM)
http://Thomson Reuters.com/Cortellis for Competitive Intelligence/IntegritySM/Table_4_Selected_Personalized_Medicines_in_Clinical_Development/

PHARMA MATTERS | SPOTLIGHT ON… PERSONALIZED MEDICINE

The paucity of actual targeted therapy examples, especially outside oncology, suggest

  • that integration of the personalized medicine paradigm into biopharmaceutical R&D is still fraught with challenges.

Despite the fact that the Human genome Project has been completed for over ten years, the broader application of genomics with drug development

  • still remains unrealized, and is hampered by a number of scientific challenges. One of the major obstacles stems from
  • incomplete association of genomic alterations with complex disease pathways and the phenotypic consequences.

As the modality of most complex diseases are multi-factorial, understanding how each genomic driver event plays a role in disease and the

  • interaction/interdependence with other genetic and environmental factors is important for
  • determining the rationale for targeted prevention or treatment of the disease.

Mutations found in Melanomas may shed light on Cancer Growth

Gina Kolata. New York Times.
http://NewYorkTimes.com/mutations_found_in_melanomas_may_shed-light_on_how_cancers_grow/

Mutations in Melanoma are in regions that control genes, not in the genes themselves. The mutations are exactly the type caused by exposure to ultraviolet light.  The findings are reported in two papers in http://Science.com/ScienceExpress/

The findings do not suggest new treatments, but they help explain how melanomas – and possibly – other cancers – develop and what drives their growth. This is a modification found in the “dark matter”, according to Dr. Levi A. Garraway,  the 99 percent of DNA in a region that regulates genes. A small control region was mutated in 7 out of 10 of the tumors, commonly of one or two tiny changes.
A German Team led by Rajiv Kumar (Heidelberg) and Dirk Schadendorf (Essen) looked at a family whose members tended to get melanomas.  Their findings indicate that those inherited with the mutations might be born with cells that have taken the first step toward cancer.
The mutations spur cells to make telomerase, that keeps the cells immortal by preventing them from losing the ends of their chromosome, the telomere. Abundant telomerase occurs in 90 percent of cancers, according to Immaculata De Vivo at Harvard Medical School.
The importance of the findings is that the mechanism of telomerase involvement in cancer is now within view. But it is not clear how to block the telomerase production in cancer cells.
 
A slight mutation in the matched nucleotides c...

A slight mutation in the matched nucleotides can lead to chromosomal aberrations and unintentional genetic rearrangement. (Photo credit: Wikipedia)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Comment

This discussion addresses the issues raised about the direction to follow in personalized medicine. Despite the amount of work necessary to bring the clarity that is sought after, the experiments and experimental design is most essential.

  • The arrest of ciliogenesis in ovarian cancer cell lines compared to wild type (WT) ovarian epithelial cells, and
  •  The link to suppressing ciliogenesis by AURA protein and CHFR at the base of the cilium, which disappears at mitosis or with proliferation.
  •  There is no accumulation by upregulation of PDGF under starvation by the cancer cells compared to the effect in WT OSE.

Here we have a systematic combination of signaling events tied to changes in putative biomarkers that occur synchronously in Ov cancer cell lines.

These changes are identified with changes in

  • proliferation,
  • loss of ciliary structure, and
  • proliferation.

In this described scenario,

  • WT OSE cells would be arrested, and
  • it appears that they would take the path to apoptosis (under starvation).

Even without more information, this cluster is what one wants to have in a “syndromic classification”. The information used to form the classification entails the identification of strong ‘signaling-related’ biomarkers. The Gli2 peptide has to be part of this.

In principle, a syndromic classification would be ideally expected to have no less than 64 classes. If the classification is “weak”, then the class frequencies would be close to what one would expect in the WT OSE. In this case, in reality,

  • several combinatorial classes would have low frequency, and
  • others would be quite high.

This obeys the classification rules established by feature identification, and the information gain described by Solomon Kullback and extended by Akaike.

Does this have to be the case for all different cancer types? I don’t think so. The cells are different in ontogenesis.  In this case, even the WT OSE have mesenchymal features and so, are not fully directed to epithelial expression.  This happens to be the case in actual anatomic expression of the ovary.  On the other hand, one would expect shared features of the

  • ovary,
  • testes,
  • thyroid,
  • adrenals, and
  • pituitary.

There is biochemical expression in terms of their synthetic function – TPN organs. I would have to put the liver into that broad class. Other organs – skeletal muscle & heart – transform substrate into energy or work.  (Where you might also put intestinal smooth muscle).

They have to have different biomarker expressions, even though they much less often don’t form neoplasms. (Bone is not just a bioenergetic force. It is maintained by muscle action. It forms sarcomas. But there has to be a balance between bone removal by osteoclasts and refill by osteoblasts.)

Viewpoint: What we have learned

  1. The Watson-Crick model proposed in 1953 is limited for explaining fully genome effects
  2. The Pauling triplex model may have been prescient because of a more full anticipation of molecular bonding variants
  3. A more adequate triple-helix model has been proposed and is consistent with a compact genome in the nucleus

The structure of the genome is not as we assumed – based on the application of Fractal Geometry.  Current body of evidence is building that can reveal a more complete view of genome function.

  • transcription
  • cell regulation
  • mutations

Summary

I have just completed a most comprehensive review of the Human Genome Project. There are key research collaborations, problems in deciphering the underlying structure of the genome, and there are also both obstacles and insights to elucidating the complexity of the final model.

This is because of frequent observations of molecular problems in folding and other interactions between nucleotides that challenge the sufficiency of the original DNA model proposed by Watson and Crick. This has come about because of breakthrough innovation in technology and in computational methods.

Radoslav Bozov •

Molecular biology and growth was primarily initiated on biochemical structural paradigms aiming to define functional spatial dynamics of molecules via assignation of various types of bondings – covalent and non-covalent – hydrogen, ionic , dipole-dipole, hydrophobic interactions.

  • Lab techniques based on z/m paradigm allowed separation, isolation and identification of bio substances with a general marker identity finding correlation between physiological/cellular states.
  • The development of electronic/x-ray technologies allowed zooming in nano space without capturing time.
  • NMR technology identified the existence of space topology of initial and final atomic states giving a highly limited light on time – energy axis of atomic interactions.
  • Sequence technology and genomic perturbations shed light on uncertainty of genomic dynamics and regulators of functional ever expanding networks.
  • Transition state theory coupled to structural complexity identification and enzymatic mechanisms ran up parallel to work on various phenomena of strings of nucleotides (oligomers and polymers) – illusion/observation of constructing models on the dynamics of protein-dna-rna interference.
  • The physical energetic constrains of biochemistry were inapplicable in open biological systems. Biologists have accepted observation as a sole driver towards re-evaluating models.
  • The separation of matter and time constrains emerged as deviation of energy and space constrains transforming into the full acceptance of code theory of life. One simple thing was left unnoticed over time –
  • the amount of information of quantum matter within a single codon is larger than that of a single amino acid. This violated all physical laws/principles known to work with a limited degree of certainty.
  • The limited amount of information analyzed by conventional sequence identity led to the notion of applicability of statistical measures of and PCR technology. Mutations were identified over larger scale of data.
  • Quantum chemistry itself is being limited due discrete space/energy constrains, thus it transformed into concepts/principles in biology that possess highly limited physical values whatsoever.
  • The central dogma is partially broken as a result of
  1. regulatory constrains
  2. epigenetic phenomena and
  3. iRNA.

Large scale code computational data run into uncertainty of the processes of evolution and its consequence of signaling transformation. All drugs were ‘lucky based’ applicability and/or discovery with largely unpredictable side effect over time.

Other Related articles on this Open Access Online Sceintific Journal include the following:

Big Data in Genomic Medicine  lhb

http://pharmaceuticalintelligence.com/2012/12/17/big-data-in-genomic-medicine/

BRCA1 a tumour suppressor in breast and ovarian cancer – functions in transcription, ubiquitination and DNA repair S Saha    http://pharmaceuticalintelligence.com/2012/12/04/brca1-a-tumour-suppressor-in-breast-and-ovarian-cancer-functions-in-transcription-ubiquitination-and-dna-repair/

Computational Genomics Center: New Unification of Computational Technologies at Stanford A Lev-Ari  http://pharmaceuticalintelligence.com/2012/12/03/computational-genomics-center-new-unification-of-computational-technologies-at-stanford/

Personalized medicine gearing up to tackle cancer ritu saxena     http://pharmaceuticalintelligence.com/2013/01/07/personalized-medicine-gearing-up-to-tackle-cancer/

Differentiation Therapy – Epigenetics Tackles Solid Tumors sj Williams     http://pharmaceuticalintelligence.com/2013/01/03/differentiation-therapy-epigenetics-tackles-solid-tumors/

Mechanism involved in Breast Cancer Cell Growth: Function in Early Detection & Treatment A Lev-Ari   http://pharmaceuticalintelligence.com/2013/01/17/mechanism-involved-in-breast-cancer-cell-growth-function-in-early-detection-treatment/

The Molecular pathology of Breast Cancer Progression tilde barliya      http://pharmaceuticalintelligence.com/2013/01/10/the-molecular-pathology-of-breast-cancer-progression/

Gastric Cancer: Whole-genome reconstruction and mutational signatures A Lev-Ari     http://pharmaceuticalintelligence.com/2012/12/24/gastric-cancer-whole-genome-reconstruction-and-mutational-signatures-2/

Paradigm Shift in Human Genomics – Predictive Biomarkers and Personalized Medicine – Part 1 (pharmaceuticalintelligence.com) A Lev-Ari                  http://pharmaceuticalintelligence.com/2013/01/13/paradigm-shift-in-human-genomics-predictive-biomarkers-and-personalized-medicine-part-1/

LEADERS in Genome Sequencing of Genetic Mutations for Therapeutic Drug Selection in Cancer Personalized Treatment: Part 2 A Lev-Ari
http://pharmaceuticalintelligence.com/2013/01/13/leaders-in-genome-sequencing-of-genetic-mutations-for-therapeutic-drug-selection-in-cancer-personalized-treatment-part-2/

Personalized Medicine: An Institute Profile – Coriell Institute for Medical Research: Part 3 A Lev-Ari   http://pharmaceuticalintelligence.com/2013/01/13/personalized-medicine-an-institute-profile-coriell-institute-for-medical-research-part-3/

Harnessing Personalized Medicine for Cancer Management, Prospects of Prevention and Cure: Opinions of Cancer Scientific Leaders @ http://pharmaceuticalintelligence.com ALA    http://pharmaceuticalintelligence.com/2013/01/13/7000/Harnessing Personalized Medicine for Cancer Management, Prospects of Prevention and Cure: Opinions of Cancer Scientific Leaders/

GSK for Personalized Medicine using Cancer Drugs needs Alacris systems biology model to determine the in silico effect of the inhibitor in its “virtual clinical trial” A Lev-Ari     http://pharmaceuticalintelligence.com/2012/11/14/gsk-for-personalized-medicine-using-cancer-drugs-needs-alacris-systems-biology-model-to-determine-the-in-silico-effect-of-the-inhibitor-in-its-virtual-clinical-trial/

Recurrent somatic mutations in chromatin-remodeling and ubiquitin ligase complex genes in serous endometrial tumors S Saha   http://pharmaceuticalintelligence.com/2012/11/19/recurrent-somatic-mutations-in-chromatin-remodeling-and-ubiquitin-ligase-complex-genes-in-serous-endometrial-tumors/

Personalized medicine-based cure for cancer might not be far away ritu saxena   http://pharmaceuticalintelligence.com/2012/11/20/personalized-medicine-based-cure-for-cancer-might-not-be-far-away/

Human Variome Project: encyclopedic catalog of sequence variants indexed to the human genome sequence A Lev-Ari
http://pharmaceuticalintelligence.com/2012/11/24/human-variome-project-encyclopedic-catalog-of-sequence-variants-indexed-to-the-human-genome-sequence/

Prostate Cancer Cells: Histone Deacetylase Inhibitors Induce Epithelial-to-Mesenchymal Transition sjwilliams
http://pharmaceuticalintelligence.com/2012/11/30/histone-deacetylase-inhibitors-induce-epithelial-to-mesenchymal-transition-in-prostate-cancer-cells/

Inspiration From Dr. Maureen Cronin’s Achievements in Applying Genomic Sequencing to Cancer Diagnostics A Lev-Ari
http://pharmaceuticalintelligence.com/2013/01/10/inspiration-from-dr-maureen-cronins-achievements-in-applying-genomic-sequencing-to-cancer-diagnostics/

The “Cancer establishments” examined by James Watson, co-discoverer of DNA w/Crick, 4/1953 A Lev-Ari
http://pharmaceuticalintelligence.com/2013/01/09/the-cancer-establishments-examined-by-james-watson-co-discover-of-dna-wcrick-41953/

Directions for genomics in personalized medicine lhb    http://pharmaceuticalintelligence.com/2013/01/27/directions-for-genomics-in-personalized-medicine/

How mobile elements in “Junk” DNA promote cancer. Part 1: Transposon-mediated tumorigenesis. Sjwilliams
http://pharmaceuticalintelligence.com/2012/10/31/how-mobile-elements-in-junk-dna-prote-cancer-part1-transposon-mediated-tumorigenesis/

Mitochondria: More than just the “powerhouse of the cell” eritu saxena   http://pharmaceuticalintelligence.com/2012/07/09/mitochondria-more-than-just-the-powerhouse-of-the-cell/

Mitochondrial fission and fusion: potential therapeutic targets? Ritu saxena    http://pharmaceuticalintelligence.com/2012/10/31/mitochondrial-fission-and-fusion-potential-therapeutic-target/

Mitochondrial mutation analysis might be “1-step” away ritu saxena     http://pharmaceuticalintelligence.com/2012/08/14/mitochondrial-mutation-analysis-might-be-1-step-away/

mRNA interference with cancer expression lhb    http://pharmaceuticalintelligence.com/2012/10/26/mrna-interference-with-cancer-expression/

Read Full Post »

 

Author and Curator: Ritu Saxena, Ph.D.

A recent post by Dr. Margaret Baker entitled “Junk DNA codes for valuable miRNAs: non-coding DNA controls Diabetes” talks about how the ENCODE project is revealing new insights into the functions of non-coding region of the human genome previously labeled as “junk DNA”. MicroRNA or miRNA, which as stated by Dr. Baker, “are among the non-gene encoding sequences in the genome and have been shown to play a major post-transcriptional role in expression of multiple genes.”

The post has touched upon several aspects of miRNA including origin, function, and mechanism of action. This commentary is an extension of Dr. Baker’s post, expanding upon the mechanism of action of miRNAs along with their role in potential disease therapy.

microRNA: Revisiting the past

MicroRNA were not discovered long back, infact, it was in 1998 when the presence of the non-coding RNAs that could be involved in switching ‘on’ and ‘off’ of certain genes. In the last decade, 2006 Nobel Prize for medicine or physiology was awarded to scientists Andrew Fire and Craig Mello for their discovery of this new role of RNA molecules.

A breakthrough research was published in the September 2010 issue of Nature journal, stating that mammalian microRNAs predominantly act by decreasing the levels of target mRNA. Mammalian microRNAs predominantly act to decrease target mRNA levels. miRNAs were initially thought to repress protein output without changes in the corresponding mRNA levels. Guo et al challenged the previous notion of ‘translational repression’ and concluded on the basis of their experimental results that ‘mRNA-destabilization’ scenario for the major part is responsible for the repression in protein expression via miRNAs. Authors utilized the method of ‘ribosome profiling’ to measure the overall effects of miRNA on protein production and then compared these to simultaneously measured effects on mRNA levels. Ribosome profiling prepares maps that exact positions of ribosomes on transcripts after nucleases chew upon the exposed part of transcripts that are not covered by ribosomes. MiR-1 and miR-155 were introduced into the HeLa-cell line. Both of these miRNAs are not  normally expressed in HeLa cells. Another miRNA used was mir-223 which is expressed in significant amounts in neutrophils. The reason for choosing the set of these miRNAs was that they had already been shown to repress protein levels via proteomics research. It was deciphered that miRNA-mediated repression was similar regardless of target expression level and further stated that “for both ectopic and endogenous miRNA regulatory interactions, lowered mRNA levels account for lowered mRNA levels accounted for most for most (>/=84%) of the decreased protein production.” These results show that changes in mRNA levels closely reflect the impact of miRNAs on gene expression and indicate that destabilization of target mRNAs is the predominant reason for reduced protein output.

Authors concluded that the discovery “will apply broadly to the vast majority of miRNA targeting interactions. If indeed general, this conclusion will be welcome news to biologists wanting to measure the ultimate impact of miRNAs on their direct regulatory targets.”

Since then and even before the paper was published, several other miRNAs and their roles have been discovered. Information on miRNAs has been consolidated in a database that can be accessed online at http://www.mirbase.org/

microRNA: From bench to bedside

Scientific community had speculated the role of non-coding RNAs in disease treatment right after their discovery. One such study demonstrating the utilization of microRNA for Cancer treatment was published in the September 2010 issue of the journal Nature Medicine. miR-380-5p represses p53 to control cellular survival and is associated with poor outcome inMYCN-amplified neuroblastoma

The p53 gene is known as a tumor suppressor gene and its inactivation has been associated in some cancers such as neuroblastoma. The study reported that microRNA-380 (miR-380) was able to repress the expression of p53 gene in cancer patients causing uninhibited cell survival and proliferation. The research group was able to decrease the tumor size in vivo in a mouse model of the neuroblastoma by delivering miR-380 antagonist. The researchers also observed that the inhibition of endogenous miR-380 in embryonic stem or neuroblastoma cells resulted in induction of p53, and extensive apoptotic cell death.

Thus, the success of miR antagonist for decreasing tumor size speaks of the effectiveness of miR as a potential therapeutic target for cancer treatment.

In conclusion, as stated by Dr. Baker in her post, “the miRNA data for tissues and specific cell types involved in disease pathology form a new approach to either detecting or possibly correcting gene (coding or non-coding) dysregulation. miRNA mimics and anti-miRNA agents are being developed as new therapeutic modalities.”

Reference:

Pharmaceutical Intelligence post, Author, Dr. Margaret Baker: Junk DNA codes for valuable miRNAs: non-coding DNA controls Diabetes

http://pharmaceuticalintelligence.com/2012/09/24/junk-dna-codes-for-valuable-mirnas/

 

Research articles: Mammalian microRNAs predominantly act to decrease target mRNA levels

miR-380-5p represses p53 to control cellular survival and is associated with poor outcome inMYCN-amplified neuroblastoma

Expert reviews- miRNA and Cancer treatment

 

News briefs: http://ygoy.com/2010/10/02/new-treatment-for-junk-dna-induced-cancers-discovered/

http://www.evolutionnews.org/2010/10/micrornas–once_dismissed_as_j038861.html

 

Read Full Post »

ENCODE data reveals important information from Genome Wide Association Studies relevant to understanding complex genetic diseases

Author: Ritu Saxena, Ph.D.

 

Introduction

“The depth, quality, and diversity of the ENCODE data are unprecedented” is what was stated by John Stamatoyannopoulos, professor of genomic sciences at the University of Washington and one of the many principle investigators of ENCODE project. ENCODE (Encyclopedia of DNA elements), indeed, was an ambitious project launched as a pilot in 2003 and then expanded in 2007 for the whole genome analysis and identification of all the functional elements of the human genome. The findings were striking as they challenged the definition of “gene” and ‘the central dogma of genetics (Gene-mRNA-protein). Infact, the non-coding part that constitutes about 80% of the genome or the so-called “junk DNA” was found to contain elements crucial for gene regulation. The elements, in large part, include RNA transcripts that are not transcribed into proteins but might have a regulatory role. For detailed reading, refer to the findings published in the issue of Nature, The ENCODE Project Consortium Nature 489, 57–74 (2012) An integrated encyclopedia of DNA elements in the human genome

Key features of the data, as explained in the National Human Genome Research Institute website (National Human Genome Research Institute News feature), include comprehensive mapping of:

  • Protein-coding genes — Proteins are molecules made of amino acids linked together in a specific sequence; the amino acid sequence is encoded by the sequence of DNA subunits called nucleotides that make up genes.
  • Non-coding genes — Stretches of DNA that are read by the cell as if they were genes but do not encode proteins. These appear to help regulate the activity of the genome.
  • Chromatin structure features — Complex physical structures made from a combination of DNA and binding proteins that make up the contents of the nucleus and affects genome function.
  • Histone modifications — Histones are the proteins that make up the chromatin structures that help shape and control the genome. In addition, histone proteins can be physically modified by adding chemical groups, such as a methyl molecule, that further regulates genomic activity.
  • DNA methylation — Just like histones, methyl groups can be added to DNA itself in a process called DNA methylation. Chemically attaching methyl groups to DNA physically changes the ability of enzymes to reach the DNA and thus alters the gene expression pattern in cells. Methylation helps cells “remember what they are doing” or alter levels of gene expression, and it is a crucial part of normal development and cellular differentiation in higher organisms.
  • Transcription factor binding sites — Transcription factors are proteins that bind to specific DNA sequences, controlling the flow (or transcription) of genetic information from DNA to mRNA. Mapping the binding sites can help researchers understand how genomic activity is controlled.

How could ENCODE be helpful in the study of complex human diseases?

Complex diseases and Genome wide association studies (GWAS)

Coronary artery disease, type 2 diabetes and many forms of cancer are complex human diseases that have a significant genetic component. Unlike mendelian disorders that have defined loci, the genetic component of complex disorders lies in the form of genetic variations in the genome making an individual susceptible to these complex diseases.

Researchers have performed Genome-wide association studies (GWAS) of the human genome, leading to the identification of thousands of DNA variants that could be linked with complex traits and diseases. However, identifying the variants, referred to as SNPs (Single Nucleotide Polymorphisms), that actually contribute to the disease, and understanding how they exert influence on a disease has been more of a mystery.

How would ENCODE solve the puzzle?

The puzzle lies in interpreting how the SNPs found in the genome affect a person’s susceptibility to a particular trait or disease and what is the mechanism behind it. As identified in the GWAS, most variants that are associated with the phenotype of the trait or disease lie in the non-coding region of the genome. Infact, in more than 400 studies compiled in the GWAS catalog only a small minority of the trait/disease-associated SNPs occur in protein-coding regions; the large majority (89%) are in noncoding regions. These variants fall in the gene deserts that lie far from protein-coding region, similar to those where cis-regulatory modules (CRMs) are found. CRMs such as promoters and enhancers are a group of binding sites for transcription factors, and the presence of transcription factors bound to these sites is a good indicator of the potential regulatory regions.

The integrative analysis of ENCODE data has give important insights to the results of GWAS studies. Investigators have employed ENCODE data as an initial guide to discover regulatory regions in which genetic variation is affecting a complex trait. Additionally, ENCODE study when examined the SNPs from GWAS that were associated with the phenotype of the trait, found that these regions are enriched in DNase-sensitive regions i.e, lie in the function-associated DNA region of the genome as it could be bound by transcription factors affecting the regulation of gene expression. Thus, the project demonstrates that non-coding regions must be considered when interpreting GWAS results, and it provides a strong motivation for reinterpreting previous GWAS findings.

Using ENCODE Data to Interpret GWAS Results

ENCODE and predisposition to CANCER:

C-Myc, a proto-oncogene, codes for a transcripton factor, when expressed constitutively leads to uninhibited cell proliferation resulting in cancer. It has been observed that common variants within a ~1 Mb region upstream of c-Myc gene have been associated with cancers of the colon, prostate, and breast. Several SNPs have been reported in this region, that although affect the phenotype, lie in the distal cis-region of the MYC gene. Alignment of the ENCODE data in this region with the significant variants from the GWAS also reveals that key variants are found in the transcription factor occupied DNA segments mapped by this consortium. One variant rs698327, lies within a DNase hypersensitive site that is bound by several transcription factors, enhancer-associated protein p300, and contains histone modifications relative to enhancers (high H3K4me1, low H3K4me3). ENCODE data indicates that non-coding regions in the human chromosome 8q24 loci are associated with cancer and as observed in the case of c-myc gene, similar studies on cancer-related genes could help explain predisposition to cancer.

ENCODE and fetal hemoglobin expression:

Another example of the use of ENCODE data is that of gene regulation of fetal hemoglobin. Several regions were predicted via ENCODE that were involved in the regulation of fetal hemoglobin. It was found that these predicted regions are close to the SNPs in the BLC11A gene that is associated with persistent expression of fetal hemoglobin.

Future perspective

As evident from the above examples, the ENCODE data shows that genetic variants do affect regulated expression of a target gene. Recently, several research groups in the UK performed a large-scale GWAS study to determine the genetic predisposition to fracture risk. The collaborative effort, published in a recent issue of the PLoS journal, was made to identify genetic variants associated with cortical bone thickness (CBT) and bone mineral density (BMD) with data from more than 10,000 subjects. http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1002745 The study generated a wealth of data including the result – identification of SNPs in the WNT16 and its adjacent gene, FAM3C were found to be relevant to CBT and BMD. ENCODE data, in this case, could be helpful in interpreting more detailed information including determining additional SNPs, the regulatory information of the genes involved and much more. Thus, it could be concluded that ENCODE data could be immensely useful in interpreting associations between disease and DNA sequences that can vary from person to person.

Sources:

Research articles

An integrated encyclopedia of DNA elements in the human genome

A User’s Guide to the Encyclopedia of DNA Elements (ENCODE)

What does our genome encode?

Genome-wide Epigenetic Data Facilitate Understanding of Disease Susceptibility Association Studies

Genomics: ENCODE explained

ENCODE Project Writes Eulogy For Junk DNA

WNT16 Influences Bone Mineral Density, Cortical Bone Thickness, Bone Strength, and Osteoporotic Fracture Risk

 News articles

ENCODE project: In massive genome analysis new data suggests ‘gene’ redefinition

National Human Genome Research Institute News feature

Related posts

Expanding the Genetic Alphabet and linking the genome to the metabolome

Junk DNA codes for valuable miRNAs: non-coding DNA controls Diabetes

ENCODE Findings as Consortium

Read Full Post »

%d bloggers like this: