Posts Tagged ‘genetic code’

The Search for the Genetic Code

Larry H. Bernstein, MD, FCAP, Curator




Discovery of DNA Structure and Function: Watson and Crick

By: Leslie A. Pray, Ph.D. © 2008 Nature Education

Citation: Pray, L. (2008) Discovery of DNA structure and function: Watson and Crick. Nature Education 1(1):100

The landmark ideas of Watson and Crick relied heavily on the work of other scientists. What did the duo actually discover?

Many people believe that American biologist James Watson and English physicist Francis Crick discovered DNA in the 1950s. In reality, this is not the case. Rather, DNA was first identified in the late 1860s by Swiss chemist Friedrich Miescher. Then, in the decades following Miescher’s discovery, other scientists–notably, Phoebus Levene and Erwin Chargaff–carried out a series of research efforts that revealed additional details about the DNA molecule, including its primary chemical components and the ways in which they joined with one another. Without the scientific foundation provided by these pioneers, Watson and Crick may never have reached their groundbreaking conclusion of 1953: that the DNA molecule exists in the form of a three-dimensional double helix.

The First Piece of the Puzzle: Miescher Discovers DNA

Although few people realize it, 1869 was a landmark year in genetic research, because it was the year in which Swiss physiological chemist Friedrich Miescher first identified what he called “nuclein” inside the nuclei of human white blood cells. (The term “nuclein” was later changed to “nucleic acid” and eventually to “deoxyribonucleic acid,” or “DNA.”) Miescher’s plan was to isolate and characterize not the nuclein (which nobody at that time realized existed) but instead the protein components of leukocytes (white blood cells). Miescher thus made arrangements for a local surgical clinic to send him used, pus-coated patient bandages; once he received the bandages, he planned to wash them, filter out the leukocytes, and extract and identify the various proteins within the white blood cells. But when he came across a substance from the cell nuclei that had chemical properties unlike any protein, including a much higher phosphorous content and resistance to proteolysis (protein digestion), Miescher realized that he had discovered a new substance (Dahm, 2008). Sensing the importance of his findings, Miescher wrote, “It seems probable to me that a whole family of such slightly varying phosphorous-containing substances will appear, as a group of nucleins, equivalent to proteins” (Wolf, 2003).

More than 50 years passed before the significance of Miescher’s discovery of nucleic acids was widely appreciated by the scientific community. For instance, in a 1971 essay on the history of nucleic acid research, Erwin Chargaff noted that in a 1961 historical account of nineteenth-century science, Charles Darwin was mentioned 31 times, Thomas Huxley 14 times, but Miescher not even once. This omission is all the more remarkable given that, as Chargaff also noted, Miescher’s discovery of nucleic acids was unique among the discoveries of the four major cellular components (i.e., proteins, lipids, polysaccharides, and nucleic acids) in that it could be “dated precisely… [to] one man, one place, one date.”


Laying the Groundwork: Levene Investigates the Structure of DNA

Meanwhile, even as Miescher’s name fell into obscurity by the twentieth century, other scientists continued to investigate the chemical nature of the molecule formerly known as nuclein. One of these other scientists was Russian biochemist Phoebus Levene. A physician turned chemist, Levene was a prolific researcher, publishing more than 700 papers on the chemistry of biological molecules over the course of his career. Levene is credited with many firsts. For instance, he was the first to discover the order of the three major components of a single nucleotide (phosphate-sugar-base); the first to discover the carbohydrate component ofRNA (ribose); the first to discover the carbohydrate component of DNA (deoxyribose); and the first to correctly identify the way RNA and DNA molecules are put together.

During the early years of Levene’s career, neither Levene nor any other scientist of the time knew how the individual nucleotide components of DNA were arranged in space; discovery of the sugar-phosphate backbone of the DNA molecule was still years away. The large number of molecular groups made available for binding by each nucleotide component meant that there were numerous alternate ways that the components could combine. Several scientists put forth suggestions for how this might occur, but it was Levene’s “polynucleotide” model that proved to be the correct one. Based upon years of work using hydrolysis to break down and analyze yeast nucleic acids, Levene proposed that nucleic acids were composed of a series of nucleotides, and that each nucleotide was in turn composed of just one of four nitrogen-containing bases, a sugar molecule, and a phosphate group. Levene made his initial proposal in 1919, discrediting other suggestions that had been put forth about the structure of nucleic acids. In Levene’s own words, “New facts and new evidence may cause its alteration, but there is no doubt as to the polynucleotide structure of the yeast nucleic acid” (1919).

Indeed, many new facts and much new evidence soon emerged and caused alterations to Levene’s proposal. One key discovery during this period involved the way in which nucleotides are ordered. Levene proposed what he called a tetranucleotide structure, in which the nucleotides were always linked in the same order (i.e., G-C-T-A-G-C-T-A and so on). However, scientists eventually realized that Levene’s proposed tetranucleotide structure was overly simplistic and that the order of nucleotides along a stretch of DNA (or RNA) is, in fact, highlyvariable. Despite this realization, Levene’s proposed polynucleotide structure was accurate in many regards. For example, we now know that DNA is in fact composed of a series of nucleotides and that each nucleotide has three components: aphosphate group; either a ribose (in the case of RNA) or a deoxyribose (in the case of DNA) sugar; and a single nitrogen-containing base. We also know that there are two basic categories of nitrogenous bases: the purines (adenine [A] and guanine [G]), each with two fused rings, and the pyrimidines (cytosine [C],thymine [T], and uracil [U]), each with a single ring. Furthermore, it is now widely accepted that RNA contains only A, G, C, and U (no T), whereas DNA contains only A, G, C, and T (no U) (Figure 1).

Figure 1: The chemical structure of a nucleotide.

A single nucleotide is made up of three components: a nitrogen-containing base, a five-carbon sugar, and a phosphate group. The nitrogenous base is either a purine or a pyrimidine. The five-carbon sugar is either a ribose (in RNA) or a deoxyribose (in DNA)molecule.

© 2013 Nature Education

Strengthening the Foundation: Chargaff Formulates His “Rules”

Erwin Chargaff was one of a handful of scientists who expanded on Levene’s work by uncovering additional details of the structure of DNA, thus further paving the way for Watson and Crick. Chargaff, an Austrian biochemist, had read the famous 1944 paper by Oswald Avery and his colleagues at Rockefeller University, which demonstrated that hereditary units, or genes, are composed of DNA. This paper had a profound impact on Chargaff, inspiring him to launch a research program that revolved around the chemistry of nucleic acids. Of Avery’s work, Chargaff (1971) wrote the following:

“This discovery, almost abruptly, appeared to foreshadow a chemistry of heredity and, moreover, made probable the nucleic acid character of thegene… Avery gave us the first text of a new language, or rather he showed us where to look for it. I resolved to search for this text.”

As his first step in this search, Chargaff set out to see whether there were any differences in DNA among different species. After developing a new paper chromatography method for separating and identifying small amounts of organic material, Chargaff reached two major conclusions (Chargaff, 1950). First, he noted that the nucleotide composition of DNA varies among species. In other words, the same nucleotides do not repeat in the same order, as proposed by Levene. Second, Chargaff concluded that almost all DNA–no matter what organism or tissue type it comes from–maintains certain properties, even as its composition varies. In particular, the amount of adenine (A) is usually similar to the amount of thymine (T), and the amount of guanine (G) usually approximates the amount of cytosine (C). In other words, the total amount of purines (A + G) and the total amount of pyrimidines (C + T) are usually nearly equal. (This second major conclusion is now known as “Chargaff’s rule.”) Chargaff’s research was vital to the later work of Watson and Crick, but Chargaff himself could not imagine the explanation of these relationships–specifically, that A bound to T and C bound to G within the molecular structure of DNA (Figure 2).

Figure 2: What is Chargaff’s rule?

All DNA follows Chargaff’s Rule, which states that the total number of purines in a DNA molecule is equal to the total number of pyrimidines.

© 2013 Nature Education

Putting the Evidence Together: Watson and Crick Propose the Double Helix

Chargaff’s realization that A = T and C = G, combined with some crucially important X-ray crystallography work by English researchers Rosalind Franklin and Maurice Wilkins, contributed to Watson and Crick’s derivation of the three-dimensional, double-helical model for the structure of DNA. Watson and Crick’s discovery was also made possible by recent advances in model building, or the assembly of possible three-dimensional structures based upon known molecular distances and bond angles, a technique advanced by American biochemist Linus Pauling. In fact, Watson and Crick were worried that they would be “scooped” by Pauling, who proposed a different model for the three-dimensional structure of DNA just months before they did. In the end, however, Pauling’s prediction was incorrect.

Using cardboard cutouts representing the individual chemical components of the four bases and other nucleotide subunits, Watson and Crick shifted molecules around on their desktops, as though putting together a puzzle. They were misled for a while by an erroneous understanding of how the different elements in thymine and guanine (specifically, the carbon, nitrogen, hydrogen, and oxygen rings) were configured. Only upon the suggestion of American scientist Jerry Donohue did Watson decide to make new cardboard cutouts of the two bases, to see if perhaps a different atomic configurationwould make a difference. It did. Not only did the complementary bases now fit together perfectly (i.e., A with T and C with G), with each pair held together by hydrogen bonds, but the structure also reflected Chargaff’s rule (Figure 3).

Figure 3: The double-helical structure of DNA.

The 3-dimensional double helix structure of DNA, correctly elucidated by James Watson and Francis Crick. Complementary bases are held together as a pair by hydrogen bonds.

© 2013 Nature Education

Although scientists have made some minor changes to the Watson and Crick model, or have elaborated upon it, since its inception in 1953, the model’s four major features remain the same yet today. These features are as follows:

  • DNA is a double-stranded helix, with the two strands connected by hydrogen bonds. A bases are always paired with Ts, and Cs are always paired with Gs, which is consistent with and accounts for Chargaff’s rule.
  • Most DNA double helices are right-handed; that is, if you were to hold your right hand out, with your thumb pointed up and your fingers curled around your thumb, your thumb would represent the axis of the helix and your fingers would represent the sugar-phosphate backbone. Only one type of DNA, called Z-DNA, is left-handed.
  • The DNA double helix is anti-parallel, which means that the 5′ end of one strand is paired with the 3′ end of its complementary strand (and vice versa). As shown in Figure 4, nucleotides are linked to each other by their phosphate groups, which bind the 3′ end of one sugar to the 5′ end of the next sugar.
  • Not only are the DNA base pairs connected via hydrogen bonding, but the outer edges of the nitrogen-containing bases are exposed and available for potential hydrogen bonding as well. These hydrogen bonds provide easy access to the DNA for other molecules, including the proteins that play vital roles in the replication and expression of DNA (Figure 4).—Copy_1_2.jpg

Figure 4: Base pairing in DNA.

Two hydrogen bonds connect T to A; three hydrogen bonds connect G to C. The sugar-phosphate backbones (grey) run anti-parallel to each other, so that the 3’ and 5’ ends of the two strands are aligned.

© 2013 Nature Education

One of the ways that scientists have elaborated on Watson and Crick’s model is through the identification of three different conformations of the DNA double helix. In other words, the precise geometries and dimensions of the double helix can vary. The most common conformation in most living cells (which is the one depicted in most diagrams of the double helix, and the one proposed by Watson and Crick) is known as B-DNA. There are also two other conformations: A-DNA, a shorter and wider form that has been found in dehydrated samples of DNA and rarely under normal physiological circumstances; and Z-DNA, a left-handed conformation. Z-DNA is a transient form of DNA, only occasionally existing in response to certain types of biological activity (Figure 5). Z-DNA was first discovered in 1979, but its existence was largely ignored until recently. Scientists have since discovered that certain proteins bind very strongly to Z-DNA, suggesting that Z-DNA plays an important biological role in protection against viraldisease (Rich & Zhang, 2003).

Figure 5: Three different conformations of the DNA double helix.

(A) A-DNA is a short, wide, right-handed helix. (B) B-DNA, the structure proposed by Watson and Crick, is the most common conformation in most living cells. (C) Z-DNA, unlike A- and B-DNA, is a left-handed helix.

© 2014 Nature Education Adapted from Pierce, Benjamin.Genetics: A Conceptual Approach, 2nd ed.


Watson and Crick were not the discoverers of DNA, but rather the first scientists to formulate an accurate description of this molecule’s complex, double-helical structure. Moreover, Watson and Crick’s work was directly dependent on the research of numerous scientists before them, including Friedrich Miescher, Phoebus Levene, and Erwin Chargaff. Thanks to researchers such as these, we now know a great deal about genetic structure, and we continue to make great strides in understanding the humangenome and the importance of DNA to life and health.

References and Recommended Reading

Chargaff, E. Chemical specificity of nucleic acids and mechanism of their enzymatic degradation. Experientia6, 201–209 (1950)

—. Preface to a grammar of biology. Science171, 637–642 (1971)

Dahm, R. Discovering DNA: Friedrich Miescher and the early years of nucleic acid research. Human Genetics122, 565–581 (2008)

Levene, P. A. The structure of yeast nucleic acid. IV. Ammonia hydrolysis. Journal of Biological Chemistry40, 415–424 (1919)

Rich, A., &. Zhang, S. Z-DNA: The long road to biological function. Nature Reviews Genetics4, 566–572 (2003) (link to article)

Watson, J. D., & Crick, F. H. C. A structure for deoxyribose nucleic acid. Nature171, 737–738 (1953) (link to article)

Wolf, G. Friedrich Miescher: The man who discovered DNA.Chemical Heritage21, 10-11, 37–41 (2003)

see also —


DNA Damage & Repair: Mechanisms for Maintaining DNA Integrity

DNA Replication and Causes of Mutation

Genetic Mutation

Genetic Mutation

Major Molecular Events of DNA Replication

Semi-Conservative DNA Replication: Meselson and Stahl


Barbara McClintock and the Discovery of Jumping Genes (Transposons)

Functions and Utility of Alu Jumping Genes

Transposons, or Jumping Genes: Not Junk DNA?

Transposons: The Jumping Genes


DNA Transcription

RNA Transcription by RNA Polymerase: Prokaryotes vs Eukaryotes

Translation: DNA to mRNA to Protein

What is a Gene? Colinearity and Transcription Units

About these ads



Read Full Post »

Regulatory DNA engineered

Larry H. Bernstein, MD, FCAP, Curator



New Type of CRISPR Screen Probes the Regulatory Genome

Aaron Krol

February 8, 2016 | When a geneticist stares down the 3 billion DNA base pairs of the human genome, searching for a clue to what’s gone awry in a single patient, it helps to narrow the field. One of the most popular places to look is the exome, the tiny fraction of our DNA―less than 2%―that actually codes for proteins. For patients with rare genetic diseases, which might be fully explained by one key mutation, many studies sequence the whole exome and leave all the noncoding DNA out. Similarly, personalized cancer tests, which can help bring to light unexpected treatment options, often sequence the tumor exome, or a smaller panel of protein-coding genes.

Unfortunately, we know that’s not the whole picture. “There are a substantial number of noncoding regions that are just as effective at turning off a gene as a mutation in the gene itself,” says Richard Sherwood, a geneticist at Brigham and Women’s Hospital in Boston. “Exome sequencing is not going to be a good proxy for what genes are working.”

Sherwood studies regulatory DNA, the vast segment of the genome that governs which genes are turned on or off in any cell at a given time. It’s a confounding area of genetics; we don’t even know how much of the genome is made up of these regulatory elements. While genes can be recognized by the presence of “start” and “stop” codons―sequences of three DNA letters that tell the cell’s molecular machinery which stretches of DNA to transcribe into RNA, and eventually into protein―there are no definite signs like this for regulatory DNA.

Instead, studies to discover new regulatory elements have been somewhat trial-and-error. If you suspect a gene’s activity might be regulated by a nearby DNA element, you can inhibit that element in a living cell, and see if your gene shuts down with it.

With these painstaking experiments, scientists can slowly work their way through potential regulatory regions―but they can’t sweep across the genome with the kind of high-throughput testing that other areas of genetics thrive on. “Previously, you couldn’t do these sorts of tests in a large form, like 4,000 of them at once,” says David Gifford, a computational biologist at MIT. “You would really need to have a more hypothesis-directed methodology.”

Recently, Gifford and Sherwood collaborated on a paper, published in Nature Biotechnology, which presents a new method for testing thousands of DNA loci for regulatory activity at once. Their assay, called MERA (multiplexed editing regulatory assay), is built on the recent technology boom in CRISPR-Cas9 gene editing, which lets scientists quickly and easily cut specific sequences of DNA out of the genome.

So far, their team, including lead author Nisha Rajagopal from Gifford’s lab, has used MERA to study the regulation of four genes involved in the development of embryonic stem cells. Already, the results have defied the accepted wisdom about regulatory DNA. Many areas of the genome flagged by MERA as important factors in gene expression do not fall into any known categories of regulatory elements, and would likely never have been tested with previous-generation methods.

“Our approach allows you to look away from the lampposts,” says Sherwood. “The more unbiased you can be, the more we’ll actually know.”

A New Kind of CRISPR Screen

In the past three years, CRISPR-Cas9 experiments have taken all areas of molecular biology by storm, and Sherwood and Gifford are far from the first to use the technology to run large numbers of tests in parallel. CRISPR screens are an excellent way to learn which genes are involved in a cellular process, like tumor growth or drug resistance. In these assays, scientists knock out entire genes, one by one, and see what happens to cells without them.

This kind of CRISPR screen, however, operates on too small a scale to study the regulatory genome. For each gene knocked out in a CRISPR screen, you have to engineer a strain of virus to deliver a “guide RNA” into the cellular genome, showing the vicelike Cas9 molecule which DNA region to cut. That works well if you know exactly where a gene lies and only need to cut it once—but in a high-throughput regulatory test, you would want to blanket vast stretches of DNA with cuts, not knowing which areas will turn out to contain regulatory elements. Creating a new virus for each of these cuts is hugely impractical.

The insight behind MERA is that, with the right preparation, most of the genetic engineering can be done in advance. Gifford and Sherwood’s team used a standard viral vector to put a “dummy” guide RNA sequence, one that wouldn’t tell Cas9 to cut anything, into an embryonic stem cell’s genome. Then they grew plenty of cells with this prebuilt CRISPR system inside, and attacked each one with a Cas9 molecule targeted to the dummy sequence, chopping out the fake guide.

Normally, the result would just be a gap in the CRISPR system where the guide once was. But along with Cas9, the researchers also exposed the cells to new, “real” guide RNA sequences. Through a DNA repair mechanism called homologous recombination, the cells dutifully patched over the gaps with new guides, whose sequences were very similar to the missing dummy code. At the end of the process, each cell had a unique guide sequence ready to make cuts at a specific DNA locus—just like in a standard CRISPR screen, but with much less hands-on engineering.

By using a large enough library of guide RNA molecules, a MERA screen can include thousands of cuts that completely tile a broad region of the genome, providing an agnostic look at anywhere regulatory elements might be hiding. “It’s a lot easier [than a typical CRISPR screen],” says Sherwood. “The day the library comes in, you just perform one PCR reaction, and the cells do the rest of the work.”

In the team’s first batch of MERA screens, they created almost 4,000 guide RNAs for each gene they studied, covering roughly 40,000 DNA bases of the “cis-regulatory region,” or the area surrounding the gene where most regulatory elements are thought to lie. It’s unclear just how large any gene’s cis-regulatory region is, but 40,000 bases is a big leap from the highly targeted assays that have come before.

“We’re now starting to do follow-up studies where we increase the number of guide RNAs,” Sherwood adds. “Eventually, what you’d like is to be able to tile an entire chromosome.”

Far From the Lampposts

Sherwood and Gifford tried to focus their assays on regions that would be rich in regulatory elements. To that end, they made sure their guide RNAs covered parts of the genome with well-known signs of regulatory activity, like histone markers and transcription factor binding sites. For many of these areas, Cas9 cuts did, in fact, shut down gene expression in the MERA screens.

But the study also targeted regions around each gene that were empty of any known regulatory features. “We tiled some other regions that we thought might serve as negative controls,” explains Gifford. “But they turned out not to be negative at all.”

The study’s most surprising finding was that several cuts to seemingly random areas of the genome caused genes to become nonfunctional. The authors named these DNA regions “unmarked regulatory elements,” or UREs. They were especially prevalent around the genes Tdgf1 and Zfp42, and in many cases, seemed to be every bit as necessary to gene activity as more predictable hits on the MERA screen.

These results caught the researchers so off guard that it was natural to wonder if MERA screens are prone to false positives. Yet follow-up experiments strongly supported the existence of UREs. Switching the guide RNAs from aTdgf1 MERA screen and a Zfp42 screen, for example, produced almost no positive results: the UREs’ regulatory effects were indeed specific to the genes near them.

In a more specific test, the researchers chose a particular URE connected to Tdgf1, and cut it out of a brand new population of cells for a closer look. “We showed that, if we deleted that region from the genome, the cells lost expression of the gene,” says Sherwood. “And then when we put it back in, the gene became expressed again. Which was good proof to us that the URE itself was responsible.”

From these results, it seems likely that follow-up MERA screens will find even more unknown stretches of regulatory DNA. Gifford and Sherwood’s experiments didn’t try to cover as much ground around their target genes as they might have, because the researchers assumed that MERA would mostly confirm what was already known. At best, they hoped MERA would rule out some suspected regulatory regions, and help show which regulatory elements have the biggest effect on gene expression.

“We tended to prioritize regions that had been known before,” Sherwood says. “Unfortunately, in the end, our datasets weren’t ideally suited to discovering these UREs.”

Getting to Basic Principles

MERA could open up huge swaths of the regulatory genome to investigation. Compared to an ordinary CRISPR screen, says Sherwood, “there’s only upside,” as MERA is cheaper, easier, and faster to run.

Still, interpreting the results is not trivial. Like other CRISPR screens, MERA makes cuts at precise points in the genome, but does not tell cells to repair those cuts in any particular way. As a result, a population of cells all carrying the same guide RNA can have a huge variety of different gaps and scars in their genomes, typically deletions in the range of 10 to 100 bases long. Gifford and Sherwood created up to 100 cells for each of their guides, and sometimes found that gene expression was affected in some but not all of them; only sequencing the genomes of their mutated cells could reveal exactly what changes had been made.

By repeating these experiments many times, and learning which mutations affect gene expression, it will eventually be possible to pin down the exact DNA bases that make up each regulatory element. Future studies might even be able to distinguish between regulatory elements with small and large effects on gene expression. In Gifford and Sherwood’s MERA screens, the target genes were altered to produce a green fluorescent protein, so the results were read in terms of whether cells gave off fluorescent light. But a more precise, though expensive, approach would be to perform RNA sequencing, to learn which cuts reduced the cell’s ability to transcribe a gene into RNA, and by how much.

A MERA screen offers a rich volume of data on the behavior of the regulatory genome. Yet, as with so much else in genetics, there are few robust principles to let scientists know where they should be focusing their efforts. Histone markers provide only a very rough sketch of regulatory elements, often proving to be red herrings on closer examination. And the existence of UREs, if confirmed by future experiments, shows that we don’t yet even know which areas of the genome to rule out in the hunt for regulatory regions.

“Every dataset we get comes closer and closer to computational principles that let us predict these regions,” says Sherwood. As more studies are conducted, patterns may emerge in the DNA sequences of regulatory elements that link UREs together, or reveal which histone markers truly point toward regulatory effects. There might also be functional clues hidden in these sequences, hinting at what is happening on a molecular level as regulatory elements turn genes on and off in the course of a cell’s development.

For now, however, the data is still rough and disorganized. For better and for worse, high-throughput tools like MERA are becoming the foundation for most discoveries in genetics—and that means there is a lot more work to do before the regulatory genome begins to come into focus.

CORRECTED 2/9/16: Originally, this story incorrectly stated that only certain cell types could be assayed with MERA for reasons related to homologous recombination. In fact, the authors see no reason MERA could not be applied to any in vitro cell line, and hope to perform screens in a wide range of cell types. The text has been edited to correct the error.



Read Full Post »

CRACKING THE CODE OF HUMAN LIFE: The Birth of BioInformatics & Computational Genomics

Author and Curator: Larry H Bernstein, MD, FCAP


The previous Part II: Cracking the Code of Human Life,

Part II  From Molecular Biology to Translational Medicine:How Far Have We Come, and Where Does It Lead Us? Is broken into a three part series.

Part II A. “CRACKING THE CODE OF HUMAN LIFE: Milestones along the Way” reviews the Human Genome Project and the decade beyond.

Part IIB. “CRACKING THE CODE OF HUMAN LIFE: The Birth of BioInformatics & Computational Genomics” lays the manifold multivariate systems analytical tools that has moved the science forward to a groung that ensures clinical application.

Part IIC. “CRACKING THE CODE OF HUMAN LIFE: Recent Advances in Genomic Analysis and Disease “ extends the discussion to advances in the management of patients as well as providing a roadmap for pharmaceutical drug targeting.

Part III concludes with Ubiquitin, it’s role in Signaling and Regulatory Control.

This article is a continuation of a previous discussion on the role of genomics in discovery of therapeutic targets titled, Directions for Genomics in Personalized Medicine, which focused on: key drivers of cellular proliferation, stepwise mutational changes coinciding with cancer progression, and potential therapeutic targets for reversal of the process. And it is a direct extension of Cracking the Code of Human Life (Part I): “the initiation phase of molecular biology”.

These articles review a web-like connectivity between inter-connected scientific discoveries, as significant findings have led to novel hypotheses and many expectations over the last 75 years. This largely post WWII revolution has driven our understanding of biological and medical processes at an exponential pace owing to successive discoveries of chemical structure,

  • the basic building blocks of DNA  and proteins,
  • of nucleotide and protein-protein interactions,
  • protein folding, allostericity,
  • genomic structure,
  • DNA replication,
  • nuclear polyribosome interaction, and
  • metabolic control.

In addition, the emergence of methods

  • for copying,
  • removal and
  • insertion, and
  1. improvements in structural analysis as well as
  2. developments in applied mathematics have transformed the research framework.

CRACKING THE CODE OF HUMAN LIFE: The Birth of BioInformatics & Computational Genomics Computational Genomics I. Three-Dimensional Folding and Functional Organization Principles of The Drosophila Genome Sexton T, Yaffe E, Kenigeberg E, Bantignies F,…Cavalli G. Institute de Genetique Humaine, Montpelliere GenomiX, and Weissman Institute, France and Israel. Cell 2012; 148(3): 458-472.

Chromosomes are the physical realization of genetic information and thus

  • form the basis for its readout and propagation.

Here we present a high-resolution chromosomal contact map derived from

  • a modified genome-wide chromosome conformation capture approach
  • applied to Drosophila embryonic nuclei.

the entire genome is linearly partitioned into

  • well-demarcated physical domains that
  • overlap extensively with
  • active and repressive epigenetic marks.

Chromosomal contacts are hierarchically organized between domains.

Global modeling of contact density and clustering of domains show

  • that inactive domains are condensed and
  • confined to their chromosomal territories, whereas
  • active domains reach out of the territory to form
  • remote intra- and interchromosomal contacts.

Moreover, we systematically identify specific

  • long-range intrachromosomal contacts between
  • Polycomb-repressed domains.

Together, these observations allow for

  • quantitative prediction of the Drosophila chromosomal contact map,
  • laying the foundation for detailed studies of
  • chromosome structure and function in
  • a genetically tractable system.

Insert pictures

profiles validate the Hi-C Genome wide map

profiles validate the Hi-C Genome wide map

IIC. “Mr. President; The Genome is Fractal !” Eric Lander

(Science Adviser to the President and Director of Broad Institute) et al.
delivered the message on Science Magazine cover (Oct. 9, 2009) and
generated interest in this by the International HoloGenomics Society at
a Sept meeting.

  • First, it may seem to be trivial to rectify the statement in “About cover”
    of Science Magazine by AAAS. The statement “the Hilbert curve is a
    one-dimensional fractal trajectory” needs mathematical clarification.

While the paper itself does not make this statement, the new Editorship
of the AAAS Magazine might be even more advanced if the previous
Editorship did not reject (without review) a Manuscript by 20+ Founders
of (formerly) International PostGenetics Society in December, 2006.

  • Second, it may not be sufficiently clear for the reader that the
    reasonable requirement for the DNA polymerase to crawl along
    a “knot-free” (or “low knot”) structure does not need fractals. A
    “knot-free” structure could be spooled by an ordinary “knitting globule”
    (such that the DNA polymerase does not bump into a “knot” when
    duplicating the strand; just like someone knitting can go through
    the entire thread without encountering an annoying knot): Just to
    be “knot-free” you don’t need fractals.

Note, however, that the “strand” can be accessed only at its beginning –
it is impossible to e.g.

  • to pluck a segment from deep inside the “globulus”.

This is where certain fractals provide a major advantage – that could be

  • the “Eureka” moment for many readers.

For instance, the mentioned Hilbert-curve is not only “knot free” – but

  • provides an easy access to “linearly remote” segments of the strand.

If the Hilbert curve starts from the lower right corner and ends at the lower left corner,

  • for instance the path shows the very easy access of what would be the mid-point
  • if the Hilbert-curve is measured by
  • the Euclidean distance along the zig-zagged path.

Likewise, even the path from the beginning of the Hilbert-curve is about equally easy to access –

  • easier than to reach from the origin a point that is about 2/3 down the path.

The Hilbert-curve provides an easy access between two points

  • within the “spooled thread”;

from a point that is about 1/5 of the overall length

  • to about 3/5 is also in a “close neighborhood”.

This may be the “Eureka-moment” for some readers, to realize that

  • the strand of “the Double Helix” requires quite a finess to fold into
  • the densest possible globuli (the chromosomes) in a clever way
  • that various segments can be easily accessed.

Moreover, in a way that distances

  • between various segments are minimized.

This marvelous fractal structure

  • is illustrated by the 3D rendering of the Hilbert-curve.

Once you observe such fractal structure, you’ll never again think of

  • a chromosome as a “brillo mess”, would you?

It will dawn on you that the genome is orders of magnitudes more

  • finessed than we ever thought so.

Insert picture

profiles validate the Hi-C Genome wide map

profiles validate the Hi-C Genome wide map

Those embarking at a somewhat complex review of some

  • historical aspects of the power of fractals may wish to consult
  • the ouvre of Mandelbrot (also, to celebrate his 85th birthday).

For the more sophisticated readers, even the fairly simple

Hilbert-curve (a representative of the Peano-class) becomes

  • even more stunningly brilliant than just some “see through density”.

Those who are familiar with the classic “Traveling Salesman Problem”

  • know that “the shortest path along which every given n locations can
  • be visited once, and only once” requires fairly sophisticated algorithms
  • (and tremendous amount of computation if n>10 (or much more).

Some readers will be amazed, therefore, that for n=9 the underlying Hilbert-curve

Briefly, the significance of the above realization, that the (recursive)

  1. Fractal Hilbert Curve is intimately connected to the
  2. (recursive) solution of TravelingSalesman Problem,
  3. a core-concept of Artificial Neural Networks summarized below.

Accomplished physicist John Hopfield aroused great excitement in 1982
(already a member of the National Academy of Science)

with his (recursive) design of artificial neural networks and learning algorithms

which were able to find reasonable solutions to combinatorial problems

such as the Traveling SalesmanProblem.
(Book review Clark Jeffries, 1991;  1. J. Anderson, R. Rosenfeld, and
A. Pellionisz (eds.), Neurocomputing 2: Directions for research, MIT
Press, Cambridge, MA, 1990):

“Perceptions were modeled chiefly with neural connections in a

  • “forward” direction: A -> B -* C — D.

The analysis of networks with strong

  • backward coupling proved intractable.

All our interesting results arise as consequences of the strong

  • back-coupling” (Hopfield, 1982).

The Principle of Recursive Genome Function surpassed obsolete

  • axioms that blocked, for half a Century,
  • entry of recursive algorithms to interpretation
  • of the structure-and function of (Holo)Genome.

This breakthrough, by uniting the two largely separate fields of

  • Neural Networks and Genome Informatics,

is particularly important for those who focused on

  • Biological (actually occurring) Neural Networks
  • (rather than abstract algorithms that may not, or
  • because of their core-axioms, simply could not
  • represent neural networks under the governance of DNA information).

IIIA. The FractoGene Decade from Inception in 2002 to Proofs of Concept and
Impending Clinical Applications by 2012

  1. Junk DNA Revisited (SF Gate, 2002)
  2. The Future of Life, 50th Anniversary of DNA (Monterey, 2003)
  3. Mandelbrot and Pellionisz (Stanford, 2004)
  4. Morphogenesis, Physiology and Biophysics (Simons, Pellionisz 2005)
  5. PostGenetics; Genetics beyond Genes (Budapest, 2006)
  6. ENCODE-conclusion (Collins, 2007)
  7. The Principle of Recursive Genome Function (paper, YouTube, 2008)
  8. You Tube Cold Spring Harbor presentation of FractoGene (Cold Spring Harbor, 2009)
  9. Mr. President, the Genome is Fractal! (2009)
  10. HolGenTech, Inc. Founded (2010)
  11. Pellionisz on the Board of Advisers in the USA and India (2011)
  12. ENCODE – final admission (2012)
  13. Recursive Genome Function is Clogged by Fractal Defects in Hilbert-Curve (2012)
  14. Geometric Unification of Neuroscience and Genomics (2012)
  15. US Patent Office issues FractoGene 8,280,641 to Pellionisz (2012)

file:///C|/Documents_and_Settings/Andras/Desktop/The_FractoGene_Decade_cover_page.htm  2012.12.16. 12:36:55

When the human genome was first sequenced in June 2000, there were two pretty big surprises.

The first was that humans have only about 30,000-40,000 identifiable genes,

  • not the 100,000 or more many researchers were expecting.

The lower –and more humbling — number

  • means humans have just one-third
  • more genes than a common species of worm.

The second stunner was how much human genetic material — more than 90 percent —

  • is made up of what scientists were calling “junk DNA.”

The term was coined to describe similar but

  • not completely identical repetitive sequences of amino acids
    (the same substances that make genes),
  • which appeared to have no function or purpose.

The main theory at the time was that these apparently

  • non-working sections of DNA were
  • just evolutionary leftovers, much like our earlobes.

If biophysicist Andras Pellionisz is correct, genetic science

  • may be on the verge of yielding its third — and
  • by far biggest — surprise.

With a doctorate in physics, Pellionisz is the holder of Ph.D.’s

  • in computer sciences and experimental biology from the
    prestigious Budapest Technical University and
    the Hungarian National Academy of Sciences.

A biophysicist by training, the 59-year-old is a former research

  1. associate professor of physiology and biophysics at New York University,
  2. author of numerous papers in respected scientific journals and textbooks,
  3. a past winner of the prestigious Humboldt Prize for scientific research,
  4. a former consultant to NASA and
  5. holder of a patent on the world’s first artificial cerebellum,
    a technology that has already been integrated into research
    on advanced avionics systems.

Because of his background, the Hungarian-born brain researcher might

  • also become one of the first people to successfully launch a new company
  • by using the Internet to gather momentum for a novel scientific idea.

The genes we know about today, Pellionisz says, can be thought of as something

  • similar to machines that make bricks (proteins, in the case of genes), with certain
  • junk-DNA sections providing a blueprint for the
  • different ways those proteins are assembled.

The notion that at least certain parts of junk DNA might have a purpose for example,

  • many researchers now refer to
  • with a far less derogatory term: introns.

Insert picture



In a provisional patent application filed July 31, Pellionisz claims to have

  • unlocked a key to the hidden role junk DNA plays in growth — and in life itself.

His patent application covers all attempts to

  • count,
  • measure and
  • compare

the fractal properties of introns

  • for diagnostic and therapeutic purposes.

IIIB. The Hidden Fractal Language of Intron DNA

To fully understand Pellionisz’ idea,

  • one must first know what a fractal is.

Fractals are a way that nature organizes matter.

Fractal patterns can be found

  • in anything that has a nonsmooth surface (unlike a billiard ball),
  1. such as coastal seashores,
  2. the branches of a tree or
  3. the contours of a neuron (a nerve cell in the brain).

Some, but not all, fractals are self-similar and

  • stop repeating their patterns at some stage

the branches of a tree, for example,

  • can get only so small.

Because they are geometric, meaning they have a shape,

  • fractals can be described in mathematical terms.

It’s similar to the way a circle can be described

  • by using a number to represent its radius
    (the distance from its center to its outer edge).

When that number is known, it’s possible to draw the circle it represents

  • without ever having seen it before.

Although the math is much more complicated,

  • the same is true of fractals.

If one has the formula for a given fractal,

  • it’s possible to use that formula to construct, or reconstruct,
  • an image of whatever structure it represents,
  • no matter how complicated.

The mysteriously repetitive but not identical strands of genetic material

  • are in reality building instructions organized in
  • a special type of pattern known as a fractal.

It’s this pattern of fractal instructions, he says, that tells genes what they

  • must do in order to form living tissue,
  • everything from the wings of a fly to the entire body of a full-grown human.

In a move sure to alienate some scientists,

  • Pellionisz has chosen the unorthodox route of
  • making his initial disclosures online on his own Web site.

He picked that strategy, he says, because

  1. it is the fastest way he can document his claims
  2. and find scientific collaborators and investors.

Most mainstream scientists usually blanch at such approaches,

  • preferring more traditionally credible methods, such as
  • publishing articles in peer-reviewed journals.

Basically, Pellionisz’ idea is that

  • a fractal set of building instructions in the DNA
  • plays a similar role in organizing life itself.

Decode the way that language works, he says, and

  • in theory it could be reverse engineered.

Just as knowing the radius of a circle lets one create that circle,

  • the more complicated fractal-based formula
  • would allow us to understand how nature creates a heart or
  • simpler structures, such as disease-fighting antibodies.

At a minimum, we’d get a far better understanding of

  • how nature gets that job done.

The complicated quality of the idea is helping encourage

  • new collaborations across the boundaries that sometimes
  • separate the increasingly intertwined disciplines of
  • biology, mathematics and computer sciences.

Hal Plotkin, Special to SF Gate. Thursday, November 21, 2002.

(1 of 10)2012.12.13. 12:11:58/ Hal Plotkin, Special to SF Gate.
Thursday, November 21, 2002

insert pictures





Fractal Defects in the genome, repeat structural variants withtheir largest example of Copy Number Variants

Fractal Defects in the genome, repeat structural variants with their largest example of Copy Number Variants

Golden_ratio  Fractal chaos Holographic neural network

Golden_ratio Fractal chaos Holographic neural network

IIIC. multifractal analysis

The human genome: a multifractal analysis.
Moreno PA, Vélez PE, Martínez E, et al. BMC Genomics 2011, 12:506.

Background: Several studies have shown that genomes

  • can be studied via a multifractal formalism.

Recently, we used a multifractal approach to study the

  • genetic information content of the Caenorhabditis elegans genome.

Here we investigate the possibility that the human genome shows a

  • similar behavior to that observed in the nematode.

Results: We report here multifractality in the human genome sequence.

This behavior correlates strongly on the presence of

  1. Alu elements and to a lesser extent on
  2. CpG islands and (G+C) content.

In contrast, no or low relationship was found for

  • LINE, MIR, MER, LTRs elements and DNA regions
  • poor in genetic information.

Gene function, cluster of orthologous genes, metabolic pathways, and exons

  1. tended to increase their frequencies with ranges of multifractality
  2. and large gene families were located in genomic regions with varied multifractality.

Additionally, a multifractal map and classification for human chromosomes are proposed.

Conclusions: we propose a descriptive non-linear model

for the structure of the human genome,

This model reveals a multifractal regionalization where

many regions coexist that are far from equilibrium and

this non-linear organization has significant molecular and medical genetic implications

  • for understanding the role of Alu elements in genome stability
  • and structure of the human genome.

Given the role of Alu sequences in

  1. adaptation and
  2. human genetic diversity,
  3. genetic diseases,
  4. gene regulation,
  5. phylogenetic analyses,

these quantifications are especially useful.

MiIP:The Monomer Identification and Isolation Program

Bun C, Ziccardi W, Doering J and Putonti C.
Evolutionary Bioinformatics 2012:8 293-300.

Repetitive elements within genomic DNA are

  • both functionally and evolutionarilly informative.

Discovering these sequences ab initio

  • is computationally challenging,
  • compounded by the fact that sequence identity
  • between repetitive elements can vary significantly.

Here we present a new application,

  • the Monomer Identification and Isolation Program (MiIP),
  • which provides functionality to both
  1. search for a particular repeat
  2. as well as discover repetitive elements within a larger genomic sequence.

To compare MiIP’s performance with other repeat detection tools,

  • analysis was conducted for synthetic sequences as well as
  • several a21-II clones and HC21 BAC sequences.

The primary benefit of MiIP is the fact that

  1. it is a single tool capable of searching for both known monomeric sequences
  2. as well as discovering the occurrence of repeats ab initio,
  3. per the user’s required sensitivity of the search

Triplex DNA A. A third strand for DNA

The DNA double helix can under certain conditions

  • accommodate a third strand in its major groove.

Researchers in the UK have now presented a complete set of

  • four variant nucleotides that makes it possible to use this phenomenon
  • in gene regulation and mutagenesis.

Natural DNA only forms a triplex

  • if the targeted strand is rich in purines – guanine (G) and adenine (A) –
  • which in addition to the bonds of the Watson-Crick base pairing
  • can form two further hydrogen bonds, and the ‘third strand’ oligonucleotide
  • has the matching sequence of pyrimidines – cytosine (C) and thymine (T).

Any Cs or Ts in the target strand of the duplex will only bind very weakly,

  • as they contribute just one hydrogen bond.

Moreover, the recognition of G requires

  • the C in the probe strand to be protonated,
  • so triplex formation will only work at low pH.

To overcome all these problems, the groups of Tom Brown and Keith Fox
at the University of Southampton

  • have developed modified building blocks, and have now
  • completed a set of four new nucleotides, each of which will bind to one
  • DNA nucleotide from the major groove of the double helix.1

They tested the binding of a 19-mer of these designer nucleotides

  • to a double helix target sequence in comparison with the corresponding
  • triplex-forming oligonucleotide made from natural DNA bases.

Using fluorescence-monitored thermal melting and DNase I footprinting,

  • the researchers showed that their construct
  • forms stable triplex even at neutral pH.

Tests with mutated versions of the target sequence showed that

  1. three of the novel nucleotides are highly selective for their target base pair,
  2. while the ‘S’ nucleotide, designed to bind to T, also tolerates C.

In principle, triplex formation has already been demonstrated as

  • a way of inducing mutations in cell cultures and animal experiments.2

Michael Gross


1 DA Rusling et al, Nucleic Acids Res. 2005, 33, 3025

2 KM Vasquez et al, Science 2000, 290, 530

B. Triplex DNA Structures.

Triplex DNA Structures. Frank-Kamenetskii, Mirkin SM. Annual Rev Biochem 1995; 64:69-95./

Since the pioneering work of Felsenfeld, Davies, & Rich (1),

  • double-stranded polynucleotides containing purines in one strand
  • and pydmidines in the other strand
    [such as poly(A)/poly(U), poly(dA)/poly(dT), or poly(dAG)/poly(dCT)]
  • have been known to be able to undergo a
  • stoichiometric transition forming a triple-stranded structure containing
  • one polypurine and two polypyrimidine strands.

Early on, it was assumed that the third strand was located in the major groove

  • and associated with the duplex via non-Watson-Crick interactions
  • now known as Hoogsteen pairing.

Insert pictures

triplex DNA

triplex DNA

Triple helices consisting of one pyrimidine and

  • two purine strands were also proposed.

However, notwithstanding the fact that single-base triads

  1. in tRNAs tructures were well-documented,
  2. triple-helical DNA escaped wide attention before the mid-1980s.

The considerable modern interest in DNA triplexes arose

  • due to two partially independent developments.

First, homopurine-homopyrimidine stretches in supercoiled plasmids

  • were found to adopt an unusual DNA structure, called H-DNA which
  • includes a triplex as the major structural element.

Secondly, several groups demonstrated that homopyrimidine and

  • some purine-rich oligonucleotides
  • can form stable and sequence-specific complexes
  • with corresponding homopurine-homopyrimidine sites on duplex DNA.

These complexes were shown to be triplex structures rather than D-loops,

  • where the oligonucleotide invades the double helix
  • and displaces one strand.

A characteristic feature of all these triplexes is that the two chemically

  • homologous strands (both pyrimidine or both purine) are antiparallel.

These findings led explosive growth in triplex studies. One can easily imagine

  • numerous “geometrical” ways to form a triplex, and
  • those that have been studied experimentally.

The canonical intermolecular triplex consists of either

  1. three independent oligonucleotide chains or of
  2. a long DNA duplex carrying homopurine-homopyrimidine insert
  • and the corresponding oligonucleotide.

Triplex formation strongly depends on the oligonucleotide(s) concentration.

A single DNA chain may also fold into a triplex connected by two loops.

To comply with the sequence and polarity requirements for triplex formation,

  • such a DNA strand must have a peculiar sequence:

It contains a mirror repeat
(homopyrimidine for YR*Y triplexes and homopurine for YR*R triplexes)

  • flanked by a sequence complementary to
  • one half of this repeat.

Such DNA sequences fold into

  • triplex configuration much more readily than do
  • the corresponding intermolecular triplexes, because
  • all triplex forming segments are brought together within the same molecule.

Insert pictures

It has become clear recently, however, that

  • both sequence requirements and chain polarity rules for triplex formation
  • can be met by DNA target sequences
  • built of clusters of purines and pyrimidines.

The third strand consists of adjacent homopurine and homopyrimidine blocks

  • forming Hoogsteen hydrogen bonds with purines
  • on alternate strands of the target duplex, andthis strand switch
  • preserves the proper chain polarity.

These structures, called alternate-strand triplexes,

  • have been experimentally observed as both intra- and intermolecular triplexes.

These results increase the number of

  • potential targets for triplex formation in natural DNAs
  • somewhat by adding sequences composed of purine and pyrimidine clusters,
  • although arbitrary sequences are still not targetable
  • because strand switching is energetically unfavorable.


Lyamichev VI, Mirkin SM, Frank-Kamenetskii MD.

J. Biomol. Stract. Dyn. 1986; 3:667-69.

Mirkin SM, Lyamichev VI, Drushlyak KN, Dobrynin VN0 Filippov SA, Frank-Kamenetskii MD.

Nature 1987; 330:495-97.

Demidov V, Frank-Kamenetskii MD, Egholm M, Buchardt O, Nielsen PE.

Nucleic Acids Res. 1993; 21:2103-7.

Mirkin SMo Frank-Kamenetskii MD.

Anna. Rev. Biophys. Biomol. Struct. 1994; 23:541-76.

Hoogsteen K.

Acta Crystallogr. 1963; 16:907-16

Malkov VA, Voloshin ON, Veselkov AG, Rostapshov VM, Jansen I, et al.

Nucleic Acids Res. 1993; 21:105-11.

Malkov VA, Voloshin ON, Soyfer VN, Frank-Kamenetskii MD.

Nucleic Acids Res. 1993; 21:585-91

Chemy DY, Belotserkovskii BP,Frank-Kamenetskii MD,
Egholm M, Buchardt O, et al.

Proc. Natl. Acad. Sci. USA 1993; 90:1667-70

C.Triplex forming oligonucleotides

Triplex forming oligonucleotides: sequence-specific tools for genetic targeting.

Knauert MP, Glazer PM. Human Molec Genetics 2001; 10(20):2243-2251.
sequence-specific_tools_for _genetic_targeting.

Triplex forming oligonucleotides (TFOs) bind in the major groove of duplex DNA

  • with a high specificity and affinity.

Because of these characteristics,

  • TFOs have been proposed as homing devices
  • for genetic manipulation in vivo.

These investigators review work demonstrating the ability of TFOs and

  • related molecules to alter gene expression and
  • mediate gene modification in mammalian cells.

TFOs can mediate targeted gene knock out in mice,

  • providing a foundation for potential application
  • of these molecules in human gene therapy.

D. Novagon DNA

John Allen Berger, founder of Novagon DNA and

  • The Triplex Genetic Code Over the past 12+ years,

Novagon DNA has amassed a vast array of empirical findings

  • which challenge the “validity” of the “central dogma theory”,
  • especially the current five nucleotide Watson-Crick DNA and
  • RNA genetic codes. DNA = A1T1G1C1, RNA =A2U1G2C2.

We propose that our new Novagon DNA 6 nucleotide Triplex Genetic Code

  • has more validity than the existing 5 nucleotide (A1T1U1G1C1)
  • Watson-Crick genetic codes.

Our goal is to conduct a “world class” validation study

  • to replicate and extend our findings.

Methods for Examining Genomic and Proteomic Interactions

A. An Integrated Statistical Approach to Compare
Transcriptomics Data Across Experiments:

A Case Study on the Identification of Candidate Target Genes
of the Transcription Factor PPARα

Ullah MO, Müller M and Hooiveld GJEJ.

Bioinformatics and Biology Insights 2012:6 145–154.


binding-of-a-ppar-ligand-to-the-ppar-ligand-binding-domain transcriptomic_Data_Across_Experiments-A-Case_Study_on_the_Identification_ of_Candidate_Target_Genes_of_the Transcription_Factor_PPARα/

Corresponding author email:

An effective strategy to elucidate the signal transduction cascades

  • activated by a transcription factor is to compare the transcriptional profiles
  • of wild type and transcription factor knockout models.

Many statistical tests have been proposed for analyzing gene expression data,

  • but most tests are based on pair-wise comparisons.

Since the analysis of micro-arrays involves the testing of

  • multiple hypotheses within one study, it is generally accepted that one should
  • control for false positives by the false discovery rate (FDR).

However, it has been reported that

  • this may be an inappropriate metric for
  • comparing data across different experiments.

Here we propose an approach that addresses the above mentioned problem

  • by the simultaneous testing and integration of the three hypotheses (contrasts)
  • using the cell means ANOVA model.

These three contrasts test for the effect of a treatment in

  • wild type,
  • gene knockout, and
  • globally over all experimental groups.

We illustrate our approach on microarray experiments that focused

  • on the identification of candidate target genes and biological processes
  • governed by the fatty acid sensing transcription factor PPARα in liver.

Compared to the often applied FDR based across experiment comparison,

  • our approach identified a conservative
  • but less noisy set of candidate genes
  • with same sensitivity and specificity.

However, our method had the advantage of properly adjusting for

  • multiple testing while integrating data from two experiments,
  • and was driven by biological inference.

We present a simple, yet efficient strategy to compare

  • differential expression of genes across experiments
  • while controlling for multiple hypothesis testing.

B. Managing biological complexity across orthologs with a visual knowledge-base
of documented biomolecular interactions Vincent VanBuren & Hailin Chen
Scientific Reports 2, Article number: 1011
Received 02 October 2012 Accepted 04 December 2012

The complexity of biomolecular interactions and influences

  • is a major obstacle to their comprehension and elucidation.

Visualizing knowledge of biomolecular interactions

  • increases comprehension and
  • facilitates the development of new hypotheses.

The rapidly changing landscape of high-content experimental results

  • also presents a challenge for the maintenance of comprehensive knowledgebases.

Distributing the responsibility for maintenance of a knowledgebase

  • to a community of subject matter experts is an effective strategy
  • for large, complex and rapidly changing knowledgebases.

Cognoscente serves these needs by building visualizations for queries

  • of biomolecular interactions on demand,
  • by managing the complexity of those visualizations, and by
  • crowdsourcing to promote the incorporation of current knowledge
  • from the literature.

Imputing functional associations between

  • biomolecules and imputing directionality of regulation for those predictions
  • each require a corpus of existing knowledge as a framework to build upon.

Comprehension of the complexity of this corpus of knowledge

  • will be facilitated by effective visualizations of
  • the corresponding biomolecular interaction networks.

Cognoscente (

  1. was designed and implemented to serve these roles as a knowledgebase
  2. and as an effective visualization tool for systems biology research and education.

Cognoscente currently contains over 413,000 documented interactions,

  • with coverage across multiple species.

Perl, HTML, GraphViz1, and a MySQL database were used in the development of Cognoscente.

Cognoscente was motivated by the need to update the knowledgebase

  • of biomolecular interactions at the user level, and
  • flexibly visualize multi-molecule query results for
  • heterogeneous interaction types across different orthologs.

Satisfying these needs provides a strong foundation for

  • developing new hypotheses about regulatory and metabolic pathway topologies.

Several existing tools provide functions that are similar to Cognoscente, so we selected several popular alternatives to assess how their feature sets compare with Cognoscente ( Table 1 ). All databases assessed had easily traceable documentation for each interaction, and included protein-protein interactions in the database.

Most databases, with the exception of BIND, provide an open-access database that can be downloaded as a whole.

Most databases, with the exceptions of EcoCyc and HPRD, provide

  • support for multiple organisms.

Most databases support web services for

  • interacting with the database contents programmatically,
  • whereas this is a planned feature for Cognoscente.

INT, STRING, IntAct, EcoCyc, DIP and Cognoscente provide built-in

  • visualizations of query results, which we consider
  • among the most important features for facilitating comprehension of query results.

BIND supports visualizations via Cytoscape.

Cognoscente is among a few other tools that support

  • multiple organisms in the same query,
  • protein->DNA interactions, and
  • multi-molecule queries.

Cognoscente has planned support for

  • small molecule interactants (i.e. pharmacological agents).

MINT, STRING, and IntAct provide a prediction (i.e. score)

  • of functional associations, whereas
  • Cognoscente does not currently support this.

Cognoscente provides support for multiple edge encodings

  • to visualize different types of interactions in the same display,
  • a crowdsourcing web portal that allows users to submit
  • interactions that are then automatically incorporated in the knowledgebase,
  • and displays orthologs as compound nodes
  • to provide clues about potential orthologous interactions.

The main strengths of Cognoscente are that it provides a combined feature set that is superior to any existing database, it provides a unique visualization feature for orthologous molecules, and relatively unique support for multiple edge encodings, crowdsourcing, and connectivity parameterization. The current weaknesses of Cognoscente relative to these other tools are that it does not fully support web service interactions with the database, it does not fully support small molecule interactants, and it does not score interactions to predict functional associations. Web services and support for small molecule interactants are currently under development.

Related references from Leaders in Pharmaceutical Intelligence:

Big Data in Genomic Medicine larryhbern

BRCA1 a tumour suppressor in breast and ovarian cancer – functions in
transcription, ubiquitination and DNA repair
S Saha

Computational Genomics Center: New Unification of Computational Technologies at Stanford
A Lev-Ari

Personalized medicine gearing up to tackle cancer
ritu saxena

Differentiation Therapy – Epigenetics Tackles Solid Tumors

SJ Williams

Mechanism involved in Breast Cancer Cell Growth: Function in Early Detection & Treatment
A Lev-Ari

The Molecular pathology of Breast Cancer Progression
tilde barliya

Gastric Cancer: Whole-genome reconstruction and mutational signatures
A Lev-Ari

Paradigm Shift in Human Genomics – Predictive Biomarkers and Personalized Medicine – Part 1 (
A Lev-Ari

LEADERS in Genome Sequencing of Genetic Mutations for Therapeutic Drug Selection in Cancer Personalized Treatment: Part 2
A Lev-Ari

Personalized Medicine: An Institute Profile – Coriell Institute for Medical Research: Part 3
A Lev-Ari

Harnessing Personalized Medicine for Cancer Management, Prospects of Prevention and Cure: Opinions of Cancer Scientific Leaders @
A Lev-Ari Cancer_Management-Prospects_of_Prevention_and_Cure/

GSK for Personalized Medicine using Cancer Drugs needs Alacris systems biology model to determine the in silico effect of the inhibitor in its “virtual clinical trial”
A Lev-Ari

Recurrent somatic mutations in chromatin-remodeling and ubiquitin ligase complex genes in serous endometrial tumors
S Saha

Personalized medicine-based cure for cancer might not be far away ritu saxena

Human Variome Project: encyclopedic catalog of sequence variants indexed to the human genome sequence A Lev-Ari

Prostate Cancer Cells: Histone Deacetylase Inhibitors Induce Epithelial-to-Mesenchymal Transition
SJ Williams

Inspiration From Dr. Maureen Cronin’s Achievements in Applying Genomic Sequencing to Cancer Diagnostics A Lev-Ari

The “Cancer establishments” examined by James Watson, co-discoverer of DNA w/Crick, 4/1953
A Lev-Ari

Directions for genomics in personalized medicine larryhbern

How mobile elements in “Junk” DNA promote cancer. Part 1: Transposon-mediated tumorigenesis. SJ Williams Mitochondria: More than just the “powerhouse of the cell” ritu saxena

Mitochondrial fission and fusion: potential therapeutic targets? Ritu saxena

Mitochondrial mutation analysis might be “1-step” away ritu saxena

mRNA interference with cancer expression larryhbern

Expanding the Genetic Alphabet and linking the genome to the metabolome

Breast Cancer, drug resistance, and biopharmaceutical targets larryhbern

Breast Cancer: Genomic profiling to predict Survival: Combination of Histopathology and Gene Expression Analysis
A Lev-Ari

Gastric Cancer: Whole-genome reconstruction and mutational signatures A Lev-Ari

Ubiquinin-Proteosome pathway, autophagy, the mitochondrion, proteolysis and cell apoptosis larryhbern

Genomic Analysis: FLUIDIGM Technology in the Life Science and Agricultural Biotechnology A Lev-Ari

2013 Genomics: The Era Beyond the Sequencing Human Genome: Francis Collins, Craig Venter, Eric Lander, et al.

Paradigm Shift in Human Genomics – Predictive Biomarkers and Personalized Medicine – Part 1 Shift in Human Genomics_/

Related Articles Life Stands on the shoulders of Giants (Viruses) (

New insights into the human genome by ENCODE project (

Unraveling the Human Genome: 6 Molecular Milestones (

Melanoma Genes Found In “Junk” DNA (

Learning the alphabet of gene control (

BEST OF THE WEB: On viral ‘junk’ DNA, a DNA-enhancing Ketogenic diet, and cometary kicks (

Sohan Modak


Sohan Modak

Owner, Open vision Inc.

Top Contributor

Larry, in a series of papers, Fertil, Deschavannes and colleagues have done beautiful analyses of fractal diagrams of Genome sequences in a series of papers.[Deschavanne PJ, Giron A, Vilain J, Fagot G, Fertil B (1999) Mol Biol Evol 16: 1391-1399; Fertil B, Massin M, Lespinats S, Devic C, Dumee P, Giron A (2005) GENSTYLE: exploration and analysis of DNA sequences with genomic signature. Nucleic Acids Res 33(Web Server issue):W512-5]. Clearly this gives an extraordinary insight in the specificity of positional sequence clusters. While fractals work well with octanucleotide clusters, longer the oligonucleotide tracks, higher the resolution. I feel that high resolution fractal maps of fentanucleotide sequences will provide something truely different and may be used as a tool to compare normal cellular DNA sequences to those from cancer cell lines and provide an operational window for manipulations.

Read Full Post »

Expanding the Genetic Alphabet and Linking the Genome to the Metabolome

English: The citric acid cycle, also known as ...

English: The citric acid cycle, also known as the tricarboxylic acid cycle (TCA cycle) or the Krebs cycle. Produced at WikiPathways. (Photo credit: Wikipedia)

Expanding the Genetic Alphabet and Linking the Genome to the Metabolome


Reporter& Curator:  Larry Bernstein, MD, FCAP


















Unlocking the diversity of genomic expression within tumorigenesis and “tailoring” of therapeutic options

1. Reshaping the DNA landscape between diseases and within diseases by the linking of DNA to treatments

In the NEW York Times of 9/24,2012 Gina Kolata reports on four types of breast cancer and the reshaping of breast cancer DNA treatment based on the findings of the genetically distinct types, which each have common “cluster” features that are driving many cancers.  The discoveries were published online in the journal Nature on Sunday (9/23).  The study is considered the first comprehensive genetic analysis of breast cancer and called a roadmap to future breast cancer treatments.  I consider that if this is a landmark study in cancer genomics leading to personalized drug management of patients, it is also a fitting of the treatment to measurable “combinatorial feature sets” that tie into population biodiversity with respect to known conditions.   The researchers caution that it will take years to establish transformative treatments, and this is clearly because in the genetic types, there are subsets that have a bearing on treatment “tailoring”.   In addition, there is growing evidence that the Watson-Crick model of the gene is itself being modified by an expansion of the alphabet used to construct the DNA library, which itself will open opportunities to explain some of what has been considered junk DNA, and which may carry essential information with respect to metabolic pathways and pathway regulation.  The breast cancer study is tied to the  “Cancer Genome Atlas” Project, already reported.  It is expected that this work will tie into building maps of genetic changes in common cancers, such as, breast, colon, and lung.  What is not explicit I presume is a closely related concept, that the translational challenge is closely related to the suppression of key proteomic processes tied into manipulating the metabolome.

Saha S. Impact of evolutionary selection on functional regions: The imprint of evolutionary selection on ENCODE regulatory elements is manifested between species and within human populations. 9/12/2012.

Hawrylycz MJ, Lein ES, Guillozet-Bongaarts AL, Shen EH, Ng L, et al. An anatomically comprehensive atlas of the adult human brain transcriptome. Nature  Sept 14-20, 2012

Sarkar A. Prediction of Nucleosome Positioning and Occupancy Using a Statistical Mechanics Model. 9/12/2012.

Heijden et al.   Connecting nucleosome positions with free energy landscapes. (Proc Natl Acad Sci U S A. 2012, Aug 20 [Epub ahead of print]).

2. Fiddling with an expanded genetic alphabet – greater flexibility in design of treatment (pharmaneogenesis?)

Diagram of DNA polymerase extending a DNA stra...

Diagram of DNA polymerase extending a DNA strand and proof-reading. (Photo credit: Wikipedia)

A clear indication of this emerging remodeling of the genetic alphabet is a new
study led by scientists at The Scripps Research Institute appeared in the
June 3, 2012 issue of Nature Chemical Biology that indicates the genetic code as
we know it may be expanded to include synthetic and unnatural sequence pairing (Study Suggests Expanding the Genetic Alphabet May Be Easier than Previously Thought, Genome). They infer that the genetic instructions for living organisms
that is composed of four bases (C, G, A and T)— is open to unnatural letters. An expanded “DNA alphabet” could carry more information than natural DNA, potentially coding for a much wider range of molecules and enabling a variety of powerful applications. The implications of the application of this would further expand the translation of portions of DNA to new transciptional proteins that are heretofore unknown, but have metabolic relavence and therapeutic potential. The existence of such pairing in nature has been studied in Eukariotes for at least a decade, and may have a role in biodiversity. The investigators show how a previously identified pair of artificial DNA bases can go through the DNA replication process almost as efficiently as the four natural bases.  This could as well be translated into human diversity, and human diseases.

The Romesberg laboratory collaborated on the new study and his lab have been trying to find a way to extend the DNA alphabet since the late 1990s. In 2008, they developed the efficiently replicating bases NaM and 5SICS, which come together as a complementary base pair within the DNA helix, much as, in normal DNA, the base adenine (A) pairs with thymine (T), and cytosine (C) pairs with guanine (G). It had been clear that their chemical structures lack the ability to form the hydrogen bonds that join natural base pairs in DNA. Such bonds had been thought to be an absolute requirement for successful DNA replication, but that is not the case because other bonds can be in play.

The data strongly suggested that NaM and 5SICS do not even approximate the edge-to-edge geometry of natural base pairs—termed the Watson-Crick geometry, after the co-discoverers of the DNA double-helix. Instead, they join in a looser, overlapping, “intercalated” fashion that resembles a ‘mispair.’ In test after test, the NaM-5SICS pair was efficiently replicable even though it appeared that the DNA polymerase didn’t recognize it. Their structural data showed that the NaM-5SICS pair maintain an abnormal, intercalated structure within double-helix DNA—but remarkably adopt the normal, edge-to-edge, “Watson-Crick” positioning when gripped by the polymerase during the crucial moments of DNA replication. NaM and 5SICS, lacking hydrogen bonds, are held together in the DNA double-helix by “hydrophobic” forces, which cause certain molecular structures (like those found in oil) to be repelled by water molecules, and thus to cling together in a watery medium.

The finding suggests that NaM-5SICS and potentially other, hydrophobically bound base pairs could be used to extend the DNA alphabet and that Evolution’s choice of the existing four-letter DNA alphabet—on this planet—may have been developed allowing for life based on other genetic systems.

3.  Studies that consider a DNA triplet model that includes one or more NATURAL nucleosides and looks closely allied to the formation of the disulfide bond and oxidation reduction reaction.

This independent work is being conducted based on a similar concep. John Berger, founder of Triplex DNA has commented on this. He emphasizes Sulfur as the most important element for understanding evolution of metabolic pathways in the human transcriptome. It is a combination of sulfur 34 and sulphur 32 ATMU. S34 is element 16 + flourine, while S32 is element 16 + phosphorous. The cysteine-cystine bond is the bridge and controller between inorganic chemistry (flourine) and organic chemistry (phosphorous). He uses a dual spelling, using  sulfphur to combine the two referring to the master catalyst of oxidation-reduction reactions. Various isotopic alleles (please note the duality principle which is natures most important pattern). Sulfphur is Methionine, S adenosylmethionine, cysteine, cystine, taurine, gluthionine, acetyl Coenzyme A, Biotin, Linoic acid, H2S, H2SO4, HSO3-, cytochromes, thioredoxin, ferredoxins, purple sulfphur anerobic bacteria prokaroytes, hydrocarbons, green sulfphur bacteria, garlic, penicillin and many antibiotics; hundreds of CSN drugs for parasites and fungi antagonists. These are but a few names which come to mind. It is at the heart of the Krebs cycle of oxidative phosphorylation, i.e. ATP. It is also a second pathway to purine metabolism and nucleic acids. It literally is the key enzymes between RNA and DNA, ie, SH thiol bond oxidized to SS (dna) cysteine through thioredoxins, ferredoxins, and nitrogenase. The immune system is founded upon sulfphur compounds and processes. Photosynthesis Fe4S4 to Fe2S3 absorbs the entire electromagnetic spectrum which is filtered by the Allen belt some 75 miles above earth. Look up chromatium vinosum or allochromatium species.  There is reasonable evidence it is the first symbiotic species of sulfphur anerobic bacteria (Fe4S4) with high potential mvolts which drives photosynthesis while making glucose with H2S.
He envisions a sulfphur control map to automate human metabolism with exact timing sequences, at specific three dimensional coordinates on Bravais crystalline lattices. He proposes adding the inosine-xanthosine family to the current 5 nucleotide genetic code. Finally, he adds, the expanded genetic code is populated with “synthetic nucleosides and nucleotides” with all kinds of customized functional side groups, which often reshape nature’s allosteric and physiochemical properties. The inosine family is nature’s natural evolutionary partner with the adenosine and guanosine families in purine synthesis de novo, salvage, and catabolic degradation. Inosine has three major enzymes (IMPDH1,2&3 for purine ring closure, HPGRT for purine salvage, and xanthine oxidase and xanthine dehydrogenase.

English: DNA replication or DNA synthesis is t...

English: DNA replication or DNA synthesis is the process of copying a double-stranded DNA molecule. This process is paramount to all life as we know it. (Photo credit: Wikipedia)

3. Nutritional regulation of gene expression,  an essential role of sulfur, and metabolic control 

Finally, the research carried out for decades by Yves Ingenbleek and the late Vernon Young warrants mention. According to their work, sulfur is again tagged as essential for health. Sulfur (S) is the seventh most abundant element measurable in human tissues and its provision is mainly insured by the intake of methionine (Met) found in plant and animal proteins. Met is endowed with unique functional properties as it controls the ribosomal initiation of protein syntheses, governs a myriad of major metabolic and catalytic activities and may be subjected to reversible redox processes contributing to safeguard protein integrity.

Consuming diets with inadequate amounts of methionine (Met) are characterized by overt or subclinical protein malnutrition, and it has serious morbid consequences. The result is reduction in size of their lean body mass (LBM), best identified by the serial measurement of plasma transthyretin (TTR), which is seen with unachieved replenishment (chronic malnutrition, strict veganism) or excessive losses (trauma, burns, inflammatory diseases).  This status is accompanied by a rise in homocysteine, and a concomitant fall in methionine.  The ratio of S to N is quite invariant, but dependent on source.  The S:N ratio is typical 1:20 for plant sources and 1:14.5 for animal protein sources.  The key enzyme involved with the control of Met in man is the enzyme cystathionine-b-synthase, which declines with inadequate dietary provision of S, and the loss is not compensated by cobalamine for CH3- transfer.

As a result of the disordered metabolic state from inadequate sulfur intake (the S:N ratio is lower in plants than in animals), the transsulfuration pathway is depressed at cystathionine-β-synthase (CβS) level triggering the upstream sequestration of homocysteine (Hcy) in biological fluids and promoting its conversion to Met. They both stimulate comparable remethylation reactions from homocysteine (Hcy), indicating that Met homeostasis benefits from high metabolic priority. Maintenance of beneficial Met homeostasis is counterpoised by the drop of cysteine (Cys) and glutathione (GSH) values downstream to CβS causing reducing molecules implicated in the regulation of the 3 desulfuration pathways

4. The effect on accretion of LBM of protein malnutrition and/or the inflammatory state: in closer focus

Hepatic synthesis is influenced by nutritional and inflammatory circumstances working concomitantly and liver production of  TTR integrates the dietary and stressful components of any disease spectrum. Thus we have a depletion of visceral transport proteins made by the liver and fat-free weight loss secondary to protein catabolism. This is most accurately reflected by TTR, which is a rapid turnover protein, but it is involved in transport and is essential for thyroid function (thyroxine-binding prealbumin) and tied to retinol-binding protein. Furthermore, protein accretion is dependent on a sulfonation reaction with 2 ATP.  Consequently, Kwashiorkor is associated with thyroid goiter, as the pituitary-thyroid axis is a major sulfonation target. With this in mind, it is not surprising why TTR is the sole plasma protein whose evolutionary patterns closely follow the shape outlined by LBM fluctuations. Serial measurement of TTR therefore provides unequaled information on the alterations affecting overall protein nutritional status. Recent advances in TTR physiopathology emphasize the detecting power and preventive role played by the protein in hyper-homocysteinemic states.

Individuals submitted to N-restricted regimens are basically able to maintain N homeostasis until very late in the starvation processes. But the N balance study only provides an overall estimate of N gains and losses but fails to identify the tissue sites and specific interorgan fluxes involved. Using vastly improved methods the LBM has been measured in its components. The LBM of the reference man contains 98% of total body potassium (TBK) and the bulk of total body sulfur (TBS). TBK and TBS reach equal intracellular amounts (140 g each) and share distribution patterns (half in SM and half in the rest of cell mass). The body content of K and S largely exceeds that of magnesium (19 g), iron (4.2 g) and zinc (2.3 g).

TBN and TBK are highly correlated in healthy subjects and both parameters manifest an age-dependent curvilinear decline with an accelerated decrease after 65 years. Sulfur Methylation (SM) undergoes a 15% reduction in size per decade, an involutive process. The trend toward sarcopenia is more marked and rapid in elderly men than in elderly women decreasing strength and functional capacity. The downward SM slope may be somewhat prevented by physical training or accelerated by supranormal cytokine status as reported in apparently healthy aged persons suffering low-grade inflammation or in critically ill patients whose muscle mass undergoes proteolysis.

5.  The results of the events described are:

  • Declining generation of hydrogen sulfide (H2S) from enzymatic sources and in the non-enzymatic reduction of elemental S to H2S.
  • The biogenesis of H2S via non-enzymatic reduction is further inhibited in areas where earth’s crust is depleted in elemental sulfur (S8) and sulfate oxyanions.
  • Elemental S operates as co-factor of several (apo)enzymes critically involved in the control of oxidative processes.

Combination of protein and sulfur dietary deficiencies constitute a novel clinical entity threatening plant-eating population groups. They have a defective production of Cys, GSH and H2S reductants, explaining persistence of an oxidative burden.

6. The clinical entity increases the risk of developing:

  • cardiovascular diseases (CVD) and
  • stroke

in plant-eating populations regardless of Framingham criteria and vitamin-B status.
Met molecules supplied by dietary proteins are submitted to transmethylation processes resulting in the release of Hcy which:

  • either undergoes Hcy — Met RM pathways or
  • is committed to transsulfuration decay.

Impairment of CβS activity, as described in protein malnutrition, entails supranormal accumulation of Hcy in body fluids, stimulation of activity and maintenance of Met homeostasis. The data show that combined protein- and S-deficiencies work in concert to deplete Cys, GSH and H2S from their body reserves, hence impeding these reducing molecules to properly face the oxidative stress imposed by hyperhomocysteinemia.

Although unrecognized up to now, the nutritional disorder is one of the commonest worldwide, reaching top prevalence in populated regions of Southeastern Asia. Increased risk of hyperhomocysteinemia and oxidative stress may also affect individuals suffering from intestinal malabsorption or westernized communities having adopted vegan dietary lifestyles.

Ingenbleek Y. Hyperhomocysteinemia is a biomarker of sulfur-deficiency in human morbidities. Open Clin. Chem. J. 2009 ; 2 : 49-60.

7. The dysfunctional metabolism in transitional cell transformation

A third development is also important and possibly related. The transition a cell goes through in becoming cancerous tends to be driven by changes to the cell’s DNA. But that is not the whole story. Large-scale techniques to the study of metabolic processes going on in cancer cells is being carried out at Oxford, UK in collaboration with Japanese workers. This thread will extend our insight into the metabolome. Otto Warburg, the pioneer in respiration studies, pointed out in the early 1900s that most cancer cells get the energy they need predominantly through a high utilization of glucose with lower respiration (the metabolic process that breaks down glucose to release energy). It helps the cancer cells deal with the low oxygen levels that tend to be present in a tumor. The tissue reverts to a metabolic profile of anaerobiosis.  Studies of the genetic basis of cancer and dysfunctional metabolism in cancer cells are complementary. Tomoyoshi Soga’s large lab in Japan has been at the forefront of developing the technology for metabolomics research over the past couple of decades (metabolomics being the ugly-sounding term used to describe research that studies all metabolic processes at once, like genomics is the study of the entire genome).

Their results have led to the idea that some metabolic compounds, or metabolites, when they accumulate in cells, can cause changes to metabolic processes and set cells off on a path towards cancer. The collaborators have published a perspective article in the journal Frontiers in Molecular and Cellular Oncology that proposes fumarate as such an ‘oncometabolite’. Fumarate is a standard compound involved in cellular metabolism. The researchers summarize that shows how accumulation of fumarate when an enzyme goes wrong affects various biological pathways in the cell. It shifts the balance of metabolic processes and disrupts the cell in ways that could favor development of cancer.  This is of particular interest because “fumarate” is the intermediate in the TCA cycle that is converted to malate.

Animation of the structure of a section of DNA...

Animation of the structure of a section of DNA. The bases lie horizontally between the two spiraling strands. (Photo credit: Wikipedia)

The Keio group is able to label glucose or glutamine, basic biological sources of fuel for cells, and track the pathways cells use to burn up the fuel.  As these studies proceed, they could profile the metabolites in a cohort of tumor samples and matched normal tissue. This would produce a dataset of the concentrations of hundreds of different metabolites in each group. Statistical approaches could suggest which metabolic pathways were abnormal. These would then be the subject of experiments targeting the pathways to confirm the relationship between changed metabolism and uncontrolled growth of the cancer cells.

Related articles

Read Full Post »

Curator: Ritu Saxena, Ph.D.

Introduction and Research Relevance:

Pancreatic ductal adenocarcinoma (PDA) is the fourth leading cause of cancer death in the United States with a median survival of <6 mo and a dismal 5-yr survival rate of 3%–5%. The cancer’s lethal nature stems from its propensity to rapidly disseminate to the lymphatic system and distant organs. This aggressive biology and resistance to conventional and targeted therapeutic agents leads to a typical clinical presentation of incurable disease at the time of diagnosis.

Also, it has been well documented that despite much progress in its molecular characterization, Pancreatic ductal adenocarcinoma (PDA) remains a lethal malignancy.

Recent article published in the journal Nature talks about discovering the link between a gene and the prognosis of Pancreatic Ductal Adenocarcenoma (PDA). The discovery might have therapeutic relevance in PDA.

Although previous work had attributed a pro-survival role to USP9X in human neoplasia, the researchers found instead that loss of Usp9x protects pancreatic cancer cells from death. Thus, the study proposed USP9X to be a major tumour suppressor gene with prognostic and therapeutic relevance in PDA.

News brief: (

29 April 2012

Gene against pancreatic cancer discovered

Study points to potential new treatment for deadly pancreatic cancer

In a study published in Nature (Sunday 29 April), researchers have identified a potential new therapeutic target for pancreatic cancer.

The team found that when a gene involved in protein degradation is switched-off through chemical tags on the DNA’s surface, pancreatic cancer cells are protected from the bodies’ natural cell death processes, become more aggressive, and can rapidly spread.

Pancreatic cancer kills around 8,000 people every year in the UK and, although survival rates are gradually improving, fewer than 1 in 5 patients survive for a year or more following their diagnosis.

Co-lead author Professor David Tuveson, from Cancer Research UK’s Cambridge Research Institute, said: “The genetics of pancreatic cancer has already been studied in some detail, so we were surprised to find that this gene hadn’t been picked up before. We suspected that the fault wasn’t in the genetic code at all, but in the chemical tags on the surface of the DNA that switch genes on and off, and by running more lab tests we were able to confirm this.”

The team expects this gene, USP9X, could be faulty in up to 15 per cent of pancreatic cancers, raising the prospect that existing drugs, which strip away these chemical tags, could be an effective way of treating some pancreatic cancers.

” This study strengthens our emerging understanding that we must also look into the biology of cells to identify all the genes that play a role in cancer. ” Dr David Adams

“Drugs which strip away these tags are already showing promise in lung cancer and this study suggests they could also be effective in treating up to 15 per cent of pancreatic cancers,” continues Professor Tuveson.

The researchers used a mouse model of pancreatic cancer to screen for genes that speed up pancreatic cancer growth using a technique called ‘Sleeping Beauty transposon mutagenesis’. This system uses mobile genetic elements that hop around the cell’s DNA from one location to the next. Cells that acquire mutations in genes that contribute to cancer development will grow out and ‘driver’ cancer genes may be identified.

By introducing the Sleeping Beauty transposon into mice pre-disposed to develop pancreatic cancer, the researchers were able to screen for a class of genes called a tumour suppressor that, under normal circumstances, would protect against cancer. These genes are a bit like the cell’s ‘brakes’, so when they become faulty there is little to stop the cell from multiplying out of control.

This approach uncovered many genes already linked to pancreatic cancer. But unexpectedly, USP9X, was identified.

Co-lead author Dr David Adams, from the Wellcome Trust Sanger Institute, said: “The human genome sequence has delivered many promising new leads and transformed our understanding of cancer. Without it, we would have only a small, shattered glimpse into the causes of this disease. This study strengthens our emerging understanding that we must also look into the biology of cells to identify all the genes that play a role in cancer.”

Read Full Post »