Posts Tagged ‘Cambridge’

DNA Structure and Oligonucleotides

Curator: Larry H Bernstein, MD, FCAP

DNA Structure and Oligonucleotides

Word Cloud by Zach Day

Triplex Medical Science
Expert, Author, Writer, Leaders in Pharmaceutical Business Intelligence
http:/pharmaceuticalintelligence.com/DNA_structure_ and_ Oligonucleotides

A section of DNA; the sequence of the plate-li...

A section of DNA; the sequence of the plate-like units (nucleotides) in the center carries information. (Photo credit: Wikipedia)


DNA (Photo credit: Allen Gathman)

Triplex DNA

1. A Third Strand for DNA

The DNA double helix can under certain conditions accommodate a third strand in its major groove. Researchers in the UK have now presented a complete set of

  • four variant nucleotides that makes it possible to use this phenomenon in gene regulation and mutagenesis.

Natural DNA only forms a triplex if the targeted strand is rich in purines – guanine (G) and adenine (A) – which in addition to the bonds of the Watson-Crick base pairing can form two further hydrogen bonds, and

  • the ‘third strand’ oligonucleotide has the matching sequence of pyrimidines – cytosine (C) and thymine (T).

Any Cs or Ts in the target strand of the duplex will only bind very weakly, as they contribute just one hydrogen bond. Moreover, the recognition of G requires the C in the probe strand to be protonated, so triplex formation will only work at low pH.

To overcome all these problems, the groups of Tom Brown and Keith Fox at the University of Southampton have developed modified building blocks, and have now completed a set of

  • four new nucleotides, each of which will bind to one DNA nucleotide from the major groove of the double helix.1

They tested the binding of a 19-mer of these designer nucleotides to a double helix target sequence in comparison with the corresponding triplex-forming oligonucleotide made from natural DNA bases. Using fluorescence-monitored thermal melting and DNase I footprinting, the researchers showed that their construct forms stable triplex even at neutral pH.  Tests with mutated versions of the target sequence showed that

  • three of the novel nucleotides are highly selective for their target base pair,
  • while the ‘S’ nucleotide, designed to bind to T, also tolerates C.

In principle, triplex formation has already been demonstrated as a way of inducing mutations in cell cultures and animal experiments.2

Michael Gross


1 DA Rusling et al, Nucleic Acids Res. 2005, 33, 3025     http://NucleicAcidsRes.com/Rusling_DA
2 KM Vasquez et al, Science 2000, 290, 530   http://Science.org/Vazquez_KM

2. Triplex DNA Structures

Triplex DNA Structures. Frank-Kamenetskii, Mirkin SM. Annual Rev Biochem 1995; 64:69-95./www.annualreviews.org/aronline

Since the pioneering work of Felsenfeld, Davies, & Rich (1), double-stranded polynucleotides containing

  • purines in one strand
  • and pydmidines in the other strand [such as poly(A)/poly(U), poly(dA)/poly(dT), or poly(dAG)/poly(dCT)]

have been known to be able to undergo a stoichiometric transition forming a triple-stranded structure containing one polypurine and two polypyrimidine strands. Early on, it was assumed that the third

strand was located in the major groove and associated with the duplex via non-Watson-Crick interactions now known as Hoogsteen pairing.

H-DNAE  triplex

Triple helices consisting of one pyrimidine and two purine strands were also proposed. However, notwithstanding the fact that single-base triads in tRNAs tructures were well-documented, triple-helical DNA escaped wide attention before the mid-1980s.

The considerable modern interest in DNA triplexes arose due to two partially independent developments.

First, homopurine-homopyrimidine stretches in supercoiled plasmids were found

  • to adopt an unusual DNA structure, called
  • H-DNA which includes a triplex as the major structural element.

Secondly, several groups demonstrated that homopyrimidine and some

  • purine-rich oligonucleotidescan form stable and sequence-specific complexes with
  • corresponding homopurine-homopyrimidine sites on duplex DNA. These

complexes were shown to be triplex structures rather than D-loops, where the

  • oligonucleotide invades the double helix and displaces one strand.

A characteristic feature of all these triplexes is that the two chemically homologous strands (both pyrimidine or both purine) are antiparallel. These findings led explosive growth in triplex studies.

One can easily imagine numerous “geometrical” ways to form a triplex, and those that have been studied experimentally. The canonical intermolecular triplex consists of either

  • three independent oligonucleotide chains or
  • of a long DNA duplex carrying homopurine-homopyrimidine insert
  • and the corresponding oligonucleotide.

Triplex formation strongly depends on the oligonucleotide(s) concentration. A single DNA chain may also fold into a triplex connected by two loops. To comply with the sequence and

  • polarity requirements for triplex formation, such a DNA strand must have a peculiar sequence: It contains
  • a mirror repeat (homopyrimidine for YR*Y triplexes and homopurine for YR*R triplexes)
  • flanked by a sequence complementary to one half of this repeat.

Such DNA sequences fold into triplex configuration much more readily than do the corresponding intermolecular triplexes, because all triplex forming segments are brought together within the same molecule.

formation of triplex DNA

It has become clear recently, however, that both sequence requirements and chain polarity rules for triplex formation can be met by DNA target sequences built

of clusters of purines and pyrimidines. The third strand consists of

  • adjacent homopurine and homopyrimidine blocks forming Hoogsteen hydrogen bonds with purines
  • on alternate strands of the target duplex, and this strand switch preserves the proper chain polarity.
  • These structures, called alternate-strand triplexes, have been

experimentally observed as both intra- and intermolecular triplexes. These results increase the number of potential targets for triplex formation in natural DNAs somewhat by

  • adding sequences composed of purine and pyrimidine clusters, although
  • arbitrary sequences are still not targetable because strand switching is energetically unfavorable.


Lyamichev VI, Mirkin SM, Frank-Kamenetskii MD. J. Biomol. Stract. Dyn. 1986; 3:667-69.  http://JbiomolStractDyn.com/Lyamichev_VI/
Mirkin SM, Lyamichev VI, Drushlyak KN, Dobrynin VN0 Filippov SA, Frank-Kamenetskii MD. Nature 1987; 330:495-97.     http://Nature.com/
Demidov V, Frank-Kamenetskii MD, Egholm M, Buchardt O, Nielsen PE. Nucleic Acids Res. 1993; 21:2103-7.    http://NucleicAcidsResearch.com/
Mirkin SMo Frank-Kamenetskii MD. Anna. Rev. Biophys. Biomol. Struct. 1994; 23:541-76. http://AnnRevBiophysBiomolecStructure.com/
Hoogsteen K. Acta Crystallogr. 1963; 16:907-16   http://ActaCrystallogr.com/
Malkov VA, Voloshin ON, Veselkov AG, Rostapshov VM, Jansen I, et al. Nucleic Acids Res. 1993; 21:105-11.  http://NucleicAcidsResearch.com/
Malkov VA, Voloshin ON, Soyfer VN, Frank-Kamenetskii MD. Nucleic Acids Res. 1993; 21:585-91
Chemy DY, Belotserkovskii BP, Frank-Kamenetskii MD, Egholm M, Buchardt O, et al. Proc. Natl. Acad. Sci. USA 1993; 90:1667-70   http://PNAS.org/

3. Triplex forming Oligonucleotides

Triplex forming oligonucleotides: sequence-specific tools for genetic targeting. Knauert MP, Glazer PM. Human Molec Genetics 2001; 10(20):2243-2251.

http://HumanMolecGenetics.com/Triplex_forming_oligonucleotides: sequence-specific_tools_for _genetic_targeting.

Triplex forming oligonucleotides (TFOs) bind in the major groove of duplex DNA with a high specificity and affinity. Because of these characteristics, TFOs have been proposed as homing devices for genetic manipulation in vivo.

These investigators review work demonstrating the ability of TFOs and related molecules to

  • alter gene expression and
  • mediate gene modification in mammalian cells.
  • TFOs can mediate targeted gene knock out in mice,

providing a foundation for potential application of these molecules in human gene therapy.

formation of a triplex DNA structure

4. Novagon DNA

John Allen Berger, founder of Novagon DNA and The Triplex Genetic Code
Over the past 12+ years, Novagon DNA has amassed a vast array of empirical findings which challenge the “validity” of the “central dogma theory”, especially the current five nucleotide Watson-Crick DNA and RNA genetic codes. DNA = A1T1G1C1, RNA =A2U1G2C2.
We propose that our new Novagon DNA 6 nucleotide Triplex Genetic Code has more validity than the existing 5 nucleotide (A1T1U1G1C1) Watson-Crick genetic codes. Our goal is to conduct a “world class” validation study to replicate and extend our findings.

Triplex DNA Structures

Maxim D. Frank-Kamenetskii, Sergei M. Mirkin

A DNA triplex is formed when pyrimidine or purine bases occupyt he major groove of the DNA  double Helix forming Hoogsteen pairs with purines of the Watson-Crick basepairs.  Intermolecular triplexes are formed
 between triplex forming oligonucleotides (TFO) and target sequences on duplex DNAI.ntramolecular triplexes are the major elements of H-DNA usnusual DNA structures, which are formed in homopurine-homopyrimidine regions of supercoiled DNAs. TFOs are promising gene-drugs, which can be used in an anti-gene strategy, that attempt to modulate gene activity in vivo. Numerous chemical modifications of TFO are known. In peptide nucleic acid (PNA), the sugarphosphate backbone is replaced with a protein-like backbone. PNAs form P-loops while interacting with duplex DNA forming triplex with one of DNA strands leaving the other strand displaced. Very unusual recombination or parallel triplexes, or R-DNA have been assumed to form under RecA protein in the course of homologous recombination.

Perspectives and Summary

Since the pioneering work of Felsenfeld, Davies, & Rich (1), double-stranded polynucleotides containing purines in one strand and pydmidines in the other strand [such as poly(A)/poly(U), poly(dA)/poly(dT), or poly(dAG)/poly(dCT)] have been known to be able to undergo a stoichiometric transition forming a triple-stranded structure containing one polypurine and two polypyrimidine strands (2-4). Early on, it was assumedth at the third strand was located in the major groove and associated with the duplex via non-Watson-Crick interactions now known as Hoogsteenp airing. Triple helices consisting of one pyrimidine and two purine strands were also proposed( 5, 6). However notwithstanding the fact that single-base triads in tRNAs tructures were well-documented (reviewed in 7), triple-helical DNA escaped wide attention before the mid-1980s.
The considerable modem interest in DNA triplexes arose due to two partially independent developments. First, homopurine-homopyrimidine stretches in supercoiled plasmids were found to adopt an unusual DNA structure, called H-DNA which includes a triplex as the major structural element (8, 9). Secondly, several groups demonstrated that homopyrimidine and some purine-rich oligonucleotides can form stable and sequence-specific complexes with corresponding homopurine-homopyrimidine sites on duplex DNA(1 0-12). These complexes were shown to be triplex structures rather than D-loops, where the oligonucleotide invades the double helix and displaces one strand. A characteristic feature of all these triplexes is that the two chemically homologous strands (both pyrimidine or both purine) are antiparallel. These findings led explosive growth in triplex studies.
During the study of intermolecular triplexes, it became clear that triplex-forming oligonucleotides (TFOs) might be universal drugs that exhibit sequence-specific recognition of duplex DNA. This is an exciting possibility because, in contrast to other DNA-binding drugs, the recognition principle of TFOs is very simple: Hoogsteen pairing rules between a purine strand of the DNA duplex and the TFO bases. However this mode of recognition is limited in that homopurinehomopyrimidine sites are preferentially recognized. Though significant efforts have been directed toward overcoming this limitation, the problem is still unsolved in general. Nevertheless, the high specificity of TFO-DNA recognition has led to the development of an “antigene” strategy, the goal of which is to modulate gene activity in vivo using TFOs (reviewed in 13).
Although numerous obstacles must be overcome to reach the goal, none are likely to be fatal for the strategy. Even if DNA TFOs proved to be unsuitable as gene-drugs, there are already many synthetic analogs that also exhibit triplex-type recognition. Among them are oligonucleotides with non-natural bases capable of binding the duplex more strongly than can natural TFOs.
Another promising modification replaces the sugar-phosphate backbone of ordinary TFO with an uncharged peptidelike backbone, called a peptide nucleic acid (PNA) (reviewed in 14). Homopyrimidine PNAs form remarkably strong and sequence-specific complexes with the DNA duplex via an unusual strand displacement reaction: Two PNA molecules form a triplex with one of the DNA strands, leaving the other DNA strand displaced (a “P-loop”) (15, 16).  The ease and sequence specificity with which duplex DNA and TFOs formed triplexes seemed to support the idea (17) that the homology search preceding homologous recombination might occur via a triplex between a single DNA strand and the DNA duplex without recourse to strand separation in the duplex.
However, these proposed recombination triplexes are dramatically different from the orthodox triplexes observed experimentally. First, the recombination triplexes must be formed for arbitrary sequences and, second, the two identical strands in this triplex are parallel rather than antiparallel. Some data supported the existence of a special class of recombination triplexes, at least within the complex among duplex DNA, RecA protein, and single-stranded DNA (reviewed in Ref. 18), called R-DNA. A stereochemical model of R-DNA was published (19). However the structure of the recombinationi ntermediate is far from being understood, and some recent data strongly favor the traditional model of homology search via local strand separation of the duplex and D-loop formation mediated by RecA protein. Intramolecular triplexes (H-DNA) are formed in vitro under superhelical stress in homopurine-homopyrimidinem irror repeats. The average negative supercoiling in the cell is not sufficient to induce H-DNA formation in most cases.
 Annu.Rev.Biochem 1995. 64:65-95

Doubling down: four-stranded, ‘quadruple helix’ DNA discovered

Published January 21, 2013
Quadruplex DNA strands are seen at left, while fluorescent stains at right reveal their presence in human cell nuclei and chromosomes. (Jean-Paul Rodriguez and Giulia Biffi)
60 years after scientists first described the “double helix” shape of human DNA, the chemical code of life, scientists have discovered the first quadruple helix — and it may help them prevent the runaway cell proliferation at the root of cancer.
“It’s been sixty years since its structure was solved but work like this shows us that the story of DNA continues to twist and turn,” said Julie Sharp, senior science information manager at Cancer Research UK.
‘The story of DNA continues to twist and turn.’
– Julie Sharp, senior science information manager at Cancer Research UK
The research, published Monday in the science journal Nature Chemistry, shows clearly a four-stranded DNA structure that the scientists dubbed a “G-quadruplex.” The name comes from the building block guanine, one of the chemical bases that form DNA, along with adenine, cytosine, and thymine (usually abbreviated to their first letter).
By targeting these DNA oddities with synthetic molecules that trap and contain them — preventing cells from replicating their DNA and consequently blocking cell division — it may be possible to halt the spread of cancer, the researchers said.
“We are seeing links between trapping the quadruplexes with molecules and the ability to stop cells dividing, which is hugely exciting,” said professor Shankar Balasubramanian from the University of Cambridge’s Department of Chemistry and Cambridge Research Institute, whose group produced the research.
“We’ve come a long way in 10 years, from simple ideas to really seeing some substance in the existence and tractability of targeting these funny structures,” he told the BBC.
“I’m hoping now that the pharmaceutical companies will bring this on to their radar and we can perhaps take a more serious look at whether quadruplexes are indeed therapeutically viable targets.”

quadruple helix dna

Electrochemical Determination of Triple Helices:  Electrocatalytic Oxidation of Guanine in an IntramolecularTriplex

Rebecca C. Holmberg and H. Holden Thorp

Probing the Solvent Accessibility and Electron Density of Adenine:  Oxidation of 7-Deazaadenine in Bent DNA and Purine Doublets
Jennifer D. Tibodeau and H. Holden Thorp
Related Content
Radical Cation Transport and Reaction in RNA/DNA Hybrid Duplexes:  Effect of Global Structure on Reactivity Journal of the American Chemical Society
Other ACS content by these authors:
Yongzhi Kan
Gary B. Schuster

Triplex DNA: fundamentals, advances, and potential applications for gene therapy

Phillip P. Chan, P. M. Glazer

 The ability to target specific sequences of DNA through oligonucleotide-based triple-helix formation provides a powerful tool for genetic manipulation. Under experimental conditions, triplex DNA can inhibit DNA transcription and replication, generate site-specific mutations, cleave DNA, and induce homologous recombination. This review describes the binding requirements for triplex formation, surveys recent advancements in the chemistry and biology of triple helices, and considers several potential applications of triplex DNA for use in genetic therapy.

A Gold Nanoparticle Based Approach for Screening Triplex DNA Binders

Min Su Han, Abigail K. R. Lytton-Jean, and Chad A. Mirkin*

The publisher’s final edited version of this article is available at J Am Chem Soc
 Nanoparticle assemblies interconnected with DNA triple helixes can be used to colorimetrically screen for triplex DNA binding molecules and simultaneously determine their relative binding affinities based on melting temperatures. Nanoparticles assemble only when DNA triple helixes form between DNA from two different particles and a third strand of free DNA. In addition, the triple helix structure is unstable at room temperature and only forms in the presence of triplex DNA binding molecules which stabilize the triple helix. The resulting melting transition of the nanoparticle assembly is much sharper and at a significantly higher Tm than the analogous triplex structure without nanoparticles. Upon nanoparticle assembly, a concomitant red-to-blue color change occurs. The assembly process and color change does not occur in the presence of duplex DNA binders and therefore provides a significantly better screening process for triplex DNA binding molecules compared to standard methods.
Regulating gene expression by controlling nucleic acid transcription is a potential strategy for the treatment of genetic-based diseases. A promising approach involves the use of triplex forming oligonucleotides (TFOs).1 Triple helix nucleic acids, or triplex structures, are formed through sequence specific Hoogsteen, or reverse Hoogsteen, hydrogen bond formation between a single-stranded TFO and purine bases in the major groove of a target duplex.2 Because TFOs can achieve sequence-specific recognition of genomic DNA, they can, in principle, be used to modulate gene expression by interfering with transcription factors that bind to DNA. However, at present, only purine-rich sequences can be targeted and the resultant triplex structure is less stable than the analogous duplex. This inherent instability has prompted research efforts to develop molecules that selectively bind to such triplex structures to stabilize the TFO-duplex complex. Potentially, triplex specific binding molecules could be used in conjunction with TFOs to achieve control of gene expression.3 Molecules identified as triplex binders include benzoindoloquinoline, benzopyridoquinoxaline, naphthyquinoline, acridine, and anthraquinone derivatives.4 In the past, typical screening processes for identifying triplex binders have included competitive dialysis, mass spectroscopy, electrophoresis and UV/Vis melting experiments, most of which are not applicable to high-throughput screening processes.5 However, with the development of combinatorial libraries which can produce large numbers of potential drug candidates, high-throughput screening strategies have become a necessary part of drug development.6

Systems Integrated Biomedical Research

 Cutting a SWATH through Personalized Medicine

The Institute for Systems Biology (ISB) signed a multi-year agreement with AB Sciex to collaborate on the development of methods and technology in proteomics mass spectrometry with the goal of redefining biomarker research and complement genomics through quantitative proteomics analysis. The aim is to help advance the development of a new approach to medical care.
Led by ISB president and co-founder Leroy Hood, M.D., Ph.D., ISB’s research is being accelerated by SWATH™ Acquisition, a data-independent acquisition (DIA) mass spectrometry workflow that reportedly can quantify virtually all detectable peptides and proteins in a sample from a single analysis. ISB will be using the AB Sciex TripleTOF® 5600+ System and an Eksigent ekspert™ nano-LC 400 System as the instrument platforms on which to conduct the protein identification and quantitation. The TripleTOF 5600+ System can reportedly provide the high speed necessary for SWATH Acquisition. ISB also plans to use SelexION™ technology, a recent advancement in differential ion mobility, in the future to advance its research.
“SWATH is a game-changing technique that essentially acts as a protein microarray and is the most reproducible way to generate comprehensive quantitation of the entire proteome,” says Dr. Hood, “It generates a digital record of the entire proteome that can be mined retrospectively for years to come.”
ISB shall support the development of SWATH libraries similar to its SRMAtlas project for the human proteome, pioneered by Rob Moritz, Ph.D., and his collaborators, and the proteomes of other clinically relevant organisms. “With complete proteome-wide libraries, ISB provides the basis to support comprehensive SWATH analysis,” said Dr. Moritz, who is ISB’s proteomics research director.
ISB aims to make the SWATH libraries available to the global scientific community to accelerate the use of SWATH for other biological research. ISB will develop new SWATH technologies and tools to enable the community to adopt comprehensive quantitative proteome analysis.
“Having the proteomics data standardized across laboratories and across samples really enables us to quantitate entire proteomes at a level that hasn’t been done before,” said Dr. Moritz. “We aim to define markers that can predict whether a patient will respond to a certain treatment or not, and applying SWATH will play a big part in taking our advancements to another level. Not only can we now complement the breadth of genomics, but we will have the much-needed libraries and software development going forward to make data-sharing quite easier and standardized.”
AB Sciex forged this alliance with ISB through the AB Sciex Academic Partnership Program to help broaden the availability of new technologies to researchers delving into OMICS research around the world.
“What ISB does with SWATH will set a new benchmark in proteomics research,” said Rainer Blair, president of AB Sciex. “Our collaboration with ISB will help drive SWATH into the mainstream of analytical science and make comprehensive, reproducible and simplified omics data more accessible to biologists around the world.”
SWATH Acquisition was first made available to the worldwide scientific community back in April through a collaboration between AB Sciex and ETH Zurich.

 Genetics and Biophysics for Large Volumes of Data

Rresearchers used an interdisciplinary approach combining genetics and biophysics. “It is the first analysis to combine all known protein structures and genomes with folding rates as a physical parameter,” says Dr. Gräter.
The analysis of 92,000 proteins and 989 genomes can only be tackled with computational methods. The group of Gustavo Caetano-Anolles, head of the Evolutionary Bioinformatics Laboratory at Urbana-Champaign, had originally classified most structurally known proteins from the Protein Database (PDB) according to age. For this study, Minglei Wang in his laboratory identified protein sequences in the genomes, which had the same folding structure as the known proteins. He then applied an algorithm to compare them to each other on a time scale. In this way, it is possible to determine which proteins became part of which organism and when. After that, Cedric Debes, a member of Dr. Gräter’s group, applied a mathematical model to predict the folding rate of proteins. The individual folding steps differ in speed and can take from nanoseconds to minutes. No microscope or laser would be able to capture these different time scales for so many proteins. A computer simulation calculating all folding structures in all proteins would take centuries to run on a mainframe computer. This is why the researchers worked with a less data-intensive method. They calculated the folding speed of the single proteins using structures that have been previously determined in experiments: A protein always folds at the same points. If these points are far apart from each other, it takes longer to fold than if they lie close to each other. With the so-called Size-Modified Contact Order (SMCO), it is possible to predict how fast these points will meet and thus how fast the protein will fold, regardless of its length.
 “Our results show that in the beginning there were proteins which could not fold very well,” Dr. Gräter summarizes. “Over time, nature improved protein folding so that eventually, more complex structures such as the many specialized proteins of humans were able to develop.”

Researchers develop Compilation of Protein Interaction Data

Posted on January 2, 2013 by grathbone

 Researchers have created a platform detailing all atomic data on protein structures and protein interactions for eight organisms. Applying a singular homology-based modelling procedure they have brought together the information previously stored in diverse databases.  Interactome3D has been compiled by scientists Roberto Mosca, Arnaud Ceol and Patrick Aloy as an open-access, free web platform as part of their work at the Institute for Research in Biomedicine.
 For the first time the platform offers anonymous access to molecular details of protein interaction and 3D models. It means that researchers can easily find the atomic level detail that is fundamental to new discoveries in biology and pharmaceuticals.
 Information on more than 12,000 protein interactions for eight model organisms – the plant Arabidopsis thaliana, the worm Caenorhabditis elegans, the fly Drosophila melanogaster, the bacteria Escherichia coli and Helicobacter pylori, the brewer’s yeast Saccharomyces cerevisiae, the mouse Mus musculus, and Homo sapiens – is included. These eight models are the most relevant for biomedical and genetic research.
 Patrick Aloy, ICREA researcher at IRB Barcelona, said: “We have designed Interactome3D for molecular and cellular biologists. It is a well organised non-technical interface that presents the results in a simple manner.
 With only a few clicks of the mouse, you can get the information you are looking for and you don’t have to be a bioinformatician to navigate around the platform, to look things up or to interpret the results.”
 The platform is the result of more than four years of lab experience and collaboration, and the information it contains will be updated every six months, with up to 16,000 protein interaction details expected to be available soon.

Dual coding in alternative reading frames correlates with intrinsic protein disorder

Erika Kovacs, Peter Tompa, Karoly Liliom, and Lajos Kalmar1

Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Karolina ut 29, H-1113 Budapest, Hungary
Edited* by Ada Yonath, Weizmann Institue, Rehovot, Israel, and approved January 29, 2010 (received for review July 14, 2009)
Numerous human genes display dual coding within alternatively spliced regions, which give rise to distinct protein products that include segments translated in more than one reading frame. To resolve the ensuing protein structural puzzle, we identified 67 human genes with alternative splice variants comprising a dualcoding region at least 75 nucleotides in length and analyzed the structural status of the protein segments they encode. The inspection of their amino acid composition and predictions by the IUPred and PONDR® VSL2 algorithms suggest a high propensity for structural disorder in dual-coding regions. In the case of þ1 frameshifts, the average level of disorder in the two frames is similarly high (47.2% in the ancestral frame, 58.2% in the derived frame, with the average level of disorder in human proteins being approximately 30%), whereas in the case of −1 frameshifts, there is a significant tendency to become more disordered upon shifting the frame (16.7% in the ancestral frame, 56.3% in the derived frame).
The regions encoded by the derived frame are mostly disordered (disorder percentage >50%) in 39 out of 62 cases, which strongly suggests that structural disorder enables these protein products to exist and function without the need of a highly evolved 3D fold.
The potential advantages are also demonstrated by the appearance of novel functions and the high incidence of transcripts escaping nonsense-mediated decay. By discussing several examples, we demonstrate that dual coding may be an effective mechanism for the evolutionary appearance of novel intrinsically disordered regions with new functions.
Alternative splicing ∣ nonsense-mediated decay ∣ unstructured protein
The process of alternative splicing (AS), in which different combinations of exons are joined together in mRNA maturation, enables several protein isoforms to be encoded by a single gene (1, 2). It is estimated that more than 75% of mammalian genes are alternatively spliced (1, 3) and in about 50% of all AS events the reading frame is altered (4), i.e., a certain stretch of DNA has the potential to be translated in different reading frames. The use of such alternative reading frames (ARFs), however, is often suppressed by a premature termination codon (PTC) that results in nonsense-mediated decay (NMD) of the mRNA product (5, 6). In mammals, a stop codon followed by an exon–exon junction more than 50–55 nucleotides downstream is recognized as a PTC (7) that regulates gene expression and/or acts as a surveillance mechanism against potentially harmful protein products.
A major concern with dual-coding in ARFs is that it gives rise to two intertwined polypeptide sequences which are highly unlikely to both result in two properly folded functional proteins. Thus, dual-coding has long been thought to be prevalent only in viruses and prokaryotes that are under pressure to maintain a compact genome (8, 9). Only relatively recently, results on functional pairs of proteins derived from ARFs (10–16) and bioinformatic studies of conserved overlapping open reading frames (ORFs) (16–19) have pointed to the likely importance of the use of ARFs in eukaryotes.
An enigmatic issue largely overlooked thus far is the protein structural impact of this phenomenon. Because folding of a polypeptide chain to a unique 3D state is a highly evolved feature  www.pnas.org/cgi/doi/10.1073/pnas.0907841107     PNAS Early Edition ∣ 1 of 6
This article contains supporting information online at www.pnas.org/cgi/content/full/0907841107/DCSupplemental


Functionalized Nucleoside 5′-triphosphates for In Vitro Selection of New Catalytic Ribonucleic Acids

JMatulic-Adamic, AT Daniher, A Karpeisky, P Haeberli, D Sweedler and L Beigelman*
http://BioorgMedChemLet.com/Functionalized Nucleoside 5′-triphosphates for In Vitro Selection of New Catalytic Ribonucleic Acids/
Bioorganic & Medicinal Chemistry Letters 10 (2000) 1299±1302

A series of novel 20-modiifed nucleoside 50-triphosphates was synthesized. The amino, imidazole, and carboxylate functionalities were attached to the 5-position of pyrimidine base of these molecules through alkynyl and alkyl spacers, respectively. Two different phosphorylation methods were used to optimize the yields of these highly modified triphosphates.
Recently, much attention has been focused on the development of functionalized nucleotides suitable for in vitro selection with the hope of increasing the potential of nucleic acids for binding and catalysis. For RNA in vitro selections modifications should be at the nucleotide level so that they can be incorporated simply and efficiently using RNA polymerase without problematic side reactions associated with synthetic posttranscriptional modification.
English: The structure of DNA showing with det...

English: The structure of DNA showing with detail showing the structure of the four bases, adenine, cytosine, guanine and thymine, and the location of the major and minor groove. (Photo credit: Wikipedia)

English: A model of a DNA tetrahedron. Each ed...

English: A model of a DNA tetrahedron. Each edge of the tetrahedron is a 20bp DNA duplex, and each vertex is a three-arm junction. In this model each basepair is represented by five pseudo-atoms, representing the two sugars, the two phosphates, and the major groove. The scale bar is 1 nm. (Photo credit: Wikipedia)

From left to right, the structures of A-, B- a...

From left to right, the structures of A-, B- and Z-DNA. The structure a DNA molecule depends on its environment. In aqueous enviromnents, including the majority of DNA in a cell, B-DNA is the most common structure. The A-DNA structure is dominates in dehydrated samples and is similar to the double-stranded RNA and DNA/RNA hybrids. Z-DNA is a rarer structure found in DNA bound to certain proteins. (Photo credit: Wikipedia)

Read Full Post »