Personalized and Precision Medicine & Genomic Research | Leaders in Pharmaceutical Business Intelligence Group, LLC, Doing Business As LPBI Group, Newton, MA

Archive for the ‘Personalized and Precision Medicine & Genomic Research’ Category

NGS Cardiovascular Diagnostics: Long-QT Genes Sequenced – A Potential Replacement for Molecular Pathology

Posted in Bio Instrumentation in Experimental Life Sciences Research, Biological Networks, Gene Regulation and Evolution, Biomarkers & Medical Diagnostics, Cardiovascular Pharmacogenomics, Chemical Genetics, Computational Biology/Systems and Bioinformatics, Frontiers in Cardiology and Cardiovascular Disorders, Genome Biology, Genomic Testing: Methodology for Diagnosis, Molecular Genetics & Pharmaceutical, Personalized and Precision Medicine & Genomic Research, Population Health Management, Genetics & Pharmaceutical, Technology Transfer: Biotech and Pharmaceutical, tagged Biochemistry and Molecular Biology, DNA, DNA Sequencing, Imperial College London, Journal of Cardiovascular Translational Research, molecular Pathology, NGS, NGS Cardiovascular Diagnostics, PCR, Polymerase chain reaction on October 1, 2012| 3 Comments »

Reporter: Aviva Lev-Ari, PhD, RN

J Cardiovasc Transl Res. 2012 Sep 7. [Epub ahead of print]

Next Generation Diagnostics in Inherited Arrhythmia Syndromes : A Comparison of Two Approaches.

Ware JS, John S, Roberts AM, Buchan R, Gong S, Peters NS, Robinson DO, Lucassen A, Behr ER, Cook SA.

Source

MRC Clinical Sciences Centre, Imperial College London, London, UK, j.ware@imperial.ac.uk.

Abstract

Next-generation sequencing (NGS) provides an unprecedented opportunity to assess genetic variation underlying human disease. Here, we compared two NGS approaches for diagnostic sequencing in inherited arrhythmia syndromes. We compared PCR-based target enrichment and long-read sequencing (PCR-LR) with in-solution hybridization-based enrichment and short-read sequencing (Hyb-SR). The PCR-LR assay comprehensively assessed five long-QT genes routinely sequenced in diagnostic laboratories and “hot spots” in RYR2. The Hyb-SR assay targeted 49 genes, including those in the PCR-LR assay. The sensitivity for detection of control variants did not differ between approaches. In both assays, the major limitation was upstream target capture, particular in regions of extreme GC content. These initial experiences with NGS cardiovascular diagnostics achieved up to 89 % sensitivity at a fraction of current costs. In the next iteration of these assays we anticipate sensitivity above 97 % for all LQT genes. NGS assays will soon replace conventional sequencing for LQT diagnostics and molecular pathology.

PMID: 22956155 [PubMed]

Source:

http://www.ncbi.nlm.nih.gov/pubmed/22956155

Researchers in the UK have compared a PCR-based and a capture hybridization-based assay for sequencing panels of inherited cardiovascular disease genes and have found both to be suitable for diagnostics in principle, though their sensitivity needs to be optimized.

According to James Ware, a clinical lecturer at Imperial College London, the purpose of the study, published online this month in the Journal of Cardiovascular Translational Research, was to evaluate different approaches for sequencing cardiovascular disease genes, both for molecular diagnosis and for large-scale resequencing research studies.

His group, in the National Institute for Health Research Royal Brompton Cardiovascular Biomedical Research Unit, is interested in a range of inherited heart disease types, including cardiomyopathies and inherited arrhythmia syndromes such as long QT syndrome.

For their study, they compared two next-gen sequencing assays: a PCR-based approach that uses Fluidigm’s Access Array to amplify 96 amplicons in five LQT genes and one other gene, followed by sequencing on the 454 GS Junior; and an in-solution hybridization approach that uses Agilent’s SureSelect to target 49 inherited arrhythmia genes and sequences them on Life Technologies’ SOLiD 4.

The study focused on the sensitivity of the assays, or how well they were able to capture their intended targets, rather than their specificity, or their ability to avoid false positives.

Ware said that at the time of the study, PCR and in-solution capture were the two main target selection methods available. The researchers are still using both approaches but are now employing “a wide range of sequencers” from various providers for both types of assays, including Illumina instruments and Life Tech’s Ion Torrent.

For their comparison, they analyzed 48 samples, of which they sequenced 33 with both approaches and 15 using either one or the other.

The samples included 19 known variants in three disease genes, of which the hybridization-SOLiD method detected 17 and the PCR-454 method 14. Undetected variants were generally in areas that were not well covered, either due to a failure in enrichment, sequencing, or because the alignment was not unique. One variant that was missed by both approaches fell in a very GC-rich region.

Consumables costs for both assays were considerably lower than with Sanger sequencing: While sequencing five genes by Sanger costs more than $700 in consumables, the five-gene PCR/454 assay cost about $55 and the 49-gene hybridization/SOLiD assay cost about $200, according to the study.

Turnaround time is the shortest for Sanger sequencing, which, according to the study, can be done in one day for five genes and 17 samples, not including sample prep. The PCR/454 assay takes about two days for target enrichment and sequencing 48 samples, and the hybridization/SOLiD assay takes about two weeks for sequencing alone, they wrote.

Overall, Ware said, both sequencing approaches performed “reasonably well” and are significantly cheaper than Sanger sequencing. He said that in the UK, molecular diagnosis for inherited cardiovascular disease has traditionally been performed by Sanger, at a cost of approximately £500 to £1,000 ($800 to $1,600) for several genes involved in a clinical condition. However, for cost reasons, not all relevant genes are usually sequenced.

Target selection was the performance-limiting step for both approaches, a result the researchers expected. “It sounds obvious, but not all genes are equally easy to target,” Ware said. For example, in the hybridization assay, the overall target coverage was about 98 percent, but for some genes, it was only 80 percent or 90 percent. The two most important genes in long QT syndrome, KCNQ1 and KCNH2, “proved to be the hardest to sequence.”

Thus, for diagnostic use of NGS gene panels, “it’s important to know not just how the system performs overall but really how it’s performing for the specific genes you’re interested in,” he said.

To use either approach in diagnostics, the target selection step would need to be optimized. Ware’s team has already improved both assays and is now trying them in a number of fully Sanger-sequenced samples to study both sensitivity and specificity.

Longer term, the sensitivity of next-gen sequencing could approach that of Sanger sequencing, he said. And even if it does not reach 100 percent, because NGS approaches can target so many more genes, “maybe you can afford a very slight tradeoff in the per-gene sensitivity if the overall diagnostic sensitivity of the panel goes up,” he said. “At the moment, because we don’t have that much experience in sequencing the less-common genes, we don’t exactly know where that tradeoff lies.” In addition, any gaps could be filled by Sanger sequencing, while the test would probably still be cost effective.

Each approach also has some features that make it more suitable for certain applications. The PCR-based method has a fast turnaround and an “extremely user-friendly workflow,” Ware said, but it can only accommodate a small number of genes at the moment. His team also found it to be easier to optimize and improve. Thus, in the short term, PCR and sequencing “is probably closer to providing a diagnostic solution,” he said, especially for conditions where only a few genes are causative.

The hybridization-based approach, on the other hand, has much greater capacity, and there are advantages in “having a single assay that covers everything,” he said. It might also be possible to detect copy number variants using this approach, but not the more limited PCR method, he added.

Ware and his colleagues are currently using the hybridization approach to study a large panel of genes in 2,000 well-phenotyped volunteers, both healthy individuals and heart disease patients.

They have also started to use the hybridization method to sequence the TTN gene, truncating mutations in which were recently found to be a common cause of dilated cardiomyopathy. They are running the TTN test routinely for patients consented for research diagnostic testing that is not available anywhere else. Because this gene is so large, it is “completely impractical to be sequenced by conventional Sanger,” Ware said.

Julia Karow tracks trends in next-generation sequencing for research and clinical applications for GenomeWeb’s In Sequenceand Clinical Sequencing News. E-mail her here or follow her GenomeWeb Twitter accounts at @InSequence and@ClinSeqNews.

http://www.genomeweb.com//node/1131416?hq_e=el&hq_m=1360171&hq_l=15&hq_v=e1df6f3681

Introduction to Nanotechnology in Drug Delivery

Posted in Bio Instrumentation in Experimental Life Sciences Research, Biomarkers & Medical Diagnostics, BioSimilars, CANCER BIOLOGY & Innovations in Cancer Therapy, Cell Biology, Signaling & Cell Circuits, Chemical Genetics, Disease Biology, Small Molecules in Development of Therapeutic Drugs, Genomic Testing: Methodology for Diagnosis, Medical Devices R&D Investment, Nanotechnology for Drug Delivery, Personalized and Precision Medicine & Genomic Research, Pharmaceutical Industry Competitive Intelligence, Pharmaceutical R&D Investment, Technology Transfer: Biotech and Pharmaceutical, tagged Biology, Cancer - General, nanotechnology, Robert Langer, Torchilin, V. P. on September 28, 2012| 4 Comments »

Author: Tilda Barliya PhD

Category owner: Nanotechnology in drug delivery

Nanotechnology is simply defined as the technology to manipulate the matter on the atomic and/or molecular scale. It is generalized to materials, devices and structures with dimensions sizes at the nanoscale of 1 to 1000 nanometers (nm) (1,2).

Nanotachnology can be applied to many fields including sensors, biomaterials for tissue engineering, and nanostructures or 3D materials for molecular imaging and drug delivery among others. In medicine, nanotechnology is essentially a multidisciplinary field of physics, organic and polymer chemistry as well as molecular biology, pharmacology and engineering. These fields team up together to design a better and most opt treatment option for a disease using “the right drug, the right vehicle and the right route of administration”. In pharmaceutical industries, a new molecular entity (NME) that demonstrates potent biological activity but poor water solubility, or a very short circulating halflife, will likely face significant development challenges or be deemed undevelopable. There is always a degree of compromise, and such tradeoffs may inevitably result in the production of less-ideal drugs. However, with the emerging trends and recent advances in nanotechnology, it has become increasingly possible to address some of the shortcomings associated with potential NMEs. By using nanoscale delivery vehicles, the pharmacological properties (e.g., solubility and circulating half-life) of such NMEs can be drastically improved, essentially leading to the discovery of optimally safe and effective drug candidates. (3,4).

This is just one example which demonstrates the degree to which nanotechnology may revolutionize the rules and possibilities of drug discovery and change the landscape of pharmaceutical industries. (5)

Nanomedicine is facing many challenges in overcoming biological barriers, arrival and accumulation at the target site, therefore advances in nanoparticle engineering, as well as advances in understanding the importance of nanoparticle characteristics such as size, shape and surface properties for biological interactions, are necessary to create new opportunities for the development of nanoparticles for therapeutic applications (6).

Compared to conventional drug delivery, the first generation nanosystems provide a number of advantages. In particular, they can enhance the therapeutic activity by prolonging drug half-life, improving solubility of hydrophobic drugs, reducing potential immunogenicity, and/or releasing drugs in a sustained or stimuli-triggered fashion. Thus, the toxic side effects of drugs can be reduced, as well as the administration frequency. In addition, nanoscale particles can passively accumulate in specific tissues (e.g., tumors) through the enhanced permeability and retention (EPR) effect. Beyond these clinically efficacious nanosystems, nanotechnology has been utilized to enable new therapies and to develop next generation nanosystems for “smart” drug delivery (such as gene theraphy).

In summary; there are several factors that need to be included for a rational nanocarrier design:

– Protect the drug from premature degradation

– Protect the drug from premature interaction with biological environment

– Enhance the absorption of the drug into the selected tissue-site

– Improve intracellular drug penetration

– Improve and control the drug pharmacokinetics and distribution profile.

Moreover there are several other factors that need to be taken into consideration to effectively influence the clinical translation of the drug delivery system (DDS) i.e materials that are biodegradable and biocompatible, easily functionalized, exhibit high differential uptake efficiency etc.(7-9).

In the next few chapters, we will try to address some of these factors as well as some examples that succeeded in the clinical setting as well as those who failed.

References:

Nanotechnology and Drug Delivery Part 1: Background and Applications Nelson A Ochekpe, Patrick O Olorunfemi and Ndidi C Ngwuluka.Tropical Journal of Pharmaceutical Research, June 2009; 8 (3): 265-274. http://www.tjpr.org/vol8_no3/2009_8_3_11_Ochekpe.pdf
Davis, M. E., Chen, Z. & Shin, D. M. Nanoparticle therapeutics: an emerging treatment modality for cancer. Nature Rev. Drug Discov. 7, 771–782 (2008). http://www.nature.com/nrd/journal/v7/n9/abs/nrd2614.html
Nanotechnology in Drug Delivery and Tissue Engineering: From Discovery to Applications Jinjun Shi,†,§ Alexander R. Votruba,§ Omid C. Farokhzad,†,§ and Robert Langer*,†,‡. Nano Lett. 2010, 10, 3223–3230. http://engineering.unl.edu/academicunits/chemical-engineering/research/focuslab/kidambi_lab/CHME_896_496_files/Impact%20of%20Nanotechnology%20on%20Drug%20Delivery-Langer_ACSNano’09.pdf
Sengupta, S. et al. Temporal targeting of tumour cells and neovasculature with a nanoscale delivery system. Nature 436, 568–572 (2005) http://www.ncbi.nlm.nih.gov/pubmed/16049491
Torchilin, V. P. Recent advances with liposomes as pharmaceutical carriers. Nature Rev. Drug Discov. 4, 145–160 (2005). http://www.chem.umass.edu/~thompson/Courses/chem697a/papers/TorchilinReviewLiposomeCarriers.pdf
Decuzzi, P. et al. Size and shape effects in the biodistribution of intravascularly injected particles. J. Control. Release 141, 320–327 (2010) http://www.ncbi.nlm.nih.gov/pubmed?term=Decuzzi%2C%20P.%20et%20al.%20Size%20and%20shape%20effects%20in%20the%20biodistribution%20of%20intravascularly%20injected%20particles.%20J.%20Control.%20Release%20141%2C%20320%E2%80%93327%20(2010)
Nanocarriers as an emerging platform for cancer therapy. Dan Peer1†, Jeffrey M. Karp2,3†, Seungpyo Hong4†, Omid C. Farokhzad5, Rimona Margalit6 and Robert Langer3,4*. nature nanotechnology 2007 | vol 2 751-760. http://www.nature.com/nnano/journal/v2/n12/abs/nnano.2007.387.html
Alonso, M. J. Nanomedicines for overcoming biological barriers. Biomed. Pharmacother. 58, 168–172 2004. http://www.ncbi.nlm.nih.gov/pubmed/15082339
Torchilin, V. P. Recent advances with liposomes as pharmaceutical carriers. Nat. Rev. Drug Discov.4, 145–160 (2005) http://www.chem.umass.edu/~thompson/Courses/chem697a/papers/TorchilinReviewLiposomeCarriers.pdf

Read Full Post »

ENCODE: the key to unlocking the secrets of complex genetic diseases

Posted in Biological Networks, Gene Regulation and Evolution, Bone Disease and Musculoskeletal Disease, CANCER BIOLOGY & Innovations in Cancer Therapy, Disease Biology, Small Molecules in Development of Therapeutic Drugs, Genome Biology, Genomic Testing: Methodology for Diagnosis, Personalized and Precision Medicine & Genomic Research, Uncategorized, tagged BMD, c-myc, Cancer - General, Cis-regulatory modules, CRMs, DNA variants, DNase, DNase-sensitive, ENCODE, Encyclopedia of DNA elements, fetal hemoglobin, gene, genome, genome wide associatinon studies, GWAS, Histone, histone modification, junk DNA, mendelian, National Human Genome Research Institute, non-coding, protein-coding, RNA, Single Nucleotide polymorphisms, SNPs, Transcription factor, type 2 diabetes, WNT16 on September 26, 2012| 9 Comments »

ENCODE data reveals important information from Genome Wide Association Studies relevant to understanding complex genetic diseases

Author: Ritu Saxena, Ph.D.

Introduction

“The depth, quality, and diversity of the ENCODE data are unprecedented” is what was stated by John Stamatoyannopoulos, professor of genomic sciences at the University of Washington and one of the many principle investigators of ENCODE project. ENCODE (Encyclopedia of DNA elements), indeed, was an ambitious project launched as a pilot in 2003 and then expanded in 2007 for the whole genome analysis and identification of all the functional elements of the human genome. The findings were striking as they challenged the definition of “gene” and ‘the central dogma of genetics (Gene-mRNA-protein). Infact, the non-coding part that constitutes about 80% of the genome or the so-called “junk DNA” was found to contain elements crucial for gene regulation. The elements, in large part, include RNA transcripts that are not transcribed into proteins but might have a regulatory role. For detailed reading, refer to the findings published in the issue of Nature, The ENCODE Project Consortium Nature 489, 57–74 (2012) An integrated encyclopedia of DNA elements in the human genome

Key features of the data, as explained in the National Human Genome Research Institute website (National Human Genome Research Institute News feature), include comprehensive mapping of:

Protein-coding genes — Proteins are molecules made of amino acids linked together in a specific sequence; the amino acid sequence is encoded by the sequence of DNA subunits called nucleotides that make up genes.
Non-coding genes — Stretches of DNA that are read by the cell as if they were genes but do not encode proteins. These appear to help regulate the activity of the genome.
Chromatin structure features — Complex physical structures made from a combination of DNA and binding proteins that make up the contents of the nucleus and affects genome function.
Histone modifications — Histones are the proteins that make up the chromatin structures that help shape and control the genome. In addition, histone proteins can be physically modified by adding chemical groups, such as a methyl molecule, that further regulates genomic activity.
DNA methylation — Just like histones, methyl groups can be added to DNA itself in a process called DNA methylation. Chemically attaching methyl groups to DNA physically changes the ability of enzymes to reach the DNA and thus alters the gene expression pattern in cells. Methylation helps cells “remember what they are doing” or alter levels of gene expression, and it is a crucial part of normal development and cellular differentiation in higher organisms.
Transcription factor binding sites — Transcription factors are proteins that bind to specific DNA sequences, controlling the flow (or transcription) of genetic information from DNA to mRNA. Mapping the binding sites can help researchers understand how genomic activity is controlled.

How could ENCODE be helpful in the study of complex human diseases?

Complex diseases and Genome wide association studies (GWAS)

Coronary artery disease, type 2 diabetes and many forms of cancer are complex human diseases that have a significant genetic component. Unlike mendelian disorders that have defined loci, the genetic component of complex disorders lies in the form of genetic variations in the genome making an individual susceptible to these complex diseases.

Researchers have performed Genome-wide association studies (GWAS) of the human genome, leading to the identification of thousands of DNA variants that could be linked with complex traits and diseases. However, identifying the variants, referred to as SNPs (Single Nucleotide Polymorphisms), that actually contribute to the disease, and understanding how they exert influence on a disease has been more of a mystery.

How would ENCODE solve the puzzle?

The puzzle lies in interpreting how the SNPs found in the genome affect a person’s susceptibility to a particular trait or disease and what is the mechanism behind it. As identified in the GWAS, most variants that are associated with the phenotype of the trait or disease lie in the non-coding region of the genome. Infact, in more than 400 studies compiled in the GWAS catalog only a small minority of the trait/disease-associated SNPs occur in protein-coding regions; the large majority (89%) are in noncoding regions. These variants fall in the gene deserts that lie far from protein-coding region, similar to those where cis-regulatory modules (CRMs) are found. CRMs such as promoters and enhancers are a group of binding sites for transcription factors, and the presence of transcription factors bound to these sites is a good indicator of the potential regulatory regions.

The integrative analysis of ENCODE data has give important insights to the results of GWAS studies. Investigators have employed ENCODE data as an initial guide to discover regulatory regions in which genetic variation is affecting a complex trait. Additionally, ENCODE study when examined the SNPs from GWAS that were associated with the phenotype of the trait, found that these regions are enriched in DNase-sensitive regions i.e, lie in the function-associated DNA region of the genome as it could be bound by transcription factors affecting the regulation of gene expression. Thus, the project demonstrates that non-coding regions must be considered when interpreting GWAS results, and it provides a strong motivation for reinterpreting previous GWAS findings.

Using ENCODE Data to Interpret GWAS Results

ENCODE and predisposition to CANCER:

C-Myc, a proto-oncogene, codes for a transcripton factor, when expressed constitutively leads to uninhibited cell proliferation resulting in cancer. It has been observed that common variants within a ~1 Mb region upstream of c-Myc gene have been associated with cancers of the colon, prostate, and breast. Several SNPs have been reported in this region, that although affect the phenotype, lie in the distal cis-region of the MYC gene. Alignment of the ENCODE data in this region with the significant variants from the GWAS also reveals that key variants are found in the transcription factor occupied DNA segments mapped by this consortium. One variant rs698327, lies within a DNase hypersensitive site that is bound by several transcription factors, enhancer-associated protein p300, and contains histone modifications relative to enhancers (high H3K4me1, low H3K4me3). ENCODE data indicates that non-coding regions in the human chromosome 8q24 loci are associated with cancer and as observed in the case of c-myc gene, similar studies on cancer-related genes could help explain predisposition to cancer.

ENCODE and fetal hemoglobin expression:

Another example of the use of ENCODE data is that of gene regulation of fetal hemoglobin. Several regions were predicted via ENCODE that were involved in the regulation of fetal hemoglobin. It was found that these predicted regions are close to the SNPs in the BLC11A gene that is associated with persistent expression of fetal hemoglobin.

Future perspective

As evident from the above examples, the ENCODE data shows that genetic variants do affect regulated expression of a target gene. Recently, several research groups in the UK performed a large-scale GWAS study to determine the genetic predisposition to fracture risk. The collaborative effort, published in a recent issue of the PLoS journal, was made to identify genetic variants associated with cortical bone thickness (CBT) and bone mineral density (BMD) with data from more than 10,000 subjects. http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1002745 The study generated a wealth of data including the result – identification of SNPs in the WNT16 and its adjacent gene, FAM3C were found to be relevant to CBT and BMD. ENCODE data, in this case, could be helpful in interpreting more detailed information including determining additional SNPs, the regulatory information of the genes involved and much more. Thus, it could be concluded that ENCODE data could be immensely useful in interpreting associations between disease and DNA sequences that can vary from person to person.

Sources:

Research articles–

An integrated encyclopedia of DNA elements in the human genome

A User’s Guide to the Encyclopedia of DNA Elements (ENCODE)

What does our genome encode?

Genome-wide Epigenetic Data Facilitate Understanding of Disease Susceptibility Association Studies

Genomics: ENCODE explained

ENCODE Project Writes Eulogy For Junk DNA

WNT16 Influences Bone Mineral Density, Cortical Bone Thickness, Bone Strength, and Osteoporotic Fracture Risk

News articles–

ENCODE project: In massive genome analysis new data suggests ‘gene’ redefinition

National Human Genome Research Institute News feature

Junk DNA codes for valuable miRNAs: non-coding DNA controls Diabetes

ENCODE Findings as Consortium

Read Full Post »

Mitochondria: Origin from oxygen free environment, role in aerobic glycolysis, metabolic adaptation

Posted in Biological Networks, Gene Regulation and Evolution, Biomarkers & Medical Diagnostics, CANCER BIOLOGY & Innovations in Cancer Therapy, Cardiovascular Pharmacogenomics, Cell Biology, Signaling & Cell Circuits, Chemical Biology and its relations to Metabolic Disease, Computational Biology/Systems and Bioinformatics, Disease Biology, Small Molecules in Development of Therapeutic Drugs, Genome Biology, Glycobiology: Biopharmaceutical Production, Pharmacodynamics and Pharmacokinetics, International Global Work in Pharmaceutical, Metabolomics, Molecular Genetics & Pharmaceutical, Nutrigenomics, Nutrition, Personalized and Precision Medicine & Genomic Research, Pharmacotherapy of Cardiovascular Disease, Population Health Management, Genetics & Pharmaceutical, Population Health Management, Nutrition and Phytochemistry, Scientist: Career considerations, Systemic Inflammatory Response Related Disorders, Technology Transfer: Biotech and Pharmaceutical, tagged Acetyl-CoA, Adenosine triphosphate, anaerobic and aerobic gycolysis, ATP, ATP demands, Cellular respiration, Citric acid cycle, fermentation, gluconeogenesis, hydrophobic interaction, N:S, overflow at branch points, oxidative stress, Pasteur effect, PFK 2, plant protein diet, protein calorie malnutrition, pyruvate dehydrogenase, Saccharomyces cerevisiae, SIRT3, synthetic small DNA ligands, TCA cycle intermediates, Warburg Effect on September 26, 2012| 26 Comments »

English: A diagram of cellular respiration including glycolysis, Krebs cycle, citric acid cycle, and the electron transport chain (Photo credit: Wikipedia)

Reporter and Curator: Larry H Bernstein, MD, FACP

Introduction

Mitochondria are essential for life, and are critical for the generation of ATP. Otto Warburg won the Nobel Prize in 1918 for his studies of respiration and he described a situation of impaired respiration in cancer cells causing them to produce lactic acid, like bacteria. This has been termed facultative anaerobic glycolysis. The metabolic explanation for mitochondrial respiration had to await the Nobel discoveries of the Krebs cycle and high energy ~P in acetyl CoA by Fritz Lippman. The Krebs cycle generates 16 ATPs I respiration compared to 2 ATPs through glycolysis. The discovery of the genetic code with the “Watson-Crick” model and the identification of DNA polymerase opened a window for contuing discovery leading to the human genome project at 20th century end that has now been followed by “ENCODE” in the 21st century. This review opens a rediscovery of the metabolic function of mitochondria and adaptive functions with respect to cancer and other diseases.

Function in aerobic and anaerobic metabolism

Two-carbon compounds – the TCA, the pentose phosphate pathway, together with gluconeogenesis and the glyoxylate cycle are essential for the provision of anabolic precursors. Yeast environmental diversity mostly leads to a vast metabolic complexity driven by carbon and the energy available in environmental habitats. This resulted in much early research on analysis of yeast metabolism associated with glucose catabolism in Saccharomyces cerevisiae, under both aerobic and anaerobic environments. Yeasts may be physiologically classified with respect to the type of energy-generating process involved in sugar metabolism, namely non-, facultative- or obligate fermentative. The nonfermentative yeasts have exclusively a respiratory metabolism and are not capable of alcoholic fermentation from glucose, while the obligate-fermentative yeasts – “natural respiratory mutants” – are only capable of metabolizing glucose through alcoholic fermentation. Most of the yeasts identified are facultative-fermentative ones, and depending on the growth conditions, the type and concentration of sugars and/or oxygen availability, may display either a fully respiratory or a fermentative metabolism or even both in a mixed respiratory-fermentative metabolism (e.g., S. cerevisiae). The sugar composition of the media and oxygen availability are the two main environmental conditions that have a strong impact on yeast metabolic physiology, and three frequently observed effects associated with the type of energy-generating processes involved in sugar metabolism and/or oxygen availability are Pasteur, Crabtree and Custer. In modern terms the Pasteur effect refers to an activation of anaerobic glycolysis in order to meet cellular ATP demands owing to the lower efficiency of ATP production by fermentation compared with respiration. In 1861 Pasteur observed that S. cerevisiae consume much more glucose in the absence of oxygen than in its presence. S. cerevisiae only shows a Pasteur at low growth rates and at resting-cell conditions, where a high contribution of respiration to sugar catabolism occurs owing to the loss of fermentative capacity. The Crabtree effect is defined as the occurrence of alcoholic fermentation under aerobic conditions, explained by a theory involving “limited respiratory capacities” in the branching point of pyruvate metabolism. The Custer effect is known as the inhibition of alcoholic fermentation by the absence of oxygen. It is thought that the Custer effect is caused by reductive stress.

Glycolysis

Once inside the cell, glucose is phosphorylated by kinases to glucose 6-phosphate and then isomerized to fructose 6-phosphate, by phosphoglucose isomerase. The next enzyme is phospho-fructokinase, which is subject to regulation by several metabolites, and further phosphorylates fructose 6-phosphate to fructose 1,6-bisphosphate. These steps of glycolysis require energy in the form of ATP. Glycolysis leads to pyruvate formation associated with a net production of energy and reducing equivalents. Approximately 50% of glucose 6-phosphate is metabolized via glycolysis and 30% via the pentose phosphate pathway in Crabtree negative yeasts. However, about 90% of the carbon going through the pentose phosphate pathway reentered glycolysis at the level of fructose 6-phosphate or glyceraldehyde 3-phosphate. The pentose phosphate pathway in Crabtree positive yeasts (S. cerevisiae) is predominantly used for NADPH production but not for biomass production or catabolic reactions.
Pyruvate branch point. At the pyruvate (the end product of glycolysis) branching point, pyruvate can follow three different metabolic fates depending on the yeast species and the environmental conditions. On the other hand, the carbon flux may be distributed between the respiratory and fermentative pathways. Pyruvate might be directly converted to acetyl–cofactor A (CoA) by the mitochondrial multienzyme complex pyruvate dehydrogenase (PDH) after its transport into the mitochondria by the mitochondrial pyruvate carrier. Alternatively, pyruvate can also be converted to acetyl–CoA in the cytosol via acetaldehyde and to acetate by the so-called PDH-bypass pathway. Compared with cytosolic pyruvate decarboxylase, the mitochondrial PDH complex has a higher affinity for pyruvate and therefore most of the pyruvate will flow through the PDH complex at low glycolytic rates. However, at increasing glucose concentrations, the glycolytic rate will increase and more pyruvate is formed, saturating the PDH bypass and shifting the carbon flux through ethanol production. In the yeast S. cerevisiae, the external glucose level controls the switch between respiration and fermentation.

Rodrigues F, Ludovico P and Leão C. Sugar Metabolism in Yeasts: an Overview of Aerobic and Anaerobic Glucose Catabolism. In Molecular and Structural Biology. Chapter 6. qxd 07/23/05 P117
Eriksson P, Andre L, Ansell R, Blomberg A, Adler L (1995) Cloning and characterization of GPD2, a second gene encoding sn-glycerol 3-phosphate dehydrogenase (NAD+) in Saccharomyces cerevisiae, and its comparison with GPD1. Mol Microbiol 17:95–107.
Flikweert MT, van der Zanden L, Janssen WM, Steensma HY, van Dijken JP, Pronk JT (1996)Pyruvate decarboxylase: an indispensable enzyme for growth of Saccharomyces cerevisiae on glucose. Yeast 12:247–257.

Biogenesis of mitochondrial structures from aerobically grown S. cerevisiae

Under aerobic conditions S. cerevisiae forms mitochondria which are classical in their properties,
but the number, morphology, and enzyme activity of these mitochondria are also affected by catabolite repression, but it cannot respire under anaerobic conditions and lacks cytochromes. These structures were isolated from anaerobically grown yeast cells and contain malate and succinate dehydrogenases, ATPase, and DNA characteristic of yeast mitochondria. These lipid-complete structures consist predominantly of double-membrane vesicles enclosing a dense matrix which contains a folded inner membrane system bordering electron-transparent regions similar to the cristae of mitochondria.

The morphology of the structures is critically dependent on their lipid composition
Their unsaturated fatty acid content is similar to that of mitochondria from aerobically grown cells
The structures from cells grown without lipid supplements have simpler morphology – a dense granular matrix surrounded by a double membrane but have no obvious folded inner membrane system within the matrix
The lipid-depleted structures are only isolated in intact form from protoplasts
The synthesis of ergosterol and unsaturated fatty acids is oxygen-dependent and anaerobically grown cells may be depleted of these lipid components
The cytology of anaerobically grown yeast cells is profoundly affected by both lipid-depletion and catabolite repression
Lipid-depleted anaerobic cells, membranous mitochondrial profiles were not demonstrable
The structures from the aerobically and anaerobically grown cells are markedly different in morphology and fatty acid composition, but both contain mitochondrial DNA and a number of mitochondrial enzymes

The phospholipid composition of various strains of Saccharomyces cerevisiae, wild type and petite (cytoplasmic respiratory deficient) yeasts and derived mitochondrial mutants grown under conditions designed to induce variations in the complement of mitochondrial were fractionated into various subcellular fractions and analyzed for cytochrome oxidase (in wild type) and phospholipid composition . 90% or more of the phospholipid, cardiolipin was found in the mitochondrial membranes of wild type and petite yeast . Cardiolipin content differed markedly under various growth conditions .

Stationary yeast grown in glucose had better developed mitochondria and more cardiolipin than repressed log phase yeast .
Aerobic yeast contained more cardiolipin than anaerobic yeast .
Respiration-deficient cytoplasmic mitochondrial mutants, both suppressive and neutral, contained less cardiolipin than corresponding wild types .
A chromosomal mutant lacking respiratory function had normal cardiolipin content .
Log phase cells grown in galactose and lactate, which do not readily repress the development of mitochondrial membranes, contained as much cardiolipin as stationary phase cells grown in glucose .
Cytoplasmic mitochondrial mutants respond to changes in the glucose concentration of the growth medium by variations in their cardiolipin content in the same way as wild type yeast does under similar growth conditions.
It is of interest that the chromosomal petite, which as far as can be ascertained has qualitatively normal mitochondrial DNA and a normal cardiolipin content when grown under maximally derepressed conditions .

Thus, the genetic defect in this case probably does not diminish the mass of inner mitochondrial membrane under appropriate conditions . This suggests the cardiolipin content of yeast is a good indicator of the state of development of mitochondrial membrane.
Jakovcic S, Getz Gs, Rabinowitz M, Jakob H, Swift H. Cardiolipin Content Of Wild Type and Mutant Yeasts in Relation to Mitochondrial Function and Development. JCB 1971. jcb.rupress.org
Jakovcic S, Haddock J, Getz GS, Rabinowitz M, Swift H. Biochem J. 1971; 121 :341 .
EPHRUSSI, B . 1953 . Nucleocytoplasmic Relations in Microorganisms . Clarendon Press, Oxford.

Mitochondria, hydrogenosomes and mitosomes

Before and after the publication of an unnoticed article in 1905 by Mereschkowsky there were many publications dealing with plant “chimera’s” and cytoplasmic inheritance in plants, which should have favoured the interpretation of plastids as “semi-autonomous” symbiotic entities in the cytoplasm of the eukaryotic plant cell. Twenty years after Mereschkowsky’s plea for an endosymbiotic origin of plastids, Wallin (1925, 1927) postulated the “bacterial nature of mitochondria”. And so it is one of the mysteries of the 20th century that an endosymbiotic origin of plastids had not been generally accepted before the 1970s, primarily because one cannot experience the consequences of mutations in the mitochondrial genome by naked eye.

Mitochondrial DNA is usually present in multiple copies in one and the same mitochondrion and those in the hundreds to thousands of mitochondria in a single cell are not necessarily identical.
The random partitioning of the mitochondria in mitosis (and meiosis) frequently results in a more or less biased distribution of the diverent mitochondria in the daughter cells, eventually causing diverent phenotypes in different tissues obscuring the maternal inheritance
It was not until the 1990s that certain diseases—which had been interpreted as being X-chromosomal with incomplete penetrance—eventually turned out to be

Lastly, the vast majority of mitochondrial proteins are encoded in the nucleus and, consequently, mutations in the corresponding genes exhibit a Mendelian, and not a cytoplasmic, maternal inheritance
In the 1970s and 1980s the unequivocal demonstration of mitochondrial DNA occurred
and mitochondrial mutations at the DNA level provided the final proof for the role of such mutations in a wealth of hereditary diseases in man.

The genomics era provided the tools to prove the endosymbiont-hypothesis for the origin of the eukaryotic cell

Since DNA does not arise de novo, the genomes of organisms and organelles provide a historical record for the evolution of the eukaryotic cell and its organelles. The DNA sequences of two to three genomes of the eukaryotic cell turned out to be a record of the evolution of the eukaryotic life on earth. The analysis of organelle genomes unequivocally revealed a cyanobacterial origin for plastids and an -proteobacterial origin for mitochondria. Both plastids and mitochondria appear to be monophyletic, i.e. plastids derived from one and the same cyanobacterial ancestor, and mitochondria from one and the same -proteobacterial ancestor.
The evolution of the eukaryotic cell appears to have involved one (in the case of animals) or two (in the case of plants) events that took place 1.5 to 2 billion years ago. However, it appears that symbioses involving one or the other eubacterium arose repeatedly during the billions of years available. For example, photosynthetic algae by phagotrophic eukaryotes, negating the hypothesis of a single eukaryotic event, rather than stringent selection shaping the diversity of present-day life. Recent hypotheses for the origin of the nucleus have postulated that introns, which could be acquired by the uptake of the -proteobacterial endosymbiont, forced the nucleus-cytosol compartmentalization. Lateral gene transfer among eukaryotes is more frequent than was assumed earlier, and “mitochondrial genes” in the nuclear genomes of amitochondrial organisms are not necessarily the consequence of a transient presence of a DNA-containing mitochondrial-like organelle.
To cope with the obvious ubiquity of “mitochondrial” genes and the chimerism of the DNA of present day eukaryotes, the hydrogen hypothesis postulates that an archaeal host took up a eubacterial symbiont that became the ancestor of mitochondria and hydrogenosomes. The hydrogen hypothesis has the potential to explain both the monophyly of the mitochondria, and the existence of “anaerobic” and “aerobic” variants of one and the same original organelle. Based on these observations we have only the terms “mitochondrion”, “hydrogenosome” and “mitosome” to classify the various variants of the mitochondrial family.
Hackstein JHP, Joachim Tjaden J , Huynen M. Mitochondria, hydrogenosomes and mitosomes: products of evolutionary tinkering! Curr Genet (2006) 50:225–245. DOI 10.1007/s00294-006-0088-8.

Lineages

A look at the phylogenetic distribution of characterized anaerobic mitochondria among animal lineages shows that these are not clustered but spread across metazoan phylogeny. The biochemistry and the enzyme equipment used in the facultatively anaerobic mitochondria of metazoans is nearly identical across lineages, strongly indicating a common origin from an archaic metazoan ancestor. The organelles look like hydrogenosomes – anaerobic forms of mitochondria that generate H2 and adenosine triphosphate (ATP) from pyruvateoxidation and which were previously found only in unicellular eukaryotes. The animals harbor structures resembling prokaryotic endosymbionts, reminiscent of the methanogenic endosymbionts found in some hydrogenosome-bearing protists; fluorescence of F420, a typical methanogen cofactor, or lack thereof, will bring more insights as to what these structures are. If we follow the anaerobic lifestyle further back into evolutionary history, beyond the origin of the metazoans, we see that the phylogenetic distribution of eukaryotes with facultative anaerobic mitochondria, eukaryotes with hydrogenosomes and eukaryotes that possess mitosomes (reduced forms of mitochondria with no direct role in ATP synthesis) the picture is similar to that seen for animals. In all six of the major lineages (or supergroups) of eukaryotes that are currently recognized, forms with anaerobic mitochondria have been found. The newest additions to the growing collection of anaerobic mitochondrial metabolisms are the denitrifying foraminiferans. A handful of about a dozen enzymes make the difference between a ‘normal’ O2-respiring mitochondrion found in mammals, and the energy metabolism of eukaryotes with anaerobic mitochondria, hydrogenosomes or mitosomes. Notably, the full complement of those enzymes, once thought to be specific to eukaryotic anaerobes, surprisingly turned up in the green alga Chlamydomonas reinhardtii , which produces O2 in the light, has typical O2-respiring mitochondria but, within about 30 min of exposure to heterotrophic, anoxic and dark conditions, expresses its anaerobic biochemistry to make H2 in the same way as trichomonads, the group in which hydrogenosomes were discovered. Chlamydomonas provides evidence which indicates that the ability to inhabit oxygen-harbouring, as well as anoxic environments, is an ancestral feature of eukaryotes and their mitochondria. The prokaryote inhabitants have existed for well over a billion years, and have reached this new habitat by dispersal, not by adaptive evolution de novo and in situ. Indeed, geochemical evidence has shown that methanogenesis and sulphate reduction, and the niches in which they occur, are truly ancient.
Mentel and Martin. Anaerobic mitochondria: more common all the time. BMC Biology 2010; 8:32. BioMed Central Ltd. http://www.biomedcentral.com/1741-7007/8/32.

Anaerobic mitochondrial enzymes

Mitochondria from the muscle of the parasitic nematode Ascaris lumbricoides var. suum function anaerobically in electron transport-associated phosphorylations under physiological conditions. These helminth organelles have been fractionated into inner and outer membrane, matrix, and inter-membrane space fractions. The distributions of enzyme systems were determined and compared with corresponding distributions reported in mammalian mitochondria. Succinate and pyruvate dehydrogenases as well as NADH oxidase, Mg++-dependent ATPase, adenylate kinase, citrate synthase, and cytochrome c reductases were determined to be distributed as in mammalian mitochondria. In contrast with the mammalian systems, fumarase and NAD-linked “malic” enzyme were isolated primarily from the intermembrane space fraction of the worm mitochondria. These enzymes are required for the anaerobic energy-generating system in Ascaris and would be expected to give rise to NADH in the intermembrane space.
Pyruvate kinase activity is barely detectable in Ascaris muscle. Therefore, rather than giving rise to cytoplasmic pyruvate, CO2 is fixed into phosphoenolpyruvate, resulting in the formation of oxalacetate which, in turn, is reduced by NADH to form malate regenerating glycolytic NAD . Ascaris muscle mitochondria utilize malate anaerobically as their major substrate by means of a dismutation reaction. The “malic” enzyme in the mitochondrion catalyzes theoxidation of malate to form pyruvate, CO2, and NADH. This reaction serves to generate intramitochondrial reducing power in the form of NADH. Concomitantly, fumarase catalyzes thedehydration of an equivalent amount of malate to form fumarate which, in turn, is reduced by an NADH-linked fumarate reductase to succinate. The flavin-linked fumarate reductase reaction results in a site I electron transport-associated phosphorylation of ADP, giving rise to ATP. This identifies a proton translocation system to obtain energy generation.
Rew RS, Saz HJ. Enzyme Localization in the Anaerobic Mitochondria Of Ascaris Lumbricoides. The Journal Of Cell Biology 1974; 63: 125-135. jcb.rupress.org

Mitochondrial redox status

Tumor cells are characterized by accelerated growth usually accompanied by up-regulated pathways that ultimately increase the rate of ATP production. These cells can suffer metabolic reprogramming, resulting in distinct bioenergetic phenotypes, generally enhancing glycolysis channeled to lactate production. These investigators showed metabolic reprogramming by means of inhibitors of histone deacetylase (HDACis), sodium butyrate and trichostatin. This treatment was able to shift energy metabolism by activating mitochondrial systems such as the respiratory chain and oxidative phosphorylation that were largely repressed in the untreated controls.
Amoêdo ND, Rodrigues MF, Pezzuto P, Galina A, et al. Energy Metabolism in H460 Lung Cancer Cells: Effects of Histone Deacetylase Inhibitors. PLoS ONE 2011; 6(7): e22264. doi:10.1371/ journal.pone.0022264
Antioxidant pathways that rely on NADPH are needed for the reduction of glutathione and maintenance of proper redox status. The mitochondrial matrix protein isocitrate dehydrogenase 2 (IDH2) is a major source of NADPH. NAD+-dependent deacetylase SIRT3 is essential for the prevention of age related hearing loss of caloric restricted mice. Oxidative stress resistance by SIRT3 was mediated through IDH2. Inserting SIRT3 Nε-acetyl-lysine into position 413 of IDH2 and has an activity loss by as much as 44-fold. Deacetylation by SIRT3 fully restored maximum IDH2 activity. The ability of SIRT3 to protect cells from oxidative stress was dependent on IDH2, and the deacetylated mimic, IDH2K413R variant was able to protect Sirt3-/- MEFs from oxidative stress through increased reduced glutathione levels. The increased SIRT3 expression protects cells from oxidative stress through IDH2 activation. Together these results uncover a previously unknown mechanism by which SIRT3 regulates IDH2 under dietary restriction. Recent findings demonstrate that IDH2 activities are a major factor in cancer, and as such, these results implicate SIRT3 as a potential regulator of IDH2-dependent functions in cancer cell metabolism.
Wei Yu, Dittenhafer-Reed KE and JM Denu. SIRT3 Deacetylates Isocitrate Dehydrogenase 2 (IDH2) and Regulates Mitochondrial Redox Status. JBC Papers in Press. Published on March 13, 2012 as Manuscript M112.355206. http://www.jbc.org
Computationally designed drug small molecules targeted for metabolic processes: a bridge from the genome to repair of dysmetabolism
New druglike small molecules with possible anticancer applications were computationally designed. The molecules formed stable complexes with antiapoptotic BCL-2, BCL-W, and BFL-1 proteins. These findings are novel because, to the best of the author’s knowledge, molecules that bind all three of these proteins are not known. A drug based on them should be more economical and better tolerated by patients than a combination of drugs, each targeting a single protein. The calculated drug-related properties of the molecules were similar to those found in most commercial drugs. The molecules were designed and evaluated following a simple, yet effective procedure. The procedure can be used efficiently in the early phases of drug discovery to evaluate promising lead compounds in time- and cost-effective ways.
Keywords: small molecule mimetics, antiapoptotic proteins, computational drug design.

Tardigrades

Tardigrades have unique stress-adaptations that allow them to survive extremes of cold, heat, radiation and vacuum. To study this, encoded protein clusters and pathways from an ongoing transcriptome study on the tardigrade Milnesium tardigradum were analyzed using bioinformatics tools and compared to expressed sequence tags (ESTs) from Hypsibius dujardini, revealing major pathways involved in resistance against extreme environmental conditions. ESTs are available on the Tardigrade Workbench along with software and databank updates. Our analysis reveals that RNA stability motifs for M. tardigradum are different from typical motifs known from higher animals. M. tardigradum and H. dujardini protein clusters and conserved domains imply metabolic storage pathways for glycogen, glycolipids and specific secondary metabolism as well as stress response pathways (including heat shock proteins, bmh2, and specific repair pathways). Redox-, DNA-, stress- and protein protection pathways complement specific repair capabilities to achieve the strong robustness of M. tardigradum. These pathways are partly conserved in other animals and their manipulation could boost stress adaptation even in human cells. However, the unique combination of resistance and repair pathways make tardigrades and M. tardigradum in particular so highly stress resistant.
Keywords: RNA, expressed sequence tag, cluster, protein family, adaptation, tardigrada, transcriptome

Epicrisis

This discussion has disparate pieces that are tied together by dysfunctional changes that are

adaptations from metabolic process in the channeling of energy dependent of mitochondrial enzymes in interaction with three to 6 carbon carbohydrates, high energy phosphate, oxygen and membrane lipid structures, as well as
proteins rich or poor in sulfur linked with genome specific targets, and semisynthetic modifications, oxidative stress
leading to a new approach to pharmaceutical targeted drug design.

Determinants of Brain Cell Metabolic Phenotypes and Energy Substrate Utilization Unraveled with a Modeling Approach (ploscompbiol.org)
Mitochondria and Cancer: An overview of mechanisms (pharmaceuticalintelligence.com)
Making a molecular micromap: Imaging the yeast 26S proteasome at near-atomic resolution (phys.org)
Fermentation (ahschoolapbio2013.wordpress.com)
Nitric Oxide has a ubiquitous role in the regulation of glycolysis -with a concomitant influence on mitochondrial function (pharmaceuticalintelligence.com)
What Causes Aging and How Does Dietary Restriction Increase Life Span? (understandnutrition.com)
Diaz-Cano SJ. Tumor heterogeneity: mechanisms and bases for a reliable application of molecular marker design. Int J Mol Sci 2012; 13(2):1951-2011 PMID 22408433Bernstein LH. Expanding the Genetic Alphabet and linking the genome to the metabolome.PharmaIntell.wordpress.com. luly 24, 2012.
Sainani K. Meet the Skeptics: Why Some Doubt Biomedical Models – and What it Takes to Win Them Over. IMAG Futures meeting. 2009
Ritusaxena. ENCODE: the key to unlocking the secrets of complex genetic diseases. PharmaIntell.wordpress.com 2012.
Ritusaxena. Scientists use natural agents for prostate cancer bone metastasis treatment. PharmaIntell.wordpress.com. 2012
Ritusaxena. β Integrin emerges as an important player in mitochondrial dysfunction associated Gastric Cancer. PharmaIntell.wordpress.com 2012

Read Full Post »

Junk DNA codes for valuable miRNAs: non-coding DNA controls Diabetes

Posted in Biomarkers & Medical Diagnostics, Bone Disease and Musculoskeletal Disease, Cardiovascular Pharmacogenomics, Cell Biology, Signaling & Cell Circuits, Chemical Biology and its relations to Metabolic Disease, Chemical Genetics, Computational Biology/Systems and Bioinformatics, Genome Biology, Genomic Testing: Methodology for Diagnosis, Glycobiology: Biopharmaceutical Production, Pharmacodynamics and Pharmacokinetics, Liver & Digestive Diseases Research, Medical and Population Genetics, Molecular Genetics & Pharmaceutical, Nutrigenomics, Nutrition, Personalized and Precision Medicine & Genomic Research, Systemic Inflammatory Response Related Disorders, Technology Transfer: Biotech and Pharmaceutical, Uncategorized, tagged DNA, ENCODE, Genome-wide association study, GWAS, Margaret Baker, Messenger RNA, MicroRNA, niRNA, RNA, T1D, T2D on September 24, 2012| 11 Comments »

Author: Margaret Baker, PhD, Registered Patent Agent

The Encyclopedia of DNA Elements (ENCODE) Project was launched in September of 2003. In 2007 the ENCODE project was expanded to study the entire human genome, Genome-wide association studies or GWAS, and published a Nature paper entitled “An integrated encyclopedia of DNA elements in the human genome,” this month also all data are available at http://genome.ucsc.edu/ENCODE/. Novel functional roles have been discovered for both transcribed and non-transcribed portions of DNA. See several articles and commentary in Science 7 September 2012: Vol. 337 no. 6099 including Maurano et al. pp. 1190-1195 DOI: 10.1126/science.1222794b

For the first time, the 3-dimensional connections that cross the genome have been mapped as long-range looping interactions between functional elements and the genes controlled. These regions of the genome, formerly referred to as “junk DNA”, have the potential to be involved in disease initiation, pathophysiology, and complications. Further, epigenetic factors may be seen to play a more direct role in the expression or silencing of protein coding genes as DNase I hot spots, nucleosomal anchor points, and DNA methylation sites are added to the map.

Non-coding transcribed DNA includes a large percentage of sequences coding for RNA. In fact, RNA encoding genes number nearly equal to the protein encoding genes- 18,400 v 20,687 – and previously unknown non-coding RNA (ncRNA) have also been characterized.

Some of the known elements that were cataloged include:

cis elements – promoters, transcription factor binding sites;
gene contiguous non-coding stretches such as introns, polyA, and UTR, splice variants;
pseudogenes (11,224);
long range gene associated elements – enhancers, insulators, suppressors, and predicted promoter flanking regions;
ribosomal RNA genes; and
sequences for 7,052 small RNAs of which 85% are small nuclear(sn)RNA, small nucleolar(sno)RNA), transfer(t)RNA, and micro(mi)RNA.

What has been found is that distinct non-coding regions, including ncRNA, can be associated with distinct disease traits. miRNA are among the non-gene encoding sequences in the genome which have already been shown to play a major post-transcriptional role in expression of multiple genes..

Most miRNA genes are intergenic or oriented antisense to neighboring genes and therefore assumed to be controlled by independent promoter units. However, in some cases a microRNA gene is transcribed together with its target gene implying coupled regulation of miRNA and protein-coding gene. About one third of miRNA genes reside in polycistronic clusters. miRNA genes can occupy the introns of protein, non-protein coding genes, or nonprotein-coding transcripts. The promoters have been shown to have some similarities in their motifs to promoters of other genes transcribed by RNA polymerase II such as protein coding genes. The ENCODE project also noted that miRNA promoters were in chromatin regions of high promiscuity. There may be up to 1000 miRNA genes in the human genome. In addition, human miRNAs show RNA editing of sequences to yield products different from those encoded by their DNA. miRNA are implicated in cellular roles as diverse as developmental timing in worms, cell death and fat metabolism in flies, haematopoiesis in mammals, and leaf development and floral patterning in plants

The final miRNA gene product is a ∼22 nt functional RNA molecule. The mature miRNA (designated miR-#) is processed from a characteristic stem–loop sequence (called a pre-mir), which in turn may be excised from a longer primary transcript (or pri-mir). It is processed by the same enzyme (DICER) that processes short hairpin RNA, forming interfering RNA, which provides and additional level of control.

MiRNA controls gene expression by binding to complementary regions of messenger transcripts in the 3’ untranslated region to repress their translation or regulate degradation. What makes the mechanism more powerful (or complicated) is the imperfect but specific binding motif associates with a large number of mRNAs in the 3’ untranslated region having the complimentary motif. Conversely then, each mRNA can potentially associate with a number miRNA. Mature processed cytosolic miRNA can act in a manner akin to small interfering(si)RNA, and form the RNA-induced silencing complex (RISC) to block translation. Computational methods have been used to identify potential gene targets based on complimentarity between the miRNA and mRNA sequences.

Gerstein et al. explored the “Architecture of the human regulatory network derived from ENCODE data” Nature 489:91-100 (06 Sep 2012) focusing on the regulation of transcription factors (TF) and association between TF and miRNAs, miRNA and miRNA, protein-protein interactions, and protein phosphorylation. Not surprisingly, not all TF are the upstream factor in each network.

These new and remarkably detailed examinations of the different elements within and transcribed from the human genome perhaps do more to aid our knowledge of why we have stumbled in attempts to eradicate diseases, initially by focusing on a single gene or constellation of coding regions. The miRNA wikipedia is also being re-written on a daily basis and new disease associations made*. As an example of a pathological state that may be linked to miRNA controlled elements, in vitro as well as in small population studies have examined miRNA species in diabetogenic conditions and patients with diabetes (Type I and Type II).

Diabetes and miRNA

In adult β-cell islets, miR-375 is low when glucose is freely available and low miR-375 induces insulin secretion. Interestingly, miR-375 is found only in brain and β-cells which share a secretion pathway.

Diabetic Complications

Organ specific miRNA have been identified in liver, skeletal muscle, kidney, vascular, and adipose tissue which are responsive to transient or sustained hyperglycemia.

miR-17-5p and miR-132 were reported to show significant differences between obese and non obese omental fat and were also abnormal in the blood of obese subjects. Altered expression of miR-17-5p and miR-132 were found to correlate significantly with BMI, fasting blood glucose and glycosylated hemoglobin. (Kloting et al. PLoS ONE 4(3), e4699 (2009).

Clinical practice related to miRNA in diabetes may be possible as one group has identified eight miRNAs (miR-144, miR-146a, miR-150, miR-182, miR-192, miR-29a, miR-30d and miR-320) as potential ‘signature miRNAs’ that could distinguish prediabetic patients from those with overt T2D (Karolina DS, Armugam A, Tavintharan S et al. MicroRNA 144 impairs insulin signaling by inhibiting the expression of insulin receptor substrate 1 in Type 2 diabetes mellitus. PLoS ONE 6(8), e22839 (2011).

Due to the autoimmune component of T1D, the constellation of miRNA would be expected to be different: upregulation of miR-510 and underexpression of miR-191 and miR-342 were observed in the Tregs (regulatory T-cells) of T1D patients (Hezova R, Slaby O, Faltejskova P et al. microRNA-342, microRNA-191 and microRNA-510 are differentially expressed in T regulatory cells of Type 1 diabetic patients. Cell. Immunol. 260(2),70–74 (2010).

Taken together with the “physical” mapping of miRNA genes in the context of the 3-dimensional genome provided by the ENCODE studies and new understanding of potential concerted regulatory mechanisms, the miRNA data for tissues and specific cell types involved in disease pathology form a new approach to either detecting or possibly correcting gene (coding or non-coding) dysregulation. miRNA mimics and anti-miRNA agents are being developed as new therapeutic modalities.

References

Bartel, DP et al. MicroRNAs: Genomics, Biogenesis, Mechanism, and Function” Cell 2004, 116:281-297.

Fernandez-Valverde, SL et al. MicroRNAs in beta-cell Biology, insulin resistance, diabetes and its complications. Diabetes July 2011 60 (7):1825-31.

Kantharidis, et al. Diabetes Complications: The MicroRNA Perspective http://diabetes.diabetesjournals.org/content/60/7/1832.short

MEDSCAPE Review article: “miRNAs and Diabetes Mellitus: miRNAs in Diabetic Complicatons” http://www.medscape.org/viewarticle/763729_6

*Based on initial studies in the worm C. elegans showing the temporal appearance of 21- and 22-nt RNAs during development, a family of highly conserved micro RNA sequences (miRNA) existing in invertebrates and vertebrates, were cataloged by Tuschl et al. at the Max-Planck-Institute and others (see Eddy, SR Non-coding RNA genes and the modern RNA world Nature Reviews Genetics, 2:920-929, 2001). The sequence-specific post-transcriptional regulatory mechanisms mediated by these miRNAs have been associated with certain disease states such as cancer miR-21) and more specifically, lung cancer (miR-124) or breast cancer (miR-7, miR-21) and new species and function continue to be found (see http://www.mirbase.org/ ).

Read Full Post »

Expanding the Genetic Alphabet and Linking the Genome to the Metabolome

Posted in Bio Instrumentation in Experimental Life Sciences Research, Biological Networks, Gene Regulation and Evolution, Biomarkers & Medical Diagnostics, CANCER BIOLOGY & Innovations in Cancer Therapy, Cell Biology, Signaling & Cell Circuits, Chemical Biology and its relations to Metabolic Disease, Chemical Genetics, Computational Biology/Systems and Bioinformatics, Disease Biology, Small Molecules in Development of Therapeutic Drugs, Genome Biology, Genomic Testing: Methodology for Diagnosis, Liver & Digestive Diseases Research, Medical and Population Genetics, Metabolomics, Molecular Genetics & Pharmaceutical, Nutrigenomics, Nutrition, Personalized and Precision Medicine & Genomic Research, Pharmaceutical Analytics, Pharmaceutical Industry Competitive Intelligence, Pharmaceutical R&D Investment, Population Health Management, Genetics & Pharmaceutical, Population Health Management, Nutrition and Phytochemistry, Proteomics, Scientist: Career considerations, Systemic Inflammatory Response Related Disorders, Technology Transfer: Biotech and Pharmaceutical, tagged adenine, ATP, biodiversity, Cancer - General, Cancer Genome Atlas, cystathionine beta synthase, cytosine, DNA, fumarate, gene modification, genetic code, genetic engineering, genetics, genome, guanine, H2S, homocysteine, inosine, malnutrition, metabolome, methionine, Nature (journal), Nature Chemical Biology, Nucleic acid double helix, nucleosides, oncometabolite, oxidative stress, proteome, regulatory pathways, S-adenosylmethionine, Scripps Research Institute, Sulfur, synthetic nucleotides, thymidine, Warburg Effect, Watson-Crick Model, xanthine on September 24, 2012| 13 Comments »

English: The citric acid cycle, also known as the tricarboxylic acid cycle (TCA cycle) or the Krebs cycle. Produced at WikiPathways. (Photo credit: Wikipedia)

Expanding the Genetic Alphabet and Linking the Genome to the Metabolome

Reporter& Curator: Larry Bernstein, MD, FCAP

Unlocking the diversity of genomic expression within tumorigenesis and “tailoring” of therapeutic options

1. Reshaping the DNA landscape between diseases and within diseases by the linking of DNA to treatments

In the NEW York Times of 9/24,2012 Gina Kolata reports on four types of breast cancer and the reshaping of breast cancer DNA treatment based on the findings of the genetically distinct types, which each have common “cluster” features that are driving many cancers. The discoveries were published online in the journal Nature on Sunday (9/23). The study is considered the first comprehensive genetic analysis of breast cancer and called a roadmap to future breast cancer treatments. I consider that if this is a landmark study in cancer genomics leading to personalized drug management of patients, it is also a fitting of the treatment to measurable “combinatorial feature sets” that tie into population biodiversity with respect to known conditions. The researchers caution that it will take years to establish transformative treatments, and this is clearly because in the genetic types, there are subsets that have a bearing on treatment “tailoring”. In addition, there is growing evidence that the Watson-Crick model of the gene is itself being modified by an expansion of the alphabet used to construct the DNA library, which itself will open opportunities to explain some of what has been considered junk DNA, and which may carry essential information with respect to metabolic pathways and pathway regulation. The breast cancer study is tied to the “Cancer Genome Atlas” Project, already reported. It is expected that this work will tie into building maps of genetic changes in common cancers, such as, breast, colon, and lung. What is not explicit I presume is a closely related concept, that the translational challenge is closely related to the suppression of key proteomic processes tied into manipulating the metabolome.

Saha S. Impact of evolutionary selection on functional regions: The imprint of evolutionary selection on ENCODE regulatory elements is manifested between species and within human populations. 9/12/2012. PharmaceuticalIntelligence.Wordpress.com

Hawrylycz MJ, Lein ES, Guillozet-Bongaarts AL, Shen EH, Ng L, et al. An anatomically comprehensive atlas of the adult human brain transcriptome. Nature Sept 14-20, 2012

Sarkar A. Prediction of Nucleosome Positioning and Occupancy Using a Statistical Mechanics Model. 9/12/2012. PharmaceuticalIntelligence.WordPress.com

Heijden et al. Connecting nucleosome positions with free energy landscapes. (Proc Natl Acad Sci U S A. 2012, Aug 20 [Epub ahead of print]). http://www.ncbi.nlm.nih.gov/pubmed/22908247

2. Fiddling with an expanded genetic alphabet – greater flexibility in design of treatment (pharmaneogenesis?)

Diagram of DNA polymerase extending a DNA strand and proof-reading. (Photo credit: Wikipedia)

A clear indication of this emerging remodeling of the genetic alphabet is a new
study led by scientists at The Scripps Research Institute appeared in the
June 3, 2012 issue of Nature Chemical Biology that indicates the genetic code as
we know it may be expanded to include synthetic and unnatural sequence pairing (Study Suggests Expanding the Genetic Alphabet May Be Easier than Previously Thought, Genome). They infer that the genetic instructions for living organisms
that is composed of four bases (C, G, A and T)— is open to unnatural letters. An expanded “DNA alphabet” could carry more information than natural DNA, potentially coding for a much wider range of molecules and enabling a variety of powerful applications. The implications of the application of this would further expand the translation of portions of DNA to new transciptional proteins that are heretofore unknown, but have metabolic relavence and therapeutic potential. The existence of such pairing in nature has been studied in Eukariotes for at least a decade, and may have a role in biodiversity. The investigators show how a previously identified pair of artificial DNA bases can go through the DNA replication process almost as efficiently as the four natural bases. This could as well be translated into human diversity, and human diseases.

The Romesberg laboratory collaborated on the new study and his lab have been trying to find a way to extend the DNA alphabet since the late 1990s. In 2008, they developed the efficiently replicating bases NaM and 5SICS, which come together as a complementary base pair within the DNA helix, much as, in normal DNA, the base adenine (A) pairs with thymine (T), and cytosine (C) pairs with guanine (G). It had been clear that their chemical structures lack the ability to form the hydrogen bonds that join natural base pairs in DNA. Such bonds had been thought to be an absolute requirement for successful DNA replication, but that is not the case because other bonds can be in play.

The data strongly suggested that NaM and 5SICS do not even approximate the edge-to-edge geometry of natural base pairs—termed the Watson-Crick geometry, after the co-discoverers of the DNA double-helix. Instead, they join in a looser, overlapping, “intercalated” fashion that resembles a ‘mispair.’ In test after test, the NaM-5SICS pair was efficiently replicable even though it appeared that the DNA polymerase didn’t recognize it. Their structural data showed that the NaM-5SICS pair maintain an abnormal, intercalated structure within double-helix DNA—but remarkably adopt the normal, edge-to-edge, “Watson-Crick” positioning when gripped by the polymerase during the crucial moments of DNA replication. NaM and 5SICS, lacking hydrogen bonds, are held together in the DNA double-helix by “hydrophobic” forces, which cause certain molecular structures (like those found in oil) to be repelled by water molecules, and thus to cling together in a watery medium.

The finding suggests that NaM-5SICS and potentially other, hydrophobically bound base pairs could be used to extend the DNA alphabet and that Evolution’s choice of the existing four-letter DNA alphabet—on this planet—may have been developed allowing for life based on other genetic systems.

3. Studies that consider a DNA triplet model that includes one or more NATURAL nucleosides and looks closely allied to the formation of the disulfide bond and oxidation reduction reaction.

This independent work is being conducted based on a similar concep. John Berger, founder of Triplex DNA has commented on this. He emphasizes Sulfur as the most important element for understanding evolution of metabolic pathways in the human transcriptome. It is a combination of sulfur 34 and sulphur 32 ATMU. S34 is element 16 + flourine, while S32 is element 16 + phosphorous. The cysteine-cystine bond is the bridge and controller between inorganic chemistry (flourine) and organic chemistry (phosphorous). He uses a dual spelling, using sulfphur to combine the two referring to the master catalyst of oxidation-reduction reactions. Various isotopic alleles (please note the duality principle which is natures most important pattern). Sulfphur is Methionine, S adenosylmethionine, cysteine, cystine, taurine, gluthionine, acetyl Coenzyme A, Biotin, Linoic acid, H2S, H2SO4, HSO3-, cytochromes, thioredoxin, ferredoxins, purple sulfphur anerobic bacteria prokaroytes, hydrocarbons, green sulfphur bacteria, garlic, penicillin and many antibiotics; hundreds of CSN drugs for parasites and fungi antagonists. These are but a few names which come to mind. It is at the heart of the Krebs cycle of oxidative phosphorylation, i.e. ATP. It is also a second pathway to purine metabolism and nucleic acids. It literally is the key enzymes between RNA and DNA, ie, SH thiol bond oxidized to SS (dna) cysteine through thioredoxins, ferredoxins, and nitrogenase. The immune system is founded upon sulfphur compounds and processes. Photosynthesis Fe4S4 to Fe2S3 absorbs the entire electromagnetic spectrum which is filtered by the Allen belt some 75 miles above earth. Look up chromatium vinosum or allochromatium species. There is reasonable evidence it is the first symbiotic species of sulfphur anerobic bacteria (Fe4S4) with high potential mvolts which drives photosynthesis while making glucose with H2S.
He envisions a sulfphur control map to automate human metabolism with exact timing sequences, at specific three dimensional coordinates on Bravais crystalline lattices. He proposes adding the inosine-xanthosine family to the current 5 nucleotide genetic code. Finally, he adds, the expanded genetic code is populated with “synthetic nucleosides and nucleotides” with all kinds of customized functional side groups, which often reshape nature’s allosteric and physiochemical properties. The inosine family is nature’s natural evolutionary partner with the adenosine and guanosine families in purine synthesis de novo, salvage, and catabolic degradation. Inosine has three major enzymes (IMPDH1,2&3 for purine ring closure, HPGRT for purine salvage, and xanthine oxidase and xanthine dehydrogenase.

English: DNA replication or DNA synthesis is the process of copying a double-stranded DNA molecule. This process is paramount to all life as we know it. (Photo credit: Wikipedia)

3. Nutritional regulation of gene expression, an essential role of sulfur, and metabolic control

Finally, the research carried out for decades by Yves Ingenbleek and the late Vernon Young warrants mention. According to their work, sulfur is again tagged as essential for health. Sulfur (S) is the seventh most abundant element measurable in human tissues and its provision is mainly insured by the intake of methionine (Met) found in plant and animal proteins. Met is endowed with unique functional properties as it controls the ribosomal initiation of protein syntheses, governs a myriad of major metabolic and catalytic activities and may be subjected to reversible redox processes contributing to safeguard protein integrity.

Consuming diets with inadequate amounts of methionine (Met) are characterized by overt or subclinical protein malnutrition, and it has serious morbid consequences. The result is reduction in size of their lean body mass (LBM), best identified by the serial measurement of plasma transthyretin (TTR), which is seen with unachieved replenishment (chronic malnutrition, strict veganism) or excessive losses (trauma, burns, inflammatory diseases). This status is accompanied by a rise in homocysteine, and a concomitant fall in methionine. The ratio of S to N is quite invariant, but dependent on source. The S:N ratio is typical 1:20 for plant sources and 1:14.5 for animal protein sources. The key enzyme involved with the control of Met in man is the enzyme cystathionine-b-synthase, which declines with inadequate dietary provision of S, and the loss is not compensated by cobalamine for CH3- transfer.

As a result of the disordered metabolic state from inadequate sulfur intake (the S:N ratio is lower in plants than in animals), the transsulfuration pathway is depressed at cystathionine-β-synthase (CβS) level triggering the upstream sequestration of homocysteine (Hcy) in biological fluids and promoting its conversion to Met. They both stimulate comparable remethylation reactions from homocysteine (Hcy), indicating that Met homeostasis benefits from high metabolic priority. Maintenance of beneficial Met homeostasis is counterpoised by the drop of cysteine (Cys) and glutathione (GSH) values downstream to CβS causing reducing molecules implicated in the regulation of the 3 desulfuration pathways

4. The effect on accretion of LBM of protein malnutrition and/or the inflammatory state: in closer focus

Hepatic synthesis is influenced by nutritional and inflammatory circumstances working concomitantly and liver production of TTR integrates the dietary and stressful components of any disease spectrum. Thus we have a depletion of visceral transport proteins made by the liver and fat-free weight loss secondary to protein catabolism. This is most accurately reflected by TTR, which is a rapid turnover protein, but it is involved in transport and is essential for thyroid function (thyroxine-binding prealbumin) and tied to retinol-binding protein. Furthermore, protein accretion is dependent on a sulfonation reaction with 2 ATP. Consequently, Kwashiorkor is associated with thyroid goiter, as the pituitary-thyroid axis is a major sulfonation target. With this in mind, it is not surprising why TTR is the sole plasma protein whose evolutionary patterns closely follow the shape outlined by LBM fluctuations. Serial measurement of TTR therefore provides unequaled information on the alterations affecting overall protein nutritional status. Recent advances in TTR physiopathology emphasize the detecting power and preventive role played by the protein in hyper-homocysteinemic states.

Individuals submitted to N-restricted regimens are basically able to maintain N homeostasis until very late in the starvation processes. But the N balance study only provides an overall estimate of N gains and losses but fails to identify the tissue sites and specific interorgan fluxes involved. Using vastly improved methods the LBM has been measured in its components. The LBM of the reference man contains 98% of total body potassium (TBK) and the bulk of total body sulfur (TBS). TBK and TBS reach equal intracellular amounts (140 g each) and share distribution patterns (half in SM and half in the rest of cell mass). The body content of K and S largely exceeds that of magnesium (19 g), iron (4.2 g) and zinc (2.3 g).

TBN and TBK are highly correlated in healthy subjects and both parameters manifest an age-dependent curvilinear decline with an accelerated decrease after 65 years. Sulfur Methylation (SM) undergoes a 15% reduction in size per decade, an involutive process. The trend toward sarcopenia is more marked and rapid in elderly men than in elderly women decreasing strength and functional capacity. The downward SM slope may be somewhat prevented by physical training or accelerated by supranormal cytokine status as reported in apparently healthy aged persons suffering low-grade inflammation or in critically ill patients whose muscle mass undergoes proteolysis.

5. The results of the events described are:

Declining generation of hydrogen sulfide (H2S) from enzymatic sources and in the non-enzymatic reduction of elemental S to H2S.
The biogenesis of H2S via non-enzymatic reduction is further inhibited in areas where earth’s crust is depleted in elemental sulfur (S8) and sulfate oxyanions.
Elemental S operates as co-factor of several (apo)enzymes critically involved in the control of oxidative processes.

Combination of protein and sulfur dietary deficiencies constitute a novel clinical entity threatening plant-eating population groups. They have a defective production of Cys, GSH and H2S reductants, explaining persistence of an oxidative burden.

6. The clinical entity increases the risk of developing:

cardiovascular diseases (CVD) and
stroke

in plant-eating populations regardless of Framingham criteria and vitamin-B status.
Met molecules supplied by dietary proteins are submitted to transmethylation processes resulting in the release of Hcy which:

either undergoes Hcy — Met RM pathways or
is committed to transsulfuration decay.

Impairment of CβS activity, as described in protein malnutrition, entails supranormal accumulation of Hcy in body fluids, stimulation of activity and maintenance of Met homeostasis. The data show that combined protein- and S-deficiencies work in concert to deplete Cys, GSH and H2S from their body reserves, hence impeding these reducing molecules to properly face the oxidative stress imposed by hyperhomocysteinemia.

Although unrecognized up to now, the nutritional disorder is one of the commonest worldwide, reaching top prevalence in populated regions of Southeastern Asia. Increased risk of hyperhomocysteinemia and oxidative stress may also affect individuals suffering from intestinal malabsorption or westernized communities having adopted vegan dietary lifestyles.

Ingenbleek Y. Hyperhomocysteinemia is a biomarker of sulfur-deficiency in human morbidities. Open Clin. Chem. J. 2009 ; 2 : 49-60.

7. The dysfunctional metabolism in transitional cell transformation

A third development is also important and possibly related. The transition a cell goes through in becoming cancerous tends to be driven by changes to the cell’s DNA. But that is not the whole story. Large-scale techniques to the study of metabolic processes going on in cancer cells is being carried out at Oxford, UK in collaboration with Japanese workers. This thread will extend our insight into the metabolome. Otto Warburg, the pioneer in respiration studies, pointed out in the early 1900s that most cancer cells get the energy they need predominantly through a high utilization of glucose with lower respiration (the metabolic process that breaks down glucose to release energy). It helps the cancer cells deal with the low oxygen levels that tend to be present in a tumor. The tissue reverts to a metabolic profile of anaerobiosis. Studies of the genetic basis of cancer and dysfunctional metabolism in cancer cells are complementary. Tomoyoshi Soga’s large lab in Japan has been at the forefront of developing the technology for metabolomics research over the past couple of decades (metabolomics being the ugly-sounding term used to describe research that studies all metabolic processes at once, like genomics is the study of the entire genome).

Their results have led to the idea that some metabolic compounds, or metabolites, when they accumulate in cells, can cause changes to metabolic processes and set cells off on a path towards cancer. The collaborators have published a perspective article in the journal Frontiers in Molecular and Cellular Oncology that proposes fumarate as such an ‘oncometabolite’. Fumarate is a standard compound involved in cellular metabolism. The researchers summarize that shows how accumulation of fumarate when an enzyme goes wrong affects various biological pathways in the cell. It shifts the balance of metabolic processes and disrupts the cell in ways that could favor development of cancer. This is of particular interest because “fumarate” is the intermediate in the TCA cycle that is converted to malate.

Animation of the structure of a section of DNA. The bases lie horizontally between the two spiraling strands. (Photo credit: Wikipedia)

The Keio group is able to label glucose or glutamine, basic biological sources of fuel for cells, and track the pathways cells use to burn up the fuel. As these studies proceed, they could profile the metabolites in a cohort of tumor samples and matched normal tissue. This would produce a dataset of the concentrations of hundreds of different metabolites in each group. Statistical approaches could suggest which metabolic pathways were abnormal. These would then be the subject of experiments targeting the pathways to confirm the relationship between changed metabolism and uncontrolled growth of the cancer cells.

Junk DNA codes for valuable miRNAs (pharmaceuticalintelligence.com)
Molecular milestone: scientists unravel the human genome – Fox News (foxnews.com)
The Story of You: Encode and the human genome – video | Martin Robbins (guardian.co.uk)
Science takes a giant leap in its understanding of genetics (jessiebaldwin.wordpress.com)
Critical Breakthrough: Study Overturns Theory of ‘Junk DNA’ (the2012scenario.com)
The ENCODE delusion (scienceblogs.com)

Read Full Post »

Amplifying Information Using S-Clustering and Relationship to Kullback-Liebler Distance: An Application to Myocardial Infarction

Posted in Bio Instrumentation in Experimental Life Sciences Research, Biomarkers & Medical Diagnostics, Chemical Biology and its relations to Metabolic Disease, Computational Biology/Systems and Bioinformatics, Ecosystems & Industrial Concentration in the Medical Device Sector, FDA Regulatory Affairs, Health Economics and Outcomes Research, Health Law & Patient Safety, HealthCare IT, International Global Work in Pharmaceutical, Medical Devices R&D Investment, Personalized and Precision Medicine & Genomic Research, Pharmaceutical Analytics, Population Health Management, Genetics & Pharmaceutical, Regulated Clinical Trials: Design, Methods, Components and IRB related issues, Scientist: Career considerations, Statistical Methods for Research Evaluation, tagged Anomaly identification, Bernoulli trial, clustering methods, combinatorial analysis, EHR, IT validation, Kullback-Liebler Distance, learning algorithms, multivariate classification, S-clustering, Shannon-Weaver Information theory on September 22, 2012| 3 Comments »

typical changes in CK-MB and cardiac troponin in Acute Myocardial Infarction (Photo credit: Wikipedia)

Reporter and curator:

Larry H Bernstein, MD, FCAP

This posting is a followup on two previous posts covering the design and handling of HIT to improve healthcare outcomes as well as lower costs from better workflow and diagnostics, which is self-correcting over time.

The first example is a non technology method designed by Lee Goldman (Goldman Algorithm) that was later implemented at Cook County Hospital in Chicago with great success. It has been known that there is over triage of patients to intensive care beds, adding to the costs of medical care. If the differentiation between acute myocardial infarction and other causes of chest pain could be made more accurate, the quantity of scare resources used on unnecessary admissions could be reduced. The Goldman algorithm was introduced in 1982 during a training phase at Yale-New Haven Hospital based on 482 patients, and later validated at the BWH (in Boston) on 468 patients.They demonstrated improvement in sensitivity as well as specificity (67% to 77%), and positive predictive value (34% to 42%). They modified the computer derived algorithm in 1988 to achieve better results in triage of patients to the ICU of patients with chest pain based on a study group of 1379 patients. The process was tested prospectively on 4770 patients at two university and 4 community hospitals. The specificity improved by 74% vs 71% in recognizing absence of AMI by the algorithm vs physician judgement. The sensitivity was not different for admission (88%). Decisions based solely on the protocol would have decreased admissions of patients without AMI by 11.5% without adverse effects. The study was repeated by Qamar et al. with equal success.

Pain in acute myocardial infarction (front) (Photo credit: Wikipedia)

An ECG showing pardee waves indicating acute myocardial infarction in the inferior leads II, III and aVF with reciprocal changes in the anterolateral leads. (Photo credit: Wikipedia)

Acute myocardial infarction with coagulative necrosis (4) (Photo credit: Wikipedia)

Goldman L, Cook EF, Brand DA, Lee TH, Rouan GW, Weisberg MC, et al. A computer protocol to predict myocardial infarction in emergency department patients with chest pain. N Engl J Med. 1988;318:797-803.

A Qamar, C McPherson, J Babb, L Bernstein, M Werdmann, D Yasick, S Zarich. The Goldman algorithm revisited: prospective evaluation of a computer-derived algorithm versus unaided physician judgment in suspected acute myocardial infarction. Am Heart J 1999; 138(4 Pt 1):705-709. ICID: 825629

The usual accepted method for determining the decision value of a predictive variable is the Receiver Operator Characteristic Curve, which requires a mapping of each value of the variable against the percent with disease on the Y-axis. This requires a review of every case entered into the study. The ROC curve is done to validate a study to classify data on leukemia markers for research purposes as shown by Jay Magidson in his demonstation of Correlated Component Regression (2012)(Statistical Innovations, Inc.) The test for the contribution of each predictor is measured by Akaike Information Criteria and Bayes Information Criteria, which have proved to be critically essential tests over the last 20 years.

I go back 20 years and revisit the application of these principles in clinical diagnostics, but the ROC was introduced to medicine in radiology earlier. A full rendering of this matter can be found in the following:
R A Rudolph, L H Bernstein, J Babb. Information induction for predicting acute myocardial infarction.Clin Chem 1988; 34(10):2031-2038. ICID: 825568.

Rypka EW. Methods to evaluate and develop the decision process in the selection of tests. Clin Lab Med 1992; 12:355

Rypka EW. Syndromic Classification: A process for amplifying information using S-Clustering. Nutrition 1996;12(11/12):827-9.

Christianson R. Foundations of inductive reasoning. 1964. Entropy Publications. Lincoln, MA.

Inability to classify information is a major problem in deriving and validating hypotheses from PRIMARY data sets necessary to establish a measure of outcome effectiveness. When using quantitative data, decision limits have to be determined that best distinguish the populations investigated. We are concerned with accurate assignment into uniquely verifiable groups by information in test relationships. Uncertainty in assigning to a supervisory classification can only be relieved by providing suffiuciuent data.

A method for examining the endogenous information in the data is used to determine decision points. The reference or null set is defined as a class having no information. When information is present in the data, the entropy (uncertainty in the data set) is reduced by the amount of information provided. This is measureable and may be referred to as the Kullback-Liebler distance, which was extended by Akaike to include statistical theory. An approach is devised using EW Rypka’s S-Clustering has been created to find optimal decision values that separate the groups being classified. Further, it is possible to obtain PRIMARY data on-line and continually creating primary classifications (learning matrices). From the primary classifications test-minimized sets of features are determined with optimal useful and sufficient information for accurately distinguishing elements (patients). Primary classifications can be continually created from PRIMARY data. More recent and complex work in classifying hematology data using a 30,000 patient data set and 16 variables to identify the anemias, moderate SIRS, sepsis, lymphocytic and platelet disorders has been published and recently presented. Another classification for malnutrition and stress hypermetabolism is now validated and in press in the journal Nutrition (2012), Elsevier.
G David, LH Bernstein, RR Coifman. Generating Evidence Based Interpretation of Hematology Screens via Anomaly Characterization. Open Clinical Chemistry Journal 2011; 4 (1):10-16. 1874-2416/11 2011 Bentham Open. ICID: 939928

G David; LH Bernstein; RR Coifman. The Automated Malnutrition Assessment. Accepted 29 April 2012.
http://www.nutritionjrnl.com. Nutrition (2012), doi:10.1016/j.nut.2012.04.017.

Keywords: Network Algorithm; unsupervised classification; malnutrition screening; protein energy malnutrition (PEM); malnutrition risk; characteristic metric; characteristic profile; data characterization; non-linear differential diagnosis

Summary: We propose an automated nutritional assessment (ANA) algorithm that provides a method for malnutrition risk prediction with high accuracy and reliability. The problem of rapidly identifying risk and severity of malnutrition is crucial for minimizing medical and surgical complications. We characterized for each patient a unique profile and mapped similar patients into a classification. We also found that the laboratory parameters were sufficient for the automated risk prediction.
We here propose a simple, workable algorithm that provides assistance for interpreting any set of data from the screen of a blood analysis with high accuracy, reliability, and inter-operability with an electronic medical record. This has been made possible at least recently as a result of advances in mathematics, low computational costs, and rapid transmission of the necessary data for computation. In this example, acute myocardial infarction (AMI) is classified using isoenzyme CKMB activity, total LD, and isoenzyme LD-1, and repeated studies have shown the high power of laboratory features for diagnosis of AMI, especially with NSTEMI. A later study includes the scale values for chest pain and for ECG changes to create the model.

LH Bernstein, A Qamar, C McPherson, S Zarich. Evaluating a new graphical ordinal logit method (GOLDminer) in the diagnosis of myocardial infarction utilizing clinical features and laboratory data. Yale J Biol Med 1999; 72(4):259-268. ICID: 825617

The quantitative measure of information, Shannon entropy treats data as a message transmission. We are interested in classifying data with near errorless discrimination. The method assigns upper limits of normal to tests computed from Rudolph’s maximum entropy definitions of group-based normal reference. Using the Bernoulli trial to determine maximum entropy reference, we determine from the entropy in the data a probability of a positive result that is the same for each test and conditionally independent of other results by setting the binary decision level for each test. The entropy of the discrete distribution is calculated from the probabilities of the distribution. The probability distribution of the binary patterns is not flat and the entropy decreases when there is information in the data. The decrease in entropy is the Kullback-Liebler distance.

The basic principle of separatory clustering is extracting features from endogenous data that amplify or maximize structural information into disjointed or separable classes. This differs from other methods because it finds in a database a theoretic – or more – number of variables with required VARIETY that map closest to an ideal, theoretic, or structural information standard. Scaling allows using variables with different numbers of message choices (number bases) in the same matrix, binary, ternary, etc (representing yes-no; small-modest, large, largest). The ideal number of class is defined by x^n. In viewing a variable value we think of it as low, normal, high, high high, etc. A system works with related parts in harmony. This frame of reference improves the applicability of S-clustering. By definition, a unit of information is log.r r = 1.

The method of creating a syndromic classification to control variety in the system also performs a semantic function by attributing a term to a Port Royal Class. If any of the attributes are removed, the meaning of the class is made meaningless. Any significant overlap between the groups would be improved by adding requisite variety. S-clustering is an objective and most desirable way to find the shortest route to diagnosis, and is an objective way of determining practice parameters.

Multiple Test Binary Decision Patterns where CK-MB = 18 u/l, LD-1 = 36 u/l, %LD1 = 32 u/l.

No. Pattern Freq P1 Self information Weighted information

0 000 26 0.1831 2.4493 0.4485
1 001 3 0.0211 5.5648 0.1176
2 010 4 0.0282 5.1497 0.1451
3 011 2 0.0282 6.1497 0.0866
4 100 6 0.0423 4.5648 0.1929
6 110 8 0.0563 4.1497 0.2338
7 111 93 0.6549 0.6106 0.3999

Entropy: sum of weighted information (average) 1.6243 bits

The effective information values are the least-error points. Non AMI patients exhibit patterns 0, 1, 2, 3, and 4: AMI patients are 6 and 7. There is 1 fp 4, and 1 fn 6. The error rate is 1.4%.

Summary:

A major problem in using quantitative data is lack of a justifiable definition of reference (normal). Our information model consists of a population group, a set of attributes derived from observations, and basic definitions using Shannon’s information measure entropy. In this model, the population set and its values for its variables are considered to be the only information available. The finding of a flat distribution with the Bernoulli test defines the reference population that has no information. The complementary syndromic group, treated in the same way, produces a distribution that is not flat and has a less than maximum information uncertainty.

The vector of probabilities – (1/2), (1/2), …(1/2), can be related to the path calculated from the Rypka-Fletcher equation, which

Ct = 1 – 2^-k/1 -2^-n

determines the theoretical maximum comprehension from the test of n attributes. We constructed a ROC curve from theoriginal IRIS data of R Fisher from four measurements of leaf and petal with a result obtained using information-based induction principles to determine discriminant points without the classification that had to be used for the discriminant analysis. The principle of maximum entropy, as formu;ated by Jaynes and Tribus proposes that for problems of statistical inference – which as defined, are problems of induction – the probabilities should be assigned so that the entropy function is maximized. Good proposed that maximum entropy be used to define the null hypothesis and Rudolph proposed that medical reference be defined as at maximum entropy.

Rudolph RA. A general purpose information processing automation: generating Port Royal Classes with probabilistic information. Intl Proc Soc Gen systems Res 1985;2:624-30.

Jaynes ET. Information theory and statistical mechanics. Phys Rev 1956;106:620-30.

Tribus M. Where do we stand after 30 years of maximum entropy? In: Levine RD, Tribus M, eds. The maximum entropy formalism. Cambridge, Ma: MIT Press, 1978.

Good IJ. Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables. Ann Math Stat 1963;34:911-34.

The most important reason for using as many tests as is practicable is derived from the prominent role of redundancy in transmitting information (Noisy Channel Theorem). The proof of this theorem does not tell how to accomplish nearly errorless discrimination, but redundancy is essential.

In conclusion, we have been using the effective information (derived from Kullback-Liebler distance) provided by more than one test to determine normal reference and locate decision values. Syndromes and patterns that are extracted are empirically verifiable.

Entropy and Syntropy (photoatelier.org)
A Software Agent for Diagnosis of ACUTE MYOCARDIAL INFARCTION (pharmaceuticalintelligence.com)
K-Nearest-Neighbors and Handwritten Digit Classification (jeremykun.wordpress.com)
Data Mining: Classification VS Clustering (cluster analysis) (parasdoshi.com)
Myocardial Infarction Algorithm Strategy 77% Effective In One Hour (guardianlv.com)
Scale‑Free Diagnosis of AMI from Clinical Laboratory Values (pharmaceuticalintelligence.com)
The great healthcare chasm: Patients want to email, access EMRs, but physicians still can’t (medcitynews.com)
Guidelines Updated for Unstable Angina/Non-ST Elevation Myocardial Infarction (pharmaceuticalintelligence.com)

Read Full Post »

Consumer-Driven Healthcare Is an Uncomfortable Concept per Eric J. Topol, MD

Posted in Health Economics and Outcomes Research, Health Law & Patient Safety, Personalized and Precision Medicine & Genomic Research, Population Health Management, Genetics & Pharmaceutical, tagged Aviva Lev-Ari, Creative Destruction of Medicine, Eric Topol, Food and Drug Administration, Medscape, Medscape Genomic Medicine, Scripps Translational Science Institute on September 21, 2012| 1 Comment »

Reporter: Aviva Lev-Ari, PhD, RN

From Medscape > Topol on The Creative Destruction of Medicine

Topol: Consumer-Driven Healthcare Is an Uncomfortable Concept

Eric J. Topol, MD

Authors and Disclosures

Posted: 09/17/2012

Hi. I’m Dr. Eric Topol, Director of the Scripps Translational Science Institute and Editor-in-Chief of Medscape Genomic Medicine and theheart.org.

In this series, The Creative Destruction of Medicine, emanating from the book I wrote, I am trying to zoom in on critical aspects of how the digital world will create better healthcare. The segment that we are getting into today is on consumer-driven healthcare.

This is a concept that a lot of physicians are very uncomfortable with. If you go back to the Gutenberg printing press, it was only then in the Middle Ages when the Bible and all the printed information could be read by others besides the high priest. In fact, that’s an analogy of what is going to happen in medicine, because until now there has been this tremendous information asymmetry.

Essentially, all the data, information, and knowledge were in the domain of doctors and healthcare professionals, and the consumer, patient, and individual was out there without that information, not even their own data. But that’s changing very quickly.

Patients will have the capability of accessing notes from an office visit and hospital records, as well as laboratory data and DNA sequencing — and on one’s smartphone, for example, blood pressure and glucose and all the key physiologic metrics.

When each individual has access to all this critical data, there will be a real shakeup to the old way that medicine was practiced. In the past, the Internet was supposed to be empowering for consumers, but that really didn’t matter because what the consumer could get through the Internet was data about a population. Now, one can get data about oneself, and, of course, a center hub for that data-sharing will be the smartphone.

Even critical information based on one’s genomic sequencing, such as drug interactions, will have a whole different look. We’ve already learned so much about the direct-to-consumer movement from the pharmaceutical industry in which patients were directed to go to their doctors and ask them for a prescription drug. That had a very powerful impact.

But in the future, with each person potentially armed with so much data and information, the role of the doctor is a very different one: It is to provide guidance, wisdom, knowledge, and judgment and, of course, the critical aspects of compassion, empathy, and communication. That is a whole different look for the consumer-driven healthcare world of the future.

Thanks so much for your attention to this segment. We will be back with more on The Creative Destruction of Medicine.

VIEW VIDEO

http://www.medscape.com/viewarticle/770587?src=ptalk

Read Full Post »

NATIONAL CENTERS FOR BIOMEDICAL COMPUTING: Resources

Posted in Bio Instrumentation in Experimental Life Sciences Research, Biomarkers & Medical Diagnostics, Cell Biology, Signaling & Cell Circuits, Computational Biology/Systems and Bioinformatics, FDA Regulatory Affairs, Genome Biology, Genomic Testing: Methodology for Diagnosis, Molecular Genetics & Pharmaceutical, Personalized and Precision Medicine & Genomic Research, Population Health Management, Genetics & Pharmaceutical, Proteomics, Scientist: Career considerations, Statistical Methods for Research Evaluation, Technology Transfer: Biotech and Pharmaceutical, tagged Aviva Lev-Ari, Bioinformatics, Biology, Brigham and Women's Hospital, Cellular network, genetics, genomics, National Centers for Biomedical Computing, National Institutes of Health, Stanford University, systems biology on September 20, 2012| Leave a Comment »

Reporter: Aviva Lev-Ari, PhD, RN

NATIONAL CENTERS FOR BIOMEDICAL COMPUTING

An overarching approach to several disciplines:

Genomics

Major research areas of Genomics:

Other Genomics related subdisciplines:

The Biomedical Computing Space

An illustration of the systems approach to biology

http://en.wikipedia.org/wiki/Systems_biology

The National Centers for Biomedical Computing (NCBCs) are part of the U.S. NIH plan to develop and implement the core of a universal computing infrastructure that is urgently needed to speed progress in biomedical research. Their mission is to create innovative software programs and other tools that will enable the biomedical community to integrate, analyze, model, simulate, and share data on human health and disease.

Biomedical Information Science and Technology Initiative (BISTI): Recognizing the potential benefits to human health that can be realized from applying and advancing the field of biomedical computing, the Biomedical Information Science and Technology Initiative (BISTI) was launched at the NIH in April 2000. This initiative is aimed at making optimal use of computer science and technology to address problems in biology and medicine. The full text of the original BISTI Report (June 1999) is available.

Current Centers

National Center for Simulation of Biological Structures (SimBioS) at Stanford University

National Center for the Multiscale Analysis of Genomic and Cellular Networks (MAGNet) at Columbia University

National Alliance for Medical Image Computing (NA-MIC) at Brigham and Women’s Hospital, Boston, MA

Integrating Biology and the Bedside (I2B2) at Brigham and Women’s Hospital, Boston, MA

National Center for Biomedical Ontology (NCBO) at Stanford University

Integrate Data for Analysis, Anonymization, and Sharing (IDASH) at the University of California, San Diego

https://commonfund.nih.gov/bioinformatics/

A Biositemap is a way for a biomedical research institution of organisation to show how biological information is distributed throughout their Information Technology systems and networks. This information may be shared with other organisations and researchers.

The Biositemap enables web browsers, crawlers and robots to easily access and process the information to use in other systems, media and computational formats. Biositemaps protocols provide clues for the Biositemap web harvesters, allowing them to find resources and content across the whole interlink of the Biositemap system. This means that human or machine users can access any relevant information on any topic across all organisations throughout the Biositemap system and bring it to their own systems for assimilation or analysis.

http://en.wikipedia.org/wiki/Biositemaps

Search for NCBC resources in the new Resource Discovery System (RDS)

NCBC Map

http://www.ncbcs.org/

For

Genome and Genetics: Resources @Stanford, @MIT, @NIH’s NCBCS

go to

http://pharmaceuticalintelligence.com/2012/09/18/genome-and-genetics-resources/

Biomedical Computation Review (BCR) is a quarterly, open-access magazine funded by the National Institutes of Health and published by Simbios, one of the National Centers for Biomedical Computing located at Stanford University. First published in 2005, BCR covers such topics as molecular dynamics, genomics, proteomics, physics-based simulation, systems biology, and other research involvingcomputational biology. BCR’s articles are targeted to those with a general science or biology background, in order to build a community among biomedical computational researchers who come from a variety of disciplines.

http://en.wikipedia.org/wiki/Biomedical_Computation_Review

REFERENCES on BIOINFORMATICS

^ Biositemaps online editor
^ ^a ^b Dinov ID, Rubin D, Lorensen W, et al. (2008). “iTools: A Framework for Classification, Categorization and Integration of Computational Biology Resources”. PLoS ONE 3 (5): e2265. doi:10.1371/journal.pone.0002265. PMC 2386255. PMID 18509477.
^ M.L. Nelson, J.A. Smith, del Campo, H. Van de Sompel, X. Liu (2006). “Efficient, Automated Web Resource Harvesting”. WIDM’06.
^ Brandman O, Cho J, Garcia-Molina H, Shivakumar N (2000). “Crawler-friendly Web Servers”. ACM SIGMETRICS Performance Evaluation Review 28 (2). doi:10.1145/362883.362894.
^ Cannata N, Merelli E, Altman RB (December 2005). “Time to organize the bioinformatics resourceome”. PLoS Comput. Biol. 1 (7): e76.doi:10.1371/journal.pcbi.0010076. PMC 1323464. PMID 16738704.
^ Chen YB, Chattopadhyay A, Bergen P, Gadd C, Tannery N (January 2007). “The Online Bioinformatics Resources Collection at the University of Pittsburgh Health Sciences Library System—a one-stop gateway to online bioinformatics databases and software tools”.Nucleic Acids Res. 35 (Database issue): D780–5. doi:10.1093/nar/gkl781. PMC 1669712. PMID 17108360.
- Kitano, Hiroaki (15 October 2001). Foundations of Systems Biology. MIT Press. pp. 320.ISBN 978-0-262-11266-6.
- Werner, Eric (29 March 2007). “All systems go”. Nature 446 (7135): 493–494. Bibcode2007Natur.446..493W. doi:10.1038/446493a. provides a comparative review of three books:
  - Alon, Uri (7 July 2006). An Introduction to Systems Biology: Design Principles of Biological Circuits. Chapman & Hall. pp. 301. ISBN 978-1-58488-642-6.
  - Kaneko, Kunihiko (15 September 2006). Life: An Introduction to Complex Systems Biology. Springer-Verlag. pp. 371. ISBN 978-3-540-32666-3.
  - Palsson, Bernhard O (16 January 2006). Systems Biology: Properties of Reconstructed Networks. Cambridge University Press. pp. 334. ISBN 978-0-521-85903-5.

REFERENCES on GENOMICS

^ National Human Genome Research Institute (2010-11-08).“FAQ About Genetic and Genomic Science”. Genome.gov. Retrieved 2011-12-03.
^ EPA Interim Genomics Policy
^ [1]
^ Min Jou W, Haegeman G, Ysebaert M, Fiers W (1972). “Nucleotide sequence of the gene coding for the bacteriophage MS2 coat protein”. Nature 237 (5350): 82–88. Bibcode 1972Natur.237…82J. doi:10.1038/237082a0.PMID 4555447.
^ Fiers W, Contreras R, Duerinck F, Haegeman G, Iserentant D, Merregaert J, Min Jou W, Molemans F, Raeymaekers A, Van den Berghe A, Volckaert G, Ysebaert M (1976). “Complete nucleotide sequence of bacteriophage MS2 RNA: primary and secondary structure of the replicase gene”. Nature 260 (5551): 500–507.Bibcode 1976Natur.260..500F. doi:10.1038/260500a0.PMID 1264203.
^ Sanger F, Air GM, Barrell BG, Brown NL, Coulson AR, Fiddes CA, Hutchison CA, Slocombe PM, Smith M (1977). “Nucleotide sequence of bacteriophage phi X174 DNA”. Nature 265 (5596): 687–695. Bibcode 1977Natur.265..687S.doi:10.1038/265687a0. PMID 870828.
^ Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, et al. (1995). “Whole-genome random sequencing and assembly of Haemophilus influenzae Rd”. Science 269 (5223): 496–512.Bibcode 1995Sci…269..496F. doi:10.1126/science.7542800.PMID 7542800.
^ “Complete genomes: Viruses”. NCBI. 2011-11-17. Retrieved 2011-11-18.
^ “Genome Project Statistics”. Entrez Genome Project. 2011-10-07. Retrieved 2011-11-18.
^ Hugenholtz, Philip (2002). “Exploring prokaryotic diversity in the genomic era”. Genome Biology 3 (2): reviews0003.1-reviews0003.8. ISSN 1465-6906.
^ BBC article Human gene number slashed from Wednesday, 20 October 2004
^ CBSE News, Thursday, 16 October 2003
^ National Human Genome Research Institute (2004-07-14).“Dog Genome Assembled: Canine Genome Now Available to Research Community Worldwide”. Genome.gov. Retrieved 2012-01-20.
^ McGrath S and van Sinderen D, ed. (2007). Bacteriophage: Genetics and Molecular Biology (1st ed.). Caister Academic Press. ISBN 978-1-904455-14-1.
^ Herrero A and Flores E, ed. (2008). The Cyanobacteria: Molecular Biology, Genomics and Evolution (1st ed.). Caister Academic Press. ISBN 978-1-904455-15-8.
^ McElheny, Victor (2010). Drawing the map of life : inside the Human Genome Project. New York NY: Basic Books. ISBN 978-0-465-04333-0.
^ Hugenholz, P; Goebel BM, Pace NR (1 September 1998).“Impact of Culture-Independent Studies on the Emerging Phylogenetic View of Bacterial Diversity”. J. Bacteriol 180 (18): 4765–74. PMC 107498. PMID 9733676.
^ Eisen, JA (2007). “Environmental Shotgun Sequencing: Its Potential and Challenges for Studying the Hidden World of Microbes”. PLoS Biology 5 (3): e82.doi:10.1371/journal.pbio.0050082. PMC 1821061.PMID 17355177.
^ Marco, D, ed. (2010). Metagenomics: Theory, Methods and Applications. Caister Academic Press. ISBN 978-1-904455-54-7.
^ Marco, D, ed. (2011). Metagenomics: Current Innovations and Future Trends. Caister Academic Press. ISBN 978-1-904455-87-5.
^ Wang L (2010). “Pharmacogenomics: a systems approach”.Wiley Interdiscip Rev Syst Biol Med 2 (1): 3–22.doi:10.1002/wsbm.42. PMID 20836007.
^ Becquemont L (June 2009). “Pharmacogenomics of adverse drug reactions: practical applications and perspectives”.Pharmacogenomics 10 (6): 961–9. doi:10.2217/pgs.09.37.PMID 19530963.
^ “Guidance for Industry Pharmacogenomic Data Submissions” (PDF). U.S. Food and Drug Administration. March 2005. Retrieved 2008-08-27.
^ Squassina A, Manchia M, Manolopoulos VG, Artac M, Lappa-Manakou C, Karkabouna S, Mitropoulos K, Del Zompo M, Patrinos GP (August 2010). “Realities and expectations of pharmacogenomics and personalized medicine: impact of translating genetic knowledge into clinical practice”.Pharmacogenomics 11 (8): 1149–67. doi:10.2217/pgs.10.97.PMID 20712531.

http://en.wikipedia.org/wiki/Genomics

Read Full Post »

Impact of evolutionary selection on functional regions: The imprint of evolutionary selection on ENCODE regulatory elements is manifested between species and within human populations

Posted in Biological Networks, Gene Regulation and Evolution, Chemical Genetics, Genome Biology, International Global Work in Pharmaceutical, Medical and Population Genetics, Molecular Genetics & Pharmaceutical, Personalized and Precision Medicine & Genomic Research, Population Health Management, Genetics & Pharmaceutical, tagged 1000 Genomes Project, ENCODE, human genome, mutation on September 20, 2012| 5 Comments »

Reporter and Curator: Dr. Sudipta Saha, Ph.D.

Negative selection was examined using two measures that highlight different periods of selection in the human genome. The first measure, inter-species, pan-mammalian constraint (GERP-based scores; 24 mammals) addresses selection during mammalian evolution. The second measure is intra-species constraint estimated from the numbers of variants discovered in human populations using data from the 1000 Genomes project and covers selection over human evolution.

For DNaseI elements and bound motifs most sets of elements show enrichment in pan mammalian constraint and decreased human population diversity, though for some cell types the DNaseI sites do not appear overall to be subject to pan-mammalian constraint. Bound TF motifs have a natural control from the set of TF motif with equal sequence potential for binding but without binding evidence from ChIP-seq experiments; in all cases, the bound motifs showed both more mammalian constraint and higher suppression of human diversity.

Consistent with previous findings, genome-wide evidence was not observed for pan-mammalian selection of novel RNA sequences. There are also a large number of elements without mammalian constraint, between 17-90% for TF-binding regions as well as DHSs and FAIRE regions. Previous studies could not determine whether these sequences are either biochemically active, but with little overall impact on the organism, or are under lineage specific selection. By isolating sequences preferentially inserted into the primate lineage, which is only feasible given the genome-wide scale of this data, this issue was specifically examined. The majority of primate-specific sequence is due to retrotransposon activity, but an appreciable proportion is non-repetitive primate-specific sequence. Of 104,343,413 primate-specific bases (excluding repetitive elements), 67,769,372 (65%) are found within ENCODE-identified elements. Examination of 227,688 variants segregating in these primate specific regions revealed that all classes of elements (RNA and regulatory) show depressed derived allele frequencies, consistent with recent negative selection occurring in at least some of these regions. This suggests that an appreciable proportion of the unconstrained elements are lineage specific elements required for organismal function, consistent with long standing views of recent evolution, and the remainder are likely to be “neutral” elements which are not currently under selection, but may still affect cellular or larger scale phenotypes without an effect on fitness.

The binding patterns of TFs are not uniform, and can be correlated both inter-and intra-species measures of negative selection with the overall information content of motif positions. The selection on some motif positions is as high as protein coding exons. These aggregate measures across motifs show that the binding preferences found in the population of sites are also relevant to the per-site behavior. By developing a per-site metric of population effect on bound motifs, it was found that highly constrained bound instances across mammals are able to buffer the impact of individual variation.

It was proposed to express the deleterious effect of TFBS mutations in terms of mutational load, a known population genetics metric that combines the frequency of mutation with predicted phenotypic consequences that it causes. This metric was adapted to use the reduction in PWM score associated with a mutation as a crude but computable measure of such phenotypic consequences. It was not assumed that TFBS load at a given site reduces an individual’s biological fitness. Rather, it was argued that binding sites that tolerate a higher load are less functionally constrained. This approach, although undoubtedly a crude one, makes it possible to consistently estimate TFBS constraints for different TFs and even different organisms and ask why TFBS mutations are tolerated differently in different contexts.

It was first asked whether motif load would be able to detect the expected link between evolutionary and individual variation. A published metric was used, Branch Length Score (BLS), to characterise the evolutionary conservation of a motif instance. This metric utilises both a PWM based model of the conservation of bases and allows for motif movement. Reassuringly, mutational load correlated with BLS in both species, with evolutionary non-conserved motifs (BLS=0) showing by far the highest degree of variation in the population. At the same time, ∼40% of human and fly TFBSs with an appreciable load (L>5e-3) still mapped to reasonably conserved sites (BLS>0.2, ∼50% percentile in both organisms), demonstrating that score-reducing mutations at evolutionary preserved sequences can be tolerated in these populations.

Using this metric, the original findings were confirmed, suggesting that TFBSs with higher PWM scores are generally more functionally constrained compared to ‘weaker’ sites. The fraction of detected sites mapping to bound regions remained similar across the whole analysed score range, suggesting that this relationship is unlikely to be an artefact of higher false-positive rates at ‘weaker’ sites. This global observation, however, does not rule out the possibility that a weaker match at some sites is specifically preserved to ensure dose-specific TF binding. This may be the case, for example, for Drosophila Bric-à-brac motifs, which exhibited no correlation between motif load and PWM score, consistent with the known dosage-dependent function of Bric-à-brac in embryo patterning.

Motif load was used to address whether TFBSs proximal to transcription start sites (TSS) are more constrained compared to more distant regulatory regions. This was found to be the case in the human, but not in Drosophila. CTCF binding sites in both species were a notable exception, tolerating the lowest mutational load at locations 500bp-1kb from TSS, but not closer to the TSS, suggesting that the putative role of CTCF in establishing chromatin domains is particularly important in proximity of gene promoters.

To gain further insight into the functional effects of TFBS mutations, a dataset was used that mapped human CTCF binding sites across four individuals. TFBS mutations detected in this dataset often did not result in a significant loss of binding, with ∼75% mutated sites retaining at least two thirds of the binding signal. This was particularly prominent at conserved sites (BLS>0.5), 90% of which showed this ‘buffering’ effect. To address whether buffering could be explained solely by the flexibility of CTCF sequence preferences, it was analysed between-allele differences in the PWM score at polymorphic binding sites. As expected, globally CTCF binding signal correlated with the PWM score of the underlying motifs. Consistent with this, alleles with minor differences in PWM match generally had little effect on the binding signal compared to sites with larger PWM score changes, suggesting that the PWM model adequately describes the functional constraints of CTCF binding sites. At the same time, it was found that CTCF binding signals could be maintained even in those cases, where mutations resulted in significant changes of PWM score, particularly at evolutionary conserved sites. A linear interaction model confirmed that the effect of motif mutations on CTCF binding was significantly reduced with increasing conservation. These effects were not due to the presence of additional CTCF motifs (as 96% of bound regions only contained a single motif), while differences between more and less conserved sites could not be explained away by differences in the PWM scores of their major alleles. A CTCF dataset from three additional individuals generated by a different laboratory yielded consistent conclusions, suggesting that our observations were not due to over-fitting.

Taken together, CTCF binding data for multiple individuals show that mutations can be buffered to maintain the levels of binding signal, particularly at highly conserved sites, and this effect cannot be explained solely by the flexibility of CTCF’s sequence consensus. It was asked whether mechanisms potentially accountable for such buffering would also affect the relationship between sequence and binding in the absence of mutations. Training an interaction linear model across the whole set of mapped CTCF binding sites revealed that conservation consistently weakens the relationship between PWM score and the binding intensity. Thus, CTCF binding to evolutionary conserved sites may generally have a reduced dependence on sequence.

Source References:

http://www.nature.com/encode/threads/impact-of-evolutionary-selection-on-functional-regions

Read Full Post »

« Newer Posts - Older Posts »

Leaders in Pharmaceutical Business Intelligence Group, LLC, Doing Business As LPBI Group, Newton, MA

Archive for the ‘Personalized and Precision Medicine & Genomic Research’ Category

NGS Cardiovascular Diagnostics: Long-QT Genes Sequenced – A Potential Replacement for Molecular Pathology

Next Generation Diagnostics in Inherited Arrhythmia Syndromes : A Comparison of Two Approaches.

Ware JS, John S, Roberts AM, Buchan R, Gong S, Peters NS, Robinson DO, Lucassen A, Behr ER, Cook SA.

Source

Abstract

Related Stories

Introduction to Nanotechnology in Drug Delivery

ENCODE: the key to unlocking the secrets of complex genetic diseases

Consumer-Driven Healthcare Is an Uncomfortable Concept per Eric J. Topol, MD

From Medscape > Topol on The Creative Destruction of Medicine

Topol: Consumer-Driven Healthcare Is an Uncomfortable Concept

NATIONAL CENTERS FOR BIOMEDICAL COMPUTING: Resources

NATIONAL CENTERS FOR BIOMEDICAL COMPUTING

Current Centers

Genome and Genetics: Resources @Stanford, @MIT, @NIH’s NCBCS

Impact of evolutionary selection on functional regions: The imprint of evolutionary selection on ENCODE regulatory elements is manifested between species and within human populations

Follow Blog via Email

Recent Posts

Archives

Categories

Meta

Archive for the ‘Personalized and Precision Medicine & Genomic Research’ Category

Next Generation Diagnostics in Inherited Arrhythmia Syndromes : A Comparison of Two Approaches.

Ware JS, John S, Roberts AM, Buchan R, Gong S, Peters NS, Robinson DO, Lucassen A, Behr ER, Cook SA.

Source

Abstract

Related Stories

Introduction

Function in aerobic and anaerobic metabolism

Glycolysis

Biogenesis of mitochondrial structures from aerobically grown S. cerevisiae

Mitochondria, hydrogenosomes and mitosomes

Lineages

Anaerobic mitochondrial enzymes

Mitochondrial redox status

Tardigrades

Epicrisis

Related articles

Diabetes and miRNA

Diabetic Complications

References

Expanding the Genetic Alphabet and Linking the Genome to the Metabolome

From Medscape > Topol on The Creative Destruction of Medicine

Topol: Consumer-Driven Healthcare Is an Uncomfortable Concept

Current Centers

Genome and Genetics: Resources @Stanford, @MIT, @NIH’s NCBCS

Follow Blog via Email

Recent Posts

Archives

Categories

Meta