Advertisements
Feeds:
Posts
Comments

Archive for the ‘BioIT: BioInformatics’ Category


Reporter and Curator: Dr. Sudipta Saha, Ph.D.

 

Low sperm count and motility are markers for male infertility, a condition that is actually a neglected health issue worldwide, according to the World Health Organization. Researchers at Harvard Medical School have developed a very low cost device that can attach to a cell phone and provides a quick and easy semen analysis. The device is still under development, but a study of the machine’s capabilities concludes that it is just as accurate as the elaborate high cost computer-assisted semen analysis machines costing tens of thousands of dollars in measuring sperm concentration, sperm motility, total sperm count and total motile cells.

 

The Harvard team isn’t the first to develop an at-home fertility test for men, but they are the first to be able to determine sperm concentration as well as motility. The scientists compared the smart phone sperm tracker to current lab equipment by analyzing the same semen samples side by side. They analyzed over 350 semen samples of both infertile and fertile men. The smart phone system was able to identify abnormal sperm samples with 98 percent accuracy. The results of the study were published in the journal named Science Translational Medicine.

 

The device uses an optical attachment for magnification and a disposable microchip for handling the semen sample. With two lenses that require no manual focusing and an inexpensive battery, it slides onto the smart phone’s camera. Total cost for manufacturing the equipment: $4.45, including $3.59 for the optical attachment and 86 cents for the disposable micro-fluidic chip that contains the semen sample.

 

The software of the app is designed with a simple interface that guides the user through the test with onscreen prompts. After the sample is inserted, the app can photograph it, create a video and report the results in less than five seconds. The test results are stored on the phone so that semen quality can be monitored over time. The device is under consideration for approval from the Food and Drug Administration within the next two years.

 

With this device at home, a man can avoid the embarrassment and stress of providing a sample in a doctor’s clinic. The device could also be useful for men who get vasectomies, who are supposed to return to the urologist for semen analysis twice in the six months after the procedure. Compliance is typically poor, but with this device, a man could perform his own semen analysis at home and email the result to the urologist. This will make sperm analysis available in the privacy of our home and as easy as a home pregnancy test or blood sugar test.

 

The device costs about $5 to make in the lab and can be made available in the market at lower than $50 initially. This low cost could help provide much-needed infertility care in developing or underdeveloped nations, which often lack the resources for currently available diagnostics.

 

References:

 

https://www.nytimes.com/2017/03/22/well/live/sperm-counts-via-your-cellphone.html?em_pos=small&emc=edit_hh_20170324&nl=well&nl_art=7&nlid=65713389&ref=headline&te=1&_r=1

 

http://www.npr.org/sections/health-shots/2017/03/22/520837557/a-smartphone-can-accurately-test-sperm-count

 

https://www.ncbi.nlm.nih.gov/pubmed/28330865

 

http://www.sciencealert.com/new-smartphone-microscope-lets-men-check-the-health-of-their-own-sperm

 

https://www.newscientist.com/article/2097618-are-your-sperm-up-to-scratch-phone-microscope-lets-you-check/

 

https://www.dezeen.com/2017/01/19/yo-fertility-kit-men-test-sperm-count-smartphone-design-technology-apps/

 

Advertisements

Read Full Post »


2017 Agenda – BioInformatics: Track 6: BioIT World Conference & Expo ’17, May 23-35, 2017, Seaport World Trade Center, Boston, MA

Reporter: Aviva Lev-Ari, PhD, RN

2017bioit-bit-mini-logo

 

 bioinformatics

http://www.bio-itworldexpo.com/Bio-It_Expo_Content.aspx?id=140955

  #BioIT17

TUESDAY, MAY 23

7:00 am Workshop Registration and Morning Coffee

8:0011:30 Recommended Morning Pre-Conference Workshops*

(W4) Data Visualization to Accelerate Biological Discovery

12:304:00 pm Recommended Afternoon Pre-Conference Workshops*

(W13) Proteogenomics: Integration of Genomics and Proteomics Data

* Separate registration required.

2:006:00 Main Conference Registration Open

4:00 PLENARY KEYNOTE SESSION

Click here for detailed information

5:007:00 Welcome Reception in the Exhibit Hall with Poster Viewing

WEDNESDAY, MAY 24

7:00 am Registration Open and Morning Coffee

8:00 PLENARY KEYNOTE SESSION

Click here for detailed information

9:50 Coffee Break in the Exhibit Hall with Poster Viewing

APPLICATIONS & SOLUTIONS FOR DATA SHARING AND DECISION MAKING

10:50 Chairperson’s Remarks

Kevin Merlo, BioSafety Development Engineer, Dassault Systemes

11:00 Innovative Data Integration Applicable for Therapeutic Protein Development 2.0

Wolfgang Paul, Group Leader and Senior Scientist, Large Molecule Research, Roche

Therapeutic proteins are registered including sequence, structural and functional data and information. Millions of data points are captured during the development of Roche’s innovative therapeutic proteins in data warehouse used by DAMAS (data acquisition, management and analyses system). Fast access and visualization of relevant process and analytical data drive scientific discussion and decision making. Analyzing the stored big data is key towards process development of therapeutic proteins 2.0.

11:30 Informatics – A Silver Bullet for Pharmaceutical Sciences?

William Loging, Ph.D., Associate Professor of Genomics & Head, Production Bioinformatics, Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai

The Pharmaceutical Sciences field is in constant search for the next big innovative push that will increase the success rate of drug programs. The fields of computational chemistry, structural bioinformatics – just to name a few – have changed the way drug researchers look for and identify novel drug candidates. Utilizing more than 15 years of Pharmaceutical experience, and using real world examples of high provide drug projects, this talk will provide practical steps for the merger of informatics and the strategic approaches needed for drug discovery success.

12:00 pm Big Data-Driven Bioinformatics

Frank Lee, Ph.D., Healthcare Life Sciences Industry Leader, Software Defined Infrastructure, IBM Systems, IBM

IBM will discuss the IBM Reference Architecture for Genomics, its new features, and case studies: hybrid cloud with integrated workload and data management for high performance genomics analytics; container technologies for migrating and sharing application and data; and application portal and metadata engine for global access to and searching of distributed resources. A demo of a hybrid cloud-based bioinformatics solution will follow.

12:30 Session Break

12:40 Luncheon Presentation I to be Announced

1:10 Luncheon Presentation II to be Announced

1:40 Session Break

STANDARDS FOR CHEMICAL STRUCTURES

1:50 Chairperson’s Remarks

1:55 PANEL DISCUSSION: Linking and Finding Information Using the IUPAC InChI Standard for Chemical Structures

Steve Heller, Ph.D., Project Director, InChI Trust; Scientific Information Consultant (Moderator)

Evan Bolton, Ph.D., Lead Scientist, National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), and National Institutes of Health (NIH)

Keith T. Taylor, BSc, Ph.D., MRSC, Principal, Ladera Consultancy

Tyler Peryea, Informatics Scientist, National Center for Advancing Translational Sciences (NCATS)

Lawrence Callahan, Ph.D., Chemist, Substance Registration System, Office of Critical Path Programs, Food and Drug Administration (FDA)

This session will highlight on-going efforts to strengthen and expand the non-proprietary IUPAC International Chemical Identifier (InChI) standard for chemical structures and its hashed-form, the InChIKey. Information standards are critical to enable effective communication of scientific content. Funding to maintain InChI comes from most major publishers and database providers as well as governmental agencies (NIH, FDA and NIST). The InChI is an open-source, widely adopted standard found in most chemical information containing databases, including those from Chemical Abstracts, Reaxys, ChEMBL, OpenPHACTS, PubChem, DrugBank, PDB, Sigma-Aldrich, and many others, such as internal Pharma corporate databases. InChI is an addition to a database, not a replacement. With the implementation of the ISO identification of medicinal products (IDMP) and the related ISO 11238 standards, adding and having an InChI will allow for an easier, effective, and more complete search for information on a particular drug.

2:55 Sponsored Presentation (Opportunity Available)

3:10 Integrated Informatics for Biologics Discovery

Robert Brown, Ph.D., Vice President, Product Marketing, Dotmatics

3:25 Refreshment Break in the Exhibit Hall with Poster Viewing

MACHINE LEARNING TECHNIQUES AND APPLICATIONS TO PERFORM BIG DATA ANALYTICS ON –OMICS DATA

4:00 Building Disease Networks Using Text Mining and Machine Learning Techniques

Kamal Rawal, Ph.D., Assistant Professor, Biotech and Bioinformatics, Jaypee Institute of Information Technology

Obesity is a global epidemic affecting over 1.5 billion people and is one of the risk factors for several diseases such as type 2 diabetes mellitus and hypertension. We have constructed a comprehensive map of the molecules reported to be implicated in obesity. Using text mining & deep curation strategies combined with omics data, we have explained the therapeutics and side effects of several drugs (i.e., orlistat) at network level.

4:20 Big Data and Systems Biology: From Genome to Phenome (and Everything in Between)

Dan Jacobson, Ph.D., Computational Biologist, Oak Ridge National Laboratory

4:40 Novel Feature Selection Strategies for Enhanced Predictive Modeling and Deep Learning in the Biosciences

Tom Chittenden, Ph.D., D.Phil., Lecturer and Senior Biostatistics and Mathematical Biology Consultant, Harvard Medical School

We have built a robust AI approach that precisely assesses pathogenicity for all genomic missense variants. Coupled with our advanced deepCODE mathematical statistics feature selection strategy for constructing deep learning models, we are able to quantitatively integrate a priori pathway-based biological knowledge with multiple types of high-throughput omics data.

5:00 Network Analysis for Drug Discovery: Benchmarking Results and Best Practices Reported by CBDD Consortium

Marina Bessarabova, Ph.D., Senior Director, Discovery and Translational Science, Life Sciences Professional Services, Clarivate Analytics (Formerly the IP & Science Business of Thomson Reuters)

A large number of advanced approaches to network analysis of -omics data were developed by academia groups in the past 15 years. Adoption of these approaches in drug development requires thorough review of the published approaches, implementation of methods identified as potentially applicable to drug development and benchmarking of the methods with an aim to establish best practices for application of the methods to diseases and mechanism of action understanding, target identification, drug repositioning, patient stratification, biomarker discovery, and drug combination effect prediction. CBDD (Computational Biology Methods for Drug Discovery) is a precompetitive consortium between Novartis, Pfizer, Sanofi, Janssen, Regeneron, UCB, Roche, Takeda, Biogen, Boehringer Ingelheim, Bristol-Myers Squibb, Merck and Clarivate Analytics (formally Thomson Reuters) focused on adoption of network analysis approaches in drug development: literature review, method implementation and benchmarking. Benchmarking results and best practices for application of network analysis in drug development established by members of the program will be shared during the presentation.

5:30 15th Anniversary Celebration in the Exhibit Hall with Poster Viewing and Best of Show Awards

THURSDAY, MAY 25

7:00 am Registration Open and Morning Coffee

8:00 PLENARY KEYNOTE SESSION & AWARDS PROGRAM

8:05 Benjamin Franklin Awards and Laureate Presentation

8:35 Best Practices Awards Program

8:50 Plenary Keynote

Click here for detailed information

9:45 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced

DATA COMPUTING AND BIOINFORMATICS IN AGRO CHEMICALS AND BIOTECHNOLOGY: CHALLENGES AND OPPORTUNITIES

10:30 Chairperson’s Remarks

Bino John, Ph.D., Computational Biology Group Leader, Dow AgroSciences LLC

10:40 How Biotech and Big Data Are Changing Agro Industry

Bino John, Ph.D., Computational Biology Group Leader, Dow AgroSciences LLC

More than 70% of the increase in food production in the next 50 years is expected to come from technological advances. Indeed, recent advances in genomics and phenomics are beginning to transform the Agro-industry, whereby creating new opportunities for informatics disciplines. While informatics needs in managing, analyzing, and visualizing big data share commonalties between Agro and the biomedical communities, Agro companies face unprecedented challenges in big biological data, generally larger than their peers in the biomedical community.

11:00 Offering Outcomes: How Digital Farming Data Is Enabling New Business Models

Tobias Menne, Global Head of Digital Farming, Bayer

11:20 Building the Next-Generation R&D IT Infrastructure for Small Molecule Discovery

Paimun Amini, Chemistry IT Lead, R&D IT, Monsanto Company

Barrett Foat, Ph.D., Data Science Team Lead, Agricultural Productivity Innovations, Monsanto

The Pharma boom in the 90s & 2000s led to the emergence of a rich ecosystem of software companies focused on delivering the IT needs for small molecule discovery. Today, cloud data storage, IoT, and the growth of predictive analytics present new opportunities for the evolution of the R&D pipeline. New technologies allow for integrated software and hardware solutions that optimize productivity while removing the risk of technical debt.

11:40 Sponsored Presentation (Opportunity Available)

12:10 pm Session Break

12:20 Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own

1:20 Dessert Refreshment Break in the Exhibit Hall with Poster Viewing

LOOKING BEYOND THE GENOME OF THE PATIENT: DATA, ANALYSIS AND TOOLS TO IMPROVE BETTER DISEASE UNDERSTANDING FOR CURRENT TREATMENTS AND DRUG DEVELOPMENT

1:55 Chairperson’s Remarks

Michael N. Liebman, Ph.D., Managing Director, IPQ Analytics, LLC and Strategic Medicine, Inc.

2:00 Distinguishing between Precision Medicine and Accurate Medicine: Application to Heart Failure Patients and Clinical Practice

Michael N. Liebman, Ph.D., IPQ Analytics, LLC and Strategic Medicine, Inc.

Increasingly, patient stratification based on genomic analysis is being considered in disease management. Critically, the need to understand real world medical practice and real world patient complexities extends far beyond the genome of the patient. We have shown examples of this complexity in heart disease and how this impacts development of clinical guidelines, trial design, and development of new patient management approaches.

2:30 CARPEDIEM – Comorbidity and Risk Profiles Evaluation in Diabetes and Heart Morbidities

Sabrina Molinaro, Psy.D., Ph.D., Head, Department of Epidemiology and Health Services, Institute of Clinical Physiology, National Research Council of Italy

Our project uniquely develops a patient record that includes clinical and individual factors (EHR-driven phenotyping) that will be validated through the comparison of existing standards for building new risk algorithms. An understanding of the current limitations and biases of risk profiling in heart disease and diabetes and how an extended, integrated database and automatic rule-based classification system can be used to improve patient management.

3:00 PANEL DISCUSSION: Precision Medicine vs. Accurate Medicine: The Need to Understand Real World Medicine and Real World Patients

Michael N. Liebman, Ph.D., IPQ Analytics, LLC and Strategic Medicine, Inc. (Moderator)

Charles Barr, M.D., MPH, Group Medical Director and Head, Evidence Science and Innovation, Genentech

Hal Wolf, Director, National Leader of Information and Digital Health Strategy, The Chartis Group

4:00 Conference Adjourns

SOURCE

http://www.bio-itworldexpo.com/bioinformatics/

Read Full Post »


Dr. Doudna: RNA synthesis capabilities of Synthego’s team represent a significant leap forward for Synthetic Biology

Reporter: Aviva Lev-Ari, PhD, RN

 

Synthego Raises $41 Million From Investors, Including a Top Biochemist

Synthego also drew in Dr. Doudna, who had crossed paths with the company’s head of synthetic biology at various industry conferences. According to Mr. Dabrowski, the money from her trust represents the single-biggest check from a non-institutional investor that the start-up has raised.

Synthego’s new funds will help the company take its products to a more global customer base, as well as broaden its offerings. The longer-term goal, Mr. Dabrowski said, is to help fully automate biotech research and take care of much of the laboratory work that scientists currently handle themselves.

The model is cloud technology, where companies rent out powerful remote server farms to handle their computing needs rather than rely on their own hardware.

“We’ll be able to do their full research workflow,” he said. “If you look at how cloud computing developed, it used to be that every company handled their server farm. Now it’s all handled in the cloud.”

SOURCE

Other related articles published in this Open Access Online Scientific Journal include the following:

UPDATED – Status “Interference — Initial memorandum” – CRISPR/Cas9 – The Biotech Patent Fight of the Century: UC, Berkeley and Broad Institute @MIT

Reporter: Aviva Lev-Ari, PhD, RN

https://pharmaceuticalintelligence.com/2016/01/06/status-interference-initial-memorandum-crisprcas9-the-biotech-patent-fight-of-the-century/

 

Read Full Post »


Translation of whole human genome sequencing to clinical practice: The Joint Initiative for Metrology in Biology (JIMB) is a collaboration between the National Institute of Standards & Technology (NIST) and Stanford University.

Reporter: Aviva Lev-Ari, PhD, RN

 

JIMB’s mission is to advance the science of measuring biology (biometrology). JIMB is pursuing fundamental research, standards development, and the translation of products that support confidence in biological measurements and reliable reuse of materials and results. JIMB is particularly focused on measurements and technologies that impact, are related to, or enabled by ongoing advances in and associated with the reading and writing of DNA.

Stanford innovators and industry entrepreneurs have joined forces with the measurement experts from NIST to create a new engine powering the bioeconomy. It’s called JIMB — “Jim Bee” — the Joint Initiative for Metrology in Biology. JIMB unites people, platforms, and projects to underpin standards-based research and innovation in biometrology.

Genome in a Bottle
Authoritative Characterization of
Benchmark Human Genomes


The Genome in a Bottle Consortium is a public-private-academic consortium hosted by NIST to develop the technical infrastructure (reference standards, reference methods, and reference data) to enable translation of whole human genome sequencing to clinical practice. The priority of GIAB is authoritative characterization of human genomes for use in analytical validation and technology development, optimization, and demonstration. In 2015, NIST released the pilot genome Reference Material 8398, which is genomic DNA (NA12878) derived from a large batch of the Coriell cell line GM12878, characterized for high-confidence SNPs, indel, and homozygous reference regions (Zook, et al., Nature Biotechnology 2014).

There are four new GIAB reference materials available.  With the addition of these new reference materials (RMs) to a growing collection of “measuring sticks” for gene sequencing, we can now provide laboratories with even more capability to accurately “map” DNA for genetic testing, medical diagnoses and future customized drug therapies. The new tools feature sequenced genes from individuals in two genetically diverse groups, Asians and Ashkenazic Jews; a father-mother-child trio set from Ashkenazic Jews; and four microbes commonly used in research. For more information click here.  To purchase them, visit:

Data and analyses are publicly available (GIAB GitHub). A description of data generated by GIAB is published here. To standardize best practices for using GIAB genomes for benchmarking, we are working with the Global Alliance for Genomics and Health Benchmarking Team (benchmarking tools).

High-confidence small variant and homozygous reference calls are available for NA12878, the Ashkenazim trio, and the Chinese son with respect to GRCh37.  Preliminary high-confidence calls with respect to GRCh38 are also available for NA12878.   The latest version of these calls is under the latest directory for each genome on the GIAB FTP.

The consortium was initiated in a set of meetings in 2011 and 2012, and the consortium holds open, public workshops in January at Stanford University in Palo Alto, CA and in August/September at NIST in Gaithersburg, MD. Slides from workshops and conferences are available online. The consortium is open and welcomes new participants.

SOURCE

Stanford innovators and industry entrepreneurs have joined forces with the measurement experts from NIST to create a new engine powering the bioeconomy. It’s called JIMB — “Jim Bee” — the Joint Initiative for Metrology in Biology. JIMB unites people, platforms, and projects to underpin standards-based research and innovation in biometrology.

JIMB World Metrology Day Symposium

JIMB’s mission is to motivate standards-based measurement innovation to facilitate translation of basic science and technology development breakthroughs in genomics and synthetic biology.

By advancing biometrology, JIMB will push the boundaries of discovery science, accelerate technology development and dissemination, and generate reusable resources.

 SOURCE

VIEW VIDEO

https://player.vimeo.com/video/184956195?wmode=opaque&api=1″,”url”:”https://vimeo.com/184956195″,”width”:640,”height”:360,”providerName”:”Vimeo”,”thumbnailUrl”:”https://i.vimeocdn.com/video/594555038_640.jpg”,”resolvedBy”:”vimeo”}” data-block-type=”32″>

Other related articles published in this Open Access Online Scientific Journal include the following:

“Genome in a Bottle”: NIST’s new metrics for Clinical Human Genome Sequencing

Reporter: Aviva Lev-Ari, PhD, RN

https://pharmaceuticalintelligence.com/2012/09/06/genome-in-a-bottle-nists-new-metrics-for-clinical-human-genome-sequencing/

Read Full Post »


cvd-series-a-volume-iii


Series A: e-Books on Cardiovascular Diseases
 

Series A Content Consultant: Justin D Pearlman, MD, PhD, FACC

VOLUME THREE

Etiologies of Cardiovascular Diseases:

Epigenetics, Genetics and Genomics

http://www.amazon.com/dp/B018PNHJ84

 

by  

Larry H Bernstein, MD, FCAP, Senior Editor, Author and Curator

and

Aviva Lev-Ari, PhD, RN, Editor and Curator

Introduction to Volume Three 

PART 1
Genomics and Medicine

1.1  Genomics and Medicine: The Physician’s View

1.2  Ribozymes and RNA Machines – Work of Jennifer A. Doudna

1.3  Genomics and Medicine: Contributions of Genetics and Genomics to Cardiovascular Disease Diagnoses

1.4 Genomics Orientations for Individualized Medicine, Volume One

1.4.1 CVD Epidemiology, Ethnic subtypes Classification, and Medication Response Variability: Cardiology, Genomics and Individualized Heart Care: Framingham Heart Study (65 y-o study) & Jackson Heart Study (15 y-o study)

1.4.2 What comes after finishing the Euchromatic Sequence of the Human Genome?

1.5  Genomics in Medicine – Establishing a Patient-Centric View of Genomic Data

 

PART 2
Epigenetics – Modifiable Factors Causing Cardiovascular Diseases

2.1 Diseases Etiology

2.1.1 Environmental Contributors Implicated as Causing Cardiovascular Diseases

2.1.2 Diet: Solids, Fluid Intake and Nutraceuticals

2.1.3 Physical Activity and Prevention of Cardiovascular Diseases

2.1.4 Psychological Stress and Mental Health: Risk for Cardiovascular Diseases

2.1.5 Correlation between Cancer and Cardiovascular Diseases

2.1.6 Medical Etiologies for Cardiovascular Diseases: Evidence-based Medicine – Leading DIAGNOSES of Cardiovascular Diseases, Risk Biomarkers and Therapies

2.1.7 Signaling Pathways

2.1.8 Proteomics and Metabolomics

2.1.9 Sleep and Cardiovascular Diseases

2.2 Assessing Cardiovascular Disease with Biomarkers

2.2.1 Issues in Genomics of Cardiovascular Diseases

2.2.2 Endothelium, Angiogenesis, and Disordered Coagulation

2.2.3 Hypertension BioMarkers

2.2.4 Inflammatory, Atherosclerotic and Heart Failure Markers

2.2.5 Myocardial Markers

2.3  Therapeutic Implications: Focus on Ca(2+) signaling, platelets, endothelium

2.3.1 The Centrality of Ca(2+) Signaling and Cytoskeleton Involving Calmodulin Kinases and Ryanodine Receptors in Cardiac Failure, Arterial Smooth Muscle, Post-ischemic Arrhythmia, Similarities and Differences, and Pharmaceutical Targets

2.3.2 EMRE in the Mitochondrial Calcium Uniporter Complex

2.3.3 Platelets in Translational Research ­ 2: Discovery of Potential Anti-platelet Targets

2.3.4 The Final Considerations of the Role of Platelets and Platelet Endothelial Reactions in Atherosclerosis and Novel Treatments

2.3.5 Nitric Oxide Synthase Inhibitors (NOS-I)

2.3.6 Resistance to Receptor of Tyrosine Kinase

2.3.7 Oxidized Calcium Calmodulin Kinase and Atrial Fibrillation

2.3.8 Advanced Topics in Sepsis and the Cardiovascular System at its End Stage

2.4 Comorbidity of Diabetes and Aging

2.4.1 Heart and Aging Research in Genomic Epidemiology: 1700 MIs and 2300 coronary heart disease events among about 29 000 eligible patients

2.4.2 Pathophysiological Effects of Diabetes on Ischemic-Cardiovascular Disease and on Chronic Obstructive Pulmonary Disease (COPD)

2.4.3 Risks of Hypoglycemia in Diabetics with Chronic Kidney Disease (CKD)

2.4.4  Mitochondrial Mechanisms of Disease in Diabetes Mellitus

2.4.5 Mitochondria: More than just the “powerhouse of the cell”

2.4.6  Pathophysiology of GLP-1 in Type 2 Diabetes

2.4.7 Developments in the Genomics and Proteomics of Type 2 Diabetes Mellitus and Treatment Targets

2.4.8 CaKMII Inhibition in Obese, Diabetic Mice leads to Lower Blood Glucose Levels

2.4.9 Protein Target for Controlling Diabetes, Fractalkine: Mediator cell-to-cell Adhesion though CX3CR1 Receptor, Released from cells Stimulate Insulin Secretion

2.4.10 Peroxisome proliferator-activated receptor (PPAR-gamma) Receptors Activation: PPARγ transrepression for Angiogenesis in Cardiovascular Disease and PPARγ transactivation for Treatment of Diabetes

2.4.11 CABG or PCI: Patients with Diabetes – CABG Rein Supreme

2.4.12 Reversal of Cardiac Mitochondrial Dysfunction

2.4.13  BARI 2D Trial Outcomes

2.4.14 Overview of new strategy for treatment of T2DM: SGLT2 inhibiting oral antidiabetic agents

2.5 Drug Toxicity and Cardiovascular Diseases

2.5.1 Predicting Drug Toxicity for Acute Cardiac Events

2.5.2 Cardiotoxicity and Cardiomyopathy Related to Drugs Adverse Effects

2.5.3 Decoding myocardial Ca2+ signals across multiple spatial scales: A role for sensitivity analysis

2.5.4. Leveraging Mathematical Models to Understand Population Variability in Response to Cardiac Drugs: Eric Sobie, PhD

2.5.5 Exploiting mathematical models to illuminate electrophysiological variability between individuals.

2.5.6 Clinical Effects and Cardiac Complications of Recreational Drug Use: Blood pressure changes, Myocardial ischemia and infarction, Aortic dissection, Valvular damage, and Endocarditis, Cardiomyopathy, Pulmonary edema and Pulmonary hypertension, Arrhythmias, Pneumothorax and Pneumopericardium

 

2.6 Male and Female Hormonal Replacement Therapy: The Benefits and the Deleterious Effects on Cardiovascular Diseases

2.6.1  Testosterone Therapy for Idiopathic Hypogonadotrophic Hypogonadism has Beneficial and Deleterious Effects on Cardiovascular Risk Factors

2.6.2 Heart Risks and Hormones (HRT) in Menopause: Contradiction or Clarification?

2.6.3 Calcium Dependent NOS Induction by Sex Hormones: Estrogen

2.6.4 Role of Progesterone in Breast Cancer Progression

PART 3
Determinants of Cardiovascular Diseases Genetics, Heredity and Genomics Discoveries

Introduction

3.1 Why cancer cells contain abnormal numbers of chromosomes (Aneuploidy)

3.1.1 Aneuploidy and Carcinogenesis

3.2 Functional Characterization of Cardiovascular Genomics: Disease Case Studies @ 2013 ASHG

3.3 Leading DIAGNOSES of Cardiovascular Diseases covered in Circulation: Cardiovascular Genetics, 3/2010 – 3/2013

3.3.1: Heredity of Cardiovascular Disorders

3.3.2: Myocardial Damage

3.3.3: Hypertention and Atherosclerosis

3.3.4: Ethnic Variation in Cardiac Structure and Systolic Function

3.3.5: Aging: Heart and Genetics

3.3.6: Genetics of Heart Rhythm

3.3.7: Hyperlipidemia, Hyper Cholesterolemia, Metabolic Syndrome

3.3.8: Stroke and Ischemic Stroke

3.3.9: Genetics and Vascular Pathologies and Platelet Aggregation, Cardiac Troponin T in Serum

3.3.10: Genomics and Valvular Disease

3.4  Commentary on Biomarkers for Genetics and Genomics of Cardiovascular Disease

PART 4
Individualized Medicine Guided by Genetics and Genomics Discoveries

4.1 Preventive Medicine: Cardiovascular Diseases

4.1.1 Personal Genomics for Preventive Cardiology Randomized Trial Design and Challenges

4.2 Gene-Therapy for Cardiovascular Diseases

4.2.1 Genetic Basis of Cardiomyopathy

4.3 Congenital Heart Disease/Defects

4.4 Cardiac Repair: Regenerative Medicine

4.4.1 A Powerful Tool For Repairing Damaged Hearts

4.4.2 Modified RNA Induces Vascular Regeneration After a Heart

4.5 Pharmacogenomics for Cardiovascular Diseases

4.5.1 Blood Pressure Response to Antihypertensives: Hypertension Susceptibility Loci Study

4.5.2 Statin-Induced Low-Density Lipoprotein Cholesterol Reduction: Genetic Determinants in the Response to Rosuvastatin

4.5.3 SNPs in apoE are found to influence statin response significantly. Less frequent variants in PCSK9 and smaller effect sizes in SNPs in HMGCR

4.5.4 Voltage-Gated Calcium Channel and Pharmacogenetic Association with Adverse Cardiovascular Outcomes: Hypertension Treatment with Verapamil SR (CCB) vs Atenolol (BB) or Trandolapril (ACE)

4.5.5 Response to Rosuvastatin in Patients With Acute Myocardial Infarction: Hepatic Metabolism and Transporter Gene Variants Effect

4.5.6 Helping Physicians identify Gene-Drug Interactions for Treatment Decisions: New ‘CLIPMERGE’ program – Personalized Medicine @ The Mount Sinai Medical Center

4.5.7 Is Pharmacogenetic-based Dosing of Warfarin Superior for Anticoagulation Control?

Summary & Epilogue to Volume Three

 

 

Read Full Post »


Previously undiscerned value of hs-troponin

Curators: Larry H. Bernstein, MD, FCAP and Aviva Lev-Ari, PhD, RN

LPBI

 

Troponin Rise Predicts CHD, HF, Mortality in Healthy People: ARIC Analysis

Veronica Hackethal, MD

Increases in levels of cardiac troponin T by high-sensitivity assay (hs-cTnT) over time are associated with later risk of death, coronary heart disease (CHD), and especially heart failure in apparently healthy middle-aged people, according to a report published June 8, 2016 in JAMA Cardiology[1].

The novel findings, based on a cohort of >8000 participants from the Atherosclerosis Risk in Communities (ARIC) study followed up to 16 years, are the first to show “an association between temporal hs-cTnT change and incident CHD events” in asymptomatic middle-aged adults,” write the authors, led by Dr John W McEvoy (Johns Hopkins University School of Medicine, Baltimore, MD).

Individuals with the greatest troponin increases over time had the highest risk for poor cardiac outcomes. The strongest association was for risk of heart failure, which reached almost 800% for those with the sharpest hs-cTnT rises.

Intriguingly, those in whom troponin levels fell at least 50% had a reduced mortality risk and may have had a slightly decreased risk of later HF or CHD.

“Serial testing over time with high-sensitivity cardiac troponins provided additional prognostic information over and above the usual clinical risk factors, [natriuretic peptide] levels, and a single troponin measurement. Two measurements appear better than one when it comes to informing risk for future coronary heart disease, heart failure, and death,” McEvoy told heartwire from Medscape.

He cautioned, though, that the conclusion is based on observational data and would need to be confirmed in clinical trials. Moreover, high-sensitivity cardiac troponin assays are widely used in Europe but are not approved in the US.

An important next step after this study, according to an accompanying editorial from Dr James Januzzi (Massachusetts General Hospital, Boston, MA), would be to evaluate whether the combination of hs-troponin and natriuretic peptides improves predictive value in this population[2].

“To the extent prevention is ultimately the holy grail for defeating the global pandemic of CHD, stroke, and HF, the main reason to do a biomarker study such as this would be to set the stage for a biomarker-guided strategy to improve the medical care for those patients at highest risk, as has been recently done with [natriuretic peptides],” he wrote.

The ARIC prospective cohort study entered and followed 8838 participants (mean age 56, 59% female, 21.4% black) in North Carolina, Mississippi, Minneapolis, and Maryland from January 1990 to December 2011. At baseline, participants had no clinical signs of CHD or heart failure.

Levels of hs-cTnT, obtained 6 years apart, were categorized as undetectable (<0.005 ng/mL), detectable (≥0.005 ng/mL to <0.014 ng/mL), and elevated (>0.014 ng/mL).

Troponin increases from <0.005 ng/mL to 0.005 ng/mL or higher independently predicted development of CHD (HR 1.41; 95% CI 1.16–1.63), HF (HR 1.96; 95% CI 1.62–2.37), and death (HR 1.50; 95% CI 1.31–1.72), compared with undetectable levels at both measurements.

Hazard ratios were adjusted for age, sex, race, body-mass index, C-reactive protein, smoking status, alcohol-intake history, systolic blood pressure, current antihypertensive therapy, diabetes, serum lipid and cholesterol levels, lipid-modifying therapy, estimated glomerular filtration rate, and left ventricular hypertrophy.

Subjects with >50% increase in hs-cTnT had a significantly increased risk of CHD (HR 1.28; 95% CI 1.09–1.52), HF (HR 1.60; 95% CI 1.35–1.91), and death (HR 1.39; 95% CI 1.22–1.59).

Risks for those end points fell somewhat for those with a >50% decrease in hs-cTnT (CHD: HR 0.47; 95% CI 0.22–1.03; HF: HR 0.49 95% CI 0.23–1.01; death: HR 0.57 95% CI 0.33–0.99).

Among participants with an adjudicated HF hospitalization, the group writes, associations of hs-cTnT changes with outcomes were of similar magnitude for those with HF with preserved ejection fraction (HFpEF) and HF with reduced ejection fraction (HFrEF).

Few biomarkers have been linked to increased risk for HFpEF, and few effective therapies exist for it. That may be due to problems identifying and enrolling patients with HFpEF in clinical trials, Dr McEvoy pointed out.

“We think the increased troponin over time reflects progressive myocardial injury or progressive myocardial damage,” Dr McEvoy said. “This is a window into future risk, particularly with respect to heart failure but other outcomes as well. It may suggest high-sensitivity troponins as a marker of myocardial health and help guide interventions targeting the myocardium.”

Moreover, he said, “We think that high-sensitivity troponin may also be a useful biomarker along with [natriuretic peptides] for emerging trials of HFpEF therapy.”

But whether hs-troponin has the potential for use as a screening tool is a question for future studies, according to McEvoy.

In his editorial, Januzzi pointed out several implications of the study, including the possibility for lowering cardiac risk in those with measurable hs-troponin, and that HF may be the most obvious outcome to target. Also, optimizing treatment and using cardioprotective therapies may reduce risk linked to increases in hs-troponin. Finally, long-term, large clinical trials on this issue will require a multidisciplinary team effort from various sectors.

“What is needed now are efforts toward developing strategies to upwardly bend the survival curves of those with a biomarker signature of risk, leveraging the knowledge gained from studies such as the report by McEvoy et al to improve public health,” he concluded.

 

Read Full Post »


mRNA data survival analysis

Curators: Larry H. Bernstein, MD, FCAP and Aviva Lev-Ari, PhD, RN

LPBI

 

SURVIV for survival analysis of mRNA isoform variation

Shihao ShenYuanyuan WangChengyang WangYing Nian Wu & Yi Xing
Nature Communications7,Article number:11548
 Feb 2016      doi:10.1038/ncomms11548

The rapid accumulation of clinical RNA-seq data sets has provided the opportunity to associate mRNA isoform variations to clinical outcomes. Here we report a statistical method SURVIV (Survival analysis of mRNA Isoform Variation), designed for identifying mRNA isoform variation associated with patient survival time. A unique feature and major strength of SURVIV is that it models the measurement uncertainty of mRNA isoform ratio in RNA-seq data. Simulation studies suggest that SURVIV outperforms the conventional Cox regression survival analysis, especially for data sets with modest sequencing depth. We applied SURVIV to TCGA RNA-seq data of invasive ductal carcinoma as well as five additional cancer types. Alternative splicing-based survival predictors consistently outperform gene expression-based survival predictors, and the integration of clinical, gene expression and alternative splicing profiles leads to the best survival prediction. We anticipate that SURVIV will have broad utilities for analysing diverse types of mRNA isoform variation in large-scale clinical RNA-seq projects.

Eukaryotic cells generate remarkable regulatory and functional complexity from a finite set of genes. Production of mRNA isoforms through alternative processing and modification of RNA is essential for generating this complexity. A prevalent mechanism for producing mRNA isoforms is the alternative splicing of precursor mRNA1. Over 95% of the multi-exon human genes undergo alternative splicing2, 3, resulting in an enormous level of plasticity in the regulation of gene function and protein diversity. In the last decade, extensive genomic and functional studies have firmly established the critical role of alternative splicing in cancer4, 5, 6. Alternative splicing is involved in a full spectrum of oncogenic processes including cell proliferation, apoptosis, hypoxia, angiogenesis, immune escape and metastasis7, 8. These cancer-associated alternative splicing patterns are not merely the consequences of disrupted gene regulation in cancer but in numerous instances actively contribute to cancer development and progression. For example, alternative splicing of genes encoding the Bcl-2 family of apoptosis regulators generates both anti-apoptotic and pro-apoptotic protein isoforms9. Alternative splicing of the pyruvate kinase M (PKM) gene has a significant impact on cancer cell metabolism and tumour growth10. A transcriptome-wide switch of the alternative splicing programme during the epithelial–mesenchymal transition plays an important role in cancer cell invasion and metastasis11, 12.

RNA sequencing (RNA-seq) has become a popular and cost-effective technology to study transcriptome regulation and mRNA isoform variation13, 14. As the cost of RNA-seq continues to decline, it has been widely adopted in large-scale clinical transcriptome projects, especially for profiling transcriptome changes in cancer. For example, as of April 2015 The Cancer Genome Atlas (TCGA) consortium had generated RNA-seq data on over 11,000 cancer patient specimens from 34 different cancer types. Within the TCGA data, breast invasive carcinoma (BRCA) has the largest sample size of RNA-seq data covering over 1,000 patients, and clinical information such as survival times, tumour stages and histological subtypes is available for the majority of the BRCA patients15. Moreover, the median follow-up time of BRCA patients is ~400 days, and 25% of the patients have more than 1,200 days of follow-up. Collectively, the large sample size and long follow-up time of the TCGA BRCA data set allow us to correlate genomic and transcriptomic profiles to clinical outcomes and patient survival times.

To date, systematic analyses have been performed to reveal the association between copy number variation, DNA methylation, gene expression and microRNA expression profiles with cancer patient survival16, 17. By contrast, despite the importance of mRNA isoform variation and alternative splicing, there have been limited efforts in transcriptome-wide survival analysis of alternative splicing in cancer patients. Most RNA-seq studies of alternative splicing in cancer transcriptomes focus on identifying ‘cancer-specific’ alternative splicing events by comparing cancer tissues with normal controls (see refs 18, 19, 20, 21, 22, 23 for examples). A recent analysis of TCGA RNA-seq data identified 163 recurrent differential alternative splicing events between cancer and normal tissues of three cancer types, among which five were found to have suggestive survival signals for breast cancer at a nominal P-value cutoff of 0.05 (ref. 21). Some other studies reported a significant survival difference between cancer patient subgroups after stratifying patients with overall mRNA isoform expression profiles24, 25. However, systematic cancer survival analyses of alternative splicing at the individual exon resolution have been lacking. Two main challenges exist for survival analyses of mRNA isoform variation and alternative splicing using RNA-seq data. The first challenge is to account for the estimation uncertainty of mRNA isoform ratios inferred from RNA-seq read counts. The statistical confidence of mRNA isoform ratio estimation depends on the RNA-seq read coverage for the events of interest, with larger read coverage leading to a more reliable estimation14. Modelling the estimation uncertainty of mRNA isoform ratio is an essential component of RNA-seq analyses of alternative splicing, as shown by various statistical algorithms developed for detecting differential alternative splicing from multi-group RNA-seq data14, 26, 27, 28,29. The second challenge, which is a general issue in survival analysis, is to properly model the association of mRNA isoform ratio with survival time, while accounting for missing data in survival time because of censoring, that is, patients still alive at the end of the survival study, whose precise survival time would be uncertain. To date, no algorithm has been developed for survival analyses of mRNA isoform variation that accounts for these sources of uncertainty simultaneously.

Here we introduce SURVIV (Survival analysis of mRNA Isoform Variation), a statistical model for identifying mRNA isoform ratios associated with patient survival times in large-scale cancer RNA-seq data sets. SURVIV models the estimation uncertainty of mRNA isoform ratios in RNA-seq data and tests the survival effects of isoform variation in both censored and uncensored survival data. In simulation studies, SURVIV consistently outperforms the conventional Cox regression survival analysis that ignores the measurement uncertainty of mRNA isoform ratio. We used SURVIV to identify alternatively spliced exons whose exon-inclusion levels significantly correlated with the survival times of invasive ductal carcinoma (IDC) patients from the TCGA breast cancer cohort. Survival-associated alternative splicing events are identified in gene pathways associated with apoptosis, oxidative stress and DNA damage repair. Importantly, we show that alternative splicing-based survival predictors outperform gene expression-based survival predictors in the TCGA IDC RNA-seq data set, as well as in TCGA data of five additional cancer types. Moreover, the integration of clinical information, gene expression and alternative splicing profiles leads to the best prediction of survival time.

SURVIV statistical model

The statistical model of SURVIV assesses the association between mRNA isoform ratio and patient survival time. While the model is generic for many types of alternative isoform variation, here we use the exon-skipping type of alternative splicing to illustrate the model (Fig. 1a). For each alternative exon involved in exon-skipping, we can use the RNA-seq reads mapping to its exon-inclusion or -skipping isoform to estimate its exon-inclusion level (denoted as ψ, or PSI that is Per cent Spliced In14). A key feature of SURVIV is that it models the RNA-seq estimation uncertainty of exon-inclusion level as influenced by the sequencing coverage for the alternative splicing event of interest. This is a critical issue in accurate quantitative analyses of mRNA isoform ratio in large-scale RNA-seq data sets14, 26, 27, 28, 29. Therefore, SURVIV contains two major components: the first to model the association of mRNA isoform ratio with patient survival time across all patients, and the second to model the estimation uncertainty of mRNA isoform ratio in each individual patient (Fig. 1a).

Figure 1: The statistical framework of the SURVIV model.

(a) For each patient k, the patient’s hazard rate λk(t) is associated with the baseline hazard rate λ0(t) and this patient’s exon-inclusion level ψk. The association of exon-inclusion level with patient survival is estimated by the survival coefficient β. The exon-inclusion level ψk is estimated from the read counts for the exon-inclusion isoform ICk and the exon-skipping isoform SCk. The proportion of the inclusion and skipping reads is adjusted by a normalization function f that considers the lengths of the exon-inclusion and -skipping isoforms (see details in Results and Supplementary Methods). (b) A hypothetical example to illustrate the association of exon-inclusion level with patient survival probability over time Sk(t), with the survival coefficient β=−1 and a constant baseline hazard rate λ0(t)=1. In this example, patients with higher exon-inclusion levels have lower hazard rates and higher survival probabilities. (c) The schematic diagram of an exon-skipping event. The exon-inclusion reads ICk are the reads from the upstream splice junction, the alternative exon itself and the downstream splice junction. The exon-skipping reads SCk are the reads from the skipping splice junction that directly connects the upstream exon to the downstream exon.

Briefly, for any individual exon-skipping event, the first component of SURVIV uses a proportional hazards model to establish the relationship between patient k’s exon-inclusion level ψk and hazard rate λk(t).

For each exon, the association between the exon-inclusion level and patient survival time is reflected by the survival coefficient β. A positive β means increased exon inclusion is associated with higher hazard rate and poorer survival, while a negative β means increased exon inclusion is associated with lower hazard rate and better survival. λ0(t) is the baseline hazard rate estimated from the survival data of all patients (see Supplementary Methods for the detailed estimation procedure). A particular patient’s survival probability over time Sk(t) can be calculated from the patient-specific hazard rate λk(t) as . Figure 1b illustrates a simple example with a negative β=−1 and a constant baseline hazard rate λ0(t)=1, where higher exon-inclusion levels are associated with lower hazard rates and higher survival probabilities.

The second component of SURVIV models the exon-inclusion level and its estimation uncertainty in individual patient samples. As illustrated in Fig. 1c, the exon-inclusion level ψk of a given exon in a particular sample can be estimated by the RNA-seq read count specific to the exon inclusion isoform (ICk) and the exon-skipping isoform (SCk). Other types of alternative splicing and mRNA isoform variation can be similarly modelled by this framework29. Given the effective lengths (that is, the number of unique isoform-specific read positions) of the exon-inclusion isoform (lI) and the exon-skipping isoform (lS), the exon-inclusion level ψk can be estimated as . Assuming that the exon-inclusion read count ICk follows a binomial distribution with the total read count nk=ICk+SCk, we have:

The binomial distribution models the estimation uncertainty of ψk as influenced by the total read count nk, in which the parameter pk represents the proportion of reads from the exon-inclusion isoform, given the exon-inclusion level ψk adjusted by a length normalization function f(ψk) based on the effective lengths of the isoforms. The definitions of effective lengths for all basic types of alternative splicing patterns are described in ref. 29.

Distinct from conventional survival analyses in which predictors do not have estimation uncertainty, the predictors in SURVIV are exon-inclusion levels ψk estimated from RNA-seq count data, and the confidence of ψk estimate for a given exon in a particular sample depends on the RNA-seq read coverage. We use the statistical framework of survival measurement error model30 to incorporate the estimation uncertainty of isoform ratio in the proportional hazards model. Using a likelihood ratio test, we test whether the exon-inclusion levels have a significant association with patient survival over the null hypothesis H0:β=0. The false discovery rate (FDR) is estimated using the Benjamini and Hochberg approach31. Details of the parameter estimation and likelihood ratio test in SURVIV are described in Supplementary Methods.

 

Figure 2: Simulation studies to assess the performance of SURVIV and the importance of modelling the estimation uncertainty of mRNA isoform ratio.

We compared our SURVIV model with Cox regression using point estimates of exon-inclusion levels, which does not consider the estimation uncertainty of the mRNA isoform ratio. (a) To study the effect of RNA-seq depth, we simulated the mean total splice junction read counts equal to 5, 10, 20, 50, 80 and 100 reads. We generated two sets of simulations with and without data-censoring. For each simulation, the true-positive rate (TPR) at 5% false-positive rate is plotted. The inset figure shows the empirical distribution of the mean total splice junction read counts in the TCGA IDC RNA-seq data (x axis in the log10 scale). (b) To faithfully represent the read count distribution in a real data set, we performed another simulation with read counts directly sampled from the TCGA IDC data. Sampled read counts were then multiplied by different factors ranging from 10 to 300% to simulate data sets with different RNA-seq read depth. Continuous and dashed lines represent the performance of SURVIV and Cox regression, respectively. Red lines represent the area under curve (AUC) of the ROC curve (TPR versus false-positive rate plot). Black lines represent the TPR at 5% false-positive rate.

 

Using these simulated data, we compared SURVIV with Cox regression in two settings, without or with censoring of the survival time. In the setting without censoring, the death and survival time of each individual is known. In the setting with censoring, certain individuals are still alive at the end of the survival study. Consequently, these patients have unknown death and survival time. Here, in the simulation with censoring, we assumed that 85% of the patients were still alive at the end of the study, similar to the censoring rate of the TCGA IDC data set. In both settings and with different depths of RNA-seq coverage, SURVIV consistently outperformed Cox regression in the true-positive rate at the same false-positive rate of 5% (Fig. 2a). As expected, we observed a more significant improvement in SURVIV over Cox regression when the RNA-seq read coverage was low (Fig. 2a).

To more faithfully recapitulate the read count distribution in a real cancer RNA-seq data set, we performed another simulation study with read counts directly sampled from the TCGA IDC data. To assess the influence of RNA-seq read depth on the performance of SURVIV and Cox regression, sampled read counts were then multiplied by different factors ranging from 10 to 300% to simulate data sets with different RNA-seq read depths (Fig. 2b). The TCGA IDC data set has an average RNA-seq depth of ~60 million paired-end reads per patient. Thus, the read depth of these simulated RNA-seq data sets ranged from ~6 million reads to 180 million reads per patient, representing low-coverage RNA-seq studies designed primarily for gene expression analysis32 up to high-coverage RNA-seq studies designed primarily for alternative isoform analysis29. At all levels of RNA-seq depth, SURVIV consistently outperformed Cox regression, as reflected by the area under curve of the receiver operating characteristic (ROC) curve as well as the true-positive rate at 5% false-positive rate (Fig. 2b). The improvement of SURVIV over Cox regression was particularly prominent when the read depth was low. For example, at 10% read depth, SURVIV had 7% improvement in area under curve (68% versus 61%) and 8% improvement in the true-positive rate at 5% false-positive rate (46% versus 38%). Collectively, these simulation results suggest that SURVIV achieves a higher accuracy by accounting for the estimation uncertainty of mRNA isoform ratio in RNA-seq data.

SURVIV analysis of TCGA IDC breast cancer data

To illustrate the practical utility of SURVIV, we used it to analyse the overall survival time of 682 IDC patients from the TCGA breast cancer (BRCA) RNA-seq data set (see Methods for details of the data source and processing pipeline). We chose to analyse IDC because it is the most frequent type of breast cancer33, comprising ~70% of patients in the TCGA breast cancer data set. To control for the effects of significant clinical parameters such as tumour stage and subtype and identify alternative splicing events associated with patient outcomes across multiple molecular and clinical subtypes, we followed the procedure of Croce and colleagues in analysing mRNA and microRNA prognostic signature of IDC33 and stratified the patients according to their clinical parameters. We then conducted SURVIV analysis in 26 clinical subgroups with at least 50 patients in each subgroup. We identified 229 exon-skipping events associated with patient survival in multiple clinical subgroups that met the criteria of SURVIV P-value≤0.01 in at least two subgroups of the same clinical parameter (cancer subtype, stage, lymph node, metastasis, tumour size, oestrogen receptor status, progesterone receptor status, HER2 status and age as shown in Fig. 3). DAVID (Database for Annotation, Visualization and Integrated Discovery) Gene Ontology analyses34 of the 229 alternative splicing events suggest an enrichment of genes in cancer-related functional categories such as intracellular signalling, apoptosis, oxidative stress and response to DNA damage (Supplementary Fig. 1). Table 1 shows a few selected examples of survival-associated alternative splicing events in cancer-related genes. Using two-means clustering of each individual exon’s inclusion levels, the 682 IDC patients can be segregated into two subgroups with significantly different survival times as illustrated by the Kaplan–Meier survival plot (Fig. 4). We also carried out hierarchical clustering of IDC patients using 176 survival-associated alternative exons (P≤0.01; SURVIV analysis of all IDC patients). Using the exon-inclusion levels of these 176 exons, we clustered IDC patients into three major subgroups, with 95, 194 and 389 patients, respectively. As illustrated by the Kaplan–Meier survival plots, the three subgroups had significantly different survival times (Supplementary Fig. 2).

Figure 3: SURVIV analysis of exon-skipping events in the TCGA IDC RNA-seq data set.

IDC patients are stratified into multiple clinical subgroups based on clinical parameters including cancer subtype, stage, lymph node status, metastasis, tumour size, oestrogen receptor status, progesterone receptor status, HER2 status and age. Only clinical subgroups with at least 50 patients are included in further analyses. Numbers of patients in the subgroups are indicated next to the names of the subgroups. Shown in the heatmap are the log10 SURVIV P-values of the 229 exons associated with patient survival (P≤0.01) in at least two subgroups of the same class of clinical parameters. Turquoise colour indicates positive correlation that higher exon-inclusion levels are associated with higher survival probabilities. Magenta colour indicates negative correlation that lower exon-inclusion levels are associated with higher survival probabilities.

TABLE 1 (not shown)

Figure 4: Kaplan–Meier survival plots of IDC patients stratified by two-means clustering of the exon-inclusion levels of four survival-associated alternative splicing events.

Clustering was generated for each of the four exons separately. Black lines represent patients with high exon-inclusion levels. Red lines represent patients with low exon-inclusion levels. The P-values are from SURVIV analysis of the TCGA IDC RNA-seq data. (a) ATRIP. (b) BCL2L11. (c) CD74. (d) PCBP4.

 

Figure 5: Alternative splicing of STAT5A exon 5 is significantly associated with IDC patient survival.

(a) The gene structure of the STAT5A full-length isoform compared to the ΔEx5 isoform skipping the 5th exon. (b) Kaplan–Meier survival plot of IDC patients stratified by two-means clustering using exon-inclusion levels of STAT5A exon 5. The 420 patients in Group 1 (average exon 5 inclusion level=95%) have significantly higher survival probabilities than the 262 patients in Group 2 (average exon 5 inclusion level=85%) (SURVIV P=6.8e−4). (c) Exon 5 inclusion levels of IDC patients stratified by two-means clustering using exon 5 inclusion levels. Group 1 has 420 patients with average exon-inclusion level at 95%. Group 2 has 262 patients with average exon-inclusion level at 85%. (d) STAT5A exon 5 inclusion levels in normal breast tissues versus breast cancer tumour samples. Exon-inclusion levels are extracted from 86 TCGA breast cancer patients with matched normal and tumour samples. Normal breast tissues have average exon 5 inclusion level at 95%, compared to 91% average exon-inclusion level in tumour samples. Error bars represent 95% confidence interval of the mean.

Network of survival-associated alternative splicing events

…see http://www.nature.com/ncomms/2016/160609/ncomms11548/full/ncomms11548.html

Figure 6: Splicing factor regulatory network of survival-associated alternative splicing events in IDC.

(ac) Kaplan–Meier survival plots of IDC patients stratified by the gene expression levels of three splicing factors: TRA2B (a, Cox regression P=1.8e−4), HNRNPH1 (b, P=3.4e−4) and SFRS3 (c, P=2.8e−3). Black lines represent patients with high gene expression levels. Red lines represent patients with low gene expression levels. (d) The exon-inclusion levels of a DHX30 alternative exon are negatively correlated with TRA2B gene expression levels (robust correlation coefficient r=−0.26, correlation P=1.2e−17). (e) The exon-inclusion levels of a MAP3K4 alternative exon are positively correlated withHNRNPH1 gene expression levels (robust correlation coefficient r=0.16, correlation P=2.6e−06). (f) A splicing co-expression network of the three splicing factors and their correlated survival-associated alternative exons. In total, 84 survival-associated alternative exons are significantly correlated with the three splicing factors. The positive/negative correlation between splicing factors and alternative exons is represented by blue/red lines, respectively. Exons whose inclusion levels are positively/negatively correlated with survival times are represented by blue/red dots, respectively. The size of the splicing factor circles is proportional to the number of correlated exons within the network.

…..

Alternative splicing predictors of cancer patient survival

see http://www.nature.com/ncomms/2016/160609/ncomms11548/full/ncomms11548.html

Figure 7: Cross-validation of different classes of IDC survival predictors measured by the C-index

A C-index of 1 indicates perfect prediction accuracy and a C-index of 0.5 indicates random guess. The plots indicate the distribution of C-indexes from 100 rounds of cross-validation. The centre value of the box plot is the median C-index from 100 rounds of cross-validation. The notch represents the 95%confidence interval of the median. The box represents the 25 and 75% quantiles. The whiskers extended out from the box represent the 5 and 95% quantiles. Two-sided Wilcoxon test was used to compare different survival predictors. The different classes of predictors are: (a) clinical information (median C-index 0.67). (b) Gene expression (median C-index 0.68). (c) Alternative splicing (median C-index 0.71). (d) Clinical information+gene expression (median C-index 0.69). (e) Clinical information+alternative splicing (median C-index 0.73). (f) Clinical information+gene expression+alternative splicing (median C-index 0.74). Note that ‘Gene’ refers to ‘Gene-level expression’ in these plots.

Next, we carried out the SURVIV analysis in five additional cancer types in TCGA, including GBM (glioblastoma multiforme), KIRC (kidney renal clear cell carcinoma), LGG (lower grade glioma), LUSC (lung squamous cell carcinoma) and OV (ovarian serous cystadenocarcinoma). As expected, the number of significant events at different FDR or P-value significance cutoffs varied across cancer types, with LGG having the strongest survival-associated alternative splicing signals with 660 significant exon-skipping events at FDR≤5% (Supplementary Data 3 and 4). Strikingly, regardless of the number of significant events, alternative splicing-based survival predictors outperformed gene expression-based survival predictors across all cancer types (Supplementary Fig. 3), consistent with our initial observation on the IDC data set.

 

Alternative processing and modification of mRNA, such as alternative splicing, allow cells to generate a large number of mRNA and protein isoforms with diverse regulatory and functional properties. The plasticity of alternative splicing is often exploited by cancer cells to produce isoform switches that promote cancer cell survival, proliferation and metastasis7, 8. The widespread use of RNA-seq in cancer transcriptome studies15, 47, 48 has provided the opportunity to comprehensively elucidate the landscape of alternative splicing in cancer tissues. While existing studies of alternative splicing in large-scale cancer transcriptome data largely focused on the comparison of splicing patterns between cancer and normal tissues or between different subtypes of cancer18, 21, 49, additional computational tools are needed to characterize the clinical relevance of alternative splicing using massive RNA-seq data sets, including the association of alternative splicing with phenotypes and patient outcomes.

We have developed SURVIV, a novel statistical model for survival analysis of alternative isoform variation using cancer RNA-seq data. SURVIV uses a survival measurement error model to simultaneously model the estimation uncertainty of mRNA isoform ratio in individual patients and the association of mRNA isoform ratio with survival time across patients. Compared with the conventional Cox regression model that uses each patient’s mRNA isoform ratio as a point estimate, SURVIV achieves a higher accuracy as indicated by simulation studies under a variety of settings. Of note, we observed a particularly marked improvement of SURVIV over Cox regression for low- and moderate-depth RNA-seq data (Fig. 2b). This has important practical value because many clinical RNA-seq data sets have large sample size but relatively modest sequencing depth.

Using the TCGA IDC breast cancer RNA-seq data of 682 patients, SURVIV identified 229 alternative splicing events associated with patient survival time, which met the criteria of SURVIVP-values≤0.01 in multiple clinical subgroups. While the statistical threshold seemed loose, several lines of evidence suggest the functional and clinical relevance of these survival-associated alternative splicing events. These alternative splicing events were frequently identified and enriched in the gene functional groups important for cancer development and progression, including apoptosis, DNA damage response and oxidative stress. While some of these events may simply reflect correlation but not causal effect on cancer patient survival, other events may play an active role in regulating cancer cell phenotypes. For example, a survival-associated alternative splicing event involving exon 5 of STAT5A is known to regulate the activity of this transcription factor with important roles in epithelial cell growth and apoptosis37. Using a co-expression network analysis of splicing factor to exon correlation across all patients, we identified three splicing factors (TRA2B, HNRNPH1 and SFRS3) as potential hubs of the survival-associated alternative splicing network of IDC. The expression levels of all three splicing factors were negatively associated with patient survival times (Fig. 6a–c), and both TRA2B and HNRNPH1 were previously reported to have an impact on cancer-related molecular pathways40, 41, 42, 43, 44, 45. Finally, despite the limited power in detecting individual events, we show that the survival-associated alternative splicing events can be used to construct a predictor for patient survival, with an accuracy higher than predictors based on clinical parameters or gene expression profiles (Fig. 7). This further demonstrates the potential biological relevance and clinical utility of the identified alternative splicing events.

We performed cross-validation analyses to evaluate and compare the prognostic value of alternative splicing, gene expression and clinical information for predicting patient survival, either independently or in combination. As expected, the combined use of all three types of information led to the best prediction accuracy. Because we used penalized regression to build the prediction model, combining information from multiple layers of data did not necessarily increase the number of predictors in the model. The perhaps more surprising and intriguing result is that alternative splicing-based predictors appear to outperform gene expression-based predictors when used alone and when either type of data was combined with clinical information (Fig. 7). We observed the same trend in five additional cancer types (Supplementary Fig. 3). We note that this finding was consistent with a previous report that cancer subtype classification based on splicing isoform expression performed better than gene expression-based classification25. While this trend seems counterintuitive because accurate estimation of gene expression requires much lower RNA-seq depth than accurate estimation of alternative splicing29, one possible explanation may be the inherent characteristic of isoform ratio data. By definition, mRNA isoform ratio is estimated as the ratio of multiple mRNA isoforms from a single gene. Therefore, mRNA isoform ratio data have a ‘built-in’ internal control that could be more robust against certain artefacts and confounding issues that influence gene expression estimates across large clinical RNA-seq data sets, such as poor sample quality and RNA degradation12. Regardless of the reasons, our data call for further studies to fully explore the utility of mRNA isoform ratio data for various clinical research applications.

The SURVIV source code is available for download at https://github.com/Xinglab/SURVIV. SURVIV is a general statistical model for survival analysis of mRNA isoform ratio using RNA-seq data. The current statistical framework of SURVIV is applicable to RNA-seq based count data for all basic types of alternative splicing patterns involving two isoform choices from an alternatively spliced region, such as exon-skipping, alternative 5′ splice sites, alternative 3′ splice sites, mutually exclusive exons and retained introns, as well as other forms of alternative isoform variation such as RNA editing. With the rapid accumulation of clinical RNA-seq data sets, SURVIV will be a useful tool for elucidating the clinical relevance and potential functional significance of alternative isoform variation in cancer and other diseases.

 

Read Full Post »

Older Posts »