Healthcare analytics, AI solutions for biological big data, providing an AI platform for the biotech, life sciences, medical and pharmaceutical industries, as well as for related technological approaches, i.e., curation and text analysis with machine learning and other activities related to AI applications to these industries.
Ido Sagi – PhD Student @HUJI, 2017 Kaye Innovation Award winner for leading research that yielded the first successful isolation and maintenance of haploid embryonic stem cells in humans.
Reporter: Aviva Lev-Ari, PhD, RN
Ido Sagi – PhD Student, Silberman Institute of Life Sciences, HUJI, Israel
Ido Sagi’s research focuses on studying genetic and epigenetic phenomena in human pluripotent stem cells, and his work has been published in leading scientific journals, including Nature, Nature Genetics and Cell Stem Cell.
Ido Sagi received BSc summa cum laude in Life Sciences from the Hebrew University, and currently pursues a PhD at the laboratory of Prof. Nissim Benvenisty at the university’s Department of Genetics in the Alexander Silberman Institute of Life Sciences.
2017 Kaye Innovation Award winner for leading research that yielded the first successful isolation and maintenance of haploid embryonic stem cells in humans.
The Kaye Innovation Awards at the Hebrew University of Jerusalem have been awarded annually since 1994. Isaac Kaye of England, a prominent industrialist in the pharmaceutical industry, established the awards to encourage faculty, staff and students of the Hebrew University to develop innovative methods and inventions with good commercial potential, which will benefit the university and society.
Hebrew University of Jerusalem’s Azrieli Center for Stem Cells and Genetic Research, led research that yielded the first successful isolation and maintenance of haploid embryonic stem cells in humans.
Together with Prof. Nissim Benvenisty, Director of the Azrieli Center, Sagi showed that this new human stem cell type will play an important role in human genetic and medical research. It will aid our understanding of human development – for example, why we reproduce sexually instead of from a single parent. It will make genetic screening easier and more precise, by allowing the examination of single sets of chromosomes. And it is already enabling the study of resistance to chemotherapy drugs, with implications for cancer therapy.
Read more at https://www.breakingisraelnews.com/90561/hebrew-u-isolates-haploid-human-stem-cells-changing-future-of-medicine/#impRGtg0syOSFGtZ.99
He is a fellow of the Adams Fellowship of the Israel Academy of Sciences and Humanities, and
Has recently received the Rappaport Prize for Excellence in Biomedical Research.
The Azrieli Center for Stem Cells and Genetic Research, Department of Genetics, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem 91904, Israel. Electronic address:
Nat Protoc 2016 Nov 20;11(11):2274-2286. Epub 2016 Oct 20.
The Azrieli Center for Stem Cells and Genetic Research, Department of Genetics, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel.
The Azrieli Center for Stem Cells and Genetic Research, Department of Genetics, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem 91904, Israel. Electronic address:
Nature 2016 Apr 16;532(7597):107-11. Epub 2016 Mar 16.
The Azrieli Center for Stem Cells and Genetic Research, Department of Genetics, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem 91904, Israel.
Nat Rev Mol Cell Biol 2016 Mar 28;17(3):170-82. Epub 2016 Jan 28.
The Azrieli Center for Stem Cells and Genetic Research, Institute of Life Sciences, Hebrew University of Jerusalem, Givat-Ram, Jerusalem 91904, Israel.
Cell Stem Cell 2014 Nov 6;15(5):634-42. Epub 2014 Nov 6.
The New York Stem Cell Foundation Research Institute, New York, NY 10032, USA; Naomi Berrie Diabetes Center & Department of Pediatrics, College of Physicians and Surgeons, Columbia University, New York, NY 10032, USA. Electronic address:
Nature 2016 12 30;540(7632):211-212. Epub 2016 Nov 30.
The Azrieli Center for Stem Cells and Genetic Research, Department of Genetics, Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem 91904, Israel.
Other related articles on Genetic and Epigenetic phenomena in human pluripotent stem cells published by LPBI Group can be found in the following e-Books on Amazon.com
During pregnancy, the baby is mostly protected from harmful microorganisms by the amniotic sac, but recent research suggests the baby could be exposed to small quantities of microbes from the placenta, amniotic fluid, umbilical cord blood and fetal membranes. One theory is that any possible prenatal exposure could ‘pre-seed’ the infant microbiome. In other words, to set the right conditions for the ‘main seeding event’ for founding the infant microbiome.
When a mother gives birth vaginally and if she breastfeeds, she passes on colonies of essential microbes to her baby. This continues a chain of maternal heritage that stretches through female ancestry for thousands of generations, if all have been vaginally born and breastfed. This means a child’s microbiome, that is the trillions of microorganisms that live on and in him or her, will resemble the microbiome of his/her mother, the grandmother, the great-grandmother and so on, if all have been vaginally born and breastfed.
As soon as the mother’s waters break, suddenly the baby is exposed to a wave of the mother’s vaginal microbes that wash over the baby in the birth canal. They coat the baby’s skin, and enter the baby’s eyes, ears, nose and some are swallowed to be sent down into the gut. More microbes form of the mother’s gut microbes join the colonization through contact with the mother’s faecal matter. Many more microbes come from every breath, from every touch including skin-to-skin contact with the mother and of course, from breastfeeding.
With formula feeding, the baby won’t receive the 700 species of microbes found in breast milk. Inside breast milk, there are special sugars called human milk oligosaccharides (HMO’s) that are indigestible by the baby. These sugars are designed to feed the mother’s microbes newly arrived in the baby’s gut. By multiplying quickly, the ‘good’ bacteria crowd out any potentially harmful pathogens. These ‘good’ bacteria help train the baby’s naive immune system, teaching it to identify what is to be tolerated and what is pathogen to be attacked. This leads to the optimal training of the infant immune system resulting in a child’s best possible lifelong health.
With C-section birth and formula feeding, the baby is not likely to acquire the full complement of the mother’s vaginal, gut and breast milk microbes. Therefore, the baby’s microbiome is not likely to closely resemble the mother’s microbiome. A baby born by C-section is likely to have a different microbiome from its mother, its grandmother, its great-grandmother and so on. C-section breaks the chain of maternal heritage and this break can never be restored.
The long term effect of an altered microbiome for a child’s lifelong health is still to be proven, but many studies link C-section with a significantly increased risk for developing asthma, Type 1 diabetes, celiac disease and obesity. Scientists might not yet have all the answers, but the picture that is forming is that C-section and formula feeding could be significantly impacting the health of the next generation. Through the transgenerational aspect to birth, it could even be impacting the health of future generations.
Researchers have classified a brand-new organ inside human body. Known as the mesentery, the new organ is found in our digestive systems, and was long thought to be made up of fragmented, separate structures. But recent research has shown that it’s actually one, continuous organ. The evidence for the organ’s reclassification is now published in The Lancet Gastroenterology & Hepatology. Although we now know about the structure of this new organ, its function is still poorly understood, and studying it could be the key to better understanding and treatment of abdominal and digestive disease.
J Calvin Coffey, a researcher from the University Hospital Limerick in Ireland, who first discovered that the mesentery was an organ. In 2012, Coffey and his colleagues showed through detailed microscopic examinations that the mesentery is actually a continuous structure. Over the past four years, they’ve gathered further evidence that the mesentery should actually be classified as its own distinct organ, and the latest paper makes it official. Mesentery is a double fold of peritoneum – the lining of the abdominal cavity – that holds our intestine to the wall of our abdomen. It was described by the Italian polymath Leanardo da Vinci in 1508, but it has been ignored throughout the centuries, until now. Although there are generally considered to be five organs in the human body, there are in fact now 79, including the mesentery. The heart, brain, liver, lungs and kidneys are the vital organs, but there are another 74 that play a role in keeping us healthy. The distinctive anatomical and functional features of mesentery have been revealed that justify designation of the mesentery as an organ. Accordingly, the mesentery should be subjected to the same investigatory focus that is applied to other organs and systems. This provides a platform from which to direct future scientific investigation of the human mesentery in health and disease.
MicroRNAs (miRNAs) are a group of small non-coding RNA molecules that play a major role in posttranscriptional regulation of gene expression and are expressed in an organ-specific manner. One miRNA can potentially regulate the expression of several genes, depending on cell type and differentiation stage. They control every cellular process and their altered regulation is involved in human diseases. miRNAs are differentially expressed in the male and female gonads and have an organ-specific reproductive function. Exerting their affect through germ cells and gonadal somatic cells, miRNAs regulate key proteins necessary for gonad development. The role of miRNAs in the testes is only starting to emerge though they have been shown to be required for adequate spermatogenesis. In the ovary, miRNAs play a fundamental role in follicles’ assembly, growth, differentiation, and ovulation.
Deciphering the underlying causes of idiopathic male infertility is one of the main challenges in reproductive medicine. This is especially relevant in infertile patients displaying normal seminal parameters and no urogenital or genetic abnormalities. In these cases, the search for additional sperm biomarkers is of high interest. This study was aimed to determine the implications of the sperm miRNA expression profiles in the reproductive capacity of normozoospermic infertile individuals. The expression levels of 736 miRNAs were evaluated in spermatozoa from normozoospermic infertile males and normozoospermic fertile males analyzed under the same conditions. 57 miRNAs were differentially expressed between populations; 20 of them was regulated by a host gene promoter that in three cases comprised genes involved in fertility. The predicted targets of the differentially expressed miRNAs unveiled a significant enrichment of biological processes related to embryonic morphogenesis and chromatin modification. Normozoospermic infertile individuals exhibit a specific sperm miRNA expression profile clearly differentiated from normozoospermic fertile individuals. This miRNA cargo has potential implications in the individuals’ reproductive competence.
Circulating or “extracellular” miRNAs detected in biological fluids, could be used as potential diagnostic and prognostic biomarkers of several disease, such as cancer, gynecological and pregnancy disorders. However, their contributions in female infertility and in vitro fertilization (IVF) remain unknown. Polycystic ovary syndrome (PCOS) is a frequent endocrine disorder in women. PCOS is associated with altered features of androgen metabolism, increased insulin resistance and impaired fertility. Furthermore, PCOS, being a syndrome diagnosis, is heterogeneous and characterized by polycystic ovaries, chronic anovulation and evidence of hyperandrogenism, as well as being associated with chronic low-grade inflammation and an increased life time risk of type 2 diabetes. Altered miRNA levels have been associated with diabetes, insulin resistance, inflammation and various cancers. Studies have shown that circulating miRNAs are present in whole blood, serum, plasma and the follicular fluid of PCOS patients and that these might serve as potential biomarkers and a new approach for the diagnosis of PCOS. Presence of miRNA in mammalian follicular fluid has been demonstrated to be enclosed within microvesicles and exosomes or they can also be associated to protein complexes. The presence of microvesicles and exosomes carrying microRNAs in follicular fluid could represent an alternative mechanism of autocrine and paracrine communication inside the ovarian follicle. The investigation of the expression profiles of five circulating miRNAs (let-7b, miR-29a, miR-30a, miR-140 and miR-320a) in human follicular fluid from women with normal ovarian reserve and with polycystic ovary syndrome (PCOS) and their ability to predict IVF outcomes showed that these miRNAs could provide new helpful biomarkers to facilitate personalized medical care for oocyte quality in ART (Assisted Reproductive Treatment) and during IVF (In Vitro Fertilization).
Mitochondria are present in almost all human cells, and vary in number from a few tens to many thousands. They generate the majority of a cell’s energy supply which powers every part of our body. Mitochondria have their own separate DNA, which carries just a few genes. All of these genes are involved in energy production but determine no other characteristics. And so, any faults in these genes lead only to problems in energy production. Around 1 in 6500 children is thought to be born with a serious mitochondrial disorder due to faults in mitochondrial DNA.
Unlike nuclear genes, mitochondrial DNA is inherited only from our mothers. Mothers can carry abnormal mitochondria and be at risk of passing on serious disease to their children, even if they themselves show only mild or no symptoms. It is for such women who by chance have a high proportion of faulty mitochondrial DNA in their eggs for which the methods of mitochondrial replacement or “donation” have been developed. This technique is also referred as the three parent technique and it involves a couple and a donor.
Mitochondrial Donation
The most developed techniques, maternal spindle transfer (MST) and pro-nuclear transfer (PNT), are based on an IVF cycle but have additional steps. Other techniques are being developed.
In both MST and PNT, nuclear DNA is moved from a patient’s egg or embryo containing unhealthy mitochondria to a donor’s egg or embryo containing healthy mitochondria, from which the donor’s nuclear DNA has been removed.
Maternal spindle transfer Bredenoord, A and P. Braude (2010) “Ethics of mitochondrial gene replacement: from bench to bedside” BMJ 341.
Pronuclear transfer Bredenoord, A and P. Braude (2010) “Ethics of mitochondrial gene replacement: from bench to bedside” BMJ 341.
Research Carried Out and Safety Issues
There have been many experiments conducted using MST and PNT in animals. PNT has been carried out since the mid-1980s in mice. MST has been carried out in a wide range of animals. More recently mice, monkeys and human embryos have been created with the specific aim of developing MST and PNT for avoiding mitochondrial disease.
There is no evidence to show that mitochondrial donation is unsafe
Research is progressing well and the recommended further experiments are expected to confirm this view.
The main area of research needed is to observe cells derived from embryos created by MST and PNT, to see how mitochondria behave.
Concerns about Mitochondrial Donation
The scientific evidence raises some potential concerns about mitochondrial donation. Just as we all have different blood groups, we also have different types of mitochondria, called haplotypes. Some scientists have suggested that if the patient and the mitochondria donor have different mitochondrial haplotypes, there is a theoretical risk that the donor’s mitochondria won’t be able to ‘talk’ properly to the patient’s nuclear DNA, which could cause problems in the embryo and resulting child. So, mitochondria haplotype matching in the process of selecting donors may be done to avoid problems.
Another potential concern is that a small amount of unhealthy mitochondrial DNA may be transferred into the donor’s egg along with the mother’s nuclear DNA. Studies carried out on MST and PNT show that some so-called mitochondrial ‘carry-over’ occurs. However, the carry-over is lower than 2% of the mitochondria in the resulting embryo, an amount which is very unlikely to be problematic for the children born.
LIVE 9/21 8AM to 10:55 AM Expoloring the Versatility of CRISPR/Cas9 at CHI’s 14th Discovery On Target, 9/19 – 9/22/2016, Westin Boston Waterfront, Boston
8:10 Functional Genomics Using CRISPR-Cas9: Technology and Applications
Neville Sanjana, Ph.D., Core Faculty Member, New York Genome Center and Assistant Professor, Department of Biology & Center for Genomics and Systems Biology, New York University
CRISPR Cas9 is easier to target to multiple genomic loci; RNA specifies DNA targeting; with zinc finger nucleases or TALEEN in the protein specifies DNA targeting
This feature of crisper allows you to make a quick big and cheap array of a GENOME SCALE Crisper Knock out (GeCKO) screening library
How do you scale up the sgRNA for whole genome?; for all genes in RefSeq, identify consitutive exons using RNA-sequencing data from 16 primary human tissue (alot of genes end with ‘gg’) changing the bases on 3’ side negates crisper system but changing on 5’ then crisper works fine
Rank sequences to be specific for target
Cloned array into lentiviral and put in selectable markers
GeCKO displays high consistency betweens reagents for the same gene versus siRNA; GeCKO has high screening sensitivity
98% of genome is noncoding so what about making a library for intronic regions (miRNA, promoter regions?)
So you design the sgRNA library by taking 100kb of gene-adjacent regions
They looked at CUL3; (data will soon be published in Science)
Do a transcription CHIP to verify the lack of binding of transcription factor of interest
Can also target histone marks on promoter and enhancer elements
TJ Cradick , Ph.D., Head of Genome Editing, CRISPR Therapeutics
NEHJ is down and dirty repair of single nonhomologous end but when have two breaks the NEHJ repair can introduce the inversions or deletions
High-throughput screens are fine but can limit your view of genomic context; genome searches pick unique sites so use bioinformatic programs to design specific guide Rna
Compared COSMID and CCTOP; 320 COSMID off-target sites, 333 CCtop off target
Young lab GUIDESeq program genome wide assay useful to design guides
If shorten guide may improve specificity; also sometime better sensitivity if lengthen guide
Manufacturing of autologous gene corrected product ex vivo gene correction (Vertex, Bayer, are partners in this)
They need to use a clones from multiple microarrays before using the GUidESeq but GUIDEseq is better for REMOVING the off targets than actually producing the sgRNA library you want (seems the methods for library development are not fully advanced to do this)
The score sometimes for the sgRNA design programs do not always give the best result because some sgRNAs are genome context dependent
9:10 Towards Combinatorial Drug Discovery: Mining Heterogeneous Phenotypes from Large Scale RNAi/Drug Perturbations
Arvind Rao, Ph.D., Assistant Professor, Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center
Bioinformatics in CRISPR screens: they looked at image analysis of light microscopy of breast cancer cells and looked for phenotypic changes
Then they modeled in a small pilot and then used the algorithm for 20,000 images (made morphometric measurements)
Can formulate training statistical algorithms to make a decision tree how you classify data points
Although their algorithms worked well there was also human input from scientists
Aggregate ranking of hits programs available on web like LINKS
@MDAndersonNews
10:25 CRISPR in Stem Cell Models of Eye Disease
Alexander Bassuk, M.D., Ph.D., Associate Professor of Pediatrics, Department of Molecular and Cellular Biology, University of Iowa
Blind athlete Michael Stone, biathlete, had eye disease since teenager helped fund and start the clinical trial for Starbardt disease; had one bad copy of ABCA4, heterozygous (inheritable in Ahkenazi Jewish) – a recessive inheritable mutation with juvenile macular degeneration
Also had another male in family with disease but he had another mutation in the RPGR gene
December 2015 paper Precision Medicine: Genetic Repair of retinitis pigmentosa in patient derived stem cells
They were able to correct the iPSCs in the RPGR gene derived from patient however low efficiency of repair, scarless repair, leaves changes in DNA, need clinical grade iPSCs, and need a humanized model of RPGR
@uiowa
10:55 CRISPR in Mouse Models of Eye Disease
Vinit Mahajan, M.D., Ph.D., Assistant Professor of Ophthalmology and Visual Sciences, University of Iowa College of Medicine
degeneration of the retina will see brown spots, the macula will often be preserved but retinal cells damaged but with RPGR have problems with peripheral vision, retinitis pigmentosa get tunnel vision with no peripheral vision (a mouse model of PDE6 Knockout recapitulates this phenotype)
the PDE6 is linked to the rhodopsin GTP pathway
rd1 -/- mouse has something that looks like retinal pigmentosa; has mutant PDE6; is actually a nonsense mutation in rd1 so they tried a crisper to fix in mice
with crisper fix of rd1 nonsense mutation the optic nerve looked comparible to normal and the retina structure restored
photoreceptors layers- some recovery but not complete
sequence results show the DNA is a mosaic so not correcting 100% but only 35% but stil leads to a phenotypic recovery; NHEJ was about 12% to 25% with large deletions
histology is restored in crspr repaired mice
CRSPR off target effects: WGS and analyze for variants SNV/indels, also looked at on target and off target regions; there were no off target SNVs indels while variants that did not pass quality control screening not a single SNV
Rhodopsin mutation accounts for a large % of patients (RhoD190N)
injection of gene therapy vectors: AAV vector carrying CRSPR and cas9 repair templates
CAPN mouse models
family in Iowa have dominant mutation in CAPN5; retinal degenerates
used CRSPR to generate mouse model with mutation in CAPN5 similar to family mutation
compared to other transgenic methods CRSPR is faster to produce a mouse model
To Follow LIVE CONFERENCE COVERAGE PLEASE FOLLOW ON TWITTER USING
Chapter 1: Evolution of the Foundation for Diagnostics and Pharmaceuticals Industries
1.1 Outline of Medical Discoveries between 1880 and 1980
1.2 The History of Infectious Diseases and Epidemiology in the late 19th and 20th Century
1.3 The Classification of Microbiota
1.4 Selected Contributions to Chemistry from 1880 to 1980
1.5 The Evolution of Clinical Chemistry in the 20th Century
1.6 Milestones in the Evolution of Diagnostics in the US HealthCare System: 1920s to Pre-Genomics
Chapter 2. The search for the evolution of function of proteins, enzymes and metal catalysts in life processes
2.1 The life and work of Allan Wilson
2.2 The evolution of myoglobin and hemoglobin
2.3 More complexity in proteins evolution
2.4 Life on earth is traced to oxygen binding
2.5 The colors of life function
2.6 The colors of respiration and electron transport
2.7 Highlights of a green evolution
Chapter 3. Evolution of New Relationships in Neuroendocrine States
3.1 Pituitary endocrine axis
3.2 Thyroid function
3.3 Sex hormones
3.4 Adrenal Cortex
3.5 Pancreatic Islets
3.6 Parathyroids
3.7 Gastointestinal hormones
3.8 Endocrine action on midbrain
3.9 Neural activity regulating endocrine response
3.10 Genomic Promise for Neurodegenerative Diseases, Dementias, Autism Spectrum, Schizophrenia, and Serious Depression
Chapter 4. Problems of the Circulation, Altitude, and Immunity
4.1 Innervation of Heart and Heart Rate
4.2 Action of hormones on the circulation
4.3 Allogeneic Transfusion Reactions
4.4 Graft-versus Host reaction
4.5 Unique problems of perinatal period
4.6. High altitude sickness
4.7 Deep water adaptation
4.8 Heart-Lung-and Kidney
4.9 Acute Lung Injury
4.10 Reconstruction of Life Processes requires both Genomics and Metabolomics to explain Phenotypes and Phylogenetics
Chapter 5. Problems of Diets and Lifestyle Changes
5.1 Anorexia nervosa
5.2 Voluntary and Involuntary S-insufficiency
5.3 Diarrheas – bacterial and nonbacterial
5.4 Gluten-free diets
5.5 Diet and cholesterol
5.6 Diet and Type 2 diabetes mellitus
5.7 Diet and exercise
5.8 Anxiety and quality of Life
5.9 Nutritional Supplements
Chapter 6. Advances in Genomics, Therapeutics and Pharmacogenomics
6.1 Natural Products Chemistry
6.2 The Challenge of Antimicrobial Resistance
6.3 Viruses, Vaccines and immunotherapy
6.4 Genomics and Metabolomics Advances in Cancer
6.5 Proteomics – Protein Interaction
6.6 Pharmacogenomics
6.7 Biomarker Guided Therapy
6.8 The Emergence of a Pharmaceutical Industry in the 20th Century: Diagnostics Industry and Drug Development in the Genomics Era: Mid 80s to Present
6.09 The Union of Biomarkers and Drug Development
6.10 Proteomics and Biomarker Discovery
6.11 Epigenomics and Companion Diagnostics
Chapter 7
Integration of Physiology, Genomics and Pharmacotherapy
7.1 Richard Lifton, MD, PhD of Yale University and Howard Hughes Medical Institute: Recipient of 2014 Breakthrough Prizes Awarded in Life Sciences for the Discovery of Genes and Biochemical Mechanisms that cause Hypertension
7.2 Calcium Cycling (ATPase Pump) in Cardiac Gene Therapy: Inhalable Gene Therapy for Pulmonary Arterial Hypertension and Percutaneous Intra-coronary Artery Infusion for Heart Failure: Contributions by Roger J. Hajjar, MD
7.3 Diagnostics and Biomarkers: Novel Genomics Industry Trends vs Present Market Conditions and Historical Scientific Leaders Memoirs
7.4 Synthetic Biology: On Advanced Genome Interpretation for Gene Variants and Pathways: What is the Genetic Base of Atherosclerosis and Loss of Arterial Elasticity with Aging
Chest pain patients are often evaluated for acute myocardial infarction through troponin testing, which may prompt downstream services (cascades) of uncertain value.
Objective
Determine the association of high-sensitivity cardiac troponin (hs-cTn) assay implementation with cascade events.
Methods
Using electronic health record and billing data, we examined patient-visits to five emergency departments, April 1, 2017 – April 1, 2019. Difference-in-differences analysis compared patient-visits for chest pain (n=7,564) to patient-visits for other symptoms (n=100,415) (irrespective of troponin testing) before and after hs-cTn assay implementation. Outcomes included presence of any cascade event potentially associated with an initial hs-cTn test (primary), individual cascade events, length of stay, and spending on cardiac services.
Results
Following hs-cTn implementation, patients with chest pain had a 2.8% (95%CI 0.72, 4.9) net increase in experiencing any cascade event. They were more likely to have multiple troponin tests (10.5%, 95%CI 9.0, 12.0) and electrocardiograms (7.1 per 100 patient-visits, 95%CI 1.8, 12.4). However, they received net fewer computed tomography scans (-1.5 per 100 patient-visits, 95%CI -1.8, -1.1), stress tests (-5.9 per 100 patient-visits, 95%CI -6.5, -5.3), and cardiac catheterizations (-0.65 per 100 patient-visits, 95%CI -1.01, -0.30) and were less likely to receive cardiac medications, undergo cardiology evaluation (-3.5%, 95%CI -4.5, 2.6), or be hospitalized (-5.8%, 95%CI -7.7, -3.8). Chest pain patients had lower net mean length of stay (-0.24 days, 95%CI -0.32, -0.16) but no net change in spending.
Conclusions
Hs-cTn assay implementation was associated with more net upfront tests yet fewer net stress tests, catheterizations, cardiology evaluations, and hospital admissions in chest pain patients relative to patients with other symptoms.
Background: We assessed whether plasma troponin I measured by a high-sensitivity assay (hs-TnI) is associated with incident cardiovascular disease (CVD) and mortality in a community-based sample without prior CVD.
Methods: ARIC study (Atherosclerosis Risk in Communities) participants aged 54 to 74 years without baseline CVD were included in this study (n=8121). Cox proportional hazards models were constructed to determine associations between hs-TnI and incident coronary heart disease (CHD; myocardial infarction and fatal CHD), ischemic stroke, atherosclerotic CVD (CHD and stroke), heart failure hospitalization, global CVD (atherosclerotic CVD and heart failure), and all-cause mortality. The comparative association of hs-TnI and high-sensitivity troponin T with incident CVD events was also evaluated. Risk prediction models were constructed to assess prediction improvement when hs-TnI was added to traditional risk factors used in the Pooled Cohort Equation.
Results: The median follow-up period was ≈15 years. Detectable hs-TnI levels were observed in 85% of the study population. In adjusted models, in comparison to low hs-TnI (lowest quintile, hs-TnI ≤1.3 ng/L), elevated hs-TnI (highest quintile, hs-TnI ≥3.8 ng/L) was associated with greater incident CHD (hazard ratio [HR], 2.20; 95% CI, 1.64-2.95), ischemic stroke (HR, 2.99; 95% CI, 2.01-4.46), atherosclerotic CVD (HR, 2.36; 95% CI, 1.86-3.00), heart failure hospitalization (HR, 4.20; 95% CI, 3.28-5.37), global CVD (HR, 3.01; 95% CI, 2.50-3.63), and all-cause mortality (HR, 1.83; 95% CI, 1.56-2.14). hs-TnI was observed to have a stronger association with incident global CVD events in white than in black individuals and a stronger association with incident CHD in women than in men. hs-TnI and high-sensitivity troponin T were only modestly correlated (r=0.47) and were complementary in prediction of incident CVD events, with elevation of both troponins conferring the highest risk in comparison with elevation in either one alone. The addition of hsTnI to the Pooled Cohort Equation model improved risk prediction for atherosclerotic CVD, heart failure, and global CVD.
Conclusions: Elevated hs-TnI is strongly associated with increased global CVD incidence in the general population independent of traditional risk factors. hs-TnI and high-sensitivity troponin T provide complementary rather than redundant information.
Siemens Launches High-sensitivity Troponin Test for Faster Diagnosis of Heart Attacks
The new troponin I assays can detect lower levels of troponin compared to conventional testing
July 25, 2018 — The U.S. Food and Drug Administration (FDA) cleared Siemens Healthineers high-sensitivity troponin I assays (TnIH) for the Atellica IM and ADVIA Centaur XP/XPT in vitro diagnostic analyzers from Siemens Healthineers to aid in the early diagnosis of myocardial infarctions.
The new tests can shorten the time doctors need to diagnose a life-threatening heart attacks. The time to first results is 10 minutes. When a patient experiencing chest pain enters the emergency department, a physician orders a blood test to determine whether troponin is present. As blood flow to the heart is blocked, the heart muscle begins to die in as few as 30 to 60 minutes and releases troponin into the bloodstream.
The company said its high-sensitivity performance of the two new Siemens TnIH assays offers the ability to detect lower levels of troponin at significantly improved precision at the 99th percentile, and detect smaller changes in a patient’s troponin level as repeat testing occurs. This design affords clinicians greater confidence in the results with precision that provides the ability to measure slight, yet critical, changes to begin treatment.[1,2]
Chest pain is the cause of more than 8 million visits annually nationwide to emergency departments, but only 5.5 percent of those visits lead to serious diagnoses such as heart attacks.[3] Armed with data to properly triage patients sooner or to exclude myocardial infarctions, the Siemens Healthineers TnIH assays can help support testing initiatives tied to improving patient experience.
“Our emergency department is overcrowded with patients. If we can do a more efficient job at triaging patients to receive the proper level of care and to discharge the patients who do not need to stay in the emergency department, this will have a tremendous economic advantage for our healthcare system,” said Alan Wu, M.D., chief of clinical chemistry and toxicology at Zuckerberg San Francisco General Hospital and Trauma Center.
Siemens is launching the product at the 70th AACC Annual Scientific Meeting and Clinical Lab Expo taking place July 31 to Aug. 2 in Chicago.
Increases in levels of cardiac troponin T by high-sensitivity assay (hs-cTnT) over time are associated with later risk of death, coronary heart disease (CHD), and especially heart failure in apparently healthy middle-aged people, according to a report published June 8, 2016 in JAMA Cardiology[1].
The novel findings, based on a cohort of >8000 participants from the Atherosclerosis Risk in Communities (ARIC) study followed up to 16 years, are the first to show “an association between temporal hs-cTnT change and incident CHD events” in asymptomatic middle-aged adults,” write the authors, led by Dr John W McEvoy (Johns Hopkins University School of Medicine, Baltimore, MD).
Individuals with the greatest troponin increases over time had the highest risk for poor cardiac outcomes. The strongest association was for risk of heart failure, which reached almost 800% for those with the sharpest hs-cTnT rises.
Intriguingly, those in whom troponin levels fell at least 50% had a reduced mortality risk and may have had a slightly decreased risk of later HF or CHD.
“Serial testing over time with high-sensitivity cardiac troponins provided additional prognostic information over and above the usual clinical risk factors, [natriuretic peptide] levels, and a single troponin measurement. Two measurements appear better than one when it comes to informing risk for future coronary heart disease, heart failure, and death,” McEvoy told heartwirefrom Medscape.
He cautioned, though, that the conclusion is based on observational data and would need to be confirmed in clinical trials. Moreover, high-sensitivity cardiac troponin assays are widely used in Europe but are not approved in the US.
An important next step after this study, according to an accompanying editorial from Dr James Januzzi (Massachusetts General Hospital, Boston, MA), would be to evaluate whether the combination of hs-troponin and natriuretic peptides improves predictive value in this population[2].
“To the extent prevention is ultimately the holy grail for defeating the global pandemic of CHD, stroke, and HF, the main reason to do a biomarker study such as this would be to set the stage for a biomarker-guided strategy to improve the medical care for those patients at highest risk, as has been recently done with [natriuretic peptides],” he wrote.
The ARIC prospective cohort study entered and followed 8838 participants (mean age 56, 59% female, 21.4% black) in North Carolina, Mississippi, Minneapolis, and Maryland from January 1990 to December 2011. At baseline, participants had no clinical signs of CHD or heart failure.
Levels of hs-cTnT, obtained 6 years apart, were categorized as undetectable (<0.005 ng/mL), detectable (≥0.005 ng/mL to <0.014 ng/mL), and elevated (>0.014 ng/mL).
Troponin increases from <0.005 ng/mL to 0.005 ng/mL or higher independently predicted development of CHD (HR 1.41; 95% CI 1.16–1.63), HF (HR 1.96; 95% CI 1.62–2.37), and death (HR 1.50; 95% CI 1.31–1.72), compared with undetectable levels at both measurements.
Hazard ratios were adjusted for age, sex, race, body-mass index, C-reactive protein, smoking status, alcohol-intake history, systolic blood pressure, current antihypertensive therapy, diabetes, serum lipid and cholesterol levels, lipid-modifying therapy, estimated glomerular filtration rate, and left ventricular hypertrophy.
Subjects with >50% increase in hs-cTnT had a significantly increased risk of CHD (HR 1.28; 95% CI 1.09–1.52), HF (HR 1.60; 95% CI 1.35–1.91), and death (HR 1.39; 95% CI 1.22–1.59).
Risks for those end points fell somewhat for those with a >50% decrease in hs-cTnT (CHD: HR 0.47; 95% CI 0.22–1.03; HF: HR 0.49 95% CI 0.23–1.01; death: HR 0.57 95% CI 0.33–0.99).
Among participants with an adjudicated HF hospitalization, the group writes, associations of hs-cTnT changes with outcomes were of similar magnitude for those with HF with preserved ejection fraction (HFpEF) and HF with reduced ejection fraction (HFrEF).
Few biomarkers have been linked to increased risk for HFpEF, and few effective therapies exist for it. That may be due to problems identifying and enrolling patients with HFpEF in clinical trials, Dr McEvoy pointed out.
“We think the increased troponin over time reflects progressive myocardial injury or progressive myocardial damage,” Dr McEvoy said. “This is a window into future risk, particularly with respect to heart failure but other outcomes as well. It may suggest high-sensitivity troponins as a marker of myocardial health and help guide interventions targeting the myocardium.”
Moreover, he said, “We think that high-sensitivity troponin may also be a useful biomarker along with [natriuretic peptides] for emerging trials of HFpEF therapy.”
But whether hs-troponin has the potential for use as a screening tool is a question for future studies, according to McEvoy.
In his editorial, Januzzi pointed out several implications of the study, including the possibility for lowering cardiac risk in those with measurable hs-troponin, and that HF may be the most obvious outcome to target. Also, optimizing treatment and using cardioprotective therapies may reduce risk linked to increases in hs-troponin. Finally, long-term, large clinical trials on this issue will require a multidisciplinary team effort from various sectors.
“What is needed now are efforts toward developing strategies to upwardly bend the survival curves of those with a biomarker signature of risk, leveraging the knowledge gained from studies such as the report by McEvoy et al to improve public health,” he concluded.
mRNA Data Survival Analysis, Volume 2 (Volume Two: Latest in Genomics Methodologies for Therapeutics: Gene Editing, NGS and BioInformatics, Simulations and the Genome Ontology), Part 1: Next Generation Sequencing (NGS)
mRNA Data Survival Analysis
Curators: Larry H. Bernstein, MD, FCAP and Aviva Lev-Ari, PhD, RN
SURVIV for survival analysis of mRNA isoform variation
The rapid accumulation of clinical RNA-seq data sets has provided the opportunity to associate mRNA isoform variations to clinical outcomes. Here we report a statistical method SURVIV (Survival analysis of mRNA Isoform Variation), designed for identifying mRNA isoform variation associated with patient survival time. A unique feature and major strength of SURVIV is that it models the measurement uncertainty of mRNA isoform ratio in RNA-seq data. Simulation studies suggest that SURVIV outperforms the conventional Cox regression survival analysis, especially for data sets with modest sequencing depth. We applied SURVIV to TCGA RNA-seq data of invasive ductal carcinoma as well as five additional cancer types. Alternative splicing-based survival predictors consistently outperform gene expression-based survival predictors, and the integration of clinical, gene expression and alternative splicing profiles leads to the best survival prediction. We anticipate that SURVIV will have broad utilities for analysing diverse types of mRNA isoform variation in large-scale clinical RNA-seq projects.
Eukaryotic cells generate remarkable regulatory and functional complexity from a finite set of genes. Production of mRNA isoforms through alternative processing and modification of RNA is essential for generating this complexity. A prevalent mechanism for producing mRNA isoforms is the alternative splicing of precursor mRNA1. Over 95% of the multi-exon human genes undergo alternative splicing2, 3, resulting in an enormous level of plasticity in the regulation of gene function and protein diversity. In the last decade, extensive genomic and functional studies have firmly established the critical role of alternative splicing in cancer4, 5, 6. Alternative splicing is involved in a full spectrum of oncogenic processes including cell proliferation, apoptosis, hypoxia, angiogenesis, immune escape and metastasis7, 8. These cancer-associated alternative splicing patterns are not merely the consequences of disrupted gene regulation in cancer but in numerous instances actively contribute to cancer development and progression. For example, alternative splicing of genes encoding the Bcl-2 family of apoptosis regulators generates both anti-apoptotic and pro-apoptotic protein isoforms9. Alternative splicing of the pyruvate kinase M (PKM) gene has a significant impact on cancer cell metabolism and tumour growth10. A transcriptome-wide switch of the alternative splicing programme during the epithelial–mesenchymal transition plays an important role in cancer cell invasion and metastasis11, 12.
RNA sequencing (RNA-seq) has become a popular and cost-effective technology to study transcriptome regulation and mRNA isoform variation13, 14. As the cost of RNA-seq continues to decline, it has been widely adopted in large-scale clinical transcriptome projects, especially for profiling transcriptome changes in cancer. For example, as of April 2015 The Cancer Genome Atlas (TCGA) consortium had generated RNA-seq data on over 11,000 cancer patient specimens from 34 different cancer types. Within the TCGA data, breast invasive carcinoma (BRCA) has the largest sample size of RNA-seq data covering over 1,000 patients, and clinical information such as survival times, tumour stages and histological subtypes is available for the majority of the BRCA patients15. Moreover, the median follow-up time of BRCA patients is ~400 days, and 25% of the patients have more than 1,200 days of follow-up. Collectively, the large sample size and long follow-up time of the TCGA BRCA data set allow us to correlate genomic and transcriptomic profiles to clinical outcomes and patient survival times.
To date, systematic analyses have been performed to reveal the association between copy number variation, DNA methylation, gene expression and microRNA expression profiles with cancer patient survival16, 17. By contrast, despite the importance of mRNA isoform variation and alternative splicing, there have been limited efforts in transcriptome-wide survival analysis of alternative splicing in cancer patients. Most RNA-seq studies of alternative splicing in cancer transcriptomes focus on identifying ‘cancer-specific’ alternative splicing events by comparing cancer tissues with normal controls (see refs 18, 19, 20, 21, 22, 23 for examples). A recent analysis of TCGA RNA-seq data identified 163 recurrent differential alternative splicing events between cancer and normal tissues of three cancer types, among which five were found to have suggestive survival signals for breast cancer at a nominal P-value cutoff of 0.05 (ref. 21). Some other studies reported a significant survival difference between cancer patient subgroups after stratifying patients with overall mRNA isoform expression profiles24, 25. However, systematic cancer survival analyses of alternative splicing at the individual exon resolution have been lacking. Two main challenges exist for survival analyses of mRNA isoform variation and alternative splicing using RNA-seq data. The first challenge is to account for the estimation uncertainty of mRNA isoform ratios inferred from RNA-seq read counts. The statistical confidence of mRNA isoform ratio estimation depends on the RNA-seq read coverage for the events of interest, with larger read coverage leading to a more reliable estimation14. Modelling the estimation uncertainty of mRNA isoform ratio is an essential component of RNA-seq analyses of alternative splicing, as shown by various statistical algorithms developed for detecting differential alternative splicing from multi-group RNA-seq data14, 26, 27, 28,29. The second challenge, which is a general issue in survival analysis, is to properly model the association of mRNA isoform ratio with survival time, while accounting for missing data in survival time because of censoring, that is, patients still alive at the end of the survival study, whose precise survival time would be uncertain. To date, no algorithm has been developed for survival analyses of mRNA isoform variation that accounts for these sources of uncertainty simultaneously.
Here we introduce SURVIV (Survival analysis of mRNA Isoform Variation), a statistical model for identifying mRNA isoform ratios associated with patient survival times in large-scale cancer RNA-seq data sets. SURVIV models the estimation uncertainty of mRNA isoform ratios in RNA-seq data and tests the survival effects of isoform variation in both censored and uncensored survival data. In simulation studies, SURVIV consistently outperforms the conventional Cox regression survival analysis that ignores the measurement uncertainty of mRNA isoform ratio. We used SURVIV to identify alternatively spliced exons whose exon-inclusion levels significantly correlated with the survival times of invasive ductal carcinoma (IDC) patients from the TCGA breast cancer cohort. Survival-associated alternative splicing events are identified in gene pathways associated with apoptosis, oxidative stress and DNA damage repair. Importantly, we show that alternative splicing-based survival predictors outperform gene expression-based survival predictors in the TCGA IDC RNA-seq data set, as well as in TCGA data of five additional cancer types. Moreover, the integration of clinical information, gene expression and alternative splicing profiles leads to the best prediction of survival time.
SURVIV statistical model
The statistical model of SURVIV assesses the association between mRNA isoform ratio and patient survival time. While the model is generic for many types of alternative isoform variation, here we use the exon-skipping type of alternative splicing to illustrate the model (Fig. 1a). For each alternative exon involved in exon-skipping, we can use the RNA-seq reads mapping to its exon-inclusion or -skipping isoform to estimate its exon-inclusion level (denoted as ψ, or PSI that is Per cent Spliced In14). A key feature of SURVIV is that it models the RNA-seq estimation uncertainty of exon-inclusion level as influenced by the sequencing coverage for the alternative splicing event of interest. This is a critical issue in accurate quantitative analyses of mRNA isoform ratio in large-scale RNA-seq data sets14, 26, 27, 28, 29. Therefore, SURVIV contains two major components: the first to model the association of mRNA isoform ratio with patient survival time across all patients, and the second to model the estimation uncertainty of mRNA isoform ratio in each individual patient (Fig. 1a).
Figure 1: The statistical framework of the SURVIV model.
(a) For each patient k, the patient’s hazard rate λk(t) is associated with the baseline hazard rate λ0(t) and this patient’s exon-inclusion level ψk. The association of exon-inclusion level with patient survival is estimated by the survival coefficient β. The exon-inclusion level ψk is estimated from the read counts for the exon-inclusion isoform ICk and the exon-skipping isoform SCk. The proportion of the inclusion and skipping reads is adjusted by a normalization function f that considers the lengths of the exon-inclusion and -skipping isoforms (see details in Results and Supplementary Methods). (b) A hypothetical example to illustrate the association of exon-inclusion level with patient survival probability over time Sk(t), with the survival coefficient β=−1 and a constant baseline hazard rate λ0(t)=1. In this example, patients with higher exon-inclusion levels have lower hazard rates and higher survival probabilities. (c) The schematic diagram of an exon-skipping event. The exon-inclusion reads ICk are the reads from the upstream splice junction, the alternative exon itself and the downstream splice junction. The exon-skipping reads SCk are the reads from the skipping splice junction that directly connects the upstream exon to the downstream exon.
Briefly, for any individual exon-skipping event, the first component of SURVIV uses a proportional hazards model to establish the relationship between patient k’s exon-inclusion level ψk and hazard rate λk(t).
For each exon, the association between the exon-inclusion level and patient survival time is reflected by the survival coefficient β. A positive β means increased exon inclusion is associated with higher hazard rate and poorer survival, while a negative β means increased exon inclusion is associated with lower hazard rate and better survival. λ0(t) is the baseline hazard rate estimated from the survival data of all patients (see Supplementary Methods for the detailed estimation procedure). A particular patient’s survival probability over time Sk(t) can be calculated from the patient-specific hazard rate λk(t) as . Figure 1b illustrates a simple example with a negative β=−1 and a constant baseline hazard rate λ0(t)=1, where higher exon-inclusion levels are associated with lower hazard rates and higher survival probabilities.
The second component of SURVIV models the exon-inclusion level and its estimation uncertainty in individual patient samples. As illustrated in Fig. 1c, the exon-inclusion level ψk of a given exon in a particular sample can be estimated by the RNA-seq read count specific to the exon inclusion isoform (ICk) and the exon-skipping isoform (SCk). Other types of alternative splicing and mRNA isoform variation can be similarly modelled by this framework29. Given the effective lengths (that is, the number of unique isoform-specific read positions) of the exon-inclusion isoform (lI) and the exon-skipping isoform (lS), the exon-inclusion level ψk can be estimated as . Assuming that the exon-inclusion read count ICk follows a binomial distribution with the total read count nk=ICk+SCk, we have:
The binomial distribution models the estimation uncertainty of ψk as influenced by the total read count nk, in which the parameter pk represents the proportion of reads from the exon-inclusion isoform, given the exon-inclusion level ψk adjusted by a length normalization function f(ψk) based on the effective lengths of the isoforms. The definitions of effective lengths for all basic types of alternative splicing patterns are described in ref. 29.
Distinct from conventional survival analyses in which predictors do not have estimation uncertainty, the predictors in SURVIV are exon-inclusion levels ψk estimated from RNA-seq count data, and the confidence of ψk estimate for a given exon in a particular sample depends on the RNA-seq read coverage. We use the statistical framework of survival measurement error model30 to incorporate the estimation uncertainty of isoform ratio in the proportional hazards model. Using a likelihood ratio test, we test whether the exon-inclusion levels have a significant association with patient survival over the null hypothesis H0:β=0. The false discovery rate (FDR) is estimated using the Benjamini and Hochberg approach31. Details of the parameter estimation and likelihood ratio test in SURVIV are described in Supplementary Methods.
Figure 2: Simulation studies to assess the performance of SURVIV and the importance of modelling the estimation uncertainty of mRNA isoform ratio.
We compared our SURVIV model with Cox regression using point estimates of exon-inclusion levels, which does not consider the estimation uncertainty of the mRNA isoform ratio. (a) To study the effect of RNA-seq depth, we simulated the mean total splice junction read counts equal to 5, 10, 20, 50, 80 and 100 reads. We generated two sets of simulations with and without data-censoring. For each simulation, the true-positive rate (TPR) at 5% false-positive rate is plotted. The inset figure shows the empirical distribution of the mean total splice junction read counts in the TCGA IDC RNA-seq data (x axis in the log10 scale). (b) To faithfully represent the read count distribution in a real data set, we performed another simulation with read counts directly sampled from the TCGA IDC data. Sampled read counts were then multiplied by different factors ranging from 10 to 300% to simulate data sets with different RNA-seq read depth. Continuous and dashed lines represent the performance of SURVIV and Cox regression, respectively. Red lines represent the area under curve (AUC) of the ROC curve (TPR versus false-positive rate plot). Black lines represent the TPR at 5% false-positive rate.
Using these simulated data, we compared SURVIV with Cox regression in two settings, without or with censoring of the survival time. In the setting without censoring, the death and survival time of each individual is known. In the setting with censoring, certain individuals are still alive at the end of the survival study. Consequently, these patients have unknown death and survival time. Here, in the simulation with censoring, we assumed that 85% of the patients were still alive at the end of the study, similar to the censoring rate of the TCGA IDC data set. In both settings and with different depths of RNA-seq coverage, SURVIV consistently outperformed Cox regression in the true-positive rate at the same false-positive rate of 5% (Fig. 2a). As expected, we observed a more significant improvement in SURVIV over Cox regression when the RNA-seq read coverage was low (Fig. 2a).
To more faithfully recapitulate the read count distribution in a real cancer RNA-seq data set, we performed another simulation study with read counts directly sampled from the TCGA IDC data. To assess the influence of RNA-seq read depth on the performance of SURVIV and Cox regression, sampled read counts were then multiplied by different factors ranging from 10 to 300% to simulate data sets with different RNA-seq read depths (Fig. 2b). The TCGA IDC data set has an average RNA-seq depth of ~60 million paired-end reads per patient. Thus, the read depth of these simulated RNA-seq data sets ranged from ~6 million reads to 180 million reads per patient, representing low-coverage RNA-seq studies designed primarily for gene expression analysis32 up to high-coverage RNA-seq studies designed primarily for alternative isoform analysis29. At all levels of RNA-seq depth, SURVIV consistently outperformed Cox regression, as reflected by the area under curve of the receiver operating characteristic (ROC) curve as well as the true-positive rate at 5% false-positive rate (Fig. 2b). The improvement of SURVIV over Cox regression was particularly prominent when the read depth was low. For example, at 10% read depth, SURVIV had 7% improvement in area under curve (68% versus 61%) and 8% improvement in the true-positive rate at 5% false-positive rate (46% versus 38%). Collectively, these simulation results suggest that SURVIV achieves a higher accuracy by accounting for the estimation uncertainty of mRNA isoform ratio in RNA-seq data.
SURVIV analysis of TCGA IDC breast cancer data
To illustrate the practical utility of SURVIV, we used it to analyse the overall survival time of 682 IDC patients from the TCGA breast cancer (BRCA) RNA-seq data set (see Methods for details of the data source and processing pipeline). We chose to analyse IDC because it is the most frequent type of breast cancer33, comprising ~70% of patients in the TCGA breast cancer data set. To control for the effects of significant clinical parameters such as tumour stage and subtype and identify alternative splicing events associated with patient outcomes across multiple molecular and clinical subtypes, we followed the procedure of Croce and colleagues in analysing mRNA and microRNA prognostic signature of IDC33 and stratified the patients according to their clinical parameters. We then conducted SURVIV analysis in 26 clinical subgroups with at least 50 patients in each subgroup. We identified 229 exon-skipping events associated with patient survival in multiple clinical subgroups that met the criteria of SURVIV P-value≤0.01 in at least two subgroups of the same clinical parameter (cancer subtype, stage, lymph node, metastasis, tumour size, oestrogen receptor status, progesterone receptor status, HER2 status and age as shown in Fig. 3). DAVID (Database for Annotation, Visualization and Integrated Discovery) Gene Ontology analyses34 of the 229 alternative splicing events suggest an enrichment of genes in cancer-related functional categories such as intracellular signalling, apoptosis, oxidative stress and response to DNA damage (Supplementary Fig. 1). Table 1 shows a few selected examples of survival-associated alternative splicing events in cancer-related genes. Using two-means clustering of each individual exon’s inclusion levels, the 682 IDC patients can be segregated into two subgroups with significantly different survival times as illustrated by the Kaplan–Meier survival plot (Fig. 4). We also carried out hierarchical clustering of IDC patients using 176 survival-associated alternative exons (P≤0.01; SURVIV analysis of all IDC patients). Using the exon-inclusion levels of these 176 exons, we clustered IDC patients into three major subgroups, with 95, 194 and 389 patients, respectively. As illustrated by the Kaplan–Meier survival plots, the three subgroups had significantly different survival times (Supplementary Fig. 2).
Figure 3: SURVIV analysis of exon-skipping events in the TCGA IDC RNA-seq data set.
IDC patients are stratified into multiple clinical subgroups based on clinical parameters including cancer subtype, stage, lymph node status, metastasis, tumour size, oestrogen receptor status, progesterone receptor status, HER2 status and age. Only clinical subgroups with at least 50 patients are included in further analyses. Numbers of patients in the subgroups are indicated next to the names of the subgroups. Shown in the heatmap are the log10 SURVIV P-values of the 229 exons associated with patient survival (P≤0.01) in at least two subgroups of the same class of clinical parameters. Turquoise colour indicates positive correlation that higher exon-inclusion levels are associated with higher survival probabilities. Magenta colour indicates negative correlation that lower exon-inclusion levels are associated with higher survival probabilities.
Figure 4: Kaplan–Meier survival plots of IDC patients stratified by two-means clustering of the exon-inclusion levels of four survival-associated alternative splicing events.
Clustering was generated for each of the four exons separately. Black lines represent patients with high exon-inclusion levels. Red lines represent patients with low exon-inclusion levels. The P-values are from SURVIV analysis of the TCGA IDC RNA-seq data. (a) ATRIP. (b) BCL2L11. (c) CD74. (d) PCBP4.
Figure 5: Alternative splicing of STAT5A exon 5 is significantly associated with IDC patient survival.
(a) The gene structure of the STAT5A full-length isoform compared to the ΔEx5 isoform skipping the 5th exon. (b) Kaplan–Meier survival plot of IDC patients stratified by two-means clustering using exon-inclusion levels of STAT5A exon 5. The 420 patients in Group 1 (average exon 5 inclusion level=95%) have significantly higher survival probabilities than the 262 patients in Group 2 (average exon 5 inclusion level=85%) (SURVIV P=6.8e−4). (c) Exon 5 inclusion levels of IDC patients stratified by two-means clustering using exon 5 inclusion levels. Group 1 has 420 patients with average exon-inclusion level at 95%. Group 2 has 262 patients with average exon-inclusion level at 85%. (d) STAT5A exon 5 inclusion levels in normal breast tissues versus breast cancer tumour samples. Exon-inclusion levels are extracted from 86 TCGA breast cancer patients with matched normal and tumour samples. Normal breast tissues have average exon 5 inclusion level at 95%, compared to 91% average exon-inclusion level in tumour samples. Error bars represent 95% confidence interval of the mean.
Figure 6: Splicing factor regulatory network of survival-associated alternative splicing events in IDC.
(a–c) Kaplan–Meier survival plots of IDC patients stratified by the gene expression levels of three splicing factors: TRA2B (a, Cox regression P=1.8e−4), HNRNPH1 (b, P=3.4e−4) and SFRS3 (c, P=2.8e−3). Black lines represent patients with high gene expression levels. Red lines represent patients with low gene expression levels. (d) The exon-inclusion levels of a DHX30 alternative exon are negatively correlated with TRA2B gene expression levels (robust correlation coefficient r=−0.26, correlation P=1.2e−17). (e) The exon-inclusion levels of a MAP3K4 alternative exon are positively correlated withHNRNPH1 gene expression levels (robust correlation coefficient r=0.16, correlation P=2.6e−06). (f) A splicing co-expression network of the three splicing factors and their correlated survival-associated alternative exons. In total, 84 survival-associated alternative exons are significantly correlated with the three splicing factors. The positive/negative correlation between splicing factors and alternative exons is represented by blue/red lines, respectively. Exons whose inclusion levels are positively/negatively correlated with survival times are represented by blue/red dots, respectively. The size of the splicing factor circles is proportional to the number of correlated exons within the network.
Figure 7: Cross-validation of different classes of IDC survival predictors measured by the C-index
A C-index of 1 indicates perfect prediction accuracy and a C-index of 0.5 indicates random guess. The plots indicate the distribution of C-indexes from 100 rounds of cross-validation. The centre value of the box plot is the median C-index from 100 rounds of cross-validation. The notch represents the 95%confidence interval of the median. The box represents the 25 and 75% quantiles. The whiskers extended out from the box represent the 5 and 95% quantiles. Two-sided Wilcoxon test was used to compare different survival predictors. The different classes of predictors are: (a) clinical information (median C-index 0.67). (b) Gene expression (median C-index 0.68). (c) Alternative splicing (median C-index 0.71). (d) Clinical information+gene expression (median C-index 0.69). (e) Clinical information+alternative splicing (median C-index 0.73). (f) Clinical information+gene expression+alternative splicing (median C-index 0.74). Note that ‘Gene’ refers to ‘Gene-level expression’ in these plots.
Next, we carried out the SURVIV analysis in five additional cancer types in TCGA, including GBM (glioblastoma multiforme), KIRC (kidney renal clear cell carcinoma), LGG (lower grade glioma), LUSC (lung squamous cell carcinoma) and OV (ovarian serous cystadenocarcinoma). As expected, the number of significant events at different FDR or P-value significance cutoffs varied across cancer types, with LGG having the strongest survival-associated alternative splicing signals with 660 significant exon-skipping events at FDR≤5% (Supplementary Data 3 and 4). Strikingly, regardless of the number of significant events, alternative splicing-based survival predictors outperformed gene expression-based survival predictors across all cancer types (Supplementary Fig. 3), consistent with our initial observation on the IDC data set.
Alternative processing and modification of mRNA, such as alternative splicing, allow cells to generate a large number of mRNA and protein isoforms with diverse regulatory and functional properties. The plasticity of alternative splicing is often exploited by cancer cells to produce isoform switches that promote cancer cell survival, proliferation and metastasis7, 8. The widespread use of RNA-seq in cancer transcriptome studies15, 47, 48 has provided the opportunity to comprehensively elucidate the landscape of alternative splicing in cancer tissues. While existing studies of alternative splicing in large-scale cancer transcriptome data largely focused on the comparison of splicing patterns between cancer and normal tissues or between different subtypes of cancer18, 21, 49, additional computational tools are needed to characterize the clinical relevance of alternative splicing using massive RNA-seq data sets, including the association of alternative splicing with phenotypes and patient outcomes.
We have developed SURVIV, a novel statistical model for survival analysis of alternative isoform variation using cancer RNA-seq data. SURVIV uses a survival measurement error model to simultaneously model the estimation uncertainty of mRNA isoform ratio in individual patients and the association of mRNA isoform ratio with survival time across patients. Compared with the conventional Cox regression model that uses each patient’s mRNA isoform ratio as a point estimate, SURVIV achieves a higher accuracy as indicated by simulation studies under a variety of settings. Of note, we observed a particularly marked improvement of SURVIV over Cox regression for low- and moderate-depth RNA-seq data (Fig. 2b). This has important practical value because many clinical RNA-seq data sets have large sample size but relatively modest sequencing depth.
Using the TCGA IDC breast cancer RNA-seq data of 682 patients, SURVIV identified 229 alternative splicing events associated with patient survival time, which met the criteria of SURVIVP-values≤0.01 in multiple clinical subgroups. While the statistical threshold seemed loose, several lines of evidence suggest the functional and clinical relevance of these survival-associated alternative splicing events. These alternative splicing events were frequently identified and enriched in the gene functional groups important for cancer development and progression, including apoptosis, DNA damage response and oxidative stress. While some of these events may simply reflect correlation but not causal effect on cancer patient survival, other events may play an active role in regulating cancer cell phenotypes. For example, a survival-associated alternative splicing event involving exon 5 of STAT5A is known to regulate the activity of this transcription factor with important roles in epithelial cell growth and apoptosis37. Using a co-expression network analysis of splicing factor to exon correlation across all patients, we identified three splicing factors (TRA2B, HNRNPH1 and SFRS3) as potential hubs of the survival-associated alternative splicing network of IDC. The expression levels of all three splicing factors were negatively associated with patient survival times (Fig. 6a–c), and both TRA2B and HNRNPH1 were previously reported to have an impact on cancer-related molecular pathways40, 41, 42, 43, 44, 45. Finally, despite the limited power in detecting individual events, we show that the survival-associated alternative splicing events can be used to construct a predictor for patient survival, with an accuracy higher than predictors based on clinical parameters or gene expression profiles (Fig. 7). This further demonstrates the potential biological relevance and clinical utility of the identified alternative splicing events.
We performed cross-validation analyses to evaluate and compare the prognostic value of alternative splicing, gene expression and clinical information for predicting patient survival, either independently or in combination. As expected, the combined use of all three types of information led to the best prediction accuracy. Because we used penalized regression to build the prediction model, combining information from multiple layers of data did not necessarily increase the number of predictors in the model. The perhaps more surprising and intriguing result is that alternative splicing-based predictors appear to outperform gene expression-based predictors when used alone and when either type of data was combined with clinical information (Fig. 7). We observed the same trend in five additional cancer types (Supplementary Fig. 3). We note that this finding was consistent with a previous report that cancer subtype classification based on splicing isoform expression performed better than gene expression-based classification25. While this trend seems counterintuitive because accurate estimation of gene expression requires much lower RNA-seq depth than accurate estimation of alternative splicing29, one possible explanation may be the inherent characteristic of isoform ratio data. By definition, mRNA isoform ratio is estimated as the ratio of multiple mRNA isoforms from a single gene. Therefore, mRNA isoform ratio data have a ‘built-in’ internal control that could be more robust against certain artefacts and confounding issues that influence gene expression estimates across large clinical RNA-seq data sets, such as poor sample quality and RNA degradation12. Regardless of the reasons, our data call for further studies to fully explore the utility of mRNA isoform ratio data for various clinical research applications.
The SURVIV source code is available for download at https://github.com/Xinglab/SURVIV. SURVIV is a general statistical model for survival analysis of mRNA isoform ratio using RNA-seq data. The current statistical framework of SURVIV is applicable to RNA-seq based count data for all basic types of alternative splicing patterns involving two isoform choices from an alternatively spliced region, such as exon-skipping, alternative 5′ splice sites, alternative 3′ splice sites, mutually exclusive exons and retained introns, as well as other forms of alternative isoform variation such as RNA editing. With the rapid accumulation of clinical RNA-seq data sets, SURVIV will be a useful tool for elucidating the clinical relevance and potential functional significance of alternative isoform variation in cancer and other diseases.