Archive for the ‘Clinical Genomics’ Category

Systems Biology analysis of Transcription Networks, Artificial Intelligence, and High-End Computing Coming to Fruition in Personalized Oncology

Curator: Stephen J. Williams, Ph.D.

In the June 2020 issue of the journal Science, writer Roxanne Khamsi has an interesting article “Computing Cancer’s Weak Spots; An algorithm to unmask tumors’ molecular linchpins is tested in patients”[1], describing some early successes in the incorporation of cancer genome sequencing in conjunction with artificial intelligence algorithms toward a personalized clinical treatment decision for various tumor types.  In 2016, oncologists Amy Tiersten collaborated with systems biologist Andrea Califano and cell biologist Jose Silva at Mount Sinai Hospital to develop a systems biology approach to determine that the drug ruxolitinib, a STAT3 inhibitor, would be effective for one of her patient’s aggressively recurring, Herceptin-resistant breast tumor.  Dr. Califano, instead of defining networks of driver mutations, focused on identifying a few transcription factors that act as ‘linchpins’ or master controllers of transcriptional networks withing tumor cells, and in doing so hoping to, in essence, ‘bottleneck’ the transcriptional machinery of potential oncogenic products. As Dr. Castilano states

“targeting those master regulators and you will stop cancer in its tracks, no matter what mutation initially caused it.”

It is important to note that this approach also relies on the ability to sequence tumors  by RNA-seq to determine the underlying mutations which alter which master regulators are pertinent in any one tumor.  And given the wide tumor heterogeneity in tumor samples, this sequencing effort may have to involve multiple biopsies (as discussed in earlier posts on tumor heterogeneity in renal cancer).

As stated in the article, Califano co-founded a company called Darwin-Health in 2015 to guide doctors by identifying the key transcription factors in a patient’s tumor and suggesting personalized therapeutics to those identified molecular targets (OncoTarget™).  He had collaborated with the Jackson Laboratory and most recently Columbia University to conduct a $15 million 3000 patient clinical trial.  This was a bit of a stretch from his initial training as a physicist and, in 1986, IBM hired him for some artificial intelligence projects.  He then landed in 2003 at Columbia and has been working on identifying these transcriptional nodes that govern cancer survival and tumorigenicity.  Dr. Califano had figured that the number of genetic mutations which potentially could be drivers were too vast:

A 2018 study which analyzed more than 9000 tumor samples reported over 1.5 million mutations[2]

and impossible to develop therapeutics against.  He reasoned that you would just have to identify the common connections between these pathways or transcriptional nodes and termed them master regulators.

A Pan-Cancer Analysis of Enhancer Expression in Nearly 9000 Patient Samples

Chen H, Li C, Peng X, et al. Cell. 2018;173(2):386-399.e12.


The role of enhancers, a key class of non-coding regulatory DNA elements, in cancer development has increasingly been appreciated. Here, we present the detection and characterization of a large number of expressed enhancers in a genome-wide analysis of 8928 tumor samples across 33 cancer types using TCGA RNA-seq data. Compared with matched normal tissues, global enhancer activation was observed in most cancers. Across cancer types, global enhancer activity was positively associated with aneuploidy, but not mutation load, suggesting a hypothesis centered on “chromatin-state” to explain their interplay. Integrating eQTL, mRNA co-expression, and Hi-C data analysis, we developed a computational method to infer causal enhancer-gene interactions, revealing enhancers of clinically actionable genes. Having identified an enhancer ∼140 kb downstream of PD-L1, a major immunotherapy target, we validated it experimentally. This study provides a systematic view of enhancer activity in diverse tumor contexts and suggests the clinical implications of enhancers.


A diagram of how concentrating on these transcriptional linchpins or nodes may be more therapeutically advantageous as only one pharmacologic agent is needed versus multiple agents to inhibit the various upstream pathways:



From: Khamsi R: Computing cancer’s weak spots. Science 2020, 368(6496):1174-1177.


VIPER Algorithm (Virtual Inference of Protein activity by Enriched Regulon Analysis)

The algorithm that Califano and DarwinHealth developed is a systems biology approach using a tumor’s RNASeq data to determine controlling nodes of transcription.  They have recently used the VIPER algorithm to look at RNA-Seq data from more than 10,000 tumor samples from TCGA and identified 407 transcription factor genes that acted as these linchpins across all tumor types.  Only 20 to 25 of  them were implicated in just one tumor type so these potential nodes are common in many forms of cancer.

Other institutions like the Cold Spring Harbor Laboratories have been using VIPER in their patient tumor analysis.  Linchpins for other tumor types have been found.  For instance, VIPER identified transcription factors IKZF1 and IKF3 as linchpins in multiple myeloma.  But currently approved therapeutics are hard to come by for targets with are transcription factors, as most pharma has concentrated on inhibiting an easier target like kinases and their associated activity.  In general, developing transcription factor inhibitors in more difficult an undertaking for multiple reasons.

Network-based inference of protein activity helps functionalize the genetic landscape of cancer. Alvarez MJ, Shen Y, Giorgi FM, Lachmann A, Ding BB, Ye BH, Califano A:. Nature genetics 2016, 48(8):838-847 [3]


Identifying the multiple dysregulated oncoproteins that contribute to tumorigenesis in a given patient is crucial for developing personalized treatment plans. However, accurate inference of aberrant protein activity in biological samples is still challenging as genetic alterations are only partially predictive and direct measurements of protein activity are generally not feasible. To address this problem we introduce and experimentally validate a new algorithm, VIPER (Virtual Inference of Protein-activity by Enriched Regulon analysis), for the accurate assessment of protein activity from gene expression data. We use VIPER to evaluate the functional relevance of genetic alterations in regulatory proteins across all TCGA samples. In addition to accurately inferring aberrant protein activity induced by established mutations, we also identify a significant fraction of tumors with aberrant activity of druggable oncoproteins—despite a lack of mutations, and vice-versa. In vitro assays confirmed that VIPER-inferred protein activity outperforms mutational analysis in predicting sensitivity to targeted inhibitors.





Figure 1 

Schematic overview of the VIPER algorithm From: Alvarez MJ, Shen Y, Giorgi FM, Lachmann A, Ding BB, Ye BH, Califano A: Functional characterization of somatic mutations in cancer using network-based inference of protein activity. Nature genetics 2016, 48(8):838-847.

(a) Molecular layers profiled by different technologies. Transcriptomics measures steady-state mRNA levels; Proteomics quantifies protein levels, including some defined post-translational isoforms; VIPER infers protein activity based on the protein’s regulon, reflecting the abundance of the active protein isoform, including post-translational modifications, proper subcellular localization and interaction with co-factors. (b) Representation of VIPER workflow. A regulatory model is generated from ARACNe-inferred context-specific interactome and Mode of Regulation computed from the correlation between regulator and target genes. Single-sample gene expression signatures are computed from genome-wide expression data, and transformed into regulatory protein activity profiles by the aREA algorithm. (c) Three possible scenarios for the aREA analysis, including increased, decreased or no change in protein activity. The gene expression signature and its absolute value (|GES|) are indicated by color scale bars, induced and repressed target genes according to the regulatory model are indicated by blue and red vertical lines. (d) Pleiotropy Correction is performed by evaluating whether the enrichment of a given regulon (R4) is driven by genes co-regulated by a second regulator (R4∩R1). (e) Benchmark results for VIPER analysis based on multiple-samples gene expression signatures (msVIPER) and single-sample gene expression signatures (VIPER). Boxplots show the accuracy (relative rank for the silenced protein), and the specificity (fraction of proteins inferred as differentially active at p < 0.05) for the 6 benchmark experiments (see Table 2). Different colors indicate different implementations of the aREA algorithm, including 2-tail (2T) and 3-tail (3T), Interaction Confidence (IC) and Pleiotropy Correction (PC).

 Other articles from Andrea Califano on VIPER algorithm in cancer include:

Resistance to neoadjuvant chemotherapy in triple-negative breast cancer mediated by a reversible drug-tolerant state.

Echeverria GV, Ge Z, Seth S, Zhang X, Jeter-Jones S, Zhou X, Cai S, Tu Y, McCoy A, Peoples M, Sun Y, Qiu H, Chang Q, Bristow C, Carugo A, Shao J, Ma X, Harris A, Mundi P, Lau R, Ramamoorthy V, Wu Y, Alvarez MJ, Califano A, Moulder SL, Symmans WF, Marszalek JR, Heffernan TP, Chang JT, Piwnica-Worms H.Sci Transl Med. 2019 Apr 17;11(488):eaav0936. doi: 10.1126/scitranslmed.aav0936.PMID: 30996079

An Integrated Systems Biology Approach Identifies TRIM25 as a Key Determinant of Breast Cancer Metastasis.

Walsh LA, Alvarez MJ, Sabio EY, Reyngold M, Makarov V, Mukherjee S, Lee KW, Desrichard A, Turcan Ş, Dalin MG, Rajasekhar VK, Chen S, Vahdat LT, Califano A, Chan TA.Cell Rep. 2017 Aug 15;20(7):1623-1640. doi: 10.1016/j.celrep.2017.07.052.PMID: 28813674

Inhibition of the autocrine IL-6-JAK2-STAT3-calprotectin axis as targeted therapy for HR-/HER2+ breast cancers.

Rodriguez-Barrueco R, Yu J, Saucedo-Cuevas LP, Olivan M, Llobet-Navas D, Putcha P, Castro V, Murga-Penas EM, Collazo-Lorduy A, Castillo-Martin M, Alvarez M, Cordon-Cardo C, Kalinsky K, Maurer M, Califano A, Silva JM.Genes Dev. 2015 Aug 1;29(15):1631-48. doi: 10.1101/gad.262642.115. Epub 2015 Jul 30.PMID: 26227964

Master regulators used as breast cancer metastasis classifier.

Lim WK, Lyashenko E, Califano A.Pac Symp Biocomput. 2009:504-15.PMID: 19209726 Free


Additional References


  1. Khamsi R: Computing cancer’s weak spots. Science 2020, 368(6496):1174-1177.
  2. Chen H, Li C, Peng X, Zhou Z, Weinstein JN, Liang H: A Pan-Cancer Analysis of Enhancer Expression in Nearly 9000 Patient Samples. Cell 2018, 173(2):386-399 e312.
  3. Alvarez MJ, Shen Y, Giorgi FM, Lachmann A, Ding BB, Ye BH, Califano A: Functional characterization of somatic mutations in cancer using network-based inference of protein activity. Nature genetics 2016, 48(8):838-847.


Other articles of Note on this Open Access Online Journal Include:

Issues in Personalized Medicine in Cancer: Intratumor Heterogeneity and Branched Evolution Revealed by Multiregion Sequencing


Read Full Post »

Updated listing of COVID-19 vaccine and therapeutic trials from NIH Clinical Trials.gov

Curator: Stephen J. Williams, PhD


The following file contains an updated list (search on 4/15/2020) of COVID-19 related clinical trials from https://clinicaltrials.gov/


The Excel file can be uploaded here: Current Covid-19 Trials


Each sheet in the workbook is separated by current COVID-19 vaccine trials, currents COVID-19 trials with the IL6R (interleukin 6 receptor) antagonist tocilizumab, and all COVID related trials.  The Excel spreadsheet also contains links to more information about the trials.


As of April 15, 2020 the number of listed trials are as follows:


clinicaltrials.gov search terms Number of results Number of completed  trials Number of trials currently recruiting
COVID-19 or SARS-CoV-2 410 5 completed

5 withdrawn  

1st row terms + vaccine 28 0 15
1st row terms + tocilizumab 16 0 10
1st row terms + hydroxychloroquine 61 1 22


A few highlights of the COVID related trials on clinicaltrials.gov


Withdrawn trials


Recombinant Human Angiotensin-converting Enzyme 2 (rhACE2) as a Treatment for Patients With COVID-19 (NCT04287686)

Study Description

Go to 

Brief Summary:

This is an open label, randomized, controlled, pilot clinical study in patients with COVID-19, to obtain preliminary biologic, physiologic, and clinical data in patients with COVID-19 treated with rhACE2 or control patients, to help determine whether a subsequent Phase 2B trial is warranted.


Condition or disease  Intervention/treatment  Phase 
COVID-19 Drug: Recombinant human angiotensin-converting enzyme 2 (rhACE2) Not Applicable


Detailed Description:

This is a small pilot study investigating whether there is any efficacy signal that warrants a larger Phase 2B trial, or any harm that suggests that such a trial should not be done. It is not expected to produce statistically significant results in the major endpoints. The investigators will examine all of the biologic, physiological, and clinical data to determine whether a Phase 2B trial is warranted.

Primary efficacy analysis will be carried only on patients receiving at least 4 doses of active drug. Safety analysis will be carried out on all patients receiving at least one dose of active drug.

It is planned to enroll more than or equal to 24 subjects with COVID-19. It is expected to have at least 12 evaluable patients in each group.

Experimental group: 0.4 mg/kg rhACE2 IV BID and standard of care Control group: standard of care

Intervention duration: up to 7 days of therapy

No planned interim analysis.

Study was withdrawn before participants were enrolled.

Washed Microbiota Transplantation for Patients With 2019-nCoV Infection (NCT04251767)

Study Description

Go to 

Brief Summary:

Gut dysbiosis co-exists in patients with coronavirus pneumonia. Some of these patients would develop secondary bacterial infections and antibiotic-associated diarrhea (AAD). The recent study on using washed microbiota transplantation (WMT) as rescue therapy in critically ill patients with AAD demonstrated the important clinical benefits and safety of WMT. This clinical trial aims to evaluate the outcome of WMT combining with standard therapy for patients with 2019-novel coronavirus pneumonia, especially for those patients with dysbiosis-related conditions.


Detailed Description:

An ongoing outbreak of 2019 novel coronavirus was reported in Wuhan, China. 2019-nCoV has caused a cluster of pneumonia cases, and posed continuing epidemic threat to China and even global health. Unfortunately, there is currently no specific effective treatment for the viral infection and the related serious complications. It is in urgent need to find a new specific effective treatment for the 2019-nCoV infection. According to Declaration of Helsinki and International Ethical Guidelines for Health-related Research Involving Humans, the desperately ill patients with 2019-nCov infection during disease outbreaks have a moral right to try unvalidated medical interventions (UMIs) and that it is therefore unethical to restrict access to UMIs to the clinical trial context.

There is a vital link between the intestinal tract and respiratory tract, which was exemplified by intestinal complications during respiratory disease and vice versa. Some of these patients can develop secondary bacterial infections and antibiotic-associated diarrhea (AAD). The recent study on using washed microbiota transplantation (WMT) as rescue therapy in critically ill patients with AAD demonstrated the important clinical benefits and safety of WMT. Additionally, the recent animal study provided direct evidence supporting that antibiotics could decrease gut microbiota and the lung stromal interferon signature and facilitate early influenza virus replication in lung epithelia. Importantly, the above antibiotics caused negative effects can be reversed by fecal microbiota transplantation (FMT) which suggested that FMT might be able to induce a significant improvement in the respiratory virus infection. Another evidence is that the microbiota could confer protection against certain virus infection such as influenza virus and respiratory syncytial virus by priming the immune response to viral evasion. The above results suggested that FMT might be a new therapeutic option for the treatment of virus-related pneumonia. The methodology of FMT recently was coined as WMT, which is dependent on the automatic facilities and washing process in a laboratory room. Patients underwent WMT with the decreased rate of adverse events and unchanged clinical efficacy in ulcerative colitis and Crohn’s disease. This clinical trial aims to evaluate the outcome of WMT combining with standard therapy for patients with novel coronavirus pneumonia, especially for those patients with dysbiosis-related conditions.


Responsible Party: Faming Zhang, Director of Medical Center for Digestive Diseases, The Second Hospital of Nanjing Medical University
Identifier NCT04251767     History of Changes

Study was withdrawn before participants were enrolled.


Therapy for Pneumonia Patients iInfected by 2019 Novel Coronavirus (NCT04293692)

Study Description

Go to 

Brief Summary:

The 2019 novel coronavirus pneumonia outbroken in Wuhan, China, which spread quickly to 26 countries worldwide and presented a serious threat to public health. It is mainly characterized by fever, dry cough, shortness of breath and breathing difficulties. Some patients may develop into rapid and deadly respiratory system injury with overwhelming inflammation in the lung. Currently, there is no effective treatment in clinical practice. The present clinical trial is to explore the safety and efficacy of Human Umbilical Cord Mesenchymal Stem Cells (UC-MSCs) therapy for novel coronavirus pneumonia patients.

Detailed Description:

Since late December 2019, human pneumonia cases infected by a novel coronavirus (2019-nCoV) were firstly identified in Wuhan, China. As the virus is contagious and of great epidemic, more and more cases have found in other areas of China and abroad. Up to February 24, a total of 77, 779 confirmed cases were reported in China. At present, there is no effective treatment for patients identified with novel coronavirus pneumonia. Therefore, it’s urgent to explore more active therapeutic methods to cure the patients.

Recently, some clinical researches about the 2019 novel coronavirus pneumonia published in The Lancet and The New England Journal of Medicine suggested that massive inflammatory cell infiltration and inflammatory cytokines secretion were found in patients’ lungs, alveolar epithelial cells and capillary endothelial cells were damaged, causing acute lung injury. It seems that the key to cure the pneumonia is to inhibit the inflammatory response, resulting to reduce the damage of alveolar epithelial cells and endothelial cells and repair the function of the lung.

Mesenchymal stem cells (MSCs) are widely used in basic research and clinical application. They are proved to migrate to damaged tissues, exert anti-inflammatory and immunoregulatory functions, promote the regeneration of damaged tissues and inhibit tissue fibrosis. Studies have shown that MSCs can significantly reduce acute lung injury in mice caused by H9N2 and H5N1 viruses by reducing the levels of proinflammatory cytokines and the recruitment of inflammatory cells into the lungs. Compared with MSCs from other sources, human umbilical cord-derived MSCs (UC-MSCs) have been widely applied to various diseases due to their convenient collection, no ethical controversy, low immunogenicity, and rapid proliferation rate. In our recent research, we confirmed that UC-MSCs can significantly reduce inflammatory cell infiltration and inflammatory factors expression in lung tissue, and significantly protect lung tissue from endotoxin (LPS) -induced acute lung injury in mice.

The purpose of this clinical study is to investigate safety and efficiency of UC-MSCs in treating pneumonia patients infected by 2019-nCoV. The investigators planned to recruit 48 patients aged from 18 to 75 years old and had no severe underlying diseases. In the cell treatment group, 24 patients received 0.5*10E6 UC-MSCs /kg body weight intravenously treatment 4 times every other day besides conventional treatment. In the control group, other 24 patients received conventional treatment plus 4 times of placebo intravenously. The lung CT, blood biochemical examination, lymphocyte subsets, inflammatory factors, 28-days mortality, etc will be evaluated within 24h and 1, 2, 4, 8 weeks after UC-MSCs treatment.


Puren Hospital Affiliated to Wuhan University of Science and Technology


Wuhan Hamilton Bio-technology Co., Ltd

Study was withdrawn before participants were enrolled.


Prognositc Factors in COVID-19 Patients Complicated With Hypertension (NCT04272710)

Study Description

Brief Summary:

There are currently no clinical studies reporting clinical characteristics difference between the hypertension patients with and without ACEI treatment when suffered with novel coronavirus infection in China

Detailed Description:

At present, the outbreak of the new coronavirus (2019-nCoV) infection in Wuhan and Hubei provinces has attracted great attention from the medical community across the country. Both 2019-nCoV and SARS viruses are coronaviruses, and they have a large homology.

Published laboratory studies have suggested that SARS virus infection and its lung injury are related to angiotensin-converting enzyme 2 (ACE2) in lung tissue. And ACE and ACE2 in the renin-angiotensin system (RAS) are vital central links to maintain hemodynamic stability and normal heart and kidney function in vivo.

A large amount of evidence-based medical evidence shows that ACE inhibitors are the basic therapeutic drugs for maintaining hypertension, reducing the risk of cardiovascular, cerebrovascular, and renal adverse events, improving quality of life, and prolonging life in patients with hypertension. Recent experimental studies suggest that treatment with ACE inhibitors can significantly reduce pulmonary inflammation and cytokine release caused by coronavirus infection.


ACEI treatment

hypertension patients with ACEI treatment when suffered with novel coronavirus infection in China


hypertension patients without ACEI treatment when suffered with novel coronavirus infection in China



The First Affiliated Hospital of Chongqing Medical University Chongqing, China

Sponsors and Collaborators Chongqing Medical University


Responsible PI: Dongying Zhang, Associate Professor, Chongqing Medical University

Withdrawn (Similar projects have been registered, and it needs to be withdrawn.)

Read Full Post »

scPopCorn: A New Computational Method for Subpopulation Detection and their Comparative Analysis Across Single-Cell Experiments

Reporter and Curator: Dr. Sudipta Saha, Ph.D.


Present day technological advances have facilitated unprecedented opportunities for studying biological systems at single-cell level resolution. For example, single-cell RNA sequencing (scRNA-seq) enables the measurement of transcriptomic information of thousands of individual cells in one experiment. Analyses of such data provide information that was not accessible using bulk sequencing, which can only assess average properties of cell populations. Single-cell measurements, however, can capture the heterogeneity of a population of cells. In particular, single-cell studies allow for the identification of novel cell types, states, and dynamics.


One of the most prominent uses of the scRNA-seq technology is the identification of subpopulations of cells present in a sample and comparing such subpopulations across samples. Such information is crucial for understanding the heterogeneity of cells in a sample and for comparative analysis of samples from different conditions, tissues, and species. A frequently used approach is to cluster every dataset separately, inspect marker genes for each cluster, and compare these clusters in an attempt to determine which cell types were shared between samples. This approach, however, relies on the existence of predefined or clearly identifiable marker genes and their consistent measurement across subpopulations.


Although the aligned data can then be clustered to reveal subpopulations and their correspondence, solving the subpopulation-mapping problem by performing global alignment first and clustering second overlooks the original information about subpopulations existing in each experiment. In contrast, an approach addressing this problem directly might represent a more suitable solution. So, keeping this in mind the researchers developed a computational method, single-cell subpopulations comparison (scPopCorn), that allows for comparative analysis of two or more single-cell populations.


The performance of scPopCorn was tested in three distinct settings. First, its potential was demonstrated in identifying and aligning subpopulations from single-cell data from human and mouse pancreatic single-cell data. Next, scPopCorn was applied to the task of aligning biological replicates of mouse kidney single-cell data. scPopCorn achieved the best performance over the previously published tools. Finally, it was applied to compare populations of cells from cancer and healthy brain tissues, revealing the relation of neoplastic cells to neural cells and astrocytes. Consequently, as a result of this integrative approach, scPopCorn provides a powerful tool for comparative analysis of single-cell populations.


This scPopCorn is basically a computational method for the identification of subpopulations of cells present within individual single-cell experiments and mapping of these subpopulations across these experiments. Different from other approaches, scPopCorn performs the tasks of population identification and mapping simultaneously by optimizing a function that combines both objectives. When applied to complex biological data, scPopCorn outperforms previous methods. However, it should be kept in mind that scPopCorn assumes the input single-cell data to consist of separable subpopulations and it is not designed to perform a comparative analysis of single cell trajectories datasets that do not fulfill this constraint.


Several innovations developed in this work contributed to the performance of scPopCorn. First, unifying the above-mentioned tasks into a single problem statement allowed for integrating the signal from different experiments while identifying subpopulations within each experiment. Such an incorporation aids the reduction of biological and experimental noise. The researchers believe that the ideas introduced in scPopCorn not only enabled the design of a highly accurate identification of subpopulations and mapping approach, but can also provide a stepping stone for other tools to interrogate the relationships between single cell experiments.















Read Full Post »

Reporter and Curator: Dr. Sudipta Saha, Ph.D.


RNA plays various roles in determining how the information in our genes drives cell behavior. One of its roles is to carry information encoded by our genes from the cell nucleus to the rest of the cell where it can be acted on by other cell components. Rresearchers have now defined how RNA also participates in transmitting information outside cells, known as extracellular RNA or exRNA. This new role of RNA in cell-to-cell communication has led to new discoveries of potential disease biomarkers and therapeutic targets. Cells using RNA to talk to each other is a significant shift in the general thought process about RNA biology.


Researchers explored basic exRNA biology, including how exRNA molecules and their transport packages (or carriers) were made, how they were expelled by producer cells and taken up by target cells, and what the exRNA molecules did when they got to their destination. They encountered surprising complexity both in the types of carriers that transport exRNA molecules between cells and in the different types of exRNA molecules associated with the carriers. The researchers had to be exceptionally creative in developing molecular and data-centric tools to begin making sense of the complexity, and found that the type of carrier affected how exRNA messages were sent and received.


As couriers of information between cells, exRNA molecules and their carriers give researchers an opportunity to intercept exRNA messages to see if they are associated with disease. If scientists could change or engineer designer exRNA messages, it may be a new way to treat disease. The researchers identified potential exRNA biomarkers for nearly 30 diseases including cardiovascular disease, diseases of the brain and central nervous system, pregnancy complications, glaucoma, diabetes, autoimmune diseases and multiple types of cancer.


As for example some researchers found that exRNA in urine showed promise as a biomarker of muscular dystrophy where current studies rely on markers obtained through painful muscle biopsies. Some other researchers laid the groundwork for exRNA as therapeutics with preliminary studies demonstrating how researchers might load exRNA molecules into suitable carriers and target carriers to intended recipient cells, and determining whether engineered carriers could have adverse side effects. Scientists engineered carriers with designer RNA messages to target lab-grown breast cancer cells displaying a certain protein on their surface. In an animal model of breast cancer with the cell surface protein, the researchers showed a reduction in tumor growth after engineered carriers deposited their RNA cargo.


Other than the above research work the scientists also created a catalog of exRNA molecules found in human biofluids like plasma, saliva and urine. They analyzed over 50,000 samples from over 2000 donors, generating exRNA profiles for 13 biofluids. This included over 1000 exRNA profiles from healthy volunteers. The researchers found that exRNA profiles varied greatly among healthy individuals depending on characteristics like age and environmental factors like exercise. This means that exRNA profiles can give important and detailed information about health and disease, but careful comparisons need to be made with exRNA data generated from people with similar characteristics.


Next the researchers will develop tools to efficiently and reproducibly isolate, identify and analyze different carrier types and their exRNA cargos and allow analysis of one carrier and its cargo at a time. These tools will be shared with the research community to fill gaps in knowledge generated till now and to continue to move this field forward.
















Read Full Post »



Reporter: Stephen J. Williams, Ph.D.


The three-day symposium aims to bring oncologists and statisticians together to share new research, discuss novel ideas, ask questions and provide solutions for cancer clinical trials. In the era of big data, precision medicine, and genomics and immune-based oncology, it is crucial to provide a platform for interdisciplinary dialogues among clinical and quantitative scientists. The Stat4Onc Annual Symposium serves as a venue for oncologists and statisticians to communicate their views on trial design and conduct, drug development, and translations to patient care. To be discussed includes big data and genomics for oncology clinical trials, novel dose-finding designs, drug combinations, immune oncology clinical trials, and umbrella/basket oncology trials. An important aspect of Stat4Onc is the participation of researchers across academia, industry, and regulatory agency.

Meeting Agenda will be announced coming soon. For Updated Agenda and Program Speakers please CLICK HERE

The registration of the symposium is via NESS Society PayPal. Click here to register.

Other  2019 Conference Announcement Posts on this Open Access Journal Include:

Read Full Post »

Hypertriglyceridemia: Evaluation and Treatment Guideline

Reporter and Curator: Dr. Sudipta Saha, Ph.D.


Severe and very severe hypertriglyceridemia increase the risk for pancreatitis, whereas mild or moderate hypertriglyceridemia may be a risk factor for cardiovascular disease. Individuals found to have any elevation of fasting triglycerides should be evaluated for secondary causes of hyperlipidemia including endocrine conditions and medications. Patients with primary hypertriglyceridemia must be assessed for other cardiovascular risk factors, such as central obesity, hypertension, abnormalities of glucose metabolism, and liver dysfunction. The aim of this study was to develop clinical practice guidelines on hypertriglyceridemia.

The diagnosis of hypertriglyceridemia should be based on fasting levels, that mild and moderate hypertriglyceridemia (triglycerides of 150–999 mg/dl) be diagnosed to aid in the evaluation of cardiovascular risk, and that severe and very severe hypertriglyceridemia (triglycerides of >1000 mg/dl) be considered a risk for pancreatitis. The patients with hypertriglyceridemia must be evaluated for secondary causes of hyperlipidemia and that subjects with primary hypertriglyceridemia be evaluated for family history of dyslipidemia and cardiovascular disease.

The treatment goal in patients with moderate hypertriglyceridemia should be a non-high-density lipoprotein cholesterol level in agreement with National Cholesterol Education Program Adult Treatment Panel guidelines. The initial treatment should be lifestyle therapy; a combination of diet modification, physical activity and drug therapy may also be considered. In patients with severe or very severe hypertriglyceridemia, a fibrate can be used as a first-line agent for reduction of triglycerides in patients at risk for triglyceride-induced pancreatitis.

Three drug classes (fibrates, niacin, n-3 fatty acids) alone or in combination with statins may be considered as treatment options in patients with moderate to severe triglyceride levels. Statins are not be used as monotherapy for severe or very severe hypertriglyceridemia. However, statins may be useful for the treatment of moderate hypertriglyceridemia when indicated to modify cardiovascular risk.











Read Full Post »

Reporter and Curator: Dr. Sudipta Saha, Ph.D.


Once herpes simplex infects a person, the virus goes into hiding inside nerve cells, hibernating there for life, periodically waking up from its sleep to reignite infection, causing cold sores or genital lesions to recur. Research from Harvard Medical School showed that the virus uses a host protein called CTCF, or cellular CCCTC-binding factor, to display this type of behavior. Researchers revealed with experiments on mice that CTCF helps herpes simplex regulate its own sleep-wake cycle, enabling the virus to establish latent infections in the body’s sensory neurons where it remains dormant until reactivated. Preventing that latency-regulating protein from binding to the virus’s DNA, weakened the virus’s ability to come out of hiding.


Herpes simplex virus’s ability to go in and out of hiding is a key survival strategy that ensures its propagation from one host to the next. Such symptom-free latency allows the virus to remain out of the reach of the immune system most of the time, while its periodic reactivation ensures that it can continue to spread from one person to the next. On one hand, so-called latency-associated transcript genes, or LAT genes, turn off the transcription of viral RNA, inducing the virus to go into hibernation, or latency. On the other hand, a protein made by a gene called ICP0 promotes the activity of genes that stimulate viral replication and causes active infection.


Based on these earlier findings, the new study revealed that this balancing act is enabled by the CTCF protein when it binds to the viral DNA. Present during latent or dormant infections, CTCF is lost during active, symptomatic infections. The researchers created an altered version of the virus that lacked two of the CTCF binding sites. The absence of the binding sites made no difference in early-stage or acute infections. Similar results were found in infected cultured human nerve cells (trigeminal ganglia) and infected mice model. The researchers concluded that the mutant virus was found to have significantly weakened reactivation capacity.


Taken together, the experiments showed that deleting the CTCF binding sites weakened the virus’s ability to wake up from its dormant state thereby establishing the evidence that the CTCF protein is a key regulator of sleep-wake cycle in herpes simplex infections.














Read Full Post »

Bioinformatics Tool Review: Genome Variant Analysis Tools

Curator: Stephen J. Williams, Ph.D.

Updated 11/15/2018

The following post will be an ongoing curation of reviews of gene variant bioinformatic software.


The Ensembl Variant Effect Predictor.

McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, Flicek P, Cunningham F.

Genome Biol. 2016 Jun 6;17(1):122. doi: 10.1186/s13059-016-0974-4.

Author information


European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. wm2@ebi.ac.uk.


European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.


European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. fiona@ebi.ac.uk.


The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.


Rare diseases can be difficult to diagnose due to low incidence and incomplete penetrance of implicated alleles however variant analysis of whole genome sequencing can identify underlying genetic events responsible for the disease (Nature, 2015).  However, a large cohort is required for many WGS association studies in order to produce enough statistical power for interpretation (see post and here).  To this effect major sequencing projects have been initiated worldwide including:

A more thorough curation of sequencing projects can be seen in the following post:

Icelandic Population Genomic Study Results by deCODE Genetics come to Fruition: Curation of Current genomic studies


And although sequencing costs have dramatically been reduced over the years, the costs to determine the functional consequences of such variants remains high, as thorough basic research studies must be conducted to validate the interpretation of variant data with respect to the underlying disease, as only a small fraction of variants from a genome sequencing project will encode for a functional protein.  Correct annotation of sequences and variants, identification of correct corresponding reference genes or transcripts in GENCODE or RefSeq respectively offer compelling challenges to the proper identification of sequenced variants as potential functional variants.

To this effect, the authors developed the Ensembl Variant Effect Predictor (VEP), which is a software suite that performs annotations and analysis of most types of genomic variation in coding and non-coding regions of the genome.

Summary of Features

  • Annotation: VEP can annotate two broad categories of genomic variants
    • Sequence variants with specific and defined changes: indels, base substitutions, SNVs, tandem repeats
    • Larger structural variants > 50 nucleotides
  • Species and assembly/genomic database support: VEP can analyze data from any species with assembled genome sequence and annotated gene set. VEP supports chromosome assemblies such as the latest GRCh38, FASTA, as well as transcripts from RefSeq as well as user-derived sequences
  • Transcript Annotation: VEP includes a wide variety of gene and transcript related information including NCBI Gene ID, Gene Symbol, Transcript ID, NCBI RefSeq ID, exon/intron information, and cross reference to other databases such as UniProt
  • Protein Annotation: Protein-related fields include Protein ID, RefSeq ID, SwissProt, UniParc ID, reference codons and amino acids, SIFT pathogenicity score, protein domains
  • Noncoding Annotation: VEP reports variants in noncoding regions including genomic regulatory regions, intronic regions, transcription binding motifs. Data from ENCODE, BLUEPRINT, and NIH Epigenetics RoadMap are used for primary annotation.  Plugins to the Perl coding are also available to link other databases which annotate noncoding sequence features.
  • Frequency, phenotype, and citation annotation: VEP searches Ensembl databases containing a large amount of germline variant information and checks variants against the dbSNP single nucleotide polymorphism database. VEP integrates with mutational databases such as COSMIC, the Human Gene Mutation Database, and structural and copy number variants from Database of Genomic Variants.  Allele Frequencies are reported from 1000 Genomes and NHLBI and integrates with PubMed for literature annotation.  Phenotype information is from OMIM, Orphanet, GWAS and clinical information of variants from ClinVar.
  • Flexible Input and Output Formats: VEP supports input data format called “variant call format” or VCP, a standard in next-gen sequencing. VEP has the ability to process variant identifiers from other database formats.  Output formats are tab deliminated and give the user choices in presentation of results (HTML or text based)
  • Choice of user interface
    • Online tool (VEP Web): simple point and click; incorporates Instant VEP Functionality and copy and paste features. Results can be stored online in cloud storage on Ensembl.
    • VEP script: VEP is available as a downloadable PERL script (see below for link) and can process large amounts of data rapidly. This interface is powerfully flexible with the ability to integrate multiple plugins available from Ensembl and GitHub.  The ability to alter the PERL code and add plugins and code functions allows the flexibility to modify any feature of VEP.
    • VEP REST API: provides robust computational access to any programming language and returns basic variant annotation. Can make use of external plugins.



Watch Video on VES Instructional Webinar: https://youtu.be/7Fs7MHfXjWk

Watch Video on VES Web Version training on How to Analyze Your Sequence in VEP



Availability of data and materials

The dataset supporting the conclusions of this article is available from Illumina’s Platinum Genomes [93] and using the Ensembl release 75 gene set. Pre-built data sets are available for all Ensembl and Ensembl Genomes species [94]. They can also be downloaded automatically during set up whilst installing the VEP.



Large-scale discovery of novel genetic causes of developmental disorders.

Deciphering Developmental Disorders Study.

Nature2015 Mar 12;519(7542):223-8. doi: 10.1038/nature14135. PMID:25533962

Updated 11/15/2018


Research Points to Caution in Use of Variant Effect Prediction Bioinformatic Tools

Although we have the ability to use high throughput sequencing to identify allelic variants occurring in rare disease, correlation of these variants with the underlying disease is often difficult due to a few concerns:

  • For rare sporadic diseases, classical gene/variant association studies have proven difficult to perform (Meyts et al. 2016)
  • As Whole Exome Sequencing (WES) returns a considerable number of variants, how to differentiate the normal allelic variation found in the human population from disease-causing pathogenic alleles
  • For rare diseases, pathogenic allele frequencies are generally low

Therefore, for these rare pathogenic alleles, the use of bioinformatics tools in order to predict the resulting changes in gene function may provide insight into disease etiology when validation of these allelic changes might be experimentally difficult.

In a 2017 Genes & Immunity paper, Line Lykke Andersen and Rune Hartmann tested the reliability of various bioinformatic software to predict the functional consequence of variants of six different genes involved in interferon induction and sixteen allelic variants of the IFNLR1 gene.  These variants were found in cohorts of patients presenting with herpes simplex encephalitis (HSE). Most of the adult population is seropositive for Herpes Simplex Virus (HSV) however a minor fraction (1 in 250,000 individuals per year) of HSV infected individuals will develop HSE (Hjalmarsson et al., 2007).  It has been suggested that HSE occurs in individuals with rare primary immunodeficiencies caused by gene defects affecting innate immunity through reduced production of interferons (IFN) (Zhang et al., Lim et al.).



Meyts I, Bosch B, Bolze A, Boisson B, Itan Y, Belkadi A, et al. Exome and genome sequencing for inborn errors of immunity. J Allergy Clin Immunol. 2016;138:957–69.

Hjalmarsson A, Blomqvist P, Skoldenberg B. Herpes simplex encephalitis in Sweden, 1990-2001: incidence, morbidity, and mortality. Clin Infect Dis. 2007;45:875–80.

Zhang SY, Jouanguy E, Ugolini S, Smahi A, Elain G, Romero P, et al. TLR3 deficiency in patients with herpes simplex encephalitis. Science. 2007;317:1522–7.

Lim HK, Seppanen M, Hautala T, Ciancanelli MJ, Itan Y, Lafaille FG, et al. TLR3 deficiency in herpes simplex encephalitis: high allelic heterogeneity and recurrence risk. Neurology. 2014;83:1888–97.


Genes Immun. 2017 Dec 4. doi: 10.1038/s41435-017-0002-z.

Frequently used bioinformatics tools overestimate the damaging effect of allelic variants.

Andersen LL1Terczyńska-Dyla E1Mørk N2Scavenius C1Enghild JJ1Höning K3Hornung V3,4Christiansen M5,6Mogensen TH2,6Hartmann R7.



We selected two sets of naturally occurring human missense allelic variants within innate immune genes. The first set represented eleven non-synonymous variants in six different genes involved in interferon (IFN) induction, present in a cohort of patients suffering from herpes simplex encephalitis (HSE) and the second set represented sixteen allelic variants of the IFNLR1 gene. We recreated the variants in vitro and tested their effect on protein function in a HEK293T cell based assay. We then used an array of 14 available bioinformatics tools to predict the effect of these variants upon protein function. To our surprise two of the most commonly used tools, CADD and SIFT, produced a high rate of false positives, whereas SNPs&GO exhibited the lowest rate of false positives in our test. As the problem in our test in general was false positive variants, inclusion of mutation significance cutoff (MSC) did not improve accuracy.


  1. Identification of rare variants
  2. Genomes of nineteen Dutch patients with a history of HSE sequenced by WES and identification of novel HSE causing variants determined by filtering the single nucleotide polymorphisms (SNPs) that had a frequency below 1% in the NHBLI Exome Sequencing Project Exome Variant Server and the 1000 Genomes Project and were present within 204 genes involved in the immune response to HSV.
  3. Identified variants (204) manually evaluated for involvement of IFN induction based on IDBase and KEGG pathway database analysis.
  4. In-silico predictions: Variants classified by the in silico variant pathogenicity prediction programs: SIFT, Mutation Assessor, FATHMM, PROVEAN, SNAP2, PolyPhen2, PhD-SNP, SNP&GO, FATHMM-MKL, MutationTaster2, PredictSNP, Condel, MetaSNP, and CADD. Each program returned prediction scores measuring likelihood of a variant either being ‘deleterious’ or ‘neutral’. Prediction accuracy measured as

ACC = (true positive+true negative)/(true positive+true negative+false positive+false negative)


  1. Validation of prediction software/tools

In order to validate the predictive value of the software, HEK293T cells, deficient in IRF3, MAVS, and IKKe/TBK1, were cotransfected with the nine variants of the aforementioned genes and a luciferase reporter under control of the IFN-b promoter and luciferase activity measured as an indicator of IFN signaling function.  Western blot was performed to confirm the expression of the constructs.



Table 2 Summary of the
bioinformatic predictions
HSE variants IFNLR1 variants Overall ACC
Uniform cutoff
SIFT 4 1 0 4 9 0.56 8 1 0 7 16 0.56 0.56
Mutation assessor 6 1 0 2 9 0.78 9 1 0 6 16 0.63 0.68
FATHMM 7 1 0 1 9 0.89 0.89
PROVEAN 8 1 0 0 9 1.00 11 1 0 4 16 0.75 0.84
SNAP2 5 1 0 3 9 0.67 8 0 1 7 16 0.50 0.56
PolyPhen2 6 1 0 2 9 0.78 12 1 0 3 16 0.81 0.80
PhD-SNP 7 1 0 1 9 0.89 11 1 0 4 16 0.75 0.80
SNPs&GO 8 1 0 0 9 1.00 14 1 0 1 16 0.94 0.96
FATHMM MKL 4 1 0 4 9 0.56 13 0 1 2 16 0.81 0.72
MutationTaster2 4 0 1 4 9 0.44 14 0 1 1 16 0.88 0.72
PredictSNP 6 1 0 2 9 0.78 11 1 0 4 16 0.75 0.76
Condel 6 1 0 2 9 0.78 0.78
Meta-SNP 8 1 0 0 9 1.00 11 1 0 4 16 0.75 0.84
CADD 2 1 0 6 9 0.33 8 0 1 7 16 0.50 0.44
MSC 95% cutoff
SIFT 5 1 0 3 9 0.67 8 1 0 8 16 0.50 0.56
PolyPhen2 6 1 0 2 9 0.78 13 1 0 3 16 0.81 0.80
CADD 4 1 0 4 9 0.56 7 0 1 9 16 0.44 0.48


Note: TN: true negative, TP: true positive, FN: false negative, FP: false positive, ACC: accuracy

Functional testing (data obtained from reporter construct experiments) were considered as the correct outcome.

Three prediction tools (PROVEAN, SNP&GO, and MetaSNP correctly predicted the effect of all nine variants tested.


Other articles related to Genomics and Bioinformatics on this online Open Access Journal Include:

Finding the Genetic Links in Common Disease: Caveats of Whole Genome Sequencing Studies


Large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes


US Personalized Cancer Genome Sequencing Market Outlook 2018 –


Icelandic Population Genomic Study Results by deCODE Genetics come to Fruition: Curation of Current genomic studies



Read Full Post »

Reporter and Curator: Dr. Sudipta Saha, Ph.D.


Researchers have embraced CRISPR gene-editing as a method for altering genomes, but some have reported that unwanted DNA changes may slip by undetected. The tool can cause large DNA deletions and rearrangements near its target site on the genome. Such alterations can confuse the interpretation of experimental results and could complicate efforts to design therapies based on CRISPR. The finding is in line with previous results from not only CRISPR but also other gene-editing systems.


CRISPR -Cas9 gene editing relies on the Cas9 enzyme to cut DNA at a particular target site. The cell then attempts to reseal this break using its DNA repair mechanisms. These mechanisms do not always work perfectly, and sometimes segments of DNA will be deleted or rearranged, or unrelated bits of DNA will become incorporated into the chromosome.


Researchers often use CRISPR to generate small deletions in the hope of knocking out a gene’s function. But when examining CRISPR edits, researchers found large deletions (often several thousand nucleotides) and complicated rearrangements of DNA sequences in which previously distant DNA sequences were stitched together. Many researchers use a method for amplifying short snippets of DNA to test whether their edits have been made properly. But this approach might miss larger deletions and rearrangements.


These deletions and rearrangements occur only with gene-editing techniques that rely on DNA cutting and not with some other types of CRISPR modifications that avoid cutting DNA. Such as a modified CRISPR system to switch one nucleotide for another without cutting DNA and other systems use inactivated Cas9 fused to other enzymes to turn genes on or off, or to target RNA. Overall, these unwanted edits are a problem that deserves more attention, but this should not stop anyone from using CRISPR. Only when people use it, they need to do a more thorough analysis about the outcome.




















Read Full Post »

Reporter and Curator: Dr. Sudipta Saha, Ph.D.


Long interspersed nuclear elements 1 (LINE1) is repeated half a million times in the human genome, making up nearly a fifth of the DNA in every cell. But nobody cared to study it and may be the reason to call it junk DNA. LINE1, like other transposons (or “jumping genes”), has the unusual ability to copy and insert itself in random places in the genome. Many other research groups uncovered possible roles in early mouse embryos and in brain cells. But nobody quite established a proper report about the functions of LINE1.


Geneticists gave attention to LINE1 when it was found to cause cancer or genetic disorders like hemophilia. But researchers at University of California at San Francisco suspected there was more characteristics of LINE1. They suspected that if it can be most harmless then it can be worst harmful also.


Many reports showed that LINE1 is especially active inside developing embryos, which suggests that the segment actually plays a key role in coordinating the development of cells in an embryo. Researchers at University of California at San Francisco figured out how to turn LINE1 off in mouse embryos by blocking LINE1 RNA. As a result the embryos got stuck in the two-cell stage, right after a fertilized egg has first split. Without LINE1, embryos essentially stopped developing.


The researchers thought that LINE1 RNA particles act as molecular “glue,” bringing together a suite of molecules that switch off the two-cell stage and kick it into the next phase of development. In particular, it turns off a gene called Dux, which is active in the two-cell stage.


LINE1’s ability to copy itself, however, seems to have nothing to do with its role in embryonic development. When LINE1 was blocked from inserting itself into the genome, the embryonic stem cells remained unaffected. It’s possible that cells in embryos have a way of making LINE1 RNA while also preventing its potentially harmful “jumping” around in the genome. But it’s unlikely that every one of the thousands of copies of LINE1 is actually being used to regulate embryonic development.


LINE1 is abundant in the genomes of almost all mammals. Other transposons, also once considered junk DNA, have turned out to have critical roles in development in human cells too. There are differences between mice and humans, so, the next obvious step is to study LINE1 in human cells, where it makes up 17 percent of the genome.














Read Full Post »

Older Posts »