Archive for the ‘Computational Biology/Systems and Bioinformatics’ Category

Live Conference Coverage AACR 2020 in Real Time: Monday June 22, 2020 Mid Day Sessions

Reporter: Stephen J. Williams, PhD

This post will be UPDATED during the next two days with notes from recordings from other talks

Follow Live in Real Time using











Register for FREE at https://www.aacr.org/




June 22-24: Free Registration for AACR Members, the Cancer Community, and the Public
This virtual meeting will feature more than 120 sessions and 4,000 e-posters, including sessions on cancer health disparities and the impact of COVID-19 on clinical trials


This Virtual Meeting is Part II of the AACR Annual Meeting.  Part I was held online in April and was centered only on clinical findings.  This Part II of the virtual meeting will contain all the Sessions and Abstracts pertaining to basic and translational cancer research as well as clinical trial findings.




Pezcoller Foundation-AACR International Award for Extraordinary Achievement in Cancer Research

The prestigious Pezcoller Foundation-AACR International Award for Extraordinary Achievement in Cancer Research was established in 1997 to annually recognize a scientist of international renown who has made a major scientific discovery in basic cancer research OR who has made significant contributions to translational cancer research; who continues to be active in cancer research and has a record of recent, noteworthy publications; and whose ongoing work holds promise for continued substantive contributions to progress in the field of cancer. For more information regarding the 2020 award recipient go to aacr.org/awards.

John E. Dick, Enzo Galligioni, David A Tuveson


Awardee: John E. Dick
Princess Anne Margaret Cancer Center, Toronto, Ontario
For determining how stem cells contribute to normal and leukemic hematopoeisis
  • not every cancer cell equal in their Cancer Hallmarks
  • how do we monitor and measure clonal dynamics
  • Barnie Clarkson did pivotal work on this
  • most cancer cells are post mitotic but minor populations of cells were dormant and survive chemotherapy
  •  only one cell is 1 in a million can regenerate and transplantable in mice and experiments with flow cytometry resolved the question of potency and repopulation of only small percentage of cells and undergo long term clonal population
  • so instead of going to cell lines and using thousands of shRNA looked at clinical data and deconvoluted the genetic information (RNASeq data) to determine progenitor and mature populations (how much is stem and how much is mature populations)
  • in leukemic patients they have seen massive expansion of a single stem cell population so only need one cell in AML if the stem cells have the mutational hits early on in their development
  • finding the “seeds of relapse”: finding the small subpopulation of stem cells that will relapse
  • they looked in BALL;;  there are cells resistant to l-aspariginase, dexamethasone, and vincristine
  • a lot of OXPHOS related genes (in DRIs) that may be the genes involved in this resistance
  • it a wonderful note of acknowledgement he dedicated this award to all of his past and present trainees who were the ones, as he said, made this field into what it is and for taking it into directions none of them could forsee

Monday, June 22

1:30 PM – 3:30 PM EDT

Virtual Educational Session

Experimental and Molecular Therapeutics, Drug Development, Cancer Chemistry

Chemistry to the Clinic: Part 1: Lead Optimization Case Studies in Cancer Drug Discovery

How can one continue to deliver innovative medicines to patients when biological targets are becoming ever scarcer and less amenable to therapeutic intervention? Are there sound strategies in place that can clear the path to targets previously considered “undruggable”? Recent advances in lead finding methods and novel technologies such as covalent screening and targeted protein degradation have enriched the toolbox at the disposal of drug discovery scientists to expand the druggable ta

Stefan N Gradl, Elena S Koltun, Scott D Edmondson, Matthew A. Marx, Joachim Rudolph


Monday, June 22

1:30 PM – 3:30 PM EDT

Virtual Educational Session

Bioinformatics and Systems Biology, Molecular and Cellular Biology/Genetics

Informatics Technologies for Cancer Research

Cancer researchers are faced with a deluge of high-throughput data. Using these data to advance understanding of cancer biology and improve clinical outcomes increasingly requires effective use of computational and informatics tools. This session will introduce informatics resources that support the data management, analysis, visualization, and interpretation. The primary focus will be on high-throughput genomic data and imaging data. Participants will be introduced to fundamental concepts

Rachel Karchin, Daniel Marcus, Andriy Fedorov, Obi Lee Griffith


  • Variant analysis is the big bottleneck, especially interpretation of variants
  • CIVIC resource is a network for curation, interpretation of genetic variants
  • CIVIC curators go through multiple rounds of editors review
  • gene summaries, variant summaries
  • curation follows ACSME guidelines
  • evidences are accumulated, categories by various ontologies and is the heart of the reports
  • as this is a network of curators the knowledgebase expands
  • CIVIC is linked to multiple external informatic, clinical, and genetic databases
  • they have curated 7017 clinical interpretations, 2527 variants, using 2578 papers, and over 1000 curators
  • they are currently integrating with COSMIC ClinVar, and UniProt
  • they are partnering with ClinGen to expand network of curators and their curation effort
  • CIVIC uses a Python interface; available on website


The Precision Medicine Revolution

Precision medicine refers to the use of prevention and treatment strategies that are tailored to the unique features of each individual and their disease. In the context of cancer this might involve the identification of specific mutations shown to predict response to a targeted therapy. The biomedical literature describing these associations is large and growing rapidly. Currently these interpretations exist largely in private or encumbered databases resulting in extensive repetition of effort.

CIViC’s Role in Precision Medicine

Realizing precision medicine will require this information to be centralized, debated and interpreted for application in the clinic. CIViC is an open access, open source, community-driven web resource for Clinical Interpretation of Variants in Cancer. Our goal is to enable precision medicine by providing an educational forum for dissemination of knowledge and active discussion of the clinical significance of cancer genome alterations. For more details refer to the 2017 CIViC publication in Nature Genetics.

U24 funding announced: We are excited to announce that the Informatics Technology for Cancer Research (ICTR) program of the National Cancer Institute (NCI) has awarded funding to the CIViC team! Starting this year, a five-year, $3.7 million U24 award (CA237719), will support CIViC to develop Standardized and Genome-Wide Clinical Interpretation of Complex Genotypes for Cancer Precision Medicine.

Informatics tools for high-throughput analysis of cancer mutations

Rachel Karchin
  • CRAVAT is a platform to determine, categorize, and curate cancer mutations and cancer related variants
  • adding new tools used to be hard but having an open architecture allows for modular growth and easy integration of other tools
  • so they are actively making an open network using social media

Towards FAIR data in cancer imaging research

Andriy Fedorov, PhD

Towards the FAIR principles

While LOD has had some uptake across the web, the number of databases using this protocol compared to the other technologies is still modest. But whether or not we use LOD, we do need to ensure that databases are designed specifically for the web and for reuse by humans and machines. To provide guidance for creating such databases independent of the technology used, the FAIR principles were issued through FORCE11: the Future of Research Communications and e-Scholarship. The FAIR principles put forth characteristics that contemporary data resources, tools, vocabularies and infrastructures should exhibit to assist discovery and reuse by third-parties through the web. Wilkinson et al.,2016. FAIR stands for: Findable, Accessible, Interoperable and Re-usable. The definition of FAIR is provided in Table 1:

Number Principle
F Findable
F1 (meta)data are assigned a globally unique and persistent identifier
F2 data are described with rich metadata
F3 metadata clearly and explicitly include the identifier of the data it describes
F4 (meta)data are registered or indexed in a searchable resource
A Accessible
A1 (meta)data are retrievable by their identifier using a standardized communications protocol
A1.1 the protocol is open, free, and universally implementable
A1.2 the protocol allows for an authentication and authorization procedure, where necessary
A2 metadata are accessible, even when the data are no longer available
I Interoperable
I1 (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
I2 (meta)data use vocabularies that follow FAIR principles
I3 (meta)data include qualified references to other (meta)data
R Reusable
R1 meta(data) are richly described with a plurality of accurate and relevant attributes
R1.1 (meta)data are released with a clear and accessible data usage license
R1.2 (meta)data are associated with detailed provenance
R1.3 (meta)data meet domain-relevant community standards

A detailed explanation of each of these is included in the Wilkinson et al., 2016 article, and the Dutch Techcenter for Life Sciences has a set of excellent tutorials, so we won’t go into too much detail here.

  • for outside vendors to access their data, vendors would need a signed Material Transfer Agreement but NCI had formulated a framework to facilitate sharing of data using a DIACOM standard for imaging data

Monday, June 22

1:30 PM – 3:01 PM EDT

Virtual Educational Session

Experimental and Molecular Therapeutics, Cancer Chemistry, Drug Development, Immunology

Engineering and Physical Sciences Approaches in Cancer Research, Diagnosis, and Therapy

The engineering and physical science disciplines have been increasingly involved in the development of new approaches to investigate, diagnose, and treat cancer. This session will address many of these efforts, including therapeutic methods such as improvements in drug delivery/targeting, new drugs and devices to effect immunomodulation and to synergize with immunotherapies, and intraoperative probes to improve surgical interventions. Imaging technologies and probes, sensors, and bioma

Claudia Fischbach, Ronit Satchi-Fainaro, Daniel A Heller


Monday, June 22

1:30 PM – 3:30 PM EDT

Virtual Educational Session


Exceptional Responders and Long-Term Survivors

How should we think about exceptional and super responders to cancer therapy? What biologic insights might ensue from considering these cases? What are ways in which considering super responders may lead to misleading conclusions? What are the pros and cons of the quest to locate exceptional and super responders?

Alice P Chen, Vinay K Prasad, Celeste Leigh Pearce


Monday, June 22

1:30 PM – 3:30 PM EDT

Virtual Educational Session

Tumor Biology, Immunology

Exploiting Metabolic Vulnerabilities in Cancer

The reprogramming of cellular metabolism is a hallmark feature observed across cancers. Contemporary research in this area has led to the discovery of tumor-specific metabolic mechanisms and illustrated ways that these can serve as selective, exploitable vulnerabilities. In this session, four international experts in tumor metabolism will discuss new findings concerning the rewiring of metabolic programs in cancer that support metabolic fitness, biosynthesis, redox balance, and the reg

Costas Andreas Lyssiotis, Gina M DeNicola, Ayelet Erez, Oliver Maddocks


Monday, June 22

1:30 PM – 3:30 PM EDT

Virtual Educational Session

Other Articles on this Open Access  Online Journal on Cancer Conferences and Conference Coverage in Real Time Include

Press Coverage

Live Notes, Real Time Conference Coverage 2020 AACR Virtual Meeting April 28, 2020 Symposium: New Drugs on the Horizon Part 3 12:30-1:25 PM

Live Notes, Real Time Conference Coverage 2020 AACR Virtual Meeting April 28, 2020 Session on NCI Activities: COVID-19 and Cancer Research 5:20 PM

Live Notes, Real Time Conference Coverage 2020 AACR Virtual Meeting April 28, 2020 Session on Evaluating Cancer Genomics from Normal Tissues Through Metastatic Disease 3:50 PM

Live Notes, Real Time Conference Coverage 2020 AACR Virtual Meeting April 28, 2020 Session on Novel Targets and Therapies 2:35 PM

Read Full Post »

Live Conference Coverage AACR 2020 in Real Time: Monday June 22, 2020 8AM-Noon Sessions

Reporter: Stephen J. Williams, PhD

Follow Live in Real Time using





Register for FREE at https://www.aacr.org/



June 22-24: Free Registration for AACR Members, the Cancer Community, and the Public
This virtual meeting will feature more than 120 sessions and 4,000 e-posters, including sessions on cancer health disparities and the impact of COVID-19 on clinical trials


This Virtual Meeting is Part II of the AACR Annual Meeting.  Part I was held online in April and was centered only on clinical findings.  This Part II of the virtual meeting will contain all the Sessions and Abstracts pertaining to basic and translational cancer research as well as clinical trial findings.




Monday, June 22

8:30 AM – 10:10 AM EDT

Virtual Special Session

Opening Ceremony

The Opening Ceremony will include the following presentations:
Welcome from AACR CEO Margaret Foti, PhD, MD (hc)



​American Association for Cancer Research
Philadelphia, Pennsylvania

  • Dr. Foti mentions that AACR is making progress in including more ethnic and gender equality in cancer research and she feels that the disparities seen in health care, and in cancer care, is related to the disparities seen in the cancer research profession
  • AACR is very focused now on blood cancers and creating innovation summits on this matter
  • In 2019 awarded over 60 grants but feel they will be able to fund more research in 2020
  • Government funding is insufficient at current levels

Remarks from AACR Immediate Past President Elaine R. Mardis, PhD, FAACR

  • involved in planning and success of the first virtual meeting (it was really well done)
  • # of registrants was at unprecedented numbers
  • the scope for this meeting will be wider than the first meeting
  • they have included special sessions including COVID19 and health disparities
  • 70 educational and methodology workshops on over 70 channels

AACR Award for Lifetime Achievement in Cancer Research

  • Dr. Philip Sharp is awardee of Lifetime Achievement Award
  • Dr. Sharp is known for his work in RNA splicing and development of multiple cancer models including a mouse CRSPR model
  • worked under Jim Watson at Cold Spring Harbor
    Presentation of New Fellows of the AACR Academy
  • Dr. Radcliffe for hypoxic factors
  • CART therapies
  • Dr. Semenza for HIF1 discovery
  • Dr Swanton for stratification of patients and tumor heterogeneity
  • these are just some of the new fellows

AACR-Biedler Prizes for Cancer Journalism

  • Writer of Article War of Nerves awarded; reported on nerve intervation of tumors
  • writer Budman on reporting and curation of hedgehog inhibitors in cancers
  • patient advocacy book was awarded for journalism
  • cancer survivor Kasie Newsome produced multiple segments on personalized cancer therapy from a cancer survivor perspective

Remarks from Speaker of the United States House of Representatives Nancy Pelosi

  • helped secure a doubling of funding for NCI and NIH in the 90s
  • securing COVID funding to offset some of the productivity issues related to the shutdown due to COVID
  • advocating for more work to alleviate health disparities


Remarks from United States Senator Roy Blunt

  • tireless champion in the Senate for cancer research funding; he was a cancer survivor himself
  • we need to keep focus on advances in science

Margaret Foti


Monday, June 22

10:10 AM – 12:30 PM EDT

Virtual Plenary Session

Bioinformatics and Systems Biology, Epidemiology, Immunology, Molecular and Cellular Biology/Genetics

Opening Plenary Session: Turning Science into Lifesaving Care

Alexander Marson, Antoni Ribas, Ashani T Weeraratna, Olivier Elemento, Howard Y Chang, Daniel D. De Carvalho


Monday, June 22

12:45 PM – 1:30 PM EDT

Awards and Lectures

How should we think about exceptional and super responders to cancer therapy? What biologic insights might ensue from considering these cases? What are ways in which considering super responders may lead to misleading conclusions? What are the pros and cons of the quest to locate exceptional and super responders?

Alice P Chen, Vinay K Prasad, Celeste Leigh Pearce


Monday, June 22

1:30 PM – 3:30 PM EDT

Virtual Educational Session

Tumor Biology, Immunology

Experimental and Molecular Therapeutics, Immunology

Other Articles on this Open Access  Online Journal on Cancer Conferences and Conference Coverage in Real Time Include

Press Coverage

Live Notes, Real Time Conference Coverage 2020 AACR Virtual Meeting April 28, 2020 Symposium: New Drugs on the Horizon Part 3 12:30-1:25 PM

Live Notes, Real Time Conference Coverage 2020 AACR Virtual Meeting April 28, 2020 Session on NCI Activities: COVID-19 and Cancer Research 5:20 PM

Live Notes, Real Time Conference Coverage 2020 AACR Virtual Meeting April 28, 2020 Session on Evaluating Cancer Genomics from Normal Tissues Through Metastatic Disease 3:50 PM

Live Notes, Real Time Conference Coverage 2020 AACR Virtual Meeting April 28, 2020 Session on Novel Targets and Therapies 2:35 PM


Read Full Post »

Live Notes, Real Time Conference Coverage 2020 AACR Virtual Meeting April 28, 2020 Session on Early Detection and ctDNA 1:35 – 3:55 PM

Reporter: Stephen J. Williams, PhD

Alberto Bardelli

  • circulating tumor DNA has been around but with NGS now we can have more specificity in analyzing ctDNA
  • interest lately in using liquid biopsy to gain insight on tumor heterogeneity versus single needle biopsy of the solid tumor
  • these talks will however be on ctDNA as a diagnostic and therapeutic monitoring modality

Prediction of cancer and tissue of origin in individuals with suspicion of cancer using a cell-free DNA multi-cancer early detection test
David Thiel 


  • test has a specificity over 90% and intended to used along with guideline
  • The Circulating  Cell-free Genome Atlas Study (clinical trial NCT02889978) (CCGA) study divided into three substudies: highest performing assay, refining assay, validation of assays
  • methylation based assays worked better than sequencing (bisulfite sequencing)
  • used a machine learning algorithm to help refine assay
  • prediction was >90%; subgroup for high clinical suspicion of cancer
  • HCS sensitivity was 100% and specificity very high; but sensitivity on training set was 40% and results may have been confounded by including kidney cancer
  • TOO tissue of origin was predicted in greater than 99% in both training and validation sets

A first-of-its-kind prospective study of a multi-cancer blood test to screen and manage 10,000 women with no history of cancer

  • DETECT-A study: prospective interventional study; can multi blood test be used prospectively and can lead to a personalized care; can the screen be used to complement current therapy?
  • 10,000 women aged 65-75;  these women could not have previous cancer and conducted through Geisinger Health Network; multi test detects DNA and protein and standard of care screening
  • the study focused on safety so a committee was consulted on each case, and used a diagnostic PET-CT
  • blood test alone not good but combined with protein and CT scans much higher (5 fold increase) detection for breast cancer

Nickolas Papadopoulos


David Huntsman

  • there are mutiple opportunities yet at same time there are still challenges to utilize these cell free tests in therapeutic monitoring, diagnostic, and screening however sensitivities for some cancers are still too low to use in large scale screening however can supplement current screening guidelines
  • we have to ask about false positive rate and need to concentrate on prospective studies
  • we must consider how tests will be used, population health studies will need to show improved survival


Phylogenetic tracking and minimal residual disease detection using ctDNA in early-stage NSCLC: A lung TRACERx study
Chris Abbosh @ucl

  • TRACERx study in collaboration with Charles Swanton.
  • multiplex PCR to track 200 SNVs: correlate tumor tissue biopsy with ctDNA
  • spike in assay shows very good sensitivity and specificity for SNVs variants tracked, did over 400 TRACERx libraries
  • sensitivity increases when tracking more variants but specificity does go down a bit
  • tracking variants can show evidence of subclonal dynamics and evolution and copy number deletion events;  they also show neoantigen editing or changing of their neoantigens
  • this assay can detect low variants in a reproducible manner

The TRACERx (TRAcking Cancer Evolution through therapy (Rx)) lung study is a multi-million pound research project taking place over nine years, which will transform our understanding of non-small cell lung cancer (NSCLC) and take a practical step towards an era of precision medicine. The study will uncover mechanisms of cancer evolution by analysing the intratumour heterogeneity in lung tumours from approximately 850 patients and tracking its evolutionary trajectory from diagnosis through to relapse. At £14 million, it’s the biggest single investment in lung cancer research by Cancer Research UK, and the start of a strategic UK-wide focus on the disease, aimed at making real progress for patients.

Led by Professor Charles Swanton at UCL, the study will bring together a network of experts from different disciplines to help integrate clinical and genomic data and identify patients who could benefit from trials of new, targeted treatments. In addition, it will use a whole suite of cutting edge analytical techniques on these patients’ tumour samples, giving unprecedented insight into the genomic landscape of primary and metastatic tumours and the impact of treatment upon this landscape.

In future, TRACERx will enable us to define how intratumour heterogeneity impacts upon cancer immunity throughout tumour evolution and therapy. Such studies will help define how the clinical evaluation of intratumour heterogeneity can inform patient stratification and the development of combinatorial therapies incorporating conventional, targeted and immune based therapeutics.

Intratumour heterogeneity is increasingly recognised as a major hurdle to achieve improvements in therapeutic outcome and biomarker validation. Intratumour genetic diversity provides a substrate for tumour adaptation and evolution. However, the evolutionary genomic landscape of non-small cell lung cancer (NSCLC) and how it changes through the disease course has not been studied in detail. TRACERx is a prospective observational study with the following objectives:

Primary Objectives

  • Define the relationship between intratumour heterogeneity and clinical outcome following surgery and adjuvant therapy (including relationships between intratumour heterogeneity and clinical disease stage and histological subtypes of NSCLC).
  • Establish the impact of adjuvant platinum-containing regimens upon intratumour heterogeneity in relapsed disease compared to primary resected tumour.

Key Secondary Objectives

  • Develop and validate an intratumour heterogeneity (ITH) ratio index as a prognostic and predictive biomarker in relation to disease-free survival and overall survival.
  • Infer a complete picture of NSCLC evolutionary dynamics – define drivers of genomic instability, metastatic progression and drug resistance by identifying and tracking the dynamics of somatic mutational heterogeneity, and chromosomal structural and numerical instability present in the primary tumour and at metastatic sites. Individual tumour phylogenetic tree analysis will:
    • Establish the order of somatic events in relation to genomic instability onset and metastatic progression
    • Decipher genetic “bottlenecking” events following metastasis and drug therapy
    • Establish dynamics of tumour evolution during the disease course from early to late stage NSCLC.
  • Initiate a longitudinal biobank of circulating tumour cells (CTCs) and circulating-free tumour DNA (cfDNA) to develop analytical methods for the early detection and monitoring of tumour evolution over time.
  • Develop a longitudinal tissue resource to serve as a platform to assess the relationship between genetic intratumour heterogeneity and the host immune response.
  • Define relationships between intratumour heterogeneity and targeted/cytotoxic therapeutic outcome.
  • Use a lung cancer specific gene panel in a certified Good Clinical Practice (GCP) laboratory environment to define clonally dominant disease drivers to address the role of clonal driver dominance in targeted therapeutic response and to guide stratification of lung cancer treatment and future clinical study inclusion (paired primary-metastatic site comparisons in at least 270 patients with relapsed disease).



Utility of longitudinal circulating tumor DNA (ctDNA) modeling to predict RECIST-defined progression in first-line patients with epidermal growth factor receptor mutation-positive (EGFRm) advanced non-small cell lung cancer (NSCLC)
Martin Johnson


Impact of the EML4-ALK fusion variant on the efficacy of lorlatinib in patients (pts) with ALK-positive advanced non-small cell lung cancer (NSCLC)
Todd Bauer


From an interview with Dr. Bauer at https://www.lungcancernews.org/2019/08/14/making-headway-with-lorlatinib/

Lorlatinib, a smallmolecule inhibitor of ALK and ROS1, was granted accelerated U.S. Food and Drug Administration approval in November 2018 for patients with ALK-positive metastatic NSCLC whose disease has progressed on crizotinib and at least one other ALK inhibitor or whose disease has progressed on alectinib or ceritinib as the first ALK inhibitor therapy for metastatic disease. Todd M. Bauer, MD, a medical oncologist and senior investigator at Sarah Cannon Research Institute/Tennessee Oncology, PLLC, in Nashville, has been very involved with the development of lorlatinib since the beginning. In the following interview, Dr. Bauer discusses some of lorlatinib’s unique toxicities, as well as his first-hand experiences with the drug.

For further reading: Solomon B, Besse B, Bauer T, et al. Lorlatinib in Patients with ALK-positive non-small-cell lung cancer: results from a global phase 2 study. Lancet. 2018;19(12):P1654-1667.


BACKGROUND: Lorlatinib is a potent, brain-penetrant, third-generation inhibitor of ALK and ROS1 tyrosine kinases with broad coverage of ALK mutations. In a phase 1 study, activity was seen in patients with ALK-positive non-small-cell lung cancer, most of whom had CNS metastases and progression after ALK-directed therapy. We aimed to analyse the overall and intracranial antitumour activity of lorlatinib in patients with ALK-positive, advanced non-small-cell lung cancer.

METHODS: In this phase 2 study, patients with histologically or cytologically ALK-positive or ROS1-positive, advanced, non-small-cell lung cancer, with or without CNS metastases, with an Eastern Cooperative Oncology Group performance status of 0, 1, or 2, and adequate end-organ function were eligible. Patients were enrolled into six different expansion cohorts (EXP1-6) on the basis of ALK and ROS1 status and previous therapy, and were given lorlatinib 100 mg orally once daily continuously in 21-day cycles. The primary endpoint was overall and intracranial tumour response by independent central review, assessed in pooled subgroups of ALK-positive patients. Analyses of activity and safety were based on the safety analysis set (ie, all patients who received at least one dose of lorlatinib) as assessed by independent central review. Patients with measurable CNS metastases at baseline by independent central review were included in the intracranial activity analyses. In this report, we present lorlatinib activity data for the ALK-positive patients (EXP1-5 only), and safety data for all treated patients (EXP1-6). This study is ongoing and is registered with ClinicalTrials.gov, number NCT01970865.

FINDINGS: Between Sept 15, 2015, and Oct 3, 2016, 276 patients were enrolled: 30 who were ALK positive and treatment naive (EXP1); 59 who were ALK positive and received previous crizotinib without (n=27; EXP2) or with (n=32; EXP3A) previous chemotherapy; 28 who were ALK positive and received one previous non-crizotinib ALK tyrosine kinase inhibitor, with or without chemotherapy (EXP3B); 112 who were ALK positive with two (n=66; EXP4) or three (n=46; EXP5) previous ALK tyrosine kinase inhibitors with or without chemotherapy; and 47 who were ROS1 positive with any previous treatment (EXP6). One patient in EXP4 died before receiving lorlatinib and was excluded from the safety analysis set. In treatment-naive patients (EXP1), an objective response was achieved in 27 (90·0%; 95% CI 73·5-97·9) of 30 patients. Three patients in EXP1 had measurable baseline CNS lesions per independent central review, and objective intracranial responses were observed in two (66·7%; 95% CI 9·4-99·2). In ALK-positive patients with at least one previous ALK tyrosine kinase inhibitor (EXP2-5), objective responses were achieved in 93 (47·0%; 39·9-54·2) of 198 patients and objective intracranial response in those with measurable baseline CNS lesions in 51 (63·0%; 51·5-73·4) of 81 patients. Objective response was achieved in 41 (69·5%; 95% CI 56·1-80·8) of 59 patients who had only received previous crizotinib (EXP2-3A), nine (32·1%; 15·9-52·4) of 28 patients with one previous non-crizotinib ALK tyrosine kinase inhibitor (EXP3B), and 43 (38·7%; 29·6-48·5) of 111 patients with two or more previous ALK tyrosine kinase inhibitors (EXP4-5). Objective intracranial response was achieved in 20 (87·0%; 95% CI 66·4-97·2) of 23 patients with measurable baseline CNS lesions in EXP2-3A, five (55·6%; 21·2-86·3) of nine patients in EXP3B, and 26 (53·1%; 38·3-67·5) of 49 patients in EXP4-5. The most common treatment-related adverse events across all patients were hypercholesterolaemia (224 [81%] of 275 patients overall and 43 [16%] grade 3-4) and hypertriglyceridaemia (166 [60%] overall and 43 [16%] grade 3-4). Serious treatment-related adverse events occurred in 19 (7%) of 275 patients and seven patients (3%) permanently discontinued treatment because of treatment-related adverse events. No treatment-related deaths were reported.

INTERPRETATION: Consistent with its broad ALK mutational coverage and CNS penetration, lorlatinib showed substantial overall and intracranial activity both in treatment-naive patients with ALK-positive non-small-cell lung cancer, and in those who had progressed on crizotinib, second-generation ALK tyrosine kinase inhibitors, or after up to three previous ALK tyrosine kinase inhibitors. Thus, lorlatinib could represent an effective treatment option for patients with ALK-positive non-small-cell lung cancer in first-line or subsequent therapy.

  • loratinib could be used for crizotanib resistant tumors based on EML4-ALK variants present in ctDNA

1. Updated efficacy and safety data from the global phase III ALEX study of alectinib (ALC) vs crizotinib (CZ) in untreated advanced ALK+ NSCLCJ Clin Oncol 36, 2018 (suppl; abstr 9043).


Corey Langer


Follow on Twitter at:








Read Full Post »

Group of Researchers @ University of California, Riverside, the University of Chicago, the U.S. Department of Energy’s Argonne National Laboratory, and Northwestern University solve COVID-19 Structure and Map Potential Therapeutics

Reporters: Stephen J Williams, PhD and Aviva Lev-Ari, PhD, RN


This illustration, created at the Centers for Disease Control and Prevention (CDC), reveals ultrastructural morphology exhibited by coronaviruses. Note the spikes that adorn the outer surface of the virus, which impart the look of a corona surrounding the virion, when viewed electron microscopically. A novel coronavirus virus was identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China in 2019.

Image and Caption Credit: Alissa Eckert, MS; Dan Higgins, MAM available at https://phil.cdc.gov/Details.aspx?pid=23311


New coronavirus protein reveals drug target

Image of newly mapped coronavirus protein, called Nsp15, which helps the virus replicate.

Image Credit: Northwestern University

Image of newly mapped coronavirus protein, called Nsp15, which helps the virus replicate.

How UC is responding to the coronavirus (COVID-19)

The University of California is vigilantly monitoring and responding to new information about the coronavirus (COVID-19) outbreak, which has been declared a global health emergency.

Get UC news and updates on this evolving situation.

The 3-D structure of a potential drug target in a newly mapped protein of COVID-19, or coronavirus, has been solved by a team of researchers from the University of California, Riverside, the University of Chicago, the U.S. Department of Energy’s Argonne National Laboratory, and Northwestern University.

The scientists said their findings suggest drugs previously developed to treat the earlier SARS outbreak could now be developed as effective drugs against COVID-19.

The initial genome analysis and design of constructs for protein synthesis were performed by the bioinformatic group of Adam Godzik, a professor of biomedical sciences at the UC Riverside School of Medicine.

The protein Nsp15 from Severe Acute Respiratory Syndrome Coronavirus 2, or SARS-CoV-2, is 89% identical to the protein from the earlier outbreak of SARS-CoV. SARS-CoV-2 is responsible for the current outbreak of COVID-19. Studies published in 2010 on SARS-CoV revealed inhibition of Nsp15 can slow viral replication. This suggests drugs designed to target Nsp15 could be developed as effective drugs against COVID-19.

Adam Godzik
Adam Godzik, UC Riverside professor of biomedical sciences
Credit: Sanford Burnham Prebys Medical Discovery Institute

“While the SARS-CoV-19 virus is very similar to the SARS virus that caused epidemics in 2003, new structures shed light on the small, but potentially important differences between the two viruses that contribute to the different patterns in the spread and severity of the diseases they cause,” Godzik said.

The structure of Nsp15, which will be released to the scientific community on March 4, was solved by the group of Andrzej Joachimiak, a distinguished fellow at the Argonne National Laboratory, University of Chicago Professor, and Director of the Structural Biology Center at Argonne’s Advanced Photon Source, a Department of Energy Office of Science user facility.

“Nsp15 is conserved among coronaviruses and is essential in their lifecycle and virulence,” Joachimiak said. “Initially, Nsp15 was thought to directly participate in viral replication, but more recently, it was proposed to help the virus replicate possibly by interfering with the host’s immune response.”

Mapping a 3D protein structure of the virus, also called solving the structure, allows scientists to figure out how to interfere in the pathogen’s replication in human cells.

“The Nsp15 protein has been investigated in SARS as a novel target for new drug development, but that never went very far because the SARS epidemic went away, and all new drug development ended,” said Karla Satchell, a professor of microbiology-immunology at Northwestern, who leads the international team of scientists investigating the structure of the SARS CoV-2 virus to understand how to stop it from replicating. “Some inhibitors were identified but never developed into drugs. The inhibitors that were developed for SARS now could be tested against this protein.”

Rapid upsurge and proliferation of SARS-CoV-2 raised questions about how this virus could become so much more transmissible as compared to the SARS and MERS coronaviruses. The scientists are mapping the proteins to address this issue.

Over the past two months, COVID-19 infected more than 80,000 people and caused at least 2,700 deaths. Although currently mainly concentrated in China, the virus is spreading worldwide and has been found in 46 countries. Millions of people are being quarantined, and the epidemic has impacted the world economy. There is no existing drug for this disease, but various treatment options, such as utilizing medicines effective in other viral ailments, are being attempted.

Godzik, Satchell, and Joachimiak — along with the entire center team — will map the structure of some of the 28 proteins in the virus in order to see where drugs can throw a chemical monkey wrench into its machinery. The proteins are folded globular structures with precisely defined functions and their “active sites” can be targeted with chemical compounds.
The first step is to clone and express the genes of the virus proteins and grow them as protein crystals in miniature ice cube-like trays. The consortium includes nine labs across eight institutions that will participate in this effort.

Above is a modified version of the Northwestern University news release written by Marla Paul.

Read Full Post »

Bioinformatic Tools for RNASeq: A Curation

Curator: Stephen J. Williams, Ph.D. 


Note:  This will be an ongoing curation as new information and tools become available.

RNASeq is a powerful tool for the analysis of the transcriptome profile and has been used to determine the transcriptional changes occurring upon stimuli such as drug treatment or detecting transcript differences between biological sample cohorts such as tumor versus normal tissue.  Unlike its genomic companion, whole genome and whole exome sequencing, which analyzes the primary sequence of the genomic DNA, RNASeq analyzes the mRNA transcripts, thereby more closely resembling the ultimate translated proteome. In addition, RNASeq and transcriptome profiling can determine if splicing variants occur as well as determining the nonexomic sequences, such as miRNA and lncRNA species, all of which have shown pertinence in the etiology of many diseases, including cancer.

However, RNASeq, like other omic technologies, generates enormous big data sets, which requires multiple types of bioinformatic tools in order to correctly analyze the sequence reads, and to visualize and interpret the output data.  This post represents a curation by the RNA-Seq blog of such tools useful for RNASeq studies and lists and reviews published literature using these curated tools.


From the RNA-Seq Blog

List of RNA-Seq bioinformatics tools

Posted by: RNA-Seq Blog in Data Analysis, Web Tools September 16, 2015 6,251 Views

from: https://en.wiki2.org/wiki/List_of_RNA-Seq_bioinformatics_tools

A review of some of the literature using some of the aforementioned curated tools are discussed below:


A.   Tools Useful for Single Cell RNA-Seq Analysis


B.  Tools for RNA-Seq Analysis of the Sliceasome


C.  Tools Useful for RNA-Seq read assembly visualization


Other articles on RNA and Transcriptomics in this Open Access Journal Include:

NIH to Award Up to $12M to Fund DNA, RNA Sequencing Research: single-cell genomics, sample preparation, transcriptomics and epigenomics, and genome-wide functional analysis.

Single-cell Genomics: Directions in Computational and Systems Biology – Contributions of Prof. Aviv Regev @Broad Institute of MIT and Harvard, Cochair, the Human Cell Atlas Organizing Committee with Sarah Teichmann of the Wellcome Trust Sanger Institute

Complex rearrangements and oncogene amplification revealed by long-read DNA and RNA sequencing of a breast cancer cell line

Single-cell RNA-seq helps in finding intra-tumoral heterogeneity in pancreatic cancer

First challenge to make use of the new NCI Cloud Pilots – Somatic Mutation Challenge – RNA: Best algorithms for detecting all of the abnormal RNA molecules in a cancer cell

Evolution of the Human Cell Genome Biology Field of Gene Expression, Gene Regulation, Gene Regulatory Networks and Application of Machine Learning Algorithms in Large-Scale Biological Data Analysis


Read Full Post »

Medicine in 2045 – Perspectives by World Thought Leaders in the Life Sciences & Medicine

Reporter: Aviva Lev-Ari, PhD, RN


This report is based on an article in Nature Medicine | VOL 25 | December 2019 | 1800–1809 | http://www.nature.com/naturemedicine

Looking forward 25 years: the future of medicine.

Nat Med 25, 1804–1807 (2019) doi:10.1038/s41591-019-0693-y


Aviv Regev, PhD

Core member and chair of the faculty, Broad Institute of MIT and Harvard; director, Klarman Cell Observatory, Broad Institute of MIT and Harvard; professor of biology, MIT; investigator, Howard Hughes Medical Institute; founding co-chair, Human Cell Atlas.

  • millions of genome variants, tens of thousands of disease-associated genes, thousands of cell types and an almost unimaginable number of ways they can combine, we had to approximate a best starting point—choose one target, guess the cell, simplify the experiment.
  • In 2020, advances in polygenic risk scores, in understanding the cell and modules of action of genes through genome-wide association studies (GWAS), and in predicting the impact of combinations of interventions.
  • we need algorithms to make better computational predictions of experiments we have never performed in the lab or in clinical trials.
  • Human Cell Atlas and the International Common Disease Alliance—and in new experimental platforms: data platforms and algorithms. But we also need a broader ecosystem of partnerships in medicine that engages interaction between clinical experts and mathematicians, computer scientists and engineers

Feng Zhang, PhD

investigator, Howard Hughes Medical Institute; core member, Broad Institute of MIT and Harvard; James and Patricia Poitras Professor of Neuroscience, McGovern Institute for Brain Research, MIT.

  • fundamental shift in medicine away from treating symptoms of disease and toward treating disease at its genetic roots.
  • Gene therapy with clinical feasibility, improved delivery methods and the development of robust molecular technologies for gene editing in human cells, affordable genome sequencing has accelerated our ability to identify the genetic causes of disease.
  • 1,000 clinical trials testing gene therapies are ongoing, and the pace of clinical development is likely to accelerate.
  • refine molecular technologies for gene editing, to push our understanding of gene function in health and disease forward, and to engage with all members of society

Elizabeth Jaffee, PhD

Dana and Albert “Cubby” Broccoli Professor of Oncology, Johns Hopkins School of Medicine; deputy director, Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins.

  • a single blood test could inform individuals of the diseases they are at risk of (diabetes, cancer, heart disease, etc.) and that safe interventions will be available.
  • developing cancer vaccines. Vaccines targeting the causative agents of cervical and hepatocellular cancers have already proven to be effective. With these technologies and the wealth of data that will become available as precision medicine becomes more routine, new discoveries identifying the earliest genetic and inflammatory changes occurring within a cell as it transitions into a pre-cancer can be expected. With these discoveries, the opportunities to develop vaccine approaches preventing cancers development will grow.

Jeremy Farrar, OBE FRCP FRS FMedSci

Director, Wellcome Trust.

  • shape how the culture of research will develop over the next 25 years, a culture that cares more about what is achieved than how it is achieved.
  • building a creative, inclusive and open research culture will unleash greater discoveries with greater impact.

John Nkengasong, PhD

Director, Africa Centres for Disease Control and Prevention.

  • To meet its health challenges by 2050, the continent will have to be innovative in order to leapfrog toward solutions in public health.
  • Precision medicine will need to take center stage in a new public health order— whereby a more precise and targeted approach to screening, diagnosis, treatment and, potentially, cure is based on each patient’s unique genetic and biologic make-up.

Eric Topol, MD

Executive vice-president, Scripps Research Institute; founder and director, Scripps Research Translational Institute.

  • In 2045, a planetary health infrastructure based on deep, longitudinal, multimodal human data, ideally collected from and accessible to as many as possible of the 9+ billion people projected to then inhabit the Earth.
  • enhanced capabilities to perform functions that are not feasible now.
  • AI machines’ ability to ingest and process biomedical text at scale—such as the corpus of the up-to-date medical literature—will be used routinely by physicians and patients.
  • the concept of a learning health system will be redefined by AI.

Linda Partridge, PhD

Professor, Max Planck Institute for Biology of Ageing.

  • Geroprotective drugs, which target the underlying molecular mechanisms of ageing, are coming over the scientific and clinical horizons, and may help to prevent the most intractable age-related disease, dementia.

Trevor Mundel, MD

President of Global Health, Bill & Melinda Gates Foundation.

  • finding new ways to share clinical data that are as open as possible and as closed as necessary.
  • moving beyond drug donations toward a new era of corporate social responsibility that encourages biotechnology and pharmaceutical companies to offer their best minds and their most promising platforms.
  • working with governments and multilateral organizations much earlier in the product life cycle to finance the introduction of new interventions and to ensure the sustainable development of the health systems that will deliver them.
  • deliver on the promise of global health equity.

Josep Tabernero, MD, PhD

Vall d’Hebron Institute of Oncology (VHIO); president, European Society for Medical Oncology (2018–2019).

  • genomic-driven analysis will continue to broaden the impact of personalized medicine in healthcare globally.
  • Precision medicine will continue to deliver its new paradigm in cancer care and reach more patients.
  • Immunotherapy will deliver on its promise to dismantle cancer’s armory across tumor types.
  • AI will help guide the development of individually matched
  • genetic patient screenings
  • the promise of liquid biopsy policing of disease?

Pardis Sabeti, PhD

Professor, Harvard University & Harvard T.H. Chan School of Public Health and Broad Institute of MIT and Harvard; investigator, Howard Hughes Medical Institute.

  • the development and integration of tools into an early-warning system embedded into healthcare systems around the world could revolutionize infectious disease detection and response.
  • But this will only happen with a commitment from the global community.

Els Toreele, PhD

Executive director, Médecins Sans Frontières Access Campaign

  • we need a paradigm shift such that medicines are no longer lucrative market commodities but are global public health goods—available to all those who need them.
  • This will require members of the scientific community to go beyond their role as researchers and actively engage in R&D policy reform mandating health research in the public interest and ensuring that the results of their work benefit many more people.
  • The global research community can lead the way toward public-interest driven health innovation, by undertaking collaborative open science and piloting not-for-profit R&D strategies that positively impact people’s lives globally.

Read Full Post »

Evolution of the Human Cell Genome Biology Field of Gene Expression, Gene Regulation, Gene Regulatory Networks and Application of Machine Learning Algorithms in Large-Scale Biological Data Analysis

Curator & Reporter: Aviva Lev-Ari, PhD, RN



The Scientific Frontier is presented in Deciphering eukaryotic gene-regulatory logic with 100 million random promoters

Boer, C.G., Vaishnav, E.D., Sadeh, R. et al. Deciphering eukaryotic gene-regulatory logic with 100 million random promotersNat Biotechnol (2019) doi:10.1038/s41587-019-0315-8


How transcription factors (TFs) interpret cis-regulatory DNA sequence to control gene expression remains unclear, largely because past studies using native and engineered sequences had insufficient scale. Here, we measure the expression output of >100 million synthetic yeast promoter sequences that are fully random. These sequences yield diverse, reproducible expression levels that can be explained by their chance inclusion of functional TF binding sites. We use machine learning to build interpretable models of transcriptional regulation that predict ~94% of the expression driven from independent test promoters and ~89% of the expression driven from native yeast promoter fragments. These models allow us to characterize each TF’s specificity, activity and interactions with chromatin. TF activity depends on binding-site strand, position, DNA helical face and chromatin context. Notably, expression level is influenced by weak regulatory interactions, which confound designed-sequence studies. Our analyses show that massive-throughput assays of fully random DNA can provide the big data necessary to develop complex, predictive models of gene regulation.

The Evolution of the Human Cell Genome Biology Field of Gene Expression, Gene Regulation, Gene Regulatory Networks and Application of Machine Learning Algorithms in Large-Scale Biological Data Analysis is presented in the following Table


50 Liu, X., Li, Y. I. & Pritchard, J. K. Trans effects on gene expression can drive omnigenic inheritance. Cell 177, 1022–1034 e1026 (2019).
5 Muerdter, F. et al. Resolving systematic errors in widely used enhancer activity assays in human cells. Nat. Methods 15, 141–149 (2018).
6 Wang, X. et al. High-resolution genome-wide functional dissection of transcriptional regulatory regions and nucleotides in human. Nat. Commun. 9, 5380 (2018).
15 Yona, A. H., Alm, E. J. & Gore, J. Random sequences rapidly evolve into de novo promoters. Nat. Commun. 9, 1530 (2018).
4 van Arensbergen, J. et al. Genome-wide mapping of autonomous promoter activity in human cells. Nat. Biotechnol. 35, 145–153 (2017).
14 Cuperus, J. T. et al. Deep learning of the regulatory grammar of yeast 5’ untranslated regions from 500,000 random sequences. Genome Res. 27, 2015–2024 (2017).
31 Levo, M. et al. Systematic investigation of transcription factor activity in the context of chromatin using massively parallel binding and expression assays. Mol. Cell 65, 604–617 e606 (2017).
49 Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).
54 de Boer, C. High-efficiency S. cerevisiae lithium acetate transformation. protocols.io https://doi.org/10.17504/protocols.io.j4tcqwn (2017).
59 Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous systems. arXiv 1603.04467 (2016).
20 Shalem, O. et al. Systematic dissection of the sequence determinants of gene 3’ end mediated expression control. PLoS Genet. 11, e1005147 (2015).
55 Deng, C., Daley, T. & Smith, A. D. Applications of species accumulation curves in large-scale biological data analysis. Quant. Biol. 3, 135–144 (2015).
9 Hughes, T. R. & de Boer, C. G. Mapping yeast transcriptional networks. Genetics 195, 9–36 (2013).
10 Jolma, A. et al. DNA-binding specificities of human transcription factors. Cell 152, 327–339 (2013).
19 Kosuri, S. et al. Composability of regulatory sequences controlling transcription and translation in Escherichia coli. Proc. Natl Acad. Sci. USA 110, 14024–14029 (2013).
7 Sharon, E. et al. Inferring gene regulatory logic from high-throughput measurements of thousands of systematically designed promoters. Nat. Biotechnol. 30, 521–530 (2012).
18 de Boer, C. G. & Hughes, T. R. YeTFaSCo: a database of evaluated yeast transcription factor sequence specificities. Nucleic Acids Res. 40, D169–D179 (2012).
56 Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
61 Cherry, J. M. et al. Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res. 40, D700–D705 (2012).
11 Nutiu, R. et al. Direct measurement of DNA affinity landscapes on a high-throughput sequencing instrument. Nat. Biotechnol. 29, 659–664 (2011).
26 Zhang, Z. et al. A packing mechanism for nucleosome organization reconstituted across a eukaryotic genome. Science 332, 977–980 (2011).
30 Ganapathi, M. et al. Extensive role of the general regulatory factors, Abf1 and Rap1, in determining genome-wide chromatin structure in budding yeast. Nucleic Acids Res. 39, 2032–2044 (2011).
52 Erb, I. & van Nimwegen, E. Transcription factor binding site positioning in yeast: proximal promoter motifs characterize TATA-less promoters. PloS One 6, e24279 (2011).
3 Kinney, J. B., Murugan, A., Callan, C. G. Jr. & Cox, E. C. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proc. Natl Acad. Sci. USA107, 9158–9163 (2010).
8 Gertz, J., Siggia, E. D. & Cohen, B. A. Analysis of combinatorial cis-regulation in synthetic and genomic promoters. Nature 457, 215–218 (2009).
16 Wunderlich, Z. & Mirny, L. A. Different gene regulation strategies revealed by analysis of binding motifs. Trends Genet. 25, 434–440 (2009).
27 Hesselberth, J. R. et al. Global mapping of protein–DNA interactions in vivo by digital genomic footprinting. Nat. Methods 6, 283–289 (2009).
29 Hartley, P. D. & Madhani, H. D. Mechanisms that specify promoter nucleosome location and identity. Cell 137, 445–458 (2009).
51 Gibson, D. G. et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6, 343–345 (2009).
58 Segal, E. & Widom, J. From DNA sequence to transcriptional behaviour: a quantitative approach. Nat. Rev. Genet. 10, 443–456 (2009).
2 Yuan, Y., Guo, L., Shen, L. & Liu, J. S. Predicting gene expression from sequence: a reexamination. PLoS Comput. Biol. 3, e243 (2007).
46 Hibbs, M. A. et al. Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics 23, 2692–2699 (2007).
25 Liu, X., Lee, C. K., Granek, J. A., Clarke, N. D. & Lieb, J. D. Whole-genome comparison of Leu3 binding in vitro and in vivo reveals the importance of nucleosome occupancy in target site selection. Genome Res. 16, 1517–1528 (2006).
34 Roberts, G. G. & Hudson, A. P. Transcriptome profiling of Saccharomyces cerevisiae during a transition from fermentative to glycerol-based respiratory growth reveals extensive metabolic and structural remodeling. Mol. Genet. Genomics 276, 170–186 (2006).
48 Tanay, A. Extensive low-affinity transcriptional interactions in the yeast genome. Gen. Res. 16, 962–972 (2006).
53 Tong, A. H. & Boone, C. Synthetic genetic array analysis in Saccharomyces cerevisiae. Methods Mol. Biol. 313, 171–192 (2006).
57 Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).
62 Chua, G. et al. Identifying transcription factor functions and targets by phenotypic activation. Proc. Natl Acad. Sci. USA 103, 12045–12050 (2006).
17 Arnosti, D. N. & Kulkarni, M. M. Transcriptional enhancers: intelligent enhanceosomes or flexible billboards? J. Cell. Biochem. 94, 890–898 (2005).
21 Granek, J. A. & Clarke, N. D. Explicit equilibrium modeling of transcription-factor binding and gene regulation. Genome Biol. 6, R87 (2005).
1 Beer, M. A. & Tavazoie, S. Predicting gene expression from sequence. Cell 117, 185–198 (2004).
28 Bernstein, B. E., Liu, C. L., Humphrey, E. L., Perlstein, E. O. & Schreiber, S. L. Global nucleosome occupancy in yeast. Genome Biol. 5, R62 (2004).
44 Kim, T. S., Kim, H. Y., Yoon, J. H. & Kang, H. S. Recruitment of the Swi/Snf complex by Ste12-Tec1 promotes Flo8-Mss11-mediated activation of STA1 expression. Mol. Cell. Biol. 24, 9542–9556 (2004).
45 Harbison, C. T. et al. Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104 (2004).
60 Kent, N. A., Eibert, S. M. & Mellor, J. Cbf1p is required for chromatin remodeling at promoter-proximal CACGTG motifs in yeast. J. Biol. Chem. 279, 27116–27123 (2004).
22 Kulkarni, M. M. & Arnosti, D. N. Information display by transcriptional enhancers. Development 130, 6569–6575 (2003).
24 Conlon, E. M., Liu, X. S., Lieb, J. D. & Liu, J. S. Integrating regulatory motif discovery and genome-wide expression analysis. Proc. Natl Acad. Sci. USA 100, 3339–3344 (2003).
43 Neely, K. E., Hassan, A. H., Brown, C. E., Howe, L. & Workman, J. L. Transcription activator interactions with multiple SWI/SNF subunits. Mol. Cell. Biol. 22, 1615–1625 (2002).
23 Bussemaker, H. J., Li, H. & Siggia, E. D. Regulatory element detection using correlation with expression. Nat. Genet. 27, 167–171 (2001).
37 Haurie, V. et al. The transcriptional activator Cat8p provides a major contribution to the reprogramming of carbon metabolism during the diauxic shift in Saccharomyces cerevisiae. J. Biol. Chem. 276, 76–85 (2001).
39 Grauslund, M. & Ronnow, B. Carbon source-dependent transcriptional regulation of the mitochondrial glycerol-3-phosphate dehydrogenase gene, GUT2, from Saccharomyces cerevisiae. Can. J. Microbiol. 46, 1096–1100 (2000).
42 Cullen, P. J. & Sprague, G. F. Jr. Glucose depletion causes haploid invasive growth in yeast. Proc. Natl Acad. Sci. USA 97, 13619–13624 (2000).
38 Sato, T. et al. TheE-box DNA binding protein Sgc1p suppresses the gcr2 mutation, which is involved in transcriptional activation of glycolytic genes in Saccharomyces cerevisiae. FEBS Lett. 463, 307–311 (1999).
40 Madhani, H. D. & Fink, G. R. Combinatorial control required for the specificity of yeast MAPK signaling. Science 275, 1314–1317 (1997).
41 Gavrias, V., Andrianopoulos, A., Gimeno, C. J. & Timberlake, W. E. Saccharomyces cerevisiae TEC1 is required for pseudohyphal growth. Mol. Microbiol. 19, 1255–1263 (1996).
36 Hedges, D., Proft, M. & Entian, K. D. CAT8, a new zinc cluster-encoding gene necessary for derepression of gluconeogenic enzymes in the yeast Saccharomyces cerevisiae. Mol. Cell. Biol. 15, 1915–1922 (1995).
47 Bednar, J. et al. Determination of DNA persistence length by cryo-electron microscopy. Separation of the static and dynamic contributions to the apparent persistence length of DNA. J. Mol. Biol. 254, 579–594 (1995).
32 Axelrod, J. D., Reagan, M. S. & Majors, J. GAL4 disrupts a repressing nucleosome during activation of GAL1 transcription in vivo. Genes Dev. 7, 857–869 (1993).
33 Morse, R. H. Nucleosome disruption by transcription factor binding in yeast. Science 262, 1563–1566 (1993).
12 Oliphant, A. R., Brandl, C. J. & Struhl, K. Defining the sequence specificity of DNA-binding proteins by selecting binding sites from random-sequence oligonucleotides: analysis of yeast GCN4 protein. Mol. Cell. Biol. 9, 2944–2949 (1989).
35 Forsburg, S. L. & Guarente, L. Identification and characterization of HAP4: a third component of the CCAAT-bound HAP2/HAP3 heteromer. Genes Dev. 3, 1166–1178 (1989).
13 Horwitz, M. S. & Loeb, L. A. Promoters selected from random DNA sequences. Proc. Natl Acad. Sci. USA 83, 7405–7409 (1986).


To access each reference as a live link, go to the number in the first column in the Table and look it up in the List of References in the Link, below


Author information

C.G.D. and A.R. drafted the manuscript, with all authors contributing. C.G.D. analyzed the data. C.G.D., E.D.V., E.L.A. and R.S. performed the experiments. A.R. and N.F. supervised the research.

Correspondence to Carl G. de Boer or Aviv Regev.

Ethics declarations

Competing interests

A.R. is an SAB member of Thermo Fisher Scientific, Neogene Therapeutics, Asimov, and Syros Pharmaceuticals, an equity holder of Immunitas, and a founder of and equity holder in Celsius Therapeutics. All other authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Cite this article

Boer, C.G., Vaishnav, E.D., Sadeh, R. et al. Deciphering eukaryotic gene-regulatory logic with 100 million random promoters. Nat Biotechnol (2019) doi:10.1038/s41587-019-0315-8

Download citation

Read Full Post »

Deep Learning extracts Histopathological Patterns and accurately discriminates 28 Cancer and 14 Normal Tissue Types: Pan-cancer Computational Histopathology Analysis

Reporter: Aviva Lev-Ari, PhD, RN

Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis

Yu Fu1, Alexander W Jung1, Ramon Viñas Torne1, Santiago Gonzalez1,2, Harald Vöhringer1, Mercedes Jimenez-Linan3, Luiza Moore3,4, and Moritz Gerstung#1,5 # to whom correspondence should be addressed 1) European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK. 2) Current affiliation: Institute for Research in Biomedicine (IRB Barcelona), Parc Científic de Barcelona, Barcelona, Spain. 3) Department of Pathology, Addenbrooke’s Hospital, Cambridge, UK. 4) Wellcome Sanger Institute, Hinxton, UK 5) European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.


Dr Moritz Gerstung European Molecular Biology Laboratory European Bioinformatics Institute (EMBL-EBI) Hinxton, CB10 1SA UK. Tel: +44 (0) 1223 494636 E-mail: moritz.gerstung@ebi.ac.uk


Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis

Here we use deep transfer learning to quantify histopathological patterns across 17,396 H&E stained histopathology image slides from 28 cancer types and correlate these with underlying genomic and transcriptomic data. Pan-cancer computational histopathology (PC-CHiP) classifies the tissue origin across organ sites and provides highly accurate, spatially resolved tumor and normal distinction within a given slide. The learned computational histopathological features correlate with a large range of recurrent genetic aberrations, including whole genome duplications (WGDs), arm-level copy number gains and losses, focal amplifications and deletions as well as driver gene mutations within a range of cancer types. WGDs can be predicted in 25/27 cancer types (mean AUC=0.79) including those that were not part of model training. Similarly, we observe associations with 25% of mRNA transcript levels, which enables to learn and localise histopathological patterns of molecularly defined cell types on each slide. Lastly, we find that computational histopathology provides prognostic information augmenting histopathological subtyping and grading in the majority of cancers assessed, which pinpoints prognostically relevant areas such as necrosis or infiltrating lymphocytes on each tumour section. Taken together, these findings highlight the large potential of PC-CHiP to discover new molecular and prognostic associations, which can augment diagnostic workflows and lay out a rationale for integrating molecular and histopathological data.



Key points

● Pan-cancer computational histopathology analysis with deep learning extracts histopathological patterns and accurately discriminates 28 cancer and 14 normal tissue types

● Computational histopathology predicts whole genome duplications, focal amplifications and deletions, as well as driver gene mutations

● Wide-spread correlations with gene expression indicative of immune infiltration and proliferation

● Prognostic information augments conventional grading and histopathology subtyping in the majority of cancers



Here we presented PC-CHiP, a pan-cancer transfer learning approach to extract computational histopathological features across 42 cancer and normal tissue types and their genomic, molecular and prognostic associations. Histopathological features, originally derived to classify different tissues, contained rich histologic and morphological signals predictive of a range of genomic and transcriptomic changes as well as survival. This shows that computer vision not only has the capacity to highly accurately reproduce predefined tissue labels, but also that this quantifies diverse histological patterns, which are predictive of a broad range of genomic and molecular traits, which were not part of the original training task. As the predictions are exclusively based on standard H&E-stained tissue sections, our analysis highlights the high potential of computational histopathology to digitally augment existing histopathological workflows. The strongest genomic associations were found for whole genome duplications, which can in part be explained by nuclear enlargement and increased nuclear intensities, but seemingly also stems from tumour grade and other histomorphological patterns contained in the high-dimensional computational histopathological features. Further, we observed associations with a range of chromosomal gains and losses, focal deletions and amplifications as well as driver gene mutations across a number of cancer types. These data demonstrate that genomic alterations change the morphology of cancer cells, as in the case of WGD, but possibly also that certain aberrations preferentially occur in distinct cell types, reflected by the tumor histology. Whatever is the cause or consequence in this equation, these associations lay out a route towards genomically defined histopathology subtypes, which will enhance and refine conventional assessment. Further, a broad range of transcriptomic correlations was observed reflecting both immune cell infiltration and cell proliferation that leads to higher tumor densities. These examples illustrated the remarkable property that machine learning does not only establish novel molecular associations from pre-computed histopathological feature sets but also allows the localisation of these traits within a larger image. While this exemplifies the power of a large scale data analysis to detect and localise recurrent patterns, it is probably not superior to spatially annotated training data. Yet such data can, by definition, only be generated for associations which are known beforehand. This appears straightforward, albeit laborious, for existing histopathology classifications, but more challenging for molecular readouts. Yet novel spatial transcriptomic44,45 and sequencing technologies46 bring within reach spatially matched molecular and histopathological data, which would serve as a gold standard in combining imaging and molecular patterns. Across cancer types, computational histopathological features showed a good level of prognostic relevance, substantially improving prognostic accuracy over conventional grading and histopathological subtyping in the majority of cancers. It is this very remarkable that such predictive It is made available under a CC-BY-NC 4.0 International license. (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. bioRxiv preprint first posted online Oct. 25, 2019; doi: http://dx.doi.org/10.1101/813543. The copyright holder for this preprint signals can be learned in a fully automated fashion. Still, at least at the current resolution, the improvement over a full molecular and clinical workup was relatively small. This might be a consequence of the far-ranging relations between histopathology and molecular phenotypes described here, implying that histopathology is a reflection of the underlying molecular alterations rather than an independent trait. Yet it probably also highlights the challenges of unambiguously quantifying histopathological signals in – and combining signals from – individual areas, which requires very large training datasets for each tumour entity. From a methodological point of view, the prediction of molecular traits can clearly be improved. In this analysis, we adopted – for the reason of simplicity and to avoid overfitting – a transfer learning approach in which an existing deep convolutional neural network, developed for classification of everyday objects, was fine tuned to predict cancer and normal tissue types. The implicit imaging feature representation was then used to predict molecular traits and outcomes. Instead of employing this two-step procedure, which risks missing patterns irrelevant for the initial classification task, one might directly employ either training on the molecular trait of interest, or ideally multi-objective learning. Further improvement may also be related to the choice of the CNN architecture. Everyday images have no defined scale due to a variable z-dimension; therefore, the algorithms need to be able to detect the same object at different sizes. This clearly is not the case for histopathology slides, in which one pixel corresponds to a defined physical size at a given magnification. Therefore, possibly less complex CNN architectures may be sufficient for quantitative histopathology analyses, and also show better generalisation. Here, in our proof-of-concept analysis, we observed a considerable dependence of the feature representation on known and possibly unknown properties of our training data, including the image compression algorithm and its parameters. Some of these issues could be overcome by amending and retraining the network to isolate the effect of confounding factors and additional data augmentation. Still, given the flexibility of deep learning algorithms and the associated risk of overfitting, one should generally be cautious about the generalisation properties and critically assess whether a new image is appropriately represented. Looking forward, our analyses revealed the enormous potential of using computer vision alongside molecular profiling. While the eye of a trained human may still constitute the gold standard for recognising clinically relevant histopathological patterns, computers have the capacity to augment this process by sifting through millions of images to retrieve similar patterns and establish associations with known and novel traits. As our analysis showed this helps to detect histopathology patterns associated with a range of genomic alterations, transcriptional signatures and prognosis – and highlight areas indicative of these traits on each given slide. It is therefore not too difficult to foresee how this may be utilised in a computationally augmented histopathology workflow enabling more precise and faster diagnosis and prognosis. Further, the ability to quantify a rich set of histopathology patterns lays out a path to define integrated histopathology and molecular cancer subtypes, as recently demonstrated for colorectal cancers47 .

Lastly, our analyses provide It is made available under a CC-BY-NC 4.0 International license. (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

bioRxiv preprint first posted online Oct. 25, 2019; doi: http://dx.doi.org/10.1101/813543.

The copyright holder for this preprint proof-of-concept for these principles and we expect them to be greatly refined in the future based on larger training corpora and further algorithmic refinements.




Other related articles published in this Open Access Online Scientific Journal include the following: 


CancerBase.org – The Global HUB for Diagnoses, Genomes, Pathology Images: A Real-time Diagnosis and Therapy Mapping Service for Cancer Patients – Anonymized Medical Records accessible to anyone on Earth

Reporter: Aviva Lev-Ari, PhD, RN



631 articles had in their Title the keyword “Pathology”



Read Full Post »

Single-cell RNA-seq helps in finding intra-tumoral heterogeneity in pancreatic cancer

Reporter and Curator: Dr. Sudipta Saha, Ph.D.


Pancreatic cancer is a significant cause of cancer mortality; therefore, the development of early diagnostic strategies and effective treatment is essential. Improvements in imaging technology, as well as use of biomarkers are changing the way that pancreas cancer is diagnosed and staged. Although progress in treatment for pancreas cancer has been incremental, development of combination therapies involving both chemotherapeutic and biologic agents is ongoing.


Cancer is an evolutionary disease, containing the hallmarks of an asexually reproducing unicellular organism subject to evolutionary paradigms. Pancreatic ductal adenocarcinoma (PDAC) is a particularly robust example of this phenomenon. Genomic features indicate that pancreatic cancer cells are selected for fitness advantages when encountering the geographic and resource-depleted constraints of the microenvironment. Phenotypic adaptations to these pressures help disseminated cells to survive in secondary sites, a major clinical problem for patients with this disease.


The immune system varies in cell types, states, and locations. The complex networks, interactions, and responses of immune cells produce diverse cellular ecosystems composed of multiple cell types, accompanied by genetic diversity in antigen receptors. Within this ecosystem, innate and adaptive immune cells maintain and protect tissue function, integrity, and homeostasis upon changes in functional demands and diverse insults. Characterizing this inherent complexity requires studies at single-cell resolution. Recent advances such as massively parallel single-cell RNA sequencing and sophisticated computational methods are catalyzing a revolution in our understanding of immunology.


PDAC is the most common type of pancreatic cancer featured with high intra-tumoral heterogeneity and poor prognosis. In the present study to comprehensively delineate the PDAC intra-tumoral heterogeneity and the underlying mechanism for PDAC progression, single-cell RNA-seq (scRNA-seq) was employed to acquire the transcriptomic atlas of 57,530 individual pancreatic cells from primary PDAC tumors and control pancreases. The diverse malignant and stromal cell types, including two ductal subtypes with abnormal and malignant gene expression profiles respectively, were identified in PDAC.


The researchers found that the heterogenous malignant subtype was composed of several subpopulations with differential proliferative and migratory potentials. Cell trajectory analysis revealed that components of multiple tumor-related pathways and transcription factors (TFs) were differentially expressed along PDAC progression. Furthermore, it was found a subset of ductal cells with unique proliferative features were associated with an inactivation state in tumor-infiltrating T cells, providing novel markers for the prediction of antitumor immune response. Together, the findings provided a valuable resource for deciphering the intra-tumoral heterogeneity in PDAC and uncover a connection between tumor intrinsic transcriptional state and T cell activation, suggesting potential biomarkers for anticancer treatment such as targeted therapy and immunotherapy.
















Read Full Post »

scPopCorn: A New Computational Method for Subpopulation Detection and their Comparative Analysis Across Single-Cell Experiments

Reporter and Curator: Dr. Sudipta Saha, Ph.D.


Present day technological advances have facilitated unprecedented opportunities for studying biological systems at single-cell level resolution. For example, single-cell RNA sequencing (scRNA-seq) enables the measurement of transcriptomic information of thousands of individual cells in one experiment. Analyses of such data provide information that was not accessible using bulk sequencing, which can only assess average properties of cell populations. Single-cell measurements, however, can capture the heterogeneity of a population of cells. In particular, single-cell studies allow for the identification of novel cell types, states, and dynamics.


One of the most prominent uses of the scRNA-seq technology is the identification of subpopulations of cells present in a sample and comparing such subpopulations across samples. Such information is crucial for understanding the heterogeneity of cells in a sample and for comparative analysis of samples from different conditions, tissues, and species. A frequently used approach is to cluster every dataset separately, inspect marker genes for each cluster, and compare these clusters in an attempt to determine which cell types were shared between samples. This approach, however, relies on the existence of predefined or clearly identifiable marker genes and their consistent measurement across subpopulations.


Although the aligned data can then be clustered to reveal subpopulations and their correspondence, solving the subpopulation-mapping problem by performing global alignment first and clustering second overlooks the original information about subpopulations existing in each experiment. In contrast, an approach addressing this problem directly might represent a more suitable solution. So, keeping this in mind the researchers developed a computational method, single-cell subpopulations comparison (scPopCorn), that allows for comparative analysis of two or more single-cell populations.


The performance of scPopCorn was tested in three distinct settings. First, its potential was demonstrated in identifying and aligning subpopulations from single-cell data from human and mouse pancreatic single-cell data. Next, scPopCorn was applied to the task of aligning biological replicates of mouse kidney single-cell data. scPopCorn achieved the best performance over the previously published tools. Finally, it was applied to compare populations of cells from cancer and healthy brain tissues, revealing the relation of neoplastic cells to neural cells and astrocytes. Consequently, as a result of this integrative approach, scPopCorn provides a powerful tool for comparative analysis of single-cell populations.


This scPopCorn is basically a computational method for the identification of subpopulations of cells present within individual single-cell experiments and mapping of these subpopulations across these experiments. Different from other approaches, scPopCorn performs the tasks of population identification and mapping simultaneously by optimizing a function that combines both objectives. When applied to complex biological data, scPopCorn outperforms previous methods. However, it should be kept in mind that scPopCorn assumes the input single-cell data to consist of separable subpopulations and it is not designed to perform a comparative analysis of single cell trajectories datasets that do not fulfill this constraint.


Several innovations developed in this work contributed to the performance of scPopCorn. First, unifying the above-mentioned tasks into a single problem statement allowed for integrating the signal from different experiments while identifying subpopulations within each experiment. Such an incorporation aids the reduction of biological and experimental noise. The researchers believe that the ideas introduced in scPopCorn not only enabled the design of a highly accurate identification of subpopulations and mapping approach, but can also provide a stepping stone for other tools to interrogate the relationships between single cell experiments.















Read Full Post »

Older Posts »