Funding, Deals & Partnerships: BIOLOGICS & MEDICAL DEVICES; BioMed e-Series; Medicine and Life Sciences Scientific Journal – http://PharmaceuticalIntelligence.com
Reporter: Frason Francis Kalapurakal, Research Assistant II
Researchers from MIT and Technion have made a significant contribution to the field of machine learning by developing an adaptive algorithm that addresses the challenge of determining when a machine should follow a teacher’s instructions or explore on its own. The algorithm autonomously decides whether to use imitation learning, which involves mimicking the behavior of a skilled teacher, or reinforcement learning, which relies on trial and error to learn from the environment.
The researchers’ key innovation lies in the algorithm’s adaptability and ability to determine the most effective learning method throughout the training process. To achieve this, they trained two “students” with different learning approaches: one using a combination of reinforcement and imitation learning, and the other relying solely on reinforcement learning. The algorithm continuously compared the performance of these two students, adjusting the emphasis on imitation or reinforcement learning based on which student achieved better results.
The algorithm’s efficacy was tested through simulated training scenarios, such as navigating mazes or reorienting objects with touch sensors. In all cases, the algorithm demonstrated superior performance compared to non-adaptive methods, achieving nearly perfect success rates and significantly outperforming other methods in terms of both accuracy and speed. This adaptability could enhance the training of machines in real-world situations where uncertainty is prevalent, such as robots navigating unfamiliar buildings or performing complex tasks involving object manipulation and locomotion.
Furthermore, the algorithm’s potential applications extend beyond robotics to various domains where imitation or reinforcement learning is employed. For example, large language models like GPT-4 could be used as teachers to train smaller models to excel in specific tasks. The researchers also suggest that analyzing the similarities and differences between machines and humans learning from their respective teachers could provide valuable insights for improving the learning experience.The MIT and Technion researchers’ algorithm stands out due to its principled approach, efficiency, and versatility across different domains. Unlike existing methods that require brute-force trial-and-error or manual tuning of parameters, their algorithm dynamically adjusts the balance between imitation and trial-and-error learning based on performance comparisons. This robustness, adaptability, and promising results make it a noteworthy advancement in the field of machine learning.
References:
“TGRL: TEACHER GUIDED REINFORCEMENT LEARNING ALGORITHM FOR POMDPS” Reincarnating Reinforcement Learning Workshop at ICLR 2023 https://openreview.net/pdf?id=kTqjkIvjj7
Concrete Problems in AI Safety by Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, Dan Mané https://arxiv.org/abs/1606.06565
Other related articles published in this Open Access Online Scientific Journal include the following:
92 articles in the Category:
‘Artificial Intelligence – Breakthroughs in Theories and Technologies’
The Vibrant Philly Biotech Scene: Proteovant Therapeutics Using Artificial Intelligence and Machine Learning to Develop PROTACs
Reporter:Stephen J. Williams, Ph.D.
It has been a while since I have added to this series but there have been a plethora of exciting biotech startups in the Philadelphia area, and many new startups combining technology, biotech, and machine learning. One such exciting biotech is Proteovant Therapeutics, which is combining the new PROTAC (Proteolysis-Targeting Chimera) technology with their in house ability to utilize machine learning and artificial intelligence to design these types of compounds to multiple intracellular targets.
PROTACs (which actually is under a trademark name of Arvinus Operations, but is also refered to as Protein Degraders. These PROTACs take advantage of the cell protein homeostatic mechanism of ubiquitin-mediated protein degradation, which is a very specific targeted process which regulates protein levels of various transcription factors, protooncogenes, and receptors. In essence this regulated proteolyic process is needed for normal cellular function, and alterations in this process may lead to oncogenesis, or a proteotoxic crisis leading to mitophagy, autophagy and cellular death. The key to this technology is using chemical linkers to associate an E3 ligase with a protein target of interest. E3 ligases are the rate limiting step in marking the proteins bound for degradation by the proteosome with ubiquitin chains.
A review of this process as well as PROTACs can be found elsewhere in articles (and future articles) on this Open Access Journal.
Protevant have made two important collaborations:
Oncopia Therapeutics: came out of University of Michigan Innovation Hub and lab of Shaomeng Wang, who developed a library of BET and MDM2 based protein degraders. In 2020 was aquired by Riovant Sciences.
Riovant Sciences: uses computer aided design of protein degraders
Proteovant Company Description:
Proteovant is a newly launched development-stage biotech company focusing on discovery and development of disease-modifying therapies by harnessing natural protein homeostasis processes. We have recently acquired numerous assets at discovery and development stages from Oncopia, a protein degradation company. Our lead program is on track to enter IND in 2021. Proteovant is building a strong drug discovery engine by combining deep drugging expertise with innovative platforms including Roivant’s AI capabilities to accelerate discovery and development of protein degraders to address unmet needs across all therapeutic areas. The company has recently secured $200M funding from SK Holdings in addition to investment from Roivant Sciences. Our current therapeutic focus includes but is not limited to oncology, immunology and neurology. We remain agnostic to therapeutic area and will expand therapeutic focus based on opportunity. Proteovant is expanding its discovery and development teams and has multiple positions in biology, chemistry, biochemistry, DMPK, bioinformatics and CMC at many levels. Our R&D organization is located close to major pharmaceutical companies in Eastern Pennsylvania with a second site close to biotech companies in Boston area.
The ubiquitin proteasome system (UPS) is responsible for maintaining protein homeostasis. Targeted protein degradation by the UPS is a cellular process that involves marking proteins and guiding them to the proteasome for destruction. We leverage this physiological cellular machinery to target and destroy disease-causing proteins.
Unlike traditional small molecule inhibitors, our approach is not limited by the classic “active site” requirements. For example, we can target transcription factors and scaffold proteins that lack a catalytic pocket. These classes of proteins, historically, have been very difficult to drug. Further, we selectively degrade target proteins, rather than isozymes or paralogous proteins with high homology. Because of the catalytic nature of the interactions, it is possible to achieve efficacy at lower doses with prolonged duration while decreasing dose-limiting toxicities.
Biological targets once deemed “undruggable” are now within reach.
Roivant develops transformative medicines faster by building technologies and developing talent in creative ways, leveraging the Roivant platform to launch “Vants” – nimble and focused biopharmaceutical and health technology companies. These Vants include Proteovant but also Dermovant, ImmunoVant,as well as others.
Roivant’s drug discovery capabilities include the leading computational physics-based platform for in silico drug design and optimization as well as machine learning-based models for protein degradation.
The integration of our computational and experimental engines enables the rapid design of molecules with high precision and fidelity to address challenging targets for diseases with high unmet need.
Our current modalities include small molecules, heterobifunctionals and molecular glues.
Roivant Unveils Targeted Protein Degradation Platform
– First therapeutic candidate on track to enter clinical studies in 2021
– Computationally-designed degraders for six targets currently in preclinical development
– Acquisition of Oncopia Therapeutics and research collaboration with lab of Dr. Shaomeng Wang at the University of Michigan to add diverse pipeline of current and future compounds
– Clinical-stage degraders will provide foundation for multiple new Vants in distinct disease areas
– Platform supported by $200 million strategic investment from SK Holdings
Other articles in this Vibrant Philly Biotech Scene on this Online Open Access Journal include:
The Future of Speech-Based Human-Computer Interaction
Reporter: Ethan Coomber, Research Assistant III
2021 LPBI Summer Internship in Data Science and Podcast Library Development This article reports on a research conducted by the Tokyo Institute of Technology, published on 9 June 2021.
As technology continues to advance, the human-computer relationship develops alongside with it. As researchers and developers find new ways to improve a computer’s ability to recognize the distinct pitches that compose a human’s voice, the potential of technology begins to push back what people previously thought was possible. This constant improvement in technology has allowed us to identify new potential challenges in voice-based technological interaction.
When humans interact with one another, we do not convey our message with only our voices. There are a multitude of complexities to our emotional states and personality that cannot be obtained simply through the sound coming out of our mouths. Aspects of our communication such as rhythm, tone, and pitch are essential in our understanding of one another. This presents a challenge to artificial intelligence as technology is not able to pick up on these cues.
In the modern day, our interactions with voice-based devices and services continue to increase. In this light, researchers at Tokyo Institute of Technology and RIKEN, Japan, have performed a meta-synthesis to understand how we perceive and interact with the voice (and the body) of various machines. Their findings have generated insights into human preferences, and can be used by engineers and designers to develop future vocal technologies.
– Kate Seaborn
While it will always be difficult for technology to perfectly replicate a human interaction, the inclusion of filler terms such as “I mean…”, “um” and “like…” have been shown to improve human’s interaction and comfort when communicating with technology. Humans prefer communicating with agents that match their personality and overall communication style. The illusion of making the artificial intelligence appear human has a dramatic affect on the overall comfort of the person interacting with the technology. Several factors that have been proven to improve communication are when the artificial intelligence comes across as happy or empathetic with a higher pitched voice.
Using machine learning, computers are able to recognize patterns within human speech rather than requiring programming for specific patterns. This allows for the technology to adapt to human tendencies as they continue to see them. Over time, humans develop nuances in the way they speak and communicate which frequently results in a tendency to shorten certain words. One of the more common examples is the expression “I don’t know”. This expression is frequently reduced to the phrase “dunno”. Using machine learning, computers would be able to recognize this pattern and realize what the human’s intention is.
With advances in technology and the development of voice assistance in our lives, we are expanding our interactions to include computer interfaces and environments. While there are still many advances that need to be made in order to achieve the desirable level of communication, developers have identified the necessary steps to achieve the desirable human-computer interaction.
Sources:
Tokyo Institute of Technology. “The role of computer voice in the future of speech-based human-computer interaction.” ScienceDaily. ScienceDaily, 9 June 2021.
Developing Machine Learning Models for Prediction of Onset of Type-2 Diabetes
Reporter: Amandeep Kaur, B.Sc., M.Sc.
A recent study reports the development of an advanced AI algorithm which predicts up to five years in advance the starting of type 2 diabetes by utilizing regularly collected medical data. Researchers described their AI model as notable and distinctive based on the specific design which perform assessments at the population level.
The first author Mathieu Ravaut, M.Sc. of the University of Toronto and other team members stated that “The main purpose of our model was to inform population health planning and management for the prevention of diabetes that incorporates health equity. It was not our goal for this model to be applied in the context of individual patient care.”
Research group collected data from 2006 to 2016 of approximately 2.1 million patients treated at the same healthcare system in Ontario, Canada. Even though the patients were belonged to the same area, the authors highlighted that Ontario encompasses a diverse and large population.
The newly developed algorithm was instructed with data of approximately 1.6 million patients, validated with data of about 243,000 patients and evaluated with more than 236,000 patient’s data. The data used to improve the algorithm included the medical history of each patient from previous two years- prescriptions, medications, lab tests and demographic information.
When predicting the onset of type 2 diabetes within five years, the algorithm model reached a test area under the ROC curve of 80.26.
The authors reported that “Our model showed consistent calibration across sex, immigration status, racial/ethnic and material deprivation, and a low to moderate number of events in the health care history of the patient. The cohort was representative of the whole population of Ontario, which is itself among the most diverse in the world. The model was well calibrated, and its discrimination, although with a slightly different end goal, was competitive with results reported in the literature for other machine learning–based studies that used more granular clinical data from electronic medical records without any modifications to the original test set distribution.”
This model could potentially improve the healthcare system of countries equipped with thorough administrative databases and aim towards specific cohorts that may encounter the faulty outcomes.
Research group stated that “Because our machine learning model included social determinants of health that are known to contribute to diabetes risk, our population-wide approach to risk assessment may represent a tool for addressing health disparities.”
Ravaut M, Harish V, Sadeghi H, et al. Development and Validation of a Machine Learning Model Using Administrative Health Data to Predict Onset of Type 2 Diabetes. JAMA Netw Open. 2021;4(5):e2111315. doi:10.1001/jamanetworkopen.2021.11315 https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2780137
Other related articles were published in this Open Access Online Scientific Journal, including the following:
AI in Drug Discovery: Data Science and Core Biology @Merck &Co, Inc., @GNS Healthcare, @QuartzBio, @Benevolent AI and Nuritas
Reporters: Aviva Lev-Ari, PhD, RN and Irina Robu, PhD
Cancer has been characterized as a heterogeneous disease consisting of many different subtypes. The early diagnosis and prognosis of a cancer type have become a necessity in cancer research, as it can facilitate the subsequent clinical management of patients. The importance of classifying cancer patients into high or low-risk groups has led many research teams, from the biomedical and the bioinformatics field, to study the application of machine learning (ML) and Artificial Intelligence (AI) methods. Therefore, these techniques have been utilized as an aim to model the progression and treatment of cancerous conditions by predicting new algorithms.
In the majority of human cancers, heritable loss of gene function through cell division may be mediated as often by epigenetic as by genetic abnormalities. Epigenetic modification occurs through a process of interrelated changes in CpG island methylation and histone modifications. Candidate gene approaches of cell cycle, growth regulatory and apoptotic genes have shown epigenetic modification associated with loss of cognate proteins in sporadic pituitary tumors.
On 11th November 2020, researchers from the University of California, Irvine, has established the understanding of epigenetic mechanisms in tumorigenesis and publicized a previously undetected repertoire of cancer driver genes. The study was published in “Science Advances”
Researchers were able to identify novel tumor suppressor genes (TSGs) and oncogenes (OGs), particularly those with rare mutations by using a new prediction algorithm, called DORGE (Discovery of Oncogenes and tumor suppressor genes using Genetic and Epigenetic features) by integrating the most comprehensive collection of genetic and epigenetic data.
The senior author Wei Li, Ph.D., the Grace B. Bell chair and professor of bioinformatics in the Department of Biological Chemistry at the UCI School of Medicine said
Existing bioinformatics algorithms do not sufficiently leverage epigenetic features to predict cancer driver genes, even though epigenetic alterations are known to be associated with cancer driver genes.
The Study
This study demonstrated how cancer driver genes, predicted by DORGE, included both known cancer driver genes and novel driver genes not reported in current literature. In addition, researchers found that the novel dual-functional genes, which DORGE predicted as both TSGs and OGs, are highly enriched at hubs in protein-protein interaction (PPI) and drug/compound-gene networks.
Prof. Li explained that the DORGE algorithm, successfully leveraged public data to discover the genetic and epigenetic alterations that play significant roles in cancer driver gene dysregulation and could be instrumental in improving cancer prevention, diagnosis and treatment efforts in the future.
Another new algorithmic prediction for the identification of cancer genes by Machine Learning has been carried out by a team of researchers at the Max Planck Institute for Molecular Genetics (MPIMG) in Berlin and the Institute of Computational Biology of Helmholtz Zentrum München combining a wide variety of data analyzed it with “Artificial Intelligence” and identified numerous cancer genes. They termed the algorithm as EMOGI (Explainable Multi-Omics Graph Integration). EMOGI can predict which genes cause cancer, even if their DNA sequence is not changed. This opens up new perspectives for targeted cancer therapy in personalized medicine and the development of biomarkers. The research was published in Nature Machine Intelligence on 12th April 2021.
In cancer, cells get out of control. They proliferate and push their way into tissues, destroying organs and thereby impairing essential vital functions. This unrestricted growth is usually induced by an accumulation of DNA changes in cancer genes—i.e. mutations in these genes that govern the development of the cell. But some cancers have only very few mutated genes, which means that other causes lead to the disease in these cases.
The aim of the study has been represented in 4 main headings
Additional targets for personalized medicine
Better results by combination
In search of hints for further studies
Suitable for other types of diseases as well
The team was headed by Annalisa Marsico. The team used the algorithm to identify 165 previously unknown cancer genes. The sequences of these genes are not necessarily altered-apparently, already a dysregulation of these genes can lead to cancer. All of the newly identified genes interact closely with well-known cancer genes and be essential for the survival of tumor cells in cell culture experiments. The EMOGI can also explain the relationships in the cell’s machinery that make a gene a cancer gene. The software integrates tens of thousands of data sets generated from patient samples. These contain information about DNA methylations, the activity of individual genes and the interactions of proteins within cellular pathways in addition to sequence data with mutations. In these data, a deep-learning algorithm detects the patterns and molecular principles that lead to the development of cancer.
Marsico says
Ideally, we obtain a complete picture of all cancer genes at some point, which can have a different impact on cancer progression for different patients
Unlike traditional cancer treatments such as chemotherapy, personalized treatments are tailored to the exact type of tumor. “The goal is to choose the best treatment for each patient, the most effective treatment with the fewest side effects. In addition, molecular properties can be used to identify cancers that are already in the early stages.
Roman Schulte-Sasse, a doctoral student on Marsico’s team and the first author of the publication says
To date, most studies have focused on pathogenic changes in sequence, or cell blueprints, at the same time, it has recently become clear that epigenetic perturbation or dysregulation gene activity can also lead to cancer.
This is the reason, researchers merged sequence data that reflects blueprint failures with information that represents events in cells. Initially, scientists confirmed that mutations, or proliferation of genomic segments, were the leading cause of cancer. Then, in the second step, they identified gene candidates that are not very directly related to the genes that cause cancer.
Clues for future directions
The researcher’s new program adds a considerable number of new entries to the list of suspected cancer genes, which has grown to between 700 and 1,000 in recent years. It was only through a combination of bioinformatics analysis and the newest Artificial Intelligence (AI) methods that the researchers were able to track down the hidden genes.
Schulte-Sasse says “The interactions of proteins and genes can be mapped as a mathematical network, known as a graph.” He explained by giving an example of a railroad network; each station corresponds to a protein or gene, and each interaction among them is the train connection. With the help of deep learning—the very algorithms that have helped artificial intelligence make a breakthrough in recent years – the researchers were able to discover even those train connections that had previously gone unnoticed. Schulte-Sasse had the computer analyze tens of thousands of different network maps from 16 different cancer types, each containing between 12,000 and 19,000 data points.
Many more interesting details are hidden in the data. Patterns that are dependent on particular cancer and tissue were seen. The researchers were also observed this as evidence that tumors are triggered by different molecular mechanisms in different organs.
Marsico explains
The EMOGI program is not limited to cancer, the researchers emphasize. In theory, it can be used to integrate diverse sets of biological data and find patterns there. It could be useful to apply our algorithm for similarly complex diseases for which multifaceted data are collected and where genes play an important role. An example might be complex metabolic diseases such as diabetes.
Main Source
New prediction algorithm identifies previously undetected cancer driver genes
Deep Learning extracts Histopathological Patterns and accurately discriminates 28 Cancer and 14 Normal Tissue Types: Pan-cancer Computational Histopathology Analysis
Evolution of the Human Cell Genome Biology Field of Gene Expression, Gene Regulation, Gene Regulatory Networks and Application of Machine Learning Algorithms in Large-Scale Biological Data Analysis
Application of Natural Language Processing (NLP) on ~1MM cases of semi-structured echocardiogram reports: Identification of aortic stenosis (AS) cases – Accuracy comparison to administrative diagnosis codes (IDC 9/10 codes)
Reporter and Curator: Aviva Lev-Ari, PhD, RN
Large-Scale Identification of Aortic Stenosis and its Severity Using Natural Language Processing on Electronic Health Records
Background Systematic case identification is critical to improving population health, but widely used diagnosis code-based approaches for conditions like valvular heart disease are inaccurate and lack specificity. Objective To develop and validate natural language processing (NLP) algorithms to identify aortic stenosis (AS) cases and associated parameters from semi-structured echocardiogram reports and compare its accuracy to administrative diagnosis codes. Methods Using 1,003 physician-adjudicated echocardiogram reports from Kaiser Permanente Northern California, a large, integrated healthcare system (>4.5 million members), NLP algorithms were developed and validated to achieve positive and negative predictive values >95% for identifying AS and associated echocardiographic parameters. Final NLP algorithms were applied to all adult echocardiography reports performed between 2008-2018, and compared to ICD-9/10 diagnosis code-based definitions for AS found from 14 days before to six months after the procedure date. Results A total of 927,884 eligible echocardiograms were identified during the study period among 519,967 patients. Application of the final NLP algorithm classified 104,090 (11.2%) echocardiograms with any AS (mean age 75.2 years, 52% women), with only 67,297 (64.6%) having a diagnosis code for AS between 14 days before and up to six months after the associated echocardiogram. Among those without associated diagnosis codes, 19% of patients had hemodynamically significant AS (i.e., greater than mild disease). Conclusion A validated NLP algorithm applied to a systemwide echocardiography database was substantially more accurate than diagnosis codes for identifying AS. Leveraging machine learning-based approaches on unstructured EHR data can facilitate more effective individual and population management than using administrative data alone.
Large-scale identification of aortic stenosis and its severity using natural language processing on electronic health records
Systematic case identification is critical to improving population health, but widely used diagnosis code–based approaches for conditions like valvular heart disease are inaccurate and lack specificity.
Objective
To develop and validate natural language processing (NLP) algorithms to identify aortic stenosis (AS) cases and associated parameters from semi-structured echocardiogram reports and compare their accuracy to administrative diagnosis codes.
Methods
Using 1003 physician-adjudicated echocardiogram reports from Kaiser Permanente Northern California, a large, integrated healthcare system (>4.5 million members), NLP algorithms were developed and validated to achieve positive and negative predictive values > 95% for identifying AS and associated echocardiographic parameters. Final NLP algorithms were applied to all adult echocardiography reports performed between 2008 and 2018 and compared to ICD-9/10 diagnosis code–based definitions for AS found from 14 days before to 6 months after the procedure date.
Results
A total of 927,884 eligible echocardiograms were identified during the study period among 519,967 patients. Application of the final NLP algorithm classified 104,090 (11.2%) echocardiograms with any AS (mean age 75.2 years, 52% women), with only 67,297 (64.6%) having a diagnosis code for AS between 14 days before and up to 6 months after the associated echocardiogram. Among those without associated diagnosis codes, 19% of patients had hemodynamically significant AS (ie, greater than mild disease).
Conclusion
A validated NLP algorithm applied to a systemwide echocardiography database was substantially more accurate than diagnosis codes for identifying AS. Leveraging machine learning–based approaches on unstructured electronic health record data can facilitate more effective individual and population management than using administrative data alone.
Keywords
Aortic stenosis Echocardiography Machine learning Population health Quality and outcomes Valvular heart disease
This session will provide information regarding methodologic and computational aspects of proteogenomic analysis of tumor samples, particularly in the context of clinical trials. Availability of comprehensive proteomic and matching genomic data for tumor samples characterized by the National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) and The Cancer Genome Atlas (TCGA) program will be described, including data access procedures and informatic tools under development. Recent advances on mass spectrometry-based targeted assays for inclusion in clinical trials will also be discussed.
Amanda G Paulovich, Shankha Satpathy, Meenakshi Anurag, Bing Zhang, Steven A Carr
Methods and tools for comprehensive proteogenomic characterization of bulk tumor to needle core biopsies
Shankha Satpathy
TCGA has 11,000 cancers with >20,000 somatic alterations but only 128 proteins as proteomics was still young field
CPTAC is NCI proteomic effort
Chemical labeling approach now method of choice for quantitative proteomics
Looked at ovarian and breast cancers: to measure PTM like phosphorylated the sample preparation is critical
Data access and informatics tools for proteogenomics analysis
Bing Zhang
Raw and processed data (raw MS data) with linked clinical data can be extracted in CPTAC
Python scripts are available for bioinformatic programming
Pathways to clinical translation of mass spectrometry-based assays
Meenakshi Anurag
· Using kinase inhibitor pulldown (KIP) assay to identify unique kinome profiles
· Found single strand break repair defects in endometrial luminal cases, especially with immune checkpoint prognostic tumors
· Paper: JNCI 2019 analyzed 20,000 genes correlated with ET resistant in luminal B cases (selected for a list of 30 genes)
· Validated in METABRIC dataset
· KIP assay uses magnetic beads to pull out kinases to determine druggable kinases
· Looked in xenografts and was able to pull out differential kinomes
· Matched with PDX data so good clinical correlation
· Were able to detect ESR1 fusion correlated with ER+ tumors
The adoption of omic technologies in the cancer clinic is giving rise to an increasing number of large-scale high-dimensional datasets recording multiple aspects of the disease. This creates the need for frameworks for translatable discovery and learning from such data. Like artificial intelligence (AI) and machine learning (ML) for the cancer lab, methods for the clinic need to (i) compare and integrate different data types; (ii) scale with data sizes; (iii) prove interpretable in terms of the known biology and batch effects underlying the data; and (iv) predict previously unknown experimentally verifiable mechanisms. Methods for the clinic, beyond the lab, also need to (v) produce accurate actionable recommendations; (vi) prove relevant to patient populations based upon small cohorts; and (vii) be validated in clinical trials. In this educational session we will present recent studies that demonstrate AI and ML translated to the cancer clinic, from prognosis and diagnosis to therapy.
NOTE: Dr. Fish’s talk is not eligible for CME credit to permit the free flow of information of the commercial interest employee participating.
Ron C. Anafi, Rick L. Stevens, Orly Alter, Guy Fish
Overview of AI approaches in cancer research and patient care
Rick L. Stevens
Deep learning is less likely to saturate as data increases
Deep learning attempts to learn multiple layers of information
The ultimate goal is prediction but this will be the greatest challenge for ML
ML models can integrate data validation and cross database validation
What limits the performance of cross validation is the internal noise of data (reproducibility)
Learning curves: not the more data but more reproducible data is important
Neural networks can outperform classical methods
Important to measure validation accuracy in training set. Class weighting can assist in development of data set for training set especially for unbalanced data sets
Discovering genome-scale predictors of survival and response to treatment with multi-tensor decompositions
Orly Alter
Finding patterns using SVD component analysis. Gene and SVD patterns match 1:1
Comparative spectral decompositions can be used for global datasets
Validation of CNV data using this strategy
Found Ras, Shh and Notch pathways with altered CNV in glioblastoma which correlated with prognosis
These predictors was significantly better than independent prognostic indicator like age of diagnosis
Identifying targets for cancer chronotherapy with unsupervised machine learning
Ron C. Anafi
Many clinicians have noticed that some patients do better when chemo is given at certain times of the day and felt there may be a circadian rhythm or chronotherapeutic effect with respect to side effects or with outcomes
ML used to determine if there is indeed this chronotherapy effect or can we use unstructured data to determine molecular rhythms?
Found a circadian transcription in human lung
Most dataset in cancer from one clinical trial so there might need to be more trials conducted to take into consideration circadian rhythms
Stratifying patients by live-cell biomarkers with random-forest decision trees
Stratifying patients by live-cell biomarkers with random-forest decision trees
Guy Fish CEO Cellanyx Diagnostics
Some clinicians feel we may be overdiagnosing and overtreating certain cancers, especially the indolent disease
This educational session focuses on the chronic wound healing, fibrosis, and cancer “triad.” It emphasizes the similarities and differences seen in these conditions and attempts to clarify why sustained fibrosis commonly supports tumorigenesis. Importance will be placed on cancer-associated fibroblasts (CAFs), vascularity, extracellular matrix (ECM), and chronic conditions like aging. Dr. Dvorak will provide an historical insight into the triad field focusing on the importance of vascular permeability. Dr. Stewart will explain how chronic inflammatory conditions, such as the aging tumor microenvironment (TME), drive cancer progression. The session will close with a review by Dr. Cukierman of the roles that CAFs and self-produced ECMs play in enabling the signaling reciprocity observed between fibrosis and cancer in solid epithelial cancers, such as pancreatic ductal adenocarcinoma.
Harold F Dvorak, Sheila A Stewart, Edna Cukierman
The importance of vascular permeability in tumor stroma generation and wound healing
Harold F Dvorak
Aging in the driver’s seat: Tumor progression and beyond
Sheila A Stewart
Why won’t CAFs stay normal?
Edna Cukierman
Tuesday, June 23
3:00 PM – 5:00 PM EDT
Other Articles on this Open Access Online Journal on Cancer Conferences and Conference Coverage in Real Time Include
AI Acquisitions by Big Tech Firms Are Happening at a Blistering Pace: 2019 Recent Data by CBI Insights
Reporter: Stephen J. Williams, Ph.D.
3.4.16 AI Acquisitions by Big Tech Firms Are Happening at a Blistering Pace: 2019 Recent Data by CBI Insights, Volume 2 (Volume Two: Latest in Genomics Methodologies for Therapeutics: Gene Editing, NGS and BioInformatics, Simulations and the Genome Ontology), Part 3: AI in Medicine
Recent report from CBI Insights shows the rapid pace at which the biggest tech firms (Google, Apple, Microsoft, Facebook, and Amazon) are acquiring artificial intelligence (AI) startups, potentially confounding the AI talent shortage that exists.
The usual suspects are leading the race for AI: tech giants like Facebook, Amazon, Microsoft, Google, & Apple (FAMGA) have all been aggressively acquiring AI startups in the last decade.
Among the FAMGA companies, Apple leads the way, making 20 total AI acquisitions since 2010. It is followed by Google (the frontrunner from 2012 to 2016) with 14 acquisitions and Microsoft with 10.
Apple’s AI acquisition spree, which has helped it overtake Google in recent years, was essential to the development of new iPhone features. For example, FaceID, the technology that allows users to unlock their iPhone X just by looking at it, stems from Apple’s M&A moves in chips and computer vision, including the acquisition of AI company RealFace.
In fact, many of FAMGA’s prominent products and services came out of acquisitions of AI companies — such as Apple’s Siri, or Google’s contributions to healthcare through DeepMind.
That said, tech giants are far from the only companies snatching up AI startups.
Since 2010, there have been 635 AI acquisitions, as companies aim to build out their AI capabilities and capture sought-after talent (as of 8/31/2019).
The pace of these acquisitions has also been increasing. AI acquisitions saw a more than 6x uptick from 2013 to 2018, including last year’s record of 166 AI acquisitions — up 38% year-over-year.
In 2019, there have already been 140+ acquisitions (as of August), putting the year on track to beat the 2018 record at the current run rate.
Part of this increase in the pace of AI acquisitions can be attributed to a growing diversity in acquirers. Where once AI was the exclusive territory of major tech companies, today, smaller AI startups are becoming acquisition targets for traditional insurance, retail, and healthcare incumbents.
For example, in February 2018, Roche Holding acquired New York-based cancer startup Flatiron Health for $1.9B — one of the largest M&A deals in artificial intelligence.This year, Nike acquired AI-powered inventory management startup Celect, Uber acquired computer vision company Mighty AI, and McDonald’s acquired personalization platform Dynamic Yield.
Despite the increased number of acquirers, however, tech giants are still leading the charge. Acquisitive tech giants have emerged as powerful global corporations with a competitive advantage in artificial intelligence, and startups have played a pivotal role in helping these companies scale their AI initiatives.
Apple, Google, Microsoft, Facebook, Intel, and Amazon are the most active acquirers of AI startups, each acquiring 7+companies.
To read more on recent Acquisitions in the AI space please see the following articles on this Open Access Online Journal
3.3.21 Multiple Barriers Identified Which May Hamper Use of Artificial Intelligence in the Clinical Setting, Volume 2 (Volume Two: Latest in Genomics Methodologies for Therapeutics: Gene Editing, NGS and BioInformatics, Simulations and the Genome Ontology), Part 2: CRISPR for Gene Editing and DNA Repair
In a commentary article from Jennifer Couzin-Frankel entitled “Medicine contends with how to use artificial intelligence” the barriers to the efficient and reliable adoption of artificial intelligence and machine learning in the hospital setting are discussed. In summary these barriers result from lack of reproducibility across hospitals. For instance, a major concern among radiologists is the AI software being developed to read images in order to magnify small changes, such as with cardiac images, is developed within one hospital and may not reflect the equipment or standard practices used in other hospital systems. To address this issue, lust recently, US scientists and government regulators issued guidance describing how to convert research-based AI into improved medical images and published these guidance in the Journal of the American College of Radiology. The group suggested greater collaboration among relevant parties in developing of AI practices, including software engineers, scientists, clinicians, radiologists etc.
As thousands of images are fed into AI algorithms, according to neurosurgeon Eric Oermann at Mount Sinai Hospital, the signals they recognize can have less to do with disease than with other patient characteristics, the brand of MRI machine, or even how a scanner is angled. For example Oermann and Mount Sinai developed an AI algorithm to detect spots on a lung scan indicative of pneumonia and when tested in a group of new patients the algorithm could detect pneumonia with 93% accuracy.
However when the group from Sinai tested their algorithm from tens of thousands of scans from other hospitals including NIH success rate fell to 73-80%, indicative of bias within the training set: in other words there was something unique about the way Mt. Sinai does their scans relative to other hospitals. Indeed, many of the patients Mt. Sinai sees are too sick to get out of bed and radiologists would use portable scanners, which generate different images than stand alone scanners.
The results were published in Plos Medicine as seen below:
PLoS Med. 2018 Nov 6;15(11):e1002683. doi: 10.1371/journal.pmed.1002683. eCollection 2018 Nov.
Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study.
There is interest in using convolutional neural networks (CNNs) to analyze medical imaging to provide computer-aided diagnosis (CAD). Recent work has suggested that image classification CNNs may not generalize to new data as well as previously believed. We assessed how well CNNs generalized across three hospital systems for a simulated pneumonia screening task.
METHODS AND FINDINGS:
A cross-sectional design with multiple model training cohorts was used to evaluate model generalizability to external sites using split-sample validation. A total of 158,323 chest radiographs were drawn from three institutions: National Institutes of Health Clinical Center (NIH; 112,120 from 30,805 patients), Mount Sinai Hospital (MSH; 42,396 from 12,904 patients), and Indiana University Network for Patient Care (IU; 3,807 from 3,683 patients). These patient populations had an age mean (SD) of 46.9 years (16.6), 63.2 years (16.5), and 49.6 years (17) with a female percentage of 43.5%, 44.8%, and 57.3%, respectively. We assessed individual models using the area under the receiver operating characteristic curve (AUC) for radiographic findings consistent with pneumonia and compared performance on different test sets with DeLong’s test. The prevalence of pneumonia was high enough at MSH (34.2%) relative to NIH and IU (1.2% and 1.0%) that merely sorting by hospital system achieved an AUC of 0.861 (95% CI 0.855-0.866) on the joint MSH-NIH dataset. Models trained on data from either NIH or MSH had equivalent performance on IU (P values 0.580 and 0.273, respectively) and inferior performance on data from each other relative to an internal test set (i.e., new data from within the hospital system used for training data; P values both <0.001). The highest internal performance was achieved by combining training and test data from MSH and NIH (AUC 0.931, 95% CI 0.927-0.936), but this model demonstrated significantly lower external performance at IU (AUC 0.815, 95% CI 0.745-0.885, P = 0.001). To test the effect of pooling data from sites with disparate pneumonia prevalence, we used stratified subsampling to generate MSH-NIH cohorts that only differed in disease prevalence between training data sites. When both training data sites had the same pneumonia prevalence, the model performed consistently on external IU data (P = 0.88). When a 10-fold difference in pneumonia rate was introduced between sites, internal test performance improved compared to the balanced model (10× MSH risk P < 0.001; 10× NIH P = 0.002), but this outperformance failed to generalize to IU (MSH 10× P < 0.001; NIH 10× P = 0.027). CNNs were able to directly detect hospital system of a radiograph for 99.95% NIH (22,050/22,062) and 99.98% MSH (8,386/8,388) radiographs. The primary limitation of our approach and the available public data is that we cannot fully assess what other factors might be contributing to hospital system-specific biases.
CONCLUSION:
Pneumonia-screening CNNs achieved better internal than external performance in 3 out of 5 natural comparisons. When models were trained on pooled data from sites with different pneumonia prevalence, they performed better on new pooled data from these sites but not on external data. CNNs robustly identified hospital system and department within a hospital, which can have large differences in disease burden and may confound predictions.
Surprisingly, not many researchers have begun to use data obtained from different hospitals. The FDA has issued some guidance in the matter but considers “locked” AI software or unchanging software as a medical device. However they just announced development of a framework for regulating more cutting edge software that continues to learn over time.
Still the key point is that collaboration over multiple health systems in various countries may be necessary for development of AI software which is used in multiple clinical settings. Otherwise each hospital will need to develop their own software only used on their own system and would provide a regulatory headache for the FDA.
Other articles on Artificial Intelligence in Clinical Medicine on this Open Access Journal include:
This article is excerpted from the Harvard Business Review, May 28, 2019
By Moni Miyashita, Michael Brady
3.4.13 The Health Care Benefits of Combining Wearables and AI, Volume 2 (Volume Two: Latest in Genomics Methodologies for Therapeutics: Gene Editing, NGS and BioInformatics, Simulations and the Genome Ontology), Part 3: AI in Medicine
In southeast England, patients discharged from a group of hospitals serving 500,000 people are being fitted with a Wi-Fi-enabled armband that remotely monitors vital signs such as respiratory rate, oxygen levels, pulse, blood pressure, and body temperature.
Under a National Health Service pilot program that now incorporates artificial intelligence to analyze all that patient data in real time, hospital readmission rates are down, and emergency room visits have been reduced. What’s more, the need for costly home visits has dropped by 22%. Longer term, adherence to treatment plans have increased to 96%, compared to the industry average of 50%.
The AI pilot is targeting what Harvard Business School Professor and Innosight co-founder Clay Christensen calls “non-consumption.” These are opportunity areas where consumers have a job to be done that isn’t currently addressed by an affordable or convenient solution.
Before the U.K. pilot at the Dartford and Gravesham hospitals, for instance, home monitoring had involved dispatching hospital staffers to drive up to 90 minutes round-trip to check in with patients in their homes about once per week. But with algorithms now constantly searching for warning signs in the data and alerting both patients and professionals instantly, a new capability is born: providing healthcare before you knew you even need it.
The biggest promise of artificial intelligence — accurate predictions at near-zero marginal cost — has rightly generated substantial interest in applying AI to nearly every area of healthcare. But not every application of AI in healthcare is equally well-suited to benefit. Moreover, very few applications serve as an appropriate strategic response to the largest problems facing nearly every health system: decentralization and margin pressure.
Take for example, medical imaging AI tools — an area in which hospitals are projected to spend $2 billion annually within four years. Accurately diagnosing diseases from cancers to cataracts is a complex task, with difficult-to-quantify but typically major consequences. However, the task is currently typically part of larger workflows performed by extensively trained, highly specialized physicians who are among some of the world’s best minds. These doctors might need help at the margins, but this is a job already being done. Such factors make disease diagnosis an extraordinarily difficult area for AI to create transformative change. And so the application of AI in such settings — even if beneficial to patient outcomes — is unlikely to fundamentally improve the way healthcare is delivered or to substantially lower costs in the near-term.
However, leading organizations seeking to decentralize care can deploy AI to do things that have never been done before. For example: There’s a wide array of non-acute health decisions that consumers make daily. These decisions do not warrant the attention of a skilled clinician but ultimately play a large role in determining patient’s health — and ultimately the cost of healthcare.
According to the World Health Organization, 60% of related factors to individual health and quality of life are correlated to lifestyle choices, including taking prescriptions such as blood-pressure medications correctly, getting exercise, and reducing stress. Aided by AI-driven models, it is now possible to provide patients with interventions and reminders throughout this day-to-day process based on changes to the patient’s vital signs.
Home health monitoring itself isn’t new. Active programs and pilot studies are underway through leading institutions ranging from Partners Healthcare, United Healthcare, and the Johns Hopkins School of Medicine, with positive results. But those efforts have yet to harness AI to make better judgements and recommendations in real time. Because of the massive volumes of data involved, machine learning algorithms are particularly well suited to scaling that task for large populations. After all, large sets of data are what power AI by making those algorithms smarter.
By deploying AI, for instance, the NHS program is not only able to scale up in the U.K. but also internationally. Current Health, the venture-capital backed maker of the patient monitoring devices used in the program, recently received FDA clearance to pilot the system in the U.S. and is now testing it with New York’s Mount Sinai Hospital. It’s part of an effort to reduce patient readmissions, which costs U.S. hospitals about $40 billion annually.
The early success of such efforts drives home three lessons in using AI to address non-consumption in the new world of patient-centric healthcare:
1) Focus on impacting critical metrics – for example, reducing costly hospital readmission rates.
Start small to home in on the goal of making an impact on a key metric tied to both patient outcomes and financial sustainability. As in the U.K. pilot, this can be done through a program with select hospitals or provider locations. In another case Grady Hospital, the largest public hospital in Atlanta, points to $4M in saving from reduced readmission rates by 31% over two years thanks to the adoption of an AI tool which identifies ‘at-risk’ patients. The system alerts clinical teams to initiate special patient touch points and interventions.
2) Reduce risk by relying on new kinds of partners.
Don’t try to do everything alone. Instead, form alliances with partners that are aiming to tackle similar problems. Consider the Synaptic Healthcare Alliance, a collaborative pilot program between Aetna, Ascension, Humana, Optum, and others. The alliance is using Blockchain to create a giant dataset across various health care providers, with AI trials on the data getting underway. The aim is to streamline health care provider data management with the goal of reducing the cost of processing claims while also improving access to care. Going it alone can be risky due to data incompatibility issues alone. For instance, the M.D. Anderson Cancer Center had to write off millions in costs for a failed AI project due in part to incompatibility with its electronic health records system. By joining forces, Synaptic’s dataset will be in a standard format that makes records and results transportable.
3) Use AI to collaborate, not compete, with highly-trained professionals.
Clinicians are often looking to augment their knowledge and reasoning, and AI can help. Many medical AI applications do actually compete with doctors. In radiology, for instance, some algorithms have performed image-bases diagnosis as well as or better than human experts. Yet it’s unclear if patients and medical institutions will trust AI to automate that job entirely. A University of California at San Diego pilot in which AI successfully diagnosed childhood diseases more accurately than junior-level pediatricians still required senior doctors to personally review and sign off on the diagnosis. The real aim is always going to be to use AI to collaborate with clinicians seeking higher precision — not try to replace them.
MIT and MGH have developed a deep learning model which identifies patients likely to develop breast cancer in the future. Learning from data on 60,000 prior patients, the AI system allows physicians to personalize their approach to breast cancer screening, essentially creating a detailed risk profile for each patient.
Taken together, these three lessons paired with solutions targeted at non-consumption have the potential to provide a clear path to effectively harnessing a technology that has been subject to rampant over-promising. Longer term, we believe the one of the transformative benefits of AI will be deepening relationships between health providers and patients. The U.K. pilot, for instance, is resulting in more frequent proactive check-ins that never would have happened before. That’s good for both improving health as well as customer loyalty in the emerging consumer-centric healthcare marketplace.