Advertisements
Feeds:
Posts
Comments

Posts Tagged ‘Big data’


BioInformatic Resources at the Environmental Protection Agency: Tools and Webinars on Toxicity Prediction

Curator Stephen J. Williams Ph.D.

New GenRA Module in EPA’s CompTox Dashboard Will Help Predict Potential Chemical Toxicity

Published September 25, 2018

As part of its ongoing computational toxicology research, EPA is developing faster and improved approaches to evaluate chemicals for potential health effects.  One commonly applied approach is known as chemical read-across. Read-across uses information about how a chemical with known data behaves to make a prediction about the behavior of another chemical that is “similar” but does not have as much data. Current read-across, while cost-effective, relies on a subjective assessment, which leads to varying predictions and justifications depending on who undertakes and evaluates the assessment.

To reduce uncertainties and develop a more objective approach, EPA researchers have developed an automated read-across tool called Generalized Read-Across (GenRA), and added it to the newest version of the EPA Computational Toxicology Dashboard. The goal of GenRA is to encode as many expert considerations used within current read-across approaches as possible and combine these with data-driven approaches to transition read-across towards a more systematic and data-based method of making predictions.

EPA chemist Dr. Grace Patlewicz says it was this uncertainty that motivated the development of GenRA. “You don’t actually know if you’ve been successful at using read-across to help predict chemical toxicity because it’s a judgement call based on one person versus the next. That subjectivity is something we were trying to move away from.” Patlewicz says.

Since toxicologists and risk assessors are already familiar with read-across, EPA researchers saw value in creating a tool that that was aligned with the current read-across workflow but which addressed uncertainty using data analysis methods in what they call a “harmonized-hybrid workflow.”

In its current form, GenRA lets users find analogues, or chemicals that are similar to their target chemical, based on chemical structural similarity. The user can then select which analogues they want to carry forward into the GenRA prediction by exploring the consistency and concordance of the underlying experimental data for those analogues. Next, the tool predicts toxicity effects of specific repeated dose studies. Then, a plot with these outcomes is generated based on a similarity-weighted activity of the analogue chemicals the user selected. Finally, the user is presented with a data matrix view showing whether a chemical is predicted to be toxic (yes or no) for a chosen set of toxicity endpoints, with a quantitative measure of uncertainty.

The team is also comparing chemicals based on other similarity contexts, such as physicochemical characteristics or metabolic similarity, as well as extending the approach to make quantitative predictions of toxicity.

Patlewicz thinks incorporating other contexts and similarity measures will refine GenRA to make better toxicity predictions, fulfilling the goal of creating a read-across method capable of assessing thousands of chemicals that currently lack toxicity data.

“That’s the direction that we’re going in,” Patlewicz says. “Recognizing where we are and trying to move towards something a little bit more objective, showing how aspects of the current read-across workflow could be refined.”

Learn more at: https://comptox.epa.gov

 

A listing of EPA Tools for Air Quality Assessment

Tools

  • Atmospheric Model Evaluation Tool (AMET)
    AMET helps in the evaluation of meteorological and air quality simulations.
  • Benchmark Dose Software (BMDS)
    EPA developed the Benchmark Dose Software (BMDS) as a tool to help estimate dose or exposure of a chemical or chemical mixture associated with a given response level. The methodology is used by EPA risk assessors and is fast becoming the world’s standard for dose-response analysis for risk assessments, including air pollution risk assessments.
  • BenMAP
    BenMAP is a Windows-based computer program that uses a Geographic Information System (GIS)-based to estimate the health impacts and economic benefits occurring when populations experience changes in air quality.
  • Community-Focused Exposure and Risk Screening Tool (C-FERST)
    C-FERST is an online tool developed by EPA in collaboration with stakeholders to provide access to resources that can be used with communities to help identify and learn more about their environmental health issues and explore exposure and risk reduction options.
  • Community Health Vulnerability Index
    EPA scientists developed a Community Health Vulnerability Index that can be used to help identify communities at higher health risk from wildfire smoke. Breathing smoke from a nearby wildfire is a health threat, especially for people with lung or heart disease, diabetes and high blood pressure as well as older adults, and those living in communities with poverty, unemployment and other indicators of social stress. Health officials can use the tool, in combination with air quality models, to focus public health strategies on vulnerable populations living in areas where air quality is impaired, either by wildfire smoke or other sources of pollution. The work was published in Environmental Science & Technology.
  • Critical Loads Mapper Tool
    The Critical Loads Mapper Tool can be used to help protect terrestrial and aquatic ecosystems from atmospheric deposition of nitrogen and sulfur, two pollutants emitted from fossil fuel burning and agricultural emissions. The interactive tool provides easy access to information on deposition levels through time; critical loads, which identify thresholds when pollutants have reached harmful levels; and exceedances of these thresholds.
  • EnviroAtlas
    EnviroAtlas provides interactive tools and resources for exploring the benefits people receive from nature or “ecosystem goods and services”. Ecosystem goods and services are critically important to human health and well-being, but they are often overlooked due to lack of information. Using EnviroAtlas, many types of users can access, view, and analyze diverse information to better understand the potential impacts of various decisions.
  • EPA Air Sensor Toolbox for Citizen Scientists
    EPA’s Air Sensor Toolbox for Citizen Scientists provides information and guidance on new low-cost compact technologies for measuring air quality. Citizens are interested in learning more about local air quality where they live, work and play. EPA’s Toolbox includes information about: Sampling methodologies; Calibration and validation approaches; Measurement methods options; Data interpretation guidelines; Education and outreach; and Low cost sensor performance information.
  • ExpoFIRST
    The Exposure Factors Interactive Resource for Scenarios Tool (ExpoFIRST) brings data from EPA’s Exposure Factors Handbook: 2011 Edition (EFH) to an interactive tool that maximizes flexibility and transparency for exposure assessors. ExpoFIRST represents a significant advance for regional, state, and local scientists in performing and documenting calculations for community and site-specific exposure assessments, including air pollution exposure assessments.
  • EXPOsure toolbox (ExpoBox)
    This is a toolbox created to assist individuals from within government, industry, academia, and the general public with assessing exposure, including exposure to air contaminants, fate and transport processes of air pollutants and their potential exposure concentrations. It is a compendium of exposure assessment tools that links to guidance documents, databases, models, reference materials, and other related resources.
  • Federal Reference & Federal Equivalency Methods
    EPA scientists develop and evaluate Federal Reference Methods and Federal Equivalency Methods for accurately and reliably measuring six primary air pollutants in outdoor air. These methods are used by states and other organizations to assess implementation actions needed to attain National Ambient Air Quality Standards.
  • Fertilizer Emission Scenario Tool for CMAQ (FEST-C)
    FEST-C facilitates the definition and simulation of new cropland farm management system scenarios or editing of existing scenarios to drive Environmental Policy Integrated Climate model (EPIC) simulations.  For the standard 12km continental Community Multi-Scale Air Quality model (CMAQ) domain, this amounts to about 250,000 simulations for the U.S. alone. It also produces gridded daily EPIC weather input files from existing hourly Meteorology-Chemistry Interface Processor (MCIP) files, transforms EPIC output files to CMAQ-ready input files and links directly to Visual Environment for Rich Data Interpretation (VERDI) for spatial visualization of input and output files. The December 2012 release will perform all these functions for any CMAQ grid scale or domain.
  • Instruction Guide and Macro Analysis Tool for Community-led Air Monitoring 
    EPA has developed two tools for evaluating the performance of low-cost sensors and interpreting the data they collect to help citizen scientists, communities, and professionals learn about local air quality.
  • Integrated Climate and Land use Scenarios (ICLUS)
    Climate change and land-use change are global drivers of environmental change. Impact assessments frequently show that interactions between climate and land-use changes can create serious challenges for aquatic ecosystems, water quality, and air quality. Population projections to 2100 were used to model the distribution of new housing across the landscape. In addition, housing density was used to estimate changes in impervious surface cover.  A final report, datasets, the ICLUS+ Web Viewer and ArcGIS tools are available.
  • Indoor Semi-Volatile Organic Compound (i-SVOC)
    i-SVOC Version 1.0 is a general-purpose software application for dynamic modeling of the emission, transport, sorption, and distribution of semi-volatile organic compounds (SVOCs) in indoor environments. i-SVOC supports a variety of uses, including exposure assessment and the evaluation of mitigation options. SVOCs are a diverse group of organic chemicals that can be found in: Many are also present in indoor air, where they tend to bind to interior surfaces and particulate matter (dust).

    • Pesticides;
    • Ingredients in cleaning agents and personal care products;
    • Additives to vinyl flooring, furniture, clothing, cookware, food packaging, and electronics.
  • Municipal Solid Waste Decision Support Tool (MSW DST)EXIT
    This tool is designed to aid solid waste planners in evaluating the cost and environmental aspects of integrated municipal solid waste management strategies. The tool is the result of collaboration between EPA and RTI International and its partners.
  • Optical Noise-Reduction Averaging (ONA) Program Improves Black Carbon Particle Measurements Using Aethalometers
    ONA is a program that reduces noise in real-time black carbon data obtained using Aethalometers. Aethalometers optically measure the concentration of light absorbing or “black” particles that accumulate on a filter as air flows through it. These particles are produced by incomplete fossil fuel, biofuel and biomass combustion. Under polluted conditions, they appear as smoke or haze.
  • RETIGO tool
    Real Time Geospatial Data Viewer (RETIGO) is a free, web-based tool that shows air quality data that are collected while in motion (walking, biking or in a vehicle). The tool helps users overcome technical barriers to exploring air quality data. After collecting measurements, citizen scientists and other users can import their own data and explore the data on a map.
  • Remote Sensing Information Gateway (RSIG)
    RSIG offers a new way for users to get the multi-terabyte, environmental datasets they want via an interactive, Web browser-based application. A file download and parsing process that now takes months will be reduced via RSIG to minutes.
  • Simulation Tool Kit for Indoor Air Quality and Inhalation Exposure (IAQX)
    IAQX version 1.1 is an indoor air quality (IAQ) simulation software package that complements and supplements existing indoor air quality simulation (IAQ) programs. IAQX is for advanced users who have experience with exposure estimation, pollution control, risk assessment, and risk management. There are many sources of indoor air pollution, such as building materials, furnishings, and chemical cleaners. Since most people spend a large portion of their time indoors, it is important to be able to estimate exposure to these pollutants. IAQX helps users analyze the impact of pollutant sources and sinks, ventilation, and air cleaners. It performs conventional IAQ simulations to calculate the pollutant concentration and/or personal exposure as a function of time. It can also estimate adequate ventilation rates based on user-provided air quality criteria. This is a unique feature useful for product stewardship and risk management.
  • Spatial Allocator
    The Spatial Allocator provides tools that could be used by the air quality modeling community to perform commonly needed spatial tasks without requiring the use of a commercial Geographic Information System (GIS).
  • Traceability Protocol for Assay and Certification of Gaseous Calibration Standards
    This is used to certify calibration gases for ambient and continuous emission monitors. It specifies methods for assaying gases and establishing traceability to National Institute of Standards and Technology (NIST) reference standards. Traceability is required under EPA ambient and continuous emission monitoring regulations.
  • Watershed Deposition Mapping Tool (WDT)
    WDT provides an easy to use tool for mapping the deposition estimates from CMAQ to watersheds to provide the linkage of air and water needed for TMDL (Total Maximum Daily Load) and related nonpoint-source watershed analyses.
  • Visual Environment for Rich Data Interpretation (VERDI)
    VERDI is a flexible, modular, Java-based program for visualizing multivariate gridded meteorology, emissions, and air quality modeling data created by environmental modeling systems such as CMAQ and the Weather Research and Forecasting (WRF) model.

 

Databases

  • Air Quality Data for the CDC National Environmental Public Health Tracking Network 
    EPA’s Exposure Research scientists are collaborating with the Centers for Disease Control and Prevention (CDC) on a CDC initiative to build a National Environmental Public Health Tracking (EPHT) network. Working with state, local and federal air pollution and health agencies, the EPHT program is facilitating the collection, integration, analysis, interpretation, and dissemination of data from environmental hazard monitoring, and from human exposure and health effects surveillance. These data provide scientific information to develop surveillance indicators, and to investigate possible relationships between environmental exposures, chronic disease, and other diseases, that can lead to interventions to reduce the burden of theses illnesses. An important part of the initiative is air quality modeling estimates and air quality monitoring data, combined through Bayesian modeling that can be linked with health outcome data.
  • EPAUS9R – An Energy Systems Database for use with the Market Allocation (MARKAL) Model
    The EPAUS9r is a regional database representation of the United States energy system. The database uses the MARKAL model. MARKAL is an energy system optimization model used by local and federal governments, national and international communities and academia. EPAUS9r represents energy supply, technology, and demand throughout the major sectors of the U.S. energy system.
  • Fused Air Quality Surfaces Using Downscaling
    This database provides access to the most recent O3 and PM2.5 surfaces datasets using downscaling.
  • Health & Environmental Research Online (HERO)
    HERO provides access to scientific literature used to support EPA’s integrated science assessments, including the  Integrated Science Assessments (ISA) that feed into the National Ambient Air Quality (NAAQS) reviews.
  • SPECIATE 4.5 Database
    SPECIATE is a repository of volatile organic gas and particulate matter (PM) speciation profiles of air pollution sources.

A listing of EPA Tools and Databases for Water Contaminant Exposure Assessment

Exposure and Toxicity

  • EPA ExpoBox (A Toolbox for Exposure Assessors)
    This toolbox assists individuals from within government, industry, academia, and the general public with assessing exposure from multiple media, including water and sediment. It is a compendium of exposure assessment tools that links to guidance documents, databases, models, reference materials, and other related resources.

Chemical and Product Categories (CPCat) Database
CPCat is a database containing information mapping more than 43,000 chemicals to a set of terms categorizing their usage or function. The comprehensive list of chemicals with associated categories of chemical and product use was compiled from publically available sources. Unique use category taxonomies from each source are mapped onto a single common set of approximately 800 terms. Users can search for chemicals by chemical name, Chemical Abstracts Registry Number, or by CPCat terms associated with chemicals.

A listing of EPA Tools and Databases for Chemical Toxicity Prediction & Assessment

  • Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS)
    SeqAPASS is a fast, online screening tool that allows researchers and regulators to extrapolate toxicity information across species. For some species, such as humans, mice, rats, and zebrafish, the EPA has a large amount of data regarding their toxicological susceptibility to various chemicals. However, the toxicity data for numerous other plants and animals is very limited. SeqAPASS extrapolates from these data rich model organisms to thousands of other non-target species to evaluate their specific potential chemical susceptibility.

 

A listing of EPA Webinar and Literature on Bioinformatic Tools and Projects

Comparative Bioinformatics Applications for Developmental Toxicology

Discuss how the US EPA/NCCT is trying to solve the problem of too many chemicals, too high cost, and too much biological uncertainty Discuss the solution the ToxCast Program is proposing; a data-rich system to screen, classify and rank chemicals for further evaluation

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dirEntryId=186844

CHEMOINFORMATIC AND BIOINFORMATIC CHALLENGES AT THE US ENVIRONMENTAL PROTECTION AGENCY.

This presentation will provide an overview of both the scientific program and the regulatory activities related to computational toxicology. This presentation will provide an overview of both the scientific program and the regulatory activities related to computational toxicology.

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dirEntryId=154013

How Can We Use Bioinformatics to Predict Which Agents Will Cause Birth Defects?

The availability of genomic sequences from a growing number of human and model organisms has provided an explosion of data, information, and knowledge regarding biological systems and disease processes. High-throughput technologies such as DNA and protein microarray biochips are now standard tools for probing the cellular state and determining important cellular behaviors at the genomic/proteomic levels. While these newer technologies are beginning to provide important information on cellular reactions to toxicant exposure (toxicogenomics), a major challenge that remains is the formulation of a strategy to integrate transcript, protein, metabolite, and toxicity data. This integration will require new concepts and tools in bioinformatics. The U.S. National Library of Medicine’s Pubmed site includes 19 million citations and abstracts and continues to grow. The BDSM team is now working on assembling the literature’s unstructured data into a structured database and linking it to BDSM within a system that can then be used for testing and generating new hypotheses. This effort will generate data bases of entities (such as genes, proteins, metabolites, gene ontology processes) linked to PubMed identifiers/abstracts and providing information on the relationships between them. The end result will be an online/standalone tool that will help researchers to focus on the papers most relevant to their query and uncover hidden connections and obvious information gaps.

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dirEntryId=227345

ADVANCED PROTEOMICS AND BIOINFORMATICS TOOLS IN TOXICOLOGY RESEARCH: OVERCOMING CHALLENGES TO PROVIDE SIGNIFICANT RESULTS

This presentation specifically addresses the advantages and limitations of state of the art gel, protein arrays and peptide-based labeling proteomic approaches to assess the effects of a suite of model T4 inhibitors on the thyroid axis of Xenopus laevis.

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NHEERL&dirEntryId=152823

Bioinformatic Integration of in vivo Data and Literature-based Gene Associations for Prioritization of Adverse Outcome Pathway Development

Adverse outcome pathways (AOPs) describe a sequence of events, beginning with a molecular initiating event (MIE), proceeding via key events (KEs), and culminating in an adverse outcome (AO). A challenge for use of AOPs in a safety evaluation context has been identification of MIEs and KEs relevant for AOs observed in regulatory toxicity studies. In this work, we implemented a bioinformatic approach that leverages mechanistic information in the literature and the AOs measured in regulatory toxicity studies to prioritize putative MIEs and/or early KEs for AOP development relevant to chemical safety evaluation. The US Environmental Protection Agency Toxicity Reference Database (ToxRefDB, v2.0) contains effect information for >1000 chemicals curated from >5000 studies or summaries from sources including data evaluation records from the US EPA Office of Pesticide Programs, the National Toxicology Program (NTP), peer-reviewed literature, and pharmaceutical preclinical studies. To increase ToxRefDB interoperability, endpoint and effect information were cross-referenced with codes from the United Medical Language System, which enabled mapping of in vivo pathological effects from ToxRefDB to PubMed (via Medical Subject Headings or MeSH). This enabled linkage to any resource that is also connected to PubMed or indexed with MeSH. A publicly available bioinformatic tool, the Entity-MeSH Co-occurrence Network (EMCON), uses multiple data sources and a measure of mutual information to identify genes most related to a MeSH term. Using EMCON, gene sets were generated for endpoints of toxicological relevance in ToxRefDB linking putative KEs and/or MIEs. The Comparative Toxicogenomics Database was used to further filter important associations. As a proof of concept, thyroid-related effects and their highly associated genes were examined, and demonstrated relevant MIEs and early KEs for AOPs to describe thyroid-related AOs. The ToxRefDB to gene mapping for thyroid resulted in >50 unique gene to chemical relationships. Integrated use of EMCON and ToxRefDB data provides a basis for rapid and robust putative AOP development, as well as a novel means to generate mechanistic hypotheses for specific chemicals. This abstract does not necessarily reflect U.S. EPA policy. Abstract and Poster for 2019 Society of Toxicology annual meeting in March 2019

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dirEntryId=344452

Bioinformatic Integration of in vivo Data and Literature-based Gene Associations for Prioritization of Adverse Outcome Pathway Development

Adverse outcome pathways (AOPs) describe a sequence of events, beginning with a molecular initiating event (MIE), proceeding via key events (KEs), and culminating in an adverse outcome (AO). A challenge for use of AOPs in a safety evaluation context has been identification of MIEs and KEs relevant for AOs observed in regulatory toxicity studies. In this work, we implemented a bioinformatic approach that leverages mechanistic information in the literature and the AOs measured in regulatory toxicity studies to prioritize putative MIEs and/or early KEs for AOP development relevant to chemical safety evaluation. The US Environmental Protection Agency Toxicity Reference Database (ToxRefDB, v2.0) contains effect information for >1000 chemicals curated from >5000 studies or summaries from sources including data evaluation records from the US EPA Office of Pesticide Programs, the National Toxicology Program (NTP), peer-reviewed literature, and pharmaceutical preclinical studies. To increase ToxRefDB interoperability, endpoint and effect information were cross-referenced with codes from the United Medical Language System, which enabled mapping of in vivo pathological effects from ToxRefDB to PubMed (via Medical Subject Headings or MeSH). This enabled linkage to any resource that is also connected to PubMed or indexed with MeSH. A publicly available bioinformatic tool, the Entity-MeSH Co-occurrence Network (EMCON), uses multiple data sources and a measure of mutual information to identify genes most related to a MeSH term. Using EMCON, gene sets were generated for endpoints of toxicological relevance in ToxRefDB linking putative KEs and/or MIEs. The Comparative Toxicogenomics Database was used to further filter important associations. As a proof of concept, thyroid-related effects and their highly associated genes were examined, and demonstrated relevant MIEs and early KEs for AOPs to describe thyroid-related AOs. The ToxRefDB to gene mapping for thyroid resulted in >50 unique gene to chemical relationships. Integrated use of EMCON and ToxRefDB data provides a basis for rapid and robust putative AOP development, as well as a novel means to generate mechanistic hypotheses for specific chemicals. This abstract does not necessarily reflect U.S. EPA policy. Abstract and Poster for 2019 Society of Toxicology annual meeting in March 2019

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dateBeginPublishedPresented=03%2F26%2F2014&dateEndPublishedPresented=03%2F26%2F2019&dirEntryId=344452&keyword=Chemical+Safety&showCriteria=2&sortBy=pubDateYear&subject=Chemical+Safety+Research

Bioinformatic Integration of in vivo Data and Literature-based Gene Associations for Prioritization of Adverse Outcome Pathway Development

Adverse outcome pathways (AOPs) describe a sequence of events, beginning with a molecular initiating event (MIE), proceeding via key events (KEs), and culminating in an adverse outcome (AO). A challenge for use of AOPs in a safety evaluation context has been identification of MIEs and KEs relevant for AOs observed in regulatory toxicity studies. In this work, we implemented a bioinformatic approach that leverages mechanistic information in the literature and the AOs measured in regulatory toxicity studies to prioritize putative MIEs and/or early KEs for AOP development relevant to chemical safety evaluation. The US Environmental Protection Agency Toxicity Reference Database (ToxRefDB, v2.0) contains effect information for >1000 chemicals curated from >5000 studies or summaries from sources including data evaluation records from the US EPA Office of Pesticide Programs, the National Toxicology Program (NTP), peer-reviewed literature, and pharmaceutical preclinical studies. To increase ToxRefDB interoperability, endpoint and effect information were cross-referenced with codes from the United Medical Language System, which enabled mapping of in vivo pathological effects from ToxRefDB to PubMed (via Medical Subject Headings or MeSH). This enabled linkage to any resource that is also connected to PubMed or indexed with MeSH. A publicly available bioinformatic tool, the Entity-MeSH Co-occurrence Network (EMCON), uses multiple data sources and a measure of mutual information to identify genes most related to a MeSH term. Using EMCON, gene sets were generated for endpoints of toxicological relevance in ToxRefDB linking putative KEs and/or MIEs. The Comparative Toxicogenomics Database was used to further filter important associations. As a proof of concept, thyroid-related effects and their highly associated genes were examined, and demonstrated relevant MIEs and early KEs for AOPs to describe thyroid-related AOs. The ToxRefDB to gene mapping for thyroid resulted in >50 unique gene to chemical relationships. Integrated use of EMCON and ToxRefDB data provides a basis for rapid and robust putative AOP development, as well as a novel means to generate mechanistic hypotheses for specific chemicals. This abstract does not necessarily reflect U.S. EPA policy. Abstract and Poster for 2019 Society of Toxicology annual meeting in March 2019

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dateBeginPublishedPresented=04%2F02%2F2014&dateEndPublishedPresented=04%2F02%2F2019&dirEntryId=344452&keyword=Chemical+Safety&showCriteria=2&sortBy=pubDateYear&subject=Chemical+Safety+Research

Bioinformatic Integration of in vivo Data and Literature-based Gene Associations for Prioritization of Adverse Outcome Pathway Development

Adverse outcome pathways (AOPs) describe a sequence of events, beginning with a molecular initiating event (MIE), proceeding via key events (KEs), and culminating in an adverse outcome (AO). A challenge for use of AOPs in a safety evaluation context has been identification of MIEs and KEs relevant for AOs observed in regulatory toxicity studies. In this work, we implemented a bioinformatic approach that leverages mechanistic information in the literature and the AOs measured in regulatory toxicity studies to prioritize putative MIEs and/or early KEs for AOP development relevant to chemical safety evaluation. The US Environmental Protection Agency Toxicity Reference Database (ToxRefDB, v2.0) contains effect information for >1000 chemicals curated from >5000 studies or summaries from sources including data evaluation records from the US EPA Office of Pesticide Programs, the National Toxicology Program (NTP), peer-reviewed literature, and pharmaceutical preclinical studies. To increase ToxRefDB interoperability, endpoint and effect information were cross-referenced with codes from the United Medical Language System, which enabled mapping of in vivo pathological effects from ToxRefDB to PubMed (via Medical Subject Headings or MeSH). This enabled linkage to any resource that is also connected to PubMed or indexed with MeSH. A publicly available bioinformatic tool, the Entity-MeSH Co-occurrence Network (EMCON), uses multiple data sources and a measure of mutual information to identify genes most related to a MeSH term. Using EMCON, gene sets were generated for endpoints of toxicological relevance in ToxRefDB linking putative KEs and/or MIEs. The Comparative Toxicogenomics Database was used to further filter important associations. As a proof of concept, thyroid-related effects and their highly associated genes were examined, and demonstrated relevant MIEs and early KEs for AOPs to describe thyroid-related AOs. The ToxRefDB to gene mapping for thyroid resulted in >50 unique gene to chemical relationships. Integrated use of EMCON and ToxRefDB data provides a basis for rapid and robust putative AOP development, as well as a novel means to generate mechanistic hypotheses for specific chemicals. This abstract does not necessarily reflect U.S. EPA policy. Abstract and Poster for 2019 Society of Toxicology annual meeting in March 2019

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dateBeginPublishedPresented=04%2F02%2F2014&dateEndPublishedPresented=04%2F02%2F2019&dirEntryId=344452&fed_org_id=111&keyword=Chemical+Safety&showCriteria=2&sortBy=pubDateYear&subject=Chemical+Safety+Research

Bioinformatic Integration of in vivo Data and Literature-based Gene Associations for Prioritization of Adverse Outcome Pathway Development

Adverse outcome pathways (AOPs) describe a sequence of events, beginning with a molecular initiating event (MIE), proceeding via key events (KEs), and culminating in an adverse outcome (AO). A challenge for use of AOPs in a safety evaluation context has been identification of MIEs and KEs relevant for AOs observed in regulatory toxicity studies. In this work, we implemented a bioinformatic approach that leverages mechanistic information in the literature and the AOs measured in regulatory toxicity studies to prioritize putative MIEs and/or early KEs for AOP development relevant to chemical safety evaluation. The US Environmental Protection Agency Toxicity Reference Database (ToxRefDB, v2.0) contains effect information for >1000 chemicals curated from >5000 studies or summaries from sources including data evaluation records from the US EPA Office of Pesticide Programs, the National Toxicology Program (NTP), peer-reviewed literature, and pharmaceutical preclinical studies. To increase ToxRefDB interoperability, endpoint and effect information were cross-referenced with codes from the United Medical Language System, which enabled mapping of in vivo pathological effects from ToxRefDB to PubMed (via Medical Subject Headings or MeSH). This enabled linkage to any resource that is also connected to PubMed or indexed with MeSH. A publicly available bioinformatic tool, the Entity-MeSH Co-occurrence Network (EMCON), uses multiple data sources and a measure of mutual information to identify genes most related to a MeSH term. Using EMCON, gene sets were generated for endpoints of toxicological relevance in ToxRefDB linking putative KEs and/or MIEs. The Comparative Toxicogenomics Database was used to further filter important associations. As a proof of concept, thyroid-related effects and their highly associated genes were examined, and demonstrated relevant MIEs and early KEs for AOPs to describe thyroid-related AOs. The ToxRefDB to gene mapping for thyroid resulted in >50 unique gene to chemical relationships. Integrated use of EMCON and ToxRefDB data provides a basis for rapid and robust putative AOP development, as well as a novel means to generate mechanistic hypotheses for specific chemicals. This abstract does not necessarily reflect U.S. EPA policy. Abstract and Poster for 2019 Society of Toxicology annual meeting in March 2019

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dirEntryId=344452&fed_org_id=111&keyword=Chemical+Safety&showCriteria=2&sortBy=pubDateYear&subject=Chemical+Safety+Research

 

Bioinformatic Integration of in vivo Data and Literature-based Gene Associations for Prioritization of Adverse Outcome Pathway Development

Adverse outcome pathways (AOPs) describe a sequence of events, beginning with a molecular initiating event (MIE), proceeding via key events (KEs), and culminating in an adverse outcome (AO). A challenge for use of AOPs in a safety evaluation context has been identification of MIEs and KEs relevant for AOs observed in regulatory toxicity studies. In this work, we implemented a bioinformatic approach that leverages mechanistic information in the literature and the AOs measured in regulatory toxicity studies to prioritize putative MIEs and/or early KEs for AOP development relevant to chemical safety evaluation. The US Environmental Protection Agency Toxicity Reference Database (ToxRefDB, v2.0) contains effect information for >1000 chemicals curated from >5000 studies or summaries from sources including data evaluation records from the US EPA Office of Pesticide Programs, the National Toxicology Program (NTP), peer-reviewed literature, and pharmaceutical preclinical studies. To increase ToxRefDB interoperability, endpoint and effect information were cross-referenced with codes from the United Medical Language System, which enabled mapping of in vivo pathological effects from ToxRefDB to PubMed (via Medical Subject Headings or MeSH). This enabled linkage to any resource that is also connected to PubMed or indexed with MeSH. A publicly available bioinformatic tool, the Entity-MeSH Co-occurrence Network (EMCON), uses multiple data sources and a measure of mutual information to identify genes most related to a MeSH term. Using EMCON, gene sets were generated for endpoints of toxicological relevance in ToxRefDB linking putative KEs and/or MIEs. The Comparative Toxicogenomics Database was used to further filter important associations. As a proof of concept, thyroid-related effects and their highly associated genes were examined, and demonstrated relevant MIEs and early KEs for AOPs to describe thyroid-related AOs. The ToxRefDB to gene mapping for thyroid resulted in >50 unique gene to chemical relationships. Integrated use of EMCON and ToxRefDB data provides a basis for rapid and robust putative AOP development, as well as a novel means to generate mechanistic hypotheses for specific chemicals. This abstract does not necessarily reflect U.S. EPA policy. Abstract and Poster for 2019 Society of Toxicology annual meeting in March 2019

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dateBeginPublishedPresented=03%2F26%2F2014&dateEndPublishedPresented=03%2F26%2F2019&dirEntryId=344452&fed_org_id=111&keyword=Chemical+Safety&showCriteria=2&sortBy=pubDateYear&subject=Chemical+Safety+Research

Bioinformatic Integration of in vivo Data and Literature-based Gene Associations for Prioritization of Adverse Outcome Pathway Development

Adverse outcome pathways (AOPs) describe a sequence of events, beginning with a molecular initiating event (MIE), proceeding via key events (KEs), and culminating in an adverse outcome (AO). A challenge for use of AOPs in a safety evaluation context has been identification of MIEs and KEs relevant for AOs observed in regulatory toxicity studies. In this work, we implemented a bioinformatic approach that leverages mechanistic information in the literature and the AOs measured in regulatory toxicity studies to prioritize putative MIEs and/or early KEs for AOP development relevant to chemical safety evaluation. The US Environmental Protection Agency Toxicity Reference Database (ToxRefDB, v2.0) contains effect information for >1000 chemicals curated from >5000 studies or summaries from sources including data evaluation records from the US EPA Office of Pesticide Programs, the National Toxicology Program (NTP), peer-reviewed literature, and pharmaceutical preclinical studies. To increase ToxRefDB interoperability, endpoint and effect information were cross-referenced with codes from the United Medical Language System, which enabled mapping of in vivo pathological effects from ToxRefDB to PubMed (via Medical Subject Headings or MeSH). This enabled linkage to any resource that is also connected to PubMed or indexed with MeSH. A publicly available bioinformatic tool, the Entity-MeSH Co-occurrence Network (EMCON), uses multiple data sources and a measure of mutual information to identify genes most related to a MeSH term. Using EMCON, gene sets were generated for endpoints of toxicological relevance in ToxRefDB linking putative KEs and/or MIEs. The Comparative Toxicogenomics Database was used to further filter important associations. As a proof of concept, thyroid-related effects and their highly associated genes were examined, and demonstrated relevant MIEs and early KEs for AOPs to describe thyroid-related AOs. The ToxRefDB to gene mapping for thyroid resulted in >50 unique gene to chemical relationships. Integrated use of EMCON and ToxRefDB data provides a basis for rapid and robust putative AOP development, as well as a novel means to generate mechanistic hypotheses for specific chemicals. This abstract does not necessarily reflect U.S. EPA policy. Abstract and Poster for 2019 Society of Toxicology annual meeting in March 2019

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dirEntryId=344452&fed_org_id=111&keyword=Chemical+Safety&showCriteria=2&sortBy=pubDateYear&subject=Chemical+Safety+Research

Bioinformatic Integration of in vivo Data and Literature-based Gene Associations for Prioritization of Adverse Outcome Pathway Development

Adverse outcome pathways (AOPs) describe a sequence of events, beginning with a molecular initiating event (MIE), proceeding via key events (KEs), and culminating in an adverse outcome (AO). A challenge for use of AOPs in a safety evaluation context has been identification of MIEs and KEs relevant for AOs observed in regulatory toxicity studies. In this work, we implemented a bioinformatic approach that leverages mechanistic information in the literature and the AOs measured in regulatory toxicity studies to prioritize putative MIEs and/or early KEs for AOP development relevant to chemical safety evaluation. The US Environmental Protection Agency Toxicity Reference Database (ToxRefDB, v2.0) contains effect information for >1000 chemicals curated from >5000 studies or summaries from sources including data evaluation records from the US EPA Office of Pesticide Programs, the National Toxicology Program (NTP), peer-reviewed literature, and pharmaceutical preclinical studies. To increase ToxRefDB interoperability, endpoint and effect information were cross-referenced with codes from the United Medical Language System, which enabled mapping of in vivo pathological effects from ToxRefDB to PubMed (via Medical Subject Headings or MeSH). This enabled linkage to any resource that is also connected to PubMed or indexed with MeSH. A publicly available bioinformatic tool, the Entity-MeSH Co-occurrence Network (EMCON), uses multiple data sources and a measure of mutual information to identify genes most related to a MeSH term. Using EMCON, gene sets were generated for endpoints of toxicological relevance in ToxRefDB linking putative KEs and/or MIEs. The Comparative Toxicogenomics Database was used to further filter important associations. As a proof of concept, thyroid-related effects and their highly associated genes were examined, and demonstrated relevant MIEs and early KEs for AOPs to describe thyroid-related AOs. The ToxRefDB to gene mapping for thyroid resulted in >50 unique gene to chemical relationships. Integrated use of EMCON and ToxRefDB data provides a basis for rapid and robust putative AOP development, as well as a novel means to generate mechanistic hypotheses for specific chemicals. This abstract does not necessarily reflect U.S. EPA policy. Abstract and Poster for 2019 Society of Toxicology annual meeting in March 2019

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dateBeginPublishedPresented=04%2F11%2F2014&dateEndPublishedPresented=04%2F11%2F2019&dirEntryId=344452&fed_org_id=111&keyword=Chemical+Safety&showCriteria=2&sortBy=pubDateYear&subject=Chemical+Safety+Research

Bioinformatic Integration of in vivo Data and Literature-based Gene Associations for Prioritization of Adverse Outcome Pathway Development

Adverse outcome pathways (AOPs) describe a sequence of events, beginning with a molecular initiating event (MIE), proceeding via key events (KEs), and culminating in an adverse outcome (AO). A challenge for use of AOPs in a safety evaluation context has been identification of MIEs and KEs relevant for AOs observed in regulatory toxicity studies. In this work, we implemented a bioinformatic approach that leverages mechanistic information in the literature and the AOs measured in regulatory toxicity studies to prioritize putative MIEs and/or early KEs for AOP development relevant to chemical safety evaluation. The US Environmental Protection Agency Toxicity Reference Database (ToxRefDB, v2.0) contains effect information for >1000 chemicals curated from >5000 studies or summaries from sources including data evaluation records from the US EPA Office of Pesticide Programs, the National Toxicology Program (NTP), peer-reviewed literature, and pharmaceutical preclinical studies. To increase ToxRefDB interoperability, endpoint and effect information were cross-referenced with codes from the United Medical Language System, which enabled mapping of in vivo pathological effects from ToxRefDB to PubMed (via Medical Subject Headings or MeSH). This enabled linkage to any resource that is also connected to PubMed or indexed with MeSH. A publicly available bioinformatic tool, the Entity-MeSH Co-occurrence Network (EMCON), uses multiple data sources and a measure of mutual information to identify genes most related to a MeSH term. Using EMCON, gene sets were generated for endpoints of toxicological relevance in ToxRefDB linking putative KEs and/or MIEs. The Comparative Toxicogenomics Database was used to further filter important associations. As a proof of concept, thyroid-related effects and their highly associated genes were examined, and demonstrated relevant MIEs and early KEs for AOPs to describe thyroid-related AOs. The ToxRefDB to gene mapping for thyroid resulted in >50 unique gene to chemical relationships. Integrated use of EMCON and ToxRefDB data provides a basis for rapid and robust putative AOP development, as well as a novel means to generate mechanistic hypotheses for specific chemicals. This abstract does not necessarily reflect U.S. EPA policy. Abstract and Poster for 2019 Society of Toxicology annual meeting in March 2019

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dateBeginPublishedPresented=04%2F11%2F2014&dateEndPublishedPresented=04%2F11%2F2019&dirEntryId=344452&keyword=Chemical+Safety&showCriteria=2&sortBy=pubDateYear&subject=Chemical+Safety+Research

A Web-Hosted R Workflow to Simplify and Automate the Analysis of 16S NGS Data

Next-Generation Sequencing (NGS) produces large data sets that include tens-of-thousands of sequence reads per sample. For analysis of bacterial diversity, 16S NGS sequences are typically analyzed in a workflow that containing best-of-breed bioinformatics packages that may leverage multiple programming languages (e.g., Python, R, Java, etc.). The process totransform raw NGS data to usable operational taxonomic units (OTUs) can be tedious due tothe number of quality control (QC) steps used in QIIME and other software packages forsample processing. Therefore, the purpose of this work was to simplify the analysis of 16SNGS data from a large number of samples by integrating QC, demultiplexing, and QIIME(Quantitative Insights Into Microbial Ecology) analysis in an accessible R project. User command line operations for each of the pipeline steps were automated into a workflow. In addition, the R server allows multi-user access to the automated pipeline via separate useraccounts while providing access to the same large set of underlying data. We demonstratethe applicability of this pipeline automation using 16S NGS data from approximately 100 stormwater runoff samples collected in a mixed-land use watershed in northeast Georgia. OTU tables were generated for each sample and the relative taxonomic abundances were compared for different periods over storm hydrographs to determine how the microbial ecology of a stream changes with rise and fall of stream stage. Our approach simplifies the pipeline analysis of multiple 16S NGS samples by automating multiple preprocessing, QC, analysis and post-processing command line steps that are called by a sequence of R scripts. Presented at ASM 2015 Rapid NGS Bioinformatic Pipelines for Enhanced Molecular Epidemiologic Investigation of Pathogens

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NERL&dirEntryId=309890

DEVELOPING COMPUTATIONAL TOOLS NECESSARY FOR APPLYING TOXICOGENOMICS TO RISK ASSESSMENT AND REGULATORY DECISION MAKING.

GENOMICS, PROTEOMICS & METABOLOMICS CAN PROVIDE USEFUL WEIGHT-OF-EVIDENCE DATA ALONG THE SOURCE-TO-OUTCOME CONTINUUM, WHEN APPROPRIATE BIOINFORMATIC AND COMPUTATIONAL METHODS ARE APPLIED TOWARDS INTEGRATING MOLECULAR, CHEMICAL AND TOXICOGICAL INFORMATION. GENOMICS, PROTEOMICS & METABOLOMICS CAN PROVIDE USEFUL WEIGHT-OF-EVIDENCE DATA ALONG THE SOURCE-TO-OUTCOME CONTINUUM, WHEN APPROPRIATE BIOINFORMATIC AND COMPUTATIONAL METHODS ARE APPLIED TOWARDS INTEGRATING MOLECULAR, CHEMICAL AND TOXICOGICAL INFORMATION.

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dirEntryId=156264

The Human Toxome Project

The Human Toxome project, funded as an NIH Transformative Research grant 2011–‐ 2016, is focused on developing the concepts and the means for deducing, validating, and sharing molecular Pathways of Toxicity (PoT). Using the test case of estrogenic endocrine disruption, the responses of MCF–‐7 human breast cancer cells are being phenotyped by transcriptomics and mass–‐spectroscopy–‐based metabolomics. The bioinformatics tools for PoT deduction represent a core deliverable. A number of challenges for quality and standardization of cell systems, omics technologies, and bioinformatics are being addressed. In parallel, concepts for annotation, validation, and sharing of PoT information, as well as their link to adverse outcomes, are being developed. A reasonably comprehensive public database of PoT, the Human Toxome Knowledge–‐base, could become a point of reference for toxicological research and regulatory tests strategies. A reasonably comprehensive public database of PoT, the Human Toxome Knowledge–‐base, could become a point of reference for toxicological research and regulatory tests strategies.

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NCCT&dirEntryId=309453

High-Resolution Metabolomics for Environmental Chemical Surveillance and Bioeffect Monitoring

High-Resolution Metabolomics for Environmental Chemical Surveillance and Bioeffect Monitoring (Presented by: Dean Jones, PhD, Department of Medicine, Emory University) (2/28/2013)

https://www.epa.gov/chemical-research/high-resolution-metabolomics-environmental-chemical-surveillance-and-bioeffect

Identification of Absorption, Distribution, Metabolism, and Excretion (ADME) Genes Relevant to Steatosis Using a Gene Expression Approach

Absorption, distribution, metabolism, and excretion (ADME) impact chemical concentration and activation of molecular initiating events of Adverse Outcome Pathways (AOPs) in cellular, tissue, and organ level targets. In order to better describe ADME parameters and how they modulate potential hazards posed by chemical exposure, our goal is to investigate the relationship between AOPs and ADME related genes and functional information. Given the scope of this task, we began using hepatic steatosis as a case study. To identify ADME genes related to steatosis, we used the publicly available toxicogenomics database, Open TG-GATEsTM. This database contains standardized rodent chemical exposure data from 170 chemicals (mostly drugs), along with differential gene expression data and corresponding associated pathological changes. We examined the chemical exposure microarray data set gathered from 9 chemical exposure treatments resulting in pathologically confirmed (minimal, moderate and severe) incidences of hepatic steatosis. From this differential gene expression data set, we utilized differential expression analyses to identify gene changes resulting from the chemical exposures leading to hepatic steatosis. We then selected differentially expressed genes (DEGs) related to ADME by filtering all genes based on their ADME functional identities. These DEGs include enzymes such as cytochrome p450, UDP glucuronosyltransferase, flavin-containing monooxygenase and transporter genes such as solute carriers and ATP-binding cassette transporter families. The up and downregulated genes were identified across these treatments. Total of 61 genes were upregulated and 68 genes were down regulated in all treatments. Meanwhile, 25 genes were both up regulated and downregulated across all the treatments. This work highlights the application of bioinformatics in linking AOPs with gene modulations specifically in relationships to ADME and exposures to chemicals. This abstract does not necessarily reflect U.S. EPA policy. This work highlights the application of bioinformatics tools to identify genes that are modulated by adverse outcomes. Specifically, we delineate a method to identify genes that are related to ADME and can impact target tissue dose in response to chemical exposures. The computational method outlined in this work is applicable to any adverse outcome pathway, and provide a linkage between chemical exposure, target tissue dose, and adverse outcomes. Application of this method will allow for the rapid screening of chemicals for their impact on ADME-related genes using available gene data bases in literature.

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NHEERL&dirEntryId=341273

Development of Environmental Fate and Metabolic Simulators

Presented at Bioinformatics Open Source Conference (BOSC), Detroit, MI, June 23-24, 2005. see description

https://cfpub.epa.gov/si/si_public_record_report.cfm?Lab=NERL&dirEntryId=257172

 

Useful Webinars on EPA Computational Tools and Informatics

 

Computational Toxicology Communities of Practice

Computational Toxicology Research

EPA’s Computational Toxicology Communities of Practice is composed of hundreds of stakeholders from over 50 public and private sector organizations (ranging from EPA, other federal agencies, industry, academic institutions, professional societies, nongovernmental organizations, environmental non-profit groups, state environmental agencies and more) who have an interest in using advances in computational toxicology and exposure science to evaluate the safety of chemicals.

The Communities of Practice is open to the public. Monthly webinars are held at EPA’s RTP campus, on the fourth Thursday of the month (occasionally rescheduled in November and December to accommodate holiday schedules), from 11am-Noon EST/EDT. Remote participation is available. For more information or to be added to the meeting email list, contact: Monica Linnenbrink (linnenbrink.monica@epa.gov).

Related Links

Past Webinar Presentations

Presentation File Presented By Date
OPEn structure-activity Relationship App (OPERA) Powerpoint(VideoEXIT) Dr. Kamel Mansouri, Lead Computational Chemist contractor for Integrated Laboratory Systems in the National Institute of Environmental Health Sciences 2019/4/25
CompTox Chemicals Dashboard and InVitroDB V3 (VideoEXIT) Dr. Antony Williams, Chemist in EPA’s National Center for Computational Toxicology and Dr. Katie Paul-Friedman, Toxicologist in EPA’s National Center for Computational Toxicology 2019/3/28
The Systematic Empirical Evaluation of Models (SEEM) framework (VideoEXIT) Dr. John Wambaugh, Physical Scientist in EPA’s National Center for Computational Toxicology 2019/2/28
ToxValDB: A comprehensive database of quantitative in vivo study results from over 25,000 chemicals (VideoEXIT) Dr. Richard Judson, Research Chemist in EPA’s National Center for Computational Toxicology 2018/12/20
Sequence Alignment to Predict Across Species Susceptibility (seqAPASS) (VideoEXIT) Dr. Carlie LaLone, Bioinformaticist, EPA’s National Health and Environmental Effects Research Laboratory 2018/11/29
Chemicals and Products Database (VideoEXIT) Dr. Kathie Dionisio, Environmental Health Scientist, EPA’s National Exposure Research Laboratory 2018/10/25
CompTox Chemicals Dashboard V3 (VideoEXIT) Dr. Antony Williams, Chemist, EPA National Center for Computational Toxicology (NCCT). 2018/09/27
Generalised Read-Across (GenRA) (VideoEXIT) Dr. Grace Patlewicz, Chemist, EPA National Center for Computational Toxicology (NCCT). 2018/08/23
EPA’S ToxCast Owner’s Manual  (VideoEXIT) Monica Linnenbrink, Strategic Outreach and Communication lead, EPA National Center for Computational Toxicology (NCCT). 2018/07/26
EPA’s Non-Targeted Analysis Collaborative Trial (ENTACT)      (VideoEXIT) Elin Ulrich, Research Chemist in the Public Health Chemistry Branch, EPA National Exposure Research Laboratory (NERL). 2018/06/28
ECOTOX Knowledgebase: New Tools and Data Visualizations(VideoEXIT) Colleen Elonen, Translational Toxicology Branch, and Dr. Jennifer Olker, Systems Toxicology Branch, in the Mid-Continent Ecology Division of EPA’s National Health & Environmental Effects Research Laboratory (NHEERL) 2018/05/24
Investigating Chemical-Microbiota Interactions in Zebrafish (VideoEXIT) Tamara Tal, Biologist in the Systems Biology Branch, Integrated Systems Toxicology Division, EPA’s National Health & Environmental Effects Research Laboratory (NHEERL) 2018/04/26
The CompTox Chemistry Dashboard v2.6: Delivering Improved Access to Data and Real Time Predictions (VideoEXIT) Tony Williams, Computational Chemist, EPA’s National Center for Computational Toxicology (NCCT) 2018/03/29
mRNA Transfection Retrofits Cell-Based Assays with Xenobiotic Metabolism (VideoEXIT* Audio starts at 10:17) Steve Simmons, Research Toxicologist, EPA’s National Center for Computational Toxicology (NCCT) 2018/02/22
Development and Distribution of ToxCast and Tox21 High-Throughput Chemical Screening Assay Method Description(VideoEXIT) Stacie Flood, National Student Services Contractor, EPA’s National Center for Computational Toxicology (NCCT) 2018/01/25
High-throughput H295R steroidogenesis assay: utility as an alternative and a statistical approach to characterize effects on steroidogenesis (VideoEXIT) Derik Haggard, ORISE Postdoctoral Fellow, EPA’s National Center for Computational Toxicology (NCCT) 2017/12/14
Systematic Review for Chemical Assessments: Core Elements and Considerations for Rapid Response (VideoEXIT) Kris Thayer, Director, Integrated Risk Information System (IRIS) Division of EPA’s National Center for Environmental Assessment (NCEA) 2017/11/16
High Throughput Transcriptomics (HTTr) Concentration-Response Screening in MCF7 Cells (VideoEXIT) Joshua Harrill, Toxicologist, EPA’s National Center for Computational Toxicology (NCCT) 2017/10/26
Learning Boolean Networks from ToxCast High-Content Imaging Data Todor Antonijevic, ORISE Postdoc, EPA’s National Center for Computational Toxicology (NCCT) 2017/09/28
Suspect Screening of Chemicals in Consumer Products Katherine Phillips, Research Chemist, Human Exposure and Dose Modeling Branch, Computational Exposure Division, EPA’s National Exposure Research Laboratory (NERHL) 2017/08/31
The EPA CompTox Chemistry Dashboard: A Centralized Hub for Integrating Data for the Environmental Sciences (VideoEXIT) Antony Williams, Chemist, EPA’s National Center for Computational Toxicology (NCCT) 2017/07/27
Navigating Through the Minefield of Read-Across Tools and Frameworks: An Update on Generalized Read-Across (GenRA)(VideoEXIT)

 

Advertisements

Read Full Post »


Role of Informatics in Precision Medicine: Notes from Boston Healthcare Webinar: Can It Drive the Next Cost Efficiencies in Oncology Care?

Reporter: Stephen J. Williams, Ph.D.

 

Boston Healthcare sponsored a Webinar recently entitled ” Role of Informatics in Precision Medicine: Implications for Innovators”.  The webinar focused on the different informatic needs along the Oncology Care value chain from drug discovery through clinicians, C-suite executives and payers. The presentation, by Joseph Ferrara and Mark Girardi, discussed the specific informatics needs and deficiencies experienced by all players in oncology care and how innovators in this space could create value. The final part of the webinar discussed artificial intelligence and the role in cancer informatics.

 

Below is the mp4 video and audio for this webinar.  Notes on each of the slides with a few representative slides are also given below:

Please click below for the mp4 of the webinar:

 

 


  • worldwide oncology related care to increase by 40% in 2020
  • big movement to participatory care: moving decision making to the patient. Need for information
  • cost components focused on clinical action
  • use informatics before clinical stage might add value to cost chain

 

 

 

 

Key unmet needs from perspectives of different players in oncology care where informatics may help in decision making

 

 

 

  1.   Needs of Clinicians

– informatic needs for clinical enrollment

– informatic needs for obtaining drug access/newer therapies

2.  Needs of C-suite/health system executives

– informatic needs to help focus of quality of care

– informatic needs to determine health outcomes/metrics

3.  Needs of Payers

– informatic needs to determine quality metrics and managing costs

– informatics needs to form guidelines

– informatics needs to determine if biomarkers are used consistently and properly

– population level data analytics

 

 

 

 

 

 

 

 

 

 

 

 

What are the kind of value innovations that tech entrepreneurs need to create in this space? Two areas/problems need to be solved.

  • innovations in data depth and breadth
  • need to aggregate information to inform intervention

Different players in value chains have different data needs

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Data Depth: Cumulative Understanding of disease

Data Depth: Cumulative number of oncology transactions

  • technology innovators rely on LEGACY businesses (those that already have technology) and these LEGACY businesses either have data breath or data depth BUT NOT BOTH; (IS THIS WHERE THE GREATEST VALUE CAN BE INNOVATED?)
  • NEED to provide ACTIONABLE as well as PHENOTYPIC/GENOTYPIC DATA
  • data depth more important in clinical setting as it drives solutions and cost effective interventions.  For example Foundation Medicine, who supplies genotypic/phenotypic data for patient samples supplies high data depth
  • technologies are moving to data support
  • evidence will need to be tied to umbrella value propositions
  • Informatic solutions will have to prove outcome benefit

 

 

 

 

 

How will Machine Learning be involved in the healthcare value chain?

  • increased emphasis on real time datasets – CONSTANT UPDATES NEED TO OCCUR. THIS IS NOT HAPPENING BUT VALUED BY MANY PLAYERS IN THIS SPACE
  • Interoperability of DATABASES Important!  Many Players in this space don’t understand the complexities integrating these datasets

Other Articles on this topic of healthcare informatics, value based oncology, and healthcare IT on this OPEN ACCESS JOURNAL include:

Centers for Medicare & Medicaid Services announced that the federal healthcare program will cover the costs of cancer gene tests that have been approved by the Food and Drug Administration

Broad Institute launches Merkin Institute for Transformative Technologies in Healthcare

HealthCare focused AI Startups from the 100 Companies Leading the Way in A.I. Globally

Paradoxical Findings in HealthCare Delivery and Outcomes: Economics in MEDICINE – Original Research by Anupam “Bapu” Jena, the Ruth L. Newhouse Associate Professor of Health Care Policy at HMS

Google & Digital Healthcare Technology

Can Blockchain Technology and Artificial Intelligence Cure What Ails Biomedical Research and Healthcare

The Future of Precision Cancer Medicine, Inaugural Symposium, MIT Center for Precision Cancer Medicine, December 13, 2018, 8AM-6PM, 50 Memorial Drive, Cambridge, MA

Live Conference Coverage @Medcity Converge 2018 Philadelphia: Oncology Value Based Care and Patient Management

2016 BioIT World: Track 5 – April 5 – 7, 2016 Bioinformatics Computational Resources and Tools to Turn Big Data into Smart Data

The Need for an Informatics Solution in Translational Medicine

 

 

 

 

Read Full Post »


Live Conference Coverage @Medcitynews Converge 2018 @Philadelphia: Promising Drugs and Breaking Down Silos

Reporter: Stephen J. Williams, PhD

Promising Drugs, Pricing and Access

The drug pricing debate rages on. What are the solutions to continuing to foster research and innovation, while ensuring access and affordability for patients? Can biosimilars and generics be able to expand market access in the U.S.?

Moderator: Bunny Ellerin, Director, Healthcare and Pharmaceutical Management Program, Columbia Business School
Speakers:
Patrick Davish, AVP, Global & US Pricing/Market Access, Merck
Robert Dubois M.D., Chief Science Officer and Executive Vice President, National Pharmaceutical Council
Gary Kurzman, M.D., Senior Vice President and Managing Director, Healthcare, Safeguard Scientifics
Steven Lucio, Associate Vice President, Pharmacy Services, Vizient

What is working and what needs to change in pricing models?

Robert:  He sees so many players in the onStevencology space discovering new drugs and other drugs are going generic (that is what is working).  However are we spending too much on cancer care relative to other diseases (their initiative Going Beyond the Surface)

Steven:  the advent of biosimilars is good for the industry

Patrick:  large effort in oncology, maybe too much (750 trials on Keytruda) and he says pharma is spending on R&D (however clinical trials take large chunk of this money)

Robert: cancer has gotten a free ride but cost per year relative to benefit looks different than other diseases.  Are we overinvesting in cancer or is that a societal decision

Gary:  maybe as we become more specific with precision medicines high prices may be a result of our success in specifically targeting a mutation.  We need to understand the targeted drugs and outcomes.

Patrick: “Cancer is the last big frontier” but he says prices will come down in most cases.  He gives the example of Hep C treatment… the previous only therapeutic option was a very toxic yearlong treatment but the newer drugs may be more cost effective and safer

Steven: Our blockbuster drugs could diffuse the expense but now with precision we can’t diffuse the expense over a large number of patients

President’s Cancer Panel Recommendation

Six recommendations

  1. promoting value based pricing
  2. enabling communications of cost
  3. financial toxicity
  4. stimulate competition biosimilars
  5. value based care
  6. invest in biomedical research

Patrick: the government pricing regime is hurting.  Alot of practical barriers but Merck has over 200 studies on cost basis

Robert:  many concerns/impetus started in Europe on pricing as they are a set price model (EU won’t pay more than x for a drug). US is moving more to outcomes pricing. For every one health outcome study three studies did not show a benefit.  With cancer it is tricky to establish specific health outcomes.  Also Medicare gets best price status so needs to be a safe harbor for payers and biggest constraint is regulatory issues.

Steven: They all want value based pricing but we don’t have that yet and there is a challenge to understand the nuances of new therapies.  Hard to align all the stakeholders together so until some legislation starts to change the reimbursement-clinic-patient-pharma obstacles.  Possibly the big data efforts discussed here may help align each stakeholders goals.

Gary: What is the data necessary to understand what is happening to patients and until we have that information it still will be complicated to determine where investors in health care stand at in this discussion

Robert: on an ICER methods advisory board: 1) great concern of costs how do we determine fair value of drug 2) ICER is only game in town, other orgs only give recommendations 3) ICER evaluates long term value (cost per quality year of life), budget impact (will people go bankrupt)

4) ICER getting traction in the public eye and advocates 5) the problem is ICER not ready for prime time as evidence keeps changing or are they keeping the societal factors in mind and they don’t have total transparancy in their methodology

Steven: We need more transparency into all the costs associated with the drug and therapy and value-based outcome.  Right now price is more of a black box.

Moderator: pointed to a recent study which showed that outpatient costs are going down while hospital based care cost is going rapidly up (cost of site of care) so we need to figure out how to get people into lower cost setting

Breaking Down Silos in Research

“Silo” is healthcare’s four-letter word. How are researchers, life science companies and others sharing information that can benefit patients more quickly? Hear from experts at institutions that are striving to tear down the walls that prevent data from flowing.

Moderator: Vini Jolly, Executive Director, Woodside Capital Partners
Speakers:
Ardy Arianpour, CEO & Co-Founder, Seqster @seqster
Lauren Becnel, Ph.D., Real World Data Lead for Oncology, Pfizer
Rakesh Mathew, Innovation, Research, & Development Lead, HealthShareExchange
David Nace M.D., Chief Medical Officer, Innovaccer

Seqster: Seqster is a secure platform that helps you and your family manage medical records, DNA, fitness, and nutrition data—all in one place. Founder has a genomic sequencing background but realized sequence  information needs to be linked with medical records.

HealthShareExchange.org :

HealthShare Exchange envisions a trusted community of healthcare stakeholders collaborating to deliver better care to consumers in the greater Philadelphia region. HealthShare Exchange will provide secure access to health information to enable preventive and cost-effective care; improve quality of patient care; and facilitate care transitions. They have partnered with multiple players in healthcare field and have data on over 7 million patients.

Innovacer

Data can be overwhelming, but it doesn’t have to be this way. To drive healthcare efficiency, we designed a modular suite of products for a smooth transition into a data-driven world within 4 weeks. Why does it take so much money to move data around and so slowly?

What is interoperatibility?

Ardy: We knew in genomics field how to build algorithms to analyze big data but how do we expand this from a consumer standpoint and see and share your data.

Lauren: how can we use the data between patients, doctors, researchers?  On the research side genomics represent only 2% of data.  Silos are one issue but figuring out the standards for data (collection, curation, analysis) is not set. Still need to improve semantic interoperability. For example Flatiron had good annotated data on male metastatic breast cancer.

David: Technical interopatabliltiy (platform), semantic interopatability (meaning or word usage), format (syntactic) interopatibility (data structure).  There is technical interoperatiblity between health system but some semantic but formats are all different (pharmacies use different systems and write different prescriptions using different suppliers).  In any value based contract this problem is a big issue now (we are going to pay you based on the quality of your performance then there is big need to coordinate across platforms).  We can solve it by bringing data in real time in one place and use mapping to integrate the format (need quality control) then need to make the data democratized among players.

Rakesh:  Patients data should follow the patient. Of Philadelphia’s 12 health systems we had a challenge to make data interoperatable among them so tdhey said to providers don’t use portals and made sure hospitals were sending standardized data. Health care data is complex.

David: 80% of clinical data is noise. For example most eMedical Records are text. Another problem is defining a patient identifier which US does not believe in.

 

 

 

 

Please follow on Twitter using the following #hash tags and @pharma_BI

#MCConverge

#cancertreatment

#healthIT

#innovation

#precisionmedicine

#healthcaremodels

#personalizedmedicine

#healthcaredata

And at the following handles:

@pharma_BI

@medcitynews

Read Full Post »


Live Conference Coverage Medcity Converge 2018 Philadelphia: Clinical Trials and Mega Health Mergers

Reporter: Stephen J. Williams, PhD

1:30 – 2:15 PM Clinical Trials 2.0

The randomized, controlled clinical trial is the gold standard, but it may be time for a new model. How can patient networks and new technology be leveraged to boost clinical trial recruitment and manage clinical trials more efficiently?

Moderator: John Reites, Chief Product Officer, Thread @johnreites
Speakers:
Andrew Chapman M.D., Chief of Cancer Services , Sidney Kimmel Cancer Center, Thomas Jefferson University Hospital
Michelle Longmire, M.D., Founder, Medable @LongmireMD
Sameek Roychowdhury MD, PhD, Medical Oncologist and Researcher, Ohio State University Comprehensive Cancer Center @OSUCCC_James

 

Michele: Medable is creating a digital surrogate biomarker for short term end result for cardiology clinical trials as well as creating a virtual site clinical trial design (independent of geography)

Sameek:  OSU is developing RNASeq tests for oncogenic fusions that are actionable

John: ability to use various technologies to conduct telehealth and tele-trials.  So why are we talking about Clinical Trials 2.0?

Andrew: We are not meeting many patients needs.  The provider also have a workload that prevents from the efficient running of a clinical trial.

Michele:  Personalized medicine: what is the framework how we conduct clinical trials in this new paradigm?

Sameek: How do we find those rare patients outside of a health network?  A fragmented health system is hurting patient recruitment efforts.

Wout: The Christmas Tree paradigm: collecting data points based on previous studies may lead to unnecessary criteria for patient recruitment

Sameek:  OSU has a cancer network (Orion) that has 95% success rate of recruitment.  Over Orion network sequencing performed at $10,000 per patient, cost reimbursed through network.  Network helps pharma companies find patients and patients to find drugs

Wout: reaching out to different stakeholders

John: what he sees in 2.0 is use of tech.  They took 12 clinic business but they integrated these sites and was able to benefit patient experience… this helped in recruitment into trials.  Now after a patient is recruited, how 2.0 model works?

Sameek:  since we work with pharma companies, what if we bring in patients from all over the US.  how do we continue to take care of them?

Andrew: utilizing a technology is critically important for tele-health to work and for tele-clinical trials to work

Michele:  the utilization of tele-health by patients is rather low.

Wout:  We are looking for insights into the data.  So we are concentrated on collecting the data and not decision trees.

John: What is a barrier to driving Clinical Trial 2.0?

Andrew: The complexity is a barrier to the patient.  Need to show the simplicity of this.  Need to match trials within a system.

Saleem: Data sharing incentives might not be there or the value not recognized by all players.  And it is hard to figure out how to share the data in the most efficient way.

Wout: Key issue when think locally and act globally but healthcare is the inverse of this as there are so many stakeholders but that adoption by all stakeholders take time

Michele: accessibility of healthcare data by patients is revolutionary.  The medical training in US does not train doctors in communicating a value of a trial

John: we are in a value-driven economy.  You have to give alot to get something in this economy. Final comments?

Saleem: we need fundamental research on the validity of clinical trials 2.0.

Wout:  Use tools to mine manually but don’t do everything manually, not underlying tasks

Andrew: Show value to patient

2:20-3:00 PM CONVERGEnce on Steroids: Why Comcast and Independence Blue Cross?

This year has seen a great deal of convergence in health care.  One of the most innovative collaborations announced was that of Cable and Media giant Comcast Corporation and health plan Independence Blue Cross.  This fireside chat will explore what the joint venture is all about, the backstory of how this unlikely partnership came to be, and what it might mean for our industry.

sponsored by Independence Blue Cross @IBX 

Moderator: Tom Olenzak, Managing Director Strategic Innovation Portfolio, Independence Blue Cross @IBX
Speakers:
Marc Siry, VP, Strategic Development, Comcast
Michael Vennera, SVP, Chief Information Officer, Independence Blue Cross

Comcast and Independence Blue Cross Blue Shield are teaming together to form an independent health firm to bring various players in healthcare onto a platform to give people a clear path to manage their healthcare.  Its not just about a payer and information system but an ecosystem within Philadelphia and over the nation.

Michael:  About 2015 at a health innovation conference they came together to produce a demo on how they envision the future of healthcare.

Marc: When we think of a customer we think of the household. So we thought about aggregating services to people in health.  How do people interact with their healthcare system?

What are the risks for bringing this vision to reality?

Michael: Key to experience is how to connect consumer to caregiver.

How do we aggregate the data, and present it in a way to consumer where it is actionable?

How do we help the patient to know where to go next?

Marc: Concept of ubiquity, not just the app, nor asking the provider to ask patient to download the app and use it but use our platform to expand it over all forms of media. They did a study with an insurer with metabolic syndrome and people’s viewing habits.  So when you can combine the expertise of IBX and the scale of a Comcast platform you can provide great amount of usable data.

Michael: Analytics will be a prime importance of the venture.

Tom:  We look at lots of companies that try to pitch technologies but they dont understand healthcare is a human problem not a tech problem.  What have you learned?

Marc: Adoption rate of new tech by doctors is very low as they are very busy.  Understanding the clinicians workflow is important and how to not disrupt their workflow was humbling for us.

Michael:  The speed at which big tech companies can integrate and innovate new technologies is very rapid, something we did not understand.  We want to get this off the ground locally but want to take this solution national and globally.

Marc:  We are not in competition with local startups but we are looking to work with them to build scale and operability so startups need to show how they can scale up.  This joint venture is designed to look at these ideas.  However this will take a while before we open up the ecosystem until we can see how they would add value. There are also challenges with small companies working with large organizations.

 

Please follow on Twitter using the following #hashtags and @pharma_BI

#MCConverge

#cancertreatment

#healthIT

#innovation

#precisionmedicine

#healthcaremodels

#personalizedmedicine

#healthcaredata

And at the following handles:

@pharma_BI

@medcitynews

 

Please see related articles on Live Coverage of Previous Meetings on this Open Access Journal

LIVE – Real Time – 16th Annual Cancer Research Symposium, Koch Institute, Friday, June 16, 9AM – 5PM, Kresge Auditorium, MIT

Real Time Coverage and eProceedings of Presentations on 11/16 – 11/17, 2016, The 12th Annual Personalized Medicine Conference, HARVARD MEDICAL SCHOOL, Joseph B. Martin Conference Center, 77 Avenue Louis Pasteur, Boston

Tweets Impression Analytics, Re-Tweets, Tweets and Likes by @AVIVA1950 and @pharma_BI for 2018 BioIT, Boston, 5/15 – 5/17, 2018

BIO 2018! June 4-7, 2018 at Boston Convention & Exhibition Center

https://pharmaceuticalintelligence.com/press-coverage/

 

Read Full Post »


Reporter: Stephen J. Williams, PhD

10:00-10:45 AM The Davids vs. the Cancer Goliath Part 1

Startups from diagnostics, biopharma, medtech, digital health and emerging tech will have 8 minutes to articulate their visions on how they aim to tame the beast.

Start Time End Time Company
10:00 10:08 Belong.Life
10:09 10:17 Care+Wear
10:18 10:26 OncoPower
10:27 10:35 PolyAurum LLC
10:36 10:44 Seeker Health

Speakers:
Karthik Koduru, MD, Co-Founder and Chief Oncologist, OncoPower
Eliran Malki, Co-Founder and CEO, Belong.Life
Chaitenya Razdan, Co-founder and CEO, Care+Wear @_crazdan
Debra Shipley Travers, President & CEO, PolyAurum LLC @polyaurum
Sandra Shpilberg, Founder and CEO, Seeker Health @sandrashpilberg

Belong Life

  • 10,000 cancer patients a month helping patients navigate cancer care with Belong App
  • Belong Eco system includes all their practitioners and using a trigger based content delivery (posts, articles etc)
  • most important taking unstructured health data (images, social activity, patient compilance) and converting to structured data

Care+Wear

personally design picc line cover for oncology patients

partners include NBA Major league baseball, Oscar de la Renta,

designs easy access pic line gowns and shirts

OncoPower :Digital Health in a Blockchain Ecosystem

problems associated with patient adherence and developed a product to address this

  1. OncoPower Blockchain: HIPAA compliant using the coin Oncopower security token to incentiavize patients and oncologists to consult with each other or oncologists with tumor boards; this is not an initial coin offering

PolyArum

  • spinout from UPENN; developing a nanoparticle based radiation therapy; glioblastoma muse model showed great response with gold based nanoparticle and radiation
  • they see enhanced tumor penetration, and retention of the gold nanoparticles
  • however most nanoparticles need to be a large size greater than 5 nm to see effect so they used a polymer based particle; see good uptake but excretion past a week so need to re-dose with Au nanoparticles
  • they are looking for capital and expect to start trials in 2020

Seeker Health

  • tying to improve the efficiency of clinical trial enrollment
  • using social networks to find the patients to enroll in clinical trials
  • steps they use 1) find patients on Facebook, Google, Twitter 2) engage patient screen 3) screening at clinical sites
  • Seeker Portal is a patient management system: patients referred to a clinical site now can be tracked

11:00- 11:45 AM Breakout: How to Scale Precision Medicine

The potential for precision medicine is real, but is limited by access to patient datasets. How are government entities, hospitals and startups bringing the promise of precision medicine to the masses of oncology patients

Moderator: Sandeep Burugupalli, Senior Manager, Real World Data Innovation, Pfizer @sandeepburug
Speakers:
Ingo ​Chakravarty, President and CEO, Navican @IngoChakravarty
Eugean Jiwanmall, Senior Research Analyst for Medical Policy & Technology Evaluation , Independence Blue Cross @IBX
Andrew Norden, M.D., Chief Medical Officer, Cota @ANordenMD
Ankur Parikh M.D, Medical Director of Precision Medicine, Cancer Treatment Centers of America @CancerCenter

Ingo: data is not ordered, only half of patients are tracked in some database, reimbursement a challenge

Eugean: identifying mutations as patients getting more comprehensive genomic coverage, clinical trials are expanding more rapidly as seen in 2018 ASCO

Ingo: general principals related to health outcomes or policy or reimbursement.. human studies are paramount but payers may not allowing for general principals (i.e. an Alk mutation in lung cancer and crizotanib treatment may be covered but maybe not for glioblastoma or another cancer containing similar ALK mutation; payers still depend on clinical trial results)

Andrew: using gene panels and NGS but only want to look for actionable targets; they establish an expert panel which reviews these NGS sequence results to determine actionable mutations

Ankur:  they have molecular tumor boards but still if want to prescribe off label and can’t find a clinical trial there is no reimbursement

Andrew: going beyond actionable mutations, although many are doing WES (whole exome sequencing) can we use machine learning to see if there are actionable data from a WES

Ingo: we forget in datasets is that patients have needs today and we need those payment systems and structures today

Eugean: problem is the start from cost (where the cost starts at and was it truly medically necessary)

Norden: there are not enough data sharing to make a decision; an enormous amount of effort to get businesses and technical limitations in data sharing; possibly there are policies needed to be put in place to assimilate datasets and promote collaborations

Ingo: need to take out the middle men between sequencing of patient tumor and treatment decision; middle men are taking out value out of the ‘supply chain’;

Andrew: PATIENTS DON’T OWN their DATA but MOST clinicians agree THEY SHOULD

Ankur: patients are willing to share data but the HIPAA compliance is a barrier

 

11:50- 12:30 AM Fireside Chat with Michael Pellini, M.D.

Building a Precision Medicine Business from the Ground Up: An Operating and Venture Perspective

Dr. Pellini has spent more than 20 years working on the operating side of four companies, each of which has pushed the boundaries of the standard of care. He will describe his most recent experience at Foundation Medicine, at the forefront of precision medicine, and how that experience can be leveraged on the venture side, where he now evaluates new healthcare technologies.

Speaker:
Michael Pellini, M.D., Managing Partner, Section 32 and Chairman, Foundation Medicine @MichaelPellini

Roche just bought Foundation Medicine for $2.5 billion.  They negotiated over 7 months but aside from critics they felt it was a great deal because it gives them, as a diagnostic venture, the international reach and biotech expertise.  Foundation Medicine offered Roche expertise on the diagnostic space including ability to navigate payers and regulatory aspects of the diagnostic business.  He feels it benefits all aspects of patient care and the work they do with other companies.

Moderatore: Roche is doing multiple deals to ‘own’ a disease state.

Dr. Pellini:  Roche is closing a deal with Flatiron just like how Merck closed deals with genomics companies.  He feels best to build the best company on a stand alone basis and provide for patients, then good things will happen.  However the problem of achieving scale for Precision Medicine is reimbursement by payers.  They still have to keep collecting data and evolving services to suit pharma.  They didn’t know if there model would work but when he met with FDA in 2011 they worked with Precision Medicine, said collect the data and we will keep working with you,

However the payers aren’t contributing to the effort.  They need to assist some of the young companies that can’t raise the billion dollars needed for all the evidence that payers require.  Precision Medicine still have problems, even though they have collected tremendous amounts of data and raised significant money.  From the private payer perspective there is no clear roadmap for success.

They recognized that the payers would be difficult but they had a plan but won’t invest in companies that don’t have a plan for getting reimbursement from payers.

Moderator: What is section 32?

Pellini:  Their investment arm invests in the spectrum of precision healtcare companies including tech companies.  They started with a digital path imaging system that went from looking through a scope and now looking at a monitor with software integrated with medical records. Section 32 has $130 million under management and may go to $400 Million but they want to stay small.

Pellini: we get 4-5 AI pitches a week.

Moderator: Are you interested in companion diagnostics?

Pellini:  There may be 24 expected 2018 drug approvals and 35% of them have a companion diagnostic (CDX) with them.  however going out ten years 70% may have a CDX associated with them.  Payers need to work with companies to figure out how to pay with these CDXs.

 

 

Read Full Post »


Turing Institute Engaging the Science of Big Data

Larry H. Bernstein, MD, FCAP, Curator

LPBI

 

Alan Turing Institute Will Lead Research in Data Science

12/08/2015 –   Duncan Roweth, Cray

Cray is partnering with the Alan Turing Institute, the new U.K. data science research organization in London, to help the U.K. as it increases research in data science to benefit research and industry.

Earlier this month Fiona Burgess, U.K. senior account manager, and I attended the launch of the institute. At the event, U.K. Minister for Science and Universities Jo Johnson paid tribute to Turing and his work. Institute director Professor Andrew Blake told the audience that the Turing Institute is about much more than just big data — it is about data science, analyzing that data and gaining a new understanding that leads to decisions and actions.

Alan Turing was a pioneering British computer scientist. He has become a household name in the U.K. following publicity surrounding his role in breaking the Enigma machine ciphers during the Second World War. This was a closely guarded secret until a few years ago, but has recently become the subject of numerous books and several films. Turing was highly influential in the development of computer science, providing a formalization of the concepts of algorithm and computation with the Turing machine. After the war, he worked at the National Physical Laboratory, where he designed ACE, one of the first stored-program computers.

The Alan Turing Institute is a joint venture between the universities of Cambridge, Edinburgh, Oxford, Warwick, University College London, and the U.K. Engineering and Physical Science Research Council (EPSRC). The Institute received initial funding in excess of £75 million ($110 million) from the U.K. government, the university partners and other business organizations, including the Lloyd’s Register Foundation.

The Turing Institute will, among other topics, research how knowledge and predictions can be extracted from large-scale and diverse digital data. It will bring together people, organizations and technologies in data science for the development of theory, methodologies and algorithms. The U.K. government is looking to this new Institute to enable the science community, commerce and industry to realize the value of big data for the U.K. economy.

Cray will be working with the Turing Institute and EPSRC to provide data analytics capability to the U.K.’s data sciences community.  EPSRC’s ARCHER supercomputer, a Cray XC30 system based at the University of Edinburgh, has been chosen for this work. Much as we worked with NERSC to port Docker to Cray systems, we will be working with ATI to port analytics software to ARCHER and then XC systems generally.

ARCHER is currently the largest supercomputer for scientific research in the U.K. — with its recent upgrade ARCHER’s 118,080 cores can access in excess of 300 TB of memory. What sort of problem might need that amount of processing power?  Genomics England is collecting around 200 GB of DNA sequence data from each of 100,000 people. Finding patterns in all this information will be a mammoth task!

ATI have put together a wide ranging programme of workshops and data science summits, details of which can be found on their Web site.

Duncan Roweth is a principal engineer in the Cray CTO Office in Bristol, U.K.  

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Read Full Post »


Best Big Data?

Larry H. Bernstein, MD, FCAP, Curator

LPBI

 

 

What’s The Big Data?

Google’s RankBrain Outranks the Best Brains in the Industry

Bloomberg recently broke the news that Google is “turning its lucrative Web search over to AI machines.” Google revealed to the reporter that for the past few months, a very large fraction of the millions of search queries Google responds to every second have been “interpreted by an artificial intelligence system, nicknamed RankBrain.”

The company that has tried hard to automate its mission to organize the world’s information was happy to report that its machines have again triumphed over humans. When Google search engineers “were asked to eyeball some pages and guess which they thought Google’s search engine technology would rank on top,” RankBrain had an 80% success rate compared to “the humans [who] guessed correctly 70 percent of the time.”

There you have it. Google’s AI machine RankBrain, after only a few months on the job, already outranks the best brains in the industry, the elite engineers that Google typically hires.

Or maybe not. Is RankBrain really “smarter than your average engineer” and already “living up to its AI hype,” as the Bloomberg article informs us, or is this all just, well, hype?

Desperate to find out how far our future machine overlords are already ahead of the best and the brightest (certainly not “average”), I asked Google to shed more light on the test, e.g., how do they determine the “success rate”?

“That test was fairly informal, but it was some of our top search engineers looking at search queries and potential search results and guessing which would be favored by users. (We don’t have more detail to share on how that’s determined; our evaluations are a pretty complex process).”

I guess both RankBrain and Google search engineers were given possible search results to a given query and RankBrain outperformed humans in guessing which are the “better” results, according to some undisclosed criteria.

I don’t know about you, but my TinyBrain is still confused. Wouldn’t Google search engine, with or without RankBrain, outperform any human being, including the smartest people on earth, in terms of “guessing” which search results “would be favored by users”? Haven’t they been mining the entire corpus of human knowledge for more than fifteen years and, by definition, have produced a search engine that “understands” relevance more than any individual human being?

The key to the competition, I guess, is that the “search queries” used in it were not just any search queries but complex queries containing words that have different meaning in different context. It’s the kind of queries that will stump most human beings and it’s quite surprising that Google engineers scored 70% on search queries that presumably require deep domain knowledge in all human endeavors, in addition to search expertise.

The only example of a complex query given in the Bloomberg article is “What’s the title of the consumer at the highest level of a food chain?” The word “consumer” in this context is a scientific term for something that consumes food and the label (the “title”) at highest level of the food chain is “predator.”

This explanation comes from search guru Danny Sullivan who has come to the rescue of perplexed humans like me, providing a detailed RankBrain FAQ, up to the limits imposed by Google’s legitimate reluctance to fully share its secrets. Sullivan: “From emailing with Google, I gather RankBrain is mainly used as a way to interpret the searches that people submit to find pages that might not have the exact words that were searched for.”

Sullivan points out that a lot of work done by humans is behind Google’s outstanding search results (e.g., creating a synonym list or a database with connections between “entities”—places, people, ideas, objects, etc.). But Google needs now to respond to some 450 million new queries per day, queries that have never been entered before into its search engine.

RankBrain “can see patterns between seemingly unconnected complex searches to understand how they’re actually similar to each other,” writes Sullivan. In addition, “RankBrain might be able to better summarize what a page is about than Google’s existing systems have done.”

Finding out the “unknown unknowns,” discovering previously unknown (to humans) links between words and concepts is the marriage of search technology with the hottest trend in big data analysis—deep learning. The real news about RankBrain is that it is the first time Google applied deep learning, the latest incarnation of “neural networks” and a specific type of machine learning, to its most prized asset—its search engine.

Google has been doing machine learning since its inception. The first published paper listed in the AI and  machine learning section of its research page is from 2001, and, to use just one example, Gmail is so good at detecting spam because of machine learning). But Goggle hasn’t applied machine learning to search. That there has been internal opposition to doing so we learn from a summary of a 2008 conversation between Anand Rajaraman and Peter Norvig, co-author of the most popular AI textbook and leader of Google search R&D since 2001. Here’s the most relevant excerpt:

The big surprise is that Google still uses the manually-crafted formula for its search results. They haven’t cut over to the machine learned model yet. Peter suggests two reasons for this. The first is hubris: the human experts who created the algorithm believe they can do better than a machine-learned model. The second reason is more interesting. Google’s search team worries that machine-learned models may be susceptible to catastrophic errors on searches that look very different from the training data. They believe the manually crafted model is less susceptible to such catastrophic errors on unforeseen query types.

This was written three years after Microsoft has applied machine learning to its search technology. But now, Google got over its hubris. 450 million unforeseen query types per day are probably too much for “manually crafted models” and google has decided that a “deep learning” system such as RankBrain provides good enough protection against “catastrophic errors.”

Read Full Post »

Older Posts »