Live Notes, Real Time Conference Coverage 2020 AACR Virtual Meeting April 27, 2020 Minisymposium on AACR Project Genie & Bioinformatics 4:00 PM – 6:00 PM
SESSION VMS.MD01.01 – Advancing Cancer Research through an International Cancer Registry: AACR Project GENIE Use Cases
Reporter: Stephen J. Williams, PhD
April 27, 2020, 4:00 PM – 6:00 PM
Virtual Meeting: All Session Times Are U.S. EDT
Session Type
Virtual Minisymposium
Track(s)
Bioinformatics and Systems Biology
17 Presentations
4:00 PM – 6:00 PM
– Chairperson Gregory J. Riely. Memorial Sloan Kettering Cancer Center, New York, NY
4:00 PM – 4:01 PM
– Introduction Gregory J. Riely. Memorial Sloan Kettering Cancer Center, New York, NY
Precision medicine requires an end-to-end learning healthcare system, wherein the treatment decisions for patients are informed by the prior experiences of similar patients. Oncology is currently leading the way in precision medicine because the genomic and other molecular characteristics of patients and their tumors are routinely collected at scale. A major challenge to realizing the promise of precision medicine is that no single institution is able to sequence and treat sufficient numbers of patients to improve clinical-decision making independently. To overcome this challenge, the AACR launched Project GENIE (Genomics Evidence Neoplasia Information Exchange).
AACR Project GENIE is a publicly accessible international cancer registry of real-world data assembled through data sharing between 19 of the leading cancer centers in the world. Through the efforts of strategic partners Sage Bionetworks (https://sagebionetworks.org) and cBioPortal (www.cbioportal.org), the registry aggregates, harmonizes, and links clinical-grade, next-generation cancer genomic sequencing data with clinical outcomes obtained during routine medical practice from cancer patients treated at these institutions. The consortium and its activities are driven by openness, transparency, and inclusion, ensuring that the project output remains accessible to the global cancer research community for the benefit of all patients.AACR Project GENIE fulfills an unmet need in oncology by providing the statistical power necessary to improve clinical decision-making, particularly in the case of rare cancers and rare variants in common cancers. Additionally, the registry can power novel clinical and translational research.
Because we collect data from nearly every patient sequenced at participating institutions and have committed to sharing only clinical-grade data, the GENIE registry contains enough high-quality data to power decision making on rare cancers or rare variants in common cancers. We see the GENIE data providing another knowledge turn in the virtuous cycle of research, accelerating the pace of drug discovery, improving the clinical trial design, and ultimately benefiting cancer patients globally.
The first set of cancer genomic data aggregated through AACR Project Genomics Evidence Neoplasia Information Exchange (GENIE) was available to the global community in January 2017. The seventh data set, GENIE 7.0-public, was released in January 2020 adding more than 9,000 records to the database. The combined data set now includes nearly 80,000 de-identified genomic records collected from patients who were treated at each of the consortium’s participating institutions, making it among the largest fully public cancer genomic data sets released to date. These data will be released to the public every six months. The public release of the eighth data set, GENIE 8.0-public, will take place in July 2020.
The combined data set now includes data for over 80 major cancer types, including data from greater than 12,500 patients with lung cancer, nearly 11,000 patients with breast cancer, and nearly 8,000 patients with colorectal cancer.
For more details about the data, analyses, and summaries of the data attributes from this release, GENIE 7.0-public, consult the data guide.
Users can access the data directly via cbioportal, or download the data directly from Sage Bionetworks. Users will need to create an account for either site and agree to the terms of access.
For frequently asked questions, visit our FAQ page.
- In fall of 2019 AACR announced the Bio Collaborative which collected pan cancer data in conjuction and collaboration and support by a host of big pharma and biotech companies
- they have a goal to expand to more than 6 cancer types and more than 50,000 records including smoking habits, lifestyle data etc
- They have started with NSCLC have have done mutational analysis on these
- included is tumor mutational burden and using cbioportal able to explore genomic data even further
- treatment data is included as well
- need to collect highly CURATED data with PRISM backbone to get more than outcome data, like progression data
- they might look to incorporate digital pathology but they are not there yet; will need good artificial intelligence systems
4:01 PM – 4:15 PM
– Invited Speaker Gregory J. Riely. Memorial Sloan Kettering Cancer Center, New York, NY
4:15 PM – 4:20 PM
– Discussion
4:20 PM – 4:30 PM
1092 – A systematic analysis of BRAF mutations and their sensitivity to different BRAF inhibitors: Zohar Barbash, Dikla Haham, Liat Hafzadi, Ron Zipor, Shaul Barth, Arie Aizenman, Lior Zimmerman, Gabi Tarcic. Novellusdx, Jerusalem, Israel
Abstract: The MAPK-ERK signaling cascade is among the most frequently mutated pathways in human cancer, with the BRAF V600 mutation being the most common alteration. FDA-approved BRAF inhibitors as well as combination therapies of BRAF and MEK inhibitors are available and provide survival benefits to patients with a BRAF V600 mutation in several indications. Yet non-V600 BRAF mutations are found in many cancers and are even more prevalent than V600 mutations in certain tumor types. As the use of NGS profiling in precision oncology is becoming more common, novel alterations in BRAF are being uncovered. This has led to the classification of BRAF mutations, which is dependent on its biochemical properties and affects it sensitivity to inhibitors. Therefore, annotation of these novel variants is crucial for assigning correct treatment. Using a high throughput method for functional annotation of MAPK activity, we profiled 151 different BRAF mutations identified in the AACR Project GENIE dataset, and their response to 4 different BRAF inhibitors- vemurafenib and 3 different exploratory 2nd generation inhibitors. The system is based on rapid synthesis of the mutations and expression of the mutated protein together with fluorescently labeled reporters in a cell-based assay. Our results show that from the 151 different BRAF mutations, ~25% were found to activate the MAPK pathway. All of the class 1 and 2 mutations tested were found to be active, providing positive validation for the method. Additionally, many novel activating mutations were identified, some outside of the known domains. When testing the response of the active mutations to different classes of BRAF inhibitors, we show that while vemurafenib efficiently inhibited V600 mutations, other types of mutations and specifically BRAF fusions were not inhibited by this drug. Alternatively, the second-generation experimental inhibitors were effective against both V600 as well as non-V600 mutations. Using this large-scale approach to characterize BRAF mutations, we were able to functionally annotate the largest number of BRAF mutations to date. Our results show that the number of activating variants is large and that they possess differential sensitivity to different types of direct inhibitors. This data can serve as a basis for rational drug design as well as more accurate treatment options for patients.
- Molecular profiling is becoming imperative for successful targeted therapies
- 500 unique mutations in BRAF so need to use bioinformatic pipeline; start with NGS panels then cluster according to different subtypes or class specific patterns
- certain mutation like V600E mutations have distinct clustering in tumor types
- 25% of mutations occur with other mutations; mutations may not be functional; they used highthruput system to analyze other V600 braf mutations to determine if functional
- active yet uncharacterized BRAF mutations seen in a major proportion of human tumors
- using genomic drug data found that many inhibitors like verafanib are specific to a specific mutation but other inhibitors that are not specific to a cleft can inhibit other BRAF mutants
- 40% of 135 mutants were functionally active
- USE of Functional Profiling instead of just genomic profiling
- Q?: They have already used this platform and analysis for RTKs and other genes as well successfully
- Q? how do you deal with co reccuring mutations: platform is able to do RTK plus signaling protiens
4:30 PM – 4:35 PM
– Discussion
4:35 PM – 4:45 PM
1093 – Calibration Tool for Genomic Aggregates (CTGA): A deep learning framework for calibrating somatic mutation profiling data from conventional gene panel data. Jordan Anaya, Craig Cummings, Jocelyn Lee, Alexander Baras. Johns Hopkins Sidney Kimmel Comprehensive Cancer Center, MD, Genentech, Inc., CA, AACR, Philadelphia, PA
Abstract: It has been suggested that aggregate genomic measures such as mutational burden can be associated with response to immunotherapy. Arguably, the gold standard for deriving such aggregate genomic measures (AGMs) would be from exome level sequencing. While many clinical trials run exome level sequencing, the vast majority of routine genomic testing performed today, as seen in AACR Project GENIE, is targeted / gene-panel based sequencing.
Despite the smaller size of these gene panels focused on clinically targetable alterations, it has been shown they can estimate, to some degree, exomic mutational burden; usually by normalizing mutation count by the relevant size of the panels. These smaller gene panels exhibit significant variability both in terms of accuracy relative to exomic measures and in comparison to other gene panels. While many genes are common to the panels in AACR Project GENIE, hundreds are not. These differences in extent of coverage and genomic loci examined can result in biases that may negatively impact panel to panel comparability.
To address these issues we developed a deep learning framework to model exomic AGMs, such as mutational burden, from gene panel data as seen in AACR Project GENIE. This framework can leverage any available sample and variant level information, in which variants are featurized to effectively re-weight their importance when estimating a given AGM, such as mutational burden, through the use of multiple instance learning techniques in this form of weakly supervised data.
Using TCGA data in conjunction with AACR Project GENIE gene panel definitions, as a proof of concept, we first applied this framework to learn expected variant features such as codons and genomic position from mutational data (greater than 99.9% accuracy observed). Having established the validity of the approach, we then applied this framework to somatic mutation profiling data in which we show that data from gene panels can be calibrated to exomic TMB and thereby improve panel to panel compatibility. We observed approximately 25% improvements in mean squared error and R-squared metrics when using our framework over conventional approaches to estimate TMB from gene panel data across the 9 tumors types examined (spanning melanoma, lung cancer, colon cancer, and others). This work highlights the application of sophisticated machine learning approaches towards the development of needed calibration techniques across seemingly disparate gene panel assays used clinically today.
4:45 PM – 4:50 PM
– Discussion
4:50 PM – 5:00 PM
1094 – Genetic determinants of EGFR-driven lung cancer growth and therapeutic response in vivoGiorgia Foggetti, Chuan Li, Hongchen Cai, Wen-Yang Lin, Deborah Ayeni, Katherine Hastings, Laura Andrejka, Dylan Maghini, Robert Homer, Dmitri A. Petrov, Monte M. Winslow, Katerina Politi. Yale School of Medicine, New Haven, CT, Stanford University School of Medicine, Stanford, CA, Stanford University School of Medicine, Stanford, CA, Yale School of Medicine, New Haven, CT, Stanford University School of Medicine, Stanford, CA, Yale School of Medicine, New Haven, CT
5:00 PM – 5:05 PM
– Discussion
5:05 PM – 5:15 PM
1095 – Comprehensive pan-cancer analyses of RAS genomic diversityRobert Scharpf, Gregory Riely, Mark Awad, Michele Lenoue-Newton, Biagio Ricciuti, Julia Rudolph, Leon Raskin, Andrew Park, Jocelyn Lee, Christine Lovly, Valsamo Anagnostou. Johns Hopkins Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, Memorial Sloan Kettering Cancer Center, New York, NY, Dana-Farber Cancer Institute, Boston, MA, Vanderbilt-Ingram Cancer Center, Nashville, TN, Amgen, Inc., Thousand Oaks, CA, AACR, Philadelphia, PA
5:15 PM – 5:20 PM
– Discussion
5:20 PM – 5:30 PM
1096 – Harmonization standards from the Variant Interpretation for Cancer Consortium. Alex H. Wagner, Reece K. Hart, Larry Babb, Robert R. Freimuth, Adam Coffman, Yonghao Liang, Beth Pitel, Angshumoy Roy, Matthew Brush, Jennifer Lee, Anna Lu, Thomas Coard, Shruti Rao, Deborah Ritter, Brian Walsh, Susan Mockus, Peter Horak, Ian King, Dmitriy Sonkin, Subha Madhavan, Gordana Raca, Debyani Chakravarty, Malachi Griffith, Obi L. Griffith. Washington University School of Medicine, Saint Louis, MO, Reece Hart Consulting, CA, Broad Institute, Boston, MA, Mayo Clinic, Rochester, MN, Washington University School of Medicine, Saint Louis, MO, Washington University School of Medicine, Saint Louis, MO, Baylor College of Medicine, Houston, TX, Oregon Health and Science University, Portland, OR, National Cancer Institute, Bethesda, MD, Georgetown University, Washington, DC, The Jackson Laboratory for Genomic Medicine, Farmington, CT, National Center for Tumor Diseases, Heidelberg, Germany, University of Toronto, Toronto, ON, Canada, University of Southern California, Los Angeles, CA, Memorial Sloan Kettering Cancer Center, New York, NY
Abstract: The use of clinical gene sequencing is now commonplace, and genome analysts and molecular pathologists are often tasked with the labor-intensive process of interpreting the clinical significance of large numbers of tumor variants. Numerous independent knowledge bases have been constructed to alleviate this manual burden, however these knowledgebases are non-interoperable. As a result, the analyst is left with a difficult tradeoff: for each knowledgebase used the analyst must understand the nuances particular to that resource and integrate its evidence accordingly when generating the clinical report, but for each knowledgebase omitted there is increased potential for missed findings of clinical significance.The Variant Interpretation for Cancer Consortium (VICC; cancervariants.org) was formed as a driver project of the Global Alliance for Genomics and Health (GA4GH; ga4gh.org) to address this concern. VICC members include representatives from several major somatic interpretation knowledgebases including CIViC, OncoKB, Jax-CKB, the Weill Cornell PMKB, the IRB-Barcelona Cancer Biomarkers Database, and others. Previously, the VICC built and reported on a harmonized meta-knowledgebase of 19,551 biomarker associations of harmonized variants, diseases, drugs, and evidence across the constituent resources.In that study, we analyzed the frequency with which the tumor samples from the AACR Project GENIE cohort would match to harmonized associations. Variant matches increased dramatically from 57% to 86% when broader matching to regions describing categorical variants were allowed. Unlike precise sequence variants with specified alternate alleles, categorical variants describe a collection of potential variants with a common feature, such as “V600” (non-valine alleles at the 600 residue), “Exon 20 mutations” (all non-silent mutations in exon 20), or “Gain-of-function” (hypermorphic alterations that activate or amplify gene activity). However, matching observed sequence variants to categorical variants is challenging, as the latter are typically only described as unstructured text. Here we describe the expressive and computational GA4GH Variation Representation specification (vr-spec.readthedocs.io), which we co-developed as members of the GA4GH Genomic Knowledge Standards work stream. This specification provides a schema for common, precise forms of variation (e.g. SNVs and Indels) and the method for computing identifiers from these objects. We highlight key aspects of the specification and our work to apply it to the characterization of categorical variation, showcasing the variant terminology and classification tools developed by the VICC to support this effort. These standards and tools are free, open-source, and extensible, overcoming barriers to standardized variant knowledge sharing and search.
- store information from different databases by curating them and classifying them then harmonizing them into values
- harmonize each variant across their knowledgebase; at any level of evidence
- had 29% of patients variants that matched when compare across many knowledgebase databases versus only 13% when using individual databases
- they are also trying to curate the database so a variant will have one code instead of various refseq codes or protein codes
- VIC is an open consortium
5:30 PM – 5:35 PM
– Discussion
5:35 PM – 5:45 PM
1097 – FGFR2 in-frame indels: A novel targetable alteration in intrahepatic cholangiocarcinoma. Yvonne Y. Li, James M. Cleary, Srivatsan Raghavan, Liam F. Spurr, Qibiao Wu, Lei Shi, Lauren K. Brais, Maureen Loftus, Lipika Goyal, Anuj K. Patel, Atul B. Shinagare, Thomas E. Clancy, Geoffrey Shapiro, Ethan Cerami, William R. Sellers, William C. Hahn, Matthew Meyerson, Nabeel Bardeesy, Andrew D. Cherniack, Brian M. Wolpin. Dana-Farber Cancer Institute, Boston, MA, Dana-Farber Cancer Institute, Boston, MA, Massachusetts General Hospital, Boston, MA, Brigham and Women’s Hospital, Boston, MA, Dana-Farber Cancer Institute, Boston, MA, Dana-Farber Cancer Institute, Boston, MA, Broad Institute of MIT and Harvard, Cambridge, MA, Massachusetts General Hospital, Boston, MA
5:45 PM – 5:50 PM
– Discussion
5:50 PM – 6:00 PM
– Closing RemarksGregory J. Riely. Memorial Sloan Kettering Cancer Center, New York, NY
Follow on Twitter at:
#AACR2020
#curecancernow
#pharmanews
Leave a Reply