Feeds:
Posts
Comments

Posts Tagged ‘mutational spectrum’

Live Notes, Real Time Conference Coverage 2020 AACR Virtual Meeting April 28, 2020 Session on Evaluating Cancer Genomics from Normal Tissues Through Metastatic Disease 3:50 PM

Reporter: Stephen J. Williams, PhD

 Minisymposium: Evaluating Cancer Genomics from Normal Tissues through Evolution to Metastatic Disease

Oncologic therapy shapes the fitness landscape of clonal hematopoiesis

April 28, 2020, 4:10 PM – 4:20 PM

Presenter/Authors
Kelly L. Bolton, Ryan N. Ptashkin, Teng Gao, Lior Braunstein, Sean M. Devlin, Minal Patel, Antonin Berthon, Aijazuddin Syed, Mariko Yabe, Catherine Coombs, Nicole M. Caltabellotta, Mike Walsh, Ken Offit, Zsofia Stadler, Choonsik Lee, Paul Pharoah, Konrad H. Stopsack, Barbara Spitzer, Simon Mantha, James Fagin, Laura Boucai, Christopher J. Gibson, Benjamin Ebert, Andrew L. Young, Todd Druley, Koichi Takahashi, Nancy Gillis, Markus Ball, Eric Padron, David Hyman, Jose Baselga, Larry Norton, Stuart Gardos, Virginia Klimek, Howard Scher, Dean Bajorin, Eder Paraiso, Ryma Benayed, Maria Arcilla, Marc Ladanyi, David Solit, Michael Berger, Martin Tallman, Montserrat Garcia-Closas, Nilanjan Chatterjee, Luis Diaz, Ross Levine, Lindsay Morton, Ahmet Zehir, Elli Papaemmanuil. Memorial Sloan Kettering Cancer Center, New York, NY, University of North Carolina at Chapel Hill, Chapel Hill, NC, University of Cambridge, Cambridge, United Kingdom, Dana-Farber Cancer Institute, Boston, MA, Washington University, St Louis, MO, The University of Texas MD Anderson Cancer Center, Houston, TX, Moffitt Cancer Center, Tampa, FL, National Cancer Institute, Bethesda, MD

Abstract
Recent studies among healthy individuals show evidence of somatic mutations in leukemia-associated genes, referred to as clonal hematopoiesis (CH). To determine the relationship between CH and oncologic therapy we collected sequential blood samples from 525 cancer patients (median sampling interval time = 23 months, range: 6-53 months) of whom 61% received cytotoxic therapy or external beam radiation therapy and 39% received either targeted/immunotherapy or were untreated. Samples were sequenced using deep targeted capture-based platforms. To determine whether CH mutational features were associated with tMN risk, we performed Cox proportional hazards regression on 9,549 cancer patients exposed to oncologic therapy of whom 75 cases developed tMN (median time to transformation=26 months). To further compare the genetic and clonal relationships between tMN and the proceeding CH, we analyzed 35 cases for which paired samples were available. We compared the growth rate of the variant allele fraction (VAF) of CH clones across treatment modalities and in untreated patients. A significant increase in the growth rate of CH mutations was seen in DDR genes among those receiving cytotoxic (p=0.03) or radiation therapy (p=0.02) during the follow-up period compared to patients who did not receive therapy. Similar growth rates among treated and untreated patients were seen for non-DDR CH genes such as DNMT3A. Increasing cumulative exposure to cytotoxic therapy (p=0.01) and external beam radiation therapy (2×10-8) resulted in higher growth rates for DDR CH mutations. Among 34 subjects with at least two CH mutations in which one mutation was in a DDR gene and one in a non-DDR gene, we studied competing clonal dynamics for multiple gene mutations within the same patient. The risk of tMN was positively associated with CH in a known myeloid neoplasm driver mutation (HR=6.9, p<10-6), and increased with the total number of mutations and clone size. The strongest associations were observed for mutations in TP53 and for CH with mutations in spliceosome genes (SRSF2, U2AF1 and SF3B1). Lower hemoglobin, lower platelet counts, lower neutrophil counts, higher red cell distribution width and higher mean corpuscular volume were all positively associated with increased tMN risk. Among 35 cases for which paired samples were available, in 19 patients (59%), we found evidence of at least one of these mutations at the time of pre-tMN sequencing and in 13 (41%), we identified two or more in the pre-tMN sample. In all cases the dominant clone at tMN transformation was defined by a mutation seen at CH Our serial sampling data provide clear evidence that oncologic therapy strongly selects for clones with mutations in the DDR genes and that these clones have limited competitive fitness, in the absence of cytotoxic or radiation therapy. We further validate the relevance of CH as a predictor and precursor of tMN in cancer patients. We show that CH mutations detected prior to tMN diagnosis were consistently part of the dominant clone at tMN diagnosis and demonstrate that oncologic therapy directly promotes clones with mutations in genes associated with chemo-resistant disease such as TP53.

  • therapy resulted also in clonal evolution and saw changes in splice variants and spliceosome
  • therapy promotes current DDR mutations
  • clonal hematopoeisis due to selective pressures
  • mutations, variants number all predictive of myeloid disease
  • deferring adjuvant therapy for breast cancer patients with patients in highest MDS risk group based on biomarkers, greatly reduced their risk for MDS

5704 – Pan-cancer genomic characterization of patient-matched primary, extracranial, and brain metastases

Presenter/AuthorsOlivia W. Lee, Akash Mitra, Won-Chul Lee, Kazutaka Fukumura, Hannah Beird, Miles Andrews, Grant Fischer, John N. Weinstein, Michael A. Davies, Jason Huse, P. Andrew Futreal. The University of Texas MD Anderson Cancer Center, TX, The University of Texas MD Anderson Cancer Center, TX, Olivia Newton-John Cancer Research Institute and School of Cancer Medicine, La Trobe University, AustraliaDisclosures O.W. Lee: None. A. Mitra: None. W. Lee: None. K. Fukumura: None. H. Beird: None. M. Andrews: ; Merck Sharp and Dohme. G. Fischer: None. J.N. Weinstein: None. M.A. Davies: ; Bristol-Myers Squibb. ; Novartis. ; Array BioPharma. ; Roche and Genentech. ; GlaxoSmithKline. ; Sanofi-Aventis. ; AstraZeneca. ; Myriad Genetics. ; Oncothyreon. J. Huse: None. P. Futreal: None.

Abstract: Brain metastases (BM) occur in 10-30% of patients with cancer. Approximately 200,000 new cases of brain metastases are diagnosed in the United States annually, with median survival after diagnosis ranging from 3 to 27 months. Recently, studies have identified significant genetic differences between BM and their corresponding primary tumors. It has been shown that BM harbor clinically actionable mutations that are distinct from those in the primary tumor samples. Additional genomic profiling of BM will provide deeper understanding of the pathogenesis of BM and suggest new therapeutic approaches.
We performed whole-exome sequencing of BM and matched tumors from 41 patients collected from renal cell carcinoma (RCC), breast cancer, lung cancer, and melanoma, which are known to be more likely to develop BM. We profiled total 126 fresh-frozen tumor samples and performed subsequent analyses of BM in comparison to paired primary tumor and extracranial metastases (ECM). We found that lung cancer shared the largest number of mutations between BM and matched tumors (83%), followed by melanoma (74%), RCC (51%), and Breast (26%), indicating that cancer type with high tumor mutational burden share more mutations with BM. Mutational signatures displayed limited differences, suggesting a lack of mutagenic processes specific to BM. However, point-mutation heterogeneity revealed that BM evolve separately into different subclones from their paired tumors regardless of cancer type, and some cancer driver genes were found in BM-specific subclones. These models and findings suggest that these driver genes may drive prometastatic subclones that lead to BM. 32 curated cancer gene mutations were detected and 71% of them were shared between BM and primary tumors or ECM. 29% of mutations were specific to BM, implying that BM often accumulate additional cancer gene mutations that are not present in primary tumors or ECM. Co-mutation analysis revealed a high frequency of TP53 nonsense mutation in BM, mostly in the DNA binding domain, suggesting TP53 nonsense mutation as a possible prerequisite for the development of BM. Copy number alteration analysis showed statistically significant differences between BM and their paired tumor samples in each cancer type (Wilcoxon test, p < 0.0385 for all). Both copy number gains and losses were consistently higher in BM for breast cancer (Wilcoxon test, p =1.307e-5) and lung cancer (Wilcoxon test, p =1.942e-5), implying greater genomic instability during the evolution of BM.
Our findings highlight that there are more unique mutations in BM, with significantly higher copy number alterations and tumor mutational burden. These genomic analyses could provide an opportunity for more reliable diagnostic decision-making, and these findings will be further tested with additional transcriptomic and epigenetic profiling for better characterization of BM-specific tumor microenvironments.

  • are there genomic signatures different in brain mets versus non metastatic or normal?
  • 32 genes from curated databases were different between brain mets and primary tumor
  • frequent nonsense mutations in TP53
  • divergent clonal evolution of drivers in BMets from primary
  • they were able to match BM with other mutational signatures like smokers and lung cancer signatures

5707 – A standard operating procedure for the interpretation of oncogenicity/pathogenicity of somatic mutations

Presenter/AuthorsPeter Horak, Malachi Griffith, Arpad Danos, Beth A. Pitel, Subha Madhavan, Xuelu Liu, Jennifer Lee, Gordana Raca, Shirley Li, Alex H. Wagner, Shashikant Kulkarni, Obi L. Griffith, Debyani Chakravarty, Dmitriy Sonkin. National Center for Tumor Diseases, Heidelberg, Germany, Washington University School of Medicine, St. Louis, MO, Mayo Clinic, Rochester, MN, Georgetown University Medical Center, Washington, DC, Dana-Farber Cancer Institute, Boston, MA, Frederick National Laboratory for Cancer Research, Rockville, MD, University of Southern California, Los Angeles, CA, Sunquest, Boston, MA, Baylor College of Medicine, Houston, TX, Memorial Sloan Kettering Cancer Center, New York, NY, National Cancer Institute, Rockville, MDDisclosures P. Horak: None. M. Griffith: None. A. Danos: None. B.A. Pitel: None. S. Madhavan: ; Perthera Inc. X. Liu: None. J. Lee: None. G. Raca: None. S. Li: ; Sunquest Information Systems, Inc. A.H. Wagner: None. S. Kulkarni: ; Baylor Genetics. O.L. Griffith: None. D. Chakravarty: None. D. Sonkin: None.AbstractSomatic variants in cancer-relevant genes are interpreted from multiple partially overlapping perspectives. When considered in discovery and translational research endeavors, it is important to determine if a particular variant observed in a gene of interest is oncogenic/pathogenic or not, as such knowledge provides the foundation on which targeted cancer treatment research is based. In contrast, clinical applications are dominated by diagnostic, prognostic, or therapeutic interpretations which in part also depends on underlying variant oncogenicity/pathogenicity. The Association for Molecular Pathology, the American Society of Clinical Oncology, and the College of American Pathologists (AMP/ASCO/CAP) have published structured somatic variant clinical interpretation guidelines which specifically address diagnostic, prognostic, and therapeutic implications. These guidelines have been well-received by the oncology community. Many variant knowledgebases, clinical laboratories/centers have adopted or are in the process of adopting these guidelines. The AMP/ASCO/CAP guidelines also describe different data types which are used to determine oncogenicity/pathogenicity of a variant, such as: population frequency, functional data, computational predictions, segregation, and somatic frequency. A second collaborative effort created the European Society for Medical Oncology (ESMO) Scale for Clinical Actionability of molecular Targets to provide a harmonized vocabulary that provides an evidence-based ranking system of molecular targets that supports their value as clinical targets. However, neither of these clinical guideline systems provide systematic and comprehensive procedures for aggregating population frequency, functional data, computational predictions, segregation, and somatic frequency to consistently interpret variant oncogenicity/pathogenicity, as has been published in the ACMG/AMP guidelines for interpretation of pathogenicity of germline variants. In order to address this unmet need for somatic variant oncogenicity/pathogenicity interpretation procedures, the Variant Interpretation for Cancer Consortium (VICC, a GA4GH driver project) Knowledge Curation and Interpretation Standards (KCIS) working group (WG) has developed a Standard Operating Procedure (SOP) with contributions from members of ClinGen Somatic Clinical Domain WG, and ClinGen Somatic/Germline variant curation WG using an approach similar to the ACMG/AMP germline pathogenicity guidelines to categorize evidence of oncogenicity/pathogenicity as very strong, strong, moderate or supporting. This SOP enables consistent and comprehensive assessment of oncogenicity/pathogenicity of somatic variants and latest version of an SOP can be found at https://cancervariants.org/wg/kcis/.

  • best to use this SOP for somatic mutations and not rearangements
  • variants based on oncogenicity as strong to weak
  • useful variant knowledge on pathogenicity curated from known databases
  • the recommendations would provide some guideline on curating unknown somatic variants versus known variants of hereditary diseases
  • they have not curated RB1 mutations or variants (or for other RBs like RB2? p130?)

 

Follow on Twitter at:

@pharma_BI

@AACR

@CureCancerNow

@pharmanews

@BiotechWorld

#AACR20

 

Read Full Post »

Complex rearrangements and oncogene amplification revealed by long-read DNA and RNA sequencing of a breast cancer cell line, Volume 2 (Volume Two: Latest in Genomics Methodologies for Therapeutics: Gene Editing, NGS and BioInformatics, Simulations and the Genome Ontology), Part 1: Next Generation Sequencing (NGS)

Complex rearrangements and oncogene amplification revealed by long-read DNA and RNA sequencing of a breast cancer cell line

Reporter: Stephen J. Williams, PhD

In a Genome Research report by Marie Nattestad et al. [1], the SK-BR-3 breast cancer cell line was sequenced using a long read single molecule sequencing protocol in order to develop one of the most detailed maps of structural variations in a cancer genome to date.  The authors detected over 20,000 variants with this new sequencing modality, whereas most of these variants would have been missed by short read sequencing.  In addition, a complex sequence of nested duplications and translocations occurred surrounding the ERBB2 (HER2) while full-length transcriptomic analysis revealed novel gene fusions within the nested genomic variants.  The authors suggest that combining this long-read genome and transcriptome sequencing results in a more comprehensive coverage of tumor gene variants and “sheds new light on the complex mechanisms involved in cancer genome evolution.”

Genomic instability is a hallmark of cancer [2], which lead to numerous genetic variations such as:

  • Copy number variations
  • Chromosomal alterations
  • Gene fusions
  • Deletions
  • Gene duplications
  • Insertions
  • Translocations

Efforts such as the Cancer Genome Atlas [3], and the International Genome Consortium (2010) use short-read sequencing technology to detect and analyze thousands of commonly occurring mutations however short-read technology has a high false positive and negative rate for detecting less common genetic structural variations {as high as 50% [4]}. In addition, short reads cannot detect variations in close proximity to each other or on the same molecule, therefore underestimating the variation number.

Methods:  The authors used a long-read sequencing technology from Pacific Biosciences (SMRT) to analyze the mutational and structural variation in the SK-BR-3 breast cancer cell line.  A split read and within-read mapping approach was used to detect variants of different types and sizes.  In general, long-reads have better alignment qualities than short reads, resulting in higher quality mapping. Transcriptomic analysis was performed using Iso-Seq.

Results: Using the SMRT long-read sequencing technology from Pacific Biosciences, the authors were able to obtain 71.9% sequencing coverage with average read length of 9.8 kb for the SK-BR-3 genome.

A few notes:

  1. Most amplified regions (33.6 copies) around the locus spanning the ERBB2 oncogene and around MYC locus (38 copies), EGFR locus (7 copies) and BCAS1 (16.8 copies)
  2. The locus 8q24.12 had the most amplifications (this locus contains the SNTB1 gene) at 69.2 copies
  3. Long-read sequencing showed more insertions than deletions and suggests an underestimate of the lengths of low complexity regions in the human reference genome
  4. Found 1,493 long read variants, 603 of which were between different chromosomes
  5. Using Iso-Seq in conjunction with the long-read platform, they detected 1,692,379 isoforms (93%) mapping to the reference genome and 53 putative gene fusions (39 of which they found genomic evidence)

A table modified from the paper on the gene fusions is given below:

Table 1. Gene fusions with RNA evidence from Iso-Seq and DNA evidence from SMRT DNA sequencing where the genomic path is found using SplitThreader from Sniffles variant calls. Note link in table is  GeneCard for each gene.

SplitThreader path

 

# Genes Distance
(bp)
Number
of variants
Chromosomes
in path
Previously observed in references
1 KLHDC2 SNTB1 9837 3 14|17|8 Asmann et al. (2011) as only a 2-hop fusion
2 CYTH1 EIF3H 8654 2 17|8 Edgren et al. (2011); Kim and Salzberg
(2011); RNA only, not observed as 2-hop
3 CPNE1 PREX1 1777 2 20 Found and validated as 2-hop by Chen et al. 2013
4 GSDMB TATDN1 0 1 17|8 Edgren et al. (2011); Kim and Salzberg
(2011); Chen et al. (2013); validated by
Edgren et al. (2011)
5 LINC00536 PVT1 0 1 8 No
6 MTBP SAMD12 0 1 8 Validated by Edgren et al. (2011)
7 LRRFIP2 SUMF1 0 1 3 Edgren et al. (2011); Kim and Salzberg
(2011); Chen et al. (2013); validated by
Edgren et al. (2011)
8 FBXL7 TRIO 0 1 5 No
9 ATAD5 TLK2 0 1 17 No
10 DHX35 ITCH 0 1 20 Validated by Edgren et al. (2011)
11 LMCD1-AS1 MECOM 0 1 3 No
12 PHF20 RP4-723E3.1 0 1 20 No
13 RAD51B SEMA6D 0 1 14|15 No
14 STAU1 TOX2 0 1 20 No
15 TBC1D31 ZNF704 0 1 8 Edgren et al. (2011); Kim and Salzberg
(2011); Chen et al. (2013); validated by
Edgren et al. (2011); Chen et al. (2013)

 

SplitThreader found two different paths for the RAD51B-SEMA6D gene fusion and for the LINC00536-PVT1 gene fusion. Number of Iso-Seq reads refers to full-length HQ-filtered reads. Alignments of SMRT DNA sequence reads supporting each of these gene fusions are shown in Supplemental Note S2.

 

 References

 

  1. Nattestad M, Goodwin S, Ng K, Baslan T, Sedlazeck FJ, Rescheneder P, Garvin T, Fang H, Gurtowski J, Hutton E et al: Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line. Genome research 2018, 28(8):1126-1135.
  2. Hanahan D, Weinberg RA: The hallmarks of cancer. Cell 2000, 100(1):57-70.
  3. Kandoth C, McLellan MD, Vandin F, Ye K, Niu B, Lu C, Xie M, Zhang Q, McMichael JF, Wyczalkowski MA et al: Mutational landscape and significance across 12 major cancer types. Nature 2013, 502(7471):333-339.
  4. Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, Zhang Y, Ye K, Jun G, Fritz MH et al: An integrated map of structural variation in 2,504 human genomes. Nature 2015, 526(7571):75-81.

 

Other articles on Cancer Genome Sequencing in this Open Access Journal Include:

 

International Cancer Genome Consortium Website has 71 Committed Cancer Genome Projects Ongoing

Loss of Gene Islands May Promote a Cancer Genome’s Evolution: A new Hypothesis on Oncogenesis

Identifying Aggressive Breast Cancers by Interpreting the Mathematical Patterns in the Cancer Genome

CancerBase.org – The Global HUB for Diagnoses, Genomes, Pathology Images: A Real-time Diagnosis and Therapy Mapping Service for Cancer Patients – Anonymized Medical Records accessible to

 

Read Full Post »

Bioinformatic Tools for Cancer Mutational Analysis: COSMIC and Beyond, Volume 2 (Volume Two: Latest in Genomics Methodologies for Therapeutics: Gene Editing, NGS and BioInformatics, Simulations and the Genome Ontology), Part 1: Next Generation Sequencing (NGS)

Bioinformatic Tools for Cancer Mutational Analysis: COSMIC and Beyond

Curator: Stephen J. Williams, Ph.D.

Updated 7/26/2019

Updated 04/27/2019

Signatures of Mutational Processes in Human Cancer (from COSMIC)

From The COSMIC Database

summary_circos_cosmic_38_380

The genomic landscape of cancer. The COSMIC database has a fully curated and annotated database of recurrent genetic mutations founds in various cancers (data taken form cancer sequencing projects). For interactive map please go to the COSMIC database here: http://cancer.sanger.ac.uk/cosmic

 

 

Somatic mutations are present in all cells of the human body and occur throughout life. They are the consequence of multiple mutational processes, including the intrinsic slight infidelity of the DNA replication machinery, exogenous or endogenous mutagen exposures, enzymatic modification of DNA and defective DNA repair. Different mutational processes generate unique combinations of mutation types, termed “Mutational Signatures”.

In the past few years, large-scale analyses have revealed many mutational signatures across the spectrum of human cancer types [Nik-Zainal S. et al., Cell (2012);Alexandrov L.B. et al., Cell Reports (2013);Alexandrov L.B. et al., Nature (2013);Helleday T. et al., Nat Rev Genet (2014);Alexandrov L.B. and Stratton M.R., Curr Opin Genet Dev (2014)]. However, as the number of mutational signatures grows the need for a curated census of signatures has become apparent. Here, we deliver such a resource by providing the profiles of, and additional information about, known mutational signatures.

The current set of mutational signatures is based on an analysis of 10,952 exomes and 1,048 whole-genomes across 40 distinct types of human cancer. These analyses are based on curated data that were generated by The Cancer Genome Atlas (TCGA), the International Cancer Genome Consortium (ICGC), and a large set of freely available somatic mutations published in peer-reviewed journals. Complete details about the data sources will be provided in future releases of COSMIC.

The profile of each signature is displayed using the six substitution subtypes: C>A, C>G, C>T, T>A, T>C, and T>G (all substitutions are referred to by the pyrimidine of the mutated Watson–Crick base pair). Further, each of the substitutions is examined by incorporating information on the bases immediately 5’ and 3’ to each mutated base generating 96 possible mutation types (6 types of substitution ∗ 4 types of 5’ base ∗ 4 types of 3’ base). Mutational signatures are displayed and reported based on the observed trinucleotide frequency of the human genome, i.e., representing the relative proportions of mutations generated by each signature based on the actual trinucleotide frequencies of the reference human genome version GRCh37. Note that only validated mutational signatures have been included in the curated census of mutational signatures.

Additional information is provided for each signature, including the cancer types in which the signature has been found, proposed aetiology for the mutational processes underlying the signature, other mutational features that are associated with each signature and information that may be relevant for better understanding of a particular mutational signature.

The set of signatures will be updated in the future. This will include incorporating additional mutation types (e.g., indels, structural rearrangements, and localized hypermutation such as kataegis) and cancer samples. With more cancer genome sequences and the additional statistical power this will bring, new signatures may be found, the profiles of current signatures may be further refined, signatures may split into component signatures and signatures

See their COSMIC tutorial page here for instructional videos

Updated News: COSMIC v75 – 24th November 2015

COSMIC v75 includes curations across GRIN2A, fusion pair TCF3-PBX1, and genomic data from 17 systematic screen publications. We are also beginning a reannotation of TCGA exome datasets using Sanger’s Cancer Genome Project analyis pipeline to ensure consistency; four studies are included in this release, to be expanded across the next few releases. The Cancer Gene Census now has a dedicated curator, Dr. Zbyslaw Sondka, who will be focused on expanding the Census, enhancing the evidence underpinning it, and developing improved expert-curated detail describing each gene’s impact in cancer. Finally, as we begin to streamline our ever-growing website, we have combined all information for each gene onto one page and simplified the layout and design to improve navigation

may be found in cancer types in which they are currently not detected.

mutational signatures across human cancer

Mutational signatures across human cancer

Patterns of mutational signatures [Download signatures]

 COSMIC database identifies 30 mutational signatures in human cancer

Please goto to COSMIC site to see bigger .png of mutation signatures

Signature 1

Cancer types:

Signature 1 has been found in all cancer types and in most cancer samples.

Proposed aetiology:

Signature 1 is the result of an endogenous mutational process initiated by spontaneous deamination of 5-methylcytosine.

Additional mutational features:

Signature 1 is associated with small numbers of small insertions and deletions in most tissue types.

Comments:

The number of Signature 1 mutations correlates with age of cancer diagnosis.

Signature 2

Cancer types:

Signature 2 has been found in 22 cancer types, but most commonly in cervical and bladder cancers. In most of these 22 cancer types, Signature 2 is present in at least 10% of samples.

Proposed aetiology:

Signature 2 has been attributed to activity of the AID/APOBEC family of cytidine deaminases. On the basis of similarities in the sequence context of cytosine mutations caused by APOBEC enzymes in experimental systems, a role for APOBEC1, APOBEC3A and/or APOBEC3B in human cancer appears more likely than for other members of the family.

Additional mutational features:

Transcriptional strand bias of mutations has been observed in exons, but is not present or is weaker in introns.

Comments:

Signature 2 is usually found in the same samples as Signature 13. It has been proposed that activation of AID/APOBEC cytidine deaminases is due to viral infection, retrotransposon jumping or to tissue inflammation. Currently, there is limited evidence to support these hypotheses. A germline deletion polymorphism involving APOBEC3A and APOBEC3B is associated with the presence of large numbers of Signature 2 and 13 mutations and with predisposition to breast cancer. Mutations of similar patterns to Signatures 2 and 13 are commonly found in the phenomenon of local hypermutation present in some cancers, known as kataegis, potentially implicating AID/APOBEC enzymes in this process as well.

Signature 3

Cancer types:

Signature 3 has been found in breast, ovarian, and pancreatic cancers.

Proposed aetiology:

Signature 3 is associated with failure of DNA double-strand break-repair by homologous recombination.

Additional mutational features:

Signature 3 associates strongly with elevated numbers of large (longer than 3bp) insertions and deletions with overlapping microhomology at breakpoint junctions.

Comments:

Signature 3 is strongly associated with germline and somatic BRCA1 and BRCA2 mutations in breast, pancreatic, and ovarian cancers. In pancreatic cancer, responders to platinum therapy usually exhibit Signature 3 mutations.

Signature 4

Cancer types:

Signature 4 has been found in head and neck cancer, liver cancer, lung adenocarcinoma, lung squamous carcinoma, small cell lung carcinoma, and oesophageal cancer.

Proposed aetiology:

Signature 4 is associated with smoking and its profile is similar to the mutational pattern observed in experimental systems exposed to tobacco carcinogens (e.g., benzo[a]pyrene). Signature 4 is likely due to tobacco mutagens.

Additional mutational features:

Signature 4 exhibits transcriptional strand bias for C>A mutations, compatible with the notion that damage to guanine is repaired by transcription-coupled nucleotide excision repair. Signature 4 is also associated with CC>AA dinucleotide substitutions.

Comments:

Signature 29 is found in cancers associated with tobacco chewing and appears different from Signature 4.

Signature 5

Cancer types:

Signature 5 has been found in all cancer types and most cancer samples.

Proposed aetiology:

The aetiology of Signature 5 is unknown.

Additional mutational features:

Signature 5 exhibits transcriptional strand bias for T>C substitutions at ApTpN context.

Comments:

Signature 6

Cancer types:

Signature 6 has been found in 17 cancer types and is most common in colorectal and uterine cancers. In most other cancer types, Signature 6 is found in less than 3% of examined samples.

Proposed aetiology:

Signature 6 is associated with defective DNA mismatch repair and is found in microsatellite unstable tumours.

Additional mutational features:

Signature 6 is associated with high numbers of small (shorter than 3bp) insertions and deletions at mono/polynucleotide repeats.

Comments:

Signature 6 is one of four mutational signatures associated with defective DNA mismatch repair and is often found in the same samples as Signatures 15, 20, and 26.

Signature 7

Cancer types:

Signature 7 has been found predominantly in skin cancers and in cancers of the lip categorized as head and neck or oral squamous cancers.

Proposed aetiology:

Based on its prevalence in ultraviolet exposed areas and the similarity of the mutational pattern to that observed in experimental systems exposed to ultraviolet light Signature 7 is likely due to ultraviolet light exposure.

Additional mutational features:

Signature 7 is associated with large numbers of CC>TT dinucleotide mutations at dipyrimidines. Additionally, Signature 7 exhibits a strong transcriptional strand-bias indicating that mutations occur at pyrimidines (viz., by formation of pyrimidine-pyrimidine photodimers) and these mutations are being repaired by transcription-coupled nucleotide excision repair.

Comments:

Signature 8

Cancer types:

Signature 8 has been found in breast cancer and medulloblastoma.

Proposed aetiology:

The aetiology of Signature 8 remains unknown.

Additional mutational features:

Signature 8 exhibits weak strand bias for C>A substitutions and is associated with double nucleotide substitutions, notably CC>AA.

Comments:

Signature 9

Cancer types:

Signature 9 has been found in chronic lymphocytic leukaemias and malignant B-cell lymphomas.

Proposed aetiology:

Signature 9 is characterized by a pattern of mutations that has been attributed to polymerase η, which is implicated with the activity of AID during somatic hypermutation.

Additional mutational features:

Comments:

Chronic lymphocytic leukaemias that possess immunoglobulin gene hypermutation (IGHV-mutated) have elevated numbers of mutations attributed to Signature 9 compared to those that do not have immunoglobulin gene hypermutation.

Signature 10

Cancer types:

Signature 10 has been found in six cancer types, notably colorectal and uterine cancer, usually generating huge numbers of mutations in small subsets of samples.

Proposed aetiology:

It has been proposed that the mutational process underlying this signature is altered activity of the error-prone polymerase POLE. The presence of large numbers of Signature 10 mutations is associated with recurrent POLE somatic mutations, viz., Pro286Arg and Val411Leu.

Additional mutational features:

Signature 10 exhibits strand bias for C>A mutations at TpCpT context and T>G mutations at TpTpT context.

Comments:

Signature 10 is associated with some of most mutated cancer samples. Samples exhibiting this mutational signature have been termed ultra-hypermutators.

Signature 11

Cancer types:

Signature 11 has been found in melanoma and glioblastoma.

Proposed aetiology:

Signature 11 exhibits a mutational pattern resembling that of alkylating agents. Patient histories have revealed an association between treatments with the alkylating agent temozolomide and Signature 11 mutations.

Additional mutational features:

Signature 11 exhibits a strong transcriptional strand-bias for C>T substitutions indicating that mutations occur on guanine and that these mutations are effectively repaired by transcription-coupled nucleotide excision repair.

Comments:

Signature 12

Cancer types:

Signature 12 has been found in liver cancer.

Proposed aetiology:

The aetiology of Signature 12 remains unknown.

Additional mutational features:

Signature 12 exhibits a strong transcriptional strand-bias for T>C substitutions.

Comments:

Signature 12 usually contributes a small percentage (<20%) of the mutations observed in a liver cancer sample.

Signature 13

Cancer types:

Signature 13 has been found in 22 cancer types and seems to be commonest in cervical and bladder cancers. In most of these 22 cancer types, Signature 13 is present in at least 10% of samples.

Proposed aetiology:

Signature 13 has been attributed to activity of the AID/APOBEC family of cytidine deaminases converting cytosine to uracil. On the basis of similarities in the sequence context of cytosine mutations caused by APOBEC enzymes in experimental systems, a role for APOBEC1, APOBEC3A and/or APOBEC3B in human cancer appears more likely than for other members of the family. Signature 13 causes predominantly C>G mutations. This may be due to generation of abasic sites after removal of uracil by base excision repair and replication over these abasic sites by REV1.

Additional mutational features:

Transcriptional strand bias of mutations has been observed in exons, but is not present or is weaker in introns.

Comments:

Signature 2 is usually found in the same samples as Signature 13. It has been proposed that activation of AID/APOBEC cytidine deaminases is due to viral infection, retrotransposon jumping or to tissue inflammation. Currently, there is limited evidence to support these hypotheses. A germline deletion polymorphism involving APOBEC3A and APOBEC3B is associated with the presence of large numbers of Signature 2 and 13 mutations and with predisposition to breast cancer. Mutations of similar patterns to Signatures 2 and 13 are commonly found in the phenomenon of local hypermutation present in some cancers, known as kataegis, potentially implicating AID/APOBEC enzymes in this process as well.

Signature 14

Cancer types:

Signature 14 has been observed in four uterine cancers and a single adult low-grade glioma sample.

Proposed aetiology:

The aetiology of Signature 14 remains unknown.

Additional mutational features:

Comments:

Signature 14 generates very high numbers of somatic mutations (>200 mutations per MB) in all samples in which it has been observed.

Signature 15

Cancer types:

Signature 15 has been found in several stomach cancers and a single small cell lung carcinoma.

Proposed aetiology:

Signature 15 is associated with defective DNA mismatch repair.

Additional mutational features:

Signature 15 is associated with high numbers of small (shorter than 3bp) insertions and deletions at mono/polynucleotide repeats.

Comments:

Signature 15 is one of four mutational signatures associated with defective DNA mismatch repair and is often found in the same samples as Signatures 6, 20, and 26.

Signature 16

Cancer types:

Signature 16 has been found in liver cancer.

Proposed aetiology:

The aetiology of Signature 16 remains unknown.

Additional mutational features:

Signature 16 exhibits an extremely strong transcriptional strand bias for T>C mutations at ApTpN context, with T>C mutations occurring almost exclusively on the transcribed strand.

Comments:

Signature 17

Cancer types:

Signature 17 has been found in oesophagus cancer, breast cancer, liver cancer, lung adenocarcinoma, B-cell lymphoma, stomach cancer and melanoma.

Proposed aetiology:

The aetiology of Signature 17 remains unknown.

Additional mutational features:

Comments:

Signature 1Signature 18

Cancer types:

Signature 18 has been found commonly in neuroblastoma. Additionally, Signature 18 has been also observed in breast and stomach carcinomas.

Proposed aetiology:

The aetiology of Signature 18 remains unknown.

Additional mutational features:

Comments:

Signature 19

Cancer types:

Signature 19 has been found only in pilocytic astrocytoma.

Proposed aetiology:

The aetiology of Signature 19 remains unknown.

Additional mutational features:

Comments:

Signature 20

Cancer types:

Signature 20 has been found in stomach and breast cancers.

Proposed aetiology:

Signature 20 is believed to be associated with defective DNA mismatch repair.

Additional mutational features:

Signature 20 is associated with high numbers of small (shorter than 3bp) insertions and deletions at mono/polynucleotide repeats.

Comments:

Signature 20 is one of four mutational signatures associated with defective DNA mismatch repair and is often found in the same samples as Signatures 6, 15, and 26.

Signature 21

Cancer types:

Signature 21 has been found only in stomach cancer.

Proposed aetiology:

The aetiology of Signature 21 remains unknown.

Additional mutational features:

Comments:

Signature 21 is found only in four samples all generated by the same sequencing centre. The mutational pattern of Signature 21 is somewhat similar to the one of Signature 26. Additionally, Signature 21 is found only in samples that also have Signatures 15 and 20. As such, Signature 21 is probably also related to microsatellite unstable tumours.

Signature 22

Cancer types:

Signature 22 has been found in urothelial (renal pelvis) carcinoma and liver cancers.

Proposed aetiology:

Signature 22 has been found in cancer samples with known exposures to aristolochic acid. Additionally, the pattern of mutations exhibited by the signature is consistent with the one previous observed in experimental systems exposed to aristolochic acid.

Additional mutational features:

Signature 22 exhibits a very strong transcriptional strand bias for T>A mutations indicating adenine damage that is being repaired by transcription-coupled nucleotide excision repair.

Comments:

Signature 22 has a very high mutational burden in urothelial carcinoma; however, its mutational burden is much lower in liver cancers.

Signature 23

Cancer types:

Signature 23 has been found only in a single liver cancer sample.

Proposed aetiology:

The aetiology of Signature 23 remains unknown.

Additional mutational features:

Signature 23 exhibits very strong transcriptional strand bias for C>T mutations.

Comments:

Signature 24

Cancer types:

Signature 24 has been observed in a subset of liver cancers.

Proposed aetiology:

Signature 24 has been found in cancer samples with known exposures to aflatoxin. Additionally, the pattern of mutations exhibited by the signature is consistent with that previous observed in experimental systems exposed to aflatoxin.

Additional mutational features:

Signature 24 exhibits a very strong transcriptional strand bias for C>A mutations indicating guanine damage that is being repaired by transcription-coupled nucleotide excision repair.

Comments:

Signature 25

Cancer types:

Signature 25 has been observed in Hodgkin lymphomas.

Proposed aetiology:

The aetiology of Signature 25 remains unknown.

Additional mutational features:

Signature 25 exhibits transcriptional strand bias for T>A mutations.

Comments:

This signature has only been identified in Hodgkin’s cell lines. Data is not available from primary Hodgkin lymphomas.

Signature 26

Cancer types:

Signature 26 has been found in breast cancer, cervical cancer, stomach cancer and uterine carcinoma.

Proposed aetiology:

Signature 26 is believed to be associated with defective DNA mismatch repair.

Additional mutational features:

Signature 26 is associated with high numbers of small (shorter than 3bp) insertions and deletions at mono/polynucleotide repeats.

Comments:

Signature 26 is one of four mutational signatures associated with defective DNA mismatch repair and is often found in the same samples as Signatures 6, 15 and 20.

Signature 27

Cancer types:

Signature 27 has been observed in a subset of kidney clear cell carcinomas.

Proposed aetiology:

The aetiology of Signature 27 remains unknown.

Additional mutational features:

Signature 27 exhibits very strong transcriptional strand bias for T>A mutations. Signature 27 is associated with high numbers of small (shorter than 3bp) insertions and deletions at mono/polynucleotide repeats.

Comments:

Signature 28

Cancer types:

Signature 28 has been observed in a subset of stomach cancers.

Proposed aetiology:

The aetiology of Signature 28 remains unknown.

Additional mutational features:

Comments:

Signature 29

Cancer types:

Signature 29 has been observed only in gingivo-buccal oral squamous cell carcinoma.

Proposed aetiology:

Signature 29 has been found in cancer samples from individuals with a tobacco chewing habit.

Additional mutational features:

Signature 29 exhibits transcriptional strand bias for C>A mutations indicating guanine damage that is most likely repaired by transcription-coupled nucleotide excision repair. Signature 29 is also associated with CC>AA dinucleotide substitutions.

Comments:

The Signature 29 pattern of C>A mutations due to tobacco chewing appears different from the pattern of mutations due to tobacco smoking reflected by Signature 4.

Signature 30

Cancer types:

Signature 30 has been observed in a small subset of breast cancers.

Proposed aetiology:

The aetiology of Signature 30 remains unknown.

 


 

Examples in the literature of deposits into or analysis from the COSMIC database

The Genomic Landscapes of Human Breast and Colorectal Cancers from Wood 318 (5853): 11081113 Science 2007

“analysis of exons representing 20,857 transcripts from 18,191 genes, we conclude that the genomic landscapes of breast and colorectal cancers are composed of a handful of commonly mutated gene “mountains” and a much larger number of gene “hills” that are mutated at low frequency. “

  • found cellular pathways with multiple pathways
  • analyzed a highly curated database (Metacore, GeneGo, Inc.) that includes human protein-protein interactions, signal transduction and metabolic pathways
  • There were 108 pathways that were found to be preferentially mutated in breast tumors. Many of the pathways involved phosphatidylinositol 3-kinase (PI3K) signaling
  • the cancer genome landscape consists of relief features (mutated genes) with heterogeneous heights (determined by CaMP scores). There are a few “mountains” representing individual CAN-genes mutated at high frequency. However, the landscapes contain a much larger number of “hills” representing the CAN-genes that are mutated at relatively low frequency. It is notable that this general genomic landscape (few gene mountains and many gene hills) is a common feature of both breast and colorectal tumors.
  • developed software to analyze multiple mutations and mutation frequencies available from Harvard Bioinformatics at

 

http://bcb.dfci.harvard.edu/~gp/software/CancerMutationAnalysis/cma.htm

 

 

R Software for Cancer Mutation Analysis (download here)

 

CancerMutationAnalysis Version 1.0:

R package to reproduce the statistical analyses of the Sjoblom et al article and the associated Technical Comment. This package is build for reproducibility of the original results and not for flexibility. Future version will be more general and define classes for the data types used. Further details are available in Working Paper 126.

CancerMutationAnalysis Version 2.0:

R package to reproduce the statistical analyses of the Wood et al article. Like its predecessor, this package is still build for reproducibility of the original results and not for flexibility. Further details are available in Working Paper 126

 

 

 

 

 

 

 

 

 

Update 04/27/2019

Review 2018. The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers. Z. Sondka et al. Nature Reviews. 2018.

The Catalogue of Somatic Mutations in Cancer (COSMIC) Cancer Gene Census (CGC) reevaluates the cancer genome landscape periodically and curates the findings into a database of genetic changes occurring in various tumor types.  The 2018 CGC describes in detail the effect of 719 cancer driving genes.  The recent expansion includes functional and mechanistic descriptions of how each gene contributes to disease etiology and in terms of the cancer hallmarks as described by Hanahan and Weinberg.  These functional characteristics show the complexity of the cancer mutational landscape and genome and suggest ” multiple cancer-related functions for many genes, which are often highly tissue-dependent or tumour stage-dependent.”  The 2018 CGC expands a second tier of genes, expanding the list of cancer related genes.

Criteria for curation of genes into CGC (curation process)

  • choosing candidate genes are selected from published literature, conference abstracts, large cancer genome screens deposited in databases, and analysis of current COSMIC database
  • COSMIC data are analyzed to determine presence of patterns of somatic mutations and frequency of such mutations in cancer
  • literature review to determine the role of the gene in cancer
  • Minimum evidence

– at least two publications from different groups shows increased mutation frequency in at least one type of cancer (PubMed)

–  at least two publications from different groups showing experimental evidence of functional involvement in at least one hallmark of cancer in order to classify the mutant gene as oncogene, tumor suppressor, or fusion partner (like BCR-Abl)

  • independent assessment by at least two postdoctoral fellows
  • gene must be classified as either Tier 1 of Tier 2 CGC gene
  • inclusion in database
  • continued curation efforts

definitions:

Tier 1 gene: genes which have strong evidence from both mutational and functional analysis as being involved in cancer

Tier 2 gene: genes with mutational patterns typical of cancer drivers but not functionally characterized as well as genes with published mechanistic description of involvement in cancer but without proof of somatic mutations in cancer

Current Status of Tier 1 and Tier 2 genes in CGC

Tier 1 genes (574 genes): include 79 oncogenes, 140 tumor suppressor genes, 93 fusion partners

Tier 2 genes (719 genes): include 103 oncogenes, 181 tumor suppressors, 134 fusion partners and 31 with unknown function

Updated 7/26/2019

The COSMIC database is undergoing an extensive update and reannotation, in order to ensure standardisation and modernisation across COSMIC data. This will substantially improve the identification of unique variants that may have been described at the genome, transcript and/or protein level. The introduction of a Genomic Identifier, along with complete annotation across multiple, high quality Ensembl transcripts and improved compliance with current HGVS syntax, will enable variant matching both within COSMIC and across other bioinformatic datasets.

As a result of these updates there will be significant changes in the upcoming releases as we work through this process. The first stage of this work was the introduction of improvedHGVS syntax compliance in our May release. The majority of the changes will be reflected in COSMIC v90, which will be released in late August or early September, and the remaining changes will be introduced over the next few releases.

The significant changes in v90 include:

  • Updated genes, transcripts and proteins from Ensembl release 93 on both the GRCh37 and GRCh38 assemblies.
  • Full reannotation of COSMIC variants with known genomic coordinates using Ensembl’s Variant Effect Predictor (VEP). This provides accurate and standardised annotation uniformly across all relevant transcripts and genes that include the genomic location of the variant.
  • New stable genomic identifiers (COSV) that indicate the definitive position of the variant on the genome. These unique identifiers allow variants to be mapped between GRCh37 and GRCh38 assemblies and displayed on a selection of transcripts.
  • Updated cross-reference links between COSMIC genes and other widely-used databases such as HGNC, RefSeq, Uniprot and CCDS.
  • Complete standardised representation of COSMIC variants, following the most recent HGVS recommendations, where possible.
  • Remapping of gene fusions on the updated transcripts on both the GRCh37 and GRCh38 assemblies, along with the genomic coordinates for the breakpoint positions.
  • Reduced redundancy of mutations. Duplicate variants have been merged into one representative variant.

Key points for you

COSMIC variants have been annotated on all relevant Ensembl transcripts across both the GRCh37 and GRCh38 assemblies from Ensembl release 93. New genomic identifiers (e.g. COSV56056643) are used, which refers to the variant change at the genomic level rather than gene, transcript or protein level and can thus be used universally. Existing COSM IDs will continue to be supported and will now be referred to as legacy identifiers e.g. COSM476. The legacy identifiers (COSM) are still searchable. In the case of mutations without genomic coordinates, hence without a COSV identifier, COSM identifiers will continue to be used.

All relevant Ensembl transcripts in COSMIC (which have been selected based on Ensembl canonical classification and on the quality of the dataset to include only GENCODE basic transcripts) will now have both accession and version numbers, so that the exact transcript is known, ensuring reproducibility. This also provides transparency and clarity as the data are updated.

How these changes will be reflected in the download files

As we are now mapping all variants on all relevant Ensembl transcripts, the number of rows in the majority of variant download files has increased significantly. In the download files, additional columns are provided including the legacy identifier (COSM) and the new genomic identifier (COSV). An internal mutation identifier is also provided to uniquely represent each mutation, on a specific transcript, on a given assembly build. The accession and version number for each transcript are included. File descriptions for each of the download files will be available from the downloads page for clarity. We have included an example of the new columns below.

For example: COSMIC Complete Mutation Data (Targeted screens)

    1. [17:Q] Mutation Id – An internal mutation identifier to uniquely represent each mutation on a specific transcript on a given assembly build.
    1. [18:R] Genomic Mutation Id – Genomic mutation identifier (COSV) to indicate the definitive position of the variant on the genome. This identifier is trackable and stable between different versions of the release.
    1. [19:S] Legacy Mutation Id – Legacy mutation identifier (COSM) that will represent existing COSM mutation identifiers.

We will shortly have some sample data that can be downloaded in the new table structure, to give you real data to manipulate and integrate, this will be available on the variant updates page.

How this affects you

We are aware that many of the changes we are making will affect integration into your pipelines and analytical platforms. By giving you advance notice of the changes, we hope much of this can be mitigated, and the end result of having clean, standardised data will be well worth any disruption. The variant updates page on the COSMIC website will provide a central point for this information and further technical details of the changes that we are making to COSMIC.

Kind Regards,
The COSMIC Team
Wellcome Sanger Institute
Wellcome Genome Campus,
Hinxton CB10 1SA

 

 

 

 

 

 

 

 

 

Read Full Post »

%d