Funding, Deals & Partnerships: BIOLOGICS & MEDICAL DEVICES; BioMed e-Series; Medicine and Life Sciences Scientific Journal – http://PharmaceuticalIntelligence.com
Complex rearrangements and oncogene amplification revealed by long-read DNA and RNA sequencing of a breast cancer cell line, Volume 2 (Volume Two: Latest in Genomics Methodologies for Therapeutics: Gene Editing, NGS and BioInformatics, Simulations and the Genome Ontology), Part 1: Next Generation Sequencing (NGS)
Complex rearrangements and oncogene amplification revealed by long-read DNA and RNA sequencing of a breast cancer cell line
Reporter: Stephen J. Williams, PhD
In a Genome Research report by Marie Nattestad et al. [1], the SK-BR-3 breast cancer cell line was sequenced using a long read single molecule sequencing protocol in order to develop one of the most detailed maps of structural variations in a cancer genome to date. The authors detected over 20,000 variants with this new sequencing modality, whereas most of these variants would have been missed by short read sequencing. In addition, a complex sequence of nested duplications and translocations occurred surrounding the ERBB2 (HER2) while full-length transcriptomic analysis revealed novel gene fusions within the nested genomic variants. The authors suggest that combining this long-read genome and transcriptome sequencing results in a more comprehensive coverage of tumor gene variants and “sheds new light on the complex mechanisms involved in cancer genome evolution.”
Genomic instability is a hallmark of cancer [2], which lead to numerous genetic variations such as:
Copy number variations
Chromosomal alterations
Gene fusions
Deletions
Gene duplications
Insertions
Translocations
Efforts such as the Cancer Genome Atlas [3], and the International Genome Consortium (2010) use short-read sequencing technology to detect and analyze thousands of commonly occurring mutations however short-read technology has a high false positive and negative rate for detecting less common genetic structural variations {as high as 50% [4]}. In addition, short reads cannot detect variations in close proximity to each other or on the same molecule, therefore underestimating the variation number.
Methods: The authors used a long-read sequencing technology from Pacific Biosciences (SMRT) to analyze the mutational and structural variation in the SK-BR-3 breast cancer cell line. A split read and within-read mapping approach was used to detect variants of different types and sizes. In general, long-reads have better alignment qualities than short reads, resulting in higher quality mapping. Transcriptomic analysis was performed using Iso-Seq.
Results: Using the SMRT long-read sequencing technology from Pacific Biosciences, the authors were able to obtain 71.9% sequencing coverage with average read length of 9.8 kb for the SK-BR-3 genome.
A few notes:
Most amplified regions (33.6 copies) around the locus spanning the ERBB2 oncogene and around MYC locus (38 copies), EGFR locus (7 copies) and BCAS1 (16.8 copies)
The locus 8q24.12 had the most amplifications (this locus contains the SNTB1 gene) at 69.2 copies
Long-read sequencing showed more insertions than deletions and suggests an underestimate of the lengths of low complexity regions in the human reference genome
Found 1,493 long read variants, 603 of which were between different chromosomes
Using Iso-Seq in conjunction with the long-read platform, they detected 1,692,379 isoforms (93%) mapping to the reference genome and 53 putative gene fusions (39 of which they found genomic evidence)
A table modified from the paper on the gene fusions is given below:
Table 1. Gene fusions with RNA evidence from Iso-Seq and DNA evidence from SMRT DNA sequencing where the genomic path is found using SplitThreader from Sniffles variant calls. Note link in table is GeneCard for each gene.
SplitThreader found two different paths for the RAD51B-SEMA6D gene fusion and for the LINC00536-PVT1 gene fusion. Number of Iso-Seq reads refers to full-length HQ-filtered reads. Alignments of SMRT DNA sequence reads supporting each of these gene fusions are shown in Supplemental Note S2.
References
Nattestad M, Goodwin S, Ng K, Baslan T, Sedlazeck FJ, Rescheneder P, Garvin T, Fang H, Gurtowski J, Hutton E et al: Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line.Genome research 2018, 28(8):1126-1135.
Kandoth C, McLellan MD, Vandin F, Ye K, Niu B, Lu C, Xie M, Zhang Q, McMichael JF, Wyczalkowski MA et al: Mutational landscape and significance across 12 major cancer types. Nature 2013, 502(7471):333-339.
Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, Zhang Y, Ye K, Jun G, Fritz MH et al: An integrated map of structural variation in 2,504 human genomes. Nature 2015, 526(7571):75-81.
Other articles on Cancer Genome Sequencing in this Open Access Journal Include:
Rewriting the Mathematics of Tumor Growth[1]; Teams Use Math Models to Sort Drivers from Passengers[2]: Two JNCI Reviews by Mike Martin Regarding Genomics, Cancer, and Mutation
Curator: Stephen J. Williams, Ph.D.
Word Cloud By Danielle Smolyar
Recently, there has been extensive interest in the cancer research and oncology community on detecting those mutations responsible for the initiation and propagation of a neoplastic cell (driver mutations) versus those mutations that are randomly (or by selective pressures) acquired due to the genetic instability of the transformed cell. The impact of either type of mutation has been a topic for debate, with a recent article showing that some passenger mutations may actually be responsible for tumor survival. In addition many articles, highlighted on this site (and referenced below) in recent years have described the importance of classifying driver and passenger mutations for the purposes of more effective personalized medicine strategies directed against tumors. Two review articles by Mike Martin in the Journal of the National Cancer Institute (JCNI) shed light on the current efforts and successes to discriminate between these passenger and driver mutations and determine impact of each type of mutation to tumor growth. However, as described in the associated article, the picture is not as clear cut as previously thought and highlights some revolutionary findings. In Rewriting the Mathematics of Tumor Growth, researchers discovered that driver mutations may confer such a small growth advantage that, multiple mutations, including the so called passenger mutations are necessary in order to sustain tumor growth. In fact, much experimental evidence has suggested at least six defined genetic events may be necessary for the in-vitro transformation of human cells. The following table shows some of the genetic events required for in-vitro transformation in cell culture systems.
3 for anchorage independence (cyclin D1, dnp53, EGFR),Cyclin D1+dnp53 for immortalization
HOSE
6
CDK4, cyclin D, hTERT plus combination of either P53DD, myrAkt, and H-ras or P53DD, H-ras, c-myc Bcl2
(f)Sasaki(Kiyono)
5
HOSE
3
hTERTSV40 earlyH-ras orK-ras
(g)Liu(Bast)
2hTERT+ SV40 early
HOSE
3
Large ThTERTH-ras orc-erB-2
(h)Kusakari(Fujii)
2hTERT+large T
Rat
Fibroblasts
2
Large TH-ras
(i)Hirakawa
Did not analyze
Fibroblasts
2
Large TH-ras
(d)Rangarajan(Weinberg)
Large T
Mouse
MOSEIn p53-/- background
3
c-mycK-rasAkt
(j)Orsulic
Pig
Fibroblasts
6
p53DDhTERT
CDK4H-ras c-myc
cyclin D1
(k)Adam(Counter)
5 need all butp53DD
Note: priming means events required to immortalize but not fully transform. * Note that both ability to form colonies in soft agarose and subsequently tested for tumor formation in immunocompromised mice.
a. Hahn, W. C., Counter, C. M., Lundberg, A. S., Beijersbergen, R. L., Brooks, M. W., and Weinberg, R. A. (1999) Creation of human tumour cells with defined genetic elements, Nature400, 464-468.
b. Kendall, S. D., Linardic, C. M., Adam, S. J., and Counter, C. M. (2005) A network of genetic events sufficient to convert normal human cells to a tumorigenic state, Cancer Res65, 9824-9828.
c. Sun, B., Chen, M., Hawks, C. L., Pereira-Smith, O. M., and Hornsby, P. J. (2005) The minimal set of genetic alterations required for conversion of primary human fibroblasts to cancer cells in the subrenal capsule assay, Neoplasia7, 585-593.
d. Rangarajan, A., Hong, S. J., Gifford, A., and Weinberg, R. A. (2004) Species- and cell type-specific requirements for cellular transformation, Cancer Cell6, 171-183.
e. Goessel, G., Quante, M., Hahn, W. C., Harada, H., Heeg, S., Suliman, Y., Doebele, M., von Werder, A., Fulda, C., Nakagawa, H., Rustgi, A. K., Blum, H. E., and Opitz, O. G. (2005) Creating oral squamous cancer cells: a cellular model of oral-esophageal carcinogenesis, Proc Natl Acad Sci U S A102, 15599-15604.
f. Sasaki, R., Narisawa-Saito, M., Yugawa, T., Fujita, M., Tashiro, H., Katabuchi, H., and Kiyono, T. (2009) Oncogenic transformation of human ovarian surface epithelial cells with defined cellular oncogenes,Carcinogenesis30, 423-431.
g. Liu, J., Yang, G., Thompson-Lanza, J. A., Glassman, A., Hayes, K., Patterson, A., Marquez, R. T., Auersperg, N., Yu, Y., Hahn, W. C., Mills, G. B., and Bast, R. C., Jr. (2004) A genetically defined model for human ovarian cancer, Cancer Res64, 1655-1663.
h. Kusakari, T., Kariya, M., Mandai, M., Tsuruta, Y., Hamid, A. A., Fukuhara, K., Nanbu, K., Takakura, K., and Fujii, S. (2003) C-erbB-2 or mutant Ha-ras induced malignant transformation of immortalized human ovarian surface epithelial cells in vitro, Br J Cancer89, 2293-2298.
i. Hirakawa, T., and Ruley, H. E. (1988) Rescue of cells from ras oncogene-induced growth arrest by a second, complementing, oncogene, Proc Natl Acad Sci U S A85, 1519-1523.
j. Orsulic, S., Li, Y., Soslow, R. A., Vitale-Cross, L. A., Gutkind, J. S., and Varmus, H. E. (2002) Induction of ovarian cancer by defined multiple genetic changes in a mouse model system, Cancer Cell1, 53-62.
k. Adam, S. J., Rund, L. A., Kuzmuk, K. N., Zachary, J. F., Schook, L. B., and Counter, C. M. (2007) Genetic induction of tumorigenesis in swine, Oncogene26, 1038-1045.
However it may be argued that the aforementioned experimental examples were produced in cell lines with a more stable genome than that which is seen in most tumors and had used traditional assays of transformation, such as growth in soft agarose and tumorigenicity in immunocompromised mice, as endpoints of transformation, and not representative of the tumor growth seen in the clinical setting.
Therefore Bert Vogelstein, M.D., along with collaborators around the world developed a model they termed the “sequential driver mutation theory”, in which they describe that driver mutations multiply over time with each mutation “slightly increasing the tumor growth rate through a process that depends on three factors”:
Driver mutation rate
The 0.4% selective growth advantage
Cell division time
This model was based on a combination of experimental data and computer simulations of gliobastoma multiforme and pancreatic adenocarcinoma. Most tumor models follow a Gompertz kinetics, which show how tumor growth is exponential but eventually levels off over time.
This new theory shows though that a tumor cell with only one driver mutation can only grow so much, until a second driver mutation is required. Using data for the COSMIC database (Catalog of Somatic Mutations in Cancer) together with analysis software CHASM (Cancer-specific High-throughput Annotation of Somatic Mutations) the researchers analyzed 713 mutations sequenced from 14 glioma patients and 562 mutations in nine pancreatic adenocarcinomas, revealing at least 100 tumor suppressor genes and 100 oncogenes altered. Therefore, the authors suggested these may be possible driver mutations, or at least mutations required for the sustained growth of these tumors. Applying this new model to data obtained from Dr. Giardiello’s publication concerning familial adenopolypsis in New England Journal of medicine in 19993 and 2000, the sequential driver mutation model predicted age distribution of FAP patients, number and size of polyps, and polyp growth rate than previous models. This surprising number of required driver mutations for full transformation was also verified in a study led by University of Texas Southwestern Medical Center biologist Jerry Shay, Ph.D., who noted “this team’s surprise nearly 45% of all colorectal candidate oncogenes (65 mutations) drove malignant proliferation”[3].
However, some investigators do not believe the model is complex enough to account for other factors involved in oncogenesis, such as epigenetic factors like methylation and acetylation. In addition the review also discusses host and tissue factors which may complicate the models, such as location where a tumor develops. However, most of the investigators interviewed for this review agreed that focusing on this long-term progression of the disease may give us clues to other potential druggable targets.
Teams Use Math Models to Sort Drivers From Passengers
A related review from Mike Martin in JNCI [2] describes a statistical method, published in 2009 Cancer Informatics[4], which distinguishes chromosomal abnormalities that can drive oncogenesis from passenger abnormalities. Chromosomal abnormalities, such as deletions, additions, and translocations are common in cancer. For instance, the well-known Philadelphia chromosome, a translocation between chromosome 9 and 22 which results in the BCR-ABL tyrosine kinase fusion protein is the molecular basis of chronic myelogenous leukemia.
In the report, Eytan Domany, Ph.D., from Weizmann Institute and several colleagues from University of Lausanne, University of Haifa and the Broad Institute were analyzing chromosomal aberrations in a subset of medulloblastoma, which had more gain and losses in chromosomes than had been attributed to the disease. Using a statistical method they termed a “volumetric sieve”, the investigators were able to identify driver versus passenger aberrations based on three filters:
Fraction of patients with the abnormality
Length of DNA involved in the aberrant chromosome
Abnormality’s copy number
Another method to sort the most “important” chromosomal aberrations from less relevant alterations is termed GISTIC[5], as the website describes is: a tool to identify genes targeted by somatic copy-number alterations (SCNAs) that drive cancer growth (at the Broad Institute website http://www.broadinstitute.org/software/cprg/?q=node/31). The method allows for comparison across multiple tumors so noise is eliminated and improves consistency of analysis. This method had been successfully used to determine driver aberrations is mesotheliomas, leukemias, and identify new oncogenes in adenocarcinomas of the lung and squamous cell carcinoma of the esophagus.
Main references for the two Mike Martin articles are as follows:
3. Eskiocak U, Kim SB, Ly P, Roig AI, Biglione S, Komurov K, Cornelius C, Wright WE, White MA, Shay JW: Functional parsing of driver mutations in the colorectal cancer genome reveals numerous suppressors of anchorage-independent growth. Cancer research 2011, 71(13):4359-4365.
4. Shay T, Lambiv WL, Reiner-Benaim A, Hegi ME, Domany E: Combining chromosomal arm status and significantly aberrant genomic locations reveals new cancer subtypes. Cancer informatics 2009, 7:91-104.
Breast cancer is the second most common cancer worldwide after lung cancer, the fifth most common cause of cancer death, and the leading cause of cancer death in women. the global burden of breast cancer exceeds all other cancers and the incidence rates of breast cancer are increasing (1,2).
The heterogeneity of breast cancers makes them both a fascinating and challenging solid tumor to diagnose and treat. Here is a great review of the molecular pathology of breast cancer progression (3).
“The molecular pathology of breast cancer progression” by Alessandro Bombonati and Dennis C Sgroi.
Breast cancer is the most frequent carcinoma in females and the second most common cause of cancer related mortality in women. Approximately 54 000 and 207 000 new cases of in situ and invasive breast carcinoma, respectively. Overall, breast cancer incidence rates have levelled off since 1990, with a decrease of 3.5%/year from 2001 to 2004. Most notably, during this same time period, breast cancer mortality rates have declined 24%, with the largest impact among young women and women with estrogen receptor (ER)-positive disease.
The decline in breast cancer mortality has been attributed to the combination of early detection with screening programmes and the advent of more efficacious adjuvant progression have aided in the discovery of novel pathway-specific targeted therapeutics, and the emergence of such effective therapeutics is currently driving the need for molecular-based, ‘patient-tailored’ treatment planning.
Proposed models of human breast cancer progression
Epidemiological and morp
hological observations led to the formulation of several linear models of breast cancer initiation, transformation and
progression. Figure 1
The ductal and lobular subtypes constitute the majority of all breast cancers worldwide, with the ductal subtype accounting for 40–75% of all diagnosed cases.
The classic model of breast cancer progression of the ductal type proposes thatneoplastic evolution initiates in normal epithelium (normal), progresses to flat epithelial atypia (FEA), advances to atypical ductalhyperplasia (ADH), evolves to ductal carcinoma in situ (DCIS) and culminates as invasive ductal carcinoma (IDC).
The model of lobular neoplasia proposes a multi-step progression from normal epithelium to atypicallobular hyperplasia, lobular carcinoma in situ (LCIS) and invasive lobular carcinoma (ILC).
The cell of origin of breast cancer: the clonal and stem cell hypotheses
The two leading models accounting for breast carcinogenesis are the sporadic clonal evolution model and the cancer stem cell (cSC) model. According to the sporadic clonal evolution hypothesis, any breast epithelial cell can be the target of random mutations. The cells with advantageous genetic and epigenetic alterations are selected over time to contribute to tumour progression. The third alternative cSC model postulates that only stem and progenitor cells (representing a small fraction of the tumor cells within the cancer) can initiate and maintain tumor progression. Figure 2.
Normal breast stem cells (nBSCs) are long-lived, tissue-resident cells capable of self-renewal activity and multi-lineage differentiation that can recapitulate the breast tubulolobular architecture that is composed of luminal and myoepithelial cells.
As normal breast cancer stem cells are long-time tissue residents, it has been proposed that such cells are candidates for accumulating genetic and epigenetic modifications. It has been further proposed that such molecular alterations result in deregulation of normal self-renewal, leading to the development of a cancer stem cell (cSC).
It is believed that the cSC undergoes asymmetrical division, maintaining the stem cell population while at the same time differentiating into committed progenitor(s) cells that give rise to the different breast cancer subtypes.
A second scenario, as it relates to breast cancer development, is one in which the cancer-initiating cells are derived from committed progenitor cells that spawn different breast cancer subtypes. Both scenarios are highly supported.
Molecular analysis of the different stages of breast cancer progression
Genomic and transcriptomic data in combination with morphological and immunohistochemical data stratify the majority of breast cancers into a “low-grade-like” molecular pathway and a “high-grade-like” molecular pathway. Figure 3. The low-grade-like pathway (left hand side) is characterized by recurrent chromosomal loss of 16q, gains of 1q, a low-grade-like gene expression signature, and the expression of estrogen and progesterone receptors (ER+ and PR+). The progression (vertical arrows) along this pathway (green rectangles) culminates with the formation of low and intermediate grade invasive ductal, (LG IDC and IG IDC) and invasive lobular carcinomas including both the classic (ILC) and the pleomorphic variant (pILC). The tumors arising from the low grade pathway are classified as luminal consisting of a continuum of gene expression frequently associated with the absence (luminal A) or presence of HER2 expression (luminal B). The vast majority of ILCs and pILCs and their precursors cluster together within the luminal subtype. The high grade-like gene expression molecular pathway (right hand side) is characterized by recurrent gain of 11q13 (+11q13), loss of 13q (13q−), expression of a high-grade-like gene expression signature, amplification of 17q12 (17q12AMP), and lack of estrogen and progesterone receptors expression (ER− and PR−). The progression along this pathway (red rectangles) includes intermediate and high grade ductal carcinomas that are stratified as HER2, or basal-like, depending on the expression/amplification of HER2. The molecular apocrine subtype, characterized by the lack of ER expression and presence of AR expression, arises from the high grade pathway. The model also depicts intra-pathway tumor grade progression (horizontal arrows).
Although the genomic and transcriptomic data presented in this review support the divergent model of breast cancer progression, the clinical experience indicates that tumors within each pathway are still fairly heterogeneous with respect to clinical outcome suggesting that even this advanced molecular progression scheme is oversimplified.
The future application of massively parallel sequencing technologies to the preinvasive stages of breast cancer will assist in assessing intratumoral heterogeneity during the transition from preinvasive to invasive breast cancer, and may assist in identifying early tumor initiating genetic events.
Summary:
Over the past decade the integration of numerous genomic and transcriptomic analyses of the various stages of breast cancer has generated multiple novel insights in the complex process of breast cancer progression.
First, human breast cancer appears to progress along two distinct molecular genetic pathways that strongly associate with tumor grade.
Second, in the epithelial and non-epithelial components of the tumor microenvironment, the greatest molecular alterations (at the gene expression level) occur prior to local invasion.
Third, in the epithelial compartment, no major additional gene expression changes occur between the preinvasive and invasive stages of breast cancer.
Fourth, the non-epithelial compartment of the tumor micromilieu undergoes dramatic epigenetic and gene expression alterations occur during the transition form preinvasive to invasive disease. Despite these significant advances, we have only begun to scratch the surface of this multifaceted biological process. With the advent of additional novel high-throughput genetic, epigenetic and proteomic technologies, it is anticipated that the next decade of breast cancer research will gain an equally paralleled appreciation for the complexity breast cancer progression. It is with great hope that knowledge gained from such studies will provide for more effective strategies to not only treat, but also prevent breast cancer.