Posts Tagged ‘activation of transcription factors’

Systems Biology analysis of Transcription Networks, Artificial Intelligence, and High-End Computing Coming to Fruition in Personalized Oncology

Curator: Stephen J. Williams, Ph.D.

In the June 2020 issue of the journal Science, writer Roxanne Khamsi has an interesting article “Computing Cancer’s Weak Spots; An algorithm to unmask tumors’ molecular linchpins is tested in patients”[1], describing some early successes in the incorporation of cancer genome sequencing in conjunction with artificial intelligence algorithms toward a personalized clinical treatment decision for various tumor types.  In 2016, oncologists Amy Tiersten collaborated with systems biologist Andrea Califano and cell biologist Jose Silva at Mount Sinai Hospital to develop a systems biology approach to determine that the drug ruxolitinib, a STAT3 inhibitor, would be effective for one of her patient’s aggressively recurring, Herceptin-resistant breast tumor.  Dr. Califano, instead of defining networks of driver mutations, focused on identifying a few transcription factors that act as ‘linchpins’ or master controllers of transcriptional networks withing tumor cells, and in doing so hoping to, in essence, ‘bottleneck’ the transcriptional machinery of potential oncogenic products. As Dr. Castilano states

“targeting those master regulators and you will stop cancer in its tracks, no matter what mutation initially caused it.”

It is important to note that this approach also relies on the ability to sequence tumors  by RNA-seq to determine the underlying mutations which alter which master regulators are pertinent in any one tumor.  And given the wide tumor heterogeneity in tumor samples, this sequencing effort may have to involve multiple biopsies (as discussed in earlier posts on tumor heterogeneity in renal cancer).

As stated in the article, Califano co-founded a company called Darwin-Health in 2015 to guide doctors by identifying the key transcription factors in a patient’s tumor and suggesting personalized therapeutics to those identified molecular targets (OncoTarget™).  He had collaborated with the Jackson Laboratory and most recently Columbia University to conduct a $15 million 3000 patient clinical trial.  This was a bit of a stretch from his initial training as a physicist and, in 1986, IBM hired him for some artificial intelligence projects.  He then landed in 2003 at Columbia and has been working on identifying these transcriptional nodes that govern cancer survival and tumorigenicity.  Dr. Califano had figured that the number of genetic mutations which potentially could be drivers were too vast:

A 2018 study which analyzed more than 9000 tumor samples reported over 1.5 million mutations[2]

and impossible to develop therapeutics against.  He reasoned that you would just have to identify the common connections between these pathways or transcriptional nodes and termed them master regulators.

A Pan-Cancer Analysis of Enhancer Expression in Nearly 9000 Patient Samples

Chen H, Li C, Peng X, et al. Cell. 2018;173(2):386-399.e12.


The role of enhancers, a key class of non-coding regulatory DNA elements, in cancer development has increasingly been appreciated. Here, we present the detection and characterization of a large number of expressed enhancers in a genome-wide analysis of 8928 tumor samples across 33 cancer types using TCGA RNA-seq data. Compared with matched normal tissues, global enhancer activation was observed in most cancers. Across cancer types, global enhancer activity was positively associated with aneuploidy, but not mutation load, suggesting a hypothesis centered on “chromatin-state” to explain their interplay. Integrating eQTL, mRNA co-expression, and Hi-C data analysis, we developed a computational method to infer causal enhancer-gene interactions, revealing enhancers of clinically actionable genes. Having identified an enhancer ∼140 kb downstream of PD-L1, a major immunotherapy target, we validated it experimentally. This study provides a systematic view of enhancer activity in diverse tumor contexts and suggests the clinical implications of enhancers.


A diagram of how concentrating on these transcriptional linchpins or nodes may be more therapeutically advantageous as only one pharmacologic agent is needed versus multiple agents to inhibit the various upstream pathways:



From: Khamsi R: Computing cancer’s weak spots. Science 2020, 368(6496):1174-1177.


VIPER Algorithm (Virtual Inference of Protein activity by Enriched Regulon Analysis)

The algorithm that Califano and DarwinHealth developed is a systems biology approach using a tumor’s RNASeq data to determine controlling nodes of transcription.  They have recently used the VIPER algorithm to look at RNA-Seq data from more than 10,000 tumor samples from TCGA and identified 407 transcription factor genes that acted as these linchpins across all tumor types.  Only 20 to 25 of  them were implicated in just one tumor type so these potential nodes are common in many forms of cancer.

Other institutions like the Cold Spring Harbor Laboratories have been using VIPER in their patient tumor analysis.  Linchpins for other tumor types have been found.  For instance, VIPER identified transcription factors IKZF1 and IKF3 as linchpins in multiple myeloma.  But currently approved therapeutics are hard to come by for targets with are transcription factors, as most pharma has concentrated on inhibiting an easier target like kinases and their associated activity.  In general, developing transcription factor inhibitors in more difficult an undertaking for multiple reasons.

Network-based inference of protein activity helps functionalize the genetic landscape of cancer. Alvarez MJ, Shen Y, Giorgi FM, Lachmann A, Ding BB, Ye BH, Califano A:. Nature genetics 2016, 48(8):838-847 [3]


Identifying the multiple dysregulated oncoproteins that contribute to tumorigenesis in a given patient is crucial for developing personalized treatment plans. However, accurate inference of aberrant protein activity in biological samples is still challenging as genetic alterations are only partially predictive and direct measurements of protein activity are generally not feasible. To address this problem we introduce and experimentally validate a new algorithm, VIPER (Virtual Inference of Protein-activity by Enriched Regulon analysis), for the accurate assessment of protein activity from gene expression data. We use VIPER to evaluate the functional relevance of genetic alterations in regulatory proteins across all TCGA samples. In addition to accurately inferring aberrant protein activity induced by established mutations, we also identify a significant fraction of tumors with aberrant activity of druggable oncoproteins—despite a lack of mutations, and vice-versa. In vitro assays confirmed that VIPER-inferred protein activity outperforms mutational analysis in predicting sensitivity to targeted inhibitors.





Figure 1 

Schematic overview of the VIPER algorithm From: Alvarez MJ, Shen Y, Giorgi FM, Lachmann A, Ding BB, Ye BH, Califano A: Functional characterization of somatic mutations in cancer using network-based inference of protein activity. Nature genetics 2016, 48(8):838-847.

(a) Molecular layers profiled by different technologies. Transcriptomics measures steady-state mRNA levels; Proteomics quantifies protein levels, including some defined post-translational isoforms; VIPER infers protein activity based on the protein’s regulon, reflecting the abundance of the active protein isoform, including post-translational modifications, proper subcellular localization and interaction with co-factors. (b) Representation of VIPER workflow. A regulatory model is generated from ARACNe-inferred context-specific interactome and Mode of Regulation computed from the correlation between regulator and target genes. Single-sample gene expression signatures are computed from genome-wide expression data, and transformed into regulatory protein activity profiles by the aREA algorithm. (c) Three possible scenarios for the aREA analysis, including increased, decreased or no change in protein activity. The gene expression signature and its absolute value (|GES|) are indicated by color scale bars, induced and repressed target genes according to the regulatory model are indicated by blue and red vertical lines. (d) Pleiotropy Correction is performed by evaluating whether the enrichment of a given regulon (R4) is driven by genes co-regulated by a second regulator (R4∩R1). (e) Benchmark results for VIPER analysis based on multiple-samples gene expression signatures (msVIPER) and single-sample gene expression signatures (VIPER). Boxplots show the accuracy (relative rank for the silenced protein), and the specificity (fraction of proteins inferred as differentially active at p < 0.05) for the 6 benchmark experiments (see Table 2). Different colors indicate different implementations of the aREA algorithm, including 2-tail (2T) and 3-tail (3T), Interaction Confidence (IC) and Pleiotropy Correction (PC).

 Other articles from Andrea Califano on VIPER algorithm in cancer include:

Resistance to neoadjuvant chemotherapy in triple-negative breast cancer mediated by a reversible drug-tolerant state.

Echeverria GV, Ge Z, Seth S, Zhang X, Jeter-Jones S, Zhou X, Cai S, Tu Y, McCoy A, Peoples M, Sun Y, Qiu H, Chang Q, Bristow C, Carugo A, Shao J, Ma X, Harris A, Mundi P, Lau R, Ramamoorthy V, Wu Y, Alvarez MJ, Califano A, Moulder SL, Symmans WF, Marszalek JR, Heffernan TP, Chang JT, Piwnica-Worms H.Sci Transl Med. 2019 Apr 17;11(488):eaav0936. doi: 10.1126/scitranslmed.aav0936.PMID: 30996079

An Integrated Systems Biology Approach Identifies TRIM25 as a Key Determinant of Breast Cancer Metastasis.

Walsh LA, Alvarez MJ, Sabio EY, Reyngold M, Makarov V, Mukherjee S, Lee KW, Desrichard A, Turcan Ş, Dalin MG, Rajasekhar VK, Chen S, Vahdat LT, Califano A, Chan TA.Cell Rep. 2017 Aug 15;20(7):1623-1640. doi: 10.1016/j.celrep.2017.07.052.PMID: 28813674

Inhibition of the autocrine IL-6-JAK2-STAT3-calprotectin axis as targeted therapy for HR-/HER2+ breast cancers.

Rodriguez-Barrueco R, Yu J, Saucedo-Cuevas LP, Olivan M, Llobet-Navas D, Putcha P, Castro V, Murga-Penas EM, Collazo-Lorduy A, Castillo-Martin M, Alvarez M, Cordon-Cardo C, Kalinsky K, Maurer M, Califano A, Silva JM.Genes Dev. 2015 Aug 1;29(15):1631-48. doi: 10.1101/gad.262642.115. Epub 2015 Jul 30.PMID: 26227964

Master regulators used as breast cancer metastasis classifier.

Lim WK, Lyashenko E, Califano A.Pac Symp Biocomput. 2009:504-15.PMID: 19209726 Free


Additional References


  1. Khamsi R: Computing cancer’s weak spots. Science 2020, 368(6496):1174-1177.
  2. Chen H, Li C, Peng X, Zhou Z, Weinstein JN, Liang H: A Pan-Cancer Analysis of Enhancer Expression in Nearly 9000 Patient Samples. Cell 2018, 173(2):386-399 e312.
  3. Alvarez MJ, Shen Y, Giorgi FM, Lachmann A, Ding BB, Ye BH, Califano A: Functional characterization of somatic mutations in cancer using network-based inference of protein activity. Nature genetics 2016, 48(8):838-847.


Other articles of Note on this Open Access Online Journal Include:

Issues in Personalized Medicine in Cancer: Intratumor Heterogeneity and Branched Evolution Revealed by Multiregion Sequencing


Read Full Post »

Cells Rhythmically Regulate Their Genes

Larry H. Bernstein, MD, FCAP, Curator



Cells Rhythmically Regulate Their Genes


Study led by researchers at Caltech shows that pulsing can allow two proteins to interact with each other in a rhythmic fashion that allows them to control genes.


Even in a calm, unchanging environment, cells are not static. Among other actions, cells activate and then deactivate some types of transcription factors—proteins that control the expression of genes—in a series of unpredictable and intermittent pulses. Since discovering this pulsing phenomenon, scientists have wondered what functions it could provide for cells.

Now, a new study from Caltech researchers shows that pulsing can allow two proteins to interact with each other in a rhythmic fashion that allows them to control genes. Specifically, when the expression of the transcription factors goes in and out of sync, gene expression also goes up and down. These rhythms of activation, the researchers say, may also underlie core processes in the cells of organisms from across the kingdoms of life.

“The way transcription factor pulses sync up with one another in time could play an important role in allowing cells to process information, communicate with other cells, and respond to stress,” says paper coauthor Michael Elowitz, a professor of biology and biological engineering and an investigator with the Howard Hughes Medical Institute.





The research was led by Caltech postdoctoral scholar Yihan Lin. Other Caltech authors of the paper are Assistant Professor of Chemistry Long Cai; Chang Ho Sohn, a staff scientist in the Cai lab; and Elowitz’s former graduate student Chiraj K. Dalal (PhD ’10), now at UC San Francisco.

Cai, Dalal, and Elowitz reported a functional role for transcription factor pulsing in 2008. In the meantime, researchers worldwide have been steadily uncovering similar surges of protein activity across diverse cell types and genetic systems.

Realizing that many different factors are pulsing in the same cell even in unchanging conditions, the Caltech scientists began to wonder if cells might adjust the relative timing of these pulses to enable a novel sort of time-based regulation. To find out, they set up time-lapse movies to follow two pulsing proteins and a target gene in real time in individual yeast cells.

The team tagged two central transcription factors named Msn2 and Mig1 with green and red fluorescent proteins, respectively. When the transcription factors are activated, they move into the nucleus, where they influence gene expression. This movement—as well as the activation of the factors—can be visualized because the fluorescent markers concentrate within the small volume of the nucleus, causing it to glow brightly, either green, red, or both. The color choice for the fluorescent tags was symbolic: Msn2 serves as an activator, and Mig1 as a repressor. “Msn2, the green factor, steps on the gas and turns up gene expression, while Mig1, the red factor, hits the brakes,” says Elowitz.

When the scientists stressed the yeast cells by adding heat, for example, or restricting food, the pulses of Msn2 and Mig1 changed their timing with respect to one another, with more or less frequent periods of overlap between their pulses, depending upon the stressing stimulus.

Generally, when the two transcription factors pulsed in synchrony, the repressor blocked the ability of the activator to turn on genes. “It’s like someone simultaneously pumping the gas and brake pedals in a car over and over again,” says Elowitz.

But when they were off-beat, with the activator pulsing without the repressor, gene expression increased. “When the cell alternates between the brake and the gas—the Msn2 transcription factor in this case—the car can move,” says Elowitz. As a result of these stress-altered rhythms, the cells successfully produced more (or fewer) copies of certain proteins that helped the yeast cope with the unpleasant situation.

Previously, researchers have thought that the relative concentrations of multiple transcription factors in the nucleus determine how they regulate a common gene target—a phenomenon known as combinatorial regulation. But the new study suggests that the relative timing of the pulses of transcription factors may be just as important as their concentration.

“Most genes in the cell are regulated by several transcription factors in a combinatorial fashion, as parts of a complex network,” says Cai. “What we’re now seeing is a new mode of regulation that controls the pulse timing of transcription factors, and this could be critical to understanding the combinatorial regulation in genetic networks.”

“There appears to be a layer of time-based regulation in the cell that, because it can only be observed with movies of individual cells, is still largely unexplored,” says Lin. “We look forward to learning more about this intriguing and underappreciated form of gene regulation.”

In future research, the scientists will try to understand how prevalent this newfound mode of time-based regulation is in a variety of cell types and will examine its involvement in gene regulation systems. In the context of synthetic biology—the harnessing and modification of biological systems for human technological applications—the researchers also hope to develop methods to control such pulsing to program new cellular behaviors.



Combinatorial gene regulation by modulation of relative pulse timing.
Nature. Nov 5, 2015; 527(7576):54-8. doi: 10.1038/nature15710. Epub 2015 Oct 14.
Studies of individual living cells have revealed that many transcription factors activate in dynamic, and often stochastic, pulses within the same cell. However, it has remained unclear whether cells might exploit the dynamic interaction of these pulses to control gene expression. Here, using quantitative single-cell time-lapse imaging of Saccharomyces cerevisiae, we show that the pulsatile transcription factors Msn2 and Mig1 combinatorially regulate their target genes through modulation of their relative pulse timing. The activator Msn2 and repressor Mig1 showed pulsed activation in either a temporally overlapping or non-overlapping manner during their transient response to different inputs, with only the non-overlapping dynamics efficiently activating target gene expression. Similarly, under constant environmental conditions, where Msn2 and Mig1 exhibit sporadic pulsing, glucose concentration modulated the temporal overlap between pulses of the two factors. Together, these results reveal a time-based mode of combinatorial gene regulation. Regulation through relative signal timing is common in engineering and neurobiology, and these results suggest that it could also function broadly within the signalling and regulatory systems of the cell.

Pulsatile Dynamics in the Yeast Proteome
Chiraj K. Dalal,1,2 Long Cai,1,2 Yihan Lin,1 Kasra Rahbar,1 and Michael B. Elowitz1, * 1
Howard Hughes Medical Institute, Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA 91125, USA



  • Pulsing is prevalent in the yeast proteome
  • Pulsing is specific to transcription factors
  • Pulsing regulates a large fraction of the genome

The activation of transcription factors in response to environmental conditions is fundamental to cellular regulation. Recent work has revealed that some transcription factors are activated in stochastic pulses of nuclear localization, rather than at a constant level, even in a constant environment [ 1–12 ]. In such cases, signals control the mean activity of the transcription factor by modulating the frequency, duration, or amplitude of these pulses. Although specific pulsatile transcription factors have been identified in diverse cell types, it has remained unclear how prevalent pulsing is within the cell, how variable pulsing behaviors are between genes, and whether pulsing is specific to transcriptional regulators or is employed more broadly. To address these issues, we performed a proteome-wide movie-based screen to systematically identify localization-based pulsing behaviors in Saccharomyces cerevisiae. The screen examined all genes in a previously developed fluorescent protein fusion library of 4,159 strains [ 13 ] in multiple media conditions. This approach revealed stochastic pulsing in ten proteins, all transcription factors. In each case, pulse dynamics were heterogeneous and unsynchronized among cells in clonal populations. Pulsing is the only dynamic localization behavior that we observed, and it tends to occur in pairs of paralogous and redundant proteins. Taken together, these results suggest that pulsatile dynamics play a pervasive role in yeast and may be similarly prevalent in other eukaryotic species.

Since most pulsing proteins are members of a pair of paralogous or functionally redundant transcription factors, one explanation for the evolution of pulsing is one in which pulsing is ancient and existed prior to the whole-genome duplication (estimated to be w80 million years ago [20]). Since then, pulsing appears to have been lost only in some proteins (Mig3 and Rtg3), and the paralogs that have retained the ability to pulse have changed in their dynamics (Figure 3). Alternatively, paralogs that both pulse could have acquired pulsatile regulation through shared regulatory inputs that later became pulsatile. Further work analyzing whether proteins orthologous to the pulsing transcription factors described here also pulse, specifically in species that diverged prior to the whole-genome duplication, will distinguish between these hypotheses.

Recent work shows that pulsatile regulation occurs in diverse mammalian systems including NF-AT [9], p53 [10], Erk signaling [11], TGF-b signaling [12], and NF-kB [22–24]. Moreover, many bacterial systems, such as persistence in Mycobacterium smegmatis [25] and bacterial competence [26], sporulation [27], and stress response in Bacillus subtilis [28], employ pulsing. The presence of pulsing in so many systems across a wide range of species suggests that pulsing may be a common solution to many biological problems. For example, pulsing has already been shown to proportionally regulate entire regulons of target genes [2, 7], implement transient differentiation [26, 29], enable a multi-cell-cycle timer [27], and promote bet-hedging [25]. Pulsing may provide a time-based mode of regulation that facilitates these and other functions [1].

Figure 3. Pulsing Is Variable Single-cell traces show that pulses vary from cell to cell (different colors on the same trace), from paralog to paralog (across columns) and from protein to protein (A–L). All traces are from the same movie that generated corresponding filmstrips in Figure 2. All traces have been smoothed. See also Figure S2 and Movie S1. pulsing may be a common solution to many biological problems. For example, pulsing has already been shown to proportionally regulate entire regulons of target genes [2, 7], implement transient differentiation [26, 29], enable a multi-cell-cycle timer [27], and promote bet-hedging [25]. Pulsing may provide a time-based mode of regulation that facilitates these and other functions [1].

Taken together, these observations reveal that pulsatility is surprisingly pervasive in cells. It will now be critical to determine its mechanisms and functions and understand how these dynamics are integrated into the core functions of living cells. Although recent work has provided new insights into Msn2 pulsing [3, 4, 7, 8, 30, 31] and other work has provided a mechanism for pulsatile activation of a sigma factor in bacteria [28], we still lack a full understanding of the mechanisms of pulse generation and modulation for any yeast transcription factor. Do different pulsing systems use a common type of mechanism for pulsing, or are there many distinct mechanisms that can generate similar pulse dynamics? Pulsatility appears to be a core regulatory mechanism in yeast and most likely in other cell types as well [9]. The pulsatile proteins identified here should provide a starting point for understanding the roles that this dynamic regulatory mechanism plays in diverse cell types.


The title places scientific facts in the correct direction. However, for fast regulatory responses (those that keeps cells alive), no change in gene expression is required.

Read Full Post »