Advertisements
Feeds:
Posts
Comments

Posts Tagged ‘transcription’


Transcription Modulation

Author and Curator: Larry H. Bernstein, MD, FCAP 

 

This portion of the transcription series deals with transcription factors and the effects of their binding on metabolism. This also has implications for pharmaceutical target identification.

The Functional Consequences of Variation in Transcription Factor Binding
DA. Cusanovich, B Pavlovic, JK. Pritchard*, Y Gilad*
1 Department of Human Genetics, 2 Howard Hughes Medical Institute, University of Chicago, Chicago, IL 3 Departments of Genetics and Biology and Howard Hughes Medical Institute, Stanford University, Stanford, CA.
PLoS Genet 2014;10(3):e1004226.  http://dx.doi.org:/10.1371/journal.pgen.1004226

One goal of human genetics is to understand how the information for precise and dynamic gene expression programs is encoded in the genome. The interactions of transcription factors (TFs) with DNA regulatory elements clearly

  • play an important role in determining gene expression outputs, yet
  • the regulatory logic underlying functional transcription factor binding is poorly understood.

An important question in genomics is to understand how a class of proteins called ‘‘transcription factors’’ controls the expression level of other genes in the genome in a cell type-specific manner – a process that is essential to human development. One major approach to this problem is to study where these transcription factors bind in the genome, but this does not tell us about the effect of that binding on gene expression levels and

  • it is generally accepted that much of the binding does not strongly influence gene expression.

To address this issue, we artificially reduced the concentration of 59 different transcription factors in the cell and then

  • examined which genes were impacted by the reduced transcription factor level.

Our results implicate some attributes

  • that might influence what binding is functional, but they also suggest that
  • a simple model of functional vs. non-functional binding may not suffice.

Many studies have focused on characterizing the genomic locations of TF binding, but

  • it is unclear whether TF binding at any specific locus has
  • functional consequences with respect to gene expression output.

We knocked down 59 TFs and chromatin modifiers in one HapMap lymphoblastoid cell line

  • to evaluate the context of functional TF binding.

We then identified genes whose expression was affected by the knockdowns

  • by intersecting the gene expression data with transcription factor binding data
    (based on ChIP-seq and DNase-seq)
  • within 10 kb of the transcription start sites of expressed genes.

This combination of data allowed us to infer functional TF binding.
Only a small subset of genes bound by a factor were

  • differentially expressed following the knockdown of that factor,
  • suggesting that most interactions between TF and chromatin
  • do not result in measurable changes in gene expression levels
  • of putative target genes.

We found that functional TF binding is enriched

  • in regulatory elements that harbor a large number of TF binding sites,
  • at sites with predicted higher binding affinity, and
  • at sites that are enriched in genomic regions annotated as ‘‘active enhancers.’’

We aim to be able to predict the expression pattern of a gene based on its regulatory
sequence alone. However, the regulatory code of the human genome is much more complicated than

  • the triplet code of protein coding sequences, and is highly context-specific,
  • depending on cell-type and other factors.

Moreover, regulatory regions are not necessarily organized into

  • discrete, easily identifiable regions of the genome and
  • may exert their influence on genes over large genomic distances

Genomic studies addressing questions of the regulatory logic of the human genome have largely taken one of two approaches.

  1. collecting transcription factor binding maps using techniques such as ChIPseq
    and DNase-seq
  2. mapping various quantitative trait loci (QTL), such as gene expression levels
    (eQTLs) [7], DNA methylation (meQTLs) [8] and chromatin accessibility (dsQTLs)

Cumulatively, binding map studies and QTL map studies have

  • led to many insights into the principles and mechanisms of gene regulation.

However, there are questions that neither mapping approach on its own is well equipped to address. One outstanding issue is

  • the fraction of factor binding in the genome that is ‘‘functional’’,
    which we define here to mean that
  • disturbing the protein-DNA interaction leads to a measurable
  • downstream effect on gene regulation.

Transcription factor knockdown could be used to address this problem, whereby

  • the RNA interference pathway is employed to greatly reduce
  • the expression level of a specific target gene by using small interfering RNAs (siRNAs).

The response to the knockdown can then be measured by collecting RNA after the knockdown and

  • measuring global changes in gene expression patterns
  • after specifically attenuating the expression level of a given factor.

Combining a TF knockdown approach with TF binding data can help us to

  • distinguish functional binding from non-functional binding

This approach has previously been applied to the study of human TFs, although for the most part studies have only focused on

  • the regulatory relationship of a single factor with its downstream targets.

The FANTOM consortium knocked down 52 different transcription factors in

  • the THP-1 cell line, an acute monocytic leukemia-derived cell line, and
  • used a subset of these to validate certain regulatory predictions based on binding motif enrichments.

We and others previously studied the regulatory architecture of gene expression in

  • the model system of HapMap lymphoblastoid cell lines (LCLs) using both
  • binding map strategies and QTL mapping strategies.

We now sought to use knockdown experiments targeting transcription factors in a HapMap LCL

  • to refine our understanding of the gene regulatory circuitry of the human genome.

Therefore, We integrated the results of the knockdown experiments with previous data on TF binding to

  • better characterize the regulatory targets of 59 different factors and
  • to learn when a disruption in transcription factor binding
  • is most likely to be associated with variation in the expression level of a nearby gene.

Gene expression levels following the knockdown were compared to

  • expression data collected from six samples that were transfected with negative control siRNA.

The expression data from all samples were normalized together using

  • quantile  normalization followed by batch correction using the RUV-2 method.

We then performed several quality control analyses to confirm

  1. that the quality of the data was high,
  2. that there were no outlier samples, and
  3. that the normalization methods reduced the influence of confounders

In order to identify genes that were expressed at a significantly different level

  • in the knockdown samples compared to the negative controls,
  • we used likelihood-ratio tests within the framework of a fixed effect linear model.

Following normalization and quality control of the arrays,

  • we identified genes that were differentially expressed between
  • the three knockdown replicates of each factor and the six controls.

Depending on the factor targeted, the knockdowns resulted in

  • between 39 and 3,892 differentially expressed genes at an FDR of 5%
    (Figure 1B; see Table S3 for a summary of the results).

The knockdown efficiency for the 59 factors ranged

  • from 50% to 90% (based on qPCR; Table S1).

The qPCR measurements of the knockdown level were significantly

  • correlated with estimates of the TF expression levels
  • based on the microarray data (P =0.001; Figure 1C).

Reassuringly, we did not observe a significant correlation between

  • the knockdown efficiency of a given factor and
  • the number of genes classified as differentially expressed foci.

Because we knocked down 59 different factors in this experiment

  • we were able to assess general patterns associated with the perturbation of transcription factors
  • beyond merely the number of affected target genes.

Globally, despite the range in the number of genes we identified as

  • differentially expressed in each knockdown,
  • the effect sizes of the differences in expression were relatively modest and
  • consistent in magnitude across all knockdowns.

The median effect size following the knockdown experiment for genes classified as

  • differentially expressed at an FDR of 5% in any knockdown was
  • a 9.2% difference in expression level between the controls and the knockdown (Figure 2),
  • while the median effect size for any individual knockdown experiment ranged between 8.1% and 11.0%.
    (this was true whether we estimated the knockdown effect based on qPCR (P = 0.10; Figure 1D) or microarray (P = 0.99; not shown) data.

Nor did we observe a correlation between

  • variance in qPCR-estimated knockdown efficiency (between replicates) and
  • the number of genes differentially expressed (P = 0.94; Figure 1E).

We noticed that the large variation in the number of differentially expressed genes

  • extended even to knockdowns of factors from the same gene family.

Figure 1. Differential expression analysis.
(a) Examples of differential expression analysis results for the genes HCST and IRF4. The top two panels are ‘MA plots’ of the mean Log2(expression level) between the knockdown arrays and the controls for each gene (x-axis) to the Log2(Fold-Change) between the knockdowns and controls (y-axis). Differentially expressed genes at an FDR of 5% are plotted in yellow (points 50% larger). The gene targeted by the siRNA is highlighted in red. The bottom two panels are ‘volcano plots’ of the Log2(Fold-Change) between the knockdowns and controls (x-axis) to the P-value for differential expression (y-axis). The dashed line marks the 5% FDR threshold. Differentially expressed genes at an FDR of 5% are plotted in yellow (points 50% larger). The red dot marks the gene targeted by the siRNA.
(b) Barplot of number of differentially expressed genes in each knockdown experiment.
(c) Comparison of the knockdown level measured by qPCR (RNA sample collected 48 hours posttransfection) and the knockdown level measured by microarray.
(d) Comparison of the level of knockdown of the transcription factor at 48 hrs (evaluated by qPCR; x-axis) and the number of genes differentially expressed in the knockdown experiment (y-axis).
(e) Comparison of the variance in knockdown efficiency between replicates for each transcription factor (evaluated by qPCR; x-axis) and the number of differentially expressed genes in the knockdown experiment (y-axis).

Differential expression analysis

Differential expression analysis

http://dx.doi.org:/10.1371/journal.pgen.1004226.g001

Figure 2. Effect sizes for differentially expressed genes.
Boxplots of absolute Log2(fold-change) between knockdown arrays and control arrays for all genes identified as differentially expressed in each experiment. Outliers are not plotted. The gray bar indicates the interquartile range across all genes differentially expressed in all knockdowns. Boxplots are ordered by the number of genes differentially expressed in each experiment. Outliers were not plotted.

Effect sizes for differentially expressed genes

Effect sizes for differentially expressed genes

http://dx.doi.org:/10.1371/journal.pgen.1004226.g002

Knocking down SREBF2 (1,286 genes differentially expressed), a key regulator of cholesterol homeostasis,

  • results in changes in the expression of genes that are
  • significantly enriched for cholesterol and sterol biosynthesis annotations.

While not all factors exhibited striking enrichments for relevant functional categories and pathways,

  • the overall picture is that perturbations of many of the factors
  • primarily affected pathways consistent with their known biology.

In order to assess functional TF binding, we next incorporated

  • binding maps together with the knockdown expression data.

We combined binding data based on DNase-seq footprints in 70 HapMap LCLs, reported by Degner et al. (Table S5)

  • and from ChIP-seq experiments in LCL GM12878, published by ENCODE.

We were thus able to obtain genome wide binding maps for a total of 131 factors that were either

  • directly targeted by an siRNA in our experiment (29 factors) or were
  • differentially expressed in one of the knockdown experiments.

We classified a gene as a bound target of a particular factor when

  • binding of that factor was inferred within 10kb of the transcription start site (TSS) of the target gene.

Using this approach, we found that the 131 TFs were bound

  • in proximity to a median of 1,922 genes per factor (range 11 to 7,053 target genes).

We considered binding of a factor to be functional if the target gene

  • was differentially expressed after perturbing the expression level the bound transcription factor.

We then asked about the concordance between

  • the transcription factor binding data and the knockdown expression data.
  •  the extent to which differences in gene expression levels following the knockdowns
  • might be predicted by binding of the transcription factors
  • within the putative regulatory regions of the responsive genes. and also
  • what proportion of putative target (bound) genes of a given TF were
  • differentially expressed following the knockdown of the factor.

Focusing only on the binding sites classified using the DNase-seq data
(which were assigned to a specific instance of the binding motif, unlike the ChIP data),

  • we examined sequence features that might distinguish functional binding.

In particular, whether binding at conserved sites was more likely to be functional  and

  • whether binding sites that better matched the known PWM for the factor were more likely to be functional.

We did not observe a significant shift in the conservation of functional binding sites (Wilcoxon rank sum P = 0.34),

  • but we did observe that binding around differentially expressed genes occurred at sites
  • that were significantly better matches to the canonical binding motif.

Figure 3. Intersecting binding data and expression data for each knockdown.
(a) Example Venn diagrams showing the overlap of binding and differential expression for the knockdowns of HCST and IRF4 (the same genes as in Figure 1).
(b) Boxplot summarizing the distribution of the fraction of all expressed genes that are bound by the targeted gene or downstream factors.
(c) Boxplot summarizing the distribution of the fraction of bound genes that are classified as differentially expressed, using an FDR of either 5% or 20%.

Intersecting binding data and expression data for each knockdown

Intersecting binding data and expression data for each knockdown

http://dx.doi.org:/10.1371/journal.pgen.1004226.g003

Considering bound targets determined from either the ChIP-seq or DNase-seq data, we observed that

  • differentially expressed genes were associated with both
  • a higher number of binding events for the relevant factors within 10 kb of the TSS (P,10216; Figure 4A)
  • as well as with a larger number of different binding factors
    (considering the siRNA-targeted factor and any TFs that were DE in the knockdown; P,10216; Figure 4B).

Figure 4. Degree of binding correlated with function. Boxplots comparing
(a) the number of sites bound, and
(b) the number of differentially expressed transcription factors binding events near functionally or non-functionally bound genes. We considered binding for siRNA-targeted factor and any factor differentially expressed in the knockdown.
(c) Focusing only on genes differentially expressed in common between each pairwise set of knockdowns we tested for enrichments of functional binding (y-axis). Pairwise comparisons between knockdown experiments were binned by the fraction of differentially expressed transcription factors in common between the two experiments. For these boxplots, outliers were not plotted.

Degree of binding correlated with function

Degree of binding correlated with function

http://dx.doi.org:/10.1371/journal.pgen.1004226.g004

We examined the distribution of binding about the TSS. Most factor binding was concentrated

  • near the TSS whether or not the genes were classified as differentially expressed (Figure 5A).
  • the distance from the TSS to the binding sites was significantly longer for differentially expressed genes (P,10216; Fig. 5B).

Figure 5. Distribution of functional binding about the TSS.
(a) A density plot of the distribution of bound sites within 10 kb of the TSS for both functional and non-functional genes. Inset is a zoom-in of the region +/21 kb from the TSS (b) Boxplots comparing the distances from the TSS to the binding sites for functionally bound genes and non-functionally bound genes. For the boxplots, 0.001 was added before log10 transforming the distances and outliers were not plotted.

Distribution of functional binding about the TSS

Distribution of functional binding about the TSS

http://dx.doi.doi:/10.1371/journal.pgen.1004226.g005

We investigated the distribution of factor binding across various chromatin states, as defined by Ernst et al. This dataset lists

  • regions of the genome that have been assigned to different activity states
  • based on ChIP-seq data for various histone modifications and CTCF binding.

For each knockdown, we separated binding events

  • by the genomic state in which they occurred and then
  • tested whether binding in that state was enriched around differentially expressed genes.

After correcting for multiple testing of genes that were differentially expressed.

  • 19 knockdowns showed significant enrichment for binding in ‘‘strong enhancers’’
  • four knockdowns had significant enrichments for ‘‘weak enhancers’’,
  • eight knockdowns showed significant depletion of binding in ‘‘active promoters’’ ,
  • six knockdowns had significant depletions for ‘‘transcription elongation’’,

Did the factors tended to have a consistent effect (either up- or down-regulation)

  • on the expression levels of genes they purportedly regulated?

All factors we tested are associated with both up- and down-regulation of downstream targets (Figure 6).

A slight majority of downstream target genes were expressed at higher levels

  • following the knockdown for 15 of the 29 factors for which we had binding information (Figure 6B).

The factor that is associated with the largest fraction (68.8%) of up-regulated target genes following the knockdown is EZH2,

  • the enzymatic component of the Polycomb group complex.

On the other end of the spectrum was JUND, a member of the AP-1 complex, for which

  • 66.7% of differentially expressed targets were down-regulated following the knockdown.

Figure 6. Magnitude and direction of differential expression after knockdown.
(a) Density plot of all Log2(fold-changes) between the knockdown arrays and controls for genes that are differentially expressed at 5% FDR in one of the knockdown experiments as well as bound by the targeted transcription factor.
(b) Plot of the fraction of differentially expressed putative direct targets that were up-regulated in each of the knockdown experiments.

Magnitude and direction of differential expression after knockdown

Magnitude and direction of differential expression after knockdown

http://dx.doi.org:/10.1371/journal.pgen.1004226.g006

We found no correlation between the number of paralogs and the fraction of bound targets that were differentially expressed. We also did not observe a significant correlation when we considered whether

  • the percent identity of the closest paralog might be predicative of
  • the fraction of bound genes that were differentially expressed following the knockdown (Figure S8).

While there is compelling evidence for our inferences, the current chromatin functional annotations

  • do not fully explain the regulatory effects of the knockdown experiments.

For example, the enrichments for binding in ‘‘strong enhancer’’ regions of the genome range from 7.2% to 50.1% (median = 19.2%),

  • much beyond what is expected by chance alone, but far from accounting for all functional binding.

In addition to considering

  • the distinguishing characteristics of functional binding, we also examined
  • the direction of effect that perturbing a transcription factor had on the expression level of its direct targets.

We specifically addressed whether

  • knocking down a particular factor tended to drive expression of its putatively direct (namely, bound) targets up or down,
  • which can be used to infer that the factor represses or activates the target, respectively.

Transcription factors have traditionally been thought of primarily as activators, and previous work from our group is consistent with that notion. Surprisingly, the most straightforward inference from the present study is that

  • many of the factors function as repressors at least as often as they function as activators.
  1. EZH2 had a negative regulatory relationship with the largest fraction of direct targets (68.8%),
    consistent with – the known role of EZH2 as the active member of the Polycomb group complex PC2
  2. while JUND seemed to have a positive regulatory relationship with the largest fraction of direct targets (66.7%),
    and with – the biochemical characterization of the AP-1 complex (of which JUND is a component) as a transactivator.

More generally, however, our results, combined with the previous work from our group and others make for a complicated view

  • of the role of transcription factors in gene regulation as
  • it seems difficult to reconcile the inference from previous work that
  • many transcription factors should primarily act as activators with the results presented here.

One somewhat complicated hypothesis, which nevertheless can resolve the apparent discrepancy, is that

  • the ‘‘repressive’’ effects we observe for known activators may be
  • at sites in which the activator is acting as a weak enhancer of transcription and
  • that reducing the cellular concentration of the factor
  • releases the regulatory region to binding by an alternative, stronger activator.

To more explicitly address the effect that our proximity-based definition of target genes might have on our analyses, we reanalyzed

  • the overlap between factor binding and differential expression following the knockdowns
  • using an independent, empirically determined set of target genes.

Thurman et al. used correlations in DNase hypersensitivity between

  • intergenic hypersensitive sites and promoter hypersensitive sites across diverse tissues
  • to assign intergenic regulatory regions to specific genes,
  • independently of proximity to a particular promoter.

We performed this alternative analysis in which we

  • assigned binding events to genes based on the classification of Thurman et al.

We then considered the overlap between binding and differential expression in this new data set. The results were largely

  • consistent with our proximity-based observations.

A median of 9.5% of genes that were bound by a factor were

  • also differentially expressed following the knockdown of that factor
    (compared to 11.1% when the assignment of binding sites to genes is based on proximity).

From the opposite perspective, a median of 28.0% of differentially expressed genes were bound by that factor
(compared to 32.3% for the proximity based definition). The results of this analysis are summarized in Table S7.

Our results should not be considered a comprehensive census of regulatory events in the human genome. Instead, we adopted a gene-centric approach,

  • focusing only on binding events near the genes for which we could measure expression
  • to learn some of the principles of functional transcription factor binding.

In light of our observations a reassessment of our estimates of binding may be warranted. In particular, because functional binding is skewed away from promoters (our system is apparently not well-suited to observe functional promoter binding, perhaps because of protection by large protein complexes),

  • a more conservative estimate of the fraction of binding that is indeed functional would not consider data within the promoter.

Importantly, excluding the putative promoter region from our analysis (i.e. only considering a window .1 kb from the TSS and ,10 kb from the TSS)

  • does not change our conclusions.

Considering this smaller window,

  • a median of 67.0% of expressed genes are still classified as bound by
  1. either the knocked down transcription factor or
  2. a downstream factors that is differentially expressed in each experiment,

yet a median of only 8.1% of the bound genes are

  • also differentially expressed after the knockdowns.

Much of what distinguishes functional binding (as we define it) has yet to be explained. We are unable to explain much of the differential expression observed in our experiments by the presence of least one relevant binding event. This may not be altogether surprising, as

  • we are only considering binding in a limited window around the transcription start site.

To address these issues, more factors should be perturbed to further evaluate the robustness of our results and to add insight. Together, such studies will help us develop a more sophisticated understanding of functional transcription factor binding in particular, the gene regulatory logic more generally.

Assessing quality and completeness of human transcriptional regulatory pathways on a genome-wide scale

E Shmelkov, Z Tang, I Aifantis, A Statnikov*
Biology Direct 2011; 6(15).  http://www.biology-direct.com/content/6/1/15

Recently the biological pathways have become a common and probably the most popular form of representing biochemical information for hypothesis generation and validation. These maps store wide knowledge of complex molecular interactions and regulations occurring in the living organism in a simple and obvious way, often using intuitive graphical notation. Two major types of biological pathways could be distinguished.

  1. Metabolic pathways incorporate complex networks of protein-based interactions and modifications, while
  2. signal transduction and transcriptional regulatory pathways are usually considered to provide information on mechanisms of transcription

While there are a lot of data collected on human metabolic processes,

  • the content of signal transduction and transcriptional regulatory pathways varies greatly in quality and completeness.

An indicative comparison of MYC transcriptional targets reported in ten different pathway databases reveals that these databases differ greatly from each other (Figure 1). Given that MYC is involved

  • in the transcriptional regulation of approximately 15% of all genes,

one cannot argue that the majority of pathway databases that contain

  • less than thirty putative transcriptional targets of MYC are even close to complete.

More importantly, to date there have been no prior genome-wide evaluation studies (that are based on genome-wide binding and gene expression assays) assessing pathway databases

Background: While pathway databases are becoming increasingly important in most types of biological and translational research, little is known about the quality and completeness of pathways stored in these databases. The present study conducts a comprehensive assessment of transcriptional regulatory pathways in humans for seven well-studied transcription factors:

  1. MYC,
  2. NOTCH1,
  3. BCL6,
  4. TP53,
  5. AR,
  6. STAT1,
  7. RELA.

The employed benchmarking methodology first involves integrating

  • genome-wide binding with functional gene expression data
  • to derive direct targets of transcription factors.

Then the lists of experimentally obtained direct targets

  • are compared with relevant lists of transcriptional targets from 10 commonly used pathway databases.

Results: The results of this study show that for the majority of pathway databases,

  • the overlap between experimentally obtained target genes and
  • targets reported in transcriptional regulatory pathway databases is
  • surprisingly small and often is not statistically significant.

The only exception is MetaCore pathway database which

  • yields statistically significant intersection with experimental results in 84% cases.

The lists of experimentally derived direct targets obtained in this study can be used

  • to reveal new biological insight in transcriptional regulation,  and we
  • suggest novel putative therapeutic targets in cancer.

Conclusions: Our study opens a debate on validity of using many popular pathway databases to obtain transcriptional regulatory targets. We conclude that the choice of pathway databases should be informed by

  • solid scientific evidence and rigorous empirical evaluation.

In the current study we perform

(1) an evaluation of ten commonly used pathway databases,

  • assessing the transcriptional regulatory pathways, considered in the current study as
  • the interactions of the type ‘transcription factor-transcriptional targets’.

This involves integration of human genome wide functional microarray or RNA-seq gene expression data with

  • protein-DNA binding data from ChIP-chip, ChIP-seq, or ChIP-PET platforms
  • to find direct transcriptional targets of the seven well known transcription factors:
  • MYC, NOTCH1, BCL6, TP53, AR, STAT1, and RELA.

The choice of transcription factors is based on their important role in oncogenesis and availability of binding and expression data in the public domain.

(2) the lists of experimentally derived direct targets are used to assess the quality and completeness of 84 transcriptional regulatory pathways from four publicly available (BioCarta, KEGG, WikiPathways and Cell Signaling Technology) and six commercial (MetaCore, Ingenuity Pathway Analysis, BKL TRANSPATH, BKL TRANSFAC, Pathway Studio and GeneSpring Pathways) pathway databases.

(3) We measure the overlap between pathways and experimentally obtained target genes and assess statistical significance of this overlap, and we demonstrate that experimentally derived lists of direct transcriptional targets

  • can be used to reveal new biological insight on transcriptional regulation.

We show this by analyzing common direct transcriptional targets of

  • MYC, NOTCH1 and RELA
  • that act in interconnected molecular pathways.

Detection of such genes is important as it could reveal novel targets of cancer therapy.

Figure 1 Number of genes in common between MYC transcriptional targets derived from ten different pathway databases. Cells are colored according to their values from white (low values) to red (high values). (not shown)

statistical methodology for comparison

statistical methodology for comparison

Figure 2 Illustration of statistical methodology for comparison between a gold-standard and a pathway database

Since we are seeking to compare gene sets from different studies/databases, it is essential to transform genes to standard identifiers. That is why we transformed all
gene sets to the HUGO Gene Nomenclature Committee approved gene symbols and names. In order to assess statistical significance of the overlap between the resulting gene sets, we used the hypergeometric test at 5% a-level with false discovery rate correction for multiple comparisons by the method of Benjamini and Yekutieli. The alternative hypothesis of this test is that two sets of genes (set A from pathway
database and set B from experiments) have greater number of genes in common than two randomly selected gene sets with the same number of genes as in sets A and B. For example, consider that for some transcription factor there are 300 direct targets in the pathway database #1 and 700 in the experimentally derived list (gold-standard), and their intersection is 16 genes (Figure 2a). If we select on random from a total of
20,000 genes two sets with 300 and 700 genes each, their overlap would be greater or equal to 16 genes in 6.34% times. Thus, this overlap will not be statistically significant at 5% a-level (p = 0.0634). On the other hand, consider that for the pathway database #2, there are 30 direct targets of that transcription factor, and their intersection with the 700-gene gold-standard is only 6 genes. Even though the size of this intersection is rather small, it is unlikely to randomly select 30 genes (out of 20,000) with an overlap greater or equal to 6 genes with a 700-gene gold-standard (p = 0.0005, see Figure 2a). This overlap is statistically significant at 5% a-level.

We also calculate an enrichment fold change ratio (EFC) for every intersection between a gold-standard and a pathway database. For a given pair of a gold-standard and a pathway database, EFC is equal to the observed number of genes in their intersection, divided by the expected size of intersection under the null hypothesis (plus machine epsilon, to avoid division by zero). Notice however that larger values of EFC may correspond to databases that are highly incomplete and contain only a few relations. For example, consider that for some transcription factor there are 300 direct targets in the pathway database #1 and 50 in the experimentally derived list (gold-standard), and their intersection is 30 genes (Figure 2b). If we select on random from a total of 20,000 genes two sets with 300 and 50 genes each, their expected overlap under the null hypothesis will be equal to 0.75. Thus, the EFC ratio will be equal to 40 (= 30/0.75). On the other hand, consider that for the pathway database #2, there are 2 direct
targets of that transcription factor, and their intersection with the 50-gene gold-standard is only 1 gene. Even though the expected overlap under the null hypothesis will be equal to 0.005 and EFC equal to 200 (5 times bigger than for the database #1), the size of this intersection with the gold-standard is 30 times less than for database #1 (Figure 2b).

Figure 3 Comparison between different pathway databases and experimentally derived gold-standards for all considered transcription factors. Value in a given cell is a number of overlapping genes between a gold-standard and a pathway-derived gene set. Cells
are colored according to their values from white (low values) to red (high values). Underlined values in red represent statistically significant intersections. (not shown)

Figure 4 Summary of the pathway databases assessment. Green cells represent statistically significant intersections between experimentally derived gold-standards and transcriptional regulatory pathways. White cells denote results that are not statistically significant. Numbers are the enrichment fold change ratios (EFC) calculated for each intersection. (not shown)

At the core of this study was creation of gold-standards of transcriptional regulation in humans that can be compared with target genes reported in transcriptional regulatory pathways. We focused on seven well known transcription factors and obtained gold-standards

  • by integrating genome-wide transcription factor-DNA binding data (from ChIP-chip, ChIP-seq, or ChIP-PET platforms)
  • with functional gene expression microarray and RNA-seq data.

The latter data allows to survey changes in the transcriptomes on a genome-wide scale

  • after the inhibition or over-expression of the transcription factor in question.

However, change in the expression of a particular gene could be caused either by the direct effect of the removal or introduction of a given transcription factor, as well as by an indirect effect, through the change in expression level of some other gene(s). It is essential

  • to integrate data from these two sources to
  • obtain an accurate list of gene targets that are directly regulated by a transcription factor.

It is worth noting that tested pathway databases typically do not give distinction between cell-lines, experimental conditions, and other details relevant to experimental systems in which data were obtained. These databases in a sense propose a ‘universal’ list of transcriptional targets. However, it is known that

  • transcriptional regulation in a cell is dynamic and works differently for different systems and stimuli.

This accentuates the major limitation of pathway databases and emphasizes

  • importance of deriving a specific list of transcriptional targets for the current experimental system.

In this study we followed the latter approach by developing gold-standards for specific cell characterized biological systems and experimental conditions.

The approach used here  for building gold-standards of direct mechanistic knowledge has several limitations. (see article).  Nevertheless, our results suggest that multiple transcription factors can co-operate and control both physiological differentiation and malignant transformation, as demonstrated utilizing combinatorial gene-profiling for

  • NOTCH1, MYC and RELA targets.

These studies might lead us to multi-pathway gene expression “signatures”

  • essential for the prediction of genes that could be targeted in cancer treatments.

In agreement with this hypothesis, several of the genes identified in our analysis have been suggested to be putative therapeutic targets in leukemia, with either preclinical or clinical trials underway (CDK4, CDK6, GSK3b, MYC, LCK, NFkB2, BCL2L1, NOTCH1).

Single-molecule tracking in live cells reveals distinct target-search strategies of transcription factors in the nucleus

I Izeddin†, V Récamier†‡, L Bosanac, II Cissé, L Boudarene, et al.
1Functional Imaging of Transcription, Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Inserm, and CNRS UMR; 2Laboratoire Kastler Brossel, CNRS UMR, Departement de Physique et Institut de Biologie
de l’Ecole Normale Supérieure (IBENS), Paris, Fr; 3Transcription Imaging Consortium, Janelia Farm Research Campus, Howard Hughes Medical Institute, Ashburn, US; + more.
Biophysics and structural biology | Cell biology eLife 2014;3:e02230. http://dx.doi.org:/10.7554/eLife.02230

Transcription factors are

  • proteins that control the expression of genes in the nucleus, and
  • they do this by binding to other proteins or DNA.

First, however, these regulatory proteins need to overcome the challenge of

  • finding their targets in the nucleus, which is crowded with other proteins and DNA.

Much research to date has focused on measuring how fast proteins can diffuse and spread out throughout the nucleus. However these measurements only make sense if these proteins have access to the same space within the nucleus.

Now, Izeddin, Récamier et al. have developed a new technique to track

  • single protein molecules in the nucleus of mammalian cells.

A transcription factor called c-Myc and another protein called P-TEFb

  • were tracked and while they diffused at similar rates,
  • they ‘explored’ the space inside the nucleus in very different ways.

Izeddin, Récamier et al. found that c-Myc explores the nucleus in a so-called ‘non-compact’ manner: this means that it

  • can move almost everywhere inside the nucleus, and has an equal chance
  • of reaching any target regardless of its position in this space.

P-TEFb, on the other hand, searches

  • the nucleus in a ‘compact’ way.

This means that it is constrained to follow a specific path

  • through the nucleus and is therefore guided to its potential targets.

Izeddin, Récamier et al. explain that

  • the different ‘search strategies’ used by these two proteins
  • influence how long it takes them to find their targets and
  • how far they can travel in a given time.

These findings, together with information about

  • where and when different proteins interact in the nucleus,

will be essential to understand how the organization of the genome within the nucleus

  • can control the expression of genes.

The next challenge will now be to

  • uncover what determines a
  • protein’s search strategy in the nucleus, as well as
  • the potential ways that this strategy might be regulated.

Mueller et al., 2010; Normanno et al., 2012). These transient interactions are essential to ensure a fine regulation of binding site occupancy—by competition or by altering the TF concentration—but must also be persistent enough to enable the assembly of multicomponent complexes (Dundr, 2002; Darzacq and Singer, 2008; Gorski et al., 2008; Cisse et al., 2013).
In parallel to the experimental evidence of the fast diffusive motion of nuclear factors, our understanding of the intranuclear space has evolved from a homogeneous environment to an organelle where spatial arrangement among genes and regulatory sequences play an important role in transcriptional control (Heard and Bickmore, 2007). The nucleus of eukaryotes displays a hierarchy of organized structures (Gibcus and Dekker, 2013) and is often referred to as a
crowded environment.
How crowding influences transport properties of macromolecules and organelles in the cell is a fundamental question in quantitative molecular biology. While a restriction of the available space for diffusion can slow down transport processes, it can also channel molecules towards their targets increasing their chance to meet interacting partners. A widespread observation in quantitative cell biology is that the diffusion of molecules is anomalous, often attributed to crowding in the nucleoplasm, cytoplasm, or in the membranes of the cell (Höfling and Franosch, 2013). An open debate remains on how to determine whether diffusion is anomalous or normal (Malchus and Weiss, 2009; Saxton, 2012), and the mechanisms behind anomalous diffusion (Saxton, 2007). The answer to these questions bears important consequences for the understanding of the biochemical reactions of the cell.
The problem of diffusing molecules in non-homogenous media has been investigated in different fields. Following the seminal work of de Gennes (1982a), (1982b) in polymer physics, the study of diffusivity of particles and their reactivity has been generalized to random or disordered media (Kopelman, 1986; Lindenberg et al., 1991). These works have set a framework to interpret the mobility of macromolecular complexes in the cell, and recently in terms of kinetics of biochemical reactions (Condamin et al., 2007). Experimental evidence has also been found, showing the influence
of the glass-like properties of the bacterial cytoplasm in the molecular dynamics of intracellular processes (Parry et al., 2014). These studies demonstrate that the geometry of the medium in which diffusion takes place has important repercussions for the search kinetics of molecules. The notion of compact and non-compact exploration was introduced by de Gennes (1982a) in the context of dense polymers and describes two fundamental types of diffusive behavior. While a non-compact explorer leaves a significant number of available sites unvisited, a compact explorer performs a redundant
exploration of the space. In chemistry, the influence of compactness is well established to describe dimensional effects on reaction rates (Kopelman, 1986).
In this study, we aim to elucidate the existence of different types of mobility of TFs in the eukaryotic nucleus, as well as the principles governing nuclear exploration of factors relevant to transcriptional control. To this end, we used single-molecule (SM) imaging to address the relationship between the nuclear geometry and the search dynamics of two nuclear factors having distinct functional roles: the proto-oncogene c-Myc and the positive transcription elongation factor (P-TEFb). c-Myc is a basic helix-loop-helix DNA-binding transcription factor that binds to E-Boxes; 18,000 E-boxes are found in the genome, and c-Myc affects the transcription of numerous genes (Gallant and Steiger, 2009).
Recently, c-Myc has been demonstrated to be a general transcriptional activator upregulating transcription of nearly all genes (Lin et al., 2012; Nie et al., 2012). P-TEFb is an essential actor in the transcription regulation driven by RNA Polymerase II. P-TEFb is a cyclin-dependent kinase, comprising a CDK9 and a Cyclin T subunit. It phosphorylates the elongation control factors SPT5 and NELF to allow productive elongation of class II gene transcription (Wada et al., 1998). The carboxy-terminal domain (CTD) of the catalytic subunit RPB1 of polymerase II is also a major target of P-TEFb (Zhou et al., 2012). c-Myc and P-TEFb are therefore two good examples of transcriptional regulators binding to numerous sites in the nucleus; the latter binds to the transcription machinery itself and the former directly to DNA.

Single particle tracking (SPT) constitutes a powerful method to probe the mobility of molecules in living cells (Lord et al., 2010). In the nucleus, SPT has been first employed to investigate the dynamics of mRNAs (Fusco et al., 2003; Shav-Tal et al., 2004) or for rheological measurements of the nucleoplasm using inert probes (Bancaud et al., 2009). Recently, the tracking of single nuclear factors has been facilitated by the advent of efficient in situ tagging methods such as Halo
tags (Mazza et al., 2012). An alternative approach takes advantage of photoconvertible tags (Lippincott-Schwartz and Patterson, 2009) and photoactivated localization microscopy (PALM) (Betzig et al., 2006; Hess et al., 2006). Single particle tracking PALM (sptPALM) was first used to achieve high-density diffusion maps of membrane proteins (Manley et al., 2008). However, spt-PALM experiments have typically been limited to proteins with slow mobility (Manley et al., 2008) or those that undergo restricted motions (Frost et al., 2010; English et al., 2011).

Recently, by inclusion of light-sheet illumination, it has been used to determine the binding characteristics of TFs to DNA (Gebhardt et al., 2013). In this study, we developed a new sptPALM procedure adapted for the recording of individual proteins rapidly diffusing in the nucleus of mammalian cells. We used the photoconvertible fluorophore Dendra2 (Gurskaya et al., 2006) and took advantage of tilted illumination (Tokunaga et al., 2008). A careful control of the photoconversion rate minimized the background signal due to out-of-focus activated molecules, and we could thus follow the motion of individual proteins freely diffusing within the nuclear volume. With this sptPALM technique, we recorded large data sets (on the order of 104 single translocations in a single imaging session), which were essential for a proper statistical analysis of the search dynamics.
We applied our technique to several nuclear proteins and found that diffusing factors do not sense a unique nucleoplasmic architecture: c-Myc and P-TEFb adopt different nuclear space-exploration strategies, which drastically change the way they reach their specific targets. The differences observed between the two factors were not due to their diffusive kinetic parameters but to the geometry of their exploration path. c-Myc and our control protein, ‘free’ Dendra2, showed free diffusion in a three-dimensional nuclear space. In contrast, P-TEFb explored the nuclear volume by sampling a space of reduced dimensionality, displaying characteristics of exploration constrained in fractal structures.
The role of the space-sampling mode in the search strategy has long been discussed from a theoretical point of view (de Gennes, 1982a; Kopelman, 1986; Lindenberg et al., 1991). Our experimental results support the notion that it could indeed be a key parameter for diffusion-limited chemical reactions in the closed environment of the nucleus (Bénichou et al., 2010). We discuss the implications of our observations in terms of gene expression control, and its relation to the spatial organization of genes within the nucleus.

Advertisements

Read Full Post »


Introduction to Protein Synthesis and Degradation

Curator: Larry H. Bernstein, MD, FCAP

Updated 8/31/2019

 

Introduction to Protein Synthesis and Degradation

This chapter I made to follow signaling, rather than to precede it. I had already written much of the content before reorganizing the contents. The previous chapters on carbohydrate and on lipid metabolism have already provided much material on proteins and protein function, which was persuasive of the need to introduce signaling, which entails a substantial introduction to conformational changes in proteins that direct the trafficking of metabolic pathways, but more subtly uncovers an important role for microRNAs, not divorced from transcription, but involved in a non-transcriptional role.  This is where the classic model of molecular biology lacked any integration with emerging metabolic concepts concerning regulation. Consequently, the science was bereft of understanding the ties between the multiple convergence of transcripts, the selective inhibition of transcriptions, and the relative balance of aerobic and anaerobic metabolism, the weight of the pentose phosphate shunt, and the utilization of available energy source for synthetic and catabolic adaptive responses.

The first subchapter serves to introduce the importance of transcription in translational science.  The several subtitles that follow are intended to lay out the scope of the transcriptional activity, and also to direct attention toward the huge role of proteomics in the cell construct.  As we have already seen, proteins engage with carbohydrates and with lipids in important structural and signaling processes.  They are integrasl to the composition of the cytoskeleton, and also to the extracellular matrix.  Many proteins are actually enzymes, carrying out the transformation of some substrate, a derivative of the food we ingest.  They have a catalytic site, and they function with a cofactor – either a multivalent metal or a nucleotide.

The amino acids that go into protein synthesis include “indispensable” nutrients that are not made for use, but must be derived from animal protein, although the need is partially satisfied by plant sources. The essential amino acids are classified into well established groups. There are 20 amino acids commonly found in proteins.  They are classified into the following groups based on the chemical and/or structural properties of their side chains :

  1. Aliphatic Amino Acids
  2. Cyclic Amino Acid
  3. AAs with Hydroxyl or Sulfur-containing side chains
  4. Aromatic Amino Acids
  5. Basic Amino Acids
  6. Acidic Amino Acids and their Amides

Examples include:

Alanine                  aliphatic hydrophobic neutral
Arginine                 polar hydrophilic charged (+)
Cysteine                polar hydrophobic neutral
Glutamine             polar hydrophilic neutral
Histidine                aromatic polar hydrophilic charged (+)
Lysine                   polar hydrophilic charged (+)
Methionine            hydrophobic neutral
Serine                   polar hydrophilic neutral
Tyrosine                aromatic polar hydrophobic

Transcribe and Translate a Gene

  1. For each RNA base there is a corresponding DNA base
  2. Cells use the two-step process of transcription and translation to read each gene and produce the string of amino acids that makes up a protein.
  3. mRNA is produced in the nucleus, and is transferred to the ribosome
  4. mRNA uses uracil instead of thymine
  5. the ribosome reads the RNA sequence and makes protein
  6. There is a sequence combination to fit each amino acid to a three letter RNA code
  7. The ribosome starts at AUG (start), and it reads each codon three letters at a time
  8. Stop codons are UAA, UAG and UGA

 

protein synthesis

protein synthesis

http://learn.genetics.utah.edu/content/molecules/transcribe/images/TandT.png

mcell-transcription-translation

mcell-transcription-translation

http://www.vcbio.science.ru.nl/images/cellcycle/mcell-transcription-translation_eng_zoom.gif

transcription_translation

transcription_translation

 

http://www.biologycorner.com/resources/transcription_translation.JPG

 

What about the purine inosine?

Inosine triphosphate pyrophosphatase – Pyrophosphatase that hydrolyzes the non-canonical purine nucleotides inosine triphosphate (ITP), deoxyinosine triphosphate (dITP) as well as 2′-deoxy-N-6-hydroxylaminopurine triposphate (dHAPTP) and xanthosine 5′-triphosphate (XTP) to their respective monophosphate derivatives. The enzyme does not distinguish between the deoxy- and ribose forms. Probably excludes non-canonical purines from RNA and DNA precursor pools, thus preventing their incorporation into RNA and DNA and avoiding chromosomal lesions.

Gastroenterology. 2011 Apr;140(4):1314-21.  http://dx.doi.org:/10.1053/j.gastro.2010.12.038. Epub 2011 Jan 1.

Inosine triphosphate protects against ribavirin-induced adenosine triphosphate loss by adenylosuccinate synthase function.

Hitomi Y1, Cirulli ET, Fellay J, McHutchison JG, Thompson AJ, Gumbs CE, Shianna KV, Urban TJ, Goldstein DB.

Genetic variation of inosine triphosphatase (ITPA) causing an accumulation of inosine triphosphate (ITP) has been shown to protect patients against ribavirin (RBV)-induced anemia during treatment for chronic hepatitis C infection by genome-wide association study (GWAS). However, the biologic mechanism by which this occurs is unknown.

Although ITP is not used directly by human erythrocyte ATPase, it can be used for ATP biosynthesis via ADSS in place of guanosine triphosphate (GTP). With RBV challenge, erythrocyte ATP reduction was more severe in the wild-type ITPA genotype than in the hemolysis protective ITPA genotype. This difference also remains after inhibiting adenosine uptake using nitrobenzylmercaptopurine riboside (NBMPR).

ITP confers protection against RBV-induced ATP reduction by substituting for erythrocyte GTP, which is depleted by RBV, in the biosynthesis of ATP. Because patients with excess ITP appear largely protected against anemia, these results confirm that RBV-induced anemia is due primarily to the effect of the drug on GTP and consequently ATP levels in erythrocytes.

Ther Drug Monit. 2012 Aug;34(4):477-80.  http://dx.doi.org:/10.1097/FTD.0b013e31825c2703.

Determination of inosine triphosphate pyrophosphatase phenotype in human red blood cells using HPLC.

Citterio-Quentin A1, Salvi JP, Boulieu R.

Thiopurine drugs, widely used in cancer chemotherapy, inflammatory bowel disease, and autoimmune hepatitis, are responsible for common adverse events. Only some of these may be explained by genetic polymorphism of thiopurine S-methyltransferase. Recent articles have reported that inosine triphosphate pyrophosphatase (ITPase) deficiency was associated with adverse drug reactions toward thiopurine drug therapy. Here, we report a weak anion exchange high-performance liquid chromatography method to determine ITPase activity in red blood cells and to investigate the relationship with the occurrence of adverse events during azathioprine therapy.

The chromatographic method reported allows the analysis of IMP, inosine diphosphate, and ITP in a single run in <12.5 minutes. The method was linear in the range 5-1500 μmole/L of IMP. Intraassay and interassay precisions were <5% for red blood cell lysates supplemented with 50, 500, and 1000 μmole/L IMP. Km and Vmax evaluated by Lineweaver-Burk plot were 677.4 μmole/L and 19.6 μmole·L·min, respectively. The frequency distribution of ITPase from 73 patients was investigated.

The method described is useful to determine the ITPase phenotype from patients on thiopurine therapy and to investigate the potential relation between ITPase deficiency and the occurrence of adverse events.

 

System wide analyses have underestimated protein abundances and the importance of transcription in mammals

Jingyi Jessica Li1, 2, Peter J Bickel1 and Mark D Biggin3

PeerJ 2:e270; http://dx.doi.org:/10.7717/peerj.270

Using individual measurements for 61 housekeeping proteins to rescale whole proteome data from Schwanhausser et al. (2011), we find that the median protein detected is expressed at 170,000 molecules per cell and that our corrected protein abundance estimates show a higher correlation with mRNA abundances than do the uncorrected protein data. In addition, we estimated the impact of further errors in mRNA and protein abundances using direct experimental measurements of these errors. The resulting analysis suggests that mRNA levels explain at least 56% of the differences in protein abundance for the 4,212 genes detected by Schwanhausser et al. (2011), though because one major source of error could not be estimated the true percent contribution should be higher.We also employed a second, independent strategy to determine the contribution of mRNA levels to protein expression.We show that the variance in translation rates directly measured by ribosome profiling is only 12% of that inferred by Schwanhausser et al. (2011), and that the measured and inferred translation rates correlate poorly (R2 D 0.13). Based on this, our second strategy suggests that mRNA levels explain 81% of the variance in protein levels. We also determined the percent contributions of transcription, RNA degradation, translation and protein degradation to the variance in protein abundances using both of our strategies. While the magnitudes of the two estimates vary, they both suggest that transcription plays a more important role than the earlier studies implied and translation a much smaller role. Finally, the above estimates only apply to those genes whose mRNA and protein expression was detected. Based on a detailed analysis by Hebenstreit et al. (2012), we estimat that approximately 40% of genes in a given cell within a population express no mRNA. Since there can be no translation in the ab-sence of mRNA, we argue that differences in translation rates can play no role in determining the expression levels for the 40% of genes that are non-expressed.

 

Related studies that reveal issues that are not part of this chapter:

  1. Ubiquitylation in relationship to tissue remodeling
  2. Post-translational modification of proteins
    1. Glycosylation
    2. Phosphorylation
    3. Methylation
    4. Nitrosylation
    5. Sulfation – sulfotransferases
      cell-matrix communication
    6. Acetylation and histone deacetylation (HDAC)
      Connecting Protein Phosphatase to 1α (PP1α)
      Acetylation complexes (such as CBP/p300 and PCAF)
      Sirtuins
      Rel/NF-kB Signal Transduction
      Homologous Recombination Pathway of Double-Strand DNA Repair
    7. Glycination
    8. cyclin dependent kinases (CDKs)
    9. lyase
    10. transferase

 

This year, the Lasker award for basic medical research went to Kazutoshi Mori (Kyoto University) and Peter Walter (University of California, San Francisco) for their “discoveries concerning the unfolded protein response (UPR) — an intracellular quality control system that

detects harmful misfolded proteins in the endoplasmic reticulum and signals the nucleus to carry out corrective measures.”

About UPR: Approximately a third of cellular proteins pass through the Endoplasmic Reticulum (ER) which performs stringent quality control of these proteins. All proteins need to assume the proper 3-dimensional shape in order to function properly in the harsh cellular environment. Related to this is the fact that cells are under constant stress and have to make rapid, real time decisions about survival or death.

A major indicator of stress is the accumulation of unfolded proteins within the Endoplasmic Reticulum (ER), which triggers a transcriptional cascade in order to increase the folding capacity of the ER. If the metabolic burden is too great and homeostasis cannot be achieved, the response shifts from

damage control to the induction of pro-apoptotic pathways that would ultimately cause cell death.

This response to unfolded proteins or the UPR is conserved among all eukaryotes, and dysfunction in this pathway underlies many human diseases, including Alzheimer’s, Parkinson’s, Diabetes and Cancer.

 

The discovery of a new class of human proteins with previously unidentified activities

In a landmark study conducted by scientists at the Scripps Research Institute, The Hong Kong University of Science and Technology, aTyr Pharma and their collaborators, a new class of human proteins has been discovered. These proteins [nearly 250], called Physiocrines belong to the aminoacyl tRNA synthetase gene family and carry out novel, diverse and distinct biological functions.

The aminoacyl tRNA synthetase gene family codes for a group of 20 ubiquitous enzymes almost all of which are part of the protein synthesis machinery. Using recombinant protein purification, deep sequencing technique, mass spectroscopy and cell based assays, the team made this discovery. The finding is significant, also because it highlights the alternate use of a gene family whose protein product normally performs catalytic activities for non-catalytic regulation of basic and complex physiological processes spanning metabolism, vascularization, stem cell biology and immunology

 

Muscle maintenance and regeneration – key player identified

Muscle tissue suffers from atrophy with age and its regenerative capacity also declines over time. Most molecules discovered thus far to boost tissue regeneration are also implicated in cancers.  During a quest to find safer alternatives that can regenerate tissue, scientists reported that the hormone Oxytocin is required for proper muscle tissue regeneration and homeostasis and that its levels decline with age.

Oxytocin could be an alternative to hormone replacement therapy as a way to combat aging and other organ related degeneration.

Oxytocin is an age-specific circulating hormone that is necessary for muscle maintenance and regeneration (June 2014)

 

Proc Natl Acad Sci U S A. 2014 Sep 30;111(39):14289-94.   http://dx.doi.org:/10.1073/pnas.1407640111. Epub 2014 Sep 15.

Role of forkhead box protein A3 in age-associated metabolic decline.

Ma X1, Xu L1, Gavrilova O2, Mueller E3.

Aging is associated with increased adiposity and diminished thermogenesis, but the critical transcription factors influencing these metabolic changes late in life are poorly understood. We recently demonstrated that the winged helix factor forkhead box protein A3 (Foxa3) regulates the expansion of visceral adipose tissue in high-fat diet regimens; however, whether Foxa3 also contributes to the increase in adiposity and the decrease in brown fat activity observed during the normal aging process is currently unknown. Here we report that during aging, levels of Foxa3 are significantly and selectively up-regulated in brown and inguinal white fat depots, and that midage Foxa3-null mice have increased white fat browning and thermogenic capacity, decreased adipose tissue expansion, improved insulin sensitivity, and increased longevity. Foxa3 gain-of-function and loss-of-function studies in inguinal adipose depots demonstrated a cell-autonomous function for Foxa3 in white fat tissue browning. Furthermore, our analysis revealed that the mechanisms of Foxa3 modulation of brown fat gene programs involve the suppression of peroxisome proliferator activated receptor γ coactivtor 1 α (PGC1α) levels through interference with cAMP responsive element binding protein 1-mediated transcriptional regulation of the PGC1α promoter.

 

Asymmetric mRNA localization contributes to fidelity and sensitivity of spatially localized systems

RJ Weatheritt, TJ Gibson & MM Babu
Nature Structural & Molecular Biology 24 Aug, 2014; 21: 833–839 http://dx.do.orgi:/10.1038/nsmb.2876

Although many proteins are localized after translation, asymmetric protein distribution is also achieved by translation after mRNA localization. Why are certain mRNA transported to a distal location and translated on-site? Here we undertake a systematic, genome-scale study of asymmetrically distributed protein and mRNA in mammalian cells. Our findings suggest that asymmetric protein distribution by mRNA localization enhances interaction fidelity and signaling sensitivity. Proteins synthesized at distal locations frequently contain intrinsically disordered segments. These regions are generally rich in assembly-promoting modules and are often regulated by post-translational modifications. Such proteins are tightly regulated but display distinct temporal dynamics upon stimulation with growth factors. Thus, proteins synthesized on-site may rapidly alter proteome composition and act as dynamically regulated scaffolds to promote the formation of reversible cellular assemblies. Our observations are consistent across multiple mammalian species, cell types and developmental stages, suggesting that localized translation is a recurring feature of cell signaling and regulation.

 

An overview of the potential advantages conferred by distal-site protein synthesis, inferred from our analysis.

 

An overview of the potential advantages conferred by distal-site protein synthesis

An overview of the potential advantages conferred by distal-site protein synthesis

 

Turquoise and red filled circle represents off-target and correct interaction partners, respectively. Wavy lines represent a disordered region within a distal site synthesis protein. Grey and red line in graphs represents profiles of t…

http://www.nature.com/nsmb/journal/v21/n9/carousel/nsmb.2876-F5.jpg

 

Tweaking transcriptional programming for high quality recombinant protein production

Since overexpression of recombinant proteins in E. coli often leads to the formation of inclusion bodies, producing properly folded, soluble proteins is undoubtedly the most important end goal in a protein expression campaign. Various approaches have been devised to bypass the insolubility issues during E. coli expression and in a recent report a group of researchers discuss reprogramming the E. coli proteostasis [protein homeostasis] network to achieve high yields of soluble, functional protein. The premise of their studies is that the basal E. coli proteostasis network is insufficient, and often unable, to fold overexpressed proteins, thus clogging the folding machinery.

By overexpressing a mutant, negative-feedback deficient heat shock transcription factor [σ32 I54N] before and during overexpression of the protein of interest, reprogramming can be achieved, resulting in high yields of soluble and functional recombinant target protein. The authors explain that this method is better than simply co-expressing/over-expressing chaperones, co-chaperones, foldases or other components of the proteostasis network because reprogramming readies the folding machinery and up regulates the essential folding components beforehand thus  maintaining system capability of the folding machinery.

The Heat-Shock Response Transcriptional Program Enables High-Yield and High-Quality Recombinant Protein Production in Escherichia coli (July 2014)

 

 Unfolded proteins collapse when exposed to heat and crowded environments

Proteins are important molecules in our body and they fulfil a broad range of functions. For instance as enzymes they help to release energy from food and as muscle proteins they assist with motion. As antibodies they are involved in immune defence and as hormone receptors in signal transduction in cells. Until only recently it was assumed that all proteins take on a clearly defined three-dimensional structure – i.e. they fold in order to be able to assume these functions. Surprisingly, it has been shown that many important proteins occur as unfolded coils. Researchers seek to establish how these disordered proteins are capable at all of assuming highly complex functions.

Ben Schuler’s research group from the Institute of Biochemistry of the University of Zurich has now established that an increase in temperature leads to folded proteins collapsing and becoming smaller. Other environmental factors can trigger the same effect.

Measurements using the “molecular ruler”

“The fact that unfolded proteins shrink at higher temperatures is an indication that cell water does indeed play an important role as to the spatial organisation eventually adopted by the molecules”, comments Schuler with regard to the impact of temperature on protein structure. For their studies the biophysicists use what is known as single-molecule spectroscopy. Small colour probes in the protein enable the observation of changes with an accuracy of more than one millionth of a millimetre. With this “molecular yardstick” it is possible to measure how molecular forces impact protein structure.

With computer simulations the researchers have mimicked the behaviour of disordered proteins.
(Courtesy of Jose EDS Roselino, PhD.

 

MLKL compromises plasma membrane integrity

Necroptosis is implicated in many diseases and understanding this process is essential in the search for new therapies. While mixed lineage kinase domain-like (MLKL) protein has been known to be a critical component of necroptosis induction, how MLKL transduces the death signal was not clear. In a recent finding, scientists demonstrated that the full four-helical bundle domain (4HBD) in the N-terminal region of MLKL is required and sufficient to induce its oligomerization and trigger cell death.

They also found a patch of positively charged amino acids on the surface of the 4HBD that bound to phosphatidylinositol phosphates (PIPs) and allowed the recruitment of MLKL to the plasma membrane that resulted in the formation of pores consisting of MLKL proteins, due to which cells absorbed excess water causing them to explode. Detailed knowledge about how MLKL proteins create pores offers possibilities for the development of new therapeutic interventions for tolerating or preventing cell death.

MLKL compromises plasma membrane integrity by binding to phosphatidylinositol phosphates (May 2014)

 

Mitochondrial and ER proteins implicated in dementia

Mitochondria and the endoplasmic reticulum (ER) form tight structural associations that facilitate a number of cellular functions. However, the molecular mechanisms of these interactions aren’t properly understood.

A group of researchers showed that the ER protein VAPB interacted with mitochondrial protein PTPIP51 to regulate ER-mitochondria associations and that TDP-43, a protein implicated in dementia, disturbs this interaction to regulate cellular Ca2+ homeostasis. These studies point to a new pathogenic mechanism for TDP-43 and may also provide a potential new target for the development of new treatments for devastating neurological conditions like dementia.

ER-mitochondria associations are regulated by the VAPB-PTPIP51 interaction and are disrupted by ALS/FTD-associated TDP-43. Nature (June 2014)

 

A novel strategy to improve membrane protein expression in Yeast

Membrane proteins play indispensable roles in the physiology of an organism. However, recombinant production of membrane proteins is one of the biggest hurdles facing protein biochemists today. A group of scientists in Belgium showed that,

by increasing the intracellular membrane production by interfering with a key enzymatic step of lipid synthesis,

enhanced expression of recombinant membrane proteins in yeast is achieved.

Specifically, they engineered the oleotrophic yeast, Yarrowia lipolytica, by

deleting the phosphatidic acid phosphatase, PAH1 gene,

which led to massive proliferation of endoplasmic reticulum (ER) membranes.

For all 8 tested representatives of different integral membrane protein families, they obtained enhanced protein accumulation.

 

An unconventional method to boost recombinant protein levels

MazF is an mRNA interferase enzyme in E.coli that functions as and degrades cellular mRNA in a targeted fashion, at the “ACA” sequence. This degradation of cellular mRNA causes a precipitous drop in cellular protein synthesis. A group of scientists at the Robert Wood Johnson Medical School in New Jersey, exploited the degeneracy of the genetic code to modify all “ACA” triplets within their gene of interest in a way that the corresponding amino acid (Threonine) remained unchanged. Consequently, induction of MazF toxin caused degradation of E.coli cellular mRNA but the recombinant gene transcription and protein synthesis continued, causing significant accumulation of high quality target protein. This expression system enables unparalleled signal to noise ratios that could dramatically simplify structural and functional studies of difficult-to-purify, biologically important proteins.

 

Tandem fusions and bacterial strain evolution for enhanced functional membrane protein production

Membrane protein production remains a significant challenge in its characterization and structure determination. Despite the fact that there are a variety of host cell types, E.coli remains the popular choice for producing recombinant membrane proteins. A group of scientists in Netherlands devised a robust strategy to increase the probability of functional membrane protein overexpression in E.coli.

By fusing Green Fluorescent Protein (GFP) and the Erythromycin Resistance protein (ErmC) to the C-terminus of a target membrane protein they wer e able to track the folding state of their target protein while using Erythromycin to select for increased expression. By increasing erythromycin concentration in the growth media and testing different membrane targets, they were able to identify four evolved E.coli strains, all of which carried a mutation in the hns gene, whose product is implicated in genome organization and transcriptional silencing. Through their experiments the group showed that partial removal of the transcriptional silencing mechanism was related to production of proteins that were essential for functional overexpression of membrane proteins.

 

The role of an anti-apoptotic factor in recombinant protein production

In a recent study, scientists at the Johns Hopkins University and Frederick National Laboratory for Cancer Research examined an alternative method of utilizing the benefits of anti-apoptotic gene expression to enhance the transient expression of biotherapeutics, specifically, through the co-transfection of Bcl-xL along with the product-coding target gene.

Chinese Hamster Ovary(CHO) cells were co-transfected with the product-coding gene and a vector containing Bcl-xL, using Polyethylenimine (PEI) reagent. They found that the cells co-transfected with Bcl-xL demonstrated reduced apoptosis, increased specific productivity, and an overall increase in product yield.

B-cell lymphoma-extra-large (Bcl-xL) is a mitochondrial transmembrane protein and a member of the Bcl-2 family of proteins which are known to act as either pro- or anti-apoptotic proteins. Bcl-xL itself acts as an anti-apoptotic molecule by preventing the release of mitochondrial contents such as cytochrome c, which would lead to caspase activation. Higher levels of Bcl-xL push a cell toward survival mode by making the membranes pores less permeable and leaky.

Introduction to Protein Synthesis and Degradation Updated 8/31/2019

N-Terminal Degradation of Proteins: The N-End Rule and N-degrons

In both prokaryotes and eukaryotes mitochondria and chloroplasts, the ribosomal synthesis of proteins is initiated with the addition of the N-formyl methionine residue.  However in eukaryotic cytosolic ribosomes, the N terminal was assumed to be devoid of the N-formyl group.  The unformylated N-terminal methionine residues of eukaryotes is then  often N-acetylated (Ac) and creates specific degradation signals, the Ac N-end rule.  These N-end rule pathways are proteolytic systems which recognize these N-degrons resulting in proteosomal degradation or autophagy.  In prokaryotes this system is stimulated by certain amino acid deficiencies and in eukaryotes is dependent on the Psh1 E3 ligase.

Two papers in the journal Science describe this N-degron in more detail.

Structured Abstract
INTRODUCTION

In both bacteria and eukaryotic mitochondria and chloroplasts, the ribosomal synthesis of proteins is initiated with the N-terminal (Nt) formyl-methionine (fMet) residue. Nt-fMet is produced pretranslationally by formyltransferases, which use 10-formyltetrahydrofolate as a cosubstrate. By contrast, proteins synthesized by cytosolic ribosomes of eukaryotes were always presumed to bear unformylated N-terminal Met (Nt-Met). The unformylated Nt-Met residue of eukaryotic proteins is often cotranslationally Nt-acetylated, a modification that creates specific degradation signals, Ac/N-degrons, which are targeted by the Ac/N-end rule pathway. The N-end rule pathways are a set of proteolytic systems whose unifying feature is their ability to recognize proteins containing N-degrons, thereby causing the degradation of these proteins by the proteasome or autophagy in eukaryotes and by the proteasome-like ClpAP protease in bacteria. The main determinant of an N‑degron is a destabilizing Nt-residue of a protein. Studies over the past three decades have shown that all 20 amino acids of the genetic code can act, in cognate sequence contexts, as destabilizing Nt‑residues. The previously known eukaryotic N-end rule pathways are the Arg/N-end rule pathway, the Ac/N-end rule pathway, and the Pro/N-end rule pathway. Regulated degradation of proteins and their natural fragments by the N-end rule pathways has been shown to mediate a broad range of biological processes.

RATIONALE

The chemical similarity of the formyl and acetyl groups and their identical locations in, respectively, Nt‑formylated and Nt-acetylated proteins led us to suggest, and later to show, that the Nt-fMet residues of nascent bacterial proteins can act as bacterial N-degrons, termed fMet/N-degrons. Here we wished to determine whether Nt-formylated proteins might also form in the cytosol of a eukaryote such as the yeast Saccharomyces cerevisiae and to determine the metabolic fates of Nt-formylated proteins if they could be produced outside mitochondria. Our approaches included molecular genetic techniques, mass spectrometric analyses of proteins’ N termini, and affinity-purified antibodies that selectively recognized Nt-formylated reporter proteins.

RESULTS

We discovered that the yeast formyltransferase Fmt1, which is imported from the cytosol into the mitochondria inner matrix, can generate Nt-formylated proteins in the cytosol, because the translocation of Fmt1 into mitochondria is not as efficacious, even under unstressful conditions, as had previously been assumed. We also found that Nt‑formylated proteins are greatly up-regulated in stationary phase or upon starvation for specific amino acids. The massive increase of Nt-formylated proteins strictly requires the Gcn2 kinase, which phosphorylates Fmt1 and mediates its retention in the cytosol. Notably, the ability of Gcn2 to retain a large fraction of Fmt1 in the cytosol of nutritionally stressed cells is confined to Fmt1, inasmuch as the Gcn2 kinase does not have such an effect, under the same conditions, on other examined nuclear DNA–encoded mitochondrial matrix proteins. The Gcn2-Fmt1 protein localization circuit is a previously unknown signal transduction pathway. A down-regulation of cytosolic Nt‑formylation was found to increase the sensitivity of cells to undernutrition stresses, to a prolonged cold stress, and to a toxic compound. We also discovered that the Nt-fMet residues of Nt‑formylated cytosolic proteins act as eukaryotic fMet/N-degrons and identified the Psh1 E3 ubiquitin ligase as the recognition component (fMet/N-recognin) of the previously unknown eukaryotic fMet/N-end rule pathway, which destroys Nt‑formylated proteins.

CONCLUSION

The Nt-formylation of proteins, a long-known pretranslational protein modification, is mediated by formyltransferases. Nt-formylation was thought to be confined to bacteria and bacteria-descended eukaryotic organelles but was found here to also occur at the start of translation by the cytosolic ribosomes of a eukaryote. The levels of Nt‑formylated eukaryotic proteins are greatly increased upon specific stresses, including undernutrition, and appear to be important for adaptation to these stresses. We also discovered that Nt-formylated cytosolic proteins are selectively destroyed by the eukaryotic fMet/N-end rule pathway, mediated by the Psh1 E3 ubiquitin ligase. This previously unknown proteolytic system is likely to be universal among eukaryotes, given strongly conserved mechanisms that mediate Nt‑formylation and degron recognition.

The eukaryotic fMet/N-end rule pathway.

(Top) Under undernutrition conditions, the Gcn2 kinase augments the cytosolic localization of the Fmt1 formyltransferase, and possibly also its enzymatic activity. Consequently, Fmt1 up-regulates the cytosolic fMet–tRNAi (initiator transfer RNA), and thereby increases the levels of cytosolic Nt-formylated proteins, which are required for the adaptation of cells to specific stressors. (Bottom) The Psh1 E3 ubiquitin ligase targets the N-terminal fMet-residues of eukaryotic cytosolic proteins, such as Cse4, Pgd1, and Rps22a, for the polyubiquitylation-mediated, proteasome-dependent degradation.

” data-icon-position=”” data-hide-link-title=”0″>

The eukaryotic fMet/N-end rule pathway.

(Top) Under undernutrition conditions, the Gcn2 kinase augments the cytosolic localization of the Fmt1 formyltransferase, and possibly also its enzymatic activity. Consequently, Fmt1 up-regulates the cytosolic fMet–tRNAi (initiator transfer RNA), and thereby increases the levels of cytosolic Nt-formylated proteins, which are required for the adaptation of cells to specific stressors. (Bottom) The Psh1 E3 ubiquitin ligase targets the N-terminal fMet-residues of eukaryotic cytosolic proteins, such as Cse4, Pgd1, and Rps22a, for the polyubiquitylation-mediated, proteasome-dependent degradation.

 

A glycine-specific N-degron pathway mediates the quality control of protein N-myristoylation. Richard T. Timms1,2Zhiqian Zhang1,2David Y. Rhee3J. Wade Harper3Itay Koren1,2,*,Stephen J. Elledge1,2

Science  05 Jul 2019: Vol. 365, Issue 6448

The second paper describes a glycine specific N-degron pathway in humans.  Specifically the authors set up a screen to identify specific N-terminal degron motifs in the human.  Findings included an expanded repertoire for the UBR E3 ligases to include substrates with arginine and lysine following an intact initiator methionine and a glycine at the extreme N-terminus, which is a potent degron.

Glycine N-degron regulation revealed

For more than 30 years, N-terminal sequences have been known to influence protein stability, but additional features of these N-end rule, or N-degron, pathways continue to be uncovered. Timms et al. used a global protein stability (GPS) technology to take a broader look at these pathways in human cells. Unexpectedly, glycine exposed at the N terminus could act as a potent degron; proteins bearing N-terminal glycine were targeted for proteasomal degradation by two Cullin-RING E3 ubiquitin ligases through the substrate adaptors ZYG11B and ZER1. This pathway may be important, for example, to degrade proteins that fail to localize properly to cellular membranes and to destroy protein fragments generated during cell death.

Science, this issue p. eaaw4912

Structured Abstract

INTRODUCTION

The ubiquitin-proteasome system is the major route through which the cell achieves selective protein degradation. The E3 ubiquitin ligases are the major determinants of specificity in this system, which is thought to be achieved through their selective recognition of specific degron motifs in substrate proteins. However, our ability to identify these degrons and match them to their cognate E3 ligase remains a major challenge.

RATIONALE

It has long been known that the stability of proteins is influenced by their N-terminal residue, and a large body of work over the past three decades has characterized a collection of N-end rule pathways that target proteins for degradation through N-terminal degron motifs. Recently, we developed Global Protein Stability (GPS)–peptidome technology and used it to delineate a suite of degrons that lie at the extreme C terminus of proteins. We adapted this approach to examine the stability of the human N terminome, allowing us to reevaluate our understanding of N-degron pathways in an unbiased manner.

RESULTS

Stability profiling of the human N terminome identified two major findings: an expanded repertoire for UBR family E3 ligases to include substrates that begin with arginine and lysine following an intact initiator methionine and, more notably, that glycine positioned at the extreme N terminus can act as a potent degron. We established human embryonic kidney 293T reporter cell lines in which unstable peptides that bear N-terminal glycine degrons were fused to green fluorescent protein, and we performed CRISPR screens to identify the degradative machinery involved. These screens identified two Cul2 Cullin-RING E3 ligase complexes, defined by the related substrate adaptors ZYG11B and ZER1, that act redundantly to target substrates bearing N-terminal glycine degrons for proteasomal degradation. Moreover, through the saturation mutagenesis of example substrates, we defined the composition of preferred N-terminal glycine degrons specifically recognized by ZYG11B and ZER1.

We found that preferred glycine degrons are depleted from the native N termini of metazoan proteomes, suggesting that proteins have evolved to avoid degradation through this pathway, but are strongly enriched at annotated caspase cleavage sites. Stability profiling of N-terminal peptides lying downstream of all known caspase cleavages sites confirmed that Cul2ZYG11Band Cul2ZER1 could make a substantial contribution to the removal of proteolytic cleavage products during apoptosis. Last, we identified a role for ZYG11B and ZER1 in the quality control of N-myristoylated proteins. N-myristoylation is an important posttranslational modification that occurs exclusively on N-terminal glycine. By profiling the stability of the human N-terminome in the absence of the N-myristoyltransferases NMT1 and NMT2, we found that a failure to undergo N-myristoylation exposes N-terminal glycine degrons that are otherwise obscured. Thus, conditional exposure of glycine degrons to ZYG11B and ZER1 permits the selective proteasomal degradation of aberrant proteins that have escaped N-terminal myristoylation.

CONCLUSION

These data demonstrate that an additional N-degron pathway centered on N-terminal glycine regulates the stability of metazoan proteomes. Cul2ZYG11B– and Cul2ZER1-mediated protein degradation through N-terminal glycine degrons may be particularly important in the clearance of proteolytic fragments generated by caspase cleavage during apoptosis and in the quality control of protein N-myristoylation.

The glycine N-degron pathway.

Stability profiling of the human N-terminome revealed that N-terminal glycine acts as a potent degron. CRISPR screening revealed two Cul2 complexes, defined by the related substrate adaptors ZYG11B and ZER1, that recognize N-terminal glycine degrons. This pathway may be particularly important for the degradation of caspase cleavage products during apoptosis and the removal of proteins that fail to undergo N-myristoylation.

” data-icon-position=”” data-hide-link-title=”0″>

The glycine N-degron pathway.

Stability profiling of the human N-terminome revealed that N-terminal glycine acts as a potent degron. CRISPR screening revealed two Cul2 complexes, defined by the related substrate adaptors ZYG11B and ZER1, that recognize N-terminal glycine degrons. This pathway may be particularly important for the degradation of caspase cleavage products during apoptosis and the removal of proteins that fail to undergo N-myristoylation.

 

Read Full Post »


Metabolomics is about Metabolic Systems Integration

Author and Curator: Larry H Bernstein, MD, FCAP 

 

This is an exploration of biological thoughts in the series on metabolomics, putting enzymatic reactions, proteins and protein conformation, and subanatomic structure into a more complete perspective in order to realize normal and dysfunctional states

of eukaryoticcells and organ systems and prokaryotic organisms.  There are structures and functions that have evolved in evolution that have concordance, even
if we find variation on themes.  Moreover, these have to be understood in a systems oriented view to have any clarity, which is currently an ongoing proposal.
It is perhaps relevant to quote Radoslav Bosov on his observation:

“After finishing her portion of the work on DNA, Franklin led pioneering work on the tobacco mosaic virus and the polio virus. She died in 1958 at the age of 37 of ovarian cancer.”  My job is to illuminate what is cancer, but serving structural identity issues.

DNA is not DNA, as RNA is not RNA as proteins are not Proteins, there is only time – interference of particles/strings/waves within ever emerging discrete relative spaces where energy transforms from one absolute form into another!

He adds the following: “A 2005 study showed methionine restriction without energy restriction extends mouse lifespan.” BUT balancing energy is not as same as balancing matter due quantum electrodynamics interference and transfromability – http://en.wikipedia.org/wiki/Methionine

I have made the following calculations!

1 – methyl groups = i Ln (1 – Lactate )/Ln (Oxygen) – K (O) =

i Ln (1/(Sqrt (1 – Acetate^2)) /Ln(Oxygen) – K(O) = i Ln (Glyoxylate)/Ln (Oxygen) – K(O)

where K(O) – mechanical electro magnetics pressure, with increase of T, increase of S (entropy), and 1-S = negative entropy

But don’t try to realize the path of derivation, it would get you in dark matter issues – water!

The problem seems to be:

  1. Methionine is necessary to provide S for acetyl CoA
  2. Insufficiency of this amino acid has consequences, which leads to increased homocysteine
  3. This imbalance is also associated with a decrease in lewan body mass
  4. Of course, the reality is that geographic location, proximity to volcanic ash, and temperate zone have relevance, as does food source, and they are relevant variables

JEDS Rosalino has referred to the important conclusion in Erwin Schroedinger’s “What is Life?”, and Schroedinger’s cat.  It is impossible to come up with a predictive equation to explain life.
It had to come from a founder of “Quantum Mechanics” because, unlike economics, physics is a science based on experimental validation.  In entering biology from Physics to make it more rigorous, as was the case for  Max Delbruck, who was preceded by the Cori’s, Beadle and Tatum, Herschey, Luria, Dubecco, Kornberg and Ochoa, Lipmann, Watson and Crick, a discipline called “Molecular Biology and Biochemistry” emerged that would open the secrets of life.  Beadle and Tatum gave us “one gene – one enzyme”, a formulation that led in medical teaching from William Osler’s edict to “Inherited Metabolic Disorders” – gene related disruption of the chemical reactions taking place in the body to convert or use energy. Physiological chemistry taught:

  1. Breaking down the carbohydrates, proteins, and fats in food to release energy.
  2. Transforming excess nitrogen into waste products excreted in urine.
  3. Breaking down or converting chemicals into other substances and transporting them inside cells.

Metabolism is an organized but chaotic chemical assembly line. Raw materials, half-finished products, and waste materials are constantly being used, produced, transported, and excreted. The “workers” on the assembly line are enzymes and other proteins that make chemical reactions happen. – http://www.webmd.com/a-to-z-guides/inherited-metabolic-disorder-types-and-treatments

The original cause of most genetic metabolic disorders is a gene mutation that occurred many, many generations ago. Each inherited metabolic disorder is quite rare in the general population, affecting about 1 in 1,000 to 2,500 newborns. But the developments now refocused an emphasis on HOW – a gene mutation occurs that is passed on through generations.  This had to be derived initially from methods developed in prokaryotes in order to relieve the complexity.  However, complexity came from evolutionary events over a long time span.

Part I. Transcription regulation

The timing is right

R Magnus N Friis  & Michael C Schultz
Affiliations  Corresponding author

Nature Structural & Molecular Biology 07 Oct 2014; 21: 846–847
http://dx.doi.org:/10.1038/nsmb.2898

Yeast cells display synchronized oscillation between

  • phases of high and low oxygen consumption
  • accompanied by a program of cyclical gene expression.

A study monitoring

  • mRNA levels,
  • histone modifications and
  • chromatin occupancy of histone modifiers

during the yeast metabolic cycle (YMC) at high temporal resolution reveals both

  • ‘just-in-time’ supply of YMC gene products and
  • new patterns of chromatin reconfiguration

associated with transcriptional regulation.

Figure 1: The yeast metabolic cycle.

yeast metabolic cycle.

The YMC is divided into metabolic phases that correspond to periods of high and low oxygen concentration in the culture medium. The program of gene (mRNA) expression during the YMC is composed of successive reductive-charging (RC),…
http://www.nature.com/nsmb/journal/v21/n10/carousel/nsmb.2898-F1.jpg

Figure 2: Modes of transcriptional regulation during the YMC.

Modes of transcriptional regulation during the YMC

Modes of transcriptional regulation during the YMC

(a) Previous work on cycling cells in batch culture revealed that H3K4me3 is typically limited to the promoter region of active genes (MET16 shown here 9, 10). (b) During the YMC, however, the OX gene RMT2 is marked by H3K4me3 regardles…

http://www.nature.com/nsmb/journal/v21/n10/carousel/nsmb.2898-F2.jpg

High-temporal-resolution view of transcription and chromatin states across distinct metabolic states in budding yeast

Z Kuang, L Cai, X Zhang, H Ji, BP Tu  & JD Boeke
Affiliations Contributions Corresponding authors

Nature Structural & Molecular Biology 31 Aug,2014; 21: 854–863
http://dx.doi.org:/10.1038/nsmb.2881

Under continuous, ​glucose-limited conditions, budding yeast exhibit

  1. robust metabolic cycles
  2. associated with major oscillations of gene expression.

We examine the correlated

  1. genome-wide transcription and chromatin states
  2. across the yeast metabolic cycle
  3. at unprecedented temporal resolution,
  4. revealing a ‘just-in-time supply chain’

by which components from specific cellular processes such as ribosome biogenesis become available in a highly coordinated manner. We identify

  1. distinct chromatin and splicing patterns
  2. associated with different gene categories and
  3. determine the relative timing of chromatin modifications
  4. relative to maximal transcription.

There is unexpected variation in the chromatin modification and expression relationship, with

  1. histone acetylation peaks occurring with
  2. varying timing and ‘sharpness’ relative to RNA expression
  3. both within and between cycle phases.

Chromatin-modifier occupancy reveals subtly distinct spatial and temporal patterns compared to those of the modifications themselves.

Figure 1: High-temporal-resolution analysis of gene expression reveals meticulous temporal compartmentalization in yeast.

High-temporal-resolution analysis of gene expression

High-temporal-resolution analysis of gene expression

Oscillation of ​oxygen (dO2) in the YMC. The 16 time points of one cycle for RNA-seq are labeled. Metabolic phases are color coded throughout figures: magenta, OX phase; green, RB phase; blue, RC phase. (b–d) Subtly distinct tempor…

http://www.nature.com/nsmb/journal/v21/n10/carousel/nsmb.2881-F1.jpg

Figure 2: RNA-seq analysis at introns reveals transient accumulation of pre-mRNAs during OX phase.

RNA-seq analysis at introns reveals transient accumulation of pre-mRNAs

RNA-seq analysis at introns reveals transient accumulation of pre-mRNAs

Relative RNA signals at intron-containing genes. Each track represents relative RNA levels at one of 16 time points, ordered sequentially from top to bottom. Signals are displayed as a percentage of the maximum value of the 16 time…
http://www.nature.com/nsmb/journal/v21/n10/carousel/nsmb.2881-F2.jpg

Figure 3: Dynamic chromatin states across the YMC.

Dynamic chromatin states across the YMC

Dynamic chromatin states across the YMC

(a)Oscillation of ​oxygen in one YMC. Cycling cells were collected at 16 intentionally uneven time points over one cycle for ChIP-seq. (b,c) Temporal relationship between RNA level and histone modifications at the ​RMT2 locus. (b) RNA…

http://www.nature.com/nsmb/journal/v21/n10/carousel/nsmb.2881-F3.jpg

Part 2. Structure of metabolic channeling

Enzyme clustering accelerates processing of intermediates through metabolic channeling

Michele Castellana, Maxwell Z Wilson, Yifan Xu, Preeti Joshi, Ileana M Cristea, Joshua D Rabinowitz, Zemer Gitai & Ned S Wingreen
Affiliations Contributions Corresponding authors

Nature Biotechnology (2014)32, 1011–1018
http://dx.doi.org:/10.1038/nbt.3018

We present a quantitative model to demonstrate that

  • coclustering multiple enzymes into compact agglomerates
  • accelerates the processing of intermediates,
  • yielding the same efficiency benefits as direct channeling,

a well-known mechanism in which enzymes are funneled between enzyme active sites through a physical tunnel. The model predicts

  • the separation and size of coclusters that maximize metabolic efficiency,
  • and this prediction is in agreement with previously reported spacings between coclusters in mammalian cells.

For direct validation, we study a metabolic branch point in Escherichia coli and experimentally confirm the model prediction that enzyme agglomerates can

  • accelerate the processing of a shared intermediate by one branch, and thus
  • regulate steady-state flux division.

Our studies establish a quantitative framework to understand coclustering-mediated metabolic channeling

Figure 1: Different types of intermediate channeling in a two-step metabolic pathway, where a substrate is processed by enzyme E1 and turned into intermediate, which is then processed by enzyme E2 and turned into product.

two-step metabolic pathway

two-step metabolic pathway

Direct channeling. The intermediate is funneled from enzyme E1 to enzyme E2 by means of a protein tunnel that connects the active sites of E1 and E2, thus preventing the intermediate from diffusing away. (b) Proximity channeling. http://www.nature.com/nbt/journal/v32/n10/carousel/nbt.3018-F1.jpg

Figure 2: Two-step metabolic pathway with an unstable intermediate.

Two-step metabolic pathway with an unstable intermediate

Two-step metabolic pathway with an unstable intermediate

(a) The two-step metabolic pathway. Substrate S0 is processed by enzyme E1 and turned into intermediate S1, which is then processed by enzyme E2 and turned into product P. (b) Enzyme configurations in the two-step metabolic pathway. Le…
http://www.nature.com/nbt/journal/v32/n10/carousel/nbt.3018-F2.jpg

Part 3. Antibiotics directed at specific DNA sequences

Sequence-specific antimicrobials using efficiently delivered RNA-guided nucleases

Robert J Citorik, Mark Mimee & Timothy K Lu
Affiliations Contributions Corresponding author

Nature Biotechnology 21 Sep 2014;
http://dx.doi.org:/10.1038/nbt.3011

Current antibiotics tend to be broad spectrum, leading to

  • indiscriminate killing of commensal bacteria and
  • accelerated evolution of drug resistance.

Here, we use CRISPR-Cas technology to create antimicrobials

  • whose spectrum of activity is chosen by design.

RNA-guided nucleases (RGNs) targeting specific DNA sequences are delivered efficiently to microbial populations using bacteriophage or bacteria carrying plasmids transmissible by conjugation. The DNA targets of RGNs can be

  • undesirable genes or polymorphisms,
  • including antibiotic resistance and virulence determinants in
  1. carbapenem-resistant Enterobacteriaceae and
  2. enterohemorrhagic Escherichia coli.

Delivery of RGNs significantly improves survival in a Galleria mellonella infection model. We also show that

  • RGNs enable modulation of complex bacterial populations
  • by selective knockdown of targeted strains
  • based on genetic signatures.

RGNs constitute a class of highly discriminatory, customizable antimicrobials that enact

  • selective pressure at the DNA level to
  1. reduce the prevalence of undesired genes,
  2. minimize off-target effects and
  3. enable programmable remodeling of microbiota.

Figure 1: RGN constructs delivered by bacteriophage particles (ΦRGN) exhibit efficient and specific antimicrobial effects against strains harboring plasmid or chromosomal target sequences

RGN constructs delivered by bacteriophage particles

RGN constructs delivered by bacteriophage particles

(a) Bacteriophage-delivered RGN constructs differentially affect host cell physiology in a sequence-dependent manner. If the target sequence is: (i) absent, the RGN exerts no effect; (ii) chromosomal, RGN activity is cytotoxic; (iii) e…

http://www.nature.com/nbt/journal/vaop/ncurrent/carousel/nbt.3011-F1.jpg

Figure 3: ΦRGN particles elicit sequence-specific toxicity against enterohemorrhagic E. coli in vitroand in vivo.

ΦRGN particles elicit sequence-specific toxicity against enterohemorrhagic E. coli in vitro and in vivo.

ΦRGN particles elicit sequence-specific toxicity against enterohemorrhagic E. coli in vitro and in vivo.

(a) E. coli EMG2 wild-type (WT) cells or ATCC 43888 F′ (EHEC) cells were treated with SM buffer, ΦRGNndm-1 orΦRGNeae at a multiplicity of infection (MOI) ~100 and plated onto LB agar to enumerate total cell number or LB+kanamycin (Km)…  http://www.nature.com/nbt/journal/vaop/ncurrent/carousel/nbt.3011-F3.jpg

Part 4. Structure and Isoform functions

Structures of human constitutive nitric oxide synthases.

H Li, J Jamal, C Plaza, SH Pineda, G Chreifi, Q Jing, MA Cinelli, RB Silverman, TLPoulos, [more]
Acta Crystallographica Section D Biological Crystallography (Impact Factor: 12.67). 10/2014; 70(Pt 10):2667-74.
http://dx.doi.org:/10.1107/S1399004714017064

Mammals produce three isoforms of nitric oxide synthase (NOS):

  1. neuronal NOS (nNOS),
  2. inducible NOS (iNOS) and
  3. endothelial NOS (eNOS).

The overproduction of NO by nNOS is associated with a number of neurodegenerative disorders; therefore, a desirable therapeutic goal is

  • the design of drugs that target nNOS
  • but not the other isoforms.

Crystallography, coupled with computational approaches and medicinal chemistry, has played a critical role in developing highly

  • selective nNOS inhibitors that
  • exhibit exceptional neuroprotective properties.

For historic reasons, crystallography has focused on rat nNOS and bovine eNOS because these were available in high quality; thus, their structures have been used in

  • structure-activity-relationship studies.

Although these constitutive NOSs share more than 90% sequence identity across mammalian species for each NOS isoform,

  • inhibitor-binding studies revealed that subtle differences near the heme active site
  • in the same NOS isoform across species still impact enzyme-inhibitor interactions.

Therefore, structures of the human constitutive NOSs are indispensible. Here, the first structure of human neuronal NOS at 2.03 Å resolution is reported and a different crystal form of human endothelial NOS is reported at 1.73 Å resolution.

“We are learning more about less and less” – PJ Russell. 1973.

Part 5.  Global Metabolomics

Global Metabolomics Market (Technique, Application, Indication and Geography) – Size, Application Analysis, Regional Outlook, Competitive Strategies and Forecasts, 2014 – 2020

Metabolomics is

  • the study of chemical processes which involve metabolites.

Metabolites are small molecules present in the blood, tissues and urine. Metabolomics pertains to the study of the

  • unique chemical fingerprints left behind by cellular processes.

These metabolite fingerprints could be used to learn about the health of an organism. It is an upcoming technology in the field of analytical biochemistry. Metabolomics has become an experimental technique that can be applied in medicine, biology and environmental science. The incorporation of computers has enabled

the creation of computational metabolomics that has application in life sciences.

Metabolics finds application in other areas as well; for instance, it is used to identify the quality, taste and nutritional value of food in the food science field.

The metabolomics market is segmented based on its application in different fields such as

  • biomarkers discovery,
  • drug discovery,
  • toxicology testing,
  • nutrigenomics,
  • clinical studies etc.

The drug discovery segment holds the dominant share in the metabolomics market due to its crucial role in

  • drug target identification & validation and
  • optimization & prioritization of diagnostic approaches for oncology research.

The metabolomics market is expected to grow at a rapid rate due to the rise in the number of

  • pre clinical & clinical trials,
  • advancements in toxicological studies and
  • growing awareness about nutritional products.

The stellar growth of data analysis software & solutions in metabolomics and its use in the biomarker screening of diseases would fuel the growth of the metabolomics market. The metabolomics market is also segmented based on techniques into

  • gas chromatography,
  • high performance liquid chromatography (HPLC),
  • ultra performance liquid chromatography, and
  • capillary electrophoresis.

HPLC holds the dominant share in the metabolomics market.

KEY BENEFITS

In-depth analysis of various regions would enable a clear understanding of current and future trends so that companies can make region specific plans

Comprehensive analysis of the factors that drive and restrict the growth of the metabolomics market

Key regulatory guidelines in various regions which impact the metabolomics market

Quantitative analysis of the current market

Deep dive analysis of various regions

Value chain analysis enables a clear understanding of the roles of the stakeholders involved in the supply chain of the metabolomics market

Market Segmentation

The metabolomics market is segmented based on techniques, applications, indication and geography

Techniques

Separation Method

  • Gas Chromatography
  • Capillary Electrophoresis
  • High Performance Liquid Chromatography
  • Ultra Performance Liquid Chromatography

Detection Methods

  • Nuclear Magnetic Resonance
  • Mass Spectrometry
  • Surface Base Mass Analysis

Application

  • Biomarkers Discovery
  • Drug Discovery
  • Toxicology Testing
  • Nutrigenomics
  • Clinical & Preclinical Studies

Indications

  • Oncology
  • Neurology
  • Cardiology
  • Others

Read Full Post »


Somatic, germ-cell, and whole sequence DNA in cell lineage and disease profiling

Curator: Larry H Bernstein, MD, FCAP

In humans, mitochondrial DNA spans about 16,500 DNA building blocks (base pairs), representing a small fraction of the total DNA in cells. Mitochondrial DNA contains 37 genes, essential for normal mitochondrial function and thirteen of them provide instructions for making enzymes involved in inner membrane function. The remaining 24 genes are transcribed into transfer RNA (tRNA) and ribosomal RNA (rRNA), which are needed to transfer amino acids into proteins.

Somatic mutations occur in the DNA of certain cells during a person’s lifetime and typically are not passed to future generations.  They differ from germ-line mutations that have a lineal descent from the maternal parent, and they occur later in life.  Mutations in the sperm DNA are not carried on to future generations, as the sperm mitochondria are destroyed after the egg is fertilized.

There is limited evidence linking somatic mutations in mitochondrial DNA with certain cancers, including breast, colon, stomach, liver, and kidney tumors. These mutations might also be associated with cancer of blood-forming tissue (leukemia) and cancer of immune system cells (lymphoma).  There are many heritable diseases that are related to germ-line mutations, and germ-line mutations have a role in many common diseases.  Mitochondrial DNA is particularly vulnerable to the effects of reactive oxygen species (ROS), and with a limited ability of the mitochondrion to repair itself, ROS easily damage mitochondrial DNA.  The repair mechanism is tied to ubiquitinylation system.  A  list of disorders associated with mitochondrial genes  is provided from Wikipedia.

Inherited changes in mitochondrial DNA may be associated with pathologies in growth and development, and multiorgan system disorders, as mutations disrupt the mitochondria’s ability to generate the cell’s energy. The effects of these conditions are most pronounced in organs and tissues with high energy requirements (such as the heart, brain, and muscles). Although the health consequences of inherited mitochondrial DNA mutations vary widely, some frequently observed features include muscle weakness and wasting, problems with movement, diabetes, kidney failure, heart disease, loss of intellectual functions (dementia), hearing loss, and abnormalities involving the eyes and vision.

A buildup of somatic mutations in mitochondrial DNA has been considered to have a role in or associated with increased risk of certain age-related disorders such as heart disease, Alzheimer disease, and Parkinson disease, and the severity of many mitochondrial disorders is thought to be associated with the percentage of mitochondria affected by a particular genetic change. Consequently, the progressive accumulation of these mutations over a person’s lifetime may play a role in aging.

Mitochondrial DNA is typically diagrammed as a circular structure with genes and regulatory regions labeled.

Mitochondrial DNA

Mitochondrial DNA

http://ghr.nlm.nih.gov/html/images/chromosomeIdeograms/mitochondria/wholeMitochondria.jpg

Additional Resources:

  • Additional NIH Resources – National Institutes of Health

NHGRI Talking Glossary: Mitochondrial DNA

mtDNA : The Eve Gene –  by Stephen Oppenheimer

Mutations are a cumulative dossier of our own maternal prehistory. The main task of DNA is to copy itself to each new generation. We can use these mutations to reconstruct a genetic tree of mtDNA, because each new mtDNA mutation in a prospective mother’s ovum will be transferred in perpetuity to all her descendants down the female line. Each new female line is thus defined by the old mutations as well as the new ones.

By looking at the DNA code in a sample of people alive today, and piecing together the changes in the code that have arisen down the generations, biologists can trace the line of descent back in time to a distant shared ancestor. Because we inherit mtDNA only from our mother, this line of descent is a picture of the female genealogy of the human species.

formation of gene trees

formation of gene trees

The diagram above shows the drawing of gene trees using single mutations

http://www.bradshawfoundation.com/journey/images/gene-diagram3.gif

Not only can we retrace the tree, but by taking into account here the sampled people came from, we can see where certain mutations occurred – for example, whether in Europe, or Asia, or Africa. What’s more, because the changes happen at a statistically consistent (though random) rate, we can approximate the time when they happened.  This has made it possible, during the late 1990s and in the new century, for us to do something that anthropologists of the past could only have dreamt of: we can now trace the migrations of modern humans around our planet.

It turns out that the oldest changes in our mtDNA took place in Africa 150,000 – 190,000 years ago. Then new mutations start to appear in Asia, about 60,000 – 80,000 years ago. This tells us that modern humans evolved in Africa, and that some of us migrated out of Africa into Asia after 80,000 years ago.  A method established in 1996, which dates each branch of the gene tree by averaging the number of new mutations in daughter types of that branch, has stood the test of time.

A final point on the methods of genetic tracking of migrations: it is important to distinguish this new approach to tracing the history of molecules on a DNA tree, known as phylogeography (literally ‘tree-geography’), from the mathematical study of the history of whole human populations, which has been used for decades and is known as classical population genetics.

The two disciplines are based on the same Mendelian biological principles, but have quite different aims and assumptions, and the difference is the source of much misunderstanding and controversy. The simplest way of explaining it is that phylogeography studies the prehistory of individual DNA molecules, while population genetics studies the prehistory of populations. Put another way, each human population contains multiple versions of any particular DNA molecule, each with its own history and different origin.

gene-diagram

gene-diagram

The diagram above shows the tracing of gene spread geographically.
Green disks represent migrant new growth on the tree
http://www.bradshawfoundation.com/journey/images/gene-diagram4.gif

http://www.bradshawfoundation.com/journey/eve.html

David Moskowitz, MD, PhD
Founder and President, GenoMed

 

Germline genes make the best drug targets

  • They operate earliest in the disease pathway
  • Unlike tissue-expressed genes, which operate years after the disease began
  • But which everybody else is using as drug targets

Variation in germline DNA is where all disease starts

  • Cancer patients overexpress oncogenes and underexpress tumor suppressors

beginning in their germline DNA

  • Mutations in tumor DNA are “private”
  • Each tumor is a “snowflake”

Tumor-expressed genes can be compensatory, not causative

  • “Passengers, not drivers”
  • We have the drivers

Tumorigenesis SNPs

Using a SNPnet™ covering only 1/3 of the genome, we found about

2,500 genes associated with each of 6 different cancers in whites

  • Nobody else has found any yet
  • This will change in 2-3 years

We estimate 10,000 genes per cancer

What cellular program takes up 1/3-1/2 of the genome?

What program takes up >1/3 of the genome?

  • Differentiation…

Does sporadic cancer arise when a tissue stem cell fails to differentiate?

  • In the embryo, the surrounding tissue expresses “fields”

Lent C. Johnson published a “field” based hypothesis of bone tumors that coincides with differentiation at the

  1. METAPHYSIS
  2. HYPOPHYSIS

and the type CELL – chondroblast, osteoblast, giant cell (osteoclast), fibroblast

Orthopedic surgeons use magnetic fields for healing

  • of powerful transcription factors.
  • Not so in adult life: a proliferating tissue stem cell is literally on its own.

Germlines hold the key to effective “differentiation therapy”

  • Ideal for patients with stage 3-4 cancer
  • Examples of differentiation therapy:
  1. 1,25-vitamin D and
  2. retinoic acid

Non-toxic but more effective treatment for late stage disease,

GenoMed’s 2,500 cancer-causing genes:

  • ½ are oncogenes,
  • ½ are tumor suppressors

Design inhibitors to oncogenes

  • Screen 1st for toxicity;
  • genomic epidemiology guarantees clinical efficacy

 

Jewish Heritage Written in DNA

By Kate Yandell | Sept 9, 2014

Fully sequenced genomes of more than 100 Ashkenazi people clarify the group’s history and provide a reference for researchers and physicians trying to pinpoint disease-associated genes.

A whole-genome sequence study from 128 healthy Jewish people is aimed at identifying disease-associated variants in the jewish population of Ashkenazi ancestry, according to a study published Sept 9 in Nature Communications. The library of sequences confirms earlier conclusions about Ashkenazi history hinted at by more limited DNA sequencing studies. The sequences point to an approximate 350-person bottleneck in the Ashkenazi population as recently as 700 years ago (1400 A.D.), and suggest that the population has a mixture of European and Middle Eastern ancestry.

The study “provides a very nice reference panel for the very unique population of Ashkenazi Jews,” said Alon Keinan, who studies human population genomics at Cornell University in New York. Keinan
is acknowledged in the study but was not involved in the research.

“One might have thought that, after many years of genetic studies relating to Ashkenazi Jews . . . there would be little room for additional insights,” Karl Skorecki of the Rambam Healthcare Campus
in Israel who also was not involved in the study wrote in an e-mail to The Scientist. The study, he added, provides “a powerful further validation and further resolution of the demographic history of
the Ashkenazi Jews in relation to non-Jewish Europeans that is reassuringly consistent with inferences drawn from two decades of studies using uniparental regions . . . and from array-based data.”

Itsik Pe’er, coauthor of the new study and an associate professor of computer science at Columbia University in New York City, recalled that several years ago, he and his colleagues kept running into the same problem as they tried to understand the genetics of disease in Ashkenazi populations. They were comparing their Ashkenazi samples to the only control genomes that were available, which were of largely non-Jewish European origin. The Ashkenazi genomes had variation that was absent in these general European genomes, making it hard to distinguish rare variants in Ashkenazi people.

“Technology is there to tell us everything in that [Ashkenazi] patient’s genome, but the genome was not there to distinguish the variants that are there and to tell us whether they are normal or whether we should get worried,” said Pe’er. Pe’er’s group teamed up with researchers from additional universities and hospitals in the U.S., Belgium, and Israel to sequence a collection of healthy Ashkenazi people’s genomes. The panel of reference sequences performs better than a group of European genomes at filtering out harmless variants from Ashkenazi Jewish genomes, thereby making it easier to identify potentially harmful ones. According to Pe’er, researchers will also be able to use the panel to infer
more complete sequences from partially sequenced genomes by looking for familiar sequences from the reference genomes.

The team also used its data to better understand the history of the Ashkenazi Jewish people through analyzing both level of similarity within Ashkenazi genomes and between Ashkenazi and non-Jewish
European genomes. By analyzing the length of identical DNA sequences that Ashkenazi individuals share, the researchers were able to estimate that 25 to 32 generations ago, the Ashkenazi Jewish population shrunk to just several hundred people, before expanding rapidly to eventually include the millions of Ashkenazi Jews alive today. Further, the researchers concluded that modern Ashkenazi Jews likely have an approximately even mixture of European and Middle Eastern ancestry. This suggests that after the Jewish people migrated from the Middle East to Europe, they recruited people from local European populations.

These results are compatible with those of prior work on mitochondrial DNA (mtDNA), which is passed on maternally. This prior work suggested that Ashkenazi men from the Middle East intermarried with local European women. The Ashkenazi population “hasn’t been likely as isolated as at least some researchers considered,” said Keinan.

Finally, the newly sequenced genomes shed light on the deeper history of Europe, showing that the European and Middle Eastern portions of Ashkenazi ancestry diverged just around 20,000 years ago.

“This is, I think, the first evidence from whole human genomes that the most important wave of settlement from the Near East was most likely shortly after the Last Glacial Maximum  . . . and, notably, before the Neolithic transitionwhich is what researchers working on mitochondrial DNA have been arguing for some years,” Martin Richards, an archeogeneticist at the University of Huddersfield in the U.K., told The Scientist in an e-mail.

Skorecki noted that the new study “demonstrates the utility of sequencing whole genomes in a diverse population… with sufficient numbers of samples, parent population information, and
computational analytic power, we can expect important and surprising utilities for personal genomic and insights in terms of human demographic history from whole genomes.”

  1. Carmi et al., “Sequencing an Ashkenazi reference panel supports population-targeted personal genomics and illuminates Jewish and European origins,” Nature
    Communications,
    http://dx.doi.org:/10.1038/ncomms5835, 2014.

Added Layers of Proteome Complexity

By Anna Azvolinsky | July 17, 2014

Scientists discover a broad spectrum of alternatively spliced human protein variants within a well-studied family of genes.

There may be more to the human proteome than previously thought. Some genes are known to have several different alternatively spliced protein variants, but the Scripps Research Institute’s Paul Schimmel and his colleagues have now uncovered almost 250 protein splice variants of an essential, evolutionarily conserved family of human genes. The results were published today (July 17) in Science.

Focusing on the 20-gene family of aminoacyl tRNA synthetases (AARSs), the team captured AARS transcripts from human tissues—some fetal, some adult—and showed that many of these messenger RNAs (mRNAs) were translated into proteins. Previous studies have identified
several splice variants of these enzymes that have novel functions, but uncovering so many more variants was unexpected, Schimmel said. Most of these new protein products lack the catalytic domain but retain other AARS non-catalytic functional domains. “The main point is that a vast new area of biology, previously missed, has been uncovered,”
said Schimmel.

“This is an incredible study that fundamentally changes how we look at the protein-synthesis machinery,” Michael Ibba, a protein translation researcher at Ohio State University who was not involved in the work, told The Scientist in an e-mail. “The unexpected and potentially vast
expanded functional networks that emerge from this study have the potential to influence virtually any aspect of cell growth.”

The team—including researchers at the Hong Kong University of Science and Technology, Stanford University, and aTyr Pharma, a San Diego-based biotech company that Schimmel co-founded—comprehensively captured and sequenced the AARS mRNAs from six human tissue types using high-throughput deep sequencing. While many of the transcripts were expressed in each of the tissues, there was also some tissue specificity.

Next, the team showed that a proportion of these transcripts, including those missing the catalytic domain, indeed resulted in stable protein products: 48 of these splice variants associated with polysomes. In vitro translation assays and the expression of more than 100 of these variants in cells confirmed that many of these variants could be made into
stable protein products.

The AARS enzymes—of which there’s one for each of the 20 amino acids—bring together an amino acid with its appropriate transfer RNA (tRNA) molecule. This reaction allows a ribosome to add the amino acid to a growing peptide chain during protein translation. AARS
enzymes can be found in all living organisms and are thought to be among the first proteins to have originated on Earth.

To understand whether these non-catalytic proteins had unique biological activities, the researchers expressed and purified recombinant AARS fragments, testing them in cell-based assays for proliferation, cell differentiation, and transcriptional regulation, among other
phenotypes. “We screened through dozens of biological assays and found that these variants operate in many signaling pathways,” said Schimmel.

“This is an interesting finding and fits into the existing paradigm that, in many cases, a single gene is processed in various ways [in the cell] to have alternative functions,” said Steven Brenner, a computational genomics researcher at the University of California, Berkeley.

The team is now investigating the potentially unique roles of these protein splice variants in greater detail—in both human tissue as well as in model organisms. For example, it is not yet clear whether any of these variants directly bind tRNAs.

“I do think [these proteins] will play some biological roles,” said Tao Pan, who studies the functional roles of tRNAs at the University of Chicago. “I am very optimistic that interesting biological functions will come out of future studies on these variants.”

Brenner agreed. “There could be very different biological roles [for some of these proteins]. Biology is very creative that way, [it’s] able to generate highly diverse new functions using combinations of existing protein domains.” However, the low abundance of these variants
is likely to constrain their potential cellular functions, he noted.

Because AARSs are among the oldest proteins, these ancient enzymes were likely subject to plenty of change over time, said Karin Musier-Forsyth, who studies protein translational
at the Ohio State University. According to Musier-Forsyth, synthetases are already known to have non-translational functions and differential localizations. “Like the addition of post-translational modifications, splicing variation has evolved as another way to repurpose protein function,” she said.

One of the protein variants was able to stimulate skeletal muscle fiber formation ex vivo and upregulate genes involved in muscle cell differentiation and metabolism in primary human skeletal myoblasts. “This was really striking,” said Musier-Forsyth. “This suggests
that, perhaps, peptides derived from these splice variants could be used as protein-based therapeutics for a variety of diseases.”

W.S. Lo et al., “Human tRNA synthetase catalytic nulls with diverse functions,” Science, http://dx.doi.org:/10.1126/science.1252943, 2014.

It’s Not Only in DNA’s Hands

By Ilene Schneider  LabRoots   Aug 22, 2014

Blood stem cells have the potential to turn into any type of blood cell, whether it is the oxygen-carrying red blood cells or the immune system’s many types of white blood cells that help fight infection. How exactly is the fate of these stem cells regulated? Preliminary findings from research conducted by scientists from the Weizmann Institute of Science and the Hebrew University are starting to reshape the conventional understanding of the way blood stem cell fate decisions are controlled, thanks to a new technique for epigenetic analysis developed at these institutions. Understanding epigenetic mechanisms (environmental influences other than genetics) of cell fate could lead to the deciphering of the molecular mechanisms of many diseases,
including immunological disorders, anemia, leukemia, and many more. The study of epigenetics also lends strong support to findings that environmental factors and lifestyle play a more prominent
role in shaping our destiny than previously realized.

 

The process of differentiation – in which a stem cell becomes a specialized mature cell – is controlled by a cascade of events in which specific genes are turned “on” and “off” in a highly regulated and accurate order. The instructions for this process are contained within the DNA itself in short regulatory sequences.

  • These regulatory regions are normally in a “closed” state, masked by special proteins called histones to ensure against unwarranted activation. Therefore, to access and “activate”
    the instructions,
  • this DNA mask needs to be “opened” by epigenetic modifications of the histones so it can be read by the necessary machinery.

In a paper published in Science, Dr. Ido Amit and David Lara-Astiaso of the Weizmann Institute’s Department of Immunology, along with Prof. Nir Friedman and Assaf Weiner of the Hebrew University of Jerusalem, charted – for the first time – histone dynamics during blood development. Thanks to the new technique for epigenetic profiling they developed, in which just a handful of cells – as few as 500 – can be sampled and analyzed accurately, they have identified the exact
DNA sequences, as well as the various regulatory proteins, that are involved in regulating the process of blood stem cell fate.

This research has also yielded unexpected results: As many as

  • 50% of these regulatory sequences are established and opened during intermediate stages of cell development.

The meaning of the research is that epigenetics can be active at stages in which it had been thought that cell destiny was already set. “This changes our whole understanding of the process of blood stem cell fate decisions,” says Lara-Astiaso, “suggesting that the process is more
dynamic and flexible than previously thought.”

Although this research was conducted on mouse blood stem cells, the scientists believe that the mechanism may hold true for other types of cells. “This research creates a lot of excitement in the field, as it sets the groundwork to study these regulatory elements in humans,” says Weiner.

Largest Cancer Genetic Analysis Reveals New Way of Classifying Cancer

http://www.biosciencetechnology.com/news/2014/08/largest-cancer-genetic-analysis-reveals-new-way-classifying-cancer

Thu, 08/07/2014 – 2:24pm

Researchers with The Cancer Genome Atlas (TCGA) Research Network have completed the largest, most diverse tumor genetic analysis ever conducted, revealing a new approach to classifying cancers. The work, led by researchers at the UNC Lineberger Comprehensive
Cancer Center at the University of North Carolina at Chapel Hill and other TCGA sites, not only

  • revamps traditional ideas of how cancers are diagnosed and treated, but could also have
  • a profound impact on the future landscape of drug development.

“We found that one in 10 cancers analyzed in this study would be classified differently using this new approach,” said Chuck Perou, PhD, professor of genetics and pathology, UNC Lineberger member and senior author of the paper, which appears online Aug. 7 in Cell.
“That means that

  • 10 percent of the patients might be better off getting a different therapy—that’s huge.”

Since 2006, much of the research has identified cancer as not a single disease, but many types and subtypes and has defined these disease types based on the tissue—breast, lung, colon, etc.—in which it originated. In this scenario, treatments were tailored to which
tissue was affected, but questions have always existed because some treatments work, and fail for others, even when a single tissue type is tested.

In their work, TCGA researchers analyzed more than 3,500 tumors across 12 different tissue types to see how they compared to one another — the largest data set of tumor genomics ever assembled, explained Katherine Hoadley, PhD, research assistant professor
in genetics and lead author. They found that

  • cancers are more likely to be genetically similar based on the type of cell in which the cancer originated, compared to the type of tissue in which it originated. 

This is fundamental premise of pathology! (Larry Bernstein)  It goes back to Rudolph Virchow. 

“In some cases, the cells in the tissue from which the tumor originates are the same,” said Hoadley. “But in other cases, the tissue in which the cancer originates is made up of multiple types of cells that can each give rise to tumors. Understanding the cell in which the cancer originates appears to be very important in determining the subtype of a tumor
and, in turn, how that tumor behaves and how it should be treated.”

Perou and Hoadley explain that the new approach may also shift how cancer drugs are developed, focusing more on the development of drugs targeting larger groups of cancers with genomic similarities, as opposed to a single tumor type as they are currently developed.

One striking example of the genetic differences within a single tissue type is breast cancer.
The breast, a highly complex organ with multiple types of cells, gives rise to multiple types of breast cancer; luminal A, luminal B, HER2-enriched and basal-like, which was previously known. In this analysis, the basal-like breast cancers looked more like ovarian cancer
and cancers of a squamous-cell type origin, a type of cell that composes the lower-layer of a tissue, rather than other cancers that arise in the breast.

“This latest research further solidifies that basal-like breast cancer is an entirely unique disease and is completely distinct from other types of breast cancer,” said Perou. In addition, bladder cancers were also quite diverse and might represent at least three different disease types that also showed differences in patient survival.

As part of the Alliance for Clinical Trials in Oncology, a national network of researchers conducting clinical trials, UNC researchers are already testing the effectiveness of carboplatin—a common treatment for ovarian cancer—on top of standard of care chemotherapy for triple-negative breast cancer (TNBC) patients, of which 80 percent are the basal-like subtype. The results of this study (called CALGB40603)
were just published on Aug. 6 in the Journal of Clinical Oncology and showed a benefit of carboplatin in TNBC patients. This new clinical trial result suggests that there may be great value in comparing clinical results across tumor types for which this study highlights as having common genomic similarities.

As participants in TCGA, UNC Lineberger scientists have been involved in multiple individual tissue type studies including most recently an analysis of a comprehensive genomic profile of lung adenocarcinoma. Perou’s seminal work in 2000 led to the first discovery of breast
cancer as not one, but in fact, four distinct subtypes of disease.  These most recent findings should continue to lay the groundwork for what could be the next generation of cancer diagnostics.

Source: University of North Carolina at Chapel Hill School of Medicine

New Gene Tied to Breast Cancer Risk

Wed, 08/06/2014

Marilynn Marchione – AP Chief Medical Writer – Associated Press

It’s long been known that faulty BRCA genes greatly raise the risk for breast cancer. Now, scientists say a more recently identified, less common gene can do the same.

Mutations in the gene can make breast cancer up to nine times more likely to develop, an international team of researchers reports in this week’s New England Journal of Medicine.

About 5 to 10 percent of breast cancers are thought to be due to bad BRCA1 or BRCA2 genes. Beyond those, many other genes are thought to play a role but how much each one raises risk has not been known, said Dr. Jeffrey Weitzel, a genetics expert at City of Hope Cancer Center
in Duarte, Calif.

The new study on the gene- called PALB2 – shows “this one is serious,” and probably is the most dangerous in terms of breast cancer after the BRCA genes, said Weitzel, one of leaders of the study.

It involved 362 members of 154 families with PALB2 mutations – the largest study of its kind. The faulty gene seems to give a woman a 14 percent chance of breast cancer by age 50 and 35 percent by age 70 and an even greater risk if she has two or more close relatives with the disease.

That’s nearly as high as the risk from a faulty BRCA2 gene, Dr. Michele Evans of the National Institute on Aging and Dr. Dan Longo of the medical journal staff write in a commentary in the journal.

The PALB2 gene works with BRCA2 as a tumor suppressor, so when it is mutated, cancer can flourish.

How common the mutations are isn’t well known, but it’s “probably more than we thought because people just weren’t testing for it,” Weitzel said. He found three cases among his own breast cancer
patients in the last month alone.

Among breast cancer patients, BRCA mutations are carried by 5 percent of whites and 12 percent of Eastern European (Ashkenazi) Jews. PALB2 mutations have been seen in up to 4 percent of families with a history of breast cancer.

 Men with a faulty PALB2 gene also have a risk for breast cancer that is eight times greater than men in the general population.

Testing for PALB2 often is included in more comprehensive genetic testing, and the new study should give people with the mutation better information on their risk, Weitzel said. Doctors say that people with faulty cancer genes should be offered genetic counseling and may want to consider more frequent screening and prevention options, which can range from hormone-blocking pills to breast removal.

The actress Angelina Jolie had her healthy breasts removed last year after learning she had a defective BRCA1 gene.

The study was funded by many government and cancer groups around the world and was led by Dr. Marc Tischkowitz of the University of Cambridge in England. The authors include Mary-Clare King, the University of Washington scientist who discovered the first breast
cancer predisposition gene, BRCA1.

Study: http://www.nejm.org/doi/full/10.1056/NEJMoa1400382

Gene info: http://ghr.nlm.nih.gov/gene/PALB2

Structure of the DDB1–CRBN E3 ubiquitin ligase in complex with thalidomide

Eric S. Fischer, Kerstin Böhm, John R. Lydeard, Haidi Yang, …, J. Wade Harper, Jeremy L. Jenkins & Nicolas H. Thomä

Nature (07 Aug 2014); 512, 49–53  http://dx.doi.org:/10.1038/nature13527

Published online 16 July 2014

In the 1950s, the drug thalidomide, administered as a sedative to pregnant women, led to the birth of thousands of children with multiple defects. Despite the teratogenicity of thalidomide and its derivatives lenalidomide and pomalidomide,

  • these immunomodulatory drugs (IMiDs) recently emerged as effective treatments for
    multiple myeloma and 5q-deletion-associated dysplasia.
  • IMiDs target the E3 ubiquitin ligase CUL4–RBX1–DDB1–CRBN (known as CRL4CRBN) and
  • promote the ubiquitination of the IKAROS family transcription factors IKZF1 and IKZF3 by CRL4CRBN.

Here we present crystal structures of the DDB1–CRBN complex bound to thalidomide,
lenalidomide and pomalidomide. The structure establishes that

  • CRBN is a substrate receptor within CRL4CRBN and enantioselectively binds IMiDs.

Using an unbiased screen, we identified the

  • homeobox transcription factor MEIS2 as an endogenous substrate of CRL4CRBN.

Our studies suggest that IMiDs block endogenous substrates (MEIS2) from binding to CRL4CRBN while the ligase complex is recruiting IKZF1 or IKZF3 for degradation.

This dual activity implies that

  • small molecules can modulate an E3 ubiquitin ligase and thereby upregulate or downregulate the ubiquitination of proteins.

Curator’s Viewpoint:

The short pieces may not appear to be so closely connected, except for the last subject on the pharmaceutical targeting of an E3 ubiquitin ligase ubiquitination of proteins, but even in that case, we have to keep in mind that protein formation by amino acid transcription, remodeling, and recapture of amino acids are in equilibrium through ubiquitylation. So I put it there.  The DNA in populations ties some mutations to disease that is tied specifically to populations, not only the sephardic population, but in Asia as well.

The next article for consideration is methodological considerations.  The BRCA2 in the sephardic population is one of a number of mutations we can identify, extending to Tay Sachs disease, for instance.  How this might have occurred in the history of the jewish people is not so obvious, except perhaps in the segregation of the jewish population for centuries.  The mutation would be confined within the population with limited marriage outside of the jewish community.  It has been known for some time that there is a Cohen gene that traces back to the priests (Kohanim) of the Holy Temple, the descendents of Aaron (Aharon), the brother of Moses.  The priests would stand at the Ark and bless the congregation in the most holy convocation of Yom Kippur, according to tradition.  Marriages were arranged between the bride and the groom.  Of course, arranged marriages were also the case in other ethnic communities, and between the privileged.

That was dramatically the case during the reign of Queen Victoria of England, with Royal arrangements across Europe.
That would be a factor in the transmission of hemophilia, and in mental disorders in the Royal families. Haemophilia figured prominently in the history of European royalty in the 19th and 20th centuries. Britain’s Queen Victoria, through two of her five daughters (Princess Alice and Princess Beatrice), passed the mutation to various royal houses across the continent, including the royal families of Spain, Germany and Russia. Victoria’s son Prince Leopold, Duke of Albany suffered from the disease.  The Prince Leopold, Duke of Albany KG KT GCSI GCMG GCStJ (Leopold George Duncan Albert; 7 April 1853 – 28 March 1884) was the eighth child and fourth son of Queen Victoria and Prince Albert of Saxe-Coburg and Gotha. Leopold was later created Duke of Albany, Earl of Clarence, and Baron Arklow. He had haemophilia, which led to his death at the age of 30.  The sex-linked X chromosome disorder manifests almost entirely in males, although the gene for the disorder is located on the X chromosome and may be inherited from either mother or father. Expression of the disorder is much more common in males than in females. This is because, although the trait is recessive, males only inherit one X chromosome, from their mothers. Of course, this is classical Mendelian genetics. Victoria appears to have been a spontaneous or de novo mutation and is usually considered the source of the disease in modern cases of haemophilia among royalty. The mutation would probably be assumed today to have occurred at the conception of Princess Alice, as she was the only known carrier among Victoria and Albert’s first seven children. Leopold was a sufferer of haemophilia and her daughters Alice and Beatrice were confirmed carriers of the gene.

Cousin marriage is marriage between people with a common grandparent or other more distant ancestor. In various cultures and legal jurisdictions,  Marriages between first and second cousins account for over 10% of marriages worldwide, and they are common in the Middle East, where in some nations they account for over half of all marriages. Proportions of first-cousin marriage in the United States, Europe and other Western countries like Brazil have declined since the 19th century, though even during that period they were not more than 3.63 percent of all unions in Europe. Cousin marriage is allowed throughout the Middle East for all recorded history, and is used mostly in Syria. It has often been chosen to keep cultural values intact through many generations and preserve familial wealth. In Iraq the right of the cousin has also traditionally been followed and a girl breaking the rule without the consent of the ibn ‘amm could have ended up murdered by him. The Syrian city of Aleppo during the 19th century featured a rate of cousin marriage among the elite of 24% according to one estimate, a figure that masked widespread variation: some leading families had none or only one cousin marriage, while others had rates approaching 70%. Cousin marriage rates were highest among women, merchant families, and older well-established families.  The percentage of Iranian cousin marriages increased from 34 to 44% between the 1940s and 1970s. Cousin marriage among native Middle Eastern Jews is generally far higher than among the European Ashkenazim, who assimilated European marital practices after the diaspora.

The essential elements of the marriage contract were now an offer by the man, an acceptance by the woman, and the performance of such conditions as the payment of dowry. According to anthropologist Ladislav Holý, cousin marriage is not an independent phenomenon but rather one expression of a wider Middle Eastern preference for agnatic solidarity, or solidarity with one’s father’s lineage.

A 2009 study found that many Arab countries display some of the highest rates of consanguineous marriages in the world, and that first cousin marriages which may reach 25-30% of all marriages. Research among Arabs and worldwide has indicated that consanguinity could have an effect on some reproductive health parameters such as postnatal mortality and rates of congenital malformations.

In the terraced streets of Bradford, Yorkshire, a child’s death is anything but rare. At the boy’s inquest, coroner Mark Hinchliffe said Hamza Rehman had died because his Pakistan-born parents (shopkeeper Abdul and housewife Rozina) are first cousins. Muslims have practiced marriages between first cousins in non-prohibited countries since the time of the Quran.

Four years before, Hamza’s older sister, three-month-old Khadeja, had died of the same brain disorder which causes fits, sickness and chest infections. The couple had another baby born with equally devastating neurological problems.

A heartbroken Mr Rehman told the inquest that he and his wife were unsure whether to have any more children. The coroner expressed deep sympathy before saying that Hamza’s death should serve as a warning to others.

I have diverged somewhat onto the genetic risks of consanguinous marriages, which George Darwin, son of Charles Darwin, argues were had a small effect in then English society.  But most importantly, we see the larger factor here of social and familial inheritance, and also the concept of cultural identity.

Insofar as the somatic and mitochondrial mutations are concerned, I call attention to the finding in the GWAS study above discussed that the results were supportive of the conclusions from mtDNA.  This gives some reason to consider whether sufficient information is obtained from the mtDNA, without the more robust GWAS.  One cannot fully consider this without some knowledge of the methodology of specimen preparation.

It is not difficult to prepare mitochondria from cells and obtain a very good preparation before further analysis, whether of the membrane structures, the enzymatic activity, or of the DNA and RNA polynucleotides.  The separation is easily achieved with differential centrifugation.  On the other hand, the finding of the basal layer of epithelium having a different signature than the superficial layer, established by the genomic studies, but known histologically for non-neoplastic tissue, is a matter for cell separation methods that are not easy.  It is from the lower layer of cells that we derive carcinoma in-situ.  These cells were identified in breast, are expected to be found in uterus, and were like the cells in ovarian-cancer, which suggested the use of a common treatment regimen as adjunct in triple negative breast cancer and ovarian cancer.  The importance of a suuficiently prepared cellular specimen as opposed to tissue specimen can’t be taken for granted.

 

 

Read Full Post »


The role and importance of transcription factors

Larry H. Bernstein, MD, FCAP, Writer and Curator

https://pharmaceuticalintelligence.com/2014/8/05/The-role-and-importance-of-transcripton-factors

The following is a second in the 2nd series that is focused on the topic of the impact of genomics and transcriptomics in the evolution of 21st century of medicine, which shall have to be more efficient and more effective by the end of this decade, if the prediction for the funding of Medicare is expected to run out. Even so, Social Security was devised by none other than the Otto von Bismarck, who unified Germany, and United Kingdom has had a charity hospital care system begun to protect the widows of the ravages of war, and nursing was developed by Florence Nightengale as a result of the experience of war. It can only be concluded that the care for the elderly, the infirm, and those who have little resources to live on has a long history in western civilization, and it will not cease to exist as a public social obligation anytime soon. The 20th century saw an explosive development of physics; organic, inorganic, biochemistry, and medicinal chemistry, and the elucidation of the genetic code and its mechanism of translation in plants, microorganisms, and eukaryotes.  All of which occurred irrespective of the most horrendous wars that have reshaped the world map.

The following are the second portions of a puzzle in construction that is intended to move into deeper complexities introduced by proteomics, cell metabolism, metabolomics, and signaling.  This is the only manner by which I can begin to appreciate what a wonder it is to view and live in this world with all its imperfections.

We have already visited the transcription process, by which an RNA sequence is read.  This is essential for protein synthesis through the ordering of the amino acids in the primary structure. However, there are microRNAs and noncoding RNAs, and there are transcription factors.  The transcription factors bind to chromatin, and the RNAs also have some role in regulating the transcription process. We shall examine this further.

  1. RNA and the transcription the genetic code

Larry H. Bernstein, MD, FCAP, Writer and Curator
https://pharmaceuticalintelligence.com/2014/08/02/rna-and-the-transcription-of-the-genetic-code/

  1. The role and importance of transcription factors?
    Larry H. Bernstein, MD, FCAP, Writer and Curator
    https://pharmaceuticalintelligence.com/2014/8/05/What-is-the-meaning-of-so-many-RNAs
  2. What is the meaning of so many RNAs?

Larry H. Bernstein, MD, FCAP, Writer and Curator
https://pharmaceuticalintelligence.com/2014/8/05/What-is-the-meaning-of-so-many-RNAs

  1. Pathology Emergence in the 21st Century
    Larry Bernstein, MD, FCAP, Author and Curator
    https://pharmaceuticalintelligence.com/2014/08/03/pathology-emergence-in-the-21st-century/
  2. The Arnold Relman Challenge: US HealthCare Costs vs US HealthCare Outcomes

Larry H. Bernstein, MD, FCAP, Reviewer and Curator; and
Aviva Lev-Ari, PhD, RN, Curator
https://pharmaceuticalintelligence.com/2014/08/05/the-relman-challenge/

 

 

 

Quantifying transcription factor kinetics: At work or at play?

Posted online on September 11, 2013. (doi:10.3109/10409238.2013.833891)

Florian Mueller1,2, Timothy J. Stasevich3, Davide Mazza4, and James G. McNally5
1Institut Pasteur, Computational Imaging and Modeling Unit, CNRS, Paris, Fr
2Functional Imaging of Transcription, Institut de Biologie de l’Ecole Normale Supérieure, Paris, Fr
3Graduate School of Frontier Biosciences, Osaka University, Osaka, Jp
4Istituto Scientifico Ospedale San Raffaele, Centro di Imaging Sperimentale e Università Vita-Salute
San Raffaele, Milano, It, and
5Fluorescence Imaging Group, National Cancer Institute, NIH, Bethesda, MD, USA

Read More: http://informahealthcare.com/doi/abs/10.3109/10409238.2013.833891?goback=%2Egde_3795224_member_273907669#%2EUjYZ8jMt8mo%2Elinkedin

Abstract

Transcription factors (TFs) interact dynamically in vivo with chromatin binding sites. Here we summarize and compare the four different techniques that are currently used to measure these kinetics in live cells, namely fluorescence recovery after photobleaching (FRAP), fluorescence correlation spectroscopy (FCS), single molecule tracking (SMT) and competition ChIP (CC). We highlight the principles underlying each of these approaches as well as their advantages and disadvantages. A comparison of data from each of these techniques raises an important question: do measured transcription kinetics reflect biologically functional interactions at specific sites (i.e. working TFs) or do they reflect non-specific interactions (i.e. playing TFs)? To help resolve this dilemma we discuss five key unresolved biological questions related to the functionality of transient and prolonged binding events at both specific promoter response elements as well as non-specific sites. In support of functionality, we review data suggesting that TF residence times are tightly regulated, and that this regulation modulates transcriptional output at single genes. We argue that in addition to this site-specific regulatory role, TF residence times also determine the fraction of promoter targets occupied within a cell thereby impacting the functional status of cellular gene networks. Thus, TF residence times are key parameters that could influence transcription in multiple ways.

Keywords: Competition-ChIP, kinetic modeling, live-cell imaging, non-specific binding, specific binding, transcription, transcription factor dynamics http://informahealthcare.com/doi/abs/10.3109/10409238.2013.833891?goback=%2Egde_3795224_member_273907669#%2EUjYZ8jMt8mo%2Elinkedin

The Transcription Factor Titration Effect Dictates Level of Gene ExpressionCalifornia Institute of Technology

Robert C. Brewster, Franz M. Weinert, Hernan G. Garcia, Dan Song, Mattias Rydenfelt, and Rob Phillips  CalTech
 Cell Mar 13, 2014; 156:1312–1323,.

Models of transcription are often built around a picture of RNA polymerase and transcription factors (TFs) acting on a single copy of a promoter. However, most TFs are shared between multiple genes with varying binding affinities. Beyond that, genes often exist at high copy number—in multiple identical copies on the chromosome or on plasmids or viral vectors with copy numbers in the hundreds. Using a thermodynamic model, we characterize the interplay between TF copy number and the demand for that TF. We demonstrate the parameter-free predictive power of this model as a function of the copy number of the TF and the number and affinities of the available specific binding sites; such predictive control is important for the understanding of transcription and the desire to quantitatively design the output of genetic circuits. Finally, we use these experiments to dynamically measure plasmid copy number through the cell cycle.

 

 

Optimal reference genes for normalization of qRT-PCR data from archival formalin-fixed, paraffin-embedded breast tumors controlling for tumor cell content and decay of mRNA.

Tramm TSørensen BSOvergaard JAlsner J.

Diagn Mol Pathol. 2013 Sep;22(3):181-7. http://dx.doi.org:/10.1097/PDM.0b013e318285651e

Gene-expression analysis is increasingly performed on degraded mRNA from formalin-fixed, paraffin-embedded tissue (FFPE), giving the option of examining retrospective cohorts. The aim of this study was to select robust reference genes showing stable expression over time in FFPE, controlling for various content of tumor tissue and decay of mRNA because of variable length of storage of the tissue.

Sixteen reference genes were quantified by qRT-PCR in 40 FFPE breast tumor samples, stored for 1 to 29 years. Samples included 2 benign lesions and 38 carcinomas with varying tumor content. Stability of the reference genes were determined by the geNorm algorithm. mRNA was successfully extracted from all samples, and the 16 genes quantified in the majority of samples.

Results showed 14% loss of amplifiable mRNA per year, corresponding to a half-life of 4.6 years. The 4 most stable expressed genes were CALM2, RPL37A, ACTB, and RPLP0. Several of the other examined genes showed considerably instability over time (GAPDH, PSMC4, OAZ1, IPO8).

In conclusion, we identified 4 genes robustly expressed over time and independent of neoplastic tissue content in the FFPE block.   PMID:23846446

 

Structures of Cas9 Endonucleases Reveal RNA-Mediated Conformational Activation

Martin Jinek1,*,Fuguo Jiang2,*David W. Taylor3,4,*Samuel H. Sternberg5,*Emine Kaya2, et al.

 

1Department of Biochemistry, University of Zurich, CH-8057 Zurich, Switzerland. 2Department of Molecular and Cell Biology,3Howard Hughes Medical Institute, 4California Institute for Quantitative Biosciences, 5Department of Chemistry, 6Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720,. 7The Laboratory for Molecular Infection Medicine Sweden, Umeå University, Umeå S-90187, Sweden. 8Helmholtz Centre for Infection Research, Department of Regulation in Infection Biology, D-38124 Braunschweig, Germany. 9Hannover Medical School, D-30625 Hannover, Germany. 10Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720.

‡ Present address: Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66 CH-4058 Basel, Switzerland.

§ Present address: Department of Agricultural and Biological Engineering, University of Florida, Gainesville, FL 32611, USA.

 

Science  http://dx.doi.org:/10.1126/science.1247997

 

Type II CRISPR-Cas systems use an RNA-guided DNA endonuclease, Cas9,

  • to generate double-strand breaks in invasive DNA during an adaptive bacterial immune response.

Cas9 has been harnessed as a powerful tool for genome editing and gene regulation in many eukaryotic organisms.

Here, we report 2.6 and 2.2 Å resolution crystal structures of two major Cas9 enzymes subtypes,

  • revealing the structural core shared by all Cas9 family members.

The architectures of Cas9 enzymes define nucleic acid binding clefts, and

single-particle electron microscopy reconstructions show that the two structural lobes harboring these clefts undergo guide

  • RNA-induced reorientation to form a central channel where DNA substrates are bound.

The observation that extensive structural rearrangements occur before target DNA duplex binding

  • implicates guide RNA loading as a key step in Cas9 activation.

MicroRNA function in endothelial cells
Dr. Virginie Mattot
Angiogenesis, endothelium activation
Solving the mystery of an unknown target gene using microRNA Target Site Blockers

Dr. Virgine Mattot works in the team “Angiogenesis, endothelium activation and Cancer” directed by Dr. Fabrice Soncin at the Institut de Biologie de Lille in France where she studies the roles played by microRNAs in endothelial cells during physiological and pathological processes such as angiogenesis or endothelium activation. She has been using Target Site Blockers to investigate the role of microRNAs on putative targets which functions are yet unknown.

What is the main focus of the research conducted in your lab?

We are studying endothelial cell functions with a particular interest in angiogenesis and endothelium activation during physiological and tumoral vascular development.

How did your research lead to the study of microRNAs?

A few years ago, we identified

  • an endothelial cell-specific gene which
  • harbors a microRNA in its intronic sequence.

We have since been working on understanding the functions of

  • both this new gene and its intronic microRNA in endothelial cells.

What is the aim of your current project?

While we were searching for the functions of the intronic microRNA,

  • we identified an unknown gene as a putative target.

The aim of my project was to investigate if this unknown gene was actually a genuine target and if regulation of this gene by the microRNA was involved in endothelial cell function. We had already characterized the endothelial cell phenotype associated with the inhibition of our intronic microRNA. We then used miRCURY LNA™ Target Site Blockers to demonstrate

  • the expression of this unknown gene is actually controlled by this microRNA.
  • the microRNA regulates specific endothelial cell properties through regulation of this unknown gene.

How did you perform the experiments and analyze the results?

LNA™ enhanced target site blockers (TSB) for our microRNA were designed by Exiqon. We

  • transfected the TSBs into endothelial cells using our standard procedure and
  • analysed the induced phenotype.

As a control for these experiments, a mutated version of the TSB was designed by Exiqon and transfected into endothelial cells. We first verified that this TSB was functional by analyzing

  • the expression of the miRNA target against which the TSB was directed
  • we then showed the TSB induced similar phenotypes as those when we inhibited the microRNA in the same cells.

What do you find to be the main benefits/advantage of the LNA™ microRNA target site blockers from Exiqon?

Target Site Blockers are efficient tools to demonstrate the specific involvement of

  • putative microRNA targets in the function played by this microRNA.

What would be your advice to colleagues about getting started with microRNA functional analysis?

  • it is essential to perform both gain and loss of functions experiments.

 Changing the core of transcription

Different members of the TAF family of proteins work in differentiated cells, such as motor neurons or brown fat cells, to control the expression of genes that are specific to each cell type.

Katherine A Jones
Jones. eLife 2014;3:e03575. http://dx.doi.org:/10.7554/eLife.03575

 

Related research articles: Herrera FJ, Yamaguchi T, Roelink H, Tjian R. 2014. Core promoter factor TAF9B regulates neuronal gene expression. eLife 3:e02559. http://dx.doi.org:/10.7554eLife.02559

Zhou H, Wan B, Grubisic I, Kaplan T, Tjian R. 2014. TAF7L modulates brown adipose tissue formation. eLife 3:e02811. Http://dx.doi.org:/10.7554/eLife.02811

 

Motor neurons (green) being grown in vitro

Motor neurons (green) being grown in vitro

Image Motor neurons (green) being grown in vitro

 

In a developing organism, different genes are expressed at different times

 

  • the pattern of gene expression can often change abruptly.

 

Expressing a gene involves multiple steps:

 

  • the DNA must be transcribed into a molecule of messenger RNA,
  • which is then trans­lated into a protein.

 

The mechanisms that start the transcription of protein-coding genes in rap­idly growing cells are reasonably well understood: two types of proteins—

 

  • DNA-binding activators and general transcription factors—

 

cooperate to recruit an enzyme called RNA polymerase, which then transcribes the gene (Kadonaga, 2012).

 

These proteins bind to a region of the gene called the promoter, which is

 

  • upstream from the protein-coding region of the gene.

 

TATA-binding protein is a general transcrip­tion factor that

  • binds to certain sequences of DNA bases found within promoters

14 TATA-binding protein associated factors (TAFs) are included into two different protein complexes called TFIID and SAGA (Müller et al., 2010). which, in budding yeast, can recruit TATA-binding protein to gene promoters (Basehoar et al., 2004), but not all genes require all of the general transcription factors, and some genes require both TFIID and SAGA complexes.

Although the steps that are required to switch on genes when cells are rapidly dividing are fairly well known,

  • the same is not true for cells that are differentiating into specialised cell types.

In these cells, many transcription factors are downregulated and

  • the entire pattern of gene expression changes dramatically.

Moreover, certain TAFs are strongly up-regulated during differentiation. The core transcriptional machinery is essentially rebuilt at the genes that are expressed in differentiated cells.

Over the years Robert Tjian of the University of California Berkeley and co-workers have illu­minated how individual TAFs can affect how a cell differentiates in different contexts (Figure 1). Now, in eLife, Francisco Herrera of UC Berkeley and co-workers—including Teppei Yamaguchi, Henk Roelink and Tjian—have identified a critical role for a TAF called TAF9B in the expression of genes in motor neurons (Herrera et al., 2014).

Herrera et al. found that TAF9B predominantly associates with the SAGA complex, rather than the TFIID complex, in the motor neuron cells. Mice in which the gene for TAF9B had been deleted had less neuronal tissue in the developing spinal cord. Moreover, the genes that are involved in forming the branches of neurons were not properly regu¬lated in these mice.

Recently, in another eLife paper, Tjian and co-workers at Berkeley, Fudan University and the Hebrew University of Jerusalem—including Haiying Zhou as first author, Bo Wan, Ivan Grubisic and Tommy Kaplan—reported that another TAF protein, called TAF7L, works as part of the TFIID complex to up-regulate genes that direct cells to become brown adipose tissue (Zhou et al., 2014).

 

TATA-binding protein associated factors

TATA-binding protein associated factors

Figure 1. TATA-binding protein associated factors (TAFs) regulate transcription in specific cell types. TAF3, for example, works with another transcription factor to regulate the expression of genes that are critical for the differentiation of the endoderm in the early embryo (Liu et al., 2011). TAF3 also forms a complex with the TATA-related factor, TRF3, to regulate Myogenin and other muscle-specific genes to form myotubes (Deato et al., 2008). TAF7L interacts with another transcription factor to activate genes involved in the formation of adipocytes (‘fat cells’) and adipose tissue (Zhou et al., 2013; Zhou et al., 2014). Finally, TAF9B is a key regulator of transcription in motor neurons (Herrera et al., 2014). The names of some of the genes regulated by the TAFs are shown in brackets.

TAF9B

Deleting the gene for TAF9B in mouse embryonic stem cells revealed that this TAF

  • is not needed for the growth of stem cells, or
  • required for the expression of genes that prevent differentiation:

both of these processes are known to be highly-dependent upon the TFIID complex
(Pijnappel et al., 2013). However,

  • genes that would normally be expressed specifically in neurons were not
  • up-regulated when cells without the TAF9B gene started to specialise.

Herrera et al. identified numerous genes that can only be switched on when the TAF9B protein is present, which means that it joins a growing list of TAF proteins that are dedicated to controllingthe expression of genes in specialised cell types.

TAF9B activates neuron-specific genes by binding to sites that

  • reside outside of these genes’ core promoters.

Further, many of these sites were also bound by a master regulator of motor neuron-specific genes.

TAF7L

 

Whilst most of the fat tissue in humans is white adipose tissue, which contains cells that store fatty molecules, some is brown adipose tissue, or ‘brown fat’, that instead generates heat. When TAF7L promotes the differentiation of brown fat, it up-regulates genes that are targeted by a tran­scription

factor called PPAR-γ; last year it was shown that this transcription factor also promotes the differentiation of white adipose tissue (Zhou et al., 2013).
Mice without the TAF7L gene had 40% less brown fat than wild-type mice, and also grew too much skeletal muscle tissue. TAF7L was specifi­cally required to activate genes that control how brown fat develops and functions. Thus TAF7L expression appears to shift the fate of a stem cell towards brown adipose tissue, potentially at the expense of skeletal muscle, as both cell types develop from the same group of stem cells.

When stem cells with less TAF7L than normal are differentiated in vitro, they yield more muscle than fat cells. Conversely, cells with an excess of TAF7L express brown fat-specific genes and switch off muscle-specific genes.

The work of Herrera et al. and Zhou et al. reinforces the idea that different TAFs

  • provide the flexibility needed to control gene expression in a tissue-specific manner, and
  • enable differenti­ating cells to change which genes they express rapidly.

However many interesting questions remain:

Which signals lead to the destruction of core transcription factors?
Are core promoter ele­ments at tissue-specific genes designed to rec­ognise variant TAFs?
What determines whether variant TAFs are incorporated within TFIID, SAGA, or other complexes?

Shortly after RNA polymerase II starts to tran­scribe a gene, it briefly pauses. Interestingly, a DNA sequence associated with this pausing, called the pause button, closely matches the sequences that bind to two subunits of TFIID (TAF6 and TAF9; Kadonaga, 2012). Consequently, TAF6 and TAF9 might be involved in pausing transcription, and if so, the variant TAF9B could play a similar role at motor neuron genes.

Molecular basis of transcription pausing

Jeffrey W. Roberts
Science 344, 1226 (2014);  http://dx.doi.org:/10.1126/science.1255712
http://www.sciencemag.org/content/344/6189/1226.full.html

During RNA synthesis, RNA polymerase moves erratically along DNA, frequently
resting as it produces an RNA copy of the DNA sequence. Such pausing helps coordinate the appearance of a transcript with its utilization by cellular processes; to this end,

  • the movement of RNA polymerase is modulated by mechanisms that determine its rate. For example,
  • pausing is critical to regulatory activities of the enzyme such as the termination of transcription. It is also
  • essential during early modifications of eukaryotic RNA polymerase II that activate the enzyme for elongation.

 

Two reports analyzing transcription pausing on a global scale in Escherichia coli, by Larson et al. ( 1) and by Vvedenskaya et al. ( 2) on page 1285 of this issue, suggest

 

  • new functions of pausing and important aspects of its molecular basis.

 

The studies of Larson et al. and Vvedenskaya et al. follow decades of analysis of

bacterial transcription that has illuminated the molecular basis of polymerase pausing

events that serve critical regulatory functions.

 

A transcription pause specified by the DNA sequence synchronizes the translation of RNA into protein

 

  • with the transcription of leader regions of operons (groups of genes transcribed together) for amino acid biosynthesis;

 

  • this coordination controls amino acid synthesis in response to amino acid availability ( 3).

A protein induced pause occurs when the E. coli initiation factor σ70 restrains RNA polymerase by binding a second occurrence of the “–10” promoter element.

 

This paused polymerase provides a structure for engaging a transcription antiterminator (the bacteriophage λ Q protein) ( 4) that, in turn, inhibits transcription

pauses, including those essential for transcription termination.

 

Biochemical and structural analyses have identified an endpoint of the pausing process called the “elemental pause” in which the catalytic structure in the active site is distorted,

 

  • preventing further nucleotide addition ( 7).

 

The elemental paused state also involves distinct

 

  • conformational changes in the polymerase that may favor transcription termination
  • and allow the his and related pauses to be stabilized by RNA hairpins ( 8).

A consensus sequence for ubiquitous pauses was identified, with two important elements:

 

  • a preference for pyrimidine [mostly cytosine (C)] at the newly formed RNA end
  • followed by G to be incorporated next—just as found for the his pause; and a preference for G at position –10 of the RNA (10 nucleotides before the 3’ end)

 

 

Polymerase, paused

Polymerase, paused

Polymerase, paused. During transcription, RNA exists in two states as RNA polymerase progresses: pretranslocated, just after the addition of the last nucleotide [here, cytosine (C)];

and posttranslocated, after all nucleic acids have shifted in register by one nucleotide relative

to the enzyme, exposing the active site for binding of the next substrate molecule [here, guanine (G)]. The pretranslocated state is dominant in the pause. The critical G-C base (RNA-DNA) pair at position –10 in the pretranslocated state and the nontemplate DNA strand G bound in the

polymerase in the posttranslocated state are marked with an asterisk.
Binding of G at position 􀀀1 to CRE only occurs in the posttranslocated state, which would thus

be favored over the pretranslocated state. Hence, if G binding inhibits pausing, then the rate-limiting paused structure must be in the pretranslocated state (a conclusion also made by Larson et al. from biochemical experiments).
This is an important insight into the sequence of protein–nucleic acid interactions that occur in pausing. Vvedenskaya et al. suggest that the actual role of the G binding site is to promote translocation and thus

inhibit pausing, to smooth out adventitious pauses in genomic DNA.
The studies by Larson et al. and Vvedenskaya et al. provide a refined and detailed analysis of DNA sequence–induced transcription pausing.
Processive Antitermination

Robert A. Weisberg1* and Max E. Gottesman2

Section on Microbial Genetics, Laboratory of Molecular Genetics, National Institute of Child Health and

Human Development, National Institutes of Health, Bethesda, Maryland 20892-2785,1 and

Institute of Cancer Research, Columbia University, New York, New York 100322

Journal Of Bacteriology, Jan. 1999; 181(2): 359–367.
After initiating synthesis of RNA at a promoter, RNA polymerase (RNAP) normally continues to elongate the transcript until it reaches a termination site. Important elements of termination sites are transcribed before polymerase translocation stops, and the resulting RNA is an active element of the termination pathway. Nascent transcripts of intrinsic sites can halt transcription without the assistance of additional factors, and

those of Rho-dependent sites recruit the Rho termination protein to the elongation complex. In both cases, RNAP, the transcript, and the template dissociate (reviewed in references 76 and 80).

 

Termination is rarely, if ever, completely efficient, and the expression of downstream genes can be controlled by altering the efficiency of terminator readthrough. Two distinct mechanisms of elongation control have been reported for bacterial RNA polymerases. In one, exemplified by attenuation of the his and trp operons of Salmonella typhimurium and Escherichia coli, respectively,

  • a single terminator is inactivated by interaction with an upstream sequence in the transcript, with a terminator-specific protein, or with a translating ribosome that follows closely behind RNAP (reviewed in references 35 and 104).

In a second, whose prototype is antitermination of phage l early transcription,

  • polymerase is stably modified to a terminator-resistant form after it leaves the promoter.

In this case, the modified enzyme not only transcribes through sequential downstream terminators,

  • but also it is less sensitive to the pause sites that normally delay transcript elongation.

Both pathways are widespread in nature, but in this minireview we consider only the second,

  • known as processive antitermination
    (for previous reviews, see references 22, 23, 27, and 32).

The recent explosive growth in our understanding of transcription elongation (reviewed in references 57, 96, and 99) make this an especially appropriate time to survey regulatory elements that target the transcription elongation complex.

Antitermination in l is induced by two quite distinct mechanisms.

  • the result of interaction between l N protein and its targets in the early phage transcripts,
  • an interaction between the l Q protein and its target in the late phage promoter.

We describe the N mechanism first. Lambda N, a small basic protein of the arginine- rich motif (ARM) (Fig. 1) family of RNA binding proteins, binds to a 15-nucleotide (nt) stem-loop called BOXB (17) (Fig. 2).

 

FIG. 1. [not shown] (A) Alignment of phage N proteins and the HK022 Nun protein. The color groupings reflect the frequency of amino acid substitutions in evolutionarily related protein domains: an amino acid is more likely to be replaced by one in the same color group than by one in a different color group in related proteins (34).

The amino-proximal ARM regions were aligned by eye and according to the structures of the P22 and l ARMs complexed to their cognate nut sites (see text and Fig. 2), and the remainder of the proteins was aligned by ClustalW (38). The dots indicate gaps introduced to improve the alignment. Aside from the ARM regions, the

proteins fall into three very distantly related (or unrelated) families: (i) l and phage 21; (ii) P22, phage L, and HK97; and (iii) HK022 Nun.

 

FIG. 2. [not shown] BOXA and BOXB RNAs and their interaction with the ARM of their cognate N proteins. The amino acid-nucleotide interactions are shown to the left except for BOXB of phage 21, for which the structure of the complex is unknown. The sequences of BOXA and BOXA-BOXB spacer are shown to the right. The dots

to the left and right of the spacer sequences are for alignment. (A) l N-ARM-BOXB complex (adapted from reference 48 with permission of the publisher). Open circles, pentagons, and rectangles represent phosphates, riboses, and bases, respectively. Watson-Crick base pairs (????) are indicated. The zigzag line denotes a sheared

G z A base pair. Open circles, open rectangles, and arrowheads depict ionic, hydrophobic, and hydrogen-bonding interactions, respectively. Guanine-11, indicated by a bold rectangle, is extruded from the BOXB loop (see text). (B) P22 N-ARM-BOXB complex (adapted from reference 15 with permission of the publisher). Open

circles, pentagons, rectangles, and ovals represent phosphates, riboses, bases, and amino acids, respectively. The solid pentagons indicate riboses with a C29-endo pucker.

Base stacking ( ), intermolecular hydrogen bonding or electrostatic interactions (,—–), intermolecular hydrophobic or van der Waals interactions (4), intramolecular hydrogen bonds (– – – –) and Watson-Crick base pairs (?????) are indicated. Cytosine-11 is extruded from the loop (see text). Note that the amino-terminal amino acid

residue in the complex corresponds to Asn-14 in the complete protein (Fig. 1), and the displayed amino acids are numbered accordingly. (C) NUTL site of phage 21. The arrows indicate the inverted sequence repeats of BOXB.

 

FIG. 3. [not skown] HK022 put sites and folded PUT RNAs. (A) Alignment of putL and putR (43). The numbers give distances from the start sites of the PL and PR promoters, respectively, and the pairs of arrows indicate inverted sequence repeats. (B) Folded PUTL and PUTR RNAs. The structures, which were generated by energy

minimization as described (43), have been partially confirmed by genetic and biochemical studies (7, 43).
The active bacterial elongation complex consists of

  • core RNAP,
  • template, and
  • RNA product.

The 39 end of the RNA

  • is engaged in the active site of the enzyme,
  • The following ;8 nt are hybridized to the template strand of the DNA, and
  • the next ;9 nt remain closely associated with RNAP (64).
  • About 17 nt of the nontemplate DNA strand are separated from the template strand in the transcription bubble.

Elongation complexes can also contain NusA and/or NusG. These proteins, which

  • increase the stability of the N-mediated antitermination complex (see above),
  • have different effects on elongation.
  • NusA decreases and NusG increases the elongation rate, and
  • both proteins alter termination efficiency in a terminator-specific manner (13, 14, 86; see reference 76).

An elongation complex, unless located at a terminator, is extraordinarily stable,

  • even when translocation is prevented by removal of substrates.

Recent observations suggest that this stability depends mainly on

  • interactions between RNAP and the RNA-DNA hybrid as well as
  • between polymerase and the downstream duplex DNA template (63, 87).

Nascent RNA emerging from the hybrid region and upstream duplex DNA

  • do not appear to be required.

The strength of the RNA-DNA hybrid is believed to

  • assure the lateral stability of the complex.

 

Reducing the strength of the RNA-DNA bonds, for example

  • by incorporation of nucleotide analogs,
  • favors backsliding of RNAP on the template, with consequent
  • disengagement of the 39 RNA end from the active site, and
  • concerted retreat of the RNA-DNA hybrid region from the 39 end (65).

Such a disengaged complex retains its resistance to dissociation and

  • is capable of resuming elongation if the original or a newly created 39 end reengages with the active site (10, 44, 45, 65, 71, 95).

Intrinsic terminators consist of a guanine- and cytosine-rich RNA hairpin stem

  • immediately followed by a short uracil-rich segment
  • within which termination can occur.

 

If termination does not occur at this point,

  • polymerase continues to elongate the transcript with normal processivity
  • until it reaches the next terminator.

Neither the stem nor the uracil-rich segment

  • is sufficient for termination, although
  • either can transiently slow elongation.

The weakness of base pairing between rU and dA

  • destabilizes the RNA-DNA hybrid in the uracil-rich segment, and
  • this probably contributes to termination.

Formation of the hairpin stem as nascent terminator RNA emerges from polymerase

  • destabilizes the RNA-DNA hybrid and
  • interrupts contacts between the emerging nascent RNA and RNAP (62a).

It might also interfere with the stabilizing interactions between

  • RNAP and the hybrid or those between RNAP and
  • the downstream region of the template.

Cross-linking of nucleic acid to RNAP suggests that

  • both the downstream DNA and the nascent RNA
  • that emerges from the hybrid region, and
  • within which the terminator hairpin might form,
  • are located close to the same regions of the enzyme (64).

Conversely, modifications that render RNAP termination resistant

  • could prevent the terminator stem from destabilizing one or more of these targets,
  • at least while the 39 end of the RNA is within the uracil rich segment of the terminator.

The l N and Q proteins and HK022 PUT RNA

  • also suppress Rho-dependent terminators (43a, 79, 103) which,
  • in contrast to intrinsic terminators, lack a precisely determined termination point.

Rho is an RNA-dependent ATPase that binds to cytosine-rich, unstructured regions in nascent RNA and acts preferentially

  • to terminate elongation complexes that are paused at nearby downstream sites
    (19, 29, 46, 47, 59, 60).

Rho possesses RNA-DNA helicase activity, and this activity is directional,

  • unwinding DNA paired to the 39 end of the RNA molecule (11, 90).
  • This corresponds to the location of the hybrid and of RNAP
    in an active ternary elongation complex.

The ability of antiterminators to suppress Rho-dependent and -independent terminators

  • suggests that they prevent a step that is common to both classes.

Given the helicase activity of Rho, a likely candidate for this step is disruption of the RNA-DNA

hybrid. However, other candidates, such as destabilization of RNAP-template or RNAP-hybrid interactions, are also plausible.

Alternatively, the ability of N, Q, and PUT to suppress RNAP pausing (31, 43, 54, 74)

  • suggests that they prevent Rho-dependent termination
  • by accelerating polymerase away from Rho bound at upstream RNA sites.

This explanation raises the problem of why NusG,

  • which also accelerates polymerase,
  • enhances rather than suppresses Rho-dependent termination (see above).

Clearly, the molecular details of processive antitermination remain poorly understood despite the 30 years that have elapsed since its discovery.

 

 

System wide analyses have underestimated protein abundances and the importance of transcription in mammals

OPEN ACCESS

Jingyi Jessica Li1, 2, Peter J Bickel1 and Mark D Biggin3

1Department of Statistics, University of California, Berkeley, CA, USA

2Departments of Statistics and Human Genetics, University of California, Los Angeles, CA, USA

3Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA

Academic editor – Barbara Engelhardt   http://dx.doi.org:/10.7717/peerj.270

Distributed under Creative-Commons CC-0

ABSTRACT

Large scale surveys in mammalian tissue culture cells suggest that the protein ex-

pressed at the median abundance is present at 8,000_16,000 molecules per cell and

that differences in mRNA expression between genes explain only 10_40% of the dif-

ferences in protein levels. We find, however, that these surveys have significantly un-

derestimated protein abundances and the relative importance of transcription.

Using individual measurements for 61 housekeeping proteins to rescale whole proteome

data from Schwanhausser et al. (2011), we find that the median protein detected is

expressed at 170,000 molecules per cell and that our corrected protein abundance

estimates show a higher correlation with mRNA abundances than do the uncorrected

protein data. In addition, we estimated the impact of further errors in mRNA and

protein abundances using direct experimental measurements of these errors.

The resulting analysis suggests that mRNA levels explain at least

  • 56% of the differences in protein abundance for the 4,212 genes

detected by Schwanhausser et al. (2011), though because one major source of error

could not be estimated the true percent contribution should be higher.
We also employed a second, independent strategy to

  • determine the contribution of mRNA levels to protein expression.

The variance in translation rates directly measured by ribosome profiling is only 12%

of that inferred by Schwanhausser et al. (2011), and

  • the measured and inferred translation rates correlate poorly (R2 D 13).

Based on this, our second strategy suggests that

  • mRNA levels explain _81% of the variance in protein levels.

We also determined the percent contributions of

  • transcription,
  • RNA degradation,
  • translation
  • and protein degradation

to the variance in protein abundances using both of our strategies.

While the magnitudes of the two estimates vary, they both suggest that

  • transcription plays a more important role than the earlier studies implied and
  • translation a much smaller role.

Finally, the above estimates only apply to those genes whose mRNA and protein expression was detected. Based on a detailed analysis by Hebenstreit et al. (2012), we estimate that approximately

  • 40% of genes in a given cell within a population express no mRNA.

Since there can be no translation in the absence of mRNA, we argue that

  • differences in translation rates can play no role in determining the expression levels for the _40% of genes that are non-expressed.

Subjects Bioinformatics, Computational Biology

Keywords Transcription, Translation, Mass spectrometry, Gene expression, Protein abundance

How to cite this article Li et al. (2014), System wide analyses have underestimated protein abundances and the importance of transcription in mammals. PeerJ 2:e270; 

http://dx.doi.org:/10.7717/peerj.270

 

 

Assessing quality and completeness of human transcriptional regulatory pathways on a genome-wide scale

Evgeny Shmelkov1,2, Zuojian Tang2, Iannis Aifantis3, Alexander Statnikov2,4

Shmelkov et al. Biology Direct 2011, 6:15  http://www.biology-direct.com/content/6/1/15

 

Background: Pathway databases are becoming increasingly important and almost omnipresent in most types of biological and translational research. However, little is known about the quality and completeness of pathways stored in these databases. The present study conducts a comprehensive assessment of transcriptional regulatory pathways in humans for seven well-studied transcription factors: MYC, NOTCH1, BCL6, TP53, AR, STAT1, and RELA.

The employed benchmarking methodology first

  • involves integrating genome-wide binding with functional gene expression data to derive direct targets of transcription factors.
  • Then the lists of experimentally obtained direct targets are compared with relevant lists of transcriptional targets from 10 commonly used pathway databases.

Results: The results of this study show that for the majority of pathway databases,

  • the overlap between experimentally obtained target genes and targets reported in transcriptional regulatory pathway databases is surprisingly small and often is not statistically significant.

The only exception is MetaCore pathway database which yields statistically significant intersection with experimental results in 84% cases. Additionally, we suggest that

  • the lists of experimentally derived direct targets obtained in this study can be used to reveal new biological insight in transcriptional regulation and
  • suggest novel putative therapeutic targets in cancer.

Conclusions: Our study opens a debate on validity of using many popular pathway databases to obtain transcriptional regulatory targets. We conclude that the choice of pathway databases should be informed by solid scientific evidence and rigorous empirical evaluation.

 

Illustration of statistical methodology

Illustration of statistical methodology

 

Figure 2 Illustration of statistical methodology for comparison

between a gold-standard and a pathway database

 

Additional material

Additional file 1: Supplementary Information. Table S1: Functional gene expression data. Table 2: Transcription factor-DNA binding data. Table S3: Most confident direct transcriptional targets of each of the four transcription factors. These targets were obtained by overlapping several gold-standards obtained with different datasets for the same transcription factor. Table S4: Genes directly regulated by two or more of the three transcription factors: MYC, NOTCH1, and RELA. Figure S1: Comparison of gene sets of transcriptional targets derived from ten different pathway databases by Jaccard index. In case, where Jaccard index of an overlap could not be determined due to comparison of two empty gene lists, we assigned value 0. Cells are colored according to the Jaccard index, from white (Jaccard index equal to 0) to dark-orange (Jaccard index equal to 1). Each sub-figure gives results for a different transcription factor: (a) AR, (b) BCL6, (c) MYC, (d) NOTCH1, (e) RELA, (f) STAT1, (g) TP53

 

http://dx.doi.org:/10.1186/1745-6150-6-15

 

Cite this article as: Shmelkov et al.: Assessing quality and completeness of human transcriptional regulatory pathways on a genome-wide scale. Biology Direct 2011 6:15

 

 

The Functional Consequences of Variation in Transcription Factor Binding
Darren A. Cusanovich1, Bryan Pavlovic1,2, Jonathan K. Pritchard1,2,3*, Yoav Gilad1*

1 Department of Human Genetics, University of Chicago, 2 Howard Hughes Medical Institute, University of Chicago, Chicago,

Illinois, 3 Departments of Genetics and Biology and Howard Hughes Medical Institute, Stanford University, Stanford, California,

 

One goal of human genetics is to understand how the information for precise and dynamic gene expression programs is encoded in the genome. The interactions of transcription factors (TFs) with DNA regulatory elements clearly play an important role in determining gene expression outputs, yet the regulatory logic underlying functional transcription factor binding is poorly understood. Many studies have focused on characterizing the genomic locations of TF binding, yet it is unclear to what extent TF binding at any specific locus has functional consequences with respect to gene expression output.

To evaluate the context of functional TF binding we knocked down

  • 59 TFs and chromatin modifiers in one HapMap lymphoblastoid cell line.
  • We identified genes whose expression was affected by the knockdowns.
  • We intersected the gene expression data with transcription factor binding data
    (based on ChIP-seq and DNase-seq) within 10 kb of the transcription start sites

This combination of data allowed us to infer functional TF binding.

  • we found that only a small subset of genes bound by a factor were differentially expressed following the knockdown of that factor, suggesting that
  • most interactions between TF and chromatin do not result in measurable changes in gene expression levels of putative target genes.
  • functional TF binding is enriched in regulatory elements that harbor
    • a large number of TF binding sites,
    • at sites with predicted higher binding affinity, and
    • at sites that are enriched in genomic regions annotated as ‘‘active enhancers.’’

Author Summary

An important question in genomics is to understand how a class of proteins called ‘‘transcription factors’’ controls the expression level of other genes in the genome in a cell type-specific manner – a process that is essential to human development. One major approach to this problem is to

study where these transcription factors bind in the genome, but this does not tell us about the effect of that binding on gene expression levels and it is generally accepted that much of the binding does not strongly influence gene expression. To address this issue, we artificially reduced the concentration of 59 different transcription factors in the cell and then examined which genes were impacted by the reduced transcription factor level. Our results implicate some attributes that might

influence what binding is functional, but they also suggest that a simple model of functional vs. non-functional binding may not suffice.

Citation: Cusanovich DA, Pavlovic B, Pritchard JK, Gilad Y (2014) The Functional Consequences of Variation in Transcription Factor Binding. PLoS Genet 10(3):e1004226. http://dx.doi.org:/10.1371/journal.pgen.1004226

Editor: Yitzhak Pilpel, Weizmann Institute of Science, Israel

 

 

Effect sizes for differentially expressed genes

Effect sizes for differentially expressed genes

Figure 2. Effect sizes for differentially expressed genes.

Boxplots of absolute Log2(fold-change) between knockdown arrays

and control arrays for all genes identified as differentially expressed in

each experiment. Outliers are not plotted. The gray bar indicates the

interquartile range across all genes differentially expressed in all

knockdowns. Boxplots are ordered by the number of genes differentially

expressed in each experiment. Outliers were not plotted.

http://dx.doi.org:/10.1371/journal.pgen.1004226.g002

 

 

Intersecting binding data and expression data for each knockdown

Intersecting binding data and expression data for each knockdown

 

 

 

 

 

Figure 3. Intersecting binding data and expression data for each knockdown. (a) Example Venn diagrams showing the overlap of binding and differential expression for the knockdowns of HCST and IRF4 (the same genes as in Figure 1). (b) Boxplot summarizing the distribution of the fraction of all expressed genes that are bound by the targeted gene or downstream factors. (c) Boxplot summarizing the distribution of the fraction of

bound genes that are classified as differentially expressed, using an FDR of either 5% or 20%.

http://dx.doi.org:/10.1371/journal.pgen.1004226.g003

 

Degree of binding correlated with function

Degree of binding correlated with function

 

Figure 4. Degree of binding correlated with function. Boxplots comparing (a) the number of sites bound, and (b) the number of differentially expressed transcription factors binding events near functionally or non-functionally bound genes. We considered binding for siRNA-targeted factor and any factor differentially expressed in the knockdown. (c) Focusing only on genes differentially expressed in common between each pairwise set of knockdowns we tested for enrichments of functional binding (y-axis). Pairwise comparisons between knock-down experiments were binned by the fraction of differentially expressed transcription factors in common between the two experiments. For these boxplots, outliers were not plotted.

http://dx.doi.org:/10.1371/journal.pgen.1004226.g004

 

Distribution of functional binding about the TSS

Distribution of functional binding about the TSS

 

Figure 5. Distribution of functional binding about the TSS. (a) A density plot of the distribution of bound sites within 10 kb of the TSS for both functional and non-functional genes. Inset is a zoom-in of the region +/21 kb from the TSS (b) Boxplots comparing the distances from the TSS to the binding sites for functionally bound genes and non-functionally bound genes. For the boxplots, 0.001 was added before log10 transforming

the distances and outliers were not plotted.

http://dx.doi.org:/10.1371/journal.pgen.1004226.g005

 

Magnitude and direction of differential expression after knockdown

Magnitude and direction of differential expression after knockdown

 

 

Figure 6. Magnitude and direction of differential expression after knockdown. (a) Density plot of all Log2(fold-changes) between the knockdown arrays and controls for genes that are differentially expressed at 5% FDR in one of the knockdown experiments as well as bound by the targeted transcription factor. (b) Plot of the fraction of differentially expressed putative direct targets that were up-regulated in each of the knockdown experiments.

http://dx.doi.org:/10.1371/journal.pgen.1004226.g006

 

To test whether the number of paralogs or the degree of similarity with the closest paralog for each transcription factor knocked down might influence the number of genes differentially expressed in our experiments, we obtained definitions of paralogy and the calculations of percent identity for 29 different factors from Ensembl’s BioMart (http://useast.ensembl.org/biomart/martview/) [31]. We used genome build GRCh37.p13.

For each gene, we counted the number of paralogs classified as a ‘‘within_species_paralog’’. After selecting only genes considered a ‘‘within_species_paralog’’, we also assigned the maximum percent identity as the closest paralog.

To evaluate the effect that an independent assignment of target genes to regulatory regions might have on our analyses, we used the definition of target genes defined by Thurman et al. (ftp://ftp.ebi.ac.uk/pub/databases/…)

which use correlations in DNase hypersensitivity between distal and proximal regulatory regions across different cell types to link distal elements to putative target genes [38].

We intersected the midpoints of our called binding events (defined above) with these regulatory elements in order to assign our binding events to specific target genes and then re-analyzed the overlap between

binding and differential expression in our experiments.

PLOS Genetics 6 Mar 2014; 10 (3), e1004226

 

 

 

The essential biology of the endoplasmic reticulum stress response

for structural and computational biologists

Sadao Wakabayashia, Hiderou Yoshidaa,*

aDepartment of Molecular Biochemistry, Graduate School of Life Science,

University of Hyogo, Hyogo 678-1297, Japan

CSBJ Mar 2013; 6(7), e201303010, http://dx.doi.org/10.5936/csbj.201303010

 

Abstract: The endoplasmic reticulum (ER) stress response is a cytoprotective mechanism that maintains homeostasis of the ER by

  • upregulating the capacity of the ER in accordance with cellular demands.

If the ER stress response cannot function correctly, because of reasons such as aging, genetic mutation or environmental stress,

  • unfolded proteins accumulate in the ER and cause ER stress-induced apoptosis,
  • resulting in the onset of folding diseases,
    • including Alzheimer’s disease and diabetes mellitus.

Although the mechanism of the ER stress response has been analyzed extensively by biochemists, cell biologists and molecular biologists, many aspects remain to be elucidated. For example,

  • it is unclear how sensor molecules detect ER stress, or
  • how cells choose the two opposite cell fates
    (survival or apoptosis) during the ER stress response.

To resolve these critical issues, structural and computational approaches will be indispensable, although the mechanism of the ER stress response is complicated and difficult to understand holistically at a glance. Here, we provide a concise introduction to the mammalian ER stress response for structural and computational biologists.

The basic mechanism of the mammalian ER stress response

The mammalian ER stress response consists of three pathways: the ATF6, IRE1 and PERK pathways, of which the main functions are

  • augmentation of folding and ERAD capacity, and
  • translational attenuation, respectively.

Although these response pathways cross-talk with each other and have several branched subpathways, we focus on the main pathways in this section.

  • The ATF6 pathway regulates the transcriptional induction of ER chaperone genes
  • pATF6(P) is a sensor molecule comprising a type II transmembrane protein residing on the ER membrane (Figure 2).

When pATF6(P) detects ER stress,

  • the protein is transported to the Golgi apparatus through vesicular transport in a COP-II vesicle
  • and is sequentially cleaved by two proteases residing in the Golgi,
    • namely site 1 protease (S1P) and site 2 protease (S2P)

The cytoplasmic portion of pATF6(P) (pATF6(N)) is

  1. released from the Golgi membrane,
  2. translocates into the nucleus,
  3. binds to an enhancer element called the ER stress response element (ERSE),
  4. and activates the transcription of ER chaperone genes,
  • including BiP, GRP94, calreticulin and protein disulfide isomerase (PDI)

The consensus nucleotide sequence of ERSE is CCAAT(N9)CCACG, and pATF6(N) recognizes both the CCACG portion and another transcription factor NF-Y,

  • which binds to the CCAAT portion

NF-Y is a general transcription factor required for

  • the transcription of various human genes

 

Figure 2. The ATF6 pathway. The sensor molecule pATF6(P) located on the ER membrane is transported to the Golgi apparatus by transport vesicles in response to ER stress. In the Golgi apparatus, pATF6(P) is sequentially cleaved by two proteases, S1P and S2P, resulting in release of the cytoplasmic portion pATF6(N) from the ER membrane. pATF6(N) translocates into the nucleus and activates transcription of ER chaperone genes through binding to the cis-acting enhancer ERSE.

 

Figure 3. The IRE1 pathway. In normal growth conditions, the sensor molecule IRE1 is an inactive monomer, whereas IRE1 forms an active oligomer in response to ER stress. Activated IRE1 converts unspliced XBP1 mRNA to mature mRNA by the cytoplasmic mRNA splicing. From mature XBP1 mRNA, an active transcription factor pXBP1(S) is translated and activates the transcription of ERAD genes through binding to the enhancer UPRE.

 

Figure 4. The PERK pathway. When PERK detects unfolded proteins in the ER, PERK phosphorylates eIF2α, resulting in translational attenuation and translational induction of ATF4. ATF4 activates the transcription of target genes encoding translation factors, anti-oxidation factors and a transcription factor CHOP. Other kinases such as PKR, GCN2 and HRI also phosphorylate eIF2α, and phosphorylated eIF2α is dephosphorylated by CReP, PP1C-GADD34 and p58IPK

 

Figure 7. Three functions of pXBP1(U). pXBP1(U) translated from XBP1(U) mRNA binds to pXBP1(S) and enhances its degradation. The CTR region of pXBP1(U) interacts with the ribosome tunnel and slows translation, while the HR2 region anchors XBP1(U) mRNA to the ER membrane, in order to enhance splicing of XBP1(U) mRNA by IRE1.

 

Figure 8. Major pathways of ER stress-induced apoptosis. ER stress induces apoptosis through various pathways, including transcriptional induction of CHOP by the PERK and ATF6 pathways, the IRE1-TRAF2 pathway and the caspase-12 pathway.

If cells are damaged by strong and sustained ER stress that they cannot deal with and ER stress still persists and hampers the survival of the organism, the ER stress response activates the apoptotic pathways and disposes of damaged cells from the body.

Computational simulation of response pathways to analyze the decision mechanism that determines cell fate (survival or apoptosis) provides a valuable analysis tool, although there have been few such studies to date.

Read Full Post »


RNA and the Transcription the Genetic Code

Curator: Larry H. Bernstein, MD, FCAP

 

 

This portion of the series is a followup on the series on the replication of the genetic code (DNA).  It may be considered alone, or as a tenth article.  Just as DNA has become far more than it was envisioned 60 years ago, the RNA, which was opened to further investigation by Roger Kornberg, Nobel Laureate, and son of the Nobel Laureate, Arthur Kornberg, who studied DNA polymerase, and with his Nobel Associate, attracted the finest minds in biochemistry and built the best academic department of Biochemistry at Stanford University.  RNA is associated with RNA polymerase as DNA is associated with DNA polymerase.  We have already highlighted the several critical reactions involved in the step-by-step replication of DNA.  The classic model has dictated DNA-RNA-protein.  We shall here look at the amazing view that RNA is heterogeneous, and is involved in living processes in health and disease.

 

 

Transcription (Wikipedia)

Transcription is the first step of gene expression, in which a particular segment of DNA is copied into RNA

Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language

  • that can be converted back and forth from DNA to RNA by the action of the correct enzymes.

During transcription, a DNA sequence is read by an RNA polymerase,

As opposed to DNA replication, transcription results in

  1. an RNA complement that includes the nucleotide uracil (U) in all instances
  • where thymine (T) would have occurred in a DNA complement.

Also unlike DNA replication where DNA is synthesized, transcription does not involve an RNA primer to initiate RNA synthesis.

Eukaryotic transcription is the elaborate process that eukaryotic cells use to copy genetic information stored in DNA into units of RNA replica. Gene transcription occurs in both eukaryotic and prokaryotic cells.
A eukaryotic cell has a nucleus that separates the processes of transcription and translation. Eukaryotic transcription occurs

The complexity of the eukaryotic genome necessitates a great variety and complexity of gene expression control.

Transcription can be reduced to the following steps, each moving like a wave along the DNA.

  1. One or more sigma factors initiate transcription of a gene by enabling binding of RNA polymerase to promoter DNA.
  2. RNA polymerase moves a transcription bubble, like the slider of a zipper, which splits the double helix DNA molecule into two strands of unpaired DNA nucleotides, by breaking the hydrogen bonds between complementary DNA nucleotides.
  3. RNA polymerase adds matching RNA nucleotides that are paired with complementary DNA nucleotides of one DNA strand.
  4. RNA sugar-phosphate backbone forms with assistance from RNA polymerase to form an RNA strand.
  5. Hydrogen bonds of the untwisted RNA + DNA helix break, freeing the newly synthesized RNA strand.
  6. If the cell has a nucleus, the RNA may be further processed (with the addition of a 3’UTR poly-A tail and a 5’UTR cap) and exits to the cytoplasm through the nuclear pore complex.

The stretch of DNA transcribed into an RNA molecule is called a transcription unit and encodes at least one gene. If the gene transcribed encodes a protein, the result of transcription is messenger RNA (mRNA), which will then be used to create that protein via the process of translation. Alternatively, the transcribed gene may encode for either non-coding RNA genes (such as microRNA, lincRNA, etc.) or ribosomal RNA (rRNA) or transfer RNA (tRNA), other components of the protein-assembly process, or other ribozymes.[1]

Making RNA replication of gene in eukaryotic cells

Transcription is the process of copying genetic information stored in a DNA strand into a transportable complementary strand of RNA.[1] Eukaryotic transcription takes place in the nucleus of the cell and proceeds in three sequential stages: initiation, elongation, and termination.[1] The transcriptional machinery that catalyzes this complex reaction has at its core three multi-subunit RNA polymerases.[1]

Protein coding genes are transcribed into messenger RNAs (mRNAs) that carry the information from DNA to the site of protein synthesis.[1] Although mRNAs possess great diversity, they are not the most abundant RNA species made in the cell. The so-called non-coding RNAs account for the large majority of the transcriptional output of a cell.[2] These non-coding RNAs perform a variety of important cellular functions.[2]

RNA Polymerase

Eukaryotes have three nuclear RNA polymerases, each with distinct roles and properties

Name Location Product
RNA Polymerase I (Pol I, Pol A) nucleolus larger ribosomal RNA (rRNA) (28S, 18S, 5.8S)
RNA Polymerase II (Pol II, Pol B) nucleus messenger RNA (mRNA), most small nuclear RNAs (snRNAs), small interfering RNA (siRNAs) and micro RNA (miRNA).
RNA Polymerase III (Pol III, Pol C) nucleus (and possibly the nucleolus-nucleoplasm interface) transfer RNA (tRNA), other small RNAs (including the small 5S ribosomal RNA (5s rRNA), snRNA U6, signal recognition particle RNA (SRP RNA) and other stable short RNAs

RNA polymerase I (Pol I)

  • catalyzes the transcription of all rRNA genes except 5S.[3][4]

These rRNA genes are organized into a single transcriptional unit and are transcribed into a continuous transcript. This precursor is then processed into

  • three rRNAs: 18S, 5.8S, and 28S.

The transcription of rRNA genes

  1. takes place in a specialized structure of the nucleus called the nucleolus,[5] where
  2. the transcribed rRNAs are combined with proteins to form ribosomes.[6]

RNA polymerase II (Pol II)

  • is responsible for the transcription of all mRNAs, some snRNAs, siRNAs, and all miRNAs.[3][4]

Many Pol II transcripts exist transiently as single strand precursor RNAs (pre-RNAs) that

  • are further processed to generate mature RNAs.[1]
  1.  precursor mRNAs (pre-mRNAs)are extensively processed
  2. before exiting into the cytoplasm through the nuclear pore for protein translation.

RNA polymerase III (Pol III) transcribes small non-coding RNAs, including tRNAs, 5S rRNA, U6 snRNA, SRP RNA, and other stable short RNAs such as ribonuclease P RNA.[7]

Structure of eukaryotic RNA polymerase II (light blue) in complex with α-amanitin (red), a strong poison found in death cap mushrooms that targets this vital enzyme

RNA Polymerases I, II, and III contain 14, 12, and 17 subunits, respectively.[8] All three eukaryotic polymerases have five core subunits that exhibit

  • homology with the β, β’, αI, αII, and ω subunits of E. coli RNA polymerase.

An identical ω-like subunit (RBP6) is used by all three eukaryotic polymerases,

  • while the same α-like subunits are used by Pol I and III.

The three eukaryotic polymerases share four other common subunits among themselves. The remaining subunits are unique to each RNA polymerase.

The additional subunits found in Pol I and Pol III relative to Pol II, are

  • homologous to Pol II transcription factors.[8]

Crystal structures of RNA polymerases I[9] and II [10] provide an opportunity to understand the interactions among the subunits and the molecular mechanism of eukaryotic transcription in atomic detail.

The carboxyl terminal domain (CTD) of RPB1, the largest subunit of RNA polymerase II,

  • plays an important role in bringing together the machinery necessary for the synthesis and processing of Pol II transcripts.[11]

Long and structurally disordered, the CTD

  • contains multiple repeats of heptapeptide sequence YSPTSPS
  1. that are subject to phosphorylation and
  2. other posttranslational modifications during the transcription cycle.

These modifications and their regulation constitute

  • the operational code for the CTD to control
  1. transcription initiation,
  2. elongation and
  3. termination and
  • to couple transcription and RNA processing.[11]

A DNA transcription unit encoding for a protein contains

  • not only the sequence that will eventually be directly translated into the protein (the coding sequence)
  • but also regulatory sequences that direct and regulate the synthesis of that protein.

The regulatory sequence before (i.e., upstream from) the coding sequence is called

the sequence following (downstream from) the coding sequence is called

Initiation

The initiation of gene transcription in eukaryotes occurs in specific steps.[1]

First, an RNA polymerase along with general transcription factors binds to the promoter region of the gene

The subsequent transition of the complex from the closed state to the open state results in

  • the melting or separation of the two DNA strands and
  • the positioning of the template strand to the active site of the RNA polymerase.

Without the need of a primer

  1. RNA polymerase can initiate the synthesis of a new RNA chain using the template DNA strand
  2. to guide ribonucleotide selection and polymerization chemistry.[1]

However, many of the initiated syntheses are aborted

  • before the transcripts reach a significant length (~10 nucleotides).

During these abortive cycles, the polymerase keeps making and releasing short transcripts

  • until it is able to produce a transcript that surpasses ten nucleotides in length.

Once this threshold is attained, RNA polymerase escapes the promoter and

  • transcription proceeds to the elongation phase.[1]

Eukaryotic promoters and general transcription factors

Pol II-transcribed genes contain a region

  • in the immediate vicinity of the transcription start site (TSS) that binds and positions the preinitiation complex.

This region is called the core promoter because of its essential role in transcription initiation.[12][13] Different classes

  • of sequence elements are found in the promoters. For example,
  • the TATA box is the highly conserved DNA recognition sequence for the TATA box binding protein,
  • TBP, whose binding initiates transcription complex assembly at many genes.

Eukaryotic genes

  • contain regulatory sequences beyond the core promoter.

These cis-acting control elements

  • bind transcriptional activators or repressors to increase or decrease transcription from the core promoter.

Well-characterized regulatory elements include

These regulatory sequences

  • can be spread over a large genomic distance, sometimes located
  • hundreds of kilobases from the core promoters.[1]

General transcription factors are

  • a group of proteins involved in transcription initiation and regulation.[1]

These factors typically have DNA-binding domains that bind

  1. specific sequence elements of the core promoter and
  2. help recruit RNA polymerase to the transcriptional start site.

General transcription factors for RNA polymerase II include TFIID, TFIIA, TFIIB, TFIIF, TFIIE, and TFIIH.[1][14][15]

Transcription has some proofreading mechanisms

  • but they are fewer and less effective than the controls for copying DNA; therefore, transcription has a lower copying fidelity than DNA replication.[2]

As in DNA replication, DNA is read from 3′ end → 5′ end during transcription. Meanwhile,

  • the complementary RNA is created from the 5′ end → 3′ end direction.

This means its 5′ end is created first in base pairing. Although DNA is arranged as two antiparallel strands in a double helix, only

one of the two DNA strands, called the template strand, is used for transcription.

This is because RNA is only single-stranded, as opposed to double-stranded DNA. The other DNA strand (the non-template strand) is called the coding strand,

  • because its sequence is the same as the newly created RNA transcript (except for the substitution of uracil for thymine).

The use of only the 3′ end → 5′ end strand eliminates the need for the Okazaki fragments seen in DNA replication.[1]

In virology, the term may also be used when referring to mRNA synthesis from a RNA molecule (i.e. RNA replication). For instance,

  • the genome of an negative-sense single-stranded RNA (ssRNA -) virus
  1. may serve as a template to transcribe a positive-sense single-stranded RNA (ssRNA +) molecule,
  • since the positive-sense strand contains the information needed
  • to translate the viral proteins for viral replication afterwards.

This process is catalysed by a viral RNA replicase.

Transcription is divided into pre-initiation, initiation, promoter clearance, elongation and termination.

Pre-initiation

In eukaryotes, RNA polymerase, and therefore the initiation of transcription, requires

  • the presence of a core promoter sequence in the DNA.

Promoters are regions of DNA that promote transcription and, in eukaryotes, are found at -30, -75, and -90 base pairs

  • upstream from the transcription start site (abbreviated to TSS).

Core promoters are sequences within the promoter that are essential for transcription initiation. RNA polymerase is able to

The most characterized type of core promoter in eukaryotes is

  • a short DNA sequence known as a TATA box, found 25-30 base pairs upstream from the TSS.

The TATA box, as a core promoter, is the binding site for

  1. a transcription factor known as TATA-binding protein (TBP), which
  2. is itself a subunit of another transcription factor, called Transcription Factor II D (TFIID).

After TFIID binds to the TATA box via the TBP,

  • five more transcription factors and RNA polymerase combine around the TATA box
  • in a series of stages to form a preinitiation complex.

One transcription factor, Transcription factor II H, has two components

  • with helicase activity and so
  • is involved in the separating of opposing strands of double-stranded DNA
  • to form the initial transcription bubble.

However, only a low, or basal, rate of transcription is driven by the preinitiation complex alone. Other proteins known as

  1. activators and repressors,
  2. along with any associated coactivators or corepressors,
  3. are responsible for modulating transcription rate.

Thus, preinitiation complex contains:

  1. Core Promoter Sequence
  2. Transcription Factors
  3. RNA Polymerase
  4. Activators and Repressors.

The transcription preinitiation in archaea is, in essence, homologous to that of eukaryotes, but is much less complex.[3]

The archaeal preinitiation complex assembles at a TATA-box binding site; however,

  • in archaea, this complex is composed of only RNA polymerase II, TBP, and TFB (the archaeal homologue of eukaryotic transcription factor II B (TFIIB)).[4][5]

Initiation

Simple diagram of transcription initiation. RNAP = RNA polymerase

In bacteria, transcription begins with the binding of RNA polymerase to the promoter in DNA. RNA polymerase is a core enzyme consisting of five subunits: 2 α subunits, 1 β subunit, 1 β’ subunit, and 1 ω subunit. At the start of initiation,

  • the core enzyme is associated with a sigma factor that
  • aids in finding the appropriate -35 and -10 base pairs downstream of promoter sequences.[6]

When the sigma factor and RNA polymerase combine, they form a holoenzyme.

Transcription initiation is more complex in eukaryotes. Eukaryotic RNA polymerase

  • does not directly recognize the core promoter sequences. Instead,
  • a collection of proteins called transcription factors mediate
  • the binding of RNA polymerase and the initiation of transcription.

Only after certain transcription factors are attached to the promoter does the RNA polymerase bind to it. The completed assembly of

  • transcription factors and RNA polymerase bind to the promoter,
  • forming a transcription initiation complex.

Transcription in the archaea domain is similar to transcription in eukaryotes.[7]

Promoter clearance

After the first bond is synthesized, the RNA polymerase must clear the promoter. During this time

  • there is a tendency to release the RNA transcript and produce truncated transcripts. This is called
  • abortive initiation and is common for both eukaryotes and prokaryotes.[8]

In prokaryotes, abortive initiation continues to occur

  • until an RNA product of a threshold length of approximately 10 nucleotides is synthesized,
  • at which point promoter escape occurs and a transcription elongation complex is formed.

The σ factor is released according to a stochastic model.[9] Mechanistically, promoter escape occurs through a scrunching mechanism, where

  • the energy built up by DNA scrunching provides the energy needed to break interactions between RNA polymerase holoenzyme and the promoter.[10]

In eukaryotes, after several rounds of 10nt abortive initiation,

  • promoter clearance coincides with the TFIIH’s phosphorylation of serine 5 on the carboxy terminal domain of RNAP II,
  • leading to the recruitment of capping enzyme (CE).[11][12]

The exact mechanism of how CE induces promoter clearance in eukaryotes is not yet known.

Elongation

Simple diagram of transcription elongation

One strand of the DNA, the template strand (or noncoding strand), is used as a template for RNA synthesis. As transcription proceeds,

  • RNA polymerase traverses the template strand and uses base pairing complementarity with the DNA template to create an RNA copy.

Although RNA polymerase traverses the template strand from 3′ → 5′, the coding (non-template) strand and newly formed RNA can also be used as reference points,

  • so transcription can be described as occurring 5′ → 3′.

This produces an RNA molecule from 5′ → 3′, an exact copy of the coding strand (except that thymines are replaced with uracils, and the nucleotides are composed of a ribose (5-carbon) sugar where DNA has deoxyribose (one fewer oxygen atom) in its sugar-phosphate backbone).

mRNA transcription can involve multiple RNA polymerases on a single DNA template and multiple rounds of transcription (amplification of particular mRNA),

  • so many mRNA molecules can be rapidly produced from a single copy of a gene.

Elongation also involves a proofreading mechanism

  • that can replace incorrectly incorporated bases.

In eukaryotes,

  • short pauses during transcription allow appropriate RNA editing factors to bind.

These pauses may be intrinsic to the RNA polymerase or due to chromatin structure.

Termination

Main article: Terminator (genetics)

Bacteria use two different strategies for transcription termination –

  1. Rho-independent termination and
  2. Rho-dependent termination.

In Rho-independent transcription termination, also called intrinsic termination,

RNA transcription stops when the newly synthesized RNA molecule forms

  1. a G-C-rich hairpin loop followed by a run of Us. When the hairpin forms,
  2. the mechanical stress breaks the weak rU-dA bonds,
  3. now filling the DNA-RNA hybrid. This pulls the poly-U transcript out of the active site of the RNA polymerase,
  4. in effect, terminating transcription.

In the “Rho-dependent” type of termination, a protein factor called “Rho

  • destabilizes the interaction between the template and the mRNA, thus
  • releasing the newly synthesized mRNA from the elongation complex.[13]

Transcription termination in eukaryotes is less understood but involves cleavage of the new transcript followed by template-independent addition of As at its new 3′ end, in a process called polyadenylation.[14]

Inhibitors

Transcription inhibitors can be used as antibiotics against, for example, pathogenic bacteria (antibacterials) and fungi (antifungals). An example of such an antibacterial is

8-Hydroxyquinoline is an antifungal transcription inhibitor.[15] The effects of histone methylation may also work to inhibit the action of transcription.

Transcription factories

Active transcription units are clustered in the nucleus, in discrete sites called transcription factories or euchromatin. Such sites can be visualized by allowing engaged polymerases to extend their transcripts in tagged precursors (Br-UTP or Br-U) and immuno-labeling the tagged nascent RNA. Transcription factories can also be localized using fluorescence in situ hybridization or marked by antibodies directed against polymerases. There are ~10,000 factories in the nucleoplasm of a HeLa cell, among which are ~8,000 polymerase II factories and ~2,000 polymerase III factories. Each polymerase II factory contains ~8 polymerases. As most active transcription units are associated with only one polymerase, each factory usually contains ~8 different transcription units. These units might be associated through promoters and/or enhancers, with loops forming a ‘cloud’ around the factor.[16]

History

A molecule that allows the genetic material to be realized as a protein was first hypothesized by François Jacob and Jacques Monod. Severo Ochoa won a Nobel Prize in Physiology or Medicine in 1959 for developing a process for synthesizing RNA in vitro with polynucleotide phosphorylase, which was useful for cracking the genetic code. RNA synthesis by RNA polymerase was established in vitro by several laboratories by 1965; however, the RNA synthesized by these enzymes had properties that suggested the existence of an additional factor needed to terminate transcription correctly.

In 1972, Walter Fiers became the first person to actually prove the existence of the terminating enzyme.

Roger D. Kornberg won the 2006 Nobel Prize in Chemistry “for his studies of the molecular basis of eukaryotic transcription”.

Reverse transcription

Some viruses (such as HIV, the cause of AIDS), have the ability to transcribe RNA into DNA. HIV has an RNA genome that is reverse transcribed into DNA. The resulting DNA can be merged with the DNA genome of the host cell. The main enzyme responsible for synthesis of DNA from an RNA template is called reverse transcriptase.

Some eukaryotic cells contain an enzyme with reverse transcription activity called telomerase. Telomerase is a reverse transcriptase that lengthens the ends of linear chromosomes. Telomerase carries an RNA template from which it synthesizes a repeating sequence of DNA, or “junk” DNA. This repeated sequence of DNA is called a telomere and can be thought of as a “cap” for a chromosome. It is important because every time a linear chromosome is duplicated, it is shortened. With this “junk” DNA or “cap” at the ends of chromosomes, the shortening eliminates some of the non-essential, repeated sequence rather than the protein-encoding DNA sequence, that is farther away from the chromosome end.

Telomerase is often activated in cancer cells to enable cancer cells to duplicate their genomes indefinitely without losing important protein-coding DNA sequence. Activation of telomerase could be part of the process that allows cancer cells to become immortal. The immortalizing factor of cancer via telomere lengthening due to telomerase has been proven to occur in 90% of all carcinogenic tumors in vivo with the remaining 10% using an alternative telomere maintenance route called ALT or Alternative Lengthening of Telomeres.[20]

RNA-Seq Dissects the Transcriptome

Transcript Targeting  Kathy Liszewski
GEN    Jul 1, 2014 (Vol. 34, No. 13)

With the rapid rise of next-generation sequencing (NGS), one of its technologies, RNA sequencing (RNA-Seq), has taken center stage for analyzing whole transcriptomes.

Although RNA-Seq is still the new kid on the block,

  • this technology has the potential to revolutionize transcriptomics,
  • revealing the architecture of gene expression in unprecedented detail.

RNA-Seq applications are proliferating and include

  • the elucidation of disease processes,
  • targeted drug development, and
  • personalized medicine.

To orient researchers who are unfamiliar with the differences between  RNA-Seq platforms, Kelli Bramlett, R&D scientist, Life Technologies, poses two key questions:

1. Are you interested in pure discovery, in a nonguided fashion, of every RNA species present in your test samples?

2. Are you mainly focused on measuring expression levels of well-annotated coding RNA transcripts?

You might have a set of genes crucial to

 

  • identifying a disease state, or
  • profiling the stage of a specific type of cancer, or
  • monitoring development in your experimental system,

You then would want to employ a system that

  • “allows you to quickly and efficiently focus on just your genes of interest and screen through many different samples in a short amount of time.”

RNA-Seq allows for true discovery but

  • “requires sequencing depth and
  • requires significant additional time for analysis
  • If a focused panel targeting specific RNAs will better meet your needs, this can be accomplished on systems with
  • much faster turnaround time and less sequencing depth.”( according to Dr. Bramlett)

Enhancing Sensitivity

RNA-Seq has advanced our ability to characterize transcriptomes at high resolution, and the laboratory and data analysis techniques used for this NGS application continue to mature, notes John Tan, Ph.D., senior scientist, Roche NimbleGen. “High sequencing costs combined with the omnipresence of pervasive, abundant transcripts decrease our power to study rare transcripts, decrease throughput, and limit the routine use of this technology.”

For example, notes Dr. Tan, a small number of

  • highly expressed housekeeping genes can be responsible for a large fraction of total sequence reads in an experiment, thus
  • increasing the amount of sequencing required to characterize less abundant transcripts of interest.

To improve the cost-effectiveness, throughput, and sensitivity of RNA-Seq, Dr. Tan and colleagues are developing methods to perform targeted RNA-Seq.
“Targeted enrichment of transcripts of interest

  • circumvents the need to perform separate rRNA depletion or polyA enrichment steps on input RNA,” explains Dr. Tan.

“By targeting their sequencing, researchers can avoid wasting resources on

  • housekeeping transcripts and focus instead on genes or genomic regions of interest.”

Targeted RNA-Seq can allow deeper sequence coverage, increased sensitivity for low-abundance transcripts, less total sequencing per sample, and more samples processed per sequencing instrument run. “Significantly, we observe that the enrichment step also preserves quantitative information very well,” adds Dr. Tan. “These advances will facilitate a more routine use of RNA-Seq technology.”

  • Sample Integrity Issues

“Formalin-fixed, paraffin-embedded (FFPE) patient tissue archives and the clinical data associated with them may provide only limited amounts of sample that may also be degraded,” comments Gary Schroth, Ph.D., distinguished scientist, Illumina. Dr. Schroth says that most labs currently gauge RNA integrity via the RIN (RNA integrity number). but the RIN number from FFPE samples is not a sensitive measure of RNA quality or a good predictor for library preparation. A better predictor is RNA fragment size. We developed the DV200 metric, the percentage of RNA fragments greater than 200 nucleotides, a size needed for accurate construction of libraries.”

Illumina offers its TruSeq® RNA Access Library Preparation Kit especially for FFPE samples. This kit, when used with the DV200 metric, provides cleaner and more accurate library preparation. This new approach allows researchers to start with five-to tenfold less material when making libraries from FFPE samples.

  • Strand Specificity

Most NGS requires initial construction of libraries that may not provide the specificity desired even when prepared from mRNA. “Traditional RNA-Seq library preparation loses the strandedness of transcripts—information that is critical in understanding cellular transcription,” says Jungsoo Park, senior marketing and sales manager, Lexogen.

According to Park, Lexogen tackled this problem

  • by developing a method to generate libraries with greater than 99.9% strand specificity with a simplified process that takes 4.5 hours to complete.

Lexogen’s SENSE mRNA-Seq library kit initially isolates mRNA via

  • the poly A tail and utilizes random hybridization of the transcripts that
  • are bound to the magnetic beads without transcript fragmentation.

“This is a revolutionary method, which keeps high strandedness of the transcripts,” asserts Park.

One of the novel aspects of this approach is the use of starter/stopper heterodimers containing platform-specific linkers that hybridize to the mRNA.
“The starters serve as primers for reverse transcription, which then

  • terminates once the stopper from the next heterodimer is reached,

“At this point, the newly synthesized cDNA and the stopper are ligated while still bound to the RNA template.” According to Park,

  • there is no need for a time-consuming fragmentation step, and library size is determined simply by the protocol itself.

For researchers only intending to see the expression levels, sequencing of the entire mRNA transcript will require subsequent bioinformatics processes such as RPKM, a measure of relative molar RNA concentration.

  RNA-Seq Libraries

NuGEN Technologies offers its Ovation Human Blood RNA-Seq Library System as an end-to-end solution for strand-specific RNA-Seq library construction. NuGEN’s Insert Dependent Adaptor Cleavage (InDA-C) technology can provide targeted depletion of unwanted high-abundance transcripts.
  • Cells possess many thousands of transcripts.
  • uninformative transcript species that can compromise data quality and the cost-effectiveness of sequencing
  • NuGEN Technologies has developed a method for targeted depletion of unwanted transcripts following construction of RNA-Seq libraries. (Insert Dependent Adaptor Cleavage (InDA-C),

employs customized primers that target specific transcripts, such as ribosomal and globin RNAs, to exclude from final RNA-Seq libraries. (hemoglobin RNA derived from blood accounts for at least 60% of transcripts)  “By depleting these two transcript classes, InDA-C quadruples informative reads. and it avoids off-target mRNA cross-hybridization events that can potentially introduce bias. The species and transcript specificity of the workflow relies on the design of InDA-C primers, which can be constructed

  • to target virtually any class of unwanted transcripts for targeted depletion,”  according to Dr. Kain.

NuGEN has developed Single Primer Enrichment Technology, which can be used to prepare targeted NGS libraries from both gDNA or cDNA,

  •  used to identify gene fusion products and alternative splicing patterns from enriched cDNA libraries.

platforms automate the RNA sequencing sample preparation process [Beckman Coulter]

Preparation of libraries for RNA-Seq entails an intensive workflow.  according to Alisa Jackson, senior marketing manager, Genomic Solutions, Beckman Coulter, automation provides four key advantages:

  • Creation of high-quality mRNA libraries. Initial steps in this process include depleting samples of ribosomal RNA. Although it has the greatest abundance, rRNA gives the least amount of information.
  • “We’ve automated this process on our Biomek instruments using popular sample preparation kits from Illumina and New England Biolabs,” notes Jackson. “Accurate pipetting and thorough mixing are critical for this process. The Biomek liquid handler’s 96-channel pipetting head is used in combination with an on-deck orbital shaker to vigorously mix samples. Results show this ‘mix and shake’ approach works well.”
  • Limited exposure to RNAses from human contact. Every scientist’s nemesis when working with RNA is the universal presence of RNA-degrading RNAses. To help overcome this problem, says Jackson, “Biomek consumables such as pipette tips are DNase and RNase-free.”
  • Reduced exposure to toxic chemicals. “An instrument dispenses all reagents involved in the various steps of process.”
  • Enhanced reproducibility. “This is still a very expensive process,” asserts Jackson. “Obtaining accurate results the first time prevents costly repetitions. For this reason, we provide Biomek methods for many NGS library preparation kits. By fully testing these methods with real-life samples, we ensure reliable and repeatable creation of sequence-ready RNA libraries, whether stranded or nonstranded, mRNA or total RNA.”
  • What’s Next?

RNA-seq data analysis

RNA-seq data analysis for target identification. [Boehringer Ingelheim]

  •  “With RNA-Seq, we are closing in on personalized medicine,” suggests Qichao Zhu, Ph.D., principal scientist, Boehringer Ingelheim. “This technology allows more exact identification of patient subgroups. Instead of ‘one drug fits all,’ we can now begin to more appropriately define which drugs will work in which patients. Diseases such as cancer and cystic fibrosis as well as neurodegenerative illnesses have many patient subcategories. Future pharmaceutical drug discovery will be better able to develop targeted therapeutics with the help of RNA-Seq.
  • ”There are still many challenges in the field, however. “A critical aspect is accuracy. Given the large scale set of RNA-Seq, even 99.99% accuracy is not good enough for diagnostics,” insists Dr. Zhu. “Further, as we move forward, we will need to improve many aspects of the technology including
  • disease tissue sample isolation,
  • library construction methodologies, as well as
  • analysis of massive datasets.

“In the future, a patient will go into the doctor’s office and have a whole transcriptome profile test performed.“When PCR technology was discovered, no one knew just how powerful it would become or how many applications it would generate. Now, it is used everywhere. NGS technology and RNA-Seq have a similar potential. ”

 

Gene Paces microRNAs to Set Developmental Rhythms

Kevin Mayer   Jul 18, 2014   GEN News Highlights

http:/www.genengnews.com/gen-news-highlights/gene-paces-micrornas-to-set-developmental-rhythms/81250124/

Using C. elegans as a model researchers identified LIN-42, a gene that is found in animals across the evolutionary tree, as a potent regulator of numerous developmental processes. [C. Hammell, Cold Spring Harbor Laboratory]

  • Although the how of a gene’s function is important, the when, too, is crucial. The ebb and flow ofgene expression can influence a cell’s fate during development, the maturation of entire organisms, and even the evolution of species—helping to explain how species with very similar gene content can differ so dramatically.

Nature’s developmental clockwork

  • depends on the activation or repression of a specific and unique complement of genes. And these genes, in turn,
  • are regulated by microRNA molecules. And, finally,
  • the microRNAs are also subject to regulation.
  •  one must then study the regulators of the regulators of the regulators.

Little is known of the ultimate regulators—the elements that determine the activities of microRNAs. These elements, however, are presumably as subtle as they are powerful—

  1. subtle because microRNAs defined temporal gene expression and cell lineage patterns in a dosage-dependent manner;
  2. powerful because a single microRNA gene can control hundreds of other genes at once.
  3. as always, timing is everything: If a microRNA turns off genes too early or too late, the organism that depends on them will likely suffer severe developmental defects.

To undertake a search for genes that control developmental timing through microRNAs, a team of researchers at Cold Spring Harbor Laboratory relied on a tried-and-true model of animal development, Caenorhabditis elegans. These worms have a fixed number of cells, and each cell division is precisely timed.  “It enables us to understand

  • exactly how a mutation affects development,
  • whether maturation is precocious or delayed,
  • by directly observing defects in the timing of gene expression.” (said team leader Christopher Hammell, Ph.D.)

The researchers described their work in an article entitled, “LIN-42, the Caenorhabditis elegans PERIOD  homolog, Negatively Regulates MicroRNA Transcription,” which appeared July 17 in PLoS Genetics.

the goal to unveil factors that regulate the expression of microRNAs that control developmental timing –

  • they  identified LIN-42, the C. elegans homolog of the human and Drosophila period gene implicated in circadian gene regulation, as a negative regulator of microRNA expression

“By analyzing the transcriptional expression patterns of representative microRNAs, we found that the transcription of many microRNAs is normally highly dynamic and coupled aspects of post-embryonic growth and behavior.”

“LIN-42 shares a significant amount of similarity to the genes that control circadian rhythms in organisms such as mice and humans,” explained Roberto Perales, Ph.D., one of the lead authors of the study. “These are genes that control the timing of cellular processes on a daily basis for you and me. In the worm, these same genes and mechanisms control development, growth, and behavior. This system will provide us with leverage to understand how all of these things are coordinated.”

  1.  LIN-42 controls the repression of numerous genes in addition to microRNAs.
  2.  levels of the protein encoded by LIN-42 tend to
  • oscillate over the course of development and form a part of a developmental clock.

“LIN-42 provides the organism with a kind of cadence or temporal memory, so that

  1. it can remember that it has completed one developmental step before it moves on to the next,” emphasized Dr. Hammell. “This way, LIN-42 coordinates optimal levels of the genes required throughout development.”

 

Intracellular RNA-Seq

This literature review highlights a study led by George Church describing FISSEQ, or fluorescent in situ RNA sequencing.

Anton Simeonov, Ph.D.   Jul 25, 2014

http://www.genengnews.com/insight-and-intelligence/intracellular-rna-seq/77900207/

 

 FISSEQ appears to be sensitive to genes associated with cell type and function, and this in turn could be used for cell typing. [© Alila Medicinal Media – Fotolia.com]

  • Methods such as fluorescence in situ hybridization (FISH) allow gene expression to be observed at the tissue and cellular level; however, only a limited number of genes can be monitored in this manner, making transcriptome-wide studies impractical. George Church’s group* is presenting the further development of their original approach called
  • fluorescent in situ sequencing (FISSEQ) to incorporate a spatially structured sequencing library and an imaging method capable of resolving the amplicons (see Figure 1).

In fixed cells, RNA was reverse transcribed with tagged random hexamers to produce cDNA amplicons.

  1. Aminoallyl deoxyuridine 5-triphosphate (dUTP) was incorporated during reverse transcription and
  2. after the cDNA fragments were circularized before rolling circle amplification (RCA),
  3. an amine-reactive linker was used to cross-link the RCA amplicons containing aminoallyl dUTP.

The team generated RNA sequencing libraries in different cell types, tissue sections, and whole-mount embryos for three-dimensional (3D) visualization that spanned multiple resolution scales (see Figure 1).

Click Image To Enlarge +
Figure 1
  • Figure 1. Construction of 3D RNA-seq libraries in situ. After RT using random hexamers with an adapter sequence in fixed cells, the cDNA is amplified and cross-linked in situ. (A) A fluorescent probe is hybridized to the adapter sequence and imaged by confocal microscopy in human iPS cells (hiPSCs; scale bar: 10 μm) and fibroblasts (scale bar: 25 μm). (B) FISSEQ can localize the total RNA transcriptome in mouse embryo and adult brain sections (scale bar: 1 mm) and whole-mount Drosophila embryos (scale bar: 5 μm), although we have not sequenced these samples. (C) 3D rendering of gene-specific or adapter-specific probes hybridized to cDNA amplicons. 3D, three-dimensional; RT, reverse transcription; FISSEQ, fluorescent in situ sequencing; FISH, fluorescence in situ hybridization.
  • In a proof-of-concept experiment (see Figure 2) the authors sequenced primary fibroblasts in situ after simulating a response to injury, which yielded 156,762 reads, mapped to 8,102 annotated genes. When the 100 highest ranked genes were clustered, cells kept in fetal bovine serum medium were enriched for fibroblast-associated gene hits, while the rapidly dividing cells in epidermal growth factor medium were less fibroblast-like, reaffirming that the FISSEQ platform output reflects the change in transcription status as a function of the cellular environment and stress factors.

 

  • Figure 2. Overcoming resolution limitations and enhancing the signal-to-noise ratio. Ligation of fluorescent oligonucleotides occurs when the sequencing primer ends are perfectly complementary to the template. Extending sequencing primers by one or more bases, one can randomly sample amplicons at 1/4th, 1/16th, and 1/256th of the original density in fibroblasts (scale bar: 5 μm). N, nucleus; C, cytoplasm.
  • The authors further noted that FISSEQ appears to be sensitive to genes associated with cell type and function, and this in turn could be used for cell typing. It was also speculated that FISSEQ might allow for a combined transcriptome profiling and mutation detection in situ.
  • *Abstract from Science 2014, Vol. 343:1360–1363

Understanding the spatial organization of gene expression with single-nucleotide resolution requires

  • localizing the sequences of expressed RNA transcripts within a cell in situ.

Here, we describe fluorescent in situ RNA sequencing (FISSEQ), in which stably cross-linked complementary DNA (cDNA) amplicons are sequenced within a biological sample.

  1. Using 30-base reads from 8102 genes in situ, we examined RNA expression and localization in human primary fibroblasts with a simulated wound-healing assay.
  2. FISSEQ is compatible with tissue sections and whole-mount embryos and
  3. reduces the limitations of optical resolution and noisy signals on single-molecule detection.

Our platform enables massively parallel detection of genetic elements, including

  • gene transcripts and molecular barcodes, and can be used
  • to investigate cellular phenotype, gene regulation, and environment in situ.

Anton Simeonov, Ph.D., works at the NIH.

ASSAY & Drug Development Technologies, is published by Mary Ann Liebert, Inc.
GEN presents here one article that was analyzed in the “Literature Search and Review” column, a paper published in Science titled “Highly multiplexed subcellular RNA sequencing in situ.” Authors of the paper are Lee JH, Daugharthy ER, Scheiman J, Kalhor R, Yang JL, Ferrante TC, Terry R, … and Church GM.

 

Completely ablate microRNA genes on the genomic level

  • miR-KOs are transcription activator-like effector (TALE) nucleases that
  • precisely edit specific miRNAs in mammalian cells.
  • SBI designed miR-TALE-nucleases to cleave within the miRNA seed region.

In the absence of HR donor vectors, the cellular machinery repairs such breaks via

  • non-homologous end joining (NHEJ).

This is an error-prone system that typically generates small deletions or insertions (indels) at or near the site of cleavage. Since the seed region (defined as bases 2-8 of the microRNA) directs miRNA binding to its target DNA, indels within the seed region completely abolish miRNA function.

 

Design of miR-KO TALE Nucleases

The miR-KOs are designed to disrupt the miRNA seed region. Pairing miR-KOs with an HR donor

  • replaces the entire miRNA hairpin structure with an insulated selectable marker cassette.

Sample data for miR-KO 21 Knockout

Selection for HR events by puromycin or by FACS-based sorting for RFP can enrich for properly knocked-out alleles. The enriched cell populations are then

  • genotyped to determine whether the knockout is at a single allele or bi-allelic (as in the case of hsa-miR-21).

Genotyping for HR events is performed via junction PCR of genomic DNA-insert junctions at 5′ and/or 3′ ends of an HR site. PCR primer pairs are designed with one of the primer sequences corresponding to the targeted genomic DNA region and the other corresponding to the HR vector.

Primer design strategy for HR-directed genotyping

Genomic DNA PCR was used to to detect HR integration in one or both alleles of hsa-miR-21. Individual cellular clones that display one HR event typically display mutated seed regions in the other allele. miR-KOs, when combined with HR donor vectors have been shown to be highly efficient in generating double miRNA knockouts. For example, a miR-KO strategy against human miR-21 in HEK293T cells resulted in 30 puromycin-resistant lines out of 96 single cell-derived clones. Subsequent PCR-based genotyping of 23 successful PCR amplifications revealed that ~96% (22/23) were mono-allelic (i.e. one allele with HR and other with NHEJ or WT) and ~4% (1/23) were bi-allelic (e.g. both alleles undergone HR) for HR-induced miR-21 deletion. Furthermore, sequencing of PCR products spanning the targeted seed region of miR-21 revealed that 91% (10/11) were NHEJ-modified.

Taken together, these results show a 87% bi-allelic modification rate (20 out of 23 clones)

  • when the miR-KOs are combined with an HR donor vector.

Validation and phenotypic analysis of miR-KO of hsa-miR-21

To confirm complete loss of miRNA-21 expression, we quantified miR-21 expression in three independent miR-21 double knockouts by qPCR.

  1. Clone #1 and #7 carry one deletion of the miR-21 hairpin structure (via HR) and
  2. one indel within the seed region (via NHEJ);
  3. clone #5 carries bi-allelic deletions of the hairpin structure (bi-allelic HR).

We found complete abolishment of miR-21 expression in all three cell lines.

Growth phenotype uncovered in miR-21 KO cell lines

MicroRNA-21 has been characterized as a cell-promoting OncomiR. The abalation of the genomic hsa-miR-21 in human cells resulted in reduced proliferation in all three miR-21 knockout lines tested. Growth curves were plotted for the parental HEK293 cells as well as the three independent knockout lines.

Increase the ease and efficiency of obtaining KOs with matched HR vectors

While the use of miR-KOs alone can successfully abolish miRNA function,

  • screening for bi-allelic indels can be laborious.

Due to the small changes seen with indels, many clonal lines have to be established through limited dilution or single-cell sorting techniques, and

  • subsequently genomic DNA is PCR-amplified,
  • cloned into vectors and
  • subjected to genotyping by Sanger sequencing.

Since many cells will only have either zero or one alleles modified, tremendous work is often required to obtain bi-allelic indels.

To facilitate the screening process,

  • one may combine miRNA-specific TALE-nucleases with HR donor vectors, which enables positive selection and convenient screening of targeted cells.

Because NHEJ occurs more frequently than HR donor integration,

  • the majority of cells that undergo HR integration on one allele carry an indel in the miRNA seed region of the second allele.

This strategy has been shown to be highly efficient in generating bi-allelic miRNA knockouts. A positive selection strategy reveals puromycin-resistant and RFP-positive single-cell derived colonies, majority of which are double knockouts (i.e. HR event on one allele and indel in seed region of second allele).

Shown above is an overview of miR-KO strategies with miR-KOs alone and in combination with an HR donor vector. The HR donor vector enables positive selection, which allows for simple and efficient generation of cells harboring double knockouts.
Gene Described as Critical to Stem Cell Development

GEN News Highlights  Jul 18, 2014
http://www.genengnews.com/gen-news-highlights/gene-described-as-critical-to-stem-cell-development/81250121/

  • Scientists at Michigan State University say they have found that a gene known as ASF1A could be critical to the development of stem cells. ASF1A is at least one of the genes responsible for the mechanism of cellular reprogramming, a phenomenon that can turn one cell type into another, which is key to the making of stem cells, according to the researchers.

In a paper (“Histone chaperone ASF1A is required for maintenance of pluripotency and cellular reprogramming”) published in Science, the MSU team describes

  • how they analyzed more than 5,000 genes from a human oocyte before determining that
  • the ASF1A, along with another gene known as OCT4 and a helper soluble molecule, were the ones responsible for the reprogramming.

In 2006, an MSU team identified the thousands of genes that reside in the oocyte. In 2007, a team of Japanese researchers found that

  • by introducing four other genes into cells, induced pluripotent stem cells (iPSCs) could be created without the use of a human egg.

The researchers say that the genes ASF1A and OCT4 work in tandem with a ligand,

  • a hormone-like substance that also is produced in the oocyte called GDF9, to facilitate the reprogramming process.
  • overexpression of just ASF1A and OCT4 in hADFs exposed to the oocyte-specific paracrine growth factor GDF9 can reprogram hADFs into pluripotent cells

The report underscores the importance of studying the unfertilized MII [metaphase II human] as a means

  • to understand the molecular pathways governing somatic cell reprogramming.

“We believe that ASF1A and GDF9 are two players among many others that remain to be discovered, which are part of the cellular-reprogramming process,” noted Dr. Cibelli. “We hope that in the near future, with what we have learned here, we will be able to test new hypotheses that will reveal more secrets the oocyte is hiding from us. In turn, we will be able to develop new and safer cell therapy strategies.”

  • Although the how of a gene’s function is important, the when, too, is crucial. The ebb and flow of gene expression can influence a cell’s fate during development, the maturation of entire organisms, and even the evolution of species—helping to explain how species with very similar gene content can differ so dramatically.

 

Identification and Insilico Analysis of Retinoblastoma Serum microRNA Profile and Gene Targets Towards Prediction of Novel Serum Biomarkers

M Beta, A Venkatesan, M Vasudevan, U Vetrivel, et al. Identification and Insilico Analysis of Retinoblastoma Serum microRNA Profile and Gene Targets Towards Prediction of Novel Serum Biomarkers.

Bioinformatics and Biology Insights 2013:7 21–34.   http://dx.doi.org:/10.4137/BBI.S10501

This study was undertaken

  • to identify the differentially expressed miRNAs in the serum of children with RB in comparison with the normal age matched serum,
  • to analyze its concurrence with the existing RB tumor miRNA profile,
  • to identify its novel gene targets specific to RB, and
  • to study the expression of a few of the identified oncogenic miRNAs in the advanced stage primary RB patient’s serum sample.

MiRNA profiling performed on 14 pooled serum from chil­dren with advanced RB and 14 normal age matched serum samples

  • 21 miRNAs found to be upregulated (fold change > 2.0, P < 0.05) and
  • 24 downregulated (fold change > 2.0, P < 0.05).

Intersection of 59 significantly deregulated miRNAs identified from RB tumor profiles with that of miRNAs detected in serum profile revealed that

  • 33 miRNAs had followed a similar deregulation pattern in RB serum.

Later we validated a few of the miRNAs (miRNA 17-92) identified by microarray in the RB patient serum samples (n = 20) by using qRT-PCR.

Expression of the oncogenic miRNAs, miR-17, miR-18a, and miR-20a by qRT-PCR was significant in the serum samples

  • exploring the potential of serum miRNAs identification as noninvasive diagnosis.

Moreover, from miRNA gene target prediction, key regulatory genes of

  • cell proliferation,
  • apoptosis, and
  • positive and negative regulatory networks

involved in RB progression were identified in the gene expression profile of RB tumors.
Therefore, these identified miRNAs and their corresponding target genes could give insights on

  • potential biomarkers and key events involved in the RB pathway.

 

Prediction of Breast Cancer Metastasis by Gene Expression Profiles: A Comparison of Metagenes and Single Genes

(M Burton, M Thomassen, Q Tan, and TA Kruse.) Cancer Informatics 2012:11 193–217

http://dx.doi.org:/10.4137/CIN.S10375

The popularity of a large number of microarray applications has in cancer research led to the development of predictive or prognostic gene expression profiles. However, the diversity of microarray platforms has made the full validation of such profiles and their related gene lists across studies difficult and, at the level of classification accuracies, rarely validated in multiple independent datasets. Frequently, while the individual genes between such lists may not match, genes with same function are included across such gene lists. Development of such lists does not take into account the fact that

  • genes can be grouped together as metagenes (MGs) based on common characteristics such as pathways, regulation, or genomic location.

In this study we compared the performance of either metagene- or single gene-based feature sets and classifiers using random forest and two support vector machines for classifier building. The performance

  • within the same dataset,
  • feature set validation perfor­mance, and
  • validation performance of entire classifiers in strictly independent datasets

were assessed by

  • 10 times repeated 10-fold cross validation,
  • leave-one-out cross validation, and
  • one-fold validation, respectively.

To test the significance of the performance difference between MG- and SG-features/classifiers, we used a repeated down-sampled binomial test approach.

MG- and SG-feature sets are transferable and perform well for training and testing prediction of metastasis outcome

  • in strictly independent data sets, both
  • between different and
  • within similar microarray platforms, while
  • classifiers had a poorer performance when validated in strictly independent datasets.

The study showed that MG- and SG-feature sets perform equally well in classifying indepen­dent data. Furthermore, SG-classifiers significantly outperformed MG-classifier

  • when validation is conducted between datasets using similar platforms, while
  • no significant performance difference was found when validation was performed between different platforms.

Prediction of metastasis outcome in lymph node–negative patients by MG- and SG-classifiers showed that SG-classifiers performed significantly better than MG-classifiers when validated in independent data based on the same microarray platform as used for developing the classifier. However, the MG- and SG-classifiers had similar performance when conducting classifier validation in independent data based on a different microarray platform. The latter was also true when only validating sets of MG- and SG-features in independent datasets, both between and within similar and different platforms.

 

Molecular basis of transcription pausing

Jeffrey W. Roberts

Science 13 June 2014;  344(6189), pp. 1226-1227   http://dx.doi.org:/10.1126/science.1255712

+Author Affiliations

  1. Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA.
  2. E-mail: jwr7@cornell.edu

During RNA synthesis, RNA polymerase moves erratically along DNA,

  1. frequently resting as it produces an RNA copy of the DNA sequence.

Such pausing helps coordinate the appearance of a transcript with its utilization by cellular processes; to this end,

  • the movement of RNA polymerase is modulated by mechanisms that determine its rate. For example,
  1. pausing is critical to regulatory activities of the enzyme such as the termination of transcription. It is also essential
  2. during early modifications of eukaryotic RNA polymerase II that activate the enzyme for elongation.

Two reports analyzing transcription pausing on a global scale in Escherichia coli, by Larson et al. (1) and by Vvedenskaya et al. (2) on page 1285 of this issue, suggest new functions of pausing and reveal important aspects of its molecular basis.

The studies of Larson et al. and Vvedenskaya et al. follow decades of analysis of bacterial transcription that has illuminated

  • the molecular basis of polymerase pausing events that serve critical regulatory functions.

A transcription pause specified by the DNA sequence

  • synchronizes the translation of RNA into protein with
  • the transcription of leader regions of operons (groups of genes transcribed together) for amino acid biosynthesis;
  • this coordination controls amino acid synthesis in response to amino acid availability (3).

A protein-induced pause occurs when the E. coli initiation factor σ70 restrains RNA polymerase

  • by binding a second occurrence of the “−10” promoter element.

This paused polymerase provides a structure for

  1. engaging a transcription antiterminator (the bacteriophage λ Q protein) (4) that,
  2. inhibits transcription pauses, including those essential for transcription termination.

Knowledge about the interactions between nucleic acids and RNA polymerase that induce pausing

  • comes partly from studies on the E. coli histidine biosynthesis operon.

RNA polymerase pauses at the leader region of this cluster of genes (the “his pause”),

  • allowing an essential RNA hairpin structure to form just upstream of the RNA-DNA hybrid
  • where RNA synthesis is templated in the polymerase’s catalytic cleft.

Importantly, however, other sequence elements are required to induce and stabilize the his pause—particularly

  • the nucleotide at the newly formed, growing end of the RNA (pausing is favored by pyrimidines rather than purines) (5), and
  • at the incoming nucleotide position [pausing is favored particularly by guanine (G)] (6), as well as surrounding elements.

Biochemical and structural analyses have identified an endpoint of the pausing process called the “elemental pause” in which

  • the catalytic structure in the active site is distorted, preventing further nucleotide addition (7).

The elemental paused state also involves distinct conformational changes in the polymerase

  1. that may favor transcription termination and
  2. allow the his and related pauses to be stabilized by RNA hairpins (8).

ILLUSTRATION: V. ALTOUNIAN/SCIENCE

Single-molecule analysis of transcribing RNA polymerase, at nearly single-nucleotide resolution, identified many specific pause sites in the E. coli genome (9). Pausing occurs on essentially any DNA, and very frequently—every 100 nucleotides or so. These “ubiquitous” pauses are only partly efficient (i.e., not always recognized as the enzyme transits), and mostly have not been associated with specific functions. However, their existence is consistent with biochemical experiments showing that the progress of RNA polymerase is generally erratic. A consensus sequence for ubiquitous pauses was identified, with two important elements:

  • a preference for pyrimidine [mostly cytosine (C)] at the newly formed RNA end,
  • followed by G to be incorporated next—just as found for the his pause; and
  • a preference for G at position −10 of the RNA (10 nucleotides before the 3′ end), which is
  • at the upstream boundary of the RNA-DNA templating hybrid.

Remarkably, the tendency of a G in this position to induce pausing was recognized earlier, when DNA could be sequenced only through its transcript (10); it was thought that inhibited unwinding of the RNA-DNA hybrid underlies the pause.

 

Polyymerase, paused.

During transcription, RNA exists in two states as RNA polymerase progresses:

  1. pretranslocated, just after the addition of the last nucleotide [here, cytosine (C)]; and
  2. posttranslocated, after all nucleic acids have shifted in register by one nucleotide relative to the enzyme,
  • exposing the active site for binding of the next substrate molecule [here, guanine (G)].

The pretranslocated state is dominant in the pause. The critical G-C base (RNA-DNA) pair at position −10 in pretranslocated state and

  • the nontemplate DNA strand G bound in the polymerase in the posttranslocated state are marked with an asterisk.

ILLUSTRATION: V. ALTOUNIAN/SCIENCE

This ubiquitous pausing consensus sequence now has been refined and mapped exhaustively in the E. coligenome by Larson et al. and Vvedenskaya et al. (see the figure). In an analysis called native elongating transcript sequencing (NET-Seq) (11), transcripts associated with the whole cellular population of RNA polymerase are isolated from abruptly frozen cells and their growing ends are sequenced, giving a snapshot at nucleotide resolution of global transcription activity; DNA sites that are highly populated by RNA polymerase represent pauses. Larson et al. identified ∼20,000 transcription pause sites in the E. coli genome, including those expected from previous analysis of known sites like the his pause. Their analysis raises interesting questions about the role of such abundant pausing sequences.

Primarily, Larson et al. note that pauses frequently occur

  • exactly at the site of translation initiation, suggesting an important role in gene expression.

This coincidence of events is understandable when you examine the sequences. The consensus sequence in RNA for RNA polymerase pausing is G−10Y−1G+1 [G at position −10 and at the site after the pause; Y denotes either C or uracil (U) at the RNA end] according to Larson et al. and Vvedenskaya et al. The Shine-Dalgarno consensus sequence in RNA that the small-subunit ribosome recognizes is AGGAGG [adenine (A)] providing the G at the −10 position;

  • the downstream initiation codon for RNA translation is AUG, providing (for E. coli) the U at the pause end at position −1, with a following G at position +1.

A slightly modified pausing consensus sequence in the bacterium Bacillus subtilis accommodates the difference in spacing between the Shine-Dalgarno sequence and the initiation codon. What might be the role of a pause exactly at the translation initiation site? Because the ribosome binding site is physically concealed by RNA at the pause,

  • pausing may enable some process that prepares the RNA for translation once RNA polymerase transits the pause site.

Larson et al. suggest that the pause allows upstream RNA secondary structure to resolve in order to present the initiation region properly to the ribosome.

A particularly informative application of NET-Seq that provides new mechanistic information about pausing is based on the discovery of a specific binding site in RNA polymerase [the core recognition element (CRE)] for G in the non-template DNA strand (the strand not transcribed), at position +1 in the “posttranslocated” structure (12).

  • It could be that specific binding of a nucleotide to the enzyme in this position enhances pausing by slowing translocation;

surprisingly, however, Vvedenskaya et al. find the opposite. Cells altered to destroy the G binding site have up to twice as many sites of pausing as in wild-type cells, with

  • a greater preference for G as the incoming nucleotide.

However, this result is understandable in terms of the translocation cycle of RNA polymerase and the ubiquitous pausing sequence that has G at position +1. Binding of G at position +1 to CRE only occurs in the posttranslocated state, which would thus be favored over the pretranslocated state. Hence,

  • if G binding inhibits pausing, then the rate-limiting paused structure must be in the pretranslocated state (a conclusion also made by Larson et al. from biochemical experiments).

This is an important insight into the sequence of protein–nucleic acid interactions that occur in pausing. Vvedenskaya et al. suggest that the actual role of the G binding site is to promote translocation and thus inhibit pausing, to smooth out adventitious pauses in genomic DNA.

The studies by Larson et al. and Vvedenskaya et al. provide a refined and detailed analysis of DNA sequence–induced transcription pausing. As a core process in gene expression, this understanding is relevant not only for the basic biology of transcription, but also has applications in synthetic biology and the design of genetic circuits.

References

    1. H. Larson
    2. et al

., Science 344, 1042 (2014).

 

Abstract/FREE Full Text

O. Vvedenskaya et al.

,Science 344, 1285 (2014).

Abstract/FREE Full Text

  1. Landick, Turnbough, , C. Yanofsky

, in Escherichia coli and Salmonella, F. Neidhardt , Ed. (American Society for Microbiology, Washington, DC, 1996), vol. 1, pp. 1263–1286.

Google Scholar

  1. A. Perdue, W. Roberts

, J. Mol. Biol. 412, 782 (2011).

CrossRefMedlineGoogle Scholar

  1. L. Chan Landick

, J. Mol. Biol. 233, 25 (1993)

CrossRefMedlineWeb of ScienceGoogle Scholar

  1. N. Lee, Phung,Stewart, Landick

, J. Biol. Chem. 265, 15145 (1990).

Abstract/FREE Full Text

  1. Toulokhonov, Zhang,Palangat, Landick

, Mol. Cell 27, 406 (2007).

CrossRefMedlineWeb of ScienceGoogle Scholar

  1. Weixlbaumer, Leon, Landick, A. Darst

, Cell 152, 431 (2013).

CrossRefMedlineWeb of ScienceGoogle Scholar

9. M. Herbert, et al

., Cell 125, 1083 (2006).

CrossRefMedlineWeb of ScienceGoogle Scholar

10. Gilbert

, in RNA Polymerase, R. L. a. M. J. Chamberl1n , Ed. (Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1976), pp. 193–205.

Google Scholar

10. Churchman,, S. Weissman

, Nature 469, 368 (2011).

 

CrossRefMedlineWeb of ScienceGoogle Scholar

12.Zhan et al

., Science 338, 1076 (2012).

 

The editors suggest the following Related Resources on Science sites

In Science Magazine

REPORT Interactions between RNA polymerase and the “core recognition element” counteract pausing

Irina O. Vvedenskaya,  Hanif Vahedian-Movahed, Jeremy G. Bird, Jared G. Knoblauch, Seth R. Goldman,

Yu Zhang, Richard H. Ebright, and Bryce E. Nickels

Science 13 June 2014: 1285-1289.

 

“miR”roring Lupus Control

Angela Colmone

Sci.Signal., 29 July 2014;; 7(336),, p. ec202   http://dx.doi.org:/10.1126/scisignal.2005732

Decreased expression of the B cell signaling inhibitor PTEN may contribute to lupus pathology. Wu et al. found that microRNA (miR)–mediated regulation of PTEN is altered in patients with the autoimmune disease systemic lupus erythematosus (SLE). Patients with SLE have hyperactivated B cells, which results in the production of autoantibodies. The authors found that decreased expression of PTEN in B cells from SLE patients contributes to this B cell hyperactivation. What’s more, they found that PTEN expression in these cells was regulated by miRs and that blocking miR-7 could restore PTEN expression and function to that of healthy controls. These data support exploring miR-7 and PTEN as therapeutic targets for SLE.

X-n. Wu, Y-x. Ye, J-w. Niu, Y. Li, X. Li, X. You, H. Chen, L-d. Zhao, X-f. Zeng, F-c. Zhang, F-l. Tang, W. He, X-t. Cao, X. Zhang, P. E. Lipsky, Defective PTEN regulation contributes to B cell hyperresponsiveness in systemic lupus erythematosus. Sci. Transl. Med. 6, 246ra99 (2014). [Full Text]

Citation:

  1. Colmone, “miR”roring Lupus Control. Sci. Signal.7, ec202 (2014).

 

Long Noncoding RNA Regulating Apoptosis Discovered

Source: © Dmitry Sunagatov – Fotolia.com

  • Scientists from the University of São Paulo (USP) have identified an RNA molecule known as INXS that, although containing no instructions for the production of a protein, modulates the action of an important gene that impactsapoptosis.

According to Sergio Verjovski-Almeida, Ph.D., professor at the USP Chemistry Institute, INXS expression is generally diminished in cancer cells, and methods that are capable of stimulating the production of this noncoding RNA can be used to treat tumors. In experiments on mice, the USP scientists were able to effect a 10-fold reduction in the volume of subcutaneous malignant tumors by administering local injections of a plasmid containing INXS.

The team’s findings (“Long noncoding RNA INXS is a critical mediator of BCL-XS induced apoptosis”) were published in Nucleic Acids Research.

The group headed by Dr. Verjovski-Almeida at USP has been investigating the regulatory role of so-called intronic nonprotein-coding genes—those found in the same region of the genome as a coding gene but on the opposite DNA strand. INXS, for example, is an RNA expressed on the opposite strand of a gene coding for  the BCL-X protein.

“We were studying several protein-coding genes involved in cell death in search of evidence that one of them was regulated by intronic noncoding RNA. That was when we found the gene for BCL-X, which is located on chromosome 20,” he explained.

BCL-X is present in cells in two different forms: one that inhibits apoptosis (BCL-XL) and one that induces the process of cell death (BCL-XS). The two isoforms act on the mitochondria but in opposite ways. The BCL-XS isoform is considered a tumor suppressor because it activates caspases, which are required for the activation of other genes that cause cell death.

“In a healthy cell, there is a balance between the two BCL-X isoforms. Normally, there is already a smaller number of the pro-apoptotic form (BCL-XS). However, in comparing tumor cells to nontumor cells, we observed that tumor cells contain even fewer of the pro-apoptotic form, as well as reduced levels of INXS. We suspect that one thing affects the other,” continued Dr. Verjovski-Almeida.

To confirm the hypothesis, the group silenced INXS expression in a normal cell lineage and the result, as expected, was an increase in the BCL-XL (anti-apoptotic) isoform. “The rate between the two—which was 0.25—decreased to 0.15; in other words, the pro-apoptotic form that previously represented one fourth of the total began to represent only one sixth,” noted Dr. Verjovski-Almeida.

The opposite occurred when the researchers artificially increased the amount of INXS using plasmid expression in a kidney cancer cell line, with the noncoding RNA being reduced. “The pro-apoptotic form increased, and the anti-apoptotic form decreased,” he added.

“In a mouse xenograft model, intra-tumor injections of an INXS-expressing plasmid caused a marked reduction in tumor weight, and an increase in BCL-XS isoform, as determined in the excised tumors,” wrote the investigators. “We revealed an endogenous lncRNA that induces apoptosis, suggesting that INXS is a possible target to be explored in cancer therapies.

 

Scientists map one of the most important proteins in life—and cancer

Mon, 07/21/2014

Scientists have revealed the structure of one of the most important and complicated proteins in cell division—a fundamental process in life and the development of cancer—in research published in Nature.

Images of the gigantic protein in unprecedented detail will transform scientists’ understanding of exactly how cells copy their chromosomes and divide, and could reveal binding sites for future cancer drugs.

A team from The Institute of Cancer Research, London, and the Medical Research Council Laboratory of Molecular Biology in Cambridge produced the first detailed images of the anaphase-promoting complex (APC/C).

The APC/C performs a wide range of vital tasks associated with mitosis,

  1. the process during which a cell copies its chromosomes and
  2. pulls them apart into two separate cells.
  3. Mitosis is used in cell division by all animals and plants.

Discovering its structure could ultimately lead to new treatments for cancer, which

  • hijacks the normal process of cell division to make thousands of copies of harmful cancer cells.

In the study, which was funded by Cancer Research UK,

the researchers reconstituted human APC/C and used a combination of electron microscopy and imaging software to visualize it at a resolution of less than a billionth of a meter.

The resolution was so fine that it allowed the researchers to see the secondary structure—

  • the set of basic building blocks which combine to form every protein.

Alpha-helix rods and folded beta-sheet constructions were clearly visible within the 20 subunits of the APC/C, defining the overall architecture of the complex.

Previous studies led by the same research team had shown

  • a globular structure for APC/C in much lower resolution, but
  • the secondary structure had not previously been mapped.

The new study could identify binding sites for potential cancer drugs.

Each of the APC/C’s subunits bond and mesh with other units at different points in the cell cycle,

  1. allowing it to control a range of mitotic processes including the initiation of DNA replication,
  2. the segregation of chromosomes along protein ‘rails’ called spindles, and
  3. the ultimate splitting of one cell into two, called cytokinesis.

Disrupting each of these processes could

  • selectively kill cancer cells or prevent them from dividing.

Dr David Barford, who led the study as Professor of Molecular Biology at The Institute of Cancer Research, London, before taking up a new position at the Medical Research Council Laboratory of Molecular Biology in Cambridge, said:

“It’s very rewarding to finally tie down the detailed structure of this important protein, which is both

  • one of the most important and most complicated found in all of nature.

We hope our discovery will open up whole new avenues of research that increase our understanding of the process of mitosis, and ultimately lead to the discovery of new cancer drugs.”

Professor Paul Workman, Interim Chief Executive of The Institute of Cancer Research, London, said: “The fantastic insights into molecular structure

  • provided by this study are a vivid illustration of the critical role played by fundamental cell biology in cancer research.

“The new study is a major step forward in our understanding of cell division. When this process goes awry

  • it is a critical difference that separates cancer cells from their healthy counterparts.

Understanding exactly how cancer cells divide inappropriately is crucial to

  • the discovery of innovative cancer treatments to improve outcomes for cancer patients.”

Dr Kat Arney, Science Information Manager at Cancer Research UK, said “Figuring out how the fundamental molecular ‘nuts and bolts’ of cells work is vital

  • if we’re to make progress understanding what goes wrong in cancer cells and how to tackle them more effectively.

Revealing the intricate details of biological shapes is a hugely important step towards identifying targets for future cancer drugs.”

Source: The Institute of Cancer Research, London

 

A cell death avenue evolved from a life-saving path

  1. Harm H. Kampinga

+Author Affiliations

  1. Department of Cell Biology, University Medical Center Groningen, University of Groningen, Groningen, Netherlands.
  2. E-mail: h.kampinga@umcg.nl

Related Resources

In Science Magazine

Science 20 June 2014: 1389-1392.Published online 22 May 2014

In Science Signaling

Sci. Signal. 24 June 2014: ec175.

Yeast metacaspases are the ancestral enzymes of caspases that execute cellular suicide (“programmed cell death”) in multicellular organisms. Studies on metacaspase 1 (Mca1)

  • have suggested that single-cell eukaryotes can also commit programmed cell death (12). However,

on page 1389 of this issue, Malmgren Hill et al. (3) show that

  • Mca1 has positive rather than negative effects on the life span of the budding yeast Saccharomyces cerevisiae,
  • especially when protein homeostasis is impaired.

Mca1 helps to degrade misfolded proteins that accumulate during aging or that are generated by acute stress, and

  • thereby ensures the continuous and healthy generation of daughter cells
  • that are free of insoluble aggregates that otherwise would limit life span.

View larger version:

 

ILLUSTRATION: V. ALTOUNIAN/SCIENCE

Loss of Mca1 activity has been associated with a reduced appearance of programmed cell death markers (14),

  • implying that its overexpression should decrease the replicative life span of yeast (the number of daughter cells a mother cell can produce throughout its life). Cells lacking Mca1
  • have increased amounts of protein aggregates and oxidized proteins (45).

Malmgren Hill et al. not only show that this is related to decreased survival,

  • but also provide mechanistic insights into the mode of action of Mca1.

Its pro-life action depends on the chaperone heat shock protein 104 (Hsp104), a protein that

  1. can disentangle protein aggregates and
  2. is crucial for the asymmetric segregation of protein aggregates in dividing cells.

Mca1 deficiency does not affect life span of wild-type strains, but

  1. further decreases life span in strains already compromised in protein quality control. In particular,
  2. replicative aging is accelerated in strains lacking the Hsp70 co-chaperone Ydj1.

Mca1 does not improve protein folding but supports

  • degradation of terminally misfolded proteins.

Malmgren Hill et al. show that Mca1 requires proteasomes (protein structures that break down proteins) for all its effects.

The study by Malmgren Hill et al. challenges the idea that

  1. caspases are activated as an altruistic suicide mechanism in single-cell eukaryotes
  2. as a means to provide nutrients for younger and fitter cells in the population (2). Rather,
  3. the data suggest that from an evolutionary perspective, caspase activation is an integrated part of a protective response
  4. to help cells survive toxic stress caused by the accumulation of misfolded proteins.

When, however, activated incorrectly (e.g., in the absence of proteotoxic stress) or too strongly (e.g., in the case of excessive damage to the cell),

  1. the caspase activity may become nonselective and thus
  2. lead to the typical Mca1-dependent hallmarks of programmed cell death (124). Also,
  3. caspase activation in metazoa may function primarily in cell-autonomous protection and cellular remodeling or
  4. pruning. Its role in programmed cell death may also simply reflect overactivation upon severe cellular damage or
  5. hijacking of the caspases in the absence of stress to serve in non–cell-autonomous regulated tissue homeostasis.

View larger version:

Defense against protein damage.

Stress-damaged proteins that form aggregates in cells can be reactivated with the Hsp104-Ssa-Ydj1 chaperone machinery. Mca1 may act

  • in parallel by binding to misfolded proteins during early stages of aggregation for proteasomal degradation (this is independent of Mca1’s enzymatic activity). Alternatively,
  • Mca1 may associate with misfolded proteins formed at late stages of aggregation (together with Hsp104 and Ssa), helping to disentangle
  • the aggregates by its protease cleavage activity before shunting them to the proteasome for degradation.

ILLUSTRATION: V. ALTOUNIAN/SCIENCE

The results of Malmgren Hill et al. also highlight the importance of protein quality control for cellular aging. A collapse of protein homeostasis

  • has been implicated mostly in chronological aging of differentiated cells and, for example,
  • as a cause of neurodegenerative diseases (6).

The authors show that it also plays a prominent role in replicative aging.

  • This supports early findings in yeast (7) and may also be relevant to metazoa,
  • in which stem cells have extremely efficient protein degradation mechanisms (8) and
  • also use asymmetric segregation of protein damage for rejuvenation (9).

The data of Malmgren Hill et al. also suggest the existence of an additional layer of control of protein homeostasis. Beyond the

  • activation and induction of chaperones that assist in protein sorting, refolding, and protein degradation via proteasomes and
  • autophagosomes (membrane structures that deliver proteins to lysosomes for enzymatic destruction) (10),
  • Malmgren Hill et al. show that activation of caspases also belongs to the cell’s repertoire of defense mechanisms against protein damage.
  • Mca1 might act in parallel to the Ssa-Ydj1 machinery. Although
  • Ssa-Ydj1 collaborates with Hsp104 to refold proteins after their aggregation (11),
  • Mca1 primarily supports protein degradation, as its actions require not only Hsp104 but also proteasomal activity (3).

Precisely how Mca1 exerts its effect is yet unclear. It can associate with aggregates independent of other chaperones (35) and

  • independent of its catalytic activity (5), suggesting that
  • it binds directly to misfolded proteins [likely through its amino-terminal “pro-domain”
  • that is rich in glutamine and asparagine repeats].

This interaction may exert chaperone-like activity by keeping unfolded proteins

  • in a proteasome-competent form, which explains why part of Mca1’s protective actions in wild-type strains is independent of its protease activity.

However, the caspase activity of Mca1 is required for protein homeostasis and control of life span in Ydj1-deficient strains. It could be that

  • for more terminally misfolded proteins that accumulate in the absence of Ydj1,
  • protease cleavage may help to dismantle such aggregates in concert with Ssa and Hsp104 (see the figure).

This would also explain why the strongest phenotypes of Mca1 are seen under conditions in which Ydj1 is absent. More biochemical data with purified proteins will be needed to test these ideas.

The study of Malmgren Hill et al. suggests that altruism may not exist among cells. However, life and death seem to be close neighbors, and the things that are life saving may also become lethal. It will therefore be a challenge

  • to make use of these insights into caspase function in order to treat diseases by selectively tipping the balance toward life (e.g., in neurodegenerative diseases) or death (e.g., in cancer).

References

  1. Madeo et al

., Mol. Cell 9, 911 (2002).

 

CrossRefMedlineWeb of ScienceGoogle Scholar

  1. Herker et al

., J. Cell Biol. 164, 501 (2004).

 

Abstract/FREE Full Text

  1. Malmgren Hill et al

., Science 344, 1389 (2014).

 

Abstract/FREE Full Text

  1. A. Khan, . Chock, R. Stadtman

, Proc. Natl. Acad. Sci. U.S.A. 102, 17326 (2005).

Abstract/FREE Full Text

  1. E. Lee, Brunette, G. Puente, A. Megeney

, Proc. Natl. Acad. Sci. U.S.A. 107, 13348(2010).

 

Abstract/FREE Full Text

  1. E. Balch, I. Morimoto, Dillin, W. Kelly

, Science 319, 916 (2008).

 

Abstract/FREE Full Text

  1. Aguilaniu, Gustafsson, Rigoulet, Nyström

, Science 299, 1751 (2003).

Abstract/FREE Full Text

  1. Vilchez et al

., Nature 489, 304 (2012).

 

CrossRefMedlineWeb of ScienceGoogle Scholar

  1. A. Rujano et al

., PLOS Biol. 4, e417 (2006).

 

CrossRefMedlineGoogle Scholar

  1. H. Kampinga, A. Craig

, Nat. Rev. Mol. Cell Biol. 11, 579 (2010).

 

CrossRefMedlineWeb of ScienceGoogle Scholar

  1. R. Glover, Lindquist

, Cell 94, 73 (1998).

 

CrossRefMedlineWeb of ScienceGoogle Scholar

 the following Related Report

Life-span extension by a metacaspase in the yeast Saccharomyces cerevisiae

Sandra Malmgren Hill, Xinxin Hao, Beidong Liu, and Thomas Nyström

Science 20 June 2014: 1389-1392.

 

Synthetic biology: the many facets of T7 RNA polymerase

David L Shis, Matthew R Bennett
Molecular Systems Biology(2014)10:745   30.07.2014
http://dx.doi.org:/10.15252/msb.20145492

 

Added 8-2-2014

Split T7 RNA polymerase provides new avenues for creating synthetic gene circuits that are decoupled from host regulatory processes—but how many times can this enzyme be split, yet retain function? New research by Voigt and colleagues (SegallShapiro et al, 2014) indicates that it may be more than you think.

See also: TH Segall‐Shapiro et al (July 2014)

Synthetic gene circuits have become an invaluable tool for studying the design principles of native gene networks and facilitating new biotechnologies (Wayet al2014). Synthetic biologists often strive to build circuits within a framework that enables their consistent and robust operation across a range of hosts and conditions. Currently, however, each circuit must be fastidiously tuned and retuned in order to properly function within a particular host, leading to costly design cycles and esoteric conclusions. As a result, researchers have invested a great deal in developing strategies that

  • decouple synthetic gene circuits from host metabolism and regulation.

In their recent work, Segall‐Shapiro et al (2014) address this problem by

  • expanding the capabilities of orthogonal transcriptional systems in Escherichia coli using fragmented mutants of bacteriophage‐T7 RNA polymerase (T7 RNAP).

T7 RNAP has had a long relationship with biotechnology and

  • is renowned for its compactness and transcriptional activity.

This single subunit polymerase strongly

  • drives transcription from a miniscule 17‐bp promoter
  • that is orthogonally regulated inE. coli.

In this context, orthogonal means that

  • T7 RNAP will not transcribe genes driven by native E. coli promoters, and
  • native polymerases in E. coli will not recognize T7 RNAP’s special promoter—that is
  • the two transcriptional systems leave each other alone.

Interestingly, T7 RNAP drives transcription so strongly that,

  • if left unregulated, it can quickly exhaust cellular resources and lead to cell death.

Because of this, T7 RNAP

  • has been leveraged in many situations calling for protein over‐expression (Studier & Moffatt, 1986).

Additionally, studies examining the binding of T7 RNAP to its promoter have identified

  • a specificity loop within the enzyme that makes direct contact with the promoter
  • between base pairs −11 and −8.

This has led to a number of efforts that have generated T7 RNAP mutants

  • with modified specificities to promoters orthogonal to the original (Chelliserrykattil et al2001).

Given the growing interest in the development of synthetic gene circuits, researchers have taken a renewed interest in T7 RNAP. The orthogonality,

  • transcriptional activity and promoter malleability of T7 RNAP make the enzyme uniquely suited for use in synthetic gene circuits. Importantly,
  • any modifications made to the enzyme increase the possible functionality of circuits. For instance, we recently utilized
  • a split version of T7 RNAP in conjunction with promoter specificity mutants to create a library of transcriptional AND gates (Shis & Bennett, 2013).

The split version of T7 RNAP was originally discovered during purification and shown to be active in vitro (Ikeda & Richardson, 1987). While the catalytic core and DNA‐binding domain

  • are both located on the C‐terminal fragment of split T7 RNAP,
  • the N‐terminal fragment is needed for transcript elongation.

Therefore, if the two halves of split T7 RNAP are placed behind two different inducible promoters,

  1. both inputs must be active in order to form a functional enzyme and
  2. activate a downstream gene.

When the split mutant is combined with promoter specificity mutants,

  • a library of transcriptional AND gates is created.

Segall‐Shapiro et al take the idea of splitting T7 RNAP for novel regulatory architectures one step further. Instead of settling for the one split site already discovered,

  • the authors first streamlined a transposon mutagenesis strategy (Segall‐Shapiro et al2011) to identify four novel cut sites within T7 RNAP.

By expressing T7 RNAP split at two different sites,

  • they create a tripartite T7 RNAP—a polymerase
  • that requires all three subunits for activity.

The authors suggestively designate the fragments of the tripartite enzyme as ‘core’, ‘alpha’, and ‘sigma’ (Fig 1) and they go on to show that

  • tripartite T7 RNAP can not only be used to create 3‐input AND gates, but
  • it also works as a ‘resource allocator’.

In other words, the transcriptional activity of the split polymerase can be regulated

  • by limiting the availability of core and/or alpha fragment, or
  • by expressing additional sigma fragments.

The authors demonstrate strategies to account for common pitfalls in synthetic gene networks

  • such as host toxicity and plasmid copy number variability.

 

Figure 1. Segall‐Shapiro et al extend previous efforts to engineer split T7 RNAP by fragmenting the enzyme at two novel locations to create a tripartite transcription complex.

Co‐expressing different sigma fragments with the alpha and core fragments enables a network of multi‐input transcriptional AND gates.

The tripartite T7 RNAP presented by Segall‐Shapiro et al

  • expands the utility of T7 RNAP in orthogonal gene circuits.

Until now, while T7 RNAP has been attractive for use in synthetic gene circuits,

  • the inability to regulate its activity has often prevented its use.

Splitting the protein into fragments and regulating the transcription complex by fragment availability

  • brings the regulation of T7 RNAP closer to the regulation of multi‐subunit prokaryotic RNA polymerases.

Sigma fragments direct the activity of the transcription complex much like σ‐factors, and the alpha fragment helps activate transcription

  • in the same way as α‐fragments of prokaryotic polymerases.

For additional regulation, the authors note that the tripartite T7 RNAP can be further split at the previously discovered split site to create a four‐fragment enzyme.

More nuanced regulation using split T7 RNAP may be possible

  • with the addition of heterodimerization domains
  • that can drive the specific association of fragments.

This strategy has been successfully applied to engineer specificity and signal diversity

  • in two‐component signaling pathways (Whitaker et al2012).

The activity of T7 RNAP might also be directed to various promoters

  • by using multiple sigma fragments simultaneously,
  • just as σ‐factors do in E. coli.

Finally, synthetic gene circuits driven primarily by T7 RNAP create the possibility of easily transplantable gene circuits. A synthetic gene circuit driven entirely by fragmented T7 RNAP

  • would depend more on fragment availability than unknown interactions with host metabolism.

This would enable rapid prototyping of synthetic gene circuits in laboratory‐friendly strains or cell‐free systems (Shin & Noireaux, 2012) before transplantation into the desired host.

References

  1. Chelliserrykattil J, Cai G, Ellington AD (2001) A combined in vitro/in vivo selection for polymerases with novel promoter specificities. BMC Biotechnol 1: 13

CrossRefMedline

  1. Ikeda RA, Richardson CC (1987) Interactions of a proteolytically nicked RNApolymerase of bacteriophageT7 with its promoter. J Biol Chem 262: 3800–3808

Abstract/FREE Full Text

  1. SegallShapiro TH, Meyer AJ, Ellington AD, Sontag ED, Voigt CA (2014) A “resource allocator” for transcription based on a highly fragmented T7 RNA polymerase.Mol Syst Biol 10: 742

Abstract/FREE Full Text

  1. SegallShapiro TH, Nguyen PQ, Dos Santos ED, Subedi S, Judd J, Suh J, Silberg JJ(2011) Mesophilic and hyperthermophilic adenylate kinases differ in their tolerance to random fragmentation. J Mol Biol 406: 135–148

CrossRefMedline

  1. Shin J, Noireaux V (2012) An  coli cellfree expression toolbox: application to synthetic gene circuits and artificial cells. Acs Synth Biol 1: 29–41

CrossRefMedlineWeb of Science

  1. Shis DL, Bennett MR (2013) Library of synthetic transcriptional AND gates built with split T7 RNA polymerase mutants. Proc Natl Acad Sci USA 110: 5028–5033

Abstract/FREE Full Text

  1. Studier FW, Moffatt BA (1986) Use of bacteriophageT7 RNApolymerase to direct selective highlevel expression of cloned genes. J Mol Biol 189: 113–130

CrossRefMedlineWeb of Science

  1. Way JC, Collins JJ, Keasling JD, Silver PA (2014) Integrating biological redesign: where synthetic biology came from and where it needs to go. Cell 157: 151–161
  2. Whitaker WR, Davis SA, Arkin AP, Dueber JE (2012) Engineering robust control of twocomponent system phosphotransfer using modular scaffolds. Proc Natl Acad Sci USA 109: 18090–18095

Abstract/FREE Full Text

© 2014 The Authors. Published under the terms of the CC BY 4.0 license

 

 

MicroRNA References

Plasma microRNAs serve as biomarkers of therapeutic efficacy and disease progression in hypertension-induced heart failure. Dickinson BA, Semus HM, Montgomery RL, Stack C, Latimer PA, et al. Eur J Heart Fail. 2013 Jun; 15(6):650-9.  http://dx.doi.org:/10.1093/eurjhf/hft018

Circulating microRNAs – Biomarkers or mediators of cardiovascular disease?  S Fichtlscherer, AM Zeiher, S Dimmeler. Arteriosclerosis, Thrombosis, and Vascular Biology. 2011; 31:2383-2390.
http://dx.doi.org:/10.1161/​ATVBAHA.111.226696

Circulating microRNAs as diagnostic biomarkers for cardiovascular diseases. AJ Tijsen, YM Pinto, and EE Creemers. Am J Physiol Heart Circ Physiol 303: H1085–H1095, 2012.  http://dx.doi.org:/10.1152/ajpheart.00191.2012.

MicroRNAs in Patients on Chronic Hemodialysis (MINOS Study). Emilian C, Goretti E, Prospert F, Pouthier D, Duhoux P, et al. Clin J Am Soc Nephrol  (CJASN)2012;  7: 619-623. http://dx.doi.org:/10.2215/CJN.10471011

Plasma microRNAs serve as biomarkers of therapeutic efficacy and disease progression in hypertension-induced heart failure.  BA Dickinson, HM Semus, RL Montgomery, C Stack, PA Latimer, et al. Eur J Heart Fail 2013 Jun 6;15(6):650-9. http://www.pubfacts.com/detail/23388090/Plasma-microRNAs-serve-as-biomarkers-of-therapeutic-efficacy-and-disease-progression-in-hypertension

Circulating MicroRNAs: Novel Biomarkers and Extracellular Communicators in Cardiovascular Disease?  Esther E. Creemers, Anke J. Tijsen, Yigal M. Pinto.  Circulation Research. 2012; 110: 483-495    http://dx.doi.org:/10.1161/​CIRCRESAHA.111.247452

Novel techniques and targets in cardiovascular microRNA research.  Dangwal S, Bang C, Thum T. Cardiovasc Res. 2012 Mar 15; 93(4):545-54.  http://dx.doi.org:/10.1093/cvr/cvr297

Microparticles: major transport vehicles for distinct microRNAs in circulation. Diehl P, Fricke A, Sander L, Stamm J, Bassler N, Htun N, et al.  Cardiovasc Res. 2012 Mar 15; 93(4):633-44. http://dx.doi.org:/10.1093/cvr/cvs007.

Profiling of circulating microRNAs: from single biomarkers to re-wired networks. A  ZampetakiP Willeit, I Drozdov, S Kiechl and M Mayr. Cardiovasc Res 2012; 93 (4): 555-562.  http://dx.doi.org:/10.1093/cvr/cvr266

Small but smart–microRNAs in the centre of inflammatory processes during cardiovascular diseases, the metabolic syndrome, and ageing. Schroen B, Heymans S.
Cardiovasc Res. 2012; 93(4):605-613http://dx.doi.org:/10.1093/cvr/cvr268

Therapeutic Inhibition of miR-208a Improves Cardiac Function and Survival During Heart Failure. RL Montgomery, TG Hullinger, HM Semus, BA Dickinson, AG Seto, et al.
http://dx.doi.org:/10.1161/​CIRCULATIONAHA.111.030932

Circulating microRNAs to identify human heart failure.  Seto AG, van Rooij E.
Eur J Heart Fail. 2012;14(2):118-119http://dx.doi.org:/10.1093/eurjhf/hfr179.

Use of Circulating MicroRNAs to Diagnose Acute Myocardial Infarction. Y Devaux,
M Vausort, E Goretti, PV Nazarov, F Azuaje. Clin Chem. 2012; 58:559-567. http://dx.doi.org:/10.1373/clinchem.2011.173823

Next Steps in Cardiovascular Disease Genomic Research–Sequencing, Epigenetics, and Transcriptomics  RB Schnabel, A Baccarelli, H Lin, PT Ellinor, and EJ Benjamin.
Clin Chem . 2012 Jan; 58(1): 113–126.  http://dx.doi.org:/10.1373/clinchem.2011.170423

MicroRNA-133 Modulates the {beta}1-Adrenergic Receptor Transduction Cascade.  A Castaldi, T Zaglia, V Di Mauro, P Carullo, G Viggiani, et al.  Circ. Res.. 2014; 115:273-283.
http://dx.doi.org:/10.1161/​CIRCRESAHA.115.303252

Development of microRNA therapeutics is coming of age.  E van Rooij, S Kauppinen.  EMBO Mol Med.. 2014; 6:851-864.  http://dx.doi.org:/10.15252/emmm.201100899

Pitx2-microRNA pathway that delimits sinoatrial node development and inhibits predisposition to atrial fibrillation.   J Wang, Y Bai, N Li, W Ye, M Zhang,et al. PNAS 2014; 111: 9181-9186.

MicroRNA-126 modulates endothelial SDF-1 expression and mobilization of Sca-1+/Lin- progenitor cells in ischaemia  Cardiovasc Res. 2011; 92:449-455,

The use of genomics for treatment is another matter, and has several factors, e.g., age, residual function after AMI, comorbidities

Read Full Post »


Larry H Bernstein, MD, FCAP, Reporter

Long noncoding RNA (lncRNA) lightens up the dark secrets

CASE WESTERN RESERVE INVESTIGATORS DISCOVER NOVEL CELLULAR GENES BY UNCOVERING UNCHARACTERIZED RNAS THAT ENCODE PROTEINS

News Release: June 23, 2014

Jeannette Spalding
216-368-3004
jeannette.spalding@case.edu 

Case Western Reserve School of Medicine scientists have made an extraordinary double discovery. First, they have identified thousands of novel long non-coding ribonucleic acid (lncRNA) transcripts. Second, they have learned that some of them defy conventional wisdom regarding lncRNA transcripts, because they actually do direct the synthesis of proteins in cells.

Both of the breakthroughs are detailed in the June 12 issue of Cell Reports.

Kristian E. Baker, PhD, assistant professor in the Center for RNA Molecular Biology, led the team that applied high throughput gene expression analysis to yield these impressive findings, which ultimately could lead to treatments for cancer and some genetic disorders.

“Our work establishes that lncRNAs in yeast can encode proteins, and we provide evidence that this is probably true also in mammals, including humans,” Baker said. “Our investigation has expanded our knowledge of the genetic coding potential of already well-characterized genomes.”

Collaborating with researchers including Case Western Reserve University graduate and undergraduate students, Baker analyzed yeast and mouse cells, which serve as model organisms because of their functional resemblance to human cells.

Previously, lncRNAs were thought to lack the information and capacity to encode for proteins, distinguishing them from the messenger RNAs that are expressed from known genes and act primarily as templates for the synthesis of proteins. Yet this team demonstrated that a subset of these lncRNAs is engaged by the translation machinery and can function to produce protein products.

In the future, Baker and fellow investigators will continue to look for novel RNA transcripts and also search for a function for these lncRNAs and their protein products in cells.

“Discovery of more transcripts equates to the discovery of new and novel genes,” Baker said. “The significance of this work is that we have discovered evidence for the expression of previously undiscovered genes. Knowing that genes are expressed is the very first step in figuring out what they do in normal cellular function or in dysfunction and disease.”

This investigation was funded by the National Institutes of Health’s National Institute of General Medical Sciences (GM080465 and GM095621) and the National Science Foundation (NSF1253788).

 

Reference:

Lecture Contents delivered at Koch Institute for Integrative Cancer Research, Summer Symposium 2014: RNA Biology, Cancer and Therapeutic Implications, June 13, 2014 @MIT

Curator of Lecture Contents: Aviva Lev-Ari, PhD, RN
https://pharmaceuticalintelligence.wordpress.com/wp-admin/post.php?post=23174&action=edit
3:15 – 3:45, 6/13/2014, Laurie Boyer “Long non-coding RNAs: molecular regulators of cell fate”    

 https://pharmaceuticalintelligence.com/2014/06/13/315-345-2014-laurie-boyer-long-non-coding-rnas-molecular-regulators-of-cell-fate/

Read Full Post »

« Newer Posts - Older Posts »