Posts Tagged ‘protein structure’

@MIT Artificial intelligence system rapidly predicts how two proteins will attach: The model called Equidock, focuses on rigid body docking — which occurs when two proteins attach by rotating or translating in 3D space, but their shapes don’t squeeze or bend

Reporter: Aviva Lev-Ari, PhD, RN

This paper introduces a novel SE(3) equivariant graph matching network, along with a keypoint discovery and alignment approach, for the problem of protein-protein docking, with a novel loss based on optimal transport. The overall consensus is that this is an impactful solution to an important problem, whereby competitive results are achieved without the need for templates, refinement, and are achieved with substantially faster run times.
28 Sept 2021 (modified: 18 Nov 2021)ICLR 2022 SpotlightReaders:  Everyone Show BibtexShow Revisions
Keywords:protein complexes, protein structure, rigid body docking, SE(3) equivariance, graph neural networks
AbstractProtein complex formation is a central problem in biology, being involved in most of the cell’s processes, and essential for applications such as drug design or protein engineering. We tackle rigid body protein-protein docking, i.e., computationally predicting the 3D structure of a protein-protein complex from the individual unbound structures, assuming no three-dimensional flexibility during binding. We design a novel pairwise-independent SE(3)-equivariant graph matching network to predict the rotation and translation to place one of the proteins at the right location and the right orientation relative to the second protein. We mathematically guarantee that the predicted complex is always identical regardless of the initial placements of the two structures, avoiding expensive data augmentation. Our model approximates the binding pocket and predicts the docking pose using keypoint matching and alignment through optimal transport and a differentiable Kabsch algorithm. Empirically, we achieve significant running time improvements over existing protein docking software and predict qualitatively plausible protein complex structures despite not using heavy sampling, structure refinement, or templates.
One-sentence SummaryWe perform rigid protein docking using a novel independent SE(3)-equivariant message passing mechanism that guarantees the same resulting protein complex independent of the initial placement of the two 3D structures.

MIT researchers created a machine-learning model that can directly predict the complex that will form when two proteins bind together. Their technique is between 80 and 500 times faster than state-of-the-art software methods, and often predicts protein structures that are closer to actual structures that have been observed experimentally.

This technique could help scientists better understand some biological processes that involve protein interactions, like DNA replication and repair; it could also speed up the process of developing new medicines.

Deep learning is very good at capturing interactions between different proteins that are otherwise difficult for chemists or biologists to write experimentally. Some of these interactions are very complicated, and people haven’t found good ways to express them. This deep-learning model can learn these types of interactions from data,” says Octavian-Eugen Ganea, a postdoc in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and co-lead author of the paper.

Ganea’s co-lead author is Xinyuan Huang, a graduate student at ETH Zurich. MIT co-authors include Regina Barzilay, the School of Engineering Distinguished Professor for AI and Health in CSAIL, and Tommi Jaakkola, the Thomas Siebel Professor of Electrical Engineering in CSAIL and a member of the Institute for Data, Systems, and Society. The research will be presented at the International Conference on Learning Representations.

Significance of the Scientific Development by the @MIT Team

EquiDock wide applicability:

  • Our method can be integrated end-to-end to boost the quality of other models (see above discussion on runtime importance). Examples are predicting functions of protein complexes [3] or their binding affinity [5], de novo generation of proteins binding to specific targets (e.g., antibodies [6]), modeling back-bone and side-chain flexibility [4], or devising methods for non-binary multimers. See the updated discussion in the “Conclusion” section of our paper.


Advantages over previous methods:

  • Our method does not rely on templates or heavy candidate sampling [7], aiming at the ambitious goal of predicting the complex pose directly. This should be interpreted in terms of generalization (to unseen structures) and scalability capabilities of docking models, as well as their applicability to various other tasks (discussed above).


  • Our method obtains a competitive quality without explicitly using previous geometric (e.g., 3D Zernike descriptors [8]) or chemical (e.g., hydrophilic information) features [3]. Future EquiDock extensions would find creative ways to leverage these different signals and, thus, obtain more improvements.


Novelty of theory:

  • Our work is the first to formalize the notion of pairwise independent SE(3)-equivariance. Previous work (e.g., [9,10]) has incorporated only single object Euclidean-equivariances into deep learning models. For tasks such as docking and binding of biological objects, it is crucial that models understand the concept of multi-independent Euclidean equivariances.

  • All propositions in Section 3 are our novel theoretical contributions.

  • We have rewritten the Contribution and Related Work sections to clarify this aspect.


Footnote [a]: We have fixed an important bug in the cross-attention code. We have done a more extensive hyperparameter search and understood that layer normalization is crucial in layers used in Eqs. 5 and 9, but not on the h embeddings as it was originally shown in Eq. 10. We have seen benefits from training our models with a longer patience in the early stopping criteria (30 epochs for DIPS and 150 epochs for DB5). Increasing the learning rate to 2e-4 is important to speed-up training. Using an intersection loss weight of 10 leads to improved results compared to the default of 1.



[1] Protein-ligand blind docking using QuickVina-W with inter-process spatio-temporal integration, Hassan et al., 2017

[2] GNINA 1.0: molecular docking with deep learning, McNutt et al., 2021

[3] Protein-protein and domain-domain interactions, Kangueane and Nilofer, 2018

[4] Side-chain Packing Using SE(3)-Transformer, Jindal et al., 2022

[5] Contacts-based prediction of binding affinity in protein–protein complexes, Vangone et al., 2015

[6] Iterative refinement graph neural network for antibody sequence-structure co-design, Jin et al., 2021

[7] Hierarchical, rotation-equivariant neural networks to select structural models of protein complexes, Eismann et al, 2020

[8] Protein-protein docking using region-based 3D Zernike descriptors, Venkatraman et al., 2009

[9] SE(3)-transformers: 3D roto-translation equivariant attention networks, Fuchs et al, 2020

[10] E(n) equivariant graph neural networks, Satorras et al., 2021

[11] Fast end-to-end learning on protein surfaces, Sverrisson et al., 2020



Read Full Post »

Brain Biobank and studies of disease structure correlates

Larry H. Bernstein, MD, FCAP, Curator



Unveiling Psychiatric Diseases

Researchers create neuropsychiatric cellular biobank

Image: iStock/mstroz
Image: iStock/mstroz
Researchers from Harvard Medical School and Massachusetts General Hospital have completed the first stage of an important collaboration aimed at understanding the intricate variables of neuropsychiatric disease—something that currently eludes clinicians and scientists.

The research team, led by Isaac Kohane at HMS and Roy Perlis at Mass General, has created a neuropsychiatric cellular biobank—one of the largest in the world.

It contains induced pluripotent stem cells, or iPSCs, derived from skin cells taken from 100 people with neuropsychiatric diseases such as schizophrenia, bipolar disorder and major depression, and from 50 people without neuropsychiatric illness.

In addition, a detailed profile of each patient, obtained from hours of in-person assessment as well as from electronic medical records, is matched to each cell sample.

As a result, the scientific community can now for the first time access cells representing a broad swath of neuropsychiatric illness. This enables researchers to correlate molecular data with clinical information in areas such as variability of drug reactions between patients. The ultimate goal is to help treat, with greater precision, conditions that often elude effective management.

The cell collection and generation was led by investigators at Mass General, who in collaboration with Kohane and his team are working to characterize the cell lines at a molecular level. The cell repository, funded by the National Institutes of Health, is housed at Rutgers University.

“This biobank, in its current form, is only the beginning,” said Perlis, director of the MGH Psychiatry Center for Experimental Drugs and Diagnostics and HMS associate professor of psychiatry. “By next year we’ll have cells from a total of four hundred patients, with additional clinical detail and additional cell types that we will share with investigators.”

A current major limitation to understanding brain diseases is the inability to access brain biopsies on living patients. As a result, researchers typically study blood cells from patients or examine post-mortem tissue. This is in stark contrast with diseases such as cancer, for which there are many existing repositories of highly characterized cells from patients.

The new biobank offers a way to push beyond this limitation.


A Big Step Forward

While the biobank is already a boon to the scientific community, researchers at MGH and the HMS Department of Biomedical Informatics will be adding additional layers of molecular data to all of the cell samples. This information will include whole genome sequencing and transcriptomic and epigenetic profiling of brain cells made from the stem cell lines.

Collaborators in the HMS Department of Neurobiology, led by Michael Greenberg, department chair and Nathan Marsh Pusey Professor of Neurobiology,  will also work to examine characteristics of other types of neurons derived from these stem cells.

“This can potentially alter the entire way we look at and diagnose many neuropsychiatric conditions,” said Perlis.

One example may be to understand how the cellular responses to medication correspond to the patient’s documented responses, comparing in vitro with in vivo. “This would be a big step forward in bringing precision medicine to psychiatry,” Perlis said.

“It’s important to recall that in the field of genomics, we didn’t find interesting connections to disease until we had large enough samples to really investigate these complex conditions,” said Kohane, chair of the HMS Department of Biomedical Informatics.

“Our hypothesis is that here we will require far fewer patients,” he said. “By measuring the molecular functioning of the cells of each patient rather than only their genetic risk, and combining that all that’s known of these people in terms of treatment response and cognitive function, we will discover a great deal of valuable information about these conditions.”

Added Perlis, “In the early days of genetics, there were frequent false positives because we were studying so few people. We’re hoping to avoid the same problem in making cellular models, by ensuring that we have a sufficient number of cell lines to be confident in reporting differences between patient groups.”

The generation of stem cell lines and characterization of patients and brain cell lines is funded jointly by the the National Institute of Mental Health, the National Human Genome Research Institute and a grant from the Centers of Excellence in Genomic Science program.


On C.T.E. and Athletes, Science Remains in Its Infancy

Se Hoon ChoiYoung Hye KimMatthias Hebisch, et al.


Alzheimer’s disease is the most common form of dementia, characterized by two pathological hallmarks: amyloid-β plaques and neurofibrillary tangles1. The amyloid hypothesis of Alzheimer’s disease posits that the excessive accumulation of amyloid-β peptide leads to neurofibrillary tangles composed of aggregated hyperphosphorylated tau2, 3. However, to date, no single disease model has serially linked these two pathological events using human neuronal cells. Mouse models with familial Alzheimer’s disease (FAD) mutations exhibit amyloid-β-induced synaptic and memory deficits but they do not fully recapitulate other key pathological events of Alzheimer’s disease, including distinct neurofibrillary tangle pathology4, 5. Human neurons derived from Alzheimer’s disease patients have shown elevated levels of toxic amyloid-β species and phosphorylated tau but did not demonstrate amyloid-β plaques or neurofibrillary tangles6, 7, 8, 9, 10, 11. Here we report that FAD mutations in β-amyloid precursor protein and presenilin 1 are able to induce robust extracellular deposition of amyloid-β, including amyloid-β plaques, in a human neural stem-cell-derived three-dimensional (3D) culture system. More importantly, the 3D-differentiated neuronal cells expressing FAD mutations exhibited high levels of detergent-resistant, silver-positive aggregates of phosphorylated tau in the soma and neurites, as well as filamentous tau, as detected by immunoelectron microscopy. Inhibition of amyloid-β generation with β- or γ-secretase inhibitors not only decreased amyloid-β pathology, but also attenuated tauopathy. We also found that glycogen synthase kinase 3 (GSK3) regulated amyloid-β-mediated tau phosphorylation. We have successfully recapitulated amyloid-β and tau pathology in a single 3D human neural cell culture system. Our unique strategy for recapitulating Alzheimer’s disease pathology in a 3D neural cell culture model should also serve to facilitate the development of more precise human neural cell models of other neurodegenerative disorders.



Figure 2: Robust increases of extracellular amyloid-β deposits in 3D-differentiated hNPCs with FAD mutations.close

Robust increases of extracellular amyloid-[bgr] deposits in 3D-differentiated hNPCs with FAD mutations.

a, Thin-layer 3D culture protocol. HC, histochemistry; IF, immunofluorescence; IHC, immunohistochemistry. b, Amyloid-β deposits in 6-week differentiated control and FAD ReN cells in 3D Matrigel (green, GFP; blue, 3D6; scale bar, …


Stem Cell-Based Spinal Cord Repair Enables Robust Corticospinal Regeneration


Novel use of EPR spectroscopy to study in vivo protein structure



α-synuclein is a protein found abundantly throughout the brain. It is present mainly at the neuron ends where it is thought to play a role in ensuring the supply of synaptic vesicles in presynaptic terminals, which are required for the release of neurotransmitters to relay signals between neurons. It is critical for normal brain function.

However, α-synuclein is also the primary protein component of the cerebral amyloid deposits characteristic of Parkinson’s disease and its precursor is found in the amyloid plaques of Alzheimer’s disease. Although α-synuclein is present in all areas of the brain, these disease-state amyloid plaques only arise in distinct areas.

Alpha-synuclein protein. May play role in Parkinson’s and Alzheimer’s disease.  © molekuul.be / Shutterstock.com

Imaging of isolated samples of α-synuclein in vitro indicate that it does not have the precise 3D folded structure usually associated with proteins. It is therefore classed as an intrinsically disordered protein. However, it was not known whether the protein also lacked a precise structure in vivo.

There have been reports that it can form helical tetramers. Since the 3D structure of a biological protein is usually precisely matched to the specific function it performs, knowing the structure of α-synuclein within a living cell will help elucidate its role and may also improve understanding of the disease states with which it is associated.

If α-synuclein remains disordered in vivo, it may be possible for the protein to achieve different structures, and have different properties, depending on its surroundings.

Techniques for determining protein structure

It has long been known that elucidating the structure of a protein at an atomic level is fundamental for understanding its normal function and behavior. Furthermore, such knowledge can also facilitate the development of targeted drug treatments. Unfortunately, observing the atomic structure of a protein in vivo is not straightforward.

X-ray diffraction is the technique usually adopted for visualizing structures at atomic resolution, but this requires crystals of the molecule to be produced and this cannot be done without separating the molecules of interest from their natural environment. Such processes can modify the protein from its usual state and, particularly with complex structures, such effects are difficult to predict.

The development of nuclear magnetic resonance (NMR) spectroscopy improved the situation by making it possible for molecules to be analyzed under in vivo conditions, i.e. same pH, temperature and ionic concentration.

More recently, increases in the sensitivity of NMR and the use of isotope labelling have enabled determinations of the atomic level structure and dynamics of proteins to be determined within living cells1. NMR has been used to determine the structure of a bacterial protein within living cells2 but it is difficult to achieve sufficient quantities of the required protein within mammalian cells and to keep the cells alive for NMR imaging to be conducted.

Electron paramagnetic resonance (EPR) spectroscopy for determining protein structure

Recently, researchers have managed to overcome these obstacles by using in-cell NMR and electron paramagnetic resonance (EPR) spectroscopy. EPR spectroscopy is a technique that is similar to NMR spectroscopy in that it is based on the measurement and interpretation of the energy differences between excited and relaxed molecular states.

In EPR spectroscopy it is electrons that are excited, whereas in NMR signals are created through the spinning of atomic nuclei. EPR was developed to measure radicals and metal complexes, but has also been utilized to study the dynamic organization of lipids in biological membranes3.

EPR has now been used for the first time in protein structure investigations and has provided atomic-resolution information on the structure of α-synuclein in living mammalians4,5.

Bacterial forms of the α-synuclein protein labelled with 15N isotopes were introduced into five types of mammalian cell using electroporation. Concentrations of α-synuclein close to those found in vivo were achieved and the 15N isotopes allowed the protein to be clearly defined from other cellular components by NMR. The conformation of the protein was then determined using electron paramagnetic resonance (EPR).

The results showed that within living mammalian cells α-synuclein remains as a disordered and highly dynamic monomer. Different intracellular environments did not induce major conformational changes.


The novel use of EPR spectroscopy has resolved the mystery surrounding the in vivo conformation of α-synuclein. It showed that α-synuclein maintains its disordered monomeric form under physiological cell conditions. It has been demonstrated for the first time that even in crowded intracellular environments α-synuclein does not form oligomers, showing that intrinsic structural disorder can be sustained within mammalian cells.


  1. Freedberg DI and Selenko P. Live cell NMR Annu. Rev. Biophys. 2014;43:171–192.
  2. Sakakibara D, et al. Protein structure determination in living cells by in-cell NMR spectroscopy. Nature 2009;458:102–105.
  3. Yashroy RC. Magnetic resonance studies of dynamic organisation of lipids in chloroplast membranes. Journal of Biosciences 1990;15(4):281.
  4. Alderson TA and Bax AD. Parkinson’s Disease. Disorder in the court. Nature 2016; doi:10.1038/nature16871.
  5. Theillet FX, et al. Structural disorder of monomeric α-synuclein persists in mammalian cells. Nature 2016; doi:10.1038/nature16531.


Read Full Post »

Effect of Heat Shock on Protein Folding

Larry H. Bernstein, MD, FCAP, Curator



Getting Back in Shape

Contrary to years of research suggesting otherwise, most aggregated proteins regain their shape and functionality following heat shock.

By Karen Zusi | Dec 1, 2015    http://www.the-scientist.com//?articles.view/articleNo/44506/title/Getting-Back-in-Shape/


SHOCKED OUT OF SHAPE: Exogenous proteins in a cell denature and aggregate in misfolded clumps when heat-shocked. During cell recovery, specialized molecular chaperone proteins degrade and dispose of the aggregates. A small portion of the exogenous proteins may refold, escaping degradation.© EVAN OTO/SCIENCE SOURCE


Mature endogenous proteins aggregate in an organized fashion when heat-shocked, remodeling the cell’s protein synthesis machinery to facilitate survival. During recovery, molecular chaperones free the mature proteins to resume their normal activity.© EVAN OTO/SCIENCE SOURCE


The paper
E.W.J. Wallace et al., “Reversible, specific, active aggregates of endogenous proteins assemble upon heat stress,” Cell, 162:1286-98, 2015.

Like cooking an egg, heating up a purified protein enough will denature it, destroying the 3-D structure key to its functionality. The protein unfolds in a one-way trip to a fried state. Previous studies of this phenomenon in cells often used exogenous proteins, which clumped together in response to heat and were largely degraded by the cell’s internal cleanup machinery—a set of molecular chaperones known as heat-shock proteins—if the cell survived.

“That, and other examples, had convinced people that what they were seeing inside cells, the clumps of proteins, represented a disaster—these giant piles of damaged proteins shoved together inside the cell so they can ultimately be cleaned up,” reflects Allan Drummond, a molecular biologist at the University of Chicago. “Nobody had really looked systematically at what happens to the proteins that are native to the cell.”

To do so, Drummond and colleagues tagged proteins in yeast cells with a set of stable isotope labels. They subjected the cells to temperatures that would stress, but not kill, them and then flash-froze them within minutes to capture a snapshot of protein aggregation at different intervals. The researchers then analyzed the clumped proteins using mass spectrometry.

Drummond’s group identified 982 proteins in the yeast cells, 177 of which aggregated in the cytosol and nucleus after heat shock. However, the researchers’ isotope labels revealed something unexpected: aggregated proteins were unclumping and returning to their previous states during the cell’s recovery period. “We had no cases that we were able to detect where proteins go into aggregates and then are degraded,” says Drummond.

“It’s really challenging this long-held assumption that heat stress causes terminal aggregation. They’ve shown that we get these aggregates or assemblies that are reversible and that may actually be pro-survival,” says Kevin Morano, a microbiologist at the University of Texas Health Science Center at Houston who studies cellular responses to stress. “It is paradigm-shifting in a sense. We thought they were destroyed.” But in fact, the researchers found, the yeast cells began growing again without making substantial numbers of new proteins, suggesting that the proteins coming out of the aggregates were still functional.

Drummond’s group figures that heat stress, and the ensuing aggregation, may trigger the cell to produce more of its molecular chaperones to aid in recovery. The team observed a three-protein complex, needed to synthesize chaperones, that was active even while clumped with other proteins. “There are many different things that aggregation seems to be doing,” says Drummond. “It’s stopping the synthesis of most proteins, but promoting synthesis of a small set of proteins that are called in response to heat shock.”

Postdoc Edward Wallace, the lead author on the study, says halting the production of most other proteins may protect newly synthesized ones. Before they have folded into their mature shapes, new proteins are more susceptible to heat shock—and may still be degraded following stress. “We speculate that [the aggregated complexes] are remodeling protein synthesis, stopping the majority of new proteins from being made, and thus preventing the aggregation of these newly synthesized proteins,” says Wallace.

Editor’s Note (December 2): The sub-headline for this article was changed to emphasize the finding that most proteins regain functionality.

Reversible, Specific, Active Aggregates of Endogenous Proteins Assemble upon Heat Stress

Edward W.J. Wallace, Jamie L. Kear-Scott, Evgeny V. Pilipenko, Michael H. Schwartz,…, Edoardo M. Airoldi, Tao Pan, Bogdan A. Budnik, D. Allan Drummond
Cell 10 Sept 2015; 162(6):1286–1298   DOI: http://dx.doi.org/10.1016/j.cell.2015.08.041
Figure thumbnail fx1
  • Mass spectrometry quantifies aggregation of endogenous proteins during heat stress
  • Aggregates form rapidly in specific subcellular compartments
  • Endogenous protein aggregates are disassembled without degradation during recovery
  • In vitro, a heat-aggregated enzyme complex retains activity and fidelity



Heat causes protein misfolding and aggregation and, in eukaryotic cells, triggers aggregation of proteins and RNA into stress granules. We have carried out extensive proteomic studies to quantify heat-triggered aggregation and subsequent disaggregation in budding yeast, identifying >170 endogenous proteins aggregating within minutes of heat shock in multiple subcellular compartments. We demonstrate that these aggregated proteins are not misfolded and destined for degradation. Stable-isotope labeling reveals that even severely aggregated endogenous proteins are disaggregated without degradation during recovery from shock, contrasting with the rapid degradation observed for many exogenous thermolabile proteins. Although aggregation likely inactivates many cellular proteins, in the case of a heterotrimeric aminoacyl-tRNA synthetase complex, the aggregated proteins remain active with unaltered fidelity. We propose that most heat-induced aggregation of mature proteins reflects the operation of an adaptive, autoregulatory process of functionally significant aggregate assembly and disassembly that aids cellular adaptation to thermal stress.

Read Full Post »

Investigating Functional Compensation by Human Paralogous Proteins

Larry H. Bernstein, MD, FCAP, Curator




Using Disease-Associated Coding Sequence Variation to Investigate Functional Compensation by Human Paralogous Proteins

Evolutionary Bioinformatics 2015:11 245-251    http://dx.doi.org:/10.4137/EBO.S30594


In this article, we examined the functional compensation among paralogs as a general phenomenon through an analysis of disease-associated genetic variation in humans.23–26 In contrast to expectations under the functional compensation hypothesis, we found that multigene families have a greater tendency to harbor dSNVs than singleton proteins. We proposed that differences in functional constraints (evolutionary constraint hypothesis) explain the observed pattern to a large degree.


Gene duplication enables the functional diversification in species. It is thought that duplicated genes may be able to compensate if the function of one of the gene copies is disrupted. This possibility is extensively debated with some studies reporting proteome-wide compensation, whereas others suggest functional compensation among only recent gene duplicates or no compensation at all. We report results from a systematic molecular evolutionary analysis to test the predictions of the functional compensation hypothesis. We contrasted the density of Mendelian disease-associated single nucleotide variants (dSNVs) in proteins with no discernable paralogs (singletons) with the dSNV density in proteins found in multigene families. Under the functional compensation hypothesis, we expected to find greater numbers of dSNVs in singletons due to the lack of any compensating partners. Our analyses produced an opposite pattern; paralogs have over 35% higher dSNV density than singletons. We found that these patterns are concordant with similar differences in the rates of amino acid evolution (ie, functional constraints), as the proteins with paralogs have evolved 33% slower than singletons. Our evolutionary constraint explanation is robust to differences in family sizes, ages (young vs. old duplicates), and degrees of amino acid sequence similarities among paralogs. Therefore, disease-associated human variation does not exhibit significant signals of functional compensation among paralogous proteins, but rather an evolutionary constraint hypothesis provides a better explanation for the observed patterns of disease-associated and neutral polymorphisms in the human genome.



Gene duplication is an important mechanism for the origin of novelty in evolution.1–3 When a gene is duplicated, one of the duplicate copies usually decays within a few million years due to an accumulation of deleterious mutations.4 However, duplicates may be retained if they become functionally important to the organism.5–7 It has been suggested that duplicate genes may be able to carry out the original gene function, which means that paralogs may compensate for each other.8,9 Gene knockout/knockdown experiments have been conducted in multiple species to examine the degree of functional redundancy in gene families. The results suggest that the loss of function in genes with paralogs is associated with higher organismal survival than the loss of function in genes without any known paralogs (singletons), supporting the functional compensation hypothesis.10–16 However, Liao and Zhang17 reported that duplicates rarely compensate for each other in mice, which has been debated.18–22 Overall, experimental data have not yet provided definitive evidence about whether paralogous genes do compensate for each other in most instances.

The predictions of functional compensation can be tested computationally by analyzing the disease-associated genetic variation in humans. These variants are currently experiencing negative selection in the human populations, which means that they constitute data of functional impact in nature. If functional compensation among gene family members is substantial, it is expected that fewer significant statistical associations between variants and disease phenotypes will be detected for proteins in multigene families than for singletons. Using this idea, Dickerson and Robertson23 tested the predictions of functional compensation and found no difference between the proportion of singletons and para logs implicated in diseases (2% difference), supporting the conclusions of Liao and Zhang.17 However, they and others have suggested that recently diverged paralogs are less likely to be disease-associated than singletons and proteins with distantly related paralogs.23–26 These results suggest functional redundancy among young gene duplicates.

However, the abovementioned computational studies have not accounted for many potentially confounding factors. First, disease-associated single nucleotide variants (dSNVs) are found preferentially at slowly evolving amino acid positions27; thus, we expect to observe a higher frequency of dSNVs in more conserved proteins. This could distort comparisons between singletons and multigene family proteins if the distributions of amino acid evolutionary rates are not the same for these two classes. Second, the numbers of dSNVs found in different proteins are not expected to be the same because the numbers of amino acids in proteins vary by an order of magnitude. This means that commonly used metrics, such as the relative fractions of disease and nondisease proteins in different protein classes, are too coarse. Metrics that take into account the number of amino acids in proteins (sequence length) are necessary for more robust hypothesis testing.

In the following section, we tested the hypothesis of functional compensation by considering the abovementioned factors to better understand the genome-wide pattern of functional evolution in gene families, which is vital for understanding genome evolution and predicting disruptive effects of the mutations of proteins that have paralogs.

We obtained a set of 15,485 human proteins and their homologs from 46 diverse species from the UCSC genome browser (see Material and Methods). For each protein, we also obtained a list of paralogs from the HOVERGEN database.28 Our set of proteins is representative of the whole human gene set because about half (52%) of these proteins have at least one paralog, a fraction that is similar to the overall fraction of proteins with paralogs in the human genome (49% in HOVERGEN database28). For each human protein, we computed the average rate of amino acid substitution (number of substitutions per site per billion years) using the interspecific amino acid sequence alignments (see Material and Methods). Figure 1 shows the distributions of evolutionary rates in singleton and multigene family proteins. Overall, singletons are less conserved than multigene family proteins, with a ∼20% mean and ∼30% median difference (P < 0.01 by two-sample Kolmogorov–Smirnov test; Fig. 1A). Similar patterns are observed when considering paralogs belonging to small (2–5) and large (.5) multigene families (P < 0.01; Fig. 1B).


Figure 1. Distributions of evolutionary rates of singleton (broken line) and multigene family proteins (solid or dotted line). (A) Evolutionary rates are in the units of the number of amino acid substitutions per amino acid site per billion years. the mean and median of these distributions are 1.05 and 0.89, respectively, for singletons, and 0.80 and 0.61, respectively, for proteins in multigene families. these distributions are significantly different (two-sample Kolmogorov– smirnov test; P < 0.01). (b) multigene family proteins were separated into those with two to five paralogs (small family; solid line) and greater than five paralogs (large family; dotted line). The mean and median of these distributions are 0.75 and 0.60, respectively, for the proteins from the small multigene families (two to five paralogs) and 0.87 and 0.63, respectively, for the proteins from the large multigene families (greater than five paralogs). These distributions are significantly different from the distribution for singletons (P < 0.01).


dsNVs in singletons and multigene families. We analyzed all available SNVs associated with Mendelian diseases in singleton and multigene family proteins. There were a total of 47,382 dSNVs in 2,589 proteins. In these data, the proportion of proteins with at least one dSNV was slightly lower (2.2%) for singletons than that of proteins with paralogs, which is consistent with the recent reports.23,29 However, the number of dSNVs in proteins varied extensively and was found to be positively correlated with the protein length (P < 0.05 for multigene family and singletons; Fig. 2). This is reasonable because longer proteins should have a greater chance of accumulating random mutations and are, therefore, more likely to be classified as disease genes. Thus, we normalized the number of dSNVs by protein length to avoid any bias due to length differences in subsequent analyses.


Figure 2. Distributions of the number of dsnvs. (A) a frequency diagram showing the number of proteins with at least one dsnv. (b) the average number of dsnvs per protein for proteins at different length thresholds at 100 amino acids intervals. the average number of dsnvs per protein is positively correlated with the average protein length for both multigene family (correlation = 0.005; P < 0.01) and singleton proteins (correlation = 0.002; P = 0.04).


We compared the number of dSNVs per 100 amino acid positions (dSNV density) between multigene family and singleton proteins. Multigene family proteins have 1.6 times higher density of dSNVs than detected in singleton proteins (0.66 and 0.42, respectively). We can statistically reject the null hypothesis of equal dSNV densities in singletons and multigene family proteins (P < 0.01). However, the direction of effect is opposite to the predictions of functional compensation from paralogous genes in multigene families, as the multigene family proteins contained significantly more dSNVs than singletons. We tested the influence of outliers on this result by excluding all proteins with .0.5 dSNVs per amino acid. This reduced the number of proteins slightly (131 proteins were excluded), but the ratio of multigene family and singleton protein dSNV densities remained unchanged (1.6; P < 0.01). We, nevertheless, excluded all proteins in which the number of dSNVs per position was .0.5 in all subsequent analyses to remove the influence of proteins with unusually high dSNV density when comparing the patterns between different classes of proteins. We also tested if the observed patterns reflect the mutations of specific amino acids (eg, arginine) that comprise a major fraction of the dataset of dSNVs (16%). Arginine codons contain a CpG dinucleotide in the first two positions and are, thus, more prone to transitional mutations, leading to amino acid variation.30 We computed the dSNV densities using only the arginine positions in proteins and found the dSNV density in multigene family proteins to be 1.5 times greater than observed in singletons (0.09 and 0.06, respectively; P < 0.01). A similar pattern was observed for glycine (replacement of glycine residues occurs for 12% of dSNVs in this dataset). The dSNV density in multigene family proteins was twice than observed in singletons (0.08 and 0.04, respectively; P < 0.01).

Finally, we looked for the signatures of functional compensation using dSNVs that are expected to be the most severe, with the rationale that functional compensation may be easier to detect, as ameliorating severe phenotypic effects will have greater relative effect on individual fitness. We designated a dSNV to be severe if the predicted functional impact score for the variant was in the top 5% of all dSNVs (see Material and Methods). For these data, the multigene family proteins have a dSNV density 2.3 times higher than that observed for singletons (0.034 and 0.015, respectively; P < 0.01), which does not support the functional compensation hypothesis. Therefore, the patterns of greater abundance of dSNVs in multigene families are robust to the predicted effect sizes of dSNVs analyzed and the amino acid composition bias of the variation dataset.

Relationship of evolutionary conservation and dsNVs.

We examined if protein conservation difference between singletons and multigene family proteins can explain the above mentioned pattern because it is now well established that highly conserved proteins are significantly more likely to contain dSNVs.27,31 Because the protein evolutionary rate distributions are neither normal nor symmetrical (Fig. 1), we compared medians (0.61 and 0.89, respectively) and found a ratio of 0.69 between the multigene family and singleton proteins. The inverse of this ratio (1.5) is only slightly different from the ratio of dSNV densities (1.6). This similarity suggests that the higher rate of dSNVs in multigene family proteins is mostly explained by the degree of functional constraint on proteins in multigene families versus singleton proteins. Based on this observation, we propose the evolutionary constraints hypothesis, which posits that the differences in dSNV densities among different classes of proteins (eg, singleton vs. multigene) are primarily a result of the differences in the degree of natural selection acting upon them. If true, this would be consistent with the neutral theory of molecular evolution.32 Evolutionary constraint hypothesis does not preclude the existence of functional compensation (among other factors) in some proteins or positions, but it does claim that differences in the intensity of purifying selection will be the primary cause of observed differences in the preponderance of SNVs in different groups of proteins.

We tested the prediction of the evolutionary constraint hypothesis in an analysis of 12,952 common neutral SNVs (nSNVs) obtained from the 1000 Genomes Project.33 These common nSNVs are complementary in nature to dSNVs, as common nSNVs persist in the human population and have risen to moderate frequencies (.5%) because their impact on fitness is effectively neutral (opposite of dSNVs that cause disease). Therefore, if functional constraints and, thus, the conservation level of human protein sequence explain the observed differences in dSNV density, we should also observe fewer nSNVs in multigene family proteins, as these proteins evolve more slowly and are expected to be subject to more severe purifying selection.34 Indeed, the nSNV density (number of nSNVs per 100 amino acids) in multigene family proteins was lower than that of singletons (ratio = 0.82; 0.13 and 0.16, respectively; P < 0.01). This ratio (0.82) is again similar to the ratio of the evolutionary rates (0.69) for these two classes of proteins. These results suggest that the occurrence of dSNVs and nSNVs in proteins is largely concordant with the degree of functional constraint on proteins, which is captured in their evolutionary rates.

Disease sNV prevalence in proteins with young and old paralogs.

Next, we tested the hypothesis that functional compensation is more common in proteins with younger paralogs.23,24 If functional compensation generally occurs only for a brief period after the gene duplication event, then the most recently diverged paralogs will provide the most powerful signal to detect functional compensation. We first identified the closest paralog for each protein within a given gene family by selecting the paralog with the smallest nucleotide divergence in their codons (third positions only). To estimate the relative antiquity of the duplicate event, we used the protein-specific human–mouse third positions in codons to normalize each closest paralog divergence across gene families (see Materials and Methods). This normalized value yields an approximate gene duplication time when it is scaled using the human–mouse divergence time (92.3 million years ago35). This approximation is reasonable, as third positions in codons evolve relatively neutrally and because we use divergence times primarily for identifying and sorting young paralogs for hypothesis testing.

Density of dSNV for duplicates that have diverged from their paralogs in the last 200 million years shows a tendency to increase with estimated duplicate age (Fig. 3A). The same pattern is observed for the positions of arginine and glycine and those with predicted severe functional impacts (Fig. 3B–D). Also, the dSNV densities for the youngest duplicates are lower than those for singletons (triangle in Fig. 3). We found that the evolutionary rate of proteins is negatively correlated with time since duplication, and the youngest duplicates have higher evolutionary rates than singletons (Fig. 4A). These patterns do not support the functional compensation hypothesis23 and are consistent with our evolutionary constraint hypothesis. These trends are confirmed in the analysis of nSNV densities that showed expected complementary patterns (Fig. 4B).


Figure 3. the dsnv density in duplicates over time. Each point shows the dsnv density of all proteins with duplication age less than or equal to a threshold time (x-axis; 10 million year intervals). the dsnv density of singletons is shown with a triangle. Panels show patterns obtained for all dsnvs (A), arginine dsnvs (b), and glycine dsnvs (C). Panel D shows patterns for dsnvs with severe impact predicted by EvoD.46


Figure 4. the average evolutionary rates (A) and nsnv densities (b) of all proteins with duplication age less than or equal to a threshold time (x-axis; 10 million year intervals). the decreasing trend for evolutionary rate (A) is opposite to that observed for dsnvs, but it is similar to that observed for nsnvs (b). in each panel, triangle shows the value from singletons.


Disease sNV prevalence in proteins with very similar paralogs.

We also tested the functional compensation hypothesis in proteins that show high amino acid sequence similarities with their paralogs, as studied by Hsiao and Vitkup.24 We found that paralogs with the highest amino acid sequence similarities (.95%) actually have higher dSNV densities than other paralogs (0.98 vs. 0.57; P < 0.01). This is inconsistent with the functional compensation hypothesis but agrees with our evolutionary constraint hypothesis because the evolutionary rates were lower in paralogs with .95% similarity (0.59 and 0.78 substitutions/site/billion years; P < 0.01). Therefore, differences in the degree of functional constraint (measured using evolutionary rates) account for the observed patterns of dSNV densities.

Next, we compared nSNV densities in paralogs with .95% sequence similarity to those with #95% similarity. For this comparison, we needed to be cognizant of the fact that variant calls are difficult when the paralogs have very similar DNA sequences.36–39 This is the case for paralogs with .95% amino acid sequence similarity because most of these proteins also showed small divergences at the third positions in codons between paralogs (#0.2 substitutions per site). To accommodate the variant call errors, we used proteins with #0.2 distance (third positions) for comparison between paralogs for two groups of proteins (225 and 69 proteins). The nSNV density was 0.30 and 0.52 for proteins that have paralogs with .95% and #95% sequence similarity, respectively (P < 0.01). The former proteins are more conserved (rate = 0.89) than the latter (rate = 1.97; P < 0.01), and so the result is consistent with the evolutionary constraint hypothesis.


In this article, we examined the functional compensation among paralogs as a general phenomenon through an analysis of disease-associated genetic variation in humans.23–26 In contrast to expectations under the functional compensation hypothesis, we found that multigene families have a greater tendency to harbor dSNVs than singleton proteins. We proposed that differences in functional constraints (evolutionary constraint hypothesis) explain the observed pattern to a large degree. We confirmed that singleton proteins show lower functional constraint than proteins with identifiable duplicates in the genome, which explains the increased detection of disease-associated variation observed in multigene families.

Some recent theoretical and empirical studies suggest that functional compensation can lead to enhanced purifying selection and, therefore, may actually be associated with slower evolutionary rates.14,40 Other studies indicate that the youngest duplicates are evolving under relaxed selection pressures, which would cause an increase in evolutionary rates for a few million years.4 Such short-term and localized rate changes (faster or slower) will not have significant impact on the estimates of very long-term evolutionary rates that we have used to quantify the functional constraint. We have calculated the evolutionary rates using sequence differences in proteins that have accumulated changes for hundreds of millions of years across major groups of vertebrates. There is no evidence that pervasive functional compensation exists across the phylogenetic breadth and genomic scale reflected in our analyses. We expect our major conclusions to hold true in general, while acknowledging that functional compensation may occur in some multigene families and some amino acid positions. In summary, we suggest that there is a need to fully consider differences in the evolutionary conservation of proteins when studying the patterns of sequence variation and variant–phenotype associations.





Read Full Post »

Dynamic Protein Profiling

Larry H. Bernstein, MD, FCAP, Curator


Dynamic profiling of the protein life cycle in response to pathogens

The protein lifecycle is regulated by mRNA expression, translation,
and degradation. Image courtesy of Broad Communications.

Cellular protein levels are dictated by the net balance of mRNA expression (the type of RNA that provides genetic information for proteins), protein synthesis, and protein degradation. While changes in protein levels are commonly inferred from measuring changes in mRNA levels (due to the difficulties involved in measuring protein levels), it’s not often clear whether determining RNA levels is actually a good proxy for measuring protein levels.

In their recent article in the journal Science, Broad Institute researchers working in core member Aviv Regev’s and institute member Nir Hacohen’s laboratories, along with the Broad’s Proteomics Platform led by Steve Carr describe a quantitative genomic model that lets them explain the abundance of proteins in cells based on mRNA expression, translation, and degradation. They performed their study in mouse dendritic cells stimulated with LPS, a component of bacteria.

While previous studies had looked at global levels of regulation in rapidly-dividing, unstimulated cells, this work focuses on understanding how much of the change in protein levels is due to a change in mRNA expression, translation, and degradation in specific genes and classes of genes in response to a stimulus – in this case, LPS. For example, would the changes in levels of one class of proteins be mostly driven by changes in the levels of the mRNAs that encode them? On the other hand, would changes in the levels of other groups of proteins occur without changes in mRNA, but rather due to faster translation or slower degradation of the protein? These were the type of questions the scientists were interested in.

Explains co-first author Marko Jovanovic, “Can we, in a dynamic system, integrate RNA and protein life cycle data? People rarely do this, and never systematically. Can we really make a global model of gene expression where we know, in the end, how much each type of regulatory layer is contributing to each gene? You can get a global answer too, but straight percentages of global contribution of RNA levels and the protein life cycle to final protein levels was not my goal. My question was really, do we see that certain classes of genes are controlled one way and certain other classes another way and therefore gain new regulatory insight?”

Since changes in protein levels are not as dramatic and fast as changes in RNA levels, one of the greatest challenges they faced in their study was distinguishing actual signal from noise. Co-first author Michael Rooney explains how they tackled this problem: “While the quantitative accuracy of mass spectrometry has grown tremendously, we realized that statistical strategies for handling stochastic and systematic errors in the data would still be critical to getting correct results. As a first step, we developed a generative statistical model for the data. This allowed us to leverage the entire time course in a manner that was robust to missing values and stochastic variation. Second, we saw that the contribution of translation might be over-estimated if we allowed translation rates and protein levels to be calculated from the same experimental system, because in such a case they would both be confounded by the same systematic errors, making them appear more similar than they actually are. This led us to the novel strategy of creating biological replicates prepared by distinct peptide library protocols.”

In this way, the team was able to robustly build a dynamic model in which the mRNA synthesis rate, the translation rate and the protein degradation rate change over time. Based on this model, it was possible to predict how much of each of the three types of regulation are contributing to the change in the level of each protein and from that measure both globally, per gene class, and per gene, the relative contributions of each type of regulation.

Analyzing the LPS-stimulated dendritic cells, the researchers found that overall mRNA expression dominates the regulation strategies, accounting for up to 90% of the fold changes in protein level variation. This is a significant increase from their pre-stimulation measurements showing regulation of mRNA expression contributing 60-70%, translation 15-25%, and degradation also 10-20%.

What appeared to be regulated more substantially by the protein lifecycle (translation, degradation) were highly expressed genes. And, looking at changes in the number of protein molecules rather than just the relative fold changes in pre- versus post-stimulated cells, what emerges is that post-stimulation, regulation at the level of the protein lifecycle begins to dominate.

The findings lead to a model for the LPS-stimulated system in which protein expression associated with functions critical for a dendritic cell-specific functions is taken care of by regulation at the level of RNA expression. However, the readjustment of the pre-existing proteome when the cells enter a new state (for example, in response to pathogen stimulation) is controlled via regulation of the protein life cycle (translation, degradation) rather than RNA expression.

“We termed this the ‘cupcake model’,” says Jovanovic. “You have to forgive me, this is my European view on how I see people buy cupcakes. They go into the store and choose the cupcake based on the icing, so the icing is kind of the identity of the cupcake. So from one cupcake to another you are basically changing the icing. In our model, the identity of cell states is adjusted by mRNA regulation so mRNA regulation is basically contributing to the icing. However, there’s also the cake part. The cake part is often specifically adjusted to the icing on top of the cupcake. The cake part, analogous to “housekeeping genes’, also needs to change and this is mainly through the protein life cycle. I’m very biased because I don’t like the icing on cupcakes, just the cake part, and so in the same vein, I wanted to know more about how the protein lifecycle contributes to gene expression. I think people have focused too much on the icing. “

So, mRNA changes drive new cell state identity. Protein lifecycle regulation drives readjustment of preexisting “housekeeping genes” such as those encoding ribosomes and factors involved in metabolism to adjust the cell to its new state.

This approach is extensible to test the regulation of gene expression in other perturbed systems as well, and allows researchers for the first time to assess the relative contributions of each of the three levels of protein level regulation – mRNA expression, translation, and degradation – in any perturbed system.

Paper cited: Jovanovic, M et al. Dynamic profiling of the protein life cycle in response to pathogens.Science. Feb. 12, 2015. http://dx.doi.org:/10.1126/science.1259038

More Dynamic Protein Profiling

To Capture Fleeting Expressions, Go High-Throughput

  • One of the unexpected findings of the Human Genome Project was that human chromosomes contain only 20,000–25,000 protein-encoding genes, fewer than had been anticipated, …

Transitioning from Traditional Assay Formats to HTRF Technology 
Sensitivity of Fluorescence Coupled to Low Background of Time Resolution

Researchers are working on novel adaptations of HTRF-based assays, as well as their combination with other types of assays, to characterize complex disease pathways that may present multiple drug targets for disease therapy. [iStock/ponsulak]

  • At the 6th Cisbio HTRF symposium, “Charting the Course of Drug Discovery” held recently in Brewster, MA, investigators described how homogeneous time-resolved fluorescence (HTRF®) continues to expand and improve upon the repertoire of available bioassay formats for basic research and drug discovery. Participants described applications of these assays as integral components in studies ranging from identification of allosteric modulators as potential drugs to determination of critical components in protein-modifying biochemical pathways as new drug targets.

    A form of time-resolved fluorescence energy transfer (TR-FRET) technology, HTRF brings together the sensitivity of fluorescence with the homogeneous nature of FRET and the low background of time resolution. As in other FRET systems, HTRF uses two fluorophores—a donor and an acceptor that transfer energy when in close proximity to each other. Excitation of the donor molecule by an energy source such as a laser causes the emission of light waves at donor-specific wave lengths.

    If the donor and acceptor are not within proximity to each other, the donor is excited but no energy transfer occurs and no acceptor emission occurs. Dual-wavelength detection reduces buffer and media interference, and the final signal is proportional to the extent of product formation.

    The HTRF assay can be miniaturized into 384- and 1536-well plate formats, which proponents say, can save reagent costs and minimize quantities of limited target and compound material used in the assay. This assay technology has been applied to many antibody-based assays, including GPCR signaling (cAMP and IP-One), kinase, cytokine, biomarker, and bioprocess (antibody and protein production), as well as assays for protein-protein, protein-peptide, and protein-DNA/RNA interactions.

    Unlike traditional TR-FRET systems that employ fluorophores such as fluorescein and rhodamine that are characterized by immediate and transient emissions, HTRF-specific donors such as europium and terbium cryptate emit relatively long-lived fluorescence upon excitation. Conversely, acceptor molecules rapidly emit fluorescence.

    Thus, the nonspecific short-lived background fluorescence that occurs in FRET assays can be reduced by introducing a time delay ranging from 50-150 microseconds between the initial donor excitation and measurement. In HTRF, therefore, if the donor and acceptor molecules are not within proximity, only donor emissions are detected following a time delay.

    Participants at the symposium focused on novel adaptations of HTRF-based assays, as well as their combination with other types of assays, to characterize convoluted disease pathways that may present multiple drug targets for disease therapy, especially neurodegenerative disorders. In particular, several presenters noted its use in addressing what the conference keynote speaker, Terrance Kenakin, Ph.D., of the University of North Carolina, characterized as “The Perfect Storm” of pharmacology, receptor allostery, and biased signaling. Strictly defined, allosteric molecules regulate proteins by binding to the molecule at a site other than the protein’s active site.

    With regard to the seven transmembrane receptors (7TMRs) also known as G protein-coupled receptors, Dr. Kenakin noted that GPCRs comprise the largest class of receptors in the human genome and are common targets for therapeutics. Originally identified as mediators of 7TMR desensitization, β-arrestins (arrestin 2 and arrestin 3), for example, are now recognized as true adaptor proteins that transduce signals to multiple effector pathways. The introduction of molecular dynamics coupled with new assays, including HTRF, he said, opened new vistas for 7TMRs as therapeutic entities. Specifically, probe-dependent allosteric vectors oriented toward the cell cytosol provided fertile ground for new 7TMR drugs in the form of ligand-producing biased signaling.

    Discovering and Characterizing Allosteric Modulators  

    Positive and negative allosteric modulators (PAMs and NAMs) of GPCRs have emerged as a novel and highly desirable class of compounds, particularly in potential treatment for mental disorders, and for metabolic, neurodegenerative, and neuromuscular diseases. Advocates say they offer some distinct advantages over conventional competitive compounds, including the potential for fine-tuning of GPCR signaling and the promise to address formerly intractable targets.

    Introduced to the market in 2010 for the treatment of secondary hyperparathyroidism in adult patients with chronic kidney disease on dialysis, Cinacalcet, a PAM, activates the calcium-sensing receptor that functions as the principal regulator of parathyroid hormone secretion. Cinacalcet is the first clinically administered allosteric modulator acting on a GPCR, and provided a proof-of-concept for future development of allosteric modulators on other GPCR drug targets..

    Hayley Jones and Jeff Jerman, both of Medical Research Council Technology (MRCT) in the U.K., talked about the characterization of novel PAMs for the dopamine 1 receptor. Although preclinical and clinical data have validated this receptor as a target for drugs to improve cognitive impairment in schizophrenia, Jones noted that, to date, attempts to clinically develop agonists have failed.

    She and her colleagues have approached this problem by targeting D1R via PAM saying that in contrast to “direct” orthosteric D1R agonists, PAMS potentially offer advantages, including physiological spatiotemporal control of dopamine function by enhancing the effect of its endogenous ligand and avoiding over stimulation by self-limiting effects.

    The investigators said they had configured a cell-based HTRF assay to screen a subset of an MRCT compound library using CHO cells that transiently express the human receptor. Inclusion of a submaximal concentration of dopamine in the assays facilitated simultaneous detection of both PAMs and agonists, allowing them to identify novel D1R activators.

    Michelle Arkin, Ph.D., of the University of California, San Francisco, focuses her research on developing small molecule modulators of allosterically regulated enzymes and protein complexes as potential drug leads. Neurodegenerative diseases such as Alzheimer’s and other “taopathies” are characterized by formation of intracellular tangles comprised of aggregated tau proteins. Previous studies have shown that the protein actyltransferase p300 acetylates tau at several sites, competing with ubiquitination and thereby inhibiting tau degradation.

    Dr. Arkin and colleagues developed a high-throughput screen using HTRF to identify p300 inhibitors, designing a suite of counter screens and secondary assays to validate hits. Based on previous findings that the protease caspase-6 clips tau at specific sites and that truncated tau forms are associated with disease progression, the investigators developed selective caspase-6 inhibitors.

    HTRF assays demonstrated, she said, that small molecule compounds inhibit caspase-6 mediated cleavage of tau in cell lysates, concluding that the combination of HTRF enzymatic and biophysical assay formats allow characterization of inhibitors of proteins that may be involved in tauopathy progression.

  • Lack of Suitable Assays

    Martha Kimos, biochemist at the Lieber Institute, noted that the discovery of novel catehechol-o methyltransferase (COMT) inhibitors for use in the treatment of Parkinson ’s disease has been limited due to lack of suitable assays for high-throughput screening. COMT inhibitors like entacapone and tolcapone prolong the action of levodopa by preventing its demethylation by COMT.

    Kimos and her colleagues developed an HTRF assay involving an enzymatic step that uses membrane-bound human COMT as an enzyme substrate and an assay step that measures s-adenosyl-L-homcysteine (SAH) as an enzymatic reaction product. To directly measure SAH release, an anti- SAH antibody labeled with terbium cryptate and a SAH-d2 tracer were used. The SAH released by the enzymatic reaction competes with the SAH-d2 labeled leading to a decrease of the HTRF signal. The assay, the researchers said, showed good potency for tolcapone, with a high degree of translation between data in fluorescence ratio and data in terms of SAH produced, and suitable for kinetic studies, including Km determination.

    At Pfizer USA, Richard Frisbee, a scientist in the hit discovery and lead profiling (HDLP) department, and colleagues have focused on the development of HTS whole blood assays using HTRF, particularly to monitor anti-inflammatory drug potency. They noted that traditional whole-blood formats such as ELISAs for detecting cytokines require multiple assay plate manipulations, including wash steps and incubation steps, have limited throughput, and are relatively time consuming.

    They reported that they had developed a sandwich immunoassay protocol that measures cytokine production in human whole blood in a 384-well format, describing key elements of the assay, including nanoliter spotting of test compounds, miniaturized blood/reagent transfer, and optimized assay incubations. Development of a relatively convenient assay to monitor compound potency in whole blood can facilitate they said, the prediction of compound doses required for therapeutic efficacy.

    Inhibiting the enzyme γ-secretase, which converts amyloid precursor protein to β-amyloid , thus preventing its accumulation in the brain, has been a goal of drug developers.

    Most recently, Bristol-Myers Squibb elected to discontinue development of its inhibitor candidate avagacestat into Phase III trials after disappointing Phase II results. BMS remains in the hunt for drugs to treat Alzheimer’s disease. Despite clinical failures of its and other companies’ other gamma secretase inhibitors, researchers continue to search for next-generation compounds they believe may succeed.

    At BMS, Dave Harden, Ph.D., principal scientist and team leader, biochemical screening in the leads discovery and optimization group, has developed novel assays to identify molecules that inhibit secretase by measuring multiple amyloid beta species in cell supernatant. He and his team have capitalized on terbium cryptate’s properties as a donor fluorophore in HTRF, that has different photophysical properties compared to the donor fluor europium. These properties afford the opportunity to measure more than one interaction within a well due to the multiple emission spectra observed upon excitation. It can therefore serve as a donor fluorophore to green-emitting fluors because it has multiple emission peaks including one at 490 nm as well as the typically used 665 nm (red) emission.

    Dr. Harden and colleagues, in order to “enhance” their screening practices by expanding well information content, enabled two color multiplexed HTRF in multiple settings in large (>1 MM well) screening campaigns. This approach, they reported, successfully identified mechanistically distinct gamma secretase inhibitors by measuring multiple amyloid beta peptide species in cell supernatants. This, and several other examples, the presenters said, demonstrated the power of multiplexed HTRF in maximizing screening outcomes.

    Across the board, meeting presenters demonstrated the flexibility of HTRF assays and their adaptability to multiple research settings. The scientists pointed out that the assays yielded values consistent with other assay results using less versatile and convenient assays formats.

Read Full Post »

Proteins, Imaging and Therapeutics

Larry H Bernstein, MD, FCAP, Curator



Dissecting the Structure of Membrane Proteins

Kathy Liszewski

  • EM for Structural Analysis

Electron microscopy (EM) not only provides a straightforward approach to scrutinize the ultrastructure of cells and tissues, but it is also gaining momentum as a means to derive structural information on membrane proteins.

According to Bridget Carragher, Ph.D., co-director, Simons Electron Microscopy Center, New York Structural Biology Center, “EM is a widely applied technique to study the structure of proteins and membranes, but it is still less common than X-ray diffraction of prepared crystals. However, crystallization of membrane proteins has been particularly challenging. Since EM does not require obtaining crystals, it is becoming an increasingly used tool for performing structural analysis of membrane proteins and their complexes.”

As an example, Dr. Carragher described the use of single particle EM to directly visualize the conformational spectra of two homologous ATP-binding cassette (ABC) exporters. Single particle EM determines structure from multiple images of individual particles and uses methods like multivariate statistical analysis to separate heterogeneous particles into homogeneous classes.

“ABC transporters constitute a large family of membrane proteins that use the energy of ATP hydrolysis to translocate (either export or import) substances such as nutrients, lipids, and ions across the lipid bilayers,” said Dr. Carragher. “They are medically important because they also transport drugs and contribute to antibiotic or antifungal resistance.

“In a collaborative study, we utilized an unbiased approach employing newly developed amphiphiles in complex with lipids to create a membrane-mimicking environment for stabilizing membrane proteins. Visualization of the complexes using single particle EM analysis revealed striking conformational differences between the two transporters with respect to the effect of binding nucleotides and substrates. Overall, these studies provided a comprehensive view of the conformational flexibility of these two ABC exporters.”

As improvements continue to be made in the technology, resolution is nearing the 3 to 5 angstrom range, at least for some proteins and protein complexes.

“EM is becoming competitive with X-ray diffraction for solving some protein structures. It is not likely to replace other techniques, but rather will be complementary to them,” she added.

  • Bacterial Membrane Dynamics

reengineered nanopore


Structural model of a re-engineered nanopore
[Lukas Tamm, Ph.D., University of Virginia]

 The outer membranes of gram negative bacteria, such as Pseudomonas and E. coli, consist of multiple proteins and densely packed lipopolysaccharides (LPS or endotoxin). This structure provides a formidable barrier to antibiotics, most of which are targeted to intracellular processes.

  • “Understanding outer membrane structure and how molecules are recognized and transported across the bacterial membrane are critical to creating more effective antibiotics,” noted Lukas K. Tamm, Ph.D., professor molecular physiology and biological physics, University of Virginia.
  • The Tamm laboratory studies the dynamics of membrane proteins especially via solution NMR spectroscopy. His laboratory provided the first structure of the outer membrane ion channel of E. coli, OmpA. The group also studies OmpG, an outer membrane protein whose single polypeptide chain forms a membrane nanopore.
  • “Engineered protein nanopores have attracted interest to detect rare metal ions and neurotransmitters in solution, to sequence DNA and RNA, and to measure folding and unfolding kinetics of single proteins,” he explained. “We developed a new approach to loop immobilization that revealed cross-talk patterns between different loops of the OmpG nanopore. This will be useful to engineer new functions into OmpG and for analyzing other membrane nanopores.”
  • Dr. Tamm also studies the outer membrane protein H (OprH) from Pseudomonas aeruginosa, a multidrug resistant pathogen that is the most common cause of pneumonia and mortality in cystic fibrosis patients. It is the major cause of hospital-acquired infections.
  • “The impermeability of this pathogen’s outer membrane contributes substantially to its notorious antibiotic resistance. We utilized in vivo and in vitro assays that demonstrated the importance of the interaction of OprH with LPS in the outer membrane. Additionally, beyond determining the structure of OprH, our studies revealed that solution NMR can be a powerful tool for investigating interaction of integral membrane proteins with specific lipids. This cannot be easily done by crystallography.”
  • Dr. Tamm explained that there are many challenges remaining before antibiotic resistance can be overcome.
  • “The substrate is unknown for many of the outer membrane proteins. To develop better targeted antibiotics, it will be important to define specific substrates. Also, determining the structure of outer membrane proteins will likely also provide new insights for understanding how protein-lipid interactions contribute to antibiotic resistance. We aren’t there yet, but we are close to getting better answers.”
  • Membrane proteins, such as receptors, ion channels, and transporters, comprise nearly 30% of all proteins in eukaryotic cells. They also constitute more than 50% of all drug targets.

Yet, membrane proteins continue to present considerable challenges to the field of structural biology. Their surface is relatively hydrophobic, usually requiring potentially harmful detergent solubilization. Conformational flexibility and instability also may create roadblocks for the expression and purification required for structural analysis.

The recent Argonne National Laboratory Conference on Membrane Protein Structures highlighted advances in the field such as use of smaller and more intense beams for X-ray micro-crystallography, novel protein engineering of fusion proteins for structure determination, nanodiscs that mimic native cell environments, visualization strategies employing single particle electron microscopy, and bacterial nanopore studies that may help surmount antibiotic resistance.

  • X-Ray Micro-Crystallography

membrane proteins structure


Schematic view of the planned upgrade of the GM/CA beamline 23-ID-D at the Advanced Photon Source (APS) at Argonne National Laboratory. Top panel: cartoon of the X-ray optics to focus the beam. Bottom panel: elevation view of the endstation focusing optics, sample goniometry, and detector. The beam line upgrade will reduce the minimum beam size from 5 µm to 1 µm in the near future. The proposed APS-MBA upgrade will allow the beam to be focused to <500 nm with a 100-fold increase in intensity. The small, high intensity X-ray beam will enable structure determination for some of the most challenging problems in structural biology.


  • Many physiological processes are controlled and regulated by conformational changes in GPCRs and other integral membrane proteins. “We are studying at the atomic level how allosteric changes in such proteins regulate cell signaling,” explained Daniel M. Rosenbaum, Ph.D., assistant professor, biophysics, biochemistry, University of Texas Southwestern Medical Center.X-ray crystallography has been a workhorse technology for structural biologists for many years. Scientists generate a minute crystal by carefully optimizing conditions, shoot a high-powered X-ray beam at it, measure the angle and intensity of the diffracted beams, and derive a complete or partial structure by analyzing the results with sophisticated analytical programs.
  • “Membrane proteins are notoriously difficult to crystallize, and often yield very small, weakly diffracting, radiation-sensitive crystals that are intractable to large-beam crystallography. However, high-resolution structures can be obtained by using a micro-beam,” noted Robert F. Fischetti, Ph.D., associate division director and group leader, X-ray Science Division, Argonne National Laboratory.
  • Dr. Fischetti said the Advanced Photon Source (APS), a DOE user facility at Argonne, leads the field in deriving membrane protein structures.
  • “G-Protein Coupled Receptors (GPCRs) are one very important class of membrane proteins. There are more than 800 GPCRs, and over 40% of all drugs target them. Of the 30 known protein structures, 21 were solved at the APS.”
  • According to Dr. Fischetti, a number of key improvements and innovative approaches are needed.
  • “Stability of the beam intensity and the relative alignment of the beam and crystal are paramount in micro-crystallography. One problem is that X-ray beams cause both primary and secondary (diffusional) structural damage to the crystal. To overcome that issue smaller, hotter beams and more rapid detectors are being used in the race against radiation damage.”
  • Dr. Fischetti said the field is also seeing the emergence of breakthrough techniques, including novel sample delivery systems such as the acoustic drop and microfluidic technologies. Further, throughput is advancing.
  • “We’re approaching the ability to perform data collection on many thousands of microcrystals complexed to a variety of compounds. This is enabling pharmaceutical applications.”
  • One of the most exciting changes at APS and throughout the scientific community is the development of a new storage ring magnet lattice design, the multibend achromat (MBA). The technology promises a revolutionary increase in brightness that could reach two to three orders of magnitude beyond the current capability.
  • According to Dr. Fischetti, “This fourth generation storage ring will be nearly diffraction-limited and provide key improvements such as focusing X-rays down to the nanometer level with much higher intensity than is currently available. We expect the proposed MBA to be available in the 2020s. With this and other advances, it is clear that we are entering a new frontier in X-ray science.”
  • Disease-Related Receptors

In particular, Dr. Rosenbaum and his laboratory use protein engineering, X-ray crystallography, and NMR spectroscopy to study the structure and dynamics of molecules involved in hormone signaling and lipid homeostasis.

“GPCRs and other membrane proteins are not easily amenable to structural studies,” he said. “This limitation can often be overcome by protein engineering methods such as creating fusion proteins or thermostable mutants and using lipid-mediated crystallization methods.”

For example, Dr. Rosenbaum and colleagues studied the human β2 adrenergic receptor (β2AR) that binds epinephrine and is involved in the fight or flight response. Using the inactive structure of β2AR as guide, the team designed a β2AR agonist that could be covalently attached to a specific site on the receptor. “With this approach, we were able to crystallize a covalent agonist-bound β2AR fusion protein in lipid bilayers and determine its structure at 3.5 angstroms resolution.”

Another example of using fusion proteins to overcome membrane protein crystallization limitations is that of the human orexin receptor, OX2R. The orexin system modulates behaviors in mammals such as sleep, arousal, and feeding. Dysfunctions can lead to narcolepsy and cataplexy. The FDA recently approved the first-in-class drug, suvorexant, which became available in early 2015.

Dr. Rosenbaum and colleagues used lipid-mediated crystallization and protein engineering with a novel fusion chimera to solve the structure of the OX2R, bound to suvorexant at 2.5 angstom resolution.

“Elucidation of the molecular architecture of the human OX2R enhances our knowledge of how it recognizes ligands. Such studies provide powerful tools for designing improved therapeutics that can activate or inactivate orexin signaling.”

These studies have an overarching significance as well. “Looking at the bigger picture, these methods may lead to the design of new classes of small molecules that modulate key signaling pathways by controlling protein conformational changes within cellular membranes,” Dr. Rosenbaum concluded.

  • Nanodisc Technology

Although membrane proteins can be purified following cell lysis and detergent solubilization or after expression in heterologous systems, their true structure and function can be significantly compromised or lost entirely in the process. Ideally one would like membrane proteins to remain in a solubilized state for easier purification, functional assays, and structure determination. However, the native membrane environment is often necessary for full functionality. Detergents pose many technical obstacles including being hazardous for protein stability and interfering in many assay techniques.

Enter Nanodisc technology, a new approach for providing accessibility to the world of membrane proteins.

“We’ve always had a dream of engineering a process that would not only incorporate any membrane protein into a soluble bilayer structure, but also one that would employ a self-assembly process that would be applicable to all individual membrane proteins regardless of their structure and topology,” explained Stephen G. Sligar, Ph.D., director of the School of Molecular and Cellular Biology, University of Illinois, Urbana Champaign.

“Recently, that dream became realized by the creation of Nanodisc technology. Nanodiscs are self-assembling nanoscale phospholipid bilayers that are stabilized using engineered membrane scaffold proteins. The Nanodiscs allow membrane proteins to remain soluble and thus closely mimics native environment.”

There are many uses for the new technology according to Dr. Sligar. “Technological applications can take advantage of Nanodisc properties such as its small size, reduced light scattering, faster diffusion, enhanced stability, access to both sides of the bilayer and for surface attachment (e.g., surface plasmon resonance studies).”

Dr. Sligar and colleagues even demonstrated how to utilize the new technology for high throughput screening (HTS) assays.

“We wanted to identify antagonists that would interfere with the binding of membrane proteins to Alzheimer’s-associated amyloid β oligomers (AβOs), which are the neurotoxic ligands that instigate Alzheimer’s dementia. In collaboration with Professor William Klein and co-workers at Northwestern University, we created a solubilized membrane protein library (SMPL). This consisted of a complete set of membrane proteomes derived from biological tissue containing a heterogeneous mixture of individual proteins.

“Screening an extensive library of drug-like compounds and natural products identified yielded several ‘hits’, thus providing proof of concept for using SMPLs in HTS applications. An initial publication appeared recently in PLOS ONE.”

The results need to be confirmed in animal studies, Dr. Sligar noted. Overall, he is enthusiastic about the Nanodisc platform for uses that range from determination of structure/function to HTS applications.

“The unique properties of Nanodiscs make them ideal candidates to address important functional and structural questions involving membrane proteins in a more native environment.”


Twists and Turns in Protein Expression

In Early Drug Discovery it’s Often Unclear Which Recombinant Proteins Will Be Affected by Changing the Host Cell


  • When drug developers use different cell lines for manufacturing and preclinical research, they risk generating inconsistent results, proteins with various structures and functions. Then, confounded by variability, drug developers may lavish attention on irrelevant candidates and overlook promising candidates.

To avoid misleading themselves, drug developers must find ways to avoid or account for protein variants, which include post-translational modifications, particularly alternative glycosylations. Such variants occur all too frequently among different host cell lines, an extensive body of literature documents.

“Variability is most evident when comparisons are made between mammalian and nonmammalian cells,” says James Brady, Ph.D., vice president of technical applications and customer support at MaxCyte. “But depending on the protein that is being produced, even different mammalian cell lines, such as HEK and CHO, will exhibit substantial differences in post-translational modifications.” Differences can lead to altered protein stability, activity, or in vivo half-life.

It is often unclear during the early drug discovery process which recombinant proteins will be affected by changing the host cell. However, misleading early-stage data are associated with significant costs and extended timelines. It therefore makes sense to adopt a single host cell for all stages of the development pipeline. That is the rationale behind MaxCyte’s flow electroporation transfection platform.

  • Large-Scale Electroporation

Chemical transfection based on lipids or polymers are the most common alternatives to electroporation for large-scale transient transfection. However, reagent costs, lot-to-lot reagent variability, scale-up difficulties, and low transfection efficiency with certain cell types often are significant challenges of chemical transfection, particularly in biomanufacturing-relevant cells such as CHO.

Viral transfection vectors are another possibility. “While viral vectors may be more effective than chemical methods for introducing genes into certain difficult-to-transfect cell types, producing viral vectors often requires the development of packaging or producer cell lines,” Dr. Brady explains. “There are also biosafety concerns associated with some viral vectors.”

Unlike stable transfection, transient gene expression does not involve integration of the transgene into the host chromosome. Therefore, influences of the integration site on protein expression levels or other protein attributes are not evident. Rather the host cell’s genetic background, media/feed formulation, and culture conditions are the most significant factors influencing product quality, regardless of whether the protein is produced by stable or transient expression.

While high-end titers for stably transfected cells are now advancing into the low double-digit grams per liter, average titers are still in the lower single digits. Thus, the titers of 2–3 g/L that have recently been reported for transient expression via flow electroporation in nonengineered CHO cells are beginning to rival those of stable cell lines.

“So far, upper limits to titer by stable or transient expression have not been reached,” Dr. Brady tells GEN. “It is likely that innovations in vector design, advances in cell-line engineering, and improvements to cell-culture processes will lead to continued advances in both stable and transient titers.”

  • Monitoring Expression
  • Analytical methods are crucial for quantifying not only protein expression but also quality. A group at Fujifilm Diosynth Biotechnologies led by Greg Adams, Ph.D., the company’s director of analytical development, is promoting analytical techniques applicable throughout a molecule’s life cycle.

A scientist at Fujifilm Diosynth Biotechnologies operating an ambr250 mini-bioreactor system from Sartorius Stedim Biotech business unit TAP Biosystems.


  • Depending on the expression system, the Fujifilm Diosynth team focuses mostly on aggregation, glycosylation, and heterogeneity. The team employs a mix of rapid and conventional analyses, for example, mass spectrometry, ultra-performance liquid chromatography (UPLC), glycan analysis with rapid 2-aminobenzamide (2-AB) labeling and normal-phase UPLC, and capillary electrophoresis (CE) techniques such as imaged CE (iCE) and the CE-sodium dodecyl sulfate (CE-SDS) method. “Our objective,” declares Dr. Adams, “is same-day quality attribute analysis for understanding what’s happening in a bioreactor while designing the upstream process.”
  • Note that all the aforementioned techniques are standard analysis methods. The novelty is the context in which Fujifilm Diosynth uses them. Another distinction is the company’s high-throughput approach. The company uses liquid-handling workstations with pre-loaded tips for culture purification over protein A. The 30–60-minute preparation provides purified, active, concentrated antibody that may be analyzed in a number of ways. “We are able to analyze multiple ambr™ minireactor or 2 L bioreactor samples in hours versus days,” asserts Dr. Adams.
  • When it is applied to cell-line development, the rapid analysis philosophy holds that the same methods should be used from early development through GMP manufacturing. In practice, this is easier with antibodies because molecules of this class lend themselves to affinity purification and rapid method optimization through design of experiment (DOE), potentially beginning with transfectant pool material.
  • “Hopefully, we can have a method that we don’t have to change for the lifetime of the program,” Dr. Adams says. “It certainly helps to be able to trace data back through clinical phases and not have to worry about chromatographic profile and column changes. This has been very successful in several programs using the newer techniques, where the development phase is assisted by the speed by which you can run each method.”
  • The next challenge is to transfer this methodology to products expressed in microbial fermentation, which Dr. Adams refers to as the “next generation” of this approach to analytics.
  • Improving Solubility

Escherichia coli became the workhorse of recombinant protein expression because of its simple genetics, ease of culturing, scalability, rapid expression, and prodigious productivity. Negatives include a lack of eukaryotic post-translational machinery, codon usage bias, and difficulty with high-molecular-weight proteins.

Pros and cons must be weighed in terms of the target protein’s intended use. Quality and purity requirements for research-only proteins vary significantly, and may be worlds apart from therapeutic proteins. “The end application dictates to a large degree the choice of expression host, purity requirements, how you design the construct, and which tags to use,” says Keshav Vasanthavada, marketing specialist at GenScript.

A disadvantage in E. coli on par with low expression is insoluble expression, which results in aggregates (inclusion bodies). Researchers can deal with this phenomenon at the process level or molecular level. But before they embark on an improvement project, they should, Vasanthavada advises, check the literature to see if other researchers have produced the target protein in adequate yield and at acceptable quality. If so, it would be worthwhile to look at the other researchers’ methods and see if they can be reproduced.

Process-level strategies, which do not require target reengineering, include changing expression conditions, in vitro protein refolding, switching E. coli strains, adjusting media and buffers, or incorporating chaperone co-expression. Molecular-level approaches involve eliminating undesirable elements through truncations or mutations.

“The easiest approach is adoption of a fusion partner-based strategy,” Vasanthavada tells GEN. “It involves the use of a solubilizing partner upstream of the target protein to enhance target protein solubility.”

While this approach is generally beneficial, it has its drawbacks. For example, while a fusion partner will solubilize the target protein, there is no guarantee that the target protein will remain in solution once the tag is cleaved off. “Sometimes, you cannot ‘cleave off’ the fusion partner. The proteolytic enzyme won’t reach the cleavage site because of interference from itself,” Vasanthavada explains. “On other occasions, your fusion partner will start sticking to your target protein post-cleavage.”


Riboswitch Flip Kills Bacteria

Scientists discover a novel antibacterial molecule that targets a vital RNA regulatory element.

By Ruth Williams | September 30, 2015



Part of a riboswitch


Researchers at the pharmaceutical company Merck have identified a new bacteria-killing compound with an unusual target—an RNA regulatory structure called a riboswitch. The team used its drug, described in Nature today (September 30), to successfully reduce an experimental bacterial infection in mice, suggesting that the molecule could lead to the creation of a new antibiotic. Moreover, the results indicate that riboswitches—and other RNA elements—might be hitherto unappreciated targets for antibiotics and other drugs.

“Finding an antibiotic with a new target . . . has always been one of the holy grails of antibiotics discovery,” said RNA biochemist Thomas Hermann of the University of California, San Diego, who was not involved in the work. “It looks like that’s what the Merck group has now accomplished.”

The team’s research began with the idea of finding a compound that blocks the bacterial riboflavin synthesis pathway. Riboflavin is an essential nutrient for humans and bacteria alike, but while humans must consume it as part of their diet, bacteria can either scavenge riboflavin from the environment or, if supplies are lacking, make their own. “We targeted the riboflavin pathway because it is specific to bacteria so you have a built in safety margin,” said John Howe of the Merck research laboratories in Kenilworth, New Jersey, who led the research.

The team devised a simple but “very smart phenotypic screen,” said Hermann. The researchers tested roughly 57,000 antibacterial synthetic small molecules on cultures of E. coli bacteria looking for ones whose killing ability was neutralized by the presence of riboflavin. “If the effect of that antibacterial was suppressed by riboflavin,” said Howe, “then we had a good chance that the small molecule . . . was targeting the riboflavin pathway.”

The team found one molecule that fit the criteria and called it ribocil. To investigate the molecule’s mechanism of action, they applied it to cultures of E. coli cells until colonies emerged that were resistant to its effect. The researchers then sequenced the whole genomes of each of the resistant bacterial strains to find which genes were mutated.

The majority of drugs target proteins, explained Howe, “so we assumed that the mutations would be in one of the enzymes in the riboflavin synthesis pathway.” But as it turned out, while all of the 19 resistant strains did have mutations in a gene called RibB (which produces one of the riboflavin synthesis enzymes), the mutations did not affect the protein itself. They altered a non-coding part of the messenger RNA transcript: the riboswitch.

Riboswitches are regulatory elements at the beginning of messenger RNA transcripts. They bind molecules—normally metabolites—that typically suppress the transcript’s expression. “So instead of regulating the enzyme itself, [ribocil] is regulating the production of the enzyme,” Howe said.

Indeed, through reporter assays and crystallization experiments, the team confirmed that ribocil directly interacted with the RibB riboswitch, preventing expression of the protein.

“Ninety-nine-point-nine percent of drug targets are proteins,” said Hermann, “but they were prepared for the 0.1 percent outcome, and I think that’s what I really liked about this work.”

The team went on to tweak ribocil’s chemical structure, improving its killing efficiency and prolonging its effectiveness inside the body. The researchers then showed that this enhanced version of ribocil could effectively reduce bacterial burden in mice infected with a weakened E. coli strain; the bacteria are unable to efficiently expel drugs.

Weakened E. coli were used because wild-type E. coli are adept at ejecting ribocil and other compounds before they can take effect. Finding a way to keep ribocil in the bacteria and making other improvements will be necessary before it can be used as an actual antibiotic, explained Howe.

“I’ve [got] no idea if ribocil will end up being a drug candidate,” biochemist Gerry Wright of McMaster University in Ontario, Canada, wrote in an email to The Scientist, “but the work is a proof of principle, which is very important, and it makes us look to new areas of biology as targets for antibiotics.”

J.A. Howe et al., “Selective small-molecule inhibition of an RNA structural element,” Nature,doi: 10.1038/nature15542, 2015.


riboswitchnoncoding RNAdrug developmentdisease/medicinecell & molecular biology and antibiotics


Assay Drug Dev Technol. 2015 Sep;13(7):402-14. doi: 10.1089/adt.2015.655.

High-Content Assays for Characterizing the Viability and Morphology of 3D Cancer Spheroid Cultures.

Sirenko O1Mitlo T1Hesley J1Luke S1Owens W1Cromwell EF2.

Author information


There is an increasing interest in using three-dimensional (3D) spheroids for modeling cancer and tissue biology to accelerate translation research. Development of higher throughput assays to quantify phenotypic changes in spheroids is an active area of investigation. The goal of this study was to develop higher throughput high-content imaging and analysis methods to characterize phenotypic changes in human cancer spheroids in response to compound treatment. We optimized spheroid cell culture protocols using low adhesion U-bottom 96- and 384-well plates for three common cancer cell lines and improved the workflow with a one-step staining procedure that reduces assay time and minimizes variability. We streamlined imaging acquisition by using a maximum projection algorithm that combines cellular information from multiple slices through a 3D object into a single image, enabling efficient comparison of different spheroid phenotypes. A custom image analysis method was implemented to provide multiparametric characterization of single-cell and spheroid phenotypes. We report a number of readouts, including quantification of marker-specific cell numbers, measurement of cell viability and apoptosis, and characterization of spheroid size and shape. Assay performance was assessed using established anticancer cytostatic and cytotoxic drugs. We demonstrated concentration-response effects for different readouts and measured IC50 values, comparing 3D spheroid results to two-dimensional cell cultures. Finally, a library of 119 approved anticancer drugs was screened across a wide range of concentrations using HCT116 colon cancer spheroids. The proposed methods can increase performance and throughput of high-content assays for compound screening and evaluation of anticancer drugs with 3D cell models.


Molecules Hold the Mirror Up to Cancer

Imaging Technologies are Critical Tools for Basic Research and Translational and Clinical Applications


The Center for Biomedical Imaging in Oncology (CBIO) at the Dana-Farber Cancer Institute in Boston is a centralized cancer imaging research enterprise that was established to enable translational cancer research and drug development through the integration of preclinical and clinical imaging, access to preclinical/clinical multidisciplinary and multimodality imaging expertise, as well as drug/imaging probe development.

cancr imaging_DanaFarber_CBIO_OraganizatonalChart6613014019



  • The molecular processes behind cancer were once seen as through a glass, darkly. But now they are being reflected more clearly, thanks to advances in probe synthesis, preclinical cancer modeling, and multimodal imaging. These advances have positioned imaging as a key tool for basic research, as well as for translational and clinical applications.

To bring cancer visualization trends to light, CHI recently held a conference, Translational Imaging in Cancer Drug Development, as part of the World Preclinical Congress in Boston. This conference attracted leading imaging experts from industry and academia, including scientists and clinicians who use their expertise to accelerate cancer research. Many of the experts described how, with a little creativity, imaging modalities can be used to translate scientific discoveries into clinical applications.

Several examples of creative imaging from the conference are discussed in this article. To start, this article will highlight one investigator’s new take on a familiar technique, positron emission tomography (PET).

“Along with the scientific challenge posed by President Obama’s Precision Medicine Initiative, molecular imaging probes have substantially improved and expanded to include the noninvasive characterization of tumors and tumor microenvironments,” said Quang-Dé Nguyen, Ph.D., director of the Lurie Family Imaging Center (LFIC) at the Dana Farber Cancer Institute. “PET is becoming a method of choice for studying tumor biology in real time.”

LFIC is fully equipped to meet the creative demands of translational molecular imaging. It is an integral part of the Center for Biomedical Imaging in Oncology (CBIO), which also includes a clinical imaging research group. In addition to LFIC and CBIO, the Dana Farber Cancer Institute includes medicinal chemistry capabilities and expertise, and has recently established the Molecular Cancer Imaging Facility housing the only PET cyclotron in the state dedicated entirely to the development of novel radiotracers for cancer research.

“A unique attribute of our Cancer Center is the fully developed Mouse Hospital, mirroring every aspect of human cancer diagnostics and care,” noted Dr. Nguyen. The center uses genetically engineered mouse models that can be matched to the specific genotype of a given individual patient. Alternatively, the Center can rapidly generate xenograft mice and orthotopic murine tumor models using human tumor cells obtained from biopsies. In either case, the resulting mouse model is a faithful genetic mirror of the patient’s tumor.

Dr. Nguyen’s team deploys PET imaging to inform patient treatment in co-clinical trials. Once a patient’s genotype is identified, an appropriate mouse model is selected, sometimes in combination with additional mutations. The mouse is treated with a desired therapy, and functional and molecular outcomes can be rapidly detected by PET imaging. Mouse-derived data can then inform the design of the clinical trial and be fully integrated with clinical data.

In a seminal study, lung tumors carrying several combinations of cancer mutations were simultaneously tested in genetically engineered mouse models and in patients with lung cancer enrolled in a clinical trial to assess response to a combination therapy with a novel drug compared to standard of care. The radiolabeled glucose analog was used to visualize the lung tumors by PET in both mice and patients.

Remarkably, within 24 hours after therapy initiation, preclinical PET imaging demonstrated treatment response to the combined regimen for some but not all the mutations. This information helped identify the resistant mutation in patients being considered for enrollment in the clinical trial and allowed enrichment of the patient population by selecting patients carrying those mutations that had showed metabolic response in the preclinical setting.


Read Full Post »

Introduction to Protein Synthesis and Degradation

Curator: Larry H. Bernstein, MD, FCAP

Updated 8/31/2019


Introduction to Protein Synthesis and Degradation

This chapter I made to follow signaling, rather than to precede it. I had already written much of the content before reorganizing the contents. The previous chapters on carbohydrate and on lipid metabolism have already provided much material on proteins and protein function, which was persuasive of the need to introduce signaling, which entails a substantial introduction to conformational changes in proteins that direct the trafficking of metabolic pathways, but more subtly uncovers an important role for microRNAs, not divorced from transcription, but involved in a non-transcriptional role.  This is where the classic model of molecular biology lacked any integration with emerging metabolic concepts concerning regulation. Consequently, the science was bereft of understanding the ties between the multiple convergence of transcripts, the selective inhibition of transcriptions, and the relative balance of aerobic and anaerobic metabolism, the weight of the pentose phosphate shunt, and the utilization of available energy source for synthetic and catabolic adaptive responses.

The first subchapter serves to introduce the importance of transcription in translational science.  The several subtitles that follow are intended to lay out the scope of the transcriptional activity, and also to direct attention toward the huge role of proteomics in the cell construct.  As we have already seen, proteins engage with carbohydrates and with lipids in important structural and signaling processes.  They are integrasl to the composition of the cytoskeleton, and also to the extracellular matrix.  Many proteins are actually enzymes, carrying out the transformation of some substrate, a derivative of the food we ingest.  They have a catalytic site, and they function with a cofactor – either a multivalent metal or a nucleotide.

The amino acids that go into protein synthesis include “indispensable” nutrients that are not made for use, but must be derived from animal protein, although the need is partially satisfied by plant sources. The essential amino acids are classified into well established groups. There are 20 amino acids commonly found in proteins.  They are classified into the following groups based on the chemical and/or structural properties of their side chains :

  1. Aliphatic Amino Acids
  2. Cyclic Amino Acid
  3. AAs with Hydroxyl or Sulfur-containing side chains
  4. Aromatic Amino Acids
  5. Basic Amino Acids
  6. Acidic Amino Acids and their Amides

Examples include:

Alanine                  aliphatic hydrophobic neutral
Arginine                 polar hydrophilic charged (+)
Cysteine                polar hydrophobic neutral
Glutamine             polar hydrophilic neutral
Histidine                aromatic polar hydrophilic charged (+)
Lysine                   polar hydrophilic charged (+)
Methionine            hydrophobic neutral
Serine                   polar hydrophilic neutral
Tyrosine                aromatic polar hydrophobic

Transcribe and Translate a Gene

  1. For each RNA base there is a corresponding DNA base
  2. Cells use the two-step process of transcription and translation to read each gene and produce the string of amino acids that makes up a protein.
  3. mRNA is produced in the nucleus, and is transferred to the ribosome
  4. mRNA uses uracil instead of thymine
  5. the ribosome reads the RNA sequence and makes protein
  6. There is a sequence combination to fit each amino acid to a three letter RNA code
  7. The ribosome starts at AUG (start), and it reads each codon three letters at a time
  8. Stop codons are UAA, UAG and UGA


protein synthesis

protein synthesis










What about the purine inosine?

Inosine triphosphate pyrophosphatase – Pyrophosphatase that hydrolyzes the non-canonical purine nucleotides inosine triphosphate (ITP), deoxyinosine triphosphate (dITP) as well as 2′-deoxy-N-6-hydroxylaminopurine triposphate (dHAPTP) and xanthosine 5′-triphosphate (XTP) to their respective monophosphate derivatives. The enzyme does not distinguish between the deoxy- and ribose forms. Probably excludes non-canonical purines from RNA and DNA precursor pools, thus preventing their incorporation into RNA and DNA and avoiding chromosomal lesions.

Gastroenterology. 2011 Apr;140(4):1314-21.  http://dx.doi.org:/10.1053/j.gastro.2010.12.038. Epub 2011 Jan 1.

Inosine triphosphate protects against ribavirin-induced adenosine triphosphate loss by adenylosuccinate synthase function.

Hitomi Y1, Cirulli ET, Fellay J, McHutchison JG, Thompson AJ, Gumbs CE, Shianna KV, Urban TJ, Goldstein DB.

Genetic variation of inosine triphosphatase (ITPA) causing an accumulation of inosine triphosphate (ITP) has been shown to protect patients against ribavirin (RBV)-induced anemia during treatment for chronic hepatitis C infection by genome-wide association study (GWAS). However, the biologic mechanism by which this occurs is unknown.

Although ITP is not used directly by human erythrocyte ATPase, it can be used for ATP biosynthesis via ADSS in place of guanosine triphosphate (GTP). With RBV challenge, erythrocyte ATP reduction was more severe in the wild-type ITPA genotype than in the hemolysis protective ITPA genotype. This difference also remains after inhibiting adenosine uptake using nitrobenzylmercaptopurine riboside (NBMPR).

ITP confers protection against RBV-induced ATP reduction by substituting for erythrocyte GTP, which is depleted by RBV, in the biosynthesis of ATP. Because patients with excess ITP appear largely protected against anemia, these results confirm that RBV-induced anemia is due primarily to the effect of the drug on GTP and consequently ATP levels in erythrocytes.

Ther Drug Monit. 2012 Aug;34(4):477-80.  http://dx.doi.org:/10.1097/FTD.0b013e31825c2703.

Determination of inosine triphosphate pyrophosphatase phenotype in human red blood cells using HPLC.

Citterio-Quentin A1, Salvi JP, Boulieu R.

Thiopurine drugs, widely used in cancer chemotherapy, inflammatory bowel disease, and autoimmune hepatitis, are responsible for common adverse events. Only some of these may be explained by genetic polymorphism of thiopurine S-methyltransferase. Recent articles have reported that inosine triphosphate pyrophosphatase (ITPase) deficiency was associated with adverse drug reactions toward thiopurine drug therapy. Here, we report a weak anion exchange high-performance liquid chromatography method to determine ITPase activity in red blood cells and to investigate the relationship with the occurrence of adverse events during azathioprine therapy.

The chromatographic method reported allows the analysis of IMP, inosine diphosphate, and ITP in a single run in <12.5 minutes. The method was linear in the range 5-1500 μmole/L of IMP. Intraassay and interassay precisions were <5% for red blood cell lysates supplemented with 50, 500, and 1000 μmole/L IMP. Km and Vmax evaluated by Lineweaver-Burk plot were 677.4 μmole/L and 19.6 μmole·L·min, respectively. The frequency distribution of ITPase from 73 patients was investigated.

The method described is useful to determine the ITPase phenotype from patients on thiopurine therapy and to investigate the potential relation between ITPase deficiency and the occurrence of adverse events.


System wide analyses have underestimated protein abundances and the importance of transcription in mammals

Jingyi Jessica Li1, 2, Peter J Bickel1 and Mark D Biggin3

PeerJ 2:e270; http://dx.doi.org:/10.7717/peerj.270

Using individual measurements for 61 housekeeping proteins to rescale whole proteome data from Schwanhausser et al. (2011), we find that the median protein detected is expressed at 170,000 molecules per cell and that our corrected protein abundance estimates show a higher correlation with mRNA abundances than do the uncorrected protein data. In addition, we estimated the impact of further errors in mRNA and protein abundances using direct experimental measurements of these errors. The resulting analysis suggests that mRNA levels explain at least 56% of the differences in protein abundance for the 4,212 genes detected by Schwanhausser et al. (2011), though because one major source of error could not be estimated the true percent contribution should be higher.We also employed a second, independent strategy to determine the contribution of mRNA levels to protein expression.We show that the variance in translation rates directly measured by ribosome profiling is only 12% of that inferred by Schwanhausser et al. (2011), and that the measured and inferred translation rates correlate poorly (R2 D 0.13). Based on this, our second strategy suggests that mRNA levels explain 81% of the variance in protein levels. We also determined the percent contributions of transcription, RNA degradation, translation and protein degradation to the variance in protein abundances using both of our strategies. While the magnitudes of the two estimates vary, they both suggest that transcription plays a more important role than the earlier studies implied and translation a much smaller role. Finally, the above estimates only apply to those genes whose mRNA and protein expression was detected. Based on a detailed analysis by Hebenstreit et al. (2012), we estimat that approximately 40% of genes in a given cell within a population express no mRNA. Since there can be no translation in the ab-sence of mRNA, we argue that differences in translation rates can play no role in determining the expression levels for the 40% of genes that are non-expressed.


Related studies that reveal issues that are not part of this chapter:

  1. Ubiquitylation in relationship to tissue remodeling
  2. Post-translational modification of proteins
    1. Glycosylation
    2. Phosphorylation
    3. Methylation
    4. Nitrosylation
    5. Sulfation – sulfotransferases
      cell-matrix communication
    6. Acetylation and histone deacetylation (HDAC)
      Connecting Protein Phosphatase to 1α (PP1α)
      Acetylation complexes (such as CBP/p300 and PCAF)
      Rel/NF-kB Signal Transduction
      Homologous Recombination Pathway of Double-Strand DNA Repair
    7. Glycination
    8. cyclin dependent kinases (CDKs)
    9. lyase
    10. transferase


This year, the Lasker award for basic medical research went to Kazutoshi Mori (Kyoto University) and Peter Walter (University of California, San Francisco) for their “discoveries concerning the unfolded protein response (UPR) — an intracellular quality control system that

detects harmful misfolded proteins in the endoplasmic reticulum and signals the nucleus to carry out corrective measures.”

About UPR: Approximately a third of cellular proteins pass through the Endoplasmic Reticulum (ER) which performs stringent quality control of these proteins. All proteins need to assume the proper 3-dimensional shape in order to function properly in the harsh cellular environment. Related to this is the fact that cells are under constant stress and have to make rapid, real time decisions about survival or death.

A major indicator of stress is the accumulation of unfolded proteins within the Endoplasmic Reticulum (ER), which triggers a transcriptional cascade in order to increase the folding capacity of the ER. If the metabolic burden is too great and homeostasis cannot be achieved, the response shifts from

damage control to the induction of pro-apoptotic pathways that would ultimately cause cell death.

This response to unfolded proteins or the UPR is conserved among all eukaryotes, and dysfunction in this pathway underlies many human diseases, including Alzheimer’s, Parkinson’s, Diabetes and Cancer.


The discovery of a new class of human proteins with previously unidentified activities

In a landmark study conducted by scientists at the Scripps Research Institute, The Hong Kong University of Science and Technology, aTyr Pharma and their collaborators, a new class of human proteins has been discovered. These proteins [nearly 250], called Physiocrines belong to the aminoacyl tRNA synthetase gene family and carry out novel, diverse and distinct biological functions.

The aminoacyl tRNA synthetase gene family codes for a group of 20 ubiquitous enzymes almost all of which are part of the protein synthesis machinery. Using recombinant protein purification, deep sequencing technique, mass spectroscopy and cell based assays, the team made this discovery. The finding is significant, also because it highlights the alternate use of a gene family whose protein product normally performs catalytic activities for non-catalytic regulation of basic and complex physiological processes spanning metabolism, vascularization, stem cell biology and immunology


Muscle maintenance and regeneration – key player identified

Muscle tissue suffers from atrophy with age and its regenerative capacity also declines over time. Most molecules discovered thus far to boost tissue regeneration are also implicated in cancers.  During a quest to find safer alternatives that can regenerate tissue, scientists reported that the hormone Oxytocin is required for proper muscle tissue regeneration and homeostasis and that its levels decline with age.

Oxytocin could be an alternative to hormone replacement therapy as a way to combat aging and other organ related degeneration.

Oxytocin is an age-specific circulating hormone that is necessary for muscle maintenance and regeneration (June 2014)


Proc Natl Acad Sci U S A. 2014 Sep 30;111(39):14289-94.   http://dx.doi.org:/10.1073/pnas.1407640111. Epub 2014 Sep 15.

Role of forkhead box protein A3 in age-associated metabolic decline.

Ma X1, Xu L1, Gavrilova O2, Mueller E3.

Aging is associated with increased adiposity and diminished thermogenesis, but the critical transcription factors influencing these metabolic changes late in life are poorly understood. We recently demonstrated that the winged helix factor forkhead box protein A3 (Foxa3) regulates the expansion of visceral adipose tissue in high-fat diet regimens; however, whether Foxa3 also contributes to the increase in adiposity and the decrease in brown fat activity observed during the normal aging process is currently unknown. Here we report that during aging, levels of Foxa3 are significantly and selectively up-regulated in brown and inguinal white fat depots, and that midage Foxa3-null mice have increased white fat browning and thermogenic capacity, decreased adipose tissue expansion, improved insulin sensitivity, and increased longevity. Foxa3 gain-of-function and loss-of-function studies in inguinal adipose depots demonstrated a cell-autonomous function for Foxa3 in white fat tissue browning. Furthermore, our analysis revealed that the mechanisms of Foxa3 modulation of brown fat gene programs involve the suppression of peroxisome proliferator activated receptor γ coactivtor 1 α (PGC1α) levels through interference with cAMP responsive element binding protein 1-mediated transcriptional regulation of the PGC1α promoter.


Asymmetric mRNA localization contributes to fidelity and sensitivity of spatially localized systems

RJ Weatheritt, TJ Gibson & MM Babu
Nature Structural & Molecular Biology 24 Aug, 2014; 21: 833–839 http://dx.do.orgi:/10.1038/nsmb.2876

Although many proteins are localized after translation, asymmetric protein distribution is also achieved by translation after mRNA localization. Why are certain mRNA transported to a distal location and translated on-site? Here we undertake a systematic, genome-scale study of asymmetrically distributed protein and mRNA in mammalian cells. Our findings suggest that asymmetric protein distribution by mRNA localization enhances interaction fidelity and signaling sensitivity. Proteins synthesized at distal locations frequently contain intrinsically disordered segments. These regions are generally rich in assembly-promoting modules and are often regulated by post-translational modifications. Such proteins are tightly regulated but display distinct temporal dynamics upon stimulation with growth factors. Thus, proteins synthesized on-site may rapidly alter proteome composition and act as dynamically regulated scaffolds to promote the formation of reversible cellular assemblies. Our observations are consistent across multiple mammalian species, cell types and developmental stages, suggesting that localized translation is a recurring feature of cell signaling and regulation.


An overview of the potential advantages conferred by distal-site protein synthesis, inferred from our analysis.


An overview of the potential advantages conferred by distal-site protein synthesis

An overview of the potential advantages conferred by distal-site protein synthesis


Turquoise and red filled circle represents off-target and correct interaction partners, respectively. Wavy lines represent a disordered region within a distal site synthesis protein. Grey and red line in graphs represents profiles of t…



Tweaking transcriptional programming for high quality recombinant protein production

Since overexpression of recombinant proteins in E. coli often leads to the formation of inclusion bodies, producing properly folded, soluble proteins is undoubtedly the most important end goal in a protein expression campaign. Various approaches have been devised to bypass the insolubility issues during E. coli expression and in a recent report a group of researchers discuss reprogramming the E. coli proteostasis [protein homeostasis] network to achieve high yields of soluble, functional protein. The premise of their studies is that the basal E. coli proteostasis network is insufficient, and often unable, to fold overexpressed proteins, thus clogging the folding machinery.

By overexpressing a mutant, negative-feedback deficient heat shock transcription factor [σ32 I54N] before and during overexpression of the protein of interest, reprogramming can be achieved, resulting in high yields of soluble and functional recombinant target protein. The authors explain that this method is better than simply co-expressing/over-expressing chaperones, co-chaperones, foldases or other components of the proteostasis network because reprogramming readies the folding machinery and up regulates the essential folding components beforehand thus  maintaining system capability of the folding machinery.

The Heat-Shock Response Transcriptional Program Enables High-Yield and High-Quality Recombinant Protein Production in Escherichia coli (July 2014)


 Unfolded proteins collapse when exposed to heat and crowded environments

Proteins are important molecules in our body and they fulfil a broad range of functions. For instance as enzymes they help to release energy from food and as muscle proteins they assist with motion. As antibodies they are involved in immune defence and as hormone receptors in signal transduction in cells. Until only recently it was assumed that all proteins take on a clearly defined three-dimensional structure – i.e. they fold in order to be able to assume these functions. Surprisingly, it has been shown that many important proteins occur as unfolded coils. Researchers seek to establish how these disordered proteins are capable at all of assuming highly complex functions.

Ben Schuler’s research group from the Institute of Biochemistry of the University of Zurich has now established that an increase in temperature leads to folded proteins collapsing and becoming smaller. Other environmental factors can trigger the same effect.

Measurements using the “molecular ruler”

“The fact that unfolded proteins shrink at higher temperatures is an indication that cell water does indeed play an important role as to the spatial organisation eventually adopted by the molecules”, comments Schuler with regard to the impact of temperature on protein structure. For their studies the biophysicists use what is known as single-molecule spectroscopy. Small colour probes in the protein enable the observation of changes with an accuracy of more than one millionth of a millimetre. With this “molecular yardstick” it is possible to measure how molecular forces impact protein structure.

With computer simulations the researchers have mimicked the behaviour of disordered proteins.
(Courtesy of Jose EDS Roselino, PhD.


MLKL compromises plasma membrane integrity

Necroptosis is implicated in many diseases and understanding this process is essential in the search for new therapies. While mixed lineage kinase domain-like (MLKL) protein has been known to be a critical component of necroptosis induction, how MLKL transduces the death signal was not clear. In a recent finding, scientists demonstrated that the full four-helical bundle domain (4HBD) in the N-terminal region of MLKL is required and sufficient to induce its oligomerization and trigger cell death.

They also found a patch of positively charged amino acids on the surface of the 4HBD that bound to phosphatidylinositol phosphates (PIPs) and allowed the recruitment of MLKL to the plasma membrane that resulted in the formation of pores consisting of MLKL proteins, due to which cells absorbed excess water causing them to explode. Detailed knowledge about how MLKL proteins create pores offers possibilities for the development of new therapeutic interventions for tolerating or preventing cell death.

MLKL compromises plasma membrane integrity by binding to phosphatidylinositol phosphates (May 2014)


Mitochondrial and ER proteins implicated in dementia

Mitochondria and the endoplasmic reticulum (ER) form tight structural associations that facilitate a number of cellular functions. However, the molecular mechanisms of these interactions aren’t properly understood.

A group of researchers showed that the ER protein VAPB interacted with mitochondrial protein PTPIP51 to regulate ER-mitochondria associations and that TDP-43, a protein implicated in dementia, disturbs this interaction to regulate cellular Ca2+ homeostasis. These studies point to a new pathogenic mechanism for TDP-43 and may also provide a potential new target for the development of new treatments for devastating neurological conditions like dementia.

ER-mitochondria associations are regulated by the VAPB-PTPIP51 interaction and are disrupted by ALS/FTD-associated TDP-43. Nature (June 2014)


A novel strategy to improve membrane protein expression in Yeast

Membrane proteins play indispensable roles in the physiology of an organism. However, recombinant production of membrane proteins is one of the biggest hurdles facing protein biochemists today. A group of scientists in Belgium showed that,

by increasing the intracellular membrane production by interfering with a key enzymatic step of lipid synthesis,

enhanced expression of recombinant membrane proteins in yeast is achieved.

Specifically, they engineered the oleotrophic yeast, Yarrowia lipolytica, by

deleting the phosphatidic acid phosphatase, PAH1 gene,

which led to massive proliferation of endoplasmic reticulum (ER) membranes.

For all 8 tested representatives of different integral membrane protein families, they obtained enhanced protein accumulation.


An unconventional method to boost recombinant protein levels

MazF is an mRNA interferase enzyme in E.coli that functions as and degrades cellular mRNA in a targeted fashion, at the “ACA” sequence. This degradation of cellular mRNA causes a precipitous drop in cellular protein synthesis. A group of scientists at the Robert Wood Johnson Medical School in New Jersey, exploited the degeneracy of the genetic code to modify all “ACA” triplets within their gene of interest in a way that the corresponding amino acid (Threonine) remained unchanged. Consequently, induction of MazF toxin caused degradation of E.coli cellular mRNA but the recombinant gene transcription and protein synthesis continued, causing significant accumulation of high quality target protein. This expression system enables unparalleled signal to noise ratios that could dramatically simplify structural and functional studies of difficult-to-purify, biologically important proteins.


Tandem fusions and bacterial strain evolution for enhanced functional membrane protein production

Membrane protein production remains a significant challenge in its characterization and structure determination. Despite the fact that there are a variety of host cell types, E.coli remains the popular choice for producing recombinant membrane proteins. A group of scientists in Netherlands devised a robust strategy to increase the probability of functional membrane protein overexpression in E.coli.

By fusing Green Fluorescent Protein (GFP) and the Erythromycin Resistance protein (ErmC) to the C-terminus of a target membrane protein they wer e able to track the folding state of their target protein while using Erythromycin to select for increased expression. By increasing erythromycin concentration in the growth media and testing different membrane targets, they were able to identify four evolved E.coli strains, all of which carried a mutation in the hns gene, whose product is implicated in genome organization and transcriptional silencing. Through their experiments the group showed that partial removal of the transcriptional silencing mechanism was related to production of proteins that were essential for functional overexpression of membrane proteins.


The role of an anti-apoptotic factor in recombinant protein production

In a recent study, scientists at the Johns Hopkins University and Frederick National Laboratory for Cancer Research examined an alternative method of utilizing the benefits of anti-apoptotic gene expression to enhance the transient expression of biotherapeutics, specifically, through the co-transfection of Bcl-xL along with the product-coding target gene.

Chinese Hamster Ovary(CHO) cells were co-transfected with the product-coding gene and a vector containing Bcl-xL, using Polyethylenimine (PEI) reagent. They found that the cells co-transfected with Bcl-xL demonstrated reduced apoptosis, increased specific productivity, and an overall increase in product yield.

B-cell lymphoma-extra-large (Bcl-xL) is a mitochondrial transmembrane protein and a member of the Bcl-2 family of proteins which are known to act as either pro- or anti-apoptotic proteins. Bcl-xL itself acts as an anti-apoptotic molecule by preventing the release of mitochondrial contents such as cytochrome c, which would lead to caspase activation. Higher levels of Bcl-xL push a cell toward survival mode by making the membranes pores less permeable and leaky.

Introduction to Protein Synthesis and Degradation Updated 8/31/2019

N-Terminal Degradation of Proteins: The N-End Rule and N-degrons

In both prokaryotes and eukaryotes mitochondria and chloroplasts, the ribosomal synthesis of proteins is initiated with the addition of the N-formyl methionine residue.  However in eukaryotic cytosolic ribosomes, the N terminal was assumed to be devoid of the N-formyl group.  The unformylated N-terminal methionine residues of eukaryotes is then  often N-acetylated (Ac) and creates specific degradation signals, the Ac N-end rule.  These N-end rule pathways are proteolytic systems which recognize these N-degrons resulting in proteosomal degradation or autophagy.  In prokaryotes this system is stimulated by certain amino acid deficiencies and in eukaryotes is dependent on the Psh1 E3 ligase.

Two papers in the journal Science describe this N-degron in more detail.

Structured Abstract

In both bacteria and eukaryotic mitochondria and chloroplasts, the ribosomal synthesis of proteins is initiated with the N-terminal (Nt) formyl-methionine (fMet) residue. Nt-fMet is produced pretranslationally by formyltransferases, which use 10-formyltetrahydrofolate as a cosubstrate. By contrast, proteins synthesized by cytosolic ribosomes of eukaryotes were always presumed to bear unformylated N-terminal Met (Nt-Met). The unformylated Nt-Met residue of eukaryotic proteins is often cotranslationally Nt-acetylated, a modification that creates specific degradation signals, Ac/N-degrons, which are targeted by the Ac/N-end rule pathway. The N-end rule pathways are a set of proteolytic systems whose unifying feature is their ability to recognize proteins containing N-degrons, thereby causing the degradation of these proteins by the proteasome or autophagy in eukaryotes and by the proteasome-like ClpAP protease in bacteria. The main determinant of an N‑degron is a destabilizing Nt-residue of a protein. Studies over the past three decades have shown that all 20 amino acids of the genetic code can act, in cognate sequence contexts, as destabilizing Nt‑residues. The previously known eukaryotic N-end rule pathways are the Arg/N-end rule pathway, the Ac/N-end rule pathway, and the Pro/N-end rule pathway. Regulated degradation of proteins and their natural fragments by the N-end rule pathways has been shown to mediate a broad range of biological processes.


The chemical similarity of the formyl and acetyl groups and their identical locations in, respectively, Nt‑formylated and Nt-acetylated proteins led us to suggest, and later to show, that the Nt-fMet residues of nascent bacterial proteins can act as bacterial N-degrons, termed fMet/N-degrons. Here we wished to determine whether Nt-formylated proteins might also form in the cytosol of a eukaryote such as the yeast Saccharomyces cerevisiae and to determine the metabolic fates of Nt-formylated proteins if they could be produced outside mitochondria. Our approaches included molecular genetic techniques, mass spectrometric analyses of proteins’ N termini, and affinity-purified antibodies that selectively recognized Nt-formylated reporter proteins.


We discovered that the yeast formyltransferase Fmt1, which is imported from the cytosol into the mitochondria inner matrix, can generate Nt-formylated proteins in the cytosol, because the translocation of Fmt1 into mitochondria is not as efficacious, even under unstressful conditions, as had previously been assumed. We also found that Nt‑formylated proteins are greatly up-regulated in stationary phase or upon starvation for specific amino acids. The massive increase of Nt-formylated proteins strictly requires the Gcn2 kinase, which phosphorylates Fmt1 and mediates its retention in the cytosol. Notably, the ability of Gcn2 to retain a large fraction of Fmt1 in the cytosol of nutritionally stressed cells is confined to Fmt1, inasmuch as the Gcn2 kinase does not have such an effect, under the same conditions, on other examined nuclear DNA–encoded mitochondrial matrix proteins. The Gcn2-Fmt1 protein localization circuit is a previously unknown signal transduction pathway. A down-regulation of cytosolic Nt‑formylation was found to increase the sensitivity of cells to undernutrition stresses, to a prolonged cold stress, and to a toxic compound. We also discovered that the Nt-fMet residues of Nt‑formylated cytosolic proteins act as eukaryotic fMet/N-degrons and identified the Psh1 E3 ubiquitin ligase as the recognition component (fMet/N-recognin) of the previously unknown eukaryotic fMet/N-end rule pathway, which destroys Nt‑formylated proteins.


The Nt-formylation of proteins, a long-known pretranslational protein modification, is mediated by formyltransferases. Nt-formylation was thought to be confined to bacteria and bacteria-descended eukaryotic organelles but was found here to also occur at the start of translation by the cytosolic ribosomes of a eukaryote. The levels of Nt‑formylated eukaryotic proteins are greatly increased upon specific stresses, including undernutrition, and appear to be important for adaptation to these stresses. We also discovered that Nt-formylated cytosolic proteins are selectively destroyed by the eukaryotic fMet/N-end rule pathway, mediated by the Psh1 E3 ubiquitin ligase. This previously unknown proteolytic system is likely to be universal among eukaryotes, given strongly conserved mechanisms that mediate Nt‑formylation and degron recognition.

The eukaryotic fMet/N-end rule pathway.

(Top) Under undernutrition conditions, the Gcn2 kinase augments the cytosolic localization of the Fmt1 formyltransferase, and possibly also its enzymatic activity. Consequently, Fmt1 up-regulates the cytosolic fMet–tRNAi (initiator transfer RNA), and thereby increases the levels of cytosolic Nt-formylated proteins, which are required for the adaptation of cells to specific stressors. (Bottom) The Psh1 E3 ubiquitin ligase targets the N-terminal fMet-residues of eukaryotic cytosolic proteins, such as Cse4, Pgd1, and Rps22a, for the polyubiquitylation-mediated, proteasome-dependent degradation.

” data-icon-position=”” data-hide-link-title=”0″>

The eukaryotic fMet/N-end rule pathway.

(Top) Under undernutrition conditions, the Gcn2 kinase augments the cytosolic localization of the Fmt1 formyltransferase, and possibly also its enzymatic activity. Consequently, Fmt1 up-regulates the cytosolic fMet–tRNAi (initiator transfer RNA), and thereby increases the levels of cytosolic Nt-formylated proteins, which are required for the adaptation of cells to specific stressors. (Bottom) The Psh1 E3 ubiquitin ligase targets the N-terminal fMet-residues of eukaryotic cytosolic proteins, such as Cse4, Pgd1, and Rps22a, for the polyubiquitylation-mediated, proteasome-dependent degradation.


A glycine-specific N-degron pathway mediates the quality control of protein N-myristoylation. Richard T. Timms1,2Zhiqian Zhang1,2David Y. Rhee3J. Wade Harper3Itay Koren1,2,*,Stephen J. Elledge1,2

Science  05 Jul 2019: Vol. 365, Issue 6448

The second paper describes a glycine specific N-degron pathway in humans.  Specifically the authors set up a screen to identify specific N-terminal degron motifs in the human.  Findings included an expanded repertoire for the UBR E3 ligases to include substrates with arginine and lysine following an intact initiator methionine and a glycine at the extreme N-terminus, which is a potent degron.

Glycine N-degron regulation revealed

For more than 30 years, N-terminal sequences have been known to influence protein stability, but additional features of these N-end rule, or N-degron, pathways continue to be uncovered. Timms et al. used a global protein stability (GPS) technology to take a broader look at these pathways in human cells. Unexpectedly, glycine exposed at the N terminus could act as a potent degron; proteins bearing N-terminal glycine were targeted for proteasomal degradation by two Cullin-RING E3 ubiquitin ligases through the substrate adaptors ZYG11B and ZER1. This pathway may be important, for example, to degrade proteins that fail to localize properly to cellular membranes and to destroy protein fragments generated during cell death.

Science, this issue p. eaaw4912

Structured Abstract


The ubiquitin-proteasome system is the major route through which the cell achieves selective protein degradation. The E3 ubiquitin ligases are the major determinants of specificity in this system, which is thought to be achieved through their selective recognition of specific degron motifs in substrate proteins. However, our ability to identify these degrons and match them to their cognate E3 ligase remains a major challenge.


It has long been known that the stability of proteins is influenced by their N-terminal residue, and a large body of work over the past three decades has characterized a collection of N-end rule pathways that target proteins for degradation through N-terminal degron motifs. Recently, we developed Global Protein Stability (GPS)–peptidome technology and used it to delineate a suite of degrons that lie at the extreme C terminus of proteins. We adapted this approach to examine the stability of the human N terminome, allowing us to reevaluate our understanding of N-degron pathways in an unbiased manner.


Stability profiling of the human N terminome identified two major findings: an expanded repertoire for UBR family E3 ligases to include substrates that begin with arginine and lysine following an intact initiator methionine and, more notably, that glycine positioned at the extreme N terminus can act as a potent degron. We established human embryonic kidney 293T reporter cell lines in which unstable peptides that bear N-terminal glycine degrons were fused to green fluorescent protein, and we performed CRISPR screens to identify the degradative machinery involved. These screens identified two Cul2 Cullin-RING E3 ligase complexes, defined by the related substrate adaptors ZYG11B and ZER1, that act redundantly to target substrates bearing N-terminal glycine degrons for proteasomal degradation. Moreover, through the saturation mutagenesis of example substrates, we defined the composition of preferred N-terminal glycine degrons specifically recognized by ZYG11B and ZER1.

We found that preferred glycine degrons are depleted from the native N termini of metazoan proteomes, suggesting that proteins have evolved to avoid degradation through this pathway, but are strongly enriched at annotated caspase cleavage sites. Stability profiling of N-terminal peptides lying downstream of all known caspase cleavages sites confirmed that Cul2ZYG11Band Cul2ZER1 could make a substantial contribution to the removal of proteolytic cleavage products during apoptosis. Last, we identified a role for ZYG11B and ZER1 in the quality control of N-myristoylated proteins. N-myristoylation is an important posttranslational modification that occurs exclusively on N-terminal glycine. By profiling the stability of the human N-terminome in the absence of the N-myristoyltransferases NMT1 and NMT2, we found that a failure to undergo N-myristoylation exposes N-terminal glycine degrons that are otherwise obscured. Thus, conditional exposure of glycine degrons to ZYG11B and ZER1 permits the selective proteasomal degradation of aberrant proteins that have escaped N-terminal myristoylation.


These data demonstrate that an additional N-degron pathway centered on N-terminal glycine regulates the stability of metazoan proteomes. Cul2ZYG11B– and Cul2ZER1-mediated protein degradation through N-terminal glycine degrons may be particularly important in the clearance of proteolytic fragments generated by caspase cleavage during apoptosis and in the quality control of protein N-myristoylation.

The glycine N-degron pathway.

Stability profiling of the human N-terminome revealed that N-terminal glycine acts as a potent degron. CRISPR screening revealed two Cul2 complexes, defined by the related substrate adaptors ZYG11B and ZER1, that recognize N-terminal glycine degrons. This pathway may be particularly important for the degradation of caspase cleavage products during apoptosis and the removal of proteins that fail to undergo N-myristoylation.

” data-icon-position=”” data-hide-link-title=”0″>

The glycine N-degron pathway.

Stability profiling of the human N-terminome revealed that N-terminal glycine acts as a potent degron. CRISPR screening revealed two Cul2 complexes, defined by the related substrate adaptors ZYG11B and ZER1, that recognize N-terminal glycine degrons. This pathway may be particularly important for the degradation of caspase cleavage products during apoptosis and the removal of proteins that fail to undergo N-myristoylation.


Read Full Post »

%d bloggers like this: