Feeds:
Posts
Comments

Archive for the ‘Statistical Methods for Research Evaluation’ Category

AKT Signaling Variable Effects

 

Reporter: Larry H Bernstein, MD, FCAP

 

Heterogeneous kinetics of AKT signaling in individual cells are accounted for by variable protein concentration

Meyer R, D’Alessandro LA, Kar S, Kramer B, She B, Kaschek D, et al.
Front. Physio. 2012; 3:451.     http://dx.doi.org/10.3389/fphys.2012.00451

In most solid cancers, cells harboring oncogenic mutations represent only a sub-fraction of the entire population. Within this sub-fraction the expression level of mutated proteins can vary significantly due to

  • cellular variability limiting the efficiency of targeted therapy.

To address the causes of the heterogeneity, we performed a systematic analysis of one of the most frequently mutated pathways in cancer cells, the phosphatidylinositol 3 kinase (PI3K) signaling pathway. Among others PI3K signaling is activated by the hepatocyte growth factor (HGF) that regulates

  • proliferation of hepatocytes during liver regeneration but
  • also fosters tumor cell proliferation.

HGF-mediated responses of PI3K signaling were monitored both at the single cell and cell population level in primary mouse hepatocytes and in the hepatoma cell line Hepa1_6. Interestingly, we observed that the HGF-mediated AKT responses at the level of individual cells is rather heterogeneous. However, the overall average behavior of the single cells strongly resembled the dynamics of AKT activation

  • determined at the cell population level.

To gain insights into the molecular cause for the observed heterogeneous behavior of individual cells, we employed

  • dynamic mathematical modeling in a stochastic framework.

Our analysis demonstrated that intrinsic noise was not sufficient to explain the observed kinetic behavior, but rather

  • the importance of extrinsic noise has to be considered.

Thus, distinct from gene expression in the examined signaling pathway fluctuations of the reaction rates has only a minor impact whereas

  • variability in the concentration of the various signaling components even in a clonal cell population is a key determinant for the kinetic behavior.
English: Structure of the HGF protein. Based o...

English: Structure of the HGF protein. Based on PyMOL rendering of PDB 1bht. (Photo credit: Wikipedia)

Read Full Post »

Virtual Biopsy – is it possible?

Author and Curator: Dror Nir, PhD

In a remark made to my last post: New envelopment in measuring mechanical properties of tissue, Dr. Aviva Lev-Ari, PhD, RN, Director and Founder of our Open Access Online Scientific Journal:  Leaders of Pharmaceutical Business Intelligence, asked whether OCT can be used for the purpose of performing biopsy. My answer to her question was “YES”. I thought that it will be worthwhile explaining why I am so “optimistic” about this:

A conventional biopsy is a process where a tissue sample is being cut out of the body and after being subjected to all kind of chemical processes a thin-film of tissue is trimmed and read under the microscope by a trained pathologist. Can imaging provide histological assessment of “thin-film” of tissue without cutting it out of the body? The answer would be positive if the imaging will result with high resolution reconstruction of a tissue sample identical in quality to a “live-sample” that is put under the microscope.

I was happy to find support to my optimism regarding the feasibility of constructing such device in the following article: Virtual skin biopsy by optical coherence tomography: the first quantitative imaging biomarker for scleroderma published on February 20th 2013 in Ann Rheum Dis doi:10.1136/annrheumdis-2012-202682

 This article reports an original, first study to perform histological comparison and explore Optical coherence tomography (“OCT”) as a potential imaging technique for the clinical assessment of patients presenting with systemic sclerosis (“SSc”). In their study the investigators used a device emitting low-intensity infrared laser beam, capable of producing high-contrast images of skin up to 2 mm deep with resolutions of 4–10 μm.

[START ORIGINAL PAPER]

ABSTRACT

Background

Skin involvement is of major prognostic value in systemic sclerosis (SSc) and often the primary outcome in clinical trials. Nevertheless, an objective, validated biomarker of skin fibrosis is lacking. Optical coherence tomography (OCT) is an imaging technology providing high-contrast images with 4 μm resolution, comparable with microscopy (‘virtual biopsy’). The present study evaluated OCT to detect and quantify skin fibrosis in SSc.

Methods

We performed 458 OCT scans of hands and forearms on 21 SSc patients and 22 healthy controls. We compared the findings with histology from three skin biopsies and by correlation with clinical assessment of the skin. We calculated the optical density (OD) of the OCT images employing Matlab software and performed statistical analysis of the results, including intraobserver/ interobserver reliability, employing SPSS software.

 Results

Comparison of OCT images with skin histology indicated a progressive loss of visualisation of the dermal–epidermal junction associated with dermal fibrosis. Furthermore, SSc affected skin showed a consistent decrease of OD in the papillary dermis, progressively worse in patients with worse modified Rodnan skin score (p<0.0001). Additionally, clinically unaffected skin was also distinguishable from healthy skin for its specific pattern of OD decrease in the reticular dermis (p<0.001). The technique showed an excellent intraobserver and interobserver reliability (intraclass correlation coefficient >0.8).

Conclusions

OCT of the skin could offer a feasible and reliable quantitative outcome measure in SSc. Studies determining OCT sensitivity to change over time and its role in defining skin vasculopathy may pave the way to defining OCT as a valuable imaging biomarker in SSc.

Virtual skin biopsy by OCT

The OCT images acquisition allowed the reconstruction of a virtual skin biopsy measuring 4×0.4×2 mm. The main structure of the healthy skin was easily recognisable by OCT (figure 1).

Virtual biopsy of forearm skin by optical coherence tomography. Representative 3D reconstruction from the tomography of healthy and systemic sclerosis (SSc) (site modified Rodnan skin score=3) skin scans. The keratin of the skin appears as a white line on the surface (k). The epidermis (ED) is quite visible in the healthy skin by the contrast with the increased optical density of the papillary dermis (PD). The dermal– epidermal junction (DEJ) is quite visible in the healthy skin between the ED and PD. On the contrary, neither clear distinction of ED and PD or DEJ is appreciable in the SSc skin. The vessels (*) are numerous and very well recognisable in healthy skin, whereas they appear less numerous and less distinct in the OCT image of SSc skin. Total depth of 3D reconstruction=1.2 mm. Scale bars are calculated by ImageJ.

Virtual biopsy of forearm skin by optical coherence tomography. Representative 3D reconstruction from the tomography of healthy and systemic sclerosis (SSc) (site modified Rodnan skin score=3) skin scans. The keratin of the skin appears as a white line on the surface (k). The epidermis (ED) is quite visible in the healthy skin by the contrast with the increased optical density of the papillary dermis (PD). The dermal– epidermal junction (DEJ) is quite visible in the healthy skin between the ED and PD. On the contrary, neither clear distinction of ED and PD or DEJ is appreciable in the SSc skin. The vessels (*) are numerous and very well recognisable in healthy skin, whereas they appear less numerous and less distinct in the OCT image of SSc skin. Total depth of 3D reconstruction=1.2 mm. Scale bars are calculated by ImageJ.

Some quantitative results  – in images:

Validation of optical coherence tomography (OCT) images by histology. (A and B) H&E staining (A) and corresponding OCT scan (B) from a healthy control (HC). The green line is the mean A-scan of the entire OCT image (100 scans) overlaid by matching the scale bars of OCT and histology. The green arrow indicates the nadir of the valley in the mean A-scan, which corresponds to the dermal–epidermal junction clearly visible on both images. The green arrowhead indicates the second peak of the mean OCT A-Scan which corresponds by the overlay to the most superficial region of the papillary dermis. (C and D) H&E staining (C) and corresponding OCT scan (D) from a systemic sclerosis (SSc) patient (site modified Rodnan skin score =3). The red line is the mean A-scan of the OCT image, overlaid by matching the scale bars in the two panels. The red arrow indicates the nadir in the valley of the mean A-scan, which in this case does not correspond to the dermal–epidermal junction. The red arrowhead corresponds to the second peak in mean A-Scan. (E) Overlay of HC and SSc. Scale bar=240 μm.

Validation of optical coherence tomography (OCT) images by histology. (A and B) H&E staining (A) and corresponding OCT scan (B) from a healthy control (HC). The green line is the mean A-scan of the entire OCT image (100 scans) overlaid by matching the scale bars of OCT and histology. The green arrow indicates the nadir of the valley in the mean A-scan, which corresponds to the dermal–epidermal junction clearly visible on both images. The green arrowhead indicates the second peak of the mean OCT A-Scan which corresponds by the overlay to the most superficial region of the papillary dermis. (C and D) H&E staining (C) and corresponding OCT scan (D) from a systemic sclerosis (SSc) patient (site modified Rodnan skin score =3). The red line is the mean A-scan of the OCT image, overlaid by matching the scale bars in the two panels. The red arrow indicates the nadir in the valley of the mean A-scan, which in this case does not correspond to the dermal–epidermal junction. The red arrowhead corresponds to the second peak in mean A-Scan. (E) Overlay of HC and SSc. Scale bar=240 μm.

Optical coherence tomography (OCT) of affected and not affected skin in plaque morphea. (A) OCT of not affected skin. Vertical scale represents depth in micrometre from the surface. The dermal–epidermal junction (DEJ) level is indicated by the white dotted line. Mean A-scan curve is overlaid and displayed in green. (B) OCT of affected skin in morphea patient. Mean A-scan curve is overlaid and displayed in red. Note the poorly visible DEJ and the valley of the curve below the DEJ (arrowhead). (C) Overlay of mean A-scan curves from the analysis of affected and unaffected skin in a morphea patient. Note that in the curves overlay graph both the difference depth of the first valley is clearly appreciable (arrowheads). Similarly the second mean A-scan peak (arrow) is subtle in the affected skin, similar to scleroderma affected skin.

Optical coherence tomography (OCT) of affected and not affected skin in plaque morphea. (A) OCT of not affected skin. Vertical scale represents depth in micrometre from the surface. The dermal–epidermal junction (DEJ) level is indicated by the white dotted line. Mean A-scan curve is overlaid and displayed in green. (B) OCT of affected skin in morphea patient. Mean A-scan curve is overlaid and displayed in red. Note the poorly visible DEJ and the valley of the curve below the DEJ (arrowhead). (C) Overlay of mean A-scan curves from the analysis of affected and unaffected skin in a morphea patient. Note that in the curves overlay graph both the difference depth of the first valley is clearly appreciable (arrowheads). Similarly the second mean A-scan peak (arrow) is subtle in the affected skin, similar to scleroderma affected skin.

DISCUSSION

The current gold standard for semiquantitative assessment of skin fibrosis, the mRSS, suffers from several shortcomings ranging from the subjectivity of skin palpation assessments and the high level of skill required from the clinical investigator. Even more importantly, a meta-analysis of three independent studies determined an overall within patient interobserver SD of five units independently of the mean skin score,[6 21] which represents an SE ranging from 20% to 26%. A primary outcome measure with 25% of SE entails the recruitment of a large number of patients to attain statistical validity in minimally significant changes, a task often difficult to accomplish given the comparatively low incidence of SSc.

A robust imaging biomarker for the assessment of skin fibrosis in SSc has not previously been reported. Herein we report the first study aimed to validate OCT for the quantitative assessment of skin involvement in SSc.

To date, the limited data on surrogate outcome measures for skin involvement are largely composed of histopathological or molecular changes in affected skin.[22 23] Despite conceptually very valuable, these studies, involving skin biopsies, are invasive and limited because of a site bias, referring to only one precise body area. Moreover, they are difficult to repeat in longitudinal manner and showed no sensitivity to change over time.[24] In this study, we evaluated OCT skin scanning as a reliable and quanti­tative tool that could be used as a surrogate marker of skin fibrosis. The technique requires minimal operator training, less than 10s per site examined, and offers the great advantage of saving image files for further or centralised operator independ­ent analysis. This latter is a particularly useful tool limiting the ‘hands on’ time in the clinic office and allowing a centralised, blinded assessment of results in clinical trials.

We observed an excellent correlation of OCT mean A-Scan curves and mRSS score at the site of analysis. More importantly, the corroboration of our OCT findings with pathological changes at the DEJ provides a robust construct validity for the technique. Of interest, we found that the changes of the OD of the dermis in SSc are similar to the ones observed in a case of plaque morphea, corroborating even further the potential value of OCT in measuring skin fibrosis.

Additional Comment

HFUS (High Frequensy Ultrasound) has been recently suggested to offer a quantitative assessment of skin thickness in SSc by several studies.8–10 In contrast with ultrasound, OCT does not require any use of gels, is able to give a higher resolution images and the analysis algo­rithm is automatic, not involving any operator interpretation. Nevertheless, since the penetration of OCT is limited to the first millimetre of skin, OCT and HFUS may be explored as comple­mentary imaging biomarkers in SSc.

REFERENCES

1     Jimenez SA, Derk CT. Following the molecular pathways toward an understanding
of the pathogenesis of systemic sclerosis. Ann Intern Med 2004;140:37–50.

2     Varga J, Abraham D. Systemic sclerosis: a prototypic multisystem fibrotic disorder.
J Clin Invest 2007;117:557–67.

3     Gabrielli A, Avvedimento EV, Krieg T. Scleroderma. N Engl J Med
2009;360:1989–2003.

4     Clements PJ, Hurwitz EL, Wong WK, et al. Skin thickness score as a predictor and
correlate of outcome in systemic sclerosis: high-dose versus low-dose penicillamine trial. Arthritis Rheum 2000;43:2445–54.

5     Steen VD, Medsger TA Jr. Improvement in skin thickening in systemic sclerosis
associated with improved survival. Arthritis Rheum 2001;44:2828–35.

6     Pope JE, Baron M, Bellamy N, et al. Variability of skin scores and clinical
measurements in scleroderma. J Rheumatol 1995;22:1271–6.

Clements PJ, Lachenbruch PA, Seibold JR, et al. Skin thickness score in systemic

sclerosis: an assessment of interobserver variability in 3 independent studies. J Rheumatol 1993;20:1892–6.

   8     Akesson A, Hesselstrand R, Scheja A, et al. Longitudinal development of skin
involvement and reliability of high frequency ultrasound in systemic sclerosis. Ann Rheum Dis 2004;63:791–6.

   9     Moore TL, Lunt M, McManus B, et al. L. Seventeen-point dermal ultrasound scoring
system—a reliable measure of skin thickness in patients with systemic sclerosis. Rheumatology (Oxford) 2003;42:1559–63.

10     Kaloudi O, Bandinelli F, Filippucci E, et al. High frequency ultrasound

measurement of digital dermal thickness in systemic sclerosis. Ann Rheum Dis 2010;69:1140–3.

11     Aden N, Shiwen X, Aden D, et al. Proteomic analysis of scleroderma lesional skin
reveals activated wound healing phenotype of epidermal cell layer. Rheumatology (Oxford) 2008;47:1754–60.

12     Aden N, Nuttall A, Shiwen X, et al. Epithelial Cells Promote Fibroblast Activation via
IL-1alpha in Systemic Sclerosis. J Invest Dermatol 2010;130:2191–200.

13     Gambichler T, Jaedicke V, Terras S. Optical coherence tomography in dermatology:
technical and clinical aspects. Arch Dermatol Res 2011;303:457–73.

14     Marschall S, Sander B, Mogensen M, et al. Optical coherence tomography-current
technology and applications in clinical and biomedical research. Anal Bioanal Chem 2011;400:2699–720.

15     Coleman AJ, Richardson TJ, Orchard G, et al. Histological correlates of optical
coherence tomography in non-melanoma skin cancer. Skin Res Technol 2013;19: e10–9.

16     Preliminary criteria for the classification of systemic sclerosis (scleroderma).
Subcommittee for scleroderma criteria of the American Rheumatism Association Diagnostic and Therapeutic Criteria Committee. Arthritis Rheum 1980;23:581–90.

17     Collins TJ. ImageJ for microscopy. Biotechniques 2007;43:25–30.

18     Clendenon JL, Phillips CL, Sandoval RM, et al. Voxx: a PC-based, near real-time

volume rendering system for biological microscopy. Am J Physiol Cell Physiol 2002;282:C213–18.

19     Bland JM, Altman DG. Statistical methods for assessing agreement between two
methods of clinical measurement. Lancet 1986;1:307–10.

20     LeRoy EC, Black C, Fleischmajer R, et al. Scleroderma (systemic sclerosis):
classification, subsets and pathogenesis. J Rheumatol 1988;15:202–5.

21     Merkel PA, Silliman NP, Clements PJ, et al. Patterns and predictors of change in
outcome measures in clinical trials in scleroderma: an individual patient meta-analysis of 629 subjects with diffuse cutaneous systemic sclerosis. Arthritis Rheum 2012;64:3420–9.

22     Farina G, Lafyatis D, Lemaire R, et al. A four-gene biomarker predicts skin disease
in patients with diffuse cutaneous systemic sclerosis. Arthritis Rheum 2010;62:580–8.

23     Milano A, Pendergrass SA, Sargent JL, et al. Molecular subsets in the gene
expression signatures of scleroderma skin. PLoS One 2008;3:e2696.

  24   Pendergrass SA, Lemaire R, Francis IP, et al. Intrinsic gene expression subsets of

diffuse cutaneou

Read Full Post »

Modeling Targeted Therapy

Reporter: Larry H. Bernstein, MD, FCAP
pharmaceuticalintelligence.com/2013/03/02/modeling-targeted-therapy/

Some Perspectives on Network Modeling in Therapeutic Target Prediction
R Albert, B DasGupta and N Mobasheri
Biomedical Engineering and Computational Biology Insights 2013; 5: 17–24    http://dx.doi.org/BECBI/Albert_DasGupta_ Mobasheri
Key steps of a typical therapeutic target identification problem include synthesizing or inferring the complex network of interactions relevant to the disease, connecting this network to the disease-specific behavior, and predicting which components are key mediators of the behavior
http://www.la-press.com/Some_Perspectives_on_Network_Modeling_in_Therapeutic_Target_Prediction/

Journal of Computational Biology

Journal of Computational Biology (Photo credit: Wikipedia)

Read Full Post »

Reporter: Aviva Lev-Ari, PhD, RN

Genetic Basis of Complex Human Diseases: Dan Koboldt’s Advice to Next-Generation Sequencing Neophytes

Word Cloud by Daniel Menzin

UPDATED 3/27/2013

The Exome is Not Enough

March 27, 2013

Dan Koboldt at MassGenomics explains why exome sequencing often fails to identify causal variants, even in Mendelian disorders — “the very plausible possibility that a noncoding functional variant is responsible.”

Koboldt, the analysis manager in the human genetics group at the Genome Institute at Washington University, says that researchers shouldn’t overlook the importance of noncoding functional variants, which require a suite of technologies to detect, including RNA-seq, ChiP-seq, DNAse sequencing and footprinting, bisulfite sequencing, and chromosome conformation capture.

“These types of experiments generate a wealth of data about regulatory activity in genomes,” he says. “While studying each of these independently is certainly informative, integrative analysis will be required to elucidate how all of these different regulatory mechanisms work together.”

While this effort will require “robust statistical models, substantial computing resources, and productive collaboration among research groups, the end result “will be a far more complete understanding of how the genome works,” he says.

 
SOURCE:

Dan Koboldt works as a staff scientist in the Human Genetics group of the Genome Institute at Washington University in St. Louis. There, he works with scientists, physicians, programmers, and data analysts to understand the genetic basis of complex human diseases such as cancer, vision disorders, and metabolic syndromes through next-gen sequencing analysis. He received bachelor’s degrees in Computer Science and French from the University of Missouri-Columbia, and a master’s degree in Biology fromWashington University.

Dan has worked in the field of human genetics since 2003, when he joined the lab of Raymond E. Miller, which played a role in the International HapMap Project and later the genetic map of C. briggsae, a model organism related to C. elegans.

Disclaimer: The views expressed on this site, including blog posts and static pages, do not necessarily reflect the opinions of the Genome Institute at Washington University, the Washington University School of Medicine, or Washington University in St. Louis.

Before diving in with both feet, next-generation sequencing neophytes might want to take a gander at a post by Dan Koboldt at MassGenomics where he describes his 10 commandments for good next-gen sequencing.

In his post, Koboldt breaks up his instructions into four categories: analysis, publications, data sharing and submissions, and research ethics and cost.

His list includes some oft repeated warnings. For example, he cautions against reinventing the wheel when it comes to developing analysis software, and, for pity’s sake, don’t invent any more words that end in “ome” or “omics.”

Some other no-no’s, according to Koboldt, include publishing results before they’ve been vetted properly, testing new methods on simulated data only, and taking “unfair advantage of submitted data.”

He also admonishes newcomers to think a little bit about the cost of analysis without which “your sequencing data, your $1,000 genome, is about as useful as a chocolate teapot,” and to have a care for the privacy of their study participants’ samples and data.

Ten Commandments for Next-Gen Sequencing

10 ngs commandmentsJust as the reach of next-generation sequencing has continued to grow — in both research and clinical realms — so too has the community of NGS users.  Some have been around since the early days. The days of 454 and Solexa sequencing. Since then, the field has matured at an astonishing pace. Many standards were established to help everyone make sense of this flood of data. The recent democratization of sequencing has made next-gen sequencing available to just about anyone.

And yet, there have been growing pains. With great power comes great responsibility. To help some of the newcomers into the field, I’ve drafted these ten commandments for next-gen sequencing.

NGS Analysis

1. Thou shalt not reinvent the wheel. In spite of rapid technological advances, NGS is not a new field. Most of the current “workhorse” technologies have been on the market for a couple of years or more. As such, we have a plethora of short read aligners, de novo assemblers, variant callers, and other tools already. Even so, there is a great temptation for bioinformaticians to write their own “custom scripts” to perform these tasks. There’s a new “Applications Note” every day with some tool that claims to do something new or better.

Can you really write an aligner that’s better than BWA? More importantly, do we need one? Unless you have some compelling reason to develop something new (as we did when we developed SomaticSniper and VarScan), take advantage of what’s already out there.

2. Thou shalt not coin any new term ending with “ome” or “omics”. We have enough of these already, to the point where it’s getting ridiculous. Genome, transcriptome, and proteome are obvious applications of this nomenclature. Epigenome, sure. But the metabolome, interactome, and various other “ome” words are starting to detract from the naming system. The ones we need have already been coined. Don’t give in to the temptation.

3. Thou shall follow thy field’s conventions for jargon. Technical terms, acronyms, and abbreviations are inherent to research. We need them both for precision and brevity. When we get into trouble is when people feel the need to create their own acronyms when a suitable one already exists. Is there a significant difference between next-generation sequencing (NGS), high-throughput sequencing (HTS), and massively parallel sequencing (MPS)?

Widely accepted terms provide something of a standard, and they should be used whenever possible. Insertion/deletion variants are indels, not InDels or INDELs DIPs. Structural variants are SVs, not SVars or GVs. We don’t need any more acronyms!

NGS Publications

These commandments address behaviors that get on my nerves, both as a blogger and a peer reviewer.

4. Thou shalt not publish by press release. This is a disturbing trend that seems to happen more and more frequently in our field: the announcement of “discoveries” before they have been accepted for publication. Peer review is the required vetting process for scientific research. Yes, it takes time and yes, your competitors are probably on the verge of the same discovery. That doesn’t mean you get to skip ahead and claim credit by putting out a press release.

There are already examples of how this can come back to bite you. When the reviewers trash your manuscript, or (gasp) you learn that a mistake was made, it looks bad. It reflects poorly on the researchers and the institution, both in the field and in the eyes of the public.

5. Thou shalt not rely only on simulated data. Often when I read a paper on a new method or algorithm, they showcase it using simulated data. This often serves a noble purpose, such as knowing the “correct” answer and demonstrating that your approach can find it. Even so, you’d better apply it to some real data too. Simulations simply can’t replicate the true randomness of nature and the crap-that-can-go-wrong reality of next-gen sequencing. There’s plenty of freely available data out there; go get some of it.

6. Thou shalt obtain enough samples. One consequence of the rapid growth of our field (and accompanying drop in sequencing costs) is that small sample numbers no longer impress anyone. They don’t impress me, and they certainly don’t impress the statisticians upstairs. The novelty of exome or even whole-genome sequencing has long worn off. Now, high-profile studies must back their findings with statistically significant results, and that usually means finding a cohort of hundreds (or thousands) of patients with which to extend your findings.

This new reality may not be entirely bad news, because it surely will foster collaboration between groups that might otherwise not be able to publish individually.

Data Sharing and Submissions

7. Thou shalt withhold no data. With some exceptions, sequencing datasets are meant to be shared. Certain institutions, such as large-scale sequencing centers in the U.S., are mandated by their funding agencies to deposit data generated using public funds on a timely basis following its generation. Since the usual deposition site is dbGaP, this means that IRB approvals and dbGaP certification letters must be in hand before sequencing can begin.

Any researchers who plan to publish their findings based on sequencing datasets will have to submit them to public datasets before publication.This is not optional. It is not “something we should do when we get around to it after the paper goes out.” It is required to reproduce the work, so it should really be done before a manuscript is submitted. Consider this excerpt from Nature‘s publication guidelines:

Data sets must be made freely available to readers from the date of publication, and must be provided to editors and peer-reviewers at submission, for the purposes of evaluating the manuscript.

For the following types of data set, submission to a community-endorsed, public repository is mandatory. Accession numbers must be provided in the paper.

The policies go on to list various types of sequencing data:

  • DNA and RNA sequences
  • DNA sequencing data (traces for capillary electrophoresis and short reads for next-generation sequencing)
  • Deep sequencing data
  • Epitopes, functional domains, genetic markers, or haplotypes.

Every journal should have a similar policy; most top-tier journals already do. Editors and referees need to enforce this submission requirement by rejecting any manuscripts that do not include the submission accession numbers.

8. Thou shalt not take unfair advantage of submitted data. Many investigators are concerned about data sharing (especially when mandated upon generation, not publication) from fear of being scooped. This is a valid concern. When you submit your data to a public repository, others can find it and (if they meet the requirements) use it. Personally, I think most of these fears are not justified — I mean, have you ever tried to get data out of dbGaP? The time it takes for someone to find, request, obtain, and use submitted data should allow the producers of the data to write it up.

Large-scale efforts to which substantial resources have been devoted — such as the Cancer Genome Atlas — have additional safeguards in place. Their data use policy states that, for a given cancer type, submitted data can’t be used until the “marker paper” has been published. This is a good rule of thumb for the NGS community, and something that journal editors (and referees) haven’t always enforced.

Just because you can scoop someone doesn’t mean that you should. It’s not only bad karma, but bad for your reputation. Scientists have long memories. They will likely review your manuscript or grant proposal sometime in the future. When that happens, you want to be the person who took the high road.

Research Ethics and Cost

9. Thou shalt not discount the cost of analysis. It’s true that since the advent of NGS technology, the cost of sequencing has plummeted. The cost of analysis, however, has not. And making sense of genomic data — alignment, quality control, variant calling, annotation, interpretation — is a daunting task indeed. It takes computational resources as well as expertise. This infrastructure is not free; in fact, it can be more expensive than the sequencing itself. 

Without analysis, your sequencing data, your $1,000 genome, is about as useful as a chocolate teapot.

10. Thou shalt honor thy patients and their samples. Earlier this month, I wrote about how supposedly anonymous individuals from the CEPH collection were identified using a combination of genetic markers and online databases. It is a simple fact that we can no longer guarantee a sequenced sample’s anonymity. That simple fact, combined with our growing ability to interpret the possible consequences of an individual genome, means a great deal of risk for study volunteers.

We must safeguard the privacy of study participants — and find ways to protect them from privacy violations and/or discrimination — if we want their continued cooperation.

This means obtaining good consent documents and ensuring that they’re all correct before sequencing begins. It also means adhering to the data use policies those consents specify. As I’ve written before, samples are the new commodity in our field. Anyone can rent time on a sequencer. If you don’t make an effort to treat your samples right, someone else will.

Related Posts:

SOURCE:

Dan Koboldt’s Publications

Bose R, Kavuri SM, Searleman AC, Shen W, Shen D, Koboldt DC, Monsey J, Goel N, Aronson AB, Li S, Ma CX, Ding L, Mardis ER, & Ellis MJ (2013).Activating HER2 mtations in HER2 gene amplification negative breast cancer. Cancer discovery PMID: 23220880

The 1000 Genomes Project Consortium (2012). An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56-65. DOI: 10.1038/nature11632

Cancer Genome Atlas Network (2012). Comprehensive molecular portraits of human breast tumours. Nature, 490 (7418), 61-70 PMID:23000897

Ellis MJ, Ding L, Shen D, Luo J, Suman VJ, Wallis JW, Van Tine BA, Hoog J, Goiffon RJ, Goldstein TC, Ng S, Lin L, Crowder R, Snider J, Ballman K, Weber J, Chen K, Koboldt DC, Kandoth C, Schierding WS, McMichael JF, Miller CA, Lu C, Harris CC, McLellan MD, Wendl MC, DeSchryver K, Allred DC, Esserman L, Unzeitig G, Margenthaler J, Babiera GV, Marcom PK, Guenther JM, Leitch M, Hunt K, Olson J, Tao Y, Maher CA, Fulton LL, Fulton RS, Harrison M, Oberkfell B, Du F, Demeter R, Vickery TL, Elhammali A, Piwnica-Worms H, McDonald S, Watson M, Dooling DJ, Ota D, Chang LW, Bose R, Ley TJ, Piwnica-Worms D, Stuart JM, Wilson RK, & Mardis ER (2012). Whole-genome analysis informs breast cancer response to aromatase inhibition. Nature, 486 (7403), 353-60 PMID: 22722193

Welch JS, Ley TJ, Link DC, Miller CA, Larson DE, Koboldt DC, Wartman LD, Lamprecht TL, Liu F, Xia J, Kandoth C, Fulton RS, McLellan MD, Dooling DJ, Wallis JW, Chen K, Harris CC, Schmidt HK, Kalicki-Veizer JM, Lu C, Zhang Q, Lin L, O’Laughlin MD, McMichael JF, Delehaunty KD, Fulton LA, Magrini VJ, McGrath SD, Demeter RT, Vickery TL, Hundal J, Cook LL, Swift GW, Reed JP, Alldredge PA, Wylie TN, Walker JR, Watson MA, Heath SE, Shannon WD, Varghese N, Nagarajan R, Payton JE, Baty JD, Kulkarni S, Klco JM, Tomasson MH, Westervelt P, Walter MJ, Graubert TA, DiPersio JF, Ding L, Mardis ER, & Wilson RK (2012). The origin and evolution of mutations in acute myeloid leukemia. Cell, 150 (2), 264-78 PMID: 22817890

Cancer Genome Atlas Network (2012). Comprehensive molecular characterization of human colon and rectal cancer. Nature, 487(7407), 330-7 PMID: 22810696

Dees ND, Zhang Q, Kandoth C, Wendl MC, Schierding W, Koboldt DC, Mooney TB, Callaway MB, Dooling D, Mardis ER, Wilson RK, & Ding L (2012). MuSiC: identifying mutational significance in cancer genomes.Genome research, 22 (8), 1589-98 PMID: 22759861

Walter MJ, Shen D, Ding L, Shao J, Koboldt DC, Chen K, Larson DE, McLellan MD, Dooling D, Abbott R, Fulton R, Magrini V, Schmidt H, Kalicki-Veizer J, O’Laughlin M, Fan X, Grillot M, Witowski S, Heath S, Frater JL, Eades W, Tomasson M, Westervelt P, DiPersio JF, Link DC, Mardis ER, Ley TJ, Wilson RK, & Graubert TA (2012). Clonal architecture of secondary acute myeloid leukemia. The New England journal of medicine, 366(12), 1090-8 PMID: 22417201

Matsushita H, Vesely MD, Koboldt DC, Rickert CG, Uppaluri R, Magrini VJ, Arthur CD, White JM, Chen YS, Shea LK, Hundal J, Wendl MC, Demeter R, Wylie T, Allison JP, Smyth MJ, Old LJ, Mardis ER, & Schreiber RD (2012).Cancer exome analysis reveals a T-cell-dependent mechanism of cancer immunoediting. Nature, 482 (7385), 400-4 PMID: 22318521

Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, Miller CA, Mardis ER, Ding L, & Wilson RK (2012). VarScan 2: Somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Research PMID: 22300766

Koboldt DC, Larson DE, Chen K, Ding L, & Wilson RK (2012). Massively parallel sequencing approaches for characterization of structural variation. Methods in molecular biology (Clifton, N.J.), 838, 369-84 PMID:22228022

Graubert TA, Shen D, Ding L, Okeyo-Owuor T, Lunn CL, Shao J, Krysiak K, Harris CC, Koboldt DC, Larson DE, McLellan MD, Dooling DJ, Abbott RM, Fulton RS, Schmidt H, Kalicki-Veizer J, O’Laughlin M, Grillot M, Baty J, Heath S, Frater JL, Nasim T, Link DC, Tomasson MH, Westervelt P, DiPersio JF, Mardis ER, Ley TJ, Wilson RK, & Walter MJ (2011). Recurrent mutations in the U2AF1 splicing factor in myelodysplastic syndromes. Nature genetics, 44 (1), 53-7 PMID: 22158538

Larson DE, Harris CC, Chen K, Koboldt DC, Abbott TE, Dooling DJ, Ley TJ, Mardis ER, Wilson RK, & Ding L. (2011). SomaticSniper: Identification of Somatic Point Mutations in Whole Genome Sequencing Data.Bioinformatics, Online : doi: 10.1093/bioinformatics/btr665

Cancer Genome Atlas Research Network (2011). Integrated genomic analyses of ovarian carcinoma. Nature, 474 (7353), 609-15 PMID:21720365

Marth GT, Yu F, Indap AR, Garimella K, et al & the 1000 Genomes Project (2011). The functional spectrum of low-frequency coding variation.Genome biology, 12 (9) PMID: 21917140

Ross JA, Koboldt DC, Staisch JE, Chamberlin HM, Gupta BP, Miller RD, Baird SE, & Haag ES (2011). Caenorhabditis briggsae recombinant inbred line genotypes reveal inter-strain incompatibility and the evolution of recombination. PLoS genetics, 7 (7) PMID: 21779179

Bowne SJ, Humphries MM, Sullivan LS, Kenna PF, Tam LC, Kiang AS, Campbell M, Weinstock GM, Koboldt DC, Ding L, Fulton RS, Sodergren EJ, et al (2011). A dominant mutation in RPE65 identified by whole-exome sequencing causes retinitis pigmentosa with choroidal involvement. European journal of human genetics : EJHG, 19 (10) PMID:21938004

Link DC, Schuettpelz LG, Shen D, Wang J, Walter MJ, Kulkarni S, Payton JE, Ivanovich J, Goodfellow PJ, Le Beau M, Koboldt DC, Dooling DJ, Fulton RS, et al (2011). Identification of a novel TP53 cancer susceptibility mutation through whole-genome sequencing of a patient with therapy-related AML. JAMA : the journal of the American Medical Association, 305 (15), 1568-76 PMID: 21505135

Ley T, Ding L, Walter M, McLellan M, Lamprecht T, Larson D, Kandoth C, Payton J, Baty J, Welch J, Harris C, Lichti C, Townsend R, Fulton R, Dooling D, Koboldt D, et al. (2010). DNMT3A Mutations in Acute Myeloid Leukemia
New England Journal of Medicine DOI: 10.1056/NEJMoa1005143

Ding L, Wendl MC, Koboldt DC, & Mardis ER (2010). Analysis of next-generation genomic data in cancer: accomplishments and challenges. Human Molecular Genetics, 19 (R2):R188-96. PMID:20843826

Sudmant PH, Kitzman JO, Antonacci F, Alkan C, Malig M, Tsalenko A, Sampas N, Bruhn L, Shendure J, 1000 Genomes Project, & Eichler EE (2010). Diversity of human copy number variation and multicopy genes. Science (New York, N.Y.), 330 (6004), 641-6 PMID: 21030649

The 1000 Genomes Project Consortium (2010). A map of human genome variation from population-scale sequencing. Nature, 467(7319), 1061-1073 DOI: 10.1038/nature09534

Bowne SJ, Sullivan LS, Koboldt DC, Ding L, Fulton R, Abbott RM, Sodergren EJ, Birch DG, Wheaton DH, Heckenlively JR, Liu Q, Pierce EA, Weinstock GM, & Daiger SP (2010). Identification of Disease-Causing Mutations in Autosomal Dominant Retinitis Pigmentosa (adRP) Using Next-Generation DNA Sequencing. Investigative ophthalmology & visual science PMID: 20861475

Fehniger, T., Wylie, T., Germino, E., Leong, J., Magrini, V., Koul, S., Keppel, C., Schneider, S., Koboldt, D., Sullivan, R., Heinz, M., Crosby, S., Nagarajan, R., Ramsingh, G., Link, D., Ley, T., & Mardis, E. (2010). Next-generation sequencing identifies the natural killer cell microRNA transcriptome Genome Research DOI: 10.1101/gr.107995.110

Ramsingh G, Koboldt DC, Trissal M, Chiappinelli KB, Wylie T, Koul S, Chang LW, Nagarajan R, Fehniger TA, Goodfellow P, Magrini V, Wilson RK, Ding L, Ley TJ, Mardis ER, & Link DC (2010). Complete characterization of the microRNAome in a patient with acute myeloid leukemia. BloodPMID: 20876853

Koboldt DC, Ding L, Mardis ER & Wilson RK. (2010). Challenges of sequencing human genomes. Briefings in Bioinformatics DOI:10.1093/bib/bbq016

Ding L, Ellis MJ, Li S, Larson DE, Chen K, Wallis JW, Harris CC, McLellan MD, Fulton RS, Fulton LL, Abbott RM, Hoog J, Dooling DJ, Koboldt DC, et al. (2010). Genome remodelling in a basal-like breast cancer metastasis and xenograft. Nature, 464 (7291), 999-1005 PMID:20393555

Koboldt DC and Miller RD (2010). Identification of polymorphic markers for genetic mapping. Genomics: Essential Methods, In Press.

Koboldt DC, Staisch J, Thillainathan B, Haines K, Baird SE, Chamberlin HM, Haag ES, Miller RD, & Gupta BP (2010). A toolkit for rapid gene mapping in the nematode Caenorhabditis briggsae. BMC genomics, 11 (1) PMID: 20385026

Voora D, Koboldt DC, King CR, Lenzini PA, Eby CS, Porche-Sorbet R, Deych E, Crankshaw M, Milligan PE, McLeod HL, Patel SR, Cavallari LH, Ridker PM, Grice GR, Miller RD, & Gage BF (2010). A polymorphism in the VKORC1 regulator calumenin predicts higher warfarin dose requirements in African Americans. Clinical pharmacology and therapeutics, 87 (4), 445-51 PMID: 20200517

Zhang Q, Ding L, Larson DE, Koboldt DC, McLellan MD, Chen K, Shi X, Kraja A, et al (2009). CMDS: a population-based method for identifying recurrent DNA copy number aberrations in cancer from high-resolution data. Bioinformatics (Oxford, England) PMID: 20031968

Mardis ER, Ding L, Dooling DJ, Larson DE, McLellan MD, Chen K, Koboldt DC, et al (2009). Recurring mutations found by sequencing an acute myeloid leukemia genome. The New England journal of medicine, 361(11), 1058-66 PMID: 19657110

Koboldt DC, Chen K, Wylie T, Larson DE, McLellan MD, Mardis ER, Weinstock GM, Wilson RK, & Ding L (2009). VarScan: variant detection in massively parallel sequencing of individual and pooled samples.Bioinformatics (Oxford, England), 25 (17), 2283-5 PMID: 19542151

Ley TJ, Mardis ER, Ding L, Fulton B, McLellan MD, Chen K, Dooling D, Dunford-Shore BH, McGrath S, Hickenbotham M, Cook L, Abbott R, Larson DE, Koboldt DC, et al (2008). DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature, 456 (7218), 66-72 PMID: 18987736

Ding L, Getz G, Wheeler DA, Mardis ER, McLellan MD, Cibulskis K, Sougnez C, et al (2008). Somatic mutations affect key pathways in lung adenocarcinoma. Nature, 455 (7216), 1069-75 PMID: 18948947

Cancer Genome Atlas Research Network (2008). Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature, 455 (7216), 1061-8 PMID: 18772890

International HapMap Consortium (2007). A second generation human haplotype map of over 3.1 million SNPs. Nature, 449 (7164), 851-61 PMID: 17943122

Sabeti PC, Varilly P, Fry B, et al (2007). Genome-wide detection and characterization of positive selection in human populations. Nature, 449 (7164), 913-8 PMID: 17943131

Hillier LW, Miller RD, Baird SE, Chinwalla A, Fulton LA, Koboldt DC, & Waterston RH (2007). Comparison of C. elegans and C. briggsaegenome sequences reveals extensive conservation of chromosome organization and synteny. PLoS biology, 5 (7) PMID: 17608563

Stanley SL Jr, Frey SE, Taillon-Miller P, Guo J, Miller RD, Koboldt DC, Elashoff M, Christensen R, Saccone NL, & Belshe RB (2007). The immunogenetics of smallpox vaccination. The Journal of infectious diseases, 196 (2), 212-9 PMID: 17570108

Koboldt DC, Miller RD, & Kwok PY (2006). Distribution of human SNPs and its effect on high-throughput genotyping. Human mutation, 27(3), 249-54 PMID: 16425292

The International HapMap Consortium (2005). A haplotype map of the human genome. Nature, 437 (7063), 1299-1320 PMID: 16255080

Miller RD, Phillips MS, et al (2005). High-density single-nucleotide polymorphism maps of the human genome. Genomics, 86 (2), 117-26 PMID: 15961272

Other Writing by Dan Koboldt

Dan Koboldt is also the author of Get Your Baby to Sleep, a resource to help new parents whose baby won’t sleep with advice on establishing healthy baby sleep habits and handling baby sleep problems. He contributes to The Best of Twins and In Search of Whitetails blogs as well.

How would you like to start your own blog? See this guide to building a blog or website in 20 minutes. It walks you through setting up a site with open-source WordPress software, which happens to be what runs Massgenomics.

SOURCE:

Other related articles on this Open Access Online Scientific Journal:

“Genome in a Bottle”: NIST’s new metrics for Clinical Human Genome Sequencing “Genome in a Bottle”: NIST’s new metrics for Clinical Human Genome Sequencing

http://pharmaceuticalintelligence.com/2012/09/06/genome-in-a-bottle-nists-new-metrics-for-clinical-human-genome-sequencing/

DNA – The Next-Generation Storage Media for Digital Information

http://pharmaceuticalintelligence.com/2012/08/27/dna-the-next-generation-storage-media-for-digital-information/

How Genome Sequencing is Revolutionizing Clinical Diagnostics

http://pharmaceuticalintelligence.com/2012/08/20/how-genome-sequencing-is-revolutionizing-clinical-diagnostics/

NGS Market: Trends and Development for Genotype-Phenotype Associations Research

http://pharmaceuticalintelligence.com/2013/02/19/ngs-market-trends-and-development-for-genotype-phenotype-associations/

What is the Future for Genomics in Clinical Medicine?

http://pharmaceuticalintelligence.com/2013/02/17/what-is-the-future-for-genomics-in-clinical-medicine/

Genomically Guided Treatment after CLIA Approval: to be offered by Weill Cornell Precision Medicine Institute

http://pharmaceuticalintelligence.com/2013/02/06/genomically-guided-treatment-after-clia-approval-to-be-offered-by-weill-cornell-precision-medicine-institute/

Inaugural Genomics in Medicine – The Conference Program, 2/11-12/2013, San Francisco, CA

http://pharmaceuticalintelligence.com/2013/02/04/inaugural-genomics-in-medicine-the-conference-program-211-122013-san-francisco-ca/

GSK for Personalized Medicine using Cancer Drugs needs Alacris systems biology model to determine the in silico effect of the inhibitor in its “virtual clinical trial”

http://pharmaceuticalintelligence.com/2012/11/14/gsk-for-personalized-medicine-using-cancer-drugs-needs-alacris-systems-biology-model-to-determine-the-in-silico-effect-of-the-inhibitor-in-its-virtual-clinical-trial/

arrayMap: Genomic Feature Mining of Cancer Entities of Copy Number Abnormalities (CNAs) Data

http://pharmaceuticalintelligence.com/2012/11/01/arraymap-genomic-feature-mining-of-cancer-entities-of-copy-number-abnormalities-cnas-data/

NGS Cardiovascular Diagnostics: Long-QT Genes Sequenced – A Potential Replacement for Molecular Pathology

http://pharmaceuticalintelligence.com/2012/10/01/ngs-cardiovascular-diagnostics-long-qt-genes-sequenced-a-potential-replacement-for-molecular-pathology/

Speeding Up Genome Analysis: MIT Algorithms for Direct Computation on Compressed Genomic Datasets

http://pharmaceuticalintelligence.com/2012/09/18/speeding-up-genome-analysis-mit-algorithms-for-direct-computation-on-compressed-genomic-datasets/

Clinical Genetics, Personalized Medicine, Molecular Diagnostics, Consumer-targeted DNA – Consumer Genetics Conference (CGC) – October 3-5, 2012, Seaport Hotel, Boston, MA

http://pharmaceuticalintelligence.com/2012/09/06/clinical-genetics-personalized-medicine-molecular-diagnostics-consumer-targeted-dna-consumer-genetics-conference-cgc-october-3-5-2012-seaport-hotel-boston-ma/

“CRACKING THE CODE OF HUMAN LIFE: The Birth of BioInformatics & Computational Genomics” lays the manifold multivariate systems analytical tools that has moved the science forward to a groung that ensures clinical application.

http://pharmaceuticalintelligence.com/2013/02/13/cracking-the-code-of-human-life-the-birth-of-bioinformatics-and-computational-genomics/

Read Full Post »

Curator: Aviva Lev-Ari, PhD, RN

This image has an empty alt attribute; its file name is ArticleID-27.png

WordCloud Image Produced by Adam Tubman

Dr. M. Michael Barmada, Associate Professor at Center for Computational Genetics, University of Pittsburgh, tells about how the hot topic of the times now – genetics – has challenged the computational resources across the University:

Associate Professor at Center for Computational Genetics at University of Pittsburgh, Dr. M. Michael Barmada

CLC bio Annual Survey Results

http://www.clcbio.com/wp-content/uploads/2013/02/annual_survey_clcbio.pdf?utm_source=Survey2012&utm_medium=CLC

CLC Bio has published the results of a survey of researchers in the next-generation sequencing market to find out which sequencers and software are used the most.

The company says it received responses from 708 individuals in 73 countries.

Not surprisingly, they found that Sequencers

  • Illumina’s HiSeq and MiSeq are the most used instruments with about 34.6 percent and 21.3 percent of respondents, respectively, stating that they use the systems. Meanwhile,
  • Roche’s 454 sequencers got 21.2 percent of the votes and
  • Life Technologies’ Ion Torrent Personal Genome Machine got 11.5 percent of the responses.

In terms of Bioinformatics tools, the

  • UCSC Genome Browser has the most use, according to the survey, with 28.9 percent of respondents reporting that they use the program. Next in line is
  • Ensembl tools and then – 26.9
  • Bowtie with  23.4 percent of the votes, respectively.

Also worth noting is that NGS is being used primarily for

  • whole-genome sequencing — 40.8 percent of the votes — followed by
  • RNA-seq and — 40.1 percent
  • de novo sequencing with  39.8 percent of the votes, respectively.

Of the 708 respondents, about 24.6 percent work in the US, according to CLC. Also,

  • 73 percent of respondents work in academic research while
  • 9 percent work in industry, another
  • 9 percent in government, and
  • 6 percent work in not-for-profit organizations, according to the survey.
We believe MedQL has the potential to be an effective time saver for researchers working with variant prioritization, making it a promising new plugin for CLC Genomics Workbench. We’re excited to add BioQL’s technology for evidence-based downstream analysis of Next Generation Sequencing data to our products.
Director of Global Partner Relations at CLC bio, Mikael Flensborg
Using CLC Genomics Workbench, a common workflow to detect causative mutations in medical genomics involves read mapping and variant detection. The result is a list of candidate gene variants that differ from the reference genome. The MedQL plugin uses an evidence-based approach to prioritize these genes for functional studies and, thereby, allowing researchers to focus their efforts on the most promising candidates.

CLC BIO AND BIOQL RELEASE MEDICAL GENOMICS PLUGIN FOR GENOTYPE–PHENOTYPE ASSOCIATIONS

Aarhus, Denmark — November 7, 2012 — Today, CLC bio and the independent software vendor, BioQL, announced the release of the MedQL Variant Prioritizer plugin for CLC Genomics Workbench. The plugin connects with MedQL’s online database to prioritize a list of variants in gene regions based on their degree of association with a given phenotype.

The MedQL database contains more than 20 million articles from Medline, indexed using a dictionary of nearly 300,000 terms from authoritative ontologies such as the HUGO Gene Nomenclature Committee (HGNC), the Human Disease Ontology, and the Online Mendelian Inheritance in Man (OMIM).

CLC BIO

We’re the world’s leading bioinformatics software developers and the only ones providing an analysis platform where both desktop and server software are seamlessly integrated and optimized for best performance.

Our wide range of analyses are available both through a user-friendly graphical user-interface as well as through command-line, allowing scientists to choose their preferred interface.

By developing our own proprietary algorithms, based on published methods, we have successfully accelerated the data calculations to achieve remarkable improvements in speed over comparable solutions.

Our enterprise platform serves as the backbone of sequence analysis pipelines for a large number of the world’s most prominent research institutions. With around 2000 different organizations as our customers around the globe, including the ten biggest pharmaceutical companies in the world, we have established ourselves as the market-leader in sequence analysis software.

One of our key strategies is to be ‘cross-platform’, which means we support all the major next generation sequencing platforms as well as traditional Sanger-based sequencing, effectively giving our customers a one-stop-shop for their analysis needs across all sequencing platforms.

http://www.clcbio.com/corporate/about-clc-bio/

 Desktop software for Sequence Analysis based on an overall level of subjects.

FEATURES

Next Generation Sequencing analysis
Genomics
Transcriptomics (Gene expression features also available in CLC Main Workbench)
Epigenomics
RNA secondary structure
BLAST searches
Protein analyses
Primer design
Assembly of Sanger sequencing data
Molecular cloning
Pattern discovery and motif search
Nucleotide analyses
GenBank Entrez searches
Sequence alignment
Phylogenetic trees
Detailed history log
Batch processing
Customization of your workbenches

CLC Genomics Machine

Our turnkey solution, for small research labs. It includes CLC Genomics Server and CLC Genomics Workbench. Everything is preinstalled on a powerful desktop computer or server blade – ready to plug-in and run from the day it is delivered.

CLC Genomics Factory

Our turnkey solution for medium and large research labs that needs a complete IT infrastructure for their NGS data analysis.

USER-FRIENDLY BIOINFORMATICS

Our software is made for biologists by biologists, so it’s easy to analyze, visualize, and compare DNA, RNA, and Protein data, as well as run advanced workflows with large and complicated datasets.

J. CRAIG VENTER INSTITUTE EXTENDS CLC BIO SITE LICENSE THROUGH 2017

Aarhus, Denmark — January 8, 2013 — Today CLC bio, the global leader in commercial sequence analysis software, announced that the J. Craig Venter Institute (JCVI) has extended their site license agreement with CLC bio through 2017.

JCVI has been utilizing CLC bio’s enterprise platform since 2009 and currently uses it on more than 30 research grants, including their work as part of the Human Microbiome Project (HMP). The HMP is a National Institutes of Health-funded project to catalogue and characterize the microbes living in and on the human body. Recently, the HMP Consortium published a series of papers with results from this work in Nature and PLOSone. CLC’s bio software was used in the analysis of this work.

The complexity and diversity of our research projects necessitates unique tools to analyze these increasingly large data sets. In our pursuit of excellence we always test and employ the best available tools for our research projects. As such we’re happy to announce the extension of our site license with CLC bio through 2017.
Karen Nelson, Ph.D., President, JCVI
For us, it’s always very exciting to see the results of all the intriguing research that our customers are doing, and no less so, when JCVI published their papers on the HMP project this summer. JCVI was one of our first site license deals with a premier institution in the genomics research field, and we’re proud to announce it has been extended for another five years.
Thomas Knudsen, CEO, CLC bio

The original 4-year site license agreement between JCVI and CLC bio was signed in the summer of 2009, and has now been extended by another 5 years, through 2017. JCVI deploys CLC bio’s platform in an integrated environment across multiple geographical locations and together with international collaborators.

Read Full Post »

The potential contribution of informatics to healthcare is more than currently estimated

Reporter: Larry H Bernstein, MD, FCAP

 

I call attention to an interesting article that just came out.   The estimate of improved costsavings in healthcare and diagnostic accuracy is extimated to be substantial.   I have written about the unused potential that we have not yet seen.  In short, there is justification in substantial investment in resources to this, as has been proposed as a critical goal.  Does this mean a reduction in staffing?  I wouldn’t look at it that way.  The two huge benefits that would accrue are:

 

  1. workflow efficiency, reducing stress and facilitating decision-making.
  2. scientifically, primary knowledge-based  decision-support by well developed algotithms that have been at the heart of computational-genomics.

 

 

 

Can computers save health care? IU research shows lower costs, better outcomes

Cost per unit of outcome was $189, versus $497 for treatment as usual

 Last modified: Monday, February 11, 2013

 

BLOOMINGTON, Ind. — New research from Indiana University has found that machine learning — the same computer science discipline that helped create voice recognition systems, self-driving cars and credit card fraud detection systems — can drastically improve both the cost and quality of health care in the United States.

 

 

 Physicians using an artificial intelligence framework that predicts future outcomes would have better patient outcomes while significantly lowering health care costs.

 

 

Using an artificial intelligence framework combining Markov Decision Processes and Dynamic Decision Networks, IU School of Informatics and Computing researchers Casey Bennett and Kris Hauser show how simulation modeling that understands and predicts the outcomes of treatment could

 

  • reduce health care costs by over 50 percent while also
  • improving patient outcomes by nearly 50 percent.

 

The work by Hauser, an assistant professor of computer science, and Ph.D. student Bennett improves upon their earlier work that

 

  • showed how machine learning could determine the best treatment at a single point in time for an individual patient.

 

By using a new framework that employs sequential decision-making, the previous single-decision research

 

  • can be expanded into models that simulate numerous alternative treatment paths out into the future;
  • maintain beliefs about patient health status over time even when measurements are unavailable or uncertain; and
  • continually plan/re-plan as new information becomes available.

In other words, it can “think like a doctor.”  (Perhaps better because of the limitation in the amount of information a bright, competent physician can handle without error!)

 

“The Markov Decision Processes and Dynamic Decision Networks enable the system to deliberate about the future, considering all the different possible sequences of actions and effects in advance, even in cases where we are unsure of the effects,” Bennett said.  Moreover, the approach is non-disease-specific — it could work for any diagnosis or disorder, simply by plugging in the relevant information.  (This actually raises the question of what the information input is, and the cost of inputting.)

 

The new work addresses three vexing issues related to health care in the U.S.:

 

  1. rising costs expected to reach 30 percent of the gross domestic product by 2050;
  2. a quality of care where patients receive correct diagnosis and treatment less than half the time on a first visit;
  3. and a lag time of 13 to 17 years between research and practice in clinical care.

  Framework for Simulating Clinical Decision-Making

 

“We’re using modern computational approaches to learn from clinical data and develop complex plans through the simulation of numerous, alternative sequential decision paths,” Bennett said. “The framework here easily out-performs the current treatment-as-usual, case-rate/fee-for-service models of health care.”  (see the above)

 

Bennett is also a data architect and research fellow with Centerstone Research Institute, the research arm of Centerstone, the nation’s largest not-for-profit provider of community-based behavioral health care. The two researchers had access to clinical data, demographics and other information on over 6,700 patients who had major clinical depression diagnoses, of which about 65 to 70 percent had co-occurring chronic physical disorders like diabetes, hypertension and cardiovascular disease.  Using 500 randomly selected patients from that group for simulations, the two

 

  • compared actual doctor performance and patient outcomes against
  • sequential decision-making models

using real patient data.

They found great disparity in the cost per unit of outcome change when the artificial intelligence model’s

 

  1. cost of $189 was compared to the treatment-as-usual cost of $497.
  2. the AI approach obtained a 30 to 35 percent increase in patient outcomes
Bennett said that “tweaking certain model parameters could enhance the outcome advantage to about 50 percent more improvement at about half the cost.”

 

While most medical decisions are based on case-by-case, experience-based approaches, there is a growing body of evidence that complex treatment decisions might be effectively improved by AI modeling.  Hauser said “Modeling lets us see more possibilities out to a further point –  because they just don’t have all of that information available to them.”  (Even then, the other issue is the processing of the information presented.)

 

 

Using the growing availability of electronic health records, health information exchanges, large public biomedical databases and machine learning algorithms, the researchers believe the approach could serve as the basis for personalized treatment through integration of diverse, large-scale data passed along to clinicians at the time of decision-making for each patient. Centerstone alone, Bennett noted, has access to health information on over 1 million patients each year. “Even with the development of new AI techniques that can approximate or even surpass human decision-making performance, we believe that the most effective long-term path could be combining artificial intelligence with human clinicians,” Bennett said. “Let humans do what they do well, and let machines do what they do well. In the end, we may maximize the potential of both.”

 

 

Artificial Intelligence Framework for Simulating Clinical Decision-Making: A Markov Decision Process Approach” was published recently in Artificial Intelligence in Medicine. The research was funded by the Ayers Foundation, the Joe C. Davis Foundation and Indiana University.

 

For more information or to speak with Hauser or Bennett, please contact Steve Chaplin, IU Communications, at 812-856-1896 or stjchap@iu.edu.

 

 

IBM Watson Finally Graduates Medical School

 

It’s been more than a year since IBM’s Watson computer appeared on Jeopardy and defeated several of the game show’s top champions. Since then the supercomputer has been furiously “studying” the healthcare literature in the hope that it can beat a far more hideous enemy: the 400-plus biomolecular puzzles we collectively refer to as cancer.

 

 

 

Anomaly Based Interpretation of Clinical and Laboratory Syndromic Classes

Larry H Bernstein, MD, Gil David, PhD, Ronald R Coifman, PhD.  Program in Applied Mathematics, Yale University, Triplex Medical Science.

 

 Statement of Inferential  Second Opinion

 Realtime Clinical Expert Support and Validation System

Gil David and Larry Bernstein have developed, in consultation with Prof. Ronald Coifman, in the Yale University Applied Mathematics Program, a software system that is the equivalent of an intelligent Electronic Health Records Dashboard that provides
  • empirical medical reference and suggests quantitative diagnostics options.

Background

The current design of the Electronic Medical Record (EMR) is a linear presentation of portions of the record by
  • services, by
  • diagnostic method, and by
  • date, to cite examples.

This allows perusal through a graphical user interface (GUI) that partitions the information or necessary reports in a workstation entered by keying to icons.  This requires that the medical practitioner finds

  • the history,
  • medications,
  • laboratory reports,
  • cardiac imaging and EKGs, and
  • radiology
in different workspaces.  The introduction of a DASHBOARD has allowed a presentation of
  • drug reactions,
  • allergies,
  • primary and secondary diagnoses, and
  • critical information about any patient the care giver needing access to the record.
 The advantage of this innovation is obvious.  The startup problem is what information is presented and how it is displayed, which is a source of variability and a key to its success.

Proposal

We are proposing an innovation that supercedes the main design elements of a DASHBOARD and
  • utilizes the conjoined syndromic features of the disparate data elements.
So the important determinant of the success of this endeavor is that it facilitates both
  1. the workflow and
  2. the decision-making process
  • with a reduction of medical error.
 This has become extremely important and urgent in the 10 years since the publication “To Err is Human”, and the newly published finding that reduction of error is as elusive as reduction in cost.  Whether they are counterproductive when approached in the wrong way may be subject to debate.
We initially confine our approach to laboratory data because it is collected on all patients, ambulatory and acutely ill, because the data is objective and quality controlled, and because
  • laboratory combinatorial patterns emerge with the development and course of disease.  Continuing work is in progress in extending the capabilities with model data-sets, and sufficient data.
It is true that the extraction of data from disparate sources will, in the long run, further improve this process.  For instance, the finding of both ST depression on EKG coincident with an increase of a cardiac biomarker (troponin) above a level determined by a receiver operator curve (ROC) analysis, particularly in the absence of substantially reduced renal function.
The conversion of hematology based data into useful clinical information requires the establishment of problem-solving constructs based on the measured data.  Traditionally this has been accomplished by an intuitive interpretation of the data by the individual clinician.  Through the application of geometric clustering analysis the data may interpreted in a more sophisticated fashion in order to create a more reliable and valid knowledge-based opinion.
The most commonly ordered test used for managing patients worldwide is the hemogram that often incorporates the review of a peripheral smear.  While the hemogram has undergone progressive modification of the measured features over time the subsequent expansion of the panel of tests has provided a window into the cellular changes in the production, release or suppression of the formed elements from the blood-forming organ to the circulation.  In the hemogram one can view data reflecting the characteristics of a broad spectrum of medical conditions.
Progressive modification of the measured features of the hemogram has delineated characteristics expressed as measurements of
  • size,
  • density, and
  • concentration,
resulting in more than a dozen composite variables, including the
  1. mean corpuscular volume (MCV),
  2. mean corpuscular hemoglobin concentration (MCHC),
  3. mean corpuscular hemoglobin (MCH),
  4. total white cell count (WBC),
  5. total lymphocyte count,
  6. neutrophil count (mature granulocyte count and bands),
  7. monocytes,
  8. eosinophils,
  9. basophils,
  10. platelet count, and
  11. mean platelet volume (MPV),
  12. blasts,
  13. reticulocytes and
  14. platelet clumps,
  15. perhaps the percent immature neutrophils (not bands)
  16. as well as other features of classification.
The use of such variables combined with additional clinical information including serum chemistry analysis (such as the Comprehensive Metabolic Profile (CMP)) in conjunction with the clinical history and examination complete the traditional problem-solving construct. The intuitive approach applied by the individual clinician is limited, however,
  1. by experience,
  2. memory and
  3. cognition.
The application of rules-based, automated problem solving may provide a more reliable and valid approach to the classification and interpretation of the data used to determine a knowledge-based clinical opinion.
The classification of the available hematologic data in order to formulate a predictive model may be accomplished through mathematical models that offer a more reliable and valid approach than the intuitive knowledge-based opinion of the individual clinician.  The exponential growth of knowledge since the mapping of the human genome has been enabled by parallel advances in applied mathematics that have not been a part of traditional clinical problem solving.  In a univariate universe the individual has significant control in visualizing data because unlike data may be identified by methods that rely on distributional assumptions.  As the complexity of statistical models has increased, involving the use of several predictors for different clinical classifications, the dependencies have become less clear to the individual.  The powerful statistical tools now available are not dependent on distributional assumptions, and allow classification and prediction in a way that cannot be achieved by the individual clinician intuitively. Contemporary statistical modeling has a primary goal of finding an underlying structure in studied data sets.
In the diagnosis of anemia the variables MCV,MCHC and MCH classify the disease process  into microcytic, normocytic and macrocytic categories.  Further consideration of
proliferation of marrow precursors,
  • the domination of a cell line, and
  • features of suppression of hematopoiesis

provide a two dimensional model.  Several other possible dimensions are created by consideration of

  • the maturity of the circulating cells.
The development of an evidence-based inference engine that can substantially interpret the data at hand and convert it in real time to a “knowledge-based opinion” may improve clinical problem solving by incorporating multiple complex clinical features as well as duration of onset into the model.
An example of a difficult area for clinical problem solving is found in the diagnosis of SIRS and associated sepsis.  SIRS (and associated sepsis) is a costly diagnosis in hospitalized patients.   Failure to diagnose sepsis in a timely manner creates a potential financial and safety hazard.  The early diagnosis of SIRS/sepsis is made by the application of defined criteria (temperature, heart rate, respiratory rate and WBC count) by the clinician.   The application of those clinical criteria, however, defines the condition after it has developed and has not provided a reliable method for the early diagnosis of SIRS.  The early diagnosis of SIRS may possibly be enhanced by the measurement of proteomic biomarkers, including transthyretin, C-reactive protein and procalcitonin.  Immature granulocyte (IG) measurement has been proposed as a more readily available indicator of the presence of
  • granulocyte precursors (left shift).
The use of such markers, obtained by automated systems in conjunction with innovative statistical modeling, may provide a mechanism to enhance workflow and decision making.
An accurate classification based on the multiplicity of available data can be provided by an innovative system that utilizes  the conjoined syndromic features of disparate data elements.  Such a system has the potential to facilitate both the workflow and the decision-making process with an anticipated reduction of medical error.

This study is only an extension of our approach to repairing a longstanding problem in the construction of the many-sided electronic medical record (EMR).  On the one hand, past history combined with the development of Diagnosis Related Groups (DRGs) in the 1980s have driven the technology development in the direction of “billing capture”, which has been a focus of epidemiological studies in health services research using data mining.

In a classic study carried out at Bell Laboratories, Didner found that information technologies reflect the view of the creators, not the users, and Front-to-Back Design (R Didner) is needed.  He expresses the view:

“Pre-printed forms are much more amenable to computer-based storage and processing, and would improve the efficiency with which the insurance carriers process this information.  However, pre-printed forms can have a rather severe downside. By providing pre-printed forms that a physician completes
to record the diagnostic questions asked,
  • as well as tests, and results,
  • the sequence of tests and questions,
might be altered from that which a physician would ordinarily follow.  This sequence change could improve outcomes in rare cases, but it is more likely to worsen outcomes. “

Decision Making in the Clinical Setting.   Robert S. Didner

 A well-documented problem in the medical profession is the level of effort dedicated to administration and paperwork necessitated by health insurers, HMOs and other parties (ref).  This effort is currently estimated at 50% of a typical physician’s practice activity.  Obviously this contributes to the high cost of medical care.  A key element in the cost/effort composition is the retranscription of clinical data after the point at which it is collected.  Costs would be reduced, and accuracy improved, if the clinical data could be captured directly at the point it is generated, in a form suitable for transmission to insurers, or machine transformable into other formats.  Such data capture, could also be used to improve the form and structure of how this information is viewed by physicians, and form a basis of a more comprehensive database linking clinical protocols to outcomes, that could improve the knowledge of this relationship, hence clinical outcomes.
 How we frame our expectations is so important that
  • it determines the data we collect to examine the process.
In the absence of data to support an assumed benefit, there is no proof of validity at whatever cost.   This has meaning for
  • hospital operations, for
  • nonhospital laboratory operations, for
  • companies in the diagnostic business, and
  • for planning of health systems.
In 1983, a vision for creating the EMR was introduced by Lawrence Weed and others.  This is expressed by McGowan and Winstead-Fry.
J J McGowan and P Winstead-Fry. Problem Knowledge Couplers: reengineering evidence-based medicine through interdisciplinary development, decision support, and research.
Bull Med Libr Assoc. 1999 October; 87(4): 462–470.   PMCID: PMC226622    Copyright notice

 

Example of Markov Decision Process (MDP) trans...

Example of Markov Decision Process (MDP) transition automaton (Photo credit: Wikipedia)

Control loop of a Markov Decision Process

Control loop of a Markov Decision Process (Photo credit: Wikipedia)

 

English: IBM's Watson computer, Yorktown Heigh...

English: IBM’s Watson computer, Yorktown Heights, NY (Photo credit: Wikipedia)

English: Increasing decision stakes and system...

English: Increasing decision stakes and systems uncertainties entail new problem solving strategies. Image based on a diagram by Funtowicz, S. and Ravetz, J. (1993) “Science for the post-normal age” Futures 25:735–55 (http://dx.doi.org/10.1016/0016-3287(93)90022-L). (Photo credit: Wikipedia)

 

 

Read Full Post »

CRACKING THE CODE OF HUMAN LIFE: The Birth of BioInformatics & Computational Genomics – Part IIB

Curator: Larry H Bernstein, MD, FCAP

Part I: The Initiation and Growth of Molecular Biology and Genomics – Part I From Molecular Biology to Translational Medicine: How Far Have We Come, and Where Does It Lead Us?

http://pharmaceuticalintelligence.com/wp-admin/post.php?post=8634&action=edit&message=1

Part II: CRACKING THE CODE OF HUMAN LIFE is divided into a three part series.

Part IIA. “CRACKING THE CODE OF HUMAN LIFE: Milestones along the Way” reviews the Human Genome Project and the decade beyond.

http://pharmaceuticalintelligence.com/2013/02/12/cracking-the-code-of-human-life-milestones-along-the-way/

Part IIB. “CRACKING THE CODE OF HUMAN LIFE: The Birth of BioInformatics & Computational Genomics” lays the manifold multivariate systems analytical tools that has moved the science forward to a groung that ensures clinical application.

http://pharmaceuticalintelligence.com/2013/02/13/cracking-the-code-of-human-life-the-birth-of-bioinformatics-and-computational-genomics/

Part IIC. “CRACKING THE CODE OF HUMAN LIFE: Recent Advances in Genomic Analysis and Disease “ will extend the discussion to advances in the management of patients as well as providing a roadmap for pharmaceutical drug targeting.

http://pharmaceuticalintelligence.com/2013/02/14/cracking-the-code-of-human-life-recent-advances-in-genomic-analysis-and-disease/

To be followed by:
Part III will conclude with Ubiquitin, it’s role in Signaling and Regulatory Control.

Part IIB. “CRACKING THE CODE OF HUMAN LIFE: The Birth of BioInformatics & Computational Genomics” is a continuation of a previous discussion on the role of genomics in discovery of therapeutic targets titled, Directions for Genomics in Personalized Medicinewhich focused on:

  • key drivers of cellular proliferation,
  • stepwise mutational changes coinciding with cancer progression, and
  • potential therapeutic targets for reversal of the process.

It is a direct extension of The Initiation and Growth of Molecular Biology and Genomics – Part I 

These articles review a web-like connectivity between inter-connected scientific discoveries, as significant findings have led to novel hypotheses and many expectations over the last 75 years. This largely post WWII revolution has driven our understanding of biological and medical processes at an exponential pace owing to successive discoveries of
  • chemical structure,
  • the basic building blocks of DNA  and proteins, of
  • nucleotide and protein-protein interactions,
  • protein folding,
  • allostericity,
  • genomic structure,
  • DNA replication,
  • nuclear polyribosome interaction, and
  • metabolic control.

Nucleotides_1.svg

In addition, the emergence of methods for

  • copying,
  • removal
  • insertion, and
  • improvements in structural analysis
  • developments in applied mathematics have transformed the research framework.

This last point,

  • developments in applied mathematics have transformed the research framework, is been developed in this very article

CRACKING THE CODE OF HUMAN LIFE: The Birth of BioInformatics & Computational Genomics – Part IIB

Computational Genomics

1. Three-Dimensional Folding and Functional Organization Principles of The Drosophila Genome

Sexton T, Yaffe E, Kenigeberg E, Bantignies F,…Cavalli G. Institute de Genetique Humaine, Montpelliere GenomiX, and Weissman Institute, France and Israel. Cell 2012; 148(3): 458-472.
http://dx.doi.org/10.1016/j.cell.2012.01.010/
http://www.cell.com/retrieve/pii/S0092867412000165
http://www.ncbi.nlm.nih.gov/pubmed/22265598

Chromosomes are the physical realization of genetic information and thus form the basis for its readout and propagation.

250px-DNA_labeled  DNA diagram showing base pairing      circular genome map

Here we present a high-resolution chromosomal contact map derived from

  • a modified genome-wide chromosome conformation capture approach applied to Drosophila embryonic nuclei.
  • the entire genome is linearly partitioned into well-demarcated physical domains that overlap extensively with active and repressive epigenetic marks.
  • Chromosomal contacts are hierarchically organized between domains.
  • Global modeling of contact density and clustering of domains show that inactive
  • domains are condensed and confined to their chromosomal territories, whereas
  • active domains reach out of the territory to form remote intra- and interchromosomal contacts.

Moreover, we systematically identify

  • specific long-range intrachromosomal contacts between Polycomb-repressed domains.

Together, these observations

  • allow for quantitative prediction of the Drosophila chromosomal contact map,
  • laying the foundation for detailed studies of chromosome structure and function in a genetically tractable system.

fractal-globule

2A. Architecture Reveals Genome’s Secrets

Three-dimensional genome maps – Human chromosome

Genome sequencing projects have provided rich troves of information about

  • stretches of DNA that regulate gene expression, as well as
  • how different genetic sequences contribute to health and disease.

But these studies miss a key element of the genome—its spatial organization—which has long been recognized as an important regulator of gene expression.

  • Regulatory elements often lie thousands of base pairs away from their target genes, and recent technological advances are allowing scientists to begin examining
  • how distant chromosome locations interact inside a nucleus.
  • The creation and function of 3-D genome organization, some say, is the next frontier of genetics.

Mapping and sequencing may be completely separate processes. For example, it’s possible to determine the location of a gene—to “map” the gene—without sequencing it. Thus, a map may tell you nothing about the sequence of the genome, and a sequence may tell you nothing about the map.  But the landmarks on a map are DNA sequences, and mapping is the cousin of sequencing. A map of a sequence might look like this:
On this map, GCC is one landmark; CCCC is another. Here we find, the sequence is a landmark on a map. In general, particularly for humans and other species with large genomes,

  • creating a reasonably comprehensive genome map is quicker and cheaper than sequencing the entire genome.
  • mapping involves less information to collect and organize than sequencing does.

Completed in 2003, the Human Genome Project (HGP) was a 13-year project. The goals were:

  • identify all the approximately 20,000-25,000 genes in human DNA,
  • determine the sequences of the 3 billion chemical base pairs that make up human DNA,
  • store this information in databases,
  • improve tools for data analysis,
  • transfer related technologies to the private sector, and
  • address the ethical, legal, and social issues (ELSI) that may arise from the project.

Though the HGP is finished, analyses of the data will continue for many years. By licensing technologies to private companies and awarding grants for innovative research, the project catalyzed the multibillion-dollar U.S. biotechnology industry and fostered the development of new medical applications. When genes are expressed, their sequences are first converted into messenger RNA transcripts, which can be isolated in the form of complementary DNAs (cDNAs). A small portion of each cDNA sequence is all that is needed to develop unique gene markers, known as sequence tagged sites or STSs, which can be detected using the polymerase chain reaction (PCR). To construct a transcript map, cDNA sequences from a master catalog of human genes were distributed to mapping laboratories in North America, Europe, and Japan. These cDNAs were converted to STSs and their physical locations on chromosomes determined on one of two radiation hybrid (RH) panels or a yeast artificial chromosome (YAC) library containing human genomic DNA. This mapping data was integrated relative to the human genetic map and then cross-referenced to cytogenetic band maps of the chromosomes. (Further details are available in the accompanying article in the 25 October issue of SCIENCE).

Tremendous progress has been made in the mapping of human genes, a major milestone in the Human Genome Project. Apart from its utility in advancing our understanding of the genetic basis of disease, it  provides a framework and focus for accelerated sequencing efforts by highlighting key landmarks (gene-rich regions) of the chromosomes. The construction of this map has been possible through the cooperative efforts of an international consortium of scientists who provide equal, full and unrestricted access to the data for the advancement of biology and human health.

There are two types of maps: genetic linkage map and physical map. The genetic linkage map shows the arrangement of genes and genetic markers along the chromosomes as calculated by the frequency with which they are inherited together. The physical map is representation of the chromosomes, providing the physical distance between landmarks on the chromosome, ideally measured in nucleotide bases. Physical maps can be divided into three general types: chromosomal or cytogenetic maps, radiation hybrid (RH) maps, and sequence maps.
 ch10f3  radiation hybrid maps   ch10f2  subchromosomal mapping

2B. Genome-nuclear lamina interactions and gene regulation.

Kind J, van Steensel B. Division of Gene Regulation, Netherlands Cancer Institute, Amsterdam, The Netherlands.
The nuclear lamina, a filamentous protein network that coats the inner nuclear membrane, has long been thought to interact with specific genomic loci and regulate their expression. Molecular mapping studies have now identified
  • large genomic domains that are in contact with the lamina.
Genes in these domains are typically repressed, and artificial tethering experiments indicate that
  • the lamina can actively contribute to this repression.
Furthermore, the lamina indirectly controls gene expression in the nuclear interior by sequestration of certain transcription factors.
Mol Cell. 2010; 38(4):603-13.          http://dx.doi.org/10.1016/j.molcel.2010.03.016
Peric-Hupkes D, Meuleman W, Pagie L, Bruggeman SW, Solovei I,  …., van Steensel B.  Division of Gene Regulation, Netherlands Cancer Institute, Amsterdam, The Netherlands.
To visualize three-dimensional organization of chromosomes within the nucleus, we generated high-resolution maps of genome-nuclear lamina interactions during subsequent differentiation of mouse embryonic stem cells via lineage-committed neural precursor cells into terminally differentiated astrocytes.  A basal chromosome architecture present in embryonic stem cells is cumulatively altered at hundreds of sites during lineage commitment and subsequent terminal differentiation. This remodeling involves both
  • individual transcription units and multigene regions and
  • affects many genes that determine cellular identity.
  •  genes that move away from the lamina are concomitantly activated;
  • others, remain inactive yet become unlocked for activation in a next differentiation step.

lamina-genome interactions are widely involved in the control of gene expression programs during lineage commitment and terminal differentiation.

 view the full text on ScienceDirect.
Graphical Summary
PDF 1.54 MB
Referred to by: The Silence of the LADs: Dynamic Genome-…
Authors:  Daan Peric-Hupkes, Wouter Meuleman, Ludo Pagie, Sophia W.M. Bruggeman, et al.
Highlights
  • Various cell types share a core architecture of genome-nuclear lamina interactions
  • During differentiation, hundreds of genes change their lamina interactions
  • Changes in lamina interactions reflect cell identity
  • Release from the lamina may unlock some genes for activation

Fractal “globule”

About 10 years ago—just as the human genome project was completing its first draft sequence—Dekker pioneered a new technique, called chromosome conformation capture (C3) that allowed researchers to get a glimpse of how chromosomes are arranged relative to each other in the nucleus. The technique relies on the physical cross-linking of chromosomal regions that lie in close proximity to one another. The regions are then sequenced to identify which regions have been cross-linked. In 2009, using a high throughput version of this basic method, called Hi-C, Dekker and his collaborators discovered that the human genome appears to adopt a “fractal globule” conformation—

  • a manner of crumpling without knotting.

gabst_EK.pptx

In the last 3 years, Jobe Dekker and others have advanced technology even further, allowing them to paint a more refined picture of how the genome folds—and how this influences gene expression and disease states.  Dekker’s 2009 findings were a breakthrough in modeling genome folding, but the resolution—about 1 million base pairs— was too crude to allow scientists to really understand how genes interacted with specific regulatory elements. The researchers report two striking findings.

First, the human genome is organized into two separate compartments, keeping

  • active genes separate and accessible
  • while sequestering unused DNA in a denser storage compartment.
  • Chromosomes snake in and out of the two compartments repeatedly
  • as their DNA alternates between active, gene-rich and inactive, gene-poor stretches.

Second, at a finer scale, the genome adopts an unusual organization known in mathematics as a “fractal.” The specific architecture the scientists found, called

  • a “fractal globule,” enables the cell to pack DNA incredibly tightly —

the information density in the nucleus is trillions of times higher than on a computer chip — while avoiding the knots and tangles that might interfere with the cell’s ability to read its own genome. Moreover, the DNA can easily Unfold and Refold during

  • gene activation,
  • gene repression, and
  • cell replication.

Dekker and his colleagues discovered, for example, that chromosomes can be divided into folding domains—megabase-long segments within which

  • genes and regulatory elements associate more often with one another than with other chromosome sections.

The DNA forms loops within the domains that bring a gene into close proximity with a specific regulatory element at a distant location along the chromosome. Another group, that of molecular biologist Bing Ren at the University of California, San Diego, published a similar finding in the same issue of Nature.  Dekker thinks the discovery of [folding] domains will be one of the most fundamental [genetics] discoveries of the last 10 years. The big questions now are

  • how these domains are formed, and
  • what determines which elements are looped into proximity.

“By breaking the genome into millions of pieces, we created a spatial map showing how close different parts are to one another,” says co-first author Nynke van Berkum, a postdoctoral researcher at UMass Medical School in Dekker‘s laboratory. “We made a fantastic three-dimensional jigsaw puzzle and then, with a computer, solved the puzzle.”

Lieberman-Aiden, van Berkum, Lander, and Dekker’s co-authors are Bryan R. Lajoie of UMMS; Louise Williams, Ido Amit, and Andreas Gnirke of the Broad Institute; Maxim Imakaev and Leonid A. Mirny of MIT; Tobias Ragoczy, Agnes Telling, and Mark Groudine of the Fred Hutchison, Cancer Research Center and the University of Washington; Peter J. Sabo, Michael O. Dorschner, Richard Sandstrom, M.A. Bender, and John Stamatoyannopoulos of the University of Washington; and Bradley Bernstein of the Broad Institute and Harvard Medical School.

2C. three-dimensional structure of the human genome

Lieberman-Aiden et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science, 2009; DOI: 10.1126/science.1181369.
Harvard University (2009, October 11). 3-D Structure Of Human Genome: Fractal Globule Architecture Packs Two Meters Of DNA Into Each Cell. ScienceDaily.   Retrieved February 2, 2013, from        http://www.sciencedaily.com/releases/2009/10/091008142957

Using a new technology called Hi-C and applying it to answer the thorny question of how each of our cells stows some three billion base pairs of DNA while maintaining access to functionally crucial segments. The paper comes from a team led by scientists at Harvard University, the Broad Institute of Harvard and MIT, University of Massachusetts Medical School, and the Massachusetts Institute of Technology. “We’ve long known that on a small scale, DNA is a double helix,” says co-first author Erez Lieberman-Aiden, a graduate student in the Harvard-MIT Division of Health Science and Technology and a researcher at Harvard’s School of Engineering and Applied Sciences and in the laboratory of Eric Lander at the Broad Institute. “But if the double helix didn’t fold further, the genome in each cell would be two meters long. Scientists have not really understood how the double helix folds to fit into the nucleus of a human cell, which is only about a hundredth of a millimeter in diameter. This new approach enabled us to probe exactly that question.”

The mapping technique that Aiden and his colleagues have come up with bridges a crucial gap in knowledge—between what goes on at the smallest levels of genetics (the double helix of DNA and the base pairs) and the largest levels (the way DNA is gathered up into the 23 chromosomes that contain much of the human genome). The intermediate level, on the order of thousands or millions of base pairs, has remained murky.  As the genome is so closely wound, base pairs in one end can be close to others at another end in ways that are not obvious merely by knowing the sequence of base pairs. Borrowing from work that was started in the 1990s, Aiden and others have been able to figure out which base pairs have wound up next  to one another. From there, they can begin to reconstruct the genome—in three dimensions.

4C profiles validate the Hi-C Genome wide map

Even as the multi-dimensional mapping techniques remain in their early stages, their importance in basic biological research is becoming ever more apparent. “The three-dimensional genome is a powerful thing to know,” Aiden says. “A central mystery of biology is the question of how different cells perform different functions—despite the fact that they share the same genome.” How does a liver cell, for example, “know” to perform its liver duties when it contains the same genome as a cell in the eye? As Aiden and others reconstruct the trail of letters into a three-dimensional entity, they have begun to see that “the way the genome is folded determines which genes were

2D. “Mr. President; The Genome is Fractal !”

Eric Lander (Science Adviser to the President and Director of Broad Institute) et al. delivered the message on Science Magazine cover (Oct. 9, 2009) and generated interest in this by the International HoloGenomics Society at a Sept meeting.

First, it may seem to be trivial to rectify the statement in “About cover” of Science Magazine by AAAS.

  • The statement “the Hilbert curve is a one-dimensional fractal trajectory” needs mathematical clarification.

The mathematical concept of a Hilbert space, named after David Hilbert, generalizes the notion of Euclidean space. It extends the methods of vector algebra and calculus from the two-dimensional Euclidean plane and three-dimensional space to spaces with any finite or infinite number of dimensions. A Hilbert space is an abstract vector space possessing the structure of an inner product that allows length and angle to be measured. Furthermore, Hilbert spaces must be complete, a property that stipulates the existence of enough limits in the space to allow the techniques of calculus to be used. A Hilbert curve (also known as a Hilbert space-filling curve) is a continuous fractal space-filling curve first described by the German mathematician David Hilbert in 1891,[1] as a variant of the space-filling curves discovered by Giuseppe Peano in 1890.[2] For multidimensional databases, Hilbert order has been proposed to be used instead of Z order because it has better locality-preserving behavior.

Representation as Lindenmayer system
The Hilbert Curve can be expressed by a rewrite system (L-system).

Alphabet : A, B

Constants : F + –                                                                                                                                      119px-Hilbert3d-step3                             120px-Hilbert512

Axiom : A

Production rules:

A → – B F + A F A + F B –

B → + A F – B F B – F A +

Here, F means “draw forward”, – means “turn left 90°”, and + means “turn right 90°” (see turtle graphics).

620px-Harmonic_partials_on_strings.svg

While the paper itself does not make this statement, the new Editorship of the AAAS Magazine might be even more advanced if the previous Editorship did not reject (without review) a Manuscript by 20+ Founders of (formerly) International PostGenetics Society in December, 2006.

Second, it may not be sufficiently clear for the reader that the reasonable requirement for the DNA polymerase to crawl along a “knot-free” (or “low knot”) structure does not need fractals. A “knot-free” structure could be spooled by an ordinary “knitting globule” (such that the DNA polymerase does not bump into a “knot” when duplicating the strand; just like someone knitting can go through the entire thread without encountering an annoying knot): Just to be “knot-free” you don’t need fractals. Note, however, that

  • the “strand” can be accessed only at its beginning – it is impossible to e.g. to pluck a segment from deep inside the “globulus”.

This is where certain fractals provide a major advantage – that could be the “Eureka” moment for many readers. For instance,

  • the mentioned Hilbert-curve is not only “knot free” –
  • but provides an easy access to “linearly remote” segments of the strand.

If the Hilbert curve starts from the lower right corner and ends at the lower left corner, for instance

  • the path shows the very easy access of what would be the mid-point
  • if the Hilbert-curve is measured by the Euclidean distance along the zig-zagged path.

Likewise, even the path from the beginning of the Hilbert-curve is about equally easy to access – easier than to reach from the origin a point that is about 2/3 down the path. The Hilbert-curve provides an easy access between two points within the “spooled thread”; from a point that is about 1/5 of the overall length to about 3/5 is also in a “close neighborhood”.

This may be the “Eureka-moment” for some readers, to realize that

  • the strand of “the Double Helix” requires quite a finess to fold into the densest possible globuli (the chromosomes) in a clever way
  • that various segments can be easily accessed. Moreover, in a way that distances between various segments are minimized.

This marvellous fractal structure is illustrated by the 3D rendering of the Hilbert-curve. Once you observe such fractal structure, you’ll never again think of a chromosome as a “brillo mess”, would you? It will dawn on you that the genome is orders of magnitudes more finessed than we ever thought so.

Those embarking at a somewhat complex review of some historical aspects of the power of fractals may wish to consult the ouvre of Mandelbrot (also, to celebrate his 85th birthday). For the more sophisticated readers, even the fairly simple Hilbert-curve (a representative of the Peano-class) becomes even more stunningly brilliant than just some “see through density”. Those who are familiar with the classic “Traveling Salesman Problem” know that “the shortest path along which every given n locations can be visited once, and only once” requires fairly sophisticated algorithms (and tremendous amount of computation if n>10 (or much more). Some readers will be amazed, therefore, that for n=9 the underlying Hilbert-curve helps to provide an empirical solution.

refer to pellionisz@junkdna.com

Briefly, the significance of the above realization, that the (recursive) Fractal Hilbert Curve is intimately connected to the (recursive) solution of TravelingSalesman Problem, a core-concept of Artificial Neural Networks can be summarized as below.

Accomplished physicist John Hopfield (already a member of the National Academy of Science) aroused great excitement in 1982 with his (recursive) design of artificial neural networks and learning algorithms which were able to find reasonable solutions to combinatorial problems such as the Traveling SalesmanProblem. (Book review Clark Jeffries, 1991, see also 2. J. Anderson, R. Rosenfeld, and A. Pellionisz (eds.), Neurocomputing 2: Directions for research, MIT Press, Cambridge, MA, 1990):

“Perceptions were modeled chiefly with neural connections in a “forward” direction: A -> B -* C — D. The analysis of networks with strong backward coupling proved intractable. All our interesting results arise as consequences of the strong back-coupling” (Hopfield, 1982).

The Principle of Recursive Genome Function surpassed obsolete axioms that blocked, for half a Century, entry of recursive algorithms to interpretation of the structure-and function of (Holo)Genome.  This breakthrough, by uniting the two largely separate fields of Neural Networks and Genome Informatics, is particularly important for

  • those who focused on Biological (actually occurring) Neural Networks (rather than abstract algorithms that may not, or because of their core-axioms, simply could not
  • represent neural networks under the governance of DNA information).

DNA base triplets

3A. The FractoGene Decade

from Inception in 2002 to Proofs of Concept and Impending Clinical Applications by 2012

  1. Junk DNA Revisited (SF Gate, 2002)
  2. The Future of Life, 50th Anniversary of DNA (Monterey, 2003)
  3. Mandelbrot and Pellionisz (Stanford, 2004)
  4. Morphogenesis, Physiology and Biophysics (Simons, Pellionisz 2005)
  5. PostGenetics; Genetics beyond Genes (Budapest, 2006)
  6. ENCODE-conclusion (Collins, 2007)

The Principle of Recursive Genome Function (paper, YouTube, 2008)

  1. Cold Spring Harbor presentation of FractoGene (Cold Spring Harbor, 2009)
  2. Mr. President, the Genome is Fractal! (2009)
  3. HolGenTech, Inc. Founded (2010)
  4. Pellionisz on the Board of Advisers in the USA and India (2011)
  5. ENCODE – final admission (2012)
  6. Recursive Genome Function is Clogged by Fractal Defects in Hilbert-Curve (2012)
  7. Geometric Unification of Neuroscience and Genomics (2012)
  8. US Patent Office issues FractoGene 8,280,641 to Pellionisz (2012)

http://www.junkdna.com/the_fractogene_decade.pdf
http://www.scribd.com/doc/116159052/The-Decade-of-FractoGene-From-Discovery-to-Utility-Proofs-of-Concept-Open-Genome-Based-Clinical-Applications
http://fractogene.com/full_genome/morphogenesis.html

When the human genome was first sequenced in June 2000, there were two pretty big surprises. The first was thathumans have only about 30,000-40,000 identifiable genes, not the 100,000 or more many researchers were expecting. The lower –and more humbling — number

  • means humans have just one-third more genes than a common species of worm.

The second stunner was

  • how much human genetic material — more than 90 percent — is made up of what scientists were calling “junk DNA.”

The term was coined to describe similar but not completely identical repetitive sequences of amino acids (the same substances that make genes), which appeared to have no function or purpose. The main theory at the time was that these apparently non-working sections of DNA were just evolutionary leftovers, much like our earlobes.

If biophysicist Andras Pellionisz is correct, genetic science may be on the verge of yielding its third — and by far biggest — surprise.

With a doctorate in physics, Pellionisz is the holder of Ph.D.’s in computer sciences and experimental biology from the prestigious Budapest Technical University and the Hungarian National Academy of Sciences. A biophysicist by training, the 59-year-old is a former research associate professor of physiology and biophysics at New York University, author of numerous papers in respected scientific journals and textbooks, a past winner of the prestigious Humboldt Prize for scientific research, a former consultant to NASA and holder of a patent on the world’s first artificial cerebellum, a technology that has already been integrated into research on advanced avionics systems. Because of his background, the Hungarian-born brain researcher might also become one of the first people to successfully launch a new company by using the Internet to gather momentum for a novel scientific idea.

The genes we know about today, Pellionisz says, can be thought of as something similar to machines that make bricks (proteins, in the case of genes), with certain junk-DNA sections providing a blueprint for the different ways those proteins are assembled. The notion that at least certain parts of junk DNA might have a purpose for example, many researchers now refer to with a far less derogatory term: introns.

In a provisional patent application filed July 31, Pellionisz claims to have unlocked a key to the hidden role junk DNA plays in growth — and in life itself. His patent application covers all attempts to count, measure and compare the fractal properties of introns for diagnostic and therapeutic purposes.

3B. The Hidden Fractal Language of Intron DNA

To fully understand Pellionisz’ idea, one must first know what a fractal is.

Fractals are a way that nature organizes matter. Fractal patterns can be found in anything that has a nonsmooth surface (unlike a billiard ball), such as coastal seashores, the branches of a tree or the contours of a neuron (a nerve cell in the brain). Some, but not all, fractals are self-similar and stop repeating their patterns at some stage; the branches of a tree, for example, can get only so small. Because they are geometric, meaning they have a shape, fractals can be described in mathematical terms. It’s similar to the way a circle can be described by using a number to represent its radius (the distance from its center to its outer edge). When that number is known, it’s possible to draw the circle it represents without ever having seen it before.

Although the math is much more complicated, the same is true of fractals. If one has the formula for a given fractal, it’s possible to use that formula

  • to construct, or reconstruct,
  • an image of whatever structure it represents,
  • no matter how complicated.

The mysteriously repetitive but not identical strands of genetic material are in reality building instructions organized in a special type

  • of pattern known as a fractal.  It’s this pattern of fractal instructions, he says, that
  • tells genes what they must do in order to form living tissue,
  • everything from the wings of a fly to the entire body of a full-grown human.

In a move sure to alienate some scientists, Pellionisz has chosen the unorthodox route of making his initial disclosures online on his own Web site. He picked that strategy, he says, because it is the fastest way he can document his claims and find scientific collaborators and investors. Most mainstream scientists usually blanch at such approaches, preferring more traditionally credible methods, such as publishing articles in peer-reviewed journals.

Basically, Pellionisz’ idea is that a fractal set of building instructions in the DNA plays a similar role in organizing life itself. Decode the way that language works, he says, and in theory it could be reverse engineered. Just as knowing the radius of a circle lets one create that circle, the more complicated fractal-based formula would allow us to understand how nature creates a heart or simpler structures, such as disease-fighting antibodies. At a minimum, we’d get a far better understanding of how nature gets that job done.

The complicated quality of the idea is helping encourage new collaborations across the boundaries that sometimes separate the increasingly intertwined disciplines of biology, mathematics and computer sciences.

Hal Plotkin, Special to SF Gate. Thursday, November 21, 2002.                          http://www.junkdna.com/Special to SF Gate/plotkin.htm (1 of 10)2012.12.13. 12:11:58/

fractogene_2002

3C. multifractal analysis

The human genome: a multifractal analysis. Moreno PA, Vélez PE, Martínez E, et al.

BMC Genomics 2011, 12:506. http://www.biomedcentral.com/1471-2164/12/506

Background: Several studies have shown that genomes can be studied via a multifractal formalism. Recently, we used a multifractal approach to study the genetic information content of the Caenorhabditis elegans genome. Here we investigate the possibility that the human genome shows a similar behavior to that observed in the nematode.
Results: We report here multifractality in the human genome sequence. This behavior correlates strongly on the

  • presence of Alu elements and
  • to a lesser extent on CpG islands and (G+C) content.

In contrast, no or low relationship was found for LINE, MIR, MER, LTRs elements and DNA regions poor in genetic information.

  • Gene function,
  • cluster of orthologous genes,
  • metabolic pathways, and
  • exons tended to increase their frequencies with ranges of multifractality and
  • large gene families were located in genomic regions with varied multifractality.

Additionally, a multifractal map and classification for human chromosomes are proposed.

Conclusions

we propose a descriptive non-linear model for the structure of the human genome,

This model reveals

  • a multifractal regionalization where many regions coexist that are far from equilibrium and
  • this non-linear organization has significant molecular and medical genetic implications for understanding the role of
  • Alu elements in genome stability and structure of the human genome.

Given the role of Alu sequences in

  • gene regulation,
  • genetic diseases,
  • human genetic diversity,
  • adaptation
  • and phylogenetic analyses,

these quantifications are especially useful.

MiIP: The Monomer Identification and Isolation Program

Bun C, Ziccardi W, Doering J and Putonti C.Evolutionary Bioinformatics 2012:8 293-300.    http://dx.goi.org/10.4137/EBO.S9248

Repetitive elements within genomic DNA are both functionally and evolutionarilly informative. Discovering these sequences ab initio is

  • computationally challenging, compounded by the fact that
  • sequence identity between repetitive elements can vary significantly.

Here we present a new application, the Monomer Identification and Isolation Program (MiIP), which provides functionality to both

  • search for a particular repeat as well as
  • discover repetitive elements within a larger genomic sequence.

To compare MiIP’s performance with other repeat detection tools, analysis was conducted for

  • synthetic sequences as well as
  • several a21-II clones and
  • HC21 BAC sequences.

The primary benefit of MiIP is the fact that it is a single tool capable of searching for both

  • known monomeric sequences as well as
  • discovering the occurrence of repeats ab initio, per the user’s required sensitivity of the search.

Methods for Examining Genomic and Proteomic Interactions

1. An Integrated Statistical Approach to Compare Transcriptomics Data Across Experiments: A Case Study on the Identification of Candidate Target Genes of the Transcription Factor PPARα

Ullah MO, Müller M and Hooiveld GJEJ. Bioinformatics and Biology Insights 2012:6 145–154.       http://dx.doi.org/10.4137/BBI.S9529

http://www.la- press.com/
http://bionformaticsandBiologyInsights.com/An_Integrated_Statistical_Approach_to_Compare_ transcriptomic_Data_Across_Experiments-A-Case_Study_on_the_Identification_ of_Candidate_Target_Genes_of_the Transcription_Factor_PPARα/
Corresponding author email: guido.hooiveld@wur.nl

An effective strategy to elucidate the signal transduction cascades activated by a transcription factor is to compare the transcriptional profiles of wild type and transcription factor knockout models. Many statistical tests have been proposed for analyzing gene expression data, but most

  • tests are based on pair-wise comparisons. Since the analysis of microarrays involves the testing of multiple hypotheses within one study, it is
  • generally accepted that one should control for false positives by the false discovery rate (FDR). However, it has been reported that
  • this may be an inappropriate metric for comparing data across different experiments.

Here we propose an approach that addresses the above mentioned problem by the simultaneous testing and integration of the three hypotheses (contrasts) using the cell means ANOVA model.

These three contrasts test for the effect of

  • a treatment in wild type,
  • gene knockout, and
  • globally over all experimental groups.

We illustrate our approach on microarray experiments that focused on the identification of candidate target genes and biological processes governed by the fatty acid sensing transcription factor PPARα in liver. Compared to the often applied FDR based across experiment comparison, our approach identified a conservative but less noisy set of candidate genes with same sensitivity and specificity. However, our method had the advantage of

  • properly adjusting for multiple testing while
  • integrating data from two experiments, and
  • was driven by biological inference.

We present a simple, yet efficient strategy to compare

  • differential expression of genes across experiments
  • while controlling for multiple hypothesis testing.

2. Managing biological complexity across orthologs with a visual knowledgebase of documented biomolecular interactions

Vincent VanBuren & Hailin Chen.   Scientific Reports 2, Article number: 1011  Received 02 October 2012 Accepted 04 December 2012 Published 20 December 2012
http://dx.doi.org/10.1038/srep01011

The complexity of biomolecular interactions and influences is a major obstacle to their comprehension and elucidation. Visualizing knowledge of biomolecular interactions increases comprehension and facilitates the development of new hypotheses. The rapidly changing landscape of high-content experimental results also presents a challenge for the maintenance of comprehensive knowledgebases. Distributing the responsibility for maintenance of a knowledgebase to a community of subject matter experts is an effective strategy for large, complex and rapidly changing knowledgebases.
Cognoscente serves these needs by

  • building visualizations for queries of biomolecular interactions on demand,
  • by managing the complexity of those visualizations, and
  • by crowdsourcing to promote the incorporation of current knowledge from the literature.

Imputing functional associations between biomolecules and imputing directionality of regulation for those predictions each

  • require a corpus of existing knowledge as a framework to build upon. Comprehension of the complexity of this corpus of knowledge
  • will be facilitated by effective visualizations of the corresponding biomolecular interaction networks.

Cognoscente

http://vanburenlab.medicine.tamhsc.edu/cognoscente.html
was designed and implemented to serve these roles as

  • a knowledgebase and
  • as an effective visualization tool for systems biology research and education.

Cognoscente currently contains over 413,000 documented interactions, with coverage across multiple species.  Perl, HTML, GraphViz1, and a MySQL database were used in the development of Cognoscente. Cognoscente was motivated by the need to

  • update the knowledgebase of biomolecular interactions at the user level, and
  • flexibly visualize multi-molecule query results for heterogeneous interaction types across different orthologs.

Satisfying these needs provides a strong foundation for developing new hypotheses about regulatory and metabolic pathway topologies.  Several existing tools provide functions that are similar to Cognoscente, so we selected several popular alternatives to

  • assess how their feature sets compare with Cognoscente ( Table 1 ). All databases assessed had
  • easily traceable documentation for each interaction, and
  • included protein-protein interactions in the database.

Most databases, with the exception of BIND,

  • provide an open-access database that can be downloaded as a whole.

Most databases, with the exceptions of EcoCyc and HPRD, provide

  • support for multiple organisms.

Most databases support web services for interacting with the database contents programatically, whereas this is a planned feature for Cognoscente.

  • INT, STRING, IntAct, EcoCyc, DIP and Cognoscente provide built-in visualizations of query results,
  • which we consider among the most important features for facilitating comprehension of query results.
  • BIND supports visualizations via Cytoscape. Cognoscente is among a few other tools that support multiple organisms in the same query,
  • protein->DNA interactions, and
  • multi-molecule queries.

Cognoscente has planned support for small molecule interactants (i.e. pharmacological agents).  MINT, STRING, and IntAct provide a prediction (i.e. score) of functional associations, whereas
Cognoscente does not currently support this. Cognoscente provides support for multiple edge encodings to visualize different types of interactions in the same display,

  • a crowdsourcing web portal that allows users to submit interactions
  • that are then automatically incorporated in the knowledgebase, and displays orthologs as compound nodes to provide clues about potential
  • orthologous interactions.

The main strengths of Cognoscente are that

  1. it provides a combined feature set that is superior to any existing database,
  2. it provides a unique visualization feature for orthologous molecules, and relatively unique support for
  3. multiple edge encodings,
  4. crowdsourcing, and
  5. connectivity parameterization.

The current weaknesses of Cognoscente relative to these other tools are

  • that it does not fully support web service interactions with the database,
  • it does not fully support small molecule interactants, and
  • it does not score interactions to predict functional associations.

Web services and support for small molecule interactants are currently under development.

Other related articles on thie Open Access Online Sceintific Journal, include the following:

Big Data in Genomic Medicine                    lhb                          http://pharmaceuticalintelligence.com/2012/12/17/big-data-in-genomic-medicine/

BRCA1 a tumour suppressor in breast and ovarian cancer – functions in transcription, ubiquitination and DNA repair S Saha                                                                                   http://pharmaceuticalintelligence.com/2012/12/04/brca1-a-tumour-suppressor-in-breast-and-ovarian-cancer-functions-in-transcription-ubiquitination-and-dna-repair/

Computational Genomics Center: New Unification of Computational Technologies at Stanford A Lev-Ari    http://pharmaceuticalintelligence.com/2012/12/03/computational-genomics-center-new-unification-of-computational-technologies-at-stanford/

Paradigm Shift in Human Genomics – Predictive Biomarkers and Personalized Medicine – Part 1 (pharmaceuticalintelligence.com) A Lev-Ari http://pharmaceuticalintelligence.com/2013/01/13/paradigm-shift-in-human-genomics-predictive-biomarkers-and-personalized-medicine-part-1/

LEADERS in Genome Sequencing of Genetic Mutations for Therapeutic Drug Selection in Cancer Personalized Treatment: Part 2 A Lev-Ari
http://pharmaceuticalintelligence.com/2013/01/13/leaders-in-genome-sequencing-of-genetic-mutations-for-therapeutic-drug-selection-in-cancer-personalized-treatment-part-2/

Personalized Medicine: An Institute Profile – Coriell Institute for Medical Research: Part 3 A Lev-Ari http://pharmaceuticalintelligence.com/2013/01/13/personalized-medicine-an-institute-profile-coriell-institute-for-medical-research-part-3/

GSK for Personalized Medicine using Cancer Drugs needs Alacris systems biology model to determine the in silico effect of the inhibitor in its “virtual clinical trial” A Lev-Ari    http://pharmaceuticalintelligence.com/2012/11/14/gsk-for-personalized-medicine-using-cancer-drugs-needs-alacris-systems-biology-model-to-determine-the-in-silico-effect-of-the-inhibitor-in-its-virtual-clinical-trial/

Recurrent somatic mutations in chromatin-remodeling and ubiquitin ligase complex genes in serous endometrial tumors S Saha
http://pharmaceuticalintelligence.com/2012/11/19/recurrent-somatic-mutations-in-chromatin-remodeling-and-ubiquitin-ligase-complex-genes-in-serous-endometrial-tumors/

Human Variome Project: encyclopedic catalog of sequence variants indexed to the human genome sequence A Lev-Ari

http://pharmaceuticalintelligence.com/2012/11/24/human-variome-project-encyclopedic-catalog-of-sequence-variants-indexed-to-the-human-genome-sequence/

Prostate Cancer Cells: Histone Deacetylase Inhibitors Induce Epithelial-to-Mesenchymal Transition sjwilliams
http://pharmaceuticalintelligence.com/2012/11/30/histone-deacetylase-inhibitors-induce-epithelial-to-mesenchymal-transition-in-prostate-cancer-cells/

http://pharmaceuticalintelligence.com/2013/01/09/the-cancer-establishments-examined-by-james-watson-co-discover-of-dna-wcrick-41953/

Directions for genomics in personalized medicine lhb http://pharmaceuticalintelligence.com/2013/01/27/directions-for-genomics-in-personalized-medicine/

How mobile elements in “Junk” DNA promote cancer. Part 1: Transposon-mediated tumorigenesis. Sjwilliams
http://pharmaceuticalintelligence.com/2012/10/31/how-mobile-elements-in-junk-dna-prote-cancer-part1-transposon-mediated-tumorigenesis/

Mitochondrial fission and fusion: potential therapeutic targets? Ritu saxena    http://pharmaceuticalintelligence.com/2012/10/31/mitochondrial-fission-and-fusion-potential-therapeutic-target/

Mitochondrial mutation analysis might be “1-step” away ritu saxena  http://pharmaceuticalintelligence.com/2012/08/14/mitochondrial-mutation-analysis-might-be-1-step-away/

mRNA interference with cancer expression lhb http://pharmaceuticalintelligence.com/2012/10/26/mrna-interference-with-cancer-expression/

Expanding the Genetic Alphabet and linking the genome to the metabolome http://pharmaceuticalintelligence.com/2012/09/24/expanding-the-genetic-alphabet-and-linking-the-genome-to-the-metabolome/

Breast Cancer: Genomic profiling to predict Survival: Combination of Histopathology and Gene Expression Analysis A Lev-Ari

http://pharmaceuticalintelligence.com/2012/12/24/breast-cancer-genomic-profiling-to-predict-survival-combination-of-histopathology-and-gene-expression-analysis/

Ubiquinin-Proteosome pathway, autophagy, the mitochondrion, proteolysis and cell apoptosis lhb http://pharmaceuticalintelligence.com/2012/10/30/ubiquinin-proteosome-pathway-autophagy-the-mitochondrion-proteolysis-and-cell-apoptosis/

Genomic Analysis: FLUIDIGM Technology in the Life Science and Agricultural Biotechnology A Lev-Ari http://pharmaceuticalintelligence.com/2012/08/22/genomic-analysis-fluidigm-technology-in-the-life-science-and-agricultural-biotechnology/

2013 Genomics: The Era Beyond the Sequencing Human Genome: Francis Collins, Craig Venter, Eric Lander, et al.  http://pharmaceuticalintelligence.com/2013_Genomics

Paradigm Shift in Human Genomics – Predictive Biomarkers and Personalized Medicine – Part 1 http://pharmaceuticalintelligence.com/Paradigm Shift in Human Genomics_/

English: DNA replication or DNA synthesis is t...

English: DNA replication or DNA synthesis is the process of copying a double-stranded DNA molecule. This process is paramount to all life as we know it. (Photo credit: Wikipedia)

Français : Deletion chromosomique

Français : Deletion chromosomique (Photo credit: Wikipedia)

A slight mutation in the matched nucleotides c...

A slight mutation in the matched nucleotides can lead to chromosomal aberrations and unintentional genetic rearrangement. (Photo credit: Wikipedia)

Read Full Post »

Rewriting the Mathematics of Tumor Growth[1]; Teams Use Math Models to Sort Drivers from Passengers[2]:  Two JNCI Reviews by Mike Martin Regarding Genomics, Cancer, and Mutation

Curator: Stephen J. Williams, Ph.D.

This image has an empty alt attribute; its file name is ArticleID-22.png

WordCloud Image Produced by Adam Tubman

Screen Shot 2021-07-19 at 6.29.41 PM

Word Cloud By Danielle Smolyar

Recently, there has been extensive interest in the cancer research and oncology community on detecting those mutations responsible for the initiation and propagation of a neoplastic cell (driver mutations) versus those mutations that are randomly (or by selective pressures) acquired due to the genetic instability of the transformed cell.  The impact of either type of mutation has been a topic for debate, with a recent article showing that some passenger mutations may actually be responsible for tumor survival.  In addition many articles, highlighted on this site (and referenced below) in recent years have described the importance of classifying driver and passenger mutations for the purposes of more effective personalized medicine strategies directed against tumors. Two review articles by Mike Martin in the Journal of the National Cancer Institute (JCNI) shed light on the current efforts and successes to discriminate between these passenger and driver mutations and determine impact of each type of mutation to tumor growth.  However, as described in the associated article, the picture is not as clear cut as previously thought and highlights some revolutionary findings. In Rewriting the Mathematics of Tumor Growth, researchers discovered that driver mutations may confer such a small growth advantage that, multiple mutations, including the so called passenger mutations are necessary in order to sustain tumor growth. In fact, much experimental evidence has suggested at least six defined genetic events may be necessary for the in-vitro transformation of human cells.  The following table shows some of the genetic events required for in-vitro transformation in cell culture systems.

Genetic events required for transformation

 Species  Cell type  # of genes required for tumor formation*  Genes used  Reference Events required for priming
Human FibroblastsEmbryonic kidney 3 hTERTH-rasLarge T (a)Hahn(Weinberg) 2LT+hTERT
Mammary epithelialMyoblastsEmbryonic kidney 6 hTERTH-rasP53DDc-myccyclin D1CDK4 (b)Kendall(Counter) Hras required for tumorigenesis so probably 5 events needed
Fibroblasts 4 Large TSmall TH-rashTERT (c)Sun(Hornsby) 2Large T + H-ras
Fibroblasts 4 Large TSmall ThTERTRas (d)Rangarajan(Weinberg) 3hTERT, Ras and either small or largeT
Keratinocytes 4 CyclinD1dnp53EGFR

c-myc

(e)Goessel(Opitz) 3 for anchorage independence (cyclin D1, dnp53, EGFR),Cyclin D1+dnp53 for immortalization
HOSE 6 CDK4, cyclin D, hTERT plus combination of either P53DD, myrAkt, and H-ras or P53DD, H-ras, c-myc Bcl2 (f)Sasaki(Kiyono) 5
HOSE 3 hTERTSV40 earlyH-ras orK-ras (g)Liu(Bast) 2hTERT+ SV40 early
HOSE 3 Large ThTERTH-ras orc-erB-2 (h)Kusakari(Fujii) 2hTERT+large T
Rat Fibroblasts 2 Large TH-ras (i)Hirakawa Did not analyze
Fibroblasts 2 Large TH-ras (d)Rangarajan(Weinberg) Large T
Mouse MOSEIn p53-/- background 3 c-mycK-rasAkt (j)Orsulic
Pig Fibroblasts 6 p53DDhTERTCDK4H-ras c-myccyclin D1 (k)Adam(Counter) 5 need all butp53DD

Note: priming means events required to immortalize but not fully transform.  * Note that both ability to form colonies in soft agarose and subsequently tested for tumor formation in immunocompromised mice.

a.         Hahn, W. C., Counter, C. M., Lundberg, A. S., Beijersbergen, R. L., Brooks, M. W., and Weinberg, R. A. (1999) Creation of human tumour cells with defined genetic elements, Nature 400, 464-468.

b.         Kendall, S. D., Linardic, C. M., Adam, S. J., and Counter, C. M. (2005) A network of genetic events sufficient to convert normal human cells to a tumorigenic state, Cancer Res 65, 9824-9828.

c.         Sun, B., Chen, M., Hawks, C. L., Pereira-Smith, O. M., and Hornsby, P. J. (2005) The minimal set of genetic alterations required for conversion of primary human fibroblasts to cancer cells in the subrenal capsule assay, Neoplasia 7, 585-593.

d.         Rangarajan, A., Hong, S. J., Gifford, A., and Weinberg, R. A. (2004) Species- and cell type-specific requirements for cellular transformation, Cancer Cell 6, 171-183.

e.         Goessel, G., Quante, M., Hahn, W. C., Harada, H., Heeg, S., Suliman, Y., Doebele, M., von Werder, A., Fulda, C., Nakagawa, H., Rustgi, A. K., Blum, H. E., and Opitz, O. G. (2005) Creating oral squamous cancer cells: a cellular model of oral-esophageal carcinogenesis, Proc Natl Acad Sci U S A 102, 15599-15604.

f.          Sasaki, R., Narisawa-Saito, M., Yugawa, T., Fujita, M., Tashiro, H., Katabuchi, H., and Kiyono, T. (2009) Oncogenic transformation of human ovarian surface epithelial cells with defined cellular oncogenes, Carcinogenesis 30, 423-431.

g.         Liu, J., Yang, G., Thompson-Lanza, J. A., Glassman, A., Hayes, K., Patterson, A., Marquez, R. T., Auersperg, N., Yu, Y., Hahn, W. C., Mills, G. B., and Bast, R. C., Jr. (2004) A genetically defined model for human ovarian cancer, Cancer Res 64, 1655-1663.

h.         Kusakari, T., Kariya, M., Mandai, M., Tsuruta, Y., Hamid, A. A., Fukuhara, K., Nanbu, K., Takakura, K., and Fujii, S. (2003) C-erbB-2 or mutant Ha-ras induced malignant transformation of immortalized human ovarian surface epithelial cells in vitro, Br J Cancer 89, 2293-2298.

i.          Hirakawa, T., and Ruley, H. E. (1988) Rescue of cells from ras oncogene-induced growth arrest by a second, complementing, oncogene, Proc Natl Acad Sci U S A 85, 1519-1523.

j.          Orsulic, S., Li, Y., Soslow, R. A., Vitale-Cross, L. A., Gutkind, J. S., and Varmus, H. E. (2002) Induction of ovarian cancer by defined multiple genetic changes in a mouse model system, Cancer Cell 1, 53-62.

k.         Adam, S. J., Rund, L. A., Kuzmuk, K. N., Zachary, J. F., Schook, L. B., and Counter, C. M. (2007) Genetic induction of tumorigenesis in swine, Oncogene 26, 1038-1045.

However it may be argued that the aforementioned experimental examples were produced in cell lines with a more stable genome than that which is seen in most tumors and had used traditional assays of transformation, such as growth in soft agarose and tumorigenicity in immunocompromised mice, as endpoints of transformation, and not representative of the tumor growth seen in the clinical setting.

Therefore Bert Vogelstein, M.D., along with collaborators around the world developed a model they termed the “sequential driver mutation theory”, in which they describe that driver mutations multiply over time with each mutation “slightly increasing the tumor growth rate through a process that depends on three factors”:

  1. Driver mutation rate
  2. The 0.4% selective growth advantage
  3. Cell division time

This model was based on a combination of experimental data and computer simulations of gliobastoma multiforme and pancreatic adenocarcinoma.  Most tumor models follow a Gompertz kinetics, which show how tumor growth is exponential but eventually levels off over time.

This new theory shows though that a tumor cell with only one driver mutation can only grow so much, until a second driver mutation is required.  Using data for the COSMIC database (Catalog of Somatic Mutations in Cancer) together with analysis software CHASM (Cancer-specific High-throughput Annotation of Somatic Mutations) the researchers analyzed 713 mutations sequenced from 14 glioma patients and 562 mutations in nine pancreatic adenocarcinomas, revealing at least 100 tumor suppressor genes and 100 oncogenes altered.  Therefore, the authors suggested these may be possible driver mutations, or at least mutations required for the sustained growth of these tumors.  Applying this new model to data obtained from Dr. Giardiello’s publication concerning familial adenopolypsis in New England Journal of medicine in 19993 and 2000, the sequential driver mutation model predicted age distribution of FAP patients, number and size of polyps, and polyp growth rate than previous models.  This surprising number of required driver mutations for full transformation was also verified in a study led by University of Texas Southwestern Medical Center biologist Jerry Shay, Ph.D., who noted “this team’s surprise nearly 45% of all colorectal candidate oncogenes (65 mutations) drove malignant proliferation”[3].

However, some investigators do not believe the model is complex enough to account for other factors involved in oncogenesis, such as epigenetic factors like methylation and acetylation.  In addition the review also discusses host and tissue factors which may complicate the models, such as location where a tumor develops.  However, most of the investigators interviewed for this review agreed that focusing on this long-term progression of the disease may give us clues to other potential druggable targets.

Teams Use Math Models to Sort Drivers From Passengers

A related review from Mike Martin in JNCI [2] describes a statistical method, published in 2009 Cancer Informatics[4], which distinguishes chromosomal abnormalities that can drive oncogenesis from passenger abnormalities.  Chromosomal abnormalities, such as deletions, additions, and translocations are common in cancer.  For instance, the well-known Philadelphia chromosome, a translocation between chromosome 9 and 22 which results in the BCR-ABL tyrosine kinase fusion protein is the molecular basis of chronic myelogenous leukemia.

In the report, Eytan Domany, Ph.D., from Weizmann Institute and several colleagues from University of Lausanne, University of Haifa and the Broad Institute were analyzing chromosomal aberrations in a subset of medulloblastoma, which had more gain and losses in chromosomes than had been attributed to the disease.  Using a statistical method they termed a “volumetric sieve”, the investigators were able to identify driver versus passenger aberrations based on three filters:

  • Fraction of patients with the abnormality
  • Length of DNA involved in the aberrant chromosome
  • Abnormality’s copy number

Another method to sort the most “important” chromosomal aberrations from less relevant alterations is termed GISTIC[5], as the website describes is: a tool to identify genes targeted by somatic copy-number alterations (SCNAs) that drive cancer growth (at the Broad Institute website http://www.broadinstitute.org/software/cprg/?q=node/31).  The method allows for comparison across multiple tumors so noise is eliminated and improves consistency of analysis.  This method had been successfully used to determine driver aberrations is mesotheliomas, leukemias, and identify new oncogenes in adenocarcinomas of the lung and squamous cell carcinoma of the esophagus.

Main references for the two Mike Martin articles are as follows:

1.         Martin M: Rewriting the mathematics of tumor growth. Journal of the National Cancer Institute 2011, 103(21):1564-1565.

2.         Martin M: Aberrant chromosomes: teams use math models to sort drivers from passengers. Journal of the National Cancer Institute 2010, 102(6):369-371.

3.         Eskiocak U, Kim SB, Ly P, Roig AI, Biglione S, Komurov K, Cornelius C, Wright WE, White MA, Shay JW: Functional parsing of driver mutations in the colorectal cancer genome reveals numerous suppressors of anchorage-independent growth. Cancer research 2011, 71(13):4359-4365.

4.         Shay T, Lambiv WL, Reiner-Benaim A, Hegi ME, Domany E: Combining chromosomal arm status and significantly aberrant genomic locations reveals new cancer subtypes. Cancer informatics 2009, 7:91-104.

5.         Beroukhim R, Getz G, Nghiemphu L, Barretina J, Hsueh T, Linhart D, Vivanco I, Lee JC, Huang JH, Alexander S et al: Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma. Proceedings of the National Academy of Sciences of the United States of America 2007, 104(50):20007-20012.

Further posts on CANCER and GENOMICS and Sequencing published on the site include:

The Initiation and Growth of Molecular Biology and Genomics

Inaugural Genomics in Medicine – The Conference Program, 2/11-12/2013, San Francisco, CA

LEADERS in Genome Sequencing of Genetic Mutations for Therapeutic Drug Selection in Cancer Personalized Treatment: Part 2

Paradigm Shift in Human Genomics – Predictive Biomarkers and Personalized Medicine – Part 1

Breast Cancer: Genomic profiling to predict Survival: Combination of Histopathology and Gene Expression Analysis

Computational Genomics Center: New Unification of Computational Technologies at Stanford

GSK for Personalized Medicine using Cancer Drugs needs Alacris systems biology model to determine the in silico effect of the inhibitor in its “virtual clinical trial”

arrayMap: Genomic Feature Mining of Cancer Entities of Copy Number Abnormalities (CNAs) Data

Comprehensive Genomic Characterization of Squamous Cell Lung Cancers

Mosaicism’ is Associated with Aging and Chronic Diseases like Cancer: detection of genetic mosaicism could be an early marker for detecting cancer.

http://onlinelibrary.wiley.com/doi/10.1111/j.1755-148X.2011.00905.x/full

http://pharmaceuticalintelligence.com/2013/02/05/winning-over-cancer-progression-new-oncology-drugs-to-suppress-driver-mutations-vs-passengers-mutations/

Additional references:

[1] Michor F, Iwasa Y, and Nowak MA (2004) Dynamics of cancer

progression. Nature Reviews Cancer 4, 197-205.

[2] Crespi B and Summers K (2005) Evolutionary biology of cancer.

Trends in Ecology and Evolution 20, 545-552.

[3] Merlo LMF, et al. (2006) Cancer as an evolutionary and ecological

process. Nature Reviews Cancer 6, 924-935.

[4] McFarland C, et al. “Accumulation of deleterious passenger mutations

in cancer,” in preparation.

[5] Birkbak NJ, et al. (2011) Paradoxical relationship between

chromosomal instability and survival outcome in cancer. Cancer

Research 71,3447-3452.

Read Full Post »

From Molecular Biology to Translational Medicine: How Far Have We Come, and Where Does It Lead Us?

The Initiation and Growth of Molecular Biology and Genomics, Part I

Curator: Larry H Bernstein, MD, FCAP

 

Introduction and purpose

This material will cover the initiation phase of molecular biology, Part I; to be followed by the Human Genome Project, Part II; and concludes with Ubiquitin, it’s Role in Signaling and Regulatory Control, Part III.
This article is first a continuation of a previous discussion on the role of genomics in discovery of therapeutic targets titled Directions for genomics in personalized medicine http://pharmaceuticalintelligence.com/2013/01/27/directions-for-genomics-in-personalized-medicine/

The previous article focused on key drivers of cellular proliferation, stepwise mutational changes coinciding with cancer progression, and potential therapeutic targets for reversal of the process. It also covers the race to delineation of the Human Genome, discovery methods and fundamental genomic patterns that are ancient in both animal and plant speciation.

This article reviews the web-like connections between early and later discoveries, as significant finding has led to novel hypotheses and many more findings over the last 75 years. This largely post WWII revolution has driven our understanding of biological and medical processes at an exponential pace owing to successive discoveries of chemical structure, the basic building blocks of DNA and proteins, of nucleotide and protein-protein interactions, protein folding, allostericity, genomic structure, DNA replication, nuclear polyribosome interaction, and metabolic control. In addition, the emergence of methods for copying, removal and insertion, and improvements in structural analysis as well as developments in applied mathematics have transformed the research framework.

In the Beginning

During the Second World War we had the discoveries of physics and the emergence out of the Manhattan Project of radioactive nuclear probes from E.O. Lawrence University of California Berkeley Laboratory. The use of radioactive isotopes led to the development of biochemistry and isolation of nucleotides, nucleosides, enzymes, and filling in of details of pathways for photosynthesis, for biosynthesis, and for catabolism.
Perhaps a good start of the journey is a student of Neils Bohr named Max Delbruck (September 4, 1906 – March 9, 1981), who won the Nobel prize for discovering that bacteria become resistant to viruses (phages) as a result of genetic mutations, founded a new discipline called Molecular Biology, lifting the experimental work in Physiology to a systematic experimentation in biology with the rigor of Physics using radiation and virus probes on selected cells. In 1937 he turned to research on the genetics of Drosophila melanogaster at Caltech, and two years later he coauthored a paper, “The growth of bacteriophage”, reporting that the viruses replicate in one step, not exponentially. In 1942, he and Salvador Luria of Indiana University demonstrated that bacterial resistance to virus infection is mediated by random mutation. This research, known as the Luria-Delbrück experiment, notably applied mathematics to make quantitative predictions, and earned them the 1969 Nobel Prize in Physiology or Medicine, shared with Alfred Hershey. His inferences on genes’ susceptibility to mutation was relied on by physicist Erwin Schrödinger in his 1944 book, What Is Life?, which conjectured genes were an “aperiodic crystal” storing code-script and influenced Francis Crick and James D. Watson in their 1953 identification of cellular DNA’s molecular structure as a double helix.

Watson-Crick Double Helix Model

A new understanding of heredity and hereditary disease was possible once it was determined that DNA consists of two chains twisted around each other, or double helixes, of alternating phosphate and sugar groups, and that the two chains are held together by hydrogen bonds between pairs of organic bases—adenine (A) with thymine (T), and guanine (G) with cytosine (C). Modern biotechnology also has its basis in the structural knowledge of DNA—in this case the scientist’s ability to modify the DNA of host cells that will then produce a desired product, for example, insulin.
The background for the work of the four scientists was formed by several scientific breakthroughs:

  1. the progress made by X-ray crystallographers in studying organic macromolecules;
  2. the growing evidence supplied by geneticists that it was DNA, not protein, in chromosomes that was responsible for heredity;
  3. Erwin Chargaff’s experimental finding that there are equal numbers of A and T bases and of G and C bases in DNA;
  4. and Linus Pauling’s discovery that the molecules of some proteins have helical shapes.

In 1962 James Watson (b. 1928), Francis Crick (1916–2004), and Maurice Wilkins (1916–2004) jointly received the Nobel Prize in physiology or medicine for their 1953 determination of the structure of deoxyribonucleic acid (DNA), performed with a knowledge of Chargaff’s ratios of the bases in DNA and having  access to the X-ray crystallography of Maurice Wilkins and Rosalind Franklin at King’s College London. Because the Nobel Prize can be awarded only to the living, Wilkins’s colleague Rosalind Franklin (1920–1958), who died of cancer at the age of 37, could not be honored.
Of the four DNA researchers, only Rosalind Franklin had any degrees in chemistry. Franklin completed her degree in 1941 in the middle of World War II and undertook graduate work at Cambridge with Ronald Norrish, a future Nobel Prize winner. She returning to Cambridge after a year of war service, presented her work and received the PhD in physical chemistry. Franklin then learned the  X-ray crystallography in Paris and rapidly became a respected authority in this field. Returning to returned to England to King’s College London in 1951, her charge was to upgrade the X-ray crystallographic laboratory there for work with DNA.

bt2304  Rosalind Franklin, crystallographer

Cold Spring Harbor Laboratory

I digress to the beginnings of the Cold Spring Harbor Laboratory. A significant part of the Laboratory’s life revolved around education with its three-week-long Phage Course, taught first in 1945 by Max Delbruck, the German-born, theoretical-physicist-turned-biologist. James D Watson first came to Cold Spring Harbor Laboratory with his thesis advisor, Salvador Luria, in the summer of 1948. Over its more than 25-year history, the Phage Course was the training ground for many notable scientists. The Laboratory’s annual scientific Symposium, has provided a unique highly interactive education about the exciting field of “molecular” biology. The 1953 symposium featured Watson coming from England to give the first public presentation of the DNA double helix. When he became the Laboratory’s director in 1968 he was determined to make the Laboratory an important center for advancing molecular biology, and he focused his energy on bringing large donations to the enterprise CSHNL. It became a magnate for future discovery at which James D. Watson became the  Director in 1968, and later the Chancellor. This contribution has as great an importance as his Nobel Prize discovery.

Biochemistry and Molecular Probes comes into View

Moreover, at the same time, the experience of Nathan Kaplan and Martin Kamen at Berkeley working with radioactive probes was the beginning of an establishment of Lawrence-Livermore Laboratories role in metabolic studies, as reported in the previous paper. A collaboration between Sid Collowick, NO Kaplan and Elizabeth Neufeld at the McCollum Pratt Institute led to the transferase reaction between the two main pyridine nucleotides.  Neufeld received a PhD a few years later from the University of California, Berkeley, under William Zev Hassid for research on nucleotides and complex carbohydrates, and did postdoctoral studies on non-protein sulfhydryl compounds in mitosis. Her later work at the NIAMDG on mucopolysaccharidoses. The Lysosomal Storage Diseases opened a new chapter on human genetic diseases when she found that the defects in Hurler and Hunter syndromes were due to decreased degradation of the mucopolysaccharides. When an assay became available for α-L-iduronidase in 1972, Neufeld was able to show that the corrective factor for Hurler syndrome that accelerates degradation of stored sulfated mucopolysaccharides was α-L-iduronidase.

______________________________________________________

The Hurler Corrective Factor. Purification and Some Properties (Barton, R. W., and Neufeld, E. F. (1971) J. Biol. Chem. 246, 7773–7779)
The Sanfilippo A Corrective Factor. Purification and Mode of Action (Kresse, H., and Neufeld, E. F. (1972) J. Biol. Chem. 247, 2164–2170)
_______________________________________________________

I mention this for two reasons:
[1] We see a huge impetus for nucleic acids and nucleotides research growing in the 1950’s with a post WWII emergence of work on biological structure.
[2] At the same time, the importance of enzymes in cellular metabolic processes runs parallel to that of the genetic code.

In 1959 Arthur Kornberg was a recipient of the Nobel prize for Physiology or Medicine based on his discovery of “the mechanisms in the biological synthesis of deoxyribonucleic acid” (DNA polymerase) together with Dr. Severo Ochoa of New York University. In the next 20 years Stanford University Department of Biochemistry became a top rated graduate program in biochemistry. Today, the Pfeffer Lab is distinguished for research into how human cells put receptors in the right place through Rab GTPases that regulate all aspects of receptor trafficking. Steve Elledge (1984-1989) at Harvard University is one of  its graduates from the 1980s.

Transcription –RNA and the ribosome

In 2006, Roger Kornberg was awarded the Nobel Prize in Chemistry for identifying the role of RNA polymerase II and other proteins in transcribing DNA. He says that the process is something akin to a machine. “It has moving parts which function in synchrony, in appropriate sequence and in synchrony with one another”. The Kornbergs were the tenth family with closely-related Nobel laureates.  The 2009 Nobel Prize in Chemistry was awarded to Venki Ramakrishnan, Tom Steitz, and Ada Yonath for crystallographic studies of the ribosome. The atomic resolution structures of the ribosomal subunits provide an extraordinary context for understanding one of the most fundamental aspects of cellular function: protein synthesis. Research on protein synthesis began with studies of microsomes, and three papers were published on the atomic resolution structures of the 50S and 30S the atomic resolution of structures of ribosomal subnits in 2000. Perhaps the most remarkable and inexplicable feature of ribosome structure is that two-thirds of the mass is composed of large RNA molecules, the 5S, 16S, and 23S ribosomal RNAs, and the remaining third is distributed among ~50 relatively small and innocuous proteins. The first step on the road to solving the ribosome structure was determining the primary structure of the 16S and 23S RNAs in Harry Noller’s laboratory. The sequences were rapidly followed by secondary structure models for the folding of the two ribosomal RNAs, in collaboration with Carl Woese, bringing the ribosome structure into two dimensions. The RNA secondary structures are characterized by an elaborate series of helices and loops of unknown structure, but other than the insights offered by the structure of transfer RNA (tRNA), there was no way to think about folding these structures into three dimensions. The first three-dimensional images of the ribosome emerged from Jim Lake’s reconstructions from electron microscopy (EM) (Lake, 1976).

Ada Yonath reported the first crystals of the 50S ribosomal subunit in 1980, a crucial step that would require almost 20 years to bring to fruition (Yonath et al., 1980). Yonath’s group introduced the innovative use of ribosomes from extremophilic organisms. Peter Moore and Don Engelman applied neutron scattering techniques to determine the relative positions of ribosomal proteins in the 30S ribosomal subunit at the same time. Elegant chemical footprinting studies from the Noller laboratory provided a basis for intertwining the RNA among the ribosomal proteins, but there was still insufficient information to produce a high resolution structure, but Venki Ramakrishnan, in Peter Moore’s laboratory did it with deuterated ribosome reconstitutions. Then the Yale group was ramping up its work on the H. marismortui crystals of the 50S subunit. Peter Moore had recruited long-time colleague Tom Steitz to work on this problem and Steitz was about to complete the final event in the pentathlon of Crick’s dogma, having solved critical structures of DNA polymerases, the glutaminyl tRNA-tRNA synthetase complex, HIV reverse transcriptase, and T7 RNA polymerase. In 1999 Steitz, Ramakrishnan, and Yonath all presented electron density maps of subunits at approximately 5 Å resolution, and the Noller group presented 10 Å electron density maps of the Thermus 70S ribosome. Peter Moore aptly paraphrased Churchill, telling attendees that this was not the end, but the end of the beginning. Almost every nucleotide in the RNA is involved in multiple stabilizing interactions that form the monolithic tertiary structure at the heart of the ribosome.
Williamson J. The ribosome at atomic resolution. Cell 2009; 139:1041-1043.    http://dx.doi.org/10.1016/j.cell.2009.11.028/      http://www.sciencedirect.com/science/article/pii/S0092867409014536

This opened the door to new therapies.  For example, in 2010 it was reported that Numerous human genes display dual coding within alternatively spliced regions, which give rise to distinct protein products that include segments translated in more than one reading frame. To resolve the ensuing protein structural puzzle, we identified human genes with alternative splice variants comprising a dual coding region at least 75 nucleotides in length and analyzed the structural status of the protein segments they encode. The inspection of their amino acid composition and predictions by the IUPred and PONDR® VSL2 algorithms suggest a high propensity for structural disorder in dual-coding regions.
Kovacs E, Tompa P, liliom K, and Kalmar L. Dual coding in alternative reading frames correlates with intrinsic protein disorder. PNAS 2010.   http://www.jstor.org/stable/25664997   http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2851785
http://www.pnas.org/content/107/12/5429.full.pdf

 

In 2012, it was shown that drug-bound ribosomes can synthesize a distinct subset of cellular polypeptides. The structure of a protein defines its ability to thread through the antibiotic-obstructed tunnel. Synthesis of certain polypeptides that initially bypass translational arrest can be stopped at later stages of elongation while translation of some proteins goes to completion. (Kannan K, Vasquez-Laslop N, and Mankin AS. Selective Protein Synthesis by Ribosomes with a Drug-Obstructed Exit Tunnel. Cell 2012; 151; 508-520.) http://dx.doi.org/10.1016/j.cell.2012.09.018     http://www.sciencedirect.com/science/article/pii/S0092867412011257

Mobility of genetic elements

Barbara McClintock received the Nobel Prize for Medicine for the discovery of the mobility of genetic elements, work that been done in that period. When transposons were demonstrated in bacteria, yeast and other organisms, Barbara rose to a stratospheric level in the general esteem of the scientific world, but she was uncomfortable about the honors. It was sufficient to have her work understood and acknowledged. Prof. Howard Green said of her, “There are scientists whose discoveries greatly transcend their personalities and their humanity. But those in the future who will know of Barbara only her discoveries will know only her shadow”.
“In Memoriam – Barbara McClintock”. Nobelprize.org. 5 Feb 2013   http://www.nobelprize.org/nobel_prizes/medicine/laureates/1983/mcclintock-article.html/

She introduced her Nobel Lecture in 1983 with the following observation: “An experiment conducted in the mid-nineteen forties prepared me to expect unusual responses of a genome to challenges for which the genome is unprepared to meet in an orderly, programmed manner. In most known instances of this kind, the types of response were not predictable in advance of initial observations of them. It was necessary to subject the genome repeatedly to the same challenge in order to observe and appreciate the nature of the changes it induces…a highly programmed sequence of events within the cell that serves to cushion the effects of the shock. Some sensing mechanism must be present in these instances to alert the cell to imminent danger, and to set in motion the orderly sequence of events that will mitigate this danger”. She goes on to consider “early studies that revealed programmed responses to threats that are initiated within the genome itself, as well as others similarly initiated, that lead to new and irreversible genomic modifications. These latter responses, now known to occur in many organisms, are significant for appreciating how a genome may reorganize itself when faced with a difficulty for which it is unprepared”.

An experiment with Zea conducted in the summer of 1944 alerted her to the mobility of specific components of genomes involved the entrance of a newly ruptured end of a chromosome into a telophase nucleus. This experiment commenced with the growing of approximately 450 plants in the summer of 1944, each of which had started its development with a zygote that had received from each parent a chromosome with a newly ruptured end of one of its arms. The design of the experiment required that each plant be self-pollinated to isolate from the self-pollinated progeny new mutants that were expected to appear, and confine them to locations within the ruptured arm of a chromosome. Each mutant was expected to reveal the phenotype produced by a minute homozygous deficiency. Their modes of origin could be projected from the known behavior of broken ends of chromosomes in successive mitoses. Forty kernels from each self-pollinated ear were sown in a seedling bench in the greenhouse during the winter of 1944-45.

Some seedling mutants of the type expected overshadowed by segregants exhibiting bizarre phenotypes. These were variegated for type and degree of expression of a gene. Those variegated expressions given by genes associated with chlorophyll development were startingly conspicuous. Within any one progeny chlorophyll intensities, and their pattern of distribution in the seedling leaves, were alike. Between progenies, however, both the type and the pattern differed widely.

The effect of X-rays on chromosomes

Initial studies of broken ends of chromosomes began in the summer of 1931. By 1931, means of studying the beads on a string hypothesis was provided by newly developed methods of examining the ten chromosomes of the maize complement in microsporocytes in meiosis. The ten bivalent chromosomes are elongated in comparison to their metaphase lengths. Each chromosome

  • is identifiable by its relative length,
  • by the location of its centromere, which is readily observed at the pachytene stage, and
  • by the individuality of the chromomeres strung along the length of each chromosome.

At that time maize provided the best material for locating known genes along a chromosome arm, and also for precisely determining the break points in chromosomes that had undergone various types of rearrangement, such as translocations, inversions, etc.
The recessive phenotypes in the examined plants arose from loss of a segment of a chromosome that carried the wild-type allele, and X-rays were responsible for inducing these deficiencies. A conclusion of basic significance could be drawn from these observations:

  1. broken ends of chromosomes will fuse, 2-by-2, and
  2. any broken end with any other broken end.

This principle has been amply proved in a series of experiments conducted over the years. In all such instances the break must sever both strands of the DNA double helix. This is a “double-strand break” in modern terminology. That two such broken ends entering a telophase nucleus will find each other and fuse, regardless of the initial distance that separates them, soon became apparent.

During the summer of 1931 she had seen plants in the maize field that showed variegation patterns resembling the one described for Nicotiana.  Dr. McClintock was interested in selecting the variegated plants to determine the presence of a ring chromosome in each, and in the summer of 1932 with Dr. Stadler’s generous cooperation from Missouri, she had the opportunity to examine such plants. Each plant had a ring chromosome, but It was the behavior of this ring that proved to be significant. It revealed several basic phenomena. The following was noted:

In the majority of mitoses

  • replication of the ring chromosome produced two chromatids completely free from each other
  • could separate without difficulty in the following anaphase.
  • sister strand exchanges do occur between replicated or replicating chromatids
  • the frequency of such events increases with increase in the size of the ring.
  • these exchanges produce a double-size ring with two centromeres.
  • Mechanical rupture occurs in each of the two chromatid bridges formed at anaphase by passage of the two centromeres on the double-size ring to opposite poles of the mitotic spindle.
  • The location of a break can be at any one position along any one bridge.
  • The broken ends entering a telophase nucleus then fuse.
  • The size and content of each newly constructed ring depend on the position of the rupture that had occurred in each bridge.
  1. The conclusion was that cells sense the presence in their nuclei of ruptured ends of chromosomes
  2. then activate a mechanism that will bring together and then unite these ends
  3. this will occur regardless of the initial distance in a telophase nucleus that separated the ruptured ends.

The ability of a cell to

  • sense these broken ends,
  • to direct them toward each other, and
  • then to unite them so that the union of the two DNA strands is correctly oriented,
  • is a particularly revealing example of the sensitivity of cells to all that is going on within them.

Evidence from gave unequivocal support for the conclusion that broken ends will find each other and fuse. The challenge is met by a programmed response. This may be necessary, as

  1. both accidental breaks and
  2. programmed breaks may be frequent.
  3. If not repaired, such breaks could lead to genomic deficiencies having serious consequences.

A cell capable of repairing a ruptured end of a chromosome must sense the presence of this end in its nucleus. This sensing

  • activates a mechanism that is required for replacing the ruptured end with a functional telomere.
  • that such a mechanism must exist was revealed by a mutant that arose in the stocks.
  • this mutant would not allow the repair mechanism to operate in the cells of the plant.

Entrance of a newly ruptured end of a chromosome into the zygote is followed by the chromatid type of breakage-fusion-bridge cycle throughout mitoses in the developing plant.
This suggested that the repair mechanism in the maize strains is repressed in cells producing

  • the male and female gametophytes and
  • also in the endosperm,
  • but is activated in the embryo.

The extent of trauma perceived by cells

  • whose nuclei receive a single newly ruptured end of a chromosome that the cell cannot repair,
  • and the speed with which this trauma is registered, was not appreciated until the winter of 1944-45.

By 1947 it was learned that the bizarre variegated phenotypes that segregated in many of the self-pollinated progenies grown on the seedling bench in the fall and winter of 1944-45, were due to the action of transposable elements. It seemed clear that

  • these elements must have been present in the genome,
  • and in a silent state previous to an event that activated one or another of them.

She concluded that some traumatic event was responsible for these activations. The unique event in the history of these plants relates to their origin. Both parents of the plants grown in 1944 had contributed a chromosome with a newly ruptured end to the zygote that gave rise to each of these plants.
Detection of silent elements is now made possible with the aid of DNA cloning method. Silent AC (Activator) elements, as well as modified derivatives of them, have already been detected in several strains of maize. When other transposable elements are cloned it will be possible to compare their structural and numerical differences among various strains of maize. In any one strain of maize the number of silent but potentially transposable elements, as well as other repetitious DNAs, may be observed to change, and most probably in response to challenges not yet recognized.
Telomeres are especially adapted to replicate free ends of chromosomes. When no telomere is present, attempts to replicate this uncapped end may be responsible for the apparent “fusions” of the replicated chromatids at the position of the previous break as well as for perpetuating the chromatid type of breakage-fusion-bridge cycle in successive mitoses.
In conclusion, a genome may react to conditions for which it is unprepared, but to which it responds in a totally unexpected manner. Among these is

  • the extraordinary response of the maize genome to entrance of a single ruptured end of a chromosome into a telophase nucleus.
  • It was this event that was responsible for activations of potentially transposable elements that are carried in a silent state in the maize genome.
  • The mobility of these activated elements allows them to enter different gene loci and to take over control of action of the gene wherever one may enter.

Because the broken end of a chromosome entering a telophase nucleus can initiate activations of a number of different potentially transposable elements,

  • the modifications these elements induce in the genome may be explored readily.

In addition to

modifying gene action, these elements can

  • restructure the genome at various levels,
  • from small changes involving a few nucleotides,
  • to gross modifications involving large segments of chromosomes, such as
  1. duplications,
  2. deficiencies,
  3. inversions,
  4. and other reorganizations.

In the future attention undoubtedly will be centered on the genome, and with greater appreciation of its significance as a highly sensitive organ of the cell,

  • monitoring genomic activities and correcting common errors,
  • sensing the unusual and unexpected events,
  • and responding to them,
  • often by restructuring the genome.

We know about the elements available for such restructuring. We know nothing, however, about

  • how the cell senses danger and instigates responses to it that often are truly remarkable.

Source: 1983 Nobel Lecture. Barbara McClintock. THE SIGNIFICANCE OF RESPONSES OF THE GENOME TO CHALLENGE.

In 2009 the Nobel Prize in Physiology or Medicine was awarded to Elizabeth Blackburn, Carol Greider and Jack Szoztak for the discovery of Telomerase. This recognition came less than a decade after the completion of the Human Genome Project previously discussed. Prof. Blackburn acknowledges a strong influence coming from the work of Barbara McClintock. The discovery is tied to the pond organism Tetrahymena thermophila, and studies of yeast cells. Blackburn was drawn to science after reading the biography of Marie Curie by her daughter, Irina, as a child. She recalls that her Master’s mentor while studying the metabolism of glutamine in the rat liver, thought that every experiment should have the beauty and simplicity of a Mozart sonata. She did her PhD at the distinguished Laboratory for Molecular Biology at Cambridge, the epicenter of molecular biology sequencing the regions of bacteriophage phiX 174, a single stranded DNA bacteriophage. Using Fred Sanger’s methods to piece together RNA sequences she showed the first sequence of a 48 nucleotide fragment to her mathematical-gifted Cambridge cousin, who pointed out repeats of DNA sequence patterns! She worked on the sequencing of the DNA at the terminal regions of  the short “minichromosomes” of the ciliated protozoan Tetrahymena thermophile at Yale in 1975. She continued her research begun at Yale at UCSF funded by the NIH based on an intriguing audiogram showing telomeric DNA in Tetrahymena.
I describe the work as follows:

  • Prof. Blackburn incorporated 32P isotope labelled deoxynucleoside residues into the rDNA molecules for DNA repair enzymatic reactions and found that
  • the end regions were selectively labeled by combinations of 32P isotope radiolabled nucleoside triphosphate, and by mid-year she had an audiogram of the depurination products.
  • The audiogram showed sequences of 4 cytosine residues flanked by either an adenosine or a guanosine residue.
  • In 1976 she had deduced a sequence consisting of a tandem array of CCCAA repeats, and subsequently separated the products on a denaturing gel electrophoresis that appeared as tiger stripes extending up the gel.
  • The size of each band was 6 bases more than the band below it.

Telomere must have a telomerase!

The discovery of the telomerase enzyme activity was done by the Prize co-awardee, Carol Greider. They were trying to decipher the structure right at the termini of telomeres of both cliliated protozoans and yeast plasmids. The view that in mammalian telomeres there is a long protruding G-rich strand does not take into account the clear evidence for the short C strand repeat oligonucleotides that she discovered. This was found for both the Tetrahymena rDNA minichromosome molecules and linear plasmids purified from yeast.
In contrast to nucleosomal regions of chromosomes, special regions of DNA, for example

  • promoters that must bind transcription initiation factors that control transcription, have proteins other than the histones on them.
  • The telomeric repeat tract turned out to be such a non-nucleosomal region.

They  found that by clipping up chromatin using an enzyme that cuts the linker between neighboring nucleosomes,

  • it cut up the bulk of the DNA into nucleosome-sized pieces
  • but left the telomeric DNA tract as a single protected chunk.

The resulting complex of the telomeric DNA tract plus its bound cargo of protective proteins behaved very differently, from nucleosomal chromatin, and concluded that it had no histones or nucleosomes.

Any evidence for a protein on the bulk of the rDNA molecule ends, such as their behavior in gel electrophoresis and the appearance of the rDNA molecules under the electron microscope, was conspicuously lacking. This was reassuring that there was no covalently attached protein at the very ends of this minichoromosome. Despite considerable work, she was unable to determine what protein(s) would co-purify with the telomeric repeat tract DNA of Tetrahymena. It was yeast genetics and approaches done by others that turned out to provide the next great leaps forward in understanding telomeric proteins. Carol Greider, her colleague, noticed the need to scale up the telomerase activity preparations and they used a very large glass column for preparative gel filtration chromatography.

Jack W Szostak at the Howard Hughes Medical Institue at Harvard shared in the 2009 Nobel Prize. He became interested in molecular biology taking a course on the frontiers of Molecular Biology and reading about the experiments of Meselson-Stahl barely a decade earlier, and learned how the genetic code had been unraveled. The fact that one could deduce, from measurements of the radioactivity in fractions from a centrifuge tube, the molecular details of DNA replication, transcription and translation was astonishing. A highlight of his time at McGill was the open-book, open-discussion final exam in this class, in which the questions required the intense collaboration of groups of students.

At Cornell, Ithaca, he collaborated with  John Stiles and they came up with a specific idea to chemically synthesize a DNA oligonucleotide of sufficient length that it would hybridize to a single sequence within the yeast genome, and then to use it as an mRNA and gene specific probe. At the time, there was only one short segment of the yeast genome for which the DNA sequence was known,

  • the region coding for the N-terminus of the iso-1 cytochrome c protein,

intensively studied by Fred Sherman
The Sherman lab, in a tour de force of genetics and protein chemistry, had isolated

  • double-frameshift mutants in which the N-terminal region of the protein was translated from out-of-frame codons.
  • Protein sequencing of the wild type and frame-shifted mutants allowed them to deduce 44 nucleotides of DNA sequence.

If they could prepare a synthetic oligonucleotide that was complementary to the coding sequence, they could use it to detect the cytochrome-c mRNA and gene. At the time, essentially all experiments on mRNA were done on total cellular mRNA. Ray Wu was already well known for determining the sequence of the sticky ends of phage lambda, the first ever DNA to be sequenced, and his lab was deeply involved in the study of enzymes that could be used to manipulate and sequence DNA more effectively, but would not take on a project from another laboratory. So John went to nearby Rochester to do postdoctoral work with Sherman, and he was able to transfer to Ray Wu’s laboratory. In order to carry out his work, Ray Wu sent him to Saran Narang’s lab in Ottawa, and he received training there under Keichi Itakura, who synthesized the Insulin gene. A few months later, he received several milligrams of our long sought 15-mer. In collaboration with John Stiles and Fred Sherman, who sent us RNA and DNA samples from appropriate yeast strains, they were able to use the labeled 15-mer as a probe to detect the cyc1 mRNA, and later the gene itself. He notes that one of the delights of the world of science is that it is filled with people of good will who are more than happy to assist a student or colleague by teaching a technique or discussing a problem. He remained in Ray’s lab after completion of the PhD upon the arrival of Rodney Rothstein from Sherman’s lab in Rochester, who introduced him to yeast genetics, and he was prepared for the next decade of work on yeast.

  • first in recombination studies, and
  • later in telomere studies and other aspects of yeast biology.

His studies of recombination in yeast were enabled by the discovery, in Gerry Fink’s lab at Cornell, of a way to introduce foreign DNA into yeast. These pioneering studies of yeast transformation showed that circular plasmid DNA molecules could on occasion become integrated into yeast chromosomal DNA by homologous recombination.

  • His studies of unequal sister chromatid exchange in rDNA locus resulted in his first publication in the field of recombination.

The idea that you could increase transformation frequency by cutting the input DNA was pleasingly counterintuitive and led us to continue our exploration of this phenomenon. He gained an appointment to the Sidney-Farber Cancer Institute due to the interest of Prof. Ruth Sager, who gathered together a great group of young investigators. In work spearheaded by his first graduate student, Terry Orr-Weaver, on

  • double-strand breaks in DNA
  • and their repair by recombination (and continuing interaction with Rod Rothstein),
  • they were attracted to what kinds of reactions occur at the DNA ends.

It was at a Gordon Conference that he was excited hearing a talk by Elizabeth Blackburn on her work on telomeres in Tetrahymena.

  • This led to a collaboration testing the ability of Tetrahymena telomers to function in yeast.
  • He performed the experiments himself, and experienced the thrill of being the first to know that our wild idea had worked.
  • It was clear from that point on that a door had been opened and that they were going to be able to learn a lot about telomere function from studies in yeast.
  • Within a short time he was able to clone bona fide yeast telomeres, and (in a continuation of the collaboration with Liz Blackburn’s lab)
  • they obtained the critical sequence information that led (them) to propose the existence of the key enzyme, telomerase.

A fanciful depiction evoking both telomere dynamics and telomere researchers, done by the artist Julie Newdoll in 2008, elicits the idea of a telomere as an ancient Sumarian temple-like hive, tended by a swarm of ancient Sumarian Bee-goddesses against a background of clay tablets inscribed with DNA sequencing gel-like bands.
Dr. Blackburn recalls owing much to Barbara McClintock for her scientific findings, but also, Barbara McClintock also gave her advice in a conversation with her in 1977, during which

  • she had unexpected findings with the rDNA end sequences.
  • Dr. McClintock urged her to trust in intuition about the scientific research results.

This advice was surprising then because intuitive thinking was not something that she accepted to be a valid aspect of being a biology researcher.
MLA style: “Elizabeth H. Blackburn – Biographical”. Nobelprize.org. 5 Feb 2013. http://www.nobelprize.org/nobel_prizes/medicine/laureates/2009/blackburn.html

Summary:

In this Part I of a series of 3, I have described the

  • emergence of Molecular Biology and
  • closely allied work on the mechanism of Cell Replication and
  • the dependence of metabolic processes on proteins and enzymatic conversions through a surge of
  • post WWII research that gave birth to centers for basic science research in biology and medicine in both US and in England, which was preceded by work in prewar Germany. This is to be followed by further developments related to the Human Genome Project.
  • Transcription initiation (Photo credit: Wikipedia)
  • Schematic relationship between biochemistry, genetics, and molecular biology (Photo credit: Wikipedia)
  • Central dogma of molecular biology (Photo credit: Wikipedia)

 

Transcription initiation

Transcription initiation (Photo credit: Wikipedia)

Schematic relationship between biochemistry, g...

Schematic relationship between biochemistry, genetics, and molecular biology (Photo credit: Wikipedia)

Central dogma of molecular biology

Central dogma of molecular biology (Photo credit: Wikipedia)

 

 

 

                        Nucleotides_1.svg

 

 

 

Related References on the Open Access On;ine Scientific Journal

Big Data in Genomic Medicine lhb
http://pharmaceuticalintelligence.com/2012/12/17/big-data-in-genomic-medicine/

BRCA1 a tumour suppressor in breast and ovarian cancer – functions in transcription, ubiquitination and DNA repair S Saha
http://pharmaceuticalintelligence.com/2012/12/04/brca1-a-tumour-suppressor-in-breast-and-ovarian-cancer-functions-in-transcription-ubiquitination-and-dna-repair/

Computational Genomics Center: New Unification of Computational Technologies at Stanford A Lev-Ari
http://pharmaceuticalintelligence.com/2012/12/03/computational-genomics-center-new-unification-of-computational-technologies-at-stanford/

Personalized medicine gearing up to tackle cancer ritu saxena
http://pharmaceuticalintelligence.com/2013/01/07/personalized-medicine-gearing-up-to-tackle-cancer/

Differentiation Therapy – Epigenetics Tackles Solid Tumors SJ Williams
http://pharmaceuticalintelligence.com/2013/01/03/differentiation-therapy-epigenetics-tackles-solid-tumors/

Mechanism involved in Breast Cancer Cell Growth: Function in Early Detection & Treatment A Lev-Ari
http://pharmaceuticalintelligence.com/2013/01/17/mechanism-involved-in-breast-cancer-cell-growth-function-in-early-detection-treatment/

The Molecular pathology of Breast Cancer Progression Tilde Barliya
http://pharmaceuticalintelligence.com/2013/01/10/the-molecular-pathology-of-breast-cancer-progression/

Gastric Cancer: Whole-genome reconstruction and mutational signatures A Lev-Ari
http://pharmaceuticalintelligence.com/2012/12/24/gastric-cancer-whole-genome-reconstruction-and-mutational-signatures-2/

Paradigm Shift in Human Genomics – Predictive Biomarkers and Personalized Medicine – Part 1 (pharmaceuticalintelligence.com) A Lev-Ari
http://pharmaceuticalintelligence.com/2013/01/13/paradigm-shift-in-human-genomics-predictive-biomarkers-and-personalized-medicine-part-1/

LEADERS in Genome Sequencing of Genetic Mutations for Therapeutic Drug Selection in Cancer Personalized Treatment: Part 2 A Lev-Ari
http://pharmaceuticalintelligence.com/2013/01/13/leaders-in-genome-sequencing-of-genetic-mutations-for-therapeutic-drug-selection-in-cancer-personalized-treatment-part-2/

Personalized Medicine: An Institute Profile – Coriell Institute for Medical Research: Part 3 A Lev-Ari
http://pharmaceuticalintelligence.com/2013/01/13/personalized-medicine-an-institute-profile-coriell-institute-for-medical-research-part-3/

Harnessing Personalized Medicine for Cancer Management, Prospects of Prevention and Cure: Opinions of Cancer Scientific Leaders @ http://pharmaceuticalintelligence.com ALA
http://pharmaceuticalintelligence.com/2013/01/13/7000/Harnessing Personalized Medicine for Cancer Management, Prospects of Prevention and Cure: Opinions of Cancer Scientific Leaders/
GSK for Personalized Medicine using Cancer Drugs needs Alacris systems biology model to determine the in silico effect of the inhibitor in its “virtual clinical trial” A Lev-Ari
http://pharmaceuticalintelligence.com/2012/11/14/gsk-for-personalized-medicine-using-cancer-drugs-needs-alacris-systems-biology-model-to-determine-the-in-silico-effect-of-the-inhibitor-in-its-virtual-clinical-trial/

Recurrent somatic mutations in chromatin-remodeling and ubiquitin ligase complex genes in serous endometrial tumors S Saha
http://pharmaceuticalintelligence.com/2012/11/19/recurrent-somatic-mutations-in-chromatin-remodeling-and-ubiquitin-ligase-complex-genes-in-serous-endometrial-tumors/

Human Variome Project: encyclopedic catalog of sequence variants indexed to the human genome sequence A Lev-Ari
http:///pharmaceuticalintelligence.com/2012/11/24/human-variome-project-encyclopedic-catalog-of-sequence-variants-indexed-to-the-human-genome-sequence/

Prostate Cancer Cells: Histone Deacetylase Inhibitors Induce Epithelial-to-Mesenchymal Transition sjwilliams
http://pharmaceuticalintelligence.com/2012/11/30/histone-deacetylase-inhibitors-induce-epithelial-to-mesenchymal-transition-in-prostate-cancer-cells/

Inspiration From Dr. Maureen Cronin’s Achievements in Applying Genomic Sequencing to Cancer Diagnostics A Lev-Ari
http://pharmaceuticalintelligence.com/2013/01/10/inspiration-from-dr-maureen-cronins-achievements-in-applying-genomic-sequencing-to-cancer-diagnostics/

The “Cancer establishments” examined by James Watson, co-discoverer of DNA w/Crick, 4/1953 A Lev-Ari
http://pharmaceuticalintelligence.com/2013/01/09/the-cancer-establishments-examined-by-james-watson-co-discover-of-dna-wcrick-41953/

Squeezing Ovarian Cancer Cells to Predict Metastatic Potential: Cell Stiffness as Possible Biomarker pkandala
http://pharmaceuticalintelligence.com/2012/12/08/squeezing-ovarian-cancer-cells-to-predict-metastatic-potential-cell-stiffness-as-possible-biomarker/

Hypothesis – following on James Watson lhb
http://pharmaceuticalintelligence.com/2013/01/27/novel-cancer-h…ts-are-harmful/

Otto Warburg, A Giant of Modern Cellular Biology lhb
http://pharmaceuticalintelligence.com/2012/11/02/otto-warburg-a-giant-of-modern-cellular-biology/

Is the Warburg Effect the cause or the effect of cancer: A 21st Century View? lhb
http://pharmaceuticalintelligence.com/2012/10/17/is-the-warburg-effect-the-cause-or-the-effect-of-cancer-a-21st-century-view/

Predicting Tumor Response, Progression, and Time to Recurrence lhb
http://pharmaceuticalintelligence.com/2012/12/20/predicting-tumor-response-progression-and-time-to-recurrence/

Directions for genomics in personalized medicine lhb
http://pharmaceuticalintelligence.com/2013/01/27/directions-for-genomics-in-personalized-medicine/

How mobile elements in “Junk” DNA promote cancer. Part 1: Transposon-mediated tumorigenesis. SJ Williams
http://pharmaceuticalintelligence.com/2012/10/31/how-mobile-elements-in-junk-dna-prote-cancer-part1-transposon-mediated-tumorigenesis/

Advances in Separations Technology for the “OMICs” and Clarification of Therapeutic Targets lhb ‎
http://pharmaceuticalintelligence.com/2012/10/22/advances-in-separations-technology-for-the-omics-and-clarification-of-therapeutic-targets/

Mitochondrial Damage and Repair under Oxidative Stress lhb
http://pharmaceuticalintelligence.com/2012/10/28/mitochondrial-damage-and-repair-under-oxidative-stress/

Mitochondria: More than just the “powerhouse of the cell” Ritu Saxena
http://pharmaceuticalintelligence.com/2012/07/09/mitochondria-more-than-just-the-powerhouse-of-the-cell/

Mitochondrial mutation analysis might be “1-step” away Ritu Saxena
http://pharmaceuticalintelligence.com/2012/08/14/mitochondrial-mutation-analysis-might-be-1-step-away/

RNA interference with cancer expression lhb
http://pharmaceuticalintelligence.com/2012/10/26/mrna-interference-with-cancer-expression/

What can we expect of tumor therapeutic response? lhb
http://pharmaceuticalintelligence.com/2012/12/05/what-can-we-expect-of-tumor-therapeutic-response/

Expanding the Genetic Alphabet and linking the genome to the metabolome
http://pharmaceuticalintelligence.com/2012/09/24/expanding-the-genetic-alphabet-and-linking-the-genome-to-the-metabolome/

Breast Cancer, drug resistance, and biopharmaceutical targets lhb
http://pharmaceuticalintelligence.com/2012/09/18/breast-cancer-drug-resistance-and-biopharmaceutical-targets/

Breast Cancer: Genomic profiling to predict Survival: Combination of Histopathology and Gene Expression Analysis A Lev-Ari
http://pharmaceuticalintelligence.com/2012/12/24/breast-cancer-genomic-profiling-to-predict-survival-combination-of-histopathology-and-gene-expression-analysis/

Gastric Cancer: Whole-genome reconstruction and mutational signatures A Lev-Ari
http://pharmaceuticalintelligence.com/2012/12/24/gastric-cancer-whole-genome-reconstruction-and-mutational-signatures-2/

Ubiquinin-Proteosome pathway, autophagy, the mitochondrion, proteolysis and cell apoptosis lhb
http://pharmaceuticalintelligence.com/2012/10/30/ubiquinin-proteosome-pathway-autophagy-the-mitochondrion-proteolysis-and-cell-apoptosis/

Identification of Biomarkers that are Related to the Actin Cytoskeleton lhb
http://pharmaceuticalintelligence.com/2012/12/10/identification-of-biomarkers-that-are-related-to-the-actin-cytoskeleton/

Genomic Analysis: FLUIDIGM Technology in the Life Science and Agricultural Biotechnology A Lev-Ari
http://pharmaceuticalintelligence.com/2012/08/22/genomic-analysis-fluidigm-technology-in-the-life-science-and-agricultural-biotechnology/

Interview with the co-discoverer of the structure of DNA: Watson on The Double Helix and his changing view of Rosalind Franklin A Lev-Ari
http://pharmaceuticalintelligence.com/2012/11/09/interview-with-the-co-discoverer-of-the-structure-of-dna-watson-on-the-double-helix-and-his-changing-view-of-rosalind-franklin/

Winning Over Cancer Progression: New Oncology Drugs to Suppress Passengers Mutations vs. Driver Mutations A Lev-Ari
http://pharmaceuticalintelligence.com/2013/02/05/winning-over-cancer-progression-new-oncology-drugs-to-suppress-driver-mutations-vs-passengers-mutations/

Read Full Post »

Curator: Aviva Lev-Ari, PhD, RN

Chaperon Protein Mechanism inspired MIT Team to Model the Role of Genetic Mutations on Cancer Progression, proposing the next generation of Oncology drugs to aim at Suppression of Passenger Mutations. Current drug, in clinical trials, use the Chaperon Protein Mechanism to suppress Driver Mutations.

Deleterious Mutations in Cancer Progression

Kirill S. Korolev1, Christopher McFarland2, and Leonid A. Mirny3

1Department of Physics, MIT, Cambridge, MA.

E-mail: papers.korolev@gmail.com

2Graduate Program in Biophysics, Harvard University, Cambridge, MA.

3Health Sciences and Technology, MIT, Cambridge, MA

The research was funded by the National Institutes of Health/National Cancer Institute Physical Sciences Oncology Center at MIT.

SOURCE:

http://cnls.lanl.gov/q-bio/wiki/images/4/40/Abstract.pdf

Deleterious passenger mutations significantly affect evolutionary dynamics of cancer. Including passenger mutations in evolutionary models is necessary to understand the role of genetic diversity in cancer progression and to create new treatments based on the accumulation of deleterious passenger mutations.

Evolutionary models of cancer almost exclusively focus on the acquisition of driver mutations, which are beneficial to cancer cells. The driver mutations, however, are only a small fraction of the mutations found in tumors. The other mutations, called passenger mutations, are typically neglected because their effect on fitness is assumed to be very small. Recently, it has been suggested that some passenger mutations are slightly deleterious. We find that deleterious passengers significantly affect cancer progression. In particular, they lead to a critical tumor size, below which tumors shrink on average, and to an optimal mutation rate for cancer evolution.

ANCER is an outcome of somatic evolution [1-3]. To outcompete their benign sisters, cancer cells need to acquire many heritable changes (driver mutations) that enable proliferation. In addition to the rare beneficial drivers, cancer cells must also acquire neutral or slightly deleterious passenger mutations [4]. Indeed, the number of possible passengers exceeds the number of possible drivers by orders of magnitude. Surprisingly, the effect of passenger mutations on cancer progression has not been explored. To address this problem, we developed an evolutionary model of cancer progression, which includes both drivers and passengers. This model was analyzed both numerically and analytically to understand how mutation rate, population size, and fitness effects of mutations affect cancer progression.

RESULTS

Upon including passengers in our model, we found that cancer is no longer a straightforward progression to malignancy. In particular, there is a critical population size such that smaller populations accumulate passengers and decline, while larger populations accumulate drivers and grow. The transition to cancer for small initial populations is, therefore, stochastic in nature and is similar to diffusion over an energy barrier in chemical kinetics. We also found that there is an optimal mutation rate for cancer development, and passengers with intermediate fitness costs are most detrimental to cancer. The existence of an optimal mutation rate could explain recent clinical data [5] and is in stark contrast to the predictions of the models neglecting passengers. We also show that our theory is consistent with recent sequencing data.

SOURCE:

http://cnls.lanl.gov/q-bio/wiki/images/4/40/Abstract.pdf

Just as some mutations in the genome of cancer cells actively spur tumor growth, it would appear there are also some that do the reverse, and act to slow it down or even stop it, according to a new US study led by MIT.

Senior author, Leonid Mirny, an associate professor of physics and health sciences and technology at MIT, and colleagues, write about this surprise finding in a paper to be published online this week in the Proceedings of the National Academy of Sciences.

In a statement released on Monday, Mirny tells the press:

“Cancer may not be a sequence of inevitable accumulation of driver events, but may be actually a delicate balance between drivers and passengers.”

“Spontaneous remissions or remissions triggered by drugs may actually be mediated by the load of deleterious passenger mutations,” he suggests.

Cancer Cell‘s Genome Has “Drivers” and “Passengers”

Your average cancer cell has a genome littered with thousands of mutations and hundreds of mutated genes. But only a handful of these mutated genes are drivers that are responsible for the uncontrolled growth that leads to tumors.

Up until this study, cancer researchers have mostly not paid much attention to the “passenger” mutations, believing that because they were not “drivers”, they had little effect on cancer progression. 

Now Mirny and colleagues have discovered, to their surprise, that the “passengers” aren’t there just for the ride. In sufficient numbers, they can slow down, and even stop, the cancer cells from growing and replicating as tumors. 

New Drugs Could Target the Passenger Mutations in Protein Chaperoning

Although there are already several drugs in development that target the effect of chaperone proteins in cancer, they are aiming to suppress driver mutations.

Recently, biochemists at the University of Massachusetts Amherst“trapped” a chaperone in action, providing a dynamic snapshot of its mechanism as a way to help development of new drugs that target drivers.

But Mirny and colleagues say there is now another option: developing drugs that target the same chaperoning process, but their aim would be to encourage the suppressive effect of the passenger mutations.

They are now comparing cells with identical driver mutations but different passenger mutations, to see which have the strongest effect on growth.

They are also inserting the cells into mice to see which are the most likely to lead to secondary tumors (metastasize).

Written by Catharine Paddock PhD
Copyright: Medical News Today

SOURCE:

http://www.medicalnewstoday.com/articles/255920.php

After proteins are synthesized, they need to be folded into the correct shape, and chaperones help with that process. In cancerous cells, chaperones help proteins fold into the correct shape even when they are mutated, helping to suppress the effects of deleterious mutations.
Several potential drugs that inhibit chaperone proteins are now in clinical trials to treat cancer, although researchers had believed that they acted by suppressing the effects of driver mutations, not by enhancing the effects of passengers.

In current studies, the researchers are comparing cancer cell lines that have identical driver mutations but a different load of passenger mutations, to see which grow faster. They are also injecting the cancer cell lines into mice to see which are likeliest to metastasize.

Drugs that tip the balance in favor of the passenger mutations could offer a new way to treat cancer, the researchers say, beating it with its own weapon — mutations. Although the influence of a single passenger mutation is minuscule, “collectively they can have a profound effect,” Mirny says. “If a drug can make them a little bit more deleterious, it’s still a tiny effect for each passenger, but collectively this can build up.”

In natural populations, selection weeds out deleterious mutations. However, Mirny and his colleagues suspected that the evolutionary process in cancer can proceed differently, allowing mutations with only a slightly harmful effect to accumulate.

If enough deleterious passengers are present, their cumulative effects can slow tumor growth, the simulations found. Tumors may become dormant, or even regress, but growth can start up again if new driver mutations are acquired. This matches the cancer growth patterns often seen in human patients.

“Spontaneous remissions or remissions triggered by drugs may actually be mediated by the load of deleterious passenger mutations.”

When they analyzed passenger mutations found in genomic data taken from cancer patients, the researchers found the same pattern predicted by their model — accumulation of large quantities of slightly deleterious mutations.

REFERENCE

Massachusetts Institute of Technology (2013, February 4). Some cancer mutations slow tumor growth. ScienceDaily. Retrieved February 4, 2013, from http://www.sciencedaily.com­/releases/2013/02/130204154011.htm

Biochemists Trap A Chaperone Machine In Action

Main Category: Biology / Biochemistry
Article Date: 11 Dec 2012 – 0:00 PST

Molecular chaperones have emerged as exciting new potential drug targets, because scientists want to learn how to stop cancer cells, for example, from using chaperones to enable their uncontrolled growth. Now a team of biochemists at the University of Massachusetts Amherst led by Lila Gierasch have deciphered key steps in the mechanism of the Hsp70 molecular machine by “trapping” this chaperone in action, providing a dynamic snapshot of its mechanism.

She and colleagues describe this work in the current issue of Cell. Gierasch’s research on Hsp70 chaperones is supported by a long-running grant to her lab from NIH’s National Institute for General Medical Sciences.

Molecular chaperones like the Hsp70s facilitate the origami-like folding of proteins, made in the cell’s nanofactories or ribosomes, from where they emerge unstructured like noodles. Proteins only function when folded into their proper structures, but the process is so difficult under cellular conditions that molecular chaperone helpers are needed. 

The newly discovered information about chaperone action is important because all rapidly dividing cells use a lot of Hsp70, Gierasch points out. “The saying is that cancer cells are addicted to Hsp70 because they rely on this chaperone for explosive new cell growth. Cancer shifts our body’s production of Hsp70 into high gear. If we can figure out a way to take that away from cancer cells, maybe we can stop the out-of-control tumor growth. To find a molecular way to inhibit Hsp70, you’ve got to know how it works and what it needs to function, so you can identify its vulnerabilities.”

Chaperone proteins in cells, from bacteria to humans, act like midwives or bodyguards, protecting newborn proteins from misfolding and existing proteins against loss of structure caused by stress such as heat or a fever. In fact, the heat shock protein (Hsp) group includes a variety of chaperones active in both these situations.

As Gierasch explains, “New proteins emerge into a challenging environment. It’s very crowded in the cell and it would be easy for them to get their sticky amino acid chains tangled and clumped together. Chaperones bind to them and help to avoid this aggregation, which is implicated in many pathologies such as neurodegenerative diseases. This role of chaperones has also heightened interest in using them therapeutically.”

However, chaperones must not bind too tightly or a protein can’t move on to do its job. To avoid this, chaperones rapidly cycle between tight and loose binding states, determined by whether ATP or ADP is bound. In the loose state, a protein client is free to fold or to be picked up by another chaperone that will help it fold to do its cellular work. In effect, Gierasch says, Hsp70s create a “holding pattern” to keep the protein substrate viable and ready for use, but also protected.

She and colleagues knew the Hsp70’s structure in both tight and loose binding affinity states, but not what happened between, which is essential to understanding the mechanism of chaperone action. Using the analogy of a high jump, they had a snapshot of the takeoff and landing, but not the top of the jump. “Knowing the end points doesn’t tell us how it works. There is a shape change in there that we wanted to see,” Gierasch says.

To address this, she and her colleagues postdoctoral fellows Anastasia Zhuravleva and Eugenia Clerico obtained “fingerprints” of the structure of Hsp70 in different states by using state-of-the-art nuclear magnetic resonance (NMR) methods that allowed them to map how chemical environments of individual amino acids of the protein change in different sample conditions. Working with an Hsp70 known as DnaK from E. coli bacteria, Zhuravleva and Clerico assigned its NMR spectra. In other words, they determined which peaks came from which amino acids in this large molecule.

The UMass Amherst team then mutated the Hsp70 so that cycling between tight and loose binding states stopped. As Gierasch explains, “Anastasia and Eugenia were able to stop the cycle part-way through the high jump, so to speak, and obtain the molecular fingerprint of a transient intermediate.” She calls this accomplishment “brilliant.”

Now that the researchers have a picture of this critical allosteric state, that is, one in which events at one site control events in another, Gierasch says many insights emerge. For example, it appears nature uses this energetically tense state to “tune” alternate versions of Hsp70 to perform different cellular functions. “Tuning means there may be evolutionary changes that let the chaperone work with its partners optimally,” she notes.

“And if you want to make a drug that controls the amount of Hsp70 available to a cell, our work points the way toward figuring out how to tickle the molecule so you can control its shape and its ability to bind to its client. We’re not done, but we made a big leap,” Gierasch adds. “We now have a idea of what the Hsp70 structure is when it is doing its job, which is extraordinarily important.” 

Article adapted by Medical News Today from original press release. Click ‘references’ tab above for source.
Visit our biology / biochemistry section for the latest news on this subject.
SOURCE:

REFERENCES

[1] Michor F, Iwasa Y, and Nowak MA (2004) Dynamics of cancer

progression. Nature Reviews Cancer 4, 197-205.

[2] Crespi B and Summers K (2005) Evolutionary biology of cancer.

Trends in Ecology and Evolution 20, 545-552.

[3] Merlo LMF, et al. (2006) Cancer as an evolutionary and ecological

process. Nature Reviews Cancer 6, 924-935.

[4] McFarland C, et al. “Accumulation of deleterious passenger mutations

in cancer,” in preparation.

[5] Birkbak NJ, et al. (2011) Paradoxical relationship between

chromosomal instability and survival outcome in cancer. Cancer

Research 71,3447-3452.

Other related articles on this Open Access Online Scientific Journal include the following:

Hold on. Mutations in Cancer do good.

http://pharmaceuticalintelligence.com/2013/02/04/hold-on-mutations-in-cancer-do-good/

Rational Design of Allosteric Inhibitors and Activators Using the Population-Shift Model: In Vitro Validation and Application to an Artificial Biosensor

http://pharmaceuticalintelligence.com/2012/10/26/rational-design-of-allosteric-inhibitors-and-activators-using-the-population-shift-model-in-vitro-validation-and-application-to-an-artificial-biosensor/

LEADERS in Genome Sequencing of Genetic Mutations for Therapeutic Drug Selection in Cancer Personalized Treatment: Part 2

http://pharmaceuticalintelligence.com/2013/01/13/leaders-in-genome-sequencing-of-genetic-mutations-for-therapeutic-drug-selection-in-cancer-personalized-treatment-part-2/

Exome sequencing of serous endometrial tumors shows recurrent somatic mutations in chromatin-remodeling and ubiquitin ligase complex genes

http://pharmaceuticalintelligence.com/2012/12/18/exome-sequencing-of-serous-endometrial-tumors-shows-recurrent-somatic-mutations-in-chromatin-remodeling-and-ubiquitin-ligase-complex-genes/

Genome-Wide Detection of Single-Nucleotide and Copy-Number Variation of a Single Human Cell(1)

http://pharmaceuticalintelligence.com/2013/02/03/genome-wide-detection-of-single-nucleotide-and-copy-number-variation-of-a-single-human-cell/

Gastric Cancer: Whole-genome reconstruction and mutational signatures

http://pharmaceuticalintelligence.com/2012/12/24/gastric-cancer-whole-genome-reconstruction-and-mutational-signatures-2/

Pregnancy with a Leptin-Receptor Mutation

http://pharmaceuticalintelligence.com/2012/10/31/pregnancy-with-a-leptin-receptor-mutation/

Mitochondrial mutation analysis might be “1-step” away

http://pharmaceuticalintelligence.com/2012/08/14/mitochondrial-mutation-analysis-might-be-1-step-away/

Genome-wide Single-Cell Analysis of Recombination Activity and De Novo Mutation Rates in Human Sperm

http://pharmaceuticalintelligence.com/2012/08/07/genome-wide-single-cell-analysis-of-recombination-activity-and-de-novo-mutation-rates-in-human-sperm/

A Prion Like-Protein, Protein Kinase Mzeta and Memory Maintenance

http://pharmaceuticalintelligence.com/2012/10/19/a-prion-like-protein-protein-kinase-mzeta-and-memory-maintenance/

Hope for Male Contraception: A small molecule that inhibits a protein important for chromatin organization can cause reversible sterility in male mice

http://pharmaceuticalintelligence.com/2012/09/03/hope-for-male-contraception-a-small-molecule-that-inhibits-a-protein-important-for-chromatin-organization-can-cause-reversible-sterility-in-male-mice/

Protein Folding may lead to better FLU Vaccine

http://pharmaceuticalintelligence.com/2012/07/25/protein-folding-may-lead-to-better-flu-vaccine/

SNAP: Predict Effect of Non-synonymous Polymorphisms: How well Genome Interpretation Tools could Translate to the Clinic

http://pharmaceuticalintelligence.com/2013/02/03/snap-predict-effect-of-non-synonymous-polymorphisms-how-well-genome-interpretation-tools-could-translate-to-the-clinic/

Drugging the Epigenome

http://pharmaceuticalintelligence.com/2013/02/01/drugging-the-epigenome/

Read Full Post »

« Newer Posts - Older Posts »