Posts Tagged ‘ENCODE’

Gene Expression: Algorithms for Protein Dynamics

Reporter:  Aviva Lev-Ari, PhD, RN

Stanford-developed algorithm reveals complex protein dynamics behind gene expression


Michael Snyder

In yet another coup for a research concept known as “big data,” researchers at the Stanford University School of Medicine have developed a computerized algorithm to understand the complex and rapid choreography of hundreds of proteins that interact in mindboggling combinations to govern how genes are flipped on and off within a cell.

To do so, they coupled findings from 238 DNA-protein-binding experiments performed by the ENCODE project — a massive, multiyear international effort to identify the functional elements of the human genome — with a laboratory-based technique to identify binding patterns among the proteins themselves.

The analysis is sensitive enough to have identified many previously unsuspected, multipartner trysts. It can also be performed quickly and repeatedly to track how a cell responds to environmental changes or crucial developmental signals.

“At a very basic level, we are learning who likes to work with whom to regulate around 20,000 human genes,” said Michael Snyder, PhD, professor and chair of genetics at Stanford. “If you had to look through all possible interactions pair-wise, it would be ridiculously impossible. Here we can look at thousands of combinations in an unbiased manner and pull out important and powerful information. It gives us an unprecedented level of understanding.”

Snyder is the senior author of a paper describing the research published Oct. 24 in Cell. The lead authors are postdoctoral scholars Dan Xie, PhD, Alan Boyle, PhD, and Linfeng Wu, PhD.

Proteins control gene expression by either binding to specific regions of DNA, or by interacting with other DNA-bound proteins to modulate their function. Previously, researchers could only analyze two to three proteins and DNA sequences at a time, and were unable to see the true complexities of the interactions among proteins and DNA that occur in living cells.

The challenge resembled trying to figure out interactions in a crowded mosh pit by studying a few waltzing couples in an otherwise empty ballroom, and it has severely limited what could be learned about the dynamics of gene expression.

The ENCODE, for the Encyclopedia of DNA Elements, project was a five-year collaboration of more than 440 scientists in 32 labs around the world to reveal the complex interplay among regulatory regions, proteins and RNA molecules that governs when and how genes are expressed. The project has been generating a treasure trove of data for researchers to analyze for the last eight years.

In this study, the researchers combined data from genomics (a field devoted to the study of genes) and proteomics (which focuses on proteins and their interactions). They studied 128 proteins, called trans-acting factors, which are known to regulate gene expression by binding to regulatory regions within the genome. Some of the regions control the expression of nearby genes; others affect the expression of genes great distances away.

The researchers used 238 data sets generated by the ENCODE project to study the specific DNA sequences bound by each of the 128 trans-acting factors. But these factors aren’t monogamous; they bind many different sequences in a variety of protein-DNA combinations. Xie, Boyle and Snyder designed a machine-learning algorithm to analyze all the data and identify which trans-acting factors tend to be seen together and which DNA sequences they prefer.

Wu then performed immunoprecipitation experiments, which use antibodies to identify protein interactions in the cell nucleus. In this way, they were able to tell which proteins interacted directly with one another, and which were seen together because their preferred DNA binding sites were adjoining.

“Before our work, only the combination of two or three regulatory proteins were studied, which oversimplified how gene regulators collaborate to find their targets,” Xie said. “With our method we are able to study the combination of more than 100 regulators and see a much more complex structure of collaboration. For example, it had been believed that a key regulator of cell proliferation called FOS typically only works with JUN protein family members. We show, in addition to JUN, FOS has different partners under different circumstances. In fact, we found almost all the canonical combinations of two or three trans-acting factors have many more partners than we previously thought.”

To broaden their analysis, the researchers included data from other sources that explored protein-binding patterns in five cell types. They found that patterns of co-localization among proteins, in which several proteins are found clustered closely on the DNA to govern gene expression, vary according to cell type and the conditions under which the cells are grown. They also found that many of these clusters can be explained through interactions among proteins, and that not every protein bound to DNA directly.

“We’d like to understand how these interactions work together to make different cell types and how they gain their unique identities in development,” Snyder said. “Furthermore, diseased cells will have a very different type of wiring diagram. We hope to understand how these cells go astray.”

Other Stanford co-authors include life science research assistant Jie Zhai and life science research associate Trupti Kawli, PhD.

The research was supported by the National Human Genome Research Institute (grants U54HG004558 and U54HG006996).

Information about Stanford’s Department of Genetics, which also supported the work, is available at

Krista Conger | Tel (650) 725-5371
M.A. Malone | Tel (650) 723-6912

Stanford Medicine integrates research, medical education and patient care at its three institutions – Stanford University School of MedicineStanford Hospital & Clinics and Lucile Packard Children’s Hospital. For more information, please visit the Office of Communication & Public Affairs site at


Dynamic trans-Acting Factor Colocalization in Human Cells

Cell, Volume 155, Issue 3, 713-724, 24 October 2013
Copyright © 2013 Elsevier Inc. All rights reserved.


    • Highlights
    • Colocalization patterns of 128 TFs in human cells
    • An application of SOMs to study high-dimensional TF colocalization patterns
    • Colocalization patterns are dynamic through stimulation and across cell types
    • Many TF colocalizations can be explained by protein-protein interaction


    Different trans-acting factors (TFs) collaborate and act in concert at distinct loci to perform accurate regulation of their target genes. To date, the cobinding of TF pairs has been investigated in a limited context both in terms of the number of factors within a cell type and across cell types and the extent of combinatorial colocalizations. Here, we use an approach to analyze TF colocalization within a cell type and across multiple cell lines at an unprecedented level. We extend this approach with large-scale mass spectrometry analysis of immunoprecipitations of 50 TFs. Our combined approach reveals large numbers of interesting TF-TF associations. We observe extensive change in TF colocalizations both within a cell type exposed to different conditions and across multiple cell types. We show distinct functional annotations and properties of different TF cobinding patterns and provide insights into the complex regulatory landscape of the cell.!

    Personalized medicine aims to assess medical risks, monitor, diagnose and treat patients according to their specific genetic composition and molecular phenotype. The advent of genome sequencing and the analysis of physiological states has proven to be powerful (Cancer Genome Atlas Research Network, 2011). However, its implementation for the analysis of otherwise healthy individuals for estimation of disease risk and medical interpretation is less clear. Much of the genome is difficult to interpret and many complex diseases, such as diabetes, neurological disorders and cancer, likely involve a large number of different genes and biological pathways (Ashley et al., 2010,Grayson et al., 2011,Li et al., 2011), as well as environmental contributors that can be difficult to assess. As such, the combination of genomic information along with a detailed molecular analysis of samples will be important for predicting, diagnosing and treating diseases as well as for understanding the onset, progression, and prevalence of disease states (Snyder et al., 2009).

    Presently, healthy and diseased states are typically followed using a limited number of assays that analyze a small number of markers of distinct types. With the advancement of many new technologies, it is now possible to analyze upward of 105 molecular constituents. For example, DNA microarrays have allowed the subcategorization of lymphomas and gliomas (Mischel et al., 2003), and RNA sequencing (RNA-Seq) has identified breast cancer transcript isoforms (Li et al., 2011,van der Werf et al., 2007,Wu et al., 2010,Lapuk et al., 2010). Although transcriptome and RNA splicing profiling are powerful and convenient, they provide a partial portrait of an organism’s physiological state. Transcriptomic data, when combined with genomic, proteomic, and metabolomic data are expected to provide a much deeper understanding of normal and diseased states (Snyder et al., 2010). To date, comprehensive integrative omics profiles have been limited and have not been applied to the analysis of generally healthy individuals.

    To obtain a better understanding of: (1) how to generate an integrative personal omics profile (iPOP) and examine as many biological components as possible, (2) how these components change during healthy and diseased states, and (3) how this information can be combined with genomic information to estimate disease risk and gain new insights into diseased states, we performed extensive omics profiling of blood components from a generally healthy individual over a 14 month period (24 months total when including time points with other molecular analyses). We determined the whole-genome sequence (WGS) of the subject, and together with transcriptomic, proteomic, metabolomic, and autoantibody profiles, used this information to generate an iPOP. We analyzed the iPOP of the individual over the course of healthy states and two viral infections (Figure 1A). Our results indicate that disease risk can be estimated by a whole-genome sequence and by regularly monitoring health states with iPOP disease onset may also be observed. The wealth of information provided by detailed longitudinal iPOP revealed unexpected molecular complexity, which exhibited dynamic changes during healthy and diseased states, and provided insight into multiple biological processes. Detailed omics profiling coupled with genome sequencing can provide molecular and physiological information of medical significance. This approach can be generalized for personalized health monitoring and medicine.



    Read Full Post »

    ENCODE (Encyclopedia of DNA Elements) program: ‘Tragic’ Sequestration Impact on NHGRI Programs

    Reporter: Aviva Lev-Ari, PhD, RN

    NHGRI’s Green Sees ‘Tragic’ Sequestration Impact on NHGRI Programs

    September 13, 2013

    NEW YORK (GenomeWeb News) – The funding squeeze from the sequestration of the US federal budget, now more than half-a-year old, has already had a sizable impact at the National Human Genome Research Institute, leading to cuts to ongoing programs, scaling back of new ones, and the deferring of efforts that have not yet launched.

    The five percent cut in funding this year at NHGRI has led not only to trimmed-down renewal grants and fewer, smaller awards broadly, but also has chopped the budget for some of the institute’s important programs, according to NHGRI Director Eric Green.

    The programs that have either had their funding reduced, and in one case delayed, include the ENCODE (Encyclopedia of DNA Elements) program, projects focused on using genome sequencing in newborns and in clinical medicine, and other initiatives, Green said in his Director’s Report to the National Advisory Council on Human Genomics Research this week.

    In addition, many renewal grants have been trimmed, and there are “numerous examples of detrimental cuts” to the institute’s intramural research program, said Green. These cuts to large and small NHGRI programs come at a pivotal time for genomics, he noted, as the products of such research are beginning to translate into clinical possibilities.

    “It is tragic. [That] is the word I would use,” Green told GenomeWeb Daily News this week.

    “[The field of genomics] is just so exciting. There are so many opportunities,” he said. “This is precisely the time that we should be pushing the accelerator hard, and we just cannot do it because we don’t have enough fuel in our fuel tank.

    “It’s frustrating. I think the opportunities now are just spectacular,” said Green. “It’s tragic because it is just so obvious that we could do some remarkable things in genomics and we are not being able to do it.”

    ENCODE, a decade-old flagship project at NIH that aims to identify all of the functional elements in the human genome, had its budget reduced by 16 percent.

    The Genomic Sequencing and Newborn Screening Disorders program was cut by half, which left the program to fund fewer research projects than planned and its research consortium to go forward without the benefit of a data coordinating center. This new initiative, an effort to support pioneering studies on how sequencing might be used in the care of newborns and in neonatal care that was created jointly with the Eunice Kennedy Shriver National Institute of Child Health and Human Development, had its budget cut from $10 million to $5 million.

    The Genomic Medicine Pilot Demonstration Projects program had its budget cut by 20 percent, and NHGRI’s Bioinformatics Resources and Analysis Research Portfolio had $5 million sliced out of its budget. The new Genomics of Gene Regulation (GGR)request for applications was bumped out of this funding year entirely, and has been delayed until 2014, according to Green.

    Because the sequestration plan was concocted and agreed to well in advance of its arrival earlier this year, Green told GWDN that the institute did have some time to try to react to the sequestration and mitigate the pain from the cuts, spreading them around fairly and evenly while maintaining priorities. He said leadership at the institute tried to prepare for the possibility of sequestration by being conservative in its planning.

    Programs that were already ongoing, like ENCODE, were likely to take priority over those that were not yet launched, like GGR, in part because the infrastructure is already in place for ongoing projects and because it is easier to plan for how they operate and generate outputs, like data.

    “With ENCODE you know for every million dollars you invest you get so much back,” said Green. “With a program like newborn sequencing … we don’t totally know what it’s going to look like or play out like. We won’t know what we are missing because we won’t be able to launch it to the scale that we wanted to launch it originally.”

    Green said some of the projects being cut or delayed were created under NHGRI’sstrategic plan, a program it laid out in 2011 that involves restructuring of the institute’s divisions and some shifting in its research portfolio to include more efforts in applying genomics to medicine and healthcare.

    “Some of these RFAs that we delayed really represent key elements that we started to anticipate two years ago,” said Green. “We knew we wanted to do more in sequencing, we knew we wanted to do some pilot projects in genomic medicine. We knew we wanted to continue to accelerate efforts in understanding how the genome works … ENCODE, GGR, and so forth. It just had to be slowed down,” he said.

    Anastasia Wise, program director for the Genomic Sequencing and Newborn Screening Disorders program, told GWDN that the program was supposed to be much larger than the $5 million in awards unveiled last week, which funded a consortium of four research projects.

    Wise said NHGRI and NICHD were each initially planning to provide double the amount of funding they were actually awarded, which is now expected to be a total of $25 million over five years, although that total could be subject to the availability of funding.

    “There were definitely more scientifically meritorious applications than we were able to fund,” she said. “Even the four awards that we made ended up being cut an additional five percent because of the sequestration.”

    She said the program “wanted to be able to make more awards, and we wanted to be able to fund a coordinating center to be able to bring the network together and help provide some harmonization of data and coordination of logistics between the different members of the consortium,” but it was unable to fund that part of the effort.

    Although the fractured fiscal culture in Washington engenders caution at NHGRI as the agency looks forward, Green sees many scientific opportunities right now, as genomics begins to hit the clinic.

    “Some people are saying we are not even going fast enough,” he said. “Lots of people have been discussing what the world is going to look like when somebody gets their genome sequenced in the newborn period, and [they] think about what the implications of that are for the patient for the rest of their lives. We want to start studying this,” he said.

    “And we are starting to … but we’re not starting as aggressively as we wanted to,” Green said. “I mean, we took a big hit this year.”

    Matt Jones is a staff reporter for GenomeWeb Daily News. He covers public policy, legislation, and funding issues that affect researchers in the genomics field, as well as the operations of research institutes. E-mail Matt Jones or follow GWDN’s headlines at @DailyNewsGW.

    Related Stories


    Read Full Post »

    Genome Jigsaws

    Genome Jigsaws (Photo credit: dullhunk)

    Sequencing became the household name.  In 2000s, it was thought to be the key of the Pandora’s box for cure.  Then, after completion of Human Genome Projects showed that there are less number of genes than expected.  This outcome induce to originate yet another set of sequencing programs and collaborations around the world, such as Human Protein Project, Human Microorganisms Projects, ENCODE, Transcriptome Sequencing and Consortiums etc.

    It is in humankind to believe in magic and illusion.  The strength of biological diversity and complex mechanism of expression may chalanges the set up of a simple but informative specific essay.  Thus, there is a new developing field to mash rules of biology with mathematical formulas to develop the best bioinformatics or also called computational biology.  Predicting transcription start or termination sites, exon boundaries, possible binding sites of transcription regulators for chromatin modification activities, like histone acetylates and enhancer- and insulator-associated factors based on the human genome sequence.  Deep in mind, this assumption supports that the sequence contains signatures for chromatin modifications essential for gene regulation and development.

    There are three primary colors, red, yellow and blue, however, an artist can create many shades. Recently, scientists combining and organizing more data to make sense of our blueprint of life to transfer info generation to generation with the hope to cure diseases of human kind.

    Analyzing genome and transcriptome open the door.  These studies suggested that all eukaryotic cells has a rich portfolio of RNAs. Among these long non-coding RNAs has impact on protein coding gene expression, regulating multiple processes even including epigenetic gene expression.

    Epigenetics, stemness and non-coding RNAs  play a great role to manipulate and correct the gene expression not only at a proper cell type but also location and time within genome without disturbing the host.

    Main concern is differentiation of embryonic stem cells under these epigenetics and influencers.  The best known post-transcriptional modifications, which include methylation, acetylation, ubiquination, and SUMOylation of lysine residues, methylation of arginine residues, and phosphorylation of serines, occur on histone tails. “Epi” means “top” or
    “above” so this mechanism give a new direction to the genetic pathways as long as the organism live sometime and may lead into evolutions.  It is critical to show the complexity of
    mechanism and relativity of a gene role with a single example for each. 

    For example,  DNA methylation occurs mostly on cytosine residues on the CpG islands usually located on promoter regions that are associated with tissue-specific gene expression.  However, there are many other forms of DNA methylations, such as  monoallelic methylation in gene imprinting and inactivation of the X chromosome,  in repetitive elements, like transposons.  There are two main mechanisms but this is not our main topic.  Yet, Myc and hypoxia-inducible factor-1α versus certain methyl-CpG-binding proteins, such as MBD1,MBD2, MBD4, MeCP2, and Kaiso works differently.

    Stemness is an important factor for an intervention to correct a pathological condition. In terms of epigenetics, regulation and non-coding RNA Vascular endothelial growth factor A (VEGF-A) is an interesting example for differentiation of endothelial cells and morphogenesis of the vascular system during development with several reasons, epigenetics, gene interactions, time and space.  Everything has to be just right, because neither less nor too much can fulfill the destiny to become a complete adult cell or an organism.   For example, both having only one VEGF-A allele and having two-fold excess of VEGF-A results in death during early embryogenesis, since mice can’t develop proper vascular network.  However, explaining diverse mechanisms and functions of VEGF-A is require more information with specific details.  VEGF-A plays many roles in many pathological cases, such as cancer, inflammation, retinopathies, and arthritis because VEGF-A has also function in epigenetic reprogramming of the promoter regions of Rex1 and Oct4 genes, that are critical for a stem cell. Preferred mechanism is anti-angiogeneic state but tumor cells prefer hypermethylation to induce pro-angiogeneic state, thus VEGF-A stimulates PIGF in tumour cells among many other factors.

    Now, let’s turn around to observe development of a cell with Polycomb repressive complexes (PRCs) because they are important chromatin regulators of embryonic stem (ES) cell function.  Originally, RYBP shown to function  as transcriptional repressor in reporter assays from both in tissue culture cells and in fruit fly (Drosophila melanogaster ) and as a direct interactor with Ring1A during embryogenesis through methylation. In addition, RYBP in epigenetic resetting during preimplantation development through repression of germ line genes and PcG targets before formation of pluripotent epiblast cells.  However, I do believe that the most important element is efficient repression of endogenous retroviruses (murine endogenous retrovirus called MuERV class),  preimplantation containing zygotic genome activation stage and germ line specific genes. The selective repressor activity of  RYBP  is in the ES cell state. When RYBP−/− ES cells were analyzed by measuring gene expression during differentiation as embryo bodies formed from mutant and wild-type cells, the result presented that  expression of pluripotency genes Oct4 and Nanog was usually downregulated. However, RYBP is able to bind genomic regions independently of H3K27me3 and there is no relation between altered RYBP binding in Dnmt1-mutant cells to DNA methylation status. In sum, RYBP has a large value in undifferentiated ES cells and may affect or even reset epigenetic landscape during early developmental stages. These are the gaps filled by long non coding RNAs.

    We learn more compelling information by comparing and contrasting what is normal and what is abnormal. As a result, pathology is a key learning canvas for basic mechanisms in molecular genetics. Then peppered with functional genomics completes the story for an edible outcome.  We generally refer this as a Translational Research.  For example, recent foundlings suggest that H19 contributes to cancer, including hepatocellular carcinoma (HCC) after reviewing Oncomine resource.  According to these observations, in most HCC cases there is a lower expression of  H19 level is compared to the liver. Thus, in vitro and in vivo studies were undertaken with classical genetic analyzes based on loss- and gain-of-function on H19 to characterize two outcomes depend on H19, that are the effects on gene expression and on HCC metastasis. First, the expression of H19 showed gene expression variation since H19 expression was low in tumor cells than peripheral tumor cells.  Second, the metastasis of cancer based on alteration of miR-200 pathway contributing mesenchymal-to-epithelial transition by H19. Therefore, H19 and miR-200 are targets to be utilized during molecular diagnostics development and establishing targeted therapies in cancer.

    Long story short, there is a circle of life where everything is connected even though they look different.  As a result, when we see a sunflower or a baby we remember to smile, because life is still an act to puzzle human.

    References and Further Readings:


    Non-coding RNAs as regulators of gene expression and epigenetics” Cardiovascular Res 1 June 2011: 430-440.

    Epigenetic regulation of key vascular genes and growth factors” Cardiovasc Res 1 June 2011: 441-446.

    Epigenetic Regulation by Long Noncoding RNAs” Science 14 December 2012: 1435-1439.

    Epigenetic control of embryonic stem cell fate” JEM 25 October 2010: 2287-2295.

    Transcribed dark matter: meaning or myth?” Hum Mol Genet 15 October 2010: R162-R168.

    Epigenetic activation of the MiR-200 family contributes to H19-mediated metastasis suppression in hepatocellular carcinoma” Carcinogenesis 1 March 2013: 577-586.

    Vernalization-Mediated Epigenetic Silencing by a Long Intronic Noncoding RNA” Science 7 January 2011: 76-79.

    Predicting the probability of H3K4me3 occupation at a base pair from the genome sequence context” Bioinformatics 1 May 2013: 1199-1205.


    Further Readings specific to Embryonic Stem Cell Differentiation and Development :

    “BMP Induces Cochlin Expression to Facilitate Self-renewal and Suppress Neural Differentiation of Mouse Embryonic Stem Cells” J. Biol. Chem. 2013 288:8053-8060


    “Regulation of DNA Methylation in Rheumatoid Arthritis Synoviocytes”  J. Immunol. 2013 190:1297-1303


    “DNA methylome signature in rheumatoid arthritis” Ann Rheum Dis 2013 72:110-117


    “The histone demethylase Kdm3a is essential to progression through differentiation” Nucleic Acids Res 2012 40:7219-7232


    “Targeted silencing of the oncogenic transcription factor SOX2 in breast cancer” Nucleic Acids Res 2012 40:6725-6740


    “Yin Yang 1 extends the Myc-related transcription factors network in embryonic stem cells” Nucleic Acids Res 2012 40:3403-3418


    “RYBP Represses Endogenous Retroviruses and Preimplantation- and Germ Line-Specific Genes in Mouse Embryonic Stem Cells” Mol. Cell. Biol. 2012 32:1139-1149


    “Polycomb Repressor Complex-2 Is a Novel Target for Mesothelioma Therapy” Clin. Cancer Res. 2012 18:77-90


    “OCT4 establishes and maintains nucleosome-depleted regions that provide additional layers of epigenetic regulation of its target genes” Proc. Natl. Acad. Sci. USA 2011 108:14497-14502


    “Genome-wide promoter DNA methylation dynamics of human hematopoietic progenitor cells during differentiation and aging” Blood 2011 117:e182-e189


    “The CHD3 Chromatin Remodeler PICKLE and Polycomb Group Proteins Antagonistically Regulate Meristem Activity in the Arabidopsis” RootPlant Cell 2011 23:1047-1060


    “Chromatin structure of pluripotent stem cells and induced pluripotent stem cells” Briefings in Functional Genomics 2011 10:37-49


    Abbreviations used:

    DNMT       DNA methyl transferase

    ES             embryonic stem

    JmjC         Jumonji C

    lincRNA     long ncRNA

    ncRNA       noncoding RNA

    PcG          Polycomb group

    PRC          Polycomb repressive complex

    PRE          Polycomb repressive element

    Previous Posts on Stem Cells:

    …  Aviva Lev-Ari, PhD, RN New Life – The Healing Promise of Stem Cells View … p://       Diseases and conditions where stem cell treatment is promising or emerging. Source: Wikipedia Since the …

    …  Aviva Lev-Ari, PhD, RN Stem cells create new heart cells in baby mice, but not in adults, study …  picture on the left shows green c-kit+ precursor stem cells within an infarct (lower right) in a

    14 January 2013  by Dr. Sudipta Saha on Pharmaceutical Intelligence
    …  and Curator: Dr. Sudipta Saha, Ph.D. Germline stem cells that produce oocytes in vitro and fertilization-competent eggs in …  from adult mouse ovaries. A fluorescence-activated cell sorting-based protocol has been standardized that can be used with adult …  compared to the ESC-derived or induced pluripotent stem cell-derived germline cells that are currently used as models for human …

    …  PhD, RN The two leading therapy classes are: Cell-based Therapies for angiogenesis and myocardial …  Research Projects Stem Cell biology Embryonic stem cells in cardiovascular repairEarly differentiation of human endothelial …

    …  Stem Cells with Unread Genome: microRNAs Author, Demet Sag, PhD Life is …  a coherent outcome. Thus, providing an engineered whole cell as a system of correction for “Stem Cell Therapy” may resolve unmet health problems.  Only 1% of the genome …

    …  are not yet known. Some studies suggest a high rate of stem cell activity with differentiation of progenitors to cardiomyocytes. Other …

    …  T-cells, said Dr. Margaret Goodell, director of the Stem Cells and Regenerative Medicine Center of Baylor College of Medicine. …  of pediatrics at BCM and a member of the Center for Cell and Gene Therapy at BCM, Texas Children¹s Hospital and The Methodist …  found that mice lacking the gene for this factor had a T-cell deficiency and in particular, too few of these early progenitor …

    28 March 2013  by ritusaxena on Pharmaceutical Intelligence
    …  and Curator: Ritu Saxena, Ph.D Although cancer stem cells constitute only a small percentage of the tumor burden, their …  after therapeutic target in cancer. The post on cancer stem cells published on the 22nd of March, 2013, describes the identity of CSCs, their functional characteristics, possible cell of origin and biomarkers. This post focuses on the therapeutic potential …

    …  programs in the fields of personalized medicine, cell biology, cytogenetics, genotyping, and biobanking drive our …  by playing an important role in induced pluripotent stem (iPS) cell research. Induced pluripotent stem cells are powerful cells which can be made from skin or blood cells, and …

    30 November 2012  by sjwilliamspa on Pharmaceutical Intelligence
    …  seen in hematologic malignancies such as cutaneous T-cell lymphoma and peripheral T-cell lymphoma and little or no positive outcome …  resistance to chemotherapeutics, and similarity to cancer stem cells(6-10). Figure 1. HDACis led to the induction of EMT phemotype. (A …

    Read Full Post »

    CRACKING THE CODE OF HUMAN LIFE: Milestones along the Way – Part IIA

    Curator: Larry H Bernstein, MD, FCAP

    Introduction and purpose

    This material goes beyond the Initiation Phase of Molecular Biology, Part I.
    Part II reviews the Human Genome Project and the decade beyond.

    In a three part series:
    Part IIA.  CRACKING THE CODE OF HUMAN LIFE: Milestones along the Way
    Part IIB.  CRACKING THE CODE OF HUMAN LIFE: The Birth of BioInformatics & Computational Genomics
    Part IIC.  CRACKING THE CODE OF HUMAN LIFE: Recent Advances in Genomic Analysis and Disease

    Part III will conclude with Ubiquitin, it’s Role in Signaling and Regulatory Control.
    Part I reviewed the huge expansion of the biological research enterprise after the Second World War. It concentrated on the

    • discovery of cellular structures,
    • metabolic function, and
    • creation of a new science of Molecular Biology.

    Part II follows the race to delineation of the Human Genome, discovery methods and fundamental genomic patterns that are ancient in both animal and plant speciation. But it explores both the complexity and the systems view of the architecture that underlies and understanding of the genome.

    These articles review a web-like connectivity between inter-connected scientific discoveries, as significant findings have led to novel hypotheses and many expectations over the last 75 years. This largely post WWII revolution has driven our understanding of biological and medical processes at an exponential pace owing to successive discoveries of

    • chemical structure,
    • the basic building blocks of DNA  and proteins,
    • nucleotide and protein-protein interactions,
    • protein folding, allostericity,
    • genomic structure,
    • DNA replication,
    • nuclear polyribosome interaction, and
    • metabolic control.

    In addition, the emergence of methods for

    • copying,
    • removal,
    • insertion,
    • improvements in structural analysis
    • developments in applied mathematics that have transformed the research framework.

    Part IIA:


    Milestones along the Way

    A NOVA interview with Francis Collins (NHGRI) (FC), J. Craig Venter (CELERA)(JCV), and Eric Lander (EL).
    RK: For the past ten years, scientists all over the world have been painstakingly trying to read the tiny instructions buried inside our DNA. And now, finally, the “Human Genome” has been decoded.
    EL: The genome is a storybook that’s been edited for a couple billion years.
    The following will address the odd similarity of genes between man and yeast

    EL: In the nucleus of your cell the DNA molecule resides that is about 10 angstroms wide curled up, but the amount of curling is limited by the negative charges that repel one another, but there are folds upon folds. If the DNA is stretched the length of the DNA would be thousands of feet.
    EL: We have known for 2000 years that your kids look a lot like you. Well it’s because you must pass them instructions that give them the eyes, the hair color, and the nose shape they have. RK: Cracking the code of those minuscule differences in DNA that influence health and illness is what the Human Genome Project is all about. Since 1990, scientists all over the world have been involved in the effort to read all three billion As, Ts, Gs, and Cs of human DNA.  It took 10 years to find the one genetic mistake that causes cystic fibrosis. Another 10 years to find the gene for Huntington’s disease. Fifteen years to find one of the genes that increase the risk for breast cancer. One letter at a time, painfully slowly…     And then came the revolution. In the last ten years the entire process has been computerized. The computations can do a thousand every second and that has made all the difference. EL: This is basically a parts list with a lot of parts. If you take an airplane, a Boeing 777, I think it has like 100,000 parts. If I gave you a parts list for the Boeing 777 in one sense you’d know 100,000 components, screws and wires and rudders and things like that.  But you wouldn’t know how to put it together, or why it flies. We now have a parts list, and that’s not enough to understand why it flies.

    The Human Genome

    The Human Genome (Photo credit: dullhunk)

    A Quest For Clarity

    Tracy Vence is a senior editor of Genome Technology
    Tracy Vence @GenomeTechMag
    Projects supported by the US National Institutes of Health will have produced 68,000 total human genomes — around 18,000 of those whole human genomes — through the end of this year, National Human Genome Research Institute estimates indicate. And in his book, The Creative Destruction of Medicine, the Scripps Research Institute’s Eric Topol projects that 1 million human genomes will have been sequenced by 2013 and 5 million by 2014.
    Daniel MacArthur, a group leader in Massachusetts General Hospital’s Analytic and Translational Genetics Unit estimates that “From a capacity perspective … millions of genomes are not that far off. If you look at the rate that we’re scaling, we can certainly achieve that.”    The prospect of so many genomes has brought clinical interpretation into focus. But there is an important distinction to be made between the interpretation of an apparently healthy person’s genome and that of an individual who is already affected by a disease.
    In an April Science Translational Medicine paper, Johns Hopkins University School of Medicine‘s Nicholas Roberts and his colleagues reported that personal genome sequences for healthy monozygotic twin pairs are not predictive of significant risk for 24 different diseases in those individuals. The researchers concluded that whole-genome sequencing was not likely to be clinically useful. Ambiguities have clouded even the most targeted interpretation efforts.

    • Technological challenges,
    • meager sample sizes,
    • a need for increased,
    • fail-safe automation and most important
    • a lack of community-wide standards for the task.

    have hampered researchers’ attempts to reliably interpret the clinical significance of genomic variation.

    How signals from the cell surface affect transcription of genes in the nucleus.

    James Darnell, Jr., MD, Astor Professor, Rockefeller
    After graduation from Washington University School of Medicine he worked with Francois Jacob at the Pasteur Institute in Paris and served as Vice President for Academic Affairs at Rockefeller in 1990-91. He is the coauthor with S.E. Luria of General Virology and the founding author with Harvey Lodish and David Baltimore of Molecular Cell Biology, now in its sixth edition. His book RNA, Life’s Indispensable Molecule was published in July 2011 by Cold Spring Harbor Laboratory Press. A member of the National Academy of Sciences since 1973, recipient of  numerous awards, including the 2003 National Medal of Science, the 2002 Albert Lasker Award.
    Using interferon as a model cytokine, the Darnell group discovered that cell transcription was quickly changed by binding of cytokines to the cell surface. The bound interferon led to the tyrosine phosphorylation of latent cytoplasmic proteins now called STATs (signal transducers and activators of transcription) that dimerize by

    • reciprocal phosphotyrosine-SH2 interchange.
    • accumulate in the nucleus,
    • bind DNA and drive transcription.

    This pathway has proved to be of wide importance with seven STATs now known in mammals that take part in a wide variety of developmental and homeostatic events in all multicellular animals. Crystallographic analysis defined functional domains in the STATs, and current attention is focused on two areas:

    • how the STATs complete their cycle of  activation and inactivation, which requires regulated tyrosine dephosphorylation; and how
    • persistent activation of STAT3 that occurs in a high proportion of many human cancers contributes to blocking apoptosis in cancer cells.

    Current efforts are devoted to inhibiting STAT3 with modified peptides that can enter cells.

    Cell cycle regulation and the cellular response to genotoxic stress

    Stephen J Elledge, PhD, Gregor Mendel Professor of Genetics and Medicine, Investigator, Howard Hughes Medical Institute, Harvard Medical School
    As a postdoctoral fellow at Stanford working on eukaryotic homologous recombination, he serendipitously found a family of genes known as ribonucleotide reductases. He subsequently showed that

    • these genes are activated by DNA damage and
    • could serve as tools to help scientists dissect the signaling pathways
    • through which cells sense and respond to DNA damage and replication stress.

    At Baylor College of Medicine he made a second major breakthrough with the discovery of the cyclin-dependent kinase 2 gene (Cdk2), which

    • controls the G1-to-S cell cycle transition,
    • an entry checkpoint for the cell proliferation cycle and
    • a critical regulatory step in tumorigenesis.

    From there, using a novel “two-hybrid” cloning method he developed, Elledge and Wade Harper, PhD, proceeded to

    • isolate several members of the Cdk2-inhibitory family.

    Their discoveries included the p21 and p57 genes, mutations in the latter (responsible for Beckwith-Wiedemann syndrome), characterized by somatic overgrowth and increased cancer risk. Elledge is also recognized for his work in understanding

    • proteome remodeling through ubiquitin-mediated proteolysis.
    • they identified F-box proteins that regulate protein degradation in the cell by
    1. binding to specific target protein sequences and then
    2. marking them with ubiquitin for destruction by the cell’s proteasome machinery.

    This breakthrough resulted in

    • the elucidation of the cullin ubiquitin ligase family,
    • which controls regulated protein stability in eukaryotes.

    nature10774-f5.2  nature10774-f3.2   ubiquitin structures  Rn1  Rn2

    Elledge’s recent research has focused on the cellular mechanisms underlying DNA damage detection and cancer using genetic technologies. In collaboration with Cold Spring Harbor Laboratory researcher Gregory Hannon, PhD, Elledge has generated complete human and mouse short hairpin RNA (shRNA) libraries for genome-wide loss-of-function studies. Their efforts have led to

    • the identification of a number of tumor suppressor proteins
    • genes upon which cancer cells uniquely depend for survival.

    This work led to the development of the “non-oncogene addiction” concept. This is noted as follows:

    • proteome remodeling through ubiquitin-mediated proteolysis
    • F-box proteins regulate protein degradation in the cell by binding to specific target protein sequences
    • and then marking them with ubiquitin for destruction by the cell’s proteasome machinery
    • elucidation of the cullin ubiquitin ligase family, which controls regulated protein stability in eukaryotes

    Playing the dual roles of inventor and investigator, Elledge developed original techniques to define

    • what drives the cell cycle and
    • how cells respond to DNA damage.

    By using these tools, he and his colleagues have identified multiple genes involved in cell-cycle regulation.

    Elledge’s work has earned him many awards, including a 2001 Paul Marks Prize for Cancer Research and a 2003 election to the National Academy of Sciences. In his Inaugural Article (1), published in this issue of PNAS, Elledge and his colleagues describe the function of Fbw7, a protein involved in controlling cell proliferation (see below). Elledge studied the error-prone DNA repair mechanism in E-Coli (Escherichia coli) called SOS mutagenesis for his PhD thesis at MIT. His work identified  and described

    • the regulation of a group of enzymes now known as error-prone polymerases,
    • the first members of which were the umuCD genes in E. coli.

    It was then that he developed a new cloning tool. Elledge invented a technique that allowed him to approach future cloning problems of this type with great rapidity. With the new technique, “you could make large libraries in lambda that behave like plasmids. We called them `phasmid’ vectors, like plasmid and phage together”. The phasmid cloning method was an early cornerstone for molecular biology research.

    Elledge began working on homologous recombination in postdoctoral fellowship at Stanford University, an important niche in the field of eukaryotic genetics. Working with the yeast genome, Elledge searched for rec A, a gene that allows DNA to recombine homologously. Although he never located rec A, he discovered a family of genes known as ribonucleotide reductases (RNRs), which are involved in DNA production. Rec A and RNRs share the same last 4 amino acids, which caused an antibody crossreaction in one of Elledge’s experiments. Initially disappointed with the false positives in his hunt for rec A, Elledge was later delighted with his luck. He found that

    • RNRs are turned  on by DNA damage, and
    • these genes are regulated by the cell cycle.

    Prior to leaving Stanford, Elledge attended a talk at the University of California, San Francisco, by Paul Nurse, a leader in cell-cycle research who would later win the 2001 Nobel Prize in medicine. Nurse described his success in isolating the homolog of a key human cell-cycle kinase gene, Cdc2, by using a mutant strain of yeast (8). Although Nurse’s methods were primitive, Elledge was struck by the message he carried: that

    • cell-cycle regulation was functionally conserved, and
    • many human genes could be isolated by looking for complimentary genes in yeast.

    Elledge then took advantage of his past successes in building phasmid vectors to build a versatile human cDNA library that could be expressed in yeast. After setting up a laboratory at Baylor, he introduced this library into yeast, screening for complimentary cell-cycle genes.  He quickly identified the same Cdc2 gene isolated by Nurse. However, Elledge also discovered a related gene known as Cdk2. Elledge subsequently found that

    • Cdk2 controlled the G1 to S cell-cycle transition, a step that often goes awry in cancer. These results were published in the EMBO Journal in 1991.

    He then continued to use

    • RNRs to perform genetic screens to
    • identify genes involved in sensing and responding to DNA damage.

    He subsequently worked out the

    • signal transduction pathways in both yeast and humans that recognize damaged DNA and replication problems.

    These “checkpoint” pathways are central to the

    • prevention of genomic instability and a key to understanding tumorigenesis.

    This contribution is part of the special series of Inaugural Articles by members of the National Academy of Sciences elected on April 29, 2003.

    Defective cardiovascular development and elevated cyclin E and Notch proteins in mice lacking the Fbw7 F-box protein.

    Tetzlaff MT, Yu W, Li M, Zhang P, Finegold M, Mahon K , Harper JW, Schwartz RJ, and SJ Elledge. PNAS 2004; 101(10): 3338-3345. cgi doi 10.1073.  pnas.0307875101

    The mammalian F-box protein Fbw7 and its Caenorhabditis elegans counterpart Sel-10 have been implicated in

    • the ubiquitin-mediated turnover of cyclin E
    • as well as the Notch Lin-12 family of transcriptional activators. Both unregulated
    1. Notch and cyclin E
    2. promote tumorigenesis, and
    3. inactivate mutations in human

    Fbw7 studies suggest that it may be a tumor suppressor. To generate an in vivo system to assess the consequences of such unregulated signaling, we generated mice deficient for Fbw7.  Fbw7-null mice die around 10.5 days post coitus because of a combination of deficiencies in hematopoietic and vascular development and heart chamber mutations. The absence of Fbw7 results in elevated levels of cyclin E, concurrent with inappropriate DNA replication in placental giant trophoblast cells. Moreover, the levels of both Notch 1 and Notch 4 intracellular domains were elevated, leading to stimulation of downstream transcriptional pathways involving Hes1, Herp1, and Herp2. These data suggest essential functions for Fbw7 in controlling cyclin E and Notch signaling pathways in the mouse.

    Science as an Adventure


    Prof. Avram Hershko – Science as an Adventure
    Prof. Avram Hershko shared the 2004 Nobel Prize in Chemistry with Aaron Ciechanover and Irwin Rose for “for the discovery of ubiquitin-mediated protein degradation.”

    Gene Switches

    Nipam Patel is a professor in the Departments of Molecular and Cell Biology and Integrative Biology at UC Berkeley and runs a research laboratory that studies the role, during embryonic development, of homeotic genes (the genetic switches described in this feature). “Ghost in Your Genes” focuses on epigenetic “switches” that turn genes “on” or “off.” But not all switches are epigenetic; some are genetic. That is, other genes within the chromosome turn genes on or off. In an animal’s embryonic stage, these gene switches play a predominant role in laying out the animal’s basic body plan and perform other early functions;

    • the epigenome begins to take over during the later stages of embryogenesis.

    Beginning as a fertilized single egg that egg becomes many different kinds of cells.  Altogether, multicellular organisms like humans have thousands of differentiated cells. Each is optimized for use in the brain, the liver, the skin, and so on. Remarkably, the DNA inside all these cells is exactly the same. What makes the cells differ from one another is that different genes in that DNA are either turned on or off in each type of cell.

    Take a typical cell, such as a red blood cell. Each gene within that cell has a coding region that encodes the information used to make a particular protein. (Hemoglobin shuttles oxygen to the tissues and carbon dioxide back out to the lungs—or gills, if you’re a fish.) But another region of the gene, called “regulatory DNA,” determines whether and when the gene will be expressed, or turned on, in a particular kind of cell. This precise transcribing of genes is handled by proteins known as transcription factors, which bind to the regulatory DNA, thereby generating instructions for the coding region.

    One important class of transcription factors is encoded by the so called homeotic, or Hox, genes. Found in all animals, Hox genes act to “regionalize” the body along the embryo’s anterior-to-posterior (head-to-tail) axis. In a fruit fly, for example, Hox genes lay out the various main body segments—the head, thorax, and abdomen. Amazingly, all animals, from fruit flies to mice to people, rely on the same basic Hox-gene complex. Using different-colored antibody stains, we can see exactly where and to what degree Hox genes are expressed. Each Hox gene is expressed in a specific region along the anterior-to-posterior axis of the embryo.

    A fly’s body has three main divisions: head, thorax, and abdomen. We’ll focus on the thorax, which itself has three main segments. In a normal adult fly, the second thoracic segment features a pair of wings, while the third thoracic segment has a pair of small, balloon-shaped structures called halteres. A modified second wing, the haltere serves as a flight stabilizer. In order for the pair of wings and the pair of halteres (as well as all other parts of the fly) to develop properly, the fly’s suite of

    • Hox genes must be expressed in a precise way and at precise times.

    During development, the fly’s two wings grow from a structure in the larva known as the wing imaginal disk. (An imago is an insect in its final, adult state.) The haltere grows from the larval haltere imaginal disk. Remember the Ubx Hox gene? Using staining again, we can detect the gene product of Ubx. This reveals that

    • the Ubx gene is naturally “off” in the wing disk—
    • and is “on” in the haltere disk.
    • Now you’ll see what happens when the Ubx gene—just one of a large number of Hox genes—is turned off in the haltere disk. What if a genetic mutation caused the Ubx gene to be turned off, during the larval stage, in the third thoracic segment, the segment that normally produces the haltere? Instead of a pair of halteres, the fly has a second set of wings. With the switch of that single Hox gene, Ubx, from on to off, the third thoracic segment becomes an additional second thoracic segment and the pair of halteres became a second pair of wings. This illustrates the remarkable ability of transcription factors like Ubx to control patterning as well as cell type during development.


    A. Data Suggests “Gene” Redefinition

    As part of a huge collaborative effort called ENCODE (Encyclopedia of DNA Elements), a research team led by Cold Spring Harbor Laboratory (CSHL) Professor Thomas Gingeras, PhD, publishes a genome-wide analysis of RNA messages, called transcripts, produced within human cells.
    Their analysis—one component of a massive release of research results by ENCODE teams from 32 institutes in 5 countries, with 30 papers appearing in 3 different high-level scientific journals—shows that three-quarters of the genome is capable of being transcribed.  This indicates that nearly all of our genome is dynamic and active.  It stands in marked contrast to consensus views prior to ENCODE’s comprehensive research efforts, which suggested that

    • only the small protein-encoding fraction of the genome was transcribed.

    The vast amount of data generated with advanced technologies by Gingeras’ group and others in the ENCODE project changes the prevailing understanding of what defines a gene. The current outstanding question concerns

    • the nature and range of those functions.  It is thought that these
    • “non-coding” RNA transcripts act something like components of a giant, complex switchboard, controlling a network of  many events in the cell by
    1. regulating the processes of
    2. replication,
    3. transcription
    4. and translation

    – that is, the copying of DNA and the making of proteins is based on information carried by messenger RNAs.  With the understanding that so much of our DNA can be transcribed into RNA comes the realization that there is much less space between what we previously thought of as genes, Gingeras points out.

    The full ENCODE Consortium data sets can be freely accessed through

    • the ENCODE project portal as well as at the University of California at Santa Cruz genome browser,
    • the National Center for Biotechnology Information, and
    • the European Bioinformatics Institute.

    Topic threads that run through several different papers can be explored via the ENCODE microsite page at    Date: September 5, 2012   Source: Cold Spring Harbor Laboratory

    1000 Genomes Project Team Reports on Variation Patterns

    (from Phase I Data) October 31, 2012 GenomeWeb

    In a study appearing online today in Nature, members of the 1000 Genomes Project Consortium presented an integrated haplotype map representing the genomic variation present in more than 1,000 individuals from 14 human populations.  Using data on 1,092 individuals tested by

    • low-coverage whole-genome sequencing,
    • deep exome sequencing, and/or
    • dense genotyping,

    the team looked at the nature and extent of the rare and common variation present in the genomes of individuals within these populations. In addition to population-specific differences in common variant profiles, for example, the researchers found distinct rare variant patterns within populations from different parts of the world — information that is expected to be important in interpreting future disease studies. They also encountered a surprising number of the variants that are expected to impact gene function, such as

    • non-synonymous changes,
    • loss-of-function variants, and, in some cases,
    • potentially damaging mutations.

    ENCODE was designed to pick up where the Human Genome Project left off.
    Although that massive effort revealed the blue­print of human biology, it quickly became clear that the instruction manual for reading the blueprint was sketchy at best. Researchers could identify in its 3 billion letters many of the regions that code for proteins, but they make up little more than 1% of the genome, contained in around 20,000 genes. ENCODE, which started in 2003, is a massive data-collection effort designed to catalogue the

    • ‘functional’ DNA sequences,
    • learn when and in which cells they are active and
    • trace their effects on how the genome is
    1. packaged,
    2. regulated and
    3. read.

    After an initial pilot phase, ENCODE scientists started applying their methods to the entire genome in 2007. That phase came to a close with the publication of 30 papers, in Nature, Genome Research and Genome Biology. The consortium has assigned some sort of function to roughly 80% of the genome, including

    • more than 70,000 ‘promoter’ regions — the sites, just upstream of genes, where proteins bind to control gene expression —
    • and nearly 400,000 ‘enhancer’ regions that regulate expression of  distant genes (see page 57)1. But the job is far from done.

    Junk DNA? What Junk DNA?

    New data reveals that at least 80% of the human genome encodes elements that have some sort of biological function. [© Gernot Krautberger –] Far from containing vast amounts of junk DNA between its protein-coding genes, at least 80% of the human genome encodes elements that have some sort of biological function, according to newly released data from the Encyclopedia of DNA Elements (Encode) project, a five-year initiative that aims to delineate all functional elements within human DNA. The massive international project, data from which are published in 30 different papers in Nature, Genome Research, Genome Biology, the Journal of Biological Chemistry, Science, and Cell, has identified four million gene switches, effectively

    • regulatory regions in the genome where
    • proteins interact with the DNA to control gene expression.

    Overall, the Encode data define regulatory switches that are scattered all over the three billion nucleotides of the genome. In fact, the data suggests,

    • the regions that lie between gene-coding sequences contain a wealth of previously unrecognized functional elements,Including
    • nonprotein-coding RNA transcribed sequences,
    • transcription factor binding sites,
    • chromatin structural elements, and
    • DNA methylation sites.

    The combined results suggest that 95% of the genome lies within 8 kb of a DNA-protein interaction, and 99% lies within 1.7 kb of at least one of the biochemical events, the researchers say. Importantly, given the complex three-dimensional nature of DNA, it’s also apparent that

    • a regulatory element for one gene may be located quite some ‘linear’ distance from the gene itself.

    “The information processing and the intelligence of the genome reside in the regulatory elements,” explains Jim Kent, director of the University of California, Santa Cruz Genome Browser project and head of the Encode Data Coordination Center. “With this project, we probably went from understanding less than 5% to now around 75% of them.”
    The ENCODE results also identified SNPs within regulatory regions that are associated with a range of diseases, providing new insights into the roles that

    • noncoding DNA plays in disease development.

    “As much as nine out of 10 times, disease-linked genetic variants are not in protein-coding regions,” comments Mike Pazin, Encode program director at the National Human Genome Research Institute.  “Far from being junk DNA, this regulatory DNA clearly makes important contributions to human disease.”

    Other Related Articles on this Open Access Online Scientific Journal, include the following: 

    Big Data in Genomic Medicine LHB

    BRCA1 a tumour suppressor in breast and ovarian cancer – functions in transcription, ubiquitination and DNA repair S Saha

    Computational Genomics Center: New Unification of Computational Technologies at Stanford A Lev-Ari

    Personalized medicine gearing up to tackle cancer ritu saxena

    Differentiation Therapy – Epigenetics Tackles Solid Tumors sj Williams

    Mechanism involved in Breast Cancer Cell Growth: Function in Early Detection & Treatment A Lev-Ari

    The Molecular pathology of Breast Cancer Progression tilde barliya`

    Paradigm Shift in Human Genomics – Predictive Biomarkers and Personalized Medicine – Part 1 ( A Lev-Ari

    LEADERS in Genome Sequencing of Genetic Mutations for Therapeutic Drug Selection in Cancer Personalized Treatment: Part 2 A Lev-Ari

    Personalized Medicine: An Institute Profile – Coriell Institute for Medical Research: Part 3 A Lev-Ari

    Harnessing Personalized Medicine for Cancer Management, Prospects of Prevention and Cure: Opinions of Cancer Scientific Leaders @ ALA Personalized Medicine for Cancer Management, Prospects of Prevention and Cure: Opinions of Cancer Scientific Leaders/

    GSK for Personalized Medicine using Cancer Drugs needs Alacris systems biology model to determine the in silico effect of the inhibitor in its “virtual clinical trial” A Lev-Ari

    Recurrent somatic mutations in chromatin-remodeling and ubiquitin ligase complex genes in serous endometrial tumors S Saha

    Personalized medicine-based cure for cancer might not be far away ritu saxena

    Human Variome Project: encyclopedic catalog of sequence variants indexed to the human genome sequence A Lev-Ari

    Prostate Cancer Cells: Histone Deacetylase Inhibitors Induce Epithelial-to-Mesenchymal Transition sjwilliams

    Inspiration From Dr. Maureen Cronin’s Achievements in Applying Genomic Sequencing to Cancer Diagnostics A Lev-Ari

    The “Cancer establishments” examined by James Watson, co-discoverer of DNA w/Crick, 4/1953 A Lev-Ari

    Directions for genomics in personalized medicine lhb

    How mobile elements in “Junk” DNA promote cancer. Part 1: Transposon-mediated tumorigenesis. SJwilliams

    Mitochondria: More than just the “powerhouse of the cell” eritu saxena

    Mitochondrial fission and fusion: potential therapeutic targets? Ritu saxena

    Mitochondrial mutation analysis might be “1-step” away ritu saxena

    mRNA interference with cancer expression lhb

    Expanding the Genetic Alphabet and linking the genome to the metabolome LHB

    Breast Cancer, drug resistance, and biopharmaceutical targets lhb

    Breast Cancer: Genomic profiling to predict Survival: Combination of Histopathology and Gene Expression Analysis A Lev-Ari

    Gastric Cancer: Whole-genome reconstruction and mutational signatures A Lev-Ari

    Ubiquinin-Proteosome pathway, autophagy, the mitochondrion, proteolysis and cell apoptosis lhb

    Genomic Analysis: FLUIDIGM Technology in the Life Science and Agricultural Biotechnology A Lev-Ari

    Reveals from ENCODE project will invite high synergistic collaborations to discover specific targets A. Sarkar

    ENCODE: the key to unlocking the secrets of complex genetic diseases R. Saxena

    Impact of evolutionary selection on functional regions: The imprint of evolutionary selection on ENCODE regulatory elements is manifested between species and within human populations s Saha

    ENCODE Findings as Consortium A Lev-Ari

    Genomics Orientations for Personalized Medicine SJH, ALA, LHB

    2013 Genomics: The Era Beyond the Sequencing of the Human Genome: Francis Collins, Craig Venter, Eric Lander, et al.

     Related Articles

    Read Full Post »

    How Mobile Elements in “Junk” DNA Promote Cancer – Part 1: Transposon-mediated Tumorigenesis

    Author, Writer and Curator: Stephen J. Williams, Ph.D.



    Landscape of Somatic Retrotransposition in Human Cancers. Science (2012); Vol. 337:967-971. (1)

    Sequencing of the human genome via massive programs such as the Cancer Genome Atlas Program (CGAP) and the Encyclopedia of DNA Elements (ENCODE) consortium in conjunction with considerable bioinformatics efforts led by the National Center for Biotechnology Information (NCBI) have unlocked a myriad of yet unclassified genes (for good review see (2).  The project encompasses 32 institutions worldwide which, so far, have generated 1640 data sets, initially depending on microarray platforms but now moving to the more cost effective new sequencing technology.  Initially the ENCODE project focused on three types of cells: an immature white blood cell line GM12878, leukemic line K562, and an approved human embryonic cell line H1-hESC.  The analysis was rapidly expanded to another 140 cell types.  DNA sequencing had revealed 20,687 known coding regions with hints of 50 more coding regions.  Another 11,224 DNA stretches were classified as pseudogenes.  The ENCODE project reveals that many genes encode for an RNA, not protein product, so called regulatory RNAs.

    However some of the most recent and interesting results focus on the noncoding regions of the human genome, previously discarded as uninteresting or “junk” DNA .  Only 2% of the human genome contains coding regions while 98% of this noncoding part of the genome is actually found to be highly active “with about 4 million constantly communicating switches” (3).  Some of these “switches” in the noncoding portion contain small, repetitive elements which are mobile throughout the genome, and can control gene expression and/or predispose to disease such as cancer.  These mobile elements, found in almost all organisms, are classified as transposable elements (TE), inserting themselves into far-reaching regions of the genome.  Retro-transposons are capable of generating new insertions through RNA intermediates.  These transposable elements are normally kept immobile by epigenetic mechanisms(4-6) however some TEs can escape epigenetic repression and insert in areas of the genome, a process described as insertional mutagenesis as the process can lead to gene alterations seen in disease(7).  In addition, this insertional mutagenesis can lead to the transformation of cells and, as described in Post 2, act as a model system to determine drivers of oncogenesis. This insertional mutagenesis is a different mechanism of genetic alteration and rearrangement seen in cancer like recombination and fusion of gene fragments as seen with the Philadelphia chromosome and BCR/ABL fusion protein (8).  The mechanism of transposition and putative effects leading to mutagenesis are described in the following figure:


    Figure.  Insertional mutagenesis based on transposon-mediated mechanism.  A) Basic structure of  transposon contains gene/sequence flanked by two inverted repeats (IR) and/or direct repeats (DR).  An enzyme, the transposase (red hexagon) binds and cuts at the IR/DR and transposon is pasted at another site in DNA, containing an insertion site.  B)   Multiple transpositions may results in oncogenic events by inserting in promoters leading to altered expression of genes driving oncogenesis or inserting within coding regions and inactivating tumor suppressors or activating oncogenes.  Deep sequencing of the resultant tumor genomes ( based on nested PCR from IR/DRs) may reveal common insertion sites (CIS) and oncogenic mutations could be identified.

    In a bioinformatics study Eunjung Lee et al.(1), in collaboration with the Cancer Genome Atlas Research Network, the authors had analyzed 43 high-coverage whole-genome sequencing datasets from five cancer types to determine transposable element insertion sites.  Using a novel computational method, the authors had identified 194 high-confidence somatic TE insertion sites present in cancers of epithelial origin such as colorectal, prostate and ovarian, but not in brain or blood cancers.  Sixty four of the 194 detected somatic TE insertions were located within 62 annotated genes. Genes with TE insertion in colon cancers have commonly high mutation rates and enriched genes were associated with cell adhesion functions (CDH12, ROBO2,NRXN3, FPR2, COL1A1, NEGR1, NTM and CTNNA2) or tumor suppressor functions (NELL1m ROBO2, DBC1, and PARK2).  None of the somatic events were located within coding regions, with the TE sequences being detected in untranslated regions (UTR) or intronic regions.  Previous studies had shown insertion in these regions (UTR or intronic) can disrupts gene expression (9). Interestingly, most of the genes with insertion sites were down-regulated, suggested by a recent paper showing that local changes in methylation status of transposable elements can drive retro-transposition (10,11).  Indeed, the authors found that somatic insertions are biased toward the hypomethylated regions in cancer cell DNA.  The authors also confirmed that the insertion sites were unique to cancer and were somatic insertions, not germline (germline: arising during embryonic development) in origin by analyzing 44 normal genomes (41 normal blood samples from cancer patients and three healthy individuals).

    The authors conclude:

    “that some TE insertions provide a selective advantage during tumorigenesis,

    rather than being merely passenger events that precede clonal expansion(1).”

    The authors also suggest that more bioinformatics studies, which utilize the expansive genomic and epigenetic databases, could determine functional consequences of such transposable elements in cancerThe following Post will describe how use of transposon-mediated insertional mutagenesis is leading to discoveries of the drivers (main genetic events) leading to oncogenesis.

    1.            Lee, E., Iskow, R., Yang, L., Gokcumen, O., Haseley, P., Luquette, L. J., 3rd, Lohr, J. G., Harris, C. C., Ding, L., Wilson, R. K., Wheeler, D. A., Gibbs, R. A., Kucherlapati, R., Lee, C., Kharchenko, P. V., and Park, P. J. (2012) Science 337, 967-971

    2.            Pennisi, E. (2012) Science 337, 1159, 1161

    3.            Park, A. (2012) Don’t Trash These Genes. “Junk DNA may lead to valuable cures. in Time, Time, Inc., New York, N.Y.

    4.            Maksakova, I. A., Mager, D. L., and Reiss, D. (2008) Cellular and molecular life sciences : CMLS 65, 3329-3347

    5.            Slotkin, R. K., and Martienssen, R. (2007) Nature reviews. Genetics 8, 272-285

    6.            Yang, N., and Kazazian, H. H., Jr. (2006) Nature structural & molecular biology 13, 763-771

    7.            Hancks, D. C., and Kazazian, H. H., Jr. (2012) Current opinion in genetics & development 22, 191-203

    8.            Sattler, M., and Griffin, J. D. (2001) International journal of hematology 73, 278-291

    9.            Han, J. S., Szak, S. T., and Boeke, J. D. (2004) Nature 429, 268-274

    10.          Reichmann, J., Crichton, J. H., Madej, M. J., Taggart, M., Gautier, P., Garcia-Perez, J. L., Meehan, R. R., and Adams, I. R. (2012) PLoS computational biology 8, e1002486

    11.          Byun, H. M., Heo, K., Mitchell, K. J., and Yang, A. S. (2012) Journal of biomedical science 19, 13

    Other research paper on ENCODE and Cancer were published on this Scientific Web site as follows:

    Expanding the Genetic Alphabet and linking the genome to the metabolome

    Junk DNA codes for valuable miRNAs: non-coding DNA controls Diabetes

    ENCODE Findings as Consortium

    Reveals from ENCODE project will invite high synergistic collaborations to discover specific targets

    ENCODE: the key to unlocking the secrets of complex genetic diseases

    Impact of evolutionary selection on functional regions: The imprint of evolutionary selection on ENCODE regulatory elements is manifested between species and within human populations

    Metabolite Identification Combining Genetic and Metabolic Information: Genetic association links unknown metabolites to functionally related genes

    Advances in Separations Technology for the “OMICs” and Clarification of Therapeutic Targets

    Commentary on Dr. Baker’s post “Junk DNA codes for valuable miRNAs: non-coding DNA controls Diabetes”

    Cancer Genomics – Leading the Way by Cancer Genomics Program at UC Santa Cruz

    Read Full Post »


    Author and Curator: Ritu Saxena, Ph.D.

    A recent post by Dr. Margaret Baker entitled “Junk DNA codes for valuable miRNAs: non-coding DNA controls Diabetes” talks about how the ENCODE project is revealing new insights into the functions of non-coding region of the human genome previously labeled as “junk DNA”. MicroRNA or miRNA, which as stated by Dr. Baker, “are among the non-gene encoding sequences in the genome and have been shown to play a major post-transcriptional role in expression of multiple genes.”

    The post has touched upon several aspects of miRNA including origin, function, and mechanism of action. This commentary is an extension of Dr. Baker’s post, expanding upon the mechanism of action of miRNAs along with their role in potential disease therapy.

    microRNA: Revisiting the past

    MicroRNA were not discovered long back, infact, it was in 1998 when the presence of the non-coding RNAs that could be involved in switching ‘on’ and ‘off’ of certain genes. In the last decade, 2006 Nobel Prize for medicine or physiology was awarded to scientists Andrew Fire and Craig Mello for their discovery of this new role of RNA molecules.

    A breakthrough research was published in the September 2010 issue of Nature journal, stating that mammalian microRNAs predominantly act by decreasing the levels of target mRNA. Mammalian microRNAs predominantly act to decrease target mRNA levels. miRNAs were initially thought to repress protein output without changes in the corresponding mRNA levels. Guo et al challenged the previous notion of ‘translational repression’ and concluded on the basis of their experimental results that ‘mRNA-destabilization’ scenario for the major part is responsible for the repression in protein expression via miRNAs. Authors utilized the method of ‘ribosome profiling’ to measure the overall effects of miRNA on protein production and then compared these to simultaneously measured effects on mRNA levels. Ribosome profiling prepares maps that exact positions of ribosomes on transcripts after nucleases chew upon the exposed part of transcripts that are not covered by ribosomes. MiR-1 and miR-155 were introduced into the HeLa-cell line. Both of these miRNAs are not  normally expressed in HeLa cells. Another miRNA used was mir-223 which is expressed in significant amounts in neutrophils. The reason for choosing the set of these miRNAs was that they had already been shown to repress protein levels via proteomics research. It was deciphered that miRNA-mediated repression was similar regardless of target expression level and further stated that “for both ectopic and endogenous miRNA regulatory interactions, lowered mRNA levels account for lowered mRNA levels accounted for most for most (>/=84%) of the decreased protein production.” These results show that changes in mRNA levels closely reflect the impact of miRNAs on gene expression and indicate that destabilization of target mRNAs is the predominant reason for reduced protein output.

    Authors concluded that the discovery “will apply broadly to the vast majority of miRNA targeting interactions. If indeed general, this conclusion will be welcome news to biologists wanting to measure the ultimate impact of miRNAs on their direct regulatory targets.”

    Since then and even before the paper was published, several other miRNAs and their roles have been discovered. Information on miRNAs has been consolidated in a database that can be accessed online at

    microRNA: From bench to bedside

    Scientific community had speculated the role of non-coding RNAs in disease treatment right after their discovery. One such study demonstrating the utilization of microRNA for Cancer treatment was published in the September 2010 issue of the journal Nature Medicine. miR-380-5p represses p53 to control cellular survival and is associated with poor outcome inMYCN-amplified neuroblastoma

    The p53 gene is known as a tumor suppressor gene and its inactivation has been associated in some cancers such as neuroblastoma. The study reported that microRNA-380 (miR-380) was able to repress the expression of p53 gene in cancer patients causing uninhibited cell survival and proliferation. The research group was able to decrease the tumor size in vivo in a mouse model of the neuroblastoma by delivering miR-380 antagonist. The researchers also observed that the inhibition of endogenous miR-380 in embryonic stem or neuroblastoma cells resulted in induction of p53, and extensive apoptotic cell death.

    Thus, the success of miR antagonist for decreasing tumor size speaks of the effectiveness of miR as a potential therapeutic target for cancer treatment.

    In conclusion, as stated by Dr. Baker in her post, “the miRNA data for tissues and specific cell types involved in disease pathology form a new approach to either detecting or possibly correcting gene (coding or non-coding) dysregulation. miRNA mimics and anti-miRNA agents are being developed as new therapeutic modalities.”


    Pharmaceutical Intelligence post, Author, Dr. Margaret Baker: Junk DNA codes for valuable miRNAs: non-coding DNA controls Diabetes


    Research articles: Mammalian microRNAs predominantly act to decrease target mRNA levels

    miR-380-5p represses p53 to control cellular survival and is associated with poor outcome inMYCN-amplified neuroblastoma

    Expert reviews- miRNA and Cancer treatment


    News briefs:–once_dismissed_as_j038861.html


    Read Full Post »

    Author and Reporter: Anamika Sarkar, Ph.D

    Early in the month of September, Nature, published 30 research papers on the results found from the ambitious and one time felt risky project, named, ENCODE (Encyclopedia of DNA Elements). The results of ENCODE revealed that 80% of human genome is not “junk”, as thought before, rather act as  regulatory domains for further signaling events.

    When human genome was first sequenced, more than a decade ago, scientists were surprised with the low ratio of coding regions transcribing genes to the number of bases in human DNA. Out of 3 billion bases in human DNA scientists found only 21,000 genes. This unexpected finding led to few basic questions:

    • Why do humans have so many base pairs?
    • How highly regulated complex behaviors of biochemical, cellular and physiological processes can be translated to regulation at genetic levels?

    ENCODE project results unveil our limited knowledge about human genome until now. Their results open up new ways of thinking human DNA and its functional domains. It also brings in huge challenges for both experimental developments and data driven computational approaches for better understanding and applications of these new findings.

    To gain insight from large scale data and identifying key players from a large pool of data, Bioinformatics approaches will  probably be the only way to move forward. This also means importance of developing new algorithms which will include the capability of including regulatory functions linking with gene regulation. Presently, most algorithms are targeted toward identifying genes and their connections in a linear fashion. However, regulatory domains and their functional activities might be non linear, something which will be revealed with many more experimental results in coming years.

    The functional characteristics of human genome will also lead to better understanding of genetic differences between normal states and disease states. Moreover, with proper identification of functional characteristics of a particular gene regulation, drugs can be targeted with much more precision in future. However, to make success of such a complicated problem, it will require visionary design and execution of experiment and computational biology teams working together.

    It is well recognized already that Bioinformatics approaches can hugely help in identifying key players in regulation of genes. However many times it is not easy to translate information at the genetic levels directly to cellular or physiological levels. Some of the main reasons are – a) the complex cross talks between proteins which lead to intracellular signaling events and b) highly non linear information sharing among receptors and ligands for extra cellular signaling processes.  To achieve efficient understanding of the functional characteristics of non-coding regions of DNA in context with regulation of genes, an effort should be given to map the functional network of gene regulation to signaling pathways of protein networks. This will require development of experimental as well as computational approaches to capture genetic as well as proteomics analysis together. Furthermore, for better understanding of cellular and physiological decisions,  mapping between regulations of genes and intracellular signaling pathways should be extended for dynamic analysis with time.

    The extraordinary findings from ENCODE project pose many challenges in front for getting answers to many unknowns for next decade or so but also give solutions to some basic questions which have haunted scientific world for almost a decade.


    News and Views- ENCODE explained:

    News and Analysis – ENCODE Project writes Eulogy for Junk DNA :

    ENCODE Project (Nature Article):


    Read Full Post »

    Older Posts »