Posts Tagged ‘Curation’

How Will FDA’s new precisionFDA Science 2.0 Collaboration Platform Protect Data?

Reporter: Stephen J. Williams, Ph.D.

As reported in

FDA launches precisionFDA to harness the power of scientific collaboration

FDA VoiceBy: Taha A. Kass-Hout, M.D., M.S. and Elaine Johanson

Imagine a world where doctors have at their fingertips the information that allows them to individualize a diagnosis, treatment or even a cure for a person based on their genes. That’s what President Obama envisioned when he announced his Precision Medicine Initiative earlier this year. Today, with the launch of FDA’s precisionFDA web platform, we’re a step closer to achieving that vision.

PrecisionFDA is an online, cloud-based, portal that will allow scientists from industry, academia, government and other partners to come together to foster innovation and develop the science behind a method of “reading” DNA known as next-generation sequencing (or NGS). Next Generation Sequencing allows scientists to compile a vast amount of data on a person’s exact order or sequence of DNA. Recognizing that each person’s DNA is slightly different, scientists can look for meaningful differences in DNA that can be used to suggest a person’s risk of disease, possible response to treatment and assess their current state of health. Ultimately, what we learn about these differences could be used to design a treatment tailored to a specific individual.

The precisionFDA platform is a part of this larger effort and through its use we want to help scientists work toward the most accurate and meaningful discoveries. precisionFDA users will have access to a number of important tools to help them do this. These tools include reference genomes, such as “Genome in the Bottle,” a reference sample of DNA for validating human genome sequences developed by the National Institute of Standards and Technology. Users will also be able to compare their results to previously validated reference results as well as share their results with other users, track changes and obtain feedback.

Over the coming months we will engage users in improving the usability, openness and transparency of precisionFDA. One way we’ll achieve that is by placing the code for the precisionFDA portal on the world’s largest open source software repository, GitHub, so the community can further enhance precisionFDA’s features.Through such collaboration we hope to improve the quality and accuracy of genomic tests – work that will ultimately benefit patients.

precisionFDA leverages our experience establishing openFDA, an online community that provides easy access to our public datasets. Since its launch in 2014, openFDA has already resulted in many novel ways to use, integrate and analyze FDA safety information. We’re confident that employing such a collaborative approach to DNA data will yield important advances in our understanding of this fast-growing scientific field, information that will ultimately be used to develop new diagnostics, treatments and even cures for patients.

fda-voice-taha-kass-1x1Taha A. Kass-Hout, M.D., M.S., is FDA’s Chief Health Informatics Officer and Director of FDA’s Office of Health Informatics. Elaine Johanson is the precisionFDA Project Manager.


The opinions expressed in this blog post are the author’s only and do not necessarily reflect those of or its employees.

So What Are the Other Successes With Such Open Science 2.0 Collaborative Networks?

In the following post there are highlighted examples of these Open Scientific Networks and, as long as

  • transparancy
  • equal contributions (lack of heirarchy)

exists these networks can flourish and add interesting discourse.  Scientists are already relying on these networks to collaborate and share however resistance by certain members of an “elite” can still exist.  Social media platforms are now democratizing this new science2.0 effort.  In addition the efforts of multiple biocurators (who mainly work for love of science) have organized the plethora of data (both genomic, proteomic, and literature) in order to provide ease of access and analysis.

Science and Curation: The New Practice of Web 2.0

Curation: an Essential Practice to Manage “Open Science”

The web 2.0 gave birth to new practices motivated by the will to have broader and faster cooperation in a more free and transparent environment. We have entered the era of an “open” movement: “open data”, “open software”, etc. In science, expressions like “open access” (to scientific publications and research results) and “open science” are used more and more often.

Curation and Scientific and Technical Culture: Creating Hybrid Networks

Another area, where there are most likely fewer barriers, is scientific and technical culture. This broad term involves different actors such as associations, companies, universities’ communication departments, CCSTI (French centers for scientific, technical and industrial culture), journalists, etc. A number of these actors do not limit their work to popularizing the scientific data; they also consider they have an authentic mission of “culturing” science. The curation practice thus offers a better organization and visibility to the information. The sought-after benefits will be different from one actor to the next.

Scientific Curation Fostering Expert Networks and Open Innovation: Lessons from Clive Thompson and others

  • Using Curation and Science 2.0 to build Trusted, Expert Networks of Scientists and Clinicians

Given the aforementioned problems of:

        I.            the complex and rapid deluge of scientific information

      II.            the need for a collaborative, open environment to produce transformative innovation

    III.            need for alternative ways to disseminate scientific findings


        I.            Curation exists beyond the review: curation decreases time for assessment of current trends adding multiple insights, analyses WITH an underlying METHODOLOGY (discussed below) while NOT acting as mere reiteration, regurgitation


      II.            Curation providing insights from WHOLE scientific community on multiple WEB 2.0 platforms


    III.            Curation makes use of new computational and Web-based tools to provide interoperability of data, reporting of findings (shown in Examples below)


Therefore a discussion is given on methodologies, definitions of best practices, and tools developed to assist the content curation community in this endeavor

which has created a need for more context-driven scientific search and discourse.

However another issue would be Individual Bias if these networks are closed and protocols need to be devised to reduce bias from individual investigators, clinicians.  This is where CONSENSUS built from OPEN ACCESS DISCOURSE would be beneficial as discussed in the following post:

Risk of Bias in Translational Science

As per the article

Risk of bias in translational medicine may take one of three forms:

  1. a systematic error of methodology as it pertains to measurement or sampling (e.g., selection bias),
  2. a systematic defect of design that leads to estimates of experimental and control groups, and of effect sizes that substantially deviate from true values (e.g., information bias), and
  3. a systematic distortion of the analytical process, which results in a misrepresentation of the data with consequential errors of inference (e.g., inferential bias).

This post highlights many important points related to bias but in summarry there can be methodologies and protocols devised to eliminate such bias.  Risk of bias can seriously adulterate the internal and the external validity of a clinical study, and, unless it is identified and systematically evaluated, can seriously hamper the process of comparative effectiveness and efficacy research and analysis for practice. The Cochrane Group and the Agency for Healthcare Research and Quality have independently developed instruments for assessing the meta-construct of risk of bias. The present article begins to discuss this dialectic.

  • Information dissemination to all stakeholders is key to increase their health literacy in order to ensure their full participation
  • threats to internal and external validity  represent specific aspects of systematic errors (i.e., bias)in design, methodology and analysis

So what about the safety and privacy of Data?

A while back I did a post and some interviews on how doctors in developing countries are using social networks to communicate with patients, either over established networks like Facebook or more private in-house networks.  In addition, these doctor-patient relationships in developing countries are remote, using the smartphone to communicate with rural patients who don’t have ready access to their physicians.

Located in the post Can Mobile Health Apps Improve Oral-Chemotherapy Adherence? The Benefit of Gamification.

I discuss some of these problems in the following paragraph and associated posts below:

Mobile Health Applications on Rise in Developing World: Worldwide Opportunity

According to International Telecommunication Union (ITU) statistics, world-wide mobile phone use has expanded tremendously in the past 5 years, reaching almost 6 billion subscriptions. By the end of this year it is estimated that over 95% of the world’s population will have access to mobile phones/devices, including smartphones.

This presents a tremendous and cost-effective opportunity in developing countries, and especially rural areas, for physicians to reach patients using mHealth platforms.

How Social Media, Mobile Are Playing a Bigger Part in Healthcare

E-Medical Records Get A Mobile, Open-Sourced Overhaul By White House Health Design Challenge Winners

In Summary, although there are restrictions here in the US governing what information can be disseminated over social media networks, developing countries appear to have either defined the regulations as they are more dependent on these types of social networks given the difficulties in patient-physician access.

Therefore the question will be Who Will Protect The Data?

For some interesting discourse please see the following post

Atul Butte Talks on Big Data, Open Data and Clinical Trials



Read Full Post »

Cancer Biology and Genomics for Disease Diagnosis (Vol. I) Now Available for Amazon Kindle

Cancer Biology and Genomics for Disease Diagnosis (Vol. I) Now Available for Amazon Kindle

Reporter: Stephen J Williams, PhD

Leaders in Pharmaceutical Business Intelligence would like to announce the First volume of their BioMedical E-Book Series C: e-Books on Cancer & Oncology

Volume One: Cancer Biology and Genomics for Disease Diagnosis

CancerandOncologyseriesCcoverwhich is now available on Amazon Kindle at                

This e-Book is a comprehensive review of recent Original Research on Cancer & Genomics including related opportunities for Targeted Therapy written by Experts, Authors, Writers. This ebook highlights some of the recent trends and discoveries in cancer research and cancer treatment, with particular attention how new technological and informatics advancements have ushered in paradigm shifts in how we think about, diagnose, and treat cancer. The results of Original Research are gaining value added for the e-Reader by the Methodology of Curation. The e-Book’s articles have been published on the Open Access Online Scientific Journal, since April 2012.  All new articles on this subject, will continue to be incorporated, as published with periodical updates.

We invite e-Readers to write an Article Reviews on Amazon for this e-Book on Amazon. All forthcoming BioMed e-Book Titles can be viewed at:

Leaders in Pharmaceutical Business Intelligence, launched in April 2012 an Open Access Online Scientific Journal is a scientific, medical and business multi expert authoring environment in several domains of  life sciences, pharmaceutical, healthcare & medicine industries. The venture operates as an online scientific intellectual exchange at their website and for curation and reporting on frontiers in biomedical, biological sciences, healthcare economics, pharmacology, pharmaceuticals & medicine. In addition the venture publishes a Medical E-book Series available on Amazon’s Kindle platform.

Analyzing and sharing the vast and rapidly expanding volume of scientific knowledge has never been so crucial to innovation in the medical field. WE are addressing need of overcoming this scientific information overload by:

  • delivering curation and summary interpretations of latest findings and innovations
  • on an open-access, Web 2.0 platform with future goals of providing primarily concept-driven search in the near future
  • providing a social platform for scientists and clinicians to enter into discussion using social media
  • compiling recent discoveries and issues in yearly-updated Medical E-book Series on Amazon’s mobile Kindle platform

This curation offers better organization and visibility to the critical information useful for the next innovations in academic, clinical, and industrial research by providing these hybrid networks.

Table of Contents for Cancer Biology and Genomics for Disease Diagnosis


Introduction  The evolution of cancer therapy and cancer research: How we got here?

Part I. Historical Perspective of Cancer Demographics, Etiology, and Progress in Research

Chapter 1:  The Occurrence of Cancer in World Populations

Chapter 2.  Rapid Scientific Advances Changes Our View on How Cancer Forms

Chapter 3:  A Genetic Basis and Genetic Complexity of Cancer Emerge

Chapter 4: How Epigenetic and Metabolic Factors Affect Tumor Growth

Chapter 5: Advances in Breast and Gastrointestinal Cancer Research Supports Hope for Cure

Part II. Advent of Translational Medicine, “omics”, and Personalized Medicine Ushers in New Paradigms in Cancer Treatment and Advances in Drug Development

Chapter 6:  Treatment Strategies

Chapter 7:  Personalized Medicine and Targeted Therapy

Part III.Translational Medicine, Genomics, and New Technologies Converge to Improve Early Detection

Chapter 8:  Diagnosis                                     

Chapter 9:  Detection

Chapter 10:  Biomarkers

Chapter 11:  Imaging In Cancer

Chapter 12: Nanotechnology Imparts New Advances in Cancer Treatment, Detection, &  Imaging                                 

Epilogue by Larry H. Bernstein, MD, FACP: Envisioning New Insights in Cancer Translational Biology


Read Full Post »

Track 9 Pharmaceutical R&D Informatics: Collaboration, Data Science and Biologics @ BioIT World, April 29 – May 1, 2014 Seaport World Trade Center, Boston, MA

Aviva Lev-Ari, PhD, RN


April 30, 2014


Big Data and Data Science in R&D and Translational Research

10:50 Chairperson’s Remarks

Ralph Haffner, Local Area Head, Research Informatics, F. Hoffmann-La Roche AG

11:00 Can Data Science Save Pharmaceutical R&D?

Jason M. Johnson, Ph.D., Associate Vice President,

Scientific Informatics & Early Development and Discovery Sciences IT, Merck

Although both premises – that the viability of pharmaceutical R&D is mortally threatened and that modern “data science” is a relevant superhero – are

suspect, it is clear that R&D productivity is progressively declining and many areas of R&D suboptimally use data in decision-making. We will discuss

some barriers to our overdue information revolution, and our strategy for overcoming them.

11:30 Enabling Data Science in Externalized Pharmaceutical R&D

Sándor Szalma, Ph.D., Head, External Innovation, R&D IT,

Janssen Research & Development, LLC

Pharmaceutical companies have historically been involved in many external partnerships. With recent proliferation of hosted solutions and the availability

of cost-effective, massive high-performance computing resources there is an opportunity and a requirement now to enable collaborative data science. We

discuss our experience in implementing robust solutions and pre-competitive approaches to further these goals.

12:00 pm Co-Presentation: Sponsored by

Collaborative Waveform Analytics: How New Approaches in Machine Learning and Enterprise Analytics will Extend Expert Knowledge and Improve Safety Assessment

  • Tim Carruthers, CEO, Neural ID
  • Scott Weiss, Director, Product Strategy, IDBS

Neural ID’s Intelligent Waveform Service (IWS) delivers the only enterprise biosignal analysis solution combining machine learning with human expertise. A collaborative platform supporting all phases of research and development, IWS addresses a significant unmet need, delivering scalable analytics and a single interoperable data format to transform productivity in life sciences. By enabling analysis from BioBook (IDBS) to original biosignals, IWS enables users of BioBook to evaluate cardio safety assessment across the R&D lifecycle.

12:15 Building a Life Sciences Data

Sponsored by

Lake: A Useful Approach to Big Data

Ben Szekely, Director & Founding Engineer,

Cambridge Semantics

The promise of Big Data is in its ability to give us technology that can cope with overwhelming volume and variety of information that pervades R&D informatics. But the challenges are in practical use of disconnected and poorly described data. We will discuss: Linking Big Data from diverse sources for easy understanding and reuse; Building R&D informatics applications on top of a Life Sciences Data Lake; and Applications of a Data Lake in Pharma.

12:40 Luncheon Presentation I:

Sponsored by

Chemical Data Visualization in Spotfire

Matthew Stahl, Ph.D., Senior Vice President,

OpenEye Scientific Software

Spotfire deftly facilitates the analysis and interrogation of data sets. Domain specific data, such as chemistry, presents a set of challenges that general data analysis tools have difficulty addressing directly. Fortunately, Spotfire is an extensible platform that can be augmented with domain specific abilities. Spotfire has been augmented to naturally handle cheminformatics and chemical data visualization through the integration of OpenEye toolkits. The OpenEye chemistry extensions for Spotfire will be presented.

1:10 Luncheon Presentation II 

1:50 Chairperson’s Remarks

Yuriy Gankin, Ph.D., Co. Founder and CSO, GGA Software Services

1:55 Enable Translational Science by Integrating Data across the R&D Organization

Christian Gossens, Ph.D., Global Head, pRED Development Informatics Team,

pRED Informatics, F. Hoffmann-La Roche Ltd.

Multi-national pharmaceutical companies face an amazingly complex information management environment. The presentation will show that

a systematic system landscaping approach is an effective tool to build a sustainable integrated data environment. Data integration is not mainly about

technology, but the use and implementation of it.

2:25 The Role of Collaboration in Enabling Great Science in the Digital Age: The BARD Data Science Case Study

Andrea DeSouza, Director, Informatics & Data Analysis,

Broad Institute

BARD (BioAssay Research Database) is a new, public web portal that uses a standard representation and common language for organizing chemical biology data. In this talk, I describe how data professionals and scientists collaborated to develop BARD, organize the NIH Molecular Libraries Program data, and create a new standard for bioassay data exchange.

May 1. 2014


10:30 Chairperson’s Opening Remarks

John Koch, Director, Scientific Information Architecture & Search, Merck

10:35 The Role of a Data Scientist in Drug Discovery and Development

Anastasia (Khoury) Christianson, Ph.D., Head, Translational R&D IT, Bristol-

Myers Squibb

A major challenge in drug discovery and development is finding all the relevant data, information, and knowledge to ensure informed, evidencebased

decisions in drug projects, including meaningful correlations between preclinical observations and clinical outcomes. This presentation will describe

where and how data scientists can support pharma R&D.

11:05 Designing and Building a Data Sciences Capability to Support R&D and Corporate Big Data Needs

Shoibal Datta, Ph.D., Director, Data Sciences, Biogen Idec

To achieve Biogen Idec’s strategic goals, we have built a cross-disciplinary team to focus on key areas of interest and the required capabilities. To provide

a reusable set of IT services we have broken down our platform to focus on the Ingestion, Digestion, Extraction and Analysis of data. In this presentation, we will outline how we brought focus and prioritization to our data sciences needs, our data sciences architecture, lessons learned and our future direction.

11:35 Data Experts: Improving Sponsored by

Translational Drug-Development Efficiency

Jamie MacPherson, Ph.D., Consultant, Tessella

We report on a novel approach to translational informatics support: embedding Data Experts’ within drug-project teams. Data experts combine first-line

informatics support and Business Analysis. They help teams exploit data sources that are diverse in type, scale and quality; analyse user-requirements and prototype potential software solutions. We then explore scaling this approach from a specific drug development team to all.


Read Full Post »

Larry H Bernstein, MD, FCAP, Author and Curator

Chief, Scientific Communication

Leaders in Pharmaceutical Intelligence

with contributions from JEDS Rosalis, Brazil
and Radislov Rosov, Univ of Virginia, VA, USA

A Brief Curation of Proteomics, Metabolomics, and Metabolism

This article is a continuation of a series of elaborations of the recent and
accelerated scientific discoveries that are enlarging the scope of and
integration of biological and medical knowledge leading to new drug
discoveries.  The work that has led us to this point actually has roots
that go back 150 years.  The roots go back to studies in the mid-nineteenth century, with the emergence of microbiology, physiology,
pathology, botany, chemistry and physics, and the laying down of a
mechanistic approach divergent from descriptive observation in the
twentieth century. Medicine took on the obligation to renew the method
of training physicians after the Flexner Report (The Flexner Report of
1910 transformed the nature and process of medical education in America
with a resulting elimination of proprietary schools), funded by the Carnegie
Foundation.  Johns Hopkins University Medical School became the first to
adopt the model, as did Harvard, Yale, University of Chicago, and others.

The advances in biochemistry, genetics and genomics, were large, as was
structural organic chemistry in the remainder of the centrury.  The advances
in applied mathematics and in instrumental analysis opened a new gateway
into the 21st century with the Human Genome Project, the Proteome Library,
Signaling Pathways, and the Metabolomes – human, microbial, and plants.

shall elaborate on how the key processes of life are being elucidated as
these interrelated disciplines converge.  I shall not be covering in great
detail the contribution of the genetic code and transcripton because they
have been covered at great length in this series.

Part I.  The foundation for the emergence of a revitalized molecular
and biochemistry.

In a series of discussions with Jose des Salles Roselino (Brazil) over a
period of months we have come to an important line of reasoning. DNA
to protein link goes from triplet sequence to amino acid sequence. The
realm of genetics. Further, protein conformation, activity and function
requires that environmental and microenvironmental factors should be
considered (Biochemistry).  This has been opened in several articles
preceding this.

In the cAMP coupled hormonal response the transfer of conformation
from protein to protein is paramount. For instance, if your scheme goes
beyond cAMP, it will show an effect over a self-assembly (inhibitor
protein and protein kinase). Therefore, sequence alone does not
explain conformation, activity and function of regulatory proteins.
Recall that sequence is primar structure, determined by the translation
of the code, but secondary structure is determined by disulfide bonds.
There is another level of structure, tertiary structure, that is molded by
steric influences of near neighbors and by noncovalent attractions
and repulsions.

A few comments ( contributed by Assoc. Prof. JEDS Roselino) are in
order to stress the importance of self-assembly (Prigogine, R. A
Marcus, conformation energy) in a subject that is the best for this
connection. We have to stress again that in the cAMP
coupled hormonal response the transfer of conformation from
protein to protein is paramount. For instance, in case the
reaction sequence follows beyond the production of the
second messenger, as in the case of cAMP, this second
messenger will remove a self-assembly of inhibitor protein
with the enzyme protein kinase. Therefore, sequence alone
does not explain conformation, activity and function of
regulatory proteins. In this case, if this important mechanism
was not ignored, the work of Stanley Prusiner would most
certainly have been recognized earlier, and “rogue” proteins
would not have been seen as so rogue as some assumed.
For the general idea of importance of self-assembly versus
change in covalent modification of proteins (see R. A Kahn
and A. G Gilman (1984) J. Biol. Chem.  259(10), pp 6235-
6240. In this case, trimeric or dimeric G does not matter.
“Signaling transduction tutorial”.
G proteins in the G protein coupled-receptor proteins are
presented following a unidirectional series of arrows.
This is adequate to convey the idea of information being
transferred from outside the cell towards cell´s interior
(therefore, against the dogma that says all information
moves from DNA to RNA to protein.  It is important to
consider the following: The entire process is driven by
a very delicate equilibrium between possible conform-
ational states of the proteins. Empty receptors have very
low affinity for G proteins. On the other hand, hormone
bound receptors have a change in conformation that
allows increasing the affinity for the G-trimer. When
hormone receptors bind to G-trimers two things happen:

  1. Receptors transfer conformation information to
    the G-triplex and
  2. the G-triplex transfers information back to the
    complex hormone-receptor.

In the first case , the dissociated G protein exchanges
GDP for GTP and has its affinity for the cyclase increased,
while by the same interaction receptor releases the
hormone which then places the first required step for the
signal. After this first interaction step, on the second and
final transduction system step is represented by an
opposite arrow. When, the G-protein + GTP complex
interacts with the cyclase two things happen:

  1. It changes the cyclase to an active conformation
    starting the production of cAMP as the single
    arrow of the scheme. However, the interaction
    also causes a backward effect.
  2. It activates the GTPase activity of this subunit
    and the breakdown of GTP to GDP moves this 
    subunit back to the initial trimeric inactive
     of G complex.

This was very well studied when the actions of cholera toxin
required better understanding. Cholera toxin changes the
GTPase subunit by ADP-ribosilation (a covalent and far more
stable change in proteins) producing a permanent conformation
of GTP bound G subunit. This keeps the cyclase in permanent
active conformation because ADP-ribosilation inhibits GTPase
activity required to put an end in the hormonal signal.

The study made while G-proteins were considered a dimer still
holds despite its limited vision of the real complexity of the
transduction system. It was also possible to get this very same
“freezing” in the active state using GTP stable analogues. This
transduction system is one of the best examples of the delicate
mechanisms of conformational interaction of proteins. Further-
more, this system also shows on the opposite side of our
reasoning scheme, how covalent changes are adequate for
more stable changes than those mediated by Van der Wall’s
forces between proteins. Yet, these delicate forces are the
same involved when Sc-Prion transfers its rogue
conformation to c-Prion proteins and other similar events.
The Jacob-Monod Model

A combination of genetic and biochemical experiments in
bacteria led to the initial recognition of

  1. protein-binding regulatory sequences associated with genes and
  2. proteins whose binding to a gene’s regulatory sequences
    either activate or repress its transcription.

These key components underlie the ability of both prokaryotic and
eukaryotic cells to turn genes on and off. The  experimental findings lead to a general model of bacterial transcription control.

Gene control serves to allow a single cell to adjust to changes in its
nutritional environment so that its growth and division can be optimized.
Thus, the prime focus of research has been on genes that encode
inducible proteins whose production varies depending on the nutritional
status of the cells. Its most characteristic and biologically far-reaching
purpose in eukaryotes, distinctive from single cell organisms is the
regulation of a genetic program that underlies embryological
development and tissue differentiation.

The principles of transcription have already been described in this
series under the translation of the genetic code into amino acids
that are the building blocks for proteins.

E.coli can use either glucose or other sugars such as the
disaccharide lactose as the sole source of carbon and energy.
When E. coli cells are grown in a glucose-containing medium,
the activity of the enzymes needed to metabolize lactose is
very low. When these cells are switched to a medium
containing lactose but no glucose, the activities of the lactose-metabolizing enzymes increase. Early studies showed that the
increase in the activity of these enzymes resulted from the
synthesis of new enzyme molecules, a phenomenon termed
induction. The enzymes induced in the presence of lactose
are encoded by the lac operon, which includes two genes, Z
and Y, that are required for metabolism of lactose and a third
gene. The lac Y gene encodes lactose permease, which spans the E. coli cell membrane and uses the energy available from
the electrochemical gradient across the membrane to pump
lactose into the cell. The lac Z gene encodes β-galactosidase,
which splits the disaccharide lactose into the monosaccharides
glucose and galactose, which are further metabolized through
the action of enzymes encoded in other operons. The third
gene encodes thiogalactoside transacetylase.

Synthesis of all three enzymes encoded in the lac operon is rapidly
induced when E. coli cells are placed in a medium containing lactose
as the only carbon source and repressed when the cells are switched
to a medium without lactose. Thus all three genes of the lac operon
are coordinately regulated. The lac operon in E. coli provides one
of the earliest and still best-understood examples of gene control.
Much of the pioneering research on the lac operon was conducted by
Francois Jacob, Jacques Monod, and their colleagues in the 1960s.

Some molecules similar in structure to lactose can induce expression
of the lacoperon genes even though they cannot be hydrolyzed by β-galactosidase. Such small molecules (i.e., smaller than proteins) are
called inducers. One of these, isopropyl-β-D-thiogalactoside,
abbreviated IPTG,is particularly useful in genetic studies of the lac
operon, because it can diffuse into cells and, it is not metabolized.
Insight into the mechanisms controlling synthesis of β-galactosidase
and lactose permease came from the study of mutants in which control
of β-galactosidase expression was abnormal and used a colorimetric
assay for β-galactosidase.

When the cells are exposed to chemical mutagens before plating on
X-gal/glucose plates, rare blue colonies appear, but when cells
from these blue colonies are recovered and grown in media containing
glucose, they overexpress all the genes of the lac operon. These cells
are called constitutive mutants because they fail to repress the lac
operon in media lacking lactose and instead continuously express the
enzymes, and the genes were mapped to a region on the E. coli
chromosome. This led to the conclusion that these cells had a defect
in a protein that normally repressed expression of the lac operon in
the absence of lactose, and that it blocks transcription by binding to
a site on the E. coli genome where transcription of the lac operon is
initiated. In addition, it binds to the lac repressor in the lactose
medium and decreases its affinity for the repressor-binding site
on the DNA causing the repressor to unbind the DNA. Thereby,
transcription of the lac operon is initiated, leading to synthesis of
β-galactosidase, lactose permease, and thiogalactoside

 regulation of the lac operon by lac repressor

Jacob and Monod model of transcriptional regulation of the lac operon

Next, Jacob and Monod isolated mutants that expressed the lac operon
constitutively even when two copies of the wild-type lacI gene
encoding the lac repressor were present in the same cell, and the
constitutive mutations mapped to one end of the lac operon, as the
model predicted.  Further, there are rare cells that carry a mutation
located at the region, promoter, that block initiation of transcription by
RNA polymerase.

lac I+ gene is trans-acting, & encodes a protein, which binds to a lac operator

 lac I+ gene is trans-acting, & encodes a protein, which
binds to a lac operator

They further demonstrated that the two types of mutations lac I and
lac I+, were cis- and trans-acting, the latter encoding a protein that
binds to the lac operator. The cis-acting Oc mutations prevent
binding of the lac repressor to the operator, and  mutations in the
lac promoter are cis-acting, since they alter the binding site for RNA
polymerase. In general, trans-acting genes that regulate expression
of genes on other DNA molecules encode diffusible products. In
most cases these are proteins, but in some cases RNA molecules
can act in trans to regulate gene expression.

According to the Jacob and Monod model of transcriptional control,
transcription of the lac operon, which encodes three inducible
proteins, is repressed by binding of lac repressor protein to the
operator sequence.

 (Section 10.1Bacterial Gene Control: The Jacob-Monod Model.)
This book is accessible by the search feature.

Comment: This seminal work was done a half century ago. It was a
decade after the Watson-Crick model for DNA. The model is
elaborated for the Eukaryote in the examples that follow.

(The next two articles were called to my attention by R. Bosov at
University of Virginia).

An acetate switch regulates stress erythropoiesis

M Xu,  JS Nagati, Ji Xie, J Li, H Walters, Young-Ah Moon, et al.
Nature Medicine 10 Aug 2014(20): 1018–1026.

message: 1- ( -CH3 ) = Ln ( (1/sqrt(1-Acetate^2) –
sqrt oxalate))/ Ln(oxygen) – K(o)

The hormone erythropoietin (EPO), synthesized in the kidney or liver
of adult mammals, controls erythrocyte production and is regulated by
the stress-responsive transcription factor hypoxia-inducible factor-2
 HIFα acetylation and efficient HIF-2–dependent EPO
induction during hypoxia requires  the lysine acetyltransferase CREB-binding protein (CBP) . These processes require acetate-dependent
acetyl CoA synthetase 2 (ACSS2) as follows.Acetate levels rise and
ACSS2 is required for HIF-2α acetylation, CBP–HIF-2α complex
formation, CBP–HIF-2α recruitment to the EPO enhancer and induction
of EPO gene expression
 in human Hep3B hepatoma cells and in EPO-generating organs of hypoxic or acutely anemic mice. In acutely anemic
mice, acetate supplementation augments stress erythropoiesis in an
ACSS2-dependent manner. Moreover, in acquired and inherited
chronic anemia mouse models, acetate supplementation increases
EPO expression
 and the resting hematocrit. Thus, a mammalian
stress-responsive acetate switch controls HIF-2 signaling and EPO
induction during pathophysiological states marked by tissue hypoxia.

Figure 1: Acss2 controls HIF-2 signaling in hypoxic cells.
Time course of endogenous HIF-2α acetylation during hypoxia following
immunoprecipitation (IP) of HIF-2α from whole-cell extracts and detection
of acetylated lysines by immunoblotting (IB).

Figure 2: Acss2 regulates hypoxia-induced renal Epo expression in mice.

Figure 3: Acute anemia induces Acss2-dependent HIF-2 signaling in mice.

Figure 4: An acetate switch regulates Cbp–HIF-2 interactions in cells.
(a) HIF-2α acetylation following immunoprecipitation of endogenous
HIF-2α and detection by immunoblotting with antibodies to acetylated
lysine or HIF-2α.

Figure 5: Acss2 signaling in cells requires intact HIF-2 acetylation.

Figure 6: Acetate facilitates recovery from anemia.

Acetate facilitates recovery from anemia

Acetate facilitates recovery from anemia

(a) Serial hematocrits of CD1 wild-type female mice after PHZ treatment, followed
by once daily per os (p.o.) supplementation with water vehicle (Veh; n = 7 mice),
GTA (n = 6 mice), GTB (n = 8 mice) or GTP (n = 7 mice) (single measurem…

see also-.
1. Bunn, H.F. & Poyton, R.O. Oxygen sensing and molecular adaptation to
hypoxia. Physiol. Rev. 76, 839–885 (1996).

  1. .Richalet, J.P. Oxygen sensors in the organism: examples of regulation
    under altitude hypoxia in mammals. Comp. Biochem. Physiol. A Physiol.
    118, 9–14 (1997).
  2. .Koury, M.J. Erythropoietin: the story of hypoxia and a finely regulated
    hematopoietic hormone. Exp. Hematol. 33, 1263–1270 (2005).
  3. Wang, G.L., Jiang, B.H., Rue, E.A. & Semenza, G.L. Hypoxia-inducible
    factor 1 is a basic-helix-loop-helix-PAS heterodimer regulated
    by cellular O2 tension. Proc. Natl. Acad. Sci. USA92, 5510–5514 (1995).
  4. Chen, R. et al. The acetylase/deacetylase couple CREB-binding
    protein/sirtuin 1 controls hypoxia-inducible factor 2 signaling. J. Biol.
    Chem. 287, 30800–30811 (2012).
  5. .Papandreou, I., Cairns, R.A., Fontana, L., Lim, A.L. & Denko, N.C.
    HIF-1 mediates adaptation to hypoxia by actively down-regulating
    mitochondrial oxygen consumption. Cell Metab. 3,187–197 (2006).

14. Kim, J.W., Tchernyshyov, I., Semenza, G.L. & Dang, C.V. HIF-1-
mediated expression of pyruvate dehydrogenase kinase: a metabolic
switch required for cellular adaptation to hypoxia. Cell Metab. 3,
177–185 (2006).

16. Fujino, T., Kondo, J., Ishikawa, M., Morikawa, K. & Yamamoto, T.T.
Acetyl-CoA synthetase 2, a mitochondrial matrix enzyme involved in the
oxidation of acetate. J. Biol. Chem. 276,11420–11426 (2001).

17..Luong, A., Hannah, V.C., Brown, M.S. & Goldstein, J.L. Molecular
characterization of human acetyl-CoA synthetase, an enzyme regulated
by sterol regulatory element-binding proteins. J. Biol. Chem. 275,
26458–26466 (2000).

20 .Wellen, K.E. et al. ATP-citrate lyase links cellular metabolism to
histone acetylation. Science324, 1076–1080 (2009).

24. McBrian, M.A. et al. Histone acetylation regulates intracellular pH.
Mol. Cell 49, 310–321(2013).

Asymmetric mRNA localization contributes to fidelity and sensitivity
of spatially localized systems

Robert J Weatheritt, Toby J Gibson & M Madan Babu
Nature Structural & Molecular Biology 21, 833–839 (2014) 

Although many proteins are localized after translation, asymmetric
protein distribution is also achieved by translation after mRNA localization.
Why are certain mRNA transported to a distal location and translated
on-site? Here we undertake a systematic, genome-scale study of
asymmetrically distributed protein and mRNA in mammalian cells.
Our findings suggest that asymmetric protein distribution by mRNA
localization enhances interaction fidelity and signaling sensitivity
Proteins synthesized at distal locations frequently contain intrinsically
disordered segments. These regions are generally rich in assembly-
promoting modules and are often regulated by post-translational
modifications. Such proteins are tightly regulated but display distinct
temporal dynamics upon stimulation with growth factors. Thus, proteins
synthesized on-site may rapidly alter proteome composition and
act as dynamically regulated scaffolds to promote the formation
of reversible cellular assemblies. 
Our observations are consistent
across multiple mammalian species, cell types and developmental stages,
suggesting that localized translation is a recurring feature of cell
signaling and regulation.

Figure 1: Classification and characterization of TAS and DSS proteins.

The two major mechanisms for localizing proteins to distal sites in the cell

The two major mechanisms for localizing proteins to distal sites in the cell

(a)The two major mechanisms for localizing proteins to distal sites in the cell.
(b) Data sets used to identify groups of DSS and TAS transcripts, as well as
DSS and TAS proteins in mouse neuroblastoma cells

Figure 2: Structural analysis of DSS proteins reveals an enrichment
in disordered regions.

Distributions of the various structural properties of the DSS and TAS proteins of the mouse neuroblastoma data sets

Distributions of the various structural properties of the DSS and TAS proteins of the mouse neuroblastoma data sets

(a,b) Distributions of the various structural properties of the DSS and TAS
proteins of the mouse neuroblastoma data sets (a), the mouse pseudopodia,
the rat embryonic sensory neuron data set and the adult sensory neuron data set (b).…

Figure 3: Analysis of DSS proteins reveals an enrichment for linear motifs, phase-
transition (i.e., higher-order assembly) promoting segments and PTM sites that act
as molecular switches.

(a,b) Distributions of the various regulatory and structural properties of the DSS
and TAS proteins of the mouse neuroblastoma data sets

Figure 4: Dynamic regulation of DSS transcripts and proteins.

Dynamic regulation of DSS transcripts and proteins

Dynamic regulation of DSS transcripts and proteins

Genome-wide quantitative measurements of gene expression of DSS (n = 289)
and TAS (n = 1,292) proteins in mouse fibroblast cells. DSS transcripts and
proteins have a lower abundance and shorter half-lives

Figure 5: An overview of the potential advantages conferred by distal-site protein
synthesis, inferred from our analysis.

An overview of the potential advantages conferred by distal-site protein synthesis, inferred from our analysis

An overview of the potential advantages conferred by distal-site protein synthesis, inferred from our analysis

Turquoise and red filled circle represents off-target and correct interaction partners,
respectively. Wavy lines – a disordered region within a distal site synthesis protein.

The identification of asymmetrically localized proteins and transcripts.

The identification of asymmetrically localized proteins and transcripts

The identification of asymmetrically localized proteins and transcripts

An illustrative explanation of the resolution of the study and the concept of asymmetric
localization of proteins and mRNA. In this example, on the left a neuron is divided into
its cell body and axon terminal, and transcriptome/proteo…

Graphs and boxplots of functional and structural properties for distal site synthesis
(DSS) proteins (red) and transport after synthesis (TAS) proteins (gray).
See Online Methods for details and legend of Figure 2 for a description of boxplots
and statistical tests.

See also –
1. Martin, K.C. & Ephrussi, A. mRNA localization: gene expression in the spatial
dimension. Cell136, 719–730 (2009).

  1. Scott, J.D. & Pawson, T. Cell signaling in space and time: where proteins come
    together and when they’re apart. Science 326, 1220–1224 (2009).

4..Holt, C.E. & Bullock, S.L. Subcellular mRNA localization in animal cells
and why it matters.Science 326, 1212–1216 (2009).

  1. Jung, H., Gkogkas, C.G., Sonenberg, N. & Holt, C.E. Remote control of
    gene function by local translation. Cell 157, 26–40 (2014). 

Regulation of metabolism by hypoxia-inducible factor 1.   
Semenza GL.    Author information
Cold Spring Harb Symp Quant Biol. 2011;76:347-53.

The maintenance of oxygen homeostasis is critical for survival, and the
master regulator of this process in metazoan species is hypoxia-inducible
factor 1 (HIF-1), which

  • controls both O(2) delivery and utilization.

Under conditions of reduced O(2) availability,

  • HIF-1 activates the transcription of genes, whose protein products
  • mediate a switch from oxidative to glycolytic metabolism.

HIF-1 is activated in cancer cells as a result of intratumoral hypoxia
and/or genetic alterations.

In cancer cells, metabolism is reprogrammed to

  • favor glycolysis even under aerobic conditions.

Pyruvate kinase M2 (PKM2) has been implicated in cancer growth and
metabolism, although the mechanism by which it exerts these effects is
unclear. Recent studies indicate that

PKM2 interacts with HIF-1α physically and functionally to

  1. stimulate the binding of HIF-1 at target genes,
  2. the recruitment of coactivators,
  3. histone acetylation, and
  4. gene transcription.

Interaction with HIF-1α is facilitated by

  • hydroxylation of PKM2 at proline-403 and -408 by PHD3.

Knockdown of PHD3

  • decreases glucose transporter 1, lactate dehydrogenase A, and
    pyruvate dehydrogenase kinase 1 expression;
  • decreases glucose uptake and lactate production; and
  • increases O(2) consumption.

The effect of PKM2/PHD3 is not limited to genes encoding metabolic
enzymes because VEGF is similarly regulated.

These results provide a mechanism by which PKM2

  • promotes metabolic reprogramming and

suggest that it plays a broader role in cancer progression than has
previously been appreciated.   PMID: 21785006   


Cadherins are thought to be the primary mediators of adhesion
between the cells
 of vertebrate animals, and also function in cell
adhesion in many invertebrates. The expression of numerous cadherins
during development is highly regulated, and the precise pattern of
cadherin expression plays a pivotal role in the morphogenesis of tissues
and organs. The cadherins are also important in the continued maintenance
of tissue structure and integrity. The loss of cadherin expression appears
to be highly correlated with the invasiveness of some types of tumors. Cadherin adhesion is also dependent on the presence of calcium ions
in the extracellular milieu.

The cadherin protein superfamily, defined as proteins containing a
cadherin-like domain, can be divided into several sub-groups. These include

  • the classical (type I) cadherins, which mediate adhesion at adherens junctions;
  • the highly-related type II cadherins;
  • the desmosomal cadherins found in desmosome junctions;
  • protocadherins, expressed only in the nervous system; and
  • atypical cadherin-like domain containing proteins.

Members of all but the atypical group have been shown to play a role
in intercellular adhesion.

Part II.  PKM2 and regulation of glycolysis

PKM2 regulates the Warburg effect and promotes ​HMGB1
release in sepsis

L Yang, M Xie, M Yang, Y Yu, S Zhu, W Hou, R Kang, …, & D Tang
Nature Communic 14 July 2014; 5(4436)

Increasing evidence suggests the important role of metabolic reprogramming

  • in the regulation of the innate inflammatory response,

We provide evidence to support a novel role for the

  • ​pyruvate kinase M2 (​PKM2)-mediated Warburg effect,

namely aerobic glycolysis,

  • in the regulation of ​high-mobility group box 1 (​HMGB1) release. ​
  1. PKM2 interacts with ​hypoxia-inducible factor 1α (​HIF1α) and
  2. activates the ​HIF-1α-dependent transcription of enzymes necessary
    for aerobic glycolysis in macrophages.

Knockdown of ​PKM2, ​HIF1α and glycolysis-related genes

  • uniformly decreases ​lactate production and ​HMGB1 release.

Similarly, a potential ​PKM2 inhibitor, ​shikonin,

  1. reduces serum ​lactate and ​HMGB1 levels, and
  2. protects mice from lethal endotoxemia and sepsis.

Collectively, these findings shed light on a novel mechanism for

  • metabolic control of inflammation by
  • regulating ​HMGB1 release and

highlight the importance of targeting aerobic glycolysis in the treatment
of sepsis and other inflammatory diseases.

  1. Glycolytic inhibitor ​2-D G attenuates ​HMGB1 release by activated macrophages.
  2. Figure 2: Upregulated ​PKM2 promotes aerobic glycolysis and ​HMGB1
    release in activated macrophages.
  3. Figure 3: ​PKM2-mediated ​HIF1α activation is required for ​HMGB1
    release in activated macrophages.


ERK1/2-dependent phosphorylation and nuclear translocation of
PKM2 promotes the Warburg effect  

W Yang, Y Zheng, Y Xia, Ha Ji, X Chen, F Guo, CA Lyssiotis, & Zhimin Lu
Nature Cell Biology  2012 (27 June 2014); 14: 1295–1304
Corrigendum (January, 2013)

Pyruvate kinase M2 (PKM2) is upregulated in multiple cancer types and
contributes to the Warburg. We demonstrate that

  • EGFR-activated ERK2 binds directly to PKM2 Ile 429/Leu 431
  • through the ERK2 docking groove
  • and phosphorylates PKM2 at Ser 37, but
  • does not phosphorylate PKM1.

Phosphorylated PKM2 Ser 37

  1. recruits PIN1 for cis–trans isomerization of PKM2, which
  2. promotes PKM2 binding to importin α5
  3. and PKM2 translocates to the nucleus.

Nuclear PKM2 acts as

  • a coactivator of β-catenin to
  • induce c-Myc expression,

This is followed by

  1. the upregulation of GLUT1, LDHA and,
  2. in a positive feedback loop,
  • PTB-dependent PKM2 expression.

Replacement of wild-type PKM2 with

  • a nuclear translocation-deficient mutant (S37A)
  • blocks the EGFR-promoted Warburg effect
    and brain tumour development in mice.

In addition, levels of PKM2 Ser 37 phosphorylation

  • correlate with EGFR and ERK1/2 activity
    in human glioblastoma specimens.

Our findings highlight the importance of

  • nuclear functions of PKM2 in the Warburg effect
    and tumorigenesis.
  1. ERK is required for PKM2 nucleus translocation.
  2. ERK2 phosphorylates PKM2 Ser 37.
  3. Figure 3: PKM2 Ser 37 phosphorylation recruits PIN1.

 Pyruvate kinase M2 activators promote tetramer formation
and suppress tumorigenesis

D Anastasiou, Y Yu, WJ Israelsen, Jian-Kang Jiang, MB Boxer, B Hong, et al.
Nature Chemical Biology  11 Oct 2012; 8: 839–847

Cancer cells engage in a metabolic program to

  • enhance biosynthesis and support cell proliferation.

The regulatory properties of pyruvate kinase M2 (PKM2)

  • influence altered glucose metabolism in cancer.

The interaction of PKM2 with phosphotyrosine-containing proteins

  • inhibits PTM2 enzyme activity and
  • increases the availability of glycolytic metabolites
  • supporting cell proliferation.

This suggests that high pyruvate kinase activity may suppress
tumor growth

  1. expression of PKM1,  the pyruvate kinase isoform with high
    constitutive activity, or
  2. exposure to published small-molecule PKM2 activators
  • inhibits the growth of xenograft tumors.

Structural studies reveal that

  • small-molecule activators bind PKM2
  • at the subunit interaction interface,
  • a site that is distinct from that of the
    • endogenous activator fructose-1,6-bisphosphate (FBP).

However, unlike FBP,

  • binding of activators to PKM2 promotes
  • a constitutively active enzyme state that is resistant to inhibition
  • by tyrosine-phosphorylated proteins.

These data support the notion that small-molecule activation of PKM2
can interfere with anabolic metabolism

  1. PKM1 expression in cancer cells impairs xenograft tumor growth.
  2. TEPP-46 and DASA-58 isoform specificity in vitro and in cells.
    TEPP-46 and DASA-58 isoform specificity in vitro and in cells.

    TEPP-46 and DASA-58 isoform specificity in vitro and in cells.

    (a) Structures of the PKM2 activators TEPP-46 and DASA-58. (b) Pyruvate kinase (PK) activity in purified recombinant human
    PKM1 or PKM2 expressed in bacteria in the presence of increasing
    concentrations of TEPP-46 or DASA-58. M1, PKM1;…

  3. Activators promote PKM2 tetramer formation and prevent
    inhibition by phosphotyrosine signaling.
Activators promote PKM2 tetramer formation and prevent inhibition by phosphotyrosine signaling.

Activators promote PKM2 tetramer formation and prevent inhibition by phosphotyrosine signaling.

Sucrose gradient ultracentrifugation profiles of purified recombinant
PKM2 (rPKM2) and the effects of FBP and TEPP-46 on PKM2 subunit stoichiometry.

Figure 5: Metabolic effects of cell treatment with PKM2 activators.
(a) Effects of TEPP-46, DASA-58 (both used at 30 μM) or PKM1
expression on the doubling time of H1299 cells under normoxia
(21% O2) or hypoxia (1% O2). (b) Effects of DASA-58 on lactate
production from glucose. The P value shown was ca…

EGFR has a tumour-promoting role in liver macrophages during
hepatocellular carcinoma formation

H Lanaya, A Natarajan, K Komposch, L Li, N Amberg, …, & Maria Sibilia
Nature Cell Biology 31 Aug 2014

Tumorigenesis has been linked with macrophage-mediated chronic
inflammation and diverse signaling pathways, including the ​epidermal
growth factor receptor (​EGFR) pathway. ​EGFR is expressed in liver
macrophages in both human HCC and in a mouse HCC model. Mice
lacking ​EGFR in macrophages show impaired hepatocarcinogenesis,
Mice lacking ​EGFR in hepatocytes develop HCC owing to increased
hepatocyte damage and compensatory proliferation. EGFR is required
in liver macrophages to transcriptionally induce ​interleukin-6 following
interleukin-1 stimulation, which triggers hepatocyte proliferation and HCC.
Importantly, the presence of ​EGFR-positive liver macrophages in HCC
patients is associated with poor survival. This study demonstrates a

  • tumour-promoting mechanism for ​EGFR in non-tumour cells,
  • which could lead to more effective precision medicine strategies.
  1. HCC formation in mice lacking ​EGFRin hepatocytes or all liver cells.

2. EGFR expression in Kupffer cells/liver macrophages promotes HCC development.

EGFR c2a expression in Kupffer cells.liver macrophages promotes HCC development.

EGFR c2a expression in Kupffer cells.liver macrophages promotes HCC development.

Hypoxia-inducible factor 1 activation by aerobic glycolysis implicates
the Warburg effect in carcinogenesis

Lu H1, Forbes RA, Verma A.
J Biol Chem. 2002 Jun 28;277(26):23111-5. Epub 2002 Apr 9

Cancer cells display high rates of aerobic glycolysis, a phenomenon
known historically as the Warburg effect. Lactate and pyruvate, the end
products of glycolysis, are highly produced by cancer cells even in the
presence of oxygen

Hypoxia-induced gene expression in cancer cells

  • has been linked to malignant transformation.

Here we provide evidence that lactate and pyruvate

  • regulate hypoxia-inducible gene expression
  • independently of hypoxia
  • by stimulating the accumulation of hypoxia-inducible Factor 1alpha

In human gliomas and other cancer cell lines,

  • the accumulation of HIF-1alpha protein under aerobic conditions
  • requires the metabolism of glucose to pyruvate that
  1. prevents the aerobic degradation of HIF-1alpha protein,
  2. activates HIF-1 DNA binding activity, and
  3. enhances the expression of several HIF-1-activated genes
  4. erythropoietin,
  5. vascular endothelial growth factor,
  6. glucose transporter 3, and
  7. aldolase A.

Our findings support a novel role for pyruvate in metabolic signaling
and suggest a mechanism by which

  • high rates of aerobic glycolysis
  • can promote the malignant transformation and
  • survival of cancer cells.PMID: 11943784

Part IV. Transcription control and innate immunity

 c-Myc-induced transcription factor AP4 is required for
host protection mediated by CD8+ T cells

C Chou, AK Pinto, JD Curtis, SP Persaud, M Cella, Chih-Chung Lin, … & T Egawa Nature Immunology 17 Jun 2014;

The transcription factor c-Myc is essential for

  • the establishment of a metabolically active and proliferative state
  • in T cells after priming,

We identified AP4 as the transcription factor

  • that was induced by c-Myc and
  • sustained activation of antigen-specific CD8+ T cells.

Despite normal priming,

  • AP4-deficient CD8+ T cells
  • failed to continue transcription of a broad range of
    c-Myc-dependent targets.

Mice lacking AP4 specifically in CD8+ T cells showed

  • enhanced susceptibility to infection with West Nile virus.

Genome-wide analysis suggested that

  • many activation-induced genes encoding molecules
  • involved in metabolism were shared targets of
  • c-Myc and AP4.

Thus, AP4 maintains c-Myc-initiated cellular activation programs

  • in CD8+ T cells to control microbial infection.
  1. AP4 is regulated post-transcriptionally in CD8+ T cells.

Microarray analysis of transcription factor–encoding genes with a difference
in expression of >1.8-fold in activated CD8+ T cells treated for 12 h with
IL-2 (100 U/ml; + IL-2) relative to their expression in activated CD8+ T cells…

2. AP4 is required for the population expansion of antigen specific
CD8+ T cells following infection with LCMV-Arm.

Expression of CD4, CD8α and KLRG1 (a) and binding of an
H-2Db–gp(33–41) tetramer and expression of CD8α, KLRG1 and
CD62L (b) in splenocytes from wild-type (WT) and Tfap4−/− mice,
assessed by flow cytometry 8 d after infection

3. AP4 is required for the sustained clonal expansion of CD8+ T cells
but  not for their initial proliferation.

  1. AP4 is essential for host protection against infection with WNV, in
    a CD8+ T cell–intrinsic manner.
AP4 is essential for host protection against infection with WNV, in a CD8+ T cell–intrinsic manner.

AP4 is essential for host protection against infection with WNV, in a CD8+ T cell–intrinsic manner.

  •  Survival of Tfap4F/FCre− control mice (Cre−; n = 16) and
  • Tfap4F/FCD8-Cre+ mice (CD8-Cre+; n = 22) following infection with WNV.
    (b,c) Viral titers in the brain (b) and spleen (c) of Tfap4F/F Cre− and Tfap4F/F
    CD8-Cre+ mice  on day 9…

AP4 is essential for the sustained expression of genes that are targets of c-Myc.

Normalized signal intensity (NSI) of endogenous transcripts in
Tfap4+/+ and Tfap4−/− OT-I donor T cells adoptively transferred into
host mice and assessed on day 4 after infection of the host with LM-OVA
(top), and that of ERCC controls

Sustained c-Myc expression ‘rescues’ defects of Tfap4−/− CD8+ T cells.

AP4 and c-Myc have distinct biological functions.

Mucosal memory CD8+ T cells are selected in the periphery
by an MHC class I molecule

Y Huang, Y Park, Y Wang-Zhu, …A Larange, R Arens, & H Cheroutre

Nature Immunology 2 Oct 2011; 12: 1086–1095

The presence of immune memory at pathogen-entry sites is a prerequisite
for protection. We show that the non-classical major histocompatibility
complex (MHC) class I molecule

  • thymus leukemia antigen (TL),
  • induced on dendritic cells interacting with CD8αα on activated CD8αβ+ T cells,
  • mediated affinity-based selection of memory precursor cells.

Furthermore, constitutive expression of TL on epithelial cells

  • led to continued selection of mature CD8αβ+ memory T cells.

The memory process driven by TL and CD8αα

  • was essential for the generation of CD8αβ+ memory T cells in the intestine and
  • the accumulation of highly antigen-sensitive CD8αβ+ memory T cells
  • that form the first line of defense at the largest entry port for pathogens.

The metabolic checkpoint kinase mTOR is essential for IL-15 signaling during the development and activation of NK cells.

Marçais A, Cherfils-Vicini J, Viant C, Degouve S, Viel S, Fenis A, Rabilloud J,
Mayol K, Tavares A, Bienvenu J, Gangloff YG, Gilson E, Vivier E,Walzer T.
Nat Immunol. 2014 Aug; 15(8):749-757. Epub 2014 Jun 29  .    PMID: 24973821

Interleukin 15 (IL-15) controls

  • both the homeostasis and the peripheral activation of natural killer (NK) cells.

We found that the metabolic checkpoint kinase

  • mTOR was activated and boosted bioenergetic metabolism
  • after exposure of NK cells to high concentrations of IL-15,

whereas low doses of IL-15 triggered

  • only phosphorylation of the transcription factor STAT5.


  • stimulated the growth and nutrient uptake of NK cells and
  • positively fed back on the receptor for IL-15.

This process was essential for

  • sustaining NK cell proliferation during development and
  • the acquisition of cytolytic potential during inflammation
    or viral infection.

The mTORC1 inhibitor rapamycin 

  • inhibited NK cell cytotoxicity both in mice and humans;
    • this probably contributes to the immunosuppressive
      activity of this drug in different clinical settings.

The Critical Role of IL-15-PI3K-mTOR Pathway in Natural Killer Cell
Effector Functions.
Nandagopal NAli AKKomal AKLee SH.   Author information
Front Immunol. 2014 Apr 23; 5:187. eCollection 2014.

Natural killer (NK) cells were so named for their uniqueness in killing
certain tumor and virus-infected cells without prior sensitization.
Their functions are modulated in vivo by several soluble immune mediators;

  • interleukin-15 (IL-15) being the most potent among them in
    enabling NK cell homeostasis, maturation, and activation.

During microbial infections,

  • NK cells stimulated with IL-15 display enhanced cytokine responses.

This priming effect has previously been shown with respect to increased
IFN-γ production in NK cells

  • upon IL-12 and IL-15/IL-2 co-stimulation.
  • we explored if this effect of IL-15 priming 
  • can be extended to various other cytokines and
  • observed enhanced NK cell responses to stimulation
    • with IL-4, IL-21, IFN-α, and IL-2 in addition to IL-12.
  • we also observed elevated IFN-γ production in primed NK cells

Currently, the fundamental processes required for priming and

  • whether these signaling pathways work collaboratively or

    • for NK cell functions are poorly understood.

We examined IL-15 effects on NK cells in which

  • the pathways emanating from IL-15 receptor activation
    • were blocked with specific inhibitors
    • To identify the key signaling events for NK cell priming,

Our results demonstrate that

the PI3K-AKT-mTOR pathway is critical for cytokine responses
in IL-15 primed NK cells. 

This pathway is also implicated in a broad range of

  • IL-15-induced NK cell effector functions such as
    • proliferation and cytotoxicity.

Likewise, NK cells from mice

  • treated with rapamycin to block the mTOR pathway
  • displayed defects in proliferation, and IFN-γ and granzyme B productions
  • resulting in elevated viral burdens upon murine cytomegalovirus infection.

Taken together, our data demonstrate

  • the requirement of PI3K-mTOR pathway
    • for enhanced NK cell functions by IL-15, thereby
  • coupling the metabolic sensor mTOR to NK cell anti-viral responses.

KEYWORDS: IL-15; JAK–STAT pathway; mTOR pathway; natural killer cells; signal transduction

Part V. Predicting Therapeutic Targets 

New discovery approach accelerates identification of potential cancer treatments
 Laura Williams, Univ. of Michigan   09/30/2014

Researchers at the Univ. of Michigan have described a new approach to
discovering potential cancer treatments that

  • requires a fraction of the time needed for more traditional methods.

They used the platform to identify

  • a novel antibody that is undergoing further investigation as a potential
    treatment for breast, ovarian and other cancers.

In research published online in the Proceedings of the National Academy
of Sciences
, researchers in the laboratory of Stephen Weiss at the U-M Life
Sciences Institute detail an approach

  • that replicates the native environment of cancer cells and
  • increases the likelihood that drugs effective against the growth of
    tumor cells in test tube models
  • will also stop cancer from growing in humans.

The researchers have used their method

  • to identify an antibody that stops breast cancer tumor growth in animal models, and
  • they are investigating the antibody as a potential treatment in humans.

“Discovering new targets for cancer therapeutics is a long and tedious undertaking, and

  • identifying and developing a potential drug to specifically hit that
    target without harming healthy cells is a daunting task,” Weiss said.
  • “Our approach allows us to identify potential therapeutics
    • in a fraction of the time that traditional methods require.”

The researchers began by

  • creating a 3-D “matrix” of collagen, a connective tissue molecule very similar to that found
    • surrounding breast cancer cells in human patients.
  • They then embedded breast cancer cells into the collagen matrix,
    • where the cells grew as they would in human tissue.

The investigators then injected the cancer-collagen tissue composites into mice that then

  • recognize the human cancer cells as foreign tissue.
    • Much in the way that our immune system generates antibodies
      to fight infection,
  • the mice began to generate thousands of antibodies directed against
    the human cancer cells.
  • These antibodies were then tested for the ability to stop the growth
    of the human tumor cells.

“We create an environment in which cells cultured in the laboratory ‘think’
they are growing in the body and then

  • rapidly screen large numbers of antibodies to see if any exert
    anti-cancer effects,” Weiss said.
  • “This allows us to select promising antibodies very quickly and then

They discovered a particular antibody, 4C3, which was able to

  • almost completely stop the proliferation of the breast cancer cells.

They then identified the molecule on the cancer cells that the antibody targets.

The antibody can be further engineered to generate

  • humanized monoclonal antibodies for use in patients

“We still need to do a lot more work to determine how effective 4C3 might be as a
treatment for breast and other cancers, on its own or in conjunction with other
therapies,” Weiss said. “But we have enough data to warrant further pursuit,
and are expanding our efforts to use this discovery platform to find similarly promising antibodies.”

Source: Univ. of Michigan

  1. Jose Eduardo de Salles Roselino

    I think you have made a great effort in order to connect basic ideas of metabolic regulation with those of gene expression control “modern” mechanisms.
    Yet, I do not think that at this stage it will be clear for all readers. At least, for the great majority of the readers. The most important factor I my opinion, is derived from the fact that modern readers considers that metabolic regulation deals with so called “housekeeping activities” of the cell. Something that is of secondary, tertiary or even less level of relevance.
    My idea, that you have mentioned in the text when you write at the beginning, the word biochemistry, in order to resume it, derives from the reading of What is life together with Prof. Leloir . For me and also, for him, biochemistry comprises a set of techniques and also a framework of reasoning about scientific results. As a set of techniques, Schrodinger has considered that it will lead to better understanding of genetics and of physiology as a two legs structure supporting the future progress related to his time (mid-forties). For Leloir, the key was the understanding of chemical reactivity and I agree with him. However, as I was able to talk and discuss it with him in detail, we should also take into account levels of stabilities of macromolecules and above all, regulation of activities and function (this is where) Pasteur effect that I was studying in Leloir´s lab at that time, 1970-72, gets into the general picture.
    Regulation for complex living beings , that also have cancer cell as a great topic of research problem can be understood through the understanding of two quite different results when opposition with lack of regulation is taken into account or experimentally elicited. The most clearly line of experiments can follow the Pasteur Effect as the intracellular result best seen when aerobiosis is compared with anaerobiosis as conditions in which maintenance of ATP levels and required metabolic regulation (Energy charge D.E, Atkinson etc) is studied. Another line of experiments is one that takes into account the extracellular result or for instance the homeostatic regulation of blood glucose levels. The blood glucose level is the most conspicuous and related to Pasteur Effect regulatory event that can be studied in the liver taking into account both final results tested or compared regarding its regulation, ATP levels maintenance (intracellular) and blood glucose maintenance (extracellular).
    My key idea is to consider that the same factors that elicits fast regulatory responses also elicits the slow energetic expensive regulatory responses. The biologic logic behind this common root is the ATP economy. In case, the regulatory stimulus fades out quickly the fast regulatory responses are good enough to maintain life and the time requiring, energetic costly responses will soon be stopped cutting short the ATP expenditure. In case, the stimulus last for long periods of time the fast responses are replaced by adaptive responses that in general will follow the line of cell differentiation mechanisms with changes in gene expression etc.
    The change from fast response mechanisms to long lasting developmentally linked ones is not sharp. Therefore, somehow, cancer cells becomes trapped into a metastable regulatory mechanism that prevents cell differentiation and reinforces those mechanisms linked to its internal regulatory goals. This metastable mechanism takes advantage from the fact that other cells, tissues and organs will take good care of homeostatic mechanisms that provide for their nutritional needs. In the case of my Hepatology work you will see a Piruvate kinase that does not responds to homeostatic signals .

Read Full Post »

Scientific Curation Fostering Expert Networks and Open Innovation: Lessons from Clive Thompson

Life-cycle of Science 2












Curators and Writer: Stephen J. Williams, Ph.D. with input from Curators Larry H. Bernstein, MD, FCAP, Dr. Justin D. Pearlman, MD, PhD, FACC and Dr. Aviva Lev-Ari, PhD, RN

(this discussion is in a three part series including:

Using Scientific Content Curation as a Method for Validation and Biocuration

Using Scientific Content Curation as a Method for Open Innovation)


Every month I get my Wired Magazine (yes in hard print, I still like to turn pages manually plus I don’t mind if I get grease or wing sauce on my magazine rather than on my e-reader) but I always love reading articles written by Clive Thompson. He has a certain flair for understanding the techno world we live in and the human/technology interaction, writing about interesting ways in which we almost inadvertently integrate new technologies into our day-to-day living, generating new entrepreneurship, new value.   He also writes extensively about tech and entrepreneurship.

October 2013 Wired article by Clive Thompson, entitled “How Successful Networks Nurture Good Ideas: Thinking Out Loud”, describes how the voluminous writings, postings, tweets, and sharing on social media is fostering connections between people and ideas which, previously, had not existed. The article was generated from Clive Thompson’s book Smarter Than you Think: How Technology is Changing Our Minds for the Better.Tom Peters also commented about the article in his blog (see here).

Clive gives a wonderful example of Ory Okolloh, a young Kenyan-born law student who, after becoming frustrated with the lack of coverage of problems back home, started a blog about Kenyan politics. Her blog not only got interest from movie producers who were documenting female bloggers but also gained the interest of fellow Kenyans who, during the upheaval after the 2007 Kenyan elections, helped Ory to develop a Google map for reporting of violence (, which eventually became a global organization using open-source technology to affect crises-management. There are a multitude of examples how networks and the conversations within these circles are fostering new ideas. As Clive states in the article:



They are influenced by the conversations around us.

However the article got me thinking of how Science 2.0 and the internet is changing how scientists contribute, share, and make connections to produce new and transformative ideas.

But HOW MUCH Knowledge is OUT THERE?


Clive’s article listed some amazing facts about the mountains of posts, tweets, words etc. out on the internet EVERY DAY, all of which exemplifies the problem:

  • 154.6 billion EMAILS per DAY
  • 400 million TWEETS per DAY
  • 1 million BLOG POSTS (including this one) per DAY
  • 2 million COMMENTS on WordPress per DAY
  • 16 million WORDS on Facebook per DAY

As he estimates this would be 520 million books per DAY (book with average 100,000 words).

A LOT of INFO. But as he suggests it is not the volume but how we create and share this information which is critical as the science fiction writer Theodore Sturgeon noted “Ninety percent of everything is crap” AKA Sturgeon’s Law.


Internet live stats show how congested the internet is each day ( Needless to say Clive’s numbers are a bit off. As of the writing of this article:


  • 2.9 billion internet users
  • 981 million websites (only 25,000 hacked today)
  • 128 billion emails
  • 385 million Tweets
  • > 2.7 million BLOG posts today (including this one)


The Good, The Bad, and the Ugly of the Scientific Internet (The Wild West?)


So how many science blogs are out there? Well back in 2008 “grrlscientistasked this question and turned up a total of 19,881 blogs however most were “pseudoscience” blogs, not written by Ph.D or MD level scientists. A deeper search on Technorati using the search term “scientist PhD” turned up about 2,000 written by trained scientists.

So granted, there is a lot of



              ….. when it comes to scientific information on the internet!






I had recently re-posted, on this site, a great example of how bad science and medicine can get propagated throughout the internet:


and in a Nature Report:Stem cells: Taking a stand against pseudoscience

Drs.Elena Cattaneo and Gilberto Corbellini document their long, hard fight against false and invalidated medical claims made by some “clinicians” about the utility and medical benefits of certain stem-cell therapies, sacrificing their time to debunk medical pseudoscience.


Using Curation and Science 2.0 to build Trusted, Expert Networks of Scientists and Clinicians


Establishing networks of trusted colleagues has been a cornerstone of the scientific discourse for centuries. For example, in the mid-1640s, the Royal Society began as:


“a meeting of natural philosophers to discuss promoting knowledge of the

natural world through observation and experiment”, i.e. science.

The Society met weekly to witness experiments and discuss what we

would now call scientific topics. The first Curator of Experiments

was Robert Hooke.”


from The History of the Royal Society


Royal Society CoatofArms







The Royal Society of London for Improving Natural Knowledge.

(photo credit: Royal Society)

(Although one wonders why they met “in-cognito”)

Indeed as discussed in “Science 2.0/Brainstorming” by the originators of OpenWetWare, an open-source science-notebook software designed to foster open-innovation, the new search and aggregation tools are making it easier to find, contribute, and share information to interested individuals. This paradigm is the basis for the shift from Science 1.0 to Science 2.0. Science 2.0 is attempting to remedy current drawbacks which are hindering rapid and open scientific collaboration and discourse including:

  • Slow time frame of current publishing methods: reviews can take years to fashion leading to outdated material
  • Level of information dissemination is currently one dimensional: peer-review, highly polished work, conferences
  • Current publishing does not encourage open feedback and review
  • Published articles edited for print do not take advantage of new web-based features including tagging, search-engine features, interactive multimedia, no hyperlinks
  • Published data and methodology incomplete
  • Published data not available in formats which can be readably accessible across platforms: gene lists are now mandated to be supplied as files however other data does not have to be supplied in file format

(put in here a brief blurb of summary of problems and why curation could help)


Curation in the Sciences: View from Scientific Content Curators Larry H. Bernstein, MD, FCAP, Dr. Justin D. Pearlman, MD, PhD, FACC and Dr. Aviva Lev-Ari, PhD, RN

Curation is an active filtering of the web’s  and peer reviewed literature found by such means – immense amount of relevant and irrelevant content. As a result content may be disruptive. However, in doing good curation, one does more than simply assign value by presentation of creative work in any category. Great curators comment and share experience across content, authors and themes. Great curators may see patterns others don’t, or may challenge or debate complex and apparently conflicting points of view.  Answers to specifically focused questions comes from the hard work of many in laboratory settings creatively establishing answers to definitive questions, each a part of the larger knowledge-base of reference. There are those rare “Einstein’s” who imagine a whole universe, unlike the three blind men of the Sufi tale.  One held the tail, the other the trunk, the other the ear, and they all said this is an elephant!
In my reading, I learn that the optimal ratio of curation to creation may be as high as 90% curation to 10% creation. Creating content is expensive. Curation, by comparison, is much less expensive.

– Larry H. Bernstein, MD, FCAP

Curation is Uniquely Distinguished by the Historical Exploratory Ties that Bind –Larry H. Bernstein, MD, FCAP

The explosion of information by numerous media, hardcopy and electronic, written and video, has created difficulties tracking topics and tying together relevant but separated discoveries, ideas, and potential applications. Some methods to help assimilate diverse sources of knowledge include a content expert preparing a textbook summary, a panel of experts leading a discussion or think tank, and conventions moderating presentations by researchers. Each of those methods has value and an audience, but they also have limitations, particularly with respect to timeliness and pushing the edge. In the electronic data age, there is a need for further innovation, to make synthesis, stimulating associations, synergy and contrasts available to audiences in a more timely and less formal manner. Hence the birth of curation. Key components of curation include expert identification of data, ideas and innovations of interest, expert interpretation of the original research results, integration with context, digesting, highlighting, correlating and presenting in novel light.

Justin D Pearlman, MD, PhD, FACC from The Voice of Content Consultant on The  Methodology of Curation in Cardiovascular Original Research: Cases in Methodology Design for Content Co-Curation The Art of Scientific & Medical Curation


In Power of Analogy: Curation in Music, Music Critique as a Curation and Curation of Medical Research Findings – A Comparison, Drs. Larry Bernstein and Aviva Lev-Ari likens the medical and scientific curation process to curation of musical works into a thematic program:


Work of Original Music Curation and Performance:


Music Review and Critique as a Curation

Work of Original Expression what is the methodology of Curation in the context of Medical Research Findings Exposition of Synthesis and Interpretation of the significance of the results to Clinical Care

… leading to new, curated, and collaborative works by networks of experts to generate (in this case) ebooks on most significant trends and interpretations of scientific knowledge as relates to medical practice.


In Summary: How Scientific Content Curation Can Help


Given the aforementioned problems of:

        I.            the complex and rapid deluge of scientific information

      II.            the need for a collaborative, open environment to produce transformative innovation

    III.            need for alternative ways to disseminate scientific findings


        I.            Curation exists beyond the review: curation decreases time for assessment of current trends adding multiple insights, analyses WITH an underlying METHODOLOGY (discussed below) while NOT acting as mere reiteration, regurgitation


      II.            Curation providing insights from WHOLE scientific community on multiple WEB 2.0 platforms


    III.            Curation makes use of new computational and Web-based tools to provide interoperability of data, reporting of findings (shown in Examples below)


Therefore a discussion is given on methodologies, definitions of best practices, and tools developed to assist the content curation community in this endeavor.

Methodology in Scientific Content Curation as Envisioned by Aviva lev-Ari, PhD, RN


At Leaders in Pharmaceutical Business Intelligence, site owner and chief editor Aviva lev-Ari, PhD, RN has been developing a strategy “for the facilitation of Global access to Biomedical knowledge rather than the access to sheer search results on Scientific subject matters in the Life Sciences and Medicine”. According to Aviva, “for the methodology to attain this complex goal it is to be dealing with popularization of ORIGINAL Scientific Research via Content Curation of Scientific Research Results by Experts, Authors, Writers using the critical thinking process of expert interpretation of the original research results.” The following post:

Cardiovascular Original Research: Cases in Methodology Design for Content Curation and Co-Curation

demonstrate two examples how content co-curation attempts to achieve this aim and develop networks of scientist and clinician curators to aid in the active discussion of scientific and medical findings, and use scientific content curation as a means for critique offering a “new architecture for knowledge”. Indeed, popular search engines such as Google, Yahoo, or even scientific search engines such as NCBI’s PubMed and the OVID search engine rely on keywords and Boolean algorithms …

which has created a need for more context-driven scientific search and discourse.

In Science and Curation: the New Practice of Web 2.0, Célya Gruson-Daniel (@HackYourPhd) states:

To address this need, human intermediaries, empowered by the participatory wave of web 2.0, naturally started narrowing down the information and providing an angle of analysis and some context. They are bloggers, regular Internet users or community managers – a new type of profession dedicated to the web 2.0. A new use of the web has emerged, through which the information, once produced, is collectively spread and filtered by Internet users who create hierarchies of information.

.. where Célya considers curation an essential practice to manage open science and this new style of research.

As mentioned above in her article, Dr. Lev-Ari represents two examples of how content curation expanded thought, discussion, and eventually new ideas.

  1. Curator edifies content through analytic process = NEW form of writing and organizations leading to new interconnections of ideas = NEW INSIGHTS

i)        Evidence: curation methodology leading to new insights for biomarkers


  1. Same as #1 but multiple players (experts) each bringing unique insights, perspectives, skills yielding new research = NEW LINE of CRITICAL THINKING

ii)      Evidence: co-curation methodology among cardiovascular experts leading to cardiovascular series ebooks

Life-cycle of Science 2

The Life Cycle of Science 2.0. Due to Web 2.0, new paradigms of scientific collaboration are rapidly emerging.  Originally, scientific discovery were performed by individual laboratories or “scientific silos” where the main method of communication was peer-reviewed publication, meeting presentation, and ultimately news outlets and multimedia. In this digital era, data was organized for literature search and biocurated databases. In an era of social media, Web 2.0, a group of scientifically and medically trained “curators” organize the piles of data of digitally generated data and fit data into an organizational structure which can be shared, communicated, and analyzed in a holistic approach, launching new ideas due to changes in organization structure of data and data analytics.


The result, in this case, is a collaborative written work above the scope of the review. Currently review articles are written by experts in the field and summarize the state of a research are. However, using collaborative, trusted networks of experts, the result is a real-time synopsis and analysis of the field with the goal in mind to


For detailed description of methodology please see Cardiovascular Original Research: Cases in Methodology Design for Content Co-Curation The Art of Scientific & Medical Curation


In her paper, Curating e-Science Data, Maureen Pennock, from The British Library, emphasized the importance of using a diligent, validated, and reproducible, and cost-effective methodology for curation by e-science communities over the ‘Grid:

“The digital data deluge will have profound repercussions for the infrastructure of research and beyond. Data from a wide variety of new and existing sources will need to be annotated with metadata, then archived and curated so that both the data and the programmes used to transform the data can be reproduced for use in the future. The data represent a new foundation for new research, science, knowledge and discovery”

— JISC Senior Management Briefing Paper, The Data Deluge (2004)


As she states proper data and content curation is important for:

  • Post-analysis
  • Data and research result reuse for new research
  • Validation
  • Preservation of data in newer formats to prolong life-cycle of research results

However she laments the lack of

  • Funding for such efforts
  • Training
  • Organizational support
  • Monitoring
  • Established procedures


Tatiana Aders wrote a nice article based on an interview with Microsoft’s Robert Scoble, where he emphasized the need for curation in a world where “Twitter is the replacement of the Associated Press Wire Machine” and new technologic platforms are knocking out old platforms at a rapid pace. In addition he notes that curation is also a social art form where primary concerns are to understand an audience and a niche.

Indeed, part of the reason the need for curation is unmet, as writes Mark Carrigan, is the lack of appreciation by academics of the utility of tools such as Pinterest, Storify, and Pearl Trees to effectively communicate and build collaborative networks.

And teacher Nancy White, in her article Understanding Content Curation on her blog Innovations in Education, shows examples of how curation in an educational tool for students and teachers by demonstrating students need to CONTEXTUALIZE what the collect to add enhanced value, using higher mental processes such as:

  • Knowledge
  • Comprehension
  • Application
  • Analysis
  • Synthesis
  • Evaluation

curating-tableA GREAT table about the differences between Collecting and Curating by Nancy White at












University of Massachusetts Medical School has aggregated some useful curation tools at

Although many tools are related to biocuration and building databases but the common idea is curating data with indexing, analyses, and contextual value to provide for an audience to generate NETWORKS OF NEW IDEAS.

See here for a curation of how networks fosters knowledge, by Erika Harrison on ScoopIt



“Nowadays, any organization should employ network scientists/analysts who are able to map and analyze complex systems that are of importance to the organization (e.g. the organization itself, its activities, a country’s economic activities, transportation networks, research networks).”

Andrea Carafa insight from World Economic Forum New Champions 2012 “Power of Networks


Creating Content Curation Communities: Breaking Down the Silos!


An article by Dr. Dana Rotman “Facilitating Scientific Collaborations Through Content Curation Communities” highlights how scientific information resources, traditionally created and maintained by paid professionals, are being crowdsourced to professionals and nonprofessionals in which she termed “content curation communities”, consisting of professionals and nonprofessional volunteers who create, curate, and maintain the various scientific database tools we use such as Encyclopedia of Life, ChemSpider (for Slideshare see here), biowikipedia etc. Although very useful and openly available, these projects create their own challenges such as

  • information integration (various types of data and formats)
  • social integration (marginalized by scientific communities, no funding, no recognition)

The authors set forth some ways to overcome these challenges of the content curation community including:

  1. standardization in practices
  2. visualization to document contributions
  3. emphasizing role of information professionals in content curation communities
  4. maintaining quality control to increase respectability
  5. recognizing participation to professional communities
  6. proposing funding/national meeting – Data Intensive Collaboration in Science and Engineering Workshop

A few great presentations and papers from the 2012 DICOSE meeting are found below

Judith M. Brown, Robert Biddle, Stevenson Gossage, Jeff Wilson & Steven Greenspan. Collaboratively Analyzing Large Data Sets using Multitouch Surfaces. (PDF) NotesForBrown


Bill Howe, Cecilia Aragon, David Beck, Jeffrey P. Gardner, Ed Lazowska, Tanya McEwen. Supporting Data-Intensive Collaboration via Campus eScience Centers. (PDF) NotesForHowe


Kerk F. Kee & Larry D. Browning. Challenges of Scientist-Developers and Adopters of Existing Cyberinfrastructure Tools for Data-Intensive Collaboration, Computational Simulation, and Interdisciplinary Projects in Early e-Science in the U.S.. (PDF) NotesForKee


Ben Li. The mirages of big data. (PDF) NotesForLiReflectionsByBen


Betsy Rolland & Charlotte P. Lee. Post-Doctoral Researchers’ Use of Preexisting Data in Cancer Epidemiology Research. (PDF) NoteForRolland


Dana Rotman, Jennifer Preece, Derek Hansen & Kezia Procita. Facilitating scientific collaboration through content curation communities. (PDF) NotesForRotman


Nicholas M. Weber & Karen S. Baker. System Slack in Cyberinfrastructure Development: Mind the Gaps. (PDF) NotesForWeber

Indeed, the movement of Science 2.0 from Science 1.0 had originated because these “silos” had frustrated many scientists, resulting in changes in the area of publishing (Open Access) but also communication of protocols (online protocol sites and notebooks like OpenWetWare and BioProtocols Online) and data and material registries (CGAP and tumor banks). Some examples are given below.

Open Science Case Studies in Curation

1. Open Science Project from Digital Curation Center

This project looked at what motivates researchers to work in an open manner with regard to their data, results and protocols, and whether advantages are delivered by working in this way.

The case studies consider the benefits and barriers to using ‘open science’ methods, and were carried out between November 2009 and April 2010 and published in the report Open to All? Case studies of openness in research. The Appendices to the main report (pdf) include a literature review, a framework for characterizing openness, a list of examples, and the interview schedule and topics. Some of the case study participants kindly agreed to us publishing the transcripts. This zip archive contains transcripts of interviews with researchers in astronomy, bioinformatics, chemistry, and language technology.


see: Pennock, M. (2006). “Curating e-Science Data”. DCC Briefing Papers: Introduction to Curation. Edinburgh: Digital Curation Centre. Handle: 1842/3330. Available online:– See more at:


2.      cBIO -cBio’s biological data curation group developed and operates using a methodology called CIMS, the Curation Information Management System. CIMS is a comprehensive curation and quality control process that efficiently extracts information from publications.


3. NIH Topic Maps – This website provides a database and web-based interface for searching and discovering the types of research awarded by the NIH. The database uses automated, computer generated categories from a statistical analysis known as topic modeling.


4. SciKnowMine (USC)- We propose to create a framework to support biocuration called SciKnowMine (after ‘Scientific Knowledge Mine’), cyberinfrastructure that supports biocuration through the automated mining of text, images, and other amenable media at the scale of the entire literature.


  1. OpenWetWareOpenWetWare is an effort to promote the sharing of information, know-how, and wisdom among researchers and groups who are working in biology & biological engineering. Learn more about us.   If you would like edit access, would be interested in helping out, or want your lab website hosted on OpenWetWare, pleasejoin us. OpenWetWare is managed by the BioBricks Foundation. They also have a wiki about Science 2.0.

6. LabTrove: a lightweight, web based, laboratory “blog” as a route towards a marked up record of work in a bioscience research laboratory. Authors in PLOS One article, from University of Southampton, report the development of an open, scientific lab notebook using a blogging strategy to share information.

7. OpenScience ProjectThe OpenScience project is dedicated to writing and releasing free and Open Source scientific software. We are a group of scientists, mathematicians and engineers who want to encourage a collaborative environment in which science can be pursued by anyone who is inspired to discover something new about the natural world.

8. Open Science Grid is a multi-disciplinary partnership to federate local, regional, community and national cyberinfrastructures to meet the needs of research and academic communities at all scales.


9. Some ongoing biomedical knowledge (curation) projects at ISI

This project is concerned with developing a curation and documentation system for information integration in collaboration with the II Group at ISI as part of the BIRN.

It’s primary purpose is to provide software for experimental biomedical scientists that would permit a single scientific worker (at the level of a graduate student or postdoctoral worker) to design, construct and manage a shared knowledge repository for a research group derived on a local store of PDF files. This project is funded by NIGMS from 2008-2012 ( RO1-GM083871).

10. Tools useful for scientific content curation


Research Analytic and Curation Tools from University of Queensland


Thomson Reuters information curation services for pharma industry


Microblogs as a way to communicate information about HPV infection among clinicians and patients; use of Chinese microblog SinaWeibo as a communication tool


VIVO for scientific communities– In order to connect this information about research activities across institutions and make it available to others, taking into account smaller players in the research landscape and addressing their need for specific information (for example, by proving non-conventional research objects), the open source software VIVO that provides research information as linked open data (LOD) is used in many countries.  So-called VIVO harvesters collect research information that is freely available on the web, and convert the data collected in conformity with LOD standards. The VIVO ontology builds on prevalent LOD namespaces and, depending on the needs of the specialist community concerned, can be expanded.



11. Examples of scientific curation in different areas of Science/Pharma/Biotech/Education


From Science 2.0 to Pharma 3.0 Q&A with Hervé Basset

Hervé Basset, specialist librarian in the pharmaceutical industry and owner of the blog “Science Intelligence“, to talk about the inspiration behind his recent book  entitled “From Science 2.0 to Pharma 3.0″, published by Chandos Publishing and available on Amazon and how health care companies need a social media strategy to communicate and convince the health-care consumer, not just the practicioner.


Thomson Reuters and NuMedii Launch Ground-Breaking Initiative to Identify Drugs for Repurposing. Companies leverage content, Big Data analytics and expertise to improve success of drug discovery


Content Curation as a Context for Teaching and Learning in Science


#OZeLIVE Feb2014

Creative Commons license


DigCCur: A graduate level program initiated by University of North Carolina to instruct the future digital curators in science and other subjects


Syracuse University offering a program in eScience and digital curation


Curation Tips from TED talks and tech experts

Steven Rosenbaum from Curation Nation


Pawan Deshpande form Curata on how content curation communities evolve and what makes a good content curation:


How the Internet of Things is Promoting the Curation Effort

Update by Stephen J. Williams, PhD 3/01/19

Up till now, curation efforts like wikis (Wikipedia, Wikimedicine, Wormbase, GenBank, etc.) have been supported by a largely voluntary army of citizens, scientists, and data enthusiasts.  I am sure all have seen the requests for donations to help keep Wikipedia and its other related projects up and running.  One of the obscure sister projects of Wikipedia, Wikidata, wants to curate and represent all information in such a way in which both machines, computers, and humans can converse in.  About an army of 4 million have Wiki entries and maintain these databases.

Enter the Age of the Personal Digital Assistants (Hellooo Alexa!)

In a March 2019 WIRED article “Encyclopedia Automata: Where Alexa Gets Its Information”  senior WIRED writer Tom Simonite reports on the need for new types of data structure as well as how curated databases are so important for the new fields of AI as well as enabling personal digital assistants like Alexa or Google Assistant decipher meaning of the user.

As Mr. Simonite noted, many of our libraries of knowledge are encoded in an “ancient technology largely opaque to machines-prose.”   Search engines like Google do not have a problem with a question asked in prose as they just have to find relevant links to pages. Yet this is a problem for Google Assistant, for instance, as machines can’t quickly extract meaning from the internet’s mess of “predicates, complements, sentences, and paragraphs. It requires a guide.”

Enter Wikidata.  According to founder Denny Vrandecic,

Language depends on knowing a lot of common sense, which computers don’t have access to

A wikidata entry (of which there are about 60 million) codes every concept and item with a numeric code, the QID code number. These codes are integrated with tags (like tags you use on Twitter as handles or tags in WordPress used for Search Engine Optimization) so computers can identify patterns of recognition between these codes.

Now human entry into these databases are critical as we add new facts and in particular meaning to each of these items.  Else, machines have problems deciphering our meaning like Apple’s Siri, where they had complained of dumb algorithms to interpret requests.

The knowledge of future machines could be shaped by you and me, not just tech companies and PhDs.

But this effort needs money

Wikimedia’s executive director, Katherine Maher, had prodded and cajoled these megacorporations for tapping the free resources of Wiki’s.  In response, Amazon and Facebook had donated millions for the Wikimedia projects.  Google recently gave 3.1 million USD$ in donations.


Future postings on the relevance and application of scientific curation will include:

Using Scientific Content Curation as a Method for Validation and Biocuration


Using Scientific Content Curation as a Method for Open Innovation


Other posts on this site related to Content Curation and Methodology include:

The growing importance of content curation

Data Curation is for Big Data what Data Integration is for Small Data

6 Steps to More Effective Content Curation

Stem Cells and Cardiac Repair: Content Curation & Scientific Reporting

Cancer Research: Curations and Reporting

Cardiovascular Diseases and Pharmacological Therapy: Curations

Cardiovascular Original Research: Cases in Methodology Design for Content Co-Curation The Art of Scientific & Medical Curation

Exploring the Impact of Content Curation on Business Goals in 2013

Power of Analogy: Curation in Music, Music Critique as a Curation and Curation of Medical Research Findings – A Comparison

conceived: NEW Definition for Co-Curation in Medical Research

The Young Surgeon and The Retired Pathologist: On Science, Medicine and HealthCare Policy – The Best Writers Among the WRITERS

Reconstructed Science Communication for Open Access Online Scientific Curation



Read Full Post »

PLENARY KEYNOTE PRESENTATIONS: THURSDAY, MAY 1 | 8:00 – 10:00 AM @ BioIT World, April 29 – May 1, 2014 Seaport World Trade Center, Boston, MA


Reporter: Aviva Lev-Ari, PhD, RN


Keynote Introduction: Sponsored by Fred Lee, M.D., MPH, Director, Healthcare Strategy and Business Development, Oracle Health Sciences

Heather Dewey-Hagborg

Artist, Ph.D. Student, Rensselaer Polytechnic Institute

Heather Dewey-Hagborg is an interdisciplinary artist, programmer and educator who explores art as research and public inquiry. She recreates identity from strands of human hair in an entirely different way. Collecting hairs she finds in random public places – bathrooms, libraries, and subway seats – she uses a battery of newly developing technologies to create physical, life-sized portraits of the owners of these hairs. Her fixation with a single hair leads her to controversial art projects and the study of genetics. Traversing media ranging from algorithms to DNA, her work seeks to question fundamental assumptions underpinning perceptions of human nature, technology and the environment. Examining culture through the lens of information, Heather creates situations and objects embodying concepts, probes for reflection and discussion. Her work has been featured in print, television, radio, and online. Heather has a BA in Information Arts from Bennington College and a Masters degree from the Interactive Telecommunications Program at Tisch School of the Arts, New York University. She is currently a Ph.D. student in Electronic Arts at Rensselaer Polytechnic Institute.


Yaniv Erlich, Ph.D.

Principal Investigator and Whitehead Fellow, Whitehead Institute for Biomedical Research


Dr. Yaniv Erlich is Andria and Paul Heafy Family Fellow and Principal Investigator at the Whitehead Institute for Biomedical Research. He received a bachelor’s degree from Tel-Aviv University, Israel and a PhD from the Watson School of Biological Sciences at Cold Spring Harbor Laboratory in 2010. Dr. Erlich’s research interests are computational human genetics. Dr. Erlich is the recipient of the Burroughs Wellcome Career Award (2013), Harold M. Weintraub award (2010), the IEEE/ACM-CS HPC award (2008), and he was selected as one of 2010 Tomorrow’s PIs team of Genome Technology.


Isaac Samuel Kohane, M.D., Ph.D.

Henderson Professor of Health Sciences and Technology, Children’s Hospital and Harvard Medical School;

Director, Countway Library of Medicine; Director, i2b2 National Center for Biomedical Computing;

Co-Director, HMS Center for Biomedical Informatics


Isaac Kohane, MD, PhD, co-directs the Center for Biomedical Informatics at Harvard Medical School. He applies computational techniques, whole genome analysis, and functional genomics to study human diseases through the developmental lens, and particularly through the use of animal model systems. Kohane has led the use of whole healthcare systems, notably in the i2b2 project, as “living laboratories” to drive discovery research in disease genomics (with a focus on autism) and pharmacovigilance

(including providing evidence for the cardiovascular risk of hypoglycemic agents which ultimately contributed to “black box”ing by the FDA) and comparative effectiveness with software and methods adopted in over 84 academic health centers internationally. Dr. Kohane has published over 200 papers in the medical literature and authored a widely used book on Microarrays for an Integrative Genomics. He has been elected to multiple honor societies including the American Society for Clinical Investigation, the American College of Medical Informatics, and the Institute of Medicine. He leads a doctoral program in genomics and bioinformatics within the Division of Medical Science at Harvard University. He is also an occasionally practicing pediatric endocrinologist.


#SachsBioinvestchat, #bioinvestchat


Read Full Post »

Track 5 Next-Gen Sequencing Informatics: Advances in Analysis and Interpretation of NGS Data @ BioIT World, April 29 – May 1, 2014 Seaport World Trade Center, Boston, MA

Reporter: Aviva Lev-Ari, PhD, RN


NGS Bioinformatics Marketplace: Emerging Trends and Predictions

10:50 Chairperson’s Remarks

Narges Baniasadi, Ph.D., Founder & CEO, Bina Technologies, Inc.

11:00 Global Next-Generation Sequencing Informatics Markets: Inflated Expectations in an Emerging Market

Greg Caressi, Senior Vice President, Healthcare and Life Sciences, Frost & Sullivan

This presentation evaluates the global next-generation sequencing (NGS) informatics markets from 2012 to 2018. Learn key market drivers and restraints,

key highlights for many of the leading NGS informatics services providers and vendors, revenue forecasts, and the important trends and predictions that

affect market growth.

Organizational Approaches to NGS Informatics

11:30 High-Performance Databases to Manage and Analyze NGS Data

Joseph Szustakowski, Ph.D., Head, Bioinformatics, Biomarker Development,

Novartis Institutes for Biomedical Research

The size, scale, and complexity of NGS data sets call for new data management and analysis strategies. High-performance database systems

combine the advantages of both established and cutting edge technologies. We are using high performance database systems to manage and analyze NGS, clinical, pathway, and phenotypic data with great success. We will describe our approach and concrete success stories that demonstrate its efficiency and effectiveness.

12:00 pm Taming Big Science Data Growth with Converged Infrastructure

Aaron D. Gardner, Senior Scientific Consultant,

BioTeam, Inc.

Many of the largest NGS sites have identified IO bottlenecks as their number one concern in growing their infrastructure to support current and projected

data growth rates. In this talk Aaron D. Gardner, Senior Scientific Consultant, BioTeam, Inc. will share real-world strategies and implementation details

for building converged storage infrastructure to support the performance, scalability and collaborative requirements of today’s NGS workflows.

12:15 Next Generation Sequencing:  Workflow Overview from a High-Performance Computing Point of View

Carlos P. Sosa, Ph.D., Applications Engineer, HPC Lead,

Cray, Inc.

Next Generation Sequencing (NGS) allows for the analysis of genetic material with unprecedented speed and efficiency. NGS increasingly shifts the burden

from chemistry done in a laboratory to a string manipulation problem, well suited to High- Performance Computing. We explore the impact of the NGS

workflow in the design of IT infrastructures. We also present Cray’s most recent solutions for NGS workflow.


Bioinformatics and BIG DATA – NGS @ CRAY i 2014

I/O moving, storage data – UNIFIED solution by Cray

  • Data access
  • Fast Access
  • Storage
  • manage high performance computinf; NGS work flow, multiple human genomes 61 then 240 sequentiallt, with high performance in 51 hours, 140 genomes in simultaneous

Architecture @Cray for Genomics

  • sequensors
  • Galaxy
  • servers for analysis
  •  workstation: Illumina, galaxy, CRAY does the integration of 3rd party SW using a workflow LEVERAGING the network, the fastest in the World, network useding NPI for scaling and i/O
  • Compute blades, reserves formI?O nodes, the Fastest interconnet in the industry
  • scale of capacity and capability, link interconnect in the file System: lustre
  • optimization of bottle neck: capability, capacity, file structure for super fast I/O

12:40 Luncheon Presentation I

Erasing the Data Analysis Bottleneck with BaseSpace

Jordan Stockton, Ph.D., Marketing Director,

Enterprise Informatics, Illumina, Inc.

Since the inception of next generation sequencing, great attention has been paid to challenges such as storage, alignment, and variant calling. We believe

that this narrow focus has distracted many biologists from higher-level scientific goals, and that simplifying this process will expedite the discovery

process in the field of applied genomics. In this talk we will show that applications in BaseSpace can empower a new class of researcher to go from

sample to answer quickly, and can allow software developers to make their tools accessible to a vast and receptive audience.

1:10 Luncheon Presentation II: Sponsored by

The Empowered Genome Community: First Insights from Shareable Joint Interpretation of Personal Genomes for Research

Nathan Pearson, Ph.D. Principal Genome Scientist,


Genome sequencing is becoming prevalent however understanding each genome requires comparing many genomes. We launched the Empowered Genome Community, consisting of people from programs such as the Personal Genome Project (PGP) and Illumina’s Understand Your Genome. Using Ingenuity Variant Analysis, members have identified proof of principle insights on a common complex disease (here,myopia) derived by open collaborative analysis of PGP genomes.

Pearson in REAL TIME

One Genome vs. population of Genomes

IF one Genome:

  1. ancestry
  2. family health
  3. less about drug and mirrors
  4. health is complex


1. mine genome

2. what all genome swill do for Humanity not what my genome can do for me

3. Cohort analysis, rich for variance

4. Ingenuity Variant Analysis – secure environment

5. comparison of genomes, a sequence, reference matching

6. phynogenum, statistical analysis as Population geneticists do

Open, collabrative myopia analysis GENES rare leading to myuopia – 111 genomes

– first-pass finding highlight 12 plausibly myopia-relevant genes: variants in cases vs control

– refine finding and analysis, statistical association, common variance

Read Full Post »

Older Posts »