Funding, Deals & Partnerships: BIOLOGICS & MEDICAL DEVICES; BioMed e-Series; Medicine and Life Sciences Scientific Journal – http://PharmaceuticalIntelligence.com
“There is a whole economy behind writing fraudulent reviews, and people paying these review writers,” explained Gupta, an assistant professor of computer science.
The researchers pointed out a website called fiverr.com , where everything sells for five dollars: including fraudulent reviews.
One post says, “I will buy your Amazon product and write your review for five dollars.”
Another states, “I will do post two nice and attractive Amazon reviews.”
For their study, Gupta and Hoyle bought 55 reviews for different products they found on Amazon.
“We took all their fraudulent reviews, and then we studied their characteristics,” explained Gupta.
They found the people who were writing the reviews for money were from all over the world: from the U.S, to Ireland, to India, to Bangladesh.
“In general, it is getting harder to distinguish good from the bad on the web,” said Gupta.
They say there are clues to help consumers understand what’s real and what’s fake.
Gupta and Hoyle recommend looking on Amazon for a label on reviews: “Amazon Verified Product.” That means the reviewer actually bought the product, and the review is more likely real.
They recommend looking at what else the reviewer has written. Hoyle found one reviewer copy and pasted the same review on multiple CDs.
Another potential warning sign: If a reviewer gives all five-star reviews within a short period of time. They may be getting paid to post positive reviews.
“You have to use a lot more judgment, and increasingly the notion of reputation will become more and more important,” said Gupta. “We are starting to see this.”
In Gupta and Hoyle’s study, they estimate 257,000 reviews on Amazon (or about 1 percent) are fraudulent. Their goal long-term is to develop a program to entirely eliminate fake reviews.
Rob Slaven reviews books — not for the money, but for the love of reading. He gets to keep every book he reviews.
“The books I’ve reviewed, I’ve tried to be devastatingly honest,” said Slaven.
If you look at all of his reviews, you’ll see a line he adds, disclosing he got the book for doing the review. He also gets feedback — positive and negative — for each review he posts.
He knows not everyone is as honest as he is, but says he’ll keep up his side of the bargain.
“I’m not going to say something that’s not true,” explained Slaven. “It would be me misleading people. Me, misleading people like me.”
Review sites say they’re fighting fraud. Yelp representatives say they’ve always had a review filter to keep fake content out. They also recently started posting a giant red “consumer alert” sign on businesses that tried to mislead people. Amazon also has a flagging system.
But, researchers say not all fake reviews are caught. It’s a disappointing, but not surprising, revelation for honest reviewers.
“Customers go on Amazon in order to get trustworthy reviews, and to get candid opinions,” said Slaven.
Federal Trade Commission spokeswoman Betsy Lordan tells 24-Hour News 8 that paying for online reviews is legal, as long as the reviewer explains they’ve been compensated.
Of course, with the reviews 24-Hour News 8 discovered, there’s been no disclosure.
Larry H Bernstein, MD
Leaders in Pharmaceutical Intelligence
I call attention to an interesting article that just came out. The estimate of improved costsavings in healthcare and diagnostic accuracy is extimated to be substantial. I have written about the unused potential that we have not yet seen. In short, there is justification in substantial investment in resources to this, as has been proposed as a critical goal. Does this mean a reduction in staffing? I wouldn’t look at it that way. The two huge benefits that would accrue are:
workflow efficiency, reducing stress and facilitating decision-making.
scientifically, primary knowledge-based decision-support by well developed algotithms that have been at the heart of computational-genomics.
Can computers save health care? IU research shows lower costs, better outcomes
Cost per unit of outcome was $189, versus $497 for treatment as usual
Last modified: Monday, February 11, 2013
BLOOMINGTON, Ind. — New research from Indiana University has found that machine learning — the same computer science discipline that helped create voice recognition systems, self-driving cars and credit card fraud detection systems — can drastically improve both the cost and quality of health care in the United States.
Physicians using an artificial intelligence framework that predicts future outcomes would have better patient outcomes while significantly lowering health care costs.
Using an artificial intelligence framework combining Markov Decision Processes and Dynamic Decision Networks, IU School of Informatics and Computing researchers Casey Bennett and Kris Hauser show how simulation modeling that understands and predicts the outcomes of treatment could
reduce health care costs by over 50 percent while also
improving patient outcomes by nearly 50 percent.
The work by Hauser, an assistant professor of computer science, and Ph.D. student Bennett improves upon their earlier work that
showed how machine learning could determine the best treatment at a single point in time for an individual patient.
By using a new framework that employs sequential decision-making, the previous single-decision research
can be expanded into models that simulate numerous alternative treatment paths out into the future;
maintain beliefs about patient health status over time even when measurements are unavailable or uncertain; and
continually plan/re-plan as new information becomes available.
In other words, it can “think like a doctor.” (Perhaps better because of the limitation in the amount of information a bright, competent physician can handle without error!)
“The Markov Decision Processes and Dynamic Decision Networks enable the system to deliberate about the future, considering all the different possible sequences of actions and effects in advance, even in cases where we are unsure of the effects,” Bennett said. Moreover, the approach is non-disease-specific — it could work for any diagnosis or disorder, simply by plugging in the relevant information. (This actually raises the question of what the information input is, and the cost of inputting.)
The new work addresses three vexing issues related to health care in the U.S.:
rising costs expected to reach 30 percent of the gross domestic product by 2050;
a quality of care where patients receive correct diagnosis and treatment less than half the time on a first visit;
and a lag time of 13 to 17 years between research and practice in clinical care.
“We’re using modern computational approaches to learn from clinical data and develop complex plans through the simulation of numerous, alternative sequential decision paths,” Bennett said. “The framework here easily out-performs the current treatment-as-usual, case-rate/fee-for-service models of health care.” (see the above)
Bennett is also a data architect and research fellow with Centerstone Research Institute, the research arm of Centerstone, the nation’s largest not-for-profit provider of community-based behavioralhealth care. The two researchers had access to clinical data, demographics and other information on over 6,700 patients who had major clinical depression diagnoses, of which about 65 to 70 percent had co-occurring chronic physical disorders like diabetes, hypertension and cardiovascular disease. Using 500 randomly selected patients from that group for simulations, the two
compared actual doctor performance and patient outcomes against
sequential decision-making models
using real patient data.
They found great disparity in the cost per unit of outcome change when the artificial intelligence model’s
cost of $189 was compared to the treatment-as-usual cost of $497.
the AI approach obtained a 30 to 35 percent increase in patient outcomes
Bennett said that “tweaking certain model parameters could enhance the outcome advantage to about 50 percent more improvement at about half the cost.”
While most medical decisions are based on case-by-case, experience-based approaches, there is a growing body of evidence that complex treatment decisions might be effectively improved by AI modeling. Hauser said “Modeling lets us see more possibilities out to a further point – because they just don’t have all of that information available to them.” (Even then, the other issue is the processing of the information presented.)
Using the growing availability of electronic health records, health information exchanges, large public biomedical databases and machine learning algorithms, the researchers believe the approach could serve as the basis for personalized treatment through integration of diverse, large-scale data passed along to clinicians at the time of decision-making for each patient. Centerstone alone, Bennett noted, has access to health information on over 1 million patients each year. “Even with the development of new AI techniques that can approximate or even surpass human decision-making performance, we believe that the most effective long-term path could be combining artificial intelligence with human clinicians,” Bennett said. “Let humans do what they do well, and let machines do what they do well. In the end, we may maximize the potential of both.”
“Artificial Intelligence Framework for Simulating Clinical Decision-Making: A Markov Decision Process Approach” was published recently in Artificial Intelligence in Medicine. The research was funded by the Ayers Foundation, the Joe C. Davis Foundation and Indiana University.
For more information or to speak with Hauser or Bennett, please contact Steve Chaplin, IU Communications, at 812-856-1896 or stjchap@iu.edu.
IBM Watson Finally Graduates Medical School
It’s been more than a year since IBM’s Watson computer appeared on Jeopardy and defeated several of the game show’s top champions. Since then the supercomputer has been furiously “studying” the healthcare literature in the hope that it can beat a far more hideous enemy: the 400-plus biomolecular puzzles we collectively refer to as cancer.
Anomaly Based Interpretation of Clinical and Laboratory Syndromic Classes
Larry H Bernstein, MD, Gil David, PhD, Ronald R Coifman, PhD. Program in Applied Mathematics, Yale University, Triplex Medical Science.
Statement of Inferential Second Opinion
Realtime Clinical Expert Support and Validation System
Gil David and Larry Bernstein have developed, in consultation with Prof. Ronald Coifman, in the Yale University Applied Mathematics Program, a software system that is the equivalent of an intelligent Electronic Health Records Dashboard that provides
empirical medical reference and suggests quantitative diagnostics options.
Background
The current design of the Electronic Medical Record (EMR) is a linear presentation of portions of the record by
services, by
diagnostic method, and by
date, to cite examples.
This allows perusal through a graphical user interface (GUI) that partitions the information or necessary reports in a workstation entered by keying to icons. This requires that the medical practitioner finds
the history,
medications,
laboratory reports,
cardiac imaging and EKGs, and
radiology
in different workspaces. The introduction of a DASHBOARD has allowed a presentation of
drug reactions,
allergies,
primary and secondary diagnoses, and
critical information about any patient the care giver needing access to the record.
The advantage of this innovation is obvious. The startup problem is what information is presented and how it is displayed, which is a source of variability and a key to its success.
Proposal
We are proposing an innovation that supercedesthe main design elements of a DASHBOARD and
utilizes the conjoined syndromic features of the disparate data elements.
So the important determinant of the success of this endeavor is that it facilitates both
the workflow and
the decision-making process
with a reduction of medical error.
This has become extremely important and urgent in the 10 years since the publication “To Err is Human”, and the newly published finding that reduction of error is as elusive as reduction in cost. Whether they are counterproductive when approached in the wrong way may be subject to debate.
We initially confine our approach to laboratory data because it is collected on all patients, ambulatory and acutely ill, because the data is objective and quality controlled, and because
laboratory combinatorial patterns emerge with the development and course of disease. Continuing work is in progress in extending the capabilities with model data-sets, and sufficient data.
It is true that the extraction of data from disparate sources will, in the long run, further improve this process. For instance, the finding of both ST depression on EKG coincident with an increase of a cardiac biomarker (troponin) above a level determined by a receiver operator curve (ROC) analysis, particularly in the absence of substantially reduced renal function.
The conversion of hematology based data into useful clinical information requires the establishment of problem-solving constructs based on the measured data. Traditionally this has been accomplished by an intuitive interpretation of the data by the individual clinician. Through the application of geometric clustering analysisthe data may interpreted in a more sophisticated fashion in order to create a more reliable and valid knowledge-based opinion.
The most commonly ordered test used for managing patients worldwide is the hemogram that often incorporates the review of a peripheral smear. While the hemogram has undergone progressive modification of the measured features over time the subsequent expansion of the panel of tests has provided a window into the cellular changes in the production, release or suppression of the formed elements from the blood-forming organ to the circulation. In the hemogram one can view data reflecting the characteristics of a broad spectrum of medical conditions.
Progressive modification of the measured features of the hemogram has delineated characteristics expressed as measurements of
size,
density, and
concentration,
resulting in more than a dozen composite variables, including the
mean corpuscular volume (MCV),
mean corpuscular hemoglobin concentration (MCHC),
mean corpuscular hemoglobin (MCH),
total white cell count (WBC),
total lymphocyte count,
neutrophil count (mature granulocyte count and bands),
monocytes,
eosinophils,
basophils,
platelet count, and
mean platelet volume (MPV),
blasts,
reticulocytes and
platelet clumps,
perhaps the percent immature neutrophils (not bands)
as well as other features of classification.
The use of such variables combined with additional clinical information including serum chemistry analysis (such as the Comprehensive Metabolic Profile (CMP)) in conjunction with the clinical history and examination complete the traditional problem-solving construct. The intuitive approach applied by the individual clinician is limited, however,
by experience,
memory and
cognition.
The application of rules-based, automated problem solving may provide a more reliable and valid approach to the classification and interpretation of the data used to determine a knowledge-based clinical opinion.
The classification of the available hematologic data in order to formulate a predictive model may be accomplished through mathematical models that offer a more reliable and valid approach than the intuitive knowledge-based opinion of the individual clinician. The exponential growth of knowledge since the mapping of the human genome has been enabled by parallel advances in applied mathematics that have not been a part of traditional clinical problem solving. In a univariate universe the individual has significant control in visualizing data because unlike data may be identified by methods that rely on distributional assumptions. As the complexity of statistical models has increased, involving the use of several predictors for different clinical classifications, the dependencies have become less clear to the individual. The powerful statistical tools now available are not dependent on distributional assumptions, and allow classification and prediction in a way that cannot be achieved by the individual clinician intuitively. Contemporary statistical modeling has a primary goal of finding an underlying structure in studied data sets.
In the diagnosis of anemia the variables MCV,MCHC and MCH classify the disease process into microcytic, normocytic and macrocytic categories. Further consideration of
proliferation of marrow precursors,
the domination of a cell line, and
features of suppression of hematopoiesis
provide a two dimensional model. Several other possible dimensions are created by consideration of
the maturity of the circulating cells.
The development of an evidence-based inference engine that can substantially interpret the data at hand and convert it in real time to a “knowledge-based opinion” may improve clinical problem solving by incorporating multiple complex clinical features as well as duration of onset into the model.
An example of a difficult area for clinical problem solving is found in the diagnosis of SIRS and associated sepsis. SIRS (and associated sepsis) is a costly diagnosis in hospitalized patients. Failure to diagnose sepsis in a timely manner creates a potential financial and safety hazard. The early diagnosis of SIRS/sepsis is made by the application of defined criteria (temperature, heart rate, respiratory rate and WBC count) by the clinician. The application of those clinical criteria, however, defines the condition after it has developed and has not provided a reliable method for the early diagnosis of SIRS. The early diagnosis of SIRS may possibly be enhanced by the measurement of proteomic biomarkers, including transthyretin, C-reactive protein and procalcitonin. Immature granulocyte (IG) measurement has been proposed as a more readily available indicator of the presence of
granulocyte precursors (left shift).
The use of such markers, obtained by automated systems in conjunction with innovative statistical modeling, may provide a mechanism to enhance workflow and decision making.
An accurate classification based on the multiplicity of available data can be provided by an innovative system that utilizes the conjoined syndromic features of disparate data elements. Such a system has the potential to facilitate both the workflow and the decision-making process with an anticipated reduction of medical error.
This study is only an extension of our approach to repairing a longstanding problem in the construction of the many-sided electronic medical record (EMR). On the one hand, past history combined with the development of Diagnosis Related Groups (DRGs) in the 1980s have driven the technology development in the direction of “billing capture”, which has been a focus of epidemiological studies in health services research using data mining.
In a classic study carried out at Bell Laboratories, Didnerfound that information technologies reflect the view of the creators, not the users, and Front-to-Back Design (R Didner) is needed. He expresses the view:
“Pre-printed forms are much more amenable to computer-based storage and processing, and would improve the efficiency with which the insurance carriers process this information. However, pre-printed forms can have a rather severe downside. By providing pre-printed forms that a physician completes
to record the diagnostic questions asked,
as well as tests, and results,
the sequence of tests and questions,
might be altered from that which a physician would ordinarily follow. This sequence change could improve outcomes in rare cases, but it is more likely to worsen outcomes. “
Decision Making in the Clinical Setting. Robert S. Didner
A well-documented problem in the medical profession is the level of effort dedicated to administration and paperwork necessitated by health insurers, HMOs and other parties (ref). This effort is currently estimated at 50% of a typical physician’s practice activity. Obviously this contributes to the high cost of medical care. A key element in the cost/effort composition is the retranscription of clinical data after the point at which it is collected. Costs would be reduced, and accuracy improved, if the clinical data could be captured directly at the point it is generated, in a form suitable for transmission to insurers, or machine transformable into other formats. Such data capture, could also be used to improve the form and structure of how this information is viewed by physicians, and form a basis of a more comprehensive database linking clinical protocols to outcomes, that could improve the knowledge of this relationship, hence clinical outcomes.
How we frame our expectations is so important that
it determines the data we collect to examine the process.
In the absence of data to support an assumed benefit, there is no proof of validity at whatever cost. This has meaning for
hospital operations, for
nonhospital laboratory operations, for
companies in the diagnostic business, and
for planning of health systems.
In 1983, a vision for creating the EMR was introduced by Lawrence Weed and others. This is expressed by McGowan and Winstead-Fry.
J J McGowan and P Winstead-Fry. Problem Knowledge Couplers: reengineering evidence-based medicine through interdisciplinary development, decision support, and research.
Example of Markov Decision Process (MDP) transition automaton (Photo credit: Wikipedia)
Control loop of a Markov Decision Process (Photo credit: Wikipedia)
English: IBM’s Watson computer, Yorktown Heights, NY (Photo credit: Wikipedia)
English: Increasing decision stakes and systems uncertainties entail new problem solving strategies. Image based on a diagram by Funtowicz, S. and Ravetz, J. (1993) “Science for the post-normal age” Futures 25:735–55 (http://dx.doi.org/10.1016/0016-3287(93)90022-L). (Photo credit: Wikipedia)
Article 1.1 Advances in the Understanding of the Human Genome The Initiation and Growth of Molecular Biology and Genomics- Part I
Introduction and purpose
This material will cover the initiation phase of molecular biology, Part I; to be followed by the Human Genome Project, Part II; and concludes with Ubiquitin, it’s Role in Signaling and Regulatory Control, Part III. This article is first a continuation of a previous discussion on the role of genomics in discovery of therapeutic targets titled Directions for genomics in personalized medicine http://pharmaceuticalintelligence.com/2013/01/27/directions-for-genomics-in-personalized-medicine/
The previous article focused on key drivers of cellular proliferation, stepwise mutational changes coinciding with cancer progression, and potential therapeutic targets for reversal of the process. It also covers the race to delineation of the Human Genome, discovery methods and fundamental genomic patterns that are ancient in both animal and plant speciation.
This article reviews the web-like connections between early and later discoveries, as significant finding has led to novel hypotheses and many more findings over the last 75 years. This largely post WWII revolution has driven our understanding of biological and medical processes at an exponential pace owing to successive discoveries of chemical structure, the basic building blocks of DNA and proteins, of nucleotide and protein-protein interactions, protein folding, allostericity, genomic structure, DNA replication, nuclear polyribosome interaction, and metabolic control. In addition, the emergence of methods for copying, removal and insertion, and improvements in structural analysis as well as developments in applied mathematics have transformed the research framework.
In the Beginning
During the Second World War we had the discoveries of physics and the emergence out of the Manhattan Project of radioactive nuclear probes from E.O. Lawrence University of California Berkeley Laboratory. The use of radioactive isotopes led to the development of biochemistry and isolation of nucleotides, nucleosides, enzymes, and filling in of details of pathways for photosynthesis, for biosynthesis, and for catabolism. Perhaps a good start of the journey is a student of Neils Bohr named Max Delbruck (September 4, 1906 – March 9, 1981), who won the Nobel prize for discovering that bacteria become resistant to viruses (phages) as a result of genetic mutations, founded a new discipline called Molecular Biology, lifting the experimental work in Physiology to a systematic experimentation in biology with the rigor of Physics using radiation and virus probes on selected cells. In 1937 he turned to research on the genetics of Drosophila melanogaster at Caltech, and two years later he coauthored a paper, “The growth of bacteriophage”, reporting that the viruses replicate in one step, not exponentially. In 1942, he and Salvador Luria of Indiana University demonstrated that bacterial resistance to virus infection is mediated by random mutation. This research, known as the Luria-Delbrück experiment, notably applied mathematics to make quantitative predictions, and earned them the 1969 Nobel Prize in Physiology or Medicine, shared with Alfred Hershey. His inferences on genes’ susceptibility to mutation was relied on by physicist Erwin Schrödinger in his 1944 book, What Is Life?, which conjectured genes were an “aperiodic crystal” storing code-script and influenced Francis Crick and James D. Watson in their 1953 identification of cellular DNA’s molecular structure as a double helix.
Watson-Crick Double Helix Model
A new understanding of heredity and hereditary disease was possible once it was determined that DNA consists of two chains twisted around each other, or double helixes, of alternating phosphate and sugar groups, and that the two chains are held together by hydrogen bonds between pairs of organic bases—adenine (A) with thymine (T), and guanine (G) with cytosine (C). Modern biotechnology also has its basis in the structural knowledge of DNA—in this case the scientist’s ability to modify the DNA of host cells that will then produce a desired product, for example, insulin. The background for the work of the four scientists was formed by several scientific breakthroughs:
the progress made by X-ray crystallographers in studying organic macromolecules;
the growing evidence supplied by geneticists that it was DNA, not protein, in chromosomes that was responsible for heredity;
Erwin Chargaff’s experimental finding that there are equal numbers of A and T bases and of G and C bases in DNA;
and Linus Pauling’s discovery that the molecules of some proteins have helical shapes.
In 1962 James Watson (b. 1928), Francis Crick (1916–2004), and Maurice Wilkins (1916–2004) jointly received the Nobel Prize in physiology or medicine for their 1953 determination of the structure of deoxyribonucleic acid (DNA), performed with a knowledge of Chargaff’s ratios of the bases in DNA and having access to the X-ray crystallography of Maurice Wilkins and Rosalind Franklin at King’s College London. Because the Nobel Prize can be awarded only to the living, Wilkins’s colleague Rosalind Franklin (1920–1958), who died of cancer at the age of 37, could not be honored. Of the four DNA researchers, only Rosalind Franklin had any degrees in chemistry. Franklin completed her degree in 1941 in the middle of World War II and undertook graduate work at Cambridge with Ronald Norrish, a future Nobel Prize winner. She returning to Cambridge after a year of war service, presented her work and received the PhD in physical chemistry. Franklin then learned the X-ray crystallography in Paris and rapidly became a respected authority in this field. Returning to returned to England to King’s College London in 1951, her charge was to upgrade the X-ray crystallographic laboratory there for work with DNA.
Cold Spring Harbor Laboratory
I digress to the beginnings of the Cold Spring Harbor Laboratory. A significant part of the Laboratory’s life revolved around education with its three-week-long Phage Course, taught first in 1945 by Max Delbruck, the German-born, theoretical-physicist-turned-biologist. James D Watson first came to Cold Spring Harbor Laboratory with his thesis advisor, Salvador Luria, in the summer of 1948. Over its more than 25-year history, the Phage Course was the training ground for many notable scientists. The Laboratory’s annual scientific Symposium, has provided a unique highly interactive education about the exciting field of “molecular” biology. The 1953 symposium featured Watson coming from England to give the first public presentation of the DNA double helix. When he became the Laboratory’s director in 1968 he was determined to make the Laboratory an important center for advancing molecular biology, and he focused his energy on bringing large donations to the enterprise CSHNL. It became a magnate for future discovery at which James D. Watson became the Director in 1968, and later the Chancellor. This contribution has as great an importance as his Nobel Prize discovery.
Biochemistry and Molecular Probes comes into View
Moreover, at the same time, the experience of Nathan Kaplan and Martin Kamen at Berkeley working with radioactive probes was the beginning of an establishment of Lawrence-Livermore Laboratories role in metabolic studies, as reported in the previous paper. A collaboration between Sid Collowick, NO Kaplan and Elizabeth Neufeld at the McCollum Pratt Institute led to the transferase reaction between the two main pyridine nucleotides. Neufeld received a PhD a few years later from the University of California, Berkeley, under William Zev Hassid for research on nucleotides and complex carbohydrates, and did postdoctoral studies on non-protein sulfhydryl compounds in mitosis. Her later work at the NIAMDG on mucopolysaccharidoses. The Lysosomal Storage Diseases opened a new chapter on human genetic diseases when she found that the defects in Hurler and Hunter syndromes were due to decreased degradation of the mucopolysaccharides. When an assay became available for α-L-iduronidase in 1972, Neufeld was able to show that the corrective factor for Hurler syndrome that accelerates degradation of stored sulfated mucopolysaccharides was α-L-iduronidase.
The Hurler Corrective Factor. Purification and Some Properties (Barton, R. W., and Neufeld, E. F. (1971) J. Biol. Chem. 246, 7773–7779) The Sanfilippo A Corrective Factor. Purification and Mode of Action (Kresse, H., and Neufeld, E. F. (1972) J. Biol. Chem. 247, 2164–2170) _______________________________________________________
I mention this for two reasons: [1] We see a huge impetus for nucleic acids and nucleotides research growing in the 1950’s with a post WWII emergence of work on biological structure. [2] At the same time, the importance of enzymes in cellular metabolic processes runs parallel to that of the genetic code.
In 1959 Arthur Kornberg was a recipient of the Nobel prize for Physiology or Medicine based on his discovery of “the mechanisms in the biological synthesis of deoxyribonucleic acid” (DNA polymerase) together with Dr. Severo Ochoa of New York University. In the next 20 years Stanford University Department of Biochemistry became a top rated graduate program in biochemistry. Today, the Pfeffer Lab is distinguished for research into how human cells put receptors in the right place through Rab GTPases that regulate all aspects of receptor trafficking. Steve Elledge (1984-1989) at Harvard University is one of its graduates from the 1980s.
Transcription –RNA and the ribosome
In 2006, Roger Kornberg was awarded the Nobel Prize in Chemistry for identifying the role of RNA polymerase II and other proteins in transcribing DNA. He says that the process is something akin to a machine. “It has moving parts which function in synchrony, in appropriate sequence and in synchrony with one another”. The Kornbergs were the tenth family with closely-related Nobel laureates. The 2009 Nobel Prize in Chemistry was awarded to Venki Ramakrishnan, Tom Steitz, and Ada Yonath for crystallographic studies of the ribosome. The atomic resolution structures of the ribosomal subunits provide an extraordinary context for understanding one of the most fundamental aspects of cellular function: protein synthesis. Research on protein synthesis began with studies of microsomes, and three papers were published on the atomic resolution structures of the 50S and 30S the atomic resolution of structures of ribosomal subnits in 2000. Perhaps the most remarkable and inexplicable feature of ribosome structure is that two-thirds of the mass is composed of large RNA molecules, the 5S, 16S, and 23S ribosomal RNAs, and the remaining third is distributed among ~50 relatively small and innocuous proteins. The first step on the road to solving the ribosome structure was determining the primary structure of the 16S and 23S RNAs in Harry Noller’s laboratory. The sequences were rapidly followed by secondary structure models for the folding of the two ribosomal RNAs, in collaboration with Carl Woese, bringing the ribosome structure into two dimensions. The RNA secondary structures are characterized by an elaborate series of helices and loops of unknown structure, but other than the insights offered by the structure of transfer RNA (tRNA), there was no way to think about folding these structures into three dimensions. The first three-dimensional images of the ribosome emerged from Jim Lake’s reconstructions from electron microscopy (EM) (Lake, 1976).
Ada Yonath reported the first crystals of the 50S ribosomal subunit in 1980, a crucial step that would require almost 20 years to bring to fruition (Yonath et al., 1980). Yonath’s group introduced the innovative use of ribosomes from extremophilic organisms. Peter Moore and Don Engelman applied neutron scattering techniques to determine the relative positions of ribosomal proteins in the 30S ribosomal subunit at the same time. Elegant chemical footprinting studies from the Noller laboratory provided a basis for intertwining the RNA among the ribosomal proteins, but there was still insufficient information to produce a high resolution structure, but Venki Ramakrishnan, in Peter Moore’s laboratory did it with deuterated ribosome reconstitutions. Then the Yale group was ramping up its work on the H. marismortui crystals of the 50S subunit. Peter Moore had recruited long-time colleague Tom Steitz to work on this problem and Steitz was about to complete the final event in the pentathlon of Crick’s dogma, having solved critical structures of DNA polymerases, the glutaminyl tRNA-tRNA synthetase complex, HIV reverse transcriptase, and T7 RNA polymerase. In 1999 Steitz, Ramakrishnan, and Yonath all presented electron density maps of subunits at approximately 5 Å resolution, and the Noller group presented 10 Å electron density maps of the Thermus 70S ribosome. Peter Moore aptly paraphrased Churchill, telling attendees that this was not the end, but the end of the beginning. Almost every nucleotide in the RNA is involved in multiple stabilizing interactions that form the monolithic tertiary structure at the heart of the ribosome. Williamson J. The ribosome at atomic resolution. Cell 2009; 139:1041-1043. http://dx.doi.org/10.1016/j.cell.2009.11.028/http://www.sciencedirect.com/science/article/pii/S0092867409014536
This opened the door to new therapies. For example, in 2010 it was reported that Numerous human genes display dual coding within alternatively spliced regions, which give rise to distinct protein products that include segments translated in more than one reading frame. To resolve the ensuing protein structural puzzle, we identified human genes with alternative splice variants comprising a dual coding region at least 75 nucleotides in length and analyzed the structural status of the protein segments they encode. The inspection of their amino acid composition and predictions by the IUPred and PONDR® VSL2 algorithms suggest a high propensity for structural disorder in dual-coding regions. Kovacs E, Tompa P, liliom K, and Kalmar L. Dual coding in alternative reading frames correlates with intrinsic protein disorder. PNAS 2010. http://www.jstor.org/stable/25664997http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2851785 http://www.pnas.org/content/107/12/5429.full.pdf
In 2012, it was shown that drug-bound ribosomes can synthesize a distinct subset of cellular polypeptides. The structure of a protein defines its ability to thread through the antibiotic-obstructed tunnel. Synthesis of certain polypeptides that initially bypass translational arrest can be stopped at later stages of elongation while translation of some proteins goes to completion. (Kannan K, Vasquez-Laslop N, and Mankin AS. Selective Protein Synthesis by Ribosomes with a Drug-Obstructed Exit Tunnel. Cell 2012; 151; 508-520.) http://dx.doi.org/10.1016/j.cell.2012.09.018 http://www.sciencedirect.com/science/article/pii/S0092867412011257
Mobility of genetic elements
Barbara McClintock received the Nobel Prize for Medicine for the discovery of the mobility of genetic elements, work that been done in that period. When transposons were demonstrated in bacteria, yeast and other organisms, Barbara rose to a stratospheric level in the general esteem of the scientific world, but she was uncomfortable about the honors. It was sufficient to have her work understood and acknowledged. Prof. Howard Green said of her, “There are scientists whose discoveries greatly transcend their personalities and their humanity. But those in the future who will know of Barbara only her discoveries will know only her shadow”. “In Memoriam – Barbara McClintock”. Nobelprize.org. 5 Feb 2013 http://www.nobelprize.org/nobel_prizes/medicine/laureates/1983/mcclintock-article.html/
She introduced her Nobel Lecture in 1983 with the following observation: “An experiment conducted in the mid-nineteen forties prepared me to expect unusual responses of a genome to challenges for which the genome is unprepared to meet in an orderly, programmed manner. In most known instances of this kind, the types of response were not predictable in advance of initial observations of them. It was necessary to subject the genome repeatedly to the same challenge in order to observe and appreciate the nature of the changes it induces…a highly programmed sequence of events within the cell that serves to cushion the effects of the shock. Some sensing mechanism must be present in these instances to alert the cell to imminent danger, and to set in motion the orderly sequence of events that will mitigate this danger”. She goes on to consider “early studies that revealed programmed responses to threats that are initiated within the genome itself, as well as others similarly initiated, that lead to new and irreversible genomic modifications. These latter responses, now known to occur in many organisms, are significant for appreciating how a genome may reorganize itself when faced with a difficulty for which it is unprepared”.
An experiment with Zea conducted in the summer of 1944 alerted her to the mobility of specific components of genomes involved the entrance of a newly ruptured end of a chromosome into a telophase nucleus. This experiment commenced with the growing of approximately 450 plants in the summer of 1944, each of which had started its development with a zygote that had received from each parent a chromosome with a newly ruptured end of one of its arms. The design of the experiment required that each plant be self-pollinated to isolate from the self-pollinated progeny new mutants that were expected to appear, and confine them to locations within the ruptured arm of a chromosome. Each mutant was expected to reveal the phenotype produced by a minute homozygous deficiency. Their modes of origin could be projected from the known behavior of broken ends of chromosomes in successive mitoses. Forty kernels from each self-pollinated ear were sown in a seedling bench in the greenhouse during the winter of 1944-45.
Some seedling mutants of the type expected overshadowed by segregants exhibiting bizarre phenotypes. These were variegated for type and degree of expression of a gene. Those variegated expressions given by genes associated with chlorophyll development were startingly conspicuous. Within any one progeny chlorophyll intensities, and their pattern of distribution in the seedling leaves, were alike. Between progenies, however, both the type and the pattern differed widely.
The effect of X-rays on chromosomes
Initial studies of broken ends of chromosomes began in the summer of 1931. By 1931, means of studying the beads on a string hypothesis was provided by newly developed methods of examining the ten chromosomes of the maize complement in microsporocytes in meiosis. The ten bivalent chromosomes are elongated in comparison to their metaphase lengths. Each chromosome
is identifiable by its relative length,
by the location of its centromere, which is readily observed at the pachytene stage, and
by the individuality of the chromomeres strung along the length of each chromosome.
At that time maize provided the best material for locating known genes along a chromosome arm, and also for precisely determining the break points in chromosomes that had undergone various types of rearrangement, such as translocations, inversions, etc. The recessive phenotypes in the examined plants arose from loss of a segment of a chromosome that carried the wild-type allele, and X-rays were responsible for inducing these deficiencies. A conclusion of basic significance could be drawn from these observations:
broken ends of chromosomes will fuse, 2-by-2, and
any broken end with any other broken end.
This principle has been amply proved in a series of experiments conducted over the years. In all such instances the break must sever both strands of the DNA double helix. This is a “double-strand break” in modern terminology. That two such broken ends entering a telophase nucleus will find each other and fuse, regardless of the initial distance that separates them, soon became apparent.
During the summer of 1931 she had seen plants in the maize field that showed variegation patterns resembling the one described for Nicotiana. Dr. McClintock was interested in selecting the variegated plants to determine the presence of a ring chromosome in each, and in the summer of 1932 with Dr. Stadler’s generous cooperation from Missouri, she had the opportunity to examine such plants. Each plant had a ring chromosome, but It was the behavior of this ring that proved to be significant. It revealed several basic phenomena. The following was noted:
In the majority of mitoses
replication of the ring chromosome produced two chromatids completely free from each other
could separate without difficulty in the following anaphase.
sister strand exchanges do occur between replicated or replicating chromatids
the frequency of such events increases with increase in the size of the ring.
these exchanges produce a double-size ring with two centromeres.
Mechanical rupture occurs in each of the two chromatid bridges formed at anaphase by passage of the two centromeres on the double-size ring to opposite poles of the mitotic spindle.
The location of a break can be at any one position along any one bridge.
The broken ends entering a telophase nucleus then fuse.
The size and content of each newly constructed ring depend on the position of the rupture that had occurred in each bridge.
The conclusion was that cells sense the presence in their nuclei of ruptured ends of chromosomes
then activate a mechanism that will bring together and then unite these ends
this will occur regardless of the initial distance in a telophase nucleus that separated the ruptured ends.
The ability of a cell to
sense these broken ends,
to direct them toward each other, and
then to unite them so that the union of the two DNA strands is correctly oriented,
is a particularly revealing example of the sensitivity of cells to all that is going on within them.
Evidence from gave unequivocal support for the conclusion that broken ends will find each other and fuse. The challenge is met by a programmed response. This may be necessary, as
both accidental breaks and
programmed breaks may be frequent.
If not repaired, such breaks could lead to genomic deficiencies having serious consequences.
A cell capable of repairing a ruptured end of a chromosome must sense the presence of this end in its nucleus. This sensing
activates a mechanism that is required for replacing the ruptured end with a functional telomere.
that such a mechanism must exist was revealed by a mutant that arose in the stocks.
this mutant would not allow the repair mechanism to operate in the cells of the plant.
Entrance of a newly ruptured end of a chromosome into the zygote is followed by the chromatid type of breakage-fusion-bridge cycle throughout mitoses in the developing plant. This suggested that the repair mechanism in the maize strains is repressed in cells producing
the male and female gametophytes and
also in the endosperm,
but is activated in the embryo.
The extent of trauma perceived by cells
whose nuclei receive a single newly ruptured end of a chromosome that the cell cannot repair,
and the speed with which this trauma is registered, was not appreciated until the winter of 1944-45.
By 1947 it was learned that the bizarre variegated phenotypes that segregated in many of the self-pollinated progenies grown on the seedling bench in the fall and winter of 1944-45, were due to the action of transposable elements. It seemed clear that
these elements must have been present in the genome,
and in a silent state previous to an event that activated one or another of them.
She concluded that some traumatic event was responsible for these activations. The unique event in the history of these plants relates to their origin. Both parents of the plants grown in 1944 had contributed a chromosome with a newly ruptured end to the zygote that gave rise to each of these plants. Detection of silent elements is now made possible with the aid of DNA cloning method. Silent AC (Activator) elements, as well as modified derivatives of them, have already been detected in several strains of maize. When other transposable elements are cloned it will be possible to compare their structural and numerical differences among various strains of maize. In any one strain of maize the number of silent but potentially transposable elements, as well as other repetitious DNAs, may be observed to change, and most probably in response to challenges not yet recognized. Telomeres are especially adapted to replicate free ends of chromosomes. When no telomere is present, attempts to replicate this uncapped end may be responsible for the apparent “fusions” of the replicated chromatids at the position of the previous break as well as for perpetuating the chromatid type of breakage-fusion-bridge cycle in successive mitoses. In conclusion, a genome may react to conditions for which it is unprepared, but to which it responds in a totally unexpected manner. Among these is
the extraordinary response of the maize genome to entrance of a single ruptured end of a chromosome into a telophase nucleus.
It was this event that was responsible for activations of potentially transposable elements that are carried in a silent state in the maize genome.
The mobility of these activated elements allows them to enter different gene loci and to take over control of action of the gene wherever one may enter.
Because the broken end of a chromosome entering a telophase nucleus can initiate activations of a number of different potentially transposable elements,
the modifications these elements induce in the genome may be explored readily.
In addition to
modifying gene action, these elements can
restructure the genome at various levels,
from small changes involving a few nucleotides,
to gross modifications involving large segments of chromosomes, such as
duplications,
deficiencies,
inversions,
and other reorganizations.
In the future attention undoubtedly will be centered on the genome, and with greater appreciation of its significance as a highly sensitive organ of the cell,
monitoring genomic activities and correcting common errors,
sensing the unusual and unexpected events,
and responding to them,
often by restructuring the genome.
We know about the elements available for such restructuring. We know nothing, however, about
how the cell senses danger and instigates responses to it that often are truly remarkable.
Source: 1983 Nobel Lecture. Barbara McClintock. THE SIGNIFICANCE OF RESPONSES OF THE GENOME TO CHALLENGE.
In 2009 the Nobel Prize in Physiology or Medicine was awarded to Elizabeth Blackburn, Carol Greider and Jack Szoztak for the discovery of Telomerase. This recognition came less than a decade after the completion of the Human Genome Project previously discussed. Prof. Blackburn acknowledges a strong influence coming from the work of Barbara McClintock. The discovery is tied to the pond organism Tetrahymena thermophila, and studies of yeast cells. Blackburn was drawn to science after reading the biography of Marie Curie by her daughter, Irina, as a child. She recalls that her Master’s mentor while studying the metabolism of glutamine in the rat liver, thought that every experiment should have the beauty and simplicity of a Mozart sonata. She did her PhD at the distinguished Laboratory for Molecular Biology at Cambridge, the epicenter of molecular biology sequencing the regions of bacteriophage phiX 174, a single stranded DNA bacteriophage. Using Fred Sanger’s methods to piece together RNA sequences she showed the first sequence of a 48 nucleotide fragment to her mathematical-gifted Cambridge cousin, who pointed out repeats of DNA sequence patterns! She worked on the sequencing of the DNA at the terminal regions of the short “minichromosomes” of the ciliated protozoan Tetrahymena thermophile at Yale in 1975. She continued her research begun at Yale at UCSF funded by the NIH based on an intriguing audiogram showing telomeric DNA in Tetrahymena. I describe the work as follows:
Prof. Blackburn incorporated 32P isotope labelled deoxynucleoside residues into the rDNA molecules for DNA repair enzymatic reactions and found that
the end regions were selectively labeled by combinations of 32P isotope radiolabled nucleoside triphosphate, and by mid-year she had an audiogram of the depurination products.
The audiogram showed sequences of 4 cytosine residues flanked by either an adenosine or a guanosine residue.
In 1976 she had deduced a sequence consisting of a tandem array of CCCAA repeats, and subsequently separated the products on a denaturing gel electrophoresis that appeared as tiger stripes extending up the gel.
The size of each band was 6 bases more than the band below it.
Telomere must have a telomerase!
The discovery of the telomerase enzyme activity was done by the Prize co-awardee, Carol Greider. They were trying to decipher the structure right at the termini of telomeres of both cliliated protozoans and yeast plasmids. The view that in mammalian telomeres there is a long protruding G-rich strand does not take into account the clear evidence for the short C strand repeat oligonucleotides that she discovered. This was found for both the Tetrahymena rDNA minichromosome molecules and linear plasmids purified from yeast. In contrast to nucleosomal regions of chromosomes, special regions of DNA, for example
promoters that must bind transcription initiation factors that control transcription, have proteins other than the histones on them.
The telomeric repeat tract turned out to be such a non-nucleosomal region.
They found that by clipping up chromatin using an enzyme that cuts the linker between neighboring nucleosomes,
it cut up the bulk of the DNA into nucleosome-sized pieces
but left the telomeric DNA tract as a single protected chunk.
The resulting complex of the telomeric DNA tract plus its bound cargo of protective proteins behaved very differently, from nucleosomal chromatin, and concluded that it had no histones or nucleosomes.
Any evidence for a protein on the bulk of the rDNA molecule ends, such as their behavior in gel electrophoresis and the appearance of the rDNA molecules under the electron microscope, was conspicuously lacking. This was reassuring that there was no covalently attached protein at the very ends of this minichoromosome. Despite considerable work, she was unable to determine what protein(s) would co-purify with the telomeric repeat tract DNA of Tetrahymena. It was yeast genetics and approaches done by others that turned out to provide the next great leaps forward in understanding telomeric proteins. Carol Greider, her colleague, noticed the need to scale up the telomerase activity preparations and they used a very large glass column for preparative gel filtration chromatography.
Jack W Szostak at the Howard Hughes Medical Institue at Harvard shared in the 2009 Nobel Prize. He became interested in molecular biology taking a course on the frontiers of Molecular Biology and reading about the experiments of Meselson-Stahl barely a decade earlier, and learned how the genetic code had been unraveled. The fact that one could deduce, from measurements of the radioactivity in fractions from a centrifuge tube, the molecular details of DNA replication, transcription and translation was astonishing. A highlight of his time at McGill was the open-book, open-discussion final exam in this class, in which the questions required the intense collaboration of groups of students.
At Cornell, Ithaca, he collaborated with John Stiles and they came up with a specific idea to chemically synthesize a DNA oligonucleotide of sufficient length that it would hybridize to a single sequence within the yeast genome, and then to use it as an mRNA and gene specific probe. At the time, there was only one short segment of the yeast genome for which the DNA sequence was known,
the region coding for the N-terminus of the iso-1 cytochrome c protein,
intensively studied by Fred Sherman The Sherman lab, in a tour de force of genetics and protein chemistry, had isolated
double-frameshift mutants in which the N-terminal region of the protein was translated from out-of-frame codons.
Protein sequencing of the wild type and frame-shifted mutants allowed them to deduce 44 nucleotides of DNA sequence.
If they could prepare a synthetic oligonucleotide that was complementary to the coding sequence, they could use it to detect the cytochrome-c mRNA and gene. At the time, essentially all experiments on mRNA were done on total cellular mRNA. Ray Wu was already well known for determining the sequence of the sticky ends of phage lambda, the first ever DNA to be sequenced, and his lab was deeply involved in the study of enzymes that could be used to manipulate and sequence DNA more effectively, but would not take on a project from another laboratory. So John went to nearby Rochester to do postdoctoral work with Sherman, and he was able to transfer to Ray Wu’s laboratory. In order to carry out his work, Ray Wu sent him to Saran Narang’s lab in Ottawa, and he received training there under Keichi Itakura, who synthesized the Insulin gene. A few months later, he received several milligrams of our long sought 15-mer. In collaboration with John Stiles and Fred Sherman, who sent us RNA and DNA samples from appropriate yeast strains, they were able to use the labeled 15-mer as a probe to detect the cyc1 mRNA, and later the gene itself. He notes that one of the delights of the world of science is that it is filled with people of good will who are more than happy to assist a student or colleague by teaching a technique or discussing a problem. He remained in Ray’s lab after completion of the PhD upon the arrival of Rodney Rothstein from Sherman’s lab in Rochester, who introduced him to yeast genetics, and he was prepared for the next decade of work on yeast.
first in recombination studies, and
later in telomere studies and other aspects of yeast biology.
His studies of recombination in yeast were enabled by the discovery, in Gerry Fink’s lab at Cornell, of a way to introduce foreign DNA into yeast. These pioneering studies of yeast transformation showed that circular plasmid DNA molecules could on occasion become integrated into yeast chromosomal DNA by homologous recombination.
His studies of unequal sister chromatid exchange in rDNA locus resulted in his first publication in the field of recombination.
The idea that you could increase transformation frequency by cutting the input DNA was pleasingly counterintuitive and led us to continue our exploration of this phenomenon. He gained an appointment to the Sidney-Farber Cancer Institute due to the interest of Prof. Ruth Sager, who gathered together a great group of young investigators. In work spearheaded by his first graduate student, Terry Orr-Weaver, on
double-strand breaks in DNA
and their repair by recombination (and continuing interaction with Rod Rothstein),
they were attracted to what kinds of reactions occur at the DNA ends.
It was at a Gordon Conference that he was excited hearing a talk by Elizabeth Blackburn on her work on telomeres in Tetrahymena.
This led to a collaboration testing the ability of Tetrahymena telomers to function in yeast.
He performed the experiments himself, and experienced the thrill of being the first to know that our wild idea had worked.
It was clear from that point on that a door had been opened and that they were going to be able to learn a lot about telomere function from studies in yeast.
Within a short time he was able to clone bona fide yeast telomeres, and (in a continuation of the collaboration with Liz Blackburn’s lab)
they obtained the critical sequence information that led (them) to propose the existence of the key enzyme, telomerase.
A fanciful depiction evoking both telomere dynamics and telomere researchers, done by the artist Julie Newdoll in 2008, elicits the idea of a telomere as an ancient Sumarian temple-like hive, tended by a swarm of ancient Sumarian Bee-goddesses against a background of clay tablets inscribed with DNA sequencing gel-like bands. Dr. Blackburn recalls owing much to Barbara McClintock for her scientific findings, but also, Barbara McClintock also gave her advice in a conversation with her in 1977, during which
she had unexpected findings with the rDNA end sequences.
Dr. McClintock urged her to trust in intuition about the scientific research results.
In this Part I of a series of 3, I have described the
emergence of Molecular Biology and
closely allied work on the mechanism of Cell Replication and
the dependence of metabolic processes on proteins and enzymatic conversions through a surge of
post WWII research that gave birth to centers for basic science research in biology and medicine in both US and in England, which was preceded by work in prewar Germany. This is to be followed by further developments related to the Human Genome Project.