Feeds:
Posts
Comments

Archive for the ‘Chemical Genetics’ Category

Tumor Progression

Larry H. Bernstein, MD, FCAP, Curator

LPBI

 

GEN News Highlights Nov 10, 2015   Darwinian Selection Does Not Influence Tumor Progression
http://www.genengnews.com/gen-news-highlights/darwinian-selection-does-not-influence-tumor-progression/81251958/

http://www.genengnews.com/Media/images/GENHighlight/85273_large1068523712.jpg

 

New answers may have just emerged in a long-standing debate in the field of oncology and molecular evolution. The neutral theory of molecular evolution states that changes occurring at the molecular level are not caused by natural selection, but rather by the random genetic drift of mutant alleles. In contrast, Darwinian selection adheres to the idea that a molecular mutation holds some selective advantage over the wild-type, allowing it to thrive.

When viewing these two theories through the lens of carcinogenesis, it is not difficult to envision the applicability of either theory. However now, new evidence from scientists at the University of Chicago and the Beijing Institute of Genomics may tip the scales in favor of neutral theory. This collaborative scientific effort assembled data from one of the most rigorous genetic sequencing ever carried out on a single tumor—revealing a much greater level of genetic diversity than expected.

The investigators excised a tumor roughly 3.5 centimeters in diameter (slightly smaller than a ping-pong ball), from a hepatocellular carcinoma tumor of the liver. The research team estimated that the tumor contained more than 100 million distinct mutations within genetic coding regions, which is thousands of times more than they anticipated. The impact of this finding is that even microscopic tumors are likely to contain extremely high genetic diversity and with so much variation there are likely many cells contained within able to resist standard post-surgical cancer treatment such as chemotherapy and radiation.

“With 100 million mutations, each capable of altering a protein in some way, there is a high probability that a significant minority of tumor cells will survive, even after aggressive treatment,” explained study director Chung-I Wu, Ph.D., professor of ecology and evolution at the University of Chicago. “In a setting with so much diversity, those cells could multiply to form new tumors, which would be resistant to standard treatments.”

The findings from this study were published recently in PNAS through an article entitled “Extremely high genetic diversity in a single tumor points to prevalence of non-Darwinian cell evolution.”

 

Extremely high genetic diversity in a single tumor points to prevalence of non-Darwinian cell evolution

Shaoping Linga,1Zheng Hua,1Zuyu Yanga,1Fang Yanga,1Yawei LiaPei LinbKe ChenaLili DongaLihua CaoaYong Taoa , et al.
PNAS Nov 11, 2015,              http://dx.doi.org:/10.1073/pnas.1519556112

Significance

A tumor comprising many cells can be compared to a natural population with many individuals. The amount of genetic diversity reflects how it has evolved and can influence its future evolution. We evaluated a single tumor by sequencing or genotyping nearly 300 regions from the tumor. When the data were analyzed by modern population genetic theory, we estimated more than 100 million coding region mutations in this unexceptional tumor. The extreme genetic diversity implies evolution under the non-Darwinian mode. In contrast, under the prevailing view of Darwinian selection, the genetic diversity would be orders of magnitude lower. Because genetic diversity accrues rapidly, a high probability of drug resistance should be heeded, even in the treatment of microscopic tumors.

 

The prevailing view that the evolution of cells in a tumor is driven by Darwinian selection has never been rigorously tested. Because selection greatly affects the level of intratumor genetic diversity, it is important to assess whether intratumor evolution follows the Darwinian or the non-Darwinian mode of evolution. To provide the statistical power, many regions in a single tumor need to be sampled and analyzed much more extensively than has been attempted in previous intratumor studies. Here, from a hepatocellular carcinoma (HCC) tumor, we evaluated multiregional samples from the tumor, using either whole-exome sequencing (WES) (n = 23 samples) or genotyping (n = 286) under both the infinite-site and infinite-allele models of population genetics. In addition to the many single-nucleotide variations (SNVs) present in all samples, there were 35 “polymorphic” SNVs among samples. High genetic diversity was evident as the 23 WES samples defined 20 unique cell clones. With all 286 samples genotyped, clonal diversity agreed well with the non-Darwinian model with no evidence of positive Darwinian selection. Under the non-Darwinian model,MALL (the number of coding region mutations in the entire tumor) was estimated to be greater than 100 million in this tumor. DNA sequences reveal local diversities in small patches of cells and validate the estimation. In contrast, the genetic diversity under a Darwinian model would generally be orders of magnitude smaller. Because the level of genetic diversity will have implications on therapeutic resistance, non-Darwinian evolution should be heeded in cancer treatments even for microscopic tumors.

intratumor heterogeneity  genetic diversity  neutral evolution  cancer evolution  natural selection

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1519556112/-/DCSupplemental.

 

Scientists at the Beijing Institute of Genomics sampled nearly 300 regions from one slice of the hepatocellular tumor and sequenced or genotyped each one searching for genetic changes. Once they analyzed their data and applied a modern population genetic theory, their results lead them to the 100 million coding-region mutation estimate for the whole tumor.

This extensive level of heterogeneity within a single tumor, which is way beyond what a Darwinian process would permit, makes the selectionism vs. neutralism debate of the 1980s “suddenly medically relevant,” Dr. Wu remarked. Since previous to the current study, no one had ever genetically dissected a tumor as thoroughly, the commonly held theory was that tumors had from a few hundred up to 20,000 genetic alterations that were not present in the patient’s healthy cells.

“Our study is the non-Darwinian process writ small, down to the cellular level,” Dr. Wu noted. “In the Darwinian struggle, there are—from the tumor’s point of view—few beneficial mutations, meaning changes that give tumor cells a growth advantage. When there are no such limits on genetic variation, however, mutations can emerge and apparently thrive.”

“This could potentially change how we think about tumor growth and spread, but the direct clinical implications of this study may not be obvious on the surface,” added co-author Daniel Catenacci, M.D., assistant professor and medical oncologist at the University of Chicago.

While the bulk of the mutations were at very low frequencies, drug intervention could provide some of the genetic mutations with a progression path forward.

“The presence of so many random mutations could present a problem to specifically targeted therapies,” Dr. Catenacci stated. “It almost guarantees that some cells will be resistant. But it also suggests that aggressive treatment could push tumor cells into a more Darwinian mode.”

Since the current study only focused on a single tumor type, it remains to be seen how comparable this data will be for other types of cancerous tumors. However, regardless of narrow focus, the results from this analysis raises important question about tumor evolution and heterogeneity.

 

 

Read Full Post »

Obesity Issues

Larry H. Bernstein, MD, FCAP, Curator

LPBI

 

The Changing Face of Obesity

Science tells us obesity is a chronic disease. Why does the outmoded and injurious notion that it is a problem of willpower persist?

By Joseph Proietto | November 1, 2015   http://www.the-scientist.com//?articles.view/articleNo/44288/title/The-Changing-Face-of-Obesity/

In Dante Alighieri’s Divine Comedy the narrator meets a man named Ciacco who had been sent to Hell for the “Damning sin of Gluttony.” According to Catholic theology, in order to end up in Hell one must willfully commit a serious sin. So Dante believed that fat people chose to be fat. This antiquated view of the cause of obesity is still widespread, even among medical professionals. The consequences of this misconception are significant, because it forms the basis for the discrimination suffered by the obese; for the wasting of scarce resources in attempts to change lifestyle habits by public education; and for the limited availability of subsidized obesity treatments.

http://www.the-scientist.com/November2015/critic1.jpg

While obesity is often labeled a lifestyle disease, poor lifestyle choices alone account for only a 6 to 8 kg weight gain. The body has a powerful negative feedback system to prevent excessive weight gain. The strongest inhibitor of hunger, the hormone leptin, is made by fat cells. A period of increased energy intake will result in fat deposition, which will increase leptin production. Leptin suppresses hunger and increases energy expenditure. This slows down weight gain. To become obese, it may be necessary to harbor a genetic difference that makes the individual resistant to the action of leptin.

Evidence from twin and adoption studies suggests that obesity has a genetic basis, and over the past two decades a number of genes associated with obesity have been described. The most common genetic defect in European populations leading to severe obesity is due to mutations in the gene coding for the melanocortin 4 receptor (MCR4). Still, this defect can explain severe obesity in only approximately 6 percent to 7 percent of cases (J Clin Invest, 106:271-79, 2000). Other genes have been discovered that can cause milder increases in weight; for example, variants of just one gene (FTO) can explain up to 3 kg of weight variation between individuals (Science, 316:889-94, 2007).

Genes do not directly cause weight gain. Rather, genes influence the desire for food and the feeling of satiety. In an environment with either poor access to food or access to only low-calorie food, obesity may not develop even in persons with a genetic predisposition. When there is an abundance of food and a sedentary lifestyle, however, an obesity-prone person will experience greater hunger and reduced satiety, increasing caloric intake and weight gain.

Since the 1980s, there has been a rapid rise in the prevalence of obesity worldwide, a trend that likely results from a variety of complex causes. There is increasing evidence, for example, that the development of obesity on individual or familial levels may be influenced by environmental experiences that occur in early life. For example, if a mother is malnourished during early pregnancy, this results in epigenetic changes to genes involved in the set points for hunger and satiety in the developing child. These changes may then become fixed, resulting in a tendency towards obesity in the offspring.

The biological basis of obesity is further highlighted by the vigorous defense of weight following weight loss. There are at least 10 circulating hormones that modulate hunger. Of these, only one has been confirmed as a hunger-inducing hormone (ghrelin), and it is made and released by the stomach. In contrast, nine hormones suppress hunger, including CCK, PYY, GLP-1, oxyntomodulin, and uroguanylin from the small bowel; leptin from fat cells; and insulin, amylin, and pancreatic polypeptide from the pancreas.

 

After weight loss, regardless of the diet employed, there are changes in circulating hormones involved in the regulation of body weight. Ghrelin levels tend to increase and levels of multiple appetite-suppressing hormones decrease. There is also a subjective increase in appetite. Researchers have shown that even after three years, these hormonal changes persist (NEJM, 365:1597-604, 2011; Lancet Diabetes and Endocrinology, 2:954-62, 2014). This explains why there is a high rate of weight regain after diet-induced weight loss.

Given that the physiological responses to weight loss predispose people to regain that weight, obesity must be considered a chronic disease. Data show that those who successfully maintain their weight after weight loss do so by remaining vigilant and constantly applying techniques to oppose weight regain. These techniques may involve strict diet and exercise practices and/or pharmacotherapy.

It is imperative for society to move away from a view that obesity is simply a lifestyle issue and to accept that it is a chronic disease. Such a change would not only relieve the stigma of obesity but would also empower politicians, scientists and clinicians to tackle the problem more effectively.

Joseph Proietto was the inaugural Sir Edward Dunlop Medical Research Foundation Professor of Medicine in the Department of Medicine, Austin Health at the University of Melbourne in Australia. He is a researcher and clinician investigating and treating obesity and type 2 diabetes.

 

 

A Weighty Anomaly

Why do some obese people actually experience health benefits?

By Jyoti Madhusoodanan | November 1, 2015     http://www.the-scientist.com//?articles.view/articleNo/44304/title/A-Weighty-Anomaly/

http://www.the-scientist.com/November2015/notebook4.jpg

THE ENDOCRINE THEORY: Some researchers have posited that fat cells may secrete molecules that affect glucose homeostasis in muscle or liver tissue.COURTESY OF MITCHELL LAZAR

In the early 19th century, Belgian mathematician Adolphe Quetelet was obsessed with a shape: the bell curve. While helping with a population census, Quetelet proposed that the spread of human traits such as height and weight followed this trend, also known as a Gaussian or normal distribution. On a quest to define a “normal man,” he showed that human height and weight data fell along his beloved bell curves, and in 1823 devised the “Quetelet Index”—more familiar to us today as the BMI, or body mass index, a ratio of weight to height.

Nearly two centuries later, clinicians, researchers, and fitness instructors continue to rely on this metric to pigeonhole people into categories: underweight, healthy, overweight, or obese. But Quetelet never intended the metric to serve as a way to define obesity. And now, a growing body of evidence suggests these categories fail to accurately reflect the health risks—or benefits—of being overweight.

Although there is considerable debate surrounding the prevalence of metabolically healthy obesity, when obesity is defined in terms of BMI (a BMI of 30 or higher), estimates suggest that about 10 percent of adults in the U.S. are obese yet metabolically healthy, while as many as 80 percent of those with a normal BMI may be metabolically unhealthy, with signs of insulin resistance and poor circulating lipid levels, even if they suffer no obvious ill effects. “If all we know about a person is that they have a certain body weight at a certain height, that’s not enough information to know their health risks from obesity,” says health-science researcher Paul McAuley of Winston-Salem State University. “We need better indicators of metabolic health.”

The dangers of being overweight, such as a higher risk of heart disease, type 2 diabetes, and other complications, are well known. But some obese individuals—dubbed the “fat fit”—appear to fare better on many measures of health when they’re heavier. Studies have found lower mortality rates, better response to hemodialysis in chronic kidney disease, and lower incidence of dementia in such people. Mortality, it’s been found, correlates with obesity in a U-shaped curve (J Sports Sci, 29:773-82, 2011). So does extra heft help or hurt?

To answer that question, researchers are trying to elucidate the metabolic reasons for this obesity paradox.

In a recent study, Harvard University epidemiologist Goodarz Danaei and his colleagues analyzed data from nine studies involving a total of more than 58,000 participants to tease apart how obesity and other well-known metabolic risk factors influence the risk of coronary heart disease. Controlling these other risk factors, such as hypertension or high cholesterol, with medication is simpler than curbing obesity itself, Danaei explains. “If you control a person’s obesity you get rid of some health risks, but if you control hypertension or diabetes, that also reduces health risks, and you can do the latter much more easily right now.”

Danaei’s team assessed BMI and metabolic markers such as systolic blood pressure, total serum cholesterol, and fasting blood glucose. The three metabolic markers only explained half of the increased risk of heart disease across all study participants. In obese individuals, the other half appeared to be mediated by fat itself, perhaps via inflammatory markers or other indirect mechanisms (Epidemiology, 26:153-62, 2015). While Danaei’s study was aimed at understanding how obesity hurts health, the results also uncovered unknown mechanisms by which excess adipose tissue might exert its effects. This particular study revealed obesity’s negative effects, but might these unknown mechanisms hold clues that explain the obesity paradox?

Other researchers have suggested additional possibilities—for example, that inflammatory markers such as TNF-α help combat conditions such as chronic kidney disease, or that obesity makes a body more capable of making changes to, and tolerating changes in, blood flow depending on systemic needs (Am J Clin Nutr, 81:543-54, 2005).

According to endocrinologist Mitchell Lazar at the University of Pennsylvania, the key to explaining the obesity paradox may be two nonexclusive ways fat tissue is hypothesized to function. One mechanism, termed the endocrine theory, suggests that fat cells secrete, or don’t secrete enough of, certain molecules that influence glucose homeostasis in other tissues, such as muscle or liver. The first such hormone to be discovered was leptin; later studies reported several other adipocyte-secreted factors, including adiponectin, resistin, and various cytokines.

The other hypothesis, dubbed the spillover theory, suggests that storing lipids in fat cells has some pluses. Adipose tissue might sequester fat-soluble endotoxins, and produce lipoproteins that can bind to and clear harmful lipids from circulation. When fat cells fill up, however, these endotoxins are stashed in the liver, pancreas, or other organs—and that’s when trouble begins. In “fat fit” people, problems typically linked to obesity such as high cholesterol or diabetes may be avoided simply because their adipocytes mop up more endotoxins.

“In this model, one could imagine that if you could store even more fat in fat cells, you could be even more obese, but you might be protected from problems [associated with] obesity because you’re protecting the other tissues from filling up with lipids that cause problems,” says Lazar. “This may be the most popular current model to explain the fat fit.”

Although obesity greatly increases the risk of type 2 diabetes—up to 93-fold in postmenopausal women, for example—not all obese people suffer from the condition. Similarly, a certain subtype of individuals with “normal” BMIs are at greater risk of developing insulin resistance and type 2 diabetes than others with BMIs in the same range. Precisely what distinguishes these two cohorts is still unclear. “Just as important as explaining why some obese people don’t get diabetes is to explain why other subgroups—normal-weight people or those with lipodystrophy—sometimes get it,” Lazar says. “If there are multiple subtypes of obesity and diabetes, can we figure out genetic aspects or biomarkers that cause one of these phenotypes and not the other?”

To Lazar, McAuley, and other researchers, it’s increasingly evident that BMI may not be that metric. Finding better ways to assess a healthy weight, however, has proven challenging. Researchers have tested measures, such as the body shape index (ABSI) or the waist-hip ratio, which attempt to gauge visceral fat—considered to be more metabolically harmful than fat in other body locations. However, these metrics have yet to be implemented widely in clinics, and few are as simple to understand as the BMI (Science, 341:856-58, 2013).

Independent of metrics, however, the health message regarding weight is still unanimous: exercise and healthy dietary choices benefit everyone. “At a certain point, despite all the so-called fit-fat people, the demographics say that there’s a huge risk of diabetes and heart disease at very high BMI,” notes Lazar. “We can’t assume we’ll be one of the lucky ones who will have a BMI in the obese category but will still be protected from heart disease.”

Correction (November 2): The original version of this article misattributed the pull quote above. The attribution for this quote has been corrected, and The Scientist regrets the error.

 

 

THE HEALTH RISK OF OBESITY—BETTER METRICS IMPERATIVE

 Science 23 Aug 2013;  341(6148): 856858     DOI: http://dx.doi.org:/10.1126/science.1241244
Obesity paradoxes.
In this review, we examine the original obesity paradox phenomenon (i.e. in cardiovascular disease populations, obese patients survive better), as well as three other related paradoxes (pre-obesity, “fat but fit” theory, and “healthy” obesity). An obesity paradox has been reported in a range of cardiovascular and non-cardiovascular conditions. Pre-obesity (defined as a body mass index of 25.0-29.9 kg · m⁻²) presents another paradox. Whereas “overweight” implies increased risk, it is in fact associated with decreased mortality risk compared with normal weight. Another paradox concerns the observation than when fitness is taken into account, the mortality risk associated with obesity is offset. The final paradox under consideration is the presence of a sizeable subset of obese individuals who are otherwise healthy. Consequently, a large segment of the overweight and obese population is not at increased risk for premature death. It appears therefore that low cardiorespiratory fitness and inactivity are a greater health threat than obesity, suggesting that more emphasis should be placed on increasing leisure time physical activity and cardiorespiratory fitness as the main strategy for reducing mortality risk in the broad population of overweight and obese adults.
Obesity, insulin resistance, and cardiovascular disease.
Recent Prog Horm Res. 2004;59:207-23.
The ability of insulin to stimulate glucose disposal varies more than six-fold in apparently healthy individuals. The one third of the population that is most insulin resistant is at greatly increased risk to develop cardiovascular disease (CVD), type 2 diabetes, hypertension, stroke, nonalcoholic fatty liver disease, polycystic ovary disease, and certain forms of cancer. Between 25-35% of the variability in insulin action is related to being overweight. The importance of the adverse effects of excess adiposity is apparent in light of the evidence that more than half of the adult population in the United States is classified as being overweight/obese, as defined by a body mass index greater than 25.0 kg/m(2). The current epidemic of overweight/obesity is most-likely related to a combination of increased caloric intake and decreased energy expenditure. In either instance, the fact that CVD risk is increased as individuals gain weight emphasizes the gravity of the health care dilemma posed by the explosive increase in the prevalence of overweight/obesity in the population at large. Given the enormity of the problem, it is necessary to differentiate between the CVD risk related to obesity per se, as distinct from the fact that the prevalence of insulin resistance and compensatory hyperinsulinemia are increased in overweight/obese individuals. Although the majority of individuals in the general population that can be considered insulin resistant are also overweight/obese, not all overweight/obese persons are insulin resistant. Furthermore, the cluster of abnormalities associated with insulin resistance – namely, glucose intolerance, hyperinsulinemia, dyslipidemia, and elevated plasma C-reactive protein concentrations — is limited to the subset of overweight/obese individuals that are also insulin resistant. Of greater clinical relevance is the fact that significant improvement in these metabolic abnormalities following weight loss is seen only in the subset of overweight/obese individuals that are also insulin resistant. In view of the large number of overweight/obese subjects at potential risk to be insulin resistant/hyperinsulinemic (and at increased CVD risk), and the difficulty in achieving weight loss, it seems essential to identify those overweight/obese individuals who are also insulin resistant and will benefit the most from weight loss, then target this population for the most-intensive efforts to bring about weight loss.
Long-Term Persistence of Hormonal Adaptations to Weight Loss

Priya Sumithran, Luke A. Prendergast, Elizabeth Delbridge, Katrina Purcell, Arthur Shulkes, Adamandia Kriketos, and Joseph Proietto

N Engl J Med 2011; 365:1597-1604   October 27, 2011http://dx.doi.org:/10.1056/NEJMoa1105816

After weight loss, changes in the circulating levels of several peripheral hormones involved in the homeostatic regulation of body weight occur. Whether these changes are transient or persist over time may be important for an understanding of the reasons behind the high rate of weight regain after diet-induced weight loss.

Weight loss (mean [±SE], 13.5±0.5 kg) led to significant reductions in levels of leptin, peptide YY, cholecystokinin, insulin (P<0.001 for all comparisons), and amylin (P=0.002) and to increases in levels of ghrelin (P<0.001), gastric inhibitory polypeptide (P=0.004), and pancreatic polypeptide (P=0.008). There was also a significant increase in subjective appetite (P<0.001). One year after the initial weight loss, there were still significant differences from baseline in the mean levels of leptin (P<0.001), peptide YY (P<0.001), cholecystokinin (P=0.04), insulin (P=0.01), ghrelin (P<0.001), gastric inhibitory polypeptide (P<0.001), and pancreatic polypeptide (P=0.002), as well as hunger (P<0.001).

What’s new in endocrinology and diabetes mellitus

Large genome wide association studies have demonstrated that variants in the FTO gene have the strongest association with obesity risk in the general population, but the mechanism of the association has been unclear. However, a nonocoding causal variant in FTO has now been identified that changes the function of adipocytes from energy utilization (beige fat) to energy storage (white fat) with a fivefold decrease in mitochondrial thermogenesis [17]. When the effect of the variant was blocked in genetically engineered mice, thermogenesis increased and weight gain did not occur, despite eating a high-fat diet. Blocking the gene’s effect in human adipocytes also increased energy utilization. This observation has important implications for potential new anti-obesity drugs. (See “Pathogenesis of obesity”, section on ‘FTO variants’.)

Liraglutide for the treatment of obesity (July 2015)

Along with diet, exercise, and behavior modification, drug therapy may be a helpful component of treatment for select patients who are overweight or obese. Liraglutide is a glucagon-like peptide-1 (GLP-1) receptor agonist, used for the treatment of type 2 diabetes, and can promote weight loss in patients with diabetes, as well as those without diabetes.

In a randomized trial in nondiabetic patients who had a body mass index (BMI) of ≥30 kg/m2 or ≥27 kg/m2 with dyslipidemia and/or hypertension, liraglutide 3 mg once daily, compared with placebo, resulted in greater mean weight loss (-8.0 versus -2.6 kg with placebo) [18]. In addition, cardiometabolic risk factors, glycated hemoglobin (A1C), and quality of life improved modestly. Gastrointestinal side effects transiently affected at least 40 percent of the liraglutide group and were the most common reason for withdrawal (6.4 percent). Liraglutide is an option for select overweight or obese patients, although gastrointestinal side effects (nausea, vomiting) and the need for a daily injection may limit the use of this drug. (See “Obesity in adults: Drug therapy”, section on ‘Liraglutide’.)

In a trial designed specifically to evaluate the effect of liraglutide on weight loss in overweight or obese patients with type 2 diabetes (mean weight 106 kg), liraglutide, compared with placebo, resulted in greater mean weight loss (-6.4 kg and -5.0 kg for liraglutide 3 mg and 1.8 mg, respectively, versus -2.2 kg for placebo) [19]. Treatment with liraglutide was associated with better glycemic control, a reduction in the use of oral hypoglycemic agents, and a reduction in systolic blood pressure. Although liraglutide is not considered as initial therapy for the majority of patients with type 2 diabetes, it is an option for select overweight or obese patients with type 2 diabetes who fail initial therapy with lifestyle intervention and metformin.  (See “Glucagon-like peptide-1 receptor agonists for the treatment of type 2 diabetes mellitus”, section on ‘Weight loss’.)

The Skinny on Fat Cells

Bruce Spiegelman has spent his career at the forefront of adipocyte differentiation and metabolism.

By Anna Azvolinsky | November 1, 2015

http://www.the-scientist.com//?articles.view/articleNo/44312/title/The-Skinny-on-Fat-Cells/

Bruce Spiegelman
Stanley J. Korsmeyer Professor of Cell Biology
and Medicine
Harvard Medical School
Director, Center for Energy Metabolism
and Chronic
Disease, Dana-Farber Cancer Institute, Boston

It’s hard to know whether you have the right stuff to be a scientist, but I had a passion for the research,” says Bruce Spiegelman, professor of cell biology at Harvard Medical School and the Dana-Farber Cancer Institute. After receiving his PhD in biochemistry from Princeton University in 1978, Spiegelman sent an application to do postdoctoral research to just one lab. “I wasn’t thinking I should apply to five different labs. I just marched forward more or less in a straight line,” he says. Spiegelman did know that he had no financial backup and depended on research fellowships throughout the early phase of his science career. “I thought it was fantastic, and still think so, that a PhD in science is supported by the government. I certainly appreciated that, because many of my friends in the humanities had to support themselves by cobbling together fellowships and teaching every semester, whereas we didn’t face similar challenges in the sciences.”

Since his graduate student days, Spiegelman has realized his potential, pioneering the study of adipose tissue biology and metabolism. He was introduced to the field in Howard Green’s laboratory, then at MIT, where Spiegelman began his one and only postdoc in 1978. Green had recently developed a system for culturing adipose cells and asked Spiegelman if he wanted to study fat cell differentiation. “I knew nothing about adipose tissue, but I was really interested in any model of how one cell switches to another. Whether skin or fat didn’t matter too much to me, because I was not coming at this from the perspective of physiology but from the perspective of how do these switches work at a molecular level?”

Spiegelman has stuck with studying the biology and differentiation of fat cells for more than 30 years. While looking for the master transcriptional regulator of fat development—which his laboratory found in 1994—Spiegelman’s group also discovered one of the first examples of a nuclear oncogene that functions as a transcription factor, and, more recently, the team found that brown fat and white fat come from completely different origins and that brown and beige fat are distinct cell types. Spiegelman was also the first to provide evidence for the connection between inflammation, insulin resistance, and fat tissue.

Here, Spiegelman talks about his strong affinity for the East Coast, his laboratory’s search for molecules that can crank up brown fat production and activity, and the culture of his laboratory’s weekly meeting.

Spiegelman Sets Out

First publication. Spiegelman grew up in Massapequa, New York, a town on Long Island. “Birds, insects, fish, and animals were fascinating to me. As a kid, I imagined I would be a wildlife ranger,” he says. Spiegelman and his brother were the first in their family to attend college; Spiegelman entered the College of William and Mary in 1970 thinking he would major in psychology. But before taking his first psychology course, he had to take a biology course, really loved it, and switched his major. For his senior thesis, he chose one of the few labs that did biochemistry-related research. He studied cultures of the filamentous fungus Aspergillus ornatus in which he induced the upregulation of a metabolic enzyme. Spiegelman applied a calculus transformation that related the age of the culture to the age of individual cells, something that had not been previously done. The work earned him his first first-author publication in 1975. “It was not a great breakthrough, but I think it showed that I was maybe applying myself more than the typical undergraduate.”

Full steam ahead. “My interest in laboratory research was intense. Even though it was not particularly inspired work, the first-author publication in a college where not many of the professors published a lot gave me a lot of confidence. It was probably out of proportion to the quality of the actual work.” That confidence and Spiegelman’s interest in the chemistry of living things led him to pursue a PhD in biochemistry at Princeton University. “Very early on, I felt that I couldn’t understand biology if it didn’t go to the molecular level. To me, just describing how an animal lived without understanding how it worked was very unsatisfying. I think it was one of the best decisions that I made in my life, to do a PhD in biochemistry,” he says, “because if you really want to understand living systems, you are very limited in how you can understand them without having a strong background in biochemistry because these are, essentially, chemical systems.”

Embracing molecular biology. Spiegelman initially joined Arthur Pardee’s laboratory, but switched when Pardee left Princeton for Harvard University in 1975. Because he was already collaborating with Marc Kirschner, a cell biologist and biochemist who studies the regulation of the cell cycle and how the cytoskeleton works, it was an easy transition to transfer to the new laboratory. In Kirschner’s group, Spiegelman became the cell biologist among many protein biochemists working on microtubule assembly in vitro. Rather than understanding how the proteins fit together to form the filamentous structures, Spiegelman wanted to understand what controlled their assembly inside cells. Working in mammalian cells, Spiegelman published three consecutive Cell papers on how microtubule assembly occurs in vivo. The firstpaper, from 1977, demonstrated that a nucleotide functions to stabilize the tubulin molecule rather than to regulate tubulin assembly in vivo.

Spiegelman Simmers

A new tool. For his next move, Spiegelman wanted to marry his background in biochemistry and molecular biology with a good cellular model system. He became interested in differentiation at the end of his PhD, while studying how the cytoskeleton is reorganized during neural differentiation, and settled on Green’s MIT laboratory for his postdoc. Green had developed a way to study both skin and fat cell differentiation. Again, Spiegelman was the odd man out, working on the molecular biology of fat cell differentiation while most of the graduate students and postdocs focused on the cellular biology of skin cell differentiation. While there, Spiegelman learned how to clone cDNA—a new method that some researchers thought was just another new fad, he says. “I thought it was pretty obvious that this was a tool that would be a game changer. I could see how I could clone some of the cDNAs and genes that were regulated in the fat cell lineage and then try to understand the regulation of these genes.”

Setting the stage. Spiegelman demonstrated that cAMP regulates the synthesis of certain enzymes in fat cells during differentiation. But while this was the most influential paper from his postdoc, says Spiegelman, it was his demonstration of cloning mRNAs from adipocytes, published in 1983, that set the stage for cloning fat-selective genes. The work, mostly done when Spiegelman was already a new faculty member at the Dana-Farber Cancer Institute, stemmed from his learning molecular cloning in Phillip Sharp’s lab at MIT and Bryan Roberts’s lab at Harvard. “This was the raw material from which we eventually cloned PPARγ and showed it to be the master regulator of fat [cell] development.”

Roots. Spiegelman became an assistant professor at the Harvard Medical School in 1982, when he was not yet 30. Although he had entertained the idea of moving to the West Coast with his wife, whom he had met at Princeton where she obtained a PhD in French literature, Spiegelman says he is really an East Coaster at heart. “My wife and I came to love Boston and were very comfortable there. Our families were both in New York, which was close, but not too close, and we really enjoyed the culture and pace of Boston; it was more ‘us.’ We really liked to visit California but didn’t particularly want to move there. We’re both real Northeastern people.”

Relating to Sisyphus. The transition from doing a postdoc to setting up his own laboratory was “very exciting and terribly stressful,” says Spiegelman. “When I think back, I always tried to be professional with my laboratory, but I was so stressed at suddenly being on my own with no management training.” The people resources he had encountered in his graduate and postdoctoral training labs were also not there yet, and he says his first publication as a principal investigator was like pushing a rock up a hill. But eventually, Spiegelman’s lab built a reputation and reached a critical mass of talented people who advanced the science. Again in 1983, Spiegelman produced a publication showing that morphological manipulation can affect gene expression and adipose differentiation.

End goal. Spiegelman’s goal was to find a master molecule that  orchestrates the conversion of adipocyte precursor cells into bona fide fat cells. Piece by piece, his lab identified the enhancers, promoters, and other regulatory elements involved in adipocyte differentiation. In 1994, graduate student Peter Tontonoz finallyfound that the PPARγ gene, inserted via a retroviral vector into fibroblasts, could induce the cells to become adipose cells. “It took 10 years,” Spiegelman says. Along the way, the laboratory found that c-fos, the product of a famous nuclear oncogene, bound to the promoters of fat-specific genes and worked as a transcription factor. “It was not really known how nuclear oncogenes worked. This was one of the first papers showing that these oncogenes bound to gene promoters and were transcription factors.”

A wider scope. In 1993, graduate student Gökhan Hotamisligil found that tumor necrosis factor-alpha(TNF-α), is induced in the fat tissue of rodent models of obesity and diabetes. The paper sparked the formation of the field of immunometabolism and resulted in the expansion of Spiegelman’s lab into the physiology arena, partly thanks to the guidance of C. Ronald Kahn and Jeff Flier, who both study metabolism and diabetes. But the work initially encountered pushback, says Spiegelman, partly because it was the merging of two fields.

Spiegelman Scales Up

Fat color palette. Brown fat tissue, abundant in infants but scarce in adults, is a metabolically active form of fat that is chock full of mitochondria and is found in pockets in the body distinct from white fat tissue.Pere Puigserver, then a postdoc in Spiegelman’s lab, found that the coactivator PCG-1, binding to PPARγ and other nuclear receptors, could stimulate mitochondrial biogenesis. The PCG-1 gene is turned on by stimuli such as exercise or a cold environment. Later, postdoc Patrick Seale, Spiegelman, and their colleagues showed brown fat cells derive from the same lineage that gives rise to skeletal muscle. “This was a big surprise, maybe the biggest surprise we ever uncovered in the lab,” says Spiegelman.

A paler shade of brown. More recently, in 2012, Spiegelman’s laboratory showed that within adult white adipose tissue, there are pockets of a yet another type of fat tissue that he called beige fat. “I think the evidence is very good from rodents that if you activate brown and beige fat, you get metabolic benefit both in obesity and diabetes. So the question now is: Can that be done in humans in a way that’s beneficial and not toxic?”  The lab is now looking to identify molecules that can either ramp up the activity of brown and beige fat or increase the production of both cell types as possible therapeutics for metabolic disorders or even cancer-associated cachexia. “Anyone who says that either approach will work better is being foolish. We just don’t know enough to go after just one or the other.”

On the irisin controversy. After reporting in 2012 that a muscle-related hormone called irisin could switch white fat to metabolically active brown fat, Spiegelman became embroiled in a media-covered debate about whether the molecule really exists; he was also the victim of a potential fraud plot. Most recently, Spiegelman provided thorough evidence that irisin does in fact exist. On the controversy, he says it’s a fine line between defending his scientific integrity and not adding more fuel to the fire or engaging with his harassers. “We have a long track record of doing credible and reproducible science and it was not that complicated to address the paper that claimed irisin was ‘a myth.’ That study used very outmoded scientific approaches.”

Raw talent. Many of Spiegelman’s trainees have gone on to become very successful scientists, including Tontonoz, Hotamisligil, Evan Rosen, and Randy Johnson. “It’s a quantum change in the experience of doing science when you get people who have their own visions. I would have thought that interacting with smart people would mainly help me get my scientific vision accomplished. And that was partly true, but also it changed my vision. When you have people challenging you on a day-to-day basis, you learn from them through the questions they ask and the way they challenge you in a constructive way. They made me a much better scientist.”

Rigorous mentorship.  “I feel very passionately that a major part of my job is to prepare the next generation of scientists. Everyone who comes through my lab will tell you that I take that very seriously. We make sure my students give a lot of talks and get critical assessments of their presentations to our lab group. I am very hands-on both scientifically and in developing the way students project their vision. I had a very good mentor, Marc Kirschner, and I’d like to think that I learned how to be a mentor from him. I want to make sure that when people walk out of my lab they are prepared to run independent research programs.”

Greatest Hits

  • Identified the master regulator of adipogenesis, the nuclear receptor PPARγ
  • Was the first to show that a nuclear oncogene, c-fos, codes for a transcription factor that binds to the promoters of genes
  • Demonstrated that adipose tissue synthesizes tumor necrosis factor-alpha (TNF-α), providing the first direct link between obesity, inflammation, insulin resistance, and fat tissue.
  • Showed that brown fat cells are not developmentally related to white fat
  • Identified beige fat as a distinct cell type, different from either white or brown fat

 

Fanning the Flames

Obesity triggers a fatty acid synthesis pathway, which in turn helps drive T cell differentiation and inflammation.

By Kate Yandell | November 1, 2015

http://www.the-scientist.com//?articles.view/articleNo/44306/title/Fanning-the-Flames/

EDITOR’S CHOICE IN IMMUNOLOGY

The paper
Y. Endo et al., “Obesity drives Th17 cell differentiation by inducing the lipid metabolic kinase, ACC1,” Cell Reports, 12:1042-55, 2015.

Cell Rep. 2015 Aug 11;12(6):1042-55.   http://dx.doi.org:/10.1016/j.celrep.2015.07.014. Epub 2015 Jul 30.
Obesity Drives Th17 Cell Differentiation by Inducing the Lipid Metabolic Kinase, ACC1.
  • A high-fat diet augments Th17 cell development and the expression of Acaca
  • ACC1 controls Th17 cell development in vitro and Th17 cell pathogenicity in vivo
  • ACC1 modulates RORγt function in developing Th17 cells
  • Obesity in humans induces ACACA and IL-17A expression in CD4 T cells

Chronic inflammation due to obesity contributes to the development of metabolic diseases, autoimmune diseases, and cancer. Reciprocal interactions between metabolic systems and immune cells have pivotal roles in the pathogenesis of obesity-associated diseases, although the mechanisms regulating obesity-associated inflammatory diseases are still unclear. In the present study, we performed transcriptional profiling of memory phenotype CD4 T cells in high-fat-fed mice and identified acetyl-CoA carboxylase 1 (ACC1, the gene product of Acaca) as an essential regulator of Th17 cell differentiation in vitro and of the pathogenicity of Th17 cells in vivo. ACC1 modulates the DNA binding of RORγt to target genes in differentiating Th17 cells. In addition, we found a strong correlation between IL-17A-producing CD45RO(+)CD4 T cells and the expression of ACACA in obese subjects. Thus, ACC1 confers the appropriate function of RORγt through fatty acid synthesis and regulates the obesity-related pathology of Th17 cells.

Figure thumbnail fx1

http://www.cell.com/cms/attachment/2035221719/2050630604/fx1.jpg

 

 

http://www.the-scientist.com/November2015/NovMediLit_310px.jpg

FEEDING INFLAMMATION: When mice eat a diet high in fat, their CD4 T cells show increased expression of the fatty acid biosynthesis gene Acaca, which encodes the enzyme ACC1 (1). Products of the ACC1 fatty acid synthesis pathway encourage the transcription factor RORγt to bind near the gene encoding the cytokine IL-17A (2). There, RORγt recruits an enzyme called p300 to modify the genome epigenetically and turn on IL-17A. The memory T cells then differentiate into inflammatory T helper 17 cells.
See full infographic: PDF
© STEVE GRAEPEL

Obesity often comes with a side of chronic inflammation, causing inflammatory chemicals and immune cells to flood adipose tissue, the hypothalamus, the liver, and other areas of the body. Inflammation is a big part of what makes obesity such an unhealthy condition, contributing to Type 2 diabetes, heart disease, cancers, autoimmune disorders, and possibly even neurodegenerative diseases.

To better understand the relationship between obesity and inflammation, Toshinori Nakayama, Yusuke Endo, and their colleagues at Chiba University in Japan started with what often leads to obesity: a high-fat diet. They fed mice rich meals for a couple of months and looked at how gene expression in the animals’ T cells compared to gene expression in the T cells of mice fed a normal diet. Most notably, they found increased expression ofAcaca, a gene that codes for a fatty acid synthesis enzyme called acetyl coA carboxylase 1 (ACC1). They went on to show that the resulting increase in fatty acid levels pushed CD4 T cells to differentiate into inflammatory T helper 17 (Th17) cells.

Th17 cells help fight off invading fungi and some bacteria. But these immune cells can also spin out of control in autoimmune diseases such as multiple sclerosis. Nakayama’s team showed that either blocking ACC1 activity with a drug called TOFA or deleting a key portion of Acaca in mouse CD4 T cells reduced the generation of pathologic Th17 cells. Overexpressing Acaca increased Th17-cell generation.

The researchers also demonstrated that mice fed a high-fat diet had elevated susceptibility to a multiple sclerosis–like disease, and that TOFA reduced the symptoms.

“This is a very intriguing finding, suggesting not only that obesity can directly induce Th17 differentiation but also indicating that pharmacologic targeting of fatty acid synthesis may help to interfere with obesity-associated inflammation,” Tim Sparwasser of the Twincore Center for Experimental and Clinical Infection Research in Hannover, Germany, says in an email. Sparwasser and his colleagues had previously shown that ACC1 is required for the differentiation of Th17 cells in mice and humans.

Nakayama explains that CD4 T cells must undergo profound metabolic changes as they mature and differentiate. “The intracellular metabolites, including fatty acids, are essential for cell proliferation and cell growth,” he says in an email. When fatty acid levels in T cells increase, the cells are activated and begin to proliferate.

“It’s a nice illustration of how, really, immune response is so highly connected to the metabolic state of the cell,” says Gökhan S. Hotamisligil of Harvard University’s T.H. Chan School of Public Health who was not involved in the study. “The immune system launches its responses commensurate with the sources of nutrients and energy from the environment,” he adds in an email.

There are still missing pieces in the path from high-fat diet to increased Acaca expression to ACC1’s influence on T-cell differentiation. It also remains to be seen how this plays out in obese humans, although Nakayama and colleagues did show that inhibiting ACC1 reduced pathologic Th17 generation in human immune cell cultures, and that the T cells of obese humans contain elevated levels of ACC1 and show signs of increased differentiation into Th17 cells.

 

The prevalence of obesity has been increasing worldwide, and obesity is now a major public health problem in most developed countries (Gregor and Hotamisligil, 2011, Ng et al., 2014). Obesity-induced inflammation contributes to the development of various chronic diseases, such as autoimmune diseases, metabolic diseases, and cancer (Kanneganti and Dixit, 2012, Kim et al., 2014,Osborn and Olefsky, 2012, Winer et al., 2009a). A number of studies have pointed out the importance of reciprocal interactions between metabolic systems and immune cells in the pathogenesis of obesity-associated diseases (Kaminski and Randall, 2010, Kanneganti and Dixit, 2012, Kim et al., 2014, Mauer et al., 2014, Stienstra et al., 2012, Winer et al., 2011).

Elucidating the molecular mechanisms by which naive CD4 T cells differentiate into effector T cells is crucial for understanding helper T (Th) cell-mediated immune pathogenicity. After antigen stimulation, naive CD4 T cells differentiate into at least four distinct Th cell subsets: Th1, Th2, Th17, and inducible regulatory T (iTreg) cells (O’Shea and Paul, 2010, Reiner, 2007). Several specific master transcription factors that regulate Th1/Th2/Th17/iTreg cell differentiation have been identified, including T-bet for Th1 (Szabo et al., 2000), GATA3 (Yamashita et al., 2004, Zheng and Flavell, 1997) for Th2, retinoic-acid-receptor-related orphan receptor γt (RORγt) for Th17 (Ivanov et al., 2006), and forkhead box protein 3 (Foxp3) for iTreg (Sakaguchi et al., 2008). The appropriate expression and function of these transcription factors is essential for proper immune regulation by each Th cell subset.

Among these Th cell subsets, Th17 cells contribute to the host defense against fungi and extracellular bacteria (Milner et al., 2008). However, the pathogenicity of IL-17-producing T cells has been recognized in various autoimmune diseases, including multiple sclerosis, psoriasis, inflammatory bowel diseases, and steroid-resistant asthma (Bettelli et al., 2006, Coccia et al., 2012, Ivanov et al., 2006,Leonardi et al., 2012, McGeachy and Cua, 2008, Nylander and Hafler, 2012,Stockinger et al., 2007, Sundrud et al., 2009).

An HFD Promotes Th17 Cell Differentiation and Affects the Expression of Fatty Acid Enzymes in Memory CD4 T Cells In Vivo

Inhibition of ACC1 Function Results in Decreased Th17 Cell Differentiation and Ameliorates the Development of Autoimmune Disease

ACC1 Controls the Differentiation of Th17 Cells Both In Vitro and In Vivo

ACC1 Controls the Function, but Not Expression, of RORγt in Differentiating Th17 Cells

Extrinsic Fatty Acid Supplementation Restored Acaca−/− Th17 Cell Differentiation through the Functional Improvement of RORγt

Obese Subjects Show Upregulation of ACACA and Increased Th17 Cells in CD45RO+ Memory CD4 T Cells

We herein identified a critical role that ACC1 plays in Th17 cell differentiation and the pathogenicity of Th17 cells through the control of the RORγt function under obese circumstances. High-fat-induced obesity augments Th17 cell differentiation and the expression of enzymes involved in fatty acid metabolism, including ACC1. Pharmacological inhibition or genetic deletion of ACC1 resulted in impaired Th17 cell differentiation in both mice and humans. In contrast, overexpression of Acaca induced Th17 cells in vivo, leaving the expression ofIfng and Il4 largely unchanged. ACC1 modulated the binding of RORγt to theIl17a gene and the subsequent p300 recruitment in differentiating Th17 cells. Memory CD4 T cells from peripheral blood mononuclear cells (PBMCs) of obese subjects showed increased IL-17A production and ACACA expression. Furthermore, a strong correlation was detected between the proportion of IL-17A-producing cells and the expression level of ACACA in memory CD4 T cells in obese subjects. Thus, our findings provide evidence of a mechanism wherein obesity can exacerbate IL-17-mediated pathology via the induction of ACC1.

Read Full Post »

Irreconciliable Dissonance in Physical Space and Cellular Metabolic Conception

Irreconciliable Dissonance in Physical Space and Cellular Metabolic Conception

Curator: Larry H. Bernstein, MD, FCAP

Pasteur Effect – Warburg Effect – What its history can teach us today. 

José Eduardo de Salles Roselino

The Warburg effect, in reality the “Pasteur-effect” was the first example of metabolic regulation described. A decrease in the carbon flux originated at the sugar molecule towards the end of the catabolic pathway, with ethanol and carbon dioxide observed when yeast cells were transferred from an anaerobic environmental condition to an aerobic one. In Pasteur´s studies, sugar metabolism was measured mainly by the decrease of sugar concentration in the yeast growth media observed after a measured period of time. The decrease of the sugar concentration in the media occurs at great speed in yeast grown in anaerobiosis (oxygen deficient) and its speed was greatly reduced by the transfer of the yeast culture to an aerobic condition. This finding was very important for the wine industry of France in Pasteur’s time, since most of the undesirable outcomes in the industrial use of yeast were perceived when yeasts cells took a very long time to create, a rather selective anaerobic condition. This selective culture media was characterized by the higher carbon dioxide levels produced by fast growing yeast cells and by a higher alcohol content in the yeast culture media.

However, in biochemical terms, this finding was required to understand Lavoisier’s results indicating that chemical and biological oxidation of sugars produced the same calorimetric (heat generation) results. This observation requires a control mechanism (metabolic regulation) to avoid burning living cells by fast heat released by the sugar biological oxidative processes (metabolism). In addition, Lavoisier´s results were the first indications that both processes happened inside similar thermodynamics limits. In much resumed form, these observations indicate the major reasons that led Warburg to test failure in control mechanisms in cancer cells in comparison with the ones observed in normal cells.

[It might be added that the availability of O2 and CO2 and climatic conditions over 750 million years that included volcanic activity, tectonic movements of the earth crust, and glaciation, and more recently the use of carbon fuels and the extensive deforestation of our land masses have had a large role in determining the biological speciation over time, in sea and on land. O2 is generated by plants utilizing energy from the sun and conversion of CO2. Remove the plants and we tip the balance. A large source of CO2 is from beneath the earth’s surface.]

Biology inside classical thermodynamics places some challenges to scientists. For instance, all classical thermodynamics must be measured in reversible thermodynamic conditions. In an isolated system, increase in P (pressure) leads to increase in V (volume), all this occurring in a condition in which infinitesimal changes in one affects in the same way the other, a continuum response. Not even a quantic amount of energy will stand beyond those parameters.

In a reversible system, a decrease in V, under same condition, will led to an increase in P. In biochemistry, reversible usually indicates a reaction that easily goes either from A to B or B to A. For instance, when it was required to search for an anti-ischemic effect of Chlorpromazine in an extra hepatic obstructed liver, it was necessary to use an adequate system of increased biliary system pressure in a reversible manner to exclude a direct effect of this drug over the biological system pressure inducer (bile secretion) in Braz. J. Med. Biol. Res 1989; 22: 889-893. Frequently, these details are jumped over by those who read biology in ATGC letters.

Very important observations can be made in this regard, when neutral mutations are taken into consideration since, after several mutations (not affecting previous activity and function), a last mutant may provide a new transcript RNA for a protein and elicit a new function. For an example, consider a Prion C from lamb getting similar to bovine Prion C while preserving  its normal role in the lamb when its ability to change Human Prion C is considered (Stanley Prusiner).

This observation is good enough, to confirm one of the most important contributions of Erwin Schrodinger in his What is Life:

“This little book arose from a course of public lectures, delivered by a theoretical physicist to an audience of about four hundred which did not substantially dwindle, though warned at the outset that the subject matter was a difficult one and that the lectures could not be termed popular, even though the physicist’s most dreaded weapon, mathematical deduction, would hardly be utilized. The reason for this was not that the subject was simple enough to be explained without mathematics, but rather that it was much too involved to be fully accessible to mathematics.”

After Hans Krebs, description of the cyclic nature of the citrate metabolism and after its followers described its requirement for aerobic catabolism two major lines of research started the search for the understanding of the mechanism of energy transfer that explains how ADP is converted into ATP. One followed the organic chemistry line of reasoning and therefore, searched for a mechanism that could explain how the breakdown of carbon-carbon link could have its energy transferred to ATP synthesis. One of the major leaders of this research line was Britton Chance. He took into account that relatively earlier in the series of Krebs cycle reactions, two carbon atoms of acetyl were released as carbon dioxide ( In fact, not the real acetyl carbons but those on the opposite side of citrate molecule). In stoichiometric terms, it was not important whether the released carbons were or were not exactly those originated from glucose carbons. His research aimed at to find out an intermediate proteinaceous intermediary that could act as an energy reservoir. The intermediary could store in a phosphorylated amino acid the energy of carbon-carbon bond breakdown. This activated amino acid could transfer its phosphate group to ADP producing ATP. A key intermediate involved in the transfer was identified by Kaplan and Lipmann at John Hopkins as acetyl coenzyme A, for which Fritz Lipmann received a Nobel Prize.

Alternatively, under possible influence of the excellent results of Hodgkin and Huxley a second line of research appears. The work of Hodgkin & Huxley indicated that the storage of electrical potential energy in transmembrane ionic asymmetries and presented the explanation for the change from resting to action potential in excitable cells. This second line of research, under the leadership of Peter Mitchell postulated a mechanism for the transfer of oxide/reductive power of organic molecules oxidation through electron transfer as the key for the energetic transfer mechanism required for ATP synthesis.
This diverted the attention from high energy (~P) phosphate bond to the transfer of electrons. During most of the time the harsh period of the two confronting points of view, Paul Boyer and followers attempted to act as a conciliatory third party, without getting good results, according to personal accounts (in L. A. or Latin America) heard from those few of our scientists who were able to follow the major scientific events held in USA, and who could present to us later. Paul  Boyer could present how the energy was transduced by a molecular machine that changes in conformation in a series of 3 steps while rotating in one direction in order to produce ATP and in opposite direction in order to produce ADP plus Pi from ATP (reversibility).

However, earlier, a victorious Peter Mitchell obtained the result in the conceptual dispute, over the Britton Chance point of view, after he used E. Coli mutants to show H+ gradients in the cell membrane and its use as energy source, for which he received a Nobel Prize. Somehow, this outcome represents such a blow to Chance’s previous work that somehow it seems to have cast a shadow over very important findings obtained during his earlier career that should not be affected by one or another form of energy transfer mechanism.  For instance, Britton Chance got the simple and rapid polarographic assay method of oxidative phosphorylation and the idea of control of energy metabolism that brings us back to Pasteur.

This metabolic alternative result seems to have been neglected in the recent years of obesity epidemics, which led to a search for a single molecular mechanism required for the understanding of the accumulation of chemical (adipose tissue) reserve in our body. It does not mean that here the role of central nervous system is neglected. In short, in respiring mitochondria the rate of electron transport linked to the rate of ATP production is determined primarily by the relative concentrations of ADP, ATP and phosphate in the external media (cytosol) and not by the concentration of respiratory substrate as pyruvate. Therefore, when the yield of ATP is high as it is in aerobiosis and the cellular use of ATP is not changed, the oxidation of pyruvate and therefore of glycolysis is quickly (without change in gene expression), throttled down to the resting state. The dependence of respiratory rate on ADP concentration is also seen in intact cells. A muscle at rest and using no ATP has a very low respiratory rate.   [When skeletal muscle is stressed by high exertion, lactic acid produced is released into the circulation and is metabolized aerobically by the heart at the end of the activity].

This respiratory control of metabolism will lead to preservation of body carbon reserves and in case of high caloric intake in a diet, also shows increase in fat reserves essential for our biological ancestors survival (Today for our obesity epidemics). No matter how important this observation is, it is only one focal point of metabolic control. We cannot reduce the problem of obesity to the existence of metabolic control. There are numerous other factors but on the other hand, we cannot neglect or remove this vital process in order to correct obesity. However, we cannot explain obesity ignoring this metabolic control. This topic is so neglected in modern times that we cannot follow major research lines of the past that were interrupted by the emerging molecular biology techniques and the vain belief that a dogmatic vision of biology could replace all previous knowledge by a new one based upon ATGC readings. For instance, in order to display bad consequences derived from the ignorance of these old scientific facts, we can take into account, for instance, how ion movements across membranes affects membrane protein conformation and therefore contradicts the wrong central dogma of molecular biology. This change in protein conformation (with unchanged amino acid sequence) and/or the lack of change in protein conformation is linked to the factors that affect vital processes as the heart beats. This modern ignorance could also explain some major pitfalls seen in new drugs clinical trials and in a small scale on bad medical practices.

The work of Britton Chance and of Peter Mitchell have deep and sound scientific roots that were made with excellent scientific techniques, supported by excellent scientific reasoning and that were produced in a large series of very important intermediary scientific results. Their sole difference was to aim at very different scientific explanations as their goals (They have different Teleology in their minds made by their previous experiences). When, with the use of mutants obtained in microorganisms P Mitchell´s goal was found to survive and B Chance to succumb to the experimental evidence, all those excellent findings of B Chance and followers were directed to the dustbin of scientific history as an example of lack of scientific consideration.  [On the one hand, the Mitchell model used a unicellular organism; on the other, Chance’s work was with eukaryotic cells, quite relevant to the discussion.]

We can resume the challenge faced by these two great scientists in the following form: The first conceptual unification in bioenergetics, achieved in the 1940s, is inextricably bound up with the name of Fritz Lipmann. Its central feature was the recognition that adenosine triphosphate, ATP, serves as a universal energy  “currency” much as money serves as economic currency. In a nutshell, the purpose of metabolism is to support the synthesis of ATP. In microorganisms, this is perfect! In humans or mammals, or vertebrates, by the same reason that we cannot consider that gene expression is equivalent to protein function (an acceptable error in the case of microorganisms) this oversimplifies the metabolic requirement with a huge error. However, in case our concern is ATP chemistry only, the metabolism produces ATP and the hydrolysis of ATP pays for the performance of almost, all kinds of works. It is possible to presume that to find out how the flow of metabolism (carbon flow) led to ATP production must be considered a major focal point of research of the two contenders. Consequently, what could be a minor fall of one of the contenders, in case we take into account all that was found during their entire life of research, the real failure in B Chance’s final goal was amplified far beyond what may be considered by reason!

Another aspect that must be taken into account: Both contenders have in the scientific past a very sound root. Metabolism may produce two forms of energy currency (I personally don´t like this expression*) and I use it here because it was used by both groups in order to express their findings. Together with simplistic thermodynamics, this expression conveys wrong ideas): The second kind of energy currency is the current of ions passing from one side of a membrane to the other. The P. Mitchell scientific root undoubtedly have the work of Hodgkin & Huxley, Huxley &  Huxley, Huxley & Simmons

*ATP is produced under the guidance of cell needs and not by its yield. When glucose yields only 2 ATPs per molecule it is oxidized at very high speed (anaerobiosis) as is required to match cellular needs. On the other hand, when it may yield (thermodynamic terms) 38 ATP the same molecule is oxidized at low speed. It would be similar to an investor choice its least money yield form for its investment (1940s to 1972) as a solid support. B. Chance had the enzymologists involved in clarifying how ATP could be produced directly from NADH + H+ oxidative reductive metabolic reactions or from the hydrolysis of an enolpyruvate intermediary. Both competitors had their work supported by different but, sound scientific roots and have produced very important scientific results while trying to present their hypothetical point of view.

Before the winning results of P. Mitchell were displayed, one line of defense used by B. Chance followers was to create a conflict between what would be expected by a restrictive role of proteins through its specificity ionic interactions and the general ability of ionic asymmetries that could be associated with mitochondrial ATP production. Chemical catalyzed protein activities do not have perfect specificity but an outstanding degree of selective interaction was presented by the lock and key model of enzyme interaction. A large group of outstanding “mitochondriologists” were able to show ATP synthesis associated with Na+, K+, Ca2+… asymmetries on mitochondrial membranes and any time they did this, P. Mitchell have to display the existence of antiporters that exchange X for hydrogen as the final common source of chemiosmotic energy used by mitochondria for ATP synthesis.

This conceptual battle has generated an enormous knowledge that was laid to rest, somehow discontinued in the form of scientific research, when the final E. Coli mutant studies presented the convincing final evidence in favor of P. Mitchell point of view.

Not surprisingly, a “wise anonymous” later, pointed out: “No matter what you are doing, you will always be better off in case you have a mutant”

(Principles of Medical Genetics T D Gelehrter & F.S. Collins chapter 7, 1990).

However, let’s take the example of a mechanical wristwatch. It clearly indicates when the watch is working in an acceptable way, that its normal functioning condition is not the result of one of its isolated components – or something that can be shown by a reductionist molecular view.  Usually it will be considered that it is working in an acceptable way, in case it is found that its accuracy falls inside a normal functional range, for instance, one or two standard deviations bellow or above the mean value for normal function, what depends upon the rigor wisely adopted. While, only when it has a faulty component (a genetic inborn error) we can indicate a single isolated piece as the cause of its failure (a reductionist molecular view).

We need to teach in medicine, first the major reasons why the watch works fine (not saying it is “automatic”). The functions may cross the reversible to irreversible regulatory limit change, faster than what we can imagine. Latter, when these ideas about normal are held very clear in the mind set of medical doctors (not medical technicians) we may address the inborn errors and what we may have learn from it. A modern medical technician may cause admiration when he uses an “innocent” virus to correct for a faulty gene (a rather impressive technological advance). However, in case the virus, later shows signals that indicate that it was not so innocent, a real medical doctor will be called upon to put things in correct place again.

Among the missing parts of normal evolution in biochemistry a lot about ion fluxes can be found. Even those oscillatory changes in Ca2+ that were shown to affect gene expression (C. De Duve) were laid to rest since, they clearly indicate a source of biological information that despite the fact that it does not change nucleotides order in the DNA, it shows an opposing flux of biological information against the dogma (DNA to RNA to proteins). Another, line has shown a hierarchy, on the use of mitochondrial membrane potential: First the potential is used for Ca2+ uptake and only afterwards, the potential is used for ADP conversion into ATP (A. L. Lehninger). In fact, the real idea of A. L. Lehninger was by far, more complex since according to him, mitochondria works like a buffer for intracellular calcium releasing it to outside in case of a deep decrease in cytosol levels or capturing it from cytosol when facing transient increase in Ca2+ load. As some of Krebs cycle dehydrogenases were activated by Ca2+, this finding was used to propose a new control factor in addition to the one of ADP (B. Chance). All this was discontinued with the wrong use of calculus (today we could indicate bioinformatics in a similar role) in biochemistry that has established less importance to a mitochondrial role after comparative kinetics that today are seen as faulty.

It is important to combat dogmatic reasoning and restore sound scientific foundations in basic medical courses that must urgently reverse the faulty trend that tries to impose a view that goes from the detail towards generalization instead of the correct form that goes from the general finding well understood towards its molecular details. The view that led to curious subjects as bioinformatics in medical courses as training in sequence finding activities can only be explained by its commercial value. The usual form of scientific thinking respects the limits of our ability to grasp new knowledge and relies on reproducibility of scientific results as a form to surpass lack of mathematical equation that defines relationship of variables and the determination of its functional domains. It also uses old scientific roots, as its sound support never replaces existing knowledge by dogmatic and/or wishful thinking. When the sequence of DNA was found as a technical advance to find amino acid sequence in proteins it was just a technical advance. This technical advance by no means could be considered a scientific result presented as an indication that DNA sequences alone have replaced the need to study protein chemistry, its responses to microenvironmental changes in order to understand its multiple conformations, changes in activities and function. As E. Schrodinger correctly describes the chemical structure responsible for the coded form stored of genetic information must have minimal interaction with its microenvironment in order to endure hundreds and hundreds years as seen in Hapsburg’s lips. Only magical reasoning assumes that it is possible to find out in non-reactive chemical structures the properties of the reactive ones.

For instance, knowledge of the reactions of the Krebs cycle clearly indicate a role for solvent that no longer could be considered to be an inert bath for catalytic activity of the enzymes when the transfer of energy include a role for hydrogen transport. The great increase in understanding this change on chemical reaction arrived from conformational energy.

Again, even a rather simplistic view of this atomic property (Conformational energy) is enough to confirm once more, one of the most important contribution of E. Schrodinger in his What is Life:

“This little book arose from a course of public lectures, delivered by a theoretical physicist to an audience of about four hundred which did not substantially dwindle, though warned at the outset that the subject matter was a difficult one and that the lectures could not be termed popular, even though the physicist’s most dreaded weapon, mathematical deduction, would hardly be utilized. The reason for this was not that the subject was simple enough to be explained without mathematics, but rather that it was much too involved to be fully accessible to mathematics.”

In a very simplistic view, while energy manifests itself by the ability to perform work conformational energy as a property derived from our atomic structure can be neutral, positive or negative (no effect, increased or decreased reactivity upon any chemistry reactivity measured as work)

Also:

“I mean the fact that we, whose total being is entirely based on a marvelous interplay of this very kind, yet if all possess the power of acquiring considerable knowledge about it. I think it possible that this knowledge may advance to little just a short of a complete understanding -of the first marvel. The second may well be beyond human understanding.”

In fact, scientific knowledge allows us to understand how biological evolution may have occurred or have not occurred and yet does not present a proof about how it would have being occurred. It will be always be an indication of possible against highly unlike and never a scientific proven fact about the real form of its occurrence.

As was the case of B. Chance in its bioenergetics findings, we may get very important findings that indicates wrong directions in the future as was his case, or directed toward our past.

The Skeleton of Physical Time – Quantum Energies in Relative Space of S-labs

By Radoslav S. Bozov  Independent Researcher

WSEAS, Biology and BioSystems of Biomedicine

Space does not equate to distance, displacement of an object by classically defined forces – electromagnetic, gravity or inertia. In perceiving quantum open systems, a quanta, a package of energy, displaces properties of wave interference and statistical outcomes of sums of paths of particles detected by a design of S-labs.

The notion of S-labs, space labs, deals with inherent problems of operational module, R(i+1), where an imagination number ‘struggles’ to work under roots of a negative sign, a reflection of an observable set of sums reaching out of the limits of the human being organ, an eye or other foundational signal processing system.

While heavenly bodies, planets, star systems, and other exotic forms of light reflecting and/or emitting objects, observable via naked eye have been deduced to operate under numerical systems that calculate a periodic displacement of one relative to another, atomic clocks of nanospace open our eyes to ever expanding energy spaces, where matrices of interactive variables point to the problem of infinity of variations in scalar spaces, however, defining properties of minute universes as a mirror image of an astronomical system. The first and furthermost problem is essentially the same as those mathematical methodologies deduced by Isaac Newton and Albert Einstein for processing a surface. I will introduce you to a surface interference method by describing undetermined objective space in terms of determined subjective time.

Therefore, the moment will be an outcome of statistical sums of a numerical system extending from near zero to near one. Three strings hold down a dual system entangled via interference of two waves, where a single wave is a product of three particles (today named accordingly to either weak or strong interactions) momentum.

The above described system emerges from duality into trinity the objective space value of physical realities. The triangle of physical observables – charge, gravity and electromagnetism, is an outcome of interference of particles, strings and waves, where particles are not particles, or are strings strings, or  are waves waves of an infinite character in an open system which we attempt to define to predict outcomes of tomorrow’s parameters, either dependent or independent as well as both subjective to time simulations.

We now know that aging of a biological organism cannot be defined within singularity. Thereafter, clocks are subjective to apparatuses measuring oscillation of defined parameters which enable us to calculate both amplitude and a period, which we know to be dependent on phase transitions.

The problem of phase was solved by the applicability of carbon relative systems. A piece of diamond does not get wet, yet it holds water’s light entangled property. Water is the dark force of light. To formulate such statement, we have been searching truth by examining cooling objects where the Maxwell demon is translated into information, a data complex system.

Modern perspectives in computing quantum based matrices, 0+1 =1 and/or 0+0=1, and/or 1+1 =0, will be reduced by applying a conceptual frame of Aladdin’s flying anti-gravity carpet, unwrapping both past and future by sending a photon to both, placing present always near zero. Thus, each parallel quantum computation of a natural system approaching the limit of a vibration of a string defining 0 does not equal 0, and 1 does not equal 1. In any case, if our method 1+1 = 1, yet, 1 is not 1 at time i+1. This will set the fundamentals of an operational module, called labris operator or in simplicity S-labs. Note, that 1 as a result is an event predictable to future, while interacting parameters of addition 1+1 may be both, 1 as an observable past, and 1 as an imaginary system, or 1+1 displaced interactive parameters of past observable events. This is the foundation of Future Quantum Relative Systems Interference (QRSI), taking analytical technologies of future as a result of data matrices compressing principle relative to carbon as a reference matter rational to water based properties.

Goedel’s concept of loops exist therefore only upon discrete relative space uniting to parallel absolute continuity of time ‘lags’. ( Goedel, Escher and Bach: An Eternal Golden Braid. A Metaphorical Fugue on Minds and Machines in the Spirit of Lewis Carroll. D Hofstadter.  Chapter XX: Strange Loops, Or Tangled Hierarchies. A grand windup of many of the ideas about hierarchical systems and self-reference. It is concerned with the snarls which arise when systems turn back on themselves-for example, science probing science, government investigating governmental wrongdoing, art violating the rules of art, and finally, humans thinking about their own brains and minds. Does Gödel’s Theorem have anything to say about this last “snarl”? Are free will and the sensation of consciousness connected to Gödel’s Theorem? The Chapter ends by tying Gödel, Escher, and Bach together once again.)  The fight struggle in-between time creates dark spaces within which strings manage to obey light properties – entangled bozons of information carrying future outcomes of a systems processing consciousness. Therefore, Albert Einstein was correct in his quantum time realities by rejecting a resolving cube of sugar within a cup of tea (Henri Bergson 19th century philosopher. Bergson’s concept of multiplicity attempts to unify in a consistent way two contradictory features: heterogeneity and continuity. Many philosophers today think that this concept of multiplicity, despite its difficulty, is revolutionary.) However, the unity of time and space could not be achieved by deducing time to charge, gravity and electromagnetic properties of energy and mass.

Charge is further deduced to interference of particles/strings/waves, contrary to the Hawking idea of irreducibility of chemical energy carrying ‘units’, and gravity is accounted for by intrinsic properties of   anti-gravity carbon systems processing light, an electromagnetic force, that I have deduced towards ever expanding discrete energy space-energies rational to compressing mass/time. The role of loops seems to operate to control formalities where boundaries of space fluctuate as a result of what we called above – dark time-spaces.

Indeed, the concept of horizon is a constant due to ever expanding observables. Thus, it fails to acquire a rational approach towards space-time issues.

Richard Feynman has touched on issues of touching of space, sums of paths of particle traveling through time. In a way he has resolved an important paradigm, storing information and possibly studying it by opening a black box. Schroedinger’s cat is alive again, but incapable of climbing a tree when chased by a dog. Every time a cat climbs a garden tree, a fruit falls on hedgehogs carried away parallel to living wormholes whose purpose of generating information lies upon carbon units resolving light.

In order to deal with such a paradigm, we will introduce i+1 under square root in relativity, therefore taking negative one ( -1 = sqrt (i+1), an operational module R dealing with Wheelers foam squeezed by light, releasing water – dark spaces. Thousand words down!

What is a number? Is that a name or some kind of language or both? Is the issue of number theory possibly accountable to the value of the concept of entropic timing? Light penetrating a pyramid holding bean seeds on a piece of paper and a piece of slice of bread, a triple set, where a church mouse has taken a drop of tear, but a blood drop. What an amazing physics! The magic of biology lies above egoism, above pride, and below Saints.

We will set up the twelve parameters seen through 3+1 in classic realities:

–              discrete absolute energies/forces – no contradiction for now between Newtonian and Albert Einstein mechanics

–              mass absolute continuity – conservational law of physics in accordance to weak and strong forces

–              quantum relative spaces – issuing a paradox of Albert Einstein’s space-time resolved by the uncertainty principle

–              parallel continuity of multiple time/universes – resolving uncertainty of united space and energy through evolving statistical concepts of scalar relative space expansion and vector quantum energies by compressing relative continuity of matter in it, ever compressing flat surfaces – finding the inverse link between deterministic mechanics of displacement and imaginary space, where spheres fit within surface of triangles as time unwraps past by pulling strings from future.

To us, common human beings, with an extra curiosity overloaded by real dreams, value happens to play in the intricate foundation of life – the garden of love, its carbon management in mind, collecting pieces of squeezed cooling time.

The infinite interference of each operational module to another composing ever emerging time constrains unified by the Solar system, objective to humanity, perhaps answers that a drop of blood and a drop of tear is united by a droplet of a substance separating negative entropy to time courses of a physical realities as defined by an open algorithm where chasing power subdue to space becomes an issue of time.

Jose Eduardo de Salles Roselino

Some small errors: For intance an increase i P leads to a decrease in V ( not an increase in V)..

 

Radoslav S. Bozov  Independent Researcher

If we were to use a preventative measures of medical science, instruments of medical science must predict future outcomes based on observable parameters of history….. There are several key issues arising: 1. Despite pinning a difference on genomic scale , say pieces of information, we do not know how to have changed that – that is shift methylome occupying genome surfaces , in a precise manner.. 2. Living systems operational quo DO NOT work as by vector gravity physics of ‘building blocks. That is projecting a delusional concept of a masonry trick, who has not worked by corner stones and ever shifting momenta … Assuming genomic assembling worked, that is dealing with inferences through data mining and annotation, we are not in a position to read future in real time, and we will never be, because of the rtPCR technology self restriction into data -time processing .. We know of existing post translational modalities… 3. We don’t know what we don’t know, and that foundational to future medicine – that is dealing with biological clocks, behavior, and various daily life inputs ranging from radiation to water systems, food quality, drugs…

Read Full Post »

Sequence the Human Genome, Volume 2 (Volume Two: Latest in Genomics Methodologies for Therapeutics: Gene Editing, NGS and BioInformatics, Simulations and the Genome Ontology), Part 1: Next Generation Sequencing (NGS)

Sequence the Human Genome

Curator: Larry H Bernstein, MD, FCAP

 

 

Geneticist Craig Venter helped sequence the human genome. Now he wants yours.

By CARL ZIMMER   NOVEMBER 5, 2015   http://www.statnews.com/2015/11/05/geneticist-craig-venter-helped-sequence-the-human-genome-now-he-wants-yours/

If you enter Health Nucleus, a new facility in San Diego cofounded by J. Craig Venter, one of the world’s best-known living scientists, you will get a telling glimpse into the state of medical science in 2015.

Your entire genome will be sequenced with extraordinary resolution and accuracy. Your body will be scanned in fine, three-dimensional detail. Thousands of compounds in your blood will be measured. Even the microbes that live inside you will be surveyed. You will get a custom-made iPad app to navigate data about yourself. Also, your wallet will be at least $25,000 lighter.

Venter, who came to the world’s attention in the 1990s when he led a campaign to produce the first draft of a human genome, launched Health Nucleus last month as part of his new company, Human Longevity. He has made clear that his aim is just as lofty as it was when he and his team sequenced the human genome or built a flu vaccine from a genetic sequence delivered to them over the Internet.

“We’re trying to show the value of actual scientific data that can change people’s lives,” Venter told STAT in some of his most extensive remarks yet about the project. “Our goal is to interpret everything in the genome that we can.”

Still, the initiative is drawing deep suspicion among some doctors who question whether Venter’s existing tests can tell patients anything meaningful at all. In interviews, they said they see Health Nucleus as the latest venture that could lead consumers to believe that more testing means improved health. That notion, they say, could drive customers to get procedures they don’t need, which might even be harmful.

“I think there is absolutely no evidence that any of those tests have any benefit for healthy people,” Dr. Rita Redberg, a cardiologist at the University of California at San Diego and the editor-in-chief of JAMA Internal Medicine, said when asked about Venter’s new project.

Venter has a black belt in media savvy — he can make the details of molecular biology alluring for viewers of 60 Minutes and TED talks alike — but off screen he has earned a reputation even from his critics for serious scientific achievements. His non-profit J. Craig Venter Institute, which he founded in 1992, now has a staff of 300. Scientists at the institute have explored everything from the ocean’s biodiversity to the Ebola virus.

Last year, at age 67, Venter cofounded Human Longevity, a company based in San Diego with branches in Mountain View, Calif., and Singapore that is building the largest human genome-sequencing operation on Earth, equipped with massive computing resources to analyze the data being generated. The firm’s database now contains highly accurate genome sequences from 20,000 people; another 3,000 genomes are being added each month.

Franz Och, the former head of Google Translate and an expert on machine learning, is leading a team that’s teaching computers to recognize patterns in the company’s databases that scientists themselves may not be able to see. To demonstrate the power of this approach, Human Longevity researchers are using machine learning to discover how genetic variations shape the human face.

“We can determine a good resemblance of your photograph straight from your genetic code,” said Venter.

Venter and his colleagues will be publishing the results of that study soon — most likely generating another round of headlines. But headlines don’t pay the bills, and at a company that’s got $70 million in funding from private investors, bills matter. The company is now exploring a number of avenues for generating income from its database. It has partnered with Discovery, an insurance company in England and South Africa, to read the DNA of their clients. For $250 apiece, it will sequence the protein-coding regions of the genome, known as exomes, and offer an interpretation of the data.

Health Nucleus could become yet another source of income for Human Longevity. The San Diego facility can handle eight to 12 people a day. There are plans to open more sites both in the United States and abroad. “You can do the math,” Venter said.

Read Full Post »

Quiagen in Molecular Diagnostics

Larry H. Bernstein, MD, FCAP, Curator

LPBI

 

QIAGEN Releases GeneReader for Clinical Sequencing in Cancer

By Aaron Krol

http://www.bio-itworld.com/2015/11/9/qiagen-releases-genereader-clinical-sequencing-cancer.html

 

November 9, 2015 | QIAGEN’s GeneReader DNA sequencing system was finally unveiled last week in Austin, Tex., at the annual meeting of the Association for Molecular Pathology. The company had first planned to launch the GeneReader in 2014, but ran into delays during early access testing.

QIAGEN, an all-around molecular diagnostics company with a large customer base in both clinical and research, has been planning an entry into next-generation sequencing (NGS) since at least 2012, when it acquired Intelligent Biosystems, a small genomics player from Waltham, Mass. QIAGEN has also picked up CLC bio andIngenuity, two popular bioinformatics vendors, to build a software suite alongside its sequencing system.

 

QIAGEN is making a late entry into NGS, at a time when even better-established vendors, like Thermo Fisher and Pacific Biosciences, are fighting to hold onto a meaningful share of a market dominated by Illumina of San Diego. But QIAGEN is not the only company that believes a huge, untapped base of hospital labs will soon be using sequencers as part of regular patient care, providing a chance for new technologies to get a foothold. In principle, QIAGEN’s existing relationships with these labs as a supplier of tests, reagents, and equipment could help the GeneReader get traction, although if labs prefer to buy sequencing equipment from their reagent vendors, that hasn’t been obvious to date. (Just ask Thermo Fisher or Roche.)

system quoteQIAGEN’s big pitch for the GeneReader is that users will not have to homebrew solutions for working with DNA samples or making sense of genetic data. The GeneReader will only be sold as a package with two other instruments, the QIAcube for extracting DNA from blood and tissue samples, and the QIAcube NGS, which prepares those DNA libraries for sequencing. The system also comes with QIAGEN Clinical Insight (QCI), a platform that combines tools from CLC bio and Ingenuity, to analyze the raw data from the GeneReader and report on the clinical meaning of any genetic variants found.

“Labs struggle with the adoption of NGS, and we feel that QIAGEN is uniquely positioned to help them with those barriers,” Jonathan Arnold, QIAGEN’s Senior Director of Marketing for NGS, tells Bio-IT World. “We’re launching a truly complete NGS solution, and that’s very different than what any other vendor has done.”

Well, sort of. Complete Genomics, a subsidiary of BGI, took a similar tack with its Revolocity sequencing system this summer, although Revolocity’s built-in software only goes as far as calling genetic variants, not interpreting them for physicians. But unlike the ultra-high-throughput, whole-genome-processing Revolocity, the GeneReader is clearly a diagnostic instrument, best suited to targeted DNA testing for clear clinical results. It launches alongside a 12-gene cancer test called the Actionable Insights Tumor Panel, which scans almost 800 mutation hotspots in genes like BRAF and EGFR, looking for variants that can be used to help choose therapies for cancer patients.

As a benchtop instrument, the GeneReader should fit neatly into the workflows of small to midsize labs that might otherwise pick up an Illumina NextSeq, or an Ion PGM or Ion S5 from Thermo Fisher, for targeted sequencing panels. QIAGEN is also offering a flexible pricing structure to win over labs that might want to run NGS, but don’t test at a high enough volume to justify buying a sequencer outright.

“We’re in these labs today,” says Arnold. “I’ve seen estimates that 75 to 80% of NGS labs are using a QIAGEN solution. And we have product lines, whether it’s qPCR or multiplex assays, that give us channels into these labs outside NGS.”

The Specs

Genomics researchers and bioinformaticians want to know the specs of a new sequencer: how much data it produces per run, its error profile, its read lengths. QIAGEN hates talking about the specs. The company line is that most of these metrics are irrelevant to a system that’s only meant to run panel tests, with the analysis and interpretation baked in.

“A lab does not need a bioinformatician to process this,” says Arnold.

Regarding volume, Arnold says the GeneReader system can run up to 120 panels per week. Other metrics are in service of those panels. For instance, the sequencer’s read lengths are around 100 base pairs, but not necessarily because of any technical limits. “We’re focused on somatic cancer, specifically from FFPE samples, so a read length of 100 base pairs is what that type of application needs,” Arnold says. “That’s what we designed around.”

Similarly, QIAGEN prefers to talk about accuracy in terms of test results, not base calls. At the Association for Molecular Pathology meeting, early access users from the Broad Institute of MIT and Harvard showed that the GeneReader, running the Actionable Insights Tumor Panel, picked up all the same mutations as an Illumina sequencer, and gave equivalent results to QIAGEN therascreen PCR tests.

GeneReader

The complete GeneReader system, including QIAcube, QIAcube NGS, and a computer running the QIAGEN Clinical Insight platform. Image credit: QIAGEN

http://www.bio-itworld.com/uploadedImages/Bio-IT_World/Top_Headlines/2015/11-Nov/S_4867_NGS_0100_s.jpg

 

Under the hood, the GeneReader runs on very familiar technology. The sequencing-by-synthesis method QIAGEN inherited from Intelligent Biosystems works the same way as Illumina’s machines, flooding the sample DNA with fluorescently labeled nucleotides and imaging the results. (At one time Illumina and Intelligent Biosystems were involved in a series of lawsuits over this technology, although everyone’s intellectual property was left intact.)

The most unique feature of the GeneReader is that it can stagger samples. The sequencer reads up to four flow cells at a time, each with up to ten samples ― but if you start sequencing with the machine only partly full, you can add more flow cells mid-run. The GeneReader pulls this off by processing flow cells in “turntable” fashion, physically separating each sequencing step: adding new nucleotides, imaging, cleaving the fluorescent markers. New flow cells just slot into place between steps.

The feature is aimed at clinicians, who may have to start new tests in a hurry and can’t always predict their volume.

By Arnold’s count, the GeneReader can process 5,000 panels a year, enough for even fairly high-volume labs. So the 20-flow-cell sequencer that Intelligent Biosystems was at one point designing is probably not forthcoming from QIAGEN. “The sequencer itself will grow with the lab as their NGS business and volume grow,” says Arnold.

A Testing Machine

QIAGEN would like customers to see the GeneReader less as a device for unraveling the DNA code, and more as a high-throughput testing machine. (In that regard QIAGEN is a lot like Direct Genomics, the Shenzhen-based company whose GenoCare sequencer is in early test runs with three Chinese hospitals.)

Of course, the GeneReader is still a sequencer, and in theory users can do whatever they want with it, from running third-party panels to sequencing bacterial genomes. The embedded software will even help with interpretation, to some extent, for pretty well any use case in humans. The former Ingenuity platform ― now QCI Interpret ― finds and reports disease-causing variants across the human genome, although QIAGEN is careful not to make any claims for the clinical validity of findings outside its own panels.

“We’re very focused on our Actionable Insights Tumor Panel,” says Arnold. “We verify that panel’s performance all the way from the GeneReader’s FFPE kit to the backend bioinformatics and QCI Interpret.”

That panel covers much the same ground as QIAGEN’s existing line of therascreen PCR tests for cancer, but also ropes in some extra gene regions with links to drug labels, the scientific literature, and testing guidelines from major clinical organizations. It also comes with a nifty extra feature in QCI Interpret: information on any ongoing clinical trials connected to a patient’s cancer mutations, organized by zip code.

For the time being, the GeneReader’s “on-label” applications will stay firmly in somatic cancer. QIAGEN has made sure the instrument can work with the degraded DNA in FFPE (formalin-fixed, paraffin-embedded) samples, and is planning to launch a solution for liquid biopsy as well. The many other potential uses for NGS ― like infectious disease, prenatal testing, and rare disease ― are on the back burner.

QIAGEN is being coy about the cost of the GeneReader, which it expects to sell mostly on a “price-per-insight” model. “I can tell you we will be extremely price competitive with what’s out there today,” says Arnold. “We would sit down with a lab, talk about the number of samples they’re going to be running, and come up with a price based on these different parameters.”

It’s a smart strategy to expand the number of customers who could think about adopting NGS. Clinical labs that run sequencing panels today already have their technicians trained to prepare DNA libraries, and more importantly, have bioinformatics pipelines in place to deal with the data, either homebrewed or from a vendor. Most likely, they employ experts in genetic interpretation, who can design new tests and know how to deal with ambiguous results. It won’t be easy to win these labs over to a new sequencing system when they’ve already invested heavily in getting this expertise and equipment in-house.

A price-per-insight model lets QIAGEN widen the field, offering more clinics access to the kind of broad cancer testing they might now be farming out to companies like Foundation Medicine, and promising more applications to come. The GeneReader probably won’t steal any customers from Illumina, but it might make labs eyeing their first MiSeqDx think twice about their choice of vendors.

New Regulatory Frontiers

QIAGEN’s plan is to submit both the sequencer and the Actionable Insights Tumor Panel to the FDA for clearance, but until then, it’s selling both of them for research use only.

That means the company has to be a bit circumspect with how QCI Interpret reports findings to doctors. “We make no claims about [the Actionable Insights Tumor Panel] as a diagnostic tool,” Arnold says. “We’re making no diagnostic claims in the interpretive reports. We’re not guiding therapy selection. We’re providing the relevant variants, but nothing more than that.”

therapy selectionThat could change if the GeneReader and its cancer panel eventually win FDA clearance. QIAGEN is testing the waters for a broad form of genetic testing, very different from the tightly-focused NGS assays, like Illumina’s tests for cystic fibrosis, that the FDA has cleared in the past. The Actionable Insights Tumor Panel is more like the kind of sweeping genetic testing that more advanced clinical labs have undertaken on their own initiative, under FDA exemptions for laboratory developed tests.

These types of panels already have wide buy-in from professional organizations like the Association for Molecular Pathology. And QIAGEN isn’t going out on a limb with its genetic targets, mainly testing genes that the FDA has already acknowledged are linked to treatment options. It would be a good sign for the future if QIAGEN, which knows the FDA better than any of its competitors in NGS, could begin to bring these more wide-ranging uses of sequencing under the normal regulatory umbrella.

Right now, the Actionable Insights Tumor Panel is in the odd position of being narrower than many tests already in use, but more expansive than anything the FDA has so far approved. The launch of the GeneReader will be yet another nudge to regulators to clarify where genetic testing in the U.S. stands, joining a new class of sequencersthat are, more than ever, taking NGS to the bedside.

 

 

Read Full Post »

Size Matters

Larry H. Bernstein, MD, FCAP, Curator

LPBI

 

MinION Sequencing Untangles RNA Transcripts in a Difficult Gene

By Aaron Krol

http://www.bio-itworld.com/2015/11/3/minion-sequencing-untangles-rna-transcripts-difficult-gene.html

 

RNA isoforms are distinct versions of the same isoforms quotegene. Through a process called alternative splicing, the different subunits, or “exons,” that make up a gene can be reshuffled in new combinations. Many genes have two or more mutually exclusive exons, and which ones are actually expressed as RNA and protein can have big effects on cellular behavior ― in effect, expanding the protein arsenal of the genome.

 

November 3, 2015 | Brenton Graveley received his first MinION shipment in April 2014, at his lab at the University of Connecticut’s Institute of Systems Genomics. His lab was among the first to unwrap one of the candy bar-sized DNA sequencers made by Oxford Nanopore Technologies, and although its accuracy was shaky and its throughput low, right away Graveley and his colleagues could see it was producing real DNA data.

“I’m still amazed to this day that it works at all,” Graveley says. “It’s like Star Trek.”

A lot of buzz around the MinION has focused on its tiny size: early adopters have plotted to take MinIONs into outbreak zones and species-hunting tromps through the rainforest, working with bare-bones labs and laptop computers. But for Graveley, the size of the DNA strands the MinION reads is just as exciting as the size of the sequencer itself. That’s because most other sequencers rely on picking up chemical reactions that become more error-prone over time, meaning DNA can only be read in short fragments. The MinION, which reads genetic material by observing single molecules of DNA as they pass through extremely narrow “nanopores,” keeps producing data for as long as DNA is moving through the pore.

“You get the read length of whatever fragment you put into the MinION,” he says. “We’ve gotten reads that are over 100 kilobases,” hundreds or even thousands of times longer than researchers can expect with most other technologies.

Now, in a paper published in Genome Biology, Graveley and two of his lab members, post-doc Mohan Bolisetty and PhD student Gopinath Rajadinakaran, have shown how these read lengths can help explain the cellular behavior of Dscam1, one of the most difficult-to-study genes known to science. Related to a gene in humans that has been linked to Down syndrome ― the name stands for “Down Syndrome Cell Adhesion Molecule” ―Dscam1 plays a fundamental role in forming the architecture of insect brains. This single gene can produce thousands of subtly different proteins, an ability that makes it both a fascinating subject of research, and almost impossible to understand using standard sequencing technology.

 

Determining exon connectivity in complex mRNAs by nanopore sequencing

Mohan T. Bolisetty12, Gopinath Rajadinakaran1 and Brenton R. Graveley1*
Genome Biology 2015, 16:204       http://dx.doi.org:/10.1186/s13059-015-0777-z                    http://genomebiology.com/2015/16/1/204

Short-read high-throughput RNA sequencing, though powerful, is limited in its ability to directly measure exon connectivity in mRNAs that contain multiple alternative exons located farther apart than the maximum read length. Here, we use the Oxford Nanopore MinION sequencer to identify 7,899 ‘full-length’ isoforms expressed from four Drosophila genes, Dscam1, MRP, Mhc, and Rdl. These results demonstrate that nanopore sequencing can be used to deconvolute individual isoforms and that it has the potential to be a powerful method for comprehensive transcriptome characterization.

High throughput RNA sequencing has revolutionized genomics and our understanding of the transcriptomes of many organisms. Most eukaryotic genes encode pre-mRNAs that are alternatively spliced [1]. In many genes, alternative splicing occurs at multiple places in the transcribed pre-mRNAs that are often located farther apart than the read lengths of most current high throughput sequencing platforms. As a result, several transcript assembly and quantitation software tools have been developed to address this [2], [3]. While these computational approaches do well with many transcripts, they generally have difficulty assembling transcripts of genes that express many isoforms. In fact, we have been unable to successfully assemble transcripts of complex alternatively spliced genes such as Dscam1 or Mhc using any transcript assembly software (data not shown). These software tools also have difficulty quantitating transcripts that have many isoforms, and for genes with distantly located alternatively spliced regions, they can only infer, and not directly measure, which isoforms may have been present in the original RNA sample [4]. For example, consider a gene containing two alternatively spliced exons located 2 kbp away from one another in the mRNA. If each exon is observed to be included at a frequency of 50 % from short read sequence data, it is impossible to determine whether there are two equally abundant isoforms that each contain or lack both exons, or four equally abundant isoforms that contain both, neither, or only one or the other exon.

Pacific Bioscience sequencing can generate read lengths sufficient to sequence full length cDNA isoforms and several groups have recently reported the use of this approach to characterize the transcriptome [5]. However, the large capital expense of this platform can be a prohibitive barrier for some users. Thus, it remains difficult to accurately and directly determine the connectivity of exons within the same transcript. The MinION nanopore sequencer from Oxford Nanopore requires a small initial financial investment, can generate extremely long reads, and has the potential to revolutionize transcriptome characterization, as well as other areas of genomics.

Several eukaryotic genes can encode hundreds to thousands of isoforms. For example, inDrosophila, 47 genes encode over 1,000 isoforms each [6]. Of these, Dscam1 is the most extensively alternatively spliced gene known and contains 115 exons, 95 of which are alternatively spliced and organized into four clusters [7]. The exon 4, 6, 9, and 17 clusters contain 12, 48, 33, and 2 exons, respectively. The exons within each cluster are spliced in a mutually exclusive manner and Dscam1 therefore has the potential to generate 38,016 different mRNA and protein isoforms. The variable exon clusters are also located far from one another in the mRNA and the exons within each cluster are up to 80 % identical to one another at the nucleotide level. Together, these characteristics present numerous challenges to characterize exon connectivity within full-length Dscam1 transcripts for any sequencing platform. Furthermore, though no other gene is as complex as Dscam1, many other genes have similar issues that confound the determination of exon connectivity.

We are interested in developing methods to perform simple and robust long-read sequencing of individual isoforms of Dscam1 and other complex alternatively spliced genes. Here, we use the Oxford Nanopore MinION to sequence ‘full-length’ cDNAs from four Drosophila genes – Rdl, MRP,Mhc, and Dscam1 – and identify a total of 7,899 distinct isoforms expressed by these four genes.

 

Similarity between alternative exons

We were interested in determining the feasibility of using the MinION nanopore sequencer to characterize the connectivity of distantly located exons in the mRNAs expressed from genes with complex splicing patterns. For the purposes of these experiments, we have focused on fourDrosophila genes with increasingly complex patterns of alternative splicing (Fig. 1). Resistant to dieldrin (Rdl) contains two clusters, each containing two mutually exclusive exons and therefore has the potential to generate four different isoforms (Fig. 1a). Multidrug-Resistance like Protein 1(MRP) contains two mutually exclusive exons in cluster 1 and eight mutually exclusive exons in cluster 2, and can generate 16 possible isoforms (Fig. 1b). Myosin heavy chain (Mhc) can potentially generate 180 isoforms due to five clusters of mutually exclusive exons – clusters 1 and 5 contain two exons, clusters 2 and 3 each contain three exons, and cluster 4 contains five exons. Finally, Dscam1 contains 12 exon 4 variants, 48 exon 6 variants, 33 exon 9 variants (Fig. 1d), and two exon 17 variants (not shown) and can potentially express 38,016 isoforms. For this study, however, we have focused only on the exon 3 through exon 10 region of Dscam1, which encompasses the 93 exon 4, 6, and 9 variants, and 19,008 potential isoforms (Fig. 1d).

thumbnail

Fig. 1. Schematic of the exon-intron structures of the genes examined in this study. a The Rdl gene contains two clusters (cluster one and two) which each contain two mutually exclusive exons. b The MRP gene contains contains two and eight mutually exclusive exons in clusters 1 and 2, respectively. Mhc contains two mutually exclusive exons in clusters 1 and 5, three mutually exclusive exons in clusters 2 and 3, and five mutually exclusive exons in cluster 4. The Dscam1 gene contains 12, 48, and 33 mutually exclusive exons in the exon 4, 6, and 9 clusters, respectively. For each gene, the constitutive exons are colored blue, while the variable exons are colored yellow, red, orange, green, or light blue

Because our nanopore sequence analysis pipeline uses LAST to perform alignments [8], we aligned all of the Rdl, MRP, Mhc, and Dscam1 exons within each cluster to one another using LAST to determine the extent of discrimination needed to accurately assign nanopore reads to a specific exon variant. For Rdl, each variable exon was only aligned to itself, and not to the other exon in the same cluster (data not shown). For MRP, the two exons within cluster 1 only align to themselves, and though the eight variable exons in cluster 2 do align to other exons, there is sufficient specificity to accurately assign nanopore reads to individual exons (Fig. 2a). For Mhc, the variable exons in cluster 1 and cluster 5 do not align to other exons, and the variable exons in cluster 2, cluster 3, and cluster 4 again align with sufficient discrimination to identify the precise exon present in the nanopore reads (Fig. 2b). Finally, for Dscam1, the difference in the LAST alignment scores between the best alignment (each exon to itself) and the second, third, and fourth best alignments are sufficient to identify the Dscam1 exon variant (Fig. 2c). This analysis indicates that for each gene in this study, LAST alignment scores are sufficiently distinct to identify the variable exons present in each nanopore read.

thumbnail

Fig. 2. Similarity distance between the variable alternative exons of MRP,Mhc, and Dscam1. a Violin plots of the LAST alignment scores of each variable exon within MRP cluster 1 and MRP cluster 2 to themselves and the second (2nd) best alignments. b Violin plots of the LAST alignment scores of each variable exon within each Mhc cluster to themselves and the second (2nd) best alignments. c Violin plots of the LAST alignment scores of each variable exon within each Dscam1 cluster to themselves (1st), and to the exons with the second (2nd), third (3rd) and fourth (4th) best alignments

Optimizing template switching in Dscam1 cDNA libraries

Template switching can occur frequently when libraries are prepared by PCR and can confound the interpretation of results [9], [10]. For example, CAM-Seq [11] and a similar method we independently developed called Triple-Read sequencing [12] to characterize Dscam1 isoforms, were found to have excessive template switching due to amplification during the library prep protocols. To assess template switching in our current study, we generated a spike-in mixture of in vitro transcribed RNAs representing six unique Dscam1 isoforms – Dscam1 4.2,6.32,9.31 , Dscam14.1,6.46,9.30 , Dscam1 4.3,6.33,9.9 , Dscam1 4.12,6.44,9.32 , Dscam1 4.7,6.8,9.15 , and Dscam1 4.5,6.4,9.4. We used 10 pg of this control spike-in mixture and prepared libraries for MinION sequencing by amplifying the exon 3 through exon 10 region for 20, 25, or 30 cycles of RT-PCR. We then end-repaired and dA-tailed the fragments, ligated adapters, and sequenced the samples on a MinION (7.3) for 12 h each. We obtained 33,736, 8,961, and 7,511 base-called reads from the 20, 25, and 30 cycle libraries, respectively. Consistent with the size of the exon 3 to 10 cDNA fragment being 1,806–1,860 bp in length, depending on the precise combination of exons it contains, most reads we observed were in this size range (Fig. 3a). We used Poretools [13] to convert the raw output files into fasta format and then used LAST to align the reads to a LAST database containing each variable exon. From these alignments, we identified reads that mapped to all three exon clusters, as well as the exon with the best alignment score within each cluster. When examining the alignments to each cluster independently, we found that for these spike-in libraries, all reads mapped uniquely to the exons present in the input isoforms. Therefore, any observed isoforms that were not present in the input pool were a result of template switching during the RT-PCR and library prep protocol and not due to false alignments or sequencing errors.

thumbnail

Fig. 3. Optimized RT-PCR minimizes template-switching for MinION sequencing. a Histogram of read lengths from MinION sequencing ofDscam1 spike-ins from the library generated using 25 cycles of PCR. bBar plot indicating the extent of template switching in Dscam1 spike-ins at different PCR cycles (left). The blue portions indicate the fraction of reads corresponding to input isoforms while the red portions correspond to the fraction of reads corresponding to template-switched isoforms. On the right, plots of the rank order versus number of reads (log10) for the 20, 25, and 30 cycle libraries. The blue dots indicate input isoforms while the red portions correspond to template-switched isoforms

When comparing the combinations of exons within each read to the input isoforms, we observed that 32 % of the reads from the 30 cycle library corresponded to isoforms generated by template switching (Fig. 3b). The template-switched isoforms observed by the greatest number of reads in the 30 cycle library were due to template switching between the two most frequently sequenced input isoforms. In most cases, template switching occurred somewhere within exon 7 or 8 and resulted in a change in exon 9. However, the extent of template switching was reduced to only 1 % in the libraries prepared using 25 cycles, and to 0.2 % in the libraries prepared using 20 cycles of PCR (Fig. 3b). Again, for these two libraries the most frequently sequenced template-switched isoforms involved the input isoforms that were also the most frequently sequenced. These experiments demonstrate that the MinION nanopore sequencer can be used to sequence ‘full length’ Dscam1 cDNAs with sufficient accuracy to identify isoforms and that the cDNA libraries can be prepared in a manner that results in a very small amount of template switching.

Dscam1 isoforms observed in adult heads

To explore the diversity of Dscam1 isoforms expressed in a biological sample, we prepared aDscam1 library from RNA isolated from D. melanogaster heads prepared from mixed male and female adults using 25 cycles of PCR and sequenced it for 12 h on the MinION nanopore sequencer obtaining a total of 159,948 reads of which 78,097 were template reads, 48,474 were complement reads, and 33,377 were 2D reads (Fig. 4a). We aligned the reads individually to the exon 4, 6, and 9 variants using LAST. A total of 28,971 reads could be uniquely or preferentially aligned to a single variant in all three clusters. For further analysis, we used all 16,419 2D read alignments and 31 1D reads when both template and complement aligned to same variant exons (not all reads with both a template and complement yield a 2D read). The remaining 12,521 aligned reads were 1D reads where there was either only a template or complement read, or when the template and complement reads disagreed with one another and were therefore not used further. We observed 92 of the 93 potential exon 4, 6, or 9 variants – only exon 6.11 was not observed in any read (Fig. 4f). To assess the accuracy of the results we performed RT-PCR using primers in the flanking constitutive exons that contained Illumina sequencing primers to separately amplify the Dscam1exon 4, 6, and 9 clusters from the same RNA used to prepare the MinION libraries, and sequenced the amplicons on an Illumina MiSeq. The frequency of variable exon use in each cluster was extremely consistent between the two methods (R 2  = 0.95, Fig. 5a).

Fig. 4. MinION sequencing of Dscam1 identified 7,874 isoforms. aHistogram of read length distribution for Drosophila head samples. b The total number of Dscam1 isoforms identified from MinION sequencing. cCumulative distribution of Dscam1 isoforms with respect to expression. dViolin plot of the number of isoforms identified using 100 random pools of the indicated number of reads. e Plot of the estimated number of total isoforms present in the library using the capture-recapture method with two random pools of the indicated number of reads. The shaded blue area indicates the 95 % confidence interval. f Deconvoluted expression of Dscam1 exon cluster variants (top) and the isoform connectivity of two highly expressed Dscam1 isoforms (bottom)

thumbnail

Fig. 5. Accuracy of Dscam1 sequencing results. a Comparison of the frequency of variable exon inclusion for the Dscam1 exon 4 (yellow), 6 (red), and 9 (orange) clusters as determined by nanopore sequencing or by amplicon sequencing using an Illumina MiSeq. b Percent identities (left) or LAST alignment scores (right) of full-length template, complement, and two directions (sequencing both template and complements) nanopore read alignments

Over their entire lengths, the 2D reads that map specifically to one exon 4, 6, and 9 variants map with an average 90.37 % identity and an average LAST score of approximately 1,200 (Fig. 5b). The 16,450 full length reads correspond to 7,874 unique isoforms, or 42 % of the 18,612 possible isoforms given the exon 4, 6, and 9 variants observed. We note, however, that while 4,385 isoforms were represented by more than one read, 3,516 of isoforms were represented by only one read indicating that the depth of sequencing has not reached saturation (Fig. 4b and c). This was further confirmed by performing a bootstrapped subsampling analysis (Fig. 4d) and by using the capture-recapture method to attempt to assess the complexity of isoforms present in the library (Fig. 4e), which suggests that over 11,000 isoforms are likely to be present, though even this analysis has not yet reached saturation. The most frequently observed isoforms were Dscam14.1,6.12,9.30 and Dscam1 4.1,6.1,9.30 which were observed with 30 and 25 reads, respectively (Fig. 4e). In conclusion, these results demonstrate the practical application of using the MinION nanopore sequencer to identify thousands of distinct Dscam1 isoforms in a single biological sample.

Nanopore sequencing of ‘full-length’ Rdl, MRP, and Mhc isoforms

To extend this approach to other genes with complex splicing patterns, we focused on Rdl, MRP, and Mhc which have the potential to generate four, 16, and 180 isoforms, respectively. We prepared libraries for each of these genes by RT-PCR using primers in the constitutive exons flanking the most distal alternative exons using 25 cycles of PCR, pooled the three libraries and sequenced them together on the MinION nanopore sequencer for 12 h obtaining a total of 22,962 reads. The input libraries for Rdl, MRP, and Mhc were 567 bp, 1,769-1,772 bp, and 3,824 bp, respectively. The raw reads were aligned independently to LAST indexes of each cluster of variable exons. The alignment results were then used to assign reads to their respective libraries, identify reads that mapped to all variable exon clusters for each gene, and the exon with the best alignment score within each cluster. In total, we obtained 301, 337, and 112 full length reads forRdl (Fig. 6), MRP (Fig. 7), and Mhc (Fig. 8), respectively. For Rdl, both variable exons in each cluster was observed, and accordingly all four possible isoforms were observed, though in each case the first exon was observed at a much higher frequency than the second exon (Fig. 6d). Interestingly, the ratio of isoforms containing the first versus second exon in the second cluster is similar for isoforms containing either the first exon or the second exon in the first cluster indicating that the splicing of these two clusters may be independent. For MRP, both exons in the first cluster were observed and all but one of the exons in the second cluster (exon B) were observed, though the frequency at which the exons in both clusters were used varied dramatically (Fig. 7d). For example, within the first cluster, exon B was observed 333 times while exon A was observed only four times. Similarly, in the second cluster, exon A was observed 157 times whereas exons B, E, F, and G were observed 0 times, thrice, once, and twice, respectively, and exons D, E, and H were observed between 40 and 76 times. As a result, we observed only nine MRP isoforms. For Mhc, we again observed strong biases in the exons observed in each of the five clusters (Fig. 8d). In the first cluster, exon B was observed more frequently than exon A. In the second cluster, 109 of the reads corresponded to exon A, while exons B and C were observed by only two and one read, respectively. In the third cluster, exon A was not observed at all while exons B and C were observed in roughly 80 % and 20 % of reads, respectively. In the fourth cluster, exon A was observed only once, exons B and C were not observed at all, exon E was observed 13 times while exon D was present in all of the remaining reads. Finally, in the fifth cluster, only exon B was observed. As with MRP, these strong biases and near or complete absences of exons in some of the clusters severely reduces the number of possible isoforms that can be observed. In fact, of the 180 potential isoforms encoded by Mhc, we observed only 12 isoforms. Various Mhc isoforms are known to be expressed in striking spatial and temporally restricted patterns [14] and thus it is likely that other Mhc isoforms that we did not observe, could be observed by sequencing other tissue samples.

thumbnail

Fig. 6. MinION sequencing of Rdl identified four isoforms. a Histogram of read lengths. b The number of reads per isoform. c Cumulative distribution of isoforms with respect to expression. d The number of reads per alternative exon (top) and per isoform (below)

thumbnail

Fig. 7. MinION sequencing of MRP identified nine isoforms. a Histogram of read lengths. b The number of reads per isoform. c Cumulative distribution of isoforms with respect to expression. d The number of reads per alternative exon (top) and per isoform (below)

thumbnail

Fig. 8. MinION sequencing of Mhc identified 12 isoforms. a Histogram of read lengths. b The number of reads per isoform. c Cumulative distribution of isoforms with respect to expression. d The number of reads per alternative exon (top) and per isoform (below)

Conclusions

Here we have demonstrated that nanopore sequencing with the Oxford Nanopore MinION can be used to easily determine the connectivity of exons in a single transcript, including Dscam1, the most complicated alternatively spliced gene known in nature. This is an important advance for several reasons. First, because short-read sequence data cannot be used to conclusively determine which exons are present in the same RNA molecule, especially for complex alternatively spliced genes, long-read sequence data are necessary to fully characterize the transcript structure and exon connectivity of eukaryotic transcriptomes. Second, although the Pacific Bioscience platform can perform long-read sequencing, there are several differences between it and the Oxford Nanopore MinION that could cause users to choose one platform over the other. In general, the quality of the sequence generated by the Pacific Bioscience is higher than that currently generated by the Oxford Nanopore MinION. This is largely due to the fact that each molecule is sequenced multiple times on the Pacific Bioscience platform yielding a high quality consensus sequence whereas on the Oxford Nanopore MinION, each molecule is sequenced at most twice (in the template and complement). We have previously used the Pacific Bioscience platform to characterize Dscam1 isoforms and found that it works well, though due to the large amount of cDNA needed to generate the libraries, many cycles of PCR are necessary and we observed an extensive amount of template switching, making it impractical to use for these experiments (BRG, unpublished data). However, over the past year that we have been involved in the MAP, the quality of sequence has steadily increased. As this trend is likely to continue, the difference in sequence quality between these two platforms is almost certain to shrink. Nonetheless, as we demonstrate, the current quality of the data is more than sufficient to allow us to accurately distinguish between highly similar alternatively spliced isoforms of the most complex gene in nature. Third, the ability to accurately characterize alternatively spliced transcripts with the Oxford Nanopore MinION makes this technology accessible to a much broader range of researchers than was previously possible. This is in part due to the fact that, in contrast to all other sequencing platforms, very little capital expense is needed to acquire the sequencer. Moreover, the MinION is truly a portable sequencer that could literally be used in the field (provided one has access to an Internet connection), and due to its size, almost no laboratory space is required for its use.

Although nanopore sequencing has many exciting and potentially disruptive advantages, there are several areas in which improvement is needed. First, although we were able to accurately identify over 7,000 Dscam1 isoforms with an average identity of full-length alignments >90 %, there are several situations in which this level of accuracy will be insufficient to determine transcript structure. For instance, there are many micro-exons in the human genome [15], and these exons would be difficult to identify if they overlapped a portion of a read that contained errors. Additionally, small unannotated exons could be difficult to identify for similar reasons. Second, the current number of usable reads is lower than that which will be required to perform whole transcriptome analysis. One issue that plagues transcriptome studies is that the majority of the sequence generated comes from the most abundant transcripts. Thus, with the current throughput, numerous runs would be needed to generate a sufficient number of reads necessary to sample transcripts expressed at a low level. In fact, this is one reason that we chose in this study, to begin by targeting specific genes rather than attempting to sequence the entire transcriptome. We do note, however, that over the past year of our participation in the MAP, the throughput of the Oxford Nanopore MinION has increased, and it is reasonable to expect additional improvements in throughput that should make it possible to generate a sufficient number of long reads to deeply interrogate even the most complex transcriptome.

In conclusion, we anticipate that nanopore sequencing of whole transcriptomes, rather than targeted genes as we have performed here, will be a rapid and powerful approach for characterizing isoforms, especially with improvements in the throughput and accuracy of the technology, and the simplification and/or elimination of the time-consuming library preparations.

 

The Tangled Transcriptome

Graveley’s lab studies the transcriptome, the mass of RNA molecules in living cells whose job is to translate DNA into proteins. The transcriptome is a sort of snapshot of which parts of the genome are active at a given time and place. Which genes are transcribed into RNA, and in what quantities, changes from organ to organ and even cell to cell, and can vary over an organism’s lifetime or in response to environmental changes.

Of particular interest to Graveley are those RNA molecules than can take different shapes, or “isoforms,” depending on random chance or what the cell needs at a particular time. RNA isoforms are distinct versions of the same isoforms quotegene. Through a process called alternative splicing, the different subunits, or “exons,” that make up a gene can be reshuffled in new combinations. Many genes have two or more mutually exclusive exons, and which ones are actually expressed as RNA and protein can have big effects on cellular behavior ― in effect, expanding the protein arsenal of the genome.

“For the entire field of transcriptomics and gene function, knowing what isoforms are expressed is critical,” says Graveley. “Most genes are complicated, especially in humans, and have alternative splicing that occurs at multiple places.”

That brings us to the challenge of Dscam1, the world record holder for alternative splicing. In fruit flies, a particularly well-studied model organism, Dscam1 is made up of 115 exons, only 20 of which are always transcribed into RNA. The other 95 exist in four “clusters” of mutually exclusive exons, and as a result, over 38,000 possible isoforms of Dscam1 have been predicted.

“This is by far, an order of magnitude, more than any other gene,” Graveley explains. This flexibility makes sense in light of Dscam1’s function. The protein it makes helps to “identify” single neurons in the insect brain, making them distinct enough from their neighbors for these cells to assemble a neural circuit on principles of like avoiding like. In experiments where Dscam1 has been altered to make fewer RNA isoforms, the neural wiring breaks down during development, sometimes severely enough to kill the flies.

Dscam1 also plays a role in the insect immune system, another reason for it to produce a huge variety of isoforms. Each of these molecules might be more or less effective at fighting certain pathogens.

It’s frustratingly hard, however, to figure out exactly which isoforms are in a specific sample. Graveley has been working on Dscam1 in fruit flies for more than a decade, but very basic questions remain unanswered: are some isoforms more common, or more important, than others? Are all the theoretical isoforms expressed? Do the isoforms have different behaviors, or are they just arbitrary ways of tagging neurons?

Size Matters

The trouble is the current state of the art in sequencing technology, which reads just a couple of hundred DNA bases at a time. That works great for identifying which exons are present in the transcriptome, but it’s no good for saying which mix of exons any specific strand of RNA is carrying. Different exons can lie thousands of bases apart on the RNA molecule, and there’s no way to bridge the gap between reads.

Graveley has tried a lot of solutions. He’s used the outdated Sanger sequencing method, which is much slower and more labor-intensive than modern sequencers, but does span longer reads. His lab also worked out a roundabout way of reconstructing RNA transcripts with contemporary Illumina sequencers, through a combination of chemistry and computational approaches.

“It worked,” he says, “but it was complicated by a lot of library preparation artifacts, and you basically had to jury-rig a genome analyzer to do something it was not supposed to do.”

Graveley’s preferred method is to use a sequencer produced by Pacific Biosciences, which, like the MinION, is built on long-read, single-molecule technology. PacBio sequencing is much better established than nanopores, and its results are known to be reliable; it also has the high throughput typical of modern instruments. For researchers working on alternative splicing, it’s clearly the technology to beat.

Unfortunately, it’s also very expensive. So Graveley’s team set out to learn whether the MinION, a low-throughput but extremely cheap alternative, could be an adequate substitute.

For the Genome Biology paper, the team focused on a 1.8-kilobase region of Dscam1 RNA that covers 93 of the gene’s 95 alternatively spliced exons. To get their samples, they crushed fruit fly heads, isolated Dscam1 RNA from the sample using a polymerase, and reverse-transcribed it into cDNA for sequencing. They also sequenced transcripts of three other alternatively spliced genes, Rdl, MRP, and Mhc.

splicing quote

The biggest concern for new applications of the MinION is its shaky accuracy. While most sequencers can achieve comfortably over 99% consensus with reference sequences, Graveley’s group has seen only about 90% identity with the MinION. That’s actually a little better than most MinION users have managed, although the device’s accuracy has been steadily improving. Users have had to pick their projects carefully to account for this: the device is pretty reliable in resequencing studies that map DNA reads to known references, but it’s still a dubious choice for sequencing unknown genetic material from scratch (although it’s been tried).

To accurately pin down the exact isoforms in the transcriptome, the MinION didn’t have to read every RNA molecule perfectly, but it did have to come close enough to decisively tell one exon from another ― and inDscam1, those exons could be as much as 80% identical.

In fact, Graveley and his co-authors found that the MinION was very capable of this. Out of around 33,000 high-quality Dscam1 reads pulled off the sequencer, almost 29,000 were a strong match for one and only one combination of exons. To further check their accuracy, the team also sequenced the same sample on Illumina technology. While the Illumina sequencer could not give whole isoforms, it did show the same proportions of different exons, suggesting that the MinION gave a complete and unbiased picture of the sample.

“Alternative splicing, it turns out, is probably one of the ideal applications for this platform,” Graveley says. “Even with a gene as complicated as this one, we’re able to accurately distinguish the isoforms from one another. Unless you have very, very small exons, or two exons that are almost identical to each other, the accuracy is good enough.”

Make Way for PromethION

The results are good news for researchers studying the transcriptome, but the MinION probably won’t push out other methods for dealing with alternative splicing just yet. Its low throughput means that at best it can cover a very small portion of the transcriptome with each run ― and that means isolating targeted RNA transcripts, a process that can introduce new biases into the data.

“You need a lot of reads to get the whole transcriptome, and what happens is you end up sequencing boring genes like actin and tubulin, the really abundantly expressed things,” Graveley explains. Still, his data from this experiment was good enough to replicate a few earlier findings: for instance, that Dscam1 does appear to make every predicted isoform. In this experiment, his lab observed almost half the possible isoforms, containing 92 of 93 possible exons.

Meanwhile, Oxford Nanopore Technologies is working on a new instrument, the PromethION, which will contain 48 MinION-style flow cells in a battery. Graveley has already signed on to be one of the first recipients, in an access program that is likely to start in the winter.

Judging by studies like this one, the PromethION stands a good chance of becoming the instrument of choice for large-scale RNA sequencing. With Dscam1, Graveley hopes to reach high enough throughput to do functional studies, seeking to learn whether different combinations of isoforms give rise to physical or behavioral differences. He also wants to look at human genes with high levels of alternative splicing, and to test whether the MinION can accurately count total numbers of RNA isoforms.

“The fact that you can use this technology to characterize whole isoforms is very exciting,” Graveley says. “It’s going to help us start characterizing the transcriptome in ways that have been very difficult.”

 

 

 

Read Full Post »

Cancer Drug-Resistance Mechanism

Curator: Larry H. Bernstein, MD, FCAP

 

Drug-Resistance Mechanism in Tumor Cells Unravelled

Targeting the RNA-binding protein that promotes resistance could lead to better cancer therapies.

About half of all tumors are missing a gene called p53, which helps healthy cells prevent genetic mutations. Many of these tumors develop resistance to chemotherapy drugs that kill cells by damaging their DNA.

MIT cancer biologists have now discovered how this happens: A backup system that takes over when p53 is disabled encourages cancer cells to continue dividing even when they have suffered extensive DNA damage. The researchers also discovered that an RNA-binding protein called hnRNPA0 is a key player in this pathway.

“I would argue that this particular RNA-binding protein is really what makes tumor cells resistant to being killed by chemotherapy when p53 is not around,” says Michael Yaffe, the David H. Koch Professor in Science, a member of the Koch Institute for Integrative Cancer Research, and the senior author of the study.

The findings suggest that shutting off this backup system could make p53-deficient tumors much more susceptible to chemotherapy. It may also be possible to predict which patients are most likely to benefit from chemotherapy and which will not, by measuring how active this system is in patients’ tumors.

Rewired for resistance

In healthy cells, p53 oversees the cell division process, halting division if necessary to repair damaged DNA. If the damage is too great, p53 induces the cell to undergo programmed cell death.

In many cancer cells, if p53 is lost, cells undergo a rewiring process in which a backup system, known as the MK2 pathway, takes over part of p53’s function. The MK2 pathway allows cells to repair DNA damage and continue dividing, but does not force cells to undergo cell suicide if the damage is too great. This allows cancer cells to continue growing unchecked after chemotherapy treatment.

“It only rescues the bad parts of p53’s function, but it doesn’t rescue the part of p53’s function that you would want, which is killing the tumor cells,” says Yaffe, who first discovered this backup system in 2013.

In the new study, the researchers delved further into the pathway and found that the MK2 protein exerts control by activating the hnRNPA0 RNA-binding protein.

RNA-binding proteins are proteins that bind to RNA and help control many aspects of gene expression. For example, some RNA-binding proteins bind to messenger RNA (mRNA), which carries genetic information copied from DNA. This binding stabilizes the mRNA and helps it stick around longer so the protein it codes for will be produced in larger quantities.

“RNA-binding proteins, as a class, are becoming more appreciated as something that’s important for response to cancer therapy. But the mechanistic details of how those function at the molecular level are not known at all, apart from this one,” says Ian Cannell, a research scientist at the Koch Institute and the lead author of the Cancer Cell paper.

In this paper, Cannell found that hnRNPA0 takes charge at two different checkpoints in the cell division process. In healthy cells, these checkpoints allow the cell to pause to repair genetic abnormalities that may have been introduced during the copying of chromosomes.

One of these checkpoints, known as G2/M, is controlled by a protein called Gadd45, which is normally activated by p53. In lung cancer cells without p53, hnRNPA0 stabilizes mRNA coding for Gadd45. At another checkpoint called G1/S, p53 normally turns on a protein called p21. When p53 is missing, hnRNPA0 stabilizes mRNA for a protein called p27, a backup to p21. Together, Gadd45 and p27 help cancer cells to pause the cell cycle and repair DNA so they can continue dividing.

Personalized medicine

The researchers also found that measuring the levels of mRNA for Gadd45 and p27 could help predict patients’ response to chemotherapy. In a clinical trial of patients with stage 2 lung tumors, they found that patients who responded best had low levels of both of those mRNAs. Those with high levels did not benefit from chemotherapy.

“You could measure the RNAs that this pathway controls, in patient samples, and use that as a surrogate for the presence or absence of this pathway,” Yaffe says. “In this trial, it was very good at predicting which patients responded to chemotherapy and which patients didn’t.”

“The most exciting thing about this study is that it not only fills in gaps in our understanding of how p53-deficient lung cancer cells become resistant to chemotherapy, it also identifies actionable events to target and could help us to identify which patients will respond best to cisplatin, which is a very toxic and harsh drug,” says Daniel Durocher, a senior investigator at the Samuel Lunenfeld Research Institute of Mount Sinai Hospital in Toronto, who was not part of the research team.

The MK2 pathway could also be a good target for new drugs that could make tumors more susceptible to DNA-damaging chemotherapy drugs. Yaffe’s lab is now testing potential drugs in mice, including nanoparticle-based sponges that would soak up all of the RNA binding protein so it could no longer promote cell survival.

Read Full Post »

DNA Replication

Larry H. Bernstein, MD, FCAP, Curator

LPBI

 

 

Decades Old DNA Replication Models Called into Question

http://www.genengnews.com/gen-news-highlights/decades-old-dna-replication-models-called-into-question/81251929/

 

Decades Old DNA Replication Models Called into Question

http://www.genengnews.com/media/images/GENHighlight/102252_web8123122217.jpg

A series of electron micrographs show the barrel-shaped helicase, which is the enzyme that separates the two DNA strands, along with other components of the replisome, including polymerase-epsilon (green).[Brookhaven National Laboratory]

  • It may be time to update biology texts to reflect newly published data from a collaborative team of scientists at Rockefeller University, Stony Brook University, and the U.S. Department of Energy’s Brookhaven National Laboratory. Using cutting-edge electron microscopy (EM) techniques, the investigators gathered the first ever images of the fully assembled replisome, providing new insight into the molecular mechanisms of replication.

    “Our finding goes against decades of textbook drawings of what people thought the replisome should look like,” remarked co-senior author Michael O’Donnell, Ph.D., professor and head of Rockefeller’s Laboratory of DNA Replication. “However, it’s a recurring theme in science that nature does not always turn out to work the way you thought it did.”

    “Our finding goes against decades of textbook drawings of what people thought the replisome should look like,” remarked co-senior author Michael O’Donnell, Ph.D., professor and head of Rockefeller’s Laboratory of DNA Replication. “However, it’s a recurring theme in science that nature does not always turn out to work the way you thought it did.”

http://www.genengnews.com/Media/images/GENHighlight/102254_web2322915422.jpg

Previously (left), the replisome’s two polymerases (green) were assumed to be below the helicase (tan), the enzyme that splits the DNA strands. The new images reveal one polymerase is located at the front of the helicase, causing one strand to loop backward as it is copied (right). [Brookhaven National Laboratory]

The researcher’s findings focused on the replisome found in eukaryotic organisms, a category that includes a broad swath of living things, including humans and other multicellular organisms. Over the past several decades, there has been an array of data describing the individual components comprising the complex nature of replisome. Yet, until now no pictures existed to show just how everything fit together.

“This work is a continuation of our long-standing research using electron microscopy to understand the mechanism of DNA replication, an essential function for every living cell,” explained co-senior author Huilin Li, Ph.D., biologist with joint appointments at Brookhaven Lab and Stony Brook University. “These new images show the fully assembled and fully activated ‘helicase’ protein complex—which encircles and separates the two strands of the DNA double helix as it passes through a central pore in the structure—and how the helicase coordinates with the two ‘polymerase’ enzymes that duplicate each strand to copy the genome.”

The image and implications from this study were described in a paper entitled “The architecture of a eukaryotic replisome,” published recently through Nature Structural & Molecular Biology.

Traditional models of DNA replication show the helicase enzyme moving along the DNA, separating the two strands of the double helix, with two polymerases located at the back where the DNA strand is split. In this configuration, the polymerases would add nucleotides to the side-by-side split ends as they move out of the helicase to form two new complete double helix DNA strands. However, the images that the researchers collected of intact replisomes revealed that only one of the polymerases is located at the back of the helicase. The other is on the front side of the helicase, where the helicase first encounters the double-stranded helix. This means that while one of the two split DNA strands is acted on by the polymerase at the back end, the other has to thread itself back through or around the helicase to reach the front-side polymerase before having its new complementary strand assembled.

“DNA replication is one of the most fundamental processes of life, so it is every biochemist’s dream to see what a replisome looks like,” stated lead author Jingchuan Sun, EM biologist in Dr. Li’s laboratory. “Our lab has expertise and a decade of experience using electron microscopy to study DNA replication, which has prepared us well to tackle the highly mobile therefore very challenging replisome structure. Working together with the O’Donnell lab, which has done beautiful, functional studies on the yeast replisome, our two groups brought perfectly complementary expertise to this project.”

The positioning of one polymerase at the front of the helicase suggests that it may have an unforeseen function—the possibilities of which the collaborative group of scientists is continuing to study. Whatever the function the offset polymerase ends up having, Drs. Li and O’Donnell hope that it will not only provide them better insight into the replication machinery but that they may uncover useful information that can be exploited for disease intervention.

“Clearly, further studies will be required to understand the functional implications of the unexpected replisome architecture reported here,” the scientists concluded.

 

RELATED CONTENT

 

Fifth Histone Found to Recruit Proteins for DNA Repair   

http://www.genengnews.com/gen-news-highlights/fifth-histone-found-to-recruit-proteins-for-dna-repair/81251895/

Scientists at the University of Copenhagen say they have located a previously unknown function for histones, which allows for an improved understanding of how cells protect and repair DNA damages. This new discovery may be of great importance to the treatment of diseases caused by cellular changes such as cancer and immune deficiency syndrome.

The study (“Histone H1 couples initiation and amplification of ubiquitin signaling after DNA damage”) is published in Nature.

“I believe that there’s a lot of work ahead. It’s like opening a door onto a previously undiscovered territory filled with lots of exciting knowledge. The histones are incredibly important to many of the cells’ processes as well as their overall wellbeing,” said Niels Mailand, Ph.D., from the Novo Nordisk Foundation Center for Protein Research at the Faculty of Health and Medical Science.

Histones enable the tight packaging of DNA strands within cells. The strands are two meters in length and the cells usually about 100,000 times smaller. Generally speaking, there are five types of histones. Four of them are core histones and they are placed like beads on the DNA strands, which are curled up like a ball of wool within the cells. The role of the histones is already well described in research, and in addition to enabling the packaging of the DNA strands they also play a central part in practically every process related to the DNA-code, including repairing possibly damaged DNA.

The four core histones have tails and, among other things, they signal damage to the DNA and thus attract the proteins that help repair the damage. Between the histone “yarn balls” we find the fifth histone, Histone H1, but up until now its function has not been thoroughly examined.

Using a mass spectrometer, Dr. Mailand and his team have discovered that, surprisingly, the H1 histone also helps summon repair proteins.

“In international research, the primary focus has been on the core histones and their functionality, whereas little attention has been paid to the H1 histone, simply because we weren’t aware that it too influenced the repair process. Having discovered this function in the H1 constitutes an important piece of the puzzle of how cells protect their DNA, and it opens a door onto hitherto unknown and highly interesting territory,” noted Dr. Mailand.

He expects the discovery to lead to increased research into Histone H1 worldwide, which will lead to increased knowledge of cells’ abilities to repair possible damage to their DNA and thus increase our knowledge of the basis for diseases caused by cellular changes. It will also generate more knowledge about the treatment of these diseases.

“By mapping the function of the H1 histone, we will also learn more about the repair of DNA damages on a molecular level. In order to provide the most efficient treatment, we need to know how the cells prevent and repair these damages,” point out Dr. Mailand.

 

Cover All the Bases for Oligonucleotide Analysis

Stephen Luke

Synthetic oligonucleotides have emerged as promising therapeutic agents for the treatment of a variety of diseases, including viral infections and cancer. Researchers are looking at several classes of nucleic acids, such as antisense oligonucleotides, small interfering RNAs (siRNAs), and aptamers, for therapeutic applications.

However, various impurities – product-related, in the starting materials, and arising from incomplete capping of coupling reactions – must be identified and removed and postsynthesis processing must be monitored. Thus, a key challenge in the development and manufacture of oligonucleotide therapeutics is to establish analytical methods that are capable of separating and identifying impurities.

Exploring Better Options for Oligonucleotide LC Separations

 

 

Table 1. Options for oligonucleotide LC separations

Ion-pair, reversed-phase separation of the trityl-on oligos and is relatively simple to perform. This method separates the full-length target oligo, which still has the dMT group attached, from the deprotected failure sequences. The analytical information obtained is limited, so this is generally considered a purification method.

An alternate method, ion-exchange separations of the trityl-off, deprotected oligos uses the negative charge on the backbone of the oligo to facilitate the separation. Resolution is good for the shorter oligos but decreases with increasing chain length. Aqueous eluents are used but oligos are highly charged, and high concentrations of salt are needed to achieve elution from the column, making the technique unsuitable for use with LC/MS.

Finally, ion-pair, reversed-phase separation of the trityl-off, deprotected oligos makes use of organic solvents and mobile phase additives such as TEAA (triethylammonium acetate) or TEA-HFIP (triethylamine and hexafluoroisopropanol) to ion-pair with the negatively charged phosphodiester backbone of the oligonucleotide. High-performance columns deliver excellent resolution. What’s more, methods with volatile mobile phase constituents such as TEA-HFIP are suitable for use with LC/MS, providing useful information to help characterize oligonucleotide structures and sequences.

In Table 1 we summarize some of the options for oligonucleotide analysis by liquid chromatography.

Designed for ion-pair, reversed-phase separation of the trityl-off, deprotected oligos using either TEAA or TEA-HFIP mobile phases –Agilent AdvanceBio Oligonucleotide columns meet these challenges.

more….   http://www.genengnews.com/gen-articles/cover-all-the-bases-for-oligonucleotide-analysis/5594/

Read Full Post »

Human Genetics and Childhood Diseases, Volume 2 (Volume Two: Latest in Genomics Methodologies for Therapeutics: Gene Editing, NGS and BioInformatics, Simulations and the Genome Ontology), Part 1: Next Generation Sequencing (NGS)

Human Genetics and Childhood Diseases

Curator: Larry H. Bernstein, MD, FCAP

 

 

 

Publication Roundup: HGMD

HGMD®, the Human Gene Mutation Database is used by scientists around the world to find information on reported genetic mutations. The papers below use the database to advance our understanding of disease, DNA dynamics, and more.

https://www.qiagenbioinformatics.com/blog/translational/publication-roundup-hgmd

Local DNA dynamics shape mutational patterns of mononucleotide repeats in human genomes
First author: Albino Bacolla

Scientists in the US and UK published results in Nucleic Acids Research of a detailed analysis of single-base substitutions and indels in the human genome. Their findings show that certain base positions are more susceptible to mutagenesis than others. They used HGMD Professional to find mutations in specific genomic regions for analysis; the paper includes charts showing mutation patterns, germline SNPs, and more from HGMD data.

High prevalence of CDH23 mutations in patients with congenital high-frequency sporadic or recessively inherited hearing loss
First author: Kunio Mizutari

This Orphanet Journal of Rare Diseases paper from scientists in Japan sequenced 72 patients with unexplained hearing loss, finding several CDH23 mutations, some of which were novel. Mutations in the gene have been linked to Usher syndrome and other forms of hereditary hearing loss. The scientists used HGMD to find all known CDH23 mutations within nearly 70 coding regions.

Mutation analyses and prenatal diagnosis in families of X-linked severe combined immunodeficiency caused by IL2Rγ gene novel mutation
First author: Q.L. Bai

In Genetics and Molecular Research, scientists report the utility of mutation analysis of the interleukin-2 receptor gamma gene to assess carrier status and perform prenatal diagnosis for X-linked severe combined immunodeficiency. They studied two high-risk families, along with 100 controls, to evaluate the approach. Sequence variation was determined using HGMD Professional and an X-SCID database, and a new mutation was discovered in the project.

Impact of glucocerebrosidase mutations on motor and nonmotor complications in Parkinson’s disease
First author: Tomoko Oeda

Researchers from three hospitals in Japan published this Neurobiology of Aging report that may help stratify Parkinson’s disease patients by prognosis. They sequenced mutations in the GBA gene in 215 patients, finding that those who had mutations associated with Gaucher disease suffered dementia and psychosis much earlier than those who didn’t. The team found previously reported GBA mutations using HGMD Professional.

Comprehensive Genetic Characterization of a Spanish Brugada Syndrome Cohort
First author: Elisabet Selga

In this PLoS One publication, scientists from a number of institutions in Spain examined genetic variation among patients with Brugada syndrome, a rare genetic cardiac arrhythmia. They sequenced 14 genes in 55 patients, identifying 61 variants and finding the subset that appear pathogenic. Variants were filtered against a number of databases, including HGMD.

 

 

Local DNA dynamics shape mutational patterns of mononucleotide repeats in human genomes

Albino Bacolla1Xiao Zhu2Hanning Chen3Katy Howells4David N. Cooper4 and Karen M. Vasquez1

Nucl. Acids Res. (26 May 2015) 43(10): 5065-5080.   http://dx.doi.org:/10.1093/nar/gkv364

Single base substitutions (SBSs) and insertions/deletions are critical for generating population diversity and can lead both to inherited disease and cancer. Whereas on a genome-wide scale SBSs are influenced by cellular factors, on a fine scale SBSs are influenced by the local DNA sequence-context, although the role of flanking sequence is often unclear. Herein, we used bioinformatics, molecular dynamics and hybrid quantum mechanics/molecular mechanics to analyze sequence context-dependent mutagenesis at mononucleotide repeats (A-tracts and G-tracts) in human population variation and in cancer genomes. SBSs and insertions/deletions occur predominantly at the first and last base-pairs of A-tracts, whereas they are concentrated at the second and third base-pairs in G-tracts. These positions correspond to the most flexible sites along A-tracts, and to sites where a ‘hole’, generated by the loss of an electron through oxidation, is most likely to be localized in G-tracts. For A-tracts, most SBSs occur in the direction of the base-pair flanking the tracts. We conclude that intrinsic features of local DNA structure, i.e. base-pair flexibility and charge transfer, render specific nucleotides along mononucleotide runs susceptible to base modification, which then yields mutations. Thus, local DNA dynamics contributes to phenotypic variation and disease in the human population.

INTRODUCTION

Changes in human genomic DNA in the form of base substitutions and insertions/deletions (indels) are essential to ensure population diversity, adaptation to the environment, defense from pathogens and self-recognition; they are also a critical source of human inherited disease and cancer. On a genome-wide scale, base substitutions result from the combined action of several factors, including replication fidelity, lagging versus leading strand DNA synthesis, repair, recombination, replication timing, transcription, nucleosome occupancy, etc., both in the germline and in cancer (14). On a much finer scale [(over a few base pairs (bp)], rates of base substitutions may be strongly influenced by interrelationships between base–protein and base–base interactions. For example, the mutator role of activation-induced deaminase (AID) in B-cells during class-switch recombination and somatic hypermutation (5) targets preferentially cytosines within WRC (W: A|T; R: A|G) sequences (6), whereas apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like (APOBEC) overexpression displays a preference for base substitutions at cytosines in TCW contexts (7). Other examples, such as the induction of C→T transitions at CG:CG dinucleotides by cytosine-5-methylation and the role of UV light in promoting base substitutions at pyrimidine dimers have been well documented (reviewed in (4,8)). More recently, complex patterns of base substitution at guanosines in cancer genomes have been found to correlate with changes in guanosine ionization potentials as a result of electronic interactions with flanking bases (9), suggesting a role for electron transfer and oxidation reactions in sequence-dependent mutagenesis. However, despite these advances, the increasing number of sequence-dependent patterns of mutation noted in genome-wide sequencing studies has met with a lack of understanding of most of the underlying mechanisms (10). Thus, a picture is emerging in which mutations are often heavily dependent on sequence-context, but for which our comprehension is limited.

Mononucleotide repeats comprise blocks of identical base pairs (A|T or C|G; hereafter referred to as A-tracts and G-tracts) and display distinct features: they are abundant in vertebrate genomes; mutations within the tracts occur more frequently than the genome-wide average; mutations generally increase with increasing tract length; length instability is a hallmark of mismatch repair-deficiency in cancers; and sequence polymorphism within the general population has been linked to phenotypic diversity (1115). Thus, mononucleotide repeats appear ideal for addressing the question of sequence-dependent mutagenesis since base pairs within the tracts are flanked by identical neighbors. Both historic and recent investigations concur with the conclusion that a major source of mononucleotide repeat polymorphism is the occurrence of slippage (i.e. repeat misalignment) during semiconservative DNA replication, which gives rise to the addition or deletion of repeat units (11,12). An additional and equally important source of mutation has recently been suggested to arise from errors in DNA replication by translesion synthesis DNA polymerases, such as pol η and pol κ (13), also on slipped intermediates, leading to single base substitutions.

A key question that remains unanswered in these studies and which is relevant to the issue of sequence context-dependent mutagenesis is whether all base pairs within mononucleotide repeats display identical susceptibility to single base changes and whether indels (which are consequent to DNA breakage) occur randomly within the tracts.

Herein, we combine bioinformatics analyses on mononucleotide repeat variants from the 1000 Genomes Project and cancer genomes with molecular dynamics simulations and hybrid quantum mechanics/molecular mechanics calculations to address the question of sequence-dependent mutagenesis within these tracts. We show that mutations along both A-tracts and G-tracts are highly non-uniform. Specifically, both base substitutions and indels occur preferentially at the first and last bp of A-tracts, whereas they are concentrated between the second and third G:C base pairs in G-tracts. These positions coincide with the most flexible base pairs for A-tracts and with the preferential localization of a ‘hole’ that results when one electron is lost due to an oxidation reaction anywhere along G-tracts. Thus, despite the uniformity of sequence composition, mutations occur in a sequence-dependent context at homopolymeric runs according to a hierarchy that is imposed by both local DNA structural features and long-range base–base interactions. We also show that the repair processes leading to base substitution must differ between A- and G-tracts, since in the former, but not in the latter, base substitutions occur predominantly in the direction of the base immediately flanking the tracts. Additional sequence-dependent patterns of mutation are likely to arise from studies of more heterogeneous sequence combinations, possibly involving other aspects intrinsic to the structure of DNA.

 

RESULTS

Mononucleotide repeat variation is defined by tract length and flanking base composition

We define mononucleotide repeats in the GRCh37/hg19 (hg19) human genome assembly as uninterrupted runs of A:T and G:C base pairs (hereafter referred to as A-tracts and G-tracts, respectively) from 4 to 13 base pairs in length (Figure 1A). We retrieved a total of 48,767,945 A-tracts and 13,633,781 G-tracts, both of which displayed a biphasic distribution with an inflection point between tract lengths of 8 and 9 (bp) and with the number of runs declining with length more dramatically for G-tracts than for A-tracts (Figure 1B), as noted previously (29). Both the number of short tracts and the extent of decline varied with flanking base composition, TA[n]T runs being two- to three-fold more abundant than CA[n]Cs (Supplementary Figure S1A) and AG[n]As declining the most rapidly (Supplementary Figure S1B). Thus, mononucleotide runs exist as a collection of separate pools of sequences in extant human genomes, each maintained at distinctive rates of sequence stability, as determined by factors such as bp composition (A:T versus G:C), tract length and flanking sequence composition.

Figure 1.

Mononucleotide repeat variation, evolutionary conservation and association with transcription. (A) The search algorithm was designed to retrieve runs of As or Ts (A-tracts) and Gs or Cs (G-tracts) length n (n = 4 to 13), along with their 5′ (n = 0) and 3′ (n = n + 1) nearest neighbors from hg19. Tract bases were numbered 5′ to 3′ with respect to the purine-rich sequence. The panel exemplifies the nomenclature for A- and G-tracts of length 4. (B) Logarithmic plot of the number of A-tracts (closed circles) and G-tracts (open circles) in hg19 as a function of length. (C) Normalized fractions of polymorphic tracts (F SNV) (number of SNVs divided by both hg19 number of tracts and n) from the 1KGP for A-tracts (closed circles) and G-tracts (open circles). (D) Radial plot of SNVs in the 1KGP at the 5′ and 3′ nearest neighbors of A-tracts. Periphery, tract length; horizontal axis, scale for the fraction of SNVs (F SNV). (E) Radial plot of SNVs in the 1KGP at the 5′ and 3′ nearest neighbors of G-tracts. (F) Percent difference in the numbers of A-tracts (closed circles) and G-tracts (open circles) between syntenic regions of hg19 and HN genomes. (G) The exponents of Benjamini-corrected P-values for A-tract-containing genes enriched in transcription-factor binding sites plotted as a function of A-tract length (triangles); each value represents the median of the top 11 USCS_TFBS terms. The percent A-tracts (closed circles) and G-tracts (open circles) intersecting genomic regions pulled-down by chromatin immunoprecipitation using antibodies against transcription factors are plotted as a function of tract length. (H) List of gene enrichment terms with a Benjamini-corrected P-value of <0.05 in common between genes containing A- and G-tracts of lengths 4–13, excluding the UCSC_TFBS terms.

 We examined the extent of sequence variation in the human population by mapping 38,878,546 single nucleotide variants (SNVs) from 1092 haplotype-resolved genomes (the 1000 Genomes Project, 1KGP) (30) to the hg19 A- and G-tracts. The normalized fractions of polymorphic tracts (F SNV) were greater for G-tracts than A-tracts and both displayed Gaussian-type distributions, with maxima of 0.067 for G-tracts of length 8 and 0.017 for A-tracts of length 9 (Figure 1C). CA[n]C and AG[n]A runs displayed the highest F SNV values for A- and G-tracts, respectively (Supplementary Figure S1C and D), with F SNV values for AG[n]As attaining ∼0.10 at length 8. We conclude that flanking base composition influences the rates of SNV within mononucleotide runs and, as a consequence, their representation in the reference human genome.

F SNV values at the flanking 5′ and 3′ bp were similar between A- and G-tracts, except for minor differences for the least represented (i.e. longest) tracts and did not exceed 0.02 (Supplementary Figure S1E). These fractions are expected to be greater than at more distant positions from the tracts, based on previous data (29). SNVs at G-tracts, but not at A-tracts, were more frequent than at flanking base pairs. F SNVs for base pairs flanking short (≤8 bp) tracts were at least twice as high as those flanking long tracts; F SNVs also displayed distinct sequence preference with most (∼0.1) variants occurring at Ts 3′ of G-tracts (Figure 1D and E). In summary, SNVs at mononucleotide runs do not increase monotonically with length but peak at 8–9 bp. This behavior mirrors the genomic distributions, both with respect to the total number of tracts (Figure 1B) and the subsets flanked by specific-sequence combinations (Supplementary Figure S1A–D). Variation at flanking base pairs also displayed a biphasic pattern centered at a length of 8–9 bp, with a greater chance of variation adjacent to G- than A-tracts and with characteristic sequence preferences.

Long tracts are evolutionarily conserved and associated with high transcription

To assess whether more variable monosatellite runs (Figure 1C) might have undergone a greater reduction in number in extant humans relative to extinct hominids, we compared the number of A- and G-tracts between syntenic regions of five individuals comprising hg19 and three Neanderthal (HN) specimens (31). The difference between hg19 and HN was very small (<±2%) for the short tracts, but it displayed more negative values in hg19 with increasing tract length, which reached a maximum of −11.8 and −32.7% for A- and G-tracts, respectively, of length 9. Beyond this threshold, the numbers of tracts converged for A-tracts, whereas they were more abundant in hg19 for G-tracts >11 bp (Figure 1F). In summary, the largest difference in the number of mononucleotide runs between hg19 and HN sequences was centered at 9 bp for both A- and G-tracts, suggesting that the length distributions (Figure 1A and Supplementary Figure S1A and B) reflect distinct rates of evolutionary gains and losses due to differential sequence mutability (Figure 1C) as a function of length and flanking sequence composition (12).

The fact that long (>9 bp) mononucleotide runs display low variability in the human population (Figure 1C) and sequence conservation during evolutionary divergence (Figure 1F) raises the possibility that they might serve functional roles. Through gene enrichment analyses, we found that genes containing A- and G-tracts were enriched for genes associated with the term ‘UCSC_TFBS’, which pertains to transcripts harboring frequent transcription factor binding sites (32,33). For A-tract-containing genes, the median P-values for the top 11 UCSC_TFBS terms decreased from 2.95E-26 for tracts of length 4 to 5.22E-241 for tracts of length 13 (Figure 1G). The percent of A-tracts intersecting genomic fragments amplified from chromatin immunoprecipitation using transcription-factor binding antibodies (32,33) also increased from 8.7 to 9.9 from length 6 to 13, whereas it was constant (mean ± SD, 22.4 ± 1.1) for G-tracts (Figure1G). For gene classes excluding ‘UCSC_TFBS’, a search for categories enriched at P < 0.05 and common to all A- and G-tract-containing genes returned a set of 25 terms, 22 of which were associated with high levels of tissue-specific gene expression (Figure 1H). In summary, these analyses extend prior work (14) supporting a role for mononucleotide tracts in enhancing gene expression, a function that for A-tracts appears to increase with increasing tract length.

Repeat variability is highly skewed

Next we addressed whether bp along A- and G-tracts display equal probability and type of variation. In the 1KGP dataset, the number of SNVs at each position along both A- and G-tracts of length 4 was within a two-fold difference (144,000–240,000); for both types of sequence, transitions (i.e. A→G and G→A) were the predominant (51–78%) type of base substitution (Supplementary Figure S2A and B). However, with increasing length, the number of SNVs decreased up to 30-fold more drastically for G-tracts than for A-tracts, with increasing numbers of transversions (A→T and G→C|T) being predominant. Normalizing the data for the number of tracts genome-wide revealed that the extent of SNV varied by up to 10-fold, depending upon tract length and bp position. Specifically, the highest degree of variation was observed at the first and last A within the A-tracts (i.e. A1 and An), which underwent up to 61% A→T and 43% A→C transversions, respectively, at length 9 (Figure 2A). Likewise, for G-tracts, the most polymorphic sites were G3, followed by G2, for mid-size tracts of 8–10 bp, with 44% G→C transversions at G3 for tracts of length 8 (Figure2B). Thus, the extent of SNV at mononucleotide runs is grossly skewed in human genomes, both along the sequence itself and across tract length, which must account for the bell-shape behavior in F SNV for the tracts as a whole (Figure 1C).

Figure 2.

Population variation spectra. (A) Variation spectra of A-tracts. Percent (number of SNVs at each position divided by the number of tracts in hg19 × 100) of A→T (black), A→C (red) and A→G (green) SNVs in the 1KGP dataset (left). Percent SNVs at A1 as a function of tract length (right). (B) Variation spectra of G-tracts. As in panel A with G→T (black), G→C (red) and G→A (cyan) (left). Percent SNVs at G3 as a function of tract length (right). (C) Percent A→T, A→C and A→G transitions at each position along A-tracts (stars) preceded and followed by a T (TA[n]T, left), C (CA[n]C), center) and G (GA[n]G, right) as a function of tract length. (D) Percent G→T, G→C and G→A transitions at each position along G-tracts (stars) preceded and followed by a T (TG[n]T, left), C (CG[n]C), center) and A (AG[n]A, right) as a function of tract length. (E) Percent transitions at base pairs (stars) preceding or following A-tracts (left) and G-tracts (right) as a function of tract length (n). *, mutated position.

We assessed whether SNV hypervariability was associated with specific combinations of nearest neighbors. For A-tracts flanked 5′ by a T, C or G, the highest percentage of SNVs was observed at A1 when preceded by a T, which reached 7.9% for TA[n] tracts of length 9 (Supplementary Figure S2C). By contrast, for 3′ T, C or G, the greatest effect was elicited by a C, with the highest percentage (7.1%) of SNVs at An for A[n]C tracts of length 9 (Supplementary Figure S2D). Therefore, flanking base pairs play a critical role both in the spectra and frequencies of SNVs at A-tracts. More detailed plots along A-tracts either preceded (Supplementary Figure S2E), followed (Supplementary Figure S2F) or preceded and followed (Figure 2C) by a T, C or G revealed the dramatic and long-range (up to 9–10 bp for the longest tracts, higher than the value of 4 bp predicted by mathematical models of slippage (11)) influence of flanking base pairs on variation spectra, in which up to 95% of the changes were in the direction of the base flanking the tract. Because the number of A-tracts preceded or followed by a specific base varies by up to three-fold (Supplementary Figure S2G), we conclude that for A-tracts, the overall mutation fractions and spectra are the result of at least three variables; length, position along the tract, and base composition of the 5′ and 3′ nearest-neighbors.

For G-tracts flanked 5′ by a T, C or A, high percentages (10–12%) of SNVs were observed at G1 for tracts preceded by a C, an effect that decreased with increasing tract length (Supplementary Figure S3A). This result, together with an exceedingly low number of G→A transitions at G1 for tracts not preceded by a C (Supplementary Figure S3C) relative to all tracts (Supplementary Figure S2B), is consistent with the known high mutability of CG:CG dinucleotides as a result of cytosine-5 methylation (9). The hypermutability at G2 was observed preferentially for tracts preceded by an A, and to a lesser extent T, whereas that at G3 was insensitive to flanking sequence composition. Likewise, G-tracts flanked 3′ by a T, C or A did not display marked sequence-dependent effects (Supplementary Figure S3B). Detailed plots of the SNV spectra along G-tracts either preceded (Supplementary Figure S3D), followed (Supplementary Figure S3E), or preceded and followed (Figure 2D) by a T, C or A revealed a noticeable effect only for 5′ T in association with G→T substitutions at G1for tracts of length ≥8. Thus, despite a consistent over-representation of G-tracts flanked 5′ by a T (Supplementary Figures S3F and S1B), which must account for the high absolute number of SNVs at G1 for TG[n] relative to AG[n] and CG[n] (Supplementary Figure S3G), nearest-neighbor base composition seems to play a lesser role in SNV spectra at G-tracts than at A-tracts.

With respect to SNVs at the flanking 5′ and 3′ nearest positions, no B→A or H→G substitutions (Figure 1A) were found above a length threshold of 9 for A-tracts and 8 for G-tracts (Figure 2E, gray shading) out of 5969 SNVs, implying that tract expansion by recruiting flanking base pairs is disfavored at these lengths. In summary, base substitution along mononucleotide repeats is strongly skewed towards the edges of A-tracts and within the 5′ half of G-tracts, with frequencies that peak at midsize lengths (8–9 bp). For A-tracts ≥7 bp, base substitution occurred almost exclusively in the direction of the flanking nearest-neighbors. Finally, base substitution at flanking bases did not contribute to tract expansion for mononucleotide runs longer than 8–9 bp.

Insertions and deletions display length and positional preference

In addition to SNVs, mononucleotide runs are polymorphic in length as a result of indels. Herein, we consider separately two types of indels: one in which tract length changes by ±1 and flanking bp composition is not altered (slippage); the other comprising all other cases involving the addition or removal of 1–200 bp (indels). Slippage is a widely accepted mutational mechanism (1112,34), whereby DNA replication errors at reiterated DNA motifs cause changes in the number of motifs (most often +/−1). The normalized fractions of slippage in the 1KGP dataset peaked at lengths of 8 bp for A-tracts and 9 bp for G-tracts (Figure 3A), generating bell-shaped curves similar to those observed for SNVs (Figure1C) and with no differences in the highest fraction of ‘slipped’ tracts, which peaked at ∼0.02. By contrast, +1 slippage occurred more frequently than −1 slippage at A-tracts (Figure 3B). These results support recent studies on microsatellite repeats (12) and contrast with previous conclusions that slippage increases monotonically with tract length, and that the extent of slippage differs between A- and G-tracts (35,36).

Figure 3.

Population insertions and deletions. (A) Normalized fractions of A-tracts (closed circles) and G-tracts (open circles) displaying +/−1 bp slippage in the 1KGP dataset as a function of tract length. Data were obtained by dividing the number of events by both the number of hg19 tracts and tract length (n). (B) Ratio of the number of +1 to −1 slippage for A-tracts (closed circles) and G-tracts (open circles). (C) Indels at A-tracts. For positions along the tracts (‘Tract’), ‘F Indel’ is the ratio between the number of indels and the number of tracts in hg19 multiplied by tract length. For the positions immediately flanking the tracts genomic coordinates (‘Before tract’ and ‘After tract’), ‘F Indel’ is the ratio between the number of indels and the number of tracts in hg19. (D) Indels at G-tracts, calculated as described in panel C. (E) Heatmap representation of insertions along A-tracts. The percent insertions (i.e. the number of insertions at each position divided by the number of tracts in hg19) (y-axis) plotted as a function of location (x-axis) from position 0 (insertion between the bp 5′ to the tract and the first bp of the tract) to position n + 1 (insertion between the bp 3′ to the last bp of the tract and the following bp) (see Figure 1A) and as a function of tract length (z-axis). (F) Heatmap representation of insertions along G-tracts.

With respect to indels, the normalized fractions were low (<1 × 10−3) along short (4–6 bp) A- and G-tracts, but rose to a plateau for longer tracts as reported earlier (11); this plateau was 10-fold higher for G-tracts (∼0.03) than for A-tracts (∼0.003) (Figure 3C and D). Indels also occurred more frequently (up to six-fold for A-tracts of length 11) at nearest-neighboring base pairs (‘Before tract’ and ‘After tract’ in Figure 3C and D) than along the tracts. Thus, contrary to SNVs and slippage, indels increased to a plateau with mononucleotide tract length.

We analyzed in detail the locations of insertions along the tracts and the flanking positions with respect to the 5′ to 3′ orientation of the tracts (Figure 1A). The normalized fractions demonstrated that insertions peaked at the 3′, and to a lesser extent 5′, ends of the longest A-tracts (Figure 3E), but remained low. For G-tracts, insertions occurred most efficiently at two locations (G2–3 and G5) (Figure 3F), they increased with tract length (up to ∼0.04), and attained ∼10-fold higher values than for A-tracts. In conclusion, insertion sites at A- and G-tracts followed the patterns observed for SNVs (Figure 2A and B), suggesting that factors associated with local DNA dynamics sensitize specific bases along the tracts to genetic alteration, inducing both SBS and indels.

Base pair flexibility and charge localization map to sites of sequence changes

To elucidate elements of intrinsic DNA dynamics that may be responsible for the biases in SNV and insertion sites, we performed molecular dynamics (MD) and hybrid quantum mechanics/molecular mechanics (QM/MM) simulations on model A[6], A[9], G[6] and G[9] duplex DNA fragments. We focused on water bridge coordination (Figure 4A), bp step flexibility, and for the G[6] and G[9], charge localization, as these properties are known to impact the susceptibility of DNA to base damage, repair and mutation. The fractions of one water coordination increased along the A[9] and A[6] structures in a 5′ to 3′ direction, irrespective of flanking sequence composition, in concert with a decrease in minor groove width (Figure 4B and Supplementary Figure S4A) as predicted (37). Vstep, a measure of bp structural fluctuation, displayed a prominent peak of ∼40 Å3deg3 at the 5′-TA-3′ step for both structures (Figure 4C and Supplementary Figure S4B), which together with low water occupancy points to 5′-TA-3′ being a preferred location for base modification and mutation. In the G[9] and G[6] structures water coordination involved mostly two-water bridges due to wide (∼14 Å) minor grooves (Figure 4Dand Supplementary Figure S4C), whereas flexibility was modest (∼20–22 Å3deg3, Figure 4E and Supplementary Figure S4D). Thus, bp dynamics are likely to impact mutations at A-tracts to a greater extent than at G-tracts. Guanine has the lowest ionization potential (IP) of all four bases and IP further decreases at guanine runs, rendering them targets for electron loss, charge localization, oxidation and eventually mutation (4,38). Because after electron loss the ensuing charge (hole) can migrate along the DNA double-helix and relocalize at specific guanines, we addressed whether the preferred sites of mutation along G-tracts, i.e. G2–3 and G5, would also be preferred sites for charge localization. The QM/MM determinations indicated that whereas for the short G[6] fragment the difference in the density-derived atomic partial charges (DDAPC) (i.e. the hole) localized most often (∼50%) to the first position (Figure 4F), for the long G[9] fragment charge localization shifted downstream (mostly to the second, but also to positions 6–7, Figure 4G). Importantly, the charge was found exclusively around the guanine rings (Figure 4H). Thus, the two main sites of sequence change along G-tracts, i.e. G2–3 and G5, coincide with positions where charge localization and hence one-electron oxidation reactions is predicted to occur most frequently. In summary, bp flexibility at A-tracts and charge transfer at G-tracts likely represent intrinsic DNA features underlying the bias in SNV and insertions at mononucleotide runs in human genomes.

Figure 4.

MD and QM/MM simulations. (A) Molecular modeling of one (left) and two (right) minor groove water bridge coordination. (B) Fraction of one-water bridge occupancy (left axis) at A[9] DNA sequences flanked 5′ and 3′ by a T (black circles), C (red circles) or G (green circles). Minor groove widths (right axis), as determined from intrastrand phosphate-to-phosphate distances. (C) Vstep for A[9] DNA sequences, determined as the product of the square root of the eigenvalues (λi) described by the six bp step parameters shift, slide, rise, tilt, roll and twist; i.e. Vstep=6i=1λi−−√. (D) Fraction of one- (black circles) and two-water (red circles) bridge occupancy (left axis) at G[9] DNA sequences. Minor groove widths (right axis), as assessed from intrastrand phosphate-to-phosphate distances. (E) Vstep for G9 DNA sequences. (F) Average charge redistribution (open circles and right axis) for G[6] DNA structures upon vertical ionization, examined by calculating the difference on the density-derived atomic partial charges (DDAPC) for the neutral and negatively charged states. Histogram of the number of instances (left axis) in which the largest charge redistribution occurred at a specific position along the G[6] structures. (G) DDAPC for G[9] DNA structures (open circles and right axis) and histogram of the number of instances (left axis) in which the largest charge redistribution occurred at a specific position. (H) VMD rendering of a G[9] DNA structure displaying hole localization at G2. Capped base pairs were removed for clarity.

Position and orientation along nucleosome core particles modulate sequence variation

DNA wrapped around histones in nucleosomes is subject to local deformation (39), which may impact mutation. Thus, we analyzed the 1KGP SNVs at A- and G-tracts predicted to overlap with well-positioned nucleosome core particles (NCPs) (16). In hg19, the percentage of tracts that overlap with NCPs decreased moderately from ∼90% at length of 4 to 81% and 71% for A- and G-tracts of length 13, respectively (Figure 5A), suggesting that mononucleotide runs are not depleted in NCPs in human genomes as previously proposed (40). A-tracts of lengths 4–8 base pairs displayed distinctive peaks along the NCP surface in phase with the helical repeat of DNA (10.5 bp) and with minor grooves facing toward the inner protein core (lengths 4–5) (16) (Figure 5B and Supplementary Figure S5A). A-tracts of length of 9–13 bp exhibited only half (six) the peaks evident for the shorter tracts. For the G-tracts, only small peaks with no clear minor groove-inward-facing regions were detected (Supplementary Figure S5B).

Figure 5.

Positioning along nucleosome core particles. (A) Percent of A-tract (open circles) and G-tract (closed circles) base pairs in hg19 overlapping with well-positioned NCP genomic coordinates as a function of tract length. (B) Counts of base pairs in hg19 A-tracts of length 5 overlapping with NCPs genomic regions as a function of distance from the histone octamer dyad axis. Minor groove-inward-facing regions (gray) were derived from the X-ray crystal structure of NCP147 (41). (C) Percent SNVs in the 1KGP dataset (left axis) at every bp along A-tracts of length 5 for tracts centered at maxima (black) and minima (gray) along NCPs (Figure 5B). Percent increase (right axis) of SNVs at minima relative to maxima (green). P-values for paired t-tests: 0.013 (*), 0.002 (**) and 4.7 × 10−6 (***). (D) Whisker plots of%SNVs (left axis) at A1 for A-tracts of length 5 centered at maxima and minima (black) along NCPs (Figure 5B). Percent difference (right axis) in the number of A-tracts of length 5 in hg19 preceded by C, T or G (red) between those centered at minima and those centered at maxima (Figure5B). (E) C-containing/G-containing ratios (see text) for G-tracts of length 5 in hg19 as a function of distance from the NCP dyad axis (black) and location of core histones (maroon and green). Peaks correspond to negative iSAT (i.e. tilt parameters multiplied by the corresponding sin θ) values (gray) (39). Ratios of%SNV at G1 (upshifted by 0.5 for clarity) between C-containing (5′-CCCCCG-3′ sequences on the hg19 forward strand) and G-containing (5′-CGGGGG-3′ sequences on the hg19 forward strand) (Figure 1A) CG[5] tracts mapping NCP Chip-seq genomic intervals (red) fitted by a non-parametric local regression (loess; sampling proportion, 0.100; polynomial degree, 3). (F) VMD rendering (top) of TATTT residues 34–38 (yellow) and the complementary AAATA residues 672–753 (pink) from the 1EQZ pdb nucleosomal crystal structure, corresponding to peak area from −40 to −36 in Figure 5E. The switch in G-tract (lengths of 5 and 7) orientation along NCPs (bottom) serves to position the C-containing strand on the outside (yellow) and, correspondingly, the G-containing strand on the inside (pink).

 To assess if tract-positioning along NCPs influences SNVs, we selected A-tracts of lengths 5, 7 and 9 bp and G-tracts of lengths 5 and 7 bp whose central positions coincided with either the maxima or minima (41) (Figure 5B and Supplementary Figure S5A and B) and conducted pair-wiset-tests (330 total) between permutations of ‘categories’, including ‘tracts centered at maxima versus minima’, ‘position along the tracts’, ‘flanking sequence composition’, ‘specific NCP locations’ and ‘tract orientation’. For A-tracts, 79/207 (38%) significant pairs were found, 68 (86%) of which were related to differences between tracts centered at maxima versus minima, with a preponderance (63%) of tests displaying increased %SNVs at minima (Supplementary Figure S5C and E). For example, %SNVs at length 5 bp were greater at minima than at maxima at each position along the A-tracts (Figure 5C). A→C substitutions at A1 were more abundant at maxima than at minima (mean ± SD, 18.7 ± 0.7% at max and 17.6 ± 0.8% at min; P-value 0.001), whereas A→T substitutions at the same position displayed the opposite trend (mean ± SD, 18.4 ± 0.5% at max and 19.8 ± 1.1% at min; P-value 0.0005) (Figure 5D). A-tracts of length 7 also exhibited a similar pattern at A7 (Supplementary Figure S5H). The percentages of CA[5] and A[7]C tracts in hg19 centered at maxima were greater than at minima and the reverse was observed for the TA[5] and A[7T] tracts (Figure 5D and Supplementary Figure S5H). Thus, we conclude that positioning along the NCP surface of both the double-helical grooves and junctions with flanking base pairs influence SNVs along A-tracts. However, this influence is complex and for the most part, difficult to predict.

For G-tracts, most pairwise comparisons (18/34, 53%) indicated SNV variation according to sequence orientation (Supplementary Figure S5F and G). In hg19, the ratio of the numbers of G-tracts of lengths 5 and 7 for which the C-containing strand coincided with the forward sequence (downstream example sequence in Figure 1A) to the numbers of G-tracts for which the G-containing strand coincided with the forward sequence (upstream example sequence in Figure 1A) (C-containing/G-containing ratios) displayed a prominent 10.5-bp oscillation in phase with iSAT (Figure 5E), a measure of ‘inside’ and ‘outside’ bases, according to the bp step tilt parameter (39). Analysis of the helical path of a 146-bp DNA fragment wrapped around histones showed that the oscillation in the C-containing/G-containing ratios corresponds to a preference for guanine bases to face the protein core (Figure 5F). We analyzed the subset of G-tracts preceded by a 5′ C (i.e. CG[5]) to assess whether SNVs at G1, the position known to be mutable due to CpG methylation also oscillated with the C-containing/G-containing ratios. Oscillation in SNV-C-containing/SNV-G-containing values was evident, with peaks aligning to the hg19 troughs (Figure 5E) implying that the cytosines facing the protein surface harbor more variants than those facing away. We conclude that A- and G-tracts display preferential positioning (the former) and orientation (the latter) along NCPs, which in turn modulate the rate of sequence variation.

Mutations associated with human disease

Knowing that the first and last As of long A-tracts and G2–3 in G-tracts are the major sites of SNV in the human population, we addressed whether these features are also discernible in mutated mononucleotide tracts associated with human genetic disease. We collected 9,450,456 unique SBSs (both SBSs and SNVs refer to single base changes) from sequenced cancer genomes and normalized the percent mutations along A- and G-tracts to enable a direct comparison with the 1KGP dataset. For A-tracts (Figure 6A and Supplementary Figure S6A), SBSs displayed the same trend as the 1KGP data (Figure 2A) with respect to the bell-shape increase in mutations at A1 and An and the mutation spectra, although the susceptibility to mutation as a function of tract length attained greater values (6.36% for length 11 in cancer versus 4.15% for length 9 in the 1KGP datasets at A1). The first and last 3 bp also harbored more SBSs than in the 1KGP dataset for tracts >7 bp, a feature that we found to be due exclusively to a large cancer dataset (42) containing high-level microsatellite instability (MSI) samples (Supplementary Figure S6B and C), which are known to result from mismatch-repair deficiency (15). Thus, A-tracts display similar patterns of base substitution between the germline and somatic cancer tissues. For G-tracts, mutation spectra were characterized by G→T transversions at tract lengths >7, particularly at G1, the most frequently mutated position for tracts lengths up to 11 bp (Figure 6B and Supplementary Figure S6D). This trend persisted even when the high rates of methylation-mediated deamination mutations at the CG dinucleotide were removed (Supplementary Figure S6E). Thus, mutation patterns in cancer genomes contrast with those observed in the germline, both with respect to the most mutable position (G1 versus G2–3) and the types of base substitution (G→T in cancer genomes versus G→T and G→C in the germline).

Figure 6.

Mutation patterns in cancer genomes. (A) Mutation spectra for SBSs at A-tracts. Percent values were obtained by dividing the total number of SBSs at each position by the number of tracts in hg19 and then multiplying by 3.2516 to equalize the percentage of A-tracts of length 4 between the cancer genomes and the 1KGP datasets. (B) Mutation spectra for SBSs at G-tracts in cancer genomes. Percent values were obtained as in (A) using a multiplication factor of 3.7419. (C) Normalized fractions of A-tracts (closed circles) and G-tracts (open circles) displaying +/−1 bp slippage, obtained by dividing the number of events by both the number of tracts in hg19 and tract length. (D) Indels at A-tracts, calculated as described in Figure 3C. (E) Indels at G-tracts, calculated as described in Figure3C. (F) Heatmap representation of insertions along G-tracts, as described in Figure 3E.

 With respect to slippage, the fractions for A-tracts elicited an excess at lengths 9 and 10 bp relative to the 1KGP dataset, which was also due to the MSI-containing dataset. For G-tracts, the fractions peaked at length 8, as for the 1KGP dataset (Figures 3A and 6C), implying that the propensity to undergo slippage is indistinguishable between the germline and soma. Indels were also more abundant at flanking base pairs than along the tracts (Figure 6D and E), particularly for G-tracts of length >7, similar to the 1KGP dataset (Figure 3C and D). Detailed analyses of insertions revealed that both G1 and the preceding position were the most significant sites of mutation (F-values up to 0.08 at G1 for tracts of length 8) (Figure 6F). Thus, the 5′ end of long G-tracts is the most susceptible site for both SBSs and insertions in cancer genomes, in contrast to the germline where these occur within the runs, typically at G2–3.

We also extracted the mutated A- and G-tracts from the Human Gene Mutation Database (HGMD), a collection of >150,000 germline gene mutations associated with human inherited disease. A total of 1519 genes were mutated at A- or G-tracts out of a total of 3972 (38%); 3480 SBSs and 2866 slippage events were noted within these tracts, 85 and 46% of which were predicted to be disease-causing, respectively (Figure 7A and Supplementary Table S1). Ranking genes by the number of literature reports indicated that among the top 10 entries three were associated with cancer (BRCA1, BRCA2 and APC), two with hemophilia (F8 and F9), four with debilitating lesions of the skin (COL71A), muscle (DMD), lung (CFTR) and kidney (PKD1), with one causing hypercholesterolemia (LDLR) (Figure 7B). Thus, mutations within A- and G-tracts carry a high social burden by contributing to some of the most common human pathological conditions.

Figure 7.

Mutation patterns in HGMD and model for sequence context-dependent changes. (A) Number of germline SBSs and slippage events (Slip.) at A- and G-tracts in HGMD. Gene alterations were classified as disease-causing mutation (DM), likely disease-causing mutation (DM?), disease-associated and putatively functional polymorphism (DFP), disease-associated polymorphism with additional supporting functional evidence (DP) and invitro/laboratory orinvivo functional polymorphism (FP). Codon changes (SIFT predictor) were classified as damaging (d), null (n), tolerated (t) and low-confidence prediction (l). (B) The 10 most commonly reported genes in HGMD with mutations at A- and G-tracts. Various mutated tracts were generally reported for the same gene in different reports. (C) Mutation spectra for SBSs at A- (left) and G-tracts (right) in HGMD. Percent values were obtained by dividing the total number of SBSs at each position by the number of tracts in hg19 exons. A|G→T (black), A|G→C (red), A→G (green), G→A (cyan). (D) Normalized fractions of A-tracts (closed circles) and G-tracts (open circles) displaying +/−1 bp slippage, obtained by dividing the total number of events by the number of tracts in hg19 exons and by tract length. (E) Model for sequence context-dependent changes at A-tracts (left) and G-tracts (right). *, site of base modification.

 For both A- and G-tracts, SBSs occurred mostly at tract lengths of 4–7, with patterns more similar to those in the 1KGP than in the cancer datasets, both with respect to the location of the most mutable positions (first and last As and first/second Gs) and the types of base substitution (A→T and G→H) (Figure 7C and Supplementary Figure S6F). Likewise, slippage events peaked at tract lengths of 7–9 as observed in the 1KGP dataset (Figure 7D). In summary, the patterns of both SBSs and slippage in the HGMD dataset followed the trend observed in the 1KGP dataset, suggesting that germline variants at mononucleotide repeats leading to either population variation or human inherited disease may have arisen through similar mechanisms.
DISCUSSION

Why are specific A:T and G:C base pairs within A- and G-tracts more susceptible to sequence changes than their identical neighbors? For A-tracts, bp flexibility may play a role. Chemical damage to DNA, such as by hydroxyl radicals has been shown to be proportional to the geometrical solvent-accessible surface of the atomic groups, which increases with DNA flexibility (43). Along A-tracts flexibility is restricted, but it is high at both the 5′ and 3′ junctions. Thus, the fact that the highest rates of mutation coincide with the highest degree of flexibility at the 5′-TA-3′ bp step is consistent with the view that this position may be susceptible to DNA damage as a result of flexibility. Other sources of DNA dynamics are also likely to be relevant, such as sugar flexibility at the junctions, which increases with tract length (44). Chemical modification at these junctions may then lead to base substitution and indels, the latter as a result of strand breaks.

With respect to SNV mutation spectra, these were found mostly in the direction of flanking base composition above a length of 7–8 bp. We interpret this behavior in terms of DNA slippage along A-tracts when attempts are made during translesion synthesis (TLS) to bypass a damaged site (Figure 7Ei). Two scenarios may be considered to account for A→T transitions at A1. In the first, the last tract-template base would loop out into the polymerase active site permitting base-pairing and strand elongation (Figure 7Eii) using the tract-flanking base as a template (34,4546). In the second (Figure 7Eiii), slippage would occur behind the polymerase, prompting extension past the newly created A*:T mispair generated by primer/template misalignment. Either pathway would yield a common intermediate (Figure 7Eiv) that contains the base complementary to the junction across from the damaged site upon slippage resolution (34). Following DNA synthesis (S) and/or repair (R) (Figure 7Ev and vi), this mispair will generate a base change that is always identical to the tract-flanking base.

For G-tracts, the high rates of G→T transversions at G1 in cancer genomes are also consistent with preferred chemical attack at this site due to high flexibility (Figure 7F top). Direct chemical attack at a guanine is known to result in stable products, such as 8-oxo-G and Fapy-G, both of which are known to yield G→T transversions (4750). Thus, G1 may be the most susceptible site for such reactions for G-tracts of lengths ≥7 (Figure 7Fright), which in cancer genomes would become a mutation hotspot. In the germline, SNVs peaked inside G-tract base pairs, while mutational spectra were insensitive to flanking base composition; these events are inconsistent with a role for template misalignment and slippage as noted for A-tracts. Rather, the correspondence between hotspot mutations at G2–3 and G5 and the QM/MM simulations suggest a role for charge transfer. A large body of work during the past 20 years using computational, theoretical chemistry and biophysical techniques on short oligonucleotides, has shown that guanine is the most easily oxidizable base in DNA and that indeed a guanine radical cation can be generated through long-range hole transfer from an oxidant via one-electron oxidation mechanisms (5155). GGG triplets were found to act as the most effective traps in hole transfer by both experimental and theoretical work (5659), demonstrating that the resulting guanine radical cation (or its neutral deprotonated form) became rather delocalized, but it preferentially centered at the first and second G. These well-established patterns of chemical reactivity are consistent with our experimental observation of high mutation frequencies at G1 for short G-tracts and the results from QM/MM simulations on G6. For longer tracts, the downstream shift in mutation hotspots, i.e., G2–3 and G5, also correlate well with the charge localization predicted from QM/MM simulations, which explicitly included solvent effects and structural fluctuations. Thus, in conjunction with the constrained density functional theory (60), both the neutral and oxidized forms of a guanine nucleobase can be reliably constructed to infer the accurate determination of mutational patterns of mononucleotide repeats in human genomic DNA.

The compact organization of the sperm genome (61), and presumably low levels of oxidative stress in the germline, may enable guanine oxidization through one-electron oxidation reactions rather than by direct chemical attack, thereby favoring the formation of radical cations. A charge injected at G1 by electron loss would then migrate to neighboring guanines and localize at sites of low IP, such as G2 (Figure 7F left). Guanine radical cations are known to readily undergo further chemical modification leading to products such as 8-oxo-G, oxazolone, imidazolone, guanidinohydantoin, and spiroiminodyhydantoin (62) (M in Figure 7F), to yield G→T, G→C and G→A substitutions (4,63). Our model is in line with recent observations in which mutations at guanines within short G-runs (1–4 bp) correlate with sequence-dependent IPs at the target guanine in cancer genomes (9). Interestingly, these correlations were not observed in the germline (9). We interpret these composite observations as follows. The IP values for G-runs have been shown to decrease asymptotically with tract length, although the absolute values vary according to the methods and assumptions used (we obtained a value of 5.43 eV for both G[6] and G[9]) (64,65). We suggest that short G-runs with high IPs undergo one-electron oxidation reactions in the oxidative environment of cancer cells but would be refractory to such a mechanism in the germline (Figure 7Fright yellow and left white sectors). As length increases and IP values fall, G-runs would be attacked directly by oxidants abundant in tumor cells (Figure 7F orange sector), whereas oxidation will be limited to electron loss in the germline environment (Figure 7F left yellow sector).

These models (template misalignment for A-tracts and charge transfer for G-tracts) suggest a more complex scenario for mechanisms underlying mononucleotide repeat polymorphism in the human population than recently proposed (13), in which nucleotide misincorporation by error-prone polymerases is proposed as a primary source of mutations at both A- and G-tracts. As already stated, the directionality of SNVs toward tract-flanking bases in A-tracts and the hotspot mutations at G2–3, supports multiple and distinct mechanisms of base substitution at mononucleotide repeats.

Our analyses highlight additional information, including the lack of mutations in the direction of tract-base composition for base pairs flanking long tracts, the association with gene expression and the preference of guanines for the inner NCP surface, and extend prior observations (12) such as the bell-shape character of base substitution and slippage, whose mechanisms remain to be fully clarified. Finally, we document the contribution of mononucleotide mutagenesis to key aspects of human pathology beyond the well-established MSI instability in cancer (15), including hemophilia and tissue degeneration. Our collective work supports the conclusion that as the human genome undergoes evolutionary diversification and along the way suffers disease-associated mutations, oxidation reactions including charge transfer may play a prominent role.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

 

 

Mutation analyses and prenatal diagnosis in families of X-linked severe combined immunodeficiency caused by IL2Rγ gene novel mutation

, , , ,

Genet. Mol. Res. 14 (2): 6164 – 6172   DOI: 10.4238/2015.June.9.2
Severe combined immunodeficiency diseases (SCIDs) are a group of primary immunodeficiency diseases characterized by a severe lack of T cells (or T cell dysfunction) caused by various gene abnormalities and accompanied by B cell dysfunction (WHO, 1992; Buckley et al., 1997). The incidence rates in infants were 1/75,000-1/10,0000 (WHO, 1992), but no morbidity statistics are available in China. The 2 genetic modes of SCID include X-linked recessive and autosomal recessive genetic inheritance. X-linked severe combined immunodeficiency (X-SCID) is the most common form, accounting for 50-60% of SCID cases (Noguchi et al., 1993). Immune system abnormalities in patients with X-SCID include T-B+NK-, in which T cells (CD3+) and natural killer (NK) cells (CD16+/CD56+) are absent or significantly reduced, and the number of B cells (CD19+) is normal or increased, causing reduced immunoglobulin production and class switching disorder (Buckley, 2004; Fischer et al., 2005). The IL- 2Rg gene mutation has been confirmed to be a major cause of X-SCID (Noguchi et al., 1993). In recent years, great progress has been made in understanding the pathogenesis of primary immunodeficiency disease and its application in clinical treatment, particularly regarding the development of critical care medicine and immune reconstruction technology. With timely control of infection and early bone marrow or stem cell transplantation, X-SCID patients can be treated, prolonging survival time. Therefore, early diagnosis of X-SCID is very important for patient treatment. Gene diagnosis has become a better early diagnosis or differential diagnosis method. In addition, familial X-SCID brings a great psychological burden to the relatives of patients. Ordinary chromosome analysis and immunological evaluation cannot be used for female carrier identification and fetal diagnosis, and gene diagnosis is the most effective method of carrier detection and prenatal diagnosis. In this study, we detected mutations in 2 families with X-SCID and identified 2 novel mutations, confirming the X-SCID pedigrees. Prenatal diagnosis was performed for the pregnant fetus in the mother of one of the probands based on gene diagnosis. Female individuals in this family were subjected to carrier detection.
IL2Rg gene mutation test Direct sequencing of 1-8 exons and the flanking region of the IL2Rg gene by PCR in family 1 showed that the 3rd exon of the proband contained the c.361-363delGAG heterozygous deletion mutation, which led to deletion of the 121st amino acid glutamate (p.E121del) in its coding product. There were no sequence variations in other coding regions or in the shear zone. The proband’s mother carried the same heterozygous mutation, while his father did not carry the mutation site (Figure 2a, b, c). This mutation was not observed in any cases of the control group, and this family was identified as an X-SCID family. The c.510-511insGAACT insertion heterozygous mutation was present in the 4th exon of the proband’s mother in family 2. This mutation was a 5-base repeat of GAACT, resulting in a change in amino acid 173 from tryptophan into a stop codon (p.W173X). While there were no sequence variations in other coding regions or in the shear zone, the patient’s father did not carry the mutation (see Figure 2d, e). We did not find this mutation in the healthy control group. We presumed that the 4th exon of the deceased child in family 2 contained the c.510-511insGAACT insertion mutation, leading to X-SCID symptoms, and thus we speculated that this family was an X-SCID pedigree. Prenatal diagnosis We verified the chorionic villus status of the fetus in family 1 using the PowerPlex 16 HS System kit. The results of prenatal diagnosis showed that the fetal tissue contained no maternal contamination and that this fetus was female. The results of prenatal diagnosis showed that there was no c.361-363delGAG (p.E121del) heterozygous mutation in the female fetus of family 1.
Figure 2. Sequencing graph of IL2Rg gene in 2 pedigrees with X-chain severe combined immunodeficiency. a.-c. Family 1. a. Normal control (rectangle indicates 3 edentulous bases of this patient). b. Proband carrying the c.361- 363delGAG (p.E121del) mutation (arrow indicates deletion of fragment connection sites). c. The proband’s mother contained a c.361-363delGAG (p.E121del) heterozygous mutation (arrow). d.-e. Family 2. d. The proband’s mother carried the c.510-511insGAACT (p.W173X) heterozygous mutation (arrow indicates that the reverse sequencing graph was positive). e. Normal control (rectangular box indicates 2 normal copies of GAACT (the mutation fragment was 3 copies). Carrier detection results For the c.361-363delGAG (p.E121del) site, the gene analysis results of the female individual in family 1 showed that I2 (proband’s grandmother) was a heterozygous carrier and that II3 (proband’s aunt) was a non-carrier and had no mutations.
IL-2 can combine with the IL-2 receptor (IL-2R) of the immune cell membrane. IL-2R is composed of 3 subunits, including the IL-2Ra chain (CD25), IL-2Rb chain (CD122), and IL- 2Rg chain (CD132). IL-2Rg functional units in common with IL-4, IL-7, IL-9, IL-15, IL-21, and other cytokine receptors, and these regions are referred to as the total chain (Li et al., 2000). The IL-2Rg chain can maintain the integrity of the IL-2R complex and is required for the internalization of the IL-2/IL-2R complex; it is also the link that contacts the cell membrane surface factor region and downstream cell signal transduction molecules. Therefore, the integrity of the IL-2Rg chain is vital for the immune function of an organism (Malka et al., 2008; Shi et al., 2009).
Mutations in the IL2Rg gene, which encodes IL-2Rg, were identified to be a major cause of X-SCID in 1993 (Noguchi et al., 1993). The IL2Rg gene is located on chromosome X q21.3-22, is 37.5 kb length, and contains 8 exons, which encode 369 IL-2Rg amino acids. The IL2Rg chain exhibits varying structural regions, such as the signal peptide [amino acids (AA) 1-22], extracellular domain (AA 23-262), transmembrane region (AA 263-283), and intracellular region (AA 284-369). The WSXWS motif is located in the extracellular region (AA 237-241), while Box 1 is located in the intracellular region (AA 286-294).
By the end of 2013, the Human Gene Mutation Database contained a total of 200 mutations in the IL2Rg gene (HGMD Professional 2013.4). The most common mutation types in the IL2Rg gene were the missense or nonsense mutations, which result from single base changes. A total of 100 missense or nonsense mutations have been identified, followed by insertion or deletion mutations in a total of 50 species. The 3rd most common type of mutations includes shear mutations in approximately 30 species. Eight exons contained mutations, and mutations in 3rd or 4th exons were the highest, accounting for a total mutation rate of 43% (86/200). According to the X-SCID gene database (IL2RGbase) (http://research.nhgri. nih.gov/scid/), the gene mutations in IL2Rg mainly occurred in the extracellular region of the IL2Rg chain (Fugmann et al., 1998). Zhang et al. (2013) reported that the IL2Rg gene mutations in 10 patients with X-SCID in China were located in the extracellular region. Two mutations reported in our study were also located in the extracellular region. The mutation of IL2Rg gene in family 1 was a codon mutation in the 3rd exon, resulting in a 3-base deletion. The c.361-363delGAG (p.E121del) mutation was located in the extracellular area of the IL- 2Rg subunit, and we inferred that the 121 glutamate deletion caused by the mutation would lead to changes in the structure of the peptide chain, affecting signal transmission and resulting in serious symptoms. The mutation of family 2 was a GAACT repeat of ILR2g gene; this repeat of 5 bases resulted in 173 codon changes from tryptophan into a stop codon. Generation of the peptide chain with the mutation lacked 196 amino acids compared to the normal chain, including the intracellular, transmembrane, and some extracellular regions, directly affecting the structure and function of receptors and causing disease. No studies have been reported regarding these 2 mutations. We combined with the mutation characteristics and clinical manifestations and diagnosed family 1 as X-SCID pedigrees. Although the patient in family 2 was deceased, it can be speculated that the 2 deceased patients in family 2 were X-SCID pedigrees caused by c.510-511insGAACT (W173X).
Prenatal diagnosis can accurately identify fetal situations and be used to avoid birth defects, which can also ease the anxiety of the pregnant mother. Gene diagnosis for pedigrees of patients based on DNA samples has advanced recently, particularly with the application of high-throughput sequencing technology (Alsina et al., 2013). We can now perform gene analysis for varied clinical infectious diseases for differential diagnosis. However, the effectiveness of prenatal diagnosis for pedigrees in which the proband is dead remains unclear. Because the gene mutations in the proband is unknown in these cases, the patient’s situation was only inferred by his mother’s genotypes. However, we considered that for the deceased, if we can define the mother was a pathogenic gene carrier, even if the proband is not X-SCID, the woman also has a risk of having X-SCID children and this pedigree may be X-linked recessive inheritance. Prenatal diagnosis may provide a choice for preventing the birth of patients in these families in the premise of informed consent.
Gene diagnosis of IL2Rg can also be used for carrier detection of suspected females in the family.
In the present study, we performed carrier detection of the patient’s grandmother and aunt in family 1 and determined that the patient’s pathogenic mutations were from his grandmother. His aunt did not inherit the pathogenic gene, and thus she was a non-carrier and her fertility will not be affected. In this study, we used direct sequencing of PCR products and identified IL2Rg gene mutations in 2 pedigrees with X-SCID. We found 2 unreported mutations in the IL2Rg gene, and prenatal diagnosis and carrier detection were conducted in 1 X-SCID family. Because the incidence rate of X-SCID is extremely low, it is difficult to promote the widespread use and application of genetic diagnosis. However, this study may provide some implications for the diagnosis of infants with immunodeficiency, and gene diagnosis techniques such as conventional or high-throughput sequencing should be used as soon as possible during pregnancy, which can be used to guide treatment. This method can also provide reliable prenatal diagnosis and carrier detection service for these families.
MEF2A gene mutations and susceptibility to coronary artery disease in the Chinese population
J. Li1 , H.-X. Chen2 , J.-G. Yang3 , W. Li3 , R. Du3 and L. Tian3       DOI http://dx.doi.org/10.4238/2014.October.20.15
Coronary artery disease (CAD) has high morbidity and mortality rates worldwide. Thus, the pathogenesis of CAD has long been the focus of medical studies. Myocyte enhancer factor 2A (MEF2A) was first discovered as a CAD-related gene by Wang (2005) and Wang et al. (2003, 2005). Three mutation points in exon 7 of MEF2A were subsequently identified by Bhagavatula et al. (2004); however, Altshuler and Hirschhorn (2005) and Weng et al. (2005) predicted that the MEF2A gene lacked mutations. Zhou et al. (2006a,b) analyzed the mutations and polymorphisms in exons 7 and 11 of the MEF2A gene in the Han population in Beijing, and various rare mutations were found in exon 11 rather than in exon 7. The clinical significance of specific 21-bp deletions in MEF2A was also explored, and previous studies have shown mixed results. In this study, polymerase chain reaction-singlestrand conformation polymorphism (PCR-SSCP) and DNA sequencing were used to detect exon 11 of the MEF2A gene in samples collected from 210 CAD patients and 190 healthy controls and to investigate the function of the MEF2A gene in CAD pathogenesis and their correlation.
CAD, a common disease in China, is induced by multiple factors, such as genetics, the environment, and lifestyle. Thus, a multi-faceted approach is necessary in the study of CAD pathogenesis, particularly in molecular biology research, which is important for developing comprehensive treatment of CAD based on gene therapy. The MEF2A gene was first identified as a CAD-related gene through linkage analysis of a large family with CAD (9 of 13 patients developed MI) in 2003.
In this study, we found the following mutations: 1) codon 451G/T (147191) heterozygous or homozygous mutation; 2) loss of 1 (Q), 2 (QQ), 3 (QQP), 6 (425QQQQQQ430), and 7 (424QQQQQQQ430) amino acids (147108-147131); and 3) codon 435G/A (147143) heterozygous mutation. Among these mutations, the synonymous mutation at locus 147191 was confirmed by reference to the National Center for Biotechnology Information (NCBI) database to be a single nucleotide polymorphism, which was also demonstrated in our study by the extensive presence of this polymorphism in healthy controls. However, the heterozygous mutation at locus 147143 was only found in the genomes of CAD patients, and was therefore identified as a mutation.
Given that MEF2A is a CAD-related gene, the results of various studies are controversial among several countries. Weng et al. (2005) screened gene mutations in exon 11 of the MEF2A gene from 300 CAD patients and 1500 healthy controls. They hypothesized that the changes in 5-12 CAG repeats are genetic polymorphisms and that the 21-base deletion in exon 11 of the MEF2A gene did not induce autosomal dominant genetic CAD. Gonzalez et al. (2006) suggested that the CAG repeat polymorphism was independent of MI susceptibility in Spanish patients. Kajimoto et al. (2005) reported that the CAG repeat sequence was not correlated with MI susceptibility in Japanese patients. Horan et al. (2006) also found that the CAG repeat sequence was not associated with the susceptibility to early-onset familial CAD in an Irish population. Hsu et al. (2010) identified no correlation between the CAG repeat sequence and CAD susceptibility in the Taiwanese population. Dai et al. (2010) found that the structural change in exon 11 was not related to CAD in the Chinese Han population. Lieb et al. (2008) and Guella et al. (2009) hypothesized that MEF2A was independent of CAD. However, Yuan et al. (2006) and Han et al. (2007) suggested that the CAG repeat sequence was correlated with CAD because 9 CAG repeats was an independent predictor of CAD. Elhawari et al. (2010) and Maiolino et al. (2011) suggested that MEF2A is a susceptibility gene for CAD. Dai et al. (2013) showed that mutations in exon 12 are associated with the early onset of CAD in the Chinese population. Liu et al. (2012) failed to demonstrate a correlation between the CAG repeat sequence and CAD through case-control analysis, systematic review, and meta-analysis, but found that the 21- base deletion in exon 11 was strongly associated with CAD, and that genetic variations in MEF2A may be a relatively rare, but specific, pathogenic gene for CAD/MI. Kajimoto et al. (2005) reported 4-15 CAG repeats. However, only 4-11 CAG repeats were observed in our study, possibly because of genetic differences in patients in this study. Eleven CAG repeats were observed in most samples from the control group, and the proportion of 10, 9, and 8 repeats exceeded 1%. The heterozygous mutation at 147143, as well as the 4 and 5 CAG repeats, was only observed in CAD patients. Thus, we speculated that the CAG repeat sequence is correlated with CAD susceptibility, and the presence of 4 or 5 repeats may be a risk factor for CAD, which was inconsistent with the results obtained by Han et al. (2007). The inconsistency in these results may be explained by the differences in subjects and sample sizes among studies.
Impact of glucocerebrosidase mutations on motor and nonmotor complications in Parkinson’s disease

Homozygous and compound heterozygous mutations in GBA encoding glucocerebrosidase lead to Gaucher disease (GD). A link between heterozygous GBAmutations and Parkinson’s disease (PD) has been suggested ( Bembi et al., 2003,Goker-Alpan et al., 2004, Halperin et al., 2006, Machaczka et al., 1999, Neudorfer et al., 1996, Tayebi et al., 2001 and Tayebi et al., 2003). In 2009, a 16-center worldwide analysis of GBA revealed that heterozygous GBA mutation carriers have a strong risk of PD ( Sidransky et al., 2009).

In addition, heterozygote GBA mutations not only carry a risk for PD development but also the possibility of some risk burden on the progression of PD clinical course. In cross-sectional analyses of GBA mutations in PD patients, earlier disease onset, increased cognitive impairment, a greater family history of PD, and more frequent pain were reported in patients with mutations, compared with no mutations ( Chahine et al., 2013,Clark et al., 2007, Gan-Or et al., 2008, Kresojevic et al., 2015, Lwin et al., 2004, Malec-Litwinowicz et al., 2014, Mitsui et al., 2009, Neumann et al., 2009, Nichols et al., 2009,Seto-Salvia et al., 2012, Sidransky et al., 2009, Swan and Saunders-Pullman, 2013 and Wang et al., 2012). Recently, a few prospective studies have investigated clinical features of PD with GBA and showed a more rapid progression of motor impairment and cognitive decline in GBA mutation cases than in PD controls ( Beavan et al., 2015, Brockmann et al., 2015 and Winder-Rhodes et al., 2013). However, in terms of motor complications such as wearing-off and dyskinesia, no studies exist in the longitudinal course of PD with GBA mutations.

Here, we conducted a multicenter retrospective cohort analysis, and the data were investigated by survival time analysis to show the impact of GBA mutations on PD clinical course. We also investigated regional cerebral blood flow (rCBF) and cardiac sympathetic nerve degeneration of subjects with GBA mutations, compared with matched PD controls.

3.1. Subjects

Among the 224 eligible PD patients (the subjects were not related to each other), 9 subjects were excluded from the analysis (4 due to multiple system atrophy findings on subsequent brain MRI and 5 because of insufficient clinical information). Therefore, 215 PD patients [female, 52.1%; age, 66.7 ± 10.8 (mean ± standard deviation)] were analyzed. For non-PD healthy controls, 126 patients’ spouses (female, 58.7%; age, 67.3 ± 10.3) without a family history of PD or GD were enrolled.

3.2. GBA mutations and risk ratios for PD

In the PD subjects, we identified 10 nonsynonymous and 2 synonymous GBA variants. Within the nonsynonymous variants, 7 mutations were previously reported in GD [R120W, L444P-A456P-V460 (RecNciI), L444P, D409H, A384D, D380N, and444L(1447-1466 del 20, insTG)] as GD-associated mutations. Three nonsynonymous mutations have never been reported in GD patients [I(-20)V, I489V, and there was one novel mutation (Y11H)].

GD-associated GBA mutations were found in 19 of the 215 (8.8%) PD patients but none in the healthy controls. The risk of PD development relative to these GD-associated mutations was estimated as an OR of 25.1 [95% confidence interval (CI), 1.50–420,p = 0.0001] with 0-cell correction. The nonsynonymous mutations that were not reported in GD patients had no association with PD development (p = 0.506; OR, 1.3; 95% CI, 0.7–2.6) ( Table 1). Four subjects had double mutations. For subsequent analyses, 2 subjects with double mutations of I (-20)V and K466K were adopted to the group of mutations unreported in GD, and 2 subjects with double mutations of R120W and I(-20)V, and of R120W and L336L were adopted to the group of GD-associated mutations.

Table 1.Frequency of glucocerebrosidase gene allele in Parkinson’s disease patients and controls

Allele name PD (n = 215) Controls (n = 126) p Odds ratio (95% CI)
GD-associated mutations
 R120W 7a 0 0.050 9.1 (0.5–160.8)
 RecNciI (L444P-A456P-V460) 4 0
 L444P 4 0
 D409H 1 0
 A384D 1 0
 D380N 1 0
444L(1447-1466 del 20, insTG) 1 0
 Subtotal, n (%) 19 (8.8%) 0 (0%) <0.001 25.1 (1.5–419.8)b
Nonsynonymous mutations not reported in GD
 I(-20)V 27a 13 0.603 1.3 (0.6–2.5)
 I489V 3 0
 Y11Hc 0 1
 Subtotal, n (%) 30 (14.0%) 14 (11.1%) 0.506 1.3 (0.7–2.6)
Synonymous, n
 K466K 2a 1
 L336L 1a 0
Allele names refer to the processed protein (excluding the 39-residue signal peptide).

Key: CI, confidence interval; GD, Gaucher disease; PD, Parkinson’s disease.

a Four subjects had double mutations; 2 of I(-20)V and K466K, 1 of I(-20)V and R120W, and 1 of R120W and L336L.
b Odds ratio was calculated by adding 0.5 to each value.
c Novel mutation.
3.3. Clinical features of PD patients by GBA mutation groups

The clinical features of PD patients with GD-associated mutations, those with mutations unreported in GD, and those without mutations are shown in Table 2. In the GD-associated mutation group, females, those with a family history and those with dementia (DSM IV) were significantly more frequent than those in the no-mutation group (p = 0.047, 0.012, and 0.020, respectively). The age of PD onset was lower in patients with GD-associated mutations (55.2 ± 9.9 years ± standard deviation), compared with those without mutations (59.3 ± 11.5), although the statistical difference was not significant. There were no differences in clinical manifestations between subjects with mutations unreported in GD and those without mutations, except for dopamine agonist dosage (p = 0.026) ( Table 2).

Table 2.Epidemiological and clinical features of PD patients with Gaucher disease–associated GBA mutations, those with mutations previously unreported in GD and those without mutations

Variables Total n = 215 Mutation (-) GD-associated mutations


Mutations unreported in GD


167 19a pb 29c pd
Sex Female, n (%) 83 (49.7) 14 (73.7) 0.047 15 (51.7) ns
Age Mean (SD) 67.0 (10.8) 62.2 (10.7) 0.063e 67.5 (11.2) nsf
Disease duration (y) Mean (SD) 7.7 (5.5) 6.9 (4.6) nsf 7.2 (4.9) nsf
Onset age Mean (SD) 59.3 (11.5) 55.2 (9.9) ns 60.3 (11.8) ns
Family history Yes, n (%) 17 (11.0)g 6 (31.6) 0.012 0 (0.0) ns
Dementia (DSM-IV) Yes, n (%) 29 (17.4) 9 (47.4) 0.020 5 (17.2) ns
MMSE Mean (SD) 25.8 (5.4)h 23.3 (7.7) nsf 27.0 (3.4)i nsf
Onset symptom (tremor vs. others) Tremor, n (%) 78 (46.8) 9 (47.4) ns 15 (51.7) ns
Modified H-Y on (<3 vs. ≥3) ≥3, n (%) 82 (49.1) 14 (73.7) 0.042 16 (55.2) ns
UPDRS part 3 Mean (SD) 23.6 (12.2)j 28.5 (13.8) nsf 21.9 (8.7) nsf
Wearing off Yes, n (%) 70 (41.9) 9 (47.4) ns 13 (44.8) ns
Dyskinesia Yes, n (%) 49 (29.3) 8 (42.1) ns 8 (27.6) ns
Mood disorder Yes, n (%) 43 (25.7) 8 (42.1) ns 7 (24.1) ns
Orthostatic hypotension symptom Yes, n (%) 21 (12.6) 5 (26.3) ns 7 (24.1) ns
Psychosis history Yes, n (%) 59 (35.3) 10 (52.6) ns 7 (24.1) ns
ICD history Yes, n (%) 8 (4.8) 1 (5.3) ns 1 (3.4) ns
Stereotactic brain surgery for PD Yes, n (%) 4 (2.4) 0 (0.0) ns 0 (0.0) ns
Agonist LED mg/d Mean (SD) 92.8 (114.2) 72.1 (137.7) nse 163.7 (155.6) 0.026e
Levodopa LED mg/d Mean (SD) 400.7 (184.2) 456.7 (206.9) nsf 369.2 (230.3) nse
Total LED mg/d Mean (SD) 496.4 (233.7) 537.9 (258.9) nsf 525.7 (287.4) nsf
Categorical data were examined by Fisher’s exact test.

Key: DSM-IV, Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition; GBA, glucocerebrosidase gene; GD, Gaucher disease; H-Y, Hoehn and Yahr; ICD, impulse control disorder; LED, levodopa equivalent dose; ns, not significant; MMSE, Mini-Mental State Examination; PD, Parkinson’s disease; SD, standard deviation; UPDRS, Unified Parkinson’s Disease Rating Scale.

a Including a double-mutation subject (with a mutation unreported in GD).
b GD-associated mutations versus mutation (-).
c Two subjects with double mutation, including GD-associated mutations, were assigned to GD-associated mutation group.
d Other mutations versus mutation (-).
e Examined by Student t test after Levene’s test for equality of variances.
f Examined by Mann-Whitney U-test because of non-Gaussian distribution.
g    n = 155 due to 10 missing data.
h    n = 164 due to 3 missing data.
i     n = 28 due to 1 missing datum.
j     n = 165 due to 2 missing data.

3.4. Survival time analyses to develop dementia, psychosis, dyskinesia, and wearing-off

Time to develop clinical outcomes (dementia, psychosis, dyskinesia, and wearing-off) was compared in 19 subjects with GD-associated mutations, 29 with mutations unreported in GD, and 167 without mutation. The median observation time was 6.0 years. The subjects with GD-associated mutations showed a significantly earlier development of dementia and psychosis, compared with subjects without mutation (p < 0.001 and p = 0.017) ( Supplementary Table e-1, Fig. 1A and B). We rereviewed the clinical record of the subject who showed early dementia (defined by DSM IV) ( Fig. 1A) and made sure it did not satisfy the criteria of DLB ( McKeith et al., 2005).

Kaplan–Meier curves of dementia and psychosis in Parkinson's disease (PD) ...

Fig. 1.

Kaplan–Meier curves of dementia and psychosis in Parkinson’s disease (PD) patients with Gaucher disease (GD)-associated glucocerebrosidase gene (GBA) mutations and those without mutations. PD patients with GD-associated GBA mutations and those without GBA mutations were compared to investigate the time taken to develop dementia (A) and psychosis (B). Because of insufficient information in several patients, the numbers in each analysis were different. The patients with and without mutations were 17 and 165 (A), 18 and 165 (B) against a total of 19 and 167. DSM IV, Diagnostic and Statistical Manual of Mental Disorders, revised fourth edition. p-Values were calculated by log-rank tests.

The associations of GBA mutations and these symptoms were estimated as HRs, adjusting for sex and age at PD onset. HRs were 8.3 for dementia (95% CI, 3.3–20.9; p < 0.001) and 3.1 for psychosis (95% CI, 1.5–6.4; p = 0.002). The time until development of wearing-off and dyskinesia complications was not statistically significant, with HRs of 1.5 (95% CI, 0.8–3.1; p = 0.219) and 1.9 (95% CI, 0.9–4.1; p = 0.086) ( Table 3).

Table 3.Hazard ratios of GBA pathogenic mutations for clinical symptoms

Model Clinical feature Hazard ratio 95% CI p
1 Dementia (DSM-IV) 8.3 3.3–20.9 <0.001
2 Psychosis 3.1 1.5–6.4 0.002
3 Wearing-off 1.5 0.8–3.1 0.219
4 Dyskinesia 1.9 0.9–4.1 0.086
Each model was adjusted for sex and age at onset.

Key: CI, confidence interval; DSM-IV; The Diagnostic and Statistical Manual of Mental Disorders part 1IV; GBA, glucocerebrosidase.

Subjects with mutations unreported in GD did not show significant differences in time to develop all 4 outcomes, compared with no mutation subjects. Therefore, subjects with GD-unreported mutations were regarded as subjects without GBA mutations in further analyses.

3.5. rCBF on SPECT in patients with GD-associated GBA mutations

We conducted pixel-by-pixel comparisons of rCBF on SPECT between PD subjects with mutations (cases) and sex-, age-, and disease duration-matched PD subjects without any mutations in GBA (controls). Four controls were adopted for each case (except for a 34-year-old female case who was matched to a control), and in total 12 cases (female 50%, age at SPECT mean ± standard error (SE); 58.9 ± 3.3 years, disease duration at SPECT 7.3 ± 1.5 years) and 45 controls (female 64.4%, age at SPECT mean ± SE; 61.0 ± 1.3 years, disease duration at SPECT 7.1 ± 0.7 years) were analyzed. As a result, a significantly lower rCBF was seen in the cases compared to the controls in the bilateral parietal cortex, including the precuneus ( Fig. 2).

Regional cerebral blood flow in the group with GD-associated mutations compared ...

Fig. 2.

Regional cerebral blood flow in the group with GD-associated mutations compared with the matched Parkinson’s disease group without mutations. Regions with lower regional cerebral blood flow in the group with GD-associated mutations displayed on an anatomic reference map. Abbreviation: GD, Gaucher disease.

3.6. H/M ratios on MIBG scintigraphy in patients with GD-associated GBA mutations

Cardiac MIBG scintigraphy visualizes catecholaminergic terminals in vivo that are reduced as well as brain dopaminergic neurons in PD patients. We also investigated MIBG scintigraphy between 16 cases (female 68.8%, age at examination mean ± SE; 60.2 ± 2.6 years, disease duration at examination 6.2 ± 1.2 years) and sex-, age- and disease duration-matched 61 controls [(63.8 %, age 62.0 ± 1.1 years, disease duration 5.5 ± 0.6 years) (1:4 except for 1 young 34-year-old female case who was matched to a control)]. In the results, both early and late H/M ratios declined in both groups and did not show any significant differences (p = 0.309 and 0.244) ( Supplementary Table e-2).

4. Discussion

4.1. Contributions of GD-associated GBA mutations to the development of PD

In the analysis of 215 PD patients and 126 non-PD controls, we identified 10 nonsynonymous heterozygous GBA mutations, including 1 novel mutation. Among these mutations, 7 were GD-associated, and the patients carrying these mutations represented 8.8% of the PD cohort. No significant association was found between the GD-unreported mutations and PD development, which suggests that only the GD-associated mutations are a genetic risk for PD. According to a worldwide multicenter analysis of 1883 fully sequenced PD patients, 7% of the GD-associated mutations are found in non-Ashkenazi Jewish PD patients ( Sidransky et al., 2009). Although the mutation frequency in the present study was similar to previous results, the OR of GD-associated heterozygous mutations (25.1) was significantly greater than the OR (5.43) of other ethnic cohorts (Sidransky et al., 2009) and was consistent with an OR of 28.0 from a previous Japanese report ( Mitsui et al., 2009). These results, taken together, suggest the possibility thatGBA mutations are at a distinct risk for PD in the Japanese population. However, a larger Japanese cohort study is required to confirm this.

4.2. Cross-sectional clinical figures of PD with GBA mutations

Before the survival time analyses, we investigated clinical features at enrollment between mutation groups. The lower onset age, more frequent family history and dementia, and worse disease severity of PD in patients with GBA mutations, compared with those without mutations, were consistent with previous cross-sectional case-control reports ( Anheim et al., 2012, Brockmann et al., 2011, Chahine et al., 2013, Lesage et al., 2011, Li et al., 2013, Mitsui et al., 2009, Neumann et al., 2009, Seto-Salvia et al., 2012 and Sidransky et al., 2009). In contrast, female-predominance (73.7%, p = 0.047) in patients with mutations observed in the present study is inconsistent ( Neumann et al., 2009 and Seto-Salvia et al., 2012).

4.3. Impact of GBA mutations on the clinical course of PD

To investigate the impact of GBA mutations on the clinical course of PD, a prospective-designed study over a long period is preferred. Although there has been a few longitudinally designed study to date, follow-up clinical data for a median of 6 years of 121 PD cases from a community-based incident cohort was recently reanalyzed; results demonstrate that progression to dementia defined by DSM IV (HR 5.7) and Hoehn and Yahr stage 3 (HR 3.2) are significantly earlier in 4 GBA mutation-carrier patients compared with 117 patients with wild-type GBA ( Winder-Rhodes et al., 2013). A 2-year follow-up clinical report of 28 heterozygous GBA carriers who were recruited from relatives of GD-patients shows slight but significant deterioration of cognition and smelling, compared to healthy controls ( Beavan et al., 2015). Brockmann et al. (2015)assessed motor and nonmotor symptoms including cognitive and mood disturbances for 3 years in 20 PD patients with GBA mutations and showed a more rapid disease progression of motor impairment and cognitive decline in GBA mutation cases comparing to sporadic PD controls. The current long-term retrospective cohort study up to 12 years reinforced these results. It revealed that dementia and psychosis developed significantly earlier in subjects with GD-associated mutations compared with those without mutation, and the HRs of GBA mutations were estimated at 8.3 for dementia and 23.1 for psychosis, with adjustments for sex and PD onset age. In contrast, the results showed no significant difference in developing wearing-off and dyskinesia.

In this study, we also investigated whether GD-unreported mutations affected the clinical course of PD. In both cross-sectional and survival time analyses, the mutations unreported in GD carried no increased burden on clinical symptoms such as dementia, psychosis, wearing-off, and dyskinesia.

4.4. Reduced rCBF in PD with GBA mutations compared with matched PD controls

We found a significantly decreased rCBF, reflecting decreased synaptic activity, in the bilateral parietal cortex including the precuneus, in subjects with GD-associated mutations compared with matched subjects without mutations. The pattern of reduced rCBF was very similar to the pattern of H215O positron-emission tomography that Goker-Alpan et.al. (2012) reported, showing decreased resting rCBF in the lateral parietal association cortex and the precuneus bilaterally in GD subjects with parkinsonism (7 subjects with homozygous or compound heterozygous GBA mutations), compared with 11 PD without GBA mutations. Results suggest that PD with heterozygous GBAmutations and GD patients presenting parkinsonism had a common reduced pattern of rCBF. Interestingly, in their study, rCBF in the precuneus—but not in the lateral parietal cortex—correlated with IQ, suggesting that the involvement of the precuneus is critical for defining GBA-associated patterns.

4.5. Reduced cardiac MIBG H/M ratios as well as matched PD controls

We also showed that cardiac MIBG H/M ratios in subjects with GD-associated mutations were lower than the cutoff point for PD discrimination (Sawada et al., 2009), suggesting that postganglionic sympathetic nerve terminals to the epicardium were denervated, as well as in PD without mutations.

4.6. Mechanisms of impact on PD clinical course by GD-associated GBA mutations

Experimental studies suggesting a bidirectional pathogenic loop between α-synuclein and glucocerebrosidase have been accumulated (Fishbein et al., 2014, Gegg et al., 2012, Mazzulli et al., 2011, Noelker et al., 2015, Schondorf et al., 2014 and Uemura et al., 2015). Loss of glucocerebrosidase function compromises α-synuclein degradation in lysosome, whereas aggregated α-synuclein inhibits normal lysosomal function of glucocerebrosidase. The pathogenic loop may facilitate neurodegeneration in GD-associated PD brain, resulting in early development of dementia or psychosis as shown in the present study. Several recent researches propose the possibility that the similar mechanism as in PD with GBA mutations exists even in idiopathic PD brain ( Alcalay et al., 2015, Chiasserini et al., 2015, Gegg et al., 2012 and Murphy et al., 2014). On the other hand, the impacts of GD-associated GBA mutations for the development of motor complications such as wearing-off and dyskinesia were not statistically significant, suggesting other pathophysiological mechanisms in the striatal circuit brought out after long-term therapy especially by l-dopa.

4.7. Limitations

Our study has several limitations. In the design of the study, we assumed that the sample size was 215 (PD patients) for survival time analyses and investigated 224 PD patients. We assumed that the mutation prevalence would be 9.4%, and in fact, we found 19 patients with mutations (8.5%) of the 224 patients. Based on these figures, we estimated the risk ratios of heterozygous GBA mutations for the risk of PD development and PD clinical symptoms as ORs in the cross-sectional multivariate analyses, although the 95% CIs were broad. More of subject numbers will be needed to determine robust risk ratios.

Comprehensive Genetic Characterization of a Spanish Brugada Syndrome Cohort

PLOS   Published: July 14, 2015   DOI: http://dx.doi.org:/10.1371/journal.pone.0132888

Brugada syndrome (BrS) was identified as a new clinical entity in 1992 [1]. Six years later, the first genetic basis for the disease was identified, with the discovery of genetic variations inSCN5A [2]. Nowadays, more than 300 pathogenic variations in this first gene are known to be associated with BrS [3]. SCN5A encodes for the α subunit of the cardiac voltage-dependent sodium channel (Nav1.5), which is responsible for inward sodium current (INa), and thus plays an essential role in phase 0 of the cardiac action potential (AP). Genetic variations in this gene can explain around 20–25% of BrS cases [3].

Since BrS was classified as a genetic disease, several other genes have been described to confer BrS-susceptibility [47]. Pathogenic variations have been mainly described in: 1) genes encoding proteins that modulate Nav1.5 function, and 2) other calcium and potassium channels and their regulatory subunits. All these proteins participate, either directly or indirectly, in the development of the cardiac AP. Although the incidence of pathogenic variations in these BrS-associated genes is low [6], it is considered that, among all of them, they could provide a genetic diagnosis for up to an extra 5–10% of BrS cases. Hence, altogether, a genetic diagnosis can be achieved approximately in 35% of clinically diagnosed BrS patients.
Other types of genetic abnormalities have been suggested to explain the remaining percentage of undiagnosed patients. Indeed, multiplex ligation-dependent probe amplification (MLPA) has allowed the detection of large-scale gene rearrangements involving one or several exons ofSCN5A in BrS cases. However, the low proportion of BrS patients carrying large genetic imbalances identified to date suggests that this type of rearrangements will provide a genetic diagnosis for a modest percentage of BrS cases [810].
BrS has been associated with an increased risk of sudden cardiac death (SCD), despite the reported variability in disease penetrance and expressivity [11]. The prevalence of BrS is estimated at about 1.34 cases per 100 000 individuals per year, with a higher incidence in Asia than in the United States and Europe [12]. However, the dynamic nature of the typical electrocardiogram (ECG) and the fact that it is often concealed, hinder the diagnosis of BrS. Therefore, an exhaustive genetic testing and subsequent family screening may prove to be crucial in identifying silent carriers. A large percentage of these pathogenic variation carriers are clinically asymptomatic, and may be at risk of SCD, which is, sometimes, the first manifestation of the disease [13].
In the present work, we aimed to determine the spectrum and prevalence of genetic variations in BrS-susceptibility genes in a Spanish cohort diagnosed with BrS, and to identify variation carriers among relatives, which would enable the adoption of preventive measures to avoid SCD in their families.

Results  
Study population 

thumbnail

Table 1. Demographics of the 55 Spanish BrS patients included in the study.

The table shows the demographic characteristics of all the patients included in the study. Numbers in parentheses represent the relative percentages for each condition. T1 ECG refers to Type 1 BrS diagnostic electrocardiogram (ECG), obtained either spontaneously, or after drug challenge. The information regarding both the electrophysiological studies (EPS) and the treatment was not available for all the patients. Two of the patients that didn’t receive any treatment died, and were not taken into account for the calculations of percentages (+2 dead). ICD, intracardiac cardioverter defibrillator.

http://dx.doi.org:/10.1371/journal.pone.0132888.t001

thumbnail

Table 2. Characteristics of the Spanish BrS patients carrying rare genetic variations.

The table shows the clinical characteristics of the probands who carried rare genetic variations in SCN5A, SCN2B, or RANGRF. All of them are potentially pathogenic except that found in RANGRF, which is of unknown significance (see discussion). All the potentially pathogenic variations (PPVs) that had been previously reported, except p.P1725L and p.R1898C, had been identified in BrS patients. p.P1725L had been associated with Long QT Syndrome and p.R1898C was found in Exome Variant Server with a MAF of 0.0079%. No rare variations were identified in the control population. Patient’s age is expressed in years. Bold identifies the patients carrying variations that had not been described previously. M, male; F, female; S, syncope; ICD, intracardiac cardioverter defibrillator; UK, unknown; EPS, electrophysiological studies (+, positive response;-, negative response; N/P, not performed). The two patients who carried two PPVs each are identified by a and b, respectively.

http://dx.doi.org:/10.1371/journal.pone.0132888.t002

Sequencing of genes associated with BrS

We performed a genetic screening of 14 genes (SCN5A, CACNA1C, CACNB2, GPD1L,SCN1B, SCN2B, SCN3B, SCN4B, KCNE3, RANGRF, HCN4, KCNJ8, KCND3, and KCNE1L), which allowed the identification of 61 genetic variations in our cohort. Of these, 20 were classified as potentially pathogenic variations (PPVs), one variation of unknown significance, and 40 common or synonymous variants considered benign.

The 20 PPVs were found in 18 of the 55 patients (32.7% of the patients, 83.3% males; Table 2). Sixteen patients (88.9%) carried one PPV, and two patients (11.1%) carried two different PPVs each. Nineteen out of the 20 PPVs identified were localized in SCN5A and one in SCN2B.

The vast majority of the PPVs identified were missense (70%). We also detected 2 nonsense variations (10%), 3 insertions or deletions causing frameshifts (15%), and one splicing variation (5%). The three frameshifts (p.R569Pfs*151, p.E625Rfs*95 and p.R1623Efs*7) were identified in SCN5A. These were not found in any of the databases consulted (see Methods), and were thus considered potentially pathogenic (see below). The other 16 rare variations identified inSCN5A had been previously described, and hence were also considered potentially pathogenic. Fourteen of them had been identified in BrS patients. Of these, 6 had also been identified in individuals diagnosed with other cardiac electric diseases (i.e. Sick Sinus Syndrome, Long QT Syndrome, Sudden Unexplained Nocturnal Death Syndrome or Idiopathic Ventricular Fibrillation [2,15,16,20,21,25]). The other 2, p.P1725L and p.R1898C, had only been associated with Long QT Syndrome or found in Exome Variant Server with a MAF of 0.0079%, respectively. Furthermore, we identified a variation in SCN2B (c.632A>G in exon 4 of the gene, resulting in p.D211G) which was considered pathogenic. This patient was included within our cohort, but the functional characterization of channels expressing SCN2B p.D211G was object of a previous study from our group [7]. We also identified a nonsense variation in RANGRFwhich has been formerly reported as rare genetic variation of unknown significance [29].

Additionally, we screened the relatives of those probands carrying a PPV. We analysed a total of 129 relatives, 69 of which (53.5%) were variation carriers. Genotype-phenotype correlations evidenced that 8 of the families displayed complete penetrance (S3 Table). Additionally, no relatives were available for one of the probands carrying a PPV, thus hampering genotype-phenotype correlation assessment. The other 12 families showed incomplete penetrance.

 

MLPA analysis

The 37 patients with negative results after the genetic screening of the 14 BrS-associated genes underwent MLPA analyses of SCN5A. This technique did not reveal any large exon deletion or duplication in this gene for any of the patients.

 

SCN5A p.R569Pfs*151 (c.1705dupC), a novel PPV

A 41-year-old asymptomatic male presented a type 3 BrS ECG which was suggestive of BrS. Flecainide challenge unmasked a type 1 BrS ECG (Fig 1A, left), which was also spontaneously observed sometimes during medical follow up. Sequencing of SCN5A revealed a duplication of a cytosine at position 1705 (c.1705dupC; Fig 1A, right), which originated a frameshift that lead to a truncated Nav1.5 channel (p.R569Pfs*151). The proband’s sister also carried this duplication, but had never presented signs of arrhythmogenesis. The proband’s twin daughters were also variation carriers, displayed normal ECGs and, to date, are asymptomatic (Fig 1A, middle). Thus, p.R569Pfs*151 represents a novel genetic alteration in the Nav1.5 channel that could potentially lead to BrS, but with incomplete penetrance.

thumbnail

Fig 1. Characteristics of the probands carrying non-reported potentially pathogenic variations (PPVs) in SCN5A and their families.

Left: Electrocardiograms of the probands: (A) patient carrying the p.R569Pfs*151 variation, showing the ST elevation characteristic of BrS in V1 at the time of the flecainide test; (B) patient carrying the p.E625Rfs*95 variation, showing the spontaneous ST elevation characteristic of BrS in V1 and V2; and (C) patient carrying the p.R1623Efs*7 variation, showing the spontaneous ST elevation characteristic of BrS in V1 and V2. Middle: Family pedigrees. Open symbols designate clinically normal subjects, filled symbols mark clinically affected individuals and question marks identify subjects without an available clinical diagnosis. Plus signs indicate the carriers of the PPVs and minus signs, non-carriers. The crosses mark deceased individuals and arrows identify the proband. Right: Detail of the electropherograms obtained after SCN5Asequence analysis of a control subject (left panels) and of the probands (right panels).

http://dx.doi.org:/10.1371/journal.pone.0132888.g001

SCN5A p.E625Rfs*95 (c.1872dupA), a novel PPV

A 51-year-old asymptomatic male was diagnosed with BrS since he presented a spontaneous ST segment elevation in leads V1 and V2 characteristic of type 1 BrS ECG (Fig 1B, left). The sequencing of SCN5A evidenced an adenine duplication at position 1872 (c.1872dupA, Fig 1B, right). This genetic variation results in a truncated Nav1.5 channel (p.E625Rfs*95). The genetic analysis of the proband’s relatives proved that only her mother carried the variation (Fig 1B, middle). She was asymptomatic, but a BrS ECG was unmasked upon ajmaline challenge. The proband’s sister was found dead in her crib at 6 months of age, which suggests that her death might be compatible with BrS. Therefore, the p.E625Rfs*95 variation in the Nav1.5 channel represents a novel genetic alteration potentially causing BrS.

SCN5A p.R1623Efs*7 (c.4867delC), a novel PPV

The proband, a 31-year-old male, was admitted to hospital after suffering a syncope. His baseline 12-lead ECG showed a ST segment elevation in leads V1 and V2 that strongly suggested BrS type 1 (Fig 1C, left). A deletion of the cytosine at position 4867 (c.4867delC) was observed upon SCN5A sequencing (Fig 1C, right). This base deletion leads to a frameshift that originates a truncated Nav1.5 channel (p.R1623Efs*7). Genetic screening of his parents and sisters evidenced that none of them carried this novel variation (Fig 1C, middle). None of them had presented any signs of arrhythmogenicity, nor had a BrS ECG. Nevertheless, in uterogenetic analysis of one of his daughters proved that she had inherited the variation. She died when she was 1 year of age of non-arrhythmogenic causes. Hence, the p.R1623Efs*7 variation in the Nav1.5 channel is a novel genetic alteration originated de novo in the proband that could potentially lead to BrS.

Synonymous and common genetic variations portrayal

In our cohort, we identified 40 single nucleotide variations which were common genetic variants and/or synonymous variants (S2 Table). Twenty-nine had a minor allele frequency (MAF) over 1%, and were thus considered common genetic variants.

We also identified 11 variants with MAF less than 1%. Of them, 9 were synonymous variants, what made us assume that they were not disease-causing. Four of these synonymous variants were not found in any of the databases consulted, and thus their MAF was considered to be less than 1%. Each of these synonymous variations was identified in 1 patient of the cohort. A similar proportion of individuals carrying these novel variations was detected upon sequencing of 300 healthy Spanish individuals (600 alleles). The remaining 2 variants were missense, and although they had either a MAF of less than 1% or an unknown MAF according to the Exome Variant Server and dbSNP websites, they were common in our cohort (29.2 and 50%, respectively; S2 Table), and a similar MAF was detected in a Spanish cohort of healthy individuals (26.7% and 48.8%, respectively).

Influence of phenotype and age on PPV discovery

To assess if a connection existed between the probands’ phenotype and the PPV detection yield, we classified the patients in our cohort according to their ECG (spontaneous or induced type 1), the presence of BrS cases within their families, and the presence/absence of symptoms. Even though the overall PPV detection yield was 32.7%, it was even higher for symptomatic patients (Fig 2). Indeed, in this group of patients, having a family history of BrS was identified as a factor for increased PPV discovery yield. In the case of absence of BrS in the family, the variation discovery yield was almost double for those patients having a spontaneous type 1 BrS ECG than for patients with drug-induced type 1 ECG (45.5% vs 25%, respectively). In addition, we identified a PPV in 44.4% of the asymptomatic patients who presented family history of BrS and a spontaneous type 1 BrS ECG. When the patient presented drug-induced type 1 ECG or in the absence of family history of BrS, the PPV discovery yield was of around 15%.

thumbnail

Fig 2. Influence of the phenotype on PPV discovery yield.

Bar graph comparing the PPV detection yield in 8 different clinical categories (stated below the graph). Each bar shows the total number of patients for each clinical category divided in those with a PPV (black) and those without an identified PPV (white). The number of patients (in brackets) and percentages are given. Pos, positive; Neg, negative; Spont, spontaneous type 1 BrS ECG; Drug, drug-induced type 1 BrS ECG; n, number of patients.

http://dx.doi.org/:10.1371/journal.pone.0132888.g002

We also investigated the role of age on the PPV occurrence. No significant age differences were observed between variation carriers and non-carriers (38.6±10.3 and 43.5±14.4, respectively, p = 0.16). However, the PPV discovery yield was higher for patients with ages between 30 and 50 years: out of the total of patients carrying a PPV, 83.3% of the patients were in this age range, while 11.1% were younger and 5.6% were older patients (Fig 3A, upper panel). The PPV discovery yield was significantly higher for symptomatic than for asymptomatic patients (42.3% vs 24.1%, respectively; Fig 3A, lower panels).

thumbnail

Fig 3. Influence of the age on PPVs discovery yield.

(A) Pie charts showing the distribution of patients in the overall population as well as in the categories of symptomatic and asymptomatic patients regarding PPV discovery. The percentage and the number of patients (in brackets) are given for each group. The small pie charts correspond to the age distribution of patients with an identified PPV. (B) Bar graphs of the PPV detection yields obtained for each of the age groups (< 30 years, 30–50 years and > 50 years). Numbers inside each bar correspond to the number of patients carrying a PPV for each category and the percentages represent the variation detection yield.

http://dx.doi.org:/10.1371/journal.pone.0132888.g003

Noteworthy, in the 30–50 age range, 52.9% (9/17) of the symptomatic patients and 35.3% (6/17) of asymptomatic patients carried one PPV (Fig 3B, middle). Additionally, 40% (2/5) of the symptomatic young patients (< 30 years) were variation carriers, while no PPVs were identified in asymptomatic patients within this age range.

Overall, 55 unrelated Spanish patients clinically diagnosed with BrS were included in our study.Table 1 shows the demographics of this cohort, and Table 2 and S1 Table show the clinical and genetic characteristics of all the patients included in the study. The mean age at clinical diagnosis was of 41.9±13.3 years. Although the majority of patients were males (74.5%), their age at diagnosis was not different than that of females (41.8±12.1 years and 42.3±16.3 years, respectively; p = 0.92). A type 1 BrS ECG was present spontaneously in 37 patients (67.3%), and drug challenge revealed a type 1 BrS ECG for the remaining 18 patients (32.7%). Almost half of the patients had experienced symptoms, including 2 SCD and 4 aborted SCD. Patients who had not previously experienced any signs of arrhythmogenicity despite having a BrS ECG were considered asymptomatic. Comparison of symptomatic vs asymptomatic patients evidenced a similar percentage of males (73.1% and 75.9%, respectively). However, the mean age at diagnosis was different between the two groups of patients (37.7±14.3 and 45.7±11.4, respectively; p<0.05).

Discussion

To the best of our knowledge, this is the first comprehensive genetic evaluation of 14 BrS-susceptibility genes and MLPA of SCN5A in a Spanish cohort. Well delimited BrS cohorts from Japan, China, Greece and even Spain have been genetically studied [24,3032]. Additionally, an international compendium of BrS genetic variations identified in more than 2100 unrelated patients from different countries was published in 2010 [3]. However, all these studies screenedSCN5A exclusively. In 2012, Crotti et al. reported the spectrum and prevalence of genetic variations in 12 BrS-susceptibility genes in a BrS cohort [5]. However, this study included patients of different ethnicity. Here, we report the analysis of 14 genes which has been conducted on a well-defined BrS cohort of the same ethnicity.

Our results confirm that SCN5A is still the most prevalent gene associated with BrS. Indeed,SCN5A-mediated BrS in our cohort (30.9%) is higher than the proportion described in other European reports [3,23], where a potentially causative variation is identified in only 20–25% of BrS patients. The reason for this discrepancy is unclear but could point towards a higher prevalence of SCN5A PPVs in the Spanish population or to selection bias. Additionally, we identified a genetic variation in SCN2B (c.632A>G, which results in p.D211G). We have formerly published the comprehensive electrophysiological characterization of this variation, and showed that indeed this variation could be responsible of the phenotype of the patient, thus linking SCN2B with BrS for the first time [7]. Also, we identified a variation in RANGRF. This variation (c.181G>T leading to p.E61X) had been previously reported in a Danish atrial fibrillation cohort [33]. Surprisingly, the authors reported an incidence of 0.4% for this variation in the healthy Danish population, which brought into question its pathogenicity. Our finding of this variation in an asymptomatic patient displaying a type 2 BrS ECG also points toward considering it as a rare genetic variation with a potential modifier effect on the phenotype but not clearly responsible for the disease [29].

No PPVs were identified in the other genes tested. Certainly, it is well accepted that the contribution of these genes to the disease is minor, and thus should only be considered under special circumstances [13,34]. In addition, recent studies have questioned the causality of variations identified in some of these minority genes [35].

We also used the MLPA technique for the detection of large exon duplications and/or deletions in SCN5A in patients without PPVs, and no large rearrangements were identified. This is in accordance with previous reports, which revealed that such imbalances are uncommon [810].

Kapplinger et al. [3] reported a predominance of PPVs in transmembrane regions of Nav1.5. Indeed, it has been proposed that most rare genetic variations in interdomain linkers may be considered as non-pathogenic [36]. In contrast, PPVs identified in this study are mainly located in extracellular loops and cytosolic linker regions of Nav1.5 (Fig 4). Additionally, 2 of our non-previously reported frameshifts are located in the DI-DII linker. These 2 genetic variations lead to truncated proteins, which would lack around 75% of the protein sequence, and thus are presupposed to be pathogenic.

thumbnail

Fig 4. Nav1.5 channel scheme showing the relative position of the SCN5A PPVs identified in our cohort.

Open symbols indicate already described variations and closed symbols locate novel variations reported in this study. DI to DIV designate the 4 domains of the protein, and numbers 1–6 identify the different segments within each domain. Crosses mark the voltage sensor.

http://dx.doi.org:/10.1371/journal.pone.0132888.g004

In our cohort, we have identified 40 synonymous or common genetic variations, 4 of which have not been previously reported. These variations are gradually becoming more and more important in the explanation of certain phenotypes of genetic diseases. Only a few common variations identified here are already published as phenotypic modifiers [37,38]. The effect of these and other common variants identified in our cohort on BrS phenotype should be further studied.

Unexpectedly, almost 40% (7/18) of the PPV carriers did not present signs of arrhythmogenicity. We also performed genotype-phenotype correlations of the PPVs identified in the families (S3 Table). These studies uncovered relatives, most of whom were young individuals, who carried a familial variation but had never exhibited any clinical manifestations of the disease. This is in agreement with Crotti et al. and Priori et al. [5,23], who postulated that a positive genetic testing result is not always associated with the presence of symptoms. Indeed, the existence of asymptomatic patients carrying genetic variations described to cause a severe Nav1.5 channel dysfunction has been reported [39]. The identification of silent carriers is of paramount importance since it allows the adoption of preventive measures before any lethal episode takes place. Unknown environmental factors, medication and modifier genes have been suggested to influence and/or predispose to arrhythmogenesis [11]. Hence, this group of patients has to be cautiously followed in order to avoid fatal events.

Our studies on the connection between patients’ phenotype and the PPV detection yield highlighted the presence of symptoms as a factor for an increased variation discovery yield. Within the group of symptomatic individuals, a PPV was identified in a higher proportion of patients displaying a spontaneous type 1 BrS ECG than for patients showing a drug-induced ECG. Likewise, within the asymptomatic patients with family history of BrS, those who presented spontaneous type 1 BrS ECG carried a PPV more often than those with a drug-induced ECG (Fig 2). Referring to age, the vast majority (17/20, 85%) of the PPVs were identified in patients around their fourth decade of age (30–50 years). This is in accordance with the accepted mean age of disease manifestation. Moreover, in this age range, more than 50% of the patients who presented symptoms carried a variation that could be pathogenic (Fig 3). Importantly, 35.3% of asymptomatic patients of around 40 years of age also carried one of such variations. These data highlight the importance of performing a genetic test even in the absence of clinical manifestations of the disease, and particularly when in the 30–50 years range, which is in accordance with consensus recommendations [13,34].

In conclusion, we have analysed for the first time 14 BrS-susceptibility genes and performed MLPA of SCN5A in a Spanish BrS cohort. Our cohort showed male prevalence with a mean age of disease manifestation around 40 years. BrS in this cohort was almost exclusivelySCN5A-mediated. The mean PPV discovery yield in our Spanish BrS patients is higher than that described for other BrS cohorts (32.7% vs 20–25%, respectively), and is even higher for patients in the 30–50 years age range (up to 53% for symptomatic patients). All these evidences support the genetic testing, at least of SCN5A, in all clinically well diagnosed BrS patients.

 

Study Limitations

First of all, drug challenge tests were not performed for all the relatives who were asymptomatic variation carriers. This fact hampered their clinical diagnosis and represents an impediment to definitely assess the link between PPVs and BrS. These patients are nowadays under follow-up.

New PPVs have been identified in our cohort. The clinical information available for the families suggests that these new variations could be pathogenic. Still, in vitro studies of these variations are required in order to evaluate their functional effects and verify their pathogenic role. Additionally, genotyping in an independent cohort would help reduce the likelihood of type I (false positive) error in genetic variant discovery.

We have to acknowledge that the study set is relatively small. Consequently, the classification of patients according to the different clinical categories rendered rather small sub-groups, which may lead to over-interpretation of the results. Future studies will be directed to the genetic screening of additional Spanish BrS patients, which will probably reinforce the significance of the tendencies observed here.

Read Full Post »

Diagnostic Revelations

Larry H. Bernstein, MD, FCAP, Curator

LPBI

New Liquid Biopsy Test Uses Platelet RNA as Cancer Diagnostic

  • Click Image To Enlarge +
    Using platelet RNA, scientists have been able to detect the presence of cancer and pinpoint its primary location. [Best et al., 2015, Cancer Cell 28, 1–11]

    The age of fast, accurate, and noninvasive cancer screening is rapidly becoming reality. The power of next-generation sequencing has allowed molecular diagnostic techniques to sample small amounts of blood for the genetic hallmarks of tumorigenesis. These liquid biopsy procedures, as they have been dubbed, typically search for circulating tumor DNA (ctDNA) that has made its way into the systemic circulation from tumor cells that have died or enrich for circulating tumor cells (CTCs) that have broken off from the primary cancer site.

    Now, a team of researchers lead by scientists at Massachusetts General Hospital (MGH), have developed a new diagnostic test that analyzes the tumor RNA picked up in circulating platelets. The investigators believe this new method could become even more useful than other molecular technologies for diagnosing cancer since it can also determine the primary location of the tumor and provide insight to potential therapeutic approaches.

    “By combining next-generation-sequencing gene expression profiles of platelet RNA with computational algorithms we developed, we were able to detect the presence of cancer with 96 percent accuracy,” explains co-senior author Bakhos Tannous, Ph.D., associate professor Harvard Medical School and associate neuroscientist at MGH. “Platelet RNA signatures also provide valuable information on the type of tumor present in the body and can guide the selection of the most optimal treatment for individual patients.

    The findings from this study were published recently in Cancer Cell through an article entitled “RNA-Seq of Tumor-Educated Platelets Enables Blood-Based Pan-Cancer, Multiclass, and Molecular Pathway Cancer Diagnostics.”

    In the current study the research team describes finding that the RNA profiles of tumor-educated platelets (TEPs)—those that have taken up molecules shed by tumors—can distinguish among blood samples of healthy individuals and those of patients with six types of cancer, determine the location of the primary tumor, and identify tumors carrying mutations that can guide therapeutic decision-making.

    Over the past several years, the scientific literature has shown that in addition to their role in promoting blood clotting, platelets take up protein and RNA molecules from tumors, possibly playing a role in tumor growth and metastasis. Dr. Tannous and his colleagues set out to determine whether tumor RNA carried in platelets could be used to diagnose and classify common types of cancer.

    The investigators isolated platelets from blood samples taken from 55 healthy donors, 39 individual with early-stage cancer and 189 patients with advanced, metastatic cancer. Among those patients with cancer, they were diagnosed with non-small-cell lung cancer, colorectal cancer, glioblastoma, pancreatic cancer, hepatobiliary cancer, or breast cancer.

    The comparison of RNA profiles from the healthy donors to those of the cancer patients identified increased levels of approximately 1,500 RNA molecules—many involved in cancer-associated processes—and a reduction of almost 800 in samples from cancer patients. Using their novel algorithm, the MGH group was able to examine close to 1,000 RNAs from almost 300 individuals with 96% accuracy for the presence of cancer.

    Additionally, the platelet mRNA profiles were able to identify the particular type of cancer within each patient participant, including distinguishing among three types of gastrointestinal adenocarcinoma: colorectal cancer, pancreatic cancer, and hepatobiliary cancer. Platelets from patients with tumors driven by mutations in KRAS or EGFR proteins—biomarkers that can guide the use of drugs targeting those mutations—proved to have unique RNA profiles as well.

    The researchers were excited by their findings and emphasize the uniqueness of their approach as currently utilized liquid biopsy approaches have been unable to diagnose cancer while simultaneously pinpointing the location of the primary tumor.

    “We observed that the mRNA profiles of tumor-educated platelets have the sensitivity and specificity to detect cancer, even in early, non-metastasized tumors,” noted Dr. Tannous. “We are further assessing the potential of TEP-based screening for therapeutic decision making and also investigating how non-cancerous diseases may further influence the RNA repertoire of TEPs.”

  • RNA-Seq of Tumor-Educated Platelets Enables Blood-Based Pan-Cancer, Multiclass, and Molecular Pathway Cancer Diagnostics

Myron G. Best Nik Sol, Jihane Tannous, Bart A. Westerman, François Rustenburg, Pepijn Schellen, Heleen Verschueren, Edward Post, Jan Koster, Bauke Ylstra, Irsan Kooi, et al.
Highlights

Tumors “educate” platelets (TEPs) by altering the platelet RNA profile

TEPs provide a RNA biosource for pan-cancer, multiclass, and companion diagnostics

TEP-based liquid biopsies may guide clinical diagnostics and therapy selection

A total of 100–500 pg of total platelet RNA is sufficient for TEP-based diagnostics

mRNA Profiles of Tumor-Educated Platelets Are Distinct from Platelets of Healthy Individuals

Summary

Tumor-educated blood platelets (TEPs) are implicated as central players in the systemic and local responses to tumor growth, thereby altering their RNA profile. We determined the diagnostic potential of TEPs by mRNA sequencing of 283 platelet samples. We distinguished 228 patients with localized and metastasized tumors from 55 healthy individuals with 96% accuracy. Across six different tumor types, the location of the primary tumor was correctly identified with 71% accuracy. Also, MET or HER2-positive, and mutant KRAS, EGFR, orPIK3CA tumors were accurately distinguished using surrogate TEP mRNA profiles. Our results indicate that blood platelets provide a valuable platform for pan-cancer, multiclass cancer, and companion diagnostics, possibly enabling clinical advances in blood-based “liquid biopsies”.

Figure thumbnail fx1

http://www.cell.com/cms/attachment/2039645414/2053235278/fx1.jpg

Significance

Blood-based “liquid biopsies” provide a means for minimally invasive molecular diagnostics, overcoming limitations of tissue acquisition. Early detection of cancer, clinical cancer diagnostics, and companion diagnostics are regarded as important applications of liquid biopsies. Here, we report that mRNA profiles of tumor-educated blood platelets (TEPs) enable for pan-cancer, multiclass cancer, and companion diagnostics in both localized and metastasized cancer patients. The ability of TEPs to pinpoint the location of the primary tumor advances the use of liquid biopsies for cancer diagnostics. The results of this proof-of-principle study indicate that blood platelets are a potential all-in-one platform for blood-based cancer diagnostics, using the equivalent of one drop of blood.

Introduction

Cancer is primarily diagnosed by clinical presentation, radiology, biochemical tests, and pathological analysis of tumor tissue, increasingly supported by molecular diagnostic tests. Molecular profiling of tumor tissue samples has emerged as a potential cancer classifying method (Akbani et al., 2014, Golub et al., 1999, Han et al., 2014, Hoadley et al., 2014, Kandoth et al., 2013,Ramaswamy et al., 2001, Su et al., 2001). In order to overcome limitations of tissue acquisition, the use of blood-based liquid biopsies has been suggested (Alix-Panabières et al., 2012, Crowley et al., 2013, Haber and Velculescu, 2014). Several blood-based biosources are currently being evaluated as liquid biopsies, including plasma DNA (Bettegowda et al., 2014, Chan et al., 2013, Diehl et al., 2008, Murtaza et al., 2013, Newman et al., 2014, Thierry et al., 2014) and circulating tumor cells (Bidard et al., 2014, Dawson et al., 2013, Maheswaran et al., 2008, Rack et al., 2014). So far, implementation of liquid biopsies for early detection of cancer has been hampered by non-specificity of these biosources to pinpoint the nature of the primary tumor (Alix-Panabières and Pantel, 2014,Bettegowda et al., 2014).

It has been reported that tumor-educated platelets (TEPs) may enable blood-based cancer diagnostics (Calverley et al., 2010, McAllister and Weinberg, 2014,Nilsson et al., 2011). Blood platelets—the second most-abundant cell type in peripheral blood—are circulating anucleated cell fragments that originate from megakaryocytes in bone marrow and are traditionally known for their role in hemostasis and initiation of wound healing (George, 2000, Leslie, 2010). More recently, platelets have emerged as central players in the systemic and local responses to tumor growth. Confrontation of platelets with tumor cells via transfer of tumor-associated biomolecules (“education”) is an emerging concept and results in the sequestration of such biomolecules (Klement et al., 2009,Kuznetsov et al., 2012, McAllister and Weinberg, 2014, Nilsson et al., 2011,Quail and Joyce, 2013). Moreover, external stimuli, such as activation of platelet surface receptors and lipopolysaccharide-mediated platelet activation (Denis et al., 2005, Rondina et al., 2011), induce specific splicing of pre-mRNAs in circulating platelets (Power et al., 2009, Rowley et al., 2011, Schubert et al., 2014). Platelets may also undergo queue-specific splice events in response to signals released by cancer cells and the tumor microenvironment—such as stromal and immune cells. The combination of specific splice events in response to external signals and the capacity of platelets to directly ingest (spliced) circulating mRNA can provide TEPs with a highly dynamic mRNA repertoire, with potential applicability to cancer diagnostics (Calverley et al., 2010, Nilsson et al., 2011) (Figure 1A). In this study, we characterize the platelet mRNA profiles of various cancer patients and healthy donors and investigate their potential for TEP-based pan-cancer, multiclass cancer, and companion diagnostics.

  
Results

We prospectively collected and isolated blood platelets from healthy donors (n = 55) and both treated and untreated patients with early, localized (n = 39) or advanced, metastatic cancer (n = 189) diagnosed by clinical presentation and pathological analysis of tumor tissue supported by molecular diagnostics tests. The patient cohort included six tumor types, i.e., non-small cell lung carcinoma (NSCLC, n = 60), colorectal cancer (CRC, n = 41), glioblastoma (GBM, n = 39), pancreatic cancer (PAAD, n = 35), hepatobiliary cancer (HBC, n = 14), and breast cancer (BrCa, n = 39) (Figure 1B; Table 1; Table S1). The cohort of healthy donors covered a wide range of ages (21–64 years old, Table 1).

Table 1Summary of Patient Characteristics
PATIENT GROUP TOTAL (N) GENDER M (%)A AGE (SD)B METASTASIS (%) MUTATION PRESENCE (%)
TRAINING VALIDATION TRAINING VALIDATION TRAINING VALIDATION TRAINING VALIDATION TRAINING VALIDATION
HD 39 16 21 (54) 6 (38) 41 (13) 38 (16)
GBM 23 16 18 (78) 10 (63) 59 (16) 62 (14) 0 (0) 0 (0)
NSCLC 36 24 14 (39) 14 (58) 60 (11) 59 (12) 33 (92) 23 (96) KRAS 15 (42) 11 (46)
EGFR 14 (39) 7 (29)
MET-overexpression 5 (14) 3 (13)
CRC 25 16 13 (52) 9 (56) 59 (13) 63 (16) 20 (80) 15 (94) KRAS 7 (28) 8 (50)
PAAD 21 14 12 (57) 7 (50) 66 (9) 66 (10) 15 (71) 9 (64) KRAS 13 (62) 9 (64)
BrCa 23 16 0 (0) 0 (0) 59 (11) 59 (11) 16 (70) 9 (56) HER2+ 7 (30) 5 (31)
PIK3CA 6 (26) 2 (13)
triple negative 5 (22) 3 (19)
HBC 8 6 6 (75) 2 (33) 68 (13) 62 (16) 6 (75) 4 (67) KRAS 3 (38) 1 (17)

HD, healthy donors; GBM, glioblastoma; NSCLC, non-small cell lung cancer; CRC, colorectal cancer; PAAD, pancreatic cancer; BrCa, breast cancer; HBC, hepatobiliary cancer. See also Table S1.

aIndicated are number of male individuals.
bIndicated is mean age in years.

Platelet purity was confirmed by morphological analysis of randomly selected and freshly isolated platelet samples (contamination is 1 to 5 nucleated cells per 10 million platelets, see Supplemental Experimental Procedures), and platelet RNA was isolated and evaluated for quality and quantity (Figure S1A). A total of 100–500 pg of platelet total RNA (the equivalent of purified platelets in less than one drop of blood) was used for SMARTer mRNA amplification and sequencing (Ramsköld et al., 2012) (Figures 1C and S1A). Platelet RNA sequencing yielded a mean read count of ∼22 million reads per sample. After selection of intron-spanning (spliced) RNA reads and exclusion of genes with low coverage (seeSupplemental Experimental Procedures), we detected in platelets of healthy donors (n = 55) and localized and metastasized cancer patients (n = 228) 5,003 different protein coding and non-coding RNAs that were used for subsequent analyses. The obtained platelet RNA profiles correlated with previously reported mRNA profiles of platelets (Bray et al., 2013, Kissopoulou et al., 2013, Rowley et al., 2011, Simon et al., 2014) and megakaryocytes (Chen et al., 2014) and not with various non-related blood cell mRNA profiles (Hrdlickova et al., 2014) (Figure S1B). Furthermore, DAVID Gene Ontology (GO) analysis revealed that the detected RNAs are strongly enriched for transcripts associated with blood platelets (false discovery rate [FDR] < 10−126).

Among the 5,003 RNAs, we identified known platelet markers, such as B2M, PPBP, TMSB4X, PF4, and several long non-coding RNAs (e.g., MALAT1). A total of 1,453 out of 5,003 mRNAs were increased and 793 out of 5,003 mRNAs were decreased in TEPs as compared to platelet samples of healthy donors (FDR < 0.001), while presenting a strong correlation between these platelet mRNA profiles (r = 0.90, Pearson correlation) (Figure 1D). Unsupervised hierarchical clustering based on the differentially detected platelet mRNAs distinguished two sample groups with minor overlap (Figure 1E; Table S2). DAVID GO analysis revealed that the increased TEP mRNAs were enriched for biological processes such as vesicle-mediated transport and the cytoskeletal protein binding while decreased mRNAs were strongly involved in RNA processing and splicing (Table S3). A correlative analysis of gene set enrichment (CAGE) GO methodology, in which 3,875 curated gene sets of the GSEA database were correlated to TEP profiles (see Experimental Procedures), demonstrated significant correlation of TEP mRNA profiles with cancer tissue signatures, histone deacetylases regulation, and platelets (Table 2). The levels of 20 non-protein coding RNAs were altered in TEPs as compared to platelets from healthy individuals and these show a tumor type-associated RNA profile (Figure S1C).

Thumbnail image of Figure 1. Opens large image

http://www.cell.com/cms/attachment/2039645414/2053235279/gr1.jpg

Tumor-Educated Platelet mRNA Profiling for Pan-Cancer Diagnostics

(A) Schematic overview of tumor-educated platelets (TEPs) as biosource for liquid biopsies.

(B) Number of platelet samples of healthy donors and patients with different types of cancer.

(C) TEP mRNA sequencing (mRNA-seq) workflow, as starting from 6 ml EDTA-coated tubes, to platelet isolation, mRNA amplification, and sequencing.

(D) Correlation plot of mRNAs detected in healthy donor (HD) platelets and cancer patients’ TEPs, including highlighted increased (red) and decreased (blue) TEP mRNAs.

(E) Heatmap of unsupervised clustering of platelet mRNA profiles of healthy donors (red) and patients with cancer (gray).

(F) Cross-table of pan-cancer SVM/LOOCV diagnostics of healthy donor subjects and patients with cancer in training cohort (n = 175). Indicated are sample numbers and detection rates in percentages.

(G) Performance of pan-cancer SVM algorithm in validation cohort (n = 108). Indicated are sample numbers and detection rates in percentages.

(H) ROC-curve of SVM diagnostics of training (red), validation (blue) cohort, and random classifiers, indicating the classification accuracies obtained by chance of the training and validation cohort (gray).

(I) Total accuracy ratios of SVM classification in five subgroups, including corresponding predictive strengths. Genes, number of mRNAs included in training of the SVM algorithm.

See also Figure S1 and Tables S1, S2, S3, and S4.

Table 2Pan-Cancer CAGE Gene Ontology
TOP 25 GO CORRELATIONS
# LOWESTA HIGHESTA
DOWN
Translation 10 −0.865 −0.890
Immune, T cell 5 −0.853 −0.883
Cancer-associated 2 −0.875 −0.887
Viral replication 2 −0.875 −0.878
IL-signaling 2 −0.869 −0.874
RNA processing 1 −0.886
Ago2-Dicer-silencing 1 −0.882
Protein metabolism 1 −0.879
Receptor processing 1 −0.869
UP
Cancer-associated 6 −0.783 −0.906
Infection 3 −0.798 −0.853
HDAC 3 −0.795 −0.852
Platelet 3 −0.837 −0.906
Cytoskeleton 2 −0.801 −0.886
Hypoxia 2 −0.763 −0.937
Protease 1 −0.854
Immunodeficiency 1 −0.812
Differentiation 1 −0.810
Immune differentiation 1 −0.801
Methylation 1 −0.778
Metabolism 1 −0.768

Top-ranking correlations of platelet-mRNA profiles with 3,875 Broad Institute curated gene sets. CAGE, Correlative Analysis of Gene Set Enrichment; GO, gene ontology; #, number of hits per annotation; IL, interleukin; HDAC, histone deacetylase.

aIndicated are lowest and highest correlations per annotation.

Next, we determined the diagnostic accuracy of TEP-based pan-cancer classification in the training cohort (n = 175), employing a leave-one-out cross-validation support vector machine algorithm (SVM/LOOCV, see Experimental Procedures; Figures S1D and S1E), previously used to classify primary and metastatic tumor tissues (Ramaswamy et al., 2001, Su et al., 2001, Vapnik, 1998, Yeang et al., 2001). Briefly, the SVM algorithm (blindly) classifies each individual sample as cancer or healthy by comparison to all other samples (175 − 1) and was performed 175 times to classify and cross validate all individuals samples. The algorithms we developed use a limited number of different spliced RNAs for sample classification. To determine the specific input gene lists for the classifying algorithms we performed ANOVA testing for differences (as implemented in the R-package edgeR), yielding classifier-specific gene lists (Table S4). For the specific algorithm of the pan-cancer TEP-based classifier test we selected 1,072 RNAs (Table S4) for the n = 175 training cohort, yielding a sensitivity of 96%, a specificity of 92%, and an accuracy of 95% (Figure 1F). Subsequent validation using a separate validation cohort (n = 108), not involved in input gene list selection and training of the algorithm, yielded a sensitivity of 97%, a specificity of 94%, and an accuracy of 96% (Figure 1G), with an area under the curve (AUC) of 0.986 to detect cancer (Figure 1H) and high predictive strength (Figure 1I). In contrast, random classifiers, as determined by multiple rounds of randomly shuffling class labels (permutation) during the SVM training process (see Experimental Procedures), had no predictive power (mean overall accuracy: 78%, SD ± 0.3%, p < 0.01), thereby showing, albeit an unbalanced representation of both groups in the study cohort, specificity of our procedure. A total of 100 times random class-proportional subsampling of the entire dataset in a training and validation set (ratio 60:40) yielded similar accuracy rates (mean overall accuracy: 96%, SD: ± 2%), confirming reproducible classification accuracy in this dataset. Of note, all 39 patients with localized tumors and 33 of the 39 patients with primary tumors in the CNS were correctly classified as cancer patients (Figure 1I). Visualization of 22 genes previously identified at differential RNA levels in platelets of patients with various non-cancerous diseases (Gnatenko et al., 2010, Healy et al., 2006, Lood et al., 2010,Raghavachari et al., 2007), revealed mixed levels in our TEP dataset (Figure S1F), suggesting that the platelet RNA repertoire in patients with non-cancerous disease is distinct from patients with cancer.

Tumor-Specific Educational Program of Blood Platelets Allows for Multiclass Cancer Diagnostics

In addition to the pan-cancer diagnosis, the TEP mRNA profiles also distinguished healthy donors and patients with specific types of cancer, as demonstrated by the unsupervised hierarchical clustering of differential platelet mRNA levels of healthy donors and all six individual tumor types, i.e., NSCLC, CRC, GBM, PAAD, BrCa, and HBC (Figures 2A, all p < 0.0001, Fisher’s exact test, and S2A; Table S5), and this resulted in tumor-specific gene lists that were used as input for training and validation of the tumor-specific algorithms (Table S4). For the unsupervised clustering of the all-female group of BrCa patients, male healthy donors were excluded to avoid sample bias due to gender-specific platelet mRNA profiles (Figure S2B). SVM-based classification of all individual tumor classes with healthy donors resulted in clear distinction of both groups in both the training and validation cohort, with high sensitivity and specificity, and 38/39 (97%) cancer patients with localized disease were classified correctly (Figures 2B and S2C). CAGE GO analysis showed that biological processes differed between TEPs of individual tumor types, suggestive of tumor-specific “educational” programs (Table S6). We did not detect sufficient differences in mRNA levels to discriminate patients with non-metastasized from patients with metastasized tumors, suggesting that the altered platelet profile is predominantly influenced by the molecular tumor type and, to a lesser extent, by tumor progression and metastases.

 We next determined whether we could discriminate three different types of adenocarcinomas in the gastro-intestinal tract by analysis of the TEP-profiles, i.e., CRC, PAAD, and HBC. We developed a CRC/PAAD/HBC algorithm that correctly classified the mixed TEP samples (n = 90) with an overall accuracy of 76% (mean overall accuracy random classifiers: 42%, SD: ± 5%, p < 0.01,Figure 2C). In order to determine whether the TEP mRNA profiles allowed for multiclass cancer diagnosis across all tumor types and healthy donors, we extended the SVM/LOOCV classification test using a combination of algorithms that classified each individual sample of the training cohort (n = 175) as healthy donor or one of six tumor types (Figures S2D and S2E). The results of the multiclass cancer diagnostics test resulted in an average accuracy of 71% (mean overall accuracy random classifiers: 19%, SD: ± 2%, p < 0.01,Figure 2D), demonstrating significant multiclass cancer discriminative power in the platelet mRNA profiles. The classification capacity of the multiclass SVM-based classifier was confirmed in the validation cohort of 108 samples, with an overall accuracy of 71% (Figure 2E). An overall accuracy of 71% might not be sufficient for introduction into cancer diagnostics. However, of the initially misclassified samples according to the SVM algorithms choice with strongest classification strength the second ranked classification was correct in 60% of the cases. This yields an overall accuracy using the combined first and second ranked classifications of 89%. The low validation score of HBC samples can be attributed to the relative low number of samples and possibly to the heterogenic nature of this group of cancers (hepatocellular cancers and cholangiocarcinomas).
large Image

Tumor-Educated Platelet mRNA Profiles for Multiclass Cancer Diagnostics

(A) Heatmaps of unsupervised clustering of platelet mRNA profiles of healthy donors (HD; n = 55) (red) and patients with non-small cell lung cancer (NSCLC; n = 60), colorectal cancer (CRC; n = 41), glioblastoma (GBM; n = 39), pancreatic cancer (PAAD, n = 35), breast cancer (BrCa; n = 39; female HD; n = 29), and hepatobiliary cancer (HBC; n = 14).

(B) ROC-curve of SVM diagnostics of healthy donors and individual tumor classes in both training (left) and validation (right) cohort. Random classifiers, indicating the classification accuracies obtained by chance, are shown in gray.

(C) Confusion matrix of multiclass SVM/LOOCV diagnostics of patients with CRC, PAAD, and HBC. Indicated are detection rates as compared to the actual classes in percentages.

(D) Confusion matrix of multiclass SVM/LOOCV diagnostics of the training cohort consisting of healthy donors (healthy) and patients with GBM, NSCLC, PAAD, CRC, BrCa, and HBC. Indicated are detection rates as compared to the actual classes in percentages.

(E) Confusion matrix of multiclass SVM algorithm in a validation cohort (n = 108). Indicated are sample numbers and detection rates in percentages. Genes, number of mRNAs included in training of the SVM algorithm.

See also Figure S2 and Tables S4, S5, and S6.

Companion Diagnostics Tumor Tissue Biomarkers Are Reflected by Surrogate TEP mRNA Onco-signatures

Blood provides a promising biosource for the detection of companion diagnostics biomarkers for therapy selection (Bettegowda et al., 2014, Crowley et al., 2013,Papadopoulos et al., 2006). We selected platelet samples of patients with distinct therapy-guiding markers confirmed in matching tumor tissue. Although the platelet mRNA profiles contained undetectable or low levels of these mutant biomarkers, the TEP mRNA profiles did allow to distinguish patients with KRASmutant tumors from KRAS wild-type tumors in PAAD, CRC, NSCLC, and HBC patients, and EGFR mutant tumors in NSCLC patients, using algorithms specifically trained on biomarker-specific input gene lists (all p < 0.01 versus random classifiers, Figures 3A–3E ; Table S4). Even though the number of samples analyzed is relatively low and the risk of algorithm overfitting needs to be taken into account, the TEP profiles distinguished patients with HER2-amplified, PIK3CA mutant or triple-negative BrCa, and NSCLC patients with MET overexpression (all p < 0.01 versus random classifiers, Figures 3F–3I).

 We subsequently compared the diagnostic accuracy of the TEP mRNA classification method with a targeted KRAS (exon 12 and 13) and EGFR (exon 20 and 21) amplicon deep sequencing strategy (∼5,000× coverage) on the Illumina Miseq platform using prospectively collected blood samples of patients with localized or metastasized cancer. This method did allow for the detection of individual mutant KRAS and EGFR sequences in both plasma DNA and platelet RNA (Table S7), indicating sequestration and potential education capacity of mutant, tumor-derived RNA biomarkers in TEPs. Mutant KRAS was detected in 62% and 39%, respectively, of plasma DNA (n = 103, kappa statistics = 0.370, p < 0.05) and platelet RNA (n = 144, kappa statistics = 0.213, p < 0.05) of patients with a KRAS mutation in primary tumor tissue. The sensitivity of the plasma DNA tests was relatively poor as reported by others (Bettegowda et al., 2014, Thierry et al., 2014), which may partly be attributed to the loss of plasma DNA quality due to relatively long blood sample storage (EDTA blood samples were stored up to 48 hr at room temperature before plasma isolation). To discriminate KRAS mutant from wild-type tumors in blood, the TEP mRNA profiles provided superior concordance with tissue molecular status (kappa statistics = 0.795–0.895, p < 0.05) compared to KRAS amplicon sequencing analysis of both plasma DNA and platelet RNA (Table S7). Thus, TEP mRNA profiles can harness potential blood-based surrogate onco-signatures for tumor tissue biomarkers that enable cancer patient stratification and therapy selection.
large Image

Tumor-Educated Platelet mRNA Profiles for Molecular Pathway Diagnostics

Cross tables of SVM/LOOCV diagnostics with the molecular markers KRAS in (A) CRC, (B) PAAD, and (C) NSCLC patients, (D) KRAS in the combined cohort of patients with either CRC, PAAD, NSCLC, or HBC, (E) EGFR and (F) MET in NSCLC patients, (G) PIK3CA mutations, (H) HER2-amplification, and (I) triple negative status in BrCa patients. Genes, number of mRNAs included in training of the SVM algorithm. See alsoTables S4 and S7.

TEP-Profiles Provide an All-in-One Biosource for Blood-Based Liquid Biopsies in Patients with Cancer

Unequivocal discrimination of primary versus metastatic nature of a tumor may be difficult and hamper adequate therapy selection. Since the TEP profiles closely resemble the different tumor types as determined by their organ of origin—regardless of systemic dissemination—this potentially allows for organ-specific cancer diagnostics. Hence we selected all healthy donors and all patients with primary or metastatic tumor burden in the lung (n = 154), brain (n = 114), or liver (n = 127). We performed “organ exams” and instructed the SVM/LOOCV algorithm to determine for lung, brain, and liver the presence or absence of cancer (96%, 91%, and 96% accuracy, respectively), with cancer subclassified as primary or metastatic tumor (84%, 93%, and 90% accuracy, respectively) and in case of metastases to identify the potential organ of origin (64%, 70%, and 64% accuracy, respectively). The platelet mRNA profiles enabled assignment of the cancer to the different organs with high accuracy (Figure 4). In addition, using the same TEP mRNA profiles we were able to again indicate the biomarker status of the tumor tissues (90%, 82%, and 93% accuracy, respectively) (Figure 4).

large Image

Organ-Focused TEP-Based Cancer Diagnostics

SVM/LOOCV diagnostics of healthy donors (n = 55) and patients with primary or metastatic tumor burden in the lung (n = 99; totaling 154 tests), brain (n = 62; totaling 114 tests), or liver (n = 72; totaling 127 tests), to determine the presence or absence of cancer, with cancer subclassified as primary or metastatic tumor, in case of metastases the identified organ of origin, and the correctly identified molecular markers. Of note, at the exam level of mutational subtypes some samples were included in multiple classifiers (i.e., KRAS, EGFR, PIK3CA,HER2-amplification, MET-overexpression, or triple negative status), explaining the higher number in mutational tests than the total number of included samples. TP, true positive; FP, false positive; FN, false negative; TN, true negative. Indicated are sample numbers and detection rates in percentages.

Discussion

The use of blood-based liquid biopsies to detect, diagnose, and monitor cancer may enable earlier diagnosis of cancer, lower costs by tailoring molecular targeted treatments, improve convenience for cancer patients, and ultimately supplements clinical oncological decision-making. Current blood-based biosources under evaluation demonstrate suboptimal sensitivity for cancer diagnostics, in particular in patients with localized disease. So far, none of the current blood-based biosources, including plasma DNA, exosomes, and CTCs, have been employed for multiclass cancer diagnostics (Alix-Panabières and Pantel, 2014, Bettegowda et al., 2014, Skog et al., 2008), hampering its implementation for early cancer detection. Here, we report that molecular interrogation of blood platelet mRNA can offer valuable diagnostics information for all cancer patients analyzed—spanning six different tumor types. Our results suggest that platelets may be employable as an all-in-one biosource to broadly scan for molecular traces of cancer in general and provide a strong indication on tumor type and molecular subclass. This includes patients with localized disease possibly allowing for targeted diagnostic confirmation using routine clinical diagnostics for each particular tumor type.

Since the discovery of circulating tumor material in blood of patients with cancer (Leon et al., 1977) and the recognition of the clinical utility of blood-based liquid biopsies, a wealth of studies has assessed the use of blood for cancer diagnostics, prognostication and treatment monitoring (Alix-Panabières et al., 2012, Bidard et al., 2014, Crowley et al., 2013, Haber and Velculescu, 2014). By development of highly sensitive targeted detection methods, such as targeted deep sequencing (Newman et al., 2014), droplet digital PCR (Bettegowda et al., 2014), and allele-specific PCR (Maheswaran et al., 2008, Thierry et al., 2014), the utility and applicability of liquid biopsies for clinical implementation has accelerated. These advances previously allowed for a pan-cancer comparison of various biosources and revealed that in >75% of cancers, including advanced stage pancreas, colorectal, breast, and ovarian cancer, cell-free DNA is detectable although detection rates are dependent on the grade of the tumor and depth of analysis (Bettegowda et al., 2014). Here, we show that the platelet RNA profiles are affected in nearly all cancer patients, regardless of the type of tumor, although the abundance of tumor-associated RNAs seems variable among cancer patients. In addition, surrogate RNA onco-signatures of tissue biomarkers, also in 88% of localized KRAS mutant cancer patients as measured by the tumor-specific and pan-cancer SVM/LOOCV procedures, are readily available from a minute amount (100–500 pg) of platelet RNA. As whole blood can be stored up to 48 hr on room temperature prior to isolation of the platelet pellet, while maintaining high-quality RNA and the dominant cancer RNA signatures, TEPs can be more readily implemented in daily clinical laboratory practice and could potentially be shipped prior to further blood sample processing.

Blood platelets are widely involved in tumor growth and cancer progression (Gay and Felding-Habermann, 2011). Platelets sequester solubilized tumor-associated proteins (Klement et al., 2009) and spliced and unspliced mRNAs (Calverley et al., 2010, Nilsson et al., 2011), whereas platelets do also directly interact with tumor cells (Labelle et al., 2011), neutrophils (Sreeramkumar et al., 2014), circulating NK-cells (Palumbo et al., 2005, Placke et al., 2012), and circulating tumor cells (Ting et al., 2014, Yu et al., 2013). Interestingly, in vivo experiments have revealed breast cancer-mediated systemic instigation by supplying circulating platelets with pro-inflammatory and pro-angiogenic proteins, supporting outgrowth of dormant metastatic foci (Kuznetsov et al., 2012). Using a gene ontology methodology, CAGE, we correlated TEP-cancer signatures with publicly available curated datasets. Indeed, we identified widespread correlations with cancer tissues, hypoxia, platelet-signatures, and cytoskeleton, possibly reflecting the “alert” and pro-tumorigenic state of TEPs. We observed strong negative correlations with RNAs implicated in RNA translation, T cell immunity, and interleukin-signaling, implying diminished needs of TEPs for RNAs involved in these biological processes or orchestrated translation of these RNAs to proteins (Denis et al., 2005). We observed that the tumor-specific educational programs in TEPs are predominantly influenced by tumor type and, to a lesser extent, by tumor progression and metastases. Although we were not able to measure significant differences between non-metastasized and metastasized tumors, we do not exclude that the use of larger sample sets could allow for the generation of SVM algorithms that do have the power to discriminate between certain stages of cancer, including those with in situ carcinomas and even pre-malignant lesions. In addition, different molecular tumor subtypes (e.g., HER2-amplified versus wild-type BrCa) result in different effects on the platelet profiles, possibly caused by different “educational” stimuli generated by the different molecular tumor subtypes (Koboldt et al., 2012). Altogether, the RNA content of platelets in patients with cancer is dependent on the transcriptional state of the bone-marrow megakaryocyte (Calverley et al., 2010, McAllister and Weinberg, 2014), complemented by sequestration of spliced RNA (Nilsson et al., 2011), release of RNA (Clancy and Freedman, 2014, Kirschbaum et al., 2015, Rak and Guha, 2012, Risitano et al., 2012), and possibly queue-specific pre-mRNA splicing during platelet circulation. Partial or complete normalization of the platelet profiles following successful treatment of the tumor would enable TEP-based disease recurrence monitoring, requiring the analysis of follow-up platelet samples. Future studies will be required to address the tumor-specific “educated” profiles on both an (small non-coding) RNA (Laffont et al., 2013, Landry et al., 2009, Leidinger et al., 2014, Lu et al., 2005) and protein (Burkhart et al., 2014,Geiger et al., 2013, Klement et al., 2009) level and determine the ability of gene ontology, blood-based cancer classification.

In conclusion, we provide robust evidence for the clinical relevance of blood platelets for liquid biopsy-based molecular diagnostics in patients with several types of cancer. Further validation is warranted to determine the potential of surrogate TEP profiles for blood-based companion diagnostics, therapy selection, longitudinal monitoring, and disease recurrence monitoring. In addition, we expect the self-learning algorithms to further improve by including significantly more samples. For this approach, isolation of the platelet fraction from whole blood should be performed within 48 hr after blood withdrawal, the platelet fraction can subsequently be frozen for cancer diagnosis. Also, future studies should address causes and anticipated risks of outlier samples identified in this study, such as healthy donors classified as cancer patients. Systemic factors such as chronic or transient inflammatory diseases, or cardiovascular events and other non-cancerous diseases may also influence the platelet mRNA profile and require evaluation in follow-up studies, possibly also including individuals predisposed for cancer.

References   

Authors  Title
Source

Akbani, R., Ng, P.K.S., Werner, H.M.J., Shahmoradgoli, M., Zhang, F., Ju, Z., Liu, W., Yang, J.-Y., Yoshihara, K., Li, J. et al. A pan-cancer proteomic perspective on The Cancer Genome Atlas.

Nat. Commun. 2014; 5: 3887

Alix-Panabières, C. and Pantel, K. Challenges in circulating tumour cell research.

Nat. Rev. Cancer. 2014; 14:623–631

Alix-Panabières, C., Schwarzenbach, H., and Pantel, K. Circulating tumor cells and circulating tumor DNA.

Annu. Rev. Med. 2012; 63:199–215

Bettegowda, C., Sausen, M., Leary, R.J., Kinde, I., Wang, Y., Agrawal, N., Bartlett, B.R., Wang, H., Luber, B., Alani, R.M. et al. Detection of circulating tumor DNA in early- and late-stage human malignancies.

Sci. Transl. Med. 2014; 6:224ra24

Bidard, F.-C., Peeters, D.J., Fehm, T., Nolé, F., Gisbert-Criado, R., Mavroudis, D., Grisanti, S., Generali, D., Garcia-Saenz, J.A., Stebbing, J. et al. Clinical validity of circulating tumour cells in patients with metastatic breast cancer: a pooled analysis of individual patient data.

Lancet Oncol. 2014; 15: 406–414

Bray, P.F., McKenzie, S.E., Edelstein, L.C., Nagalla, S., Delgrosso, K., Ertel, A., Kupper, J., Jing, Y., Londin, E., Loher, P. et al. The complex transcriptional landscape of the anucleate human platelet.

BMC Genomics. 2013; 14: 1

Burkhart, J.M., Gambaryan, S., Watson, S.P., Jurk, K., Walter, U., Sickmann, A., Heemskerk, J.W.M., and Zahedi, R.P.  What can proteomics tell us about platelets?.

Circ. Res. 2014; 114: 1204–1219

Calverley, D.C., Phang, T.L., Choudhury, Q.G., Gao, B., Oton, A.B., Weyant, M.J., and Geraci, M.W. Significant downregulation of platelet gene expression in metastatic lung cancer.

Clin. Transl. Sci. 2010; 3:227–232

Chan, K.C.A., Jiang, P., Chan, C.W.M., Sun, K., Wong, J., Hui, E.P., Chan, S.L., Chan, W.C., Hui, D.S.C., Ng, S.S.M. et al. Noninvasive detection of cancer-associated genome-wide hypomethylation and copy number aberrations by plasma DNA bisulfite sequencing.

Proc. Natl. Acad. Sci. USA.2013; 110: 18761–18768

Abstract Image
Chi-Ping Day, Glenn Merlino, Terry Van Dyke
Cell, Vol. 163, Issue 1, p39–53
Published in issue: September 24, 2015
Abstract Image
Katherine A. Hoadley, Christina Yau, Denise M. Wolf, Andrew D. Cherniack, David Tamborero, Sam Ng, Max D.M. Leiserson, Beifang Niu, Michael D. McLellan, Vladislav Uzunangelov, Jiashan Zhang, Cyriac Kandoth, Rehan Akbani, Hui Shen, Larsson Omberg, Andy Chu, Adam A. Margolin, Laura J. van’t Veer, Nuria Lopez-Bigas, Peter W. Laird, Benjamin J. Raphael, Li Ding, A. Gordon Robertson, Lauren A. Byers, Gordon B. Mills, John N. Weinstein, Carter Van Waes, Zhong Chen, Eric A. Collisson, The Cancer Genome Atlas Research Network, Christopher C. Benz, Charles M. Perou, Joshua M. Stuart
Cell, Vol. 158, Issue 4, p929–944
Published online: August 7, 2014

Open Archive

Abstract Image
Pau Creixell, Erwin M. Schoof, Craig D. Simpson, James Longden, Chad J. Miller, Hua Jane Lou, Lara Perryman, Thomas R. Cox, Nevena Zivanovic, Antonio Palmeri, Agata Wesolowska-Andersen, Manuela Helmer-Citterich, Jesper Ferkinghoff-Borg, Hiroaki Itamochi, Bernd Bodenmiller, Janine T. Erler, Benjamin E. Turk, Rune Linding
Cell, Vol. 163, Issue 1, p202–217
Published online: September 17, 2015
Abstract Image
Cell, Vol. 155, Issue 1, p9–10
Published in issue: September 26, 2013

Open Archive

Abstract Image
Corina E. Antal, Andrew M. Hudson, Emily Kang, Ciro Zanca, Christopher Wirth, Natalie L. Stephenson, Eleanor W. Trotter, Lisa L. Gallegos, Crispin J. Miller, Frank B. Furnari, Tony Hunter, John Brognard, Alexandra C. Newton
Cell, Vol. 160, Issue 3, p489–502
Published online: January 22, 2015
Abstract Image
Cell, Vol. 160, Issues 1-2, p7
Published in issue: January 15, 2015
Abstract Image
Amelia J. Johnston, Kate T. Murphy, Laura Jenkinson, David Laine, Kerstin Emmrich, Pierre Faou, Ross Weston, Krishnath M. Jayatilleke, Jessie Schloegel, Gert Talbo, Joanne L. Casey, Vita Levina, W. Wei-Lynn Wong, Helen Dillon, Tushar Sahay, Joan Hoogenraad, Holly Anderton, Cathrine Hall, Pascal Schneider, Maria Tanzer, Michael Foley, Andrew M. Scott, Paul Gregorevic, Spring Yingchun Liu, Linda C. Burkly, Gordon S. Lynch, John Silke, Nicholas J. Hoogenraad
Cell, Vol. 162, Issue 6, p1365–1378
Published in issue: September 10, 2015
Abstract Image
Levi A. Garraway, Eric S. Lander
Cell, Vol. 153, Issue 1, p17–37
Published in issue: March 28, 2013

Open Archive

Abstract Image
Hector L. Franco, W. Lee Kraus
Cell, Vol. 163, Issue 1, p28–30
Published in issue: September 24, 2015

Read Full Post »

« Newer Posts - Older Posts »