Archive for the ‘Exosomes’ Category

Controversial Case of Sarepta’s eteplirsen, for DMD gained a Support Letter to FDA by Site Investigators and Advisers working on the drug

Reporter: Aviva Lev-Ari, PhD, RN


On 1/23/2016 I published

Gene Editing for Exon 51: Why CRISPR Snipping might be better than Exon Skipping for DMD

Reporter: Aviva Lev-Ari, PhD, RN


Duchenne docs–aka Sarepta investigators, advisers and advocates–back controversial case for eteplirsen

Monday, March 21, 2016 | By John Carroll

UCLA’s Perry Shieh

Sarepta ($SRPT) enjoyed a rare spike in its share price today after several dozen experts in Duchenne muscular dystrophy, including a host of site investigators and advisers working on the drug, circulated a detailed letter explaining why the FDA should support the company’s application for eteplirsen–despite a withering internal assessment from agency insiders.

On a point-by-point basis these physicians took exception to the FDA review, noting that despite the extremely small number of patients in the key study–with data on only 12 boys–their experience observing patients in their practices suggest that the drug was clearly effective. As news of the letter spread, Sarepta’s share price surged 20%.

“The collective signatories note that the group of 12 eteplirsen treated boys, even accounting for daily deflazacort usage or twice-weekly prednisone, is clearly performing better than our collective clinical experience and the published literature would predict,” the lineup of physicians asserts. “Collectively, a portion of us represent a group of physicians who have observed over 5,000 DMD patients in our practices over an average of more than 15 years. Published external natural history data and our clinical experience strongly support that the 12 boys treated for over 4 years show a milder clinical progression, likely due to a positive treatment effect of eteplirsen.”

A quick check of Sarepta’s websites and online records also revealed that many of the as-advertised experts have close ties to Sarepta, often listed as the very investigators who have been helping Sarepta gather the controversial data together and analyze it for regulators. Among those who have worked on drug trials related to eteplirsen and signed the letter are UCLA’s Perry Shieh (a principal investigator for one study), Stanford’s John Day (who lists Confirmatory Study of Eteplirsen in DMD Patients on his resume; and Kathy Mathews at the University of Iowa (a hub site investigator and principal). Harvard Med School’s Louis Kunkel and the University of Washington’s Jeff Chamberlain, who signed on to the company’s advisory board, are also signers on the lobbying letter, along with many trial site leaders, including Anne Connolly, Susan Apkon, Nancy Kuntz and Basil Darras, all listed on Sarepta’s website.

Dr. M. Carrie Miceli and Dr. Stanley Nelson, co-director of the Center for Muscular Dystrophy at UCLA, took the lead on the letter, which was dated February 24 and addressed to Dr. Billy Dunn, director of the division of neurology products at the FDA. The wife/husband team launched a public campaign advocating for new drugs to be approved for Duchenne, which afflicts their teenage son. Miceli, Nelson and Chamberlain also sit on the advisory board of CureDuchenne, an advocacy group which has offered its full-throated support of eteplirsen’s approval, along with Sarepta CEO Ed Kaye.

In their view, which you can see here, the best approach would be to go ahead and approve eteplirsen and then go ahead and let upcoming data provide confirmatory results.

“The FDA Briefing Document also implies that the ongoing non-placebo controlled confirmatory eteplirsen trial (NCT02255552) and additional eteplirsen safety studies (NCT02420379 and NCT02286947) initiated in response to FDA guidance may not be considered sufficiently robust to allow for approval,” the letter reads. “Given the relative paucity of patients with amenable mutations, the flexibility afforded by FDASIA, and the fact that many of the boys between the ages of 4 and 21 years with relevant mutations are already receiving eteplirsen in the context of these trials, it would be difficult to conduct a large placebo controlled study in the near future. Thus, it would be dubiously ethical to veer from the currently recommended study path at this point. In keeping with the criteria imposed by FDASIA for accelerated approval for rare disease with unmet need, we conclude that the aggregate data, described in the briefing documents, are providing substantial evidence of efficacy and use in the greater population of boys amenable to exon 51 skipping is appropriate. We suggest that the most scientifically robust way forward and the most ethical choice for the Duchenne community is in the context of an accelerated approval followed by a confirmatory trial.”

Just how persuasive this group can be, with such close ties to Sarepta, won’t be clear until April 25, when the FDA’s advisory committee will finally meet for a review. The FDA’s internal assessment virtually dismissed Sarepta’s case. But the biotech has enjoyed intense support from parents and patients–as well as the professional community, which has played a big role in testing the drug.

Related Articles:

Sarepta has a new date with an FDA AdComm for Duchenne drug eteplirsen

Sarepta shrinks as execs wait for FDA’s decision on Duchenne drug

Sarepta faces another FDA delay with its much-scrutinized DMD drug

Sarepta shares crash on a harsh FDA review of Duchenne’s drug


From: FierceBiotech <>

Reply-To: <>

Date: Tuesday, March 22, 2016 at 1:31 PM

To: Aviva Lev-Ari <>

Subject: | 03.22.16 | Eli Lilly raises big questions in quest to salvage Alzheimer’s drug; Sarepta investigators join lobbying effort for eteplirsen


Read Full Post »

Correspondence on Leadership in Genomics and other Gene Curations: Dr. Williams with Dr. Lev-Ari

Authors: Stephen J Williams, PhD and Aviva Lev-Ari, PhD, RN




Reporter: Aviva Lev-Ari, PhD, RN

Reporter: Aviva Lev-Ari, PhD, RN

Author: Aviva Lev-Ari, PhD, RN




From: Aviva Lev-Ari <>

Date: Thursday, February 18, 2016 at 12:39 AM

To: “Dr. Katie Katie Siafaca” <>

Subject: Re: In light of — >>>>>> Leadership in Genomics: VarElect ­ Variants in Disease and UCSC Genome Technology Center | Leaders in Pharmaceutical Business Intelligence

Leadership in Genomics: VarElect ­ Variants in Disease and UCSC Genome Technology Center

  • There are important resources in the link above. 
  • Gene therapy is the new trend.
  • In Immune-Oncology – T Cell Reseptor Like (TCRL) is the new trend. 
  • 5th generation is CAR-T

No one said it is not huge task. A very small piece is needed – which one ???


From: “Dr. Katie Katie Siafaca” <>

Reply-To: “Dr. Katie Katie Siafaca” <>

Date: Wednesday, February 17, 2016 at 11:11 PM

To: Aviva Lev-Ari <>

Subject: re: In light of — >>>>>> Leadership in Genomics: VarElect ­ Variants in Disease and UCSC Genome Technology Center | Leaders in Pharmaceutical Business Intelligence

Hi Aviva,

I am not sure what is being proposed here.  In the cancer area, there are at least 1,200 genes implicated somehow in this disease and new ones are reported every day.  This is a colossal task!



From: Aviva Lev-Ari <>

Date: Wednesday, February 17, 2016 at 10:34 PM

To: “Stephen Williams, PhD” <>

Cc: “Dr. Larry Bernstein” <>, Gerard Loiseau <>, “Dr. Katie Katie Siafaca” <>

Subject: In light of — >>>>>> Leadership in Genomics: VarElect ­ Variants in Disease and UCSC Genome Technology Center | Leaders in Pharmaceutical Business Intelligence

Dear Dr. Williams,

HERE I am thinking LOUD

Is it possible to go to the dashboard, all posts and click on your Name, you will get the Universe of ~200 articles that you published.

HOW one could search or one needs to visually glance at the title of each — so as to pull a subset of posts that are dedicated to a GENE.

Create an Excel File, place each gene inside and go to Weizmann Institute’s and pullout from them respective data on that gene

By so doing we will have LPBI’s Gene Inventory which we could reference in the Drug Discovery process, we do more and more, as we are aggregating all Biologics under the Joint Venture with SBH Sciences, Inc.

In light of :

Leadership in Genomics: VarElect ­ Variants in Disease and UCSC Genome Technology Center

My Questions are:

1. HOW could we take this “to be create Excel File” to be published a PAGE, Password Protected as your Curation, it needs to have a Parent or a Hierarchy of Nesting in the Website architecture

And subject that to your our search into New Medicine, Inc. NM/OK DB for data complementarity compilation?

2. What Foundation Medicine, Now Roche, does have vs. Weizmann Institute’s

I read and I visited

Most interesting is

3. Will Weizmann Institute’s be interested in New Medicine, Inc., NM/OK DB?

4. I have explored with Foundation Medicine, Now Roche regarding New Medicine, Inc., NM/OK DB and their reply was that they focus ONLY on Genomics data in Cancer, thus,, no interest in New Medicine, Inc. NM/OK DB, there

5. What is in Weizmann Institute’s that is NOT in UC Santa Cruz DBs ?

6. If you would take EACH ENTRY in this “to be create Excel File” and supplement it with

6.1 Weizmann Institute’s

6.2 UC Santa Cruz Dbs

6.3 New Medicine, Inc., NM/OK DB – given this is a GENE in the cancer implication

6.4 A RECORD of the outputs from 6.1, 6.2, 6.3

7. THEN we could target 6.4 for CRISPR and go to

DNA interrogation by the CRISPR RNA-guided endonuclease Cas9


8. Doudna started her professorship at Yale University in 1994. While the group was able to grow high-quality crystals, they struggled with thephase problem due to unspecific binding of the metal ions. One of her early graduate students and later her husband, Jamie Cate decided to soak the crystals in osmium hexamine to imitate magnesium. Using this strategy, they were able to solve the structure, the second solved folded RNA structure since tRNA.[9][10] The magnesium ions would cluster at the center of the ribozyme and would serve as a core for RNA folding similar to that of a hydrophobic core of a protein.[5]

9. In 2015, Doudna gave a TED Talk about the bioethics of using CRISPR[13]

“Jennifer Doudna TED Talk”.


10. Caribou BioSciences

Precision medicines have the ability to transform healthcare and treat a myriad of unmet medical needs. The Caribou technology platform has the ability to generate transformative medicines in multiple different market segments.

Our current therapeutic areas of exploration include anti-microbials, animal health, and therapeutic bioproduction.

Human therapeutics

In 2014, Caribou co-founded Intellia Therapeutics to develop curative medicines utilizing the Caribou CRISPR-Cas9 platform. Rachel Haurwitz, President and Chief Executive Officer of Caribou, is a member of Intellia’s Board of Directors.

Intellia is developing human gene and cell therapies for both ex vivo and in vivo applications using CRISPR-Cas9 gene editing technology. Near-term ex vivo applications include the treatment of blood disorders and cancer. In January 2015, Intellia announced a five-year research and development collaboration with Novartis to accelerate the ex vivo development of new CRISPR-Cas9-based therapies using chimeric antigen receptor T cells (CARTs) and hematopoetic stem cells (HSCs).

Any thoughts for me?

Aviva Lev-Ari, PhD, RN


From: “Stephen Williams, PhD” <>

Date: Wednesday, February 17, 2016 at 6:42 PM

To: Aviva Lev-Ari <>

Subject: Re: Leadership in Genomics: VarElect ­ Variants in Disease and UCSC Genome Technology Center | Leaders in Pharmaceutical Business Intelligence

Every post I do that contains a gene in the post is curated with a link to genecards database so later it not only can be searched but is an integrated knowledge-analysis base integrated with a knowledge and fully integrated Omics database as gene cards . org also contains protein, structure and functional databases. 

This is where I always felt the power of LPBI was in the genomic space, integration of a deep analysis curated database 





Sent: 2016-02-17 18:01:03 GMT

Subject: Leadership in Genomics: VarElect ­ Variants in Disease and UCSC Genome Technology Center | Leaders in Pharmaceutical Business Intelligence

Which of them did you use already?

Aviva Lev-Ari, PhD, RN


From: Aviva Lev-Ari <>

Date: Wednesday, February 17, 2016 at 5:59 PM

To: “Stephen Williams, PhD” <>

Cc: “Dr. Katie Katie Siafaca” <>

Subject: Fwd: Leadership in Genomics: VarElect – Variants in Disease and UCSC Genome Technology Center | Leaders in Pharmaceutical Business Intelligence

We will use these two platforms

Aviva Lev-Ari, PhD, RN


From: Aviva Lev-Ari <>

Date: Wednesday, February 17, 2016 at 3:42 PM

To: “Stephen Williams, PhD” <>

Subject: Re: The Science Coming in 2016 – OpenMind

I read and I visited gene

Most interesting is

Aviva Lev-Ari, PhD, RN


From: “Stephen Williams, PhD” <>

Date: Wednesday, February 17, 2016 at 1:46 PM

To: Aviva Lev-Ari <>

Subject: Re: The Science Coming in 2016 – OpenMind

I want you to go to then pick a gene and scroll down.  You will see a database there for CRISPR products available from different distributors including Qiagen, Promega, Fisher Scientific, Santa Cruz as well as others.  This seems to be already underway.  It is possible to copy what these companies are already doing but I don’t see the business advantage in that.  Please remember that 3D printing involves layering a of first and second dimension to a third dimension product.  So for instance the cell would be the “first dimension” even though it is three dimensional but the effect of layering MULTIPLE layers of cells is what gives their 3D effect.  The biomaterial you put in each tube is, in essence, your first dimension you are going to layer into a multilayered “3D” structure.

DNA can be made by synthesizers, there is no need to bioprint it, especially short fragments and in fact you wouldn’t.  They can handle even longer material.  Possibly if you want to replace a whole nucleosome but the chemistry is not there.  That is fine working with Jennifer Duodna making a library of small guide RNA’s to be used in CRISPR however it seems to be in process as I said before.  This would need to be done with her system and optimized for her system. You would also need a huge operation to do validation as well.  In addition the number of mutations, SNPs, variants are extremely large and many are not disease specific.

Again each would have to be validated.  In addition, unless you are doing embryo manipulation, you will need to partner with a company that has a good gene delivery system.  This will cost $, probably around 500 million. 


From: “Aviva Lev-Ari” <>


Cc: “Gerard Loiseau” <>, “Dr. Larry Bernstein” <>

Sent: Tuesday, February 16, 2016 4:48:54 AM

Subject: The Science Coming in 2016 – OpenMind

This gene fragment in red color — I am suggesting to build with 3D BioPrinting,

at the Oligonucleotide level.

Create a library of fragments for the most common mismatch in transcriptions, as well as on demand for rare deletions.

Per University of California, Santa Cruz, Database of Variations, prepare an INVENTORY of GENE REPAIR PARTS, manage the inventory by Analytics, where each part was implanted and monthly interval monitoring of segment incorporation and new function of protein folding achieved.

Trace the genetic therapy achieved by Gene editing.

Any comments??


Read Full Post »

Regulatory DNA engineered

Larry H. Bernstein, MD, FCAP, Curator



New Type of CRISPR Screen Probes the Regulatory Genome

Aaron Krol

February 8, 2016 | When a geneticist stares down the 3 billion DNA base pairs of the human genome, searching for a clue to what’s gone awry in a single patient, it helps to narrow the field. One of the most popular places to look is the exome, the tiny fraction of our DNA―less than 2%―that actually codes for proteins. For patients with rare genetic diseases, which might be fully explained by one key mutation, many studies sequence the whole exome and leave all the noncoding DNA out. Similarly, personalized cancer tests, which can help bring to light unexpected treatment options, often sequence the tumor exome, or a smaller panel of protein-coding genes.

Unfortunately, we know that’s not the whole picture. “There are a substantial number of noncoding regions that are just as effective at turning off a gene as a mutation in the gene itself,” says Richard Sherwood, a geneticist at Brigham and Women’s Hospital in Boston. “Exome sequencing is not going to be a good proxy for what genes are working.”

Sherwood studies regulatory DNA, the vast segment of the genome that governs which genes are turned on or off in any cell at a given time. It’s a confounding area of genetics; we don’t even know how much of the genome is made up of these regulatory elements. While genes can be recognized by the presence of “start” and “stop” codons―sequences of three DNA letters that tell the cell’s molecular machinery which stretches of DNA to transcribe into RNA, and eventually into protein―there are no definite signs like this for regulatory DNA.

Instead, studies to discover new regulatory elements have been somewhat trial-and-error. If you suspect a gene’s activity might be regulated by a nearby DNA element, you can inhibit that element in a living cell, and see if your gene shuts down with it.

With these painstaking experiments, scientists can slowly work their way through potential regulatory regions―but they can’t sweep across the genome with the kind of high-throughput testing that other areas of genetics thrive on. “Previously, you couldn’t do these sorts of tests in a large form, like 4,000 of them at once,” says David Gifford, a computational biologist at MIT. “You would really need to have a more hypothesis-directed methodology.”

Recently, Gifford and Sherwood collaborated on a paper, published in Nature Biotechnology, which presents a new method for testing thousands of DNA loci for regulatory activity at once. Their assay, called MERA (multiplexed editing regulatory assay), is built on the recent technology boom in CRISPR-Cas9 gene editing, which lets scientists quickly and easily cut specific sequences of DNA out of the genome.

So far, their team, including lead author Nisha Rajagopal from Gifford’s lab, has used MERA to study the regulation of four genes involved in the development of embryonic stem cells. Already, the results have defied the accepted wisdom about regulatory DNA. Many areas of the genome flagged by MERA as important factors in gene expression do not fall into any known categories of regulatory elements, and would likely never have been tested with previous-generation methods.

“Our approach allows you to look away from the lampposts,” says Sherwood. “The more unbiased you can be, the more we’ll actually know.”

A New Kind of CRISPR Screen

In the past three years, CRISPR-Cas9 experiments have taken all areas of molecular biology by storm, and Sherwood and Gifford are far from the first to use the technology to run large numbers of tests in parallel. CRISPR screens are an excellent way to learn which genes are involved in a cellular process, like tumor growth or drug resistance. In these assays, scientists knock out entire genes, one by one, and see what happens to cells without them.

This kind of CRISPR screen, however, operates on too small a scale to study the regulatory genome. For each gene knocked out in a CRISPR screen, you have to engineer a strain of virus to deliver a “guide RNA” into the cellular genome, showing the vicelike Cas9 molecule which DNA region to cut. That works well if you know exactly where a gene lies and only need to cut it once—but in a high-throughput regulatory test, you would want to blanket vast stretches of DNA with cuts, not knowing which areas will turn out to contain regulatory elements. Creating a new virus for each of these cuts is hugely impractical.

The insight behind MERA is that, with the right preparation, most of the genetic engineering can be done in advance. Gifford and Sherwood’s team used a standard viral vector to put a “dummy” guide RNA sequence, one that wouldn’t tell Cas9 to cut anything, into an embryonic stem cell’s genome. Then they grew plenty of cells with this prebuilt CRISPR system inside, and attacked each one with a Cas9 molecule targeted to the dummy sequence, chopping out the fake guide.

Normally, the result would just be a gap in the CRISPR system where the guide once was. But along with Cas9, the researchers also exposed the cells to new, “real” guide RNA sequences. Through a DNA repair mechanism called homologous recombination, the cells dutifully patched over the gaps with new guides, whose sequences were very similar to the missing dummy code. At the end of the process, each cell had a unique guide sequence ready to make cuts at a specific DNA locus—just like in a standard CRISPR screen, but with much less hands-on engineering.

By using a large enough library of guide RNA molecules, a MERA screen can include thousands of cuts that completely tile a broad region of the genome, providing an agnostic look at anywhere regulatory elements might be hiding. “It’s a lot easier [than a typical CRISPR screen],” says Sherwood. “The day the library comes in, you just perform one PCR reaction, and the cells do the rest of the work.”

In the team’s first batch of MERA screens, they created almost 4,000 guide RNAs for each gene they studied, covering roughly 40,000 DNA bases of the “cis-regulatory region,” or the area surrounding the gene where most regulatory elements are thought to lie. It’s unclear just how large any gene’s cis-regulatory region is, but 40,000 bases is a big leap from the highly targeted assays that have come before.

“We’re now starting to do follow-up studies where we increase the number of guide RNAs,” Sherwood adds. “Eventually, what you’d like is to be able to tile an entire chromosome.”

Far From the Lampposts

Sherwood and Gifford tried to focus their assays on regions that would be rich in regulatory elements. To that end, they made sure their guide RNAs covered parts of the genome with well-known signs of regulatory activity, like histone markers and transcription factor binding sites. For many of these areas, Cas9 cuts did, in fact, shut down gene expression in the MERA screens.

But the study also targeted regions around each gene that were empty of any known regulatory features. “We tiled some other regions that we thought might serve as negative controls,” explains Gifford. “But they turned out not to be negative at all.”

The study’s most surprising finding was that several cuts to seemingly random areas of the genome caused genes to become nonfunctional. The authors named these DNA regions “unmarked regulatory elements,” or UREs. They were especially prevalent around the genes Tdgf1 and Zfp42, and in many cases, seemed to be every bit as necessary to gene activity as more predictable hits on the MERA screen.

These results caught the researchers so off guard that it was natural to wonder if MERA screens are prone to false positives. Yet follow-up experiments strongly supported the existence of UREs. Switching the guide RNAs from aTdgf1 MERA screen and a Zfp42 screen, for example, produced almost no positive results: the UREs’ regulatory effects were indeed specific to the genes near them.

In a more specific test, the researchers chose a particular URE connected to Tdgf1, and cut it out of a brand new population of cells for a closer look. “We showed that, if we deleted that region from the genome, the cells lost expression of the gene,” says Sherwood. “And then when we put it back in, the gene became expressed again. Which was good proof to us that the URE itself was responsible.”

From these results, it seems likely that follow-up MERA screens will find even more unknown stretches of regulatory DNA. Gifford and Sherwood’s experiments didn’t try to cover as much ground around their target genes as they might have, because the researchers assumed that MERA would mostly confirm what was already known. At best, they hoped MERA would rule out some suspected regulatory regions, and help show which regulatory elements have the biggest effect on gene expression.

“We tended to prioritize regions that had been known before,” Sherwood says. “Unfortunately, in the end, our datasets weren’t ideally suited to discovering these UREs.”

Getting to Basic Principles

MERA could open up huge swaths of the regulatory genome to investigation. Compared to an ordinary CRISPR screen, says Sherwood, “there’s only upside,” as MERA is cheaper, easier, and faster to run.

Still, interpreting the results is not trivial. Like other CRISPR screens, MERA makes cuts at precise points in the genome, but does not tell cells to repair those cuts in any particular way. As a result, a population of cells all carrying the same guide RNA can have a huge variety of different gaps and scars in their genomes, typically deletions in the range of 10 to 100 bases long. Gifford and Sherwood created up to 100 cells for each of their guides, and sometimes found that gene expression was affected in some but not all of them; only sequencing the genomes of their mutated cells could reveal exactly what changes had been made.

By repeating these experiments many times, and learning which mutations affect gene expression, it will eventually be possible to pin down the exact DNA bases that make up each regulatory element. Future studies might even be able to distinguish between regulatory elements with small and large effects on gene expression. In Gifford and Sherwood’s MERA screens, the target genes were altered to produce a green fluorescent protein, so the results were read in terms of whether cells gave off fluorescent light. But a more precise, though expensive, approach would be to perform RNA sequencing, to learn which cuts reduced the cell’s ability to transcribe a gene into RNA, and by how much.

A MERA screen offers a rich volume of data on the behavior of the regulatory genome. Yet, as with so much else in genetics, there are few robust principles to let scientists know where they should be focusing their efforts. Histone markers provide only a very rough sketch of regulatory elements, often proving to be red herrings on closer examination. And the existence of UREs, if confirmed by future experiments, shows that we don’t yet even know which areas of the genome to rule out in the hunt for regulatory regions.

“Every dataset we get comes closer and closer to computational principles that let us predict these regions,” says Sherwood. As more studies are conducted, patterns may emerge in the DNA sequences of regulatory elements that link UREs together, or reveal which histone markers truly point toward regulatory effects. There might also be functional clues hidden in these sequences, hinting at what is happening on a molecular level as regulatory elements turn genes on and off in the course of a cell’s development.

For now, however, the data is still rough and disorganized. For better and for worse, high-throughput tools like MERA are becoming the foundation for most discoveries in genetics—and that means there is a lot more work to do before the regulatory genome begins to come into focus.

CORRECTED 2/9/16: Originally, this story incorrectly stated that only certain cell types could be assayed with MERA for reasons related to homologous recombination. In fact, the authors see no reason MERA could not be applied to any in vitro cell line, and hope to perform screens in a wide range of cell types. The text has been edited to correct the error.



Read Full Post »

BioMEMS The Market aspects of Oligonucleotide-Chips, Products and Applications, Competition, January 21, 2016

Curator: Gérard LOISEAU, ESQ



The Market aspects of Oligonucleotide-Chips, Products, Applications, Competition 

January 21, 2016


The oligonucleotide synthesis market is expected to reach USD 1.918.6Billion at a CAGR of 10.1% by 2020 from USD 1.078.1Billion in 2015.





  • Agilent Technologies Inc.
  • BioAutomation Corp.
  • Biosearch Technologies
  • Gen9 Inc.
  • GenScript Inc.
  • Illumina Inc.
  • Integrated DNA Technologies
  • New England Biolabs Inc.
  • Nitto Denko Avecia Inc.
  • OriGene Technologies Inc.
  • Sigma-Aldrich Corporation
  • Thermo Fisher Scientific Inc.
  • TriLink Biotechnologies


Agilent Technologies

  • Agilent was created as a spin off from Hewlett-Packard Company in 1999.
  • Agilent Technologies Inc. is engaged in the life sciences, diagnostics and applied chemical markets. The Company provides application focused solutions that include instruments, software, services and consumables for the entire laboratory workflow. The Company has three business segments:

the life sciences and applied markets business,

the diagnostics and genomics business, and

the Agilent Cross Lab business

  • The Company’s life sciences and applied markets business segment brings together the Company’s analytical laboratory instrumentation and informatics.
  • The Company’s diagnostics and genomics business segment consists of three businesses: the Dako business, the genomics business and the nucleic acid solutions business.
  • The Company’s Agilent Cross Lab business segment combines its analytical laboratory services and consumables business



  • October 09, 2015 03:21 PM Eastern Daylight Time
  • CARPINTERIA, Calif.–(BUSINESS WIRE)–Dako, an Agilent Technologies company and a worldwide provider of cancer diagnostics, today announced the U.S. Food and Drug Administration has approved a new test that can identify PD-L1 expression levels on the surface of non-small cell lung cancer tumor cells and provide information on the survival benefit with OPDIVO® (nivolumab) for patients with non-squamous NSCLC.




BioAutomation Corp.



  • DNA and RNA synthesis reagents for the MerMades


Note: The MerMade 192E Oligonucleotide synthesizer is designed to synthesize DNA, RNA & LNA oligonucleotides in a column format


  • HONGENE BIOTECH : BIOAUTOMATION is the exclusive distributor for the Americas






Biosearch Technologies


  • qPCR & SNP Genotyping
  • Custom Oligonucleotides
  • – highly sophisticated oligonucleotides
  • – simple PCR primers
  • Oligos in Plates
  • Synthesis Reagents
  • Immunochemicals
  • Primers
  • Probes
  • Large-Scale Synthesis Oligos
  • Intermediate-Scale Synthesis Oligos


  • GMP & Commercial Services
  • OEM & Kit Manufacturing
  • qPCR Design Collaborations


Argentina | Australia | Austria | Brazil | Canada |Chile | China | Colombia | Czech Republic | Denmark | Ecuador | Finland | Germany |Hong Kong | Israel | Italy | Japan | Korea | Malaysia | Mexico | New Zealand | Norway | Paraguay | Peru| Philippines | Poland | Romania | Singapore | South Africa | Spain | Sweden |Switzerland | Taiwan ROC | Thailand | Turkey | United Kingdom | Uruguay | Vietnam



Gen9 Inc.


Gen9 is building on advances in synthetic biology to power a scalable fabrication capability that will significantly increase the world’s capacity to produce DNA content. The privately held company’s next-generation gene synthesis technology allows for the high-throughput, automated production of DNA constructs at lower cost and higher accuracy than previous methods on the market. Founded by world leaders in synthetic biology, Gen9 aims to ensure the constructive application of synthetic biology in industries ranging from enzyme and chemical production to pharmaceuticals and biofuels.


  • Synthetic Biology
  • Gene Synthesis Services
  • Variant Libraries
  • Gene Sequence Design Services


  • Agilent Technologies : Private Equity
  • CAMBRIDGE, Mass. and SANTA CLARA, Calif. — April 24, 2013 —Gen9 Receives $21 Million Strategic Investment from Agilent Technologies



GenScript Inc.

  • GenScript is the largest gene synthesis provider in the USA
  • GenScript Corporation, a biology contract research organization, provides biological research and drug discovery services to pharmaceutical companies, biotech firms, and research institutions in the United States, Europe, and Japan. It offers bio-reagent, custom molecular biology, custom peptide, protein production, custom antibody production, drug candidates testing, assay development and screening, lead optimization, antibody drug development, gene synthesis, and assay-ready cell line production services.
  • The company also offers molecular biology, peptide, protein, immunoassay, chemicals, and cell biology products. It offers its products through distributors in Tokyo, Japan; and Seoul, Korea. GenScript Corporation has a strategic partnership with Immunologix, Inc. The company was founded in 2002 and is based in Piscataway, New Jersey. It has subsidiaries in France, Japan, and China.


Note: As of October 24, 2011, Immunologix, Inc. was acquired by Intrexon Corporation. Immunologix, Inc. develops and produces antibody-based therapeutics for various biological targets. It produces human monoclonal antibodies against viral, bacterial, and tumor antigens, as well as human auto antigens.

Intrexon Corporation, founded in 1998, is a leader in synthetic biology focused on collaborating with companies in Health, Food, Energy, Environment and Consumer sectors to create biologically based products that improve quality of life and the health of the planet.




  • Gene synthesis
  • Antibody services
  • Protein Services
  • Peptide services



Note: The Balloch Group (‘TBG’) was established in 2001 by Howard Balloch (Canada‘s ambassador to China from 1996 to 2001). TBG has since grown from a market-entry consultancy working with North American clients in China to a leading advisory and merchant banking firm serving both domestic Chinese companies and multinational corporations. TBG was ranked as the number one boutique investment bank in China by ChinaVenture in 2008.

Kleiner, Perkins, Caufield and Byers


Inc. CA


Monica Heger : SAN FRANCISCO (GenomeWeb) – Illumina today announced two new next-generation sequencing platforms, a targeted sequencing system called MiniSeq and a semiconductor sequencer that is still under development.

Illumina disclosed the initiatives during a presentation at the JP Morgan Healthcare conference held here today. During the presentation, Illumina CEO Jay Flatley also announced a new genotyping array called Infinium XT; a partnership with Bio-Rad to develop a single-cell sequencing workflow; preliminary estimates of its fourth-quarter 2015 revenues; and an update on existing products. The presentation followed the company’s announcement on Sunday that it has launched a new company called Grail to develop a next-generation sequencing test for early cancer detection from patient blood samples.

The MiniSeq system, which is based on Illumina’s current sequencing technology, will begin shipping early this quarter and has a list price of $49,500. It can perform a variety of targeted DNA and RNA applications, from single-gene to pathway sequencing, and promises “all-in” prices, including library prep and sequencing, of $200 to $300 per sample, Flatley said during the JP Morgan presentation.




  •               Mid to large scale manufacturing assets
  •               Analytical Labs
  •               Pre-clinical
  •               Clinical
  •               Launched products


              COMPETITORS Tue, Feb 2, 2016, 2:16pm EST – US Markets

Market Cap: 22.75B N/A 1.13B 835.66M 134.14M
Employees: 3,700 10,000 1,200 745 45.00
Qtrly Rev Growth (yoy): 0.14 N/A -0.01 0.07 0.18
Revenue (ttm): 2.14B 3.80B1 357.74M 235.37M 8.47M
Gross Margin (ttm): 0.73 N/A 0.63 0.71 0.58
EBITDA (ttm): 770.84M N/A 46.64M 52.99M -12.31M
Operating Margin (ttm): 0.30 N/A 0.08 0.17 -1.62
Net Income (ttm): 510.36M 430.90M1 11.22M 39.29M N/A
EPS (ttm): 3.42 N/A 0.13 0.93 -0.34
P/E (ttm): 45.43 N/A 104.40 20.91 25.33
PEG (5 yr expected): 2.68 N/A 4.66 0.55 N/A
P/S (ttm): 10.87 N/A 3.13 3.45 13.65


Pvt1 = Life Technologies Corporation (privately held)

AFFX = Affymetrix Inc.

LMNX = Luminex Corporation



Integrated DNA Technologies (IDT)


Integrated DNA Technologies, Inc. (IDT), the global leader in nucleic acid synthesis, serving all areas of life sciences research and development, offers products for a broad range of genomics applications. IDT’s primary business is the production of custom, synthetic nucleic acids for molecular biology applications, including qPCR, sequencing, synthetic biology, and functional genomics. The company manufactures and ships an average of 44,000 custom nucleic acids per day to more than 82,000 customers worldwide. For more information, visit




  • DNA & RNA Synthesis
  • Custom DNA Oligos 96- & 384-Well Plates Ultramer Oligos Custom RNA Oligos SameDay Oligos HotPlates ReadyMade Primers Oligo Modifications Freedom
  • Dyes GMP for Molecular Diagnostics Large Scale Oligo Synthesis


Note : Skokie, IL – December 1, 2015. Integrated DNA Technologies Inc. (“IDT”), the global leader in custom nucleic acid synthesis, has entered into a definitive agreement to acquire the oligonucleotide synthesis business of AITbiotech Pte. Ltd. in Singapore (“AITbiotech”). With this acquisition, IDT expands its customer base across Southeast Asia making it possible for these additional customers to now have access to its broad range of products for genomic applications. AITbiotech will continue operations in its other core business areas.


New England Biolabs Inc.



  •                 Restriction Endonucleases
  •                 PCR, Polymerases & Amplification Technologies
  •                 DNA Modifying Enzymes
  •                 Library Preparation for Next Generation Sequencing
  •                 Nucleic Acid Purification
  •                 Markers & Ladders
  •                 RNA Reagents
  •                 Gene Expression
  •                 Cellular Analysis



Nitto Denko Avecia Inc.


With over 20 years of experience in oligonucleotide development and production, and over 1000 sequences manufactured, Avecia has played an integral role in the advancing oligo therapeutic market. Our mission is to continue to build value for our customers, as they progress through drug development into commercialization. And as a member of the Nitto Denko Corporation (, Avecia is committed to the future of the oligonucleotide market. We are driven by innovative ideas and flexible solutions, designed to provide our customers with the best in service, quality, and technology.




Note : 1918 Nitto Electric Industrial Co., Ltd. forms in Ohsaki, Tokyo, to produce electrical insulating materials in Japan.

2011 Acquires Avecia Biotechnology Inc. in the U.S.A.



OriGene Technologies Inc.


OriGene Technologies, Inc. develops, manufactures, and sells genome wide research and diagnostic products for pharmaceutical, biotechnology, and academic research applications. The company offers cDNA clones, including TrueORF cDNA, viral ORF, destination vectors, TrueClones (human), TrueClones (mouse), organelle marker plasmids, MicroRNA tools, mutant and variant clones, plasmid purification kits, transfection reagents, and gene synthesis service; and HuSH shRNA, siRNA, miRNA, qPCR reagents, plasmid purification products, transfection reagents, PolyA+ and total RNA products, first-strand cDNA synthesis, and CRISPR/Cas9 genome products. It also provides proteins and lysates, such as purified human proteins, over-expression cell lysates, mass spectrometry standard proteins, and protein purification reagents; UltraMAB IHC antibodies, TrueMAB primary antibodies, anti-tag and fluorescent proteins, ELISA antibodies, luminex antibodies, secondary antibodies, and controls and others; and anatomic pathology products, including IHC antibodies, detection systems, and IHC accessories

The company offers luminex and ELISA antibody pairs, autoantibody profiling arrays, ELISA kits, cell assay kits, assay reagents, custom development, and fluorogenic cell assays; TissueFocus search tools; tissue sections; tissue microarrays, cancer protein lysate arrays, TissueScan cDNA arrays, tissue blocks, and quality control products, as well as tissue RNA, DNA, and protein lysates; and lab essentials. Its research areas include cancer biomarker research, RNAi, pathology IHC, stem cell research, ion channels, and protein kinase products. The company provides gene synthesis and molecular biology services, genome editing, custom cloning, custom shRNA, purified protein, monoclonal antibody development, and assay development. It sells its products through distributors worldwide, as well as online. OriGene Technologies, Inc. was incorporated in 1995 and is based in Rockville, Maryland.



  •                cDNA Clones
Human, mouse, rat
Expression validated
  •                RNAi
shRNA, siRNA
microRNA & 3’UTR clones
  •                Gene Synthesis
Codon optimization
Variant libraries
  •                Real-time PCR
Primer pairs, panels
SYBR green reagents
  •                Lab Essentials
DNA/RNA purification kits
Transfection reagents
  •                Anatomic Pathology
UltraMAB antibodies
Specificity validated
  •                Recombinant Proteins
10,000 human proteins
from mammalian system
  •                Antibodies
TrueMAB primary antibodies
Anti-tag antibodies
  •                Assays and Kits
ELISA & Luminex antibodies
Autoantibody Profiling Array
  •                Cancer & Normal Tissues
Pathologist verified
gDNA, RNA, sections, arrays



Sigma-Aldrich Corporation 

Louis, MO – November 18, 2015 Merck KGaA, Darmstadt, Germany, Completes Sigma-Aldrich Acquisition

Merck KGaA today announced the completion of its $17 billion acquisition of Sigma-Aldrich, creating one of the leaders in the $130 billion global industry to help solve the toughest problems in life science.

Press Release: 18-Nov-2015

Letter to our Life Science Customers from Dr. Udit Batra

The life science business of Merck KGaA, Darmstadt, Germany brings together the world-class products and services, innovative capabilities and exceptional talent of EMD Millipore and Sigma-Aldrich to create a global leader in the life science industry.

Everything we do starts with our shared purpose – to solve the toughest problems in life science by collaborating with the global scientific community. 

This combination is built on complementary strengths, which will enable us to serve you even better as one organization than either company could alone.

This means providing a broader portfolio with a catalog of more than 300,000 products, including many of the most respected brands in the industry, greater geographic reach, and an unmatched combination of industry-leading capabilities.





Thermo Fisher Scientific Inc.

Thermo Fisher Scientific Inc. is a provider of analytical instruments, equipment, reagents and consumables, software and services for research, manufacturing, analysis, discovery and diagnostics. The company operates through four segments: Life Sciences Solutions, provides reagents, instruments and consumables used in biological and medical research, discovery and production of new drugs and vaccines as well as diagnosis of disease; Analytical Instruments, provides instruments, consumables, software and services that are used in the laboratory; Specialty Diagnostics, offers diagnostic test kits, reagents, culture media, instruments and associated products, and Laboratory Products and Services, offers self-manufactured and sourced products for the laboratory.




  •                 Oligos Value – Standard – Plate
  •                 Primers
  •                 Probes
  •                 Nucleotides



  1.                THERMO SCIENTIFIC
  2.                 APPLIED BIOSYSTEMS
  3.                 INVITROGEN
  4.                 FISHER SCIENTIFIC
  5.                 UNITY LAB SERVICES




WALTHAM, Mass. & SANTA CLARA, Calif.–(BUSINESS WIRE)–Jan. 8, 2016– Thermo Fisher Scientific Inc. (NYSE:TMO), the world leader in serving science, and Affymetrix Inc. (NASDAQ:AFFX), a leading provider of cellular and genetic analysis products, today announced that their boards of directors have unanimously approved Thermo Fisher’s acquisition of Affymetrix for $14.00 per share in cash. The transaction represents a purchase price of approximately $1.3 billion.



TriLink Biotechnologies




  •               DNA Oligos
  •               RNA Oligos
  •               Modified Oligos
  •               Specialty Oligos


  •               NTPs (Nucleoside Triphosphates)
  •               Biphosphates
  •               Monophosphates



  •              Custom Chemistry
  •              Reagents
  •              Aptamers






Other related articles published in this Open Access Online Scientific Journal include the following:

Gene Editing: The Role of Oligonucleotide Chips

Gene Editing for Exon 51: Why CRISPR Snipping might be better than Exon Skipping for DMD

























Read Full Post »

UPDATED on 3/28/2016

SAN FRANCISCO — What briefly appeared to be a potential bidding war for Affymetrix, a genetics analysis technology maker, fizzled out on Monday after the company chose to stick with a takeover bid from Thermo Fisher Scientific over a higher bid from a Chinese-backed suitor.

In a statement, Affymetrix reiterated its support for the $14-a-share offer from Thermo Fisher that it accepted in January. 

UPDATED on 3/23/2016

Affymetrix Postpones Stockholder Meeting as Origin Ups Acquisition Offer; Board Backs Thermo Bid

UPDATED on 3/21/2016

Former Affymetrix Execs Offer to Buy Company in Alternative to Thermo Fisher Deal

NEW YORK (GenomeWeb) – Origin Technologies Corporation, founded by former Affymetrix executives for the purpose of purchasing the company, proposed today to acquire Affy for $16.10 per share in an all-cash transaction valued at approximately $1.5 billion.

The proposal comes about a week before Affy shareholders are scheduled to vote on a different deal, Thermo Fisher Scientific’s proposed acquisition of Affy for approximately $1.3 billion, which the boards of directors of both firms unanimously approved in January.

According to a letter sent by Origin to Affymetrix today, its proposal represents a 75 percent premium to Affymetrix’s unaffected closing share price of $9.21 on the last trading day prior to the announcement of Thermo Fisher’s proposed acquisition.

Fully financed by SummitView Capital, Origin said its all-cash offer represents a 15 percent premium for Affy stockholders relative to the proposed transaction with Thermo, under which stockholders would receive $14.00 per share in cash.

As part of the offer, Origin also pledged to fund payment of the $55 million termination fee that would be due to Thermo under the terms of Thermo and Affy’s January agreement.

Wei Zhou, president of the newly formed Origin, wrote in the letter to Affy today that Origin strongly believes that its offer is superior to Thermo’s based on several criteria.

First, it offers substantially higher value to Affy’s stockholders, he said. Additionally, Origin believes it is in a better position to help Affy achieve its potential as a standalone, global company focused on genomics and proteomics. The deal would also offer an opportunity to acquire new technologies in the complete human genome sequencing space, Zhou wrote.

If the Origin-Affy merger goes through, Origin would have a separate option of combining with another company founded by Zhou in 2009, Centrillion Technology Holdings Corporation.



Affymetrix: Sales $350 million, Acquisition Price $1.3 billion – Advantages: Cytogenetics, Genotyping and Gene Expression Analysis

Reporter: Aviva Lev-Ari, PhD, RN

Thermo Fisher Scientific Inc.

NYSE: TMOJan 12 1:13 PM EST
136.60Price increase1.72 (1.28%)

Thermo Fisher Scientific to acquire Affymetrix for $1.3 billion

WALTHAM, Mass. – Thermo Fisher Scientific Inc., announced Jan. 8 that it has agreed to acquire Affymetrix Inc. for $14.00 per share in cash, or roughly $1.3 billion. The transaction, approved by the boards of directors of both companies is pending shareholder approval and is expected to close in the second quarter this year.

Santa Clara, Calif.-based Affymetrix was founded in 1992 and is a pioneer in the field of

  • microarray technology, launching its
  • GeneChip line in 1994. Today, the company serves both the
  1. life sciences research and
  2. clinical markets

Over the past ten years, the company has broadened its portfolio of tools that enable both

  • multiplexed and
  • parallel analysis of
  • biological systems at the cell, protein and genetic level.

Notable acquisitions for Affymetrix have included genetic tools company ParAllele Bioscience (2005), genetic, protein and cellular analysis provider Panomics (2008), and eBioscience (2012), which included one of the world’s largest selections of

  • antibodies,
  • ELISAs, and
  • proteins

for life science research and diagnostics.

“The acquisition of Affymetrix will strengthen our leadership in biosciences and create new market opportunities for us in genetic analysis,” said Marc N. Casper, president and CEO of Thermo Fisher Scientific. “In biosciences, the company’s antibody portfolio will significantly expand our offering in the fast-growing flow cytometry market, and customers will have greater access to these products through our global scale and commercial reach. In genetic analysis, Affymetrix’s technologies are highly complementary and present new opportunities for us in targeted

  • clinical and
  • applied markets.”

According to Frank Whitney, president and CEO of Affymetrix, the acquisition will allow the company to continue to build upon the close relationships it has created with customers, while deepening its reach into the biopharma market. “We are excited about the opportunity to combine our portfolios and strengthen our position in high-growth markets such as

  • single-cell biology
  • reproductive health and
  • AgBio

According to information provided by Thermo Fisher, benefits of the acquisition include expanding its offerings of its antibody portfolio via the eBioscience line of products, which also includes

  • multiplex RNA,
  • protein assay
  • single-cell assays
  • genetic analysis capabilities via complementary products used in
  1. cytogenetics
  2. genotyping and
  3. gene expression.

Thermo expects Affymetrix will add $0.10 in adjusted earnings per share in the first full year of ownership, while creating $70 million in operational savings by year three. Affymetrix has annual revenues of approximately $350 million and will be integrated within Thermo Fisher’s Life Sciences Solutions business unit.



Other related articles published in this Open Access Online Scientific Journal include the following:


Gene Editing: The Role of Oligonucleotide Chips

Curator: Aviva Lev-Ari, PhD, RN


Articles on Immune-Oncology Molecules In Development

Curators: Stephen J Williams, PhD and Aviva Lev-Ari, PhD, RN



Read Full Post »

People with blood type O have been reported to be protected from coronary heart disease, cancer, and have lower cholesterol levels.

Reporter: Aviva Lev-Ari, PhD, RN


The New England Centenarian Study (NECS) led by Boston University identified 4 key genetic influences in Long-life:

1. ABO Locus

Controls blood type. The study results showed that centenarians are more likely to have the O blood group than controls. People with blood type O have been reported to be protected from coronary heart disease, cancer, and have lower cholesterol levels.


(Source: Wikipedia Commons)


Implicated in the regulation of the cell life cycle, SNPs from this region have previously been found to be associated with a surprising diversity of age-related diseases. These include cardiovascular disease, type 2 diabetes, intracranial aneurisms, amyotrophic lateral sclerosis (ALS) and several cancers in the case of Anril (through a study at the Paris Descartes University).

For cardiovascular disease, this locus shows the strongest association of any locus in the genome, with each copy of the risk allele increasing one’s risk of disease by 20–30%.


APOE was initially investigated because its ɛ4 allele was known to increase the risk of Alzheimer’sand coronary artery disease, and in the study the disease-allele was shown to be depleted in long-lived populations.

There was also a relationship between the locus and incidence of age-related macular degeneration (vision loss) and total cholesterol levels.

4. SH2B3/ATXN2

Variation in this locus has been associated with a wide variety of diseases, including rheumatoid arthritis, type 2 diabetes, coronary artery disease, blood pressure and cholesterol levels.

iGWAS analysis also showed a protective SNP against lung and pancreatic cancers and promoting good bone mineral density. SH2B3 specifically encodes a signaling protein, and loss-of-function mutations in the invertebrate equivalent gene (Lnk) in fruit flies (Drosophila) was also shown to result in an extended lifespan.




Read Full Post »


Larry H. Bernstein, MD, FCAP, Curator



Human Exomes Galore

A new database includes complete sequences of protein-coding DNA from 60,706 individuals.

By Karen Zusi | November 16, 2015

The ability to sequence a person’s entire genome has led many researchers to hunt for the genetic causes of certain diseases. But without a larger set of genomes to compare mutations against, putting these variations into context is difficult. An international group of researchers has banked the full exomes of 60,706 individuals in a database called the Exome Aggregation Consortium (ExAC). The team’s analaysis, posted last month (October 30) on the preprint server bioRxiv, was presented at the Genome Science 2015 conference in Birmingham, U.K. (September 7).

Led by Daniel MacArthur from the Broad Institute of MIT and Harvard, the research team collected exomes from labs around the world for its dataset. “The resulting catalogue of human genetic diversity has unprecedented resolution,” the authors wrote in their preprint. Many of the variants observed in the dataset occurred only once.

“This is one of the most useful resources ever created for medical testing for genetic disorders,” Heidi Rehm, a clinical lab director at Harvard Medical School, told Science News.

Among other things, the team found 3,230 genes that are highly conserved across exomes, indicating likely involvement in critical cellular functions. Of these, 2,557 are not associated with diseases. The authors hypothesized that these genes, if mutated, either lead to embryonic death—before a problem can be diagnosed—or cause rare diseases that have not yet been genetically characterized.

“We should soon be able to say, with high precision: If you have a mutation at this site, it will kill you. And we’ll be able to say that without ever seeing a person with that mutation,” MacArthur said during his Genome Science talk, according to The Atlantic.

This is not the complete set of essential genes in the human body, David Goldstein, a geneticist at Columbia University in New York City, pointed out to Nature. Only by studying more exomes will researchers be able to refine that number, he noted.


Analysis of protein-coding genetic variation in 60,706 humans

, , , , ,  et al.      doi:

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities. The resulting catalogue of human genetic diversity has unprecedented resolution, with an average of one variant every eight bases of coding sequence and the presence of widespread mutational recurrence. The deep catalogue of variation provided by the Exome Aggregation Consortium (ExAC) can be used to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; we identify 3,230 genes with near-complete depletion of truncating variants, 79% of which have no currently established human disease phenotype. Finally, we show that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human knockout variants in protein-coding genes.

Analysis of protein-coding genetic variation in 60,706 humans



Over the last five years, the widespread availability of high-throughput DNA sequencing technologies has permitted the sequencing of the whole genomes or exomes (the 18 protein-coding regions of genomes) of over half a million humans. In theory, these data represent a powerful source of information about the global patterns of human genetic variation, but in practice, are difficult to access for practical, logistical, and ethical reasons; in addition, the inconsistent processing complicates variant-calling pipelines used by different groups. Current publicly available datasets of human DNA sequence variation contain only a small fraction of all sequenced samples: the Exome Variant Server, created as part of the NHLBI Exome Sequencing Project (ESP)1, contains frequency information spanning 6,503 exomes; and the 1000 Genomes (1000G) Project, which includes individual-level genotype data from whole-genome and exome sequence data for 2,504 individuals2.

Databases of genetic variation are important for our understanding of human population history and biology1–5, but also provide critical resources for the clinical interpretation of variants observed in patients suffering from rare Mendelian diseases6,7. The filtering of candidate variants by frequency in unselected individuals is a key step in any pipeline for the discovery of causal variants in Mendelian disease patients, and the efficacy of such filtering depends on both the size and the ancestral diversity of the available reference data.

Here, we describe the joint variant calling and analysis of high-quality variant calls across 60,706 human exomes, assembled by the Exome Aggregation Consortium (ExAC; This call set exceeds previously available exome-wide variant databases by nearly an order of magnitude, providing unprecedented resolution for the analysis of very low-frequency genetic variants. We demonstrate the application of this data set to the analysis of patterns of genetic variation including the discovery of widespread mutational recurrence, the inference of gene-level constraint against 10 truncating variation, the clinical interpretation of variation in Mendelian disease genes, and the discovery of human “knockout” variants in protein-coding genes.


Deleterious variants are expected to have lower allele frequencies than neutral ones, due to negative selection. This theoretical property has been demonstrated previously in human population sequencing data18,19 and here (Figure 1d, Figure 1e). This allows inference of the degree of natural selection against specific functional classes of variation: however, mutational recurrence as described above indicates that allele frequencies observed in ExAC-scale samples are also skewed by mutation rate, with 10 more mutable sites less likely to be singletons (Figure 2c and Extended Data Figure 4d). Mutation rate is in turn non-uniformly distributed across functional classes – for instance, stop lost mutations can never occur at CpG dinucleotides (Extended Data Figure 4e). We corrected for mutation rates (Supplementary Information) by creating a mutability-adjusted proportion singleton (MAPS) metric. This metric reflects (as expected) strong selection against predicted PTVs, as well as missense variants predicted by conservation-based methods to be deleterious (Figure 2e).

The deep ascertainment of rare variation in ExAC also allows us to infer the extent of 19 selection against variant categories on a per-gene basis by examining the proportion of 20 variation that is missing compared to expectations under random mutation. Conceptually similar approaches have been applied to smaller exome datasets13,20 but have been underpowered, particularly for the analysis of depletion of PTVs. We compared the observed number of rare (MAF <0.1%) variants per gene to an expected number derived from a selection neutral, sequence-context based mutational model13. The model performs extremely well in predicting the number of synonymous variants, which should be under minimal purifying selection, per gene (r = 0.98; Extended Data Figure 5).


Critically, we note that LoF-intolerant genes include virtually all known severe haploinsufficient human disease genes (Figure 3b), but that 79% of LoF-intolerant genes have not yet been assigned a human disease phenotype despite the clear evidence for extreme selective constraint (Supplementary Information 4.11). These likely represent either undiscovered severe dominant disease genes, or genes in which loss of a single copy results in embryonic lethality.

The most highly constrained missense (top 25% missense Z scores) and PTV (pLI ≥0.9) genes show higher expression levels and broader tissue expression than the least constrained genes24 (Figure 3c). These most highly constrained genes are also depleted for eQTLs (p < 10-9 for missense and PTV; Figure 3d), yet are enriched within genome-wide significant trait-associated loci (χ2 p < 10-14, Figure 3e). Intuitively, genes intolerant of PTV variation are dosage sensitive: natural selection does not tolerate a 50% deficit in expression due to the loss of single allele. It is therefore unsurprising that these genes are also depleted of common genetic variants that have a large enough effect on expression to be detected as eQTLs with current limited sample sizes. However, smaller changes in the expression of these genes, through weaker eQTLs or functional variants, are more likely to contribute to medically relevant phenotypes. Therefore, highly constrained genes are dosage-sensitive, expressed more broadly across tissues (as expected for core cellular processes), and are enriched for medically relevant variation.

Finally, we investigated how these constraint metrics would stratify mutational classes according to their frequency spectrum, corrected for mutability as in the previous section (Figure 3f). The effect was most dramatic when considering stop-gained variants in the LoF-intolerant set of genes. For missense variants, the missense Z score offers information additional to Polyphen2 and CADD classifications, indicating that gene-level measures of constraint offer additional information to variant-level metrics in assessing potential pathogenicity.

We assessed the value of ExAC as a reference dataset for clinical sequencing approaches, which typically prioritize or filter potentially deleterious variants based on functional consequence and allele frequency6. To simulate a Mendelian variant analysis, we filtered variants in 100 ExAC exomes per continental population against ESP (the previous default reference data set for clinical analysis) or the remainder of ExAC, removing variants present at ≥0.1% allele frequency, a filter recommended for dominant 16 disease variant discovery6. Filtering on ExAC reduced the number of candidate protein-altering variants by 7-fold compared to ESP, and was most powerful when the highest 18 allele frequency in any one population (“popmax”) was used rather than average (“global”) allele frequency (Figure 4a). ESP is not well-powered to filter at 0.1% AF without removing many genuinely rare variants, as AF estimates based on low allele counts are both upward-biased and imprecise (Figure 4b). We thus expect that ExAC will provide a very substantial boost in the power and accuracy of variant filtering in Mendelian disease projects.


The above curation efforts confirm the importance of allele frequency filtering in analysis of candidate disease variants. However, literature and database errors are prevalent even at lower allele frequencies: the average ExAC exome contains 0.89 reportedly Mendelian variants in well-characterized dominant disease genes at <1% popmax AF and 0.20 at <0.1% popmax AF. This inflation likely results from a combination of false reports of pathogenicity and incomplete penetrance, as we show for PRNP in the accompanying work [Minikel et al, submitted]. The abundance of rare functional variation in many disease genes in ExAC is a reminder that such variants should not be assumed to be causal or highly penetrant without careful segregation or case-control analysis28,7.

We investigated the distribution of PTVs, variants predicted to disrupt protein-coding genes through the introduction of a stop codon or frameshift or the disruption of an essential splice site; such variants are expected to be enriched for complete loss-of-function of the impacted genes. Naturally-occurring PTVs in humans provide a model for the functional impact of gene inactivation, and have been used to identify many genes in 6 which LoF causes severe disease31, as well as rare cases where LoF is protective against disease32.

Among the 7,404,909 HQ variants in ExAC, we found 179,774 high-confidence PTVs (as 10 defined in Supplementary Information Section 6), 121,309 of which are singletons. This 11 corresponds to an average of 85 heterozygous and 35 homozygous PTVs per individual (Figure 5a). The diverse nature of the cohort enables the discovery of substantial numbers of novel PTVs: out of 58,435 PTVs with an allele count greater than one, 33,625 occur in only one population. However, while PTVs as a category are extremely rare, the majority of the PTVs found in any one person are common, and each individual 16 has only ~2 singleton PTVs, of which 0.14 are found in PTV-constrained genes (pLI 17 >0.9). The site frequency spectrum of these variants across the populations represented in ExAC recapitulates known aspects of demographic models, including an increase in intermediate-frequency (1%-5%) PTVs in Finland33 and relatively common (>0.1%) PTVs in Africans (Figure 5b).



Discussion  Here we describe the generation and analysis of the most comprehensive catalogue of 29 human protein-coding genetic variation to date, incorporating high-quality exome sequencing data from 60,706 individuals of diverse geographic ancestry. The resulting call set provides unprecedented resolution for the analysis of very low-frequency protein-coding variants in human populations, as well as a powerful resource for the clinical interpretation of genetic variants observed in disease patients. The complete frequency CC-BY-ND 4.0 International license for this preprint is the author/funder. It is made available under a bioRxiv preprint first posted online October 30, 2015; ; The copyright holder and annotation data from this call-set has been made freely available through a public website []

The ExAC resource provides the largest database to date for the estimation of allele frequency for protein-coding genetic variants, providing a powerful filter for analysis of candidate pathogenic variants in severe Mendelian diseases. Frequency data from ESP1 have been widely used for this purpose, but those data are limited by population diversity and by resolution at allele frequencies ≤0.1%. ExAC therefore provides 21 substantially improved power for Mendelian analyses, although it is still limited in power at lower allele frequencies, emphasizing the need for more sophisticated pathogenic variant filtering strategies alongside on-going data aggregation efforts. ExAC also highlights an unexpected tolerance of many disease genes to functional variation, and reveals that the literature and public databases contain an inflated number of reportedly pathogenic variants across the frequency spectrum, indicating a need for stringent criteria for assertions of pathogenicity.

Finally, we show that different populations confer different advantages in the discovery of gene-disrupting PTVs, providing guidance for projects seeking to identify human “knockouts” to understand gene function. Individuals of African ancestry have more PTVs (140 on average), with this enrichment most pronounced at allele frequencies above 1% (Figure 5b). Finnish individuals, as a result of a population bottleneck, are depleted at the lowest (<0.1%) allele frequencies but have a peak in frequency at 1-5% (Figure 5b). However, these differences are diminished when considering only LoF-constrained (pLI > 0.9) genes (Extended Data Figure 10). Sampling multiple populations would likely be a fruitful strategy for a researcher investigating common PTV variation. However, discovery of homozygous PTVs is markedly enhanced in the South Asia samples, which come primarily from a Pakistani cohort with 38.3% of individuals self- reporting as having closely related parents, emphasizing the extreme value of consanguineous cohorts for “human knockout” discovery (Figure 5d) [Saleheen et al., to 8 be co-submitted].


While the ExAC dataset dramatically exceeds the scale of previously available frequency reference datasets, much remains to be gained by further increases in sample size. Indeed, the fact that even the rarest transversions have mutational rates13 on the order of 1 x 10-9 implies that almost all possible non-lethal SNVs likely exist in some person on Earth. ExAC already includes >70% of all possible protein-coding CpG transitions at well-covered sites; order of magnitude increases in sample size will eventually lead to saturation of other classes of variation.

Read Full Post »

« Newer Posts