Feeds:
Posts
Comments

Archive for the ‘BioIT: BioInformatics, NGS, Clinical & Translational, Pharmaceutical R&D Informatics, Clinical Genomics, Cancer Informatics’ Category


Structure-guided Drug Discovery: (1) The Coronavirus 3CL hydrolase (Mpro) enzyme (main protease) essential for proteolytic maturation of the virus and (2) viral protease, the RNA polymerase, the viral spike protein, a viral RNA as promising two targets for discovery of cleavage inhibitors of the viral spike polyprotein preventing the Coronavirus Virion the spread of infection

 

Curators and Reporters: Stephen J. Williams, PhD and Aviva Lev-Ari, PhD, RN

 

Therapeutical options to coronavirus (2019-nCoV) include consideration of the following:

(a) Monoclonal and polyclonal antibodies

(b)  Vaccines

(c)  Small molecule treatments (e.g., chloroquinolone and derivatives), including compounds already approved for other indications 

(d)  Immuno-therapies derived from human or other sources

 

 

Structure of the nCoV trimeric spike

The World Health Organization has declared the outbreak of a novel coronavirus (2019-nCoV) to be a public health emergency of international concern. The virus binds to host cells through its trimeric spike glycoprotein, making this protein a key target for potential therapies and diagnostics. Wrapp et al. determined a 3.5-angstrom-resolution structure of the 2019-nCoV trimeric spike protein by cryo–electron microscopy. Using biophysical assays, the authors show that this protein binds at least 10 times more tightly than the corresponding spike protein of severe acute respiratory syndrome (SARS)–CoV to their common host cell receptor. They also tested three antibodies known to bind to the SARS-CoV spike protein but did not detect binding to the 2019-nCoV spike protein. These studies provide valuable information to guide the development of medical counter-measures for 2019-nCoV. [Bold Face Added by ALA]

Science, this issue p. 1260

Abstract

The outbreak of a novel coronavirus (2019-nCoV) represents a pandemic threat that has been declared a public health emergency of international concern. The CoV spike (S) glycoprotein is a key target for vaccines, therapeutic antibodies, and diagnostics. To facilitate medical countermeasure development, we determined a 3.5-angstrom-resolution cryo–electron microscopy structure of the 2019-nCoV S trimer in the prefusion conformation. The predominant state of the trimer has one of the three receptor-binding domains (RBDs) rotated up in a receptor-accessible conformation. We also provide biophysical and structural evidence that the 2019-nCoV S protein binds angiotensin-converting enzyme 2 (ACE2) with higher affinity than does severe acute respiratory syndrome (SARS)-CoV S. Additionally, we tested several published SARS-CoV RBD-specific monoclonal antibodies and found that they do not have appreciable binding to 2019-nCoV S, suggesting that antibody cross-reactivity may be limited between the two RBDs. The structure of 2019-nCoV S should enable the rapid development and evaluation of medical countermeasures to address the ongoing public health crisis.

SOURCE
Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation
  1. Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA.

  2. 2Vaccine Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA.
  1. Corresponding author. Email: jmclellan@austin.utexas.edu
  1. * These authors contributed equally to this work.

Science  13 Mar 2020:
Vol. 367, Issue 6483, pp. 1260-1263
DOI: 10.1126/science.abb2507

 

02/04/2020

New Coronavirus Protease Structure Available

PDB data provide a starting point for structure-guided drug discovery

A high-resolution crystal structure of COVID-19 (2019-nCoV) coronavirus 3CL hydrolase (Mpro) has been determined by Zihe Rao and Haitao Yang’s research team at ShanghaiTech University. Rapid public release of this structure of the main protease of the virus (PDB 6lu7) will enable research on this newly-recognized human pathogen.

Recent emergence of the COVID-19 coronavirus has resulted in a WHO-declared public health emergency of international concern. Research efforts around the world are working towards establishing a greater understanding of this particular virus and developing treatments and vaccines to prevent further spread.

While PDB entry 6lu7 is currently the only public-domain 3D structure from this specific coronavirus, the PDB contains structures of the corresponding enzyme from other coronaviruses. The 2003 outbreak of the closely-related Severe Acute Respiratory Syndrome-related coronavirus (SARS) led to the first 3D structures, and today there are more than 200 PDB structures of SARS proteins. Structural information from these related proteins could be vital in furthering our understanding of coronaviruses and in discovery and development of new treatments and vaccines to contain the current outbreak.

The coronavirus 3CL hydrolase (Mpro) enzyme, also known as the main protease, is essential for proteolytic maturation of the virus. It is thought to be a promising target for discovery of small-molecule drugs that would inhibit cleavage of the viral polyprotein and prevent spread of the infection.

Comparison of the protein sequence of the COVID-19 coronavirus 3CL hydrolase (Mpro) against the PDB archive identified 95 PDB proteins with at least 90% sequence identity. Furthermore, these related protein structures contain approximately 30 distinct small molecule inhibitors, which could guide discovery of new drugs. Of particular significance for drug discovery is the very high amino acid sequence identity (96%) between the COVID-19 coronavirus 3CL hydrolase (Mpro) and the SARS virus main protease (PDB 1q2w). Summary data about these closely-related PDB structures are available (CSV) to help researchers more easily find this information. In addition, the PDB houses 3D structure data for more than 20 unique SARS proteins represented in more than 200 PDB structures, including a second viral protease, the RNA polymerase, the viral spike protein, a viral RNA, and other proteins (CSV).

Public release of the COVID-19 coronavirus 3CL hydrolase (Mpro), at a time when this information can prove most vital and valuable, highlights the importance of open and timely availability of scientific data. The wwPDB strives to ensure that 3D biological structure data remain freely accessible for all, while maintaining as comprehensive and accurate an archive as possible. We hope that this new structure, and those from related viruses, will help researchers and clinicians address the COVID-19 coronavirus global public health emergency.

Update: Released COVID-19-related PDB structures include

  • PDB structure 6lu7 (X. Liu, B. Zhang, Z. Jin, H. Yang, Z. Rao Crystal structure of COVID-19 main protease in complex with an inhibitor N3 doi: 10.2210/pdb6lu7/pdb) Released 2020-02-05
  • PDB structure 6vsb (D. Wrapp, N. Wang, K.S. Corbett, J.A. Goldsmith, C.-L. Hsieh, O. Abiona, B.S. Graham, J.S. McLellan (2020) Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Science doi: 10.1126/science.abb2507) Released 2020-02-26
  • PDB structure 6lxt (Y. Zhu, F. Sun Structure of post fusion core of 2019-nCoV S2 subunit doi: 10.2210/pdb6lxt/pdb) Released 2020-02-26
  • PDB structure 6lvn (Y. Zhu, F. Sun Structure of the 2019-nCoV HR2 Domain doi: 10.2210/pdb6lvn/pdb) Released 2020-02-26
  • PDB structure 6vw1
    J. Shang, G. Ye, K. Shi, Y.S. Wan, H. Aihara, F. Li Structural basis for receptor recognition by the novel coronavirus from Wuhan doi: 10.2210/pdb6vw1/pdb
    Released 2020-03-04
  • PDB structure 6vww
    Y. Kim, R. Jedrzejczak, N. Maltseva, M. Endres, A. Godzik, K. Michalska, A. Joachimiak, Center for Structural Genomics of Infectious Diseases Crystal Structure of NSP15 Endoribonuclease from SARS CoV-2 doi: 10.2210/pdb6vww/pdb
    Released 2020-03-04
  • PDB structure 6y2e
    L. Zhang, X. Sun, R. Hilgenfeld Crystal structure of the free enzyme of the SARS-CoV-2 (2019-nCoV) main protease doi: 10.2210/pdb6y2e/pdb
    Released 2020-03-04
  • PDB structure 6y2f
    L. Zhang, X. Sun, R. Hilgenfeld Crystal structure (monoclinic form) of the complex resulting from the reaction between SARS-CoV-2 (2019-nCoV) main protease and tert-butyl (1-((S)-1-(((S)-4-(benzylamino)-3,4-dioxo-1-((S)-2-oxopyrrolidin-3-yl)butan-2-yl)amino)-3-cyclopropyl-1-oxopropan-2-yl)-2-oxo-1,2-dihydropyridin-3-yl)carbamate (alpha-ketoamide 13b) doi: 10.2210/pdb6y2f/pdb
    Released 2020-03-04
  • PDB structure 6y2g
    L. Zhang, X. Sun, R. Hilgenfeld Crystal structure (orthorhombic form) of the complex resulting from the reaction between SARS-CoV-2 (2019-nCoV) main protease and tert-butyl (1-((S)-1-(((S)-4-(benzylamino)-3,4-dioxo-1-((S)-2-oxopyrrolidin-3-yl)butan-2-yl)amino)-3-cyclopropyl-1-oxopropan-2-yl)-2-oxo-1,2-dihydropyridin-3-yl)carbamate (alpha-ketoamide 13b) doi: 10.2210/pdb6y2g/pdb
    Released 2020-03-04
First page image

Abstract

Coronavirus disease 2019 (COVID-19) is a global pandemic impacting nearly 170 countries/regions and more than 285,000 patients worldwide. COVID-19 is caused by the Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2), which invades cells through the angiotensin converting enzyme 2 (ACE2) receptor. Among those with COVID-19, there is a higher prevalence of cardiovascular disease and more than 7% of patients suffer myocardial injury from the infection (22% of the critically ill). Despite ACE2 serving as the portal for infection, the role of ACE inhibitors or angiotensin receptor blockers requires further investigation. COVID-19 poses a challenge for heart transplantation, impacting donor selection, immunosuppression, and post-transplant management. Thankfully there are a number of promising therapies under active investigation to both treat and prevent COVID-19. Key Words: COVID-19; myocardial injury; pandemic; heart transplant

SOURCE

https://www.ahajournals.org/doi/pdf/10.1161/CIRCULATIONAHA.120.046941

ACE2

  • Towler P, Staker B, Prasad SG, Menon S, Tang J, Parsons T, Ryan D, Fisher M, Williams D, Dales NA, Patane MA, Pantoliano MW (Apr 2004). “ACE2 X-ray structures reveal a large hinge-bending motion important for inhibitor binding and catalysis”The Journal of Biological Chemistry279 (17): 17996–8007. doi:10.1074/jbc.M311191200PMID 14754895.

 

  • Turner AJ, Tipnis SR, Guy JL, Rice G, Hooper NM (Apr 2002). “ACEH/ACE2 is a novel mammalian metallocarboxypeptidase and a homologue of angiotensin-converting enzyme insensitive to ACE inhibitors”Canadian Journal of Physiology and Pharmacology80 (4): 346–53. doi:10.1139/y02-021PMID 12025971.

 

  •  Zhang, Haibo; Penninger, Josef M.; Li, Yimin; Zhong, Nanshan; Slutsky, Arthur S. (3 March 2020). “Angiotensin-converting enzyme 2 (ACE2) as a SARS-CoV-2 receptor: molecular mechanisms and potential therapeutic target”Intensive Care Medicine. Springer Science and Business Media LLC. doi:10.1007/s00134-020-05985-9ISSN 0342-4642PMID 32125455.

 

  • ^ Gurwitz, David (2020). “Angiotensin receptor blockers as tentative SARS‐CoV‐2 therapeutics”Drug Development Researchdoi:10.1002/ddr.21656PMID 32129518.

 

Angiotensin converting enzyme 2 (ACE2)

is an exopeptidase that catalyses the conversion of angiotensin I to the nonapeptide angiotensin[1-9][5] or the conversion of angiotensin II to angiotensin 1-7.[6][7] ACE2 has direct effects on cardiac functiona and is expressed predominantly in vascular endothelial cells of the heart and the kidneys.[8] ACE2 is not sensitive to the ACE inhibitor drugs used to treat hypertension.[9]

ACE2 receptors have been shown to be the entry point into human cells for some coronaviruses, including the SARS virus.[10] A number of studies have identified that the entry point is the same for SARS-CoV-2,[11] the virus that causes COVID-19.[12][13][14][15]

Some have suggested that a decrease in ACE2 could be protective against Covid-19 disease[16], but others have suggested the opposite, that Angiotensin II receptor blocker drugs could be protective against Covid-19 disease via increasing ACE2, and that these hypotheses need to be tested by datamining of clinical patient records.[17]

REFERENCES

https://en.wikipedia.org/wiki/Angiotensin-converting_enzyme_2

 

FOLDING@HOME TAKES UP THE FIGHT AGAINST COVID-19 / 2019-NCOV

We need your help! Folding@home is joining researchers around the world working to better understand the 2019 Coronavirus (2019-nCoV) to accelerate the open science effort to develop new life-saving therapies. By downloading Folding@Home, you can donate your unused computational resources to the Folding@home Consortium, where researchers working to advance our understanding of the structures of potential drug targets for 2019-nCoV that could aid in the design of new therapies. The data you help us generate will be quickly and openly disseminated as part of an open science collaboration of multiple laboratories around the world, giving researchers new tools that may unlock new opportunities for developing lifesaving drugs.

2019-nCoV is a close cousin to SARS coronavirus (SARS-CoV), and acts in a similar way. For both coronaviruses, the first step of infection occurs in the lungs, when a protein on the surface  of the virus binds to a receptor protein on a lung cell. This viral protein is called the spike protein, depicted in red in the image below, and the receptor is known as ACE2. A therapeutic antibody is a type of protein that can block the viral protein from binding to its receptor, therefore preventing the virus from infecting the lung cell. A therapeutic antibody has already been developed for SARS-CoV, but to develop therapeutic antibodies or small molecules for 2019-nCoV, scientists need to better understand the structure of the viral spike protein and how it binds to the human ACE2 receptor required for viral entry into human cells.

Proteins are not stagnant—they wiggle and fold and unfold to take on numerous shapes.  We need to study not only one shape of the viral spike protein, but all the ways the protein wiggles and folds into alternative shapes in order to best understand how it interacts with the ACE2 receptor, so that an antibody can be designed. Low-resolution structures of the SARS-CoV spike protein exist and we know the mutations that differ between SARS-CoV and 2019-nCoV.  Given this information, we are uniquely positioned to help model the structure of the 2019-nCoV spike protein and identify sites that can be targeted by a therapeutic antibody. We can build computational models that accomplish this goal, but it takes a lot of computing power.

This is where you come in! With many computers working towards the same goal, we aim to help develop a therapeutic remedy as quickly as possible. By downloading Folding@home here [LINK] and selecting to contribute to “Any Disease”, you can help provide us with the computational power required to tackle this problem. One protein from 2019-nCoV, a protease encoded by the viral RNA, has already been crystallized. Although the 2019-nCoV spike protein of interest has not yet been resolved bound to ACE2, our objective is to use the homologous structure of the SARS-CoV spike protein to identify therapeutic antibody targets.

This illustration, created at the Centers for Disease Control and Prevention (CDC), reveals ultrastructural morphology exhibited by coronaviruses. Note the spikes that adorn the outer surface of the virus, which impart the look of a corona surrounding the virion, when viewed electron microscopically. A novel coronavirus virus was identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China in 2019.

Image and Caption Credit: Alissa Eckert, MS; Dan Higgins, MAM available at https://phil.cdc.gov/Details.aspx?pid=23311

Structures of the closely related SARS-CoV spike protein bound by therapeutic antibodies may help rapidly design better therapies. The three monomers of the SARS-CoV spike protein are shown in different shades of red; the antibody is depicted in green. [PDB: 6NB7 https://www.rcsb.org/structure/6nb7]

(post authored by Ariana Brenner Clerkin)

References:

PDB 6lu7 structure summary ‹ Protein Data Bank in Europe (PDBe) ‹ EMBL-EBI https://www.ebi.ac.uk/pdbe/entry/pdb/6lu7 (accessed Feb 5, 2020).

Tian, X.; Li, C.; Huang, A.; Xia, S.; Lu, S.; Shi, Z.; Lu, L.; Jiang, S.; Yang, Z.; Wu, Y.; et al. Potent Binding of 2019 Novel Coronavirus Spike Protein by a SARS Coronavirus-Specific Human Monoclonal Antibody; preprint; Microbiology, 2020. https://doi.org/10.1101/2020.01.28.923011.

Walls, A. C.; Xiong, X.; Park, Y. J.; Tortorici, M. A.; Snijder, J.; Quispe, J.; Cameroni, E.; Gopal, R.; Dai, M.; Lanzavecchia, A.; et al. Unexpected Receptor Functional Mimicry Elucidates Activation of Coronavirus Fusion. Cell 2019176, 1026-1039.e15. https://doi.org/10.2210/pdb6nb7/pdb.

SOURCE

https://foldingathome.org/2020/02/27/foldinghome-takes-up-the-fight-against-covid-19-2019-ncov/

UPDATED 3/13/2020

I am reposting the following Science blog post from Derrick Lowe as is and ask people go browse through the comments on his Science blog In the Pipeline because, as Dr. Lowe states that in this current crisis it is important to disseminate good information as quickly as possible so wanted the readers here to have the ability to read his great posting on this matter of Covid-19.  Also i would like to direct readers to the journal Science opinion letter concerning how important it is to rebuild the trust in good science and the scientific process.  The full link for the following In the Pipeline post is: https://blogs.sciencemag.org/pipeline/archives/2020/03/06/covid-19-small-molecule-therapies-reviewed

A Summary of current potential repurposed therapeutics for COVID-19 Infection from In The Pipeline: A Science blog from Derick Lowe

Covid-19 Small Molecule Therapies Reviewed

Let’s take inventory on the therapies that are being developed for the coronavirus epidemic. Here is a very thorough list of at Biocentury, and I should note that (like Stat and several other organizations) they’re making all their Covid-19 content free to all readers during this crisis. I’d like to zoom in today on the potential small-molecule therapies, since some of these have the most immediate prospects for use in the real world.

The ones at the front of the line are repurposed drugs that are already approved for human use, for a lot of obvious reasons. The Biocentury list doesn’t cover these, but here’s an article at Nature Biotechnology that goes into detail. Clinical trials are a huge time sink – they sort of have to be, in most cases, if they’re going to be any good – and if you’ve already done all that stuff it’s a huge leg up, even if the drug itself is not exactly a perfect fit for the disease. So what do we have? The compound that is most advanced is probably remdesivir from Gilead, at right. This has been in development for a few years as an RNA virus therapy – it was originally developed for Ebola, and has been tried out against a whole list of single-strand RNA viruses. That includes the related coronaviruses SARS and MERS, so Covid-19 was an obvious fit.

The compound is a prodrug – that phosphoramide gets cleaved off completely, leaving the active 5-OH compound GS-44-1524. It mechanism of action is to get incorporated into viral RNA, since it’s taken up by RNA polymerase and it largely seems to evade proofreading. This causes RNA termination trouble later on, since that alpha-nitrile C-nucleoside is not exactly what the virus is expecting in its genome at that point, and thus viral replication is inhibited.

There are five clinical trials underway (here’s an overview at Biocentury). The NIH has an adaptive-design Phase II trial that has already started in Nebraska, with doses to be changed according to Bayesian readouts along the way. There are two Phase III trials underway at China-Japan Friendship Hospital in Hubei, double-blinded and placebo-controlled (since placebo is, as far as drug therapy goes, the current standard of care). And Gilead themselves are starting two open-label trials, one with no control arm and one with an (unblinded) standard-of-care comparison arm. Those might read out first, depending on when they get off the ground, but will be only rough readouts due to the fast-and-loose trial design. The two Hubei trials and the NIH one will add some rigor to the process, but I’m not sure when they’re going to report. My personal opinion is that I like the chances of this drug more than anything else on this list, but it’s still unlikely to be a game-changer.

There’s an RNA polymerase inhibitor (favipiravir) from Toyama, at right, that’s in a trial in China. It’s a thought – a broad-spectrum agent of this sort would be the sort of thing to try. But unfortunately, from what I can see, it has already turned up as ineffective in in vitro tests. The human trial that’s underway is honestly the sort of thing that would only happen under circumstances like the present: a developing epidemic with a new pathogen and no real standard of care. I hold out little hope for this one, but given that there’s nothing else at present, it probably should be tried. As you’ll see, this is far from the only situation like this.

One of the screens of known drugs in China that also flagged remdesivir noted that the old antimalarial drug chloroquine seemed to be effective in vitro. It had been reported some years back as a possible antiviral, working through more than one mechanism, probably both at viral entry and intracellularly thereafter. That part shouldn’t be surprising – chloroquine’s actual mode(s) of action against malaria parasites are still not completely worked out, either, and some of what people thought they knew about it has turned out to be wrong. There are several trials underway with it at Chinese facilities, some in combination with other agents like remdesivir. Chloroquine has of course been taken for many decades as an antimalarial, but it has a number of liabilities, including seizures, hearing damage, retinopathy and sudden effects on blood glucose. So it’s going to be important to establish just how effective it is and what doses will be needed. Just as with vaccine candidates, it’s possible to do more harm with a rushed treatment than the disease is doing itself

There are several other known antiviral drugs are being tried in China, but I don’t have too much hope for those, either. The neuraminidase inhibitors such as oseltamivir (better known as Tamiflu) were tried against SARS and were ineffective; there is no reason to expect anything versus Covid-19 although these drugs are a component of some drug cocktail trials. The HIV protease therapies such as darunavir and the combination therapy Kaletra are in trials, but that’s also a rather desperate long shot, since there’s no particular reason to think that they will have any such protease inhibition against what this new virus has to offer (and indeed, such agents weren’t much help against SARS in the end, either). The classic interferon/ribavirin combination seems to have had some activity against SARS and MERS, and is in two trials from what I can see. That’s not an awful idea by any means, but it’s not a great one, either: if your viral disease has interferon/ribavirin as a front line therapy, it generally means that there’s nothing really good available. No, unless we get really lucky none of these ideas are going to slow the disease down much.

There are a few other repurposed-protease-inhibitors ideas out there, such as this one. (Edit: I had seen this paper but couldn’t track it down, so thanks to those who sent it along). This paper suggests that the TMPRSS2 protease is important for viral entry on the human-cell-side of the process, a pathway that has been noted for other coronaviruses. And it points out that there is a an approved inhibitor (in Japan) for this enzyme (camostat), so that would definitely seem to be worth a trial, probably in combination with remdesivir.

That’s about it for the existing small molecules, from what I can see. What about new ones? Don’t hold your breath, is all I can say. A drug discovery program from scratch against a new pathogen is, as many readers here well know, not a trivial exercise. As this Bloomberg article details, many such efforts in the past (small molecules and vaccines alike) have come to grief because by the time they had anything to deliver the epidemic itself had passed. Indeed, Gilead’s remdesivir had already been dropped as a potential Ebola therapy.

You will either need to have a target in mind up front or go phenotypic. For the former, what you’d see are better characterizations of the viral protease and more extensive screens against it. Two other big target areas are viral entry (which involves the “spike” proteins on the virus surface and the ACE2 protein on human cells) and viral replication. To the former, it’s worth quickly noting that ACE2 is so much unlike the more familiar ACE protein that none of the cardiovascular ACE inhibitors do anything to it at all. And targeting the latter mechanisms is how remdesivir was developed as a possible Ebola agent, but as you can see, that took time, too. Phenotypic screens are perfectly reasonable against viral pathogens as well, but you’ll need to put time and effort into that assay up front, just as with any phenotypic effort, because as anyone who does that sort of work will tell you, a bad phenotypic screen is a complete waste of everyone’s time.

One of the key steps for either route is identifying an animal model. While animal models of infectious disease can be extremely well translated to human therapy, that doesn’t happen by accident: you need to choose the right animal. Viruses in general (and coronaviruses are no exception) vary widely in their effects in different species, and not just across the gaps of bird/reptile/human and the like. No, you’ll run into things where even the usual set of small mammals are acting differently from each other, with some of them not even getting sick at all. This current virus may well have gone through a couple of other mammalian species before landing on us, but you’ll note that dogs (to pick one) don’t seem to have any problem with it.

All this means that any new-target new-chemical-matter effort against Covid-19 (or any new pathogen) is going to take years, and there is just no way around that. Update: see here for just such an effort to start finding fragment hits for the viral protease. This puts small molecules in a very bimodal distribution: you have the existing drugs that might be repurposed, and are presumably available right now. Nothing else is! At the other end, for completely new therapies you have the usual prospects of drug discovery: years from now, lots of money, low success rate, good luck to all of us. The gap between these two could in theory be filled by vaccines and antibody therapies (if everything goes really, really well) but those are very much their own area and will be dealt with in a separate post.

Either way, the odds are that we (and I mean “we as a species” here) are going to be fighting this epidemic without any particularly amazing pharmacological weapons. Eventually we’ll have some, but I would advise people, pundits, and politicians not to get all excited about the prospects for some new therapies to come riding up over the hill to help us out. The odds of that happening in time to do anything about the current outbreak are very small. We will be going for months, years, with the therapeutic options we have right now. Look around you: what we have today is what we have to work with.

Other related articles published in this Open Access Online Scientific Journal include the following:

 

Group of Researchers @ University of California, Riverside, the University of Chicago, the U.S. Department of Energy’s Argonne National Laboratory, and Northwestern University solve COVID-19 Structure and Map Potential Therapeutics

Reporters: Stephen J Williams, PhD and Aviva Lev-Ari, PhD, RN

https://pharmaceuticalintelligence.com/2020/03/06/group-of-researchers-solve-covid-19-structure-and-map-potential-therapeutic/

Predicting the Protein Structure of Coronavirus: Inhibition of Nsp15 can slow viral replication and Cryo-EM – Spike protein structure (experimentally verified) vs AI-predicted protein structures (not experimentally verified) of DeepMind (Parent: Google) aka AlphaFold

Curators: Stephen J. Williams, PhD and Aviva Lev-Ari, PhD, RN

https://pharmaceuticalintelligence.com/2020/03/08/predicting-the-protein-structure-of-coronavirus-inhibition-of-nsp15-can-slow-viral-replication-and-cryo-em-spike-protein-structure-experimentally-verified-vs-ai-predicted-protein-structures-not/

 

Coronavirus facility opens at Rambam Hospital using new Israeli tech

https://www.jpost.com/Israel-News/Coronavirus-facility-opens-at-Rambam-Hospital-using-new-Israeli-tech-619681

 

Read Full Post »


Predicting the Protein Structure of Coronavirus: Inhibition of Nsp15 can slow viral replication and Cryo-EM – Spike protein structure (experimentally verified) vs AI-predicted protein structures (not experimentally verified) of DeepMind (Parent: Google) aka AlphaFold

 

Curators: Stephen J. Williams, PhD and Aviva Lev-Ari, PhD, RN

This illustration, created at the Centers for Disease Control and Prevention (CDC), reveals ultrastructural morphology exhibited by coronaviruses. Note the spikes that adorn the outer surface of the virus, which impart the look of a corona surrounding the virion, when viewed electron microscopically. A novel coronavirus virus was identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China in 2019.

Image and Caption Credit: Alissa Eckert, MS; Dan Higgins, MAM available at https://phil.cdc.gov/Details.aspx?pid=23311

 

UPDATED on 3/11/2020

Coronaviruses

According to the World Health Organization, coronaviruses make up a large family of viruses named for the crown-like spikes found on their surface (Figure 1). They carry their genetic material in single strands of RNA and cause respiratory problems and fever. Like HIV, coronaviruses can be transmitted between animals and humans.  Coronaviruses have been responsible for the Severe Acute Respiratory Syndrome (SARS) pandemic in the early 2000s and the Middle East Respiratory Syndrome (MERS) outbreak in South Korea in 2015. While the most recent coronavirus, COVID-19, has caused international concern, accessible and inexpensive sequencing is helping us understand COVID-19 and respond to the outbreak quickly.

Figure 1. Coronaviruses with the characteristic spikes as seen under a microscope.

First studies that explore genetic susceptibility to COVID-19 are now being published. The first results indicate that COVID-19 infects cells using the ACE2 cell-surface receptor. Genetic variants in the ACE2 receptor gene are thus likely to influence how effectively COVID-19 can enter the cells in our bodies. Researchers hope to discover genetic variants that confer resistance to a COVID-19 infection, similar to how some variants in the CCR5 receptor gene make people immune to HIV. At Nebula Genomics, we are monitoring the latest COVID-19 research and will add any relevant discoveries to the Nebula Research Library in a timely manner.

The Role of Genomics in Responding to COVID-19

Scientists in China sequenced COVID-19’s genome just a few weeks after the first case was reported in Wuhan. This stands in contrast to SARS, which was discovered in late 2002 but was not sequenced until April of 2003. It is through inexpensive genome-sequencing that many scientists across the globe are learning and sharing information about COVID-19, allowing us to track the evolution of COVID-19 in real-time. Ultimately, sequencing can help remove the fear of the unknown and allow scientists and health professionals to prepare to combat the spread of COVID-19.

Next-generation DNA sequencing technology has enabled us to understand COVID-19 is ~30,000 bases long. Moreover, researchers in China determined that COVID-19 is also almost identical to a coronavirus found in bats and is very similar to SARS. These insights have been critical in aiding in the development of diagnostics and vaccines. For example, the Centers for Disease Control and Prevention developed a diagnostic test to detect COVID-19 RNA from nose or mouth swabs.

Moreover, a number of different government agencies and pharmaceutical companies are in the process of developing COVID-19 vaccines to stop the COVID-19 from infecting more people. To protect humans from infection inactivated virus particles or parts of the virus (e.g. viral proteins) can be injected into humans. The immune system will recognize the inactivated virus as foreign, priming the body to build immunity against possible future infection. Of note, Moderna Inc., the National Institute of Allergy and Infectious Diseases, and Coalition for Epidemic Preparedness Innovations identified a COVID-19 vaccine candidate in a record 42 days. This vaccine will be tested in human clinical trials starting in April.

For more information about COVID-19, please refer to the World Health Organization website.

SOURCE

https://blog.nebula.org/role-of-genomics-coronavirus-covid-19/?utm_source=Nebula%20Genomics&utm_medium=email&utm_campaign=COVID-19

Aviva Lev-Ari
13.3K Tweets

Aviva Lev-Ari
@AVIVA1950

My BIO lnkd.in/eEyn69r MediaPharma ex-SRI ex-MITRE ex-McGraw-Hill Followed by

Aviva Lev-Ari
@AVIVA1950

Predicting the #ProteinStructure of #Coronavirus: #Inhibition of #Nsp15 #Cryo-EM – #spike #protein structure (#experimentally verified) vs #AI-predicted protein structures (not verified) of

(

) #AlphaFold

Quote Tweet
Eric Topol
@EricTopol
·
The problem w/ visionaries is that we don’t recognize them in a timely manner (too late) Ralph Baric @UNCpublichealth and Vineet Menachery deserve recognition for being 5 yrs ahead of #COVID19 nature.com/articles/nm.39 @NatureMedicine pnas.org/content/113/11 @PNASNews via @hoondy

Image

Image

Image

Image

 

 

Senior, A.W., Evans, R., Jumper, J. et al. Improved protein structure prediction using potentials from deep learningNature 577, 706–710 (2020)https://doi.org/10.1038/s41586-019-1923-7

Abstract

Protein structure prediction can be used to determine the three-dimensional shape of a protein from its amino acid sequence1. This problem is of fundamental importance as the structure of a protein largely determines its function2; however, protein structures can be difficult to determine experimentally. Considerable progress has recently been made by leveraging genetic information. It is possible to infer which amino acid residues are in contact by analysing covariation in homologous sequences, which aids in the prediction of protein structures3. Here we show that we can train a neural network to make accurate predictions of the distances between pairs of residues, which convey more information about the structure than contact predictions. Using this information, we construct a potential of mean force4 that can accurately describe the shape of a protein. We find that the resulting potential can be optimized by a simple gradient descent algorithm to generate structures without complex sampling procedures. The resulting system, named AlphaFold, achieves high accuracy, even for sequences with fewer homologous sequences. In the recent Critical Assessment of Protein Structure Prediction5 (CASP13)—a blind assessment of the state of the field—AlphaFold created high-accuracy structures (with template modelling (TM) scores6 of 0.7 or higher) for 24 out of 43 free modelling domains, whereas the next best method, which used sampling and contact information, achieved such accuracy for only 14 out of 43 domains. AlphaFold represents a considerable advance in protein-structure prediction. We expect this increased accuracy to enable insights into the function and malfunction of proteins, especially in cases for which no structures for homologous proteins have been experimentally determined7. https://doi.org/10.1038/s41586-019-1923-7

[ALA added bold face]

COVID-19 outbreak

The scientific community has galvanised in response to the recent COVID-19 outbreak, building on decades of basic research characterising this virus family. Labs at the forefront of the outbreak response shared genomes of the virus in open access databases, which enabled researchers to rapidly develop tests for this novel pathogen. Other labs have shared experimentally-determined and computationally-predicted structures of some of the viral proteins, and still others have shared epidemiological data. We hope to contribute to the scientific effort using the latest version of our AlphaFold system by releasing structure predictions of several under-studied proteins associated with SARS-CoV-2, the virus that causes COVID-19. We emphasise that these structure predictions have not been experimentally verified, but hope they may contribute to the scientific community’s interrogation of how the virus functions, and serve as a hypothesis generation platform for future experimental work in developing therapeutics. We’re indebted to the work of many other labs: this work wouldn’t be possible without the efforts of researchers across the globe who have responded to the COVID-19 outbreak with incredible agility.

Knowing a protein’s structure provides an important resource for understanding how it functions, but experiments to determine the structure can take months or longer, and some prove to be intractable. For this reason, researchers have been developing computational methods to predict protein structure from the amino acid sequence.  In cases where the structure of a similar protein has already been experimentally determined, algorithms based on “template modelling” are able to provide accurate predictions of the protein structure. AlphaFold, our recently published deep learning system, focuses on predicting protein structure accurately when no structures of similar proteins are available, called “free modelling”.  We’ve continued to improve these methods since that publication and want to provide the most useful predictions, so we’re sharing predicted structures for some of the proteins in SARS-CoV-2 generated using our newly-developed methods.

It’s important to note that our structure prediction system is still in development and we can’t be certain of the accuracy of the structures we are providing, although we are confident that the system is more accurate than our earlier CASP13 system. We confirmed that our system provided an accurate prediction for the experimentally determined SARS-CoV-2 spike protein structure shared in the Protein Data Bank, and this gave us confidence that our model predictions on other proteins may be useful. We recently shared our results with several colleagues at the Francis Crick Institute in the UK, including structural biologists and virologists, who encouraged us to release our structures to the general scientific community now. Our models include per-residue confidence scores to help indicate which parts of the structure are more likely to be correct. We have only provided predictions for proteins which lack suitable templates or are otherwise difficult for template modeling.  While these understudied proteins are not the main focus of current therapeutic efforts, they may add to researchers’ understanding of SARS-CoV-2.

Normally we’d wait to publish this work until it had been peer-reviewed for an academic journal. However, given the potential seriousness and time-sensitivity of the situation, we’re releasing the predicted structures as we have them now, under an open license so that anyone can make use of them.

Interested researchers can download the structures here, and can read more technical details about these predictions in a document included with the data. The protein structure predictions we’re releasing are for SARS-CoV-2 membrane protein, protein 3a, Nsp2, Nsp4, Nsp6, and Papain-like proteinase (C terminal domain). To emphasise, these are predicted structures which have not been experimentally verified. Work on the system continues for us, and we hope to share more about it in due course.

Citation:  John Jumper, Kathryn Tunyasuvunakool, Pushmeet Kohli, Demis Hassabis, and the AlphaFold Team, “Computational predictions of protein structures associated with COVID-19”, DeepMind website, 5 March 2020, https://deepmind.com/research/open-source/computational-predictions-of-protein-structures-associated-with-COVID-19

SARS-COV-2 MEMBRANE PROTEIN: A RENDERING OF ONE OF OUR PROTEIN STRUCTURE PREDICTIONS

SOURCES

Computational predictions of protein structures associated with COVID-19

https://deepmind.com/research/open-source/computational-predictions-of-protein-structures-associated-with-COVID-19

AlphaFold: Using AI for scientific discovery 

https://deepmind.com/blog/article/AlphaFold-Using-AI-for-scientific-discovery

 

DeepMind has shared its results with researchers at the Francis Crick Institute, a biomedical research lab in the UK, as well as offering it for download from its website.

“Normally we’d wait to publish this work until it had been peer-reviewed for an academic journal. However, given the potential seriousness and time-sensitivity of the situation, we’re releasing the predicted structures as we have them now, under an open license so that anyone can make use of them,” it said. [ALA added bold face]

There are 93,090 cases of COVID-19, and 3,198 deaths, spread across 76 countries, according to the latest report from the World Health Organization at time of writing. ®

SOURCE

https://www.theregister.co.uk/2020/03/06/deepmind_covid19_outbreak/

 

  • MHC content – The spike protein is thought to be the key to binding to cells via the angiotensin II receptor, the major mechanism the immune system uses to distinguish self from non-self

Preliminary Identification of Potential Vaccine Targets for the COVID-19 Coronavirus (SARS-CoV-2) Based on SARS-CoV Immunological Studies

Syed Faraz Ahmed 1,† , Ahmed A. Quadeer 1, *,† and Matthew R. McKay 1,2, *

1 Department of Electronic and Computer Engineering, The Hong Kong University of Science and

Technology, Hong Kong, China; sfahmed@connect.ust.hk

2 Department of Chemical and Biological Engineering, The Hong Kong University of Science and

Technology, Hong Kong, China

* Correspondence: eeaaquadeer@ust.hk.com (A.A.Q.); m.mckay@ust.hk (M.R.M.)

These authors contributed equally to this work.

Received: 9 February 2020; Accepted: 24 February 2020; Published: 25 February 2020

Abstract:

The beginning of 2020 has seen the emergence of COVID-19 outbreak caused by a novel coronavirus, Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). There is an imminent need to better understand this new virus and to develop ways to control its spread. In this study, we sought to gain insights for vaccine design against SARS-CoV-2 by considering the high genetic similarity between SARS-CoV-2 and SARS-CoV, which caused the outbreak in 2003, and leveraging existing immunological studies of SARS-CoV. By screening the experimentally determined SARS-CoV-derived B cell and T cell epitopes in the immunogenic structural proteins of SARS-CoV, we identified a set of B cell and T cell epitopes derived from the spike (S) and nucleocapsid (N) proteins that map identically to SARS-CoV-2 proteins. As no mutation has been observed in these identified epitopes among the 120 available SARS-CoV-2 sequences (as of 21 February 2020), immune targeting of these epitopes may potentially offer protection against this novel virus. For the T cell epitopes, we performed a population coverage analysis of the associated MHC alleles and proposed a set of epitopes that is estimated to provide broad coverage globally, as well as in China. Our findings provide a screened set of epitopes that can help guide experimental efforts towards the development of vaccines against SARS-CoV-2.

Keywords: Coronavirus; 2019-nCoV; 2019 novel coronavirus; SARS-CoV-2; COVID-19; SARS-CoV; MERS-CoV; T cell epitopes; B cell epitopes; vaccine [ALA added bold face]

SOURCE

https://www.mdpi.com/1999-4915/12/3/254/pdf

 

Selected Online COMMENTS to

https://forums.theregister.co.uk/forum/all/2020/03/06/deepmind_covid19_outbreak/

MuscleguySilver badge

Re: Protein structure prediction has been done for ages…

Not quite, Natural Selection does not measure methods, it measures outputs, usually at the organism level.

Sure correct folding is necessary for much protein function and we have prions and chaperone proteins to get it wrong and right.

The only way NS measures methods and mechanisms is if they are very energetically wasteful. But there are some very wasteful ones out there. Beta-Catenin at the end of point of Wnt signalling comes particularly to mind.

Chemist

Re: Does not matter at all

“Determining the structure of the virus proteins might also help in developing a molecule that disrupts the operation of just those proteins, and not anything else in the human body.”

Well it might, but predicting whether a ‘drug’ will NOT interact with any other of the 20000+ protein in complex organisms is well beyond current science. If we could do that we could predict/avoid toxicity and other non-mechanism related side-effects & mostly we can’t.

rob miller

Title

There are 480 structures on PDBe resulting from a search on ‘coronavirus,’ the top hits from MERS and SARS. PR stunt or not, they did win the most recent CASP ‘competition’, so arguably it’s probably our best shot right now – and I am certainly not satisfied that they have been sufficiently open in explaining their algorithms though I have not checked in the last few months. No one is betting anyone’s health on this, and it is not like making one wrong turn in a series of car directions. Latest prediction algorithms incorporate contact map predictions, so it’s not like a wrong dihedral angle sends the chain off in the wrong direction. A decent model would give something to run docking algorithms against with a series of already approved drugs, then we take that shortlist into the lab. A confirmed hit could be an instantly available treatment, no two year wait as currently estimated. [ALA added bold face]

jelabarre59Silver badge

Re: these structure predictions have not been experimentally verified

Naaaah. Can’t possibly be a stupid marketing stunt.

Well yes, a good possibility. But it can also be trying to build on the open-source model of putting it out there for others to build and improve upon. Essentially opening that “peer review” to a larger audience quicker. [ALA added bold face]

We shall see.

Anonymous Coward

Anonymous CowardWhat bothers me, besides the obvious PR stunt, is that they say this prediction is licensed. How can a prediction from software be protected by, I presume, patents? And if this can be protected without even verifying which predictions actually work, what’s to stop someone spitting out millions of random, untested predictions just in case they can claim ownership later when one of them is proven to work? [ALA added bold face]

 

 

SOURCES

 

  • AI-predicted protein structures could unlock vaccine for Wuhan coronavirus… if correct… after clinical trials It’s not quite DeepMind’s ‘Come with me if you want to live’ moment, but it’s close, maybe

Experimentally derived by a group of scientists at the University of Texas at Austin and the National Institute of Allergy and Infectious Diseases, an agency under the US National Institute of Health. They both feature a “Spike protein structure.”

  • Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation

See all authors and affiliations

Science  19 Feb 2020:
eabb2507
DOI: 10.1126/science.abb2507

 

  • Israeli scientists: We have developed a coronavirus vaccine

https://www.fromthegrapevine.com/health/coronavirus-vaccine-israel-migal-research-institute-david-zigdon

Other related articles published in this Open Access Online Scientific Journal include the following:

 

  • Group of Researchers @ University of California, Riverside, the University of Chicago, the U.S. Department of Energy’s Argonne National Laboratory, and Northwestern University solve COVID-19 Structure and Map Potential Therapeutics

Reporters: Stephen J Williams, PhD and Aviva Lev-Ari, PhD, RN

https://pharmaceuticalintelligence.com/2020/03/06/group-of-researchers-solve-covid-19-structure-and-map-potential-therapeutic/

 

  • Is It Time for the Virtual Scientific Conference?: Coronavirus, Travel Restrictions, Conferences Cancelled Curator:

Stephen J. Williams, PhD

https://pharmaceuticalintelligence.com/2020/03/06/is-it-time-for-the-virtual-scientific-conference-coronavirus-travel-restrictions-conferences-cancelled/

Read Full Post »


Group of Researchers @ University of California, Riverside, the University of Chicago, the U.S. Department of Energy’s Argonne National Laboratory, and Northwestern University solve COVID-19 Structure and Map Potential Therapeutics

Reporters: Stephen J Williams, PhD and Aviva Lev-Ari, PhD, RN

 

This illustration, created at the Centers for Disease Control and Prevention (CDC), reveals ultrastructural morphology exhibited by coronaviruses. Note the spikes that adorn the outer surface of the virus, which impart the look of a corona surrounding the virion, when viewed electron microscopically. A novel coronavirus virus was identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China in 2019.

Image and Caption Credit: Alissa Eckert, MS; Dan Higgins, MAM available at https://phil.cdc.gov/Details.aspx?pid=23311

 

New coronavirus protein reveals drug target

Image of newly mapped coronavirus protein, called Nsp15, which helps the virus replicate.

Image Credit: Northwestern University

Image of newly mapped coronavirus protein, called Nsp15, which helps the virus replicate.

How UC is responding to the coronavirus (COVID-19)

The University of California is vigilantly monitoring and responding to new information about the coronavirus (COVID-19) outbreak, which has been declared a global health emergency.

Get UC news and updates on this evolving situation.

The 3-D structure of a potential drug target in a newly mapped protein of COVID-19, or coronavirus, has been solved by a team of researchers from the University of California, Riverside, the University of Chicago, the U.S. Department of Energy’s Argonne National Laboratory, and Northwestern University.

The scientists said their findings suggest drugs previously developed to treat the earlier SARS outbreak could now be developed as effective drugs against COVID-19.

The initial genome analysis and design of constructs for protein synthesis were performed by the bioinformatic group of Adam Godzik, a professor of biomedical sciences at the UC Riverside School of Medicine.

The protein Nsp15 from Severe Acute Respiratory Syndrome Coronavirus 2, or SARS-CoV-2, is 89% identical to the protein from the earlier outbreak of SARS-CoV. SARS-CoV-2 is responsible for the current outbreak of COVID-19. Studies published in 2010 on SARS-CoV revealed inhibition of Nsp15 can slow viral replication. This suggests drugs designed to target Nsp15 could be developed as effective drugs against COVID-19.

Adam Godzik
Adam Godzik, UC Riverside professor of biomedical sciences
Credit: Sanford Burnham Prebys Medical Discovery Institute

“While the SARS-CoV-19 virus is very similar to the SARS virus that caused epidemics in 2003, new structures shed light on the small, but potentially important differences between the two viruses that contribute to the different patterns in the spread and severity of the diseases they cause,” Godzik said.

The structure of Nsp15, which will be released to the scientific community on March 4, was solved by the group of Andrzej Joachimiak, a distinguished fellow at the Argonne National Laboratory, University of Chicago Professor, and Director of the Structural Biology Center at Argonne’s Advanced Photon Source, a Department of Energy Office of Science user facility.

“Nsp15 is conserved among coronaviruses and is essential in their lifecycle and virulence,” Joachimiak said. “Initially, Nsp15 was thought to directly participate in viral replication, but more recently, it was proposed to help the virus replicate possibly by interfering with the host’s immune response.”

Mapping a 3D protein structure of the virus, also called solving the structure, allows scientists to figure out how to interfere in the pathogen’s replication in human cells.

“The Nsp15 protein has been investigated in SARS as a novel target for new drug development, but that never went very far because the SARS epidemic went away, and all new drug development ended,” said Karla Satchell, a professor of microbiology-immunology at Northwestern, who leads the international team of scientists investigating the structure of the SARS CoV-2 virus to understand how to stop it from replicating. “Some inhibitors were identified but never developed into drugs. The inhibitors that were developed for SARS now could be tested against this protein.”

Rapid upsurge and proliferation of SARS-CoV-2 raised questions about how this virus could become so much more transmissible as compared to the SARS and MERS coronaviruses. The scientists are mapping the proteins to address this issue.

Over the past two months, COVID-19 infected more than 80,000 people and caused at least 2,700 deaths. Although currently mainly concentrated in China, the virus is spreading worldwide and has been found in 46 countries. Millions of people are being quarantined, and the epidemic has impacted the world economy. There is no existing drug for this disease, but various treatment options, such as utilizing medicines effective in other viral ailments, are being attempted.

Godzik, Satchell, and Joachimiak — along with the entire center team — will map the structure of some of the 28 proteins in the virus in order to see where drugs can throw a chemical monkey wrench into its machinery. The proteins are folded globular structures with precisely defined functions and their “active sites” can be targeted with chemical compounds.
The first step is to clone and express the genes of the virus proteins and grow them as protein crystals in miniature ice cube-like trays. The consortium includes nine labs across eight institutions that will participate in this effort.

Above is a modified version of the Northwestern University news release written by Marla Paul.

Read Full Post »


Diversity and Health Disparity Issues Need to be Addressed for GWAS and Precision Medicine Studies

Curator: Stephen J. Williams, PhD

 

 

From the POLICY FORUM ETHICS AND DIVERSITY Section of Science

Ethics of inclusion: Cultivate trust in precision medicine

 See all authors and affiliations

Science  07 Jun 2019:
Vol. 364, Issue 6444, pp. 941-942
DOI: 10.1126/science.aaw8299

Precision medicine is at a crossroads. Progress toward its central goal, to address persistent health inequities, will depend on enrolling populations in research that have been historically underrepresented, thus eliminating longstanding exclusions from such research (1). Yet the history of ethical violations related to protocols for inclusion in biomedical research, as well as the continued misuse of research results (such as white nationalists looking to genetic ancestry to support claims of racial superiority), continue to engender mistrust among these populations (2). For precision medicine research (PMR) to achieve its goal, all people must believe that there is value in providing information about themselves and their families, and that their participation will translate into equitable distribution of benefits. This requires an ethics of inclusion that considers what constitutes inclusive practices in PMR, what goals and values are being furthered through efforts to enhance diversity, and who participates in adjudicating these questions. The early stages of PMR offer a critical window in which to intervene before research practices and their consequences become locked in (3).

Initiatives such as the All of Us program have set out to collect and analyze health information and biological samples from millions of people (1). At the same time, questions of trust in biomedical research persist. For example, although the recent assertions of white nationalists were eventually denounced by the American Society of Human Genetics (4), the misuse of ancestry testing may have already undermined public trust in genetic research.

There are also infamous failures in research that included historically underrepresented groups, including practices of deceit, as in the Tuskegee Syphilis Study, or the misuse of samples, as with the Havasupai tribe (5). Many people who are being asked to give their data and samples for PMR must not only reconcile such past research abuses, but also weigh future risks of potential misuse of their data.

To help assuage these concerns, ongoing PMR studies should open themselves up to research, conducted by social scientists and ethicists, that examines how their approaches enhance diversity and inclusion. Empirical studies are needed to account for how diversity is conceptualized and how goals of inclusion are operationalized throughout the life course of PMR studies. This is not limited to selection and recruitment of populations but extends to efforts to engage participants and communities, through data collection and measurement, and interpretations and applications of study findings. A commitment to transparency is an important step toward cultivating public trust in PMR’s mission and practices.

From Inclusion to Inclusive

The lack of diverse representation in precision medicine and other biomedical research is a well-known problem. For example, rare genetic variants may be overlooked—or their association with common, complex diseases can be misinterpreted—as a result of sampling bias in genetics research (6). Concentrating research efforts on samples with largely European ancestry has limited the ability of scientists to make generalizable inferences about the relationships among genes, lifestyle, environmental exposures, and disease risks, and thereby threatens the equitable translation of PMR for broad public health benefit (7).

However, recruiting for diverse research participation alone is not enough. As with any push for “diversity,” related questions arise about how to describe, define, measure, compare, and explain inferred similarities and differences among individuals and groups (8). In the face of ambivalence about how to represent population variation, there is ample evidence that researchers resort to using definitions of diversity that are heterogeneous, inconsistent, and sometimes competing (9). Varying approaches are not inherently problematic; depending on the scientific question, some measures may be more theoretically justified than others and, in many cases, a combination of measures can be leveraged to offer greater insight (10). For example, studies have shown that American adults who do not self-identify as white report better mental and physical health if they think others perceive them as white (1112).

The benefit of using multiple measures of race and ancestry also extends to genetic studies. In a study of hypertension in Puerto Rico, not only did classifications based on skin color and socioeconomic status better predict blood pressure than genetic ancestry, the inclusion of these sociocultural measures also revealed an association between a genetic polymorphism and hypertension that was otherwise hidden (13). Thus, practices that allow for a diversity of measurement approaches, when accompanied by a commitment to transparency about the rationales for chosen approaches, are likely to benefit PMR research more than striving for a single gold standard that would apply across all studies. These definitional and measurement issues are not merely semantic. They also are socially consequential to broader perceptions of PMR research and the potential to achieve its goals of inclusion.

Study Practices, Improve Outcomes

Given the uncertainty and complexities of the current, early phase of PMR, the time is ripe for empirical studies that enable assessment and modulation of research practices and scientific priorities in light of their social and ethical implications. Studying ongoing scientific practices in real time can help to anticipate unintended consequences that would limit researchers’ ability to meet diversity recruitment goals, address both social and biological causes of health disparities, and distribute the benefits of PMR equitably. We suggest at least two areas for empirical attention and potential intervention.

First, we need to understand how “upstream” decisions about how to characterize study populations and exposures influence “downstream” research findings of what are deemed causal factors. For example, when precision medicine researchers rely on self-identification with U.S. Census categories to characterize race and ethnicity, this tends to circumscribe their investigation of potential gene-environment interactions that may affect health. The convenience and routine nature of Census categories seemed to lead scientists to infer that the reasons for differences among groups were self-evident and required no additional exploration (9). The ripple effects of initial study design decisions go beyond issues of recruitment to shape other facets of research across the life course of a project, from community engagement and the return of results to the interpretation of study findings for human health.

Second, PMR studies are situated within an ecosystem of funding agencies, regulatory bodies, disciplines, and other scholars. This partly explains the use of varied terminology, different conceptual understandings and interpretations of research questions, and heterogeneous goals for inclusion. It also makes it important to explore how expectations related to funding and regulation influence research definitions of diversity and benchmarks for inclusion.

For example, who defines a diverse study population, and how might those definitions vary across different institutional actors? Who determines the metrics that constitute successful inclusion, and why? Within a research consortium, how are expectations for data sharing and harmonization reconciled with individual studies’ goals for recruitment and analysis? In complex research fields that include multiple investigators, organizations, and agendas, how are heterogeneous, perhaps even competing, priorities negotiated? To date, no studies have addressed these questions or investigated how decisions facilitate, or compromise, goals of diversity and inclusion.

The life course of individual studies and the ecosystems in which they reside cannot be easily separated and therefore must be studied in parallel to understand how meanings of diversity are shaped and how goals of inclusion are pursued. Empirically “studying the studies” will also be instrumental in creating mechanisms for transparency about how PMR is conducted and how trade-offs among competing goals are resolved. Establishing open lines of inquiry that study upstream practices may allow researchers to anticipate and address downstream decisions about how results can be interpreted and should be communicated, with a particular eye toward the consequences for communities recruited to augment diversity. Understanding how scientists negotiate the challenges and barriers to achieving diversity that go beyond fulfilling recruitment numbers is a critical step toward promoting meaningful inclusion in PMR.

Transparent Reflection, Cultivation of Trust

Emerging research on public perceptions of PMR suggests that although there is general support, questions of trust loom large. What we learn from studies that examine on-the-ground approaches aimed at enhancing diversity and inclusion, and how the research community reflects and responds with improvements in practices as needed, will play a key role in building a culture of openness that is critical for cultivating public trust.

Cultivating long-term, trusting relationships with participants underrepresented in biomedical research has been linked to a broad range of research practices. Some of these include the willingness of researchers to (i) address the effect of history and experience on marginalized groups’ trust in researchers and clinicians; (ii) engage concerns about potential group harms and risks of stigmatization and discrimination; (iii) develop relationships with participants and communities that are characterized by transparency, clear communication, and mutual commitment; and (iv) integrate participants’ values and expectations of responsible oversight beyond initial informed consent (14). These findings underscore the importance of multidisciplinary teams that include social scientists, ethicists, and policy-makers, who can identify and help to implement practices that respect the histories and concerns of diverse publics.

A commitment to an ethics of inclusion begins with a recognition that risks from the misuse of genetic and biomedical research are unevenly distributed. History makes plain that a multitude of research practices ranging from unnecessarily limited study populations and taken-for-granted data collection procedures to analytic and interpretive missteps can unintentionally bolster claims of racial superiority or inferiority and provoke group harm (15). Sustained commitment to transparency about the goals, limits, and potential uses of research is key to further cultivating trust and building long-term research relationships with populations underrepresented in biomedical studies.

As calls for increasing diversity and inclusion in PMR grow, funding and organizational pathways must be developed that integrate empirical studies of scientific practices and their rationales to determine how goals of inclusion and equity are being addressed and to identify where reform is required. In-depth, multidisciplinary empirical investigations of how diversity is defined, operationalized, and implemented can provide important insights and lessons learned for guiding emerging science, and in so doing, meet our ethical obligations to ensure transparency and meaningful inclusion.

References and Notes

  1. C. P. Jones et al Ethn. Dis. 18496 (2008).
  2. C. C. GravleeA. L. NonC. J. Mulligan
  3. S. A. Kraft et al Am. J. Bioeth. 183 (2018).
  4. A. E. Shields et al Am. Psychol. 6077 (2005).

Read Full Post »


Medicine in 2045 – Perspectives by World Thought Leaders in the Life Sciences & Medicine

Reporter: Aviva Lev-Ari, PhD, RN

 

This report is based on an article in Nature Medicine | VOL 25 | December 2019 | 1800–1809 | http://www.nature.com/naturemedicine

Looking forward 25 years: the future of medicine.

Nat Med 25, 1804–1807 (2019) doi:10.1038/s41591-019-0693-y

 

Aviv Regev, PhD

Core member and chair of the faculty, Broad Institute of MIT and Harvard; director, Klarman Cell Observatory, Broad Institute of MIT and Harvard; professor of biology, MIT; investigator, Howard Hughes Medical Institute; founding co-chair, Human Cell Atlas.

  • millions of genome variants, tens of thousands of disease-associated genes, thousands of cell types and an almost unimaginable number of ways they can combine, we had to approximate a best starting point—choose one target, guess the cell, simplify the experiment.
  • In 2020, advances in polygenic risk scores, in understanding the cell and modules of action of genes through genome-wide association studies (GWAS), and in predicting the impact of combinations of interventions.
  • we need algorithms to make better computational predictions of experiments we have never performed in the lab or in clinical trials.
  • Human Cell Atlas and the International Common Disease Alliance—and in new experimental platforms: data platforms and algorithms. But we also need a broader ecosystem of partnerships in medicine that engages interaction between clinical experts and mathematicians, computer scientists and engineers

Feng Zhang, PhD

investigator, Howard Hughes Medical Institute; core member, Broad Institute of MIT and Harvard; James and Patricia Poitras Professor of Neuroscience, McGovern Institute for Brain Research, MIT.

  • fundamental shift in medicine away from treating symptoms of disease and toward treating disease at its genetic roots.
  • Gene therapy with clinical feasibility, improved delivery methods and the development of robust molecular technologies for gene editing in human cells, affordable genome sequencing has accelerated our ability to identify the genetic causes of disease.
  • 1,000 clinical trials testing gene therapies are ongoing, and the pace of clinical development is likely to accelerate.
  • refine molecular technologies for gene editing, to push our understanding of gene function in health and disease forward, and to engage with all members of society

Elizabeth Jaffee, PhD

Dana and Albert “Cubby” Broccoli Professor of Oncology, Johns Hopkins School of Medicine; deputy director, Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins.

  • a single blood test could inform individuals of the diseases they are at risk of (diabetes, cancer, heart disease, etc.) and that safe interventions will be available.
  • developing cancer vaccines. Vaccines targeting the causative agents of cervical and hepatocellular cancers have already proven to be effective. With these technologies and the wealth of data that will become available as precision medicine becomes more routine, new discoveries identifying the earliest genetic and inflammatory changes occurring within a cell as it transitions into a pre-cancer can be expected. With these discoveries, the opportunities to develop vaccine approaches preventing cancers development will grow.

Jeremy Farrar, OBE FRCP FRS FMedSci

Director, Wellcome Trust.

  • shape how the culture of research will develop over the next 25 years, a culture that cares more about what is achieved than how it is achieved.
  • building a creative, inclusive and open research culture will unleash greater discoveries with greater impact.

John Nkengasong, PhD

Director, Africa Centres for Disease Control and Prevention.

  • To meet its health challenges by 2050, the continent will have to be innovative in order to leapfrog toward solutions in public health.
  • Precision medicine will need to take center stage in a new public health order— whereby a more precise and targeted approach to screening, diagnosis, treatment and, potentially, cure is based on each patient’s unique genetic and biologic make-up.

Eric Topol, MD

Executive vice-president, Scripps Research Institute; founder and director, Scripps Research Translational Institute.

  • In 2045, a planetary health infrastructure based on deep, longitudinal, multimodal human data, ideally collected from and accessible to as many as possible of the 9+ billion people projected to then inhabit the Earth.
  • enhanced capabilities to perform functions that are not feasible now.
  • AI machines’ ability to ingest and process biomedical text at scale—such as the corpus of the up-to-date medical literature—will be used routinely by physicians and patients.
  • the concept of a learning health system will be redefined by AI.

Linda Partridge, PhD

Professor, Max Planck Institute for Biology of Ageing.

  • Geroprotective drugs, which target the underlying molecular mechanisms of ageing, are coming over the scientific and clinical horizons, and may help to prevent the most intractable age-related disease, dementia.

Trevor Mundel, MD

President of Global Health, Bill & Melinda Gates Foundation.

  • finding new ways to share clinical data that are as open as possible and as closed as necessary.
  • moving beyond drug donations toward a new era of corporate social responsibility that encourages biotechnology and pharmaceutical companies to offer their best minds and their most promising platforms.
  • working with governments and multilateral organizations much earlier in the product life cycle to finance the introduction of new interventions and to ensure the sustainable development of the health systems that will deliver them.
  • deliver on the promise of global health equity.

Josep Tabernero, MD, PhD

Vall d’Hebron Institute of Oncology (VHIO); president, European Society for Medical Oncology (2018–2019).

  • genomic-driven analysis will continue to broaden the impact of personalized medicine in healthcare globally.
  • Precision medicine will continue to deliver its new paradigm in cancer care and reach more patients.
  • Immunotherapy will deliver on its promise to dismantle cancer’s armory across tumor types.
  • AI will help guide the development of individually matched
  • genetic patient screenings
  • the promise of liquid biopsy policing of disease?

Pardis Sabeti, PhD

Professor, Harvard University & Harvard T.H. Chan School of Public Health and Broad Institute of MIT and Harvard; investigator, Howard Hughes Medical Institute.

  • the development and integration of tools into an early-warning system embedded into healthcare systems around the world could revolutionize infectious disease detection and response.
  • But this will only happen with a commitment from the global community.

Els Toreele, PhD

Executive director, Médecins Sans Frontières Access Campaign

  • we need a paradigm shift such that medicines are no longer lucrative market commodities but are global public health goods—available to all those who need them.
  • This will require members of the scientific community to go beyond their role as researchers and actively engage in R&D policy reform mandating health research in the public interest and ensuring that the results of their work benefit many more people.
  • The global research community can lead the way toward public-interest driven health innovation, by undertaking collaborative open science and piloting not-for-profit R&D strategies that positively impact people’s lives globally.

Read Full Post »


Evolution of the Human Cell Genome Biology Field of Gene Expression, Gene Regulation, Gene Regulatory Networks and Application of Machine Learning Algorithms in Large-Scale Biological Data Analysis

Curator & Reporter: Aviva Lev-Ari, PhD, RN

 

Subjects:

The Scientific Frontier is presented in Deciphering eukaryotic gene-regulatory logic with 100 million random promoters

Boer, C.G., Vaishnav, E.D., Sadeh, R. et al. Deciphering eukaryotic gene-regulatory logic with 100 million random promotersNat Biotechnol (2019) doi:10.1038/s41587-019-0315-8

Abstract

How transcription factors (TFs) interpret cis-regulatory DNA sequence to control gene expression remains unclear, largely because past studies using native and engineered sequences had insufficient scale. Here, we measure the expression output of >100 million synthetic yeast promoter sequences that are fully random. These sequences yield diverse, reproducible expression levels that can be explained by their chance inclusion of functional TF binding sites. We use machine learning to build interpretable models of transcriptional regulation that predict ~94% of the expression driven from independent test promoters and ~89% of the expression driven from native yeast promoter fragments. These models allow us to characterize each TF’s specificity, activity and interactions with chromatin. TF activity depends on binding-site strand, position, DNA helical face and chromatin context. Notably, expression level is influenced by weak regulatory interactions, which confound designed-sequence studies. Our analyses show that massive-throughput assays of fully random DNA can provide the big data necessary to develop complex, predictive models of gene regulation.

The Evolution of the Human Cell Genome Biology Field of Gene Expression, Gene Regulation, Gene Regulatory Networks and Application of Machine Learning Algorithms in Large-Scale Biological Data Analysis is presented in the following Table

 

50 Liu, X., Li, Y. I. & Pritchard, J. K. Trans effects on gene expression can drive omnigenic inheritance. Cell 177, 1022–1034 e1026 (2019).
5 Muerdter, F. et al. Resolving systematic errors in widely used enhancer activity assays in human cells. Nat. Methods 15, 141–149 (2018).
6 Wang, X. et al. High-resolution genome-wide functional dissection of transcriptional regulatory regions and nucleotides in human. Nat. Commun. 9, 5380 (2018).
15 Yona, A. H., Alm, E. J. & Gore, J. Random sequences rapidly evolve into de novo promoters. Nat. Commun. 9, 1530 (2018).
4 van Arensbergen, J. et al. Genome-wide mapping of autonomous promoter activity in human cells. Nat. Biotechnol. 35, 145–153 (2017).
14 Cuperus, J. T. et al. Deep learning of the regulatory grammar of yeast 5’ untranslated regions from 500,000 random sequences. Genome Res. 27, 2015–2024 (2017).
31 Levo, M. et al. Systematic investigation of transcription factor activity in the context of chromatin using massively parallel binding and expression assays. Mol. Cell 65, 604–617 e606 (2017).
49 Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).
54 de Boer, C. High-efficiency S. cerevisiae lithium acetate transformation. protocols.io https://doi.org/10.17504/protocols.io.j4tcqwn (2017).
59 Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous systems. arXiv 1603.04467 (2016).
20 Shalem, O. et al. Systematic dissection of the sequence determinants of gene 3’ end mediated expression control. PLoS Genet. 11, e1005147 (2015).
55 Deng, C., Daley, T. & Smith, A. D. Applications of species accumulation curves in large-scale biological data analysis. Quant. Biol. 3, 135–144 (2015).
9 Hughes, T. R. & de Boer, C. G. Mapping yeast transcriptional networks. Genetics 195, 9–36 (2013).
10 Jolma, A. et al. DNA-binding specificities of human transcription factors. Cell 152, 327–339 (2013).
19 Kosuri, S. et al. Composability of regulatory sequences controlling transcription and translation in Escherichia coli. Proc. Natl Acad. Sci. USA 110, 14024–14029 (2013).
7 Sharon, E. et al. Inferring gene regulatory logic from high-throughput measurements of thousands of systematically designed promoters. Nat. Biotechnol. 30, 521–530 (2012).
18 de Boer, C. G. & Hughes, T. R. YeTFaSCo: a database of evaluated yeast transcription factor sequence specificities. Nucleic Acids Res. 40, D169–D179 (2012).
56 Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
61 Cherry, J. M. et al. Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res. 40, D700–D705 (2012).
11 Nutiu, R. et al. Direct measurement of DNA affinity landscapes on a high-throughput sequencing instrument. Nat. Biotechnol. 29, 659–664 (2011).
26 Zhang, Z. et al. A packing mechanism for nucleosome organization reconstituted across a eukaryotic genome. Science 332, 977–980 (2011).
30 Ganapathi, M. et al. Extensive role of the general regulatory factors, Abf1 and Rap1, in determining genome-wide chromatin structure in budding yeast. Nucleic Acids Res. 39, 2032–2044 (2011).
52 Erb, I. & van Nimwegen, E. Transcription factor binding site positioning in yeast: proximal promoter motifs characterize TATA-less promoters. PloS One 6, e24279 (2011).
3 Kinney, J. B., Murugan, A., Callan, C. G. Jr. & Cox, E. C. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proc. Natl Acad. Sci. USA107, 9158–9163 (2010).
8 Gertz, J., Siggia, E. D. & Cohen, B. A. Analysis of combinatorial cis-regulation in synthetic and genomic promoters. Nature 457, 215–218 (2009).
16 Wunderlich, Z. & Mirny, L. A. Different gene regulation strategies revealed by analysis of binding motifs. Trends Genet. 25, 434–440 (2009).
27 Hesselberth, J. R. et al. Global mapping of protein–DNA interactions in vivo by digital genomic footprinting. Nat. Methods 6, 283–289 (2009).
29 Hartley, P. D. & Madhani, H. D. Mechanisms that specify promoter nucleosome location and identity. Cell 137, 445–458 (2009).
51 Gibson, D. G. et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6, 343–345 (2009).
58 Segal, E. & Widom, J. From DNA sequence to transcriptional behaviour: a quantitative approach. Nat. Rev. Genet. 10, 443–456 (2009).
2 Yuan, Y., Guo, L., Shen, L. & Liu, J. S. Predicting gene expression from sequence: a reexamination. PLoS Comput. Biol. 3, e243 (2007).
46 Hibbs, M. A. et al. Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics 23, 2692–2699 (2007).
25 Liu, X., Lee, C. K., Granek, J. A., Clarke, N. D. & Lieb, J. D. Whole-genome comparison of Leu3 binding in vitro and in vivo reveals the importance of nucleosome occupancy in target site selection. Genome Res. 16, 1517–1528 (2006).
34 Roberts, G. G. & Hudson, A. P. Transcriptome profiling of Saccharomyces cerevisiae during a transition from fermentative to glycerol-based respiratory growth reveals extensive metabolic and structural remodeling. Mol. Genet. Genomics 276, 170–186 (2006).
48 Tanay, A. Extensive low-affinity transcriptional interactions in the yeast genome. Gen. Res. 16, 962–972 (2006).
53 Tong, A. H. & Boone, C. Synthetic genetic array analysis in Saccharomyces cerevisiae. Methods Mol. Biol. 313, 171–192 (2006).
57 Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).
62 Chua, G. et al. Identifying transcription factor functions and targets by phenotypic activation. Proc. Natl Acad. Sci. USA 103, 12045–12050 (2006).
17 Arnosti, D. N. & Kulkarni, M. M. Transcriptional enhancers: intelligent enhanceosomes or flexible billboards? J. Cell. Biochem. 94, 890–898 (2005).
21 Granek, J. A. & Clarke, N. D. Explicit equilibrium modeling of transcription-factor binding and gene regulation. Genome Biol. 6, R87 (2005).
1 Beer, M. A. & Tavazoie, S. Predicting gene expression from sequence. Cell 117, 185–198 (2004).
28 Bernstein, B. E., Liu, C. L., Humphrey, E. L., Perlstein, E. O. & Schreiber, S. L. Global nucleosome occupancy in yeast. Genome Biol. 5, R62 (2004).
44 Kim, T. S., Kim, H. Y., Yoon, J. H. & Kang, H. S. Recruitment of the Swi/Snf complex by Ste12-Tec1 promotes Flo8-Mss11-mediated activation of STA1 expression. Mol. Cell. Biol. 24, 9542–9556 (2004).
45 Harbison, C. T. et al. Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104 (2004).
60 Kent, N. A., Eibert, S. M. & Mellor, J. Cbf1p is required for chromatin remodeling at promoter-proximal CACGTG motifs in yeast. J. Biol. Chem. 279, 27116–27123 (2004).
22 Kulkarni, M. M. & Arnosti, D. N. Information display by transcriptional enhancers. Development 130, 6569–6575 (2003).
24 Conlon, E. M., Liu, X. S., Lieb, J. D. & Liu, J. S. Integrating regulatory motif discovery and genome-wide expression analysis. Proc. Natl Acad. Sci. USA 100, 3339–3344 (2003).
43 Neely, K. E., Hassan, A. H., Brown, C. E., Howe, L. & Workman, J. L. Transcription activator interactions with multiple SWI/SNF subunits. Mol. Cell. Biol. 22, 1615–1625 (2002).
23 Bussemaker, H. J., Li, H. & Siggia, E. D. Regulatory element detection using correlation with expression. Nat. Genet. 27, 167–171 (2001).
37 Haurie, V. et al. The transcriptional activator Cat8p provides a major contribution to the reprogramming of carbon metabolism during the diauxic shift in Saccharomyces cerevisiae. J. Biol. Chem. 276, 76–85 (2001).
39 Grauslund, M. & Ronnow, B. Carbon source-dependent transcriptional regulation of the mitochondrial glycerol-3-phosphate dehydrogenase gene, GUT2, from Saccharomyces cerevisiae. Can. J. Microbiol. 46, 1096–1100 (2000).
42 Cullen, P. J. & Sprague, G. F. Jr. Glucose depletion causes haploid invasive growth in yeast. Proc. Natl Acad. Sci. USA 97, 13619–13624 (2000).
38 Sato, T. et al. TheE-box DNA binding protein Sgc1p suppresses the gcr2 mutation, which is involved in transcriptional activation of glycolytic genes in Saccharomyces cerevisiae. FEBS Lett. 463, 307–311 (1999).
40 Madhani, H. D. & Fink, G. R. Combinatorial control required for the specificity of yeast MAPK signaling. Science 275, 1314–1317 (1997).
41 Gavrias, V., Andrianopoulos, A., Gimeno, C. J. & Timberlake, W. E. Saccharomyces cerevisiae TEC1 is required for pseudohyphal growth. Mol. Microbiol. 19, 1255–1263 (1996).
36 Hedges, D., Proft, M. & Entian, K. D. CAT8, a new zinc cluster-encoding gene necessary for derepression of gluconeogenic enzymes in the yeast Saccharomyces cerevisiae. Mol. Cell. Biol. 15, 1915–1922 (1995).
47 Bednar, J. et al. Determination of DNA persistence length by cryo-electron microscopy. Separation of the static and dynamic contributions to the apparent persistence length of DNA. J. Mol. Biol. 254, 579–594 (1995).
32 Axelrod, J. D., Reagan, M. S. & Majors, J. GAL4 disrupts a repressing nucleosome during activation of GAL1 transcription in vivo. Genes Dev. 7, 857–869 (1993).
33 Morse, R. H. Nucleosome disruption by transcription factor binding in yeast. Science 262, 1563–1566 (1993).
12 Oliphant, A. R., Brandl, C. J. & Struhl, K. Defining the sequence specificity of DNA-binding proteins by selecting binding sites from random-sequence oligonucleotides: analysis of yeast GCN4 protein. Mol. Cell. Biol. 9, 2944–2949 (1989).
35 Forsburg, S. L. & Guarente, L. Identification and characterization of HAP4: a third component of the CCAAT-bound HAP2/HAP3 heteromer. Genes Dev. 3, 1166–1178 (1989).
13 Horwitz, M. S. & Loeb, L. A. Promoters selected from random DNA sequences. Proc. Natl Acad. Sci. USA 83, 7405–7409 (1986).

 

To access each reference as a live link, go to the number in the first column in the Table and look it up in the List of References in the Link, below

https://www.nature.com/articles/s41587-019-0315-8

Author information

C.G.D. and A.R. drafted the manuscript, with all authors contributing. C.G.D. analyzed the data. C.G.D., E.D.V., E.L.A. and R.S. performed the experiments. A.R. and N.F. supervised the research.

Correspondence to Carl G. de Boer or Aviv Regev.

Ethics declarations

Competing interests

A.R. is an SAB member of Thermo Fisher Scientific, Neogene Therapeutics, Asimov, and Syros Pharmaceuticals, an equity holder of Immunitas, and a founder of and equity holder in Celsius Therapeutics. All other authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Cite this article

Boer, C.G., Vaishnav, E.D., Sadeh, R. et al. Deciphering eukaryotic gene-regulatory logic with 100 million random promoters. Nat Biotechnol (2019) doi:10.1038/s41587-019-0315-8

Download citation

Read Full Post »


Data Science & Analytics: What do Data Scientists Do in 2020 and a Pioneer Practitioner’s Portfolio of Algorithm-based Decision Support Systems for Operations Management in Several Industrial Verticals

Curator: Aviva Lev-Ari, PhD, RN

Based on  Jesse Anderson’s work on data teams Kathleen Walch in Why Data Scientists Aren’t Data Engineers makes several keen distinctions between the two skill sets.

I can attest that she is absolutely correct. See below, a Pioneer Practitioner’s Portfolio of Algorithm-based Decision Support Systems for Operations Management in Several Industrial Verticals

 

These key distinctions are:

Data Scientists vs Data Engineers

In the mid-2000s, we saw the emergence of the Data Scientist position. As cited in the O’Reilly article: “This increase in the demand for data scientists has been driven by the success of the major Internet companies. Google, Facebook, LinkedIn, and Amazon have all made their marks by using data creatively: not just warehousing data, but turning it into something of value.” Not surprisingly, any organization that has data of value is looking at data science and data scientists to increasingly extract more value from that information.

Originating from roots in statistical modeling and data analysis, data scientists have backgrounds in advanced math and statistics, advanced analytics, and increasingly machine learning / AI.  The focus of data scientists is, unsurprisingly, data science — that is to say, how to extract useful information from a sea of data, and how to translate business and scientific informational needs into the language of information and math. Data scientists need to be masters of statistics, probability, mathematics, and algorithms that help to glean useful insights from huge piles of information. These data scientists usually have learned programming out of necessity more than anything else in order to run programs and run advanced analysis on data.  As a result, the code that data scientists have usually been tasked to write, is of a minimal nature – only as necessary to accomplish a data science task (R is a common language for them to use) and work best when they are provided clean data to run advanced analytics on. A data scientist is a scientist who creates hypothesis, runs tests and analysis of the data, and then translates their results for someone else in the organization to easily view and understand.

On the other hand, data scientists can’t perform their jobs without access to large volumes of clean data. Extracting, cleaning, and moving data is not really the role of a data scientist, but rather that of a data engineer. Data Engineers have programming and technology expertise, and have previously been involved with data integration, middleware, analytics, business data portal, and extract-transform-load (ETL) operations. The data engineer’s center of gravity and skills are focused around big data and distributed systems, and experience with programming languages such as Java, Python, Scala, and scripting tools and techniques.  Data engineers are challenged with the task of taking data from a wide range of systems in structured and unstructured formats, and data which is usually not “clean”, with missing fields, mismatched data types, and other data-related issues. These data engineers need to use their programming, integration, architecture, and systems skills to clean all the data and put it into a format and system that data scientists can then use to analyze, build their data models, and provide value to the organization. In this way, the role of a data engineer is an engineer who designs, builds and arranges data.

Can there be a combined Data Scientist-Engineer role?

While it might seem that the roles of a data scientist and data engineer are distinct, data scientists and data engineers share many traits and skill sets. These overlapping skills include the necessity to work with and manipulate big data sets, programming skills to apply operations to the data, data analytics skills, and general fluency with systems operations.

Rather than engineering and programming-centric tools, data scientists need data science-centric tools. Right now there’s a growing collection of these tools, often emerging from data or predictive analytics environments that suit the needs of data scientists. However, it’s possible that even more business-centric tools might be appropriate, especially as the data scientists become more embedded with the line of business. For example, decades ago if you wanted to operate on large volumes of data in a spreadsheet-like format, this involved programming, but tools like Excel introduced things like pivot tables and now business managers are able to perform all sorts of analyses. It’s only a matter of time before tools like Excel embed data science capabilities, or business-centric data mining and analysis tools into their products.

As the talent gap for data scientists continues to widen, there is no doubt that we will see new tools created out of necessity to allow non-technical (read: business) people to run, test, and analyze data. Strategic business managers will begin to learn data science, without needing or wanting programming or data integration experience.  Traditional data scientists will still be needed to run very complex analysis of data. For the most part however, basic analysis will move more to the business unit due to increasingly easy-to-use tools. This means we have still yet to see which tool or technology will be the dominant one for ML and data science in the enterprise.

 

 

My SOURCES for the evolution of the field of Data Science are the following:

 Jesse Anderson’s work on data teams

Learn How to Create and Manage Big Data Teams

This Free, 73 Page E-Book is the Complete Guide to Successful Big Data projects

I’m really tired of seeing Big Data projects fail. They fail for both technical and managerial reasons. They all fail for similar reasons and that’s just sad because we can fix or prevent them. Gartner’s research shows that 85% of Big Data projects don’t even make it into production.

“Only 15 percent of businesses reported deploying their big data project to production, effectively unchanged from last year (14 percent).”

October 4, 2016 Gartner Press Release

https://www.bigdatainstitute.io/data-engineering-teams-book/

 

December, 1, 2019, 9:48 am

Why Data Scientists Aren’t Data Engineers

Kathleen Walch

Managing Partner & Principal Analyst at AI Focused Research and Advisory firm Cognilytica

https://www.forbes.com/sites/cognitiveworld/2019/12/01/why-data-scientists-arent-data-engineers/amp/?__twitter_impression=true

 

Translating Between Computer Science and Statistics

Posted on December 1, 2019

Gil Press

https://whatsthebigdata.com/2019/12/01/translating-between-computer-science-and-statistics/

 

Jan 8, 2019, 06:18am

The AI Chronicles: Combining Statistical Analysis And Computing From Hollerith To Zuckerberg

Gil Press Contributor

Enterprise & Cloud

https://www.forbes.com/sites/gilpress/2019/01/08/the-ai-chronicles-combining-statistical-analysis-and-computing-from-hollerith-to-zuckerberg/#23cf507c73b3

 

Jan 2, 2015, 10:48am

A Very Short History Of The Internet And The Web

Gil Press Contributor

Enterprise & Cloud

https://www.forbes.com/sites/gilpress/2015/01/02/a-very-short-history-of-the-internet-and-the-web-2/#a45c9307a4e2

 

May 28, 2013, 09:09am

A Very Short History Of Data Science

Gil Press Contributor

Enterprise & Cloud

https://www.forbes.com/sites/gilpress/2013/05/28/a-very-short-history-of-data-science/#1e7db3e155cf

 

May 9, 2013, 09:45am

A Very Short History Of Big Data

Gil Press Contributor

Enterprise & Cloud

https://www.forbes.com/sites/gilpress/2013/05/09/a-very-short-history-of-big-data/#16c2043b65a1

 

Apr 8, 2013, 09:16am

A Very Short History of Information Technology (IT)

Gil Press Contributor

Enterprise & Cloud

https://www.forbes.com/sites/gilpress/2013/04/08/a-very-short-history-of-information-technology-it/#3f5491022440

 

A Pioneer Practitioner’s Portfolio of Algorithm-based Decision Support Systems for Operations Management in Several Industrial Verticals: Analytics Designer, Aviva Lev-Ari, PhD, RN

On this landscape about IT, The Internet, Analytics, Statistics, Big Data, Data Science and Artificial Intelligence, I am to tell stories on my own pioneering work in data science, Algorithm-based decision support systems design for different organizations in several sectors of the US economy:

  • Startups:
  1. TimeØ Group
  2. Concept Five Technologies, Inc.
  3. MDSS, Inc.
  4. LPBI Group
  • Top Tier Management Consulting: SRI International, Monitor Group;
  • OEM: Amdahl Corporation;
  • Top 6th System Integrator: Perot System Corporation;
  • FFRDC: MITRE Corporation.
  • Publishing industry: was Director of Research at McGraw-Hill/CTB.
  • Northeastern University, Researcher on Cardiovascular Pharmaco-therapy at Bouve College of Health Sciences (Independent research guided by Professor of Pharmacology)

Type of institutions:

  • For-Profit corporations: Amdahl Corp, PSC, McGraw-Hill
  • For-Profit Top Tier Consulting: Monitor Company, Now Deloitte
  • Not-for-Profit Top Tier Consulting: SRI International
  • FFRDC: MITRE
  • eScientific Publishing: LPBI Group: Developers of Curation methodology for e-Articles [N = 3,700], electronic Table of Contents for e-Books in Medicine [N = 16, https://lnkd.in/ekWGNqA] and e-Proceedings of Biotech Conferences [N = 70].

 

Autobiographical Annotations: Tribute to My Professors

 

Pioneering implementations of analytics to business decision making: contributions to domain knowledge conceptualization, research design, methodology development, data modeling and statistical data analysis: Aviva Lev-Ari, UCB, PhD’83; HUJI MA’76

https://pharmaceuticalintelligence.com/2018/05/28/pioneering-implementations-of-analytics-to-business-decision-making-contributions-to-domain-knowledge-conceptualization-research-design-methodology-development-data-modeling-and-statistical-data-a/

Recollections of Years at UC, Berkeley, Part 1 and Part 2

  • Recollections: Part 1 – My days at Berkeley, 9/1978 – 12/1983 – About my doctoral advisor, Allan Pred, other professors and other peers

https://pharmaceuticalintelligence.com/2018/03/15/recollections-my-days-at-berkeley-9-1978-12-1983-about-my-doctoral-advisor-allan-pred-other-professors-and-other-peer/

  • Recollections: Part 2 – “While Rolling” is preceded by “While Enrolling” Autobiographical Alumna Recollections of Berkeley – Aviva Lev-Ari, PhD’83

https://pharmaceuticalintelligence.com/2018/05/24/recollections-part-2-while-rolling-is-preceded-by-while-enrolling-autobiographical-alumna-recollections-of-berkeley-aviva-lev-ari-phd83/

Accomplishments

The Digital Age Gave Rise to New Definitions – New Benchmarks were born on the World Wide Web for the Intangible Asset of Firm’s Reputation: Pay a Premium for buying e-Reputation

For @AVIVA1950, Founder, LPBI Group @pharma_BI: Twitter Analytics [Engagement Rate, Link Clicks, Retweets, Likes, Replies] & Tweet Highlights [Tweets, Impressions, Profile Visits, Mentions, New Followers] https://analytics.twitter.com/user/AVIVA1950/tweets

Thriving at the Survival Calls during Careers in the Digital Age – An AGE like no Other, also known as, DIGITAL

Reflections on a Four-phase Career: Aviva Lev-Ari, PhD, RN, March 2018

Was prepared for publication in American Friends of the Hebrew University (AFHU), May 2018 Newsletter, Hebrew University’s HUJI Alumni Spotlight Section.

Aviva Lev-Ari’s profile was up on 5/3/2018 on AFHU website under the Alumni Spotlight at https://www.afhu.org/

On 5/11/2018, Excerpts were Published in AFHU e-news.

https://us10.campaign-archive.com/?u=5c25136c60d4dfc4d3bb36eee&id=757c5c3aae&e=d09d2b8d72

https://www.afhu.org/2018/05/03/aviva-lev-ari/

 

Read Full Post »

Older Posts »