Advertisements
Feeds:
Posts
Comments

Posts Tagged ‘@science 2_0’


Live Conference Coverage @Medcitynews Converge 2018 @Philadelphia: Promising Drugs and Breaking Down Silos

Reporter: Stephen J. Williams, PhD

Promising Drugs, Pricing and Access

The drug pricing debate rages on. What are the solutions to continuing to foster research and innovation, while ensuring access and affordability for patients? Can biosimilars and generics be able to expand market access in the U.S.?

Moderator: Bunny Ellerin, Director, Healthcare and Pharmaceutical Management Program, Columbia Business School
Speakers:
Patrick Davish, AVP, Global & US Pricing/Market Access, Merck
Robert Dubois M.D., Chief Science Officer and Executive Vice President, National Pharmaceutical Council
Gary Kurzman, M.D., Senior Vice President and Managing Director, Healthcare, Safeguard Scientifics
Steven Lucio, Associate Vice President, Pharmacy Services, Vizient

What is working and what needs to change in pricing models?

Robert:  He sees so many players in the onStevencology space discovering new drugs and other drugs are going generic (that is what is working).  However are we spending too much on cancer care relative to other diseases (their initiative Going Beyond the Surface)

Steven:  the advent of biosimilars is good for the industry

Patrick:  large effort in oncology, maybe too much (750 trials on Keytruda) and he says pharma is spending on R&D (however clinical trials take large chunk of this money)

Robert: cancer has gotten a free ride but cost per year relative to benefit looks different than other diseases.  Are we overinvesting in cancer or is that a societal decision

Gary:  maybe as we become more specific with precision medicines high prices may be a result of our success in specifically targeting a mutation.  We need to understand the targeted drugs and outcomes.

Patrick: “Cancer is the last big frontier” but he says prices will come down in most cases.  He gives the example of Hep C treatment… the previous only therapeutic option was a very toxic yearlong treatment but the newer drugs may be more cost effective and safer

Steven: Our blockbuster drugs could diffuse the expense but now with precision we can’t diffuse the expense over a large number of patients

President’s Cancer Panel Recommendation

Six recommendations

  1. promoting value based pricing
  2. enabling communications of cost
  3. financial toxicity
  4. stimulate competition biosimilars
  5. value based care
  6. invest in biomedical research

Patrick: the government pricing regime is hurting.  Alot of practical barriers but Merck has over 200 studies on cost basis

Robert:  many concerns/impetus started in Europe on pricing as they are a set price model (EU won’t pay more than x for a drug). US is moving more to outcomes pricing. For every one health outcome study three studies did not show a benefit.  With cancer it is tricky to establish specific health outcomes.  Also Medicare gets best price status so needs to be a safe harbor for payers and biggest constraint is regulatory issues.

Steven: They all want value based pricing but we don’t have that yet and there is a challenge to understand the nuances of new therapies.  Hard to align all the stakeholders together so until some legislation starts to change the reimbursement-clinic-patient-pharma obstacles.  Possibly the big data efforts discussed here may help align each stakeholders goals.

Gary: What is the data necessary to understand what is happening to patients and until we have that information it still will be complicated to determine where investors in health care stand at in this discussion

Robert: on an ICER methods advisory board: 1) great concern of costs how do we determine fair value of drug 2) ICER is only game in town, other orgs only give recommendations 3) ICER evaluates long term value (cost per quality year of life), budget impact (will people go bankrupt)

4) ICER getting traction in the public eye and advocates 5) the problem is ICER not ready for prime time as evidence keeps changing or are they keeping the societal factors in mind and they don’t have total transparancy in their methodology

Steven: We need more transparency into all the costs associated with the drug and therapy and value-based outcome.  Right now price is more of a black box.

Moderator: pointed to a recent study which showed that outpatient costs are going down while hospital based care cost is going rapidly up (cost of site of care) so we need to figure out how to get people into lower cost setting

Breaking Down Silos in Research

“Silo” is healthcare’s four-letter word. How are researchers, life science companies and others sharing information that can benefit patients more quickly? Hear from experts at institutions that are striving to tear down the walls that prevent data from flowing.

Moderator: Vini Jolly, Executive Director, Woodside Capital Partners
Speakers:
Ardy Arianpour, CEO & Co-Founder, Seqster @seqster
Lauren Becnel, Ph.D., Real World Data Lead for Oncology, Pfizer
Rakesh Mathew, Innovation, Research, & Development Lead, HealthShareExchange
David Nace M.D., Chief Medical Officer, Innovaccer

Seqster: Seqster is a secure platform that helps you and your family manage medical records, DNA, fitness, and nutrition data—all in one place. Founder has a genomic sequencing background but realized sequence  information needs to be linked with medical records.

HealthShareExchange.org :

HealthShare Exchange envisions a trusted community of healthcare stakeholders collaborating to deliver better care to consumers in the greater Philadelphia region. HealthShare Exchange will provide secure access to health information to enable preventive and cost-effective care; improve quality of patient care; and facilitate care transitions. They have partnered with multiple players in healthcare field and have data on over 7 million patients.

Innovacer

Data can be overwhelming, but it doesn’t have to be this way. To drive healthcare efficiency, we designed a modular suite of products for a smooth transition into a data-driven world within 4 weeks. Why does it take so much money to move data around and so slowly?

What is interoperatibility?

Ardy: We knew in genomics field how to build algorithms to analyze big data but how do we expand this from a consumer standpoint and see and share your data.

Lauren: how can we use the data between patients, doctors, researchers?  On the research side genomics represent only 2% of data.  Silos are one issue but figuring out the standards for data (collection, curation, analysis) is not set. Still need to improve semantic interoperability. For example Flatiron had good annotated data on male metastatic breast cancer.

David: Technical interopatabliltiy (platform), semantic interopatability (meaning or word usage), format (syntactic) interopatibility (data structure).  There is technical interoperatiblity between health system but some semantic but formats are all different (pharmacies use different systems and write different prescriptions using different suppliers).  In any value based contract this problem is a big issue now (we are going to pay you based on the quality of your performance then there is big need to coordinate across platforms).  We can solve it by bringing data in real time in one place and use mapping to integrate the format (need quality control) then need to make the data democratized among players.

Rakesh:  Patients data should follow the patient. Of Philadelphia’s 12 health systems we had a challenge to make data interoperatable among them so tdhey said to providers don’t use portals and made sure hospitals were sending standardized data. Health care data is complex.

David: 80% of clinical data is noise. For example most eMedical Records are text. Another problem is defining a patient identifier which US does not believe in.

 

 

 

 

Please follow on Twitter using the following #hash tags and @pharma_BI

#MCConverge

#cancertreatment

#healthIT

#innovation

#precisionmedicine

#healthcaremodels

#personalizedmedicine

#healthcaredata

And at the following handles:

@pharma_BI

@medcitynews

Advertisements

Read Full Post »


How Will FDA’s new precisionFDA Science 2.0 Collaboration Platform Protect Data?

Reporter: Stephen J. Williams, Ph.D.

As reported in MassDevice.com

FDA launches precisionFDA to harness the power of scientific collaboration

FDA VoiceBy: Taha A. Kass-Hout, M.D., M.S. and Elaine Johanson

Imagine a world where doctors have at their fingertips the information that allows them to individualize a diagnosis, treatment or even a cure for a person based on their genes. That’s what President Obama envisioned when he announced his Precision Medicine Initiative earlier this year. Today, with the launch of FDA’s precisionFDA web platform, we’re a step closer to achieving that vision.

PrecisionFDA is an online, cloud-based, portal that will allow scientists from industry, academia, government and other partners to come together to foster innovation and develop the science behind a method of “reading” DNA known as next-generation sequencing (or NGS). Next Generation Sequencing allows scientists to compile a vast amount of data on a person’s exact order or sequence of DNA. Recognizing that each person’s DNA is slightly different, scientists can look for meaningful differences in DNA that can be used to suggest a person’s risk of disease, possible response to treatment and assess their current state of health. Ultimately, what we learn about these differences could be used to design a treatment tailored to a specific individual.

The precisionFDA platform is a part of this larger effort and through its use we want to help scientists work toward the most accurate and meaningful discoveries. precisionFDA users will have access to a number of important tools to help them do this. These tools include reference genomes, such as “Genome in the Bottle,” a reference sample of DNA for validating human genome sequences developed by the National Institute of Standards and Technology. Users will also be able to compare their results to previously validated reference results as well as share their results with other users, track changes and obtain feedback.

Over the coming months we will engage users in improving the usability, openness and transparency of precisionFDA. One way we’ll achieve that is by placing the code for the precisionFDA portal on the world’s largest open source software repository, GitHub, so the community can further enhance precisionFDA’s features.Through such collaboration we hope to improve the quality and accuracy of genomic tests – work that will ultimately benefit patients.

precisionFDA leverages our experience establishing openFDA, an online community that provides easy access to our public datasets. Since its launch in 2014, openFDA has already resulted in many novel ways to use, integrate and analyze FDA safety information. We’re confident that employing such a collaborative approach to DNA data will yield important advances in our understanding of this fast-growing scientific field, information that will ultimately be used to develop new diagnostics, treatments and even cures for patients.

fda-voice-taha-kass-1x1Taha A. Kass-Hout, M.D., M.S., is FDA’s Chief Health Informatics Officer and Director of FDA’s Office of Health Informatics. Elaine Johanson is the precisionFDA Project Manager.

 

The opinions expressed in this blog post are the author’s only and do not necessarily reflect those of MassDevice.com or its employees.

So What Are the Other Successes With Such Open Science 2.0 Collaborative Networks?

In the following post there are highlighted examples of these Open Scientific Networks and, as long as

  • transparancy
  • equal contributions (lack of heirarchy)

exists these networks can flourish and add interesting discourse.  Scientists are already relying on these networks to collaborate and share however resistance by certain members of an “elite” can still exist.  Social media platforms are now democratizing this new science2.0 effort.  In addition the efforts of multiple biocurators (who mainly work for love of science) have organized the plethora of data (both genomic, proteomic, and literature) in order to provide ease of access and analysis.

Science and Curation: The New Practice of Web 2.0

Curation: an Essential Practice to Manage “Open Science”

The web 2.0 gave birth to new practices motivated by the will to have broader and faster cooperation in a more free and transparent environment. We have entered the era of an “open” movement: “open data”, “open software”, etc. In science, expressions like “open access” (to scientific publications and research results) and “open science” are used more and more often.

Curation and Scientific and Technical Culture: Creating Hybrid Networks

Another area, where there are most likely fewer barriers, is scientific and technical culture. This broad term involves different actors such as associations, companies, universities’ communication departments, CCSTI (French centers for scientific, technical and industrial culture), journalists, etc. A number of these actors do not limit their work to popularizing the scientific data; they also consider they have an authentic mission of “culturing” science. The curation practice thus offers a better organization and visibility to the information. The sought-after benefits will be different from one actor to the next.

Scientific Curation Fostering Expert Networks and Open Innovation: Lessons from Clive Thompson and others

  • Using Curation and Science 2.0 to build Trusted, Expert Networks of Scientists and Clinicians

Given the aforementioned problems of:

        I.            the complex and rapid deluge of scientific information

      II.            the need for a collaborative, open environment to produce transformative innovation

    III.            need for alternative ways to disseminate scientific findings

CURATION MAY OFFER SOLUTIONS

        I.            Curation exists beyond the review: curation decreases time for assessment of current trends adding multiple insights, analyses WITH an underlying METHODOLOGY (discussed below) while NOT acting as mere reiteration, regurgitation

 

      II.            Curation providing insights from WHOLE scientific community on multiple WEB 2.0 platforms

 

    III.            Curation makes use of new computational and Web-based tools to provide interoperability of data, reporting of findings (shown in Examples below)

 

Therefore a discussion is given on methodologies, definitions of best practices, and tools developed to assist the content curation community in this endeavor

which has created a need for more context-driven scientific search and discourse.

However another issue would be Individual Bias if these networks are closed and protocols need to be devised to reduce bias from individual investigators, clinicians.  This is where CONSENSUS built from OPEN ACCESS DISCOURSE would be beneficial as discussed in the following post:

Risk of Bias in Translational Science

As per the article

Risk of bias in translational medicine may take one of three forms:

  1. a systematic error of methodology as it pertains to measurement or sampling (e.g., selection bias),
  2. a systematic defect of design that leads to estimates of experimental and control groups, and of effect sizes that substantially deviate from true values (e.g., information bias), and
  3. a systematic distortion of the analytical process, which results in a misrepresentation of the data with consequential errors of inference (e.g., inferential bias).

This post highlights many important points related to bias but in summarry there can be methodologies and protocols devised to eliminate such bias.  Risk of bias can seriously adulterate the internal and the external validity of a clinical study, and, unless it is identified and systematically evaluated, can seriously hamper the process of comparative effectiveness and efficacy research and analysis for practice. The Cochrane Group and the Agency for Healthcare Research and Quality have independently developed instruments for assessing the meta-construct of risk of bias. The present article begins to discuss this dialectic.

  • Information dissemination to all stakeholders is key to increase their health literacy in order to ensure their full participation
  • threats to internal and external validity  represent specific aspects of systematic errors (i.e., bias)in design, methodology and analysis

So what about the safety and privacy of Data?

A while back I did a post and some interviews on how doctors in developing countries are using social networks to communicate with patients, either over established networks like Facebook or more private in-house networks.  In addition, these doctor-patient relationships in developing countries are remote, using the smartphone to communicate with rural patients who don’t have ready access to their physicians.

Located in the post Can Mobile Health Apps Improve Oral-Chemotherapy Adherence? The Benefit of Gamification.

I discuss some of these problems in the following paragraph and associated posts below:

Mobile Health Applications on Rise in Developing World: Worldwide Opportunity

According to International Telecommunication Union (ITU) statistics, world-wide mobile phone use has expanded tremendously in the past 5 years, reaching almost 6 billion subscriptions. By the end of this year it is estimated that over 95% of the world’s population will have access to mobile phones/devices, including smartphones.

This presents a tremendous and cost-effective opportunity in developing countries, and especially rural areas, for physicians to reach patients using mHealth platforms.

How Social Media, Mobile Are Playing a Bigger Part in Healthcare

E-Medical Records Get A Mobile, Open-Sourced Overhaul By White House Health Design Challenge Winners

In Summary, although there are restrictions here in the US governing what information can be disseminated over social media networks, developing countries appear to have either defined the regulations as they are more dependent on these types of social networks given the difficulties in patient-physician access.

Therefore the question will be Who Will Protect The Data?

For some interesting discourse please see the following post

Atul Butte Talks on Big Data, Open Data and Clinical Trials

 

Read Full Post »


 

Yay! Bloomberg View Seems to Be On the Side of the Lowly Scientist!

 

Reporter: Stephen J. Williams, Ph.D.

Justin Fox at BloombergView had just published an article near and dear to the hearts of all those #openaccess scientists and those of us @Pharma_BI and @MozillaScience who feel strong about #openscience #opendata and the movement to make scientific discourse freely accessible.

His article “Academic Publishing Can’t Remain Such a Great Business” discusses the history of academic publishing and how consolidation of smaller publishers into large scientific publishing houses (Bigger publishers bought smaller ones) has produced a monopoly like environment in which prices for journal subscriptions are rising. He also discusses how the open access movement is challenging this model and may oneday replace the big publishing houses.

A few tidbits from his article:

Publishers of academic journals have a great thing going. They generally don’t pay for the articles they publish, or for the primary editing and peer reviewing essential to preparing them for publication (they do fork over some money for copy editing). Most of this gratis labor is performed by employees of academic institutions. Those institutions, along with government agencies and foundations, also fund all the research that these journal articles are based upon.

Yet the journal publishers are able to get authors to sign over copyright to this content, and sell it in the form of subscriptions to university libraries. Most journals are now delivered in electronic form, which you think would cut the cost, but no, the price has been going up and up:

 

This isn’t just inflation at work: in 1994, journal subscriptions accounted for 51 percent of all library spending on information resources. In 2012 it was 69 percent.

Who exactly is getting that money? The largest academic publisher is Elsevier, which is also the biggest, most profitable division of RELX, the Anglo-Dutch company that was known until February as Reed Elsevier.

 

RELX reports results in British pounds; I converted to dollars in part because the biggest piece of the company’s revenue comes from the U.S. And yes, those are pretty great operating-profit margins: 33 percent in 2014, 39 percent in 2013. The next biggest academic publisher is Springer Nature, which is closely held (by German publisher Holtzbrinck and U.K. private-equity firm BC Partners) but reportedly has annual revenue of about $1.75 billion. Other biggies that are part of publicly traded companies include Wiley-Blackwell, a division of John Wiley & Sons; Wolters Kluwer Health, a division of Wolters Kluwer; and Taylor & Francis, a division of Informa.

And gives a brief history of academic publishing:

The history here is that most early scholarly journals were the work of nonprofit scientific societies. The goal was to disseminate research as widely as possible, not to make money — a key reason why nobody involved got paid. After World War II, the explosion in both the production of and demand for academic research outstripped the capabilities of the scientific societies, and commercial publishers stepped into the breach. At a time when journals had to be printed and shipped all over the world, this made perfect sense.

Once it became possible to effortlessly copy and disseminate digital files, though, the economics changed. For many content producers, digital copying is a threat to their livelihoods. As Peter Suber, the director of Harvard University’s Office for Scholarly Communication, puts it in his wonderful little book, “Open Access”:

And while NIH Tried To Force These Houses To Accept Open Access:

About a decade ago, the universities and funding agencies began fighting back. The National Institutes of Health in the U.S., the world’s biggest funder of medical research, began requiring in 2008 that all recipients of its grants submit electronic versions of their final peer-reviewed manuscripts when they are accepted for publication in journals, to be posted a year later on the NIH’s open-access PubMed depository. Publishers grumbled, but didn’t want to turn down the articles.

Big publishers are making $ by either charging as much as they can or focus on new customers and services

For the big publishers, meanwhile, the choice is between positioning themselves for the open-access future or maximizing current returns. In its most recent annual report, RELX leans toward the latter while nodding toward the former:

Over the past 15 years alternative payment models for the dissemination of research such as “author-pays” or “author’s funder-pays” have emerged. While it is expected that paid subscription will remain the primary distribution model, Elsevier has long invested in alternative business models to address the needs of customers and researchers.

Elsevier’s extra services can add news avenues of revenue

https://www.elsevier.com/social-sciences/business-and-management

https://www.elsevier.com/rd-solutions

but they may be seeing the light on OpenAccess (possibly due to online advocacy, an army of scientific curators and online scientific communities):

Elsevier’s Mendeley and Academia.edu – How We Distribute Scientific Research: A Case in Advocacy for Open Access Journals

SAME SCIENTIFIC IMPACT: Scientific Publishing – Open Journals vs. Subscription-based

e-Recognition via Friction-free Collaboration over the Internet: “Open Access to Curation of Scientific Research”

Indeed we recently put up an interesting authored paper “A Patient’s Perspective: On Open Heart Surgery from Diagnosis and Intervention to Recovery” (free of charge) letting the community of science freely peruse and comment, and generally well accepted by both author and community as a nice way to share academic discourse without the enormous fees, especially on opinion papers in which a rigorous peer review may not be necessary.

But it was very nice to see a major news outlet like Bloomberg View understand the lowly scientist’s aggravations.

Thanks Bloomberg!

 

 

 

 

 

Read Full Post »


Artificial Intelligence Versus the Scientist: Who Will Win?

Will DARPA Replace the Human Scientist: Not So Fast, My Friend!

Writer, Curator: Stephen J. Williams, Ph.D.

scientistboxingwithcomputer

Last month’s issue of Science article by Jia You “DARPA Sets Out to Automate Research”[1] gave a glimpse of how science could be conducted in the future: without scientists. The article focused on the U.S. Defense Advanced Research Projects Agency (DARPA) program called ‘Big Mechanism”, a $45 million effort to develop computer algorithms which read scientific journal papers with ultimate goal of extracting enough information to design hypotheses and the next set of experiments,

all without human input.

The head of the project, artificial intelligence expert Paul Cohen, says the overall goal is to help scientists cope with the complexity with massive amounts of information. As Paul Cohen stated for the article:

“‘

Just when we need to understand highly connected systems as systems,

our research methods force us to focus on little parts.

                                                                                                                                                                                                               ”

The Big Mechanisms project aims to design computer algorithms to critically read journal articles, much as scientists will, to determine what and how the information contributes to the knowledge base.

As a proof of concept DARPA is attempting to model Ras-mutation driven cancers using previously published literature in three main steps:

  1. Natural Language Processing: Machines read literature on cancer pathways and convert information to computational semantics and meaning

One team is focused on extracting details on experimental procedures, using the mining of certain phraseology to determine the paper’s worth (for example using phrases like ‘we suggest’ or ‘suggests a role in’ might be considered weak versus ‘we prove’ or ‘provide evidence’ might be identified by the program as worthwhile articles to curate). Another team led by a computational linguistics expert will design systems to map the meanings of sentences.

  1. Integrate each piece of knowledge into a computational model to represent the Ras pathway on oncogenesis.
  2. Produce hypotheses and propose experiments based on knowledge base which can be experimentally verified in the laboratory.

The Human no Longer Needed?: Not So Fast, my Friend!

The problems the DARPA research teams are encountering namely:

  • Need for data verification
  • Text mining and curation strategies
  • Incomplete knowledge base (past, current and future)
  • Molecular biology not necessarily “requires casual inference” as other fields do

Verification

Notice this verification step (step 3) requires physical lab work as does all other ‘omics strategies and other computational biology projects. As with high-throughput microarray screens, a verification is needed usually in the form of conducting qPCR or interesting genes are validated in a phenotypical (expression) system. In addition, there has been an ongoing issue surrounding the validity and reproducibility of some research studies and data.

See Importance of Funding Replication Studies: NIH on Credibility of Basic Biomedical Studies

Therefore as DARPA attempts to recreate the Ras pathway from published literature and suggest new pathways/interactions, it will be necessary to experimentally validate certain points (protein interactions or modification events, signaling events) in order to validate their computer model.

Text-Mining and Curation Strategies

The Big Mechanism Project is starting very small; this reflects some of the challenges in scale of this project. Researchers were only given six paragraph long passages and a rudimentary model of the Ras pathway in cancer and then asked to automate a text mining strategy to extract as much useful information. Unfortunately this strategy could be fraught with issues frequently occurred in the biocuration community namely:

Manual or automated curation of scientific literature?

Biocurators, the scientists who painstakingly sort through the voluminous scientific journal to extract and then organize relevant data into accessible databases, have debated whether manual, automated, or a combination of both curation methods [2] achieves the highest accuracy for extracting the information needed to enter in a database. Abigail Cabunoc, a lead developer for Ontario Institute for Cancer Research’s WormBase (a database of nematode genetics and biology) and Lead Developer at Mozilla Science Lab, noted, on her blog, on the lively debate on biocuration methodology at the Seventh International Biocuration Conference (#ISB2014) that the massive amounts of information will require a Herculaneum effort regardless of the methodology.

Although I will have a future post on the advantages/disadvantages and tools/methodologies of manual vs. automated curation, there is a great article on researchinformation.infoExtracting More Information from Scientific Literature” and also see “The Methodology of Curation for Scientific Research Findings” and “Power of Analogy: Curation in Music, Music Critique as a Curation and Curation of Medical Research Findings – A Comparison” for manual curation methodologies and A MOD(ern) perspective on literature curation for a nice workflow paper on the International Society for Biocuration site.

The Big Mechanism team decided on a full automated approach to text-mine their limited literature set for relevant information however was able to extract only 40% of relevant information from these six paragraphs to the given model. Although the investigators were happy with this percentage most biocurators, whether using a manual or automated method to extract information, would consider 40% a low success rate. Biocurators, regardless of method, have reported ability to extract 70-90% of relevant information from the whole literature (for example for Comparative Toxicogenomics Database)[3-5].

Incomplete Knowledge Base

In an earlier posting (actually was a press release for our first e-book) I had discussed the problem with the “data deluge” we are experiencing in scientific literature as well as the plethora of ‘omics experimental data which needs to be curated.

Tackling the problem of scientific and medical information overload

pubmedpapersoveryears

Figure. The number of papers listed in PubMed (disregarding reviews) during ten year periods have steadily increased from 1970.

Analyzing and sharing the vast amounts of scientific knowledge has never been so crucial to innovation in the medical field. The publication rate has steadily increased from the 70’s, with a 50% increase in the number of original research articles published from the 1990’s to the previous decade. This massive amount of biomedical and scientific information has presented the unique problem of an information overload, and the critical need for methodology and expertise to organize, curate, and disseminate this diverse information for scientists and clinicians. Dr. Larry Bernstein, President of Triplex Consulting and previously chief of pathology at New York’s Methodist Hospital, concurs that “the academic pressures to publish, and the breakdown of knowledge into “silos”, has contributed to this knowledge explosion and although the literature is now online and edited, much of this information is out of reach to the very brightest clinicians.”

Traditionally, organization of biomedical information has been the realm of the literature review, but most reviews are performed years after discoveries are made and, given the rapid pace of new discoveries, this is appearing to be an outdated model. In addition, most medical searches are dependent on keywords, hence adding more complexity to the investigator in finding the material they require. Third, medical researchers and professionals are recognizing the need to converse with each other, in real-time, on the impact new discoveries may have on their research and clinical practice.

These issues require a people-based strategy, having expertise in a diverse and cross-integrative number of medical topics to provide the in-depth understanding of the current research and challenges in each field as well as providing a more conceptual-based search platform. To address this need, human intermediaries, known as scientific curators, are needed to narrow down the information and provide critical context and analysis of medical and scientific information in an interactive manner powered by web 2.0 with curators referred to as the “researcher 2.0”. This curation offers better organization and visibility to the critical information useful for the next innovations in academic, clinical, and industrial research by providing these hybrid networks.

Yaneer Bar-Yam of the New England Complex Systems Institute was not confident that using details from past knowledge could produce adequate roadmaps for future experimentation and noted for the article, “ “The expectation that the accumulation of details will tell us what we want to know is not well justified.”

In a recent post I had curated findings from four lung cancer omics studies and presented some graphic on bioinformatic analysis of the novel genetic mutations resulting from these studies (see link below)

Multiple Lung Cancer Genomic Projects Suggest New Targets, Research Directions for

Non-Small Cell Lung Cancer

which showed, that while multiple genetic mutations and related pathway ontologies were well documented in the lung cancer literature there existed many significant genetic mutations and pathways identified in the genomic studies but little literature attributed to these lung cancer-relevant mutations.

KEGGinliteroanalysislungcancer

  This ‘literomics’ analysis reveals a large gap between our knowledge base and the data resulting from large translational ‘omic’ studies.

Different Literature Analyses Approach Yeilding

A ‘literomics’ approach focuses on what we don NOT know about genes, proteins, and their associated pathways while a text-mining machine learning algorithm focuses on building a knowledge base to determine the next line of research or what needs to be measured. Using each approach can give us different perspectives on ‘omics data.

Deriving Casual Inference

Ras is one of the best studied and characterized oncogenes and the mechanisms behind Ras-driven oncogenenis is highly understood.   This, according to computational biologist Larry Hunt of Smart Information Flow Technologies makes Ras a great starting point for the Big Mechanism project. As he states,” Molecular biology is a good place to try (developing a machine learning algorithm) because it’s an area in which common sense plays a minor role”.

Even though some may think the project wouldn’t be able to tackle on other mechanisms which involve epigenetic factors UCLA’s expert in causality Judea Pearl, Ph.D. (head of UCLA Cognitive Systems Lab) feels it is possible for machine learning to bridge this gap. As summarized from his lecture at Microsoft:

“The development of graphical models and the logic of counterfactuals have had a marked effect on the way scientists treat problems involving cause-effect relationships. Practical problems requiring causal information, which long were regarded as either metaphysical or unmanageable can now be solved using elementary mathematics. Moreover, problems that were thought to be purely statistical, are beginning to benefit from analyzing their causal roots.”

According to him first

1) articulate assumptions

2) define research question in counter-inference terms

Then it is possible to design an inference system using calculus that tells the investigator what they need to measure.

To watch a video of Dr. Judea Pearl’s April 2013 lecture at Microsoft Research Machine Learning Summit 2013 (“The Mathematics of Causal Inference: with Reflections on Machine Learning”), click here.

The key for the Big Mechansism Project may me be in correcting for the variables among studies, in essence building a models system which may not rely on fully controlled conditions. Dr. Peter Spirtes from Carnegie Mellon University in Pittsburgh, PA is developing a project called the TETRAD project with two goals: 1) to specify and prove under what conditions it is possible to reliably infer causal relationships from background knowledge and statistical data not obtained under fully controlled conditions 2) develop, analyze, implement, test and apply practical, provably correct computer programs for inferring causal structure under conditions where this is possible.

In summary such projects and algorithms will provide investigators the what, and possibly the how should be measured.

So for now it seems we are still needed.

References

  1. You J: Artificial intelligence. DARPA sets out to automate research. Science 2015, 347(6221):465.
  2. Biocuration 2014: Battle of the New Curation Methods [http://blog.abigailcabunoc.com/biocuration-2014-battle-of-the-new-curation-methods]
  3. Davis AP, Johnson RJ, Lennon-Hopkins K, Sciaky D, Rosenstein MC, Wiegers TC, Mattingly CJ: Targeted journal curation as a method to improve data currency at the Comparative Toxicogenomics Database. Database : the journal of biological databases and curation 2012, 2012:bas051.
  4. Wu CH, Arighi CN, Cohen KB, Hirschman L, Krallinger M, Lu Z, Mattingly C, Valencia A, Wiegers TC, John Wilbur W: BioCreative-2012 virtual issue. Database : the journal of biological databases and curation 2012, 2012:bas049.
  5. Wiegers TC, Davis AP, Mattingly CJ: Collaborative biocuration–text-mining development task for document prioritization for curation. Database : the journal of biological databases and curation 2012, 2012:bas037.

Other posts on this site on include: Artificial Intelligence, Curation Methodology, Philosophy of Science

Inevitability of Curation: Scientific Publishing moves to embrace Open Data, Libraries and Researchers are trying to keep up

A Brief Curation of Proteomics, Metabolomics, and Metabolism

The Methodology of Curation for Scientific Research Findings

Scientific Curation Fostering Expert Networks and Open Innovation: Lessons from Clive Thompson and others

The growing importance of content curation

Data Curation is for Big Data what Data Integration is for Small Data

Cardiovascular Original Research: Cases in Methodology Design for Content Co-Curation The Art of Scientific & Medical Curation

Exploring the Impact of Content Curation on Business Goals in 2013

Power of Analogy: Curation in Music, Music Critique as a Curation and Curation of Medical Research Findings – A Comparison

conceived: NEW Definition for Co-Curation in Medical Research

Reconstructed Science Communication for Open Access Online Scientific Curation

Search Results for ‘artificial intelligence’

 The Simple Pictures Artificial Intelligence Still Can’t Recognize

Data Scientist on a Quest to Turn Computers Into Doctors

Vinod Khosla: “20% doctor included”: speculations & musings of a technology optimist or “Technology will replace 80% of what doctors do”

Where has reason gone?

Read Full Post »


Twitter is Becoming a Powerful Tool in Science and Medicine

 Curator: Stephen J. Williams, Ph.D.

Updated 4/2016

Life-cycle of Science 2

A recent Science article (Who are the science stars of Twitter?; Sept. 19, 2014) reported the top 50 scientists followed on Twitter. However, the article tended to focus on the use of Twitter as a means to develop popularity, a sort of “Science Kardashian” as they coined it. So the writers at Science developed a “Kardashian Index (K-Index) to determine scientists following and popularity on Twitter.

Now as much buzz Kim Kardashian or a Perez Hilton get on social media, their purpose is solely for entertainment and publicity purposes, the Science sort of fell flat in that it focused mainly on the use of Twitter as a metric for either promotional or public outreach purposes. A notable scientist was mentioned in the article, using Twitter feed to gauge the receptiveness of his presentation. In addition, relying on Twitter for effective public discourse of science is problematic as:

  • Twitter feeds are rapidly updated and older feeds quickly get buried within the “Twittersphere” = LIMITED EXPOSURE TIMEFRAME
  • Short feeds may not provide the access to appropriate and understandable scientific information (The Science Communication Trap) which is explained in The Art of Communicating Science: traps, tips and tasks for the modern-day scientist. “The challenge of clearly communicating the intended scientific message to the public is not insurmountable but requires an understanding of what works and what does not work.” – from Heidi Roop, G.-Martinez-Mendez and K. Mills

However, as highlighted below, Twitter, and other social media platforms are being used in creative ways to enhance the research, medical, and bio investment collaborative, beyond a simple news-feed.  And the power of Twitter can be attributed to two simple features

  1. Ability to organize – through use of the hashtag (#) and handle (@), Twitter assists in the very important task of organizing, indexing, and ANNOTATING content and conversations. A very great article on Why the Hashtag in Probably the Most Powerful Tool on Twitter by Vanessa Doctor explains how hashtags and # search may be as popular as standard web-based browser search. Thorough annotation is crucial for any curation process, which are usually in the form of database tags or keywords. The use of # and @ allows curators to quickly find, index and relate disparate databases to link annotated information together. The discipline of scientific curation requires annotation to assist in the digital preservation, organization, indexing, and access of data and scientific & medical literature. For a description of scientific curation methodologies please see the following links:

Please read the following articles on CURATION

The Methodology of Curation for Scientific Research Findings

Power of Analogy: Curation in Music, Music Critique as a Curation and Curation of Medical Research Findings – A Comparison

Science and Curation: The New Practice of Web 2.0

  1. Information Analytics

Multiple analytic software packages have been made available to analyze information surrounding Twitter feeds, including Twitter feeds from #chat channels one can set up to cover a meeting, product launch etc.. Some of these tools include:

Twitter Analytics – measures metrics surrounding Tweets including retweets, impressions, engagement, follow rate, …

Twitter Analytics – Hashtags.org – determine most impactful # for your Tweets For example, meeting coverage of bioinvestment conferences or startup presentations using #startup generates automatic retweeting by Startup tweetbot @StartupTweetSF.

 

  1. Tweet Sentiment Analytics

Examples of Twitter Use

A. Scientific Meeting Coverage

In a paper entitled Twitter Use at a Family Medicine Conference: Analyzing #STFM13 authors Ranit Mishori, MD, Frendan Levy, MD, and Benjamin Donvan analyzed the public tweets from the 2013 Society of Teachers of Family Medicine (STFM) conference bearing the meeting-specific hashtag #STFM13. Thirteen percent of conference attendees (181 users) used the #STFM13 to share their thoughts on the meeting (1,818 total tweets) showing a desire for social media interaction at conferences but suggesting growth potential in this area. As we have also seen, the heaviest volume of conference-tweets originated from a small number of Twitter users however most tweets were related to session content.

However, as the authors note, although it is easy to measure common metrics such as number of tweets and retweets, determining quality of engagement from tweets would be important for gauging the value of Twitter-based social-media coverage of medical conferences.

Thea authors compared their results with similar analytics generated by the HealthCare Hashtag Project, a project and database of medically-related hashtag use, coordinated and maintained by the company Symplur.  Symplur’s database includes medical and scientific conference Twitter coverage but also Twitter usuage related to patient care. In this case the database was used to compare meeting tweets and hashtag use with the 2012 STFM conference.

These are some of the published journal articles that have employed Symplur (www.symplur.com) data in their research of Twitter usage in medical conferences.

B. Twitter Usage for Patient Care and Engagement

Although the desire of patients to use and interact with their physicians over social media is increasing, along with increasing health-related social media platforms and applications, there are certain obstacles to patient-health provider social media interaction, including lack of regulatory framework as well as database and security issues. Some of the successes and issues of social media and healthcare are discussed in the post Can Mobile Health Apps Improve Oral-Chemotherapy Adherence? The Benefit of Gamification.

However there is also a concern if social media truly engages the patient and improves patient education. In a study of Twitter communications by breast cancer patients Tweeting about breast cancer, authors noticed Tweeting was a singular event. The majority of tweets did not promote any specific preventive behavior. The authors concluded “Twitter is being used mostly as a one-way communication tool.” (Using Twitter for breast cancer prevention: an analysis of breast cancer awareness month. Thackeray R1, Burton SH, Giraud-Carrier C, Rollins S, Draper CR. BMC Cancer. 2013;13:508).

In addition a new poll by Harris Interactive and HealthDay shows one third of patients want some mobile interaction with their physicians.

Some papers cited in Symplur’s HealthCare Hashtag Project database on patient use of Twitter include:

C. Twitter Use in Pharmacovigilance to Monitor Adverse Events

Pharmacovigilance is the systematic detection, reporting, collecting, and monitoring of adverse events pre- and post-market of a therapeutic intervention (drug, device, modality e.g.). In a Cutting Edge Information Study, 56% of pharma companies databases are an adverse event channel and more companies are turning to social media to track adverse events (in Pharmacovigilance Teams Turn to Technology for Adverse Event Reporting Needs). In addition there have been many reports (see Digital Drug Safety Surveillance: Monitoring Pharmaceutical Products in Twitter) that show patients are frequently tweeting about their adverse events.

There have been concerns with using Twitter and social media to monitor for adverse events. For example FDA funded a study where a team of researchers from Harvard Medical School and other academic centers examined more than 60,000 tweets, of which 4,401 were manually categorized as resembling adverse events and compared with the FDA pharmacovigilance databases. Problems associated with such social media strategy were inability to obtain extra, needed information from patients and difficulty in separating the relevant Tweets from irrelevant chatter.  The UK has launched a similar program called WEB-RADR to determine if monitoring #drug_reaction could be useful for monitoring adverse events. Many researchers have found the adverse-event related tweets “noisy” due to varied language but had noticed many people do understand some principles of causation including when adverse event subsides after discontinuing the drug.

However Dr. Clark Freifeld, Ph.D., from Boston University and founder of the startup Epidemico, feels his company has the algorithms that can separate out the true adverse events from the junk. According to their web site, their algorithm has high accuracy when compared to the FDA database. Dr. Freifeld admits that Twitter use for pharmacovigilance purposes is probably a starting point for further follow-up, as each patient needs to fill out the four-page forms required for data entry into the FDA database.

D. Use of Twitter in Big Data Analytics

Published on Aug 28, 2012

http://blogs.ischool.berkeley.edu/i29…

Course: Information 290. Analyzing Big Data with Twitter
School of Information
UC Berkeley

Lecture 1: August 23, 2012

Course description:
How to store, process, analyze and make sense of Big Data is of increasing interest and importance to technology companies, a wide range of industries, and academic institutions. In this course, UC Berkeley professors and Twitter engineers will lecture on the most cutting-edge algorithms and software tools for data analytics as applied to Twitter microblog data. Topics will include applied natural language processing algorithms such as sentiment analysis, large scale anomaly detection, real-time search, information diffusion and outbreak detection, trend detection in social streams, recommendation algorithms, and advanced frameworks for distributed computing. Social science perspectives on analyzing social media will also be covered.

This is a hands-on project course in which students are expected to form teams to complete intensive programming and analytics projects using the real-world example of Twitter data and code bases. Engineers from Twitter will help advise student projects, and students will have the option of presenting their final project presentations to an audience of engineers at the headquarters of Twitter in San Francisco (in addition to on campus). Project topics include building on existing infrastructure tools, building Twitter apps, and analyzing Twitter data. Access to data will be provided.

Other posts on this site on USE OF SOCIAL MEDIA AND TWITTER IN HEALTHCARE and Conference Coverage include:

Methodology for Conference Coverage using Social Media: 2014 MassBio Annual Meeting 4/3 – 4/4 2014, Royal Sonesta Hotel, Cambridge, MA

Strategy for Event Joint Promotion: 14th ANNUAL BIOTECH IN EUROPE FORUM For Global Partnering & Investment 9/30 – 10/1/2014 • Congress Center Basel – SACHS Associates, London

REAL TIME Cancer Conference Coverage: A Novel Methodology for Authentic Reporting on Presentations and Discussions launched via Twitter.com @ The 2nd ANNUAL Sachs Cancer Bio Partnering & Investment Forum in Drug Development, 19th March 2014 • New York Academy of Sciences • USA

PCCI’s 7th Annual Roundtable “Crowdfunding for Life Sciences: A Bridge Over Troubled Waters?” May 12 2014 Embassy Suites Hotel, Chesterbrook PA 6:00-9:30 PM

CRISPR-Cas9 Discovery and Development of Programmable Genome Engineering – Gabbay Award Lectures in Biotechnology and Medicine – Hosted by Rosenstiel Basic Medical Sciences Research Center, 10/27/14 3:30PM Brandeis University, Gerstenzang 121

Tweeting on 14th ANNUAL BIOTECH IN EUROPE FORUM For Global Partnering & Investment 9/30 – 10/1/2014 • Congress Center Basel – SACHS Associates, London

https://pharmaceuticalintelligence.com/press-coverage/

Statistical Analysis of Tweet Feeds from the 14th ANNUAL BIOTECH IN EUROPE FORUM For Global Partnering & Investment 9/30 – 10/1/2014 • Congress Center Basel – SACHS Associates, London

1st Pitch Life Science- Philadelphia- What VCs Really Think of your Pitch

What VCs Think about Your Pitch? Panel Summary of 1st Pitch Life Science Philly

How Social Media, Mobile Are Playing a Bigger Part in Healthcare

Can Mobile Health Apps Improve Oral-Chemotherapy Adherence? The Benefit of Gamification.

Medical Applications and FDA regulation of Sensor-enabled Mobile Devices: Apple and the Digital Health Devices Market

E-Medical Records Get A Mobile, Open-Sourced Overhaul By White House Health Design Challenge Winners

Read Full Post »


Track 9 Pharmaceutical R&D Informatics: Collaboration, Data Science and Biologics @ BioIT World, April 29 – May 1, 2014 Seaport World Trade Center, Boston, MA

Aviva Lev-Ari, PhD, RN

 

April 30, 2014

 

Big Data and Data Science in R&D and Translational Research

10:50 Chairperson’s Remarks

Ralph Haffner, Local Area Head, Research Informatics, F. Hoffmann-La Roche AG

11:00 Can Data Science Save Pharmaceutical R&D?

Jason M. Johnson, Ph.D., Associate Vice President,

Scientific Informatics & Early Development and Discovery Sciences IT, Merck

Although both premises – that the viability of pharmaceutical R&D is mortally threatened and that modern “data science” is a relevant superhero – are

suspect, it is clear that R&D productivity is progressively declining and many areas of R&D suboptimally use data in decision-making. We will discuss

some barriers to our overdue information revolution, and our strategy for overcoming them.

11:30 Enabling Data Science in Externalized Pharmaceutical R&D

Sándor Szalma, Ph.D., Head, External Innovation, R&D IT,

Janssen Research & Development, LLC

Pharmaceutical companies have historically been involved in many external partnerships. With recent proliferation of hosted solutions and the availability

of cost-effective, massive high-performance computing resources there is an opportunity and a requirement now to enable collaborative data science. We

discuss our experience in implementing robust solutions and pre-competitive approaches to further these goals.

12:00 pm Co-Presentation: Sponsored by

Collaborative Waveform Analytics: How New Approaches in Machine Learning and Enterprise Analytics will Extend Expert Knowledge and Improve Safety Assessment

  • Tim Carruthers, CEO, Neural ID
  • Scott Weiss, Director, Product Strategy, IDBS

Neural ID’s Intelligent Waveform Service (IWS) delivers the only enterprise biosignal analysis solution combining machine learning with human expertise. A collaborative platform supporting all phases of research and development, IWS addresses a significant unmet need, delivering scalable analytics and a single interoperable data format to transform productivity in life sciences. By enabling analysis from BioBook (IDBS) to original biosignals, IWS enables users of BioBook to evaluate cardio safety assessment across the R&D lifecycle.

12:15 Building a Life Sciences Data

Sponsored by

Lake: A Useful Approach to Big Data

Ben Szekely, Director & Founding Engineer,

Cambridge Semantics

The promise of Big Data is in its ability to give us technology that can cope with overwhelming volume and variety of information that pervades R&D informatics. But the challenges are in practical use of disconnected and poorly described data. We will discuss: Linking Big Data from diverse sources for easy understanding and reuse; Building R&D informatics applications on top of a Life Sciences Data Lake; and Applications of a Data Lake in Pharma.

12:40 Luncheon Presentation I:

Sponsored by

Chemical Data Visualization in Spotfire

Matthew Stahl, Ph.D., Senior Vice President,

OpenEye Scientific Software

Spotfire deftly facilitates the analysis and interrogation of data sets. Domain specific data, such as chemistry, presents a set of challenges that general data analysis tools have difficulty addressing directly. Fortunately, Spotfire is an extensible platform that can be augmented with domain specific abilities. Spotfire has been augmented to naturally handle cheminformatics and chemical data visualization through the integration of OpenEye toolkits. The OpenEye chemistry extensions for Spotfire will be presented.

1:10 Luncheon Presentation II 

1:50 Chairperson’s Remarks

Yuriy Gankin, Ph.D., Co. Founder and CSO, GGA Software Services

1:55 Enable Translational Science by Integrating Data across the R&D Organization

Christian Gossens, Ph.D., Global Head, pRED Development Informatics Team,

pRED Informatics, F. Hoffmann-La Roche Ltd.

Multi-national pharmaceutical companies face an amazingly complex information management environment. The presentation will show that

a systematic system landscaping approach is an effective tool to build a sustainable integrated data environment. Data integration is not mainly about

technology, but the use and implementation of it.

2:25 The Role of Collaboration in Enabling Great Science in the Digital Age: The BARD Data Science Case Study

Andrea DeSouza, Director, Informatics & Data Analysis,

Broad Institute

BARD (BioAssay Research Database) is a new, public web portal that uses a standard representation and common language for organizing chemical biology data. In this talk, I describe how data professionals and scientists collaborated to develop BARD, organize the NIH Molecular Libraries Program data, and create a new standard for bioassay data exchange.

May 1. 2014

BIG DATA AND DATA SCIENCE IN R&D AND TRANSLATIONAL RESEARCH

10:30 Chairperson’s Opening Remarks

John Koch, Director, Scientific Information Architecture & Search, Merck

10:35 The Role of a Data Scientist in Drug Discovery and Development

Anastasia (Khoury) Christianson, Ph.D., Head, Translational R&D IT, Bristol-

Myers Squibb

A major challenge in drug discovery and development is finding all the relevant data, information, and knowledge to ensure informed, evidencebased

decisions in drug projects, including meaningful correlations between preclinical observations and clinical outcomes. This presentation will describe

where and how data scientists can support pharma R&D.

11:05 Designing and Building a Data Sciences Capability to Support R&D and Corporate Big Data Needs

Shoibal Datta, Ph.D., Director, Data Sciences, Biogen Idec

To achieve Biogen Idec’s strategic goals, we have built a cross-disciplinary team to focus on key areas of interest and the required capabilities. To provide

a reusable set of IT services we have broken down our platform to focus on the Ingestion, Digestion, Extraction and Analysis of data. In this presentation, we will outline how we brought focus and prioritization to our data sciences needs, our data sciences architecture, lessons learned and our future direction.

11:35 Data Experts: Improving Sponsored by

Translational Drug-Development Efficiency

Jamie MacPherson, Ph.D., Consultant, Tessella

We report on a novel approach to translational informatics support: embedding Data Experts’ within drug-project teams. Data experts combine first-line

informatics support and Business Analysis. They help teams exploit data sources that are diverse in type, scale and quality; analyse user-requirements and prototype potential software solutions. We then explore scaling this approach from a specific drug development team to all.

 

Read Full Post »


Scientific Curation Fostering Expert Networks and Open Innovation: Lessons from Clive Thompson

Life-cycle of Science 2

 

 

 

 

 

 

 

 

 

 

 

Curators and Writer: Stephen J. Williams, Ph.D. with input from Curators Larry H. Bernstein, MD, FCAP, Dr. Justin D. Pearlman, MD, PhD, FACC and Dr. Aviva Lev-Ari, PhD, RN

(this discussion is in a three part series including:

Using Scientific Content Curation as a Method for Validation and Biocuration

Using Scientific Content Curation as a Method for Open Innovation)

 

Every month I get my Wired Magazine (yes in hard print, I still like to turn pages manually plus I don’t mind if I get grease or wing sauce on my magazine rather than on my e-reader) but I always love reading articles written by Clive Thompson. He has a certain flair for understanding the techno world we live in and the human/technology interaction, writing about interesting ways in which we almost inadvertently integrate new technologies into our day-to-day living, generating new entrepreneurship, new value.   He also writes extensively about tech and entrepreneurship.

October 2013 Wired article by Clive Thompson, entitled “How Successful Networks Nurture Good Ideas: Thinking Out Loud”, describes how the voluminous writings, postings, tweets, and sharing on social media is fostering connections between people and ideas which, previously, had not existed. The article was generated from Clive Thompson’s book Smarter Than you Think: How Technology is Changing Our Minds for the Better.Tom Peters also commented about the article in his blog (see here).

Clive gives a wonderful example of Ory Okolloh, a young Kenyan-born law student who, after becoming frustrated with the lack of coverage of problems back home, started a blog about Kenyan politics. Her blog not only got interest from movie producers who were documenting female bloggers but also gained the interest of fellow Kenyans who, during the upheaval after the 2007 Kenyan elections, helped Ory to develop a Google map for reporting of violence (http://www.ushahidi.com/, which eventually became a global organization using open-source technology to affect crises-management. There are a multitude of examples how networks and the conversations within these circles are fostering new ideas. As Clive states in the article:

 

Our ideas are PRODUCTS OF OUR ENVIRONMENT.

They are influenced by the conversations around us.

However the article got me thinking of how Science 2.0 and the internet is changing how scientists contribute, share, and make connections to produce new and transformative ideas.

But HOW MUCH Knowledge is OUT THERE?

 

Clive’s article listed some amazing facts about the mountains of posts, tweets, words etc. out on the internet EVERY DAY, all of which exemplifies the problem:

  • 154.6 billion EMAILS per DAY
  • 400 million TWEETS per DAY
  • 1 million BLOG POSTS (including this one) per DAY
  • 2 million COMMENTS on WordPress per DAY
  • 16 million WORDS on Facebook per DAY
  • TOTAL 52 TRILLION WORDS per DAY

As he estimates this would be 520 million books per DAY (book with average 100,000 words).

A LOT of INFO. But as he suggests it is not the volume but how we create and share this information which is critical as the science fiction writer Theodore Sturgeon noted “Ninety percent of everything is crap” AKA Sturgeon’s Law.

 

Internet live stats show how congested the internet is each day (http://www.internetlivestats.com/). Needless to say Clive’s numbers are a bit off. As of the writing of this article:

 

  • 2.9 billion internet users
  • 981 million websites (only 25,000 hacked today)
  • 128 billion emails
  • 385 million Tweets
  • > 2.7 million BLOG posts today (including this one)

 

The Good, The Bad, and the Ugly of the Scientific Internet (The Wild West?)

 

So how many science blogs are out there? Well back in 2008 “grrlscientistasked this question and turned up a total of 19,881 blogs however most were “pseudoscience” blogs, not written by Ph.D or MD level scientists. A deeper search on Technorati using the search term “scientist PhD” turned up about 2,000 written by trained scientists.

So granted, there is a lot of

goodbadugly

 

              ….. when it comes to scientific information on the internet!

 

 

 

 

 

I had recently re-posted, on this site, a great example of how bad science and medicine can get propagated throughout the internet:

https://pharmaceuticalintelligence.com/2014/06/17/the-gonzalez-protocol-worse-than-useless-for-pancreatic-cancer/

 

and in a Nature Report:Stem cells: Taking a stand against pseudoscience

http://www.nature.com/news/stem-cells-taking-a-stand-against-pseudoscience-1.15408

Drs.Elena Cattaneo and Gilberto Corbellini document their long, hard fight against false and invalidated medical claims made by some “clinicians” about the utility and medical benefits of certain stem-cell therapies, sacrificing their time to debunk medical pseudoscience.

 

Using Curation and Science 2.0 to build Trusted, Expert Networks of Scientists and Clinicians

 

Establishing networks of trusted colleagues has been a cornerstone of the scientific discourse for centuries. For example, in the mid-1640s, the Royal Society began as:

 

“a meeting of natural philosophers to discuss promoting knowledge of the

natural world through observation and experiment”, i.e. science.

The Society met weekly to witness experiments and discuss what we

would now call scientific topics. The first Curator of Experiments

was Robert Hooke.”

 

from The History of the Royal Society

 

Royal Society CoatofArms

 

 

 

 

 

 

The Royal Society of London for Improving Natural Knowledge.

(photo credit: Royal Society)

(Although one wonders why they met “in-cognito”)

Indeed as discussed in “Science 2.0/Brainstorming” by the originators of OpenWetWare, an open-source science-notebook software designed to foster open-innovation, the new search and aggregation tools are making it easier to find, contribute, and share information to interested individuals. This paradigm is the basis for the shift from Science 1.0 to Science 2.0. Science 2.0 is attempting to remedy current drawbacks which are hindering rapid and open scientific collaboration and discourse including:

  • Slow time frame of current publishing methods: reviews can take years to fashion leading to outdated material
  • Level of information dissemination is currently one dimensional: peer-review, highly polished work, conferences
  • Current publishing does not encourage open feedback and review
  • Published articles edited for print do not take advantage of new web-based features including tagging, search-engine features, interactive multimedia, no hyperlinks
  • Published data and methodology incomplete
  • Published data not available in formats which can be readably accessible across platforms: gene lists are now mandated to be supplied as files however other data does not have to be supplied in file format

(put in here a brief blurb of summary of problems and why curation could help)

 

Curation in the Sciences: View from Scientific Content Curators Larry H. Bernstein, MD, FCAP, Dr. Justin D. Pearlman, MD, PhD, FACC and Dr. Aviva Lev-Ari, PhD, RN

Curation is an active filtering of the web’s  and peer reviewed literature found by such means – immense amount of relevant and irrelevant content. As a result content may be disruptive. However, in doing good curation, one does more than simply assign value by presentation of creative work in any category. Great curators comment and share experience across content, authors and themes. Great curators may see patterns others don’t, or may challenge or debate complex and apparently conflicting points of view.  Answers to specifically focused questions comes from the hard work of many in laboratory settings creatively establishing answers to definitive questions, each a part of the larger knowledge-base of reference. There are those rare “Einstein’s” who imagine a whole universe, unlike the three blind men of the Sufi tale.  One held the tail, the other the trunk, the other the ear, and they all said this is an elephant!
In my reading, I learn that the optimal ratio of curation to creation may be as high as 90% curation to 10% creation. Creating content is expensive. Curation, by comparison, is much less expensive.

– Larry H. Bernstein, MD, FCAP

Curation is Uniquely Distinguished by the Historical Exploratory Ties that Bind –Larry H. Bernstein, MD, FCAP

The explosion of information by numerous media, hardcopy and electronic, written and video, has created difficulties tracking topics and tying together relevant but separated discoveries, ideas, and potential applications. Some methods to help assimilate diverse sources of knowledge include a content expert preparing a textbook summary, a panel of experts leading a discussion or think tank, and conventions moderating presentations by researchers. Each of those methods has value and an audience, but they also have limitations, particularly with respect to timeliness and pushing the edge. In the electronic data age, there is a need for further innovation, to make synthesis, stimulating associations, synergy and contrasts available to audiences in a more timely and less formal manner. Hence the birth of curation. Key components of curation include expert identification of data, ideas and innovations of interest, expert interpretation of the original research results, integration with context, digesting, highlighting, correlating and presenting in novel light.

Justin D Pearlman, MD, PhD, FACC from The Voice of Content Consultant on The  Methodology of Curation in Cardiovascular Original Research: Cases in Methodology Design for Content Co-Curation The Art of Scientific & Medical Curation

 

In Power of Analogy: Curation in Music, Music Critique as a Curation and Curation of Medical Research Findings – A Comparison, Drs. Larry Bernstein and Aviva Lev-Ari likens the medical and scientific curation process to curation of musical works into a thematic program:

 

Work of Original Music Curation and Performance:

 

Music Review and Critique as a Curation

Work of Original Expression what is the methodology of Curation in the context of Medical Research Findings Exposition of Synthesis and Interpretation of the significance of the results to Clinical Care

… leading to new, curated, and collaborative works by networks of experts to generate (in this case) ebooks on most significant trends and interpretations of scientific knowledge as relates to medical practice.

 

In Summary: How Scientific Content Curation Can Help

 

Given the aforementioned problems of:

        I.            the complex and rapid deluge of scientific information

      II.            the need for a collaborative, open environment to produce transformative innovation

    III.            need for alternative ways to disseminate scientific findings

CURATION MAY OFFER SOLUTIONS

        I.            Curation exists beyond the review: curation decreases time for assessment of current trends adding multiple insights, analyses WITH an underlying METHODOLOGY (discussed below) while NOT acting as mere reiteration, regurgitation

 

      II.            Curation providing insights from WHOLE scientific community on multiple WEB 2.0 platforms

 

    III.            Curation makes use of new computational and Web-based tools to provide interoperability of data, reporting of findings (shown in Examples below)

 

Therefore a discussion is given on methodologies, definitions of best practices, and tools developed to assist the content curation community in this endeavor.

Methodology in Scientific Content Curation as Envisioned by Aviva lev-Ari, PhD, RN

 

At Leaders in Pharmaceutical Business Intelligence, site owner and chief editor Aviva lev-Ari, PhD, RN has been developing a strategy “for the facilitation of Global access to Biomedical knowledge rather than the access to sheer search results on Scientific subject matters in the Life Sciences and Medicine”. According to Aviva, “for the methodology to attain this complex goal it is to be dealing with popularization of ORIGINAL Scientific Research via Content Curation of Scientific Research Results by Experts, Authors, Writers using the critical thinking process of expert interpretation of the original research results.” The following post:

Cardiovascular Original Research: Cases in Methodology Design for Content Curation and Co-Curation

 

https://pharmaceuticalintelligence.com/2013/07/29/cardiovascular-original-research-cases-in-methodology-design-for-content-curation-and-co-curation/

demonstrate two examples how content co-curation attempts to achieve this aim and develop networks of scientist and clinician curators to aid in the active discussion of scientific and medical findings, and use scientific content curation as a means for critique offering a “new architecture for knowledge”. Indeed, popular search engines such as Google, Yahoo, or even scientific search engines such as NCBI’s PubMed and the OVID search engine rely on keywords and Boolean algorithms …

which has created a need for more context-driven scientific search and discourse.

In Science and Curation: the New Practice of Web 2.0, Célya Gruson-Daniel (@HackYourPhd) states:

To address this need, human intermediaries, empowered by the participatory wave of web 2.0, naturally started narrowing down the information and providing an angle of analysis and some context. They are bloggers, regular Internet users or community managers – a new type of profession dedicated to the web 2.0. A new use of the web has emerged, through which the information, once produced, is collectively spread and filtered by Internet users who create hierarchies of information.

.. where Célya considers curation an essential practice to manage open science and this new style of research.

As mentioned above in her article, Dr. Lev-Ari represents two examples of how content curation expanded thought, discussion, and eventually new ideas.

  1. Curator edifies content through analytic process = NEW form of writing and organizations leading to new interconnections of ideas = NEW INSIGHTS

i)        Evidence: curation methodology leading to new insights for biomarkers

 

  1. Same as #1 but multiple players (experts) each bringing unique insights, perspectives, skills yielding new research = NEW LINE of CRITICAL THINKING

ii)      Evidence: co-curation methodology among cardiovascular experts leading to cardiovascular series ebooks

Life-cycle of Science 2

The Life Cycle of Science 2.0. Due to Web 2.0, new paradigms of scientific collaboration are rapidly emerging.  Originally, scientific discovery were performed by individual laboratories or “scientific silos” where the main method of communication was peer-reviewed publication, meeting presentation, and ultimately news outlets and multimedia. In this digital era, data was organized for literature search and biocurated databases. In an era of social media, Web 2.0, a group of scientifically and medically trained “curators” organize the piles of data of digitally generated data and fit data into an organizational structure which can be shared, communicated, and analyzed in a holistic approach, launching new ideas due to changes in organization structure of data and data analytics.

 

The result, in this case, is a collaborative written work above the scope of the review. Currently review articles are written by experts in the field and summarize the state of a research are. However, using collaborative, trusted networks of experts, the result is a real-time synopsis and analysis of the field with the goal in mind to

INCREASE THE SCIENTIFIC CURRENCY.

For detailed description of methodology please see Cardiovascular Original Research: Cases in Methodology Design for Content Co-Curation The Art of Scientific & Medical Curation

 

In her paper, Curating e-Science Data, Maureen Pennock, from The British Library, emphasized the importance of using a diligent, validated, and reproducible, and cost-effective methodology for curation by e-science communities over the ‘Grid:

“The digital data deluge will have profound repercussions for the infrastructure of research and beyond. Data from a wide variety of new and existing sources will need to be annotated with metadata, then archived and curated so that both the data and the programmes used to transform the data can be reproduced for use in the future. The data represent a new foundation for new research, science, knowledge and discovery”

— JISC Senior Management Briefing Paper, The Data Deluge (2004)

 

As she states proper data and content curation is important for:

  • Post-analysis
  • Data and research result reuse for new research
  • Validation
  • Preservation of data in newer formats to prolong life-cycle of research results

However she laments the lack of

  • Funding for such efforts
  • Training
  • Organizational support
  • Monitoring
  • Established procedures

 

Tatiana Aders wrote a nice article based on an interview with Microsoft’s Robert Scoble, where he emphasized the need for curation in a world where “Twitter is the replacement of the Associated Press Wire Machine” and new technologic platforms are knocking out old platforms at a rapid pace. In addition he notes that curation is also a social art form where primary concerns are to understand an audience and a niche.

Indeed, part of the reason the need for curation is unmet, as writes Mark Carrigan, is the lack of appreciation by academics of the utility of tools such as Pinterest, Storify, and Pearl Trees to effectively communicate and build collaborative networks.

And teacher Nancy White, in her article Understanding Content Curation on her blog Innovations in Education, shows examples of how curation in an educational tool for students and teachers by demonstrating students need to CONTEXTUALIZE what the collect to add enhanced value, using higher mental processes such as:

  • Knowledge
  • Comprehension
  • Application
  • Analysis
  • Synthesis
  • Evaluation

curating-tableA GREAT table about the differences between Collecting and Curating by Nancy White at http://d20innovation.d20blogs.org/2012/07/07/understanding-content-curation/

 

 

 

 

 

 

 

 

 

 

 

University of Massachusetts Medical School has aggregated some useful curation tools at http://esciencelibrary.umassmed.edu/data_curation

Although many tools are related to biocuration and building databases but the common idea is curating data with indexing, analyses, and contextual value to provide for an audience to generate NETWORKS OF NEW IDEAS.

See here for a curation of how networks fosters knowledge, by Erika Harrison on ScoopIt

(http://www.scoop.it/t/mobilizing-knowledge-through-complex-networks)

 

“Nowadays, any organization should employ network scientists/analysts who are able to map and analyze complex systems that are of importance to the organization (e.g. the organization itself, its activities, a country’s economic activities, transportation networks, research networks).”

Andrea Carafa insight from World Economic Forum New Champions 2012 “Power of Networks

 

Creating Content Curation Communities: Breaking Down the Silos!

 

An article by Dr. Dana Rotman “Facilitating Scientific Collaborations Through Content Curation Communities” highlights how scientific information resources, traditionally created and maintained by paid professionals, are being crowdsourced to professionals and nonprofessionals in which she termed “content curation communities”, consisting of professionals and nonprofessional volunteers who create, curate, and maintain the various scientific database tools we use such as Encyclopedia of Life, ChemSpider (for Slideshare see here), biowikipedia etc. Although very useful and openly available, these projects create their own challenges such as

  • information integration (various types of data and formats)
  • social integration (marginalized by scientific communities, no funding, no recognition)

The authors set forth some ways to overcome these challenges of the content curation community including:

  1. standardization in practices
  2. visualization to document contributions
  3. emphasizing role of information professionals in content curation communities
  4. maintaining quality control to increase respectability
  5. recognizing participation to professional communities
  6. proposing funding/national meeting – Data Intensive Collaboration in Science and Engineering Workshop

A few great presentations and papers from the 2012 DICOSE meeting are found below

Judith M. Brown, Robert Biddle, Stevenson Gossage, Jeff Wilson & Steven Greenspan. Collaboratively Analyzing Large Data Sets using Multitouch Surfaces. (PDF) NotesForBrown

 

Bill Howe, Cecilia Aragon, David Beck, Jeffrey P. Gardner, Ed Lazowska, Tanya McEwen. Supporting Data-Intensive Collaboration via Campus eScience Centers. (PDF) NotesForHowe

 

Kerk F. Kee & Larry D. Browning. Challenges of Scientist-Developers and Adopters of Existing Cyberinfrastructure Tools for Data-Intensive Collaboration, Computational Simulation, and Interdisciplinary Projects in Early e-Science in the U.S.. (PDF) NotesForKee

 

Ben Li. The mirages of big data. (PDF) NotesForLiReflectionsByBen

 

Betsy Rolland & Charlotte P. Lee. Post-Doctoral Researchers’ Use of Preexisting Data in Cancer Epidemiology Research. (PDF) NoteForRolland

 

Dana Rotman, Jennifer Preece, Derek Hansen & Kezia Procita. Facilitating scientific collaboration through content curation communities. (PDF) NotesForRotman

 

Nicholas M. Weber & Karen S. Baker. System Slack in Cyberinfrastructure Development: Mind the Gaps. (PDF) NotesForWeber

Indeed, the movement of Science 2.0 from Science 1.0 had originated because these “silos” had frustrated many scientists, resulting in changes in the area of publishing (Open Access) but also communication of protocols (online protocol sites and notebooks like OpenWetWare and BioProtocols Online) and data and material registries (CGAP and tumor banks). Some examples are given below.

Open Science Case Studies in Curation

1. Open Science Project from Digital Curation Center

This project looked at what motivates researchers to work in an open manner with regard to their data, results and protocols, and whether advantages are delivered by working in this way.

The case studies consider the benefits and barriers to using ‘open science’ methods, and were carried out between November 2009 and April 2010 and published in the report Open to All? Case studies of openness in research. The Appendices to the main report (pdf) include a literature review, a framework for characterizing openness, a list of examples, and the interview schedule and topics. Some of the case study participants kindly agreed to us publishing the transcripts. This zip archive contains transcripts of interviews with researchers in astronomy, bioinformatics, chemistry, and language technology.

 

see: Pennock, M. (2006). “Curating e-Science Data”. DCC Briefing Papers: Introduction to Curation. Edinburgh: Digital Curation Centre. Handle: 1842/3330. Available online: http://www.dcc.ac.uk/resources/briefing-papers/introduction-curation– See more at: http://www.dcc.ac.uk/resources/briefing-papers/introduction-curation/curating-e-science-data#sthash.RdkPNi9F.dpuf

 

2.      cBIO -cBio’s biological data curation group developed and operates using a methodology called CIMS, the Curation Information Management System. CIMS is a comprehensive curation and quality control process that efficiently extracts information from publications.

 

3. NIH Topic Maps – This website provides a database and web-based interface for searching and discovering the types of research awarded by the NIH. The database uses automated, computer generated categories from a statistical analysis known as topic modeling.

 

4. SciKnowMine (USC)- We propose to create a framework to support biocuration called SciKnowMine (after ‘Scientific Knowledge Mine’), cyberinfrastructure that supports biocuration through the automated mining of text, images, and other amenable media at the scale of the entire literature.

 

  1. OpenWetWareOpenWetWare is an effort to promote the sharing of information, know-how, and wisdom among researchers and groups who are working in biology & biological engineering. Learn more about us.   If you would like edit access, would be interested in helping out, or want your lab website hosted on OpenWetWare, pleasejoin us. OpenWetWare is managed by the BioBricks Foundation. They also have a wiki about Science 2.0.

6. LabTrove: a lightweight, web based, laboratory “blog” as a route towards a marked up record of work in a bioscience research laboratory. Authors in PLOS One article, from University of Southampton, report the development of an open, scientific lab notebook using a blogging strategy to share information.

7. OpenScience ProjectThe OpenScience project is dedicated to writing and releasing free and Open Source scientific software. We are a group of scientists, mathematicians and engineers who want to encourage a collaborative environment in which science can be pursued by anyone who is inspired to discover something new about the natural world.

8. Open Science Grid is a multi-disciplinary partnership to federate local, regional, community and national cyberinfrastructures to meet the needs of research and academic communities at all scales.

 

9. Some ongoing biomedical knowledge (curation) projects at ISI

IICurate
This project is concerned with developing a curation and documentation system for information integration in collaboration with the II Group at ISI as part of the BIRN.

BioScholar
It’s primary purpose is to provide software for experimental biomedical scientists that would permit a single scientific worker (at the level of a graduate student or postdoctoral worker) to design, construct and manage a shared knowledge repository for a research group derived on a local store of PDF files. This project is funded by NIGMS from 2008-2012 ( RO1-GM083871).

10. Tools useful for scientific content curation

 

Research Analytic and Curation Tools from University of Queensland

 

Thomson Reuters information curation services for pharma industry

 

Microblogs as a way to communicate information about HPV infection among clinicians and patients; use of Chinese microblog SinaWeibo as a communication tool

 

VIVO for scientific communities– In order to connect this information about research activities across institutions and make it available to others, taking into account smaller players in the research landscape and addressing their need for specific information (for example, by proving non-conventional research objects), the open source software VIVO that provides research information as linked open data (LOD) is used in many countries.  So-called VIVO harvesters collect research information that is freely available on the web, and convert the data collected in conformity with LOD standards. The VIVO ontology builds on prevalent LOD namespaces and, depending on the needs of the specialist community concerned, can be expanded.

 

 

11. Examples of scientific curation in different areas of Science/Pharma/Biotech/Education

 

From Science 2.0 to Pharma 3.0 Q&A with Hervé Basset

http://digimind.com/blog/experts/pharma-3-0/

Hervé Basset, specialist librarian in the pharmaceutical industry and owner of the blog “Science Intelligence“, to talk about the inspiration behind his recent book  entitled “From Science 2.0 to Pharma 3.0″, published by Chandos Publishing and available on Amazon and how health care companies need a social media strategy to communicate and convince the health-care consumer, not just the practicioner.

 

Thomson Reuters and NuMedii Launch Ground-Breaking Initiative to Identify Drugs for Repurposing. Companies leverage content, Big Data analytics and expertise to improve success of drug discovery

 

Content Curation as a Context for Teaching and Learning in Science

 

#OZeLIVE Feb2014

http://www.youtube.com/watch?v=Ty-ugUA4az0

Creative Commons license

 

DigCCur: A graduate level program initiated by University of North Carolina to instruct the future digital curators in science and other subjects

 

Syracuse University offering a program in eScience and digital curation

 

Curation Tips from TED talks and tech experts

Steven Rosenbaum from Curation Nation

http://www.youtube.com/watch?v=HpncJd1v1k4

 

Pawan Deshpande form Curata on how content curation communities evolve and what makes a good content curation:

http://www.youtube.com/watch?v=QENhIU9YZyA

 

 

Future postings on the relevance and application of scientific curation will include:

Using Scientific Content Curation as a Method for Validation and Biocuration

 

Using Scientific Content Curation as a Method for Open Innovation

 

Other posts on this site related to Content Curation and Methodology include:

The growing importance of content curation

Data Curation is for Big Data what Data Integration is for Small Data

6 Steps to More Effective Content Curation

Stem Cells and Cardiac Repair: Content Curation & Scientific Reporting

Cancer Research: Curations and Reporting

Cardiovascular Diseases and Pharmacological Therapy: Curations

Cardiovascular Original Research: Cases in Methodology Design for Content Co-Curation The Art of Scientific & Medical Curation

Exploring the Impact of Content Curation on Business Goals in 2013

Power of Analogy: Curation in Music, Music Critique as a Curation and Curation of Medical Research Findings – A Comparison

conceived: NEW Definition for Co-Curation in Medical Research

The Young Surgeon and The Retired Pathologist: On Science, Medicine and HealthCare Policy – The Best Writers Among the WRITERS

Reconstructed Science Communication for Open Access Online Scientific Curation

 

 

Read Full Post »

Older Posts »