Posts Tagged ‘Artificial intelligence’


Live Coverage: MedCity Converge 2018 Philadelphia: AI in Cancer and Keynote Address

Reporter: Stephen J. Williams, PhD

8:30 AM -9:15

Practical Applications of AI in Cancer

We are far from machine learning dictating clinical decision making, but AI has important niche applications in oncology. Hear from a panel of innovative startups and established life science players about how machine learning and AI can transform different aspects in healthcare, be it in patient recruitment, data analysis, drug discovery or care delivery.

Moderator: Ayan Bhattacharya, Advanced Analytics Specialist Leader, Deloitte Consulting LLP
Wout Brusselaers, CEO and Co-Founder, Deep 6 AI @woutbrusselaers ‏
Tufia Haddad, M.D., Chair of Breast Medical Oncology and Department of Oncology Chair of IT, Mayo Clinic
Carla Leibowitz, Head of Corporate Development, Arterys @carlaleibowitz
John Quackenbush, Ph.D., Professor and Director of the Center for Cancer Computational Biology, Dana-Farber Cancer Institute

Ayan: working at IBM and Thompon Rueters with structured datasets and having gone through his own cancer battle, he is now working in healthcare AI which has an unstructured dataset(s)

Carla: collecting medical images over the world, mainly tumor and calculating tumor volumetrics

Tufia: drug resistant breast cancer clinician but interested in AI and healthcareIT at Mayo

John: taking large scale datasets but a machine learning skeptic

moderator: how has imaging evolved?

Carla: ten times images but not ten times radiologists so stressed field needs help with image analysis; they have seen measuring lung tumor volumetrics as a therapeutic diagnostic has worked

moderator: how has AI affected patient recruitment?

Tufia: majority of patients are receiving great care but AI can offer profiles and determine which patients can benefit from tertiary care;

John: 1980 paper on no free lunch theorem; great enthusiasm about optimization algortihisms fell short in application; can extract great information from e.g. images

moderator: how is AI for healthcare delivery working at mayo?

Tufia: for every hour with patient two hours of data mining. for care delivery hope to use the systems to leverage the cognitive systems to do the data mining

John: problem with irreproducible research which makes a poor dataset:  also these care packages are based on population data not personalized datasets; challenges to AI is moving correlation to causation

Carla: algorithisms from on healthcare network is not good enough, Google tried and it failed

John: curation very important; good annotation is needed; needed to go in and develop, with curators, a systematic way to curate medial records; need standardization and reproducibility; applications in radiometrics can be different based on different data collection machines; developed a machine learning model site where investigators can compare models on a hub; also need to communicate with patients on healthcare information and quality information

Ayan: Australia and Canada has done the most concerning AI and lifescience, healthcare space; AI in most cases is cognitive learning: really two types of companies 1) the Microsofts, Googles, and 2) the startups that may be more pure AI


Final Notes: We are at a point where collecting massive amounts of healthcare related data is simple, rapid, and shareable.  However challenges exist in quality of datasets, proper curation and annotation, need for collaboration across all healthcare stakeholders including patients, and dissemination of useful and accurate information


9:15 AM–9:45 AM

Opening Keynote: Dr. Joshua Brody, Medical Oncologist, Mount Sinai Health System

The Promise and Hype of Immunotherapy

Immunotherapy is revolutionizing oncology care across various types of cancers, but it is also necessary to sort the hype from the reality. In his keynote, Dr. Brody will delve into the history of this new therapy mode and how it has transformed the treatment of lymphoma and other diseases. He will address the hype surrounding it, why so many still don’t respond to the treatment regimen and chart the way forward—one that can lead to more elegant immunotherapy combination paths and better outcomes for patients.

Joshua Brody, M.D., Assistant Professor, Mount Sinai School of Medicine @joshuabrodyMD

Director Lymphoma therapy at Mt. Sinai

  • lymphoma a cancer with high PD-L1 expression
  • hodgkin’s lymphoma best responder to PD1 therapy (nivolumab) but hepatic adverse effects
  • CAR-T (chimeric BCR and TCR); a long process which includes apheresis, selection CD3/CD28 cells; viral transfection of the chimeric; purification
  • complete remissions of B cell lymphomas (NCI trial) and long term remissions past 18 months
  • side effects like cytokine release (has been controlled); encephalopathy (he uses a hand writing test to see progression of adverse effect)


  •  teaching the immune cells as PD1 inhibition exhausting T cells so a vaccine boost could be an adjuvant to PD1 or checkpoint therapy
  • using Flt3L primed in-situ vaccine (using a Toll like receptor agonist can recruit the dendritic cells to the tumor and then activation of T cell response);  therefore vaccine does not need to be produced ex vivo; months after the vaccine the tumor still in remission
  • versus rituximab, which can target many healthy B cells this in-situ vaccine strategy is very specific for the tumorigenic B cells
  • HoWEVER they did see resistant tumor cells which did not overexpress PD-L1 but they did discover a novel checkpoint (cannot be disclosed at this point)










Please follow on Twitter using the following #hashtags and @pharma_BI











And at the following handles:




Please see related articles on Live Coverage of Previous Meetings on this Open Access Journal

LIVE – Real Time – 16th Annual Cancer Research Symposium, Koch Institute, Friday, June 16, 9AM – 5PM, Kresge Auditorium, MIT

Real Time Coverage and eProceedings of Presentations on 11/16 – 11/17, 2016, The 12th Annual Personalized Medicine Conference, HARVARD MEDICAL SCHOOL, Joseph B. Martin Conference Center, 77 Avenue Louis Pasteur, Boston

Tweets Impression Analytics, Re-Tweets, Tweets and Likes by @AVIVA1950 and @pharma_BI for 2018 BioIT, Boston, 5/15 – 5/17, 2018

BIO 2018! June 4-7, 2018 at Boston Convention & Exhibition Center


Read Full Post »

Our Astrophysicist

Larry H. Bernstein, MD, FCAP, Curator



Ray Kurzweil talks with host Neal deGrasse Tyson, PhD: on invention & immortality

part of the week long event series 7 Days of Genius at 92 Street Y

92 Street Y | 7 Days of Genius
Conversation on stage during the week long event series, held at the historic community center.

featured talk | Ray Kurzweil with host Neil DeGrasse Tyson, PhD — on Invention & Immortality


Inventor, author and futurist Ray Kurzweil is joined by astrophysicist and science communicator Neil deGrasse Tyson, PhD for a discussion of some of the biggest topics of our time. They explore the role of technology in the future, its impact on brain science — and coming innovations in artificial intelligence, energy, life extension and immortality.

Ray Kurzweil has been accurately predicting the future for decades. He explains to Star Talk show host Neil DeGrasse Tyson, PhD how he does it.

Kurzweil also says microscopic robots called nanobots will connect your neocortex to the cloud — the expansion of the human brain that he predicts will happen in the 2030s.

This featured talk is part of a week long series of events called 7 Days of Genius. Presented by the celebrated, historic 92 Street Y cultural arts and community center.

video | 1.
Highlights from the talk with Ray Kurzweil and host Neil deGrasse Tyson, PhD


video | 2.
Highlights from the talk with Ray Kurzweil and host Neil deGrasse Tyson, PhD


Entrepreneur | The one tip for success shared by Ray Kurzweil and Neil deGrasse Tyson, PhD

March 9, 2016

Entrepreneur — March 8, 2016 | Catherine Clifford

This is a summary. Read original article in full here

Follow your passion deeply, Ray Kurzweil told an audience at an impressively humorous and entertaining talk hosted by astrophysicist Neil deGrasse Tyson, PhD at the 92 Street Y community center.

The talk with leading, innovative thinkers was part of 92 Street Y’s week long 7 Days of Genius festival. Kurzweil is an inventor, entrepreneur, author and futurist.

In the future, Kurzweil said, there will be a premium on specialized, comprehensive knowledge. If you have passion for art, music or literature — follow that, he says. Kurzweil learned when he was young he had a passion for inventing. “But for some people it’s not clear,” he says. “They should explore many different avenues.”

Money should not be the motivating factor, says Kurzweil, who is something of a romantic. “Don’t do what you think is practical, just because you think that’s a way to make a living. The best way to pursue the future is find an expression you have a passion for,” he says.

Tyson encourages people to seek out learning, visit museums and follow curiosity. Tyson says, “I’m here to make more people passionate, to transform the world for good.”


about | 7 Days of Genius at 92 Street Y
Background on the week long event series exploring science, innovation and culture.

92 Street Y |  7 Days of Genius is a multi-platform, week long festival with stage events featuring thought leaders in science, innovation and culture. It explores the concept of genius, and how it transforms lives and cultures.

Events are also hosted globally by partner organizations, and digital broadcast through partners MS • NBC and National Geographic.

Our yearly series of inspiring conversations with experts in politics, technology, knowledge, ethics is focused on the power of genius to change the world for the better.


92 Street Y | 7 Days of Genius
Some featured speakers from the series.

1.  Manjul Bhargava, PhD
2.  Esther Dyson
3.  Ray Kurzweil
4.  Martine Rothblatt, PhD
5.  Yancey Strickler
6.  Neil deGrasse Tyson, PhD


the festival celebrates Genius Revealed featuring:

1.  special installation at 92 Street Y on remarkable, historic female scientists and inventors throughout history
2.  series on female genius produced with Big Think
3.  20 world events with United Nations Women, exploring how genius can help gender equity
4.  global events celebrating innovative ideas of youth to improve communities with design, entrepreneurship
5.  look for Mental Floss campaign on women geniuses
6.  special programming on MS • NBC, and results of our Ultimate Genius Showdown
7.  see winners of our Global Challenges on design, entrepreneurship, religion


video | about 92 Street Y
Background on the historic cultural and community center

watch | video tour

about | 92 Street Y
Landmark community center for culture, arts and conversation.

The historic 92 Street Y is a famous cultural and community center where people from all over connect through culture, arts, entertainment and conversation. For 140 years, we have harnessed the power of arts and ideas to enrich, enlighten and change lives, and the power of community to repair the world. The 92 Street Y is a United States cultural institution in New York, New York at the corner of 92 Street and Lexington Avenue. It’s now a significant landmark center for music, arts, philosophy, celebrity talks and entertainment.

Its full name is the 92 Street Young Men’s and Young Women’s Hebrew Association. Founded in 1874 by German Jewish professionals, 92 Street Y has grown into an organization guided by Jewish principles but serves people of all races and faiths. We harness the power of arts and ideas to enrich, enlighten and change lives, and the power of community.

We enthusiastically reach out to all ages, backgrounds while embracing Jewish values like learning and self-improvement, importance of family, joy of life, and giving back to a wonderfully diverse and growing world.

We curate conversations with the world’s thought leaders — today’s most exceptional thinkers and influential partners for social good — to deepen understanding and engage.

Our performing arts center presents classical, jazz, popular and world music and dance performances. 92 Street Y is a legendary literary destination where the most celebrated writers and readers have gathered since 1939.

We’re a studio, school and workshop where dancers, musicians, jewelry makers, ceramicists, visual artists, poets, playwrights and novelists — professionals and eager amateurs — nourish the human spirit through the arts.

We provide an inspiring, safe and supportive home for families, decades of expertise in parenting, child development, after school sports and classes, special needs programs and summer camps. And offer seniors dozens of activities.

Our fitness center inspires health. 92 Street Y creates meaningful, relevant and joyous experiences for all those who want to connect, finding new ways to bring tradition into dialog with the modern world.

I would encourage anybody as well to watch the Intelligence Square debate video on 92nd Y Street. It is quiet interesting It’s called Don’t trust the promise of Artificial Intelligence. I think both sides of the debate bring interesting arguments.


PBS Newshour | Tech’s next feats? Maybe on-demand kidneys, robot sex, cheap solar, lab meat

PBS Newshour | Optimists at Silicon Valley thinktank Singularity University are pushing the frontiers of human progress through innovation and emerging technologies, looking to greater longevity and better health. As part of his series on “Making Sense” of financial news, Paul Solman explores a future of “exponential growth.”

Paul Solman: Admittedly, solar now provides less than 1 percent of U.S. energy needs. But Singularity University’s other cofounder, Ray Kurzweil, whom we interviewed by something called Teleportec, says the public is pointlessly pessimistic.

Ray Kurzweil, Chancellor, Singularity University: And I think the major reason that people are pessimistic is they don’t realize that these technologies are growing exponentially.

For example, solar energy is doubling every two years. It’s now only seven doublings from meeting 100 percent of the world’s energy needs, and we have 10,000 times more sunlight than we need to do that. […]


York University | “Google’s Ray Kurzweil receives honorary doctorate” — October 16, 2013

On October 16, 2013 York University conferred an honorary doctorate on Ray Kurzweil, Director of Engineering at Google, in a ceremony on campus. The Lassonde School of Engineering wishes to congratulate Ray Kurzweil on this tremendous honour.

An inventor, author, futurist and a thinker, Ray Kurzweil is most certainly a Renaissance Engineer

Read Full Post »

Unlocking the Microbiome

Larry H. Bernstein, MD, FCAP, Curator



Machine-learning technique uncovers unknown features of multi-drug-resistant pathogen

Relatively simple “unsupervised” learning system reveals important new information to microbiologists
January 29, 201

According to the CDC, Pseudomonas aeruginosa is a common cause of healthcare-associated infections, including pneumonia, bloodstream infections, urinary tract infections, and surgical site infections. Some strains of P. aeruginosa have been found to be resistant to nearly all or all antibiotics. (illustration credit: CDC)

A new machine-learning technique can uncover previously unknown features of organisms and their genes in large datasets, according to researchers from the Perelman School of Medicine at the University of Pennsylvania and the Geisel School of Medicine at Dartmouth University.

For example, the technique learned to identify the characteristic gene-expression patterns that appear when a bacterium is exposed in different conditions, such as low oxygen and the presence of antibiotics.

The technique, called “ADAGE” (Analysis using Denoising Autoencoders of Gene Expression), uses a “denoising autoencoder” algorithm, which learns to identify recurring features or patterns in large datasets — without being told what specific features to look for (that is, “unsupervised.”)*

Last year,  Casey Greene, PhD, an assistant professor of Systems Pharmacology and Translational Therapeutics at Penn, and his team published, in an open-access paper in the American Society for Microbiology’s mSystems, the first demonstration of ADAGE in a biological context: an analysis of two gene-expression datasets of breast cancers.

Tracking down gene patterns of a multi-drug-resistant bacterium

The new study, published Jan. 19 in an open-access paper in mSystems, was more ambitious. It applied ADAGE to a dataset of 950 gene-expression arrays publicly available at the time for the multi-drug-resistant bacteriumPseudomonas aeruginosa. This bacterium is a notorious pathogen in the hospital and in individuals with cystic fibrosis and other chronic lung conditions; it’s often difficult to treat due to its high resistance to standard antibiotic therapies.

The data included only the identities of the roughly 5,000 P. aeruginosa genes and their measured expression levels in each published experiment. The goal was to see if this “unsupervised” learning system could uncover important patterns in P. aeruginosa gene expression and clarify how those patterns change when the bacterium’s environment changes — for example, when in the presence of an antibiotic.

Even though the model built with ADAGE was relatively simple — roughly equivalent to a brain with only a few dozen neurons — it had no trouble learning which sets of P. aeruginosa genes tend to work together or in opposition. To the researchers’ surprise, the ADAGE system also detected differences between the main laboratory strain of P. aeruginosa and strains isolated from infected patients. “That turned out to be one of the strongest features of the data,” Greene said.

“We expect that this approach will be particularly useful to microbiologists researching bacterial species that lack a decades-long history of study in the lab,” said Greene. “Microbiologists can use these models to identify where the data agree with their own knowledge and where the data seem to be pointing in a different direction … and to find completely new things in biology that we didn’t even know to look for.”

Support for the research came from the Gordon and Betty Moore Foundation, the William H. Neukom Institute for Computational Science, the National Institutes of Health, and the Cystic Fibrosis Foundation.

* In 2012, Google-sponsored researchers applied a similar method to randomly selected YouTube images; their system learned to recognize major recurring features of those images — including cats of course.

Abstract of ADAGE-Based Integration of Publicly Available Pseudomonas aeruginosa Gene Expression Data with Denoising Autoencoders Illuminates Microbe-Host Interactions

The increasing number of genome-wide assays of gene expression available from public databases presents opportunities for computational methods that facilitate hypothesis generation and biological interpretation of these data. We present an unsupervised machine learning approach, ADAGE (analysis using denoising autoencoders of gene expression), and apply it to the publicly available gene expression data compendium for Pseudomonas aeruginosa. In this approach, the machine-learned ADAGE model contained 50 nodes which we predicted would correspond to gene expression patterns across the gene expression compendium. While no biological knowledge was used during model construction, cooperonic genes had similar weights across nodes, and genes with similar weights across nodes were significantly more likely to share KEGG pathways. By analyzing newly generated and previously published microarray and transcriptome sequencing data, the ADAGE model identified differences between strains, modeled the cellular response to low oxygen, and predicted the involvement of biological processes based on low-level gene expression differences. ADAGE compared favorably with traditional principal component analysis and independent component analysis approaches in its ability to extract validated patterns, and based on our analyses, we propose that these approaches differ in the types of patterns they preferentially identify. We provide the ADAGE model with analysis of all publicly available P. aeruginosa GeneChip experiments and open source code for use with other species and settings. Extraction of consistent patterns across large-scale collections of genomic data using methods like ADAGE provides the opportunity to identify general principles and biologically important patterns in microbial biology. This approach will be particularly useful in less-well-studied microbial species.

Abstract of Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with denoising autoencoders

Big data bring new opportunities for methods that efficiently summarize and automatically extract knowledge from such compendia. While both supervised learning algorithms and unsupervised clustering algorithms have been successfully applied to biological data, they are either dependent on known biology or limited to discerning the most significant signals in the data. Here we present denoising autoencoders (DAs), which employ a data-defined learning objective independent of known biology, as a method to identify and extract complex patterns from genomic data. We evaluate the performance of DAs by applying them to a large collection of breast cancer gene expression data. Results show that DAs successfully construct features that contain both clinical and molecular information. There are features that represent tumor or normal samples, estrogen receptor (ER) status, and molecular subtypes. Features constructed by the autoencoder generalize to an independent dataset collected using a distinct experimental platform. By integrating data from ENCODE for feature interpretation, we discover a feature representing ER status through association with key transcription factors in breast cancer. We also identify a feature highly predictive of patient survival and it is enriched by FOXM1 signaling pathway. The features constructed by DAs are often bimodally distributed with one peak near zero and another near one, which facilitates discretization. In summary, we demonstrate that DAs effectively extract key biological principles from gene expression data and summarize them into constructed features with convenient properties.

Read Full Post »

Future of Big Data for Societal Transformation

Larry H. Bernstein, MD, FCAP, Curator



Musk, others commit $1 billion to non-profit AI research company to ‘benefit humanity’

Open-sourcing AI development to prevent an AI superpower takeover
(credit: OpenAI)

Elon Musk and associates announced OpenAI, a non-profit AI research company, on Friday (Dec. 11), committing $1 billion toward their goal to “advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return.”

The funding comes from a group of tech leaders including Musk, Reid Hoffman, Peter Thiel, and Amazon Web Services, but the venture expects to only spend “a tiny fraction of this in the next few years.”

The founders note that it’s hard to predict how much AI could “damage society if built or used incorrectly” or how soon. But the hope is to have a leading research institution that can “prioritize a good outcome for all over its own self-interest … as broadly and evenly distributed as possible.”

Brains trust

OpenAI’s co-chairs are Musk, who is also the principal funder of Future of Life Institute, and Sam Altman, president of  venture-capital seed-accelerator firm Y Combinator, who is also providing funding.

I think the best defense against the misuse of AI is to empower as many people as possible to have AI. If everyone has AI powers, then there’s not any one person or a small set of individuals who can have AI superpower.” — Elon Musk on Medium

The founders say the organization’s patents (if any) “will be shared with the world. We’ll freely collaborate with others across many institutions and expect to work with companies to research and deploy new technologies.”

OpenAI’s research director is machine learning expert Ilya Sutskever, formerly at Google, and its CTO is Greg Brockman, formerly the CTO of Stripe. The group’s other founding members are “world-class research engineers and scientists” Trevor Blackwell, Vicki Cheung, Andrej Karpathy, Durk Kingma, John Schulman, Pamela Vagata, and Wojciech Zaremba. Pieter Abbeel, Yoshua Bengio, Alan Kay, Sergey Levine, and Vishal Sikka are advisors to the group. The company will be based in San Francisco.

If I’m Dr. Evil and I use it, won’t you be empowering me?

“There are a few different thoughts about this. Just like humans protect against Dr. Evil by the fact that most humans are good, and the collective force of humanity can contain the bad elements, we think its far more likely that many, many AIs will work to stop the occasional bad actors than the idea that there is a single AI a billion times more powerful than anything else. If that one thing goes off the rails or if Dr. Evil gets that one thing and there is nothing to counteract it, then we’re really in a bad place.” — Sam Altman in an interview with Steven Levy on Medium.

The announcement follows recent announcements by Facebook to open-source the hardware design of its GPU-based “Big Sur” AI server (used for large-scale machine learning software to identify objects in photos and understand natural language, for example); by Google to open-source its TensorFlow machine-learning software; and by Toyota Corporation to invest $1 billion in a five-year private research effort in artificial intelligence and robotics technologies, jointly with Stanford University and MIT.

To follow OpenAI: @open_ai or    Topics: AI/Robotics | Survival/Defense

Spot on Elon! The threat is and currently the developments are unfortunately pointing exactly in that direction that AI will be controlled via a handful of big and powerful cooperation . None surprisingly none of those subjects are part of the OpenAI movement.

I like the sentiment, AI for all and for the common good, and at one level it seems doable but at another level it seems problematic on the scale of nation states and multinational entities.

If we all have AI systems then it will be those with control of the most energy to run their AI who will have the most influence, and that could be a “Dr. Evil”. It is the sum total of computing power on any given side of a conflict that will determine the outcome, if AI is a significant factor at all.

We could see bigger players looking at strategic questions such as, do they act now, or wait and put more resources into advancing the power of their AI so that they have better odds later, but at the risk of falling to a preemptive attack. Given this sort of thing I don’t see that AI will be a game changer, a leveller, rather it could just fit into the existing arms race type scenarios, at least until one group crosses a singularity threshold and then accelerates away from the pack while holding everyone else back so that they cannot catch up.

Not matter how I look at it I always see the scenarios running in the opposite direction to diversity, toward a singular dominant entity that “roots” all the other AI, sensor and actuator systems and then assimilates them.

How do they plan to stop this? How can one group of AIs have an ethical framework that allows them to “keep down” another group or single AI so that it does not get into a position to dominate them? How will this be any less messy than how the human super-powers have interacted in the last century?


I recommend the book “SuperIntelligence” by Nick Bostrom. Most thorough and penetrating. It covers many permutations of the intelligence explosion. The Allegory at the beginning is worth the price alone.


Elon, for goodness sake, focus! Get the big battery factory working, get space industry off the ground and America back in the ISS resupply and re-crew business, but enough with the non-profit expenditures already! Keep sinking your capital into non profits like the Hyperlink-a beautiful, high tech version of the old “I just know I can make trains profitable again outside of the northeast” dream and this non-profit AI and you’ll eventually go one financial step too far.

Both for you and for all of us who benefit from your efforts, consider this. At least change your attitude about profit; keep the option open that this AI will bring some profit, even with the open source aspect. This is a great effort, as I see you possibly becoming the “good AI” element that Ray writes about in his first essay, in the essay section on this site. There, Ray is confident that the good people with AI will out-think the bad people with AI and so good AI will prevail.

Read Full Post »

Artificial Intelligence Versus the Scientist: Who Will Win?

Will DARPA Replace the Human Scientist: Not So Fast, My Friend!

Writer, Curator: Stephen J. Williams, Ph.D.


Last month’s issue of Science article by Jia You “DARPA Sets Out to Automate Research”[1] gave a glimpse of how science could be conducted in the future: without scientists. The article focused on the U.S. Defense Advanced Research Projects Agency (DARPA) program called ‘Big Mechanism”, a $45 million effort to develop computer algorithms which read scientific journal papers with ultimate goal of extracting enough information to design hypotheses and the next set of experiments,

all without human input.

The head of the project, artificial intelligence expert Paul Cohen, says the overall goal is to help scientists cope with the complexity with massive amounts of information. As Paul Cohen stated for the article:


Just when we need to understand highly connected systems as systems,

our research methods force us to focus on little parts.


The Big Mechanisms project aims to design computer algorithms to critically read journal articles, much as scientists will, to determine what and how the information contributes to the knowledge base.

As a proof of concept DARPA is attempting to model Ras-mutation driven cancers using previously published literature in three main steps:

  1. Natural Language Processing: Machines read literature on cancer pathways and convert information to computational semantics and meaning

One team is focused on extracting details on experimental procedures, using the mining of certain phraseology to determine the paper’s worth (for example using phrases like ‘we suggest’ or ‘suggests a role in’ might be considered weak versus ‘we prove’ or ‘provide evidence’ might be identified by the program as worthwhile articles to curate). Another team led by a computational linguistics expert will design systems to map the meanings of sentences.

  1. Integrate each piece of knowledge into a computational model to represent the Ras pathway on oncogenesis.
  2. Produce hypotheses and propose experiments based on knowledge base which can be experimentally verified in the laboratory.

The Human no Longer Needed?: Not So Fast, my Friend!

The problems the DARPA research teams are encountering namely:

  • Need for data verification
  • Text mining and curation strategies
  • Incomplete knowledge base (past, current and future)
  • Molecular biology not necessarily “requires casual inference” as other fields do


Notice this verification step (step 3) requires physical lab work as does all other ‘omics strategies and other computational biology projects. As with high-throughput microarray screens, a verification is needed usually in the form of conducting qPCR or interesting genes are validated in a phenotypical (expression) system. In addition, there has been an ongoing issue surrounding the validity and reproducibility of some research studies and data.

See Importance of Funding Replication Studies: NIH on Credibility of Basic Biomedical Studies

Therefore as DARPA attempts to recreate the Ras pathway from published literature and suggest new pathways/interactions, it will be necessary to experimentally validate certain points (protein interactions or modification events, signaling events) in order to validate their computer model.

Text-Mining and Curation Strategies

The Big Mechanism Project is starting very small; this reflects some of the challenges in scale of this project. Researchers were only given six paragraph long passages and a rudimentary model of the Ras pathway in cancer and then asked to automate a text mining strategy to extract as much useful information. Unfortunately this strategy could be fraught with issues frequently occurred in the biocuration community namely:

Manual or automated curation of scientific literature?

Biocurators, the scientists who painstakingly sort through the voluminous scientific journal to extract and then organize relevant data into accessible databases, have debated whether manual, automated, or a combination of both curation methods [2] achieves the highest accuracy for extracting the information needed to enter in a database. Abigail Cabunoc, a lead developer for Ontario Institute for Cancer Research’s WormBase (a database of nematode genetics and biology) and Lead Developer at Mozilla Science Lab, noted, on her blog, on the lively debate on biocuration methodology at the Seventh International Biocuration Conference (#ISB2014) that the massive amounts of information will require a Herculaneum effort regardless of the methodology.

Although I will have a future post on the advantages/disadvantages and tools/methodologies of manual vs. automated curation, there is a great article on researchinformation.infoExtracting More Information from Scientific Literature” and also see “The Methodology of Curation for Scientific Research Findings” and “Power of Analogy: Curation in Music, Music Critique as a Curation and Curation of Medical Research Findings – A Comparison” for manual curation methodologies and A MOD(ern) perspective on literature curation for a nice workflow paper on the International Society for Biocuration site.

The Big Mechanism team decided on a full automated approach to text-mine their limited literature set for relevant information however was able to extract only 40% of relevant information from these six paragraphs to the given model. Although the investigators were happy with this percentage most biocurators, whether using a manual or automated method to extract information, would consider 40% a low success rate. Biocurators, regardless of method, have reported ability to extract 70-90% of relevant information from the whole literature (for example for Comparative Toxicogenomics Database)[3-5].

Incomplete Knowledge Base

In an earlier posting (actually was a press release for our first e-book) I had discussed the problem with the “data deluge” we are experiencing in scientific literature as well as the plethora of ‘omics experimental data which needs to be curated.

Tackling the problem of scientific and medical information overload


Figure. The number of papers listed in PubMed (disregarding reviews) during ten year periods have steadily increased from 1970.

Analyzing and sharing the vast amounts of scientific knowledge has never been so crucial to innovation in the medical field. The publication rate has steadily increased from the 70’s, with a 50% increase in the number of original research articles published from the 1990’s to the previous decade. This massive amount of biomedical and scientific information has presented the unique problem of an information overload, and the critical need for methodology and expertise to organize, curate, and disseminate this diverse information for scientists and clinicians. Dr. Larry Bernstein, President of Triplex Consulting and previously chief of pathology at New York’s Methodist Hospital, concurs that “the academic pressures to publish, and the breakdown of knowledge into “silos”, has contributed to this knowledge explosion and although the literature is now online and edited, much of this information is out of reach to the very brightest clinicians.”

Traditionally, organization of biomedical information has been the realm of the literature review, but most reviews are performed years after discoveries are made and, given the rapid pace of new discoveries, this is appearing to be an outdated model. In addition, most medical searches are dependent on keywords, hence adding more complexity to the investigator in finding the material they require. Third, medical researchers and professionals are recognizing the need to converse with each other, in real-time, on the impact new discoveries may have on their research and clinical practice.

These issues require a people-based strategy, having expertise in a diverse and cross-integrative number of medical topics to provide the in-depth understanding of the current research and challenges in each field as well as providing a more conceptual-based search platform. To address this need, human intermediaries, known as scientific curators, are needed to narrow down the information and provide critical context and analysis of medical and scientific information in an interactive manner powered by web 2.0 with curators referred to as the “researcher 2.0”. This curation offers better organization and visibility to the critical information useful for the next innovations in academic, clinical, and industrial research by providing these hybrid networks.

Yaneer Bar-Yam of the New England Complex Systems Institute was not confident that using details from past knowledge could produce adequate roadmaps for future experimentation and noted for the article, “ “The expectation that the accumulation of details will tell us what we want to know is not well justified.”

In a recent post I had curated findings from four lung cancer omics studies and presented some graphic on bioinformatic analysis of the novel genetic mutations resulting from these studies (see link below)

Multiple Lung Cancer Genomic Projects Suggest New Targets, Research Directions for

Non-Small Cell Lung Cancer

which showed, that while multiple genetic mutations and related pathway ontologies were well documented in the lung cancer literature there existed many significant genetic mutations and pathways identified in the genomic studies but little literature attributed to these lung cancer-relevant mutations.


  This ‘literomics’ analysis reveals a large gap between our knowledge base and the data resulting from large translational ‘omic’ studies.

Different Literature Analyses Approach Yeilding

A ‘literomics’ approach focuses on what we don NOT know about genes, proteins, and their associated pathways while a text-mining machine learning algorithm focuses on building a knowledge base to determine the next line of research or what needs to be measured. Using each approach can give us different perspectives on ‘omics data.

Deriving Casual Inference

Ras is one of the best studied and characterized oncogenes and the mechanisms behind Ras-driven oncogenenis is highly understood.   This, according to computational biologist Larry Hunt of Smart Information Flow Technologies makes Ras a great starting point for the Big Mechanism project. As he states,” Molecular biology is a good place to try (developing a machine learning algorithm) because it’s an area in which common sense plays a minor role”.

Even though some may think the project wouldn’t be able to tackle on other mechanisms which involve epigenetic factors UCLA’s expert in causality Judea Pearl, Ph.D. (head of UCLA Cognitive Systems Lab) feels it is possible for machine learning to bridge this gap. As summarized from his lecture at Microsoft:

“The development of graphical models and the logic of counterfactuals have had a marked effect on the way scientists treat problems involving cause-effect relationships. Practical problems requiring causal information, which long were regarded as either metaphysical or unmanageable can now be solved using elementary mathematics. Moreover, problems that were thought to be purely statistical, are beginning to benefit from analyzing their causal roots.”

According to him first

1) articulate assumptions

2) define research question in counter-inference terms

Then it is possible to design an inference system using calculus that tells the investigator what they need to measure.

To watch a video of Dr. Judea Pearl’s April 2013 lecture at Microsoft Research Machine Learning Summit 2013 (“The Mathematics of Causal Inference: with Reflections on Machine Learning”), click here.

The key for the Big Mechansism Project may me be in correcting for the variables among studies, in essence building a models system which may not rely on fully controlled conditions. Dr. Peter Spirtes from Carnegie Mellon University in Pittsburgh, PA is developing a project called the TETRAD project with two goals: 1) to specify and prove under what conditions it is possible to reliably infer causal relationships from background knowledge and statistical data not obtained under fully controlled conditions 2) develop, analyze, implement, test and apply practical, provably correct computer programs for inferring causal structure under conditions where this is possible.

In summary such projects and algorithms will provide investigators the what, and possibly the how should be measured.

So for now it seems we are still needed.


  1. You J: Artificial intelligence. DARPA sets out to automate research. Science 2015, 347(6221):465.
  2. Biocuration 2014: Battle of the New Curation Methods []
  3. Davis AP, Johnson RJ, Lennon-Hopkins K, Sciaky D, Rosenstein MC, Wiegers TC, Mattingly CJ: Targeted journal curation as a method to improve data currency at the Comparative Toxicogenomics Database. Database : the journal of biological databases and curation 2012, 2012:bas051.
  4. Wu CH, Arighi CN, Cohen KB, Hirschman L, Krallinger M, Lu Z, Mattingly C, Valencia A, Wiegers TC, John Wilbur W: BioCreative-2012 virtual issue. Database : the journal of biological databases and curation 2012, 2012:bas049.
  5. Wiegers TC, Davis AP, Mattingly CJ: Collaborative biocuration–text-mining development task for document prioritization for curation. Database : the journal of biological databases and curation 2012, 2012:bas037.

Other posts on this site on include: Artificial Intelligence, Curation Methodology, Philosophy of Science

Inevitability of Curation: Scientific Publishing moves to embrace Open Data, Libraries and Researchers are trying to keep up

A Brief Curation of Proteomics, Metabolomics, and Metabolism

The Methodology of Curation for Scientific Research Findings

Scientific Curation Fostering Expert Networks and Open Innovation: Lessons from Clive Thompson and others

The growing importance of content curation

Data Curation is for Big Data what Data Integration is for Small Data

Cardiovascular Original Research: Cases in Methodology Design for Content Co-Curation The Art of Scientific & Medical Curation

Exploring the Impact of Content Curation on Business Goals in 2013

Power of Analogy: Curation in Music, Music Critique as a Curation and Curation of Medical Research Findings – A Comparison

conceived: NEW Definition for Co-Curation in Medical Research

Reconstructed Science Communication for Open Access Online Scientific Curation

Search Results for ‘artificial intelligence’

 The Simple Pictures Artificial Intelligence Still Can’t Recognize

Data Scientist on a Quest to Turn Computers Into Doctors

Vinod Khosla: “20% doctor included”: speculations & musings of a technology optimist or “Technology will replace 80% of what doctors do”

Where has reason gone?

Read Full Post »


Larry H Bernstein, MD
Leaders in Pharmaceutical Intelligence


I call attention to an interesting article that just came out.   The estimate of improved costsavings in healthcare and diagnostic accuracy is extimated to be substantial.   I have written about the unused potential that we have not yet seen.  In short, there is justification in substantial investment in resources to this, as has been proposed as a critical goal.  Does this mean a reduction in staffing?  I wouldn’t look at it that way.  The two huge benefits that would accrue are:


  1. workflow efficiency, reducing stress and facilitating decision-making.
  2. scientifically, primary knowledge-based  decision-support by well developed algotithms that have been at the heart of computational-genomics.




Can computers save health care? IU research shows lower costs, better outcomes

Cost per unit of outcome was $189, versus $497 for treatment as usual

 Last modified: Monday, February 11, 2013


BLOOMINGTON, Ind. — New research from Indiana University has found that machine learning — the same computer science discipline that helped create voice recognition systems, self-driving cars and credit card fraud detection systems — can drastically improve both the cost and quality of health care in the United States.



 Physicians using an artificial intelligence framework that predicts future outcomes would have better patient outcomes while significantly lowering health care costs.



Using an artificial intelligence framework combining Markov Decision Processes and Dynamic Decision Networks, IU School of Informatics and Computing researchers Casey Bennett and Kris Hauser show how simulation modeling that understands and predicts the outcomes of treatment could


  • reduce health care costs by over 50 percent while also
  • improving patient outcomes by nearly 50 percent.


The work by Hauser, an assistant professor of computer science, and Ph.D. student Bennett improves upon their earlier work that


  • showed how machine learning could determine the best treatment at a single point in time for an individual patient.


By using a new framework that employs sequential decision-making, the previous single-decision research


  • can be expanded into models that simulate numerous alternative treatment paths out into the future;
  • maintain beliefs about patient health status over time even when measurements are unavailable or uncertain; and
  • continually plan/re-plan as new information becomes available.

In other words, it can “think like a doctor.”  (Perhaps better because of the limitation in the amount of information a bright, competent physician can handle without error!)


“The Markov Decision Processes and Dynamic Decision Networks enable the system to deliberate about the future, considering all the different possible sequences of actions and effects in advance, even in cases where we are unsure of the effects,” Bennett said.  Moreover, the approach is non-disease-specific — it could work for any diagnosis or disorder, simply by plugging in the relevant information.  (This actually raises the question of what the information input is, and the cost of inputting.)


The new work addresses three vexing issues related to health care in the U.S.:


  1. rising costs expected to reach 30 percent of the gross domestic product by 2050;
  2. a quality of care where patients receive correct diagnosis and treatment less than half the time on a first visit;
  3. and a lag time of 13 to 17 years between research and practice in clinical care.

  Framework for Simulating Clinical Decision-Making


“We’re using modern computational approaches to learn from clinical data and develop complex plans through the simulation of numerous, alternative sequential decision paths,” Bennett said. “The framework here easily out-performs the current treatment-as-usual, case-rate/fee-for-service models of health care.”  (see the above)


Bennett is also a data architect and research fellow with Centerstone Research Institute, the research arm of Centerstone, the nation’s largest not-for-profit provider of community-based behavioral health care. The two researchers had access to clinical data, demographics and other information on over 6,700 patients who had major clinical depression diagnoses, of which about 65 to 70 percent had co-occurring chronic physical disorders like diabetes, hypertension and cardiovascular disease.  Using 500 randomly selected patients from that group for simulations, the two


  • compared actual doctor performance and patient outcomes against
  • sequential decision-making models

using real patient data.

They found great disparity in the cost per unit of outcome change when the artificial intelligence model’s


  1. cost of $189 was compared to the treatment-as-usual cost of $497.
  2. the AI approach obtained a 30 to 35 percent increase in patient outcomes
Bennett said that “tweaking certain model parameters could enhance the outcome advantage to about 50 percent more improvement at about half the cost.”


While most medical decisions are based on case-by-case, experience-based approaches, there is a growing body of evidence that complex treatment decisions might be effectively improved by AI modeling.  Hauser said “Modeling lets us see more possibilities out to a further point –  because they just don’t have all of that information available to them.”  (Even then, the other issue is the processing of the information presented.)



Using the growing availability of electronic health records, health information exchanges, large public biomedical databases and machine learning algorithms, the researchers believe the approach could serve as the basis for personalized treatment through integration of diverse, large-scale data passed along to clinicians at the time of decision-making for each patient. Centerstone alone, Bennett noted, has access to health information on over 1 million patients each year. “Even with the development of new AI techniques that can approximate or even surpass human decision-making performance, we believe that the most effective long-term path could be combining artificial intelligence with human clinicians,” Bennett said. “Let humans do what they do well, and let machines do what they do well. In the end, we may maximize the potential of both.”



Artificial Intelligence Framework for Simulating Clinical Decision-Making: A Markov Decision Process Approach” was published recently in Artificial Intelligence in Medicine. The research was funded by the Ayers Foundation, the Joe C. Davis Foundation and Indiana University.


For more information or to speak with Hauser or Bennett, please contact Steve Chaplin, IU Communications, at 812-856-1896 or



IBM Watson Finally Graduates Medical School


It’s been more than a year since IBM’s Watson computer appeared on Jeopardy and defeated several of the game show’s top champions. Since then the supercomputer has been furiously “studying” the healthcare literature in the hope that it can beat a far more hideous enemy: the 400-plus biomolecular puzzles we collectively refer to as cancer.




Anomaly Based Interpretation of Clinical and Laboratory Syndromic Classes

Larry H Bernstein, MD, Gil David, PhD, Ronald R Coifman, PhD.  Program in Applied Mathematics, Yale University, Triplex Medical Science.


 Statement of Inferential  Second Opinion

 Realtime Clinical Expert Support and Validation System

Gil David and Larry Bernstein have developed, in consultation with Prof. Ronald Coifman, in the Yale University Applied Mathematics Program, a software system that is the equivalent of an intelligent Electronic Health Records Dashboard that provides
  • empirical medical reference and suggests quantitative diagnostics options.


The current design of the Electronic Medical Record (EMR) is a linear presentation of portions of the record by
  • services, by
  • diagnostic method, and by
  • date, to cite examples.

This allows perusal through a graphical user interface (GUI) that partitions the information or necessary reports in a workstation entered by keying to icons.  This requires that the medical practitioner finds

  • the history,
  • medications,
  • laboratory reports,
  • cardiac imaging and EKGs, and
  • radiology
in different workspaces.  The introduction of a DASHBOARD has allowed a presentation of
  • drug reactions,
  • allergies,
  • primary and secondary diagnoses, and
  • critical information about any patient the care giver needing access to the record.
 The advantage of this innovation is obvious.  The startup problem is what information is presented and how it is displayed, which is a source of variability and a key to its success.


We are proposing an innovation that supercedes the main design elements of a DASHBOARD and
  • utilizes the conjoined syndromic features of the disparate data elements.
So the important determinant of the success of this endeavor is that it facilitates both
  1. the workflow and
  2. the decision-making process
  • with a reduction of medical error.
 This has become extremely important and urgent in the 10 years since the publication “To Err is Human”, and the newly published finding that reduction of error is as elusive as reduction in cost.  Whether they are counterproductive when approached in the wrong way may be subject to debate.
We initially confine our approach to laboratory data because it is collected on all patients, ambulatory and acutely ill, because the data is objective and quality controlled, and because
  • laboratory combinatorial patterns emerge with the development and course of disease.  Continuing work is in progress in extending the capabilities with model data-sets, and sufficient data.
It is true that the extraction of data from disparate sources will, in the long run, further improve this process.  For instance, the finding of both ST depression on EKG coincident with an increase of a cardiac biomarker (troponin) above a level determined by a receiver operator curve (ROC) analysis, particularly in the absence of substantially reduced renal function.
The conversion of hematology based data into useful clinical information requires the establishment of problem-solving constructs based on the measured data.  Traditionally this has been accomplished by an intuitive interpretation of the data by the individual clinician.  Through the application of geometric clustering analysis the data may interpreted in a more sophisticated fashion in order to create a more reliable and valid knowledge-based opinion.
The most commonly ordered test used for managing patients worldwide is the hemogram that often incorporates the review of a peripheral smear.  While the hemogram has undergone progressive modification of the measured features over time the subsequent expansion of the panel of tests has provided a window into the cellular changes in the production, release or suppression of the formed elements from the blood-forming organ to the circulation.  In the hemogram one can view data reflecting the characteristics of a broad spectrum of medical conditions.
Progressive modification of the measured features of the hemogram has delineated characteristics expressed as measurements of
  • size,
  • density, and
  • concentration,
resulting in more than a dozen composite variables, including the
  1. mean corpuscular volume (MCV),
  2. mean corpuscular hemoglobin concentration (MCHC),
  3. mean corpuscular hemoglobin (MCH),
  4. total white cell count (WBC),
  5. total lymphocyte count,
  6. neutrophil count (mature granulocyte count and bands),
  7. monocytes,
  8. eosinophils,
  9. basophils,
  10. platelet count, and
  11. mean platelet volume (MPV),
  12. blasts,
  13. reticulocytes and
  14. platelet clumps,
  15. perhaps the percent immature neutrophils (not bands)
  16. as well as other features of classification.
The use of such variables combined with additional clinical information including serum chemistry analysis (such as the Comprehensive Metabolic Profile (CMP)) in conjunction with the clinical history and examination complete the traditional problem-solving construct. The intuitive approach applied by the individual clinician is limited, however,
  1. by experience,
  2. memory and
  3. cognition.
The application of rules-based, automated problem solving may provide a more reliable and valid approach to the classification and interpretation of the data used to determine a knowledge-based clinical opinion.
The classification of the available hematologic data in order to formulate a predictive model may be accomplished through mathematical models that offer a more reliable and valid approach than the intuitive knowledge-based opinion of the individual clinician.  The exponential growth of knowledge since the mapping of the human genome has been enabled by parallel advances in applied mathematics that have not been a part of traditional clinical problem solving.  In a univariate universe the individual has significant control in visualizing data because unlike data may be identified by methods that rely on distributional assumptions.  As the complexity of statistical models has increased, involving the use of several predictors for different clinical classifications, the dependencies have become less clear to the individual.  The powerful statistical tools now available are not dependent on distributional assumptions, and allow classification and prediction in a way that cannot be achieved by the individual clinician intuitively. Contemporary statistical modeling has a primary goal of finding an underlying structure in studied data sets.
In the diagnosis of anemia the variables MCV,MCHC and MCH classify the disease process  into microcytic, normocytic and macrocytic categories.  Further consideration of
proliferation of marrow precursors,
  • the domination of a cell line, and
  • features of suppression of hematopoiesis

provide a two dimensional model.  Several other possible dimensions are created by consideration of

  • the maturity of the circulating cells.
The development of an evidence-based inference engine that can substantially interpret the data at hand and convert it in real time to a “knowledge-based opinion” may improve clinical problem solving by incorporating multiple complex clinical features as well as duration of onset into the model.
An example of a difficult area for clinical problem solving is found in the diagnosis of SIRS and associated sepsis.  SIRS (and associated sepsis) is a costly diagnosis in hospitalized patients.   Failure to diagnose sepsis in a timely manner creates a potential financial and safety hazard.  The early diagnosis of SIRS/sepsis is made by the application of defined criteria (temperature, heart rate, respiratory rate and WBC count) by the clinician.   The application of those clinical criteria, however, defines the condition after it has developed and has not provided a reliable method for the early diagnosis of SIRS.  The early diagnosis of SIRS may possibly be enhanced by the measurement of proteomic biomarkers, including transthyretin, C-reactive protein and procalcitonin.  Immature granulocyte (IG) measurement has been proposed as a more readily available indicator of the presence of
  • granulocyte precursors (left shift).
The use of such markers, obtained by automated systems in conjunction with innovative statistical modeling, may provide a mechanism to enhance workflow and decision making.
An accurate classification based on the multiplicity of available data can be provided by an innovative system that utilizes  the conjoined syndromic features of disparate data elements.  Such a system has the potential to facilitate both the workflow and the decision-making process with an anticipated reduction of medical error.
This study is only an extension of our approach to repairing a longstanding problem in the construction of the many-sided electronic medical record (EMR).  On the one hand, past history combined with the development of Diagnosis Related Groups (DRGs) in the 1980s have driven the technology development in the direction of “billing capture”, which has been a focus of epidemiological studies in health services research using data mining.

In a classic study carried out at Bell Laboratories, Didner found that information technologies reflect the view of the creators, not the users, and Front-to-Back Design (R Didner) is needed.  He expresses the view:

“Pre-printed forms are much more amenable to computer-based storage and processing, and would improve the efficiency with which the insurance carriers process this information.  However, pre-printed forms can have a rather severe downside. By providing pre-printed forms that a physician completes
to record the diagnostic questions asked,
  • as well as tests, and results,
  • the sequence of tests and questions,
might be altered from that which a physician would ordinarily follow.  This sequence change could improve outcomes in rare cases, but it is more likely to worsen outcomes. “

Decision Making in the Clinical Setting.   Robert S. Didner

 A well-documented problem in the medical profession is the level of effort dedicated to administration and paperwork necessitated by health insurers, HMOs and other parties (ref).  This effort is currently estimated at 50% of a typical physician’s practice activity.  Obviously this contributes to the high cost of medical care.  A key element in the cost/effort composition is the retranscription of clinical data after the point at which it is collected.  Costs would be reduced, and accuracy improved, if the clinical data could be captured directly at the point it is generated, in a form suitable for transmission to insurers, or machine transformable into other formats.  Such data capture, could also be used to improve the form and structure of how this information is viewed by physicians, and form a basis of a more comprehensive database linking clinical protocols to outcomes, that could improve the knowledge of this relationship, hence clinical outcomes.
 How we frame our expectations is so important that
  • it determines the data we collect to examine the process.
In the absence of data to support an assumed benefit, there is no proof of validity at whatever cost.   This has meaning for
  • hospital operations, for
  • nonhospital laboratory operations, for
  • companies in the diagnostic business, and
  • for planning of health systems.
In 1983, a vision for creating the EMR was introduced by Lawrence Weed and others.  This is expressed by McGowan and Winstead-Fry.
J J McGowan and P Winstead-Fry. Problem Knowledge Couplers: reengineering evidence-based medicine through interdisciplinary development, decision support, and research.
Bull Med Libr Assoc. 1999 October; 87(4): 462–470.   PMCID: PMC226622    Copyright notice


Example of Markov Decision Process (MDP) trans...

Example of Markov Decision Process (MDP) transition automaton (Photo credit: Wikipedia)

Control loop of a Markov Decision Process

Control loop of a Markov Decision Process (Photo credit: Wikipedia)


English: IBM's Watson computer, Yorktown Heigh...

English: IBM’s Watson computer, Yorktown Heights, NY (Photo credit: Wikipedia)

English: Increasing decision stakes and system...

English: Increasing decision stakes and systems uncertainties entail new problem solving strategies. Image based on a diagram by Funtowicz, S. and Ravetz, J. (1993) “Science for the post-normal age” Futures 25:735–55 ( (Photo credit: Wikipedia)



Read Full Post »


Demonstration of a diagnostic clinical laboratory neural network agent applied to three laboratory data conditioning problems

Izaak Mayzlin                                                                        Larry Bernstein, MD

Principal Scientist, MayNet                                            Technical Director

Boston, MA                                                                          Methodist Hospital Laboratory, Brooklyn, NY

Our clinical chemistry section services a hospital emergency room seeing 15,000 patients with chest pain annually.  We have used a neural network agent, MayNet, for data conditioning.  Three applications are – troponin, CKMB, EKG for chest pain; B-type natriuretic peptide (BNP), EKG for congestive heart failure (CHF); and red cell count (RBC), mean corpuscular volume (MCV), hemoglobin A2 (Hgb A2) for beta thalassemia.  Three data sets have been extensively validated prior to neural network analysis using receiver-operator curve (ROC analysis), Latent Class Analysis, and a multinomial regression approach.  Optimum decision points for classifying using these data were determined using ROC (SYSTAT, 11.0), LCM (Latent Gold), and ordinal regression (GOLDminer).   The ACS and CHF studies both had over 700 patients, and had a different validation sample than the initial exploratory population.  The MayNet incorporates prior clustering, and sample extraction features in its application.   Maynet results are in agreement with the other methods.

Introduction: A clinical laboratory servicing a hospital with an  emergency room seeing 15,000 patients with chest pain to produce over 2 million quality controlled chemistry accessions annually.  We have used a neural network agent, MayNet, to tackle the quality control of the information product.  The agent combines a statistical tool that first performs clustering of input variables by Euclidean distances in multi-dimensional space. The clusters are trained on output variables by the artificial neural network performing non-linear discrimination on clusters’ averages.  In applying this new agent system to diagnosis of acute myocardial infarction (AMI) we demonstrated that at an optimum clustering distance the number of classes is minimized with efficient training on the neural network. The software agent also performs a random partitioning of the patients’ data into training and testing sets, one time neural network training, and an accuracy estimate on the testing data set. Three examples to illustrate this are – troponin, CKMB, EKG for acute coronary syndrome (ACS); B-type natriuretic peptide (BNP), EKG for the estimation of ejection fraction in congestive heart failure (CHF); and red cell count (RBC), mean corpuscular volume (MCV), hemoglobin A2 (Hgb A2) for identifying beta thalassemia.  We use three data sets that have been extensively validated prior to neural network analysis using receiver-operator curve (ROC analysis), Latent Class Analysis, and a multinomial regression approach.

In previous studies1,2 CK-MB and LD1 sampled at 12 and 18 hours postadmission were near-optimum times used to form a classification by the analysis of information in the data set. The population consisted of 101 patients with and 41 patients without AMI based on review of the medical records, clinical presentation, electrocardiography, serial enzyme and isoenzyme  assays, and other tests. The clinical or EKG data, and other enzymes or sampling times were not used to form a classification but could be handled by the program developed. All diagnoses were established by cardiologist review. An important methodological problem is the assignment of a correct diagnosis by a “gold standard” that is independent of the method being tested so that the method tested can be suitably validated. This solution is not satisfactory in the case of myocardial infarction because of the dependence of diagnosis on a constellation of observations with different sensitivities and specificities. We have argued that the accuracy of diagnosis is  associated with the classes formed by combined features and has greatest uncertainty associated with a single measure.

Methods:  Neural network analysis is by MayNet, developed by one of the authors.  Optimum decision points for classifying using these data were determined using ROC (SYSTAT, 11.0), LCM (Latent Gold)3, and ordinal regression (GOLDminer)4.   Validation of the ACS and CHF study sets both had over 700 patients, and all studies had a different validation sample than the initial exploratory population.  The MayNet incorporates prior clustering, and sample extraction features in its application.   We now report on a new classification method and its application to diagnosis of acute myocardial infarction (AMI).  This method is based on the combination of clustering by Euclidean distances in multi-dimensional space and non-linear discrimination fulfilled by the Artificial Neural Network (ANN) trained on clusters’ averages.   These studies indicate that at an optimum clustering distance the number of classes is minimized with efficient training on the ANN. This novel approach to ANN reduces the number of patterns used for ANN learning and works also as an effective tool for smoothing data, removing singularities,  and increasing the accuracy of classification by the ANN. The studies  conducted involve training and testing on separate clinical data sets, which subsequently achieves a high accuracy of diagnosis (97%).

Unlike classification, which assumes the prior definition of borders between classes5,6, clustering procedure includes establishing these borders as a result of processing statistical information and using a given criteria for difference (distance) between classes.  We perform clustering using the geometrical (Euclidean) distance between two points in n-dimensional space, formed by n variables, including both input and output variables. Since this distance assumes compatibility of different variables, the values of all input variables are linearly transformed (scaled) to the range from 0 to 1.

The ANN technique for readers accustomed to classical statistics can be viewed as an extension of multivariate regression analyses with such new features as non-linearity and ability to process categorical data. Categorical (not continuous) variables represent two or more levels, groups, or classes of correspondent feature, and in our case this concept is used to signify patient condition, for example existence or not of AMI.

The ANN is an acyclic directed graph with input and output nodes corresponding respectively to input and output variables. There are also “intermediate” nodes, comprising so called “hidden” layers.  Each node nj is assigned the value xj that has been evaluated by the node’s “processing” element, as a non-linear function of the weighted sum of values xi of nodes ni, connected with nj by directed edges (ni, nj).

xj = f(wi(1),jxi(1) + wi(2),jxi(2) + … + wi(l),jxi(l)),

where xk is the value in node nk and wk,j is the “weight” of the edge (nk, nj).  In our research we used the standard function f(x), “sigmoid”, defined as f(x)=1/(1+exp(-x)).  This function is suitable for categorical output and allows for using an efficient back-propagation algorithm7 for calculating the optimal values of weights, providing the best fit for learning set of data, and eventually the most accurate classification.

Process description:  We implemented the proposed algorithm for diagnosis of AMI. All the calculations were performed on PC with Pentium 3 Processor applying the authors’ unique Software Agent Maynet. First, using the automatic random extraction procedure, the initial data set (139 patients) was partitioned into two sets — training and testing.  This randomization also determined the size of these sets (96 and 43, respectively) since the program was instructed to assign approximately 70 % of data to the training set.

The main process consists of three successive steps: (1) clustering performed on training data set, (2) neural network’s training on clusters from previous step, and (3) classifier’s accuracy evaluation on testing data.

The classifier in this research will be the ANN, created on step 2, with output in the range [0,1], that provides binary result (1 – AMI, 0 – not AMI), using decision point 0.5.

In this demonstartion we used the data of two previous studies1,2 with three patients, potential outliers, removed (n = 139). The data contains three input variables, CK-MB, LD-1, LD-1/total LD, and one output variable, diagnoses, coded as 1 (for AMI) or 0 (non-AMI).

Results: The application of this software intelligent agent is first demonstrated here using the initial model. Figures 1-2 illustrate the history of training process. One function is the maximum (among training patterns) and lower function shows the average error. The latter defines duration of training process. Training terminates when the average error achieves 5%.

There was slow convergence of back-propagation algorithm applied to the training set of 96 patients. We needed 6800 iterations to achieve the sufficiently small (5%) average error.

Figure 1 shows the process of training on stage 2. It illustrates rapid convergence because we deal only with 9 patterns representing the 9 classes, formed on step 1.

Table 1 illustrates the effect of selection of maximum distance on the number of classes formed and on the production of errors. The number of classes increased with decreasing distance, but accuracy of classification does not decreased.

The rate of learning is inversely related to the number of classes. The use of the back-propagation to train on the entire data set without prior processing is slower than for the training on patterns.

     Figures 2 is a two-dimensional projection of three-dimensional space of input variables CKMB and LD1 with small dots corresponding to the patterns and rectangular as cluster centroids (black – AMI, white – not AMI).

     We carried out a larger study using troponin I (instead of LD1) and CKMB for the diagnosis of myocardial infarction (MI).  The probabilities and odds-ratios for the TnI scaled into intervals near the entropy decision point are shown in Table 2 (N = 782).  The cross-table shows the frequencies for scaled TnI results versus the observed MI, the percent of values within MI, and the predicted probabilities and odds-ratios for MI within TnI intervals.  The optimum decision point is at or near 0.61 mg/L (the probability of MI at 0.46-0.6 mg/L is 3% and the odds ratio is at 13, while the probability of MI at 0.61-0.75 mg/L is 26% at an odds ratio of 174) by regressing the scaled values.

     The RBC, MCV criteria used were applied to a series of 40 patients different than that used in deriving the cutoffs.  A latent class cluster analysis is shown in Table 3.  MayNet is carried out on all 3 data sets for MI, CHF, and for beta thalassemia for comparison and will be shown.

Discussion:  CKMB has been heavily used for a long time to determine heart attacks. It is used in conjunction with a troponin test and the EKG to identify MI but, it isn’t as sensitive as is needed. A joint committee of the AmericanCollege of Cardiology and European Society of Cardiology (ACC/ESC) has established the criteria for acute, recent or evolving AMI predicated on a typical increase in troponin in the clinical setting of myocardial ischemia (1), which includes the 99th percentile of a healthy normal population. The improper selection of a troponin decision value is, however, likely to increase over use of hospital resources.  A study by Zarich8 showed that using an MI cutoff concentration for TnT from a non-acute coronary syndrome (ACS) reference improves risk stratification, but fails to detect a positive TnT in 11.7% of subjects with an ACS syndrome8. The specificity of the test increased from 88.4% to 96.7% with corresponding negative predictive values of 99.7% and 96.2%. Lin et al.9 recently reported that the use of low reference cutoffs suggested by the new guidelines results in markedly increased TnI-positive cases overall. Associated with a positive TnI and a negative CKMB, these cases are most likely false positive for MI. Maynet relieves this and the following problem effectively.

Monitoring BNP levels is a new and highly efficient way of diagnosing CHF as well as excluding non-cardiac causes of shortness of breath. Listening to breath sounds is only accurate when the disease is advanced to the stage in which the pumping function of the heart is impaired. The pumping of the heart is impaired when the circulation pressure increases above the osmotic pressure of the blood proteins that keep fluid in the circulation, causing fluid to pass into the lung’s airspaces.  Our studies combine the BNP with the EKG measurement of QRS duration to predict whether a patient has a high or low ejection fraction, a measure to stage the severity of CHF.

We also had to integrate the information from the hemogram (RBC, MCV) with the hemoglobin A2 quantitation (BioRad Variant II) for the diagnosis of beta thalassemia.  We chose an approach to the data that requires no assumption about the distribution of test values or the variances.   Our detailed analyses validates an approach to thalassemia screening that has been widely used, the Mentzer index10, and in addition uses critical decision values for the tests that are used in the Mentzer index. We also showed that Hgb S has an effect on both Hgb A2 and Hgb F.  This study is adequately powered to assess the usefulness of the Hgb A2 criteria but not adequately powered to assess thalassemias with elevated Hgb F.


1.  Adan J, Bernstein LH, Babb J. Lactate dehydrogenase isoenzyme-1/total ratio: accurate for determining the existence of myocardial infarction. Clin Chem 1986;32:624-8.

2. Rudolph RA, Bernstein LH, Babb J. Information induction for predicting acute myocardial infarction.  Clin Chem 1988;34:2031- 2038.

3. Magidson J. “Maximum Likelihood Assessment of Clinical Trials Based on an Ordered Categorical Response.” Drug Information Journal, Maple Glen, PA: Drug Information Association 1996;309[1]: 143-170.

4. Magidson J and Vermoent J.  Latent Class Cluster Analysis. in J. A. Hagenaars and A. L. McCutcheon (eds.), Applied Latent Class Analysis. Cambridge: CambridgeUniversity Press, 2002, pp. 89-106.

5. Mkhitarian VS, Mayzlin IE, Troshin LI, Borisenko LV. Classification of the base objects upon integral parameters of the attached network. Applied Mathematics and Computers.  Moscow, USSR: Statistika, 1976: 118-24.

6.Mayzlin IE, Mkhitarian VS. Determining the optimal bounds for objects of different classes. In: Dubrow AM, ed. Computational Mathematics and Applications. MoscowUSSR: Economics and Statistics Institute. 1976: 102-105.

7. RumelhartDE, Hinton GE, Williams RJ. Learning internal representations by error propagation. In:

RumelhartDE, Mc Clelland JL, eds. Parallel distributed processing.   Cambridge, Mass: MIT Press, 1986; 1: 318-62.

8. Zarich SW, Bradley K, Mayall ID, Bernstein, LH. Minor Elevations in Troponin T Values Enhance Risk Assessment in Emergency Department Patients with Suspected Myocardial Ischemia: Analysis of Novel Troponin T Cut-off Values.  Clin Chim Acta 2004 (in press).

9. Lin JC, Apple FS, Murakami MM, Luepker RV.  Rates of positive cardiac troponin I and creatine kinase MB mass among patients hospitalized for suspected acute coronary syndromes.  Clin Chem 2004;50:333-338.

10.Makris PE. Utilization of a new index to distinguish heterozygous thalassemic syndromes: comparison of its specificity to five other discriminants.Blood Cells. 1989;15(3):497-506.

Acknowledgements:   Jerard Kneifati-Hayek and Madeleine Schlefer, Midwood High School, Brooklyn, and Salman Haq, Cardiology Fellow, Methodist Hospital.

Table 1. Effect of selection of maximum distance on the number of classes formed and on the accuracy of recognition by ANN

ClusteringDistanceFactor F(D = F * R)  Number ofClasses  Number of Nodes inThe HiddenLayers  Number ofMisrecognizedPatterns inThe TestingSet of 43 Percent ofMisrecognized  2414135  1, 02, 03, 01, 02, 03, 0

3, 2

3, 2






Figure 1.

Figure 2.

Table 2.  Frequency cross-table, probabilities and odds-ratios for scaled TnI versus expected diagnosis

Range Not MI MI N Pct in MI Prob by TnI Odds Ratio
< 0.45 655 2 657 2 0 1
0.46-0.6 7 0 7 0 0.03 13
0.61-0.75 4 0 4 0. 0.26 175
0.76-0.9 13 59 72 57.3 0.82 2307
> 0.9 0 42 42 40.8 0.98 30482
679 103 782 100


Read Full Post »

Older Posts »