Feeds:
Posts
Comments

Posts Tagged ‘medical decision-making’

Science Policy Forum: Should we trust healthcare explanations from AI predictive systems?

Some in industry voice their concerns

Curator: Stephen J. Williams, PhD

Post on AI healthcare and explainable AI

   In a Policy Forum article in ScienceBeware explanations from AI in health care”, Boris Babic, Sara Gerke, Theodoros Evgeniou, and Glenn Cohen discuss the caveats on relying on explainable versus interpretable artificial intelligence (AI) and Machine Learning (ML) algorithms to make complex health decisions.  The FDA has already approved some AI/ML algorithms for analysis of medical images for diagnostic purposes.  These have been discussed in prior posts on this site, as well as issues arising from multi-center trials.  The authors of this perspective article argue that choice of type of algorithm (explainable versus interpretable) algorithms may have far reaching consequences in health care.

Summary

Artificial intelligence and machine learning (AI/ML) algorithms are increasingly developed in health care for diagnosis and treatment of a variety of medical conditions (1). However, despite the technical prowess of such systems, their adoption has been challenging, and whether and how much they will actually improve health care remains to be seen. A central reason for this is that the effectiveness of AI/ML-based medical devices depends largely on the behavioral characteristics of its users, who, for example, are often vulnerable to well-documented biases or algorithmic aversion (2). Many stakeholders increasingly identify the so-called black-box nature of predictive algorithms as the core source of users’ skepticism, lack of trust, and slow uptake (3, 4). As a result, lawmakers have been moving in the direction of requiring the availability of explanations for black-box algorithmic decisions (5). Indeed, a near-consensus is emerging in favor of explainable AI/ML among academics, governments, and civil society groups. Many are drawn to this approach to harness the accuracy benefits of noninterpretable AI/ML such as deep learning or neural nets while also supporting transparency, trust, and adoption. We argue that this consensus, at least as applied to health care, both overstates the benefits and undercounts the drawbacks of requiring black-box algorithms to be explainable.

Source: https://science.sciencemag.org/content/373/6552/284?_ga=2.166262518.995809660.1627762475-1953442883.1627762475

Types of AI/ML Algorithms: Explainable and Interpretable algorithms

  1.  Interpretable AI: A typical AI/ML task requires constructing algorithms from vector inputs and generating an output related to an outcome (like diagnosing a cardiac event from an image).  Generally the algorithm has to be trained on past data with known parameters.  When an algorithm is called interpretable, this means that the algorithm uses a transparent or “white box” function which is easily understandable. Such example might be a linear function to determine relationships where parameters are simple and not complex.  Although they may not be as accurate as the more complex explainable AI/ML algorithms, they are open, transparent, and easily understood by the operators.
  2. Explainable AI/ML:  This type of algorithm depends upon multiple complex parameters and takes a first round of predictions from a “black box” model then uses a second algorithm from an interpretable function to better approximate outputs of the first model.  The first algorithm is trained not with original data but based on predictions resembling multiple iterations of computing.  Therefore this method is more accurate or deemed more reliable in prediction however is very complex and is not easily understandable.  Many medical devices that use an AI/ML algorithm use this type.  An example is deep learning and neural networks.

The purpose of both these methodologies is to deal with problems of opacity, or that AI predictions based from a black box undermines trust in the AI.

For a deeper understanding of these two types of algorithms see here:

https://www.kdnuggets.com/2018/12/machine-learning-explainability-interpretability-ai.html

or https://www.bmc.com/blogs/machine-learning-interpretability-vs-explainability/

(a longer read but great explanation)

From the above blog post of Jonathan Johnson

  • How interpretability is different from explainability
  • Why a model might need to be interpretable and/or explainable
  • Who is working to solve the black box problem—and how

What is interpretability?

Does Chipotle make your stomach hurt? Does loud noise accelerate hearing loss? Are women less aggressive than men? If a machine learning model can create a definition around these relationships, it is interpretable.

All models must start with a hypothesis. Human curiosity propels a being to intuit that one thing relates to another. “Hmm…multiple black people shot by policemen…seemingly out of proportion to other races…something might be systemic?” Explore.

People create internal models to interpret their surroundings. In the field of machine learning, these models can be tested and verified as either accurate or inaccurate representations of the world.

Interpretability means that the cause and effect can be determined.

What is explainability?

ML models are often called black-box models because they allow a pre-set number of empty parameters, or nodes, to be assigned values by the machine learning algorithm. Specifically, the back-propagation step is responsible for updating the weights based on its error function.

To predict when a person might die—the fun gamble one might play when calculating a life insurance premium, and the strange bet a person makes against their own life when purchasing a life insurance package—a model will take in its inputs, and output a percent chance the given person has at living to age 80.

Below is an image of a neural network. The inputs are the yellow; the outputs are the orange. Like a rubric to an overall grade, explainability shows how significant each of the parameters, all the blue nodes, contribute to the final decision.

In this neural network, the hidden layers (the two columns of blue dots) would be the black box.

For example, we have these data inputs:

  • Age
  • BMI score
  • Number of years spent smoking
  • Career category

If this model had high explainability, we’d be able to say, for instance:

  • The career category is about 40% important
  • The number of years spent smoking weighs in at 35% important
  • The age is 15% important
  • The BMI score is 10% important

Explainability: important, not always necessary

Explainability becomes significant in the field of machine learning because, often, it is not apparent. Explainability is often unnecessary. A machine learning engineer can build a model without ever having considered the model’s explainability. It is an extra step in the building process—like wearing a seat belt while driving a car. It is unnecessary for the car to perform, but offers insurance when things crash.

The benefit a deep neural net offers to engineers is it creates a black box of parameters, like fake additional data points, that allow a model to base its decisions against. These fake data points go unknown to the engineer. The black box, or hidden layers, allow a model to make associations among the given data points to predict better results. For example, if we are deciding how long someone might have to live, and we use career data as an input, it is possible the model sorts the careers into high- and low-risk career options all on its own.

Perhaps we inspect a node and see it relates oil rig workers, underwater welders, and boat cooks to each other. It is possible the neural net makes connections between the lifespan of these individuals and puts a placeholder in the deep net to associate these. If we were to examine the individual nodes in the black box, we could note this clustering interprets water careers to be a high-risk job.

In the previous chart, each one of the lines connecting from the yellow dot to the blue dot can represent a signal, weighing the importance of that node in determining the overall score of the output.

  • If that signal is high, that node is significant to the model’s overall performance.
  • If that signal is low, the node is insignificant.

With this understanding, we can define explainability as:

Knowledge of what one node represents and how important it is to the model’s performance.

So how does choice of these two different algorithms make a difference with respect to health care and medical decision making?

The authors argue: 

“Regulators like the FDA should focus on those aspects of the AI/ML system that directly bear on its safety and effectiveness – in particular, how does it perform in the hands of its intended users?”

A suggestion for

  • Enhanced more involved clinical trials
  • Provide individuals added flexibility when interacting with a model, for example inputting their own test data
  • More interaction between user and model generators
  • Determining in which situations call for interpretable AI versus explainable (for instance predicting which patients will require dialysis after kidney damage)

Other articles on AI/ML in medicine and healthcare on this Open Access Journal include

Applying AI to Improve Interpretation of Medical Imaging

Real Time Coverage @BIOConvention #BIO2019: Machine Learning and Artificial Intelligence #AI: Realizing Precision Medicine One Patient at a Time

LIVE Day Three – World Medical Innovation Forum ARTIFICIAL INTELLIGENCE, Boston, MA USA, Monday, April 10, 2019

Cardiac MRI Imaging Breakthrough: The First AI-assisted Cardiac MRI Scan Solution, HeartVista Receives FDA 510(k) Clearance for One Click™ Cardiac MRI Package

 

Read Full Post »

 

Larry H Bernstein, MD
Leaders in Pharmaceutical Intelligence

 

I call attention to an interesting article that just came out.   The estimate of improved costsavings in healthcare and diagnostic accuracy is extimated to be substantial.   I have written about the unused potential that we have not yet seen.  In short, there is justification in substantial investment in resources to this, as has been proposed as a critical goal.  Does this mean a reduction in staffing?  I wouldn’t look at it that way.  The two huge benefits that would accrue are:

 

  1. workflow efficiency, reducing stress and facilitating decision-making.
  2. scientifically, primary knowledge-based  decision-support by well developed algotithms that have been at the heart of computational-genomics.

 

 

 

Can computers save health care? IU research shows lower costs, better outcomes

Cost per unit of outcome was $189, versus $497 for treatment as usual

 Last modified: Monday, February 11, 2013

 

BLOOMINGTON, Ind. — New research from Indiana University has found that machine learning — the same computer science discipline that helped create voice recognition systems, self-driving cars and credit card fraud detection systems — can drastically improve both the cost and quality of health care in the United States.

 

 

 Physicians using an artificial intelligence framework that predicts future outcomes would have better patient outcomes while significantly lowering health care costs.

 

 

Using an artificial intelligence framework combining Markov Decision Processes and Dynamic Decision Networks, IU School of Informatics and Computing researchers Casey Bennett and Kris Hauser show how simulation modeling that understands and predicts the outcomes of treatment could

 

  • reduce health care costs by over 50 percent while also
  • improving patient outcomes by nearly 50 percent.

 

The work by Hauser, an assistant professor of computer science, and Ph.D. student Bennett improves upon their earlier work that

 

  • showed how machine learning could determine the best treatment at a single point in time for an individual patient.

 

By using a new framework that employs sequential decision-making, the previous single-decision research

 

  • can be expanded into models that simulate numerous alternative treatment paths out into the future;
  • maintain beliefs about patient health status over time even when measurements are unavailable or uncertain; and
  • continually plan/re-plan as new information becomes available.

In other words, it can “think like a doctor.”  (Perhaps better because of the limitation in the amount of information a bright, competent physician can handle without error!)

 

“The Markov Decision Processes and Dynamic Decision Networks enable the system to deliberate about the future, considering all the different possible sequences of actions and effects in advance, even in cases where we are unsure of the effects,” Bennett said.  Moreover, the approach is non-disease-specific — it could work for any diagnosis or disorder, simply by plugging in the relevant information.  (This actually raises the question of what the information input is, and the cost of inputting.)

 

The new work addresses three vexing issues related to health care in the U.S.:

 

  1. rising costs expected to reach 30 percent of the gross domestic product by 2050;
  2. a quality of care where patients receive correct diagnosis and treatment less than half the time on a first visit;
  3. and a lag time of 13 to 17 years between research and practice in clinical care.

  Framework for Simulating Clinical Decision-Making

 

“We’re using modern computational approaches to learn from clinical data and develop complex plans through the simulation of numerous, alternative sequential decision paths,” Bennett said. “The framework here easily out-performs the current treatment-as-usual, case-rate/fee-for-service models of health care.”  (see the above)

 

Bennett is also a data architect and research fellow with Centerstone Research Institute, the research arm of Centerstone, the nation’s largest not-for-profit provider of community-based behavioral health care. The two researchers had access to clinical data, demographics and other information on over 6,700 patients who had major clinical depression diagnoses, of which about 65 to 70 percent had co-occurring chronic physical disorders like diabetes, hypertension and cardiovascular disease.  Using 500 randomly selected patients from that group for simulations, the two

 

  • compared actual doctor performance and patient outcomes against
  • sequential decision-making models

using real patient data.

They found great disparity in the cost per unit of outcome change when the artificial intelligence model’s

 

  1. cost of $189 was compared to the treatment-as-usual cost of $497.
  2. the AI approach obtained a 30 to 35 percent increase in patient outcomes
Bennett said that “tweaking certain model parameters could enhance the outcome advantage to about 50 percent more improvement at about half the cost.”

 

While most medical decisions are based on case-by-case, experience-based approaches, there is a growing body of evidence that complex treatment decisions might be effectively improved by AI modeling.  Hauser said “Modeling lets us see more possibilities out to a further point –  because they just don’t have all of that information available to them.”  (Even then, the other issue is the processing of the information presented.)

 

 

Using the growing availability of electronic health records, health information exchanges, large public biomedical databases and machine learning algorithms, the researchers believe the approach could serve as the basis for personalized treatment through integration of diverse, large-scale data passed along to clinicians at the time of decision-making for each patient. Centerstone alone, Bennett noted, has access to health information on over 1 million patients each year. “Even with the development of new AI techniques that can approximate or even surpass human decision-making performance, we believe that the most effective long-term path could be combining artificial intelligence with human clinicians,” Bennett said. “Let humans do what they do well, and let machines do what they do well. In the end, we may maximize the potential of both.”

 

 

Artificial Intelligence Framework for Simulating Clinical Decision-Making: A Markov Decision Process Approach” was published recently in Artificial Intelligence in Medicine. The research was funded by the Ayers Foundation, the Joe C. Davis Foundation and Indiana University.

 

For more information or to speak with Hauser or Bennett, please contact Steve Chaplin, IU Communications, at 812-856-1896 or stjchap@iu.edu.

 

 

IBM Watson Finally Graduates Medical School

 

It’s been more than a year since IBM’s Watson computer appeared on Jeopardy and defeated several of the game show’s top champions. Since then the supercomputer has been furiously “studying” the healthcare literature in the hope that it can beat a far more hideous enemy: the 400-plus biomolecular puzzles we collectively refer to as cancer.

 

 

 

Anomaly Based Interpretation of Clinical and Laboratory Syndromic Classes

Larry H Bernstein, MD, Gil David, PhD, Ronald R Coifman, PhD.  Program in Applied Mathematics, Yale University, Triplex Medical Science.

 

 Statement of Inferential  Second Opinion

 Realtime Clinical Expert Support and Validation System

Gil David and Larry Bernstein have developed, in consultation with Prof. Ronald Coifman, in the Yale University Applied Mathematics Program, a software system that is the equivalent of an intelligent Electronic Health Records Dashboard that provides
  • empirical medical reference and suggests quantitative diagnostics options.

Background

The current design of the Electronic Medical Record (EMR) is a linear presentation of portions of the record by
  • services, by
  • diagnostic method, and by
  • date, to cite examples.

This allows perusal through a graphical user interface (GUI) that partitions the information or necessary reports in a workstation entered by keying to icons.  This requires that the medical practitioner finds

  • the history,
  • medications,
  • laboratory reports,
  • cardiac imaging and EKGs, and
  • radiology
in different workspaces.  The introduction of a DASHBOARD has allowed a presentation of
  • drug reactions,
  • allergies,
  • primary and secondary diagnoses, and
  • critical information about any patient the care giver needing access to the record.
 The advantage of this innovation is obvious.  The startup problem is what information is presented and how it is displayed, which is a source of variability and a key to its success.

Proposal

We are proposing an innovation that supercedes the main design elements of a DASHBOARD and
  • utilizes the conjoined syndromic features of the disparate data elements.
So the important determinant of the success of this endeavor is that it facilitates both
  1. the workflow and
  2. the decision-making process
  • with a reduction of medical error.
 This has become extremely important and urgent in the 10 years since the publication “To Err is Human”, and the newly published finding that reduction of error is as elusive as reduction in cost.  Whether they are counterproductive when approached in the wrong way may be subject to debate.
We initially confine our approach to laboratory data because it is collected on all patients, ambulatory and acutely ill, because the data is objective and quality controlled, and because
  • laboratory combinatorial patterns emerge with the development and course of disease.  Continuing work is in progress in extending the capabilities with model data-sets, and sufficient data.
It is true that the extraction of data from disparate sources will, in the long run, further improve this process.  For instance, the finding of both ST depression on EKG coincident with an increase of a cardiac biomarker (troponin) above a level determined by a receiver operator curve (ROC) analysis, particularly in the absence of substantially reduced renal function.
The conversion of hematology based data into useful clinical information requires the establishment of problem-solving constructs based on the measured data.  Traditionally this has been accomplished by an intuitive interpretation of the data by the individual clinician.  Through the application of geometric clustering analysis the data may interpreted in a more sophisticated fashion in order to create a more reliable and valid knowledge-based opinion.
The most commonly ordered test used for managing patients worldwide is the hemogram that often incorporates the review of a peripheral smear.  While the hemogram has undergone progressive modification of the measured features over time the subsequent expansion of the panel of tests has provided a window into the cellular changes in the production, release or suppression of the formed elements from the blood-forming organ to the circulation.  In the hemogram one can view data reflecting the characteristics of a broad spectrum of medical conditions.
Progressive modification of the measured features of the hemogram has delineated characteristics expressed as measurements of
  • size,
  • density, and
  • concentration,
resulting in more than a dozen composite variables, including the
  1. mean corpuscular volume (MCV),
  2. mean corpuscular hemoglobin concentration (MCHC),
  3. mean corpuscular hemoglobin (MCH),
  4. total white cell count (WBC),
  5. total lymphocyte count,
  6. neutrophil count (mature granulocyte count and bands),
  7. monocytes,
  8. eosinophils,
  9. basophils,
  10. platelet count, and
  11. mean platelet volume (MPV),
  12. blasts,
  13. reticulocytes and
  14. platelet clumps,
  15. perhaps the percent immature neutrophils (not bands)
  16. as well as other features of classification.
The use of such variables combined with additional clinical information including serum chemistry analysis (such as the Comprehensive Metabolic Profile (CMP)) in conjunction with the clinical history and examination complete the traditional problem-solving construct. The intuitive approach applied by the individual clinician is limited, however,
  1. by experience,
  2. memory and
  3. cognition.
The application of rules-based, automated problem solving may provide a more reliable and valid approach to the classification and interpretation of the data used to determine a knowledge-based clinical opinion.
The classification of the available hematologic data in order to formulate a predictive model may be accomplished through mathematical models that offer a more reliable and valid approach than the intuitive knowledge-based opinion of the individual clinician.  The exponential growth of knowledge since the mapping of the human genome has been enabled by parallel advances in applied mathematics that have not been a part of traditional clinical problem solving.  In a univariate universe the individual has significant control in visualizing data because unlike data may be identified by methods that rely on distributional assumptions.  As the complexity of statistical models has increased, involving the use of several predictors for different clinical classifications, the dependencies have become less clear to the individual.  The powerful statistical tools now available are not dependent on distributional assumptions, and allow classification and prediction in a way that cannot be achieved by the individual clinician intuitively. Contemporary statistical modeling has a primary goal of finding an underlying structure in studied data sets.
In the diagnosis of anemia the variables MCV,MCHC and MCH classify the disease process  into microcytic, normocytic and macrocytic categories.  Further consideration of
proliferation of marrow precursors,
  • the domination of a cell line, and
  • features of suppression of hematopoiesis

provide a two dimensional model.  Several other possible dimensions are created by consideration of

  • the maturity of the circulating cells.
The development of an evidence-based inference engine that can substantially interpret the data at hand and convert it in real time to a “knowledge-based opinion” may improve clinical problem solving by incorporating multiple complex clinical features as well as duration of onset into the model.
An example of a difficult area for clinical problem solving is found in the diagnosis of SIRS and associated sepsis.  SIRS (and associated sepsis) is a costly diagnosis in hospitalized patients.   Failure to diagnose sepsis in a timely manner creates a potential financial and safety hazard.  The early diagnosis of SIRS/sepsis is made by the application of defined criteria (temperature, heart rate, respiratory rate and WBC count) by the clinician.   The application of those clinical criteria, however, defines the condition after it has developed and has not provided a reliable method for the early diagnosis of SIRS.  The early diagnosis of SIRS may possibly be enhanced by the measurement of proteomic biomarkers, including transthyretin, C-reactive protein and procalcitonin.  Immature granulocyte (IG) measurement has been proposed as a more readily available indicator of the presence of
  • granulocyte precursors (left shift).
The use of such markers, obtained by automated systems in conjunction with innovative statistical modeling, may provide a mechanism to enhance workflow and decision making.
An accurate classification based on the multiplicity of available data can be provided by an innovative system that utilizes  the conjoined syndromic features of disparate data elements.  Such a system has the potential to facilitate both the workflow and the decision-making process with an anticipated reduction of medical error.
This study is only an extension of our approach to repairing a longstanding problem in the construction of the many-sided electronic medical record (EMR).  On the one hand, past history combined with the development of Diagnosis Related Groups (DRGs) in the 1980s have driven the technology development in the direction of “billing capture”, which has been a focus of epidemiological studies in health services research using data mining.

In a classic study carried out at Bell Laboratories, Didner found that information technologies reflect the view of the creators, not the users, and Front-to-Back Design (R Didner) is needed.  He expresses the view:

“Pre-printed forms are much more amenable to computer-based storage and processing, and would improve the efficiency with which the insurance carriers process this information.  However, pre-printed forms can have a rather severe downside. By providing pre-printed forms that a physician completes
to record the diagnostic questions asked,
  • as well as tests, and results,
  • the sequence of tests and questions,
might be altered from that which a physician would ordinarily follow.  This sequence change could improve outcomes in rare cases, but it is more likely to worsen outcomes. “

Decision Making in the Clinical Setting.   Robert S. Didner

 A well-documented problem in the medical profession is the level of effort dedicated to administration and paperwork necessitated by health insurers, HMOs and other parties (ref).  This effort is currently estimated at 50% of a typical physician’s practice activity.  Obviously this contributes to the high cost of medical care.  A key element in the cost/effort composition is the retranscription of clinical data after the point at which it is collected.  Costs would be reduced, and accuracy improved, if the clinical data could be captured directly at the point it is generated, in a form suitable for transmission to insurers, or machine transformable into other formats.  Such data capture, could also be used to improve the form and structure of how this information is viewed by physicians, and form a basis of a more comprehensive database linking clinical protocols to outcomes, that could improve the knowledge of this relationship, hence clinical outcomes.
 How we frame our expectations is so important that
  • it determines the data we collect to examine the process.
In the absence of data to support an assumed benefit, there is no proof of validity at whatever cost.   This has meaning for
  • hospital operations, for
  • nonhospital laboratory operations, for
  • companies in the diagnostic business, and
  • for planning of health systems.
In 1983, a vision for creating the EMR was introduced by Lawrence Weed and others.  This is expressed by McGowan and Winstead-Fry.
J J McGowan and P Winstead-Fry. Problem Knowledge Couplers: reengineering evidence-based medicine through interdisciplinary development, decision support, and research.
Bull Med Libr Assoc. 1999 October; 87(4): 462–470.   PMCID: PMC226622    Copyright notice

 

Example of Markov Decision Process (MDP) trans...

Example of Markov Decision Process (MDP) transition automaton (Photo credit: Wikipedia)

Control loop of a Markov Decision Process

Control loop of a Markov Decision Process (Photo credit: Wikipedia)

 

English: IBM's Watson computer, Yorktown Heigh...

English: IBM’s Watson computer, Yorktown Heights, NY (Photo credit: Wikipedia)

English: Increasing decision stakes and system...

English: Increasing decision stakes and systems uncertainties entail new problem solving strategies. Image based on a diagram by Funtowicz, S. and Ravetz, J. (1993) “Science for the post-normal age” Futures 25:735–55 (http://dx.doi.org/10.1016/0016-3287(93)90022-L). (Photo credit: Wikipedia)

 

 

Read Full Post »

%d bloggers like this: