Feeds:
Posts
Comments

Archive for the ‘AI in Medical-Imaging’ Category


Cardiac MRI Imaging Breakthrough: The First AI-assisted Cardiac MRI Scan Solution, HeartVista Receives FDA 510(k) Clearance for One Click™ Cardiac MRI Package

Reporter: Aviva Lev-Ari, PhD, RN

 

HeartVista Receives FDA 510(k) Clearance for One Click™ Cardiac MRI Package, the First AI-assisted Cardiac MRI Scan Solution

The future of imaging is here—and FDA cleared.

LOS ALTOS, Calif.–(BUSINESS WIRE)–HeartVista, a pioneer in AI-assisted MRI solutions, today announced that it received 510(k) clearance from the U.S. Food and Drug Administration to deliver its AI-assisted One Click™ MRI acquisition software for cardiac exams. Despite the many advantages of cardiac MRI, or cardiac magnetic resonance (CMR), its use has been largely limited due to a lack of trained technologists, high costs, longer scan time, and complexity of use. With HeartVista’s solution, cardiac MRI is now simple, time-efficient, affordable, and highly consistent.

“HeartVista’s Cardiac Package is a vital tool to enhance the consistency and productivity of cardiac magnetic resonance studies, across all levels of CMR expertise,” said Dr. Raymond Kwong, MPH, Director of Cardiac Magnetic Resonance Imaging at Brigham and Women’s Hospital and Associate Professor of Medicine at Harvard Medical School.

A recent multi-center, outcome-based study (MR-INFORM), published in the New England Journal of Medicine, demonstrated that non-invasive myocardial perfusion cardiovascular MRI was as good as invasive FFR, the previous gold standard method, to guide treatment for patients with stable chest pain, while leading to 20% fewer catheterizations.

“This recent NEJM study further reinforces the clinical literature that cardiac MRI is the gold standard for cardiac diagnosis, even when compared against invasive alternatives,” said Itamar Kandel, CEO of HeartVista. “Our One Click™ solution makes these kinds of cardiac MRI exams practical for widespread adoption. Patients across the country now have access to the only AI-guided cardiac MRI exam, which will deliver continuous imaging via an automated process, minimize errors, and simplify scan operation. Our AI solution generates definitive, accurate and actionable real-time data for cardiologists. We believe it will elevate the standard of care for cardiac imaging, enhance patient experience and access, and improve patient outcomes.”

HeartVista’s FDA-cleared Cardiac Package uses AI-assisted software to prescribe the standard cardiac views with just one click, and in as few as 10 seconds, while the patient breathes freely. A unique artifact detection neural network is incorporated in HeartVista’s protocol to identify when the image quality is below the acceptable threshold, prompting the operator to reacquire the questioned images if desired. Inversion time is optimized with further AI assistance prior to the myocardial delayed-enhancement acquisition. A 4D flow measurement application uses a non-Cartesian, volumetric parallel imaging acquisition to generate high quality images in a fraction of the time. The Cardiac Package also provides preliminary measures of left ventricular function, including ejection fraction, left ventricular volumes, and mass.

HeartVista is presenting its new One Click™ Cardiac Package features at the Radiological Society of North America (RSNA) annual meeting in Chicago, on Dec. 4, 2019, at 2 p.m., in the AI Showcase Theater. HeartVista will also be at Booth #11137 for the duration of the conference, from Dec. 1 through Dec. 5.

About HeartVista

HeartVista believes in leveraging artificial intelligence with the goal of improving access to MRI and improved patient care. The company’s One Click™ software platform enables real-time MRI for a variety of clinical and research applications. Its AI-driven, one-click cardiac localization method received first place honors at the International Society for Magnetic Resonance in Medicine’s Machine Learning Workshop in 2018. The company’s innovative technology originated at the Stanford Magnetic Resonance Systems Research Laboratory. HeartVista is funded by Khosla Ventures, and the National Institute of Health’s Small Business Innovation Research program.

For more information, visit www.heartvista.ai

SOURCE

Reply-To: Kimberly Ha <kimberly.ha@kkhadvisors.com>

Date: Tuesday, October 29, 2019 at 11:01 AM

To: Aviva Lev-Ari <AvivaLev-Ari@alum.berkeley.edu>

Subject: HeartVista Receives FDA Clearance for First AI-assisted Cardiac MRI Solution

Read Full Post »


Showcase: How Deep Learning could help radiologists spend their time more efficiently

Reporter and Curator: Dror Nir, PhD

 

The debate on the function AI could or should realize in modern radiology is buoyant presenting wide spectrum of positive expectations and also fears.

The article: A Deep Learning Model to Triage Screening Mammograms: A Simulation Study that was published this month shows the best, and very much feasible, utility for AI in radiology at the present time. It would be of great benefit for radiologists and patients if such applications will be incorporated (with all safety precautions taken) into routine practice as soon as possible.

In a simulation study, a deep learning model to triage mammograms as cancer free improves workflow efficiency and significantly improves specificity while maintaining a noninferior sensitivity.

Background

Recent deep learning (DL) approaches have shown promise in improving sensitivity but have not addressed limitations in radiologist specificity or efficiency.

Purpose

To develop a DL model to triage a portion of mammograms as cancer free, improving performance and workflow efficiency.

Materials and Methods

In this retrospective study, 223 109 consecutive screening mammograms performed in 66 661 women from January 2009 to December 2016 were collected with cancer outcomes obtained through linkage to a regional tumor registry. This cohort was split by patient into 212 272, 25 999, and 26 540 mammograms from 56 831, 7021, and 7176 patients for training, validation, and testing, respectively. A DL model was developed to triage mammograms as cancer free and evaluated on the test set. A DL-triage workflow was simulated in which radiologists skipped mammograms triaged as cancer free (interpreting them as negative for cancer) and read mammograms not triaged as cancer free by using the original interpreting radiologists’ assessments. Sensitivities, specificities, and percentage of mammograms read were calculated, with and without the DL-triage–simulated workflow. Statistics were computed across 5000 bootstrap samples to assess confidence intervals (CIs). Specificities were compared by using a two-tailed t test (P < .05) and sensitivities were compared by using a one-sided t test with a noninferiority margin of 5% (P < .05).

Results

The test set included 7176 women (mean age, 57.8 years ± 10.9 [standard deviation]). When reading all mammograms, radiologists obtained a sensitivity and specificity of 90.6% (173 of 191; 95% CI: 86.6%, 94.7%) and 93.5% (24 625 of 26 349; 95% CI: 93.3%, 93.9%). In the DL-simulated workflow, the radiologists obtained a sensitivity and specificity of 90.1% (172 of 191; 95% CI: 86.0%, 94.3%) and 94.2% (24 814 of 26 349; 95% CI: 94.0%, 94.6%) while reading 80.7% (21 420 of 26 540) of the mammograms. The simulated workflow improved specificity (P = .002) and obtained a noninferior sensitivity with a margin of 5% (P < .001).

Conclusion

This deep learning model has the potential to reduce radiologist workload and significantly improve specificity without harming sensitivity.

Read Full Post »


These twelve artificial intelligence innovations are expected to start impacting clinical care by the end of the decade.

Reporter: Gail S. Thornton, M.A.

 

This article is excerpted from Health IT Analytics, April 11, 2019.

 By Jennifer Bresnick

April 11, 2019 – There’s no question that artificial intelligence is moving quickly in the healthcare industry.  Even just a few months ago, AI was still a dream for the next generation: something that would start to enter regular care delivery in a couple of decades – maybe ten or fifteen years for the most advanced health systems.

Even Partners HealthCare, the Boston-based giant on the very cutting edge of research and reform, set a ten-year timeframe for artificial intelligence during its 2018 World Medical Innovation Forum, identifying a dozen AI technologies that had the potential to revolutionize patient care within the decade.

But over the past twelve months, research has progressed so rapidly that Partners has blown up that timeline. 

Instead of viewing AI as something still lingering on the distant horizon, this year’s Disruptive Dozen panel was tasked with assessing which AI innovations will be ready to fundamentally alter the delivery of care by 2020 – now less than a year away.

Sixty members of the Partners faculty participated in nominating and narrowing down the tools they think will have an almost immediate benefit for patients and providers, explained Erica Shenoy, MD, PhD, an infectious disease specialist at Massachusetts General Hospital (MGH).

“These are innovations that have a strong potential to make significant advancement in the field, and they are also technologies that are pretty close to making it to market,” she said.

The results include everything from mental healthcare and clinical decision support to coding and communication, offering patients and their providers a more efficient, effective, and cost-conscious ecosystem for improving long-term outcomes.

In order from least to greatest potential impact, here are the twelve artificial intelligence innovations poised to become integral components of the next decade’s data-driven care delivery system.

NARROWING THE GAPS IN MENTAL HEALTHCARE

Nearly twenty percent of US patients struggle with a mental health disorder, yet treatment is often difficult to access and expensive to use regularly.  Reducing barriers to access for mental and behavioral healthcare, especially during the opioid abuse crisis, requires a new approach to connecting patients with services.

AI-driven applications and therapy programs will be a significant part of the answer.

“The promise and potential for digital behavioral solutions and apps is enormous to address the gaps in mental healthcare in the US and across the world,” said David Ahern, PhD, a clinical psychologist at Brigham & Women’s Hospital (BWH). 

Smartphone-based cognitive behavioral therapy and integrated group therapy are showing promise for treating conditions such as depression, eating disorders, and substance abuse.

While patients and providers need to be wary of commercially available applications that have not been rigorously validated and tested, more and more researchers are developing AI-based tools that have the backing of randomized clinical trials and are showing good results.

A panel of experts from Partners HealthCare presents the Disruptive Dozen at WMIF19.
A panel of experts from Partners HealthCare presents the Disruptive Dozen at WMIF19.

Source: Partners HealthCare

STREAMLINING WORKFLOWS WITH VOICE-FIRST TECHNOLOGY

Natural language processing is already a routine part of many behind-the-scenes clinical workflows, but voice-first tools are expected to make their way into the patient-provider encounter in a new way. 

Smart speakers in the clinic are prepping to relieve clinicians of their EHR burdens, capturing free-form conversations and translating the content into structured documentation.  Physicians and nurses will be able to collect and retrieve information more quickly while spending more time looking patients in the eye.

Patients may benefit from similar technologies at home as the consumer market for virtual assistants continues to grow.  With companies like Amazon achieving HIPAA compliance for their consumer-facing products, individuals may soon have more robust options for voice-first chronic disease management and patient engagement.

IDENTIFYING INDIVIDUALS AT HIGH RISK OF DOMESTIC VIOLENCE

Underreporting makes it difficult to know just how many people suffer from intimate partner violence (IPV), says Bharti Khurana, MD, an emergency radiologist at BWH.  But the symptoms are often hiding in plain sight for radiologists.

Using artificial intelligence to flag worrisome injury patterns or mismatches between patient-reported histories and the types of fractures present on x-rays can alert providers to when an exploratory conversation is called for.

“As a radiologist, I’m very excited because this will enable me to provide even more value to the patient instead of simply evaluating their injuries.  It’s a powerful tool for clinicians and social workers that will allow them to approach patients with confidence and with less worry about offending the patient or the spouse,” said Khurana.

REVOLUTIONIZING ACUTE STROKE CARE

Every second counts when a patient experiences a stroke.  In far-flung regions of the United States and in the developing world, access to skilled stroke care can take hours, drastically increasing the likelihood of significant long-term disability or death.

Artificial intelligence has the potential to close the gaps in access to high-quality imaging studies that can identify the type of stroke and the location of the clot or bleed.  Research teams are currently working on AI-driven tools that can automate the detection of stroke and support decision-making around the appropriate treatment for the individual’s needs.  

In rural or low-resource care settings, these algorithms can compensate for the lack of a specialist on-site and ensure that every stroke patient has the best possible chance of treatment and recovery.

AI revolutionizing stroke care

Source: Getty Images

REDUCING ADMINISTRATIVE BURDENS FOR PROVIDERS

The costs of healthcare administration are off the charts.  Recent data from the Center for American progress states that providers spend about $282 billion per year on insurance and medical billing, and the burdens are only going to keep getting bigger.

Medical coding and billing is a perfect use case for natural language processing and machine learning.  NLP is well-suited to translating free-text notes into standardized codes, which can move the task off the plates of physicians and reduce the time and effort spent on complying with convoluted regulations.

“The ultimate goal is to help reduce the complexity of the coding and billing process through automation, thereby reducing the number of mistakes – and, in turn, minimizing the need for such intense regulatory oversight,” Partners says.

NLP is already in relatively wide use for this task, and healthcare organizations are expected to continue adopting this strategy as a way to control costs and speed up their billing cycles.

UNLEASHING HEALTH DATA THROUGH INFORMATION EXCHANGE

AI will combine with another game-changing technology, known as FHIR, to unlock siloes of health data and support broader access to health information.

Patients, providers, and researchers will all benefit from a more fluid health information exchange environment, especially since artificial intelligence models are extremely data-hungry.

Stakeholders will need to pay close attention to maintaining the privacy and security of data as it moves across disparate systems, but the benefits have the potential to outweigh the risks.

“It completely depends on how everyone in the medical community advocates for, builds, and demands open interfaces and open business models,” said Samuel Aronson, Executive Director of IT at Partners Personalized Medicine.

“If we all row in the same direction, there’s a real possibility that we will see fundamental improvements to the healthcare system in 3 to 5 years.”

OFFERING NEW APPROACHES FOR EYE HEALTH AND DISEASE

Image-heavy disciplines have started to see early benefits from artificial intelligence since computers are particularly adept at analyzing patterns in pixels.  Ophthalmology is one area that could see major changes as AI algorithms become more accurate and more robust.

From glaucoma to diabetic retinopathy, millions of patients experience diseases that can lead to irreversible vision loss every year.  Employing AI for clinical decision support can extend access to eye health services in low-resource areas while giving human providers more accurate tools for catching diseases sooner.

REAL-TIME MONITORING OF BRAIN HEALTH

The brain is still the body’s most mysterious organ, but scientists and clinicians are making swift progress unlocking the secrets of cognitive function and neurological disease.  Artificial intelligence is accelerating discovery by helping providers interpret the incredibly complex data that the brain produces.

From predicting seizures by reading EEG tests to identifying the beginnings of dementia earlier than any human, artificial intelligence is allowing providers to access more detailed, continuous measurements – and helping patients improve their quality of life.

Seizures can happen in patients with other serious illnesses, such as kidney or liver failure, explained, Bandon Westover, MD, PhD, executive director of the Clinical Data Animation Center at MGH, but many providers simply don’t know about it.

“Right now, we mostly ignore the brain unless there’s a special need for suspicion,” he said.  “In a year’s time, we’ll be catching a lot more seizures and we’ll be doing it with algorithms that can monitor patients continuously and identify more ambiguous patterns of dysfunction that can damage the brain in a similar manner to seizures.”

AUTOMATING MALARIA DETECTION IN DEVELOPING REGIONS

Malaria is a daily threat for approximately half the world’s population.  Nearly half a million people died from the mosquito-borne disease in 2017, according to the World Health Organization, and the majority of the victims are children under the age of five.

Deep learning tools can automate the process of quantifying malaria parasites in blood samples, a challenging task for providers working without pathologist partners.  One such tool achieved 90 percent accuracy and specificity, putting it on par with pathology experts.

This type of software can be run on a smartphone hooked up to a camera on a microscope, dramatically expanding access to expert-level diagnosis and monitoring.

AI for diagnosing and detecting malaria

Source: Getty Images

AUGMENTING DIAGNOSTICS AND DECISION-MAKING

Artificial intelligence has made especially swift progress in diagnostic specialties, including pathology. AI will continue to speed down the road to maturity in this area, predicts Annette Kim, MD, PhD, associate professor of pathology at BWH and Harvard Medical School.

“Pathology is at the center of diagnosis, and diagnosis underpins a huge percentage of all patient care.  We’re integrating a huge amount of data that funnels through us to come to a diagnosis.  As the number of data points increases, it negatively impacts the time we have to synthesize the information,” she said.

AI can help automate routine, high-volume tasks, prioritize and triage cases to ensure patients are getting speedy access to the right care, and make sure that pathologists don’t miss key information hidden in the enormous volumes of clinical and test data they must comb through every day.

“This is where AI can have a huge impact on practice by allowing us to use our limited time in the most meaningful manner,” Kim stressed.

PREDICTING THE RISK OF SUICIDE AND SELF-HARM

Suicide is the tenth leading cause of death in the United States, claiming 45,000 lives in 2016.  Suicide rates are on the rise due to a number of complex socioeconomic and mental health factors, and identifying patients at the highest risk of self-harm is a difficult and imprecise science.

Natural language processing and other AI methodologies may help providers identify high-risk patients earlier and more reliably.  AI can comb through social media posts, electronic health record notes, and other free-text documents to flag words or concepts associated with the risk of harm.

Researchers also hope to develop AI-driven apps to provide support and therapy to individuals likely to harm themselves, especially teenagers who commit suicide at higher rates than other age groups.

Connecting patients with mental health resources before they reach a time of crisis could save thousands of lives every year.

REIMAGINING THE WORLD OF MEDICAL IMAGING

Radiology is already one of AI’s early beneficiaries, but providers are just at the beginning of what they will be able to accomplish in the next few years as machine learning explodes into the imaging realm.

AI is predicted to bring earlier detection, more accurate assessment of complex images, and less expensive testing for patients across a huge number of clinical areas.

But as leaders in the AI revolution, radiologists also have a significant responsibility to develop and deploy best practices in terms of trustworthiness, workflow, and data protection.

“We certainly feel the onus on the radiology community to make sure we do deliver and translate this into improved care,” said Alexandra Golby, MD, a neurosurgeon and radiologist at BWH and Harvard Medical School.

“Can radiology live up to the expectations?  There are certainly some challenges, including trust and understanding of what the algorithms are delivering.  But we desperately need it, and we want to equalize care across the world.”

Radiologists have been among the first to overcome their trepidation about the role of AI in a changing clinical world, and are eagerly embracing the possibilities of this transformative approach to augmenting human skills.”

“All of the imaging societies have opened their doors to the AI adventure,” Golby said.  “The community very anxious to learn, codevelop, and work with all of the industry partners to turn this technology into truly valuable tools. We’re very optimistic and very excited, and we look forward to learning more about how AI can improve care.”

Source:

https://healthitanalytics.com/news/top-12-artificial-intelligence-innovations-disrupting-healthcare-by-2020

 

Read Full Post »


Retrospect on HistoScanning; an AI routinely used in diagnostic imaging for over a decade

Author and Curator: Dror Nir, PhD

This blog-post is a retrospect on over a decade of doing with HistoScanning; an AI medical-device for imaging-based tissue characterization.

Imaging-based tissue characterization by AI is offering a change in imaging paradigm; enhancing the visual information received when using diagnostic-imaging beyond that which the eye alone can see and at the same time simplifying and increasing the cost-effectiveness of patients clinical pathway.

In the case of HistoScanning, imaging is a combination of 3D-scanning by ultrasound with a real-time application of AI. The HistoScanning AI application comprises fast “patterns recognition” algorithms trained on ultrasound-scans and matched histopathology of cancer patients. It classifies millimetric tissue-volumes by identifying differences in the scattered ultrasound characterizing different mechanical and morphological properties of the different pathologies. A user-friendly interface displays the analysis results on the live ultrasound video image.

Users of AI in diagnostic-imaging of cancer patients expect it to improve their ability to:

  • Detect clinically significant cancer lesions with high sensitivity and specificity
  • Accurately position lesions within an organ
  • Accurately estimate the lesion volume
  • AND; help determine the pre-clinical level of lesion aggressiveness

The last being achieved through real-time guidance of needle biopsy towards the most suspicious locations.

Unlike most technologies that get obsolete as time passes, AI gets better. Availability of more processing power, better storage technologies, and faster memories translate to an ever-growing capacity of machines to learn. Moreover, the human-perception of AI is transforming fast from disbelief at the time HistoScanning was first launched, into total embracement.

During the last decade, 192 systems were put to use at the hands of urologists, radiologists, and gynecologists. Over 200 peer-reviewed, scientific-posters and white-papers were written by HistoScanning users sharing experiences and thoughts. Most of these papers are about HistoScanning for Prostate (PHS) which was launched as a medical-device in 2007. The real-time guided prostate-biopsy application was added to it in late 2013. I have mentioned several  of these papers in blog-posts published in this open-access website, e.g. :

Today’s fundamental challenge in Prostate cancer screening (September 2, 2012)

The unfortunate ending of the Tower of Babel construction project and its effect on modern imaging-based cancer patients’ management (October 22, 2012)

On the road to improve prostate biopsy (February 15, 2013)

Ultrasound-based Screening for Ovarian Cancer (April 28, 2013)

Imaging-Biomarkers; from discovery to validation (September 28, 2014)

For people who are developing AI applications for health-care, retrospect on HistoScanning represents an excellent opportunity to better plan the life cycle of such products and what it would take to bring it to a level of wide adoption by global health systems.

It would require many pages to cover the lessons HistoScanning could teach each and all of us in detail. I will therefore briefly discuss the highlights:

  • Regulations: Clearance for HistoScanning by FDA required a PMA and was not achieved until today. The regulatory process in Europe was similar to that of ultrasound but getting harder in recent years.
  • Safety: During more than a decade and many thousands of procedures, no safety issue was brought up.
  • Learning curve: Many of the reports on HistoScanning conclude that in order to maximize its potential the sonographer must be experienced and well trained with using the system. Amongst else, it became clear that there is a strong correlation between the clinical added value of using HistoScanning and the quality of the ultrasound scan, which is dependant on the sonographer but also, in many cases, on the patient (e.g. his BMI)
  • Patient’s attitude: PMS reviews on HistoScanning shows that patients are generally excited about the opportunity of an AI application being involved in their diagnostic process. It seems to increase their confidence in the validity of the results and there was never a case of refusal to be exposed to the analysis. Also, some of the early adopters of PHS (HistoScanning for prostate) charged their patients privately for the service and patients were happy to accept that although there was no reimbursement of such cost by their health insurance.
  • Adoption by practitioners: To date, PHS did not achieve wide market adoption and users’ feedback on it are mixed, ranging from strong positive recommendation to very negative and dismissive. Close examination of the reasons for such a variety of experiences reveals that most of the reports are relying on small and largely varying samples. The reason for it being the relatively high complexity and cost of clinical trials aiming at measuring its performance. Moreover, without any available standards of assessing AI performance, what is good enough for one user can be totally insufficient for another. Realizing this led to recent efforts by some leading urologists to organize large patients’ registries related to routine-use of PHS.

The most recent peer-reviewed paper on PHS; Evaluation of Prostate HistoScanning as a Method for Targeted Biopsy in Routine Practice. Petr V. Glybochko, Yuriy G. Alyaev, Alexandr V. Amosov, German E. Krupinov, Dror Nir, Mathias Winkler, Timur M. Ganzha, European Urology Focus.

Studies PHS on statistically reasonable number (611) of patients and concluded that “Our study results support supplementing the standard schematic transrectal ultrasound-guided biopsy with a few guided cores harvested using the ultrasound-based prostate HistoScanning true targeting approach in cases for which multiparametric magnetic resonance imaging is not available.”

Read Full Post »


Reported by Dror Nir, PhD

Deep Learning–Assisted Diagnosis of Cerebral Aneurysms Using the HeadXNet Model

Allison Park, BA1Chris Chute, BS1Pranav Rajpurkar, MS1;  et al, Original Investigation, Health Informatics, June 7, 2019, JAMA Netw Open. 2019;2(6):e195600. doi:10.1001/jamanetworkopen.2019.5600

Key Points

Question  How does augmentation with a deep learning segmentation model influence the performance of clinicians in identifying intracranial aneurysms from computed tomographic angiography examinations?

Findings  In this diagnostic study of intracranial aneurysms, a test set of 115 examinations was reviewed once with model augmentation and once without in a randomized order by 8 clinicians. The clinicians showed significant increases in sensitivity, accuracy, and interrater agreement when augmented with neural network model–generated segmentations.

Meaning  This study suggests that the performance of clinicians in the detection of intracranial aneurysms can be improved by augmentation using deep learning segmentation models.

 

Abstract

Importance  Deep learning has the potential to augment clinician performance in medical imaging interpretation and reduce time to diagnosis through automated segmentation. Few studies to date have explored this topic.

Objective  To develop and apply a neural network segmentation model (the HeadXNet model) capable of generating precise voxel-by-voxel predictions of intracranial aneurysms on head computed tomographic angiography (CTA) imaging to augment clinicians’ intracranial aneurysm diagnostic performance.

Design, Setting, and Participants  In this diagnostic study, a 3-dimensional convolutional neural network architecture was developed using a training set of 611 head CTA examinations to generate aneurysm segmentations. Segmentation outputs from this support model on a test set of 115 examinations were provided to clinicians. Between August 13, 2018, and October 4, 2018, 8 clinicians diagnosed the presence of aneurysm on the test set, both with and without model augmentation, in a crossover design using randomized order and a 14-day washout period. Head and neck examinations performed between January 3, 2003, and May 31, 2017, at a single academic medical center were used to train, validate, and test the model. Examinations positive for aneurysm had at least 1 clinically significant, nonruptured intracranial aneurysm. Examinations with hemorrhage, ruptured aneurysm, posttraumatic or infectious pseudoaneurysm, arteriovenous malformation, surgical clips, coils, catheters, or other surgical hardware were excluded. All other CTA examinations were considered controls.

Main Outcomes and Measures  Sensitivity, specificity, accuracy, time, and interrater agreement were measured. Metrics for clinician performance with and without model augmentation were compared.

Results  The data set contained 818 examinations from 662 unique patients with 328 CTA examinations (40.1%) containing at least 1 intracranial aneurysm and 490 examinations (59.9%) without intracranial aneurysms. The 8 clinicians reading the test set ranged in experience from 2 to 12 years. Augmenting clinicians with artificial intelligence–produced segmentation predictions resulted in clinicians achieving statistically significant improvements in sensitivity, accuracy, and interrater agreement when compared with no augmentation. The clinicians’ mean sensitivity increased by 0.059 (95% CI, 0.028-0.091; adjusted P = .01), mean accuracy increased by 0.038 (95% CI, 0.014-0.062; adjusted P = .02), and mean interrater agreement (Fleiss κ) increased by 0.060, from 0.799 to 0.859 (adjusted P = .05). There was no statistically significant change in mean specificity (0.016; 95% CI, −0.010 to 0.041; adjusted P = .16) and time to diagnosis (5.71 seconds; 95% CI, 7.22-18.63 seconds; adjusted P = .19).

Conclusions and Relevance  The deep learning model developed successfully detected clinically significant intracranial aneurysms on CTA. This suggests that integration of an artificial intelligence–assisted diagnostic model may augment clinician performance with dependable and accurate predictions and thereby optimize patient care.

Introduction

Diagnosis of unruptured aneurysms is a critically important clinical task: intracranial aneurysms occur in 1% to 3% of the population and account for more than 80% of nontraumatic life-threatening subarachnoid hemorrhages.1 Computed tomographic angiography (CTA) is the primary, minimally invasive imaging modality currently used for diagnosis, surveillance, and presurgical planning of intracranial aneurysms,2,3but interpretation is time consuming even for subspecialty-trained neuroradiologists. Low interrater agreement poses an additional challenge for reliable diagnosis.47

Deep learning has recently shown significant potential in accurately performing diagnostic tasks on medical imaging.8 Specifically, convolutional neural networks (CNNs) have demonstrated excellent performance on a range of visual tasks, including medical image analysis.9 Moreover, the ability of deep learning systems to augment clinician workflow remains relatively unexplored.10 The development of an accurate deep learning model to help clinicians reliably identify clinically significant aneurysms in CTA has the potential to provide radiologists, neurosurgeons, and other clinicians an easily accessible and immediately applicable diagnostic support tool.

In this study, a deep learning model to automatically detect intracranial aneurysms on CTA and produce segmentations specifying regions of interest was developed to assist clinicians in the interpretation of CTA examinations for the diagnosis of intracranial aneurysms. Sensitivity, specificity, accuracy, time to diagnosis, and interrater agreement for clinicians with and without model augmentation were compared.

Methods

The Stanford University institutional review board approved this study. Owing to the retrospective nature of the study, patient consent or assent was waived. The Standards for Reporting of Diagnostic Accuracy (STARD) reporting guideline was used for the reporting of this study.

Data

A total of 9455 consecutive CTA examination reports of the head or head and neck performed between January 3, 2003, and May 31, 2017, at Stanford University Medical Center were retrospectively reviewed. Examinations with parenchymal hemorrhage, subarachnoid hemorrhage, posttraumatic or infectious pseudoaneurysm, arteriovenous malformation, ischemic stroke, nonspecific or chronic vascular findings such as intracranial atherosclerosis or other vasculopathies, surgical clips, coils, catheters, or other surgical hardware were excluded. Examinations of injuries that resulted from trauma or contained images degraded by motion were also excluded on visual review by a board-certified neuroradiologist with 12 years of experience. Examinations with nonruptured clinically significant aneurysms (>3 mm) were included.11

Radiologist Annotations

The reference standard for all examinations in the test set was determined by a board-certified neuroradiologist at a large academic practice with 12 years of experience who determined the presence of aneurysm by review of the original radiology report, double review of the CTA examination, and further confirmation of the aneurysm by diagnostic cerebral angiograms, if available. The neuroradiologist had access to all of the Digital Imaging and Communications in Medicine (DICOM) series, original reports, and clinical histories, as well as previous and follow-up examinations during interpretation to establish the best possible reference standard for the labels. For each of the aneurysm examinations, the radiologist also identified the location of each of the aneurysms. Using the open-source annotation software ITK-SNAP,12 the identified aneurysms were manually segmented on each slice.

Model Development

In this study, we developed a 3-dimensional (3-D) CNN called HeadXNet for segmentation of intracranial aneurysms from CT scans. Neural networks are functions with parameters structured as a sequence of layers to learn different levels of abstraction. Convolutional neural networks are a type of neural network designed to process image data, and 3-D CNNs are particularly well suited to handle sequences of images, or volumes.

HeadXNet is a CNN with an encoder-decoder structure (eFigure 1 in the Supplement), where the encoder maps a volume to an abstract low-resolution encoding, and the decoder expands this encoding to a full-resolution segmentation volume. The segmentation volume is of the same size as the corresponding study and specifies the probability of aneurysm for each voxel, which is the atomic unit of a 3-D volume, analogous to a pixel in a 2-D image. The encoder is adapted from a 50-layer SE-ResNeXt network,1315and the decoder is a sequence of 3 × 3 transposed convolutions. Similar to UNet,16 skip connections are used in 3 layers of the encoder to transmit outputs directly to the decoder. The encoder was pretrained on the Kinetics-600 data set,17 a large collection of YouTube videos labeled with human actions; after pretraining the encoder, the final 3 convolutional blocks and the 600-way softmax output layer were removed. In their place, an atrous spatial pyramid pooling18 layer and the decoder were added.

Training Procedure

Subvolumes of 16 slices were randomly sampled from volumes during training. The data set was preprocessed to find contours of the skull, and each volume was cropped around the skull in the axial plane before resizing each slice to 208 × 208 pixels. The slices were then cropped to 192 × 192 pixels (using random crops during training and centered crops during testing), resulting in a final input of size 16 × 192 × 192 per example; the same transformations were applied to the segmentation label. The segmentation output was trained to optimize a weighted combination of the voxelwise binary cross-entropy and Dice losses.19

Before reaching the model, inputs were clipped to [−300, 700] Hounsfield units, normalized to [−1, 1], and zero-centered. The model was trained on 3 Titan Xp graphical processing units (GPUs) (NVIDIA) using a minibatch of 2 examples per GPU. The parameters of the model were optimized using a stochastic gradient descent optimizer with momentum of 0.9 and a peak learning rate of 0.1 for randomly initialized weights and 0.01 for pretrained weights. The learning rate was scheduled with a linear warm-up from 0 to the peak learning rate for 10 000 iterations, followed by cosine annealing20 over 300 000 iterations. Additionally, the learning rate was fixed at 0 for the first 10 000 iterations for the pretrained encoder. For regularization, L2 weight decay of 0.001 was added to the loss for all trainable parameters and stochastic depth dropout21 was used in the encoder blocks. Standard dropout was not used.

To control for class imbalance, 3 methods were used. First, an auxiliary loss was added after the encoder and focal loss was used to encourage larger parameter updates on misclassified positive examples. Second, abnormal training examples were sampled more frequently than normal examples such that abnormal examples made up 30% of training iterations. Third, parameters of the decoder were not updated on training iterations where the segmentation label consisted of purely background (normal) voxels.

To produce a segmentation prediction for the entire volume, the segmentation outputs for sequential 16-slice subvolumes were simply concatenated. If the number of slices was not divisible by 16, the last input volume was padded with 0s and the corresponding output volume was truncated back to the original size.

Study Design

We performed a diagnostic accuracy study comparing performance metrics of clinicians with and without model augmentation. Each of the 8 clinicians participating in the study diagnosed a test set of 115 examinations, once with and once without assistance of the model. The clinicians were blinded to the original reports, clinical histories, and follow-up imaging examinations. Using a crossover design, the clinicians were randomly and equally divided into 2 groups. Within each group, examinations were sorted in a fixed random order for half of the group and sorted in reverse order for the other half. Group 1 first read the examinations without model augmentation, and group 2 first read the examinations with model augmentation. After a washout period of 14 days, the augmentation arrangement was reversed such that group 1 performed reads with model augmentation and group 2 read the examinations without model augmentation (Figure 1A).

Clinicians were instructed to assign a binary label for the presence or absence of at least 1 clinically significant aneurysm, defined as having a diameter greater than 3 mm. Clinicians read alone in a diagnostic reading room, all using the same high-definition monitor (3840 × 2160 pixels) displaying CTA examinations on a standard open-source DICOM viewer (Horos).22 Clinicians entered their labels into a data entry software application that automatically logged the time difference between labeling of the previous examination and the current examination.

When reading with model augmentation, clinicians were provided the model’s predictions in the form of region of interest (ROI) segmentations directly overlaid on top of CTA examinations. To ensure an image display interface that was familiar to all clinicians, the model’s predictions were presented as ROIs in a standard DICOM viewing software. At every voxel where the model predicted a probability greater than 0.5, readers saw a semiopaque red overlay on the axial, sagittal, and coronal series (Figure 1C). Readers had access to the ROIs immediately on loading the examinations, and the ROIs could be toggled off to reveal the unaltered CTA images (Figure 1B). The red overlays were the only indication that was given whether a particular CTA examination had been predicted by the model to contain an aneurysm. Given these model results, readers had the option to take it into consideration or disregard it based on clinical judgment. When readers performed diagnoses without augmentation, no ROIs were present on any of the examinations. Otherwise, the diagnostic tools were identical for augmented and nonaugmented reads.

 

Statistical Analysis

On the binary task of determining whether an examination contained an aneurysm, sensitivity, specificity, and accuracy were used to assess the performance of clinicians with and without model augmentation. Sensitivity denotes the number of true-positive results over total aneurysm-positive cases, specificity denotes the number of true-negative results over total aneurysm-negative cases, and accuracy denotes the number of true-positive and true-negative results over all test cases. The microaverage of these statistics across all clinicians was also computed by measuring each statistic pertaining to the total number of true-positive, false-negative, and false-positive results. In addition, to convert the models’ segmentation output of the model into a binary prediction, a prediction was considered positive if the model predicted at least 1 voxel as belonging to an aneurysm and negative otherwise. The 95% Wilson score confidence intervals were used to assess the variability in the estimates for sensitivity, specificity, and accuracy.23

To assess whether the clinicians achieved significant increases in performance with model augmentation, a 1-tailed t test was performed on the differences in sensitivity, specificity, and accuracy across all 8 clinicians. To determine the robustness of the findings and whether results were due to inclusion of the resident radiologist and neurosurgeon, we performed a sensitivity analysis: we computed the t test on the differences in sensitivity, specificity, and accuracy across board-certified radiologists only.

The average time to diagnosis for the clinicians with and without augmentation was computed as the difference between the mean entry times into the spreadsheet of consecutive diagnoses; 95% t score confidence intervals were used to assess the variability in the estimates. To account for interruptions in the clinical read or time logging errors, the 5 longest and 5 shortest time to diagnosis for each clinician in each reading were excluded. To assess whether model augmentation significantly decreased the time to diagnosis, a 1-tailed t test was performed on the difference in average time with and without augmentation across all 8 clinicians.

The interrater agreement of clinicians and for the radiologist subset was computed using the exact Fleiss κ.24 To assess whether model augmentation increased interrater agreement, a 1-tailed permutation test was performed on the difference between the interrater agreement of clinicians on the test set with and without augmentation. The permutation procedure consisted of randomly swapping clinician annotations with and without augmentation so that a random subset of the test set that had previously been labeled as read with augmentation was now labeled as being read without augmentation, and vice versa; the exact Fleiss κ values (and the difference) were computed on the test set with permuted labels. This permutation procedure was repeated 10 000 times to generate the null distribution of the Fleiss κ difference (the interrater agreement of clinician annotations with augmentation is not higher than without augmentation) and the unadjusted value calculated as the proportion of Fleiss κ differences that were higher than the observed Fleiss κ difference.

To control the familywise error rate, the Benjamini-Hochberg correction was applied to account for multiple hypothesis testing; a Benjamini-Hochberg–adjusted P ≤ .05 indicated statistical significance. All tests were 1-tailed.25

Results

The data set contained 818 examinations from 662 unique patients with 328 CTA examinations (40.1%) containing at least 1 intracranial aneurysm and 490 examinations (59.9%) without intracranial aneurysms (Figure 2). Of the 328 aneurysm cases, 20 cases from 15 unique patients contained 2 or more aneurysms. One hundred forty-eight aneurysm cases contained aneurysms between 3 mm and 7 mm, 108 cases had aneurysms between 7 mm and 12 mm, 61 cases had aneurysms between 12 mm and 24 mm, and 11 cases had aneurysms 24 mm or greater. The location of the aneurysms varied according to the following distribution: 99 were located in the internal carotid artery, 78 were in the middle cerebral artery, 50 were cavernous internal carotid artery aneurysms, 44 were basilar tip aneurysms, 41 were in the anterior communicating artery, 18 were in the posterior communicating artery, 16 were in the vertebrobasilar system, and 12 were in the anterior cerebral artery. All examinations were performed either on a GE Discovery, GE LightSpeed, GE Revolution, Siemens Definition, Siemens Sensation, or a Siemens Force scanner, with slice thicknesses of 1.0 mm or 1.25 mm, using standard clinical protocols for head angiogram or head/neck angiogram. There was no difference between the protocols or slice thicknesses between the aneurysm and nonaneurysm examinations. For this study, axial series were extracted from each examination and a segmentation label was produced on every axial slice containing an aneurysm. The number of images per examination ranged from 113 to 802 (mean [SD], 373 [157]).

The examinations were split into a training set of 611 examinations (494 patients; mean [SD] age, 55.8 [18.1] years; 372 [60.9%] female) used to train the model, a development set of 92 examinations (86 patients; mean [SD] age, 61.6 [16.7] years; 59 [64.1%] female) used for model selection, and a test set of 115 examinations (82 patients; mean [SD] age, 57.8 [18.3] years; 74 [64.4%] female) to evaluate the performance of the clinicians when augmented with the model (Figure 2).

Using stratified random sampling, the development and test sets were formed to include 50% aneurysm examinations and 50% normal examinations; the remaining examinations composed the training set, of which 36.5% were aneurysm examinations. Forty-three patients had multiple examinations in the data set due to examinations performed for follow-up of the aneurysm. To account for these repeat patients, examinations were split so that there was no patient overlap between the different sets. Figure 2 contains pathology and patient demographic characteristics for each set.

A total of 8 clinicians, including 6 board-certified practicing radiologists, 1 practicing neurosurgeon, and 1 radiology resident, participated as readers in the study. The radiologists’ years of experience ranged from 3 to 12 years, the neurosurgeon had 2 years of experience as attending, and the resident was in the second year of training at Stanford University Medical Center. Groups 1 and 2 consisted of 3 radiologists each; the resident and neurosurgeon were both in group 1. None of the clinicians were involved in establishing the reference standard for the examinations.

Without augmentation, clinicians achieved a microaveraged sensitivity of 0.831 (95% CI, 0.794-0.862), specificity of 0.960 (95% CI, 0.937-0.974), and an accuracy of 0.893 (95% CI, 0.872-0.912). With augmentation, the clinicians achieved a microaveraged sensitivity of 0.890 (95% CI, 0.858-0.915), specificity of 0.975 (95% CI, 0.957-0.986), and an accuracy of 0.932 (95% CI, 0.913-0.946). The underlying model had a sensitivity of 0.949 (95% CI, 0.861-0.983), specificity of 0.661 (95% CI, 0.530-0.771), and accuracy of 0.809 (95% CI, 0.727-0.870). The performances of the model, individual clinicians, and their microaverages are reported in eTable 1 in the Supplement.

 

With augmentation, there was a statistically significant increase in the mean sensitivity (0.059; 95% CI, 0.028-0.091; adjusted P = .01) and mean accuracy (0.038; 95% CI, 0.014-0.062; adjusted P = .02) of the clinicians as a group. There was no statistically significant change in mean specificity (0.016; 95% CI, −0.010 to 0.041; adjusted P = .16). Performance improvements across clinicians are detailed in the Table, and individual clinician improvement in Figure 3.

Individual performances with and without model augmentation are shown in eTable 1 in the Supplement. The sensitivity analysis confirmed that even among board-certified radiologists, there was a statistically significant increase in mean sensitivity (0.059; 95% CI, 0.013-0.105; adjusted P = .04) and accuracy (0.036; 95% CI, 0.001-0.072; adjusted P = .05). Performance improvements of board-certified radiologists as a group are shown in eTable 2 in the Supplement.

 

The mean diagnosis time per examination without augmentation microaveraged across clinicians was 57.04 seconds (95% CI, 54.58-59.50 seconds). The times for individual clinicians are detailed in eTable 3 in the Supplement, and individual time changes are shown in eFigure 2 in the Supplement.

 

With augmentation, there was no statistically significant decrease in mean diagnosis time (5.71 seconds; 95% CI, −7.22 to 18.63 seconds; adjusted P = .19). The model took a mean of 7.58 seconds (95% CI, 6.92-8.25 seconds) to process an examination and output its segmentation map.Confusion matrices, which are tables reporting true- and false-positive results and true- and false-negative results of each clinician with and without model augmentation, are shown in eTable 4 in the Supplement.

There was a statistically significant increase of 0.060 (adjusted P = .05) in the interrater agreement among the clinicians, with an exact Fleiss κ of 0.799 without augmentation and 0.859 with augmentation. For the board-certified radiologists, there was an increase of 0.063 in their interrater agreement, with an exact Fleiss κ of 0.783 without augmentation and 0.847 with augmentation.

Discussion

In this study, the ability of a deep learning model to augment clinician performance in detecting cerebral aneurysms using CTA was investigated with a crossover study design. With model augmentation, clinicians’ sensitivity, accuracy, and interrater agreement significantly increased. There was no statistical change in specificity and time to diagnosis.Given the potential catastrophic outcome of a missed aneurysm at risk of rupture, an automated detection tool that reliably detects and enhances clinicians’ performance is highly desirable. Aneurysm rupture is fatal in 40% of patients and leads to irreversible neurological disability in two-thirds of those who survive; therefore, an accurate and timely detection is of paramount importance. In addition to significantly improving accuracy across clinicians while interpreting CTA examinations, an automated aneurysm detection tool, such as the one presented in this study, could also be used to prioritize workflow so that those examinations more likely to be positive could receive timely expert review, potentially leading to a shorter time to treatment and more favorable outcomes.The significant variability among clinicians in the diagnosis of aneurysms has been well documented and is typically attributed to lack of experience or subspecialty neuroradiology training, complex neurovascular anatomy, or the labor-intensive nature of identifying aneurysms. Studies have shown that interrater agreement of CTA-based aneurysm detection is highly variable, with interrater reliability metrics ranging from 0.37 to 0.85,6,7,2628 and performance levels that vary depending on aneurysm size and individual radiologist experience.4,6 In addition to significantly increasing sensitivity and accuracy, augmenting clinicians with the model also significantly improved interrater reliability from 0.799 to 0.859. This implies that augmenting clinicians with varying levels of experience and specialties with models could lead to more accurate and more consistent radiological interpretations. Currently, tools to improve clinician aneurysm detection on CTA include bone subtraction,29 as well as 3-D rendering of intracranial vasculature,3032 which rely on application of contrast threshold settings to better delineate cerebral vasculature and create a 3-D–rendered reconstruction to assist aneurysm detection. However, using these tools is labor- and time-intensive for clinicians; in some institutions, this process is outsourced to a 3-D lab at additional costs. The tool developed in this study, integrated directly in a standard DICOM viewer, produces a segmentation map on a new examination in only a few seconds. If integrated into the standard workflow, this diagnostic tool could substantially decrease both cost and time to diagnosis, potentially leading to more efficient treatment and more favorable patient outcomes.Deep learning has recently shown success in various clinical image-based recognition tasks. In particular, studies have shown strong performance of 2-D CNNs in detecting intracranial hemorrhage and other acute brain findings, such as mass effect or skull fractures, on CT head examinations.3336 Recently, one study10 examined the potential role for deep learning in magnetic resonance angiogram–based detection of cerebral aneurysms, and another study37 showed that providing deep learning model predictions to clinicians when interpreting knee magnetic resonance studies increased specificity in detecting anterior cruciate ligament tears. To our knowledge, prior to this study, deep learning had not been applied to CTA, which is the first-line imaging modality for detecting cerebral aneurysms. Our results demonstrate that deep learning segmentation models may produce dependable and interpretable predictions that augment clinicians and improve their diagnostic performance. The model implemented and tested in this study significantly increased sensitivity, accuracy, and interrater reliability of clinicians with varied experience and specialties in detecting cerebral aneurysms using CTA.

Limitations

This study has limitations. First, because the study focused only on nonruptured aneurysms, model performance on aneurysm detection after aneurysm rupture, lesion recurrence after coil or surgical clipping, or aneurysms associated with arteriovenous malformations has not been investigated. Second, since examinations containing surgical hardware or devices were excluded, model performance in their presence is unknown. In a clinical environment, CTA is typically used to evaluate for many types of vascular diseases, not just for aneurysm detection. Therefore, the high prevalence of aneurysm in the test set and the clinician’s binary task could have introduced bias in interpretation. Also, this study was performed on data from a single tertiary care academic institution and may not reflect performance when applied to data from other institutions with different scanners and imaging protocols, such as different slice thicknesses.

Conclusions

A deep learning model was developed to automatically detect clinically significant intracranial aneurysms on CTA. We found that the augmentation significantly improved clinicians’ sensitivity, accuracy, and interrater reliability. Future work should investigate the performance of this model prospectively and in application of data from other institutions and hospitals.

Article Information:

Accepted for Publication: April 23, 2019.

Published: June 7, 2019. doi:10.1001/jamanetworkopen.2019.5600

Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2019 Park A et al. JAMA Network Open.

Corresponding Author: Kristen W. Yeom, MD, School of Medicine, Department of Radiology, Stanford University, 725 Welch Rd, Ste G516, Palo Alto, CA 94304 (kyeom@stanford.edu).

Author Contributions: Ms Park and Dr Yeom had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Ms Park and Messrs Chute and Rajpurkar are co–first authors. Drs Ng and Yeom are co–senior authors.

Concept and design: Park, Chute, Rajpurkar, Lou, Shpanskaya, Ni, Basu, Lungren, Ng, Yeom.

Acquisition, analysis, or interpretation of data: Park, Chute, Rajpurkar, Lou, Ball, Shpanskaya, Jabarkheel, Kim, McKenna, Tseng, Ni, Wishah, Wittber, Hong, Wilson, Halabi, Patel, Lungren, Yeom.

Drafting of the manuscript: Park, Chute, Rajpurkar, Lou, Ball, Jabarkheel, Kim, McKenna, Hong, Halabi, Lungren, Yeom.

Critical revision of the manuscript for important intellectual content: Park, Chute, Rajpurkar, Ball, Shpanskaya, Jabarkheel, Kim, Tseng, Ni, Wishah, Wittber, Wilson, Basu, Patel, Lungren, Ng, Yeom.

Statistical analysis: Park, Chute, Rajpurkar, Lou, Ball, Lungren.

Administrative, technical, or material support: Park, Chute, Shpanskaya, Jabarkheel, Kim, McKenna, Tseng, Wittber, Hong, Wilson, Lungren, Ng, Yeom.

Supervision: Park, Ball, Tseng, Halabi, Basu, Lungren, Ng, Yeom.

Conflict of Interest Disclosures: Drs Wishah and Patel reported grants from GE and Siemens outside the submitted work. Dr Patel reported participation in the speakers bureau for GE. Dr Lungren reported personal fees from Nines Inc outside the submitted work. Dr Yeom reported grants from Philips outside the submitted work. No other disclosures were reported.

Funding/Support: This work was supported by National Institutes of Health National Center for Advancing Translational Science Clinical and Translational Science Award UL1TR001085.

Role of the Funder/Sponsor: The National Institutes of Health had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Disclaimer: The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

References

1.Jaja  BN, Cusimano  MD, Etminan  N,  et al.  Clinical prediction models for aneurysmal subarachnoid hemorrhage: a systematic review.  Neurocrit Care. 2013;18(1):143-153. doi:10.1007/s12028-012-9792-zPubMedGoogle ScholarCrossref
2.Turan  N, Heider  RA, Roy  AK,  et al.  Current perspectives in imaging modalities for the assessment of unruptured intracranial aneurysms: a comparative analysis and review.  World Neurosurg. 2018;113:280-292. doi:10.1016/j.wneu.2018.01.054PubMedGoogle ScholarCrossref
3.Yoon  NK, McNally  S, Taussky  P, Park  MS.  Imaging of cerebral aneurysms: a clinical perspective.  Neurovasc Imaging. 2016;2(1):6. doi:10.1186/s40809-016-0016-3Google ScholarCrossref
4.Jayaraman  MV, Mayo-Smith  WW, Tung  GA,  et al.  Detection of intracranial aneurysms: multi-detector row CT angiography compared with DSA.  Radiology. 2004;230(2):510-518. doi:10.1148/radiol.2302021465PubMedGoogle ScholarCrossref
5.Bharatha  A, Yeung  R, Durant  D,  et al.  Comparison of computed tomography angiography with digital subtraction angiography in the assessment of clipped intracranial aneurysms.  J Comput Assist Tomogr. 2010;34(3):440-445. doi:10.1097/RCT.0b013e3181d27393PubMedGoogle ScholarCrossref
6.Lubicz  B, Levivier  M, François  O,  et al.  Sixty-four-row multisection CT angiography for detection and evaluation of ruptured intracranial aneurysms: interobserver and intertechnique reproducibility.  AJNR Am J Neuroradiol. 2007;28(10):1949-1955. doi:10.3174/ajnr.A0699PubMedGoogle ScholarCrossref
7.White  PM, Teasdale  EM, Wardlaw  JM, Easton  V.  Intracranial aneurysms: CT angiography and MR angiography for detection prospective blinded comparison in a large patient cohort.  Radiology. 2001;219(3):739-749. doi:10.1148/radiology.219.3.r01ma16739PubMedGoogle ScholarCrossref
8.Suzuki  K.  Overview of deep learning in medical imaging.  Radiol Phys Technol. 2017;10(3):257-273. doi:10.1007/s12194-017-0406-5PubMedGoogle ScholarCrossref
9.Rajpurkar  P, Irvin  J, Ball  RL,  et al.  Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists.  PLoS Med. 2018;15(11):e1002686. doi:10.1371/journal.pmed.1002686PubMedGoogle ScholarCrossref
10.Bien  N, Rajpurkar  P, Ball  RL,  et al.  Deep-learning-assisted diagnosis for knee magnetic resonance imaging: development and retrospective validation of MRNet.  PLoS Med. 2018;15(11):e1002699. doi:10.1371/journal.pmed.1002699PubMedGoogle ScholarCrossref
11.Morita  A, Kirino  T, Hashi  K,  et al; UCAS Japan Investigators.  The natural course of unruptured cerebral aneurysms in a Japanese cohort.  N Engl J Med. 2012;366(26):2474-2482. doi:10.1056/NEJMoa1113260PubMedGoogle ScholarCrossref
12.Yushkevich  PA, Piven  J, Hazlett  HC,  et al.  User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability.  Neuroimage. 2006;31(3):1116-1128. doi:10.1016/j.neuroimage.2006.01.015PubMedGoogle ScholarCrossref
13.He  K, Zhang  X, Ren  S, Sun  J. Deep residual learning for image recognition. Paper presented at: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; June 27, 2016; Las Vegas, NV.
14.Xie  S, Girshick  R, Dollár  P, Tu  Z, He  K. Aggregated residual transformations for deep neural networks. Paper presented at: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); July 25, 2017; Honolulu, HI.
15.Hu  J, Shen  L, Sun  G. Squeeze-and-excitation networks. Paper presented at: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); June 21, 2018; Salt Lake City, Utah.
16.Ronneberger  O, Fischer  P, Brox  T. U-net: Convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention. Basel, Switzerland: Springer International; 2015:234–241.
17.Carreira  J, Zisserman  A. Quo vadis, action recognition? a new model and the kinetics dataset. Paper presented at: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); July 25, 2017; Honolulu, HI.
18.Chen  L-C, Papandreou  G, Schroff  F, Adam  H. Rethinking atrous convolution for semantic image segmentation. https://arxiv.org/abs/1706.05587. Published June 17, 2017. Accessed May 7, 2019.
19.Milletari  F, Navab  N, Ahmadi  S-A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. Paper presented at: 2016 IEEE Fourth International Conference on 3D Vision (3DV); October 26-28, 2016; Stanford, CA.
20.Loshchilov  I, Hutter  F. Sgdr: Stochastic gradient descent with warm restarts. Paper presented at: 2017 Fifth International Conference on Learning Representations; April 24-26, 2017; Toulon, France.
21.Huang  G, Sun  Y, Liu  Z, Sedra  D, Weinberger  KQ. Deep networks with stochastic depth. European Conference on Computer Vision. Basel, Switzerland: Springer International; 2016:646–661.
22.Horos. https://horosproject.org. Accessed May 1, 2019.
23.Wilson  EB.  Probable inference, the law of succession, and statistical inference.  J Am Stat Assoc. 1927;22(158):209-212. doi:10.1080/01621459.1927.10502953Google ScholarCrossref
24.Fleiss  JL, Cohen  J.  The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability.  Educ Psychol Meas. 1973;33(3):613-619. doi:10.1177/001316447303300309Google ScholarCrossref
25.Benjamini  Y, Hochberg  Y.  Controlling the false discovery rate: a practical and powerful approach to multiple testing.  J R Stat Soc Series B Stat Methodol. 1995;57(1):289-300.Google Scholar
26.Maldaner  N, Stienen  MN, Bijlenga  P,  et al.  Interrater agreement in the radiologic characterization of ruptured intracranial aneurysms based on computed tomography angiography.  World Neurosurg. 2017;103:876-882.e1. doi:10.1016/j.wneu.2017.04.131PubMedGoogle ScholarCrossref
27.Wang  Y, Gao  X, Lu  A,  et al.  Residual aneurysm after metal coils treatment detected by spectral CT.  Quant Imaging Med Surg. 2012;2(2):137-138.PubMedGoogle Scholar
28.Yoon  YW, Park  S, Lee  SH,  et al.  Post-traumatic myocardial infarction complicated with left ventricular aneurysm and pericardial effusion.  J Trauma. 2007;63(3):E73-E75. doi:10.1097/01.ta.0000246896.89156.70PubMedGoogle ScholarCrossref
29.Tomandl  BF, Hammen  T, Klotz  E, Ditt  H, Stemper  B, Lell  M.  Bone-subtraction CT angiography for the evaluation of intracranial aneurysms.  AJNR Am J Neuroradiol. 2006;27(1):55-59.PubMedGoogle Scholar
30.Shi  W-Y, Li  Y-D, Li  M-H,  et al.  3D rotational angiography with volume rendering: the utility in the detection of intracranial aneurysms.  Neurol India. 2010;58(6):908-913. doi:10.4103/0028-3886.73743PubMedGoogle ScholarCrossref
31.Lin  N, Ho  A, Gross  BA,  et al.  Differences in simple morphological variables in ruptured and unruptured middle cerebral artery aneurysms.  J Neurosurg. 2012;117(5):913-919. doi:10.3171/2012.7.JNS111766PubMedGoogle ScholarCrossref
32.Villablanca  JP, Jahan  R, Hooshi  P,  et al.  Detection and characterization of very small cerebral aneurysms by using 2D and 3D helical CT angiography.  AJNR Am J Neuroradiol. 2002;23(7):1187-1198.PubMedGoogle Scholar
33.Chang  PD, Kuoy  E, Grinband  J,  et al.  Hybrid 3D/2D convolutional neural network for hemorrhage evaluation on head CT.  AJNR Am J Neuroradiol. 2018;39(9):1609-1616. doi:10.3174/ajnr.A5742PubMedGoogle ScholarCrossref
34.Chilamkurthy  S, Ghosh  R, Tanamala  S,  et al.  Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study.  Lancet. 2018;392(10162):2388-2396. doi:10.1016/S0140-6736(18)31645-3PubMedGoogle ScholarCrossref
35.Jnawali  K, Arbabshirani  MR, Rao  N, Patel  AA. Deep 3D convolution neural network for CT brain hemorrhage classification. Paper presented at: Medical Imaging 2018: Computer-Aided Diagnosis. February 27, 2018; Houston, TX. doi:10.1117/12.2293725
36.Titano  JJ, Badgeley  M, Schefflein  J,  et al.  Automated deep-neural-network surveillance of cranial images for acute neurologic events.  Nat Med. 2018;24(9):1337-1341. doi:10.1038/s41591-018-0147-yPubMedGoogle ScholarCrossref
37.Ueda  D, Yamamoto  A, Nishimori  M,  et al.  Deep learning for MR angiography: automated detection of cerebral aneurysms.  Radiology. 2019;290(1):187-194.PubMedGoogle ScholarCrossref

Read Full Post »


Applying AI to Improve Interpretation of Medical Imaging

Author and Curator: Dror Nir, PhD

 

 

images

The idea that we can use machines’ intelligence to help us perform daily tasks is not an alien any more. As consequence, applying AI to improve the assessment of patients’ clinical condition is booming. What used to be the field of daring start-ups became now a playground for the tech-giants; Google, Amazon, Microsoft and IBM.

Interpretation of medical-Imaging involves standardised workflows and requires analysis of many data-items. Also, it is well established that human-subjectivity is a barrier to reproducibility and transferability of medical imaging results (evident by the reports on high intraoperative variability in  imaging-interpretation).Accepting the fact that computers are better suited that humans to perform routine, repeated tasks involving “big-data” analysis makes AI a very good candidate to improve on this situation.Google’s vision in that respect: “Machine learning has dozens of possible application areas, but healthcare stands out as a remarkable opportunity to benefit people — and working closely with clinicians and medical providers, we’re developing tools that we hope will dramatically improve the availability and accuracy of medical services.”

Google’s commitment to their vision is evident by their TensorFlow initiative. “TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.” Two recent papers describe in length the use of TensorFlow in retrospective studies (supported by Google AI) in which medical-images (from publicly accessed databases) where used:

Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning, Nature Biomedical Engineering, Authors: Ryan Poplin, Avinash V. Varadarajan, Katy Blumer, Yun Liu, Michael V. McConnell, Greg S. Corrado, Lily Peng, and Dale R. Webster

As a demonstrator to the expected benefits the use of AI in interpretation of medical-imaging entails this is a very interesting paper. The authors show how they could extract information that is relevant for the assessment of the risk for having an adverse cardiac event from retinal fundus images collected while managing a totally different medical condition.  “Using deep-learning models trained on data from 284,335 patients and validated on two independent datasets of 12,026 and 999 patients, we predicted cardiovascular risk factors not previously thought to be present or quantifiable in retinal images, such as age (mean absolute error within 3.26 years), gender (area under the receiver operating characteristic curve (AUC) = 0.97), smoking status (AUC = 0.71), systolic

blood pressure (mean absolute error within 11.23 mmHg) and major adverse cardiac events (AUC = 0.70).”

 

Screenshot 2019-05-28 at 10.07.21Screenshot 2019-05-28 at 10.09.40

Clearly, if such algorithm would be implemented as a generalised and transferrable medical-device that can be used in routine practice, it will contribute to the cost-effectiveness of screening programs.

 

End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography, Nature Medicine, Authors: Diego Ardila, Atilla P. Kiraly, Sujeeth Bharadwaj, Bokyung Choi, Joshua J. Reicher, Lily Peng, Daniel Tse , Mozziyar Etemadi, Wenxing Ye, Greg Corrado, David P. Naidich and Shravya Shetty.

This paper is in line of many previously published works demonstrating how AI can increase the accuracy of cancer diagnosis in comparison to current state of the art: “Existing challenges include inter-grader variability and high false-positive and false-negative rates. We propose a deep learning algorithm that uses a patient’s current and prior computed tomography volumes to predict the risk of lung cancer. Our model achieves a state-of-the art performance (94.4% area under the curve) on 6,716 National Lung Cancer Screening Trial cases, and performs similarly on an independent clinical validation set of 1,139 cases.”

Screenshot 2019-05-28 at 10.22.06Screenshot 2019-05-28 at 10.23.48

The benefit of using an AI based application for lung cancer screening (If and when such algorithm is implemented as a generalised and transferable medical device) is well summarised by the authors: “The strong performance of the model at the case level has important potential clinical relevance. The observed increase in specificity could translate to fewer unnecessary follow up procedures. Increased sensitivity in cases without priors could translate to fewer missed cancers in clinical practice, especially as more patients begin screening. For patients with prior imaging exams, the performance of the deep learning model could enable gains in workflow efficiency and consistency as assessment of prior imaging is already a key component of a specialist’s workflow. Given that LDCT screening is in the relatively early phases of adoption, the potential for considerable improvement in patient care in the coming years is substantial. The model’s localization directs follow-up for specific lesion(s) of greatest concern. These predictions are critical for patients proceeding for further work-up and treatment, including diagnostic CT, positron emission tomography (PET)/CT or biopsy. Malignancy risk prediction allows for the possibility of augmenting existing, manually created interpretation guidelines such as Lung-RADS, which are limited to subjective clustering and assessment to approximate cancer risk.

BTW: The methods section in these two papers is detailed enough to allow any interested party to reproduce the study.

For the sake of balance-of-information, I would like to note that:

  • Amazon is encouraging access to its AI platform Amazon SageMaker “Amazon SageMaker provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly. Amazon SageMaker is a fully-managed service that covers the entire machine learning workflow to label and prepare your data, choose an algorithm, train the model, tune and optimize it for deployment, make predictions, and take action. Your models get to production faster with much less effort and lower cost.” Amazon is offering training courses to help programmers get proficiency in Machine-Learning using its AWS platform: “We offer 30+ digital ML courses totaling 45+ hours, plus hands-on labs and documentation, originally developed for Amazon’s internal use. Developers, data scientists, data platform engineers, and business decision makers can use this training to learn how to apply ML, artificial intelligence (AI), and deep learning (DL) to their businesses unlocking new insights and value. Validate your learning and your years of experience in machine learning on AWS with a new certification.”
  • IBM is offering a general-purpose AI platform named Watson. Watson is also promoted as a platform to develop AI applications in the “health” sector with the following positioning: “IBM Watson Health applies data-driven analytics, advisory services and advanced technologies such as AI, to deliver actionable insights that can help you free up time to care, identify efficiencies, and improve population health.”
  • Microsoft is offering its AI platform as a tool to accelerate development of AI solutions. They are also offering an AI school : “Dive in and learn how to start building intelligence into your solutions with the Microsoft AI platform, including pre-trained AI services like Cognitive Services and Bot Framework, as well as deep learning tools like Azure Machine Learning, Visual Studio Code Tools for AI and Cognitive Toolkit. Our platform enables any developer to code in any language and infuse AI into your apps. Whether your solutions are existing or new, this is the intelligence platform to build on.”

Read Full Post »