Showcase: How Deep Learning could help radiologists spend their time more efficiently
Reporter and Curator: Dror Nir, PhD
3.5.2.3 Showcase: How Deep Learning could help radiologists spend their time more efficiently, Volume 2 (Volume Two: Latest in Genomics Methodologies for Therapeutics: Gene Editing, NGS and BioInformatics, Simulations and the Genome Ontology), Part 3: AI in Medicine
The debate on the function AI could or should realize in modern radiology is buoyant presenting wide spectrum of positive expectations and also fears.
The article: A Deep Learning Model to Triage Screening Mammograms: A Simulation Study that was published this month shows the best, and very much feasible, utility for AI in radiology at the present time. It would be of great benefit for radiologists and patients if such applications will be incorporated (with all safety precautions taken) into routine practice as soon as possible.
Background:
Mammography is the main imaging modality used in breast cancer screening. Despite its benefits, challenges include variation in interpretive performance and the scarcity of specialized radiologists. A recent report of mammography screening performance in U.S. community practice demonstrated that radiologists’ diagnostic performance ranged from 66.7% to 98.6% for sensitivity and from 71.2% to 96.9% for specificity. False-negative examinations can result in delayed diagnosis, and false-positive examinations can lead to unnecessary procedures, impacting both patient experience and overall costs. Moreover, the availability of specialized radiologists to serve the global population of women eligible for breast cancer screening is limited by workflow inefficiencies (see article for references).
The authors of this article hypothesized that a deep learning model trained to triage mammograms as cancer free can prove to save radiologist time and increase their readings specificity without harming sensitivity. They trained a model to predict cancer directly from full-resolution mammograms and chose a high sensitivity threshold to identify a subset of cancer-free patients with near-perfect accuracy. The study is simulating a scenario in which all patient examinations below this threshold are interpreted as negative for cancer and those above the threshold are read by radiologists who specialize in breast imaging.
Cited from Radiology website:
Abstract
In a simulation study, a deep learning model to triage mammograms as cancer free improves workflow efficiency and significantly improves specificity while maintaining a noninferior sensitivity.
Background
Recent deep learning (DL) approaches have shown promise in improving sensitivity but have not addressed limitations in radiologist specificity or efficiency.
Purpose
To develop a DL model to triage a portion of mammograms as cancer free, improving performance and workflow efficiency.
Materials and Methods
In this retrospective study, 223 109 consecutive screening mammograms performed in 66 661 women from January 2009 to December 2016 were collected with cancer outcomes obtained through linkage to a regional tumor registry. This cohort was split by patient into 212 272, 25 999, and 26 540 mammograms from 56 831, 7021, and 7176 patients for training, validation, and testing, respectively. A DL model was developed to triage mammograms as cancer free and evaluated on the test set. A DL-triage workflow was simulated in which radiologists skipped mammograms triaged as cancer free (interpreting them as negative for cancer) and read mammograms not triaged as cancer free by using the original interpreting radiologists’ assessments. Sensitivities, specificities, and percentage of mammograms read were calculated, with and without the DL-triage–simulated workflow. Statistics were computed across 5000 bootstrap samples to assess confidence intervals (CIs). Specificities were compared by using a two-tailed t test (P < .05) and sensitivities were compared by using a one-sided t test with a noninferiority margin of 5% (P < .05).
Results
The test set included 7176 women (mean age, 57.8 years ± 10.9 [standard deviation]). When reading all mammograms, radiologists obtained a sensitivity and specificity of 90.6% (173 of 191; 95% CI: 86.6%, 94.7%) and 93.5% (24 625 of 26 349; 95% CI: 93.3%, 93.9%). In the DL-simulated workflow, the radiologists obtained a sensitivity and specificity of 90.1% (172 of 191; 95% CI: 86.0%, 94.3%) and 94.2% (24 814 of 26 349; 95% CI: 94.0%, 94.6%) while reading 80.7% (21 420 of 26 540) of the mammograms. The simulated workflow improved specificity (P = .002) and obtained a noninferior sensitivity with a margin of 5% (P < .001).
Conclusion
This deep learning model has the potential to reduce radiologist workload and significantly improve specificity without harming sensitivity.
Leave a Reply