In this issue of our ongoing series, I want to take a closer look at several studies on the impact of artificial intelligence methods in musculoskeletal radiology, mostly for fracture detection. These articles have been selected for review based on their clinical significance and citation indexes of the journals. And they intrigue me. Why?
These systematic reviews and meta-analyses have revealed promising results that artificial intelligence systems can be trained to detect and classify wrist, hand, ankle, hip, proximal humerus, rib and thoracolumbar spine fractures on radiographs with a diagnostic accuracy comparable to that of radiologists.
The improvement in sensitivity was significant at all points except the shoulder, clavicle, and thoracolumbar spine. More subtle fractures (such as non-displaced femoral neck fractures or scaphoid fractures) require further study because artificial intelligence models may be less accurate. Even if the reduction in reading time is only a few seconds per radiographic examination, it can result in significant time savings for radiographers who can read 200-300 radiographs per day.
Nevertheless, before AI algorithms can be transferred into routine practice, they must be externally validated in a prospective study representing a relevant sample of patients. Finally, as with other clinical prediction models, AI systems should be evaluated in the context of randomised clinical trials to assess their impact on patient-centered outcomes. But these challenges remain to be tackled.
1. Kuo RYL et al. Artificial Intelligence in Fracture Detection: A Systematic Review and Meta-Analysis. Radiology. 2022 Jul;304(1):50-62.
Patients with fractures are a common emergency and can be misdiagnosed through radiologic imaging. In fact, between 3% and 10% of patients are likely to experience a missed or delayed diagnosis of fractures on radiography. An increasing number of studies are using artificial intelligence (AI) techniques to detect fractures as an adjunct to clinical diagnosis. In a systematic review and meta-analysis by Kuo et al. 42 studies (37 studies with radiography and five studies with computed tomography), the cumulative diagnostic effectiveness of using artificial intelligence (AI) to detect fractures had a sensitivity of 92% and 91% and specificity of 91 and 91%, with internal and external validation, respectively.
The list of studies included analysis of proximal femur, vertebral, upper humerus, distal radius, scaphoid, calcaneal, supracondylar or lateral condyle elbow fractures. The performance of clinicians was comparable to that of AI in detecting fractures (sensitivity 91%, 92%; specificity 94%, 94%). There were no statistically significant differences between clinician and AI performance. But the studies have several disadvantages.
Transparent reporting is necessary so that users can assess the most important elements to validate the quality of the study and judge whether the results are applicable to the intended user population. For example, the 17 studies did not report on the male/female proportion of study participants, and 15 researchers did not report information about the age of participants.
Conclusion: The results from this meta-analysis cautiously suggest that AI is not inferior to clinicians in terms of diagnostic performance in fracture detection, showing promise as a useful diagnostic tool. But many studies have limited real-world applicability because of flawed methods or unrepresentative data sets. As a result, future research must prioritise pragmatic algorithm development.
2. Guermazi A. et al. Improving Radiographic Fracture Recognition Performance and Efficiency Using Artificial Intelligence. Radiology. 2022 Mar;302(3):627-636.
According to Guermazi's retrospective analysis, fracture interpretation errors can account for up to 24% of the harmful diagnostic errors encountered in emergency departments.
Inconsistencies in radiographic fracture diagnosis are more common during the evening and night hours. The purpose of this study was to assess the impact of AI assistance on physicians' diagnostic performance when diagnosing fractures on radiographs. In a retrospective study of 480 patients, AI-assisted interpretation of radiographs by six types of readers showed a 10.4% improvement in fracture detection sensitivity (75.2% vs. 64.8%) with no decrease in specificity. AI assistance also reduced radiograph reading time by 6.3 seconds per patient. According to the authors, a major benefit that AI can bring to clinical practice – particularly in the acute care setting – is its potential to function as a triage system in busy medical centers. Another benefit of AI is the reduction in reading time. Even if the reduction in reading time is only a few seconds per radiographic examination, it can result in significant time savings for radiographers who can read 200-300 radiographs per day. AI assistance can also be helpful for the detection of non-obvious or subtle fractures.
It should be noted that this study had some limitations.
Conclusion: Artificial intelligence assistance for searching skeletal fractures on radiographs improved the sensitivity and specificity of readers and shortened their reading time, as stated by the authors. The improvement in sensitivity was significant at all points except the shoulder, clavicle, and thoracolumbar spine. In fact, the stand-alone AI outperformed human readers for the detection of rib and thoracolumbar spine fractures. Even if the reduction in reading time is only a few seconds per radiographic examination, it can result in significant time savings for radiographers who can read 200-300 radiographs per day.
3. Langerhuizen DWG et al. What Are the Applications and Limitations of Artificial Intelligence for Fracture Detection and Classification in Orthopaedic Trauma Imaging? A Systematic Review. Clin Orthop Relat Res. 2019 Nov;477(11):2482-2491.
A systematic review by Langerhuizen et al. raised more specific questions. What is the proportion of correctly detected or classified fractures and the area under the receiving operating characteristic (AUC) curve of AI fracture detection and classification models? And what is the performance of AI in this setting compared with the performance of human examiners?
For fracture detection, the AUC in five studies reflected near-perfect prediction (range, 0.95-1.0), and the accuracy in seven studies ranged from 83% to 98%. For fracture classification, the AUC was 0.94 in one study, and the accuracy in two studies ranged from 77% to 90%. Langerhuizen’s review showed that AI is very good for detecting common fractures.
Conclusion: AI may enhance processing and communicating probabilistic tasks in medicine, including orthopaedic surgery. AI outperformed human examiners for detecting and classifying hip and proximal humerus fractures and showed equivalent performance for detecting wrist, hand and ankle fractures. More subtle fractures (such as non-displaced femoral neck fractures or scaphoid fractures) require further study because artificial intelligence models may be less accurate. At present, inadequate reference standard assignments to train and test AI is the biggest hurdle before integration into clinical workflows.
List of publications:
Kuo RYL, Harrison C, Curran TA, Jones B, Freethy A, Cussons D, Stewart M, Collins GS, Furniss D. Artificial Intelligence in Fracture Detection: A Systematic Review and Meta-Analysis. Radiology. 2022 Jul;304(1):50-62. doi: 10.1148/radiol.211785. Epub 2022 Mar 29. PMID: 35348381; PMCID: PMC9270679.
Guermazi A, Tannoury C, Kompel AJ, Murakami AM, Ducarouge A, Gillibert A, Li X, Tournier A, Lahoud Y, Jarraya M, Lacave E, Rahimi H, Pourchot A, Parisien RL, Merritt AC, Comeau D, Regnard NE, Hayashi D. Improving Radiographic Fracture Recognition Performance and Efficiency Using Artificial Intelligence. Radiology. 2022 Mar;302(3):627-636. doi: 10.1148/radiol.210937. Epub 2021 Dec 21. PMID: 34931859.
Langerhuizen DWG, Janssen SJ, Mallee WH, van den Bekerom MPJ, Ring D, Kerkhoffs GMMJ, Jaarsma RL, Doornberg JN. What Are the Applications and Limitations of Artificial Intelligence for Fracture Detection and Classification in Orthopaedic Trauma Imaging? A Systematic Review. Clin Orthop Relat Res. 2019 Nov;477(11):2482-2491. doi: 10.1097/CORR.0000000000000848. PMID: 31283727; PMCID: PMC6903838.
📸 image credit to DALL-E. Our search: "two meeting arms with stretched fingers xray with artificial intelligence in michelangelo style"