Research Summary

AI Matches Orthopedic Surgeons in Diagnosing Distal Radius Fractures on Wrist Radiographs

Key Highlights:

  • AI-guided detection of distal radius fractures (DRFs) on wrist radiographs demonstrated diagnostic accuracy comparable to an experienced orthopedic surgeon.
  • On anteroposterior radiographs, the artificial intelligence (AI) achieved 95.9% accuracy, 92.0% sensitivity, and 98.5% specificity.
  • On lateral views, the AI attained 94.8% accuracy, with slightly lower sensitivity (89.8%) and high specificity (98.3%).

A retrospective study evaluated the performance of an artificial intelligence (AI) system, BoneView v2.5.1, in detecting distal radius fractures (DRFs) on plain wrist radiographs compared with an experienced orthopedic surgeon. Across all key performance metrics—accuracy, sensitivity, specificity, F1 score, and Cohen’s kappa—the AI model performed nearly identically to the human expert.

Distal radius fractures are among the most commonly encountered fractures in emergency and orthopedic care. Accurate diagnosis is critical, as delayed or missed detection can result in poor functional outcomes. Previous studies on AI-based detection lacked external validation, focused only on anteroposterior (AP) radiographs, or used limited datasets. This study was designed to address these gaps by employing a large, real-world dataset inclusive of both AP and lateral wrist views and directly comparing AI performance to that of an experienced clinician.

Researchers retrospectively analyzed 1145 wrist radiographs acquired between September 11, 2023, and September 10, 2024, at a single institution following the implementation of AI-assisted diagnosis. Radiographs included both AP (n = 556) and lateral views (n = 589), with DRFs confirmed by board-certified radiologist reports. An experienced orthopedic surgeon blinded to the AI results and patient data independently evaluated all radiographs. The AI system categorized radiographs based on a confidence threshold, and statistical comparisons were made between AI predictions and the ground truth using metrics such as sensitivity, specificity, accuracy, and Cohen’s kappa.

data from study

Among AP radiographs, 40.5% revealed a DRF, while 40.7% of lateral views did as well. On AP views, the AI system achieved 95.9% accuracy, 92.0% sensitivity, 98.5% specificity, F1 score of 0.947, and a Cohen’s kappa of 0.913. In comparison, the orthopedic surgeon achieved 94.9% accuracy, 89.7% sensitivity, 98.5% specificity, F1 score of 0.935, and a kappa of 0.894. Similar trends were observed for lateral views, with AI demonstrating 94.8% accuracy and the orthopedic surgeon reaching 96.1%. The Youden Index also supported a high diagnostic value for both raters (AI: 90.5 on AP; orthopedic surgeon: 90.6 on lateral).

Despite strong findings, the study's retrospective design and single-center dataset may limit its generalizability. Additionally, the radiologist's report was used as the ground truth, which—though standard practice—can introduce interpretation variability. The AI also produced false positives and false negatives, underscoring the need for human oversight.

“AI-guided detection of distal radius fractures is highly accurate and comparable to human expert evaluation,” the study authors concluded. “AI has the potential to improve diagnostic efficiency and support clinicians in DRF assessment. However, further research is needed to validate AI performance across diverse clinical settings, different fractures, and to explore its integration into routine workflows. At the moment, AI should be viewed as a complementary tool that enhances, rather than replaces, human expertise in DRF diagnosis.”


Reference:
Ramadanov N, John P, Hable R, et al. Artificial intelligence-guided distal radius fracture detection on plain radiographs in comparison with human raters. J Orthop Surg Res. 2025;20(1):468. Published 2025 May 16. doi:10.1186/s13018-025-05888-9