Abstract
This thesis explores the development, performance, and clinical application of machine learning (ML) prediction models in spinal surgery, focusing on outcome prediction and decision support. The research highlights both advancements and challenges in applying ML models to surgical outcomes. Part I: General Overview and Reporting Quality Chapter 1: The increasing
... read more
use of ML prediction models in orthopedic surgery, specifically in spine surgery, has led to studies exploring their outcomes and methodologies. Although most models show fair to good discriminative ability, calibration is often underreported. Variations exist in sample size, ML algorithms, and outcome timing. Future research should emphasize multi-institutional, prospective studies and develop multiple models for comparison. Chapter 2: A systematic review evaluated adherence to the TRIPOD statement and risk of bias using PROBAST. Many studies suffered from incomplete reporting of study design, patient selection, and performance metrics. Poor reporting practices and high risk of bias were common. Future models must follow guidelines and address issues like incomplete performance measures and mishandling of missing data. Part II: Development of Specific ML Models for Spine Surgery Chapter 3: A nomogram predicted the failure of nonoperative management in spinal epidural abscess, identifying six predictors, including health status, neurological findings, and radiographic features. Chapter 4: For lumbar spinal stenosis, a neural network model predicted discharge placement post-surgery with good discrimination and calibration, potentially reducing hospital stays and costs. Chapter 5: A Bayes Point Machine model predicted discharge disposition after surgery for degenerative spondylolisthesis, using NSQIP database data. It outperformed previous models in discrimination and calibration. Chapter 6: A Random Forest model predicted prolonged opioid use after surgery for degenerative spondylolisthesis. Key predictors included preoperative opioid use, age, BMI, and comorbidities. The model performed well in discrimination, calibration, and overall accuracy. Part III: External Validation and Model Impact on Decision-Making In this section, the importance of external validation and the real-world impact of ML models on clinical decision-making are addressed. Chapter 7: The SORG algorithm for predicting prolonged opioid use was externally validated in Taiwan. Despite differences in demographics and opioid policies, the algorithm showed good performance, demonstrating generalizability across populations. Chapter 8: An interobserver survey is underway to assess how ML models influence decision-making for spinal epidural abscess. It compares treatment recommendations made with and without ML assistance, highlighting the potential impact of ML on clinical decisions. This research underscores ML models' potential to improve surgical outcomes and decision-making in spine surgery. However, it also highlights gaps in reporting standards, external validation, and model comparison. Adherence to frameworks like TRIPOD and multi-institutional studies are essential for integrating ML models into clinical practice.
show less