Researchers at the First Affiliated Hospital of Xinjiang Medical University developed an XGBoost-based ML model using electronic medical records from over 18,000 atrial fibrillation patients to predict in-hospital cardiac mortality. They selected 79 clinical variables, applied downsampling and fivefold cross-validation, and used SHAP for interpretability, achieving high precision, accuracy, and AUC.
Key points
- XGBoost applied to EMR data from 18,727 AF patients achieved AUC 0.964 (training) and 0.932 (validation).
- Data processing included downsampling to balance classes, median imputation for <3% missingness, and removal of highly correlated variables.
- SHAP analysis identified thyroid function indices, procalcitonin, NT-proBNP, and INR as top predictors for in-hospital cardiac mortality.
Why it matters: This interpretable XGBoost model offers precise risk stratification, enabling clinicians to identify high-mortality atrial fibrillation patients and optimize in-hospital interventions.
Q&A
- What is XGBoost?
- How does SHAP explain model predictions?
- Why is class imbalance important in this study?
- What role do thyroid hormones play in model predictions?