Alfaisal University researchers evaluate twelve machine learning algorithms—including logistic regression, random forests, and neural networks—on UCI heart disease data, assessing how preprocessing steps like standardization and SMOTE affect accuracy, F1 score, and other key metrics.
Key points
- CatBoost achieves highest accuracy (89.71%) and lowest logloss (0.2735) in heart disease prediction.
- SMOTE balancing prevents class bias, improving recall for patients with heart disease.
- Comparison of feature scaling methods reveals optimal preprocessing pipelines for ML convergence and performance.
Why it matters: This systematic AI benchmark identifies optimal preprocessing and modeling strategies for reliable, scalable heart disease prediction in clinical settings.
Q&A
- What is SMOTE?
- Why does feature scaling matter in ML?
- How do Gradient Boosting Machines work?