A team at Peking Union Medical College Hospital applies machine learning to integrate quantitative MRI radiomic features and clinical variables, building a Random Forest classifier that predicts bevacizumab response in metastatic brain tumor–induced peritumoral edema with 0.91 AUC.
Key points
Integrated 13 radiomic and eight clinical features from 300 metastatic brain tumor patients.
Applied stratified 70/30 train-test split, SMOTE oversampling, and tenfold cross-validation across RF, LR, GBT, and NB.
Random Forest achieved 0.89 accuracy, 0.91 AUC-ROC, and identified edema volume as the most important predictor.
Why it matters:
Precision prediction of bevacizumab response can reduce unnecessary risks and costs, improving edema management in neuro-oncology.
Q&A
What is bevacizumab?
How does radiomics differ from standard imaging?
What is Random Forest in machine learning?
Why use SMOTE for class imbalance?
What does AUC-ROC measure?
Read full article
Academy
Radiomics and Machine Learning in Biomedical Imaging
Introduction: Biomedical imaging provides noninvasive views of tissue structures and functions, enabling disease diagnosis, treatment planning, and monitoring. Traditional image interpretation relies on qualitative assessment by radiologists. Radiomics augments this by extracting quantitative features from medical images—such as CT, MRI, or PET scans—transforming images into high-dimensional data for computational analysis.
What Is Radiomics? Radiomics involves feature extraction, where each voxel in an image is analyzed to compute metrics related to intensity, shape, texture, and higher-order statistics. These features quantify lesion heterogeneity, boundary irregularity, and internal tissue patterns. For example, texture features such as gray-level co-occurrence matrix (GLCM) capture pixel intensity correlations, reflecting tumor microenvironment complexity.
Machine Learning Basics: Machine learning (ML) methods train algorithms to recognize patterns within data. Supervised ML uses labeled examples—where outcomes like treatment response are known—to learn feature–outcome relationships. Popular algorithms include support vector machines, decision trees, and ensemble models like Random Forest and Gradient Boosting. Performance is evaluated using metrics such as accuracy, precision, recall, and AUC-ROC.
ML Workflow in Imaging:
- Data Collection: Assemble imaging and clinical datasets from patient cohorts.
- Preprocessing: Standardize image resolution, normalize intensities, and segment regions of interest.
- Feature Extraction: Compute radiomic features and integrate clinical variables (e.g., age, comorbidities).
- Feature Selection: Use methods like recursive feature elimination or principal component analysis to reduce dimensionality and avoid overfitting.
- Model Training: Apply cross-validation, hyperparameter tuning, and imbalance handling (e.g., SMOTE) to train robust classifiers.
- Validation: Test on independent cohorts to assess generalizability and calibrate predictive thresholds.
Applications in Disease Detection and Prognosis: Radiomics and ML have been used to predict tumor subtype, genetic mutations, and treatment response across oncology. In neuro-oncology, radiomic features from MRI can forecast survival in glioblastoma, detect early recurrence, and now predict bevacizumab efficacy in peritumoral edema. Such predictive models support personalized therapy by identifying patients likely to benefit from specific treatments.
Link to Longevity Science: While radiomics primarily addresses disease management, its principles extend to aging research. Quantitative imaging of brain and organ tissue can reveal subclinical changes associated with aging. ML analysis of imaging biomarkers may detect early degeneration, monitor intervention effects, and guide strategies to promote healthy aging and extend healthspan.
Future Directions: Integrating multi-omics data—combining radiomics with genomics, proteomics, and metabolomics—will yield deeper insights into disease biology and aging processes. Advancements in deep learning enable automated feature learning directly from raw images, reducing the need for handcrafted features. Ethical deployment requires standardized data protocols, transparency in algorithmic decision-making, and rigorous clinical validation.