Researchers at Goethe University Frankfurt conducted a bibliometric study of 29,192 AI-in-medicine papers from 1969 to 2022, using the NewQIS platform and density-equalizing map procedures to chart global publication trends, socio-economic correlations, and equity patterns across countries.
Key points
Analyzed 29,192 AI-in-medicine articles from Web of Science (1969–2022) using NewQIS bibliometric methodologies.
Applied density-equalizing cartogram projections to visualize country-level research output and citation patterns.
Performed Spearman correlations and regression residual analysis with GDP, GII, and AI readiness indices to assess global equity and disparities.
Why it matters:
Mapping the global AI-in-medicine landscape exposes economic and innovation-driven inequities, guiding policies to foster inclusive research and deployment in underserved regions.
Q&A
What is NewQIS?
How do density-equalizing map projections work?
Why correlate AI publications with GDP and GII?
What does a positive regression residual indicate?
Why is AI readiness important for equity?
Read full article
Academy
Machine Learning in Longevity Research
Introduction: Machine learning (ML) is a branch of artificial intelligence that enables computers to learn from data without explicit programming. In longevity research, ML plays a transformative role by analyzing large-scale biological and clinical datasets to discover aging biomarkers, model disease progression, and predict individual health trajectories. By comparing patterns across thousands of samples, ML algorithms help scientists understand the complex interactions among genes, proteins, and environmental factors that drive the aging process.
Key Concepts in Machine Learning
- Supervised Learning: Algorithms such as random forests, support vector machines, and neural networks learn from labeled datasets, where each sample includes input features (e.g., gene expression levels) and an associated outcome (e.g., measured biological age). These models optimize functions to minimize prediction errors on known data, then apply learned rules to new cases.
- Unsupervised Learning: Techniques like clustering and principal component analysis identify hidden structures in unlabeled data. In aging studies, unsupervised methods group individuals by molecular profiles or detect novel aging-related subtypes without predefined categories.
- Deep Learning: Advanced neural networks with multiple hidden layers can model highly nonlinear relationships in complex datasets. Convolutional neural networks (CNNs) process medical imaging to assess tissue-level aging, while autoencoders reduce dimensionality of multiomics data to uncover latent aging signals.
Applications in Longevity Science
- Biological Age Predictors: ML-driven ‘epigenetic clocks’ predict biological age by analyzing DNA methylation patterns, correlating with lifespan and disease risk better than chronological age.
- Drug Discovery and Repurposing: Predictive models screen existing compounds for geroprotective effects by simulating molecular interactions with aging pathways, prioritizing candidates for laboratory testing.
- Personalized Intervention Strategies: By integrating lifestyle factors, clinical biomarkers, and genetic profiles, ML algorithms generate personalized aging assessments to tailor supplements, exercise regimens, or preventative screenings.
- Imaging Biomarkers: CNNs analyze medical images—CT scans, MRI, retinal photographs—to detect tissue degeneration and vascular changes as noninvasive measures of biological aging.
Data Challenges and Best Practices
The success of ML in longevity research depends on high-quality, diverse datasets. Heterogeneous sources often use different measurement platforms, population cohorts, and protocols. Rigorous data preprocessing—normalization, batch correction, feature selection—and cross-validation are essential to ensure model robustness and generalizability.
Interpretability and Explainability: Complex ML models can act as ‘black boxes,’ making it hard to understand predictions. Methods like SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) reveal feature importance, enhancing trust and enabling biological insights.
Ethical and Equity Considerations
Models trained on data from high-income regions may underperform in underrepresented populations. Ensuring diverse and inclusive datasets is crucial to avoid reinforcing health disparities. Researchers must follow ethical guidelines for consent, data privacy, and transparent reporting across demographic groups.
Future Directions
Integrating multiomics—genomics, transcriptomics, proteomics, metabolomics—with wearable device and electronic health record data will refine predictive accuracy. Federated learning enables collaborative model training across institutions without sharing raw data, preserving privacy. Standardized data-sharing platforms and community-driven protocols will support reproducible, scalable ML pipelines and accelerate the development of geroprotective therapies to extend healthy lifespan globally.