Researchers from the Department of Biomedical Engineering at Islamic University of Kushtia apply an XGBoost feature-importance approach on large RNA-Seq count datasets to classify active tuberculosis with 96.3% accuracy. Their workflow integrates supervised machine learning models and comprehensive bioinformatics analyses for robust biomarker identification in TB diagnostics.
Key points
- XGBoost classified active TB from RNA-Seq count data with 96.3% accuracy and lowest log loss (0.139).
- Feature-importance selection extracted top 100 TB-associated genes for GO, pathway, PPI, and hub-gene analyses.
- Integration of AI and bioinformatics identified 20 hub genes, 24 gene ontologies, and 22 potential drug candidates for TB therapeutics.
Why it matters: By integrating AI and bioinformatics, this pipeline accelerates reliable TB biomarker discovery, enabling targeted diagnostics and potential drug repurposing.
Q&A
- What is RNA-Seq count data?
- How does XGBoost improve TB classification?
- What is feature importance in machine learning?
- What role do hub genes play in this study?
- How are potential drugs predicted from gene data?