Glioma Identification from microRNA Biomarkers using Machine Learning
Gliomas are the most aggressive malignant brain tumours, occurring mostly in adults and accounting for approximately 80% of central nervous system malignant tumours. Traditional diagnostic methods are both invasive and expensive, thus accurate, minimally invasive, and cost-effective early detection is vital to guide personalised treatment plans. MicroRNAs (miRNAs) are stable non-coding RNAs detectable in various body fluids (e.g., serum, plasma, and cerebrospinal Fluid (CSF)) that regulate gene expression and influence cellular processes; their dysregulation is a significant factor in cancer development. This makes them a promising biomarker for glioma classification. In this article, we present a glioma identification methodology from miRNA data using machine learning (ML) followed by data analysis for miRNA biomarker investigation. A machine learning pipeline is applied to classify glioma from controls as well as from meningioma samples using miRNA expression data obtained by four Gene Expression Omnibus (GEO) datasets (GSE112264, GSE113486, GSE113740, GSE139031). After preprocessing, five feature selection techniques (LASSO, mRMR, ReliefF, RFE, and RF importance) were employed. Six machine learning algorithms (LR, KNN, DT, RF, SVM, XGB) were used for classification with and without SMOTE oversampling. Performance was assessed after 5-fold cross-validation, in terms of accuracy, F1-score, precision, recall, and area under the curve (AUC). The results showed that in binary classification (glioma vs controls) all models achieving up to 100% accuracy, and in multi-class classification (glioma vs meningioma vs controls) up to 100% F1-score was achieved with both KNN and XGB classifiers. The top-ranked miRNAs were also analysed and compared with biomarkers previously known from the literature. Seven miRNAs were identified as potential biomarkers, namely the miR-125a-3p, miR-4276, miR-4648, miR-4763-3p, miR-663a, miR-6784-5p and miR-873-3p, and were independently validated on the GSE211692 dataset.
| Item Type | Article |
|---|---|
| Identification Number | 10.3389/fsysb.2026.1771910 |
| Additional information | © 2026 Andugala, Cieslik, Braoudaki and Mporas. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). https://creativecommons.org/licenses/by/4.0/ |
| Date Deposited | 10 Jun 2026 07:48 |
| Last Modified | 10 Jun 2026 07:48 |
