Deep Learning for Condition Detection in Chest Radiographs: A Performance Comparison of Different Radiograph Views and Handling of Uncertain Labels
View/ Open
Author
Ahmad, Mubashir
Koay, Kheng
Sun, Yi
Jayaram, Vijay
Arunachalam, Ganesh
Amirabdollahian, Farshid
Attention
2299/27486
Abstract
Chest radiographs are the initial diagnostic modality for lung or chest-related conditions. It is believed that radi- ologist’s availability is a bottleneck impacting patient’s safety because of long waiting times. With the arrival of machine learning and especially deep learning, the race for finding artificial intelligence (AI) based approaches that allow for the highest accuracy in detecting abnormalities on chest radiographs is at its peak. classification of radiographs as normal or ab- normal is based on the training and expertise of the reporting radiologist. The increase in the number of chest radiographs over a period of time and the lack of sufficient radiologists in the UK and worldwide have had an impact on the number of chest radiographs assessed and reported in a given time frame. Substantial work is dedicated to machine learning for classifying normal and abnormal radiographs based on a single pathology. The success of deep learning techniques in binary radiograph classification urges the medical imaging community to apply it to multi-label radiographs. Deep learning techniques often require huge datasets to train its underlying model. Recently, the availability of large multi-label datasets has ignited new efforts to overcome this challenging task. This work presents multiple convolutional neural networks (CNNs) based models trained on publically available CheXpert multi-label data. Based on common pathologies seen on chest radiographs and their clinical significance, we have chosen pathologies such as pulmonary odema, cardiomegaly, atelectasis, consolidation and pleural ef- fusion. We trained our models using different projections such as anteroposterior (AP), posteroanterior (PA), and lateral and compared the performance of our models for each projection. Our results demonstrate that the model for the AP projection outperforms the remaining models with an average AUC of 0.85. Furthermore, we use the samples with uncertain labels in CheXpert dataset and improve the model performance by removing the uncertainty using gaussian mixture models (GMM). The results show improvement in all three views with AUCs ranging from 0.91 for AP, 0.75 for PA and 0.85 on the lateral view.