Interpreting chest X-rays via CNNs that exploit hierarchical disease dependencies and uncertainty labels
Introduction
Chest X-ray (CXR) is one of the most common radiological exams in diagnosing many different diseases related to lung and heart, with millions of scans performed globally every year [1], [2]. Many diseases among them, like Pneumothorax [3], can be deadly if not diagnosed quickly and accurately enough. A computer-aided diagnosis (CAD) system that is able to correctly diagnose the most common observations from CXRs will significantly benefit many clinical practices. In this work, we investigate the problem of multi-label classification for CXRs using deep convolutional neural networks (CNNs).
There has been a recent effort to harness advances in machine learning, especially deep learning, to build a new generation of CAD systems for classification and localization of common thoracic diseases from CXR images [4]. Several motivations are behind this transformation: First, interpreting CXRs to accurately diagnose pathologies is difficult. Even well-trained radiologists can easily make mistake due to challenges in distinguishing different kinds of pathologies, many of which often have similar visual features [5]. Therefore, a high-precision method for common thorax diseases classification and localization can be used as a second reader to support the decision making process of radiologists and to help reduce the diagnostic error. It also addresses the lack of diagnostic expertise in areas where the radiologists are limited or not available [6], [7]. Second, such a system can be used as a screening tool that helps reduce waiting time of patients in hospitals and allows care providers to respond to emergency situations sooner or to speed up a diagnostic imaging workflow [8]. Third, deep neural networks, in particular deep CNNs, have shown remarkable performance for various applications in medical imaging analysis [9], including the CXR interpretation task [10], [11], [12], [13].
Several deep learning-based approaches have been proposed for classifying lung diseases and proven that they could achieve human-level performance [10], [14]. Almost all of these approaches, however, aim to detect some specific diseases such as pneumonia [15], tuberculosis [16], [17], or lung cancer [18]. Meanwhile, building a unified deep learning framework for accurately detecting the presence of multiple common thoracic diseases from CXRs remains a difficult task that requires much research effort. In particular, we recognize that standard multi-label classifiers often ignore domain knowledge. For example, in the case of CXR data, how to leverage clinical taxonomies of disease patterns and how to handle uncertainty labels are still open questions, which have not received much research attention. This observation motivates us to build and optimize a predictive model based on deep CNNs for the CXR interpretation in which dependencies among labels and uncertainty information are taken into account during both the training and inference stages. Specifically, we develop a deep learning-based approach that puts together the ideas of conditional training [19] and label smoothing [20] into a novel training procedure for classifying 14 common lung diseases and observations. We trained our system on more than 200,000 CXRs of the CheXpert dataset [21]—one of the largest CXR dataset currently available, and evaluated it on the validation set of CheXpert containing 200 studies, which were manually annotated by 3 board-certified radiologists. The proposed method is also tested against the majority vote of 5 radiologists on the hidden test set of the CheXpert competition that contains 500 studies.
This study makes several contributions. First, we propose a novel training strategy for multi-label CXR classification that incorporates (1) a conditional training process based on a predefined disease hierarchy and (2) a smoothing regularization technique for uncertainty labels. The benefits of these two key factors are empirically demonstrated through our ablation studies. Second, we train a series of state-of-the-art CNNs on frontal-view CXRs of the CheXpert dataset for classifying 14 common thoracic diseases. Our best model, which is an ensemble of various CNN architectures, achieves the highest area under ROC curve (AUC) score on both the validation set and test set of CheXpert at the time being. Specifically, on the validation set, it yields an averaged AUC of 0.940 in predicting 5 selected lung diseases: Atelectasis (0.909), Cardiomegaly (0.910), Edema (0.958), Consolidation (0.957) and Pleural Effusion (0.964). This model improves the baseline method reported in [21] by a large margin of 5%. On the independent test set, we obtain a mean AUC of 0.930. More importantly, the proposed deep learning model is on average more accurate than 2.6 out of 3 individual radiologists in predicting the 5 selected thoracic diseases when presented with the same data.1
The rest of the paper is organized as follows. Related works on CNNs in medical imaging and the problem of multi-label classification in CXR images are reviewed in Section 2. In Section 3, we present the details of the proposed method with a focus on how to deal with dependencies among diseases and uncertainty labels. Section 4 provides comprehensive experiments on the CheXpert dataset. Section 5 discusses the experimental results, some key findings and limitations of this research. Finally, Section 6 concludes the paper.
Section snippets
Related works
Thanks to the increased availability of large scale, high-quality labeled datasets [22], [21], [23] and high-performing deep network architectures [24], [25], [26], [27], deep learning-based approaches have been able to reach, even outperform expert-level performance for many medical image interpretation tasks [10], [12], [11], [28], [29], [16]. Most successful applications of deep neural networks in medical imaging rely on CNNs [30], [31], which utilize convolutions to extract local features
Proposed method
In this section, we present details of the proposed method. We first give a formulation of the multi-label classification for CXRs and the evaluation protocol used in this study (Section 3.1). We then describe a new training procedure that exploits the relationship among diseases for improving model performance (Section 3.2). This section also introduces the way we use LSR to deal with uncertain samples in the training data (Section 3.3).
CXR dataset and settings
CheXpert dataset [21] was used to develop and evaluate the proposed method. This is one of the largest public CXR dataset currently available, which contains 224,316 X-ray scans of 65,240 patients. The dataset was labeled for the presence of 14 observations, including 12 common thoracic pathologies. Each observation can be assigned to either positive (1), negative (0), or uncertain (−1). The main task on CheXpert is to predict the probability of multiple observations from an input CXR. The
Key findings and meaning
By training a set of strong CNNs on a large scale dataset, we built a deep learning model that can accurately predict multiple thoracic diseases from CXRs. In particular, we empirically showed a major improvement, in terms of AUC score, by exploiting the dependencies among diseases and by applying the label smoothing technique to uncertain samples. We found that it is especially difficult to obtain a good AUC score for all diseases with a single CNN. It is also observed that the classification
Conclusion
We presented in this paper a comprehensive approach for building a high-precision computer-aided diagnosis system for common thoracic diseases classification from CXRs. We investigated almost every aspect of the task including data cleaning, network design, training, and ensembling. In particular, we introduced a new training procedure in which dependencies among diseases and uncertainty labels are effectively exploited and integrated in training advanced CNNs. Extensive experiments
CRediT authorship contribution statement
Hieu H. Pham: Conceptualization, Methodology, Writing - original draft, Writing - review & editing. Tung T. Le: Visualization. Dat Q. Tran: Visualization. Dat T. Ngo: Visualization. Ha Q. Nguyen: Conceptualization, Methodology, Writing - original draft, Writing - review & editing, Supervision, Validation.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This research was supported by the Vingroup Big Data Institute (VinBDI). The authors gratefully acknowledge Jeremy Irvin from the Machine Learning Group, Stanford University for helping us evaluate the proposed method on the hidden test set of CheXpert.
Huy H. Pham received the Ph.D. degree in Computer Science at the Toulouse Computer Science Research Institute and Toulouse Cerema Research Center, France. He is currently working as a staff research scientist at Medical Imaging Department at the Vingroup Big Data Institute (VinBDI), Hanoi, Vietnam with a focus on medical image analysis. His research interests include image processing, computer vision and machine learning.
References (51)
- et al.
Increased incidence of spontaneous pneumothorax in very young people: observations and treatment
Chest
(2016) - et al.
A survey on deep learning in medical image analysis
Medical Image Analysis
(2017) - et al.
Identifying pneumonia in chest X-rays: A deep learning approach
Measurement
(2019) - et al.
The prostate, lung, colorectal and ovarian (PLCO) cancer screening trial of the national cancer institute: History, organization, and status
Controlled Clinical Trials
(2000) - N. England, Diagnostic imaging dataset statistical release. February 2019, https://www.england.nhs.uk/, (accessed 30...
- L. Anderson, A. Dean, D. Falzon, K. Floyd, I. Baena, C. Gilpin, P. Glaziou, Y. Hamada, T. Hiatt, A. Char, et al.,...
- et al.
Computer-aided detection in chest radiography based on artificial intelligence: a survey
Biomedical Engineering Online
(2018) - L. Delrue, R. Gosselin, B. Ilsen, A. Van Landeghem, J. de Mey, P. Duyck, Difficulties in the interpretation of chest...
- et al.
Global supply of health professionals
New England Journal of Medicine
(2014) - T. Atlantic, Most of the world doesn’t have access to X-rays,...
Automated triaging of adult chest radiographs with deep artificial neural networks
Radiology
Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists
PLoS Medicine
Boosted cascaded convnets for multilabel classification of thoracic diseases in chest radiographs
Development and validation of a deep learning–based automated detection algorithm for major thoracic diseases on chest radiographs
JAMA Network Open
Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks
Radiology
Efficient deep network architectures for fast chest X-ray tuberculosis screening and visualization
Scientific Reports
Automatic lung cancer prediction from chest X-ray images using the deep learning approach
When does label smoothing help?
Cited by (108)
Deep reinforcement learning framework for thoracic diseases classification via prior knowledge guidance
2023, Computerized Medical Imaging and GraphicsImage projective transformation rectification with synthetic data for smartphone-captured chest X-ray photos classification
2023, Computers in Biology and MedicineLightweight multi-scale classification of chest radiographs via size-specific batch normalization
2023, Computer Methods and Programs in Biomedicine
Huy H. Pham received the Ph.D. degree in Computer Science at the Toulouse Computer Science Research Institute and Toulouse Cerema Research Center, France. He is currently working as a staff research scientist at Medical Imaging Department at the Vingroup Big Data Institute (VinBDI), Hanoi, Vietnam with a focus on medical image analysis. His research interests include image processing, computer vision and machine learning.
Tung T. Le is currently pursuing the B.S. degree in Computer Science from the Department of Computer Science, University of Engineering and Technology, Vietnam National University, Hanoi, Vietnam. He is also a research intern at Medical Imaging Department at the Vingroup Big Data Institute (VinBDI), Hanoi, Vietnam. His research interests include medical image analysis and deep learning techniques.
Dat Q. Tran received his B.S. degree in Biomedical Engineering from the Department of Biomedical Engineering, International University, Ho Chi Minh City, Vietnam, in 2019. He is currently working as Computer Vision Research Engineer at Medical Imaging Department at the Vingroup Big Data Institute (VinBDI), Hanoi, Vietnam. His work focuses on applying advanced computer vision algorithm to biomedical Imaging.
Dat T. Ngo received his B.S. degree in Electronics and Communication Engineering from the Faculty of Electronics & Telecommunications, University of Engineering and Technology, Vietnam National University, Hanoi, Vietnam in 2017. He is currently working as a Computer Vision Research Specialist at Medical Imaging Department at the Vingroup Big Data Institute (VinBDI), Hanoi, Vietnam. His research interests include computer vision and deep learning techniques.
Ha Q. Nguyen was born in Hai Phong, Vietnam, in 1983. He received the B.S. degree in mathematics from the Hanoi National University of Education, Hanoi, Vietnam, the S.M. degree in electrical engineering and computer science from the Massachusetts Institute of Technology, Cambridge, MA, USA, and the Ph.D. degree in electrical and computer engineering from the University of Illinois at Urbana-Champaign, Champaign, IL, USA, in 2005, 2009, and 2014, respectively.
During 2009–2011, he was a Lecturer in electrical engineering with the International University, Vietnam National University, Ho Chi Minh City, Vietnam, during 2014–2017, he was a Postdoctoral Research Associate with the Biomedical Imaging Group, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, and during 2017–2018, he was a Signal Processing Engineer with the Viettel Research & Development Institute, Hanoi, Vietnam. He is currently the Head of Medical Imaging Department at the Vingroup Big Data Institute, Hanoi, Vietnam. His research interests include medical image analysis, machine learning, computational imaging, and data compression.
Dr. Nguyen was a Fellow of Vietnam Education Foundation, cohort 2007. He was the recipient of the Best Student Paper Award (second prize) of the IEEE International Conference on Acoustics, Speech and Signal Processing in 2014 for his paper (with P.A. Chou and Y. Chen) on the compression of human body sequences using graph wavelet filter banks.