Automatic cell nuclei segmentation and classification of breast cancer histopathology images
Introduction
Breast cancer is the leading type of malignant tumor observed in women. Early detection of cancer is very important for successful treatment. Diagnosis from histopathological images remains the "gold standard" for breast cancer. Manually analyzing numerous biopsy slides by pathologist is labor intensive and has suboptimal reproducibility. Thanks to recent advances in digital pathology, the automatic image analysis method has the potential to overcome the subjective interpretation and reduce the workload. Computer-aided diagnosis (CADx) scheme is becoming an important tool to assist pathologist in breast cancer detection and diagnosis. The CADx scheme consists of two phases, which are segmentation phase and classification phase [1], [2], [3].
Segmentation of nuclei is an important first step towards automatic analysis of BCH images. Several algorithms for segmentation of nuclei in BCH images have been proposed. Most of them revolve around watershed segmentation, active contours, pixel classification or combination varieties, supplemented by different pre-processing and post-processing phases. Watershed segmentation is always improved by obtaining locations as markers that can mark the objects of interest and the background [4], [5], [6], [7], [8]. But these techniques suffer from over-segmentation and did not work well for the overlapping cells. The objective of active contours is to find a minimum energy fit of the moving contours, and the algorithm is always combined with a nuclei detection method. These kinds of methods initially define a large number of candidate regions and then select the ones that can present correctly segmented nuclei [8], [9], [10], [11], [12], [13]. However, these models have some limitations in convergence, the optimization problem involved leads to uncertainty and poor stability of result of segmentation. Moreover, clustering based methods such as K-means [14] and unsupervised or supervised machine learning [12], [15] have been applied for the segmentation of the cell nuclei of breast cancer. These methods require explicit prior knowledge of the image structure and the computational complexity is relatively high. Due to the high variability of the tissue appearance, reliable cell nuclei segmentation of BCH images is still a challenging task.
After nuclei are precisely segmented, classification phase is implemented. The most important aspects of the classification performance are the features extracted and the classification algorithms. Some researchers have studied the analysis of BCH images by proposing new features or considering different classification algorithms. Most of the extracted features are morphology-based and texture-based features. Some new features were extracted such as textural features using a critical exponent analysis (CEA) [16] and complex Daubechies wavelets [13], [17], distribution-based features of nuclei [18], etc. Most classification algorithms utilized in literature revolve around Support Vector Machine (SVM) [19], k-Nearest Neighbor (k-NN) [13], Naive Bayes (NB) [18], fuzzy c-means (FCM) [20], neural network [19] or a combination of the above mentioned algorithm [18]. Some other classification approaches such as Decision Tree (DT) and partial least squares regression [21] have been applied for breast cell classification as well. Almost all the literature applied all extracted feature as the input of classifiers. However, feature selection can improve the classification accuracy and the reliability [18], [22].
In this paper, an automatic CADx scheme of BCH images is proposed. For the nuclei segmentation of BCH images, top-bottom hat transform is applied to enhance grayscale image. Wavelet decomposition and multi-scale region-growing (WDMR) are combined to obtain regions of interest (ROIs), a double strategy splitting model (DSSM) containing adaptive mathematical morphology and Curvature Scale Space (CSS) corner detection method is applied to split overlapped cells for better accuracy and robustness. For the classification of the cell nuclei, 4 shape-based features and 138 textural features based on color spaces are extracted as initial feature set. Then optimal feature set is selected by wrapper feature selection algorithm based on chain-like agent genetic algorithm (CAGA) [23] and SVM classifier to get high classification accuracy.
This paper is divided into five sections. Section 1 presents a review of the methods for cell nuclei segmentation and classification. Section 2 describes the acquisition progress of the images used for analyzing in this paper. Section 3 presents the methods in cell segmentation stage. Feature extraction, selection and classification algorithms are described in Section 4. Section 5 discusses the experiment results of segmentation and classification of BCH images. Finally the paper ends up with conclusions.
Section snippets
Tissue preparation and imaging
The first step of the tissue preparation process was formalin fixation and embedding in paraffin [24], [25]. From the paraffin blocks, sections with a thickness of 4 μm were cut using a microtome (a high precision cutting instrument) and mounted on glass slides. Then the mounted sections were stained with H&E and cover-slipped in order to make the nuclei and cytoplasm visible. The histopathological images were acquired through digital camera adapted to an optical microscopy with 40×
Nuclei detection and segmentation
While all the subsequent processes are based on the segmentation results, it is very important that all the cells were isolated properly. An automatic segmentation method based on multi-scale region-growing with Double Strategy Splitting Model is proposed. The block-diagram with an overview of the proposed method is shown in Fig.1. This entire approach can be divided into three steps including pre-processing, ROIs extraction using WDMR, and overlapped cells isolation based on DSSM.
Classification
The efficient classification of cell nuclear requires the generation of meaningful features having very good discriminative ability. Morphometric, colorimetric, textural, and structural features have been used for feature selection and feature evaluation in previous studies to realize classification of breast cells. To obtain better classification result and time saving, shape-based features and textural features based on different color spaces are extracted and selected by CAGA. These
Results and discussion
The proposed automatic CADx scheme of BCH images is implemented as an algorithm by MATLAB. For the segmentation phase, in order to demonstrate the efficiency of the segmentation strategy, it is compared qualitatively and quantitatively with other methods. The segmentation method is applied on the 68 dataset images, and the performance is evaluated. For the classification phase, two different classification schemes are applied including image classification and patient classification. The
Conclusions
This paper describes a computer-aided diagnosis system for quantitative analyzing the BCH images. A new fully automatic segmentation method is developed for detection and segmentation of cell nuclei. The proposed method utilizes wavelet transform and multi-scale region growing to locate ROIs. Then an adaptive morphological operation combining with CSS corner detection algorithm is applied for separating overlapping cells. Compared with other cell image segmentation method, the segmentation
Acknowledgment
This research is funded by National Natural Science Foundation of China NSFC (Nos. 61108086, 61171089, and 11304382), the Natural Science Foundation of Chongqing (cstc2012jjA40015), Chongqing City Science and Technology Plan (cstc2012gg-yyjs0572), Fundamental Research Funds for the Central Universities (CDJZR12160011, CDJZR13160008, and CDJZR155507), The China Postdoctoral Science Foundation (2013M532153) and the Chongqing Postdoctoral Science Special Foundation of China.
References (37)
- et al.
Automatic image segmentation of nuclear stained breast tissue sections using color active contour model and an improved watershed method
Biomed. Signal Process. Control
(2013) - et al.
Unsupervised cell nucleus segmentation with active contours
Signal Process.
(1998) - et al.
Texture analysis of breast cancer cells in microscopic images using critical exponent analysis method
Procedia Eng.
(2012) - et al.
Comparison of algorithms that select features for pattern classifiers
Pattern Recognit.
(2000) - et al.
Two coding based adaptive parallel co-genetic algorithm with double agents structure
Eng. Appl. Artif. Intell.
(2010) Wavelet-based corner detection using eigenvectors of covariance matrices
Pattern Recognit. Lett.
(2003)- et al.
Estimates of global cancer prevalence for 27 sites in the adult population in 2008
Int. J. Cancer
(2012) A better lens on disease: computerized pathology slides may help doctors make faster and more accurate diagnoses
Sci. Am.
(2010)- et al.
Going fully digital: perspective of a Dutch academic pathology lab
J. Pathol. Inform.
(2013) - et al.
Automatic nuclei segmentation in H&E stained breast cancer histopathology images
PLoS One
(2013)