1 Introduction

Due to the complex nature of polymeric composite materials, including the presence of interfaces and inherent anisotropy, they have drawbacks which cover poorer performance at high temperature, poor through-the-thickness properties and poor resistance to transverse impact loadings [13]. From among the in-service damage types, delamination, bond failure, cracking, moisture ingress, fibre buckling or fracture, failure of the interface between the matrix and fibres, and impact damage can be distinguished. Low energy impacts are the source of common problems, since they cause the so-called barely visible impact damage (BVID), which may cause extensive internal damage with simultaneous very limited visible marks on the impacted surface.

Modern non-destructive testing (NDT) methods allow for the effective diagnostics of composite structures. Commonly used damage tolerance philosophy allows composite components with existing damage to be operated under certain conditions. A structure is considered to be damage tolerant if the existing damage does not weaken the structural integrity. Therefore, such structures are included in a maintenance program, the aim of which is to identify damage before its development reduces the residual strength of the structure below an acceptable limit. The damage tolerance approach assumes necessity of damage extent identification and monitoring of its growth by periodic inspections using NDT methods. This procedure is very important because of a very complex nature of composite damaging and its propagation. In a general scenario, micro-cracks develop in the composite matrix due to cyclic loading, continuation of loading causes development of the micro-cracks into macroscopic cracks, which, in turn, spread through the composite plies, and finally, develop into local delamination due to stress concentrations. When delamination is formed, the damage may increase rapidly up to failure [20]. In view of this progressive damage behaviour, composite elements are periodically inspected to monitor the damage progression.

1.1 Ultrasonic Testing of Composite Materials

One of the most commonly applied NDT methods for composite elements is ultrasonic testing (UT). During ultrasonic inspections, two fundamental parameters of the received energy are observed: the amplitude and Time-of-Flight (ToF). The so-called C-Scan presentation mode of UT results, presenting damage as from the top side of the tested element, is used for the purpose of damage sizing. Appropriate processing of ultrasonic C-Scans allows for damage detection, localisation, and calculation of its extent. However, there are numerous factors influencing on the damage detectability and the occurrence of measurement errors during performing ultrasonic inspections. In the paper [64], the authors presented a study on the uncertainty assessment connected with the selection of techniques and parameters of ultrasonic testing that affects the damage size estimation significantly. This paper is focused on the uncertainty assessment connected with the post-processing of ultrasonic data obtained after the inspection.

1.2 Ultrasonic C-Scan Processing

In order to calculate the extent of damage detected in the ultrasonic C-Scan it is necessary to firstly separate the pixels representing damage from other pixel regions visible in the scan. Therefore, a very important factor in damage sizing is appropriate processing of ultrasonic scans in order to extract the damage region properly before calculation of its extent. The industrial inspections using UT techniques must be performed by a certified expert. In practice, damage size is usually calculated based on manually selected areas on the C-Scans. Due to the necessity of increasing the efficiency of ultrasonic inspections, i.e. shortening the duration of UT data analysis, the application of image processing methods is one of the presently undertaken goals.

In the literature, one can find certain approaches aimed at aiding the procedure of damage size assessment from C-Scans using image processing methods. The examples are, e.g., a method based on segmentation of ultrasonic scans by data clustering [49], segmentation based on statistical mean and standard deviation [33], or an algorithm based on image filtering, thresholding and morphological operations [59]. The latest studies of the authors of this paper in the area of application of image processing methods to UT data cover an interactive algorithm allowing for damage extraction from C-Scans and calculation of its extent [62], as well as development of damage segmentation with its 3D reconstruction approaches [63, 65, 66].

However, there is so far little work devoted to the development of a universal method of damage detection from C-Scans. Approaches found in the literature were adapted to specific tested cases and are not universal. Some methods that succeed in the case of inspection of simple structures, e.g. with a uniform thickness, may fail in the case of testing of more complex structures, e.g. with a varying thickness, and vice versa. There is a vast number of methods of image processing that may be helpful for the purpose of damage extraction from ultrasonic scans. The aim of this paper is to analyse various types of image segmentation methods in the light of their effectiveness in damage extraction, and select the most useful methods for the analysis of ultrasonic scans.

2 A Brief Review on Image Segmentation Methods

Image segmentation results in partitioning of an image into fragments (sets of pixels) corresponding to objects visible in the image. Main criteria considered in such procedure can be a colour, intensity or texture. Segmentation algorithms may either be applied to the images as originally recorded, or after the initial processing, e.g. application of filters. After segmentation, methods of mathematical morphology can be used to improve the results. Finally, the segmentation results are used to extract quantitative information from the images.

In the literature, there is a large number of surveys on image segmentation methods, from the former general overviews (e.g. [17, 22, 53]) and those focused on threshold-based methods [47, 60] to more recent surveys—from the overall (see e.g. [11, 32, 56, 68, 69]), to those focused on image binarization [9, 58], thresholding [21, 52], or colour image segmentation [55].

The mentioned surveys and comparative studies found in the literature are mainly based on object detection from photographs, where problems have a different character than in the case of ultrasonic scans. These are, for instance, the noise content in the photograph or the influence of the uniformity of the illumination. Moreover, many studies are dedicated to problems related to a text/background separation (see e.g. [29, 58]) for the optical character recognition (OCR) systems. Surveys of image segmentation found in the area of UT relate mainly to medical applications (e.g. [36, 48, 57]). In the survey considering segmentation of X-ray and C-Scan images of composite materials performed by Jain and Dubuisson [25] only four methods, mainly adaptive thresholding, were tested and compared. The authors of [52] compared more image thresholding methods for NDT applications. There seems to be a lack of comprehensive analyses of image segmentation methods in the context of processing of ultrasonic images in industrial applications, such as diagnostics of composite materials. As mentioned earlier, performing such the analysis is the aim of this paper.

Short descriptions of the most common segmentation methods, within the following categories, are presented below. Considering the introductory character of this section, full descriptions are omitted here, however, a reader can find the details in cited literature.

2.1 Threshold-Based Segmentation

Thresholding is the simplest and one of the most commonly used segmentation methods. The pixels are divided depending on their intensity value. In a basic approach, a threshold value is selected from a grey-scale image and used to separate the foreground of the image from its background. This approach is also called a bi-modal segmentation, since it assumes that the image contains two classes. The threshold can be chosen manually or automatically using one of many methods developed for this purpose, which are described below. Thresholding can be categorized into global, variable, and multiple methods.

2.1.1 Global Thresholding

Global thresholding is based on using a single threshold T for the entire image. Several approaches of automated T selection are listed below.

  • Methods based on using a Gaussian-mixture distribution. Otsu’s method [39] aims at finding the optimal value for the global T. It is based on the interclass variance maximization (or the intraclass variance minimization) between dark and light regions, through the assumption that well thresholded classes have well discriminated intensity values. This method is also categorised as clustering-based thresholding. Riddler and Calvard [45] proposed an iterative version of the Otsu’s method. Kittler and Illingworth [31] presented a minimum-error-thresholding method based on fitting of the mixture of Gaussian distributions.

  • Methods based on a histogram shape, where, for example, the peaks, valleys and curvatures of the histogram are analysed. One of the examples is an approach proposed by Prewitt and Mendelsohn [42], where the histogram is smoothed iteratively until it has only two local maxima.

  • Methods based on maximizing the entropy of the histogram of grey levels of the resulting classes, e.g. proposed by Pun [43], and modified by Kapur et al. [26] or by Pal and Pal [40]. A faster, two-stage approach based on entropy was proposed by Chen et al. [10].

2.1.2 Variable Thresholding

The thresholding methods are called variable when T can change over the image. They can be categorised into local or regional thresholding, when T depends on a neighbourhood of a given pixel coordinates (xy), and adaptive thresholding, when T is a function of (xy). Examples of the most common algorithms are listed below.

  • Niblack’s algorithm [35] calculates a local threshold by sliding a rectangular window over the grey-level image. The computation of the threshold is based on the local mean and the local standard deviation of all the pixels in the window. This approach is the parent of many local image thresholding methods.

  • Sauvola’s algorithm [50] is the modification of the Niblack’s algorithm, also based on the local mean value and the local standard deviation, but the threshold is computed with the dynamic range of standard deviation.

  • Wolf’s algorithm [61] addresses a problem in Sauvola’s method when the grey level of the background and the foreground are close. The authors proposed to normalize the contrast and the mean grey value of the image before computing the threshold.

  • Feng’s algorithm [16] introduced the notion of two local windows, one contained within the other. This method can qualitatively outperform the Sauvola’s thresholding, however, many parameters have to be determined empirically, which makes this method reluctantly used.

  • Nick’s algorithm [30] derives the thresholding formula from the original Niblack’s algorithm. The method was developed for the OCR applications, especially for low quality ancient documents. The major advantage of this method is that it improves binarization for light page images by shifting down the threshold.

  • Mean and median thresholding algorithm. The mean-based method calculates the mean value in a local window and if the pixel’s intensity is below the mean the pixel is set to black, otherwise the pixel is set to white. In the median-based algorithm the threshold is selected as the median of the local grey-scale distribution.

  • Bernsen’s algorithm [2] is a method using a user-defined contrast threshold. When the local contrast is above or equal to the contrast threshold, the threshold is set as the mean value of the minimum and maximum values within the local window. When the local contrast is below the contrast threshold, the neighbourhood is set to only one class (an object or background) depending on the mean value.

  • Bradley’s algorithm [7] is an adaptive method, where each pixel is set to black if its value is t percent lower than the average of the surrounding pixels in the local window, otherwise it is set to white.

  • Triangle algorithm [67] calculates the threshold based on a line constructed between the global maximum of the histogram and a grey level near the end of the histogram. The threshold value is set as the histogram level from which the normal distance to the line is maximal.

2.1.3 Multiple Thresholding

The thresholding methods are called multiple, or multi-modal, when more than one T is used. The most common examples are listed below.

  • A method of Reddi et al. [44] can be considered as an iterative form of Otsu’s original method, which is faster and generalized to multi-level thresholding.

  • Another extension of the Otsu’s method to multi-level thresholding is referred to as the multi Otsu method of Liao et al. [34].

  • A method proposed by Sezan [51] consists in detection of peaks of the histogram using zero-crossings and image data quantization based on thresholds set between the peaks.

2.2 Edge-Based Segmentation

In an ideal scenario, regions are bounded by closed boundaries and by filling the boundaries we can obtain the regions (objects). This assumption was the foundation to develop the edge-based segmentation methods. They are based on detection of rapid changes (discontinuities) of an intensity value in an image.

The edge detection approaches (see a comparative survey of Bhardwaj and Mittal [6]) use one of two criteria, i.e. they locate the edges when:

  • the first derivative of the intensity is greater in magnitude than a given threshold. Using this method, the input image is convolved by a mask to generate a gradient image. The most popular edge detectors (filters) are based on Sobel, Prewitt, and Roberts operators;

  • the second derivative of the intensity has a zero crossing. This approach is based on smoothing of the image and extraction of zero crossing points, which indicates the presence of maxima in the image. A popular approach is based on a Laplacian of Gaussian (LoG) operator.

Unfortunately, these procedures rarely produce satisfactory results in the image segmentation problems. Noisy and poorly contrasted images badly affect edge detection, thus producing a closed contour is not a trivial task.

There are also many other methods aimed at finding straight lines and other parametrized shapes in images. The original Hough transform [24] was developed for detection of straight lines. This method was later generalized to the detection of analytically described shapes, such as circles [14], and to the detection of any shape [1]. These methods, however, are not useful for the problem undertaken in this study, since the general assumption is the baseline-free approach, i.e. damage needed to be detected is of unknown shapes.

2.3 Region-Based Segmentation

The region-based segmentation methods are based on the assumption that pixels in neighbouring regions have similar characteristics, i.e. values of a colour and intensity. The two basic methods are listed below.

  • Region growing is a method, in which an initial pixel (a seed) is selected and the region grows by merging the neighbouring pixels of the seed until the similarity criteria (colour, intensity value) are met.

  • Region splitting and merging methods. Splitting operation stands for iteratively dividing of an image into homogeneous regions, whereas merging contributes to joining of the adjacent similar regions. There are approaches using one of these operations solely (e.g. a statistical region merging (SRM) algorithm of Nock and Nielsen [37], a region splitting method of Ohlander et al. [38]), or both of them.

One should mention a group of region-based methods called watershed-based segmentation. The idea of the watershed transforms comes from geography, i.e. the gradient of image is considered as a topographic map. For instance, in one of such approaches, an image is treated as a map of a landscape or topographic relief flooded by water, where watersheds are the borders of the domains of attraction of rain falling over the region [46]. One of the first algorithms based on the watershed transform was proposed by Beucher and Lantuéjoul [3]. The application of appropriate morphological operations after the watershed transform enables obtaining the segmented image.

The main disadvantage of the region-based approaches is that they are computational time- and memory-consuming.

2.4 Clustering-Based Segmentation

Clustering is a multidimensional extension of the concept of thresholding. Clustering is mainly used to divide data into groups of similar objects. Clustering can be classified as either hard or fuzzy depending on whether a pattern data belongs exclusively to a single cluster or several clusters with different membership values. Some clustering methods can readily be applied for image segmentation and the most common of them are described below.

  • Hard clustering is a simple clustering technique dividing an image into a set of clusters, which is best applicable to data sets that have a significant difference (sharp boundaries) between groups. The most popular algorithm of hard clustering is a k-means clustering algorithm [23], which simultaneously belongs to unsupervised classification methods. In this method, initial centroids of a given number k of clusters are computed, and each pixel is assigned to the nearest centroid. Then, the centroids of clusters are recomputed by taking the mean of pixel intensity values within each cluster, and the pixels are reassigned. This process is repeated iteratively until the centroids stabilize. In this method, k must be determined, which is its main disadvantage. Moreover, it may lead to different results for each execution, which depends on the computation of initial cluster centroids.

  • Soft clustering is applicable to noisy data sets, where the difference between groups is not sharp. An example of such a method is a fuzzy c-means clustering, developed by Dunn [15] and later improved by Bezdek [4]. The algorithm steps in the fuzzy c-means clustering are very similar to the k-means clustering. The main difference in this method is that pixels are partitioned into clusters based on partial membership, i.e. one pixel can belong to more than one cluster and this degree of belonging is described by membership values.

  • A mean shift clustering (appeared first in [19]) is another clustering-based method. It seeks modes or local maxima of density in the feature space. Mean shift defines a window around each data point and calculates the mean of data point. Then, it shifts the centre of the window to the mean and repeats the algorithm step till it converges. This method does not need prior knowledge of a number of clusters but it needs a mean shift bandwidth parameter.

  • Expectation Maximization (EM) algorithm [12] is used to estimate the parameters of the Gaussian Mixture Model (GMM) of an image. The method consists in recursive finding of the means and variances of each Gaussian distribution and finding the best solutions for the means and variances. The EM algorithm can be efficient when analysed data is incomplete, e.g. there are missing data points. However, the method is computationally expensive, and prior knowledge of a number of clusters is needed. Exemplary studies on segmentation using the GMM and EM algorithm are presented in [18].

There are many other advanced clustering-based segmentation methods, e.g. a Normalized Graph Cut method [54] based on the Graph Theory. In this approach, each pixel is a vertex in a graph and edges link adjacent pixels. Weights on the edge are assigned according to similarity, colour or grey level, textures, or distance between two corresponding pixels. These methods, however, are time-consuming and determining of many parameters’ values is needed.

Fig. 1
figure 1

Ultrasonic testing with the use of the \(\text {MAUS}{\textregistered }\) automated system

Fig. 2
figure 2

The tested composite elements

2.5 Other Segmentation Methods

Some other methods, which can be also used to segment images are briefly mentioned below. In most cases these are very advanced algorithms, but being strongly parameter-dependent makes them non-universal with respect to the problem considered in this paper.

  • Texture-based segmentation approaches are useful when objects that are needed to be detected have a distinguishable texture. These approaches are often based on making use of texture measures, such as co-occurrence matrices or wavelet transforms. By applying the appropriate filters together with morphological operations, an object of a given texture can be identified in the image.

  • Template matching methods (see for instance [8]) are used when an object looking exactly like a template is expected to be found in images. In such a method, a template is compared to all regions in the analysed image and if the match between the template and the region is close enough, this region is labelled as the template object.

One should also mention the advanced, soft-computing methods that are also used for the purpose of image segmentation. The most common groups of such methods are shortly introduced below.

  • Artificial neural network-based segmentation methods simulate the learning strategies of human brain for the purpose of decision making. A neural network is made of a large number of connected nodes and each connection has a particular weight. A well-known example of neural networks used for data clustering is a Kohonen self-organising map (SOM) [41].

  • Genetic algorithms are randomised search and optimization methods guided by the principles of evolution and natural genetics. A study concerning the application of image segmentation using the genetic algorithms was broadly presented in a book of Bhanu and Lee [5]. Besides the genetic algorithms, there are many other optimisation approaches that can provide similar results of image segmentation.

3 Experimental Data

The comparative analysis of image segmentation methods, presented in Sect. 4, was performed based on exemplary C-Scans acquired during UT of two real composite structures. The testing of both elements was performed using a \(\text {MAUS}{\textregistered }\) automated system of the Boeing\({\textregistered }\) company (see Fig. 1), which is widely used for the inspections of aircraft structures. For this purpose, a 5 MHz single transducer was selected and a resolution of 0.01” (0.254 mm per pixel) was set.

The first tested element is a specimen (see its fragment in Fig. 2a) made of a CFRP composite with an impact damage of a BVID type. The BVID was introduced artificially using a test rig for the drop weight impact tests (described in [28]), i.e. the specimen was impacted with the energy of 20 J, using the impactor with a rounded ending (see the impactor E presented in Fig. 1 in [27]). As it can be noticed, there are only barely visible marks of impact damage on the specimen’s surface, in the middle of the image.

The second element is a fragment of an aircraft panel made of a CFRP composite, with delamination formed during the aircraft operation. A demonstrative fragment of this element is presented in Fig. 2b. The delaminated areas developed around a flap of an elliptic shape as well as in the area of openings for the rivets. This more complicated structure was intentionally chosen for this study since the performed analysis aims at selecting universal segmentation methods that are suitable for both simple and more difficult tested cases.

The obtained C-Scans of the impacted specimen and the aircraft panel are presented in Fig. 3a, c, respectively. The ground-truth binary images of the C-Scans are presented in Fig. 3b, d, where the thresholds were carefully selected manually to extract all potential damage regions. This selection of the thresholds was performed by an empirical approach until obtaining the desired areas, as it is done in practice by the operators during ultrasonic inspections. These ground-truth images are the reference data, to which the results obtained in this study will be compared.

Fig. 3
figure 3

C-Scans of the tested elements (a, c) and corresponding binary images (b, d)

Additionally, a verification of potentially the best segmentation methods (selected based on the results presented in Sect. 4) was performed with the use of 5 other C-Scans of composite aircraft panels, presented in Sect. 5.

4 Comparative Analysis of Image Segmentation Algorithms

In order to consider a segmentation method as a candidate for the analysis of ultrasonic C-Scans during inspections of composite materials, the method should not require setting of many input parameters, thus it should be universal. Based on the performed review of image segmentation methods and taking into consideration their advantages, disadvantages and limitations, appropriate methods were selected for the comparative analysis. The obtained C-Scans were processed using Matlab\({\textregistered }\) environment.

Two main criteria should be taken into consideration when analysing accuracy of damage detection in ultrasonic C-Scans: a segmentation accuracy, and a number of the resulting classes. The accuracy was calculated as a correlation between the resulting binary image and the reference ground-truth image, given in the range of 0–1. The resulting number of classes obtained as a result of image segmentation is also very important since, for instance, obtaining high accuracy but through detection of a large number of little segments that perfectly cover the area of the ground truth object is not desirable.

4.1 Analysis of Bi-modal Threshold-Based Segmentation Methods

Firstly, the threshold-based bi-modal segmentation methods were tested, namely the Bernsen’s, Bradley’s, Feng’s, Niblack’s, Nick’s, Otsu’s, Sauvola’s, Triangle, Wolf’s, as well as the Mean and Median thresholding algorithms. The exemplary results obtained with selected values of the algorithms’ parameters are presented in Figs. 4 and 5, respectively, for the C-Scan of the impacted specimen and the aircraft panel.

Fig. 4
figure 4

Exemplary results of image thresholding (C-Scan of the specimen)

Fig. 5
figure 5

Exemplary results of image thresholding (C-Scan of the aircraft panel)

From these results one can notice that bi-modal segmentation did not bring the expected results in any case. It can be observed that such approaches are not appropriate for the UT applications, since C-Scans should not be respected as having only two classes, i.e. the damaged and undamaged regions. It is especially visible in the case of the aircraft panel that beyond damage and the healthy structure there are also other regions, such as stiffeners, openings, or just noise, which should be segmented separately. This fact entirely eliminates the bi-modal segmentation methods from further considerations.

4.2 Analysis of Edge-Based Segmentation Methods

The second group of tests was aimed at edge-based segmentation algorithms, where the Canny, Prewitt, Roberts, Sobel, and LoG detectors were used. In Fig. 6, the exemplary results of edge detection algorithms are presented.

Fig. 6
figure 6

Exemplary results of edge detection (a, b C-Scan of the specimen, c, d C-Scan of the aircraft panel)

It can be noticed that the assumption of edge-based segmentation methods that regions are bounded by closed boundaries is not met. In many cases, the produced edges do not have closed contours and a significant part of these edges does not represent contours of damage only. Using this approach, similarly as in the case of the bi-modal thresholding, it is not easily possible to extract damage regions only (i.e. clearly separate them from other elements or noise), thus the edge detection methods are regarded as not suitable for the considered problem.

The above-mentioned observations lead to the conclusion that multi-modal segmentation approaches are needed to be applied, i.e. the methods that enable obtaining more than two classes.

4.3 Analysis of Clustering-Based Segmentation Methods

Further tests were performed with the use of several methods, where the number of classes k must be provided as an input, namely the k-means, fuzzy c-means, multilevel Otsu, and GMM with EM clustering method. The experiment was performed for variable values of k in the range of 2–10 with a step of 1 in order to observe its influence on the segmentation accuracy. The exemplary segmentation results are presented in Fig. 7, for the impacted specimen, and in Fig. 8, for the aircraft panel.

Fig. 7
figure 7

Exemplary results of image segmentation (C-Scan of the specimen) with different numbers of classes (given in brackets)

Fig. 8
figure 8

Exemplary results of image segmentation (C-Scan of the aircraft panel) with different numbers of classes (given in brackets)

A quantitative analysis of the segmentation results was performed and its summary is presented in Fig. 9. The segmentation accuracy was calculated as a correlation between the ground-truth image and the resulting binary image obtained by selection of one or more classes from the segmented image – as damage.

Fig. 9
figure 9

Segmentation accuracy obtained with the use of methods: k-means, c-means, multilevel Otsu thresholding and GMM-EM clustering

Fig. 10
figure 10

Exemplary results obtained using the Mean Shift clustering algorithm (a, b C-Scan of the impacted specimen, c, d C-Scan of the aircraft panel)

In the case of the impacted specimen, the segmentation accuracy using the k-means clustering and multilevel Otsu thresholding is in most cases very high, whereas using the c-means clustering and GMM-EM clustering it is changeable with the k variation. However, when analysed more complicated data, i.e. the C-Scan of the aircraft panel, the segmentation accuracy is very changeable for all the tested segmentation methods. These observations prove that selection of the number of classes k strongly affects the segmentation accuracy. Although the accuracy is in many cases very high, the necessity of the k selection makes these methods non-universal and inappropriate for the NDT applications.

The next from the tested methods is the Mean Shift clustering algorithm. As it was mentioned earlier, the advantage of this approach is that selection of a number of classes is not needed, however, a mean shift bandwidth parameter has to be provided. The experiments and quantitative analysis were performed for the bandwidth in the range of 0.1–0.9 with a step of 0.1. The exemplary segmentation results are presented in Fig. 10, and the summary of the quantitative analysis of the segmentation accuracy in Fig. 11. In the latter, the calculated correlation as well as the resulting number of classes for each selected bandwidth are set together. It can be observed that, although the segmentation accuracy is in many cases very high, it is very dependent on the selection of the bandwidth, and there are several cases where, especially in the case of the aircraft panel, the accuracy is not satisfactory.

Fig. 11
figure 11

Segmentation accuracy obtained using the Mean Shift clustering algorithm

4.4 Analysis of Region-Based Segmentation Methods

The next of the tested groups of methods is the region-based segmentation. In Fig. 12, the exemplary results after the application of the SRM method are presented. Here, the number of classes is the output parameter, however, other input data, such as filtering parameters and a size of the smallest region allowed to be obtained, have to be defined. As it can be noticed in Fig. 12, for a small number of classes the damaged region is entirely filled or merged with other neighbouring regions. For larger numbers of classes the regions seem to be better segmented, however, it does not find a proof in the results of a quantitative analysis, presented in Fig. 13. The segmentation accuracy is far from satisfactory in all the cases, especially when considering the C-Scan of the aircraft panel.

Fig. 12
figure 12

Exemplary results obtained using the SRM algorithm for the C-Scan of the impacted specimen (ad) and the aircraft panel (eh)

Fig. 13
figure 13

Segmentation accuracy obtained using the SRM method

4.5 Analysis of Other Histogram-Based Segmentation Approaches

Since all of the tested methods described above have some disadvantages, i.e. the main problem is the necessity of selection of input parameters, and thus, the lack of universality, the authors decided to test several non-parametric histogram-based approaches based on own-developed algorithms.

The first approach is called a multilevel segmentation using a Minima-Between-Peaks (MBP) criterion. This consists in selection of all the local minima of the image histogram as the thresholds. Optionally, to reduce a number of the resulting classes, single pixels of particular intensity values can be excluded from the histogram, i.e. the frequencies lower than a given value are set to zero. The histograms of the tested C-Scans are presented in Figs. 14a and 15a, respectively, for the impacted specimen and the aircraft panel. The dotted lines show locations of the obtained thresholds. The resulting images after segmentation using this approach are presented in Figs. 14b and 15b, where one can also observe the resulting numbers of classes.

Fig. 14
figure 14

Exemplary results obtained using the MBP approaches (for the specimen)

Fig. 15
figure 15

Exemplary results obtained using the MBP approaches (for the aircraft panel)

The second approach is based on the MBP algorithm with the difference that the histogram is smoothed before the minima detection step. Remarkably the elimination of some local noisy maxima in the histogram produces sharper distinction between the relevant segments. For the smoothing purpose, the one-dimensional median filters of a 2nd, 3rd, and 4th order were tested. The exemplary results, i.e. the smoothed histograms with indication on the thresholds’ locations, and the corresponding segmented images are presented in Fig. 14c, d, for the impacted specimen, and in Fig. 15c, d, for the aircraft panel.

Filtering of the input C-Scans before processing of their histograms was also tested. Various types of two-dimensional filters, such as the averaging, or Gaussian low-pass filters, and variable scenarios of their sizes were tested. However, this approach did not bring satisfying results, since the segmented images have too large number of classes (for the C-Scan of the aircraft panel it is in the range of 37–54). Interestingly the post-filtering (smoothing the histogram) appears to be more efficient than applying filters on the physical image itself.

The last approach is based on the MBP algorithm and a probability distribution model criterion, which is called here Gaussian-MBP. In this approach, the histogram is initially segmented into levels using the basic MBP algorithm and then, a one-term Gaussian Model is fitted to each histogram level individually. With opportune normalisation, the interpolation by Gaussian density functions provides immediate probabilistic interpretation of the pixel assignment to one or another region. The intersection points of all the obtained Gaussian Models are the new threshold values. The resulting Gaussian models of the histogram (based on the C-Scan of the aircraft panel) are presented in Fig. 16a and their zoomed fragment, together with the original histogram and detected thresholds using both methods (MBP and Gaussian-MBP), in Fig. 16b.

Fig. 16
figure 16

Exemplary results of the Gaussian-MBP algorithm (for the aircraft panel)

Fig. 17
figure 17

Segmentation accuracy obtained using the MBP and the Gaussian-MBP algorithms

The segmentation accuracy obtained using the three approaches described above, together with the resulting number of classes, are summarised in Fig. 17. From these results one can notice that these very simple, non-parametric, approaches turn out to be of very high accuracy. The MBP approach without any filtering brought perfect results for the both tested C-Scans. Median filtering of the histogram enables reducing the number of classes along with the increase of the filter order, however, with a little loss of accuracy in certain cases. The Gaussian-MBP algorithm returned the exact results in the case of the C-Scan of the impacted specimen, whereas in the case of the more complicated C-Scan, the segmentation accuracy slightly decreases.

Fig. 18
figure 18

Results obtained for the MBP algorithm without filtering (from the left to the right): original C-Scans, segmented C-Scans and binary results

The proposed approaches have a main significant advantage over other segmentation methods tested in this study, that they do not require setting any input parameters, such as a number of classes k. The resulting k value is selected automatically and the algorithms are universal for both simple and more complicated tested cases.

5 Verification of the Proposed Histogram-Based Segmentation Approaches

In order to verify potentially the best methods, namely the MBP algorithm without filtering and with median filtering of the histogram as well as the Gaussian-MBP algorithm, they were additionally tested based on 5 other C-Scans of composite aicraft panels, presented in Fig. 18. For these new test cases, similarily as in the previous research steps, the ground-truth binary images were prepared and the correlation as well as the resulting number of classes were calculated and summarised in Table 1.

Table 1 Segmentation accuracy obtained using histogram-based approaches for the additional test cases

The analysis of these results indicated that the MBP algorithm without filtering brought the best results with a total correlation for all the test cases. For the rest approaches there is, in most cases, a very little loss of accuracy (a correlation decrease by 0.003 on average) that corresponds to discrepancies in single pixel amounts. Therefore, it can be concluded that the best method selected from the experiments presented in this paper is the MBP algorithm without filtering, which allowed for the data reduction by approx. 80

The results of image segmentation using the MBP algorithm without filtering as well as the resulting binary images obtained for the 5 test cases are presented in Fig. 18.

6 Conclusions

The presented study was aimed at performing a comparative analysis of various types of image segmentation methods in the light of accuracy of damage detection in ultrasonic C-Scans of composite structures. A brief review of image segmentation approaches and their short description was introduced. A vast majority of surveys and comparative analyses found in the literature concerns mainly the problems of segmentation of photographs and documents. Processing of ultrasonic images is mainly addressed to issues connected with medical imaging. Due to the lack of comprehensive analyses in relation to segmentation of ultrasonic images in industrial applications, the authors presented the results of analysis of accuracy of damage extraction in C-Scans of CFRP structures with different level of complexity. The accuracy was determined based on the correlation between the resulting binary images and the ground-truth images, but also the resulting numer of classes in the segmented images was taken into consideration. Several threshold-, edge-, clustering-, region-based, as well as proposed non-parametric histogram-based segmentation approaches were tested and the quantitative analysis of the results was depicted.

The presented findings show several problems of many methods, mainly related to a necessity of selection of the input parameters or computation duration. The obtained results allow concluding that simple, very fast and non-parametric histogram-based methods are the most suitable for the aim of the analysis of ultrasonic scans of composite materials. Additional verification of the histogram-based segmentation methods based on more test cases indicated that the MBP algorithm without filtering brought the best results in all the test cases (the correlation equals 1) with the data reduction by approx. 80%.

The selected histogram-based segmentation method is universal and can be applied to C-Scans of composite elements of any type of material, thickness, or other geometrical properties. This versatility results from the fact that the method allows for proper sectioning of the ultrasonic scan into groups of colours (values), i.e. individual areas lying at different depths of the tested object, depending on the level of colour similarity, thus it is not dependent on the numerical values themselves. Therefore, the method can be applied not only for the C-Scan processing procedures but also in other applications related to image processing.