1 Introduction

Nowadays we can get very good digital scanners and cameras to store the sweet memories of world tour. With these cameras a huge amount of images are coming out. However, we cannot access or make use of these data unless they are organized so as to allow efficient browsing, searching, and retrieval. Hence, there is a dire need of some search technique viz content-based image retrieval (CBIR). The feature extraction is a prominent step and the effectiveness of a CBIR system depends typically on the method of features extraction from raw images. The CBIR utilizes the visual contents of an image such as color, texture, shape, faces, spatial layout, etc. to represent and index the image. The visual features can further be classified into general features which include color, texture, shape, and domain-specific features as human faces and finger prints. There exists no single best representation of an image for all perceptual subjectivity, because the user may take the photographs in different conditions (view angle, illumination changes etc.). Comprehensive and extensive literature survey on CBIR is presented in [1, 2, 3, 4].

Texture analysis has attracted a great deal of attention due to its potential values for computer vision and pattern recognition applications. This branch of texture analysis is particularly well suited for identification of products such as ceramic tiles, marble, parquet slabs, etc. This has been an eye catcher for researchers. Ahmadian et al. used the wavelet transform for texture classification [5]. The discrete wavelet transform (DWT) based texture image retrieval using generalized Gaussian density and Kullback–Leibler distance can be seen in [6]. Texture classification and segmentation by use of wavelet frames is observed in [7]. However, DWT can extract the features only in three directions (horizontal, Vertical and diagonal). Hence, Gabor transform (GT) [8], rotated wavelet filters [9], and the combination of dual tree complex wavelet filters (DT-CWF) [10], dual tree rotated complex wavelet filters (DT-RCWF) have been proposed in literature to extract the more directional features which are absent in DWT. Manjunath et al. [8] proposed the Gabor transform (GT) for image retrieval on Bordatz texture database taking into account the mean and standard deviation features from four scales and six directions of GT. Texture image retrieval by calculation of characteristics of image in different directions has been achieved using rotated wavelet filters [9], and the combination of DT-CWF, DT-RCWF and also rotational invariant DT-RCWF is proposed by Kokare et al. [9, 10, 11], respectively.

Now, a concise review of the related literature available, targeted for development of our algorithm, is given here. The local binary pattern (LBP) features are designed for description of texture. Ojala et al. [12] proposed the LBP, which are converted to rotational invariant for texture classification [13]. Pietikainen et al. proposed the rotational invariant texture classification using feature distributions [14]. Ahonen et al. [15] and Zhao and Pietikainen [16] used the LBP operator for facial expression analysis and recognition. Heikkila et al. proposed the background modeling and detection using LBP [17]. Huang et al. [18] proposed the extended LBP for shape localization. Heikkila et al. [19] used the LBP for interest region description. Li and Staunton [20] used the combination of Gabor filter and LBP for texture segmentation. Zhang et al. [21] proposed local derivative patterns (LDP) for face recognition, where they considered LBP as non-directional first-order local patterns collected from the first-order derivatives of an image. The block-based texture feature which uses the LBP texture feature as the source of image description is proposed in [22] for CBIR. The center-symmetric local binary pattern (CS-LBP) which is a modified version of the well-known LBP feature is combined with scale invariant feature transform (SIFT) in [23] for description of interest regions. Yao et al. [24] have proposed two types of local edge patterns (LEP) histograms: one is LEPSEG for image segmentation, and the other is LEPINV for image retrieval. The LEPSEG is sensitive to variations in rotation and scale; on the contrary, the LEPINV is resistant to variations in rotation and scale.

It has already been proved that the directional features are very valuable for image retrieval applications [9, 10, 11]. But, the above-discussed various extensions of LBP features are non-directional features. To address this problem, in this paper we propose the directional local extrema patterns (DLEP) for image retrieval and the main contributions of this work are given in the next subsection.

1.1 Main contributions

The main contributions of this work are summarized as follows:

  1. 1.

    The DLEP is proposed in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions in contrast to LBP. This DLEP differs from the existing LBP in a manner that it extracts the directional edge information based on local extrema.

  2. 2.

    The performance of the proposed method is tested on benchmark image databases.

The paper is systematized as follows: In Sect. 1, a brief review of CBIR and related work is given. Section 2 presents a concise review of local pattern operator. The proposed system framework and query matching are illustrated in Sect. 3. Experimental results and discussions are given in Sect. 4. Based on the above work, conclusions and future scope are derived in Sect. 5.

2 Local patterns

2.1 Local binary patterns (LBP)

The LBP operator was introduced by Ojala et al. [12] for texture classification. Success in terms of speed (no need to tune any parameters) and performance is reported in many research areas such as texture classification [12, 13, 14], face recognition [15, 16], object tracking, bio-medical image retrieval, and finger print recognition. Given a center pixel in the 3\(\times \)3 pattern, LBP value is computed by comparing its gray-scale value with its neighborhoods based on Eqs. (1) and (2):

$$\begin{aligned}&\text{ LBP}_{P,R} =\sum \limits _{p=1}^P {2^{(p-1)}\times f_1 (I(g_p )-I(g_c))}\end{aligned}$$
(1)
$$\begin{aligned}&f_1 (x)=\left\{ {\begin{array}{l@{\quad }l} 1&x\ge 0 \\ 0&\text{ else} \\ \end{array}} \right. \end{aligned}$$
(2)

where \(I(g_c )\) denotes the gray value of the center pixel, \(I(g_p)\) represents the gray value of its neighbors, \(P\) stands for the number of neighbors, and \(R\) the radius of the neighborhood.

After computing the LBP pattern for each pixel \((j, k)\), the whole image is represented by building a histogram as shown in Eq. (3).

$$\begin{aligned} H_\mathrm{ LBP} (l)&= \sum \limits _{j=1}^{N_1 } {\sum \limits _{k=1}^{N_2 } {f_2 (} } \text{LBP}(j,k),l);\; l\in [0,(2^P-1)]\nonumber \\ \end{aligned}$$
(3)
$$\begin{aligned} f_2 (x,y)&= \left\{ {\begin{array}{l@{\quad }l} 1&x=y \\ 0&\text{ else} \\ \end{array}} \right. \end{aligned}$$
(4)

where the size of input image is \(N_1 \times N_2 \).

Figure 1 shows an example of obtaining an LBP from a given \(3\times 3\) pattern. The histograms of these patterns contain the information on the distribution of edges in an image.

Fig. 1
figure 1

Calculation of LBP

Fig. 2
figure 2

Example of obtaining DLEP for the \(3\times 3\) pattern

2.2 Center-symmetric local binary patterns (CS_LBP)

Instead of comparing each pixel with the center pixel, Heikkila et al. [23] have compared center-symmetric pairs of pixels for CS_LBP as shown in Eq. (5):

$$\begin{aligned} {\rm CS\_LBP}_{P,R} =\sum \limits _{p=1}^P {2^{(p-1)}\times f_1 (I(g_p )-I(g_{p+(P/2)} ))} \end{aligned}$$
(5)

After computing the CS_LBP pattern for each pixel \((j, k)\), the whole image is represented by building a histogram, as similar to the LBP.

2.3 Directional local extrema patterns (DLEP)

The idea of LBP proposed in [12] has been adopted to define directional local extrema patterns (DLEP). DLEP describes the spatial structure of the local texture using the local extrema of center gray pixel\(g_c \).

Fig. 3
figure 3

Example to obtain DLEP pattern in 0\(^{\circ }\) direction

In proposed DLEP for a given image the local extrema in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions are obtained by computing local difference between the center pixel and its neighbors as shown below:

$$\begin{aligned} I^{\prime }(g_i )=I(g_c )-I(g_i ); \quad i=1,2,\ldots ,8 \end{aligned}$$
(6)

The local extremas are obtained by Eq. (7).

$$\begin{aligned}&\hat{I}_\alpha (g_c )=f_3 (I^{\prime }(g_j ),\quad I^{\prime }(g_{j+4} ));\quad j={(1+\alpha } /45) \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \forall \alpha =0^{\circ }, 45^{\circ },90^{\circ },135^{\circ } \end{aligned}$$
(7)
$$\begin{aligned}&f_3 (I^{\prime }(g_j ),\quad I^{\prime }(g_{j+4} ))=\left\{ {\begin{array}{l@{\quad }l} 1&I^{\prime }(g_j )\times I^{\prime }(g_{j+4} )\ge 0 \\ 0&\text{ else} \\ \end{array}} \right. \end{aligned}$$
(8)

The DLEP is defined (\(\alpha =0^{\circ }, 45^{\circ }, 90^{\circ }\), and \(135^{\circ })\) as follows:

$$\begin{aligned} \left. {\text{ DLEP}(I(g_c ))} \right|_\alpha =\left\{ {\hat{I}_\alpha (g_c );\,\hat{I}_\alpha (g_1 );\,\hat{I}_\alpha (g_2 );\ldots \hat{I}_\alpha (g_8 )} \right\} \nonumber \\ \end{aligned}$$
(9)

The detailed representation of DLEP can be seen in Fig. 2.

Eventually, the given image is converted to DLEP images with values ranging from 0 to 511.

Fig. 4
figure 4

Example of LBP and DLEP feature maps: a sample image, b LBP feature map, c DLEP feature map in 0\(^{\circ }\) direction, d DLEP feature map in 45\(^{\circ }\) direction, e DLEP feature map in 90\(^{\circ }\) direction, and f DLEP feature map in 135\(^{\circ }\) direction

Fig. 5
figure 5

Proposed image retrieval system framework

Fig. 6
figure 6

Some sample images from database Corel-1K (one image per category)

After calculation of DLEP, the whole image is represented by building a histogram supported by Eq. (10).

$$\begin{aligned} H_{\left. \mathrm{ DLEP} \right|_\alpha } (l)\!=\!\sum \limits _{j=1}^{N_1 } {\sum \limits _{k=1}^{N_2 } {f_2 (} } \left. {\text{DLEP}(j,k)} \right|_\alpha ,l);\quad \! l\!\in \! [0,\,511]\nonumber \\ \end{aligned}$$
(10)

where the size of input image is \(N_1 \times N_2 \).

The DLEP computation for a center pixel marked with red color has been illustrated in Fig. 2. The local difference between the center pixel and its eight neighbors are used to evaluate the directions as shown in Fig. 2. Further, these directions are utilized to obtain DLEP patterns in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions. The selected 3\(\times \)3 pattern for DLEP calculation is represented with subscripts (0) to (8) as shown in Fig. 2.

An example of the DLEP computation in 0\(^{\circ }\) direction for a center pixel marked with red color has been illustrated in Fig. 3. For a center pixel ‘6’ we apply the local extrema in horizontal direction and then it is seen that these two directions are leaving from the center pixel; hence this pattern is coded to ‘1’. Similarly, we computed the remaining bits of DLEP from other 8 neighbors, and the resulting pattern is ‘1 1 1 0 1 1 1 0 1’. In the same fashion, DLEP patterns for center pixel in the directions 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) are also computed.

Table 1 Results of various techniques in terms of precision on Corel-1K database
Table 2 Results of various techniques in terms of recall on Corel-1K database
Fig. 7
figure 7

Comparison of proposed method with other existing methods in terms of a average precision and b average retrieval rate on Corel-1K database

The proposed DLEP is different from the well-known LBP. The DLEP encodes the spatial relation between any pair of neighbors in a local region along a given direction, while LBP [12] extracts relation between the center pixel and its neighbors. Therefore, DLEP captures more spatial information as compared with LBP. It has already been proved that the directional features are very valuable for image retrieval applications [9, 10, 11].

Figure 4 illustrates the results obtained by applying LBP and DLEP operators on referenced face image. Face image is chosen as it provides the results which are visibly comprehensible to differentiate the effectiveness of these approaches. From Fig. 4, it is observed that the DLEP yields more directional edge information as compared with LBP. The experimental results demonstrate that the proposed DLEP shows better performance as compared with LBP, indicating that it can capture more edge information than LBP for texture extraction.

Table 3 Results of all methods in terms of precision and recall on Corel-5K and Corel-10K databases
Fig. 8
figure 8

Comparison of proposed method with other existing methods on Corel-5K. a Category-wise performance in terms of precision, b category-wise performance in terms of recall, c total database performance in terms of average precision, and d total database performance in terms of ARR

Fig. 9
figure 9

An example of image retrieval by proposed method (DLEP) on Corel-5K database

3 Proposed system framework

3.1 Proposed image retrieval system

Figure 5 depicts the flowchart of the proposed technique and algorithm for the same is presented here:

Algorithm:

Input: Image; Output: Retrieval result

  1. 1.

    Load the gray-scale image

  2. 2.

    Calculate the local extrema in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions.

  3. 3.

    Compute the DLEP patterns in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions.

  4. 4.

    Construct the histograms for DLEP patterns in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions.

  5. 5.

    Construct the feature vector by concatenating all histograms.

  6. 6.

    Compare the query image with the image in the database using Eq. (11).

  7. 7.

    Retrieve the images based on the best matches

3.2 Query matching

Feature vector for query image \(Q\) is represented as\(f_Q =(f_{Q_1 } ,f_{Q_1 } ,\ldots \ldots f_{Q_{Lg} } )\) obtained after the feature extraction. Similarly, each image in the database is represented with feature vector \(f_{\mathrm{ DB}_j } =(f_{\mathrm{ DB}_{j1} } ,f_{\mathrm{ DB}_{j1} } ,\ldots f_{\mathrm{ DB}_{jLg} } );\,j=1,2, \ldots ,\left|\text{DB} \right|\). The goal is to select \(n\) best images that resemble the query image. This involves selection of \(n\) top matched images by measuring the distance between query image and image in the database \(\left|\text{DB} \right|\). In order to match the images we used \(d_{1}\) similarity distance metric computed by Eq. (11).

$$\begin{aligned} D(Q,\text{ DB})=\sum \limits _{i=1}^{Lg} {\left| {\frac{f_{\mathrm{ DB}_{ji} } -f_{Q_i } }{1+f_{\mathrm{ DB}_{ji} } +f_{Q_i } }} \right|} \end{aligned}$$
(11)

where \(f_{\mathrm{ DB}_{ji} } \)is \(i\)th feature of \(j\)th image in the database \(\left|{\text{ DB}} \right|\).

4 Experimental results and discussions

The retrieval performance of the proposed method has been analyzed by conducting four experiments on two different databases (Corel-10K (DB1), and Brodatz database (DB2)) and results are presented separately.

In experiment #1, #2, and #3, images from Corel database [25] have been used. This database consists of large number of images of various contents ranging from animals to outdoor sports to natural images. These images have been pre-classified into different categories each of size 100 by domain professionals. Some researchers think that Corel database meets all the requirements to evaluate an image retrieval system, due its large size and heterogeneous content.

In all experiments, each image in the database is used as the query image. For each query, the system collects \(n\) database images \(X=(x_{1}, x_{2}, \ldots , x_{n})\) with the shortest image matching distance computed using Eq. (11). If the retrieved image \(x_{i}=1, 2, \ldots , n\) belongs to same category as that of the query image then we say the system has appropriately identified the expected image; else, the system fails to find the expected image.

The performance of the proposed method is measured in terms of average precision, average recall, and average retrieval rate (ARR) as shown below:

Fig. 10
figure 10

Comparison of proposed method with other existing methods on Corel-10K. a Category-wise performance in terms of precision, b category-wise performance in terms of recall, c total database performance in terms of average precision and d total database performance in terms of ARR

Fig. 11
figure 11

An example of image retrieval by proposed method (DLEP) on Corel-10K database

For the query image \(I_q \), the precision is defined as follows:

$$\begin{aligned} P(I_q ,n)=\frac{1}{n}\sum \limits _{i=1}^{\left| {\text{DB}} \right|} {\left| {\delta ( {\Phi ( {I_i }),\Phi ( {I_q })})\vert \,\text{ Rank}(I_i ,I_q )\le n} \right|}\nonumber \\ \end{aligned}$$
(12)

where ‘\(n\)’ indicates the number of retrieved images, \(\left|{\text{DB}} \right|\) is size of image database. \(\Phi ( x)\) stands for the category of ‘\(x\)’, \(\text{ Rank}(I_i ,I_q )\) returns the rank of image \(I_i \) (for the query image\(I_q )\) among all images of \(\left| {\text{DB}} \right|\) and \(\delta ( {\Phi ( {I_i }),\Phi ( {I_q })})=\left\{ {\begin{array}{l@{\quad }l} 1&\Phi ( {I_i })=\Phi ( {I_q }) \\ 0&\text{Otherwise} \\ \end{array}} \right.\).

Recall is defined as below:

$$\begin{aligned} \left. {R(I_q ,n)=P(I_q ,N_G )} \right|_{n=N_G } \end{aligned}$$
(13)

The average precision for the \(j\)th similarity category of the reference image database are given by Eq. (14).

$$\begin{aligned} P_\mathrm{ ave}^j (n)=\frac{1}{N_G }\sum \limits _{i\in G} {P(I_i ,n)} \end{aligned}$$
(14)

Finally, the total average precision, and ARR for the whole reference image database are computed using Eqs. (15) and (16), respectively

$$\begin{aligned}&P_\mathrm{ ave}^\mathrm{ Total} (n)=\frac{1}{\left| {\text{DB}} \right|}\sum \limits _{i=1}^{\left| {\text{DB}} \right|} {P(I_i ,n)} \end{aligned}$$
(15)
$$\begin{aligned}&\text{ ARR}=\frac{1}{\left| {\text{DB}} \right|}\left. {\sum \limits _{i=1}^{\left| {\text{DB}} \right|} {R(I_i ,n)} } \right|_{n\le 100} \end{aligned}$$
(16)

The average recall \((R)\) is also defined in the same manner.

4.1 Experiment #1

For this experiment, we have collected 1000 images to form database Corel-1K. These images are collected from ten different domains, namely Africans, beaches, buildings, buses, dinosaurs, elephants, flowers, horses, mountains, and food. Each category has \(N_{G}\) (100) images with resolution of either \(256\times 384\) or \(384\times 256\). Figure 6 shows the sample images of Corel-1K database (one image from each category). The performance of the proposed method is measured in terms of average precision, average recall, and ARR as shown in Eqs. (1216).

Tables 1 and 2 show the results of proposed method and other existing methods (LBP, CS_LBP, LEPSEG, LEPINV, BLK_LBP) in terms of precision and recall. The results are considered to be better if average values of precision and recall are high.

Table 4 Average retrieval rate for the 116 texture classes of Brodatz database
Fig. 12
figure 12

Comparison of proposed method with other existing method in terms of ARR on DB2 database

From Tables 1 and 2, the following points are observed:

  1. 1.

    The average precision of proposed method (74.8%) is more as compared with LBP (71.2%), CS_LBP (59.1%), LEPSEG (65.2%), LEPINV (60.8%), and BLK_ LBP (70.1%).

  2. 2.

    The average recall of proposed method (49.16%) is more as compared with LBP (45.71%), CS_LBP (40.9%), LEPSEG (38.1%), LEPINV (34.68%), and BLK_LBP (43.0%).

From the above observations, it is evident that the proposed method significantly improves results in terms of average precision and average recall. Figure 7a, b show the experimental results of proposed method and other existing methods. It is observed that the proposed method (DLEP) achieves a superior average precision and ARR on image database Corel-1K as compared with other existing methods.

4.2 Experiment #2

In this experiment, we have used 5000 images to form database of Corel-5K. This database consists of 50 different categories and each category contains 100 images. The performance of the proposed method is measured in terms of average precision, average recall, and ARR as shown in Eqs. (1216).

Table 3 illustrates the retrieval results of proposed method and other existing methods on Corel-5K and Corel-10K databases in terms of average precision and recall. Figure 8a, b show the category-wise performance of methods in terms of precision and recall on Corel-5K database. The performance of all techniques in terms of average precision and ARR on Corel-5K database can be seen in Fig. 8c, d, respectively. From Table 3 and Fig. 8, it is clear that the proposed method shows a significant improvement as compared with other existing methods in terms of their evaluation measures on Corel-5K database. Figure 9 illustrates the query results of proposed method on Corel-5K database (top left image is the query image).

4.3 Experiment #3

In experiment #3, we have used 10,000 images to form database of Corel-10K. This database consists of 100 different categories and each category contains 100 images. The performance of the proposed method is measured in terms of average precision, average recall, and ARR as shown in Eqs. (1216).

Figure 10a, b show the category-wise performance of methods in terms of precision and recall on Corel-10K database. The performance of all techniques in terms of average precision and ARR on Corel-10K database can be seen in Fig. 10c, d, respectively. From Table 3 and Fig. 10, it is clear that the proposed method shows a significant improvement as compared with other existing methods in terms of their evaluation measures on Corel-10K database. Figure 11 illustrates the query results of proposed method on Corel-10K database (top left image is the query image).

4.4 Experiment #4

In experiment #4 the database DB2 is used, that consists of 116 different textures. We have used 109 textures from Brodatz texture photographic album [26] and seven textures from University of Southern California (USC) database [27]. The size of each texture is 512\(\times \)512. Each 512\(\times \)512 image is divided into sixteen 128\(\times \)128 non-overlapping sub-images, thus creating a database of 1856 (116\(\times \)16) images. In this experiment, each image in the database is considered as the query image and the performance of the proposed method is measured in terms of ARR as given by Eq. (17).

$$\begin{aligned} \text{ ARR}=\frac{1}{\left| {\text{DB}} \right|}\left. {\sum \limits _{i=1}^{\left| {\text{ DB}} \right|} {R(I_i ,\,n)} } \right|_{\begin{array}{ll} N_G\; =\; 16 \\ n\; \ge\; 16 \\ \end{array}} \end{aligned}$$
(17)

The database DB2 is used to compare the performance of the proposed method (DLEP) with other existing methods (GT, DT-CWT, DT-RCWT, DT-CWT+DT-RCWT CS_LBP, LEPSEG, LEPINV, BLK_LBP, and LBP) in terms of ARR. From Table 4, it is evident that the proposed is outperforming other existing methods. Figure 12a, b show the graphs which illustrates the retrieval performance of proposed method and other existing methods as a function of number of top matches, and we find that the proposed method outperforms the other existing methods in terms ARR.

5 Conclusions and future work

A new approach for CBIR is presented in this paper. The proposed DLEP differs from the existing LBP in a manner that it extracts the directional edge information based on local extrema in 0\(^{\circ }\), 45\(^{\circ }\), 90\(^{\circ }\), and 135\(^{\circ }\) directions in an image. Performance of the proposed method is tested by conducting four experiments on benchmark image databases and retrieval results show a significant improvement in terms of their evaluation measures as compared with other existing methods on respective databases.

Further this work can be extended by combining proposed method with GT and by varying the number of neighbors (more directions) of referenced pixels.