Abstract
In this article, we propose a novel system for feature selection, which is one of the key problems in content-based image indexing and retrieval as well as various other research fields such as pattern classification and genomic data analysis. The proposed system aims at enhancing semantic image retrieval results, decreasing retrieval process complexity, and improving the overall system usability for end-users of multimedia search engines. Three feature selection criteria and a decision method construct the feature selection system. Two novel feature selection criteria based on inner-cluster and intercluster relations are proposed in the article. A majority voting-based method is adapted for efficient selection of features and feature combinations. The performance of the proposed criteria is assessed over a large image database and a number of features, and is compared against competing techniques from the literature. Experiments show that the proposed feature selection system improves semantic performance results in image retrieval systems.
Article PDF
Similar content being viewed by others
Abbreviations
- p(x):
-
Probability density functions
- p(x, y):
-
Joint probability density function
- I(X; Y):
-
Mutual information
- H(X):
-
Shannon’s entropy
- S:
-
Correlation measure for evaluating the discrimination power of feature
- c :
-
Number of classes
- δ :
-
Correlation between clusters
- f xi :
-
ith item in the cluster x
- μ x :
-
Mean of cluster x
- σ x :
-
Standard deviation of cluster x
- N x :
-
Cardinality of clusters x
- e 1 :
-
Eigen vector corresponding to the largest eigen value of the covariance matrix
- π :
-
The best representative feature vector
- x N :
-
Set of feature vectors
- x i :
-
Feature vector corresponding to the ith item in the cluster
- x ij :
-
jth element of the feature vector corresponding to the ith item of the cluster
- M:
-
Mean vector
- μ j :
-
Elements of M, mean values
- Δ:
-
Distance between π and M
- n :
-
Number of elements in the vectors π and M
- d :
-
Euclidean distance between cluster members
- S w1 S w2 S w3 :
-
Compactness measurements
- r :
-
Covering radius, distance from the center to the farthest item in the cluster
- Π:
-
Probability
- υ MI :
-
Normalized numerical results from mutual information criterion
- υ ICR :
-
Normalized numerical results from inner-cluster relation criterion
- υ PPMC :
-
Normalized numerical results from Pearson’s product-moment correlation criterion
- \({\nu_{{f_{i}}}}\) :
-
Votes for each feature
- F :
-
Number of features in the FSRL list
- α i :
-
Weights of the features in retrieval
- R i :
-
Rank of the ith feature in FSRL list
- ω i :
-
Weight of item i in SPFL list
- ω j :
-
Weight of item j in FL list
References
MUVIS: A system for content-based multimedia indexing and retrieval in multimedia databases. http://muvis.cs.tut.fi/
Pentland, A., Picard, R.W., Sclaroff, S.: Photobook: content-based manipulation of image databases. Int. J. Comput. Vis. 18(3), 233–254 (1996)
Niblack, W., Barber, R., et al.: The QBIC project: querying images by content using color, textures and shape. In: Proceedings of SPIE Storage and Retrieval for Image and Video Databases, 1996, pp. 124–128 (1996)
Smith, J.R., Chang, S.-F.:VisualSEEk: a fully automated content-based image query system. In: Proceedings of ACM Multimedia, Boston, November 1996, pp. 87–98 (1996)
Dy, J.G., Brodley, C.E., Kak, A.C., Broderick, L.S., Aisen, A.M.: Unsupervised feature selection applied to content-based retrieval of lung images. IEEE Trans. Pattern Anal. Mach. Intell. 25(3), 373–378 (2003)
Collins, R.T., Yanxi, L., Leordeanu, M.: Online selection of discriminative tracking features. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1631–1643 (2005)
Wei, J., Guihua, E., Qionghai, D., Jinwei, G.: Similarity-based online feature selection in content-based image retrieval. IEEE Trans. Image Process. 15(3), 702–712 (2006)
Vasconcelos, N., Vasconcelos, M.:Scalable discriminant feature selection for image retrieval and recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–775 (2004)
Xing, E.P., Jordan, M.I., Karp, R.M.:Feature selection for high-dimensional genomic microarray data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 601–608 (2001)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowledge Data Eng. 17(4), 491–502 (2005)
Jain, A., Zongker, D.: Feature selection: evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 153–158 (1997)
Koller, D., Sahami, M.:Toward optimal feature selection. In: Proceedings of the 13th International Conference on Machine Learning, pp. 284–292 (1996)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–238 (2005)
Ding, C., Peng, H.: minimum redundancy feature selection from microarray gene expression data. In: Proceedings of the IEEE Computer Society Conference on Bioinformatics, 11–14 August 2003, pp. 523–528 (2003)
Ellis, D.P.W., Bilmes, J.A.: Using mutual information to design feature combinations. In: Proceedings of International Conference on Spoken Language Processing, ICSLP-2000, Vol. 3, Beijing, October 2000, pp. 79–82 (2000)
Hariri, S., Yousif, M., QuA, G.: New dependency and correlation analysis for features. IEEE Trans. Knowl. Data Eng. 17(9), 199–1207 (2005)
Shi, D., Shu, W., Liu, H.: Feature selection for handwritten chinese character recognition based on genetic algorithms. IEEE Int. Conf. Syst. Man Cybernet. 5, 4201–4206 (1998)
Mitra, P., Murthy, C.A., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 301–312 (2002)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2001)
Kiranyaz, S., Gabbouj, M.: Hierarchical cellular tree: an efficient indexing scheme for content-based retrieval on multimedia databases. IEEE Trans. Multimed. 9(1), 102–119 (2007)
Partio, M., Cramariuc, B., Gabbouj, M., Visa, A.: Rock texture retrieval using gray level co-occurrence matrix. In: Proceedings of 5th Nordic Signal Processing Symposium, October 2002. (2002)
Xu, L., Krzyzak, A., Suen, C.Y.: Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. Syst. Man Cybernet. 22(3), 418–435 (1992)
Lin, X., Yacoub, S.M., Burns, J., Simske, S.J.: Performance analysis of pattern classifier combination by plurality voting. Pattern Recognit. Lett. 24(12), 1959–1969 (2003)
Corel Stock Photo Library, Corel, Ontario
Swain, M.J., Ballard, D.H: Color indexing. Int. J. Comput. Vis. 7(1), 11–32 (1991)
Ma, W.Y., Manjunath, B.: Texture features for browsing and retrieval of image data. IEEE Trans. Pattern Anal. Mach. Intell. 18, 837–842 (1996)
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8(6), 679–698 (1986)
Manjunath B., S., Ohm, J.-R., Vasudevan, V.V., Yamada, A.: Color and texture descriptors. IEEE Trans. Circuits Syst. Video Technol. 11(6), 703–715 (2001)
Author information
Authors and Affiliations
Corresponding author
Additional information
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
This work was supported by the Academy of Finland, Project No. 213,462 (Finnish Centre of Excellence Program 2006–2011).
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Guldogan, E., Gabbouj, M. Feature selection for content-based image retrieval. SIViP 2, 241–250 (2008). https://doi.org/10.1007/s11760-007-0049-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-007-0049-9