article

Enhancing relevance feedback in image retrieval using unlabeled data

Authors:
Zhi-Hua Zhou

Nanjing University, Nanjing, China

Nanjing University, Nanjing, China
View Profile

,
Ke-Jia Chen

Nanjing University, Nanjing, China

Nanjing University, Nanjing, China
View Profile

,
Hong-Bin Dai

Nanjing University, Nanjing, China

Nanjing University, Nanjing, China
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 24 Issue 2pp 219–244https://doi.org/10.1145/1148020.1148023

Published:01 April 2006Publication History

ACM Transactions on Information Systems

Abstract

Relevance feedback is an effective scheme bridging the gap between high-level semantics and low-level features in content-based image retrieval (CBIR). In contrast to previous methods which rely on labeled images provided by the user, this article attempts to enhance the performance of relevance feedback by exploiting unlabeled images existing in the database. Concretely, this article integrates the merits of semisupervised learning and active learning into the relevance feedback process. In detail, in each round of relevance feedback two simple learners are trained from the labeled data, that is, images from user query and user feedback. Each learner then labels some unlabeled images in the database for the other learner. After retraining with the additional labeled data, the learners reclassify the images in the database and then their classifications are merged. Images judged to be positive with high confidence are returned as the retrieval result, while those judged with low confidence are put into the pool which is used in the next round of relevance feedback. Experiments show that using semisupervised learning and active learning simultaneously in CBIR is beneficial, and the proposed method achieves better performance than some existing methods.

References

Abe, N. and Mamitsuka, H. 1998. Query learning strategies using boosting and bagging. In Proceedings of the 15th International Conference on Machine Learning (Madison, WI). 1--9. Google Scholar
Blum, A. and Chawla, S. 2001. Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the 18th International Conference on Machine Learning (Williamston, MA). 19--26. Google Scholar
Blum, A. and Mitchell, T. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the 11th Annual Conference on Computational Learning Theory (Madison, WI). 92--100. Google Scholar
Bookstein, A. 1983. Information retrieval: A sequential learning process. J. American Society Inf. Sci. 34, 4, 331--342.Google Scholar
Chen, J.-Y., Bouman, C. A., and Dalton, J. 2000. Hierarchical browsing and search of large image databases. IEEE Trans. Image Proces. 9, 3, 442--445. Google Scholar
Ciocca, G. and Schettini, R. 1999. A relevance feedback mechanism for content-based image retrieval. Inf. Proces. Management 35, 5, 605--632. Google Scholar
Cohen, I., Cozman, F. G., Sebe, N., Cirelo, M. C., and Huang, T. S. 2004. Semisupervised learning of classifiers: Theory, algorithm, and their application to human-computer interaction. IEEE Trans. Pattern Anal. Mach. Intel. 26, 12, 1553--1567. Google Scholar
Cox, I. J., Miller, M., Minka, T. P., Papathomas, T., and Yianilos, P. 2000. The Bayesian image retrieval system, PicHunter: Theory, implementation, and psychophysical experiments. IEEE Trans. Image Proces. 9, 1, 20--37. Google Scholar
Cozman, F. G. and Cohen, I. 2002. Unlabeled data can degrade classificaion performance of generative classifiers. In Proceedings of the 15th International Conference of the Florida Artificial Intelligence Research Society (Pensacola, FL). 327--331. Google Scholar
Dasgupta, S., Littman, M., and McAllester, D. 2002. PAC generalization bounds for co-training. In Advances in Neural Information Processing Systems 14, T. G. Dietterich et al., eds. MIT Press, Cambridge, MA. 375--382.Google Scholar
Dempster, A. P., Laird, N. M., and Rubin, D. B. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statistical Society, Series B 39, 1, 1--38.Google Scholar
Dong, A. and Bhanu, B. 2003. A new semi-supervised EM algorithm for image retrieval. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (Madison, WI). 662--667.Google Scholar
Goldman, S. and Zhou, Y. 2000. Enhancing supervised learning with unlabeled data. In Proceedings of the 17th International Conference on Machine Learning (San Francisco, CA). 327--334. Google Scholar
Huijsmans, D. P. and Sebe, N. 2005. How to complete performance graphs in content-based image retrieval: Add generality and normalize scope. IEEE Trans. Pattern Anal. Mach. Intel. 27, 2, 245--251. Google Scholar
Hwa, R., Osborne, M., Sarkar, A., and Steedman, M. 2003. Corrected co-training for statistical parsers. In Working Notes of the ICML'03 Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining (Washington, DC).Google Scholar
Ishikawa, Y., Subramanya, R., and Faloutsos, C. 1998. MindReader: Query databases through multiple examples. In Proceedings of the 24th International Conference on Very Large Data Bases (New York, NY). 218--227. Google Scholar
Joachims, T. 1999. Transductive inference for text classification using support vector machines. In Proceedings of the 16th International Conference on Machine Learning (Bled, Slovenia). 200--209. Google Scholar
Kherfi, M. L., Ziou, D., and Bernardi, A. 2002. Learning from negative example in relevance feedback for content-based image retrieval. In Proceedings of the 16th International Conference on Pattern Recognition (Quebec, Canada). 933--936.Google Scholar
Lewis, D. 1992. Representation and learning in information retrieval. Ph.D. thesis, Dept. of Computer Science, University of Massachusetts. Google Scholar
Lewis, D. and Gale, W. 1994. A sequential algorithm for training text classifiers. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Dublin, Ireland). 3--12. Google Scholar
Manjunath, B. S. and Ma, W. Y. 1996. Texture features for browsing and retrieval of image data. IEEE Trans. Pattern Anal. Mach. Intel. 18, 8, 837--842. Google Scholar
Mehtre, B. M., Kankanhalli, M. S., Narasimhalu, A. D., and Man, G. C. 1995. Color matching for image retrieval. Pattern Recogn. Lett. 16, 3, 325--331. Google Scholar
Miller, D. J. and Uyar, H. S. 1997. A mixture of experts classifier with learning based on both labelled and unlabelled data. In Advances in Neural Information Processing Systems 9, M. Mozer et al., eds. MIT Press, Cambridge, MA. 571--577.Google Scholar
Müller, H., Müller, W., Squire, D. M., Marchand-Maillet, S., and Pun, T. 2001. Performance evaluation in content-based image retrieval: Overview and proposals. Pattern Recogn. Lett. 22, 5, 593--601. Google Scholar
Muslea, I., Minton, S., and Knoblock, C. A. 2000. Selective sampling with redundant views. In Proceedings of the 17th National Conference on Artificial Intelligence (Austin, TX). 621--626. Google Scholar
Nastar, C., Mitschke, M., and Meilhac, C. 1998. Efficient query refinement for image retrieval. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (Santa Barbara, CA). 547--552. Google Scholar
Nigam, K. and Ghani, R. 2000. Analyzing the effectiveness and applicability of co-training. In Proceedings of the 9th ACM International Conference on Information and Knowledge Management (Washington, DC). 86--93. Google Scholar
Nigam, K., McCallum, A. K., Thrun, S., and Mitchell, T. 2000. Text classification from labeled and unlabeled documents using EM. Mach. Learn. 39, 2-3, 103--134. Google Scholar
Picard, R. W., Minka, T. P., and Szummer, M. 1996. Modeling user subjectivity in image libraries. In Proceedings of the International Conference on Image Processing (Lausanne, Switzerland). 777--780.Google Scholar
Pierce, D. and Cardie, C. 2001. Limitations of co-training for natural language learning from large data sets. In Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing (Pittsburgh, PA). 1--9.Google Scholar
Rui, Y., Huang, T. S., Ortega, M., and Mehrotra, S. 1998. Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circuits Syst. Video Technol. 8, 5, 644--655. Google Scholar
Sarkar, A. 2001. Applying co-training methods to statistical parsing. In Proceedings of the 2nd Annual Meeting of the North American Chapter of the Association for Computational Linguistics (Pittsburgh, PA). 95--102. Google Scholar
Seung, H., Opper, M., and Sompolinsky, H. 1992. Query by committee. In Proceedings of the 5th ACM Workshop on Computational Learning Theory (Pittsburgh, PA). 287--294. Google Scholar
Shahshahani, B. and Landgrebe, D. 1994. The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon. IEEE Trans. Geosci. Remote Sensing 32, 5, 1087--1095.Google Scholar
Smeulders, A. W. M., Worring, M., Santini, S., Gupta, A., and Jain, R. 2000. Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intel. 22, 12, 1349--1380. Google Scholar
Steedman, M., Osborne, M., Sarkar, A., Clark, S., Hwa, R., Hockenmaier, J., Ruhlen, P., Baker, S., and Crim, J. 2003. Bootstrapping statistical parsers from small data sets. In Proceedings of the 11th Conference on the European Chapter of the Association for Computational Linguistics (Budapest, Hungary). 331--338. Google Scholar
Tian, Q., Yu, J., Xue, Q., and Sebe, N. 2004. A new analysis of the value of unlabeled data in semi-supervised learning for image retrieval. In Proceedings of the IEEE International Conference on Multimedia Exposition (Taibei). 1019--1022.Google Scholar
Tieu, K. and Viola, P. 2000. Boosting image retrieval. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (Hilton Head, SC). 228--235.Google Scholar
Tong, S. and Chang, E. 2001. Support vector machine active learning for image retrieval. In Proceedings of the 9th ACM International Conference on Multimedia (Ottawa, Canada). 107--118. Google Scholar
Vasconcelos, N. and Lippman, A. 2000. Learning from user feedback in image retrieval systems. In Advances in Neural Information Processing Systems 12, S. A. Solla et al., eds. MIT Press, Cambridge, MA. 977--986.Google Scholar
Wang, H. F., Jin, X. Y., and Sun, Z. 2002. Semantic image retrieval (in Chinese). J. Comput. Research Development 39, 5, 513--523.Google Scholar
Wu, Y., Tian, Q., and Huang, T. S. 2000. Discriminant-EM algorithm with application to image retrieval. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (Hilton Head, SC). 222--227.Google Scholar
Yao, J. and Zhang, Z. 2005. Object detection in aerial imagery based on enhanced semi-supervised learning. In Proceedings of the 10th IEEE International Conference on Computer Vision (Beijing). 1012--1017. Google Scholar
Zhang, C. and Chen, T. 2002. An active learning framework for content-based information retrieval. IEEE Trans. Multimedia 4, 2, 260--268. Google Scholar
Zhang, R. and Zhang, Z. 2004. Stretching Bayesian learning in the relevance feedback of image retrieval. In Proceedings of the 8th European Conference on Computer Vision (Prague, Czech). 355--367.Google Scholar
Zhou, X. S. and Huang, T. S. 2001. Small sample learning during multimedia retrieval using BiasMap. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (Kauai, HI). 11--17.Google Scholar
Zhou, X. S. and Huang, T. S. 2003. Relevance feedback in image retrieval: A comprehensive review. Multimedia Syst. 8, 6, 536--544.Google Scholar
Zhou, Z.-H., Chen, K.-J., and Jiang, Y. 2004. Exploiting unlabeled data in content-based image retrieval. In Proceedings of the 15th European Conference on Machine Learning (Pisa, Italy). 525--536.Google Scholar
Zhou, Z.-H. and Li, M. 2005a. Semi-supervised learning with co-training. In Proceedings of the 19th International Joint Conference on Artificial Intelligence (Edinburgh, Scotland). 908--913. Google Scholar
Zhou, Z.-H. and Li, M. 2005b. Tri-training: Exploiting unlabeled data using three classifiers. IEEE Trans. Knowledge Data Engineering 17, 11, 1529--1541. Google Scholar

Index Terms

Enhancing relevance feedback in image retrieval using unlabeled data
1. Computing methodologies
  1. Machine learning
2. Information systems

Recommendations

SVM-based active feedback in image retrieval using clustering and unlabeled data

In content-based image retrieval, relevance feedback is studied extensively to narrow the gap between low-level image feature and high-level semantic concept. However, most methods are challenged by small sample size problem since users are usually not ...
Read More
Laplacian optimal design for image retrieval
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

Relevance feedback is a powerful technique to enhance Content-Based Image Retrieval (CBIR) performance. It solicits the user's relevance judgments on the retrieved images returned by the CBIR systems. The user's labeling is then used to learn a ...
Read More
A Unified Log-Based Relevance Feedback Scheme for Image Retrieval

Relevance feedback has emerged as a powerful tool to boost the retrieval performance in content-based image retrieval (CBIR). In the past, most research efforts in this field have focused on designing effective algorithms for traditional relevance ...
Read More

Reviews

Reviewer: Richard CHBEIR

Relevance feedback in content-based image retrieval (CBIR) is addressed in this paper, which provides an interesting approach based on a preliminary method-semi-supervised active image retrieval with asymmetry (SSAIRA). This approach involves three issues: a small sample size, an asymmetric training sample, and a real-time requirement. In this work, the authors propose a learning method by considering two learners trained from the labeled data. The user query is considered as the labeled positive example, while the image database is considered initially as a set of unlabeled data. The two learners are defined with respect to the Minkowski distance, and they are differentiated by the order of this distance. The defined learners are easy to update, which makes the relevance feedback process more efficient. In addition, the learning algorithm deals with negative image examples. The authors consider each image to be representative of a semantic class, and images close to a negative example may belong to the same class. To define the representative of the class, they calculate the k-nearest neighbors of negative examples. One may wonder why they use the Euclidian distance in the neighborhood calculation, when other more adaptive methods can be applied and used instead. However, several rich and satisfactory experimental tests have been conducted by the authors to validate their approach and to test its relevance compared with current approaches. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Information Systems Volume 24, Issue 2
April 2006
150 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/1148020
Issue’s Table of Contents

Copyright © 2006 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 April 2006
Published in tois Volume 24, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Relevance feedback
active learning
content-based image retrieval machine learning
learning with unlabeled data
semisupervised learning
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 112
  Total Citations
  View Citations
- 1,483
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Enhancing relevance feedback in image retrieval using unlabeled data

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

SVM-based active feedback in image retrieval using clustering and unlabeled data

Laplacian optimal design for image retrieval

A Unified Log-Based Relevance Feedback Scheme for Image Retrieval

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Enhancing relevance feedback in image retrieval using unlabeled data

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

SVM-based active feedback in image retrieval using clustering and unlabeled data

Laplacian optimal design for image retrieval

A Unified Log-Based Relevance Feedback Scheme for Image Retrieval

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media