skip to main content
10.1145/1631272.1631305acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Inferring semantic concepts from community-contributed images and noisy tags

Published:19 October 2009Publication History

ABSTRACT

In this paper, we exploit the problem of inferring images' semantic concepts from community-contributed images and their associated noisy tags. To infer the concepts more accurately, we propose a novel sparse graph-based semi-supervised learning approach for harnessing the labeled and unlabeled data simultaneously. The sparse graph constructed by datum-wise one-vs-all sparse reconstructions of all samples can remove most of the concept-unrelated links among the data, thus is more robust and discriminative than conventional graphs. More importantly, we propose an effective training label refinement strategy within this graph-based learning framework to handle the noise in the tags, by bringing in a dual regularization for both the quantity and sparsity of the noise. In addition, we construct an informative compact concept space with small semantic gap to infer the semantic concepts in this space to bridge the semantic gap. The relations among different concepts are inherently embedded in this space to help the concept inference. We conduct extensive experiments on a real-world community-contributed image database consisting of 55,615 Flickr images and associated tags. The results demonstrate the effectiveness of the proposed approaches and the capability of our method to deal with the noise in the tags. We further show that we could achieve comparable performance by inferring semantic concepts from training data with noisy tags versus training data with clean ground-truth labels.

References

  1. l1-magic. http://www.acm.caltech.edu/l1magic/.Google ScholarGoogle Scholar
  2. M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Boll, P. Sandhaus, A. Scherp, and U. Westermann. Semantics, content, and structure of many for the creation of personal photo albums. In ACM International Conference on Multimedia, pages 641--650, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines. 2001. http://www.csie.ntu.edu.tw/Ücjlin/libsvm.Google ScholarGoogle Scholar
  5. O. Chapelle, A. Zien, and B. Scholkopf. Semi-supervised Learning. MIT Press, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y.-T. Zheng. NUS-WIDE: A real-world web image database from national university of singapore. In Proc. of ACM Conf. on Image and Video Retrieval, Santorini, Greece., July 8-10, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. L. Donoho. For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Communications on Pure and Applied Mathematics, 59(6):797--829, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  8. R. Duda, D. Stork, and P. Hart. Pattern Classification. JOHN WILEY, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Elkan. Using the triangle inequality to accelerate k-means. In Proc. of the Twentieth International Conference on Machine Learning, 2003.Google ScholarGoogle Scholar
  10. R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman. Learning object categories from google's image search. In IEEE International Conference on Computer Vision, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K.-S. Goh, E. Y. Chang, and W.-C. Lai. Multimodal concept-dependent active learning for image retrieval. In Proc. of the 12th annual ACM international conference on Multimedia, pages 564--571, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Hauptmann, R. Yan, W.-H. Lin, M. Christel, and H. Wactlar. Can high-level concepts fill the semantic gap in video retrieval? a case study with broadcast news. IEEE Transactions on Multimedia, 9(5):958--966, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. He, M. Li, H.-J. Zhang, H. Tong, and C. Zhang. Manifold-ranking based image retrieval. In ACM Multimedia, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Jeon, V. Lavrenko, and R. Manmatha. Automatic image annotation and retrieval using cross-media relevance models. In Proc. of the ACM conference on Research and development in informaion retrieval, pages 119--126, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. X. Li, L. Chen, L. Zhang, F. Lin, and W.-Y. Ma. Image annotation by large-scale content-based image retrieval. In ACM Multimedia, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Lu, L. Zhang, Q. Tian, and W.-Y. Ma. What are the high-level concepts with small semantic gaps? In IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, Alaska, USA, 2008.Google ScholarGoogle Scholar
  17. J. Magalhaes, F. Ciravegna, and S. Ruger. Exploring multimedia in a keyword space. In ACM Multimedia, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Nocedal and S. J. Wright. Numerical Optimization. Springer-Verlag, 2006.Google ScholarGoogle Scholar
  19. R. Rao, B. Olshausen, and M. Lewicki. Probabilistic Models of the Brain: Perception and Neural Function. MIT Press, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  20. N. Rasiwasia, P. L. Moreno, and N. Vasconcelos. Bridging the gap: Query by semantic example. IEEE Transactions on Multimedia, 9(5):923--938, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. Saad. Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, Second Edition, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Saad and M. Schultz. Gmres: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM Journal on Scientific and Statistical Computing, 7:856--869, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S.T.Roweis and L.K.Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290:2323--2326, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  24. Y. Sun, S. Shimada, Y. Taniguchi, and A. Kojima. A novel region-based approach to visual concept modeling using web images. In Proceeding of the 16th ACM International Conference on Multimedia, Canada, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Tang, X.-S. Hua, and et al. Video annotation based on kernel linear neighborhood propagation. IEEE Transaction on Multimedia, 10(4), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Torralba, R. Fergus, and W. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transaction on Pattern Analysis and Machine Intelligence, 30(11), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. N. Vasconcelos. From pixels to semantic spaces: Advances in content-based image retrieval. IEEE Computer, 40(7):20--26, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Wang, F. Jing, L. Zhang, and H.-J. Zhang. Image annotation refinement using random walk with restarts. In Proc. ACM Multimedia, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. F. Wang and C. Zhang. Label propagation through linear neighborhoods. In 23rd International Conference on Machine Learning, June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. F. Wang and C. Zhang. Label propagation through linear neighborhoods. IEEE Transactions on Knowledge and Data Engineering, 20(1):55--67, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. X.-J. Wang, L. Zhang, F. Jing, and W.-Y. Ma. Annosearch: Image auto-annotation by search. In IEEE Conference on Computer Vision and Pattern Recognition. New York, USA., Jun. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. X.-J. Wang, L. Zhang, X. Li, and W.-Y. Ma. Annotating images by mining image search results. IEEE Transaction on Pattern Analysis and Machine Intelligence, 30(11):1919--1932, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. X. Y. Wei, C. W. Ngo, and Y. G. Jiang. Selection of concept detectors for video search by ontology-enriched semantic spaces. IEEE Transactions on Multimedia, 10(6), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma. Robust face recognition via sparse representation. IEEE Transaction on Pattern Analysis and Machine Intelligence, 31(2):210--227, Feb. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. In Proc. 17-th Annual Conference on Neural Information Processing Systems, 2003.Google ScholarGoogle Scholar
  36. X. Zhu. Semi-Supervised Learning with Graphs. PhD Thesis, CMU, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic function. In Proc. 20-th International Conference on Machine Learning, 2003.Google ScholarGoogle Scholar

Index Terms

  1. Inferring semantic concepts from community-contributed images and noisy tags

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          MM '09: Proceedings of the 17th ACM international conference on Multimedia
          October 2009
          1202 pages
          ISBN:9781605586083
          DOI:10.1145/1631272

          Copyright © 2009 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 October 2009

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate995of4,171submissions,24%

          Upcoming Conference

          MM '24
          MM '24: The 32nd ACM International Conference on Multimedia
          October 28 - November 1, 2024
          Melbourne , VIC , Australia

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader