research-article

Inferring semantic concepts from community-contributed images and noisy tags

Authors:
Jinhui Tang

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Shuicheng Yan

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Richang Hong

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Guo-Jun Qi

University of Illinois at Urbana-Champaign, Illinois, USA

University of Illinois at Urbana-Champaign, Illinois, USA
View Profile

,
Tat-Seng Chua

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

MM '09: Proceedings of the 17th ACM international conference on MultimediaOctober 2009Pages 223–232https://doi.org/10.1145/1631272.1631305

Published:19 October 2009Publication History

MM '09: Proceedings of the 17th ACM international conference on Multimedia

Pages 223–232

ABSTRACT

In this paper, we exploit the problem of inferring images' semantic concepts from community-contributed images and their associated noisy tags. To infer the concepts more accurately, we propose a novel sparse graph-based semi-supervised learning approach for harnessing the labeled and unlabeled data simultaneously. The sparse graph constructed by datum-wise one-vs-all sparse reconstructions of all samples can remove most of the concept-unrelated links among the data, thus is more robust and discriminative than conventional graphs. More importantly, we propose an effective training label refinement strategy within this graph-based learning framework to handle the noise in the tags, by bringing in a dual regularization for both the quantity and sparsity of the noise. In addition, we construct an informative compact concept space with small semantic gap to infer the semantic concepts in this space to bridge the semantic gap. The relations among different concepts are inherently embedded in this space to help the concept inference. We conduct extensive experiments on a real-world community-contributed image database consisting of 55,615 Flickr images and associated tags. The results demonstrate the effectiveness of the proposed approaches and the capability of our method to deal with the noise in the tags. We further show that we could achieve comparable performance by inferring semantic concepts from training data with noisy tags versus training data with clean ground-truth labels.

References

l1-magic. http://www.acm.caltech.edu/l1magic/.Google Scholar
M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 2003. Google ScholarDigital Library
S. Boll, P. Sandhaus, A. Scherp, and U. Westermann. Semantics, content, and structure of many for the creation of personal photo albums. In ACM International Conference on Multimedia, pages 641--650, 2007. Google ScholarDigital Library
C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines. 2001. http://www.csie.ntu.edu.tw/Ücjlin/libsvm.Google Scholar
O. Chapelle, A. Zien, and B. Scholkopf. Semi-supervised Learning. MIT Press, 2006.Google ScholarDigital Library
T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y.-T. Zheng. NUS-WIDE: A real-world web image database from national university of singapore. In Proc. of ACM Conf. on Image and Video Retrieval, Santorini, Greece., July 8-10, 2009. Google ScholarDigital Library
D. L. Donoho. For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Communications on Pure and Applied Mathematics, 59(6):797--829, 2006.Google ScholarCross Ref
R. Duda, D. Stork, and P. Hart. Pattern Classification. JOHN WILEY, 2000. Google ScholarDigital Library
C. Elkan. Using the triangle inequality to accelerate k-means. In Proc. of the Twentieth International Conference on Machine Learning, 2003.Google Scholar
R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman. Learning object categories from google's image search. In IEEE International Conference on Computer Vision, 2005. Google ScholarDigital Library
K.-S. Goh, E. Y. Chang, and W.-C. Lai. Multimodal concept-dependent active learning for image retrieval. In Proc. of the 12th annual ACM international conference on Multimedia, pages 564--571, 2004. Google ScholarDigital Library
A. Hauptmann, R. Yan, W.-H. Lin, M. Christel, and H. Wactlar. Can high-level concepts fill the semantic gap in video retrieval? a case study with broadcast news. IEEE Transactions on Multimedia, 9(5):958--966, 2007. Google ScholarDigital Library
J. He, M. Li, H.-J. Zhang, H. Tong, and C. Zhang. Manifold-ranking based image retrieval. In ACM Multimedia, 2004. Google ScholarDigital Library
J. Jeon, V. Lavrenko, and R. Manmatha. Automatic image annotation and retrieval using cross-media relevance models. In Proc. of the ACM conference on Research and development in informaion retrieval, pages 119--126, 2003. Google ScholarDigital Library
X. Li, L. Chen, L. Zhang, F. Lin, and W.-Y. Ma. Image annotation by large-scale content-based image retrieval. In ACM Multimedia, 2006. Google ScholarDigital Library
Y. Lu, L. Zhang, Q. Tian, and W.-Y. Ma. What are the high-level concepts with small semantic gaps? In IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, Alaska, USA, 2008.Google Scholar
J. Magalhaes, F. Ciravegna, and S. Ruger. Exploring multimedia in a keyword space. In ACM Multimedia, 2008. Google ScholarDigital Library
J. Nocedal and S. J. Wright. Numerical Optimization. Springer-Verlag, 2006.Google Scholar
R. Rao, B. Olshausen, and M. Lewicki. Probabilistic Models of the Brain: Perception and Neural Function. MIT Press, 2002.Google ScholarCross Ref
N. Rasiwasia, P. L. Moreno, and N. Vasconcelos. Bridging the gap: Query by semantic example. IEEE Transactions on Multimedia, 9(5):923--938, 2007. Google ScholarDigital Library
Y. Saad. Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, Second Edition, 2003. Google ScholarDigital Library
Y. Saad and M. Schultz. Gmres: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM Journal on Scientific and Statistical Computing, 7:856--869, 1986. Google ScholarDigital Library
S.T.Roweis and L.K.Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290:2323--2326, 2000.Google ScholarCross Ref
Y. Sun, S. Shimada, Y. Taniguchi, and A. Kojima. A novel region-based approach to visual concept modeling using web images. In Proceeding of the 16th ACM International Conference on Multimedia, Canada, 2008. Google ScholarDigital Library
J. Tang, X.-S. Hua, and et al. Video annotation based on kernel linear neighborhood propagation. IEEE Transaction on Multimedia, 10(4), 2008. Google ScholarDigital Library
A. Torralba, R. Fergus, and W. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transaction on Pattern Analysis and Machine Intelligence, 30(11), 2008. Google ScholarDigital Library
N. Vasconcelos. From pixels to semantic spaces: Advances in content-based image retrieval. IEEE Computer, 40(7):20--26, 2007. Google ScholarDigital Library
C. Wang, F. Jing, L. Zhang, and H.-J. Zhang. Image annotation refinement using random walk with restarts. In Proc. ACM Multimedia, 2006. Google ScholarDigital Library
F. Wang and C. Zhang. Label propagation through linear neighborhoods. In 23rd International Conference on Machine Learning, June 2006. Google ScholarDigital Library
F. Wang and C. Zhang. Label propagation through linear neighborhoods. IEEE Transactions on Knowledge and Data Engineering, 20(1):55--67, 2008. Google ScholarDigital Library
X.-J. Wang, L. Zhang, F. Jing, and W.-Y. Ma. Annosearch: Image auto-annotation by search. In IEEE Conference on Computer Vision and Pattern Recognition. New York, USA., Jun. 2006. Google ScholarDigital Library
X.-J. Wang, L. Zhang, X. Li, and W.-Y. Ma. Annotating images by mining image search results. IEEE Transaction on Pattern Analysis and Machine Intelligence, 30(11):1919--1932, 2008. Google ScholarDigital Library
X. Y. Wei, C. W. Ngo, and Y. G. Jiang. Selection of concept detectors for video search by ontology-enriched semantic spaces. IEEE Transactions on Multimedia, 10(6), 2008. Google ScholarDigital Library
J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma. Robust face recognition via sparse representation. IEEE Transaction on Pattern Analysis and Machine Intelligence, 31(2):210--227, Feb. 2009. Google ScholarDigital Library
D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. In Proc. 17-th Annual Conference on Neural Information Processing Systems, 2003.Google Scholar
X. Zhu. Semi-Supervised Learning with Graphs. PhD Thesis, CMU, 2005. Google ScholarDigital Library
X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic function. In Proc. 20-th International Conference on Machine Learning, 2003.Google Scholar

Index Terms

Inferring semantic concepts from community-contributed images and noisy tags
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Scene understanding
2. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing

Recommendations

Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images

In this article, we exploit the problem of annotating a large-scale image corpus by label propagation over noisily tagged web images. To annotate the images more accurately, we propose a novel kNN-sparse graph-based semi-supervised learning approach for ...
Read More
Linking Images to Semantic Knowledge Base with User-generated Tags
SEMANTiCS 2016: Proceedings of the 12th International Conference on Semantic Systems

Images account for an important part of Multimedia Linked Open Data, but currently most of the semantic relations between images and other entities are based on manual semantic annotation. With the popularity of image hosting websites, such as Flickr, ...
Read More
Semi- and Weakly- Supervised Semantic Segmentation with Deep Convolutional Neural Networks
MM '15: Proceedings of the 23rd ACM international conference on Multimedia

Successful semantic segmentation methods typically rely on the training datasets containing a large number of pixel-wise labeled images. To alleviate the dependence on such a fully annotated training dataset, in this paper, we propose a semi- and weakly-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '09: Proceedings of the 17th ACM international conference on Multimedia
October 2009
1202 pages
ISBN:9781605586083
DOI:10.1145/1631272
General Chairs:
Wen Gao
Peking University, China
,
Yong Rui
Microsoft, China
,
Alan Hanjalic
Delft University of Technology, The Netherlands
,
Program Chairs:
Changsheng Xu
Institute of Automation, Chinese Academy of Sciences, China
,
Eckehard Steinbach
Technical University of Munich, Germany
,
Abdulmotaleb El Saddik
University of Ottawa, Canada
,
Michelle Zhou
IBM T. J. Watson Research Center, USA
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 October 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
concept space
noisy tags
semi-supervised learning
sparse graph
web image
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 170
  Total Citations
  View Citations
- 1,086
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Inferring semantic concepts from community-contributed images and noisy tags

MM '09: Proceedings of the 17th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images

Linking Images to Semantic Knowledge Base with User-generated Tags

Semi- and Weakly- Supervised Semantic Segmentation with Deep Convolutional Neural Networks