Abstract
Image representation is the crucial component in image analysis and understanding. However, the widely used low-level features cannot correctly represent the high-level semantic content of images in many situations due to the “semantic gap”. In order to bridge the “semantic gap”, in this brief, we present a novel topic model, which can learn an effective and robust mid-level representation in the latent semantic space for image analysis. In our model, the ℓ1-graph is constructed to model the local image neighborhood structure and the word co-occurrence is computed to capture the local word consistency. Then, the local information is incorporated into the model for topic discovering. Finally, the generalized EM algorithm is used to estimate the parameters. As our model considers both the local image structure and local word consistency simultaneously when estimating the probabilistic topic distributions, the image representations can have more powerful description ability in the learned latent semantic space. Extensive experiments on the publicly available databases demonstrate the effectiveness of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. 2169–2178 (2006)
Li, P., Wang, M., Cheng, J., Xu, C., Lu, H.: Spectral hashing with semantically consistent graph for image indexing. IEEE Transactions on Multimedia 14 (2012)
Li, Z., Yang, Y., Liu, J., Zhou, X., Lu, H.: Unsupervised feature selection using nonnegative spectral analysis. In: AAAI (2012)
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 42, 177–196 (2001)
Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. The Journal of Machine Learning Research 3, 993–1022 (2003)
Bosch, A., Zisserman, A., Muñoz, X.: Scene Classification Via pLSA. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 517–530. Springer, Heidelberg (2006)
Monay, F., Gatica-Perez, D.: PLSA-based image auto-annotation: constraining the latent space. In: ACM Multimedia, pp. 348–351 (2004)
Cao, L., Li, F.: Spatially coherent latent topic model for concurrent segmentation and classification of objects and scenes. In: ICCV, pp. 1–8 (2007)
Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computing 15, 1373–1396 (2002)
Tenenbaum, J., Silva, V., Langford, J.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)
Cai, D., Mei, Q., Han, J., Zhai, C.: Modeling hidden topics on document manifold. In: CIKM, pp. 911–920 (2008)
Cai, D., Wang, X., He, X.: Probabilistic dyadic data analysis with local and global consistency. In: ICML (2009)
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 391–407 (1990)
Tenenbaum, J.: Mapping a manifold of perceptual observations. In: NIPS, pp. 682–688 (1997)
He, X., Niyogi, P.: Locality preserving projections. In: NIPS (2003)
He, X., Cai, D., Yan, S., Zhang, H.: Neighborhood preserving embedding. In: ICCV, pp. 1208–1213 (2005)
Cheng, B., Yang, J., Yan, S., Fu, Y., Huang, T.: Learning with ℓ1-graph for image analysis. IEEE Trans. on Image Processing 19, 858–866 (2010)
Donoho, D.: For most large underdetermined systems of linear equations the minimal ℓ1-norm solution is also the sparsest solution. Communications on Pure and applied Mathematics 59, 797–829 (2004)
Meinshansen, N., Buhlmann, P.: High-dimensional graphs and variable selection with the lasso. The Annals of Statistics 34, 1436–1462 (2006)
Wright, J., Genesh, A., Yang, A., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. on Pattern Anal. Mach. Intell. 31, 210–227 (2009)
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from in complete data via the EM algorithm. Journal of the Royal Statistical Society 39, 1–38 (1977)
Neal, R., Hinton, G.: A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Learning in Graphical Models (1998)
Press, W., Flannery, B., Teukolsky, S., Vetterling, W.: Numerical recipes in C: the art of scientific computing. Cambridge University Press (1992)
Li, F., Rob, F., Pietro, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: CVPR Workshop on Generative Model Based Vision (2004)
Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: SIGIR, pp. 267–273 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, P., Cheng, J., Lu, H. (2013). Modeling Hidden Topics with Dual Local Consistency for Image Analysis. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37331-2_49
Download citation
DOI: https://doi.org/10.1007/978-3-642-37331-2_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37330-5
Online ISBN: 978-3-642-37331-2
eBook Packages: Computer ScienceComputer Science (R0)