Abstract
Cross-lingual adaptation is a special case of domain adaptation and refers to the transfer of classification knowledge between two languages. In this article we describe an extension of Structural Correspondence Learning (SCL), a recently proposed algorithm for domain adaptation, for cross-lingual adaptation in the context of text classification. The proposed method uses unlabeled documents from both languages, along with a word translation oracle, to induce a cross-lingual representation that enables the transfer of classification knowledge from the source to the target language. The main advantages of this method over existing methods are resource efficiency and task specificity.
We conduct experiments in the area of cross-language topic and sentiment classification involving English as source language and German, French, and Japanese as target languages. The results show a significant improvement of the proposed method over a machine translation baseline, reducing the relative error due to cross-lingual adaptation by an average of 30% (topic classification) and 59% (sentiment classification). We further report on empirical analyses that reveal insights into the use of unlabeled data, the sensitivity with respect to important hyperparameters, and the nature of the induced cross-lingual word correspondences.
- Ando, R. K. and Zhang, T. 2005a. A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817--1853. Google ScholarDigital Library
- Ando, R. K. and Zhang, T. 2005b. A high-performance semi-supervised learning method for text chunking. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). Association for Computational Linguistics, Morristown, NJ. 1--9. Google ScholarDigital Library
- Bautin, M., Vijayarenu, L., and Skiena, S. 2008. International sentiment analysis for news and blogs. In Proceedings of the AAAI International Conference on Weblogs and Social Media (ICWSM’08). 19--26.Google Scholar
- Bel, N., Koster, C. H. A., and Villegas, M. 2003. Cross-lingual text categorization. In Proceedings of the European Conference on Digital Libraries (ECDL’03). 126--139.Google Scholar
- Berry, M. W. 1992. Large-scale sparse singular value computations. Int. J. Supercomput. Appl. 6, 1, 13--49.Google ScholarDigital Library
- Bickel, S., Brückner, M., and Scheffer, T. 2009. Discriminative learning under covariate shift. J. Mach. Learn. Res. 10, 2137--2155. Google ScholarDigital Library
- Blitzer, J., Dredze, M., and Pereira, F. 2007. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL’07). Association for Computational Linguistics. 440--447.Google Scholar
- Blitzer, J., Mcdonald, R., and Pereira, F. 2006. Domain adaptation with structural correspondence learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’06). Association for Computational Linguistics, 120--128. Google ScholarDigital Library
- Cortes, C., Mohri, M., Riley, M., and Rostamizadeh, A. 2008. Sample selection bias correction theory. In Algorithmic Learning Theory, Y. Freund, L. Györfi, G. Turán, and T. Zeugmann Eds., Lecture Notes in Computer Science, vol. 5254, Springer Berlin, Chapter 8, 38--53. Google ScholarDigital Library
- Crammer, K., Dredze, M., and Kulesza, A. 2009. Multi-class confidence weighted algorithms. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’09). Association for Computational Linguistics, Morristown, NJ, 496--504. Google ScholarDigital Library
- Dai, W., Chen, Y., Xue, G.-R., Yang, Q., and Yu, Y. 2008. Translated learning: Transfer learning across different feature spaces. In Advances in Neural Information Processing Systems 21. MIT Press, 353--360.Google Scholar
- Daume, III, H. 2007. Frustratingly easy domain adaptation. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL’07). Association for Computational Linguistics. 256--263.Google Scholar
- Dietterich, T. G. 1998. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput. 10, 1895--1923. Google ScholarDigital Library
- Duchi, J., Shalev-Shwartz, S., Singer, Y., and Chandra, T. 2008. Efficient projections onto the l 1-ball for learning in high dimensions. In Proceedings of the 25th International Conference on Machine Learning. ACM, New York, 272--279. Google ScholarDigital Library
- Dumais, S. T., Letsche, T. A., Littman, M. L., and Landauer, T. K. 1997. Automatic cross-language retrieval using latent semantic indexing. In Proceedings of the AAAI Symposium on CrossLanguage Text and Speech Retrieval. American Association for Artificial Intelligence.Google Scholar
- Finkel, J. R. and Manning, C. D. 2009. Hierarchical bayesian domain adaptation. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL’09). Association for Computational Linguistics, Morristown, NJ. 602--610. Google ScholarDigital Library
- Fortuna, B. and Shawe-Taylor, J. 2005. The use of machine translation tools for cross-lingual text mining. In Proceedings of the Workshop on Learning with Multiple Views (ICML’05).Google Scholar
- Gao, J., Andrew, G., Johnson, M., and Toutanova, K. 2007. A comparative study of parameter estimation methods for statistical natural language processing. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL’07). The Association for Computer Linguistics. 824--831.Google Scholar
- Gliozzo, A. and Strapparava, C. 2005. Cross language text categorization by acquiring multilingual domain models from comparable corpora. In Proceedings of the ACL Workshop on Building and Using Parallel Texts (ParaText’05). Association for Computational Linguistics, Morristown, NJ. 9--16. Google ScholarDigital Library
- Gliozzo, A. and Strapparava, C. 2006. Exploiting comparable corpora and bilingual dictionaries for cross-language text categorization. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (ACL’06). Association for Computational Linguistics, Morristown, NJ. 553--560. Google ScholarDigital Library
- Hiroshi, K., Tetsuya, N., and Hideo, W. 2004. Deeper sentiment analysis using machine translation technology. In Proceedings of the 20th International Conference on Computational Linguistics (ACL’04). Association for Computational Linguistics, Morristown, NJ. 494+. Google ScholarDigital Library
- Jiang, J. and Zhai, C. 2007. A two-stage approach to domain adaptation for statistical classifiers. In Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM’07). ACM, New York. 401--410. Google ScholarDigital Library
- Langford, J., Li, L., and Zhang, T. 2009. Sparse online learning via truncated gradient. J. Mach. Learn. Res. 10, 777--801. Google ScholarDigital Library
- Lavrenko, V., Choquette, M., and Croft, W. B. 2002. Cross-lingual relevance models. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’02). ACM, New York. 175--182. Google ScholarDigital Library
- Li, Y. and Taylor, J. S. 2007. Advanced learning algorithms for cross-language patent retrieval and classification. Inf. Process. Manage. 43, 5, 1183--1199. Google ScholarDigital Library
- Ling, X., Xue, G. R., Dai, W., Jiang, Y., Yang, Q., and Yu, Y. 2008. Can chinese web pages be classified with english data source? In Proceeding of the 17th International Conference on World Wide Web (WWW’08). ACM, New York, 969--978. Google ScholarDigital Library
- Margolis, A., Livescu, K., and Ostendorf, M. 2010. Domain adaptation with unlabeled data for dialog act tagging. In Proceedings of the Workshop on Domain Adaptation for Natural Language Processing (DANLP’10). Association for Computational Linguistics. 45--52. Google ScholarDigital Library
- Oard, D. W. 1998. A comparative study of query and document translation for cross-language information retrieval. In Proceedings of AMTA. D. Farwell, L. Gerber, E. H. Hovy, D. Farwell, L. Gerber, and E. H. Hovy, Eds. Lecture Notes in Computer Science, vol. 1529, Springer. 472--483. Google ScholarDigital Library
- Olsson, J. S., Oard, D. W., and Hajič, J. 2005. Cross-language text classification. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’05). ACM, New York. 645--646. Google ScholarDigital Library
- Pan, S. J. and Yang, Q. 2009. A survey on transfer learning. IEEE Trans. Knowl. Data Engin. 99, 1. Google ScholarDigital Library
- Pang, B., Lee, L., and Vaithyanathan, S. 2002. Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the ACL Conference on Empirical Methods in Natural Language Processing (EMNLP’02). Association for Computational Linguistics, Morristown, NJ. 79--86. Google ScholarDigital Library
- Potthast, M., Stein, B., and Anderka, M. 2008. A wikipedia-based multilingual retrieval model. In Advances in Information Retrieval. Lecture Notes in Computer Science, Chapter 51, 522--530. Google ScholarDigital Library
- Prettenhofer, P. and Stein, B. 2010. Cross-Language text classification using structural correspondence learning. In Proceedings of the 48th Annual Meeting of the Association of Computational Linguistics (ACL’10). Association for Computational Linguistics, 1118--1127. Google ScholarDigital Library
- Quattoni, A., Collins, M., and Darrell, T. 2007. Learning visual representations using images with captions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 1--8.Google Scholar
- Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., and Lawrence, N. D. 2009. Dataset Shift in Machine Learning. The MIT Press. Google ScholarDigital Library
- Rigutini, L., Maggini, M., and Liu, B. 2005. An EM based training algorithm for cross-language text categorization. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. 529--535. Google ScholarDigital Library
- Riloff, E., Schafer, C., and Yarowsky, D. 2002. Inducing information extraction systems for new languages via cross-language projection. In Proceedings of the 19th International Conference on Computational Linguistics. Association for Computational Linguistics, Morristown, NJ. 1--7. Google ScholarDigital Library
- Shalev-Shwartz, S., Singer, Y., and Srebro, N. 2007. Pegasos: Primal estimated sub-gradient solver for svm. In Proceedings of the 24th International Conference on Machine Learning (ICML’07). ACM, New York. 807--814. Google ScholarDigital Library
- Shimodaira, H. 2000. Improving predictive inference under covariate shift by weighting the log-likelihood function. J. Statist. Plan. Inference 90, 2, 227--244.Google ScholarCross Ref
- Tibshirani, R. 1996. Regression shrinkage and selection via the lasso. J. Royal Statist. Soc. B 58, 1, 267--288.Google ScholarCross Ref
- Tsuruoka, Y., Tsujii, J., and Ananiadou, S. 2009. Stochastic gradient descent training for l1-regularized log-linear models with cumulative penalty. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics. 477--485. Google ScholarDigital Library
- Wan, X. 2009. Co-training for cross-lingual sentiment classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics. 235--243. Google ScholarDigital Library
- Wei, B. and Pal, C. 2010. Cross lingual adaptation: An experiment on sentiment classifications. In Proceedings of the ACL Conference Short Papers. Association for Computational Linguistics. 258--262. Google ScholarDigital Library
- Wu, K., Wang, X., and Lu, B.-L. 2008. Cross language text categorization using a bilingual lexicon. In Proceedings of the 3rd International Joint Conference on Natural Language Processing.Google Scholar
- Zhang, T. 2004. Solving large scale linear prediction problems using stochastic gradient descent algorithms. In Proceedings of the 21st International Conference on Machine Learning (ICML’04). ACM, New York. 116--124. Google ScholarDigital Library
- Zou, H. and Hastie, T. 2005. Regularization and variable selection via the elastic net. J. Royal Statist. Soc. B 67, 2, 301--320.Google ScholarCross Ref
Index Terms
- Cross-Lingual Adaptation Using Structural Correspondence Learning
Recommendations
Study on cross-lingual adaptation of a czech LVCSR system towards slovak
COST'10: Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and EnactmentThis paper deals with cross-lingual adaptation of a Large Vocabulary Continuous Speech Recognition (LVCSR) system between two similar Slavic languages --- from Czech to Slovak. The proposed adaptation scheme is performed in two consecutive phases and it ...
Structural correspondence learning for cross-lingual sentiment classification with one-to-many mappings
AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial IntelligenceStructural correspondence learning (SCL) is an effective method for cross-lingual sentiment classification. This approach uses unlabeled documents along with a word translation oracle to automatically induce task specific, cross-lingual correspondences. ...
Cross-lingual sentiment lexicon learning with bilingual word graph label propagation
In this article we address the task of cross-lingual sentiment lexicon learning, which aims to automatically generate sentiment lexicons for the target languages with available English sentiment lexicons. We formalize the task as a learning problem on a ...
Comments