research-article

Cross-Lingual Adaptation Using Structural Correspondence Learning

Authors:
Peter Prettenhofer

Bauhaus-Universität Weimar

Bauhaus-Universität Weimar
View Profile

,
Benno Stein

Bauhaus-Universität Weimar

Bauhaus-Universität Weimar
View Profile

ACM Transactions on Intelligent Systems and Technology Volume 3 Issue 1Article No.: 13pp 1–22https://doi.org/10.1145/2036264.2036277

Published:01 October 2011Publication History

ACM Transactions on Intelligent Systems and Technology

Abstract

Cross-lingual adaptation is a special case of domain adaptation and refers to the transfer of classification knowledge between two languages. In this article we describe an extension of Structural Correspondence Learning (SCL), a recently proposed algorithm for domain adaptation, for cross-lingual adaptation in the context of text classification. The proposed method uses unlabeled documents from both languages, along with a word translation oracle, to induce a cross-lingual representation that enables the transfer of classification knowledge from the source to the target language. The main advantages of this method over existing methods are resource efficiency and task specificity.

We conduct experiments in the area of cross-language topic and sentiment classification involving English as source language and German, French, and Japanese as target languages. The results show a significant improvement of the proposed method over a machine translation baseline, reducing the relative error due to cross-lingual adaptation by an average of 30% (topic classification) and 59% (sentiment classification). We further report on empirical analyses that reveal insights into the use of unlabeled data, the sensitivity with respect to important hyperparameters, and the nature of the induced cross-lingual word correspondences.

References

Ando, R. K. and Zhang, T. 2005a. A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817--1853. Google ScholarDigital Library
Ando, R. K. and Zhang, T. 2005b. A high-performance semi-supervised learning method for text chunking. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). Association for Computational Linguistics, Morristown, NJ. 1--9. Google ScholarDigital Library
Bautin, M., Vijayarenu, L., and Skiena, S. 2008. International sentiment analysis for news and blogs. In Proceedings of the AAAI International Conference on Weblogs and Social Media (ICWSM’08). 19--26.Google Scholar
Bel, N., Koster, C. H. A., and Villegas, M. 2003. Cross-lingual text categorization. In Proceedings of the European Conference on Digital Libraries (ECDL’03). 126--139.Google Scholar
Berry, M. W. 1992. Large-scale sparse singular value computations. Int. J. Supercomput. Appl. 6, 1, 13--49.Google ScholarDigital Library
Bickel, S., Brückner, M., and Scheffer, T. 2009. Discriminative learning under covariate shift. J. Mach. Learn. Res. 10, 2137--2155. Google ScholarDigital Library
Blitzer, J., Dredze, M., and Pereira, F. 2007. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL’07). Association for Computational Linguistics. 440--447.Google Scholar
Blitzer, J., Mcdonald, R., and Pereira, F. 2006. Domain adaptation with structural correspondence learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’06). Association for Computational Linguistics, 120--128. Google ScholarDigital Library
Cortes, C., Mohri, M., Riley, M., and Rostamizadeh, A. 2008. Sample selection bias correction theory. In Algorithmic Learning Theory, Y. Freund, L. Györfi, G. Turán, and T. Zeugmann Eds., Lecture Notes in Computer Science, vol. 5254, Springer Berlin, Chapter 8, 38--53. Google ScholarDigital Library
Crammer, K., Dredze, M., and Kulesza, A. 2009. Multi-class confidence weighted algorithms. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’09). Association for Computational Linguistics, Morristown, NJ, 496--504. Google ScholarDigital Library
Dai, W., Chen, Y., Xue, G.-R., Yang, Q., and Yu, Y. 2008. Translated learning: Transfer learning across different feature spaces. In Advances in Neural Information Processing Systems 21. MIT Press, 353--360.Google Scholar
Daume, III, H. 2007. Frustratingly easy domain adaptation. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL’07). Association for Computational Linguistics. 256--263.Google Scholar
Dietterich, T. G. 1998. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput. 10, 1895--1923. Google ScholarDigital Library
Duchi, J., Shalev-Shwartz, S., Singer, Y., and Chandra, T. 2008. Efficient projections onto the l ₁-ball for learning in high dimensions. In Proceedings of the 25th International Conference on Machine Learning. ACM, New York, 272--279. Google ScholarDigital Library
Dumais, S. T., Letsche, T. A., Littman, M. L., and Landauer, T. K. 1997. Automatic cross-language retrieval using latent semantic indexing. In Proceedings of the AAAI Symposium on CrossLanguage Text and Speech Retrieval. American Association for Artificial Intelligence.Google Scholar
Finkel, J. R. and Manning, C. D. 2009. Hierarchical bayesian domain adaptation. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL’09). Association for Computational Linguistics, Morristown, NJ. 602--610. Google ScholarDigital Library
Fortuna, B. and Shawe-Taylor, J. 2005. The use of machine translation tools for cross-lingual text mining. In Proceedings of the Workshop on Learning with Multiple Views (ICML’05).Google Scholar
Gao, J., Andrew, G., Johnson, M., and Toutanova, K. 2007. A comparative study of parameter estimation methods for statistical natural language processing. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL’07). The Association for Computer Linguistics. 824--831.Google Scholar
Gliozzo, A. and Strapparava, C. 2005. Cross language text categorization by acquiring multilingual domain models from comparable corpora. In Proceedings of the ACL Workshop on Building and Using Parallel Texts (ParaText’05). Association for Computational Linguistics, Morristown, NJ. 9--16. Google ScholarDigital Library
Gliozzo, A. and Strapparava, C. 2006. Exploiting comparable corpora and bilingual dictionaries for cross-language text categorization. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (ACL’06). Association for Computational Linguistics, Morristown, NJ. 553--560. Google ScholarDigital Library
Hiroshi, K., Tetsuya, N., and Hideo, W. 2004. Deeper sentiment analysis using machine translation technology. In Proceedings of the 20th International Conference on Computational Linguistics (ACL’04). Association for Computational Linguistics, Morristown, NJ. 494+. Google ScholarDigital Library
Jiang, J. and Zhai, C. 2007. A two-stage approach to domain adaptation for statistical classifiers. In Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM’07). ACM, New York. 401--410. Google ScholarDigital Library
Langford, J., Li, L., and Zhang, T. 2009. Sparse online learning via truncated gradient. J. Mach. Learn. Res. 10, 777--801. Google ScholarDigital Library
Lavrenko, V., Choquette, M., and Croft, W. B. 2002. Cross-lingual relevance models. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’02). ACM, New York. 175--182. Google ScholarDigital Library
Li, Y. and Taylor, J. S. 2007. Advanced learning algorithms for cross-language patent retrieval and classification. Inf. Process. Manage. 43, 5, 1183--1199. Google ScholarDigital Library
Ling, X., Xue, G. R., Dai, W., Jiang, Y., Yang, Q., and Yu, Y. 2008. Can chinese web pages be classified with english data source? In Proceeding of the 17th International Conference on World Wide Web (WWW’08). ACM, New York, 969--978. Google ScholarDigital Library
Margolis, A., Livescu, K., and Ostendorf, M. 2010. Domain adaptation with unlabeled data for dialog act tagging. In Proceedings of the Workshop on Domain Adaptation for Natural Language Processing (DANLP’10). Association for Computational Linguistics. 45--52. Google ScholarDigital Library
Oard, D. W. 1998. A comparative study of query and document translation for cross-language information retrieval. In Proceedings of AMTA. D. Farwell, L. Gerber, E. H. Hovy, D. Farwell, L. Gerber, and E. H. Hovy, Eds. Lecture Notes in Computer Science, vol. 1529, Springer. 472--483. Google ScholarDigital Library
Olsson, J. S., Oard, D. W., and Hajič, J. 2005. Cross-language text classification. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’05). ACM, New York. 645--646. Google ScholarDigital Library
Pan, S. J. and Yang, Q. 2009. A survey on transfer learning. IEEE Trans. Knowl. Data Engin. 99, 1. Google ScholarDigital Library
Pang, B., Lee, L., and Vaithyanathan, S. 2002. Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the ACL Conference on Empirical Methods in Natural Language Processing (EMNLP’02). Association for Computational Linguistics, Morristown, NJ. 79--86. Google ScholarDigital Library
Potthast, M., Stein, B., and Anderka, M. 2008. A wikipedia-based multilingual retrieval model. In Advances in Information Retrieval. Lecture Notes in Computer Science, Chapter 51, 522--530. Google ScholarDigital Library
Prettenhofer, P. and Stein, B. 2010. Cross-Language text classification using structural correspondence learning. In Proceedings of the 48th Annual Meeting of the Association of Computational Linguistics (ACL’10). Association for Computational Linguistics, 1118--1127. Google ScholarDigital Library
Quattoni, A., Collins, M., and Darrell, T. 2007. Learning visual representations using images with captions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 1--8.Google Scholar
Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., and Lawrence, N. D. 2009. Dataset Shift in Machine Learning. The MIT Press. Google ScholarDigital Library
Rigutini, L., Maggini, M., and Liu, B. 2005. An EM based training algorithm for cross-language text categorization. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. 529--535. Google ScholarDigital Library
Riloff, E., Schafer, C., and Yarowsky, D. 2002. Inducing information extraction systems for new languages via cross-language projection. In Proceedings of the 19th International Conference on Computational Linguistics. Association for Computational Linguistics, Morristown, NJ. 1--7. Google ScholarDigital Library
Shalev-Shwartz, S., Singer, Y., and Srebro, N. 2007. Pegasos: Primal estimated sub-gradient solver for svm. In Proceedings of the 24th International Conference on Machine Learning (ICML’07). ACM, New York. 807--814. Google ScholarDigital Library
Shimodaira, H. 2000. Improving predictive inference under covariate shift by weighting the log-likelihood function. J. Statist. Plan. Inference 90, 2, 227--244.Google ScholarCross Ref
Tibshirani, R. 1996. Regression shrinkage and selection via the lasso. J. Royal Statist. Soc. B 58, 1, 267--288.Google ScholarCross Ref
Tsuruoka, Y., Tsujii, J., and Ananiadou, S. 2009. Stochastic gradient descent training for l1-regularized log-linear models with cumulative penalty. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics. 477--485. Google ScholarDigital Library
Wan, X. 2009. Co-training for cross-lingual sentiment classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics. 235--243. Google ScholarDigital Library
Wei, B. and Pal, C. 2010. Cross lingual adaptation: An experiment on sentiment classifications. In Proceedings of the ACL Conference Short Papers. Association for Computational Linguistics. 258--262. Google ScholarDigital Library
Wu, K., Wang, X., and Lu, B.-L. 2008. Cross language text categorization using a bilingual lexicon. In Proceedings of the 3rd International Joint Conference on Natural Language Processing.Google Scholar
Zhang, T. 2004. Solving large scale linear prediction problems using stochastic gradient descent algorithms. In Proceedings of the 21st International Conference on Machine Learning (ICML’04). ACM, New York. 116--124. Google ScholarDigital Library
Zou, H. and Hastie, T. 2005. Regularization and variable selection via the elastic net. J. Royal Statist. Soc. B 67, 2, 301--320.Google ScholarCross Ref

Index Terms

Cross-Lingual Adaptation Using Structural Correspondence Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Document filtering
      2. Information extraction

Recommendations

Study on cross-lingual adaptation of a czech LVCSR system towards slovak
COST'10: Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment

This paper deals with cross-lingual adaptation of a Large Vocabulary Continuous Speech Recognition (LVCSR) system between two similar Slavic languages --- from Czech to Slovak. The proposed adaptation scheme is performed in two consecutive phases and it ...
Read More
Structural correspondence learning for cross-lingual sentiment classification with one-to-many mappings
AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence

Structural correspondence learning (SCL) is an effective method for cross-lingual sentiment classification. This approach uses unlabeled documents along with a word translation oracle to automatically induce task specific, cross-lingual correspondences. ...
Read More
Cross-lingual sentiment lexicon learning with bilingual word graph label propagation

In this article we address the task of cross-lingual sentiment lexicon learning, which aims to automatically generate sentiment lexicons for the target languages with available English sentiment lexicons. We formalize the task as a learning problem on a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Intelligent Systems and Technology Volume 3, Issue 1
October 2011
391 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/2036264
Issue’s Table of Contents

Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 October 2011
- Accepted: 1 December 2010
- Revised: 1 October 2010
- Received: 1 June 2010
Published in tist Volume 3, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Cross-language text classification
cross-lingual adaptation
structural correspondence learning
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 39
  Total Citations
  View Citations
- 485
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Cross-Lingual Adaptation Using Structural Correspondence Learning

ACM Transactions on Intelligent Systems and Technology

Abstract

References

Cited By

Index Terms

Recommendations

Study on cross-lingual adaptation of a czech LVCSR system towards slovak

Structural correspondence learning for cross-lingual sentiment classification with one-to-many mappings

Cross-lingual sentiment lexicon learning with bilingual word graph label propagation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Cross-Lingual Adaptation Using Structural Correspondence Learning

ACM Transactions on Intelligent Systems and Technology

Abstract

References

Cited By

Index Terms

Recommendations

Study on cross-lingual adaptation of a czech LVCSR system towards slovak

Structural correspondence learning for cross-lingual sentiment classification with one-to-many mappings

Cross-lingual sentiment lexicon learning with bilingual word graph label propagation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media