ABSTRACT
In this paper, we present a new learning scenario, heterogeneous transfer learning, which improves learning performance when the data can be in different feature spaces and where no correspondence between data instances in these spaces is provided. In the past, we have classified Chinese text documents using English training data under the heterogeneous transfer learning framework. In this paper, we present image clustering as an example to illustrate how unsupervised learning can be improved by transferring knowledge from auxiliary heterogeneous data obtained from the social Web. Image clustering is useful for image sense disambiguation in query-based image search, but its quality is often low due to imagedata sparsity problem. We extend PLSA to help transfer the knowledge from social Web data, which have mixed feature representations. Experiments on image-object clustering and scene clustering tasks show that our approach in heterogeneous transfer learning based on the auxiliary data is indeed effective and promising.
- Alina Andreevskaia and Sabine Bergler. 2008. When specialists and generalists work together: Overcoming domain dependence in sentiment tagging. In ACL-08: HLT, pages 290--298, Columbus, Ohio, June.Google Scholar
- Andrew Arnold, Ramesh Nallapati, and William W. Cohen. 2007. A comparative study of methods for transductive transfer learning. In ICDM 2007 Workshop on Mining and Management of Biological Data, pages 77--82. Google ScholarDigital Library
- Andrew Arnold, Ramesh Nallapati, and William W. Cohen. 2008. Exploiting feature hierarchy for transfer learning in named entity recognition. In ACL-08: HLT.Google Scholar
- Sugato Basu, Mikhail Bilenko, and Raymond J. Mooney. 2004. A probabilistic framework for semi-supervised clustering. In ACM SIGKDD 2004, pages 59--68. Google ScholarDigital Library
- John Blitzer, Ryan Mcdonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In EMNLP 2006, pages 120--128, Sydney, Australia. Google ScholarDigital Library
- John Blitzer, Mark Dredze, and Fernando Pereira. 2007. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In ACL 2007, pages 440--447, Prague, Czech Republic.Google Scholar
- Avrim Blum and Tom Mitchell. 1998. Combining labeled and unlabeled data with co-training. In COLT 1998, pages 92--100, New York, NY, USA. ACM. Google ScholarDigital Library
- Rich Caruana. 1997. Multitask learning. Machine Learning, 28(1):41--75. Google ScholarDigital Library
- Yee Seng Chan and Hwee Tou Ng. 2007. Domain adaptation with active learning for word sense disambiguation. In ACL 2007, Prague, Czech Republic.Google Scholar
- David A. Cohn and Thomas Hofmann. 2000. The missing link - a probabilistic model of document content and hypertext connectivity. In NIPS 2000, pages 430--436.Google Scholar
- Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2008a. Translated learning: Transfer learning across different feature spaces. In NIPS 2008, pages 353--360.Google ScholarDigital Library
- Wenyuan Dai, Qiang Yang, Gui-Rong Xue, and Yong Yu. 2008b. Self-taught clustering. In ICML 2008, pages 200--207. Omnipress. Google ScholarDigital Library
- Hal Daume, III. 2007. Frustratingly easy domain adaptation. In ACL 2007, pages 256--263, Prague, Czech Republic.Google Scholar
- Jesse Davis and Pedro Domingos. 2008. Deep transfer via second-order markov logic. In AAAI 2008 Workshop on Transfer Learning, Chicago, USA.Google Scholar
- Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. L, and Richard Harshman. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science, pages 391--407.Google ScholarCross Ref
- A. P. Dempster, N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incomplete data via the em algorithm. J. of the Royal Statistical Society, 39:1--38.Google Scholar
- Thomas Finley and Thorsten Joachims. 2005. Supervised clustering with support vector machines. In ICML 2005, pages 217--224, New York, NY, USA. ACM. Google ScholarDigital Library
- G. Griffin, A. Holub, and P. Perona. 2007. Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology.Google Scholar
- Thomas Hofmann. 1999 Probabilistic latent semantic analysis. In Proc. of Uncertainty in Artificial Intelligence, UAI99. Pages 289--296 Google ScholarDigital Library
- Thomas Hofmann. 2001. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning. volume 42, number 1--2, pages 177--196. Kluwer Academic Publishers. Google ScholarDigital Library
- Jing Jiang and Chengxiang Zhai. 2007. Instance weighting for domain adaptation in NLP. In ACL 2007, pages 264--271, Prague, Czech Republic, June.Google Scholar
- Leonard Kaufman and Peter J. Rousseeuw. 1990. Finding groups in data: an introduction to cluster analysis. John Wiley and Sons, New York.Google Scholar
- Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR 2006, pages 2169--2178, Washington, DC, USA. Google ScholarDigital Library
- Fei-Fei Li and Pietro Perona. 2005. A bayesian hierarchical model for learning natural scene categories. In CVPR 2005, pages 524--531, Washington, DC, USA. Google ScholarDigital Library
- Xiao Ling, Gui-Rong Xue, Wenyuan Dai, Yun Jiang, Qiang Yang, and Yong Yu. 2008. Can chinese web pages be classified with english data source? In WWW 2008, pages 969--978, New York, NY, USA. ACM. Google ScholarDigital Library
- Nicolas Loeff, Cecilia Ovesdotter Alm, and David A. Forsyth. 2006. Discriminating image senses by clustering with multimodal features. In COLING/ACL 2006 Main conference poster sessions, pages 547--554. Google ScholarDigital Library
- David G. Lowe. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (IJCV) 2004, volume 60, number 2, pages 91--110. Google ScholarDigital Library
- J. B. MacQueen. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of Fifth Berkeley Symposium on Mathematical Statistics and Probability, pages 1:281--297, Berkeley, CA, USA.Google Scholar
- Kamal Nigam and Rayid Ghani. 2000. Analyzing the effectiveness and applicability of co-training. In Proceedings of the Ninth International Conference on Information and Knowledge Management, pages 86--93, New York, USA. Google ScholarDigital Library
- Rajat Raina, Alexis Battle, Honglak Lee, Benjamin Packer, and Andrew Y. Ng. 2007. Self-taught learning: transfer learning from unlabeled data. In ICML 2007, pages 759--766, New York, NY, USA. ACM. Google ScholarDigital Library
- Roi Reichart and Ari Rappoport. 2007. Self-training for enhancement and domain adaptation of statistical parsers trained on small datasets. In ACL 2007.Google Scholar
- Roi Reichart, Katrin Tomanek, Udo Hahn, and Ari Rappoport. 2008. Multi-task active learning for linguistic annotations. In ACL-08: HLT, pages 861--869.Google Scholar
- C. E. Shannon. 1948. A mathematical theory of communication. Bell system technical journal, 27.Google Scholar
- J. Sivic, B. C. Russell, A. A. Efros, A. Zisserman, and W. T. Freeman. 2005. Discovering object categories in image collections. In ICCV 2005.Google Scholar
- Naftali Tishby, Fernando C. Pereira, and William Bialek. The information bottleneck method. 1999. In Proc. of the 37th Annual Allerton Conference on Communication, Control and Computing, pages 368--377.Google Scholar
- Pengcheng Wu and Thomas G. Dietterich. 2004. Improving svm accuracy by training on auxiliary data sources. In ICML 2004, pages 110--117, New York, NY, USA. Google ScholarDigital Library
- Yejun Wu and Douglas W. Oard. 2008. Bilingual topic aspect classification with a few training examples. In ACM SIGIR 2008, pages 203--210, New York, NY, USA. Google ScholarDigital Library
- Xiaojin Zhu. 2007. Semi-supervised learning literature survey. Technical Report 1530, Computer Sciences, University of Wisconsin-Madison.Google Scholar
Index Terms
- Heterogeneous transfer learning for image clustering via the social web
Recommendations
General heterogeneous transfer distance metric learning via knowledge fragments transfer
IJCAI'17: Proceedings of the 26th International Joint Conference on Artificial IntelligenceTransfer learning aims to improve the performance of target learning task by leveraging information (or transferring knowledge) from other related tasks. Recently, transfer distance metric learning (TDML) has attracted lots of interests, but most of ...
Heterogeneous transfer learning for image classification
AAAI'11: Proceedings of the Twenty-Fifth AAAI Conference on Artificial IntelligenceTransfer learning as a new machine learning paradigm has gained increasing attention lately. In situations where the training data in a target domain are not sufficient to learn predictive models effectively, transfer learning leverages auxiliary source ...
Transfer spectral clustering
ECMLPKDD'12: Proceedings of the 2012th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part IITransferring knowledge from auxiliary datasets has been proved useful in machine learning tasks. Its adoption in clustering however is still limited. Despite of its superior performance, spectral clustering has not yet been incorporated with knowledge ...
Comments