ABSTRACT
With the rapid increase of diversity and modality of data in user-generated contents, sentiment analysis as a core area of social media analytics has gone beyond traditional text-based analysis. Multimodal sentiment analysis has become an important research topic in recent years. Most of the existing work on multimodal sentiment analysis extracts features from image and text separately, and directly combine them to train a classifier. As visual and textual information in multimodal data can mutually reinforce and complement each other in analyzing the sentiment of people, previous research all ignores this mutual influence between image and text. To fill this gap, in this paper, we consider the interrelation of visual and textual information, and propose a novel co-memory network to iteratively model the interactions between visual contents and textual words for multimodal sentiment analysis. Experimental results on two public multimodal sentiment datasets demonstrate the effectiveness of our proposed model compared to the state-of-the-art methods.
- Claudio Baecchi, Tiberio Uricchio, Marco Bertini, and Alberto Del Bimbo . 2016. A multimodal feature learning approach for sentiment analysis of social network multimedia. Multimedia Tools and Applications Vol. 75, 5 (2016), 2507--2525. Google ScholarDigital Library
- Damian Borth, Rongrong Ji, Tao Chen, Thomas Breuel, and Shih-Fu Chang . 2013. Large-scale visual sentiment ontology and detectors using adjective noun pairs ACM MM. ACM, 223--232. Google ScholarDigital Library
- Guoyong Cai and Binbin Xia . 2015. Convolutional neural networks for multimedia sentiment analysis. In NLPCC. Springer, 159--167. Google ScholarDigital Library
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton . 2015. Deep learning. nature Vol. 521, 7553 (2015), 436.Google Scholar
- Zheng Li, Yu Zhang, Ying Wei, Yuxiang Wu, and Qiang Yang . 2017. End-to-end adversarial memory network for cross-domain sentiment classification IJCAI. 2237. Google ScholarDigital Library
- Teng Niu, Shiai Zhu, Lei Pang, and Abdulmotaleb El Saddik . 2016. Sentiment analysis on multi-view social data. In MMM. Springer, 15--27.Google Scholar
- Jeffrey Pennington, Richard Socher, and Christopher Manning . 2014. Glove: Global vectors for word representation. In EMNLP. 1532--1543.Google Scholar
- Mohammad Soleymani, David Garcia, Brendan Jou, Björn Schuller, Shih-Fu Chang, and Maja Pantic . 2017. A survey of multimodal sentiment analysis. Image and Vision Computing Vol. 65 (2017), 3--14.Google ScholarCross Ref
- Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna . 2016. Rethinking the inception architecture for computer vision CVPR. IEEE, 2818--2826.Google Scholar
- Nan Xu . 2017. Analyzing multimodal public sentiment based on hierarchical semantic attentional network. In ISI. IEEE, 152--154.Google Scholar
- Nan Xu and Wenji Mao . 2017. MultiSentiNet: A Deep Semantic Network for Multimodal Sentiment Analysis CIKM. ACM, 2399--2402. Google ScholarDigital Library
- Quanzeng You, Jiebo Luo, Hailin Jin, and Jianchao Yang . 2015. Robust image sentiment analysis using progressively trained and domain transferred deep networks. In AAAI. AAAI Press, 381--388. Google ScholarDigital Library
- Yuhai Yu, Hongfei Lin, Jiana Meng, and Zhehuan Zhao . 2016. Visual and textual sentiment analysis of a microblog using deep convolutional neural networks. Algorithms Vol. 9, 2 (2016), 41.Google ScholarCross Ref
Index Terms
- A Co-Memory Network for Multimodal Sentiment Analysis
Recommendations
Multimodal Social Media Sentiment Analysis Based on Cross-Modal Hierarchical Attention Fusion
Artificial Intelligence and Mobile Services – AIMS 2021AbstractWith the diversification of data forms on social media, more and more multimodal information mixed with image and text replaces the traditional single text description. Compared with single-modal data, multimodal data can more fully express people’...
Deep Modular Co-Attention Shifting Network for Multimodal Sentiment Analysis
Human Multimodal Sentiment Analysis (MSA) is an attractive research that studies sentiment expressed from multiple heterogeneous modalities. While transformer-based methods have achieved great success, designing an effective “co-attention” model to ...
Hybrid cross-modal interaction learning for multimodal sentiment analysis
AbstractMultimodal sentiment analysis (MSA) predicts the sentiment polarity of an unlabeled utterance that carries multiple modalities, such as text, vision and audio, by analyzing labeled utterances. Existing fusion methods mainly focus on establishing ...
Comments