Sentiment analysis based on rhetorical structure theory:Learning deep neural networks from discourse trees

doi:10.1016/j.eswa.2018.10.002

Expert Systems with Applications

Volume 118, 15 March 2019, Pages 65-79

https://doi.org/10.1016/j.eswa.2018.10.002 Get rights and content

Highlights

•
Improves sentiment analysis with discourse trees from rhetoric structure theory.
•
Extracts salient passages based on the position and relation in the discourse tree.
•
Develops a tensor-based tree-structured neural network.
•
Tensor structure distinguishes hierarchy and relation types.
•
Overfitting is reduced by a tree-based algorithms for data augmentation.

Abstract

Prominent applications of sentiment analysis are countless, covering areas such as marketing, customer service and communication. The conventional bag-of-words approach for measuring sentiment merely counts term frequencies; however, it neglects the position of the terms within the discourse. As a remedy, we develop a discourse-aware method that builds upon the discourse structure of documents. For this purpose, we utilize rhetorical structure theory to label (sub-)clauses according to their hierarchical relationships and then assign polarity scores to individual leaves. To learn from the resulting rhetorical structure, we propose a tensor-based, tree-structured deep neural network (named Discourse-LSTM) in order to process the complete discourse tree. The underlying tensors infer the salient passages of narrative materials. In addition, we suggest two algorithms for data augmentation (node reordering and artificial leaf insertion) that increase our training set and reduce overfitting. Our benchmarks demonstrate the superior performance of our approach. Moreover, our tensor structure reveals the salient text passages and thereby provides explanatory insights.

Introduction

Sentiment analysis reveals personal opinions towards entities such as products, services or events, which can benefit organizations and businesses in improving their marketing, communication, production and procurement. For this purpose, sentiment analysis quantifies the positivity or negativity of subjective information in narrative materials (Chen, Xu, He, Wang, 2017, Feldman, 2013, Kratzwald, Ilic, Kraus, Feuerriegel, Prendinger, 2018, Pang, Lee, 2008). Among the many applications of sentiment analysis are tracking customer opinions (Araque, Corcuera-Platas, Sánchez-Rada, Iglesias, 2017, Bohanec, Kljajić Borštnar, Robnik-Šikonja, 2017, Tanaka, 2010), mining user reviews (Kontopoulos, Berberidis, Dergiades, Bassiliades, 2013, Mostafa, 2013, Ye, Zhang, Law, 2009), trading upon financial news (Khadjeh Nassirtoussi, Aghabozorgi, Ying Wah, Ngo, 2015, Kraus, Feuerriegel, 2017, Weng, Lu, Wang, Megahed, Martinez, 2018), detect social events (Yoo, Song, & Jeong, 2018) and predicting sales (Rui, Liu, Whinston, 2013, Yu, Liu, Huang, An, 2012).

Sentiment analysis traditionally utilizes bag-of-words approaches, which merely count the frequency of words (and tuples thereof) to obtain a mathematical representation of documents in matrix form (Dey, Jenamani, Thakkar, 2018, Guzella, Caminhas, 2009, Manning, Schütze, 1999, Pang, Lee, 2008). As such, these approaches are not capable of taking into consideration semantic relationships between sections and sentences of a document. In naïve bag-of-words models, all clauses are assigned the same level of relevance, which cannot mark certain subordinate clauses more than others for purposes of inferring the sentiment. Conversely, the objective of this paper is to develop a discourse-aware method for sentiment analysis that can recognize differences in salience between individual subordinate clauses, as well as the discriminate the relevance of sentences based on their function (e. g.whether it introduces a new fact or elaborates upon an existing one).

Let us, for instance, consider the two examples in Fig. 1, which express opposite polarities. By simply counting the frequency of positive and negative words, we cannot discriminate between the texts, as both contain the same number of polarity terms. To reliably analyze the sentiment, it is essential to account for the semantic structure and the variable importance across passages. That is, we can identify the main clauses and then infer the correct tone of the examples by looking at them. Similarly, RST trees can locate relevant parts in lengthy texts. For instance, the concluding section of a newspaper article is typically relevant as it reports the opinion of the author.

Our method is based on rhetorical structure theory (RST), which incorporates the discourse structures of natural language. RST structures documents hierarchically (Mann & Thompson, 1988) by splitting the content into (sub-)clauses called elementary discourse units (EDUs). The EDUs are then connected to form a binary discourse tree. Here RST discriminates between a nucleus, which conveys primary, and satellite, which conveys ancillary information. The formalization of nucleus/satellite can be loosely thought of main and subordinate parts of a clause. The edges are further labeled according to the type of discourse – for instance, whether it is an elaboration or an argument. Hence, this method essentially derives the function of a text passage. Both concepts of the RST tree help in localizing essential information within documents. Hence, the goal of this work is to develop a novel approach that identifies salient passages in a document based on their position in the discourse tree and incorporates their importance in the form of weights when computing sentiment scores.

Previous research has demonstrated that discourse-related information can improve the performance of sentiment analysis (see Section 2 for details). The work by Taboada, Voll, and Brooke (2008) is the first to combine rhetorical structure theory and sentiment analysis. In this work, the authors weigh adjectives in a nucleus more heavily than those in a satellite. Beyond that, one can reweigh the importance of passages based on their relation type (Hogenboom, Frasincar, de Jong, & Kaymak, 2015b) or depth (Märkle-Huß, Feuerriegel, & Prendinger, 2017) in the discourse tree. Some methods prune the discourse trees at certain thresholds to yield a tree of fixed depth, e. g.2 or 4 levels (Märkle-Huß et al., 2017). Other approaches train machine learning classifiers based on the relation types as input features (Hogenboom, Frasincar, de Jong, & Kaymak, 2015a). What the previous references have in common is that they try to map the tree structure onto mathematically simpler representations, thereby dropping partial information from the tree.

An alternative strategy is to apply tree-structured neural networks that traverse discourse trees for representation learning. When encountering a node, these networks combine the information from the leaves and pass them on to the next higher level, until reaching the root at which point a prediction is made. Thereby, the approach merely adheres to the tree-structure but does not account for either the relation type or whether it is a nucleus/satellite. To do so, one can extend the network to include different weights for each edge in the tree depending on, e. g., the relation type. This essentially introduces additional degrees of freedom that can weigh the different discourse units by their importance. The work by Fu, Liu, Xu, Yu, and Wang (2016) extends the network by such a mechanism with respect to the nucleus/satellite information but discards the relation type and merely applies the network to individual sentences instead of longer documents. The approach in Ji and Smith (2017) can only exploit the relation type and not the nucleus/satellite information. Furthermore, former approaches are based on traditional recursive neural networks, which are limited by the fact that they can persist information for only a few iterations (Bengio, Simard, & Frasconi, 1994). Therefore, these methods struggle with complex discourses, while we explicitly build upon tree-shaped long short-term memory models, since they are better equipped to handle very deep structures.

We build upon the previous works and advance them by proposing a specific neural network, called Discourse-LSTM. The Discourse-LSTM utilizes multiple tensors to localize salient passages within documents by incorporating the full discourse structure including nucleus/satellite information and relation types. In brief, our approach is as follows: we utilize rhetorical structure theory to represent the semantic structure of a document in the form of a hierarchical discourse tree. We then obtain sentiment scores for each leaf by utilizing both polarity dictionaries and word embeddings. The resulting tree is subsequently traversed by the Discourse-LSTM, thereby aggregating the sentiment scores based on the discourse structure in order to compute a sentiment score for the document. This approach thus weighs the importance of (sub-)clauses based on their position and relation in the discourse tree, which is learned during the training phase. As a consequence, this allows us to enhance sentiment analysis with discourse information. Another key contribution is that we propose two techniques for data augmentation that facilitate training and yield higher predictive accuracy.

The remainder of this paper is structured as follows. Section 2 reviews discourse parsing and RST-based sentiment analysis. Section 3 then introduces our Discourse-LSTM, as well as our algorithms for data augmentation. Section 4 describes our experimental setup in order to evaluate the performance of our deep learning methods in comparison to common baselines (Section 5). Section 6 concludes with a summary and suggestions for future research.

Section snippets

Rhetorical structure theory

Rhetorical structure theory formalizes the discourse in narrative materials by organizing sub-clauses, sentences and paragraphs into a hierarchy (Mann & Thompson, 1988). The premise is that a document is split into elementary discourse units, which constitute the smallest, indivisible segments. These EDUs are then connected by one of 18 different relation types, which represent edges in the discourse tree; see Table 1 for a list. Each relation is further labeled by a hierarchy type, i. e.either

Discourse-based sentiment analysis with deep learning

This section introduces our discourse-based methodology, which infers sentiment scores from textual materials. Fig. 3 illustrates the underlying framework and divides the procedure into steps for discourse parsing, computing low-level polarity features, data augmentation and prediction. The prediction phase implements either of the baselines or our proposed Discourse-LSTM.

Datasets

We build upon earlier work and utilize three common datasets. The first consists of 2000 movie reviews from Rotten Tomatoes (Pang & Lee, 2004), for which we perform 10-fold cross-validation and then average the predictive performance across splits. The second dataset comprises 50000 reviews from the Internet Movie Database (IMDb), which are split evenly into 25000 reviews for training and 25000 for testing (Maas et al., 2011). It includes, at most, 30 reviews for any one movie, since reviews

Results

In this section, we evaluate the performance of our Discourse-LSTM and compare it to the previous baselines. In addition, we perform statistical significance tests on the receiver operating characteristics (ROC) (DeLong, DeLong, & Clarke-Pearson, 1988). The evaluation provides evidence that incorporating semantic structure into the task of sentiment analysis improves the predictive performance.

Conclusion

Deep learning for natural language predominantly builds upon sequential models such as LSTMs. While these models usually achieve a high predictive power when applied to short texts, the complexity of linguistic discourse hampers performance for longer documents. As a remedy, our paper proposes an innovative, discourse-aware approach: we first parse the semantic structure based on rhetorical structure theory, thereby mapping the document onto a discourse tree that encodes its storyline. We then

Declarations of interest:

none

Acknowledgment

The valuable contribution of Ryan Grabowski is greatly appreciated.

References (49)

O. Araque et al.
Enhancing deep learning sentiment analysis with ensemble techniques in social applications
Expert Systems with Applications
(2017)
M. Bohanec et al.
Explaining machine learning models in sales predictions
Expert Systems with Applications
(2017)
C. de Boom et al.
Representation learning for very short texts using weighted word embedding aggregation
Pattern Recognition Letters
(2016)
T. Chen et al.
Improving sentiment analysis via sentence type classification using biLSTM-CRF and CNN
Expert Systems with Applications
(2017)
J.M. Chenlo et al.
Rhetorical structure theory for polarity estimation: an experimental study
Data & Knowledge Engineering
(2014)
A. Dey et al.
Senti-N-Gram: An n-gram lexicon for sentiment analysis
Expert Systems with Applications
(2018)
T.S. Guzella et al.
A review of machine learning approaches to spam filtering
Expert Systems with Applications
(2009)
A. Hogenboom et al.
Polarity classification using structure-based vector representations of text
Decision Support Systems
(2015)
A. Khadjeh Nassirtoussi et al.
Text mining of news-headlines for FOREX market prediction: A multi-layer dimension reduction algorithm with semantics and sentiment
Expert Systems with Applications
(2015)
E. Kontopoulos et al.
Ontology-based sentiment analysis of twitter posts
Expert Systems with Applications
(2013)

M. Kraus et al.

Decision support fromfinancial disclosures with deep neural networks and transfer learning

Decision Support Systems

(2017)

S. Liu et al.

Discovering sentiment sequence within email data through trajectory representation

Expert Systems with Applications

(2018)

M.M. Mostafa

More than words: Social networks’ text mining for consumer brand sentiments

Expert Systems with Applications

(2013)

H. Rui et al.

Whose and what chatter matters? the effect of tweets on movie sales

Decision Support Systems

(2013)

K. Tanaka

A sales forecasting model for new-released and nonlinear sales trend products

Expert Systems with Applications

(2010)

B. Weng et al.

Predicting short-term stock prices using ensemble methods and online data sources

Expert Systems with Applications

(2018)

Q. Ye et al.

Sentiment classification of online reviews to travel destinations by supervised machine learning approaches

Expert Systems with Applications

(2009)

S. Yoo et al.

Social media contents based sentiment analysis and prediction system

Expert Systems with Applications

(2018)

O. Abend et al.

The state of the art in semantic representation

Proceedings of the 55th annual meeting of the association for computational linguistics (acl ’17)

(2017)

S. Baccianella et al.

SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining

Proceedings of the international conference on language resources and evaluation (lrec ’10)

(2010)

Y. Bengio et al.

Learning long-term dependencies with gradient descent is difficult

IEEE Transactions on Neural Networks

(1994)

P. Bhatia et al.

Better document-level sentiment analysis from RST discourse parsing

Proceedings of the conference on empirical methods in natural language processing (emnlp ’15)

(2015)

E.R. DeLong et al.

Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach

Biometrics

(1988)

R. Feldman

Techniques and applications for sentiment analysis

Communications of the ACM

(2013)

Cited by (52)

Efficient utilization of pre-trained models: A review of sentiment analysis via prompt learning
2024, Knowledge-Based Systems
Sentiment analysis is one of the traditional well-known tasks in Natural Language Processing (NLP) research. In recent years, Pre-trained Models (PMs) have become one of the frontiers of NLP, and the knowledge in PMs is usually leveraged to improve machine learning models' performance for a variety of downstream NLP tasks including sentiment analysis. However, there are also some shortcomings in PM-based approaches. For example, many studies pointed out there are gaps between pre-training and fine-tuning. In addition, because of the time-consuming and high-cost data annotation process, the labeled training data are usually precious and scarce, which often leads to the over-fitting of models. The recent advent of prompt learning technology provides a promising solution to the above challenges. In this paper, we first discussed the background of prompt learning and its basic principle. Prompt learning changes the model input by adding templates, allowing learning tasks to adapt actively to pre-trained models, and therefore can promote the innovation and applicability of pre-trained models. Then we investigated the evolution of sentiment analysis and explored the application of prompt learning to different sentiment analysis tasks. Our research and review show that prompt learning is more suitable for sentiment analysis tasks and can achieve good performance. Finally, we also provided some future research directions on prompt-based sentiment analysis. Our survey demonstrated that prompt learning can facilitate the efficient utilization of pre-trained models in sentiment analysis and other tasks, which makes it a new paradigm worthy of further exploration.
A comprehensive decision support system for stock investment decisions
2022, Expert Systems with Applications
The stock market is of paramount importance to modern society. Decision support in this sense must consider multiple criteria and be able to deal with the different stages involved. We propose a comprehensive Decision Support System for investing in the stock market that addresses the three main aspects of stock portfolio management: price forecasting, stock selection and portfolio optimization. An artificial neural network and fundamental analysis are used during the first stage of the system to forecast future stock prices. Differential evolution and fundamental analysis are used to select the most plausible stocks in the second stage. Finally, genetic algorithms and statistical analysis are used to build the most preferred portfolio in the third stage. Back-testing is used in experiments considering historical returns of the stocks in the S&P’s 500 index. The proposed approach is compared to several benchmarks (average of the market, market index, contemporary approaches) in several contexts (actual returns, Sharpe ratio, Sortino ratio). The results show that the proposed system outperformed the benchmarks with statistical significance in most scenarios, including different market trends. The results suggest that the proposed system has the potential to be a good alternative to existing methods.
Detecting fake reviews through topic modelling
2022, Journal of Business Research
Citation Excerpt :
The theory has been used in news in order to identify contents such as description of discourse, detection of fake news ( Della Vedova et al., 2018; Shu et al., 2017. Some of these studies used machine learning as a research technic (Han & Metha, 2019; Kraus & Feuerriegel, 2019; Prasanna, 2019, Rubin & Lukoianova, 2014; Rădescu, 2020; Rubin et al., 2015). Rhetorical Structure Theory has been also used for understanding deception in customer complaints (Pisarevskaya et al, 2019), detection of fake online reviews (Popoola, 2017)
Against the uncertainty caused by the information overload in the online world, consumers can benefit greatly by reading online product reviews before making their online purchases. However, some of the reviews are written deceptively to manipulate purchasing decisions. The purpose of present study is to determine which feature combination is most effective in fake review detection among the features of sentiment scores, topic distributions, cluster distributions and bag of words. In this study, additional feature combinations to a sentiment analysis are searched to examine the critical problem of fake reviews made to influence the decision-making process using review from amazon.com dataset. Results of the study points that behavior-related features play an important role in fake review classifications when jointly used with text-related features. Verified purchase is the only behavior related feature used comparatively with other text-related features.
MBiLSTMGloVe: Embedding GloVe knowledge into the corpus using multi-layer BiLSTM deep learning model for social media sentiment analysis
2022, Expert Systems with Applications
Citation Excerpt :
Li et al., 2018) showed that the dataset with surveys having a short length and high lucidity could accomplish the best exhibition contrast and some other blends of the degrees of word count and clarity and that controlling the audit length is more successful for gathering a more significant level of exactness than expanding the coherence. Discoveries reveal that Discourse-LSTM significantly beats the baselines (Kraus & Feuerriegel, 2019). A CNN-LSTM model has been proposed to detect the emotions of the #BlackLives-Matter movement for two provinces of the USA and achieved 94% accuracy for validation (Ankita et al., 2022).
The fast improvement and transformation of online media and unique sites with critical reviews of items, movies, goods, etc. have created a tremendous assortment of assets for clients everywhere around the globe. This information might contain a great deal of data, including product reviews, anticipating market changes, and the extremity of film assessments. Sentiment Analysis (SA) innovation produces phonetic comprehension according to the viewpoint of machines through the handling and investigation of immense amounts of information, which is a hot expedition passageway heading into the field of man-made reasoning, a.k.a. Artificial Intelligence (AI). To address the substance appendage from short texts, we want to investigate the further semantics of words by exploiting thoughtful Machine Learning (ML) and Deep Learning (DL) strategies. In this way, AI, ML, and DL procedures can control and distribute intuition introspection in these difficulties.
Our recommended model, based on the DL method and the GloVe word embedding approach, learns the features using a CNN layer and then coordinates those parts into a Multi-Layered Bi-DirectionalLong-Short-Term Memory (MBiLSTM) to capture long-range embedded circumstances. The main aim of this experiment is to give an adequate answer to examine feelings and user reviews in positive and negative classifications. Our runs show that a test accuracy of 92.05% and a validation accuracy of 93.55% can be attained with the given model. The framework is assessed using IMDB datasets. The proposed model outflanks existing pattern models, which show that going past the substance of a tweet is valuable in opinion classification orders since it gives the classifier a deep understanding of the chore.
Multi-source information fusion and deep-learning-based characteristics measurement for exploring the effects of peer engagement on stock price synchronicity
2021, Information Fusion
By combining financial information from the financial market with social textual information from social media, we apply a two-level information fusion approach to examine the effects of peer engagement on social media on stock price synchronicity and compare the effects between epidemic and non-epidemic contexts. On the first level, single pieces of information are fused at the firm-year level and deep learning models are used to measure the characteristics of peer engagement – informativeness, diversity, information diffusion degree, and expert proportion – which are constructed grounded in the theory of the wisdom of crowds (WoC). On the second level, all measurements at the firm-year level are fused into a full sample to conduct regression analysis. The experimental results show that peer engagement reduces stock price synchronicity. This suggests that high synchronicity could be mitigated through effective guidance from peer engagement activities. We also find that during epidemics, synchronicity is much higher, and group diversity and experts have stronger effects in lowering synchronicity, while the effects of informativeness and information diffusion are hampered. This has implications for combatting the adverse effects of epidemic outbreaks on financial markets.
Improving sentiment analysis on clinical narratives by exploiting UMLS semantic types
2021, Artificial Intelligence in Medicine
Citation Excerpt :
Compared to the use of only content-based features, the incorporation of ST features improved the classification accuracies by 2.4 percent on average. The polarity calculation methods used in [32–34] are variants of SWN. Our dataset was taken from MIMIC II (Multiparameter Intelligent Monitoring in Intensive Care II) database [4], which contains data concerning patients in intensive care units of hospitals.
Sentiments associated with assessments and observations recorded in a clinical narrative can often indicate a patient's health status. To perform sentiment analysis on clinical narratives, domain-specific knowledge concerning meanings of medical terms is required. In this study, semantic types in the Unified Medical Language System (UMLS) are exploited to improve lexicon-based sentiment classification methods. For sentiment classification using SentiWordNet, the overall accuracy is improved from 0.582 to 0.710 by using logistic regression to determine appropriate polarity scores for UMLS ‘Disorders’ semantic types. For sentiment classification using a trained lexicon, when disorder terms in a training set are replaced with their semantic types, classification accuracies are improved on some data segments containing specific semantic types. To select an appropriate classification method for a given data segment, classifier combination is proposed. Using classifier combination, classification accuracies are improved on most data segments, with the overall accuracy of 0.882 being obtained.

View all citing articles on Scopus

View full text

Sentiment analysis based on rhetorical structure theory:Learning deep neural networks from discourse trees

Highlights

Abstract

Introduction

Section snippets

Rhetorical structure theory

Discourse-based sentiment analysis with deep learning

Datasets

Results

Conclusion

Declarations of interest:

Acknowledgment

Expert Systems with Applications

Expert Systems with Applications

Pattern Recognition Letters

Expert Systems with Applications

Data & Knowledge Engineering

Expert Systems with Applications

Expert Systems with Applications

Decision Support Systems

Expert Systems with Applications

Expert Systems with Applications

Decision Support Systems

Expert Systems with Applications

Expert Systems with Applications

Decision Support Systems

Expert Systems with Applications

Expert Systems with Applications

Expert Systems with Applications

Expert Systems with Applications

The state of the art in semantic representation

Proceedings of the 55th annual meeting of the association for computational linguistics (acl ’17)

SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining

Proceedings of the international conference on language resources and evaluation (lrec ’10)

Learning long-term dependencies with gradient descent is difficult

IEEE Transactions on Neural Networks

Better document-level sentiment analysis from RST discourse parsing

Proceedings of the conference on empirical methods in natural language processing (emnlp ’15)

Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach

Biometrics

Techniques and applications for sentiment analysis

Communications of the ACM