Skip to main content
Log in

Sentiment analysis of expectation and perception of MILANO EXPO2015 in twitter data: a generalized cross entropy approach

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

In this paper, data concerning MILANO EXPO2015 is collected from the official twitter page of the event before and after its opening. In order to extract a semi-supervised ontology and to evaluate the global sentiment around the event, a variety of language processing techniques has been applied on the collected “tweets”: Latent Semantic Analysis, sentiment polarity tracking, along with gap analysis has allowed the semantic evaluation of users’ opinions. Moreover, the generalized cross entropy approach has been applied for the first time on web data, adding prior information on the effect of semantic classes on the global sentiment, improving accuracy and adding detail to the analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. ‘Fuzzy’ is here intended as embedding concepts from more than one cluster at once.

  2. Remember that our considerations are restricted to the case where no fuzzy k-means clustering is applied. In this last case, further consideration should be made: The most straightforward approach would be to keep the same general approach, but here it should be: \(C' \notin \mathcal {R}^{n_o \times n_c}\).

  3. This matrix is actually a vector, because there is only one document: The query itself.

  4. See https://dev.twitter.com/streaming/public for more information.

  5. Available at https://store.continuum.io/cshop/anaconda/.

References

  • Abbott D (2014) Applied predictive analytics: principles and techniques for the professional data analyst. Wiley, Hoboken

    Google Scholar 

  • Alexeyeva N, Alexandre S (2013) The negative binomial model of word usage. Electron J Appl Stat Anal 6(1):84–96

  • Badri MA, Abdulla M, Al-Madani A (2005) Information technology center service quality: assessment and application of SERVQUAL. Int J Qual Reliab Manag 22(8):819–848

    Google Scholar 

  • Berry M, Do T, O’Brien G, Krishna V, Varadhan S (1993) SVDPACKC (version 1.0) user’s Guide1

  • Blackburn KG, Yilmaz G, Boyd RL (2018) Food for thought: exploring how people think and talk about food online. Appetite 123:390–401

    Google Scholar 

  • Brown SW, Swartz TA (1989) A gap analysis of professional service quality. J Mark 53:92–98

    Google Scholar 

  • Carpita M, Ciavolino E (2017) A generalized maximum entropy estimator to simple linear measurement error model with a composite indicator. Adv Data Anal Classif 11(1):139–158

    MathSciNet  MATH  Google Scholar 

  • Ciavolino E (2011) An information theoretic job satisfaction analysis. J Appl Sci 11(4):686–692

    Google Scholar 

  • Ciavolino E, Al-Nasser AD (2009) Comparing generalised maximum entropy and partial least squares methods for structural equation models. J Nonparametric Stat 21(8):1017–1036

    MathSciNet  MATH  Google Scholar 

  • Ciavolino E, Calcagnì A (2015) Generalized cross entropy method for analysing the SERVQUAL model. J Appl Stat 42(3):520–534

    MathSciNet  Google Scholar 

  • Ciavolino E, Calcagnì A (2016) A generalized maximum entropy (GME) estimation approach to fuzzy regression model. Appl Soft Comput 38:51–63

    Google Scholar 

  • Ciavolino E, Carpita M (2015) The GME estimator for the regression model with a composite indicator as explanatory variable. Qual Quant 49(3):955–965

    Google Scholar 

  • Ciavolino E, Dahlgaard JJ (2009) Simultaneous equation model based on the generalized maximum entropy for studying the effect of management factors on enterprise performance. J Appl Stat 36(7):801–815

    MathSciNet  MATH  Google Scholar 

  • Ciavolino E, Carpita M, Al-Nasser A (2015) Modelling the quality of work in the Italian social co-operatives combining NPCA-RSM and SEM-GME approaches. J Appl Stat 42(1):161–179

    MathSciNet  Google Scholar 

  • Cover TM, Thomas JA (2006) Elements of information theory, 2nd edn. Wiley, Hoboken

    MATH  Google Scholar 

  • Deerwester SC, Dumais ST, Landauer TK, Furnas GW, Harshman RA (1990) Indexing by latent semantic analysis. JASIS 41(6):391–407

    Google Scholar 

  • Dumais ST (2004) Latent semantic analysis. Annu Rev Inf Sci Technol 38(1):188–230

    Google Scholar 

  • Feldman R (2013) Techniques and applications for sentiment analysis. Commun ACM 56(4):82–89

    Google Scholar 

  • Golan A (2008) Information and entropy econometrics: a review and synthesis. Found Trends Econ 2(1–2):1–145

    MathSciNet  Google Scholar 

  • Golan A, Judge G, Miller D (1996) Maximum entropy econometrics: robust estimation with limited data, series in financial economics and quantitative analysis. Wiley, Hoboken

    MATH  Google Scholar 

  • Halko N, Martinsson P-G, Tropp JA (2011) Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev 53(2):217–288

    MathSciNet  MATH  Google Scholar 

  • Jaynes ET (1957) Information theory and statistical mechanics. Phys Rev 106(4):620

    MathSciNet  MATH  Google Scholar 

  • Jaynes ET (1968) Prior probabilities. IEEE Trans Syst Sci Cybern 4(3):227–241

    MATH  Google Scholar 

  • Jolliffe IT, Cadima J (2016) Principal component analysis: a review and recent developments. Philos Trans R Soc A 374(2065):20150202

    MathSciNet  MATH  Google Scholar 

  • Klema V, Laub A (1980) The singular value decomposition: its computation and some applications. IEEE Trans Autom Control 25(2):164–176

    MathSciNet  MATH  Google Scholar 

  • Landauer TK (2006) Latent semantic analysis. Wiley, Hoboken

    Google Scholar 

  • Landauer TK, Foltz PW, Laham D (1998) An introduction to latent semantic analysis. Discourse Process 25(2–3):259–284

    Google Scholar 

  • Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167

    Google Scholar 

  • Lloyd SP (1982) Least squares quantization in PCM. Inf Theory IEEE Trans 28(2):129–137

    MathSciNet  MATH  Google Scholar 

  • Ofli F, Aytar Y, Weber I, Al Hammouri R, Torralba A (2017) Is saki# delicious? The food perception gap on instagram and its relation to health. In: Proceedings of the 26th international conference on world wide web, international world wide web conferences steering committee, pp 509–518

  • Omachonu V, Haar J, Berg D (2016) Assessing quality in professional services: a framework for gap analysis. Int J Trans Innov Syst 5(1):4–19

    Google Scholar 

  • Paliouras G, Spyropoulos CD, Tsatsaronis G (2011) Knowledge-driven multimedia information extraction and ontology evolution. Springer, Berlin

    Google Scholar 

  • Papalia RB, Ciavolino E (2011) Gme estimation of spatial structural equations models. J Classif 28(1):126–141

    MathSciNet  Google Scholar 

  • Pasca P, Ciavolino E, Boyd R (2018) A data-mining approach to the Parkour discipline. In: Proceedings of the 49th annual meeting of the Italian statistical society. Palermo

  • Pukelsheim F (1994) The three sigma rule. Am Stat 48(2):88–91

    MathSciNet  Google Scholar 

  • Seth N, Deshmukh S, Vrat P (2005) Service quality models: a review. Int J Qual Reliab Manag 22(9):913–949

    Google Scholar 

  • Shannon CE (2001) A mathematical theory of communication. ACM SIGMOBILE Mob Comput Commun Rev 5(1):3–55

    MathSciNet  Google Scholar 

  • Trefethen LN, Bau D III (1997) Numerical linear algebra, vol 50. Siam, Philadelphia

    MATH  Google Scholar 

  • Turney PD (2001) Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: European conference on machine learning. Springer, Berlin, pp 491–502

  • Varçın F, Erbay H, Horasan F (2016) Latent semantic analysis via truncated ULV decomposition. In: Signal processing and communication application conference (SIU), 2016 24th. IEEE, pp 1333–1336

  • Weng S-S, Tsai H-J, Liu S-C, Hsu C-H (2006) Ontology construction for information classification. Expert Syst Appl 31(1):1–12

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Enrico Ciavolino.

Ethics declarations

Conflict of interest

Authors declare that there is no conflict of interest.

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by Massimo Squillante.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Corallo, A., Fortunato, L., Massafra, A. et al. Sentiment analysis of expectation and perception of MILANO EXPO2015 in twitter data: a generalized cross entropy approach. Soft Comput 24, 13597–13607 (2020). https://doi.org/10.1007/s00500-019-04368-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-019-04368-7

Keywords

Navigation