skip to main content
10.1145/2527031.2527049acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
research-article

Dengue surveillance based on a computational model of spatio-temporal locality of Twitter

Published:15 June 2011Publication History

ABSTRACT

Twitter is a unique social media channel, in the sense that users discuss and talk about the most diverse topics, including their health conditions. In this paper we analyze how Dengue epidemic is reflected on Twitter and to what extent that information can be used for the sake of surveillance. Dengue is a mosquito-borne infectious disease that is a leading cause of illness and death in tropical and subtropical regions, including Brazil. We propose an active surveillance methodology that is based on four dimensions: volume, location, time and public perception. First we explore the public perception dimension by performing sentiment analysis. This analysis enables us to filter out content that is not relevant for the sake of Dengue surveillance. Then, we verify the high correlation between the number of cases reported by official statistics and the number of tweets posted during the same time period (i.e., R2 = 0.9578). A clustering approach was used in order to exploit the spatio-temporal dimension, and the quality of the clusters obtained becomes evident when they are compared to official data (i.e., RandIndex = 0.8914). As an application, we propose a Dengue surveillance system that shows the evolution of the dengue situation reported in tweets, which is implemented in www.observatorio.inweb.org.br/dengue/.

References

  1. H. Achrekar, A. Gandhe, R. Lazarus, S. Yu, and B. Liu. Predicting flu trends using twitter data. In International Workshop on Cyber-Physical Networking Systems, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  2. R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In ACM SIGMOD International Conference on Management of Data, pages 207--216. ACM, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Asur and B. A. Huberman. Predicting the future with social media. CoRR, abs/1003.5699, 2010.Google ScholarGoogle Scholar
  4. D. Birant and A. Kut. St-dbscan: An algorithm for clustering spatial-temporal data. Data Knowl. Eng., 60:208--221, January 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Centers for Disease Control and Prevention. http://www.cdc.gov/dengue/.Google ScholarGoogle Scholar
  6. M. Cha, H. Haddadi, F. Benevenuto, and K. P. Gummadi. Measuring user influence in twitter: The million follower fallacy. In International AAAI Conference on Weblogs and Social Media. AAAI Press, May 2010.Google ScholarGoogle Scholar
  7. L. Chen, H. Achrekar, B. Liu, and R. Lazarus. Vision: towards real time epidemic vigilance through online social networks. In ACM Workshop on Mobile Cloud Computing Services: Social Networks and Beyond, pages 1--5. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Chew and G. Eysenbach. Pandemics in the age of twitter: Content analysis of tweets during the 2009 h1n1 outbreak. PLoS ONE, 5(11):e14118, 11 2010.Google ScholarGoogle ScholarCross RefCross Ref
  9. DATASUS Dengue. http://bit.ly/dGtFst.Google ScholarGoogle Scholar
  10. Epidemiological report summary on Dengue. http://portal.saude.gov.br/portal/arquivos/pdf/informe_dengue_2011_janeiro_e_marco_13_04.pdf.Google ScholarGoogle Scholar
  11. M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In International Conference on Knowledge Discovery and Data Mining, pages 226--231. AAAI Press, 1996.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Eysenbach. Infodemiology:tracking flu-related searches on the web for syndromic surveillance. In AMIA Annu Symp Proc., pages 244--248, 2006.Google ScholarGoogle Scholar
  13. G. Eysenbach. Infodemiology and infoveillance: Framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the internet. J Med Internet Res., 11:e11, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  14. J. Ginsberg, M. H. Mohebbi, R. S. Patel, L. Brammer, M. S. Smolinski, and L. Brilliant. Detecting influenza epidemics using search engine query data. Nature, 457:1012--4, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  15. S. Goel, J. M. Hofman, S. Lahaie, D. M. Pennock, and D. J. Watts. Predicting consumer behavior with web search. Proceedings of the National Academy of Sciences, 107(41):17486--17490, October 2010.Google ScholarGoogle ScholarCross RefCross Ref
  16. Google Geocoding API. http://code.google.com/intl/en/apis/maps/documentation/geocoding/.Google ScholarGoogle Scholar
  17. S. B. Halstead. Dengue. In Lancet, pages 1644--1652, 2007.Google ScholarGoogle Scholar
  18. A. Hulth, G. Rydevik, and A. Linde. Web queries as a source for syndromic surveillance. PLoS ONE, 4(2):e4378, 02 2009.Google ScholarGoogle ScholarCross RefCross Ref
  19. J. Kivinen and H. Mannila. The power of sampling in knowledge discovery. In ACM SIGACT- SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 77--85. ACM, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. Lampos and N. Cristianini. Tracking the flu pandemic by monitoring the social web. In Workshop on Cognitive Information Processing (CIP 2010), pages 411--416. IEEE Press, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  21. V. Lampos, T. De Bie, and N. Cristianini. Flu detector: tracking epidemics on twitter. In European conference on Machine learning and knowledge discovery in databases, pages 599--602. Springer-Verlag, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. M. Polgreen, Y. Chen, D. M. Pennock, F. D. Nelson, and R. A. Weinstein. Using internet searches for influenza surveillance. Clinical Infectious Diseases, 47:1443--1448, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  23. S. Runge-Ranzinger, O. Horstick, M. Marx, and A. Kroeger. What does dengue disease surveillance contribute to predicting and detecting outbreaks and describing trends? Tropical Medicine International Health, 13(8):1022--1041, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  24. T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In International conference on World wide web, pages 851--860. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Tumasjan, T. Sprenger, P. Sandner, and I. Welpe. Predicting elections with twitter: What 140 characters reveal about political sentiment. In International AAAI Conference on Weblogs and Social Media. AAAI Press, 2010.Google ScholarGoogle Scholar
  26. Twitter Streaming API. http://apiwiki.twitter.com/.Google ScholarGoogle Scholar
  27. A. Veloso, W. Meira Jr., and M. J. Zaki. Lazy associative classification. In International Conference on Data Mining, pages 645--654. IEEE Computer Society, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. World Health Organization. http://www.who.int/tdr/diseases/default.htm.Google ScholarGoogle Scholar

Index Terms

  1. Dengue surveillance based on a computational model of spatio-temporal locality of Twitter

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          WebSci '11: Proceedings of the 3rd International Web Science Conference
          June 2011
          483 pages
          ISBN:9781450308557
          DOI:10.1145/2527031

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 15 June 2011

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          WebSci '11 Paper Acceptance Rate34of203submissions,17%Overall Acceptance Rate218of875submissions,25%

          Upcoming Conference

          Websci '24
          16th ACM Web Science Conference
          May 21 - 24, 2024
          Stuttgart , Germany

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader