Skip to main content
Log in

Estimating micro-populations through social media analytics

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Estimation of crowd sizes or the occupancy of buildings and skyscrapers can often be essential. However, traditional ways of estimation through manual counting, image processing or in the case of skyscrapers, through total water usage are awkward, inefficient and often inaccurate. Social media has developed rapidly in the last decade. In this work, we provide novel solutions to estimate the population of suburbs and skyscrapers—so-called micro-populations, through the use of social media. We develop a big data solution leveraging large-scale harvesting and analysis of Twitter data. By harvesting real-time tweets and clustering tweets within suburbs and skyscrapers, we show how micro-populations can be calculated. To validate this, we construct linear and spatial models for the suburbs in four cities of Australia using census data and geospatial data models (shapefiles). Our prediction of micro-population shows that Twitter can indeed be used for population prediction with a high degree of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Akaike H (2011) Akaike’s information criterion. In: Lovric M (ed) International encyclopedia of statistical science. Springer, Berlin, pp 25

    Chapter  Google Scholar 

  • Anderson JC, Lehnardt J, Slater N (2010) CouchDB: the definitive guide. O’Reilly Media, Inc., Sebastopol

    Google Scholar 

  • Botta F, Federico, Moat HS, Preis T (2015) Quantifying crowd size with mobile phone and Twitter data. R Soc Open Sci 2(5):150162

    Article  MathSciNet  Google Scholar 

  • Brockmann D, Hufnagel L, Geisel T (2006) The scaling laws of human travel. Nature 439(7075):462–465

    Article  Google Scholar 

  • Calabrese F et al (2010) The geography of taste: analyzing cell-phone mobility and social events. Pervasive computing. Springer, Berlin, pp 22–37

    Google Scholar 

  • Cheng Z et al (2011) Exploring millions of footprints in location sharing services. ICWSM 2011:81–88

    Google Scholar 

  • Cliff AD, Ord JK (1975) The choice of a test for spatial autocorrelation. In: Davies JC, McCullagh ML (eds) Display and analysis of spatial data. Wiley, Chichester, pp 54–77

    Google Scholar 

  • Davies AC, Yin JH, Velastin SA (1995) Crowd monitoring using image processing. Electron Commun Eng J 7(1):37–47

    Article  Google Scholar 

  • Geary RC (1954) The contiguity ratio and statistical mapping. Inc Stat 5(3):115–146

    Google Scholar 

  • Georgiev P, Noulas A, Mascolo C (2014) The call of the crowd: event participation in location-based social services. arXiv preprint arXiv:1403.7657

  • Gomide J et al (2011) Dengue surveillance based on a computational model of spatio-temporal locality of Twitter. In: Proceedings of the 3rd international web science conference. ACM

  • Gong Y, Deng F, Sinnott RO (2015) Identification of (near) real-time traffic congestion in the cities of Australia through Twitter”, understanding the City with Urban Informatics, CIKM 2015. Melbourne, Australia

    Google Scholar 

  • Gonzalez MC, Hidalgo CA, Barabasi AL (2008) Understanding individual human mobility patterns. Nature 453(7196):779–782

    Article  Google Scholar 

  • Halleck Vega S, Elhorst JP (2015) The SLX model. J Reg Sci 55(3):339–363

    Article  Google Scholar 

  • Huang X, Li L, Sim T (2004) Stereo-based human head detection from crowd scenes. In: Image processing, 2004. ICIP’04. 2004 international conference on IEEE, vol. 2

  • Hubert LJ, Golledge RG, Costanzo CM (1981) Generalized procedures for evaluating spatial autocorrelation. Geogr Anal 13(3):224–233

    Article  Google Scholar 

  • Jacobs H (1967) To count a crowd. Columb Journal Rev 6(1):37

    Google Scholar 

  • Jones M, Viola P (2003) Fast multi-view face detection. Mitsubishi Electr Res Lab TR-20003-96 3:14

    Google Scholar 

  • Kong D, Gray D, Tao H (2006) A viewpoint invariant approach for crowd counting In: Pattern recognition, 2006. ICPR 2006. 18th international conference on IEEE, vol. 3

  • Lee R, Sumiya K (2010) Measuring geographical regularities of crowd behaviors for Twitter-based geo-social event detection. In: Proceedings of the 2nd ACM SIGSPATIAL international workshop on location based social networks. ACM

  • Li SZ (2002) Statistical learning of multi-view face detection. Computer vision—ECCV 2002. Springer, Berlin, pp 67–81

    Google Scholar 

  • Liang Y et al (2013) How big is the crowd?: Event and location based population modeling in social media. In: Proceedings of the 24th ACM conference on hypertext and social media. ACM

  • Marana AN et al (1997) Estimation of crowd density using image processing. In: Image processing for security applications (Digest No.: 1997/074), IEE Colloquium on IET

  • MacEachren AM et al (2011) Senseplace2: Geotwitter analytics support for situational awareness. In: Visual analytics science and technology (VAST), 2011 IEEE conference on IEEE

  • Moran PAP (1950) Notes on continuous stochastic phenomena. Biometrika 37(1/2):17–23

    Article  MathSciNet  MATH  Google Scholar 

  • Ratti C et al (2006) Mobile landscapes: using location data from cell phones for urban analysis. Environ Plan 33(5):727–748

    Article  Google Scholar 

  • Regazzoni CS, Tesei A, Murino V (1993) A real-time vision system for crowding monitoring In: Industrial electronics, control, and instrumentation, 1993. Proceedings of the IECON’93, international conference on IEEE

  • Ryan D et al (2015) An evaluation of crowd counting methods, features and regression models. Comput Vis Image Underst 130:1–17

    Article  Google Scholar 

  • Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on World Wide Web. ACM

  • Sawada M (2001) Global spatial autocorrelation indices–Moran’s I, Geary’s C and the General Cross-Product Statistic. Laboratory of Paleoclimatology and Climatology, Dept. Geography, University of Ottawa, Mimeo

  • Scellato S et al (2011) Socio-spatial properties of online location-based social networks. ICWSM 11:329–336

    Google Scholar 

  • Seidler J, Meyer K, Gillivray LM (1976) Collecting data on crowds and rallies: a new method of stationary sampling. Soc Forces 55(2):507–519

    Article  Google Scholar 

  • Sinnott RO, Chen W (2016) Estimating crowd sizes through social media. In: 2016 IEEE international conference on pervasive computing and communication workshops (PerCom Workshops). IEEE

  • Sinnott RO, Yin S (2015) Accident black spot identification, verification and prediction through social media. In: IEEE international conference on data science and data intensive systems, Sydney, Australia

  • Sinnott RO et al (2014) The Australian urban research gateway. J Concurr Comput Pract Exp. doi:10.1002/cpe.3282

    Google Scholar 

  • Song C et al (2010) Limits of predictability in human mobility. Science 327(5968):1018–1021

    Article  MathSciNet  MATH  Google Scholar 

  • Swank E, Clapp JD (1999) Some methodological concerns when estimating the size of organizing activities. J Commun Pract 6(3):49–69

    Article  Google Scholar 

  • Swets DL, Punch B (1995) Genetic algorithms for object localization in a complex scene. In: IEEE international conference on image processing

  • Tanton R et al (2011) Small area estimation using a reweighting algorithm. J R Stat Soc Ser A (Statistics in Society) 174(4):931–951

    Article  MathSciNet  Google Scholar 

  • Terpstra T et al (2012) Towards a realtime Twitter analysis during crises for operational crisis management. Simon Fraser University, Burnaby

    Book  Google Scholar 

  • Yip PSF et al (2010) Estimation of the number of people in a demonstration. Aust N Z J Stat 52(1):17–26

    Article  MathSciNet  MATH  Google Scholar 

  • Zaldumbide JP, Sinnott RO (2015) Identification and verification of real-time health events through social media. In: IEEE international conference on data science and data intensive systems, Sydney, Australia

  • Zhan B et al (2008) Crowd analysis: a survey. Mach Vis Appl 19(5-6):345–357

    Article  Google Scholar 

Download references

Acknowledgements

The authors wish to thank the NeCTAR project for the use of the Cloud systems underpinning this paper, and the AURIN project for the Census and suburb Shapefiles. The corresponding author is Prof. Richard O. Sinnott.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard O. Sinnott.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sinnott, R.O., Wang, W. Estimating micro-populations through social media analytics. Soc. Netw. Anal. Min. 7, 13 (2017). https://doi.org/10.1007/s13278-017-0433-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-017-0433-6

Keywords

Navigation