Skip to main content
Log in

Harvesting ambient geospatial information from social media feeds

  • Published:
GeoJournal Aims and scope Submit manuscript

Abstract

Social media generated from many individuals is playing a greater role in our daily lives and provides a unique opportunity to gain valuable insight on information flow and social networking within a society. Through data collection and analysis of its content, it supports a greater mapping and understanding of the evolving human landscape. The information disseminated through such media represents a deviation from volunteered geography, in the sense that it is not geographic information per se. Nevertheless, the message often has geographic footprints, for example, in the form of locations from where the tweets originate, or references in their content to geographic entities. We argue that such data conveys ambient geospatial information, capturing for example, people’s references to locations that represent momentary social hotspots. In this paper we address a framework to harvest such ambient geospatial information, and resulting hybrid capabilities to analyze it to support situational awareness as it relates to human activities. We argue that this emergence of ambient geospatial analysis represents a second step in the evolution of geospatial data availability, following on the heels of volunteered geographical information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. While we present a general architecture upon which we based our own system to collect such information, we should note that there also exists a number of comparable tools such as 140kit (http://140kit.com/), or twapperkeeper (http://twapperkeeper.com/), but these are limited in their scalability with respect to large datasets. Sites such as ushahidi (http://www.ushahidi.com/) also provide a means to collect and disseminate information over the web. However, there are very few tools that allow one to add context to content, or to support detailed analysis.

  2. Readers are referred to http://www.casa.ucl.ac.uk/tom/ and http://urbantick.blogspot.com/ for further information.

  3. Hashtags represent a bottom up, user-generated convention for adding content (in a sense, metadata) about a specific topic, by identifying keywords to describe content. Thus it allows easy searching of tweets and trends. Sites such as http://hashtags.org/ monitor such trends from tweets and provide relevant statistics, but only over short periods of times.

  4. The tweets were gained by searching using the twitter API within a 30 km radius of the given city or using their twitter profile location.

  5. http://www.geovista.psu.edu/SensePlace2/.

  6. Examples include: monitoring swine flu (http://compepi.cs.uiowa.edu/~alessio/twitter-monitor-swine-flu/) or twitter traffic and the Oscars (http://www.neoformix.com/2009/OscarTwitterMap.html).

  7. Basically stated social network analysis (SNA) allows us to explore how different parts of a social system (e.g. people, organizations) are linked together. Moreover, it allows one to define the systems’ structure and evolution over time (e.g. kinship or role-based networks). SNA is a quantitative methodology using mathematical graphs to represent people or organizations, where each person is a node, and nodes are connected to others via links (edges). Such links can be directed or undirected (e.g. friendship networks don’t have to be reciprocal).

  8. The 3 most retweeted tweets in this specific dataset in chronological order were: Tweet 1 from NHK_PR: 2011-03-11 13:51:23, asking people to remain calm; the number of retweets in the period of interest was 303. Tweet 2 from NHK_PR: 2011-03-11 13:54:59 was a warning to switch off power to homes before evacuating; the number of retweets in the period of interest was 435. Tweet 3 from NHK_PR: 2011-03-11 14:05:45 was a tsunami warning; the number of retweets in the period of interest was 258.

  9. http://ilektrojohn.github.com/creepy/.

References

  • Adar, E., & Huberman, B. A. (2000). Free riding on gnutella. First Monday, 5(10–2). Available at http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/792/701.

  • Anderson, P. (2007). What is Web 2.0? Ideas, technologies and implications for education. Horizon scanning report, JISC technology and standards watch. Available at http://www.jisc.ac.uk/whatwedo/services/techwatch/reports/horizonscanning/hs0701.aspx.

  • Asur, S., & Huberman, B. A. (2010). Predicting the future with social media. In Proceedings of the IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (pp. 492–499). Toronto, Canada.

  • Backstrom, L., Sun, E., & Marlow, C. (2010). Find me if you can: Improving geographical prediction with social and spatial proximity. In WWW’10 (pp. 61–70). Raleigh, NC.

  • Barabási, A., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.

    Article  Google Scholar 

  • Batty, M., Hudson-Smith, A., Milton, R., & Crooks, A. T. (2010). Map mashups, Web 2.0 and the GIS revolution. Annals of GIS, 16(1), 1–13.

    Article  Google Scholar 

  • BBC. (2011). iPhone tracks users’ movements. Available at http://www.bbc.co.uk/news/technology-13145562. Accessed on 27th, June, 2011.

  • Becker, H., Naaman, M., & Gravano, L. (2010). Learning similarity metrics for event identification in social media. In Proceedings AAAI conference on weblogs and social media (pp. 291–300). New York, NY.

  • Becker, H., Naaman, M., & Gravano, L. (2011). Beyond trending topics: Real-world event identification on twitter. In Proceedings AAAI conference on weblogs and social media (pp. 438–441). Barcelona, Spain.

  • Biewald, L., & Janah, L. (2010). TechCrunch: Crowdsourcing disaster relief. Available at http://techcrunch.com/2010/08/21/crowdsourcing-disaster-relief/. Accessed on January, 26th, 2011.

  • Bowman, C., Danzig, P., Hardy, D., Manber, U., & Schwartz, M. (1995). The harvest information discovery and access system. Computer Networks and ISDN Systems, 28(1–2), 119–125.

    Article  Google Scholar 

  • Brownstein, J. S., Freifeld, C. C., Reis, B. Y., & Mandl, K. D. (2008). Surveillance Sans Frontiéres: Internet-based emerging infectious disease intelligence and the Healthmap project. PLoS Medicine, 5(7), 1019–1024.

    Article  Google Scholar 

  • Buys, P., Dasgupta, S., Thomas, T. S., & Wheeler, D. (2009). Determinants of a digital divide in sub-Saharan Africa: A spatial econometric analysis of cell phone coverage. World Development, 37(9), 1494–1505.

    Article  Google Scholar 

  • Buyukkokten, O., Cho, J., Garcia-Molina, H., Gravano, L., & Shivakumar, N. (1999). Exploiting geographical location information of web pages. In Proceedings ACM SIGMOD workshop on web and databasesWebDB. Philadelphia, PA.

  • Cheng, Z., Caverlee, J., & Lee, K. (2010). You are where you tweet: A content-based approach to geo-locating twitter users. In Proceedings ACM conference on information and knowledge management CIKM’10 (pp. 759–768). Toronto, Canada.

  • Choudhury, M. D., Sundaraman, H., John, A., Seligmann, D. D., & Kelliher, A. (2010). Birds of a feather: Does user homophily impact information diffusion in social media?’ Available at http://arxiv.org/abs/1006.1702.

  • Clauset, A., Shalizi, C. R., & Newman, M. E. J. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661–703.

    Article  Google Scholar 

  • Cooley, R., Mobasher, B., & Srivastava, J. (1997). Web mining: Information and pattern discovery on the World Wide Web. In International conference on tools with artificial intelligenceICTAI’97 (pp. 558–567). Newport Beach, CA.

  • Corley, C. D., Cook, D. J., Mikler, A. R., & Singh, K. P. (2010). Text and structural data mining of influenza mentions in web and social media. International Journal of Environmental Research and Public Health, 7(2), 596–615.

    Article  Google Scholar 

  • Culotta, A. (2010). Towards detecting influenza epidemics by analyzing twitter messages. In Proceedings of the first workshop on social media analytics (pp. 115–122). Washington, DC.

  • Dave, K., Bhatt, R., & Varma, V. (2010). Modeling action cascades in social networks. In Proceedings of the fifth international AAAI conference on weblogs and social media (pp. 121–128). Barcelona, Spain.

  • Elwood, S. (2010). Geographic information science: Emerging research on the societal implications of the geospatial web. Progress in Human Geography, 34(3), 349–357.

    Article  Google Scholar 

  • Eriksson, B., Barford, P., Sommers, J., & Nowak, R. (2010). A learning-based approach for IP geolocation. In: A. Krishnamurthy, B. Plattner (Eds.), Passive and active measurement, lecture notes in computer science (Vol. 6032, pp. 171–180). Berlin, Germany: Springer.

  • Financial Times. (2010). Coke sees ‘phenomenal’ result from twitter ads. Financial Times (June, 25th). Available at http://www.ft.com/cms/s/0/6726ef4e-805a-11df-8b9e-00144feabdc0.html#axzz1OoiRiNWO. Accessed on 27th, June, 2011.

  • Firedland, G., Choi, J., Lei, H., & Janin A. (2011). Multimodal location estimation on flickr videos. In Proceedings of MultiMedia’11. Scottsdale, AZ.

  • Friedland, G., & Sommer, R. (2010). Cybercasing the joint: On the privacy implications of geotagging. In Proceedings of the fifth USENIX workshop on hot topics in security (HotSec 10). Washington, DC.

  • Goodchild, M. F. (2007a). Citizens as sensors: The world of volunteered geography. GeoJournal, 69(4), 211–221.

    Article  Google Scholar 

  • Goodchild, M. F. (2007b). Citizens as voluntary sensors: Spatial data infrastructure in the world of Web 2.0. International Journal of Spatial Data Infrastructures Research, 2(1), 24–32.

    Google Scholar 

  • Graham, P. (2007). Web 2.0. Available at http://www.paulgraham.com/web20.html. Accessed on May 1st, 2008.

  • Gravano, L., Hatzivassiloglou, V., & R. Lichtenstein. (2003). Categorizing web queries according to geographical locality. In Proceedings of the conference on information and knowledge managementCIKM (pp. 325–333). New Orleans, LA.

  • Guardian. (2011). Assault on Zawiyah. The Guardian (March 8th). Available at http://www.guardian.co.uk/world/2011/mar/08/arab-and-middle-east-protests-libya. Accessed on 27th, June, 2011.

  • Haklay, M. (2010). How good is volunteered geographical information? A comparative study of OpenStreetMap and ordnance survey datasets. Environment and Planning B, 37(4), 682–703.

    Article  Google Scholar 

  • Haklay, M., Singleton, A., & Parker, C. (2008). Web mapping 2.0: The neogeography of the GeoWeb. Geography Compass, 2(6), 2011–2039.

    Article  Google Scholar 

  • Hecht, B., Hong, L., Suh, B., & Chi, E. (2011). Tweets from Justin Bieber’s heart: The dynamics of the ‘location’ field in user profiles. In Proceedings of the ACM CHI conference on human factors in computing systems. Vancouver, Canada.

  • Howe, J. (2006). The rise of crowdsourcing. Wired 14.06, 161–165. Available at http://www.wired.com/wired/archive/14.06/crowds.html. Accessed on September 25th, 2008.

  • Huberman, B. A., Romero, D. M., & Wu, F. (2009). Social networks that matter: Twitter under the microscope. First Monday, 14, 1–5.

    Google Scholar 

  • Hudson-Smith, A., Batty, M., Crooks, A. T., & Milton, R. (2009a). Mapping tools for the masses: Web 2.0 and crowdsourcing. Social Science Computer Review, 27(4), 524–538.

    Article  Google Scholar 

  • Hudson-Smith, A., & Crooks, A. T. (2009). The renaissance of geographic information: Neogeography, gaming and second life. In H. Lin & M. Batty (Eds.), Virtual geographic environments (pp. 25–36). Beijing: Science Press.

    Google Scholar 

  • Hudson-Smith, A., Crooks, A. T., Gibin, M., Milton, R., & Batty, M. (2009b). Neogeography and Web 2.0: Concepts, tools and applications. Journal of Location Based Services, 3(2), 118–145.

    Article  Google Scholar 

  • Java, A., Song, X., Finin, T., & Tseng, B. (2007). Why we twitter: Understanding microblogging usage and communities. In Joint 9th WEBKDD and 1st SNA-KDD workshop’07 (pp. 56–65). San Jose, CA.

  • Keen, A. (2007). The cult of the amateur: How today’s internet is killing our culture. New York, NY: Currency.

    Google Scholar 

  • Kennedy, L., Naaman, M., Ahern, S., Nair, R., & Rattenbury, T. (2007). How flickr helps us make sense of the world: Context and content in community-contributed media collections. In The proceedings of multiMedia’07 (pp. 631–640). Augsburg, Germany.

  • Kimura, M., Saito, K., & Motoda, H. (2010). Extracting influential nodes on a social networks for information diffusion. Data Mining and Knowledge Discovery, 20(1), 70–97.

    Article  Google Scholar 

  • Kosala, R., & Blockeel, H. (2000). Web mining research: A survey. SIGKDD Explorations, 2(1), 1–15.

    Article  Google Scholar 

  • Kwak, H., Lee, C., Park, H., & Moon, S. (2010). What is twitter, a social network or a news media? In WWW’10 (pp. 591–600). Raleigh, NC.

  • Longley, P. A., Ashby, D. I., Webber, R., & Li, C. (2006). Geodemographic classifications, the digital divide and understanding customer take-up of new technologies. BT Technology Journal, 24(3), 67–74.

    Article  Google Scholar 

  • Longley, P. A., Goodchild, M. F., Maguire, D. J., & Rhind, D. W. (2010). Geographical information systems and science (3rd ed.). New York, NY: Wiley.

    Google Scholar 

  • Ludford, P. J., Priedhorsky, R., Reily, K., & Terveen, L. G. (2007). Capturing, sharing, and using local place information. In Proceedings of CHI ‘07 (pp. 1235–1244). San Jose, CA.

  • MacEachren, A. M., Robinson, A. C., Jaiswal, A., Pezanowski, S., Savelyev, A., Blanford, J., & Mitra, P. (2011). Geo-twitter analytics: Applications in crisis management. In Proceedings of the 25th international cartographic conference. Paris, France.

  • McPherson, M., Smith-Lovin, L., & Cook, J. C. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27, 415–444.

    Article  Google Scholar 

  • Naaman, M., Becker, H., & Gravano, L. (2011). Hip and trendy: Characterizing emerging trends on twitter. Journal of the American Society for Information Science and Technology, 62(5), 902–918.

    Article  Google Scholar 

  • National Audubon Society. (2011). Christmas bird count. Available at http://birds.audubon.org/christmas-bird-count. Accessed on 26th April, 2011.

  • Newman, M. E. J. (2005). Power laws, pareto distributions and Zipf’s Law. Contemporary Physics 46(5), 323–351.

    Google Scholar 

  • Newsweek. (2009). A twitter timeline of the iran election. Newsweek. Available at http://www.newsweek.com/2009/06/25/a-twitter-timeline-of-the-iran-election.html. Accessed on April 27th, 2011.

  • Norheim-Hagtun, I., & Meier, P. (2010). Crowdsourcing for crisis mapping in Haiti. Innovations: Technology Governance, 5(4), 81–89.

    Article  Google Scholar 

  • O’Reilly, T. (2005). What is Web 2.0: Design patterns and business models for the next generation of software. Available at http://www.oreillynet.com/lpt/a/6228. Accessed on February 20th, 2009.

  • Parry, M. (2011). Academics join relief efforts around the world as crisis mappers. The Chronicle of Higher Education (March, 27th). Available at http://chronicle.com/article/Academics-Join-Relief-Efforts/126912/#. Accessed on 27th June, 2011.

  • Petrovic, S., Osborne, M., & Lavrenko, V. (2010). Streaming first story detection with application to twitter. In: Human language technologiesHLT’10 (pp. 181–189).

  • Poese, I., Uhlig, S., Kaafar, M. A., Donnet, B., & Gueye, B. (2011). IP geolocation databases: Unreliable? Computer Communication Review, 4(2), 53–56.

    Article  Google Scholar 

  • Polgreen, P. M., Chen, Y., Pennock, D. M., & Nelson, F. D. (2008). Using internet searches for influenza surveillance. Clinical Infectious Diseases, 47(11), 1443–1448.

    Article  Google Scholar 

  • Popescu, A., & Grefenstette, G. (2010). Mining user home location and gender from flickr tags. In Proceedings international conference on weblogs and social mediaICWSM’10 (pp. 307–310). Washington, DC.

  • Rattenbury, T., & Naaman, M. (2009). Methods for extracting place semantics from flickr tags. ACM Transactions on the Web, 3(1), 1–30.

    Article  Google Scholar 

  • Ritterman, J., Osborne, M., & Klein, E. (2009). Using prediction markets and twitter to predict a swine flu pandemic. In F. M. Carrero, J. M. Gómez, B. Monsalve, E. Puertas & J. C. Cortizo (Eds.), 1st international workshop on mining social media (pp. 9–18). Sevilla, Spain.

  • Russell, M. A. (2011a). Mining the social web. O’Reilly Media.

  • Russell, M. A. (2011b). 21 recipes for mining twitter. O’Reilly Media.

  • Sakaki, T., Okazaki, M., & Matsuo, Y. (2010). Earthquake shakes twitter users: Real-time event detection by social sensors. In WWW ‘10, Raleigh, NC.

  • Salton, G., & McGill, M. J. (1983). Introduction to modern information retrieval. NewYork: McGraw-Hill.

    Google Scholar 

  • Sankaranarayanan, J., Samet, H., Teitler, B., Lieberman, M., & Sperling, J. (2009). TwitterStand: News in tweets. In ACM GIS 2009 (pp. 42–51). Seattle, WA.

  • Schindler, G., Brown, M., & Szeliski, R. (2007). City-scale location recognition. In IEEE conference on computer vision and pattern recognition (CVPR’07) (pp. 1–7). Minneapolis, MN.

  • Shi, X., Tseng, B., & Adamic, L. A. (2007). Looking at the Blogosphere topology through different lenses. In Proceedings of the international conference on weblogs and social media (ICWSM 2007). Boulder, CO.

  • Singla, P., & Richardson, M. (2008). Yes, there is a correlation-from social networks to personal behavior on the web. In WWW’08 (pp. 655–664). Beijing, PRC.

  • Standby Task Force. (2011). The security and ethics of live mapping in repressive regimes and hostile environments. Available at http://blog.standbytaskforce.com/?p=259. Accessed on 27th, June, 2011.

  • Tang, J., Sun, J., Wang, C., & Yang, Z. (2009). Social influence analysis in large-scale networks. In Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 807–816). Paris, France.

  • TomTom. (2011). This is what we really do with your data. Available at http://www.tomtom.com/page/facts. Accessed on 27th June, 2011.

  • Wall Street Journal. (2011). Where the young and tech-savvy go? Wall Street Journal (May 19). Available at http://blogs.wsj.com/digits/2011/05/19/a-week-on-foursquare/. Accessed on 27th June, 2011.

  • Watts, D. J., & Dodds, P. S. (2007). Influentials, networks, and public opinion. Journal of Consumer Research, 34(4), 441–458.

    Article  Google Scholar 

  • Weng, J., & Lee, B.-S. (2011). Event detection in twitter. In Proceedings of the AAAI conference on weblogs and social media (ICWSM-11). Barcelona, Spain.

  • Weng, J., Lim, E.-P., Jiang, J., & He, Q. (2010). TwitterRank: Finding topic-sensitive influential twitters. In Proceedings of the web search and data mining (WSDM’10). New York, NY.

  • Yardi, S., & Boyd, D. (2010). Tweeting for the town square: Measuring geographic local networks. In Proceedings of fourth international AAAI conference on weblogs and social media (pp. 194–201). Washington, DC.

  • Zhang, J., Ackerman, M.S., & Adamic, L. (2007). Expertise networks in online communities: Structure and algorithms. In Proceedings of the 16th international conference on World Wide Web (pp. 221–230). Banff, Canada.

  • Zhang, W., & Kosecka, J. (2006). Image based localization in Urban environments. In International symposium on 3D data processing, visualization, and transmission (pp. 33–40). Chapel Hill, NC.

  • Zook, M., Graham, M., Shelton, T., & Gorman, S. (2010). Volunteered geographic information and crowdsourcing disaster relief: A case study of the Haitian earthquake. World Medical and Health Policy, 2(2), 7–33.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anthony Stefanidis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Stefanidis, A., Crooks, A. & Radzikowski, J. Harvesting ambient geospatial information from social media feeds. GeoJournal 78, 319–338 (2013). https://doi.org/10.1007/s10708-011-9438-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10708-011-9438-2

Keywords

Navigation