Abstract
Social media generated from many individuals is playing a greater role in our daily lives and provides a unique opportunity to gain valuable insight on information flow and social networking within a society. Through data collection and analysis of its content, it supports a greater mapping and understanding of the evolving human landscape. The information disseminated through such media represents a deviation from volunteered geography, in the sense that it is not geographic information per se. Nevertheless, the message often has geographic footprints, for example, in the form of locations from where the tweets originate, or references in their content to geographic entities. We argue that such data conveys ambient geospatial information, capturing for example, people’s references to locations that represent momentary social hotspots. In this paper we address a framework to harvest such ambient geospatial information, and resulting hybrid capabilities to analyze it to support situational awareness as it relates to human activities. We argue that this emergence of ambient geospatial analysis represents a second step in the evolution of geospatial data availability, following on the heels of volunteered geographical information.
Similar content being viewed by others
Notes
While we present a general architecture upon which we based our own system to collect such information, we should note that there also exists a number of comparable tools such as 140kit (http://140kit.com/), or twapperkeeper (http://twapperkeeper.com/), but these are limited in their scalability with respect to large datasets. Sites such as ushahidi (http://www.ushahidi.com/) also provide a means to collect and disseminate information over the web. However, there are very few tools that allow one to add context to content, or to support detailed analysis.
Readers are referred to http://www.casa.ucl.ac.uk/tom/ and http://urbantick.blogspot.com/ for further information.
Hashtags represent a bottom up, user-generated convention for adding content (in a sense, metadata) about a specific topic, by identifying keywords to describe content. Thus it allows easy searching of tweets and trends. Sites such as http://hashtags.org/ monitor such trends from tweets and provide relevant statistics, but only over short periods of times.
The tweets were gained by searching using the twitter API within a 30 km radius of the given city or using their twitter profile location.
Examples include: monitoring swine flu (http://compepi.cs.uiowa.edu/~alessio/twitter-monitor-swine-flu/) or twitter traffic and the Oscars (http://www.neoformix.com/2009/OscarTwitterMap.html).
Basically stated social network analysis (SNA) allows us to explore how different parts of a social system (e.g. people, organizations) are linked together. Moreover, it allows one to define the systems’ structure and evolution over time (e.g. kinship or role-based networks). SNA is a quantitative methodology using mathematical graphs to represent people or organizations, where each person is a node, and nodes are connected to others via links (edges). Such links can be directed or undirected (e.g. friendship networks don’t have to be reciprocal).
The 3 most retweeted tweets in this specific dataset in chronological order were: Tweet 1 from NHK_PR: 2011-03-11 13:51:23, asking people to remain calm; the number of retweets in the period of interest was 303. Tweet 2 from NHK_PR: 2011-03-11 13:54:59 was a warning to switch off power to homes before evacuating; the number of retweets in the period of interest was 435. Tweet 3 from NHK_PR: 2011-03-11 14:05:45 was a tsunami warning; the number of retweets in the period of interest was 258.
References
Adar, E., & Huberman, B. A. (2000). Free riding on gnutella. First Monday, 5(10–2). Available at http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/792/701.
Anderson, P. (2007). What is Web 2.0? Ideas, technologies and implications for education. Horizon scanning report, JISC technology and standards watch. Available at http://www.jisc.ac.uk/whatwedo/services/techwatch/reports/horizonscanning/hs0701.aspx.
Asur, S., & Huberman, B. A. (2010). Predicting the future with social media. In Proceedings of the IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (pp. 492–499). Toronto, Canada.
Backstrom, L., Sun, E., & Marlow, C. (2010). Find me if you can: Improving geographical prediction with social and spatial proximity. In WWW’10 (pp. 61–70). Raleigh, NC.
Barabási, A., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.
Batty, M., Hudson-Smith, A., Milton, R., & Crooks, A. T. (2010). Map mashups, Web 2.0 and the GIS revolution. Annals of GIS, 16(1), 1–13.
BBC. (2011). iPhone tracks users’ movements. Available at http://www.bbc.co.uk/news/technology-13145562. Accessed on 27th, June, 2011.
Becker, H., Naaman, M., & Gravano, L. (2010). Learning similarity metrics for event identification in social media. In Proceedings AAAI conference on weblogs and social media (pp. 291–300). New York, NY.
Becker, H., Naaman, M., & Gravano, L. (2011). Beyond trending topics: Real-world event identification on twitter. In Proceedings AAAI conference on weblogs and social media (pp. 438–441). Barcelona, Spain.
Biewald, L., & Janah, L. (2010). TechCrunch: Crowdsourcing disaster relief. Available at http://techcrunch.com/2010/08/21/crowdsourcing-disaster-relief/. Accessed on January, 26th, 2011.
Bowman, C., Danzig, P., Hardy, D., Manber, U., & Schwartz, M. (1995). The harvest information discovery and access system. Computer Networks and ISDN Systems, 28(1–2), 119–125.
Brownstein, J. S., Freifeld, C. C., Reis, B. Y., & Mandl, K. D. (2008). Surveillance Sans Frontiéres: Internet-based emerging infectious disease intelligence and the Healthmap project. PLoS Medicine, 5(7), 1019–1024.
Buys, P., Dasgupta, S., Thomas, T. S., & Wheeler, D. (2009). Determinants of a digital divide in sub-Saharan Africa: A spatial econometric analysis of cell phone coverage. World Development, 37(9), 1494–1505.
Buyukkokten, O., Cho, J., Garcia-Molina, H., Gravano, L., & Shivakumar, N. (1999). Exploiting geographical location information of web pages. In Proceedings ACM SIGMOD workshop on web and databases—WebDB. Philadelphia, PA.
Cheng, Z., Caverlee, J., & Lee, K. (2010). You are where you tweet: A content-based approach to geo-locating twitter users. In Proceedings ACM conference on information and knowledge management CIKM’10 (pp. 759–768). Toronto, Canada.
Choudhury, M. D., Sundaraman, H., John, A., Seligmann, D. D., & Kelliher, A. (2010). Birds of a feather: Does user homophily impact information diffusion in social media?’ Available at http://arxiv.org/abs/1006.1702.
Clauset, A., Shalizi, C. R., & Newman, M. E. J. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661–703.
Cooley, R., Mobasher, B., & Srivastava, J. (1997). Web mining: Information and pattern discovery on the World Wide Web. In International conference on tools with artificial intelligence—ICTAI’97 (pp. 558–567). Newport Beach, CA.
Corley, C. D., Cook, D. J., Mikler, A. R., & Singh, K. P. (2010). Text and structural data mining of influenza mentions in web and social media. International Journal of Environmental Research and Public Health, 7(2), 596–615.
Culotta, A. (2010). Towards detecting influenza epidemics by analyzing twitter messages. In Proceedings of the first workshop on social media analytics (pp. 115–122). Washington, DC.
Dave, K., Bhatt, R., & Varma, V. (2010). Modeling action cascades in social networks. In Proceedings of the fifth international AAAI conference on weblogs and social media (pp. 121–128). Barcelona, Spain.
Elwood, S. (2010). Geographic information science: Emerging research on the societal implications of the geospatial web. Progress in Human Geography, 34(3), 349–357.
Eriksson, B., Barford, P., Sommers, J., & Nowak, R. (2010). A learning-based approach for IP geolocation. In: A. Krishnamurthy, B. Plattner (Eds.), Passive and active measurement, lecture notes in computer science (Vol. 6032, pp. 171–180). Berlin, Germany: Springer.
Financial Times. (2010). Coke sees ‘phenomenal’ result from twitter ads. Financial Times (June, 25th). Available at http://www.ft.com/cms/s/0/6726ef4e-805a-11df-8b9e-00144feabdc0.html#axzz1OoiRiNWO. Accessed on 27th, June, 2011.
Firedland, G., Choi, J., Lei, H., & Janin A. (2011). Multimodal location estimation on flickr videos. In Proceedings of MultiMedia’11. Scottsdale, AZ.
Friedland, G., & Sommer, R. (2010). Cybercasing the joint: On the privacy implications of geotagging. In Proceedings of the fifth USENIX workshop on hot topics in security (HotSec 10). Washington, DC.
Goodchild, M. F. (2007a). Citizens as sensors: The world of volunteered geography. GeoJournal, 69(4), 211–221.
Goodchild, M. F. (2007b). Citizens as voluntary sensors: Spatial data infrastructure in the world of Web 2.0. International Journal of Spatial Data Infrastructures Research, 2(1), 24–32.
Graham, P. (2007). Web 2.0. Available at http://www.paulgraham.com/web20.html. Accessed on May 1st, 2008.
Gravano, L., Hatzivassiloglou, V., & R. Lichtenstein. (2003). Categorizing web queries according to geographical locality. In Proceedings of the conference on information and knowledge management—CIKM (pp. 325–333). New Orleans, LA.
Guardian. (2011). Assault on Zawiyah. The Guardian (March 8th). Available at http://www.guardian.co.uk/world/2011/mar/08/arab-and-middle-east-protests-libya. Accessed on 27th, June, 2011.
Haklay, M. (2010). How good is volunteered geographical information? A comparative study of OpenStreetMap and ordnance survey datasets. Environment and Planning B, 37(4), 682–703.
Haklay, M., Singleton, A., & Parker, C. (2008). Web mapping 2.0: The neogeography of the GeoWeb. Geography Compass, 2(6), 2011–2039.
Hecht, B., Hong, L., Suh, B., & Chi, E. (2011). Tweets from Justin Bieber’s heart: The dynamics of the ‘location’ field in user profiles. In Proceedings of the ACM CHI conference on human factors in computing systems. Vancouver, Canada.
Howe, J. (2006). The rise of crowdsourcing. Wired 14.06, 161–165. Available at http://www.wired.com/wired/archive/14.06/crowds.html. Accessed on September 25th, 2008.
Huberman, B. A., Romero, D. M., & Wu, F. (2009). Social networks that matter: Twitter under the microscope. First Monday, 14, 1–5.
Hudson-Smith, A., Batty, M., Crooks, A. T., & Milton, R. (2009a). Mapping tools for the masses: Web 2.0 and crowdsourcing. Social Science Computer Review, 27(4), 524–538.
Hudson-Smith, A., & Crooks, A. T. (2009). The renaissance of geographic information: Neogeography, gaming and second life. In H. Lin & M. Batty (Eds.), Virtual geographic environments (pp. 25–36). Beijing: Science Press.
Hudson-Smith, A., Crooks, A. T., Gibin, M., Milton, R., & Batty, M. (2009b). Neogeography and Web 2.0: Concepts, tools and applications. Journal of Location Based Services, 3(2), 118–145.
Java, A., Song, X., Finin, T., & Tseng, B. (2007). Why we twitter: Understanding microblogging usage and communities. In Joint 9th WEBKDD and 1st SNA-KDD workshop’07 (pp. 56–65). San Jose, CA.
Keen, A. (2007). The cult of the amateur: How today’s internet is killing our culture. New York, NY: Currency.
Kennedy, L., Naaman, M., Ahern, S., Nair, R., & Rattenbury, T. (2007). How flickr helps us make sense of the world: Context and content in community-contributed media collections. In The proceedings of multiMedia’07 (pp. 631–640). Augsburg, Germany.
Kimura, M., Saito, K., & Motoda, H. (2010). Extracting influential nodes on a social networks for information diffusion. Data Mining and Knowledge Discovery, 20(1), 70–97.
Kosala, R., & Blockeel, H. (2000). Web mining research: A survey. SIGKDD Explorations, 2(1), 1–15.
Kwak, H., Lee, C., Park, H., & Moon, S. (2010). What is twitter, a social network or a news media? In WWW’10 (pp. 591–600). Raleigh, NC.
Longley, P. A., Ashby, D. I., Webber, R., & Li, C. (2006). Geodemographic classifications, the digital divide and understanding customer take-up of new technologies. BT Technology Journal, 24(3), 67–74.
Longley, P. A., Goodchild, M. F., Maguire, D. J., & Rhind, D. W. (2010). Geographical information systems and science (3rd ed.). New York, NY: Wiley.
Ludford, P. J., Priedhorsky, R., Reily, K., & Terveen, L. G. (2007). Capturing, sharing, and using local place information. In Proceedings of CHI ‘07 (pp. 1235–1244). San Jose, CA.
MacEachren, A. M., Robinson, A. C., Jaiswal, A., Pezanowski, S., Savelyev, A., Blanford, J., & Mitra, P. (2011). Geo-twitter analytics: Applications in crisis management. In Proceedings of the 25th international cartographic conference. Paris, France.
McPherson, M., Smith-Lovin, L., & Cook, J. C. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27, 415–444.
Naaman, M., Becker, H., & Gravano, L. (2011). Hip and trendy: Characterizing emerging trends on twitter. Journal of the American Society for Information Science and Technology, 62(5), 902–918.
National Audubon Society. (2011). Christmas bird count. Available at http://birds.audubon.org/christmas-bird-count. Accessed on 26th April, 2011.
Newman, M. E. J. (2005). Power laws, pareto distributions and Zipf’s Law. Contemporary Physics 46(5), 323–351.
Newsweek. (2009). A twitter timeline of the iran election. Newsweek. Available at http://www.newsweek.com/2009/06/25/a-twitter-timeline-of-the-iran-election.html. Accessed on April 27th, 2011.
Norheim-Hagtun, I., & Meier, P. (2010). Crowdsourcing for crisis mapping in Haiti. Innovations: Technology Governance, 5(4), 81–89.
O’Reilly, T. (2005). What is Web 2.0: Design patterns and business models for the next generation of software. Available at http://www.oreillynet.com/lpt/a/6228. Accessed on February 20th, 2009.
Parry, M. (2011). Academics join relief efforts around the world as crisis mappers. The Chronicle of Higher Education (March, 27th). Available at http://chronicle.com/article/Academics-Join-Relief-Efforts/126912/#. Accessed on 27th June, 2011.
Petrovic, S., Osborne, M., & Lavrenko, V. (2010). Streaming first story detection with application to twitter. In: Human language technologies—HLT’10 (pp. 181–189).
Poese, I., Uhlig, S., Kaafar, M. A., Donnet, B., & Gueye, B. (2011). IP geolocation databases: Unreliable? Computer Communication Review, 4(2), 53–56.
Polgreen, P. M., Chen, Y., Pennock, D. M., & Nelson, F. D. (2008). Using internet searches for influenza surveillance. Clinical Infectious Diseases, 47(11), 1443–1448.
Popescu, A., & Grefenstette, G. (2010). Mining user home location and gender from flickr tags. In Proceedings international conference on weblogs and social media—ICWSM’10 (pp. 307–310). Washington, DC.
Rattenbury, T., & Naaman, M. (2009). Methods for extracting place semantics from flickr tags. ACM Transactions on the Web, 3(1), 1–30.
Ritterman, J., Osborne, M., & Klein, E. (2009). Using prediction markets and twitter to predict a swine flu pandemic. In F. M. Carrero, J. M. Gómez, B. Monsalve, E. Puertas & J. C. Cortizo (Eds.), 1st international workshop on mining social media (pp. 9–18). Sevilla, Spain.
Russell, M. A. (2011a). Mining the social web. O’Reilly Media.
Russell, M. A. (2011b). 21 recipes for mining twitter. O’Reilly Media.
Sakaki, T., Okazaki, M., & Matsuo, Y. (2010). Earthquake shakes twitter users: Real-time event detection by social sensors. In WWW ‘10, Raleigh, NC.
Salton, G., & McGill, M. J. (1983). Introduction to modern information retrieval. NewYork: McGraw-Hill.
Sankaranarayanan, J., Samet, H., Teitler, B., Lieberman, M., & Sperling, J. (2009). TwitterStand: News in tweets. In ACM GIS 2009 (pp. 42–51). Seattle, WA.
Schindler, G., Brown, M., & Szeliski, R. (2007). City-scale location recognition. In IEEE conference on computer vision and pattern recognition (CVPR’07) (pp. 1–7). Minneapolis, MN.
Shi, X., Tseng, B., & Adamic, L. A. (2007). Looking at the Blogosphere topology through different lenses. In Proceedings of the international conference on weblogs and social media (ICWSM 2007). Boulder, CO.
Singla, P., & Richardson, M. (2008). Yes, there is a correlation-from social networks to personal behavior on the web. In WWW’08 (pp. 655–664). Beijing, PRC.
Standby Task Force. (2011). The security and ethics of live mapping in repressive regimes and hostile environments. Available at http://blog.standbytaskforce.com/?p=259. Accessed on 27th, June, 2011.
Tang, J., Sun, J., Wang, C., & Yang, Z. (2009). Social influence analysis in large-scale networks. In Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 807–816). Paris, France.
TomTom. (2011). This is what we really do with your data. Available at http://www.tomtom.com/page/facts. Accessed on 27th June, 2011.
Wall Street Journal. (2011). Where the young and tech-savvy go? Wall Street Journal (May 19). Available at http://blogs.wsj.com/digits/2011/05/19/a-week-on-foursquare/. Accessed on 27th June, 2011.
Watts, D. J., & Dodds, P. S. (2007). Influentials, networks, and public opinion. Journal of Consumer Research, 34(4), 441–458.
Weng, J., & Lee, B.-S. (2011). Event detection in twitter. In Proceedings of the AAAI conference on weblogs and social media (ICWSM-11). Barcelona, Spain.
Weng, J., Lim, E.-P., Jiang, J., & He, Q. (2010). TwitterRank: Finding topic-sensitive influential twitters. In Proceedings of the web search and data mining (WSDM’10). New York, NY.
Yardi, S., & Boyd, D. (2010). Tweeting for the town square: Measuring geographic local networks. In Proceedings of fourth international AAAI conference on weblogs and social media (pp. 194–201). Washington, DC.
Zhang, J., Ackerman, M.S., & Adamic, L. (2007). Expertise networks in online communities: Structure and algorithms. In Proceedings of the 16th international conference on World Wide Web (pp. 221–230). Banff, Canada.
Zhang, W., & Kosecka, J. (2006). Image based localization in Urban environments. In International symposium on 3D data processing, visualization, and transmission (pp. 33–40). Chapel Hill, NC.
Zook, M., Graham, M., Shelton, T., & Gorman, S. (2010). Volunteered geographic information and crowdsourcing disaster relief: A case study of the Haitian earthquake. World Medical and Health Policy, 2(2), 7–33.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Stefanidis, A., Crooks, A. & Radzikowski, J. Harvesting ambient geospatial information from social media feeds. GeoJournal 78, 319–338 (2013). https://doi.org/10.1007/s10708-011-9438-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10708-011-9438-2