ABSTRACT
Real spatial data, e.g., detailed road networks, rivers, buildings, parks, are not easily available for most of the world. This hinders the practicality of many research ideas that need a real spatial data for testing and experiments. Such data is often available for governmental use, or at major software companies, but it is prohibitively expensive to build or buy for academia or individual researchers. This paper presents TAREEG; a web-service that makes real spatial data, from anywhere in the world, available at the fingertips of every researcher or individual. TAREEG gets all its data by leveraging the richness of OpenStreetMap data set; the most comprehensive available spatial data of the world. Yet, it is still challenging to obtain OpenStreetMap data due to the size limitations, special data format, and the noisy nature of spatial data. TAREEG employs MapReduce-based techniques to make it efficient and easy to extract OpenStreetMap data in a standard form with minimal effort. Experimental results show that TAREEG is highly accurate and efficient.
- L. Alarabi, A. Eldawy, R. Alghamdi, and M. F. Mokbel. TAREEG: A MapReduce-Based Web Service for Extracting Spatial Data from OpenStreetMap (System Demonstration). In SIGMOD, pages 897--900, Snowbird, UT, June 2014. Google ScholarDigital Library
- Apache pig. http://pig.apache.org/.Google Scholar
- Z. Chen, Y. Liy, R. C.-W. Wong, J. Xiong, G. Mai, and C. Long. Efficient algorithms for optimal location queries in road networks. In SIGMOD, 2014. Google ScholarDigital Library
- Z. Chen, H. T. Shen, X. Zhou, and J. X. Yu. Monitoring path nearest neighbor in road networks. In SIGMOD, pages 591--602, 2009. Google ScholarDigital Library
- K. Deng, X. Zhou, and H. T. Shen. Multi-source skyline query processing in road networks. In ICDE, pages 796--805, 2007.Google ScholarCross Ref
- A. Eldawy and M. F. Mokbel. A Demonstration of SpatialHadoop: An Efficient MapReduce Framework for Spatial Data (System Demo). In VLDB, Riva del Garda, Italy, Aug. 2013. Google ScholarDigital Library
- A. Eldawy and M. F. Mokbel. Pigeon: A spatial mapreduce language. In ICDE, pages 1242--1245, 2014.Google ScholarCross Ref
- Geo fabrik. http://download.geofabrik.de/.Google Scholar
- A. Guttman. R-Trees: A Dynamic Index Structure for Spatial Searching. In SIGMOD, pages 47--57, 1984. Google ScholarDigital Library
- M. Haklay. How good is Volunteered Geographical Information? A Comparative Study of OpenStreetMap and Ordnance Survey Datasets. Environment and Planning B: Planning and Design, 37(4):682--703, 2010.Google ScholarCross Ref
- L. Hu, Y. Jing, W.-S. Ku, and C. Shahabi. Enforcing k nearest neighbor query integrity on road networks. In SIGSPATIAL GIS, pages 422--425, 2012. Google ScholarDigital Library
- C. S. Jensen, J. Kolárvr, T. B. Pedersen, and I. Timko. Nearest neighbor queries in road networks. In SIGSPATIAL GIS, pages 1--8, 2003. Google ScholarDigital Library
- Y. Jing, L. Hu, W.-S. Ku, and C. Shahabi. Authentication of k nearest neighbor query on road networks. TKDE, 26(6):1494--1506, 2014. Google ScholarDigital Library
- C.-C. Lee, Y.-H. Wu, and A. L. P. Chen. Continuous evaluation of fastest path queries on road networks. In SSTD, pages 20--37, 2007. Google ScholarDigital Library
- G. Li, Y. Li, J. Li, L. Shu, and F. Yang. Continuous reverse k nearest neighbor monitoring on moving objects in road networks. Information Systems, 35(8):860--883, 2010. Google ScholarDigital Library
- S. Luo, Y. Luo, S. Zhou, G. Cong, and J. Guan. DISKs: a system for distributed spatial group keyword search on road networks. Proceedings of the International Conference on Very Large Data Bases, VLDB, 5(12):1966--1969, 2012. Google ScholarDigital Library
- X. Ma, S. Shekhar, and H. Xiong. Multi-type nearest neighbor queries in road networks with time window constraints. In SIGSPATIAL GIS, pages 484--487, 2009. Google ScholarDigital Library
- P. Mooney and P. Corcoran. Characteristics of heavily edited objects in openstreetmap. Future Internet, 2012.Google ScholarCross Ref
- K. Mouratidis, M. L. Yiu, D. Papadias, and N. Mamoulis. Continuous nearest neighbor monitoring in road networks. In VLDB, pages 43--54, 2006. Google ScholarDigital Library
- Navteq. http://here.com/navteq-redirect/?lang=en-GB.Google Scholar
- P. Neis and A. Zipf. Analyzing the Contributor Activity of a Volunteered Geographic Information Project --- The Case of OpenStreetMap. ISPRS International Journal of Geo-Information, 1(2):146--165, 2012.Google ScholarCross Ref
- J. Nievergelt, H. Hinterberger, and K. Sevcik. The Grid File: An Adaptable, Symmetric Multikey File Structure. TODS, 9(1):38--71, 1984. Google ScholarDigital Library
- Open geospatial consortium (ogc). http://www.opengeospatial.org/.Google Scholar
- Openstreetmap. http://www.openstreetmap.org/export.Google Scholar
- Osm benchmarks, June 2012. http://wiki.openstreetmap.org/wiki/Osm2pgsql/benchmarks.Google Scholar
- Osm tools, June 2012. http://wiki.openstreetmap.org/wiki/Osmosis.Google Scholar
- PostGIS, 2007. http://postgis.refractions.net/.Google Scholar
- M. N. Rice and V. J. Tsotras. Graph indexing of road networks for shortest path queries with label restrictions. PVLDB, 4(2):69--80, 2010. Google ScholarDigital Library
- T. K. Sellis, N. Roussopoulos, and C. Faloutsos. The R+-Tree: A Dynamic Index for Multi-Dimensional Objects. In VLDB, pages 507--518, 1987. Google ScholarDigital Library
- S. Shang, B. Yuan, K. Deng, K. Xie, and X. Zhou. Finding the Most Accessible Locations: Reverse Path Nearest Neighbor Query in Road Networks. In SIGSPATIAL GIS, pages 181--190, 2011. Google ScholarDigital Library
- TAREEG. www.tareeg.org.Google Scholar
- J. R. Thomsen, M. L. Yiu, and C. S. Jensen. Effective caching of shortest paths for location-based services. In SIGMOD, 2012. Google ScholarDigital Library
- Y. Tian, K. C. K. Lee, and W.-C. Lee. Finding skyline paths in road networks. In SIGSPATIAL GIS, pages 444--447, 2009. Google ScholarDigital Library
- TIGER files. http://www.census.gov/geo/www/tiger/.Google Scholar
- S. Vanhove and V. Fack. An effective heuristic for computing many shortest path alternatives in road networks. International Journal of Geographical Information Science, 26(6):1031--1050, 2012. Google ScholarDigital Library
- L. Wu, X. Xiao, D. Deng, G. Cong, A. D. Zhu, and S. Zhou. Shortest path and distance queries on road networks: An experimental evaluation. PVLDB, 5(5):406--417, 2012. Google ScholarDigital Library
- M. L. Yiu, N. Mamoulis, and D. Papadias. Aggregate nearest neighbor queries in road networks. TKDE, 17(6):820--833, 2005. Google ScholarDigital Library
- W. Zeng and R. Church. Finding shortest paths on real road networks: The case for a*. International Journal of Geographical Information Science, 23(4):531--543, 2009. Google ScholarDigital Library
- A. D. Zhu, H. Ma, X. Xiao, S. Luo, Y. Tang, and S. Zhou. Shortest path and distance queries on road networks: Towards bridging theory and practice. In SIGMOD, pages 857--868, 2013. Google ScholarDigital Library
- L. Zhu, Y. Jing, W. Sun, D. Mao, and P. Liu. Voronoi-based aggregate nearest neighbor query processing in road networks. In SIGSPATIAL GIS, pages 518--521, 2010. Google ScholarDigital Library
Index Terms
- TAREEG: a MapReduce-based system for extracting spatial data from OpenStreetMap
Recommendations
TAREEG: a MapReduce-based web service for extracting spatial data from OpenStreetMap
SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of DataReal spatial data, e.g., detailed road networks, rivers, buildings, parks, are not really available in most of the world. This hinders the practicality of many research ideas that need a real spatial data for testing experiments. Such data is often ...
SpatialHadoop: towards flexible and scalable spatial processing using mapreduce
SIGMOD'14 PhD Symposium: Proceedings of the 2014 SIGMOD PhD symposiumRecently, MapReduce frameworks, e.g., Hadoop, have been used extensively in different applications that include tera-byte sorting, machine learning, and graph processing. With the huge volumes of spatial data coming from different sources, there is an ...
Disease Surveillance System for Big Climate Data Processing and Dengue Transmission
Ambient intelligence is an emerging platform that provides advances in sensors and sensor networks, pervasive computing, and artificial intelligence to capture the real time climate data. This result continuously generates several exabytes of ...
Comments