ABSTRACT
This paper tries to answer two questions. First, how to infer real-time air quality of any arbitrary location given environmental data and historical air quality data from very sparse monitoring locations. Second, if one needs to establish few new monitoring stations to improve the inference quality, how to determine the best locations for such purpose? The problems are challenging since for most of the locations (>99%) in a city we do not have any air quality data to train a model from. We design a semi-supervised inference model utilizing existing monitoring data together with heterogeneous city dynamics, including meteorology, human mobility, structure of road networks, and point of interests (POIs). We also propose an entropy-minimization model to suggest the best locations to establish new monitoring stations. We evaluate the proposed approach using Beijing air quality data, resulting in clear advantages over a series of state-of-the-art and commonly used methods.
- T. H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms (3rd ed.), MIT Press. Google ScholarDigital Library
- S. Donald. A two-dimensional interpolation function for irregularly-spaced data. In Proc. of the National Conference. pp. 517--524. 1968. Google ScholarDigital Library
- A. V. Donkelaar, R. V. Martin, and R. J. Park (2006), Estimating ground-level PM2.5 using aerosol optical depth determined from satellite remote sensing, J. Geophys. Res., 111, D21201.Google Scholar
- W. Du, Z. Xing, M. Li, B. He, L. H. C. Chua, and H. Miao. Optimal sensor placement and measurement of wind for water quality studies in urban reservoirs. In Proc. of IEEE International Symposium on Information Processing in Sensor Net-works ISPN, 2014. Google ScholarDigital Library
- D. Erdös, V. Ishakian, A. Lapets, E. Terzi, and A. Bestavros. The filter-placement problem and its application to minimizing information multiplicity. In Proc. VLDB 2012.Google ScholarDigital Library
- D. Hasenfratz, O. Saukh, S. Sturzenegger, and L. Thiele. Participatory Air Pollution Monitoring Using Smartphones. In the 2nd International Workshop on Mobile Sensing 2012.Google Scholar
- J. Hooyberghs, C. Mensink, G. Dumont, F. Fierens, and O. Brasseur (2005). Aneural network forecast for daily average PM10 concentrations in Belgium. Atmospheric Environment 39 (2005) 3279--3289.Google Scholar
- Y. Jiang, K. Li, L. Tian, R. Piedrahita, X. Yun, O. Mansata, Q. Lv, R. P. Dick, M. Hannigan, and L. Shang. Maqs: A personalized mobile sensing system for indoor air quality. In Proc. of UbiComp 2011. Google ScholarDigital Library
- D. Karamshuk, A. Noulas, S. Scellato, V. Nicosia, and C. Mascolo. Geo-spotting: mining online location-based services for optimal retail store placement. In Proc. of KDD 2013. Google ScholarDigital Library
- A. Krause, J. Leskovec, C. Guestrin, J. VanBriesen, and C. Faloutsos. Efficient Sensor Placement Optimization for Securing Large Water Distribution Networks. Journal of Water Re-sources Planning and Management, 134(6), 2008.Google Scholar
- A. Krause, R. Rajagopal, A. Gupta, and C. Guestrin. Simultaneous Optimization of Sensor Placements and Balanced Schedules. IEEE Transactions on Automatic Control, 2011.Google ScholarCross Ref
- Wei-Zen Lu, and Wen-Jian Wang. Potential assessment of the "support vector machine" method in forecasting ambient air pollutant trends. Chemosphere 59: 693--701, 2005.Google ScholarCross Ref
- H. Niska, T. Hiltunen, A. Karppinen, J. Ruuskanen, and M. Kolehmainen. Evolving the neural network model for forecast-ing air pollution time series. Engineering Applications of Artificial Intelligence 17, 159--167, 2004.Google ScholarCross Ref
- M. A. Oliver and R. Webster. Kriging: a method of interpolation for geographical information system. INT. J. Geographical Information Systems, VOL. 4, No. 3, 313--332, 1990.Google Scholar
- P. Perez, R. Palacios and A. Castillo. Carbon Monoxide Concentration Forecasting in Santiage, Chile. Journal of the air and waste management association 54:908--913. ISSN 1047--3289. 2004.Google Scholar
- M. Pourali and A. Mosleh. A Functional Sensor Placement Optimization Method for Power Systems Health Monitoring, IEEE Transactions on Industrial Applications, 49(4), 2013.Google ScholarCross Ref
- S. Vardoulakis, B. E. A. Fisher, K. Pericleous, N. Gonzalez-Flesca. Modelling air quality in street canyons: a review. Atmospheric Environment 37 (2003) 155--182, 2003.Google ScholarCross Ref
- Y. Zheng, F. Liu, H- P. Hsieh, U-Air: When Urban Air Quality Inference Meets Big Data. In Proc. of KDD 2013. Google ScholarDigital Library
- Y. Zheng, L. Capra, O. Wolfson, H. Yang. Urban Computing: concepts, methodologies, and applications. ACM Transaction on Intelligent Systems and Technology (ACM TIST). 5(3), 2014. Google ScholarDigital Library
- Y. Zheng, X. Chen, Q. Jin, Y. Chen, X. Qu, X. Liu, E. Chang, W-Y. Ma, Y. Rui, W. Sun. A Cloud-Based Knowledge Discovery System for Monitoring Fine-Grained Air Quality. MSR-TR-2014-40.Google Scholar
- X. Zhu, Z. Ghahramani and J. Lafferty. Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions. ICML 2003.Google Scholar
Index Terms
- Inferring Air Quality for Station Location Recommendation Based on Urban Big Data
Recommendations
U-Air: when urban air quality inference meets big data
KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data miningInformation about urban air quality, e.g., the concentration of PM2.5, is of great importance to protect human health and control air pollution. While there are limited air-quality-monitor-stations in a city, air quality varies in urban spaces non-...
Forecasting Fine-Grained Air Quality Based on Big Data
KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data MiningIn this paper, we forecast the reading of an air quality monitoring station over the next 48 hours, using a data-driven method that considers current meteorological data, weather forecasts, and air quality data of the station and that of other stations ...
Deep Distributed Fusion Network for Air Quality Prediction
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningAccompanying the rapid urbanization, many developing countries are suffering from serious air pollution problem. The demand for predicting future air quality is becoming increasingly more important to government's policy-making and people's decision ...
Comments