Skip to main content

Air Quality Prediction Based on Long Short-Term Memory (LSTM) and Clustering K-Means in Andahuaylas, Peru

  • Conference paper
  • First Online:
Advances in Information and Communication (FICC 2021)

Abstract

Air pollution is a global problem that directly affects the health of living beings; the World Health Organization (WHO) estimates that about 7 million of people die each year from exposure to polluted air. Having a prediction model for these air pollutants is an essential source of information for the proper prevention of health and life. There are many methods, and models for predicting air quality, almost all of them focused on large cities in the world. However, there are no models for cities considered underdeveloped and with high air pollution. Under this approach, the present project implemented an air quality prediction model for air pollutants (PM2.5, NO2, and 03). This is a proposal based on a method that combines a recurring neural network architecture LSTM and the increase of characteristics through a clustering process with K-means. The efficiency of our model was evaluated with the mean absolute error (MAE) and the mean square error (RMSE) and compared with machine learning algorithms: (Linnear Regression, K-Nearest, Random Forest, Decision Tree, and LSTM). Our proposed model (LSTM K-means) was more efficient than the traditional machine learning algorithms for regression; in the case of particulate matter (PM25) prediction, an MAE of 1.5 and RMSE of 2.39 was obtained, for Nitrogen Oxide (NO2) an MAE of 0.05 and RMSE of 0.06. For Ozone (O3), an MAE of 7.5 and RMSE of 9.81 was obtained, which are the minimum values compared to other algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arango, M.R., Peláez, M.V., Agudelo, E.V., Sánchez, L.M.: Impacto clínico de la contaminación aérea. Archivos de Medicina (Col) 16(2), 373–384 (2016)

    Article  Google Scholar 

  2. WHO (World Health Organization): Air Pollution (2020). https://www.who.int/health-topics/air-pollution

  3. Krishan, M., Jha, S., Das, J., Singh, A., Goyal, M.K., Sekar, C.: Air quality modelling using long short-term memory (LSTM) over NCT-Delhi, India. Air Qual. Atmos. Health 12(8), 899–908 (2019)

    Article  Google Scholar 

  4. Ma, J., Li, Z., Cheng, J.C., Ding, Y., Lin, C., Xu, Z.: Air quality prediction at new stations using spatially transferred bi-directional long short-term memory network. Sci. Total Environ. 705, 135771135771 (2020)

    Article  Google Scholar 

  5. Sun, W., Sun, J.: Daily PM2. 5 concentration prediction based on principal component analysis and LSSVM optimized by cuckoo search algorithm. J. Environ. Manage. 188, 144–152 (2017)

    Article  Google Scholar 

  6. Pan, B.: Application of XGBoost algorithm in hourly PM2. 5 concentration prediction. In: IOP Conference Series: Earth and Environmental Science, vol. 113, pp. 1–7 (2018)

    Google Scholar 

  7. Anaya Diaz, J.J.: Sistema prototipo para la estimación del comportamiento del índice de calidad del aire usando técnicas de aprendizaje computacional. Ingeniería de Sistemas (2015)

    Google Scholar 

  8. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  9. MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Symposium on Math, Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)

    Google Scholar 

  10. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002). https://doi.org/10.1023/A:1012487302797

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Herwin Alayn Huillcen Baca , Flor de Luz Palomino Valdivia or Melvin Edward Huillcen Baca .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huillcen Baca, H.A., Valdivia, F.d.L.P., Ibarra, M.J., Cruz, M.A., Baca, M.E.H. (2021). Air Quality Prediction Based on Long Short-Term Memory (LSTM) and Clustering K-Means in Andahuaylas, Peru. In: Arai, K. (eds) Advances in Information and Communication. FICC 2021. Advances in Intelligent Systems and Computing, vol 1364. Springer, Cham. https://doi.org/10.1007/978-3-030-73103-8_11

Download citation

Publish with us

Policies and ethics