Skip to main content

A Hybrid Missing Data Imputation Method for Constructing City Mobility Indices

  • Conference paper
  • First Online:
Book cover Data Mining (AusDM 2018)

Abstract

An effective missing data imputation method is essential for data mining and knowledge discovery from a comprehensive database with missing values. This paper proposes a new hybrid imputation method to effectively deal with the missing data issue of the Mobility in Cities Database (MCD) to construct city mobility indices. The hybrid method integrates the advantages of decision trees and fuzzy clustering into an iterative algorithm for missing data imputation. Extensive experiments conducted on the MCD and three commonly used datasets demonstrate that the hybrid method outperforms other existing effective imputation methods. With the MCD’s missing values imputed by the hybrid method, and using factor analysis and principal component analysis, this paper constructs city mobility indices for 63 cities in the MCD based on the novel concept of city mobility supply and demand. The city mobility indices constructed under a hierarchical structure of mobility supply and demand indicators represent substantial city mobility knowledge discovered from mining the MCD. The proposed hybrid method represents a significant contribution to missing data imputation research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Nikfalazar, S., Amiri, M., Khorshidi, H.A.: Social impact assessment on metro development with a case study in Eastern District of Tehran. Int. J. Soc. Syst. Sci. 6(3), 245–263 (2014)

    Article  Google Scholar 

  2. Rassafi, A.A., Vaziri, M.: Sustainable transport indicators: definition and integration. Int. J. Environ. Sci. Technol. 2(1), 83–96 (2005)

    Article  Google Scholar 

  3. Violato, R.R., Galves, M.L., de Oliveira, D.D.G.: Non-motorized mobility in central urban areas: application of multi-criteria decision aid in the city of campinas, Brazil. Int. J. Sustain. Transp. 8(6), 423–446 (2014)

    Article  Google Scholar 

  4. Haghshenas, H., Vaziri, M.: Urban sustainable transportation indicators for global comparison. Ecol. Ind. 15(1), 115–121 (2012)

    Article  Google Scholar 

  5. Moeinaddini, M., Asadi-Shekari, Z., Zaly Shah, M.: An urban mobility index for evaluating and reducing private motorized trips. Measurement 63, 30–40 (2015)

    Article  Google Scholar 

  6. Albalate, D., Bel, G.: What shapes local public transportation in Europe? Economics, mobility, institutions, and geography. Transp. Res. Part E Logist. Transp. Rev. 46(5), 775–790 (2010)

    Article  Google Scholar 

  7. Albalate, D., Bel, G.: Tourism and urban public transport: holding demand pressure under supply constraints. Tour. Manag. 31(3), 425–433 (2010)

    Article  Google Scholar 

  8. Alonso, A., Monzón, A., Cascajo, R.: Comparative analysis of passenger transport sustainability in European cities. Ecol. Ind. 48, 578–592 (2015)

    Article  Google Scholar 

  9. Reisi, M., Aye, L., Rajabifard, A., Ngo, T.: Land-use planning: implications for transport sustainability. Land Use Policy 50, 252–261 (2016)

    Article  Google Scholar 

  10. Joumard, R., Gudmundsson, H., Folkeson, L.: Framework for assessing indicators of environmental impacts in the transport sector. Transp. Res. Rec. 2242, 55–63 (2011)

    Article  Google Scholar 

  11. UITP: Mobility in cities database. International Association of Public Transport, Brussels (2015)

    Google Scholar 

  12. Nikfalazar, S., Yeh, C.-H., Bedingfield, S., Khorshidi, H.A.: A new iterative fuzzy clustering algorithm for multiple imputation of missing data. In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–6. IEEE, Naples (2017)

    Google Scholar 

  13. Rahman, M.G., Islam, M.Z.: Missing value imputation using a fuzzy clustering-based EM approach. Knowl. Inf. Syst. 46(2), 389–422 (2016)

    Article  Google Scholar 

  14. James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning: with Applications in R. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-7138-7

    Book  MATH  Google Scholar 

  15. Sato-Ilic, M., Jain, L.C.: Innovations in Fuzzy Clustering: Theory and Applications. Springer, Heidelberg (2006). https://doi.org/10.1007/3-540-34357-1

    Book  MATH  Google Scholar 

  16. Campello, R.J.G.B., Hruschka, E.R.: A fuzzy extension of the silhouette width criterion for cluster analysis. Fuzzy Sets Syst. 157(21), 2858–2875 (2006)

    Article  MathSciNet  Google Scholar 

  17. Junninen, H., Niska, H., Tuppurainen, K., Ruuskanen, J., Kolehmainen, M.: Methods for imputation of missing values in air quality data sets. Atmos. Environ. 38(18), 2895–2907 (2004)

    Article  Google Scholar 

  18. Cevallos Valdiviezo, H., Van Aelst, S.: Tree-based prediction on incomplete data using imputation or surrogate decisions. Inf. Sci. 311, 163–181 (2015)

    Article  Google Scholar 

  19. Rahman, M.G., Islam, M.Z.: Missing value imputation using decision trees and decision forests by splitting and merging records: two novel techniques. Knowl. Based Syst. 53, 51–65 (2013)

    Article  Google Scholar 

  20. Cheng, K.O., Law, N.F., Siu, W.C.: Iterative bicluster-based least square framework for estimation of missing values in microarray gene expression data. Pattern Recogn. 45(4), 1281–1289 (2012)

    Article  Google Scholar 

  21. Wang, X., Li, A., Jiang, Z., Feng, H.: Missing value estimation for DNA microarray gene expression data by support vector regression imputation and orthogonal coding scheme. BMC Bioinform. 7(32), 1–10 (2006)

    Google Scholar 

  22. Schneider, T.: Analysis of incomplete climate data: estimation of mean values and covariance matrices and imputation of missing values. J. Clim. 14, 853–871 (2001)

    Article  Google Scholar 

  23. Hair, J.F., Black, W.C., Babin, B.J., Anderson, R.E.: Multivariate Data Analysis, 7th edn. Pearson Prentice Hall, Upper Saddle River (2014)

    Google Scholar 

  24. Tate, E.: Social vulnerability indices: a comparative assessment using uncertainty and sensitivity analysis. Nat. Hazards 63(2), 325–347 (2012)

    Article  Google Scholar 

  25. Reckien, D.: What is in an index? Construction method, data metric, and weighting scheme determine the outcome of composite social vulnerability indices in New York City. Reg. Environ. Change 18(5), 1439–1451 (2018)

    Article  Google Scholar 

  26. Eyler, L., Hubbard, A., Juillard, C.: Assessment of economic status in trauma registries: a new algorithm for generating population-specific clustering-based models of economic status for time-constrained low-resource settings. Int. J. Med. Inf. 94, 49–58 (2016)

    Article  Google Scholar 

  27. Tajik, P., Majdzadeh, R.: Constructing pragmatic socioeconomic status assessment tools to address health equality challenges. Int. J. Prev. Med. 5(1), 46–51 (2014)

    Google Scholar 

  28. Vidal, R., Ma, Y., Sastry, S.S.: Generalized Principal Component Analysis. Interdisciplinary Applied Mathematics. Springer, New York (2016). https://doi.org/10.1007/978-0-387-87811-9

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sanaz Nikfalazar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nikfalazar, S., Yeh, CH., Bedingfield, S., Khorshidi, H.A. (2019). A Hybrid Missing Data Imputation Method for Constructing City Mobility Indices. In: Islam, R., et al. Data Mining. AusDM 2018. Communications in Computer and Information Science, vol 996. Springer, Singapore. https://doi.org/10.1007/978-981-13-6661-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-6661-1_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-6660-4

  • Online ISBN: 978-981-13-6661-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics