Skip to main content
Log in

Exponential distance-based fuzzy clustering for interval-valued data

  • Published:
Fuzzy Optimization and Decision Making Aims and scope Submit manuscript

Abstract

In several real life and research situations data are collected in the form of intervals, the so called interval-valued data. In this paper a fuzzy clustering method to analyse interval-valued data is presented. In particular, we address the problem of interval-valued data corrupted by outliers and noise. In order to cope with the presence of outliers we propose to employ a robust metric based on the exponential distance in the framework of the Fuzzy C-medoids clustering mode, the Fuzzy C-medoids clustering model for interval-valued data with exponential distance. The exponential distance assigns small weights to outliers and larger weights to those points that are more compact in the data set, thus neutralizing the effect of the presence of anomalous interval-valued data. Simulation results pertaining to the behaviour of the proposed approach as well as two empirical applications are provided in order to illustrate the practical usefulness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://www.brace.sinanet.apat.it/web/struttura.html. Data retrieved on 2015-05-03.

  2. Retrieved at http://www1.toronto.ca/City%20Of%20Toronto/Information%20&%20Technology/Open%20Data/Data%20Sets/Assets/Files/E-Bike_Survey_Responses.xls, on 2015-05-03.

References

  • Anderson, D. T., Bezdek, J. C., Popescu, M., & Keller, J. M. (2010). Comparing fuzzy, probabilistic, and possibilistic partitions. IEEE Transactions on Fuzzy Systems, 18(5), 906–918.

    Article  Google Scholar 

  • Campello, R. J., & Hruschka, E. R. (2006). A fuzzy extension of the silhouette width criterion for cluster analysis. Fuzzy Sets and Systems, 157(21), 2858–2875.

    Article  MathSciNet  MATH  Google Scholar 

  • Cazes, P., Chouakria, A., Diday, E., & Schektrman, Y. (1997). Extension de l’analyse en composantes principales à des données de type intervalle. Revue de Statistique Appliquée, 45(3), 5–24.

    Google Scholar 

  • Coppi, R., & D’Urso, P. (2002). Fuzzy k-means clustering models for triangular fuzzy time trajectories. Statistical Methods and Applications, 11(1), 21–40.

    Article  MATH  Google Scholar 

  • De Carvalho, Fd A T, & Lechevallier, Y. (2009). Partitional clustering algorithms for symbolic interval data based on single adaptive distances. Pattern Recognition, 42(7), 1223–1236.

    Article  MATH  Google Scholar 

  • De Carvalho, Fd A T, & Tenório, C. P. (2010). Fuzzy k-means clustering algorithms for interval-valued data based on adaptive quadratic distances. Fuzzy Sets and Systems, 161(23), 2978–2999.

    Article  MathSciNet  MATH  Google Scholar 

  • De Carvalho, Fd A T, De Souza, R. M., Chavent, M., & Lechevallier, Y. (2006). Adaptive hausdorff distances and dynamic clustering of symbolic interval data. Pattern Recognition Letters, 27(3), 167–179.

    Article  Google Scholar 

  • Denoeux, T., & Masson, M. (2000). Multidimensional scaling of interval-valued dissimilarity data. Pattern Recognition Letters, 21(1), 83–92.

    Article  Google Scholar 

  • Dey, V., Pratihar, D. K., & Datta, G. L. (2011). Genetic algorithm-tuned entropy-based fuzzy c-means algorithm for obtaining distinct and compact clusters. Fuzzy Optimization and Decision Making, 10(2), 153–166.

    Article  MathSciNet  Google Scholar 

  • Duarte Silva, A. P., & Brito, P. (2015). Discriminant analysis of interval data: An assessment of parametric and distance-based approaches. Journal of Classification, 32(3), 516–541. doi:10.1007/s00357-015-9189-8.

    Article  MathSciNet  MATH  Google Scholar 

  • D’Urso, P., & De Giovanni, L. (2014). Robust clustering of imprecise data. Chemometrics and Intelligent Laboratory Systems, 136, 58–80.

    Article  Google Scholar 

  • D’Urso, P., & Giordani, P. (2004). A least squares approach to principal component analysis for interval valued data. Chemometrics and Intelligent Laboratory Systems, 70(2), 179–192.

    Article  MathSciNet  Google Scholar 

  • D’Urso, P., & Giordani, P. (2006). A robust fuzzy k-means clustering model for interval valued data. Computational Statistics, 21(2), 251–269.

    Article  MathSciNet  MATH  Google Scholar 

  • D’Urso, P., De Giovanni, L., & Massari, R. (2015a). Time series clustering by a robust autoregressive metric with application to air pollution. Chemometrics and Intelligent Laboratory Systems, 141, 107–124.

    Article  Google Scholar 

  • D’Urso, P., De Giovanni, L., & Massari, R. (2015b). Trimmed fuzzy clustering for interval-valued data. Advances in Data Analysis and Classification, 9(1), 21–40.

    Article  MathSciNet  Google Scholar 

  • García-Escudero, L. A., & Gordaliza, A. (2005). A proposal for robust curve clustering. Journal of Classification, 22(2), 185–201.

    Article  MathSciNet  MATH  Google Scholar 

  • Giordani, P., & Kiers, H. A. (2004). Three-way component analysis of interval-valued data. Journal of Chemometrics, 18(5), 253–264.

    Article  Google Scholar 

  • Gowda, K. C., & Diday, E. (1991). Symbolic clustering using a new dissimilarity measure. Pattern Recognition, 24(6), 567–578.

    Article  Google Scholar 

  • Guru, D. S., Kiranagi, B. B., & Nagabhushan, P. (2004). Multivalued type proximity measure and concept of mutual similarity value useful for clustering symbolic patterns. Pattern Recognition Letters, 25(10), 1203–1213.

    Article  Google Scholar 

  • Hung, T. W. (2007). The bi-objective fuzzy c-means cluster analysis for tsk fuzzy system identification. Fuzzy Optimization and Decision Making, 6(1), 51–61.

    Article  MathSciNet  MATH  Google Scholar 

  • Kim, J., Krishnapuram, R., & Davé, R. (1996). Application of the least trimmed squares technique to prototype-based clustering. Pattern Recognition Letters, 17(6), 633–641.

    Article  Google Scholar 

  • Krishnapuram, R., Joshi, A., Nasraoui, O., & Yi, L. (2001). Low-complexity fuzzy relational clustering algorithms for web mining. IEEE Transactions on Fuzzy Systems, 9(4), 595–607.

    Article  Google Scholar 

  • Leite, D., Ballini, R., Costa, P., & Gomide, F. (2012). Evolving fuzzy granular modeling from nonstationary fuzzy data streams. Evolving Systems, 3(2), 65–79.

    Article  Google Scholar 

  • Wu, K. L., & Yang, M. S. (2002). Alternative c-means clustering algorithms. Pattern Recognition, 35(10), 2267–2278.

    Article  MATH  Google Scholar 

  • Xu, Z. (2012). Fuzzy ordered distance measures. Fuzzy Optimization and Decision Making, 11(1), 73–97.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The authors thank the Editors and the referees for their useful comments and suggestions which helped to improve the quality and presentation of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pierpaolo D’Urso.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

D’Urso, P., Massari, R., De Giovanni, L. et al. Exponential distance-based fuzzy clustering for interval-valued data. Fuzzy Optim Decis Making 16, 51–70 (2017). https://doi.org/10.1007/s10700-016-9238-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10700-016-9238-8

Keywords

Navigation