Skip to main content

Advertisement

Log in

Multi-objective semi-supervised clustering to identify health service patterns for injured patients

  • Research
  • Published:
Health Information Science and Systems Aims and scope Submit manuscript

Abstract

Purpose

This study develops a pattern recognition method that identifies patterns based on their similarity and their association with the outcome of interest. The practical purpose of developing this pattern recognition method is to group patients, who are injured in transport accidents, in the early stages post-injury. This grouping is based on distinctive patterns in health service use within the first week post-injury. The groups also provide predictive information towards the total cost of medication process. As a result, the group of patients who have undesirable outcomes are identified as early as possible based health service use patterns.

Methods

We propose a multi-objective optimization model to group patients. An objective function is the cost function of k-medians clustering to recognize the similar patterns. Another objective function is the cross-validated root-mean-square error to examine the association with the total cost. The best grouping is obtained by minimizing both objective functions. As a result, the multi-objective optimization model is a semi-supervised clustering which learns health service use patterns in both unsupervised and supervised ways. We also introduce an evolutionary computation approach includes stochastic gradient descent and Pareto optimal solutions to find the optimal solution. In addition, we use the decision tree method to reproduce the optimal groups using an interpretable classification model.

Results

The results show that the proposed multi-objective semi-supervised clustering identifies distinct groups of health service uses and contributes to predict the total cost. The performance of the multi-objective model has been examined using two metrics such as the average silhouette width and the cross-validation error. The examination proves that the multi-objective model outperforms the single-objective ones. In addition, the interpretable classification model shows that imaging and therapeutic services are critical services in the first-week post-injury to group injured patients.

Conclusion

The proposed multi-objective semi-supervised clustering finds the optimal clusters that not only are well-separated from each other but can provide informative insights regarding the outcome of interest. It also overcomes two drawback of clustering methods such as being sensitive to the initial cluster centers and need for specifying the number of clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Azmin M, Jafari A, Rezaei N, Bhalla K, Bose D, Shahraz S, Dehghani M, Niloofar P, Fatholahi S, Hedayati J, Jamshidi H, Farzadfar F. An approach towards reducing road traffic injuries and improving public health through big data telematics: a randomised controlled trial protocol. Arch Iran Med. 2018;21(11):495–501.

    Google Scholar 

  2. Scheetz LJ, Zhang J, Kolassa J. Classification tree modeling to identify severe and moderate vehicular injuries in young and middle-aged adults. Artif Intell Med. 2009;45(1):1–10.

    Article  Google Scholar 

  3. Mitchell RJ, Cameron CM, McClure R. Patterns of health care use of injured adults: a population-based matched cohort study. Injury. 2017;48(7):1393–9.

    Article  Google Scholar 

  4. Pinaire J, Azé J, Bringay S, Landais P. Patient healthcare trajectory an essential monitoring tool: a systematic review. Health Inf Sci Syst. 2017;5(1):1–18.

    Article  Google Scholar 

  5. Prang KH, Hassani-Mahmooei B, Collie A. Compensation Research Database: population-based injury data for surveillance, linkage and mining. BMC Res Notes. 2016;9(1):1–11.

    Article  Google Scholar 

  6. Department of Health, MBS online. 2017: http://www9.health.gov.au/mbs/search.cfm?adv=1.

  7. Saha S, Ekbal A, Alok AK. Semi-supervised clustering using multiobjective optimization. In: Proceedings of the 2012 12th International Conference on Hybrid Intelligent Systems, HIS 2012, 2012.

  8. Handl J, Knowles J. On semi-supervised clustering via multiobjective optimization. In: GECCO 2006: Genetic and Evolutionary Computation Conference, 2006.

  9. Santos L, Veras R, Aires K, Britto L, Machado V. Medical image segmentation using seeded fuzzy c-means: a semi-supervised clustering algorithm. In: Proceedings of the International Joint Conference on Neural Networks, 2018.

  10. Yang J, Sun L, Wu Q. Constraint projections for semi-supervised spectral clustering ensemble. Concurr Comput. 2019. https://doi.org/10.1002/cpe.5359.

    Article  Google Scholar 

  11. Charkhgard H, Eshragh A. A new approach to select the best subset of predictors in linear regression modelling: bi-objective mixed integer linear programming. ANZIAM J. 2019;61(1):64–75.

    Article  MathSciNet  MATH  Google Scholar 

  12. Khorshidi HA, Haffari G, Aickelin U, Hassani-Mahmooei B, Early identification of undesirable outcomes for transport accident injured patients using semi-supervised clustering. In: Health Informatics Conference 2019: Melbourne, Australia.

  13. García-Escudero LA, Gordaliza A, Matrán C, Mayo-Iscar A. A review of robust clustering methods. Adv Data Anal Classif. 2010;4(2):89–109.

    Article  MathSciNet  MATH  Google Scholar 

  14. Cardot H, Cénac P, Monnez JM. A fast and recursive algorithm for clustering large datasets with k-medians. Comput Stat Data Anal. 2012;56(6):1434–49.

    Article  MathSciNet  MATH  Google Scholar 

  15. Rahim MS and Ahmed T. An initial centroid selection method based on radial and angular coordinates for K-means algorithm. In: 20th International Conference of Computer and Information Technology, ICCIT 2017, 2018.

  16. Khan F. An initial seed selection algorithm for k-means clustering of georeferenced data to improve replicability of cluster assignments for mapping application. Appl Soft Comput. 2012;12(11):3698–700.

    Article  Google Scholar 

  17. Pal R. Some new indexes of cluster validity. IEEE Trans Syst Man Cybern B. 1998;28(3):301–15.

    Article  Google Scholar 

  18. Campello RJGB, Hruschka ER. A fuzzy extension of the silhouette width criterion for cluster analysis. Fuzzy Sets Syst. 2006;157(21):2858–75.

    Article  MathSciNet  MATH  Google Scholar 

  19. Nikfalazar S, Yeh C-H, Bedingfield S, Khorshidi HA. A new iterative fuzzy clustering algorithm for multiple imputation of missing data. In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Naples, Italy, 2017.

  20. Sun H, Wang S, Jiang Q. FCM-based model selection algorithms for determining the number of clusters. Pattern Recogn. 2004;37(10):2027–37.

    Article  MATH  Google Scholar 

  21. Abouei Ardakan M, Rezvan MT. Multi-objective optimization of reliability–redundancy allocation problem with cold-standby strategy using NSGA-II. Reliab Eng Syst Saf. 2018;172:225–38.

    Article  Google Scholar 

  22. Alok AK, Saha S, Ekbal A. Semi-supervised clustering for gene-expression data in multiobjective optimization framework. Int Journal of Mach Learn Cybern. 2017;8(2):421–39.

    Article  Google Scholar 

  23. Milligan GW, Cooper MC. A study of standardization of variables in cluster analysis. J Classif. 1988;5(2):181–204.

    Article  MathSciNet  Google Scholar 

  24. Forestier G, Petitjean F, Senin P, Riffaud L, Henaux PL, Jannin P. Finding discriminative and interpretable patterns in sequences of surgical activities. Artif Intell Med. 2017;82:11–9.

    Article  Google Scholar 

  25. Nikfalazar S, Khorshidi HA, Hamadani AZ. Fuzzy risk analysis by similarity-based multi-criteria approach to classify alternatives. Int J Syst Assur Eng Manag. 2016;7(3):250–6.

    Article  Google Scholar 

  26. Mihaljević B, Benavides-Piccione R, Guerra L, DeFelipe J, Larrañaga P, Bielza C. Classifying GABAergic interneurons with semi-supervised projected model-based clustering. Artif Intell Med. 2015;65(1):49–59.

    Article  Google Scholar 

  27. Zhang J, Cao P, Gross DP, Zaiane OR. On the application of multi-class classification in physical therapy recommendation. Health Inf Sci Syst. 2013;1(1):15.

    Article  Google Scholar 

Download references

Acknowledgements

This project was funded by the Transport Accident Commission (Transport Accident Commission) through the Institute for Safety, Compensation and Recovery Research (ISCRR).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hadi Akbarzadeh Khorshidi.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Informed consent

Statement not required. This study was performed using a de-identified administrative dataset, with ethics approval granted by Monash University Human Research Ethics Committee (CF09/3150—2009001727).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Akbarzadeh Khorshidi, H., Aickelin, U., Haffari, G. et al. Multi-objective semi-supervised clustering to identify health service patterns for injured patients. Health Inf Sci Syst 7, 18 (2019). https://doi.org/10.1007/s13755-019-0080-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13755-019-0080-6

Keywords

Navigation