Abstract
To prevent the occurrences of accidents at workplaces, accident data should be analyzed properly. However, handling such data of higher dimension is often a difficult task for analysis to achieve efficient decision making due to the slow convergence and local minima problem. To address these issues, the present study proposes a new clustering algorithm called growing self-organizing map (GSOM)-based genetic K-means (GSGKM) for classifying accident data into an optimal number of clusters. Tolerance rough set approach (TRSA) is later used on each cluster to extract useful accident patterns, which enables helps in accident analysis and prevention. To validate the effectiveness of our proposed methodology, accident data obtained from an integrated steel plant are used as a case study. Besides, a total of four benchmark datasets collected from the University of California, Irvine (UCI) machine learning repository are also used for comparative study to prove its (i.e., GSGKM) superiority over some other state-of-the-arts. Experimental results reveal that the proposed methodology provides the highest clustering accuracy. A total of four clusters are obtained from the analysis. A set of 16 accident crisp patterns or rules are extracted from clusters using TRSA. Company employees are found to be more exposed to accidents than contractors. Additionally, behavioral issues are identified as the most determinant factor behind the injuries at work. The proposed methodology can be effectively used in decision making for different industries, including construction, manufacturing, and aviation.
Similar content being viewed by others
Change history
29 July 2022
A Correction to this paper has been published: https://doi.org/10.1007/s00521-022-07544-3
References
Malondkar A, Corizzo R, Kiringa I, Ceci M, Japkowicz N (2019) Spark-ghsom: growing hierarchical self-organizing map for large scale mixed attribute datasets. Inform Sci 496:572–591
Huang Z, Ng MK (1999) A fuzzy k-modes algorithm for clustering categorical data. IEEE Trans Fuzzy Syst 4(7):446–452
Liang W, Hu J, Zhang L, Guo C, Lin W (2012) Assessing and classifying risk of pipeline third-party interference based on fault tree and som. Eng Appl Artif Intell 25(3):594–608
Asgary A, Naini AS, Levy J (2012) Modeling the risk of structural fire incidents using a self-organizing map. Fire Saf J 49:1–9
Smith KA, Ng A (2003) Web page clustering using a self-organizing map of user navigation patterns. Decis Supp Syst 35(2):245–256
Abe T, Sugawara H, Kanaya S, Kinouchi M, Ikemura T (2006) Self-organizing map (som) unveils and visualizes hidden sequence characteristics of a wide range of eukaryote genomes. Gene 365:27–34
Séverin E (2010) Self organizing maps in corporate finance: quantitative and qualitative analysis of debt and leasing. Neurocomputing 73(10–12):2061–2067
Alahakoon D, Halgamuge SK, Srinivasan B (2000) Dynamic self-organizing maps with controlled growth for knowledge discovery. IEEE Trans Neural Netw 11(3):601–614
Fränti P, Sieranoja S (2019) How much can k-means be improved by using better initialization and repeats? Pattern Recog 93:95–112
Sarkar S, Raj R, Vinay S, Maiti J, Pratihar DK (2019) An optimization-based decision tree approach for predicting slip-trip-fall accidents at work. Saf Sci 118:57–69
Kulluk S, Özbakır L, Baykasoğlu A (2013) Fuzzy difaconn-miner: a novel approach for fuzzy rule extraction from neural networks. Exp Syst Appl 40(3):938–946
Jia J (2012) Evaluation of rough sets theory on effect factors in highway traffic accidents, In: CICTP 2012: multimodal transportation systems–convenient, safe, cost-effective, efficient, pp. 2107–2118
Pramanik A, Sarkar S, Maiti J, Mitra P (2021) Rt-gsom: rough tolerance growing self-organizing map. Inform Sci 566:19–37
Kohonen T (1990) The self-organizing map. Proceed IEEE 78(9):1464–1480
Krishna K, Murty MN (1999) Genetic k-means algorithm. IEEE Trans Syst Man Cybern Part B (Cybernetics) 29(3):433–439
Jones DR (1991) Solving partitioning problems with genetic algorithms, In: Proc. of the 4th ICGA, pp. 442–449
Sharma P, Wadhwa A, Komal, (2014) Analysis of selection schemes for solving an optimization problem in genetic algorithm. Int J Comput Appl 93(11):1–3
Whitley D (1994) A genetic algorithm tutorial. Stat Comput 4(2):65–85
Kryszkiewicz M (1998) Rough set approach to incomplete information systems. Inform Sci 112(1–4):39–49
Zhai J, Wang X, Zhang S, Hou S (2018) Tolerance rough fuzzy decision tree. Inform Sci 465:425–438
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intel 2:224–227
Sokal RR (1958) A statistical method for evaluating systematic relationship. Univ Kansas Sci Bull 28:1409–1438
Choi S-S, Cha S-H, Tappert CC (2010) A survey of binary similarity and distance measures. J Syst Cybern Inform 8(1):43–48
Lourenco F, Lobo V, Bacao F (2004) Binary-based similarity measures for categorical data and their application in self-organizing maps, 1–18
Anton H (2013) Elementary Linear Algebra. Wiley, Hoboken
Singhal A et al (2001) Modern information retrieval: a brief overview. IEEE Data Eng Bull 24(4):35–43
Black PE, Manhattan distance dictionary of algorithms and data structures, http://xlinux.nist.gov/dads//
Cantrell CD (2000) Modern mathematical methods for physicists and engineers. Cambridge University Press, Cambridge
Székely GJ, Rizzo ML, Bakirov NK et al (2007) Measuring and testing dependence by correlation of distances. The Ann Stat 35(6):2769–2794
Uriarte EA, Martín FD (2005) Topology preservation in som. Int J Appl Math Comput Sci 1(1):19–22
Taguchi G (1986) Introduction to quality engineering: designing quality into products and processes, Tech rep
Kaufman L, Rousseeuw P (2009) Finding groups in data: an introduction to cluster analysis. John Wiley & Sons
Vesanto J, Alhoniemi E (2000) Clustering of the self-organizing map. IEEE Trans Neural Netw 11(3):586–600
Fattahi P, Hajipour V, Nobari A (2015) A bi-objective continuous review inventory control model: Pareto-based meta-heuristic algorithms. Appl Soft Comput 32:211–223
Mousavi SM, Sadeghi J, Niaki STA, Tavana M (2016) A bi-objective inventory optimization model under inflation and discount using tuned pareto-based algorithms: Nsga-ii, nrga, and mopso. Appl Soft Comput 43:57–72
Palamara F, Piglione F, Piccinini N (2011) Self-organizing map and clustering algorithms for the analysis of occupational accident databases. Saf Sci 49(8–9):1215–1230
Forti A, Foresti GL (2006) Growing hierarchical tree som: an unsupervised neural network with dynamic topology. Neural Netw 19(10):1568–1580
Acknowledgements
We deeply acknowledge the Centre of Excellence in Safety Engineering and Analytics (CoE-SEA) (https://www.iitkgp.ac.in/department/SE), IIT Kharagpur and Safety Analytics & Virtual Reality (SAVR) Laboratory (https://www.savr.iitkgp.ac.in) of Department of Industrial & Systems Engineering, IIT Kharagpur for experimental/computational and research facilities for this work. We would like to thank the management of the plant for providing relevant data and their support and cooperation during the study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose. The authors have no conflicts of interest to declare that are relevant to the content of this article. There are no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sarkar, S., Ejaz, N., Maiti, J. et al. An integrated approach using growing self-organizing map-based genetic K-means clustering and tolerance rough set in occupational risk analysis. Neural Comput & Applic 34, 9661–9687 (2022). https://doi.org/10.1007/s00521-022-06956-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-06956-5