Abstract
Over time, process model repositories tend to accumulate duplicate fragments (also called clones) as new process models are created or extended by copying and merging fragments from other models. This phenomenon calls for methods to detect clones in process models, so that these clones can be refactored as separate subprocesses in order to improve maintainability. This paper presents an indexing structure to support the fast detection of clones in large process model repositories. The proposed index is based on a novel combination of a method for process model decomposition (specifically the Refined Process Structure Tree), with established graph canonization and string matching techniques. Experiments show that the algorithm scales to repositories with hundreds of models. The experimental results also show that a significant number of non-trivial clones can be found in process model repositories taken from industrial practice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Babai, L.: Monte carlo algorithms in graph isomorphism testing. Technical Report D.M.S. No. 79-10, Universite de Montreal (1979)
Baxter, I.D., Yahin, A., Moura, L., Sant’Anna, M., Bier, L.: Clone Detection Using Abstract Syntax Trees. In: The Int. Conf. on Software Maintenance, pp. 368–377 (1998)
Bellon, S., Koschke, R., Antoniol, G., Krinke, J., Merlo, E.: Comparison and Evaluation of Clone Detection Tools. IEEE Trans. on Software Engineering 33(9), 577–591 (2007)
Deissenboeck, F., Hummel, B., Jürgens, E., Schätz, B., Wagner, S., Girard, J.-F., Teuchert, S.: Clone Detection in Automotive Model-based Development. In: ICSE (2008)
Fahland, D., Favre, C., Jobstmann, B., Koehler, J., Lohmann, N., Völzer, H., Wolf, K.: Instantaneous soundness checking of industrial business process models. In: Dayal, U., Eder, J., Koehler, J., Reijers, H.A. (eds.) BPM 2009. LNCS, vol. 5701, pp. 278–293. Springer, Heidelberg (2009)
Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
He, H., Singh, A.K.: Closure-Tree: An Index Structure for Graph Queries. In: The 22nd Int. Conf. on Data Engineering. IEEE Computer Society, Los Alamitos (2006)
Jin, T., Wang, J., Wu, N., La Rosa, M., ter Hofstede, A.H.M.: Efficient and Accurate Retrieval of Business Process Models through Indexing. In: Meersman, R., Dillon, T.S., Herrero, P. (eds.) OTM 2010. LNCS, vol. 6426, pp. 402–409. Springer, Heidelberg (2010)
Keller, G., Teufel, T.: SAP R/3 Process Oriented Implementation: Iterative Process Prototyping. Addison-Wesley, Reading (1998)
Koschk, R.: Identifying and Removing Software Clones. In: Mens, T., Demeyer, S. (eds.) Software Evolution. Springer, Heidelberg (2008)
Krinke, J.: Identifying Similar Code with Program Dependence Graphs. In: WCRE (2001)
Pham, N.H., Nguyen, H.A., Nguyen, T.T., Al-Kofahi, J.M., Nguyen, T.N.: Complete and Accurate Clone Detection in Graph-based Models. In: ICSE, pp. 276–286. IEEE, Los Alamitos (2009)
Polyvyanyy, A., Vanhatalo, J., Völzer, H.: Simplified Computation and Generalization of the Refined Process Structure Tree. In: WSFM (2010)
Reijers, H.A., Mans, R.S., van der Toorn, R.A.: Improved Model Management with Aggregated Business Process Models. Data Knowl. Eng. 68(2), 221–243 (2009)
Rosemann, M.: Potential pitfalls of process modeling: Part a. Business Process Management Journal 12(2), 249–254 (2006)
Shasha, D., Wang, J.T.-L., Giugno, R.: Algorithmics and Applications of Tree and Graph Searching. In: PODS, pp. 39–52 (2002)
Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)
Vanhatalo, J., Völzer, H., Koehler, J.: The Refined Process Structure Tree. Data Knowl. Eng. 68(9), 793–818 (2009)
Weber, B., Reichert, M., Mendling, J., Reijers, H.A.: Refactoring large process model repositories. Computers in Industry 62(5), 467–486 (2011)
Williams, D.W., Huan, J., Wang, W.: Graph database indexing using structured graph decomposition. In: ICDE, pp. 976–985 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Uba, R., Dumas, M., García-Bañuelos, L., La Rosa, M. (2011). Clone Detection in Repositories of Business Process Models. In: Rinderle-Ma, S., Toumani, F., Wolf, K. (eds) Business Process Management. BPM 2011. Lecture Notes in Computer Science, vol 6896. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23059-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-23059-2_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23058-5
Online ISBN: 978-3-642-23059-2
eBook Packages: Computer ScienceComputer Science (R0)