Skip to main content

Complex Symbolic Sequence Clustering and Multiple Classifiers for Predictive Process Monitoring

  • Conference paper
  • First Online:
Book cover Business Process Management Workshops (BPM 2016)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 256))

Included in the following conference series:

Abstract

This paper addresses the following predictive business process monitoring problem: Given the execution trace of an ongoing case, and given a set of traces of historical (completed) cases, predict the most likely outcome of the ongoing case. In this context, a trace refers to a sequence of events with corresponding payloads, where a payload consists of a set of attribute-value pairs. Meanwhile, an outcome refers to a label associated to completed cases, like, for example, a label indicating that a given case completed “on time” (with respect to a given desired duration) or “late”, or a label indicating that a given case led to a customer complaint or not. The paper tackles this problem via a two-phased approach. In the first phase, prefixes of historical cases are encoded using complex symbolic sequences and clustered. In the second phase, a classifier is built for each of the clusters. To predict the outcome of an ongoing case at runtime given its (uncompleted) trace, we select the closest cluster(s) to the trace in question and apply the respective classifier(s), taking into account the Euclidean distance of the trace from the center of the clusters. We consider two families of clustering algorithms – hierarchical clustering and k-medoids – and use random forests for classification. The approach was evaluated on four real-life datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    except for the “No clustering” approach.

References

  1. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA (1984)

    MATH  Google Scholar 

  2. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  3. Conforti, R., de Leoni, M., La Rosa, M., van der Aalst, W.M.P., ter Hofstede, A.H.M.: A recommendation system for predicting risks across multiple business process instances. Decis. Support Syst. 69, 1–19 (2015)

    Article  Google Scholar 

  4. Francescomarino, C.D., Dumas, M., Maggi, F.M., Teinemaa, I.: Clustering-Based Predictive Process Monitoring. ArXiv e-prints, June 2015

    Google Scholar 

  5. Dumas, M., Maggi, F.M.: Enabling process innovation via deviance mining and predictive monitoring. In: vom Brocke, J., Schmiedel, T. (eds.) BPM - Driving Innovation in a Digital World. Management for Professionals, pp. 145–154. Springer, Heidelberg (2015)

    Google Scholar 

  6. Folino, F., Guarascio, M., Pontieri, L.: Discovering context-aware models for predicting business process performances. In: Meersman, R., et al. (eds.) OTM 2012, Part I. LNCS, vol. 7565, pp. 287–304. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  7. Greco, G., Guzzo, A., Manco, G., Sacca, D.: Mining unconnected patterns in workflows. Inf. Syst. 32(5), 685–712 (2007)

    Article  Google Scholar 

  8. Grigori, D., Casati, F., Dayal, U., Shan, M.-C.: Improving business process quality through exception understanding, prediction, and prevention. In: Proceedings of the 27th International Conference on Very Large Data Bases, VLDB 2001, pp. 159–168. San Francisco, CA, USA (2001). Morgan Kaufmann Publishers Inc

    Google Scholar 

  9. Kang, B., Kim, D., Kang, S.-H.: Real-time business process monitoring method for prediction of abnormal termination using knni-based lof prediction. Expert Syst. Appl. 39(5), 6061–6068 (2012)

    Article  Google Scholar 

  10. Kononenko, I., Kukar, M.: Machine Learning and Data Mining. Elsevier Science, New York (2007)

    Book  MATH  Google Scholar 

  11. Lakshmanan, G.T., Shamsi, D., Doganata, Y.N., Unuvar, M., Khalaf, R.: A markov prediction model for data-driven semi-structured business processes. Knowl. Inf. Syst. 42(1), 97–126 (2015)

    Article  Google Scholar 

  12. Langfeldera, P., Zhangb, B., Horvatha, S.: Dynamic tree cut: in-depth description, tests and applications, November 22, 2007

    Google Scholar 

  13. Leontjeva, A., Conforti, R., Francescomarino, C.D., Dumas, M., Maggi, F.M.: Complex symbolic sequence encodings for predictive monitoring of business processes. In: Motahari-Nezhad, H.R., Recker, J., Weidlich, M. (eds.) Business Process Management. LNCS, vol. 9253, pp. 297–313. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  14. Maggi, F.M., Di Francescomarino, C., Dumas, M., Ghidini, C.: Predictive monitoring of business processes. In: Jarke, M., Mylopoulos, J., Quix, C., Rolland, C., Manolopoulos, Y., Mouratidis, H., Horkoff, J. (eds.) CAiSE 2014. LNCS, vol. 8484, pp. 457–472. Springer, Heidelberg (2014)

    Google Scholar 

  15. Metzger, A., Franklin, R., Engel, Y.: Predictive monitoring of heterogeneous service-oriented business networks: The transport and logistics case. In: SRII Global Conference (SRII), 2012 Annual, pp. 313–322. IEEE (2012)

    Google Scholar 

  16. Nguyen, H., Dumas, M., La Rosa, M., Maggi, F.M., Suriadi, S.: Mining business process deviance: a quest for accuracy. In: Meersman, R., Panetto, H., Dillon, T., Missikoff, M., Liu, L., Pastor, O., Cuzzocrea, A., Sellis, T. (eds.) OTM 2014. LNCS, vol. 8841, pp. 436–445. Springer, Heidelberg (2014)

    Google Scholar 

  17. Pika, A., van der Aalst, W.M.P., Fidge, C.J., ter Hofstede, A.H.M., Wynn, M.T.: Predicting deadline transgressions using event logs. In: La Rosa, M., Soffer, P. (eds.) BPM Workshops 2012. LNBIP, vol. 132, pp. 211–216. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  18. Rogge-Solti, A., Weske, M.: Prediction of remaining service execution time using stochastic petri nets with arbitrary firing delays. In: Basu, S., Pautasso, C., Zhang, L., Fu, X. (eds.) ICSOC 2013. LNCS, vol. 8274, pp. 389–403. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  19. Setiawan, M.A., Sadiq, S.: A methodology for improving business process performance through positive deviance. Int. J. Inf. Syst. Model. Des. (IJISMD) 4(2), 1–22 (2013)

    Article  Google Scholar 

  20. Suriadi, S., Ouyang, C., van der Aalst, W.M.P., ter Hofstede, A.H.M.: Root cause analysis with enriched process logs. In: La Rosa, M., Soffer, P. (eds.) BPM Workshops 2012. LNBIP, vol. 132, pp. 174–186. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  21. van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, Heidelberg (2011). http://link.springer.com/book/10.1007%2F978-3-642-19345-3

  22. van der Aalst, W.M.P., Pesic, M., Song, M.: Beyond process mining: from the past to present and future. In: Pernici, B. (ed.) CAiSE 2010. LNCS, vol. 6051, pp. 38–52. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  23. Van der Aalst, W.M.P., Schonenberg, M.H., Song, M.: Time prediction based on process mining. Inf. Syst. 36(2), 450–475 (2011)

    Article  Google Scholar 

  24. van der Spoel, S., van Keulen, M., Amrit, C.: Process prediction in noisy data sets: a case study in a dutch hospital. In: Cudre-Mauroux, P., Ceravolo, P., Gašević, D. (eds.) SIMPDA 2012. LNBIP, vol. 162, pp. 60–83. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  25. van Dongen, B.F., Crooy, R.A., van der Aalst, W.M.P.: Cycle time prediction: when will this case finally be finished? In: Meersman, R., Tari, Z. (eds.) OTM 2008, Part I. LNCS, vol. 5331, pp. 319–336. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  26. Xing, Z., Pei, J., Dong, G., Philip, S.Y.: Mining sequence classifiers for early prediction. In: SDM, pp. 644–655. SIAM (2008)

    Google Scholar 

  27. Xing, Z., Pei, J., Keogh, E.: A brief survey on sequence classification. ACM SIGKDD Explor. Newsl. 12(1), 40–48 (2010)

    Article  Google Scholar 

  28. Xu, R., Wunsch, D.: Clustering. IEEE Press Series on Computational Intelligence. Wiley, New York (2008)

    Book  Google Scholar 

  29. Zeng, L., Lingenfelder, C., Lei, H., Chang, H.: Event-driven quality of service prediction. In: Bouguettaya, A., Krueger, I., Margaria, T. (eds.) ICSOC 2008. LNCS, vol. 5364, pp. 147–161. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ilya Verenich .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Verenich, I., Dumas, M., La Rosa, M., Maggi, F.M., Di Francescomarino, C. (2016). Complex Symbolic Sequence Clustering and Multiple Classifiers for Predictive Process Monitoring. In: Reichert, M., Reijers, H. (eds) Business Process Management Workshops. BPM 2016. Lecture Notes in Business Information Processing, vol 256. Springer, Cham. https://doi.org/10.1007/978-3-319-42887-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42887-1_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42886-4

  • Online ISBN: 978-3-319-42887-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics