Skip to main content
Log in

FATOC: Bug Isolation Based Multi-Fault Localization by Using OPTICS Clustering

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Bug isolation is a popular approach for multi-fault localization (MFL), where all failed test cases are clustered into several groups, and then the failed test cases in each group combined with all passed test cases are used to localize only a single fault. However, existing clustering algorithms cannot always obtain completely correct clustering results, which is a potential threat for bug isolation based MFL approaches. To address this issue, we first analyze the influence of the accuracy of the clustering on the performance of MFL, and the results of a controlled study indicate that using the clustering algorithm with the highest accuracy can achieve the best performance of MFL. Moreover, previous studies on clustering algorithms also show that the elements in a higher density cluster have a higher similarity. Based on the above motivation, we propose a novel approach FATOC (One-Fault-at-a-Time via OPTICS Clustering). In particular, FATOC first leverages the OPTICS (Ordering Points to Identify the Clustering Structure) clustering algorithm to group failed test cases, and then identifies a cluster with the highest density. OPTICS clustering is a density-based clustering algorithm, which can reduce the misgrouping and calculate a density value for each cluster. Such a density value of each cluster is helpful for finding a cluster with the highest clustering effectiveness. FATOC then combines the failed test cases in this cluster with all passed test cases to localize a single-fault through the traditional spectrum-based fault localization (SBFL) formula. After this fault is localized and fixed, FATOC will use the same method to localize the next single-fault, until all the test cases are passed. Our evaluation results show that FATOC can significantly outperform the traditional SBFL technique and a state-of-the-art MFL approach MSeer on 804 multi-faulty versions from nine real-world programs. Specifically, FATOC’s performance is 10.32% higher than that of traditional SBFL when using Ochiai formula in terms of metric A-EXAM. Besides, the results also indicate that, when checking 1%, 3% and 5% statements of all subject programs, FATOC can locate 36.91%, 48.50% and 66.93% of all faults respectively, which is also better than the traditional SBFL and the MFL approach MSeer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Xie X, Chen T Y, Kuo F C, Xu B. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Transactions on Software Engineering and Methodology, 2013, 22(4): Article No. 31.

  2. Wong W E, Gao R, Li Y, Abreu R, Wotawa F. A survey on software fault localization. IEEE Transactions on Software Engineering, 2016, 42(8): 707-740.

    Article  Google Scholar 

  3. Kim J, Kim J, Lee E. VFL: Variable-based fault localization. Information and Software Technology, 2019, 107: 179-191.

    Article  MathSciNet  Google Scholar 

  4. Pearson S, Campos J, Just R, Fraser G, Abreu R, Ernst M D, Pang D, Keller B. Evaluating and improving fault localization. In Proc. the 39th IEEE/ACM Int. Conf. Software Engineering, May 2017, pp.609-620.

  5. Liu Y, Li M,Wu Y, Li Z. A weighted fuzzy classification approach to identify and manipulate coincidental correct test cases for fault localization. Journal of Systems and Software, 2019, 151: 20-37.

    Article  Google Scholar 

  6. Wah K S H T. A theoretical study of fault coupling. Software Testing Verification and Reliability, 2000, 10(1): 3-45.

    Article  MathSciNet  Google Scholar 

  7. Gopinath R, Jensen C, Groce A. The theory of composite faults. In Proc. the 2017 IEEE Int. Conf. Software Testing, Verification and Validation, March 2017, pp.47-57.

  8. Gao R, Wong W E. MSeer — An advanced technique for locating multiple bugs in parallel. IEEE Transactions on Software Engineering, 2019, 45(3): 301-318.

    Article  Google Scholar 

  9. Zheng Y, Wang Z, Fan X Y, Chen X, Yang Z J. Localizing multiple software faults based on evolution algorithm. Journal of Systems and Software, 2018, 139: 107-123.

    Article  Google Scholar 

  10. Liu B, Nejati S, Briand L, Bruckmann T. Localizing multiple faults in Simulink models. In Proc. the 23rd IEEE Int. Conf. Software Analysis, Evolution, and Reengineering, March 2016, pp.146-156.

  11. Jones J A, Bowring J F, Harrold M J. Debugging in parallel. In Proc. the 2007 International Symposium on Software Testing and Analysis, July 2007, pp.16-26.

  12. Chen Z, Chen Z, Zhao Z, Yan S, Zhang J, Xu B. An improved regression test selection technique by clustering execution profiles. In Proc. the 10th International Conference on Quality Software, July 2010, pp.171-179.

  13. Chen S, Chen Z, Zhao Z, Xu B, Feng Y. Using semisupervised clustering to improve regression test selection techniques. In Proc. the 4th IEEE Int. Conf. Software Testing, Verification and Validation, March 2011, pp.1-10.

  14. Vangala V, Czerwonka J, Talluri P. Test case comparison and clustering using program profiles and static execution. In Proc. the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, August 2009, pp.293-294.

  15. Dickinson W, Leon D, Fodgurski A. Finding failures by cluster analysis of execution profiles. In Proc. the 23rd Int. Conf. Software Engineering, May 2001, pp.339-348.

  16. Dickinson W, Leon D, Podgurski A. Pursuing failure: The distribution of program failures in a profile space. ACM SIGSOFT Software Engineering Notes, 2001, 26(5): 246-255.

    Article  Google Scholar 

  17. Mathias R, Lagrange M, Cont A. Efficient similarity-based data clustering by optimal object to cluster reallocation. PLOS ONE, 2018, 13(6): e0197450.

    Article  Google Scholar 

  18. Liu Y C, Li Z M, Xiong H, Gao X D,Wu J J. Understanding of internal clustering validation measures. In Proc. the 2010 IEEE Int. Conf. Data Mining, December 2010, pp.911-916.

  19. Huang Y, Wu J, Feng Y, Chen Z, Zhao Z. An empirical study on clustering for isolating bugs in fault localization. In Proc. the 2013 IEEE Int. Conf. Software Reliability Engineering Workshops, November 2013, pp.138-143.

  20. Ankerst M, Breunig M M, Kriegel H P, Sander J. OPTICS: Ordering points to identify the clustering structure. ACM SIGMOD Record, 1999, 28(2): 49-60.

    Article  Google Scholar 

  21. Li Z, Wu Y H, Liu Y. An empirical study of bug isolation on the effectiveness of multiple fault localization. In Proc. the 19th IEEE Int. Conf. Software Quality, Reliability and Security, July 2019, pp.18-25.

  22. Zou D M, Liang J J, Xiong Y F, Ernst M D, Zhang L. An empirical study of fault localization families and their combinations. IEEE Transactions on Software Engineering. doi:https://doi.org/10.1109/TSE.2019.2892102.

  23. Wen M, Chen J J, Tian Y J, Wu R X, Hao D, Han S, Cheung S C. Historical spectrum based fault localization. IEEE Transactions on Software Engineering. doi:https://doi.org/10.1109/TSE.2019.2948158.

  24. Jones J A, Harrold M J. Empirical evaluation of the tarantula automatic fault-localization technique. In Proc. the 20th IEEE/ACM International Conference on Automated Software Engineering, November 2005, pp.273-282.

  25. Naish L, Lee H J, Ramamohanarao K. A model for spectrabased software diagnosis. ACM Transactions on Software Engineering and Methodology, 2011, 20(3): Article No. 11.

  26. Dallmeier V, Lindig C, Zeller A. Lightweight bug localization with AMPLE. In Proc. the 6th Int. Symp. Automated Analysis-Driven Debugging, September 2005, pp.99-104.

  27. Masri W. Fault localization based on information flow coverage. Software Testing Verification & Reliability, 2010, 20(2): 121-147.

    Article  Google Scholar 

  28. Shu T, Ye T T, Ding Z H, Xia J S. Fault localization based on statement frequency. Information Sciences, 2016, 360: 43-56.

    Article  Google Scholar 

  29. Jaccard P. Etude de la distribution florale dans une portion des Alpes et du Jura. Bulletin De La Societe Vaudoise Des Sciences Naturelles, 2013, 37(142): 547-579. (in French)

    Google Scholar 

  30. Jones J A, Harrold M J, Stasko J. Visualization of test information to assist fault localization. In Proc. the 24th Int. Conf. Software Engineering, May 2002, pp.467-477.

  31. Rui A, Zoeteweij P, van Gemund A J C. An evaluation of similarity coefficients for software fault localization. In Proc. the 12th Pacific Rim International Symposium on Dependable Computing, December 2006, pp.39-46.

  32. Wong W E, Debroy V, Xu D. Towards better fault localization: A crosstab-based statistical approach. IEEE Transactions on Systems Man & Cybernetics, 2012, 42(3): 378-396.

    Article  Google Scholar 

  33. Wong W E, Debroy V, Gao R, Li Y. The DStar method for effective software fault localization. IEEE Transactions on Reliability, 2014, 63(1): 290-308.

    Article  Google Scholar 

  34. Feyzi F, Parsa S. Inforence: Effective fault localization based on information-theoretic analysis and statistical causal inference. Frontiers of Computer Science, 2019, 13(4): 735-759.

    Article  Google Scholar 

  35. Zakari A, Lee S P, Hashem I A T. A single fault localization technique based on failed test input. Array, 2019, 3/4: Article No. 100008.

  36. Abreu R, Zoeteweij P, van Gemund A J C. Spectrum-based multiple fault localization. In Proc. the 2009 IEEE/ACM Int. Conf. Automated Software Engineering, Nov. 2009, pp.88-99.

  37. Wong W E, Debroy V, Golden R, Xu X F, Thuraisingham B. Effective software fault localization using an RBF neural network. IEEE Transactions on Reliability, 2012, 61(1): 149-169.

    Article  Google Scholar 

  38. Zakari A, Lee S P. Simultaneous isolation of software faults for effective fault localization. In Proc. the 15th IEEE International Colloquium on Signal Processing & Its Applications, March 2019, pp.16-20.

  39. Zakari A, Lee S P. Parallel debugging: An investigative study. Journal of Software: Evolution and Process, 2019, 31(11): Article No. e2178.

  40. He Z J, Chen Y, Huang E Y, Wang Q X, Pei Yu, Yuan H D. A system identification based Oracle for control-CPS software fault localization. In Proc. the 41st IEEE/ACM Int. Conf. Software Engineering, May 2019, pp.116-127.

  41. Do H, Elbaum S, Rothermel G. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Software Engineering, 2005, 10(4): 405-435.

    Article  Google Scholar 

  42. Birant D, Kut A. ST-DBSCAN: An algorithm for clustering spatial-temporal data. Data & Knowledge Engineering, 2007, 60(1): 208-221.

    Article  Google Scholar 

  43. Ester Martin, Kriegel H P, Sander J, Xu X W. A densitybased algorithm for discovering clusters in large spatial databases with noise. In Proc. the 2nd Int. Conf. Knowledge Discovery and Data Mining, August 1996, pp.226-231.

  44. Yang Q, Li J J, Weiss D M. A survey of coverage-based testing tools. Computer Journal, 2009, 52(5): 589-597.

    Article  Google Scholar 

  45. Lamraoui S M, Nakajima S. A formula-based approach for automatic fault localization of multi-fault programs. Journal of Information Processing, 2016, 24(1): 88-98.

    Article  Google Scholar 

  46. Yu Z, Bai C, Cai K Y. Does the failing test execute a single or multiple faults? An approach to classifying failing tests. In Proc. the 37th IEEE/ACM IEEE Int. Conf. Software Engineering, May 2015, pp.924-935.

  47. Steimann F, Frenkel M, Abreu R. Threats to the validity and value of empirical assessments of the accuracy of coverage-based fault locators. In Proc. the 2013 Int. Symp. Software Testing and Analysis, July 2013, pp.314-324.

  48. Li X, Zhang L M. Transforming programs and tests in tandem for fault localization. Proceedings of the ACM on Programming Languages, 2017, 1(OOPSLA): Article No. 92.

  49. Parnin C, Orso A. Are automated debugging techniques actually helping programmers? In Proc. the 2011 Int. Symp. Software Testing and Analysis, July 2011, pp.199-209.

  50. Prybutok V R. An introduction to statistical methods and data analysis. Technometrics, 1989, 31(3): 389-390.

    Article  Google Scholar 

  51. Perez A, Rui A, d’Amorim M. Prevalence of single-fault fixes and its impact on fault localization. In Proc. the 2017 IEEE Int. Conf. Software Testing, March 2017, pp.12-22.

  52. Liu Y, Li Z, Zhao R, Gong P. An optimal mutation execution strategy for cost reduction of mutation-based fault localization. Information Sciences, 2017, 422: 572-596.

    Article  MathSciNet  Google Scholar 

  53. Manish M, Yuriy B. Automatically generating precise oracles from structured natural language specifications localization. In Proc. the 41st ACM/IEEE Int. Conf. Software Engineering, May 2019, pp.188-199.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Liu.

Electronic supplementary material

ESM 1

(PDF 337 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, YH., Li, Z., Liu, Y. et al. FATOC: Bug Isolation Based Multi-Fault Localization by Using OPTICS Clustering. J. Comput. Sci. Technol. 35, 979–998 (2020). https://doi.org/10.1007/s11390-020-0549-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-020-0549-4

Keywords

Navigation