Skip to main content
Log in

Fault density, fault types, and spectra-based fault localization

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

This paper presents multiple empirical experiments that investigate the impact of fault quantity and fault type on statistical, coverage-based fault localization techniques and fault-localization interference. Fault-localization interference is a phenomenon revealed in earlier studies of coverage-based fault localization that causes faults to obstruct, or interfere, with other faults’ ability to be localized. Previously, it had been asserted that a fault-localization technique’s effectiveness was negatively correlated to the quantity of faults in the program. To investigate these beliefs, we conducted an experiment on six programs consisting of more than 72,000 multiple-fault versions. Our data suggests that the impact of multiple faults exerts a significant, but slight influence on fault-localization effectiveness. In addition, faults were categorized according to four existing fault-taxonomies and found no correlation between fault type and fault-localization interference. In general, even in the presence of many faults, at least one fault was found by fault localization with similar effectiveness. Additionally, our data exhibits that fault-localization interference is prevalent and exerts a meaningful influence that may cause a fault’s localizability to vary greatly. Because almost all real-world software contains multiple faults, these results affect the practical use and understanding of statistical fault-localization techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Abreu R, Zoeteweij P, Van Gemund AJ (2007) On the accuracy of spectrum-based fault localization. In: Testing: academic and industrial conference practice and research techniques-MUTATION, 2007. TAICPART-MUTATION 2007, pp 89–98. IEEE

  • Ali S, Andrews JH, Dhandapani T, Wang W (2009) Evaluating the accuracy of fault localization techniques. In: Proceedings of the 2009 IEEE/ACM international conference on automated software engineering, pp 76–87. IEEE Computer Society

  • Arumuga Nainar P, Liblit B (2010) Adaptive bug isolation. In: Proceedings of the 32nd ACM/IEEE international conference on software engineering, vol 1, pp 255–264. ACM

  • Clark S, Cobb J, Kapfhammer GM, Jones JA, Harrold MJ (2011) Localizing sql faults in database applications. In: Proceedings of the 26th IEEE/ACM international conference on automated software engineering (ASE), pp 213–222

  • Cleve H, Zeller A (2005) Locating causes of program failures. In: Proceedings of the 27th international conference on software engineering, pp 342–351. ACM

  • Debroy V, Wong WE (2009) Insights on fault interference for programs with multiple bugs. In: 20th international symposium on software reliability engineering, 2009. ISSRE’09, pp 165–174. IEEE

  • Denmat T, Ducassé M, Ridoux O (2005) Data mining and cross-checking of execution traces: a re-interpretation of jones, harrold and stasko test information. Tech. rep.

  • Dickinson W, Leon D, Podgurski A (2001) Finding failures by cluster analysis of execution profiles. In: Proceedings of the international conference on software engineering. http://portal.acm.org/citation.cfm?id=.

  • Dickinson W, Leon D, Podgurski A (2001) Pursuing failure: the distribution of program failures in a profile space. In: Proceedings of the international symposium on foundations of software engineering . http://doi.acm.org/10.1145/503209.503243

  • DiGiuseppe N, Jones JA (2011a) Fault interaction and its repercussions. In: 2011 27th IEEE international conference on software maintenance (ICSM), pp 3–12. IEEE

  • DiGiuseppe N, Jones JA (2011b) On the influence of multiple faults on coverage-based fault localization. In: Proceedings of the 2011 international symposium on software testing and analysis, pp 210– 220. ACM

  • DiGiuseppe N, Jones JA (2012a) Concept-based failure clustering. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering, p 29. ACM

  • DiGiuseppe N, Jones JA (2012b) Software behavior and failure clustering: an empirical study of fault causality. In: 2012 IEEE fifth international conference on software testing, verification and validation (ICST), pp 191–200. IEEE

  • Do H, Elbaum S, Rothermel G (2005) Supporting controlled experimentation with testing techniques: an infrastructure and its potential impact. Empir Softw Eng 10(4):405–435

    Article  Google Scholar 

  • Hayes JH (1994) Testing of object-oriented programming systems (oops): a fault-based approach. In: Object-oriented methodologies and systems, pp 205–220. Springer

  • Hayes JH, Chemannoor IR, Holbrook EA (2011) Improved code defect detection with fault links. Softw Test Verification and Reliab 21(4):299–325

    Article  Google Scholar 

  • Jones JA (2008) Semi-automatic fault localization. Georgia Institute of Technology, Ph.D. thesis

    Google Scholar 

  • Jones JA, Bowring JF, Harrold MJ (2007) Debugging in parallel. In: Proceedings of the 2007 international symposium on software testing and analysis, pp 16–26. ACM

  • Jones JA, Harrold MJ (2005) Empirical evaluation of the tarantula automatic fault-localization technique. In: Proceedings of the 20th IEEE/ACM international conference on automated software engineering, pp 273–282. ACM

  • Jones JA, Harrold MJ, Stasko J (2002) Visualization of test information to assist fault localization. In: Proceedings of the 24th international conference on Software engineering, pp 467–477. ACM

  • Kung DC, Gao J, Kung CH (1998) Testing object-oriented software. Tech. rep.

  • Liblit B, Naik M, Zheng AX, Aiken A, Jordan MI (2005) Scalable statistical bug isolation, pp 15–26. ACM

  • Liu C, Han J (2006) Failure proximity: a fault localization-based approach. In: Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering, pp 46–56. ACM

  • Liu C, Yan X, Fei L, Han J, Midkiff SP (2005) Sober: statistical model-based bug localization, pp 286–295. ACM

  • Offutt AJ, Lee A, Rothermel G, Untch RH, Zapf C (1996) An experimental determination of sufficient mutant operators. ACM Trans Softw Eng Methodol (TOSEM) 5(2):99–118

    Article  Google Scholar 

  • Parnin C, Orso A (2011) Are automated debugging techniques actually helping programmers? In: Proceedings of the 2011 international symposium on software testing and analysis, pp 199–209. ACM

  • Podgurski A, Leon D, Francis P, Masri W, Minch M, Sun J, Wang B (2003) Automated support for classifying software failure reports. In: Proceedings 25th international conference on software engineering, 2003, pp 465–475. IEEE

  • Renieres M, Reiss SP (2003) Fault localization with nearest neighbor queries. In: Proceedings. 18th IEEE international conference on automated software engineering, 2003, pp 30–39. IEEE

  • Santelices R, Jones JA, Yu Y, Harrold MJ (2009) Lightweight fault-localization using multiple coverage types. In: Proceedings of the 31st international conference on software engineering, ICSE ’09, pp 56–66

  • Smith M, Robson D (1992) A framework for testing object-oriented programs. J Object-Oriented Program 5(3):45–53

    Google Scholar 

  • Srivastav M, Singh Y, Gupta C, Chauhan DS (2010) Complexity estimation approach for debugging in parallel. In: 2010 2nd international conference on computer research and development, pp 223–227. IEEE

  • Vessey I (1985) Expertise in debugging computer programs: a process analysis. Int J Man-Machine Stud 23(5):459–494

    Article  Google Scholar 

  • Voas J (1992) Pie: a dynamic failure-based technique. IEEE transactions on software engineering 18(8):717–727. doi:10.1109/32.153381

    Google Scholar 

  • Wong WE, Debroy V (2009) A survey of software fault localization. University of Texas at Dallas. Tech. Rep. UTDCS-45-09

  • Yu Y, Jones JA, Harrold MJ (2008) An empirical study of the effects of test-suite reduction on fault localization. In: Proceedings of the 30th international conference on software engineering, pp 201–210. ACM

  • Zeller A (2002) Isolating cause-effect chains from computer programs. In: Proceedings of the 10th ACM SIGSOFT symposium on foundations of software engineering, pp 1–10. ACM

  • Zeller A (2009) In: why programs fail: a guide to systematic debugging. Morgan Kaufmann

  • Zheng AX, Jordan MI, Liblit B, Naik M, Aiken A (2006) Statistical debugging: simultaneous identification of multiple bugs. In: Proceedings of the 23rd international conference on machine learning, pp 1105–1112. ACM

Download references

Acknowledgments

This material is based upon work supported by the National Science Foundation, through Award CCF-1116943 and through Graduate Research Fellowship under Grant No. DGE-0808392.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicholas DiGiuseppe.

Additional information

Communicated by: James Miller

Appendix A: Fault-Type to Number Index

Appendix A: Fault-Type to Number Index

Smith92

  1. 1.

    Inter-routine conceptual

  2. 2.

    Inter-routine actual

  3. 3.

    Intra-routine conceptual

  4. 4.

    Intra-routine actual

Firesmith92

  1. 1.

    Incorrect visibility

  2. 2.

    Missing component

  3. 3.

    Inconsistent component

  4. 4.

    Incorrect allocation/deallocation of resources

  5. 5.

    Does not meet requirements

Hayes94

  1. 1.

    Abstraction

  2. 2.

    Encapsulation

  3. 3.

    Modularity

  4. 4.

    Hierarchy

Hayes11

  1. 1.

    Data Declaration

  2. 2.

    Data Initialization

  3. 3.

    Data representation

  4. 4.

    Data accessing

  5. 5.

    Incorrect equation

  6. 6.

    Wrong manipulation

  7. 7.

    Incorrect/missing processing

  8. 8.

    Unnecessary processing

  9. 9.

    Rampaging go to

  10. 10.

    Incorrect labels

  11. 11.

    Dead-end code

  12. 12.

    Duplicate logic

  13. 13.

    Unachievable path

  14. 14.

    Incorrect initial value

  15. 15.

    Incorrect terminal value

  16. 16.

    Incorrect control value processing

  17. 17.

    Incorrect exception exit processing

  18. 18.

    Illogical conditions or impossible cases

  19. 19.

    Incorrect module interaction

  20. 20.

    Incorrect module-external data structure

  21. 21.

    Incorrect input parameters

  22. 22.

    Large response time

  23. 23.

    Lack of naturalness

  24. 24.

    Inconsistency

  25. 25.

    Redundancy

  26. 26.

    Complexity

  27. 27.

    Lack of flexibility

  28. 28.

    Non-supportiveness

  29. 29.

    Unpredictable flows

  30. 30.

    Visual stimulation

  31. 31.

    Platform

  32. 32.

    Wrong file included

  33. 33.

    Incorrect environment variable setting

  34. 34.

    Documentation

1.1 A.1 Expense for Prominent Fault

Fig. 11
figure 11

The range of expense that was observed for the most localizable fault at each discrete quantity of faults

1.2 A.2 Median Expense for All Faults

Fig. 12
figure 12

The aggregated expense for each fault, evaluated with a ranked list, in all our programs. The lowest plot line represents the fault with the least expense (i.e., the first localized), and each higher line represents the next found faults

1.3 A.3 Median Suspiciousness for All Faults

Fig. 13
figure 13

Mean suspiciousness values for each fault, evaluated with a ranked list, across all our programs. The dotted line represents the mean suspiciousness of all instructions in the program

1.4 A.4 Expense Distribution per Fault Type

Fig. 14
figure 14

The range of expense that was observed for every fault with each of the four taxonomies used, for all our programs

1.5 A.5 Fault Interference Distribution per Fault Type

Fig. 15
figure 15

The range of interference that was observed for every fault for each of the four taxonomies used

Rights and permissions

Reprints and permissions

About this article

Cite this article

DiGiuseppe, N., Jones, J.A. Fault density, fault types, and spectra-based fault localization. Empir Software Eng 20, 928–967 (2015). https://doi.org/10.1007/s10664-014-9304-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-014-9304-1

Keywords

Navigation