research-article

A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization

Authors:
Xiaoyuan Xie

Swinburne University of Technology, Australia

Swinburne University of Technology, Australia
View Profile

,
Tsong Yueh Chen

Swinburne University of Technology, Australia

Swinburne University of Technology, Australia
View Profile

,
Fei-Ching Kuo

Swinburne University of Technology, Australia

Swinburne University of Technology, Australia
View Profile

,
Baowen Xu

Nanjing University, Jiangsu, China

Nanjing University, Jiangsu, China
View Profile

ACM Transactions on Software Engineering and Methodology Volume 22 Issue 4Article No.: 31pp 1–40https://doi.org/10.1145/2522920.2522924

Published:22 October 2013Publication History

ACM Transactions on Software Engineering and Methodology

Abstract

An important research area of Spectrum-Based Fault Localization (SBFL) is the effectiveness of risk evaluation formulas. Most previous studies have adopted an empirical approach, which can hardly be considered as sufficiently comprehensive because of the huge number of combinations of various factors in SBFL. Though some studies aimed at overcoming the limitations of the empirical approach, none of them has provided a completely satisfactory solution. Therefore, we provide a theoretical investigation on the effectiveness of risk evaluation formulas. We define two types of relations between formulas, namely, equivalent and better. To identify the relations between formulas, we develop an innovative framework for the theoretical investigation. Our framework is based on the concept that the determinant for the effectiveness of a formula is the number of statements with risk values higher than the risk value of the faulty statement. We group all program statements into three disjoint sets with risk values higher than, equal to, and lower than the risk value of the faulty statement, respectively. For different formulas, the sizes of their sets are compared using the notion of subset. We use this framework to identify the maximal formulas which should be the only formulas to be used in SBFL.

References

Abreu, R., Zoeteweij, P., Golsteijn, R., and Van Gemund, A. J. C. 2009. A practical evaluation of spectrum-based fault localization. J. Syst. Softw. 82, 11, 1780--1792. Google ScholarDigital Library
Abreu, R., Zoeteweij, P., and Van Gemund, A. J. C. 2006. An evaluation of similarity coefficients for software fault localization. In Proceedings of the 12^th Pacific Rim International Symposium on Dependable Computing. 39--46. Google ScholarDigital Library
Abreu, R., Zoeteweij, P., and Van Gemund, A. J. C. 2007. On the accuracy of spectrum-based fault localization. In Proceedings of Testing: Academic and Industrial Conference Practice and Research Techniques (TAICPART-MUTATION'07). 89--98. Google ScholarDigital Library
Agrawal, H., Horgan, J. R., London, S., and Wong, W. E. 1995. Fault localization using execution slices and dataflow tests. In Proceedings of the 6^th International Symposium on Software Reliability Engineering. 143--151.Google Scholar
Baah, G. K., Podgurski, A., and Harrold, M. J. 2010. Causal inference for statistical fault localization. In Proceedings of the International Symposium on Software Testing and Analysis. 73--84. Google ScholarDigital Library
Bandyopadhyay, A. and Ghosh, S. 2011. Proximity based weighting of test cases to improve spectrum based fault localization. In Proceedings of the 26^th IEEE/ACM International Conference on Automated Software Engineering. 420--423. Google ScholarDigital Library
Chen, M., Kiciman, E., Fratkin, E., Fox, A., and Brewer, E. 2002. Pinpoint: Problem determination in large, dynamic internet services. In Proceedings of the 32^nd IEEE/IFIP International Conference on Dependable Systems and Networks. 595--604. Google ScholarDigital Library
Collofello, J. S. and Woodfield, S. N. 1989. Evaluating the effectiveness of reliability-assurance techniques. J. Syst. Softw. 9, 3, 191--195. Google ScholarDigital Library
Dallmeier, V., Lindig, C., and Zeller, A. 2005. Lightweight defect localization for java. In Proceedings of the 19^th European Conference on Object-Oriented Programming. 528--550. Google ScholarDigital Library
Dickinson, W., Leon, D., and Podgurski, A. 2001. Finding failures by cluster analysis of execution profiles. In Proceedings of the 23^rd International Conference on Software Engineering. 339--348. Google ScholarDigital Library
Digiuseppe, N. and Jones, J. A. 2011. On the influence of multiple faults on coverage-based fault localization. In Proceedings of the International Symposium on Software Testing and Analysis. 199--209. Google ScholarDigital Library
Harrold, M. J., Rothermel, G., Sayre, K., Wu, R., and Yi, L. 2000. An empirical investigation of the relationship between spectra differences and regression faults. Softw. Testing Verif. Reliab. 10, 3, 171--194.Google ScholarCross Ref
Harrold, M. J., Rothermel, G., Wu, R., and Yi, L. 1998. An empirical investigation of program spectra. In Proceedings of the 1^st ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering. 83--90. Google ScholarDigital Library
Jiang, B., Zhang, Z., Tse, T. H., and Chen, T. Y. 2009. How well do test case prioritization techniques support statistical fault localization. In Proceedings of the 33^rd Annual International Conference on Computer Software and Applications. Vol. 1. 99--106. Google ScholarDigital Library
Jones, J. A., Bowring, J. F., and Harrold, M. J. 2007. Debugging in parallel. In Proceedings of the International Symposium on Software Testing and Analysis. 16--26. Google ScholarDigital Library
Jones, J. A. and Harrold, M. J. 2005. Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20^th IEEE/ACM International Conference on Automated Software Engineering. 273--282. Google ScholarDigital Library
Jones, J. A., Harrold, M. J., and Stasko, J. 2002. Visualization of test information to assist fault localization. In Proceedings of the 24^th International Conference on Software Engineering. 467--477. Google ScholarDigital Library
Lee, H. J., Naish, L., and Ramamohanarao, K. 2009a. Study of the relationship of bug consistency with respect to performance of spectra metrics. In Proceedings of the 2^nd IEEE International Conference on Computer Science and Information Technology. 501--508.Google Scholar
Lee, H. J., Naish, L., and Ramamohanarao, K. 2009b. The effectiveness of using non redundant test cases with program spectra for bug localization. In Proceedings of the 2^nd IEEE International Conference on Computer Science and Information Technology. 127--134.Google Scholar
Liblit, B. R. 2004. Cooperative bug isolation. Ph.D. thesis, University of California. http://theory.stanford.edu/&sim;aiken/publications/theses/liblit.pdf. Google ScholarDigital Library
Liblit, B. R., Naik, M., Zheng, A. X., Aiken, A., and Jordan, M. I. 2005. Scalable statistical bug isolation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 15--26. Google ScholarDigital Library
Liu, C., Fei, L., Yan, X., Han, J., and Midkiff, S. 2006. Statistical debugging: A hypothesis testing-based approach. IEEE Trans. Softw. Engin. 32, 10, 831--848. Google ScholarDigital Library
Liu, C. and Han, J. 2006. Failure proximity: A fault localization-based approach. In Proceedings of the 14^th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 46--56. Google ScholarDigital Library
Naish, L., Lee, H. J., and Ramamohanarao, K. 2009. Spectral debugging with weights and incremental ranking. In Proceedings of the 16^th Asia-Pacific Software Engineering Conference. 168--175. Google ScholarDigital Library
Naish, L., Lee, H. J., and Ramamohanarao, K. 2011. A model for spectra-based software diagnosis. ACM Trans. Softw. Engin. Methodol. 20, 3, 11:1--11:32. Google ScholarDigital Library
Parnin, C. and Orso, A. 2011. Are automated debugging techniques actually helping programmers&quest; In Proceedings of the International Symposium on Software Testing and Analysis. 199--209. Google ScholarDigital Library
Podgurski, A., Leon, D., Francis, P., Masri, W., Minch, M., Sun, J., and WANG, B. 2003. Automated support for classifying software failure reports. In Proceedings of the 25^th International Conference on Software Engineering. 465--475. Google ScholarDigital Library
Reps, T., Ball, T., Das, M., and Larus, J. 1997. The use of program profiling for software maintenance with applications to the year 2000 problem. In Proceedings of the 6^th European Software Engineering Conference held jointly with the 5^th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 432--449. Google ScholarDigital Library
Santelices, R., Jones, J. A., Yu, Y., and Harrold, M. J. 2009. Lightweight fault-localization using multiple coverage types. In Proceedings of the 31^st International Conference on Software Engineering. 56--66. Google ScholarDigital Library
Sir. 2005. http://sir.unl.edu/php/index.php.Google Scholar
Wong, W. E., Debroy, V., and Choi, B. 2010. A family of code coverage-based heuristics for effective fault localization. J. Syst. Softw. 83, 2, 188--208. Google ScholarDigital Library
Wong, W. E. and Qi, Y. 2006. Effective program debugging based on execution slices and inter-block data dependency. J. Syst. Softw. 79, 7, 891--903. Google ScholarDigital Library
Wong, W. E., Qi, Y., Zhao, L., and Cai, K. Y. 2007. Effective fault localization using code coverage. In Proceedings of the 31^st Annual International Conference on Computer Software and Applications. 449--456. Google ScholarDigital Library
Wong, W. E., Wei, T., Qi, Y., and Zhao, L. 2008. A crosstab-based statistical method for effective fault localization. In Proceedings of the 1^st International Conference on Software Testing, Verification and Validation. 42--51. Google ScholarDigital Library
Xie, X. Y. 2012. On the analysis of spectrum-based fault localization. Ph.D. thesis, Swinburne University of Technology, Australia. http://www.ict.swin.edu.au/personal/xxie/publications/XiaoyuanXie-PhDThesis.pdf.Google Scholar
Xie, X. Y., Chen, T. Y., and Xu, B. W. 2010. Isolating suspiciousness from spectrum-based fault localization techniques. In Proceedings of the 10^th International Conference on Quality Software. 385--392. Google ScholarDigital Library
Xie, X. Y., Wong, W. E., Chen, T. Y., and Xu, B. W. 2011. Spectrum-based fault localization: Testing oracles are no longer mandatory. In Proceedings of the 11^th International Conference on Quality Software. 1--10. Google ScholarDigital Library
Yu, Y., Jones, J. A., and Harrold, M. J. 2008. An empirical study of the effects of test-suite reduction on fault localization. In Proceedings of the 30^th International Conference on Software Engineering. 201--210. Google ScholarDigital Library
Zeller, A. 2002. Isolating cause-effect chains from computer programs. In Proceedings of the 10^th ACM SIGSOFT Symposium on Foundations of Software Engineering. 1--10. Google ScholarDigital Library
Zheng, A. X., Jordan, M. I., Liblit, B., Naik, M., and Aiken, A. 2006. Statistical debugging: Simultaneous identification of multiple bugs. In Proceedings of the 23^rd International Conference on Machine Learning. 1105--1112. Google ScholarDigital Library

Index Terms

A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Boosting spectrum-based fault localization using PageRank
ISSTA 2017: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis

Manual debugging is notoriously tedious and time consuming. Therefore, various automated fault localization techniques have been proposed to help with manual debugging. Among the existing fault localization techniques, spectrum-based fault localization ...
Read More
Enhancing spectrum based fault localization via emphasizing its formulas with importance weight
APR '22: Proceedings of the Third International Workshop on Automated Program Repair

Spectrum-Based Fault Localization (SBFL) computes suspicion scores, using risk evaluation formulas, for program elements (e.g., statements, methods, or classes) by counting how often each element is executed or not executed by passing versus failing ...
Read More
A vector table model-based systematic analysis of spectral fault localization techniques

Spectral fault localization (SFL) is an automatic fault-localization technique, which uses risk evaluation formula to rank the risk of fault existence in each program entity after collecting the testing information dynamically. To provide insight into ...
Read More

Reviews

Reviewer: T.H. Tse

Spectrum-based fault localization is a popular technique in automatic program debugging. Researchers analyze the distribution of pass and fail cases in program testing using different risk evaluation formulas, and validate how their proposals are better than earlier work via empirical studies. In this paper, the authors propose a theoretical framework to compare 30 risk evaluation formulas in terms of the percentage of code examined before a fault is identified. They rank the formulas using "better" and "equivalent" relations. Only five formulas are proven to be the most efficient. Many of the best-known formulas are not among them. There is an unhealthy tendency toward empirical studies in software testing and debugging research. Researchers use hypothesis testing to determine whether their proposal is better than that of their predecessors. Reviewers demand more subject programs and larger test pools for further validation. It is refreshing to see that the authors of this paper do not simply rely on empirical studies, but prove mathematically whether various proposals have hit their mark. This paper is not the only example of the successful application of mathematical theory by Chen's research group. Chen and Merkel prove in one paper [1] that no test case generation technique can be better than random testing by more than 50 percent. Hence, their proposed adaptive random testing technique is close to this theoretic limit. In another paper [2], Chen and Yu prove that their proposed proportional sampling strategy is the only partition testing strategy that ensures that the probability of finding at least one failure is no lower than random testing for any program. Understandably, some researchers are disgruntled because these theoretical results stop them from making further incremental proposals. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Software Engineering and Methodology Volume 22, Issue 4
Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance
October 2013
387 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/2522920
Issue’s Table of Contents

Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 October 2013
- Accepted: 1 October 2012
- Revised: 1 September 2012
- Received: 1 January 2012
Published in tosem Volume 22, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Debugging
risk evaluation formulas
spectrum-based fault localization
testing
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 248
  Total Citations
  View Citations
- 1,465
  Total Downloads
- Downloads (Last 12 months)101
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization

ACM Transactions on Software Engineering and Methodology

Abstract

References

Cited By

Index Terms

Recommendations

Boosting spectrum-based fault localization using PageRank

Enhancing spectrum based fault localization via emphasizing its formulas with importance weight

A vector table model-based systematic analysis of spectral fault localization techniques

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization

ACM Transactions on Software Engineering and Methodology

Abstract

References

Cited By

Index Terms

Recommendations

Boosting spectrum-based fault localization using PageRank

Enhancing spectrum based fault localization via emphasizing its formulas with importance weight

A vector table model-based systematic analysis of spectral fault localization techniques

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media