Skip to main content
Log in

Repeatable software engineering experiments for comparing defect-detection techniques

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Techniques for detecting defects in source code are fundamental to the success of any software development approach. A software development organization therefore needs to understand the utility of techniques such as reading or testing in its own environment. Controlled experiments have proven to be an effective means for evaluating software engineering techniques and gaining the necessary understanding about their utility. This paper presents a characterization scheme for controlled experiments that evaluate defect-detection techniques. The characterization scheme permits the comparison of results from similar experiments and establishes a context for cross-experiment analysis of those results. The characterization scheme is used to structure a detailed survey of four experiments that compared reading and testing techniques for detecting defects in source code. We encourage educators, researchers, and practitioners to use the characterization scheme in order to develop and conduct further instances of this class of experiments. By repeating this experiment we expect the software engineering community will gain quantitative insights about the utility of defect-detection techniques in different environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aronson E., Brewer M., and Carlsmith J. M. 1985. Experimentation in social psychology. Handbook of Social Psychology (Lindzey G., and Aronson E., eds.) Vol. 1, 3rd ed., New York: Random House.

    Google Scholar 

  • Basili, V. R., Caldiera, G., and Rombach, H. D. 1994. Goal question metric paradigm. Encyclopedia of Software Engineering, (Marciniak, J. J., ed.) vol. 1. John Wiley & Sons, pp. 528–532.

  • Basili V. R., and Rombach H. D. 1988. The TAME project: Towards improvement-oriented software environments. IEEE Transactions on Software Engineering Se-14(6): 758–773.

    Google Scholar 

  • Basili V. R., and Selby R. W. 1987. Comparing the effectiveness of software testing techniques. IEEE Transactions on Software Engineering 13(12): 1278–1296.

    Google Scholar 

  • Basili V. R., Selby R. W., and Hutchens D. H. 1986. Experimentation in software engineering. IEEE Transactions on Software Engineering SE-12(7): 733–743.

    Google Scholar 

  • Basili V. R., and Weiss D. M. 1984. A methodology for collecting valid software engineering data. IEEE Transactions on Software Engineering SE-10(6): 728–738.

    Google Scholar 

  • Basili V. R., and Weiss D. M. 1985. Evaluating software development by analysis of changes: Some data from the software engineering laboratory. IEEE Transactions on Software Engineering 11(2): 157–168.

    Google Scholar 

  • Box G. E. P., Hunter W. G., and Hunter J. S. 1978. Statistics for Experimenters. New York: John Wiley & Sons.

    Google Scholar 

  • Briand L., El Emam K. and Morasca S. 1996. On the application of measurement theory in software Engineering Journal of Empirical Software Engineering 1(1): 61–88.

    Google Scholar 

  • Briand L., Basili V. R., and Hetmanski C. J. 1993. Developing interpretable models with optimized set reduction for identifying high-risk software components. IEEE Transactions on Software Engineering 19(11): 1028–1044.

    Google Scholar 

  • Brooks R. E. 1980. Studying programmer behavior experimentally: The problems of proper methodology. Communications of the ACM 23(4): 207–213.

    Google Scholar 

  • Campbell D. T., and Stanley J. C. 1966. Experimental and Quasi-Experimental Designs for Research. Boston: Houghton Mifflin. 0–395–30787–2.

    Google Scholar 

  • Cohen, J. 1988. Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates.

  • Curtis B. 1980. Measurement and experimentation in software engineering. Proceedings of the IEEE 68(9): 1144–1157.

    Google Scholar 

  • Differding C., Hoisl B., and Lott C. M. 1996. Technology package for the goal question metric paradigm. Technical Report 281–96, Department of Computer Science, University of Kaiserslautern, 67653 Kaiserslautern, Germany, April.

    Google Scholar 

  • Fagan M. E. 1976. Design and code inspections to reduce errors in program development. IBM Systems Journal 15(3): 219–248.

    Google Scholar 

  • Fenton N., Pfleeger S. L., and Glass R. L. 1994. Science and substance: A challenge to software engineering. IEEE Software 11(4): 86–95.

    Google Scholar 

  • Gannon J. D., Hamlet R. B., and Mills H. D. 1987. Theory of modules. IEEE Transactions on Software Engineering 13(7): 820–829.

    Google Scholar 

  • Glass, R. L. 1995. Pilot studies: What, why and how. The Software Practitioner, January, 4–11.

  • Hetzel, W. C. 1976. An Experimental Analysis of Program Verification Methods. PhD thesis, University of North Carolina at Chapel Hill.

  • Hoare C. A. R. 1969. An axiomatic basis for computer programming. Communications of the ACM 12(10): 576–580, 583.

    Google Scholar 

  • Howden W. E. 1978. An evaluation of the effectiveness of symbolic testing. Software-Practice and Experience 8(4): 381–398.

    Google Scholar 

  • Howden W. E. 1980. Functional program testing. IEEE Transactions on Software Engineering SE-6: 162–169.

    Google Scholar 

  • Humphrey, W. H. 1995. A Discipline for Software Engineering. Addison-Wesley.

  • Institute of Electrical and Electronics Engineers. 1983. Standard Glossary of Software Engineering Terminology.

  • Judd, C. M., Smith, E. R., and Kidder, L. H. 1991. Research Methods in Social Relations. Holt, Rinehart and Winston, sixth edition.

  • Kamsties, E., and Lott, C. M. 1995. An empirical evaluation of three defect-detection techniques. Proceedings of the Fifth European Software Engineering Conference (Schäfer, W., and Botella, P., eds.), Lecture Notes in Computer Science Nr. 989, Springer-Verlag, 362–383, September.

  • Kamsties E., and Lott C. M. 1995. An empirical evaluation of three defect-detection techniques. Technical Report ISERN 95–02, Department of Computer Science, University of Kaiserslautem, 67653 Kaiserslautern, Germany, May.

    Google Scholar 

  • Laitenberger O. 1995. Perspective-based reading technique, validation and research in future. Student project (Projektarbeit), Department of Computer Science, University of Kaiserslautern, 67653 Kaiserslautern, Germany.

    Google Scholar 

  • Lee, A. S. 1989. A scientific methodology for MIS case studies. MIS Quarterly, 33–50, March.

  • Linger, R. C., Mills, H. D., and Witt, B. I. 1979. Structured Programming: Theory and Practice. Addison-Wesley Publishing Company.

  • Marick, B. 1994. The Craft of Software Testing. Prentice Hall.

  • Miller J., Daly J., Wood M., Roper M., and Brooks A. 1995. Statistical power and its subcomponents—missing and misunderstood concepts in software engineering empirical research. Technical Report RR/95/192, Department of Computer Science, University of Strathclyde, Livingstone Tower, Richmond Street, Glasgow G1 1XH, Scotland. http://www.cs.strath.ac.uk/cs/research/efocs/research-reports/efocs-15–95.ps.z.

    Google Scholar 

  • Myers G. J. 1978. A controlled experiment in program testing and code walkthroughs/inspections. Communications of the ACM 21(9): 760–768.

    Google Scholar 

  • Myers G. J. 1979. The Art of Software Testing. New York: John Wiley & Sons.

    Google Scholar 

  • Parnas, D. L., and Weiss, D. M. 1985. Active design reviews: principles and practices. Proceedings of the Eighth International Conference on Software Engineering, IEEE Computer Society Press, 132–136.

  • Pfleeger S. L. 1995a. Experimental design and analysis in software engineering, part 3: Types of experimental design. ACM SIGSOFT Software Engineering Notes 20(2): 14–16.

    Google Scholar 

  • Pfleeger S. L. 1995b. Experimental design and analysis in software engineering, part 4: Choosing an experimental design. ACM SIGSOFT Software Engineering Notes 20(3): 13–15.

    Google Scholar 

  • Porter A. A., Siy H., and Votta L. G. 1995. A survey of software inspections. Technical Report CS-TR-3552, UMIACS-TR-95–104, Department of Computer Science, University of Maryland, College Park, Maryland 20742, October.

    Google Scholar 

  • Porter A. A., Votta L. G., and Basili V. R. 1995. Comparing detection methods for software requirements inspections: A replicated experiment. IEEE Transactions on Software Engineering, 21(6): 563–575.

    Google Scholar 

  • Preece J., and Rombach H. D. 1994. A taxonomy for combining software engineering and human-computer interaction measurement approaches: Towards a common framework. International Journal of Human-Computer Studies 41: 553–583.

    Google Scholar 

  • Rombach, H. D., Basili, V. R., and Selby, R. W. (eds.) 1992. Experimental Software Engineering Issues: A Critical Assessment and Future Directions. Lecture Notes in Computer Science Nr. 706, Springer-Verlag.

  • Selby, R. W. 1985. Evaluations of Software Technologies: Testing, CLEANROOM, and Metrics. PhD thesis, Department of Computer Science, University of Maryland, College Park, MD 20742, May.

  • Selby R. W., and Porter A. A. 1988. Learning from examples: generation and evaluation of decision trees for software resource analysis. IEEE Transactions on Software Engineering 14(12): 1743–1757.

    Google Scholar 

  • Spector P. E. 1981. Research Designs. Sage University Paper series on Quantitative Applications in the Social Sciences, series no. 07–023. Beverly Hills: Sage Publications.

    Google Scholar 

  • Vander Wiel S. A., and Votta L. G. 1993. Assessing software designs using capture-recapture methods. IEEE Transactions on Software Engineering 19(11): 1045–1054.

    Google Scholar 

  • Votta, L. G., and Porter, A. 1995. Experimental software engineering: A report on the state of the art. Proceedings of the Seventeenth International Conference on Software Engineering, IEEE Computer Society Press, 277–279.

  • Zweben S. H., Edwards S. H., Weide B. W., and Hollingsworth J. E. 1995. The effects of layering and encapsulation on software development cost and quality. IEEE Transactions on Software Engineering 21(3): 200–208.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This work was conducted while the author was with the Department of Computer Science, University of Kaiserslautern, 67653 Kaiserslautern, Germany.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lott, C.M., Rombach, H.D. Repeatable software engineering experiments for comparing defect-detection techniques. Empirical Software Engineering 1, 241–277 (1996). https://doi.org/10.1007/BF00127447

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00127447

Keywords

Navigation