Skip to main content

Using Slicing to Identify Duplication in Source Code

  • Conference paper
  • First Online:
Static Analysis (SAS 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2126))

Included in the following conference series:

Abstract

Programs often have a lot of duplicated code, which makes both understanding and maintenance more difficult. This problem can be alleviated by detecting duplicated code, extracting it into a separate new procedure, and replacing all the clones (the instances of the duplicated code) by calls to the new procedure. This paper describes the design and initial implementation of a tool that finds clones and displays them to the programmer. The novel aspect of our approach is the use of program dependence graphs (PDGs) and program slicing to find isomorphic PDG subgraphs that represent clones. The key benefits of this approach are that our tool can find non-contiguous clones (clones whose components do not occur as contiguous text in the program), clones in which matching statements have been reordered, and clones that are intertwined with each other. Furthermore, the clones that are found are likely to be meaningful computations, and thus good candidates for extraction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. B. Baker. On finding duplication and near-duplication in large software systems. In Proc. IEEE Working Conf. on Reverse Engineering, pages 86–95, July 1995.

    Google Scholar 

  2. B. Baker. Parameterized duplication in strings: Algorithms and an application to software maintenance. SIAM Jrnl. on Computing, 26(5):1343–1362, Oct. 1997.

    Google Scholar 

  3. H. Barrow and R. Burstall. Subgraph isomorphism, matching relational structures and maximal cliques. Information Processing Letters, 4(4):83–84, Jan. 1976.

    Google Scholar 

  4. I. Baxter, A. Yahin, L. Moura, M. Sant’Anna, and L. Bier. Clone detection using abstract syntax trees. In Int. Conf. on Software Maintenance, pages 368–378, 1998.

    Google Scholar 

  5. D. Bayada, R. Simpson, A. Johnson, and C. Laurenco. An algorithm for the multiple common subgraph problem. Jrnl. of Chemical Information and Computer Sciences, 32(6):680–685, Nov.–Dec. 1992.

    Google Scholar 

  6. R. Bowdidge and W. Griswold. Supporting the restructuring of data abstractions through manipulation of a program visualization. ACM Trans. on Software Engineering and Methodology, 7(2):109–157, Apr. 1998.

    Google Scholar 

  7. N. Davey, P. Barson, S. Field, R. Frank, and D. Tansley. The development of a software clone detector. Int. Jrnl. of Applied Software Technology, 1(3-4):219–36, 1995.

    Google Scholar 

  8. S. Debray, W. Evans, R. Muth, and B. D. Sutter. Compiler techniques for code compaction. ACM Trans. on Programming Languages and Systems, 22(2):378–415, Mar. 2000.

    Google Scholar 

  9. J. Ferrante, K. Ottenstein, and J. Warren. The program dependence graph and its use in optimization. ACM Trans. on Programming Languages and Systems, 9(3):319–349, July 1987.

    Google Scholar 

  10. http://www.codesurfer.com.

  11. R. Komondoor and S. Horwitz. Semantics-preserving procedure extraction. In Proc. ACM Symp. on Principles of Programming Languages (POPL), pages 155–169, Jan. 2000.

    Google Scholar 

  12. K. Kontogiannis, R. Demori, E. Merlo, M. Galler, and M. Bernstein. Pattern matching for clone and concept detection. Automated Software Engineering, 3(1-2):77–108, 1996.

    Article  MathSciNet  Google Scholar 

  13. B. Lague, D. Proulx, J. Mayrand, E. Merlo, and J. Hudepohl. Assessing the benefits of incorporating function clone detection in a development process. In Int. Conf. on Software Maintenance, pages 314–321, 1997.

    Google Scholar 

  14. J. Mayrand, C. Leblanc, and E. Merlo. Experiment on the automatic detection of function clones in a software system using metrics. In Proceedings of the Int. Conf. on Software Maintenance, pages 244–254, 1996.

    Google Scholar 

  15. J. McGregor. Backtrack search algorithms and maximal common subgraph problem. Software-Practice and Experience, 12:23–34, 1982.

    Article  MATH  Google Scholar 

  16. K. Ottenstein and L. Ottenstein. The program dependence graph in a software development environment. In Proc. ACM SIGSOFT/SIGPLAN Software Engineering Symp. on Practical Software Development Environments, pages 177–184, 1984.

    Google Scholar 

  17. W. Stevens, G. Myers, and L. Constantine. Structured design. IBM Systems Jrnl., 13(2):115–139, 1974.

    Google Scholar 

  18. T. Wang and J. Zhou. Emcss: A new method for maximal common substructure search. Jrnl. of Chemical Information and Computer Sciences, 37(5):828–834, Sept.–Oct. 1997.

    Google Scholar 

  19. M. Weiser. Program slicing. IEEE Trans. on Software Engineering, SE-10(4):352–357, July 1984.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Komondoor, R., Horwitz, S. (2001). Using Slicing to Identify Duplication in Source Code. In: Cousot, P. (eds) Static Analysis. SAS 2001. Lecture Notes in Computer Science, vol 2126. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47764-0_3

Download citation

  • DOI: https://doi.org/10.1007/3-540-47764-0_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42314-0

  • Online ISBN: 978-3-540-47764-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics