skip to main content
10.1145/2384616.2384689acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

Typestate-based semantic code search over partial programs

Published:19 October 2012Publication History

ABSTRACT

We present a novel code search approach for answering queries focused on API-usage with code showing how the API should be used. To construct a search index, we develop new techniques for statically mining and consolidating temporal API specifications from code snippets. In contrast to existing semantic-based techniques, our approach handles partial programs in the form of code snippets. Handling snippets allows us to consume code from various sources such as parts of open source projects, educational resources (e.g. tutorials), and expert code sites. To handle code snippets, our approach (i) extracts a possibly partial temporal specification from each snippet using a relatively precise static analysis tracking a generalized notion of typestate, and (ii) consolidates the partial temporal specifications, combining consistent partial information to yield consolidated temporal specifications, each of which captures a full(er) usage scenario.

To answer a search query, we define a notion of relaxed inclusion matching a query against temporal specifications and their corresponding code snippets.

We have implemented our approach in a tool called PRIME and applied it to search for API usage of several challenging APIs. PRIME was able to analyze and consolidate thousands of snippets per tested API, and our results indicate that the combination of a relatively precise analysis and consolidation allowed PRIME to answer challenging queries effectively.

References

  1. ACHARYA, M., X IE , T., P EI, J., AND XU, J. Mining API patterns as partial orders from source code: from usage scenarios to specifications. In ESEC-FSE '07, pp. 25--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. ALNUSAIR, A., Z HAO, T., AND BODDEN, E. Effective API navigation and reuse. In IRI (aug. 2010), pp. 7--12.Google ScholarGoogle Scholar
  3. ALUR, R., C ERNY, P., M ADHUSUDAN, P., AND NAM, W. Synthesis of interface specifications for Java classes. In POPL (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. AMMONS, G., B ODIK, R., AND LARUS, J. R. Mining specifications. In POPL'02, pp. 4--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. BAXTER, I. D., Y AHIN, A., MOURA, L., SANT'ANNA, M., AND BIER, L. Clone detection using abstract syntax trees. In ICSM '98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. BECKMAN, N., K IM , D., AND ALDRICH, J. An empirical study of object protocols in the wild. In ECOOP'11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. COOK, J. E., AND WOLF, A. L. Discovering models of software processes from event-based data. ACM Trans. Softw. Eng. Methodol. 7, 3 (1998), 215--249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. COUSOT, P., AND COUSOT, R. Modular static program analysis, invited paper. April 6-14 2002.Google ScholarGoogle Scholar
  9. DAGENAIS, B., AND HENDREN, L. J. Enabling static analysis for partial Java programs. In OOPSLA'08, pp. 313--328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. DALLMEIER, V., L INDIG, C., W ASYLKOWSKI, A., AND ZELLER, A. Mining object behavior with ADABU. In WODA '06. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. DUCASSE, S., R IEGER , M., AND DEMEYER, S. A language independent approach for detecting duplicated code. In ICSM '99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. FINK, S., Y AHAV, E., D OR, N., R AMALINGAM, G., AND GEAY, E. Effective typestate verification in the presence of aliasing. In ISSTA'06, pp. 133--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. GABEL, M., J IANG , L., AND SU, Z. Scalable detection of semantic clones. In ICSE '08, pp. 321--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. GABEL, M., AND SU, Z. Javert: fully automatic mining of general temporal properties from dynamic traces. In FSE'08. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. github code search. https://github.com/search.Google ScholarGoogle Scholar
  16. GRUSKA, N., W ASYLKOWSKI, A., AND ZELLER, A. Learn-ing from 6,000 projects: Lightweight cross-project anomaly detection. In ISSTA '10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. HOLMES, R., AND MURPHY, G. C. Using structural context to recommend source code examples. In ICSE '05. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. HOLMES, R., W ALKER, R. J., AND MURPHY, G. C. Strath-cona example recommendation tool. In FSE'05, pp. 237--240. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J IANG, L., M ISHERGHI, G., S U, Z., AND GLONDU, S. Deckard: Scalable and accurate tree-based detection of code clones. IEEE Computer Society, pp. 96--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. KAMIYA, T., K USUMOTO, S., AND I NOUE, K. CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28, 7 (July 2002), 654--670. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. KIM , J., L EE, S., WON HWANG, S., AND KIM , S. Towards an intelligent code search engine. In AAAI'10.Google ScholarGoogle Scholar
  22. Koders. http://www.koders.com/.Google ScholarGoogle Scholar
  23. KOMONDOOR, R., AND HORWITZ, S. Using slicing to iden-tify duplication in source code. In SAS '01, pp. 40--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. KRINKE, J. Identifying similar code with program depen-dence graphs. In WCRE (2001), pp. 301--309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. LIVIERI, S., H IGO, Y., M ATUSHITA, M., AND I NOUE, K. Very-large scale code clone analysis and visualization of open source programs using distributed CCFinder: D-CCFinder. In ICSE'07. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. LO, D., AND KHOO, S.-C. SMArTIC: towards building an accurate, robust and scalable specification miner. In FSE'06. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. MANDELIN, D., X U, L., B ODIK, R., AND KIMELMAN , D. Jungloid mining: helping to navigate the API jungle. In PLDI '05, pp. 48--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. MISHNE, A. Typestate-based semantic code search over partial programs. Master's thesis, Technion-Israel Institute of Technology, Haifa, Israel, 2012.Google ScholarGoogle Scholar
  29. MONPERRUS, M., B RUCH, M., AND MEZINI, M. Detecting missing method calls in object-oriented software. In ECOOP (2010), T. D'Hondt, Ed., vol. 6183 of Lecture Notes in Computer Science, Springer, pp. 2--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. REISS, S. P. Semantics-based code search. In ICSE'09. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. SAHAVECHAPHAN, N., AND CLAYPOOL, K. XSnippet: mining for sample code. In OOPSLA '06. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. SHOHAM, S., Y AHAV, E., F INK , S., AND PISTOIA , M. Static specification mining using automata-based abstractions. In ISSTA '07. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. SOLAR-L EZAMA, A., R ABBAH, R., B ODÍK, R., AND EBCIOGLU, K. Programming by sketching for bit-streaming programs. In PLDI '05.Google ScholarGoogle Scholar
  34. stackoverflow. http://stackoverflow.com/.Google ScholarGoogle Scholar
  35. STROM, R. E., AND YEMINI, S. Typestate: A programming language concept for enhancing software reliability. IEEE Trans. Software Eng. 12, 1 (1986), 157--171. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. THUMMALAPENTA, S., AND XIE , T. PARSEWeb: a programmer assistant for reusing open source code on the web. In ASE'07, pp. 204--213. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. VALLÉE-R AI, R., C O, P., G AGNON, E., H ENDREN, L., LAM, P., AND SUNDARESAN, V. Soot - a Java bytecode optimization framework. In CASCON '99, IBM Press, pp. 13--. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. WAHLER, V., S EIPEL, D., W OLFF, J., AND FISCHER, G. Clone detection in source code by frequent itemset techniques. In Source Code Analysis and Manipulation (2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. WASYLKOWSKI, A., AND ZELLER, A. Mining temporal specifications from object usage. In Autom. Softw. Eng. (2011), vol. 18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. WASYLKOWSKI, A., Z ELLER, A., AND LINDIG , C. Detecting object usage anomalies. In FSE'07, pp. 35--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. WEIMER, W., AND NECULA, G. Mining temporal specifications for error detection. In TACAS (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. WHALEY, J., M ARTIN, M. C., AND LAM, M. S. Automatic extraction of object-oriented component interfaces. In ISSTA'02. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. YANG, J., E VANS, D., B HARDWAJ, D., B HAT, T., AND DAS, M. Perracotta: mining temporal API rules from imperfect traces. In ICSE '06, pp. 282--291. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. ZHONG, H., X IE, T., Z HANG, L., P EI, J., AND MEI, H. MAPO: Mining and recommending API usage patterns. In ECOOP'09. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Typestate-based semantic code search over partial programs

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applications
                October 2012
                1052 pages
                ISBN:9781450315616
                DOI:10.1145/2384616
                • cover image ACM SIGPLAN Notices
                  ACM SIGPLAN Notices  Volume 47, Issue 10
                  OOPSLA '12
                  October 2012
                  1011 pages
                  ISSN:0362-1340
                  EISSN:1558-1160
                  DOI:10.1145/2398857
                  Issue’s Table of Contents

                Copyright © 2012 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 19 October 2012

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

                Acceptance Rates

                Overall Acceptance Rate268of1,244submissions,22%

                Upcoming Conference

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader