skip to main content
research-article
Open Access

Feedback-driven semi-supervised synthesis of program transformations

Published:13 November 2020Publication History
Skip Abstract Section

Abstract

While editing code, it is common for developers to make multiple related repeated edits that are all instances of a more general program transformation. Since this process can be tedious and error-prone, we study the problem of automatically learning program transformations from past edits, which can then be used to predict future edits. We take a novel view of the problem as a semi-supervised learning problem: apart from the concrete edits that are instances of the general transformation, the learning procedure also exploits access to additional inputs (program subtrees) that are marked as positive or negative depending on whether the transformation applies on those inputs. We present a procedure to solve the semi-supervised transformation learning problem using anti-unification and programming-by-example synthesis technology. To eliminate reliance on access to marked additional inputs, we generalize the semi-supervised learning procedure to a feedback-driven procedure that also generates the marked additional inputs in an iterative loop. We apply these ideas to build and evaluate three applications that use different mechanisms for generating feedback. Compared to existing tools that learn program transformations from edits, our feedback-driven semi-supervised approach is vastly more effective in successfully predicting edits with significantly lesser amounts of past edit data.

Skip Supplemental Material Section

Supplemental Material

oopsla20main-p494-p-video.mp4

mp4

247.9 MB

References

  1. R. Alur, R. Bodik, G. Juniwal, M. M. K. Martin, M. Raghothaman, S. A. Seshia, R. Singh, A. Solar-Lezama, E. Torlak, and A. Udupa. Syntax-guided synthesis. In 2013 Formal Methods in Computer-Aided Design, pages 1-8, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  2. R. Alur, M. M. K. Martin, M. Raghothaman, C. Stergiou, S. Tripakis, and A. Udupa. Synthesizing finite-state protocols from scenarios and requirements. In E. Yahav, editor, 10th International Haifa Verification Conference, volume 8855 of Lecture Notes in Computer Science, pages 75-91. Springer, 2014. doi: 10.1007/978-3-319-13338-6_7. URL https: //doi.org/10.1007/978-3-319-13338-6_7. Google ScholarGoogle ScholarCross RefCross Ref
  3. R. Alur, P. Černy`, and A. Radhakrishna. Synthesis through unification. In International Conference on Computer Aided Verification, pages 163-179. Springer, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  4. R. Alur, A. Radhakrishna, and A. Udupa. Scaling enumerative program synthesis via divide and conquer. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 319-336. Springer, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  5. S. An, R. Singh, S. Misailovic, and R. Samanta. Augmented example-based synthesis using relational perturbation properties. Proceedings of the ACM on Programming Languages, 4 (POPL): 1-24, 2019.Google ScholarGoogle Scholar
  6. J. Bader, A. Scott, M. Pradel, and S. Chandra. Getafix: Learning to fix bugs automatically. Proc. ACM Program. Lang., 3 (OOPSLA), Oct. 2019. doi: 10.1145/3360585. URL https://doi.org/10.1145/3360585. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S. Hallem, C. Henri-Gros, A. Kamsky, S. McPeak, and D. Engler. A few billion lines of code later. Commun. ACM, 53 ( 2 ), 2010. ISSN 0001-0782. doi: 10.1145/1646353.1646374. URL https://doi.org/10.1145/1646353.1646374. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. R. Buchi and L. H. Landweber. Solving sequential conditions by finite-state strategies. Transactions of the American Mathematical Society, 138 : 295-311, 1969. ISSN 00029947. URL http://www.jstor.org/stable/1994916.Google ScholarGoogle ScholarCross RefCross Ref
  9. P. Cerný, K. Chatterjee, T. A. Henzinger, A. Radhakrishna, and R. Singh. Quantitative synthesis for concurrent programs. In G. Gopalakrishnan and S. Qadeer, editors, 23rd International Conference on Computer Aided Verification (CAV), volume 6806 of Lecture Notes in Computer Science, pages 243-259. Springer, 2011. doi: 10.1007/978-3-642-22110-1_20. URL https://doi.org/10.1007/978-3-642-22110-1_20. Google ScholarGoogle ScholarCross RefCross Ref
  10. P. Cerný, T. A. Henzinger, A. Radhakrishna, L. Ryzhyk, and T. Tarrach. Eficient synthesis for concurrency by semanticspreserving transformations. In N. Sharygina and H. Veith, editors, 25th International Conference on Computer Aided Verification (CAV), volume 8044 of Lecture Notes in Computer Science, pages 951-967. Springer, 2013. doi: 10.1007/978-3-642-39799-8_68. URL https://doi.org/10.1007/978-3-642-39799-8_68. Google ScholarGoogle ScholarCross RefCross Ref
  11. Eclipse Foundation. Eclipse. At https://www.eclipse.org/, 2020.Google ScholarGoogle Scholar
  12. W. S. Evans, C. W. Fraser, and F. Ma. Clone detection via structural abstraction. Software Quality Journal, 17 ( 4 ): 309-330, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. K. Feser, S. Chaudhuri, and I. Dillig. Synthesizing data structure transformations from input-output examples. In D. Grove and S. Blackburn, editors, Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 229-239. ACM, 2015. doi: 10.1145/2737924.2737977. URL https://doi.org/10.1145/2737924.2737977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Frankle, P. Osera, D. Walker, and S. Zdancewic. Example-directed synthesis: a type-theoretic interpretation. In R. Bodík and R. Majumdar, editors, Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2016, St. Petersburg, FL, USA, January 20-22, 2016, pages 802-815. ACM, 2016. doi: 10.1145/2837614. 2837629. URL https://doi.org/10.1145/2837614.2837629. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Gulwani. Automating string processing in spreadsheets using input-output examples. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '11). ACM New York, NY, USA, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. Gvero, V. Kuncak, I. Kuraj, and R. Piskac. Complete completion using types and weights. In Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation, pages 27-38, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. K. Huang, X. Qiu, P. Shen, and Y. Wang. Reconciling enumerative and deductive program synthesis. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 1159-1174, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. JetBrains. IntelliJ. At https://www.jetbrains.com/idea/, 2020a.Google ScholarGoogle Scholar
  19. JetBrains. ReSharper. At https://www.jetbrains.com/resharper/, 2020b.Google ScholarGoogle Scholar
  20. S. Jha, S. Gulwani, S. A. Seshia, and A. Tiwari. Oracle-guided component-based program synthesis. In 2010 ACM/IEEE 32nd International Conference on Software Engineering, volume 1, pages 215-224. IEEE, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. Jiang, G. Misherghi, Z. Su, and S. Glondu. Deckard: Scalable and accurate tree-based detection of code clones. In 29th International Conference on Software Engineering (ICSE'07), pages 96-105. IEEE, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Kim, T. Zimmermann, and N. Nagappan. A field study of refactoring challenges and benefits. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE '12, pages 50 : 1-50 : 11, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1614-9. doi: 10.1145/2393596.2393655. URL http://doi.acm.org/10.1145/2393596.2393655. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. V. Le, D. Perelman, O. Polozov, M. Raza, A. Udupa, and S. Gulwani. Interactive program synthesis. arXiv preprint arXiv:1703.03539, 2017.Google ScholarGoogle Scholar
  24. Z. Manna and R. J. Waldinger. A deductive approach to program synthesis. ACM Transactions on Programming Languages and Systems, 2 ( 1 ): 90-121, 1980. doi: 10.1145/357084.357090. URL https://doi.org/10.1145/357084.357090. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Mayer, G. Soares, M. Grechkin, V. Le, M. Marron, O. Polozov, R. Singh, B. Zorn, and S. Gulwani. User interaction models for disambiguation in programming by example. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology, pages 291-301, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. N. Meng, M. Kim, and K. S. McKinley. Systematic editing: generating program transformations from an example. ACM SIGPLAN Notices, 46 ( 6 ): 329-342, 2011.Google ScholarGoogle Scholar
  27. N. Meng, M. Kim, and K. S. McKinley. Lase: locating and applying systematic edits by learning from examples. In 2013 35th International Conference on Software Engineering (ICSE), pages 502-511. IEEE, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. T. Mens and T. Tourwe. A survey of software refactoring. IEEE Transactions on Software Engineering, 30 ( 2 ): 126-139, Feb 2004. ISSN 0098-5589. doi: 10.1109/TSE. 2004. 1265817.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Microsoft. Visual Studio. At https://www.visualstudio.com, 2019.Google ScholarGoogle Scholar
  30. Microsoft. Intellicode suggestions. At https://docs.microsoft.com/en-us/visualstudio/intellicode/intellicode-suggestions, 2020.Google ScholarGoogle Scholar
  31. A. Miltner, S. Gulwani, V. Le, A. Leung, A. Radhakrishna, G. Soares, A. Tiwari, and A. Udupa. On the fly synthesis of edit suggestions. Proceedings of the ACM on Programming Languages, 3 (OOPSLA): 1-29, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. T. M. Mitchell. Generalization as search. Artificial intelligence, 18 ( 2 ): 203-226, 1982.Google ScholarGoogle Scholar
  33. H. A. Nguyen, A. T. Nguyen, T. T. Nguyen, T. N. Nguyen, and H. Rajan. A study of repetitiveness of code changes in software evolution. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 180-190. IEEE, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. W. F. Opdyke. Refactoring Object-oriented Frameworks. PhD thesis, Champaign, IL, USA, 1992. UMI Order No. GAX93-05645.Google ScholarGoogle Scholar
  35. D. Perelman, S. Gulwani, T. Ball, and D. Grossman. Type-directed completion of partial expressions. In Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation, pages 275-286, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Pnueli and R. Rosner. On the synthesis of a reactive module. In Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 179-190, 1989.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. O. Polozov and S. Gulwani. Flashmeta: a framework for inductive program synthesis. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 107-126, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. V. Raychev, M. Vechev, and E. Yahav. Code completion with statistical language models. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 419-428, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. Reynolds, M. Deters, V. Kuncak, C. Tinelli, and C. Barrett. Counterexample-guided quantifier instantiation for synthesis in smt. In International Conference on Computer Aided Verification, pages 198-216. Springer, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  40. R. Rolim, G. Soares, L. D'Antoni, O. Polozov, S. Gulwani, R. Gheyi, R. Suzuki, and B. Hartmann. Learning syntactic program transformations from examples. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pages 404-415. IEEE, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. R. Rolim, G. Soares, R. Gheyi, T. Barik, and L. D'Antoni. Learning quick fixes from code repositories, 2018.Google ScholarGoogle Scholar
  42. R. Singh. Blinkfill: Semi-supervised programming by example for syntactic string transformations. Proc. VLDB Endow., 9 ( 10 ): 816-827, June 2016. ISSN 2150-8097. doi: 10.14778/2977797.2977807. URL https://doi.org/10.14778/2977797.2977807. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. R. Singh and A. Solar-Lezama. Synthesizing data structure manipulations from storyboards. In T. Gyimóthy and A. Zeller, editors, SIGSOFT/FSE'11 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE-19) and ESEC'11: 13th European Software Engineering Conference (ESEC-13), Szeged, Hungary, September 5-9, 2011, pages 289-299. ACM, 2011. doi: 10.1145/2025113.2025153. URL https://doi.org/10.1145/2025113.2025153. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. A. Solar-Lezama, R. M. Rabbah, R. Bodík, and K. Ebcioglu. Programming by sketching for bit-streaming programs. In V. Sarkar and M. W. Hall, editors, Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, Chicago, IL, USA, June 12-15, 2005, pages 281-294. ACM, 2005. doi: 10.1145/1065010.1065045. URL https://doi.org/10.1145/1065010.1065045. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. A. Solar-Lezama, L. Tancau, R. Bodik, S. Seshia, and V. Saraswat. Combinatorial sketching for finite programs. In Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, pages 404-415, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. A. Solar-Lezama, L. Tancau, R. Bodík, S. A. Seshia, and V. A. Saraswat. Combinatorial sketching for finite programs. In J. P. Shen and M. Martonosi, editors, Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2006, San Jose, CA, USA, October 21-25, 2006, pages 404-415. ACM, 2006. doi: 10.1145/1168857.1168907. URL https://doi.org/10.1145/1168857.1168907. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. A. Solar-Lezama, G. Arnold, L. Tancau, R. Bodík, V. A. Saraswat, and S. A. Seshia. Sketching stencils. In J. Ferrante and K. S. McKinley, editors, Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, San Diego, California, USA, June 10-13, 2007, pages 167-178. ACM, 2007. doi: 10.1145/1250734.1250754. URL https://doi.org/10.1145/1250734.1250754. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. A. Solar-Lezama, C. G. Jones, and R. Bodík. Sketching concurrent data structures. In R. Gupta and S. P. Amarasinghe, editors, Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, Tucson, AZ, USA, June 7-13, 2008, pages 136-148. ACM, 2008. doi: 10.1145/1375581.1375599. URL https://doi.org/10.1145/1375581.1375599. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. F. Steimann and J. von Pilgrim. Refactorings without names. In Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, ASE 2012, pages 290-293, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1204-2. doi: 10.1145/2351676.2351726. URL http://doi.acm.org/10.1145/2351676.2351726. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. A. Udupa, A. Raghavan, J. V. Deshmukh, S. Mador-Haim, M. M. Martin, and R. Alur. Transit: specifying protocols with concolic snippets. ACM SIGPLAN Notices, 48 ( 6 ): 287-296, 2013.Google ScholarGoogle Scholar
  51. M. Vakilian, N. Chen, S. Negara, B. A. Rajkumar, B. P. Bailey, and R. E. Johnson. Use, disuse, and misuse of automated refactorings. In 2012 34th International Conference on Software Engineering (ICSE), pages 233-243. IEEE, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  52. M. T. Vechev, E. Yahav, and G. Yorsh. Abstraction-guided synthesis of synchronization. In M. V. Hermenegildo and J. Palsberg, editors, Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2010, Madrid, Spain, January 17-23, 2010, pages 327-338. ACM, 2010. doi: 10.1145/1706299.1706338. URL https: //doi.org/10.1145/1706299.1706338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. N. Yaghmazadeh, X. Wang, and I. Dillig. Automated migration of hierarchical data to relational tables using programmingby-example. Proc. VLDB Endow., 11 ( 5 ): 580-593, 2018. doi: 10.1145/3187009.3177735. URL http://www.vldb.org/pvldb/ vol11/p580-yaghmazadeh.pdf.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. K. Yessenov, Z. Xu, and A. Solar-Lezama. Data-driven synthesis for object-oriented frameworks. ACM SIGPLAN Notices, 46 ( 10 ): 65-82, 2011.Google ScholarGoogle Scholar
  55. H. Zhang, A. Jain, G. Khandelwal, C. Kaushik, S. Ge, and W. Hu. Bing developer assistant: improving developer productivity by recommending sample code. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 956-961, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. X. Zhu and A. B. Goldberg. Introduction to semi-supervised learning. Synthesis lectures on artificial intelligence and machine learning, 3 ( 1 ): 1-130, 2009.Google ScholarGoogle Scholar
  57. X. J. Zhu. Semi-supervised learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences, 2005.Google ScholarGoogle Scholar

Index Terms

  1. Feedback-driven semi-supervised synthesis of program transformations

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image Proceedings of the ACM on Programming Languages
            Proceedings of the ACM on Programming Languages  Volume 4, Issue OOPSLA
            November 2020
            3108 pages
            EISSN:2475-1421
            DOI:10.1145/3436718
            Issue’s Table of Contents

            Copyright © 2020 Owner/Author

            This work is licensed under a Creative Commons Attribution International 4.0 License.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 13 November 2020
            Published in pacmpl Volume 4, Issue OOPSLA

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader