research-article

Open Access

Feedback-driven semi-supervised synthesis of program transformations

Authors:
Xiang Gao

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Shraddha Barke

University of California at San Diego, USA

University of California at San Diego, USA
View Profile

,
Arjun Radhakrishna

Microsoft, USA

Microsoft, USA
View Profile

,
Gustavo Soares

Microsoft, USA

Microsoft, USA
View Profile

,
Sumit Gulwani

Microsoft, USA

Microsoft, USA
View Profile

,
Alan Leung

Microsoft, USA

Microsoft, USA
View Profile

,
Nachiappan Nagappan

Microsoft Research, USA

Microsoft Research, USA
View Profile

,
Ashish Tiwari

Microsoft, USA

Microsoft, USA
View Profile

Proceedings of the ACM on Programming Languages Volume 4 Issue OOPSLAArticle No.: 219pp 1–30https://doi.org/10.1145/3428287

Published:13 November 2020Publication History

Proceedings of the ACM on Programming Languages

Abstract

While editing code, it is common for developers to make multiple related repeated edits that are all instances of a more general program transformation. Since this process can be tedious and error-prone, we study the problem of automatically learning program transformations from past edits, which can then be used to predict future edits. We take a novel view of the problem as a semi-supervised learning problem: apart from the concrete edits that are instances of the general transformation, the learning procedure also exploits access to additional inputs (program subtrees) that are marked as positive or negative depending on whether the transformation applies on those inputs. We present a procedure to solve the semi-supervised transformation learning problem using anti-unification and programming-by-example synthesis technology. To eliminate reliance on access to marked additional inputs, we generalize the semi-supervised learning procedure to a feedback-driven procedure that also generates the marked additional inputs in an iterative loop. We apply these ideas to build and evaluate three applications that use different mechanisms for generating feedback. Compared to existing tools that learn program transformations from edits, our feedback-driven semi-supervised approach is vastly more effective in successfully predicting edits with significantly lesser amounts of past edit data.

Supplemental Material

oopsla20main-p494-p-video.mp4

mp4

247.9 MB

Download

References

R. Alur, R. Bodik, G. Juniwal, M. M. K. Martin, M. Raghothaman, S. A. Seshia, R. Singh, A. Solar-Lezama, E. Torlak, and A. Udupa. Syntax-guided synthesis. In 2013 Formal Methods in Computer-Aided Design, pages 1-8, 2013.Google ScholarCross Ref
R. Alur, M. M. K. Martin, M. Raghothaman, C. Stergiou, S. Tripakis, and A. Udupa. Synthesizing finite-state protocols from scenarios and requirements. In E. Yahav, editor, 10th International Haifa Verification Conference, volume 8855 of Lecture Notes in Computer Science, pages 75-91. Springer, 2014. doi: 10.1007/978-3-319-13338-6_7. URL https: //doi.org/10.1007/978-3-319-13338-6_7. Google ScholarCross Ref
R. Alur, P. Černy`, and A. Radhakrishna. Synthesis through unification. In International Conference on Computer Aided Verification, pages 163-179. Springer, 2015.Google ScholarCross Ref
R. Alur, A. Radhakrishna, and A. Udupa. Scaling enumerative program synthesis via divide and conquer. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 319-336. Springer, 2017.Google ScholarCross Ref
S. An, R. Singh, S. Misailovic, and R. Samanta. Augmented example-based synthesis using relational perturbation properties. Proceedings of the ACM on Programming Languages, 4 (POPL): 1-24, 2019.Google Scholar
J. Bader, A. Scott, M. Pradel, and S. Chandra. Getafix: Learning to fix bugs automatically. Proc. ACM Program. Lang., 3 (OOPSLA), Oct. 2019. doi: 10.1145/3360585. URL https://doi.org/10.1145/3360585. Google ScholarDigital Library
A. Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S. Hallem, C. Henri-Gros, A. Kamsky, S. McPeak, and D. Engler. A few billion lines of code later. Commun. ACM, 53 ( 2 ), 2010. ISSN 0001-0782. doi: 10.1145/1646353.1646374. URL https://doi.org/10.1145/1646353.1646374. Google ScholarDigital Library
J. R. Buchi and L. H. Landweber. Solving sequential conditions by finite-state strategies. Transactions of the American Mathematical Society, 138 : 295-311, 1969. ISSN 00029947. URL http://www.jstor.org/stable/1994916.Google ScholarCross Ref
P. Cerný, K. Chatterjee, T. A. Henzinger, A. Radhakrishna, and R. Singh. Quantitative synthesis for concurrent programs. In G. Gopalakrishnan and S. Qadeer, editors, 23rd International Conference on Computer Aided Verification (CAV), volume 6806 of Lecture Notes in Computer Science, pages 243-259. Springer, 2011. doi: 10.1007/978-3-642-22110-1_20. URL https://doi.org/10.1007/978-3-642-22110-1_20. Google ScholarCross Ref
P. Cerný, T. A. Henzinger, A. Radhakrishna, L. Ryzhyk, and T. Tarrach. Eficient synthesis for concurrency by semanticspreserving transformations. In N. Sharygina and H. Veith, editors, 25th International Conference on Computer Aided Verification (CAV), volume 8044 of Lecture Notes in Computer Science, pages 951-967. Springer, 2013. doi: 10.1007/978-3-642-39799-8_68. URL https://doi.org/10.1007/978-3-642-39799-8_68. Google ScholarCross Ref
Eclipse Foundation. Eclipse. At https://www.eclipse.org/, 2020.Google Scholar
W. S. Evans, C. W. Fraser, and F. Ma. Clone detection via structural abstraction. Software Quality Journal, 17 ( 4 ): 309-330, 2009.Google ScholarDigital Library
J. K. Feser, S. Chaudhuri, and I. Dillig. Synthesizing data structure transformations from input-output examples. In D. Grove and S. Blackburn, editors, Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 229-239. ACM, 2015. doi: 10.1145/2737924.2737977. URL https://doi.org/10.1145/2737924.2737977. Google ScholarDigital Library
J. Frankle, P. Osera, D. Walker, and S. Zdancewic. Example-directed synthesis: a type-theoretic interpretation. In R. Bodík and R. Majumdar, editors, Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2016, St. Petersburg, FL, USA, January 20-22, 2016, pages 802-815. ACM, 2016. doi: 10.1145/2837614. 2837629. URL https://doi.org/10.1145/2837614.2837629. Google ScholarDigital Library
S. Gulwani. Automating string processing in spreadsheets using input-output examples. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '11). ACM New York, NY, USA, 2011.Google ScholarDigital Library
T. Gvero, V. Kuncak, I. Kuraj, and R. Piskac. Complete completion using types and weights. In Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation, pages 27-38, 2013.Google ScholarDigital Library
K. Huang, X. Qiu, P. Shen, and Y. Wang. Reconciling enumerative and deductive program synthesis. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 1159-1174, 2020.Google ScholarDigital Library
JetBrains. IntelliJ. At https://www.jetbrains.com/idea/, 2020a.Google Scholar
JetBrains. ReSharper. At https://www.jetbrains.com/resharper/, 2020b.Google Scholar
S. Jha, S. Gulwani, S. A. Seshia, and A. Tiwari. Oracle-guided component-based program synthesis. In 2010 ACM/IEEE 32nd International Conference on Software Engineering, volume 1, pages 215-224. IEEE, 2010.Google ScholarDigital Library
L. Jiang, G. Misherghi, Z. Su, and S. Glondu. Deckard: Scalable and accurate tree-based detection of code clones. In 29th International Conference on Software Engineering (ICSE'07), pages 96-105. IEEE, 2007.Google ScholarDigital Library
M. Kim, T. Zimmermann, and N. Nagappan. A field study of refactoring challenges and benefits. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE '12, pages 50 : 1-50 : 11, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1614-9. doi: 10.1145/2393596.2393655. URL http://doi.acm.org/10.1145/2393596.2393655. Google ScholarDigital Library
V. Le, D. Perelman, O. Polozov, M. Raza, A. Udupa, and S. Gulwani. Interactive program synthesis. arXiv preprint arXiv:1703.03539, 2017.Google Scholar
Z. Manna and R. J. Waldinger. A deductive approach to program synthesis. ACM Transactions on Programming Languages and Systems, 2 ( 1 ): 90-121, 1980. doi: 10.1145/357084.357090. URL https://doi.org/10.1145/357084.357090. Google ScholarDigital Library
M. Mayer, G. Soares, M. Grechkin, V. Le, M. Marron, O. Polozov, R. Singh, B. Zorn, and S. Gulwani. User interaction models for disambiguation in programming by example. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology, pages 291-301, 2015.Google ScholarDigital Library
N. Meng, M. Kim, and K. S. McKinley. Systematic editing: generating program transformations from an example. ACM SIGPLAN Notices, 46 ( 6 ): 329-342, 2011.Google Scholar
N. Meng, M. Kim, and K. S. McKinley. Lase: locating and applying systematic edits by learning from examples. In 2013 35th International Conference on Software Engineering (ICSE), pages 502-511. IEEE, 2013.Google ScholarDigital Library
T. Mens and T. Tourwe. A survey of software refactoring. IEEE Transactions on Software Engineering, 30 ( 2 ): 126-139, Feb 2004. ISSN 0098-5589. doi: 10.1109/TSE. 2004. 1265817.Google ScholarDigital Library
Microsoft. Visual Studio. At https://www.visualstudio.com, 2019.Google Scholar
Microsoft. Intellicode suggestions. At https://docs.microsoft.com/en-us/visualstudio/intellicode/intellicode-suggestions, 2020.Google Scholar
A. Miltner, S. Gulwani, V. Le, A. Leung, A. Radhakrishna, G. Soares, A. Tiwari, and A. Udupa. On the fly synthesis of edit suggestions. Proceedings of the ACM on Programming Languages, 3 (OOPSLA): 1-29, 2019.Google ScholarDigital Library
T. M. Mitchell. Generalization as search. Artificial intelligence, 18 ( 2 ): 203-226, 1982.Google Scholar
H. A. Nguyen, A. T. Nguyen, T. T. Nguyen, T. N. Nguyen, and H. Rajan. A study of repetitiveness of code changes in software evolution. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 180-190. IEEE, 2013.Google ScholarDigital Library
W. F. Opdyke. Refactoring Object-oriented Frameworks. PhD thesis, Champaign, IL, USA, 1992. UMI Order No. GAX93-05645.Google Scholar
D. Perelman, S. Gulwani, T. Ball, and D. Grossman. Type-directed completion of partial expressions. In Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation, pages 275-286, 2012.Google ScholarDigital Library
A. Pnueli and R. Rosner. On the synthesis of a reactive module. In Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 179-190, 1989.Google ScholarDigital Library
O. Polozov and S. Gulwani. Flashmeta: a framework for inductive program synthesis. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 107-126, 2015.Google ScholarDigital Library
V. Raychev, M. Vechev, and E. Yahav. Code completion with statistical language models. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 419-428, 2014.Google ScholarDigital Library
A. Reynolds, M. Deters, V. Kuncak, C. Tinelli, and C. Barrett. Counterexample-guided quantifier instantiation for synthesis in smt. In International Conference on Computer Aided Verification, pages 198-216. Springer, 2015.Google ScholarCross Ref
R. Rolim, G. Soares, L. D'Antoni, O. Polozov, S. Gulwani, R. Gheyi, R. Suzuki, and B. Hartmann. Learning syntactic program transformations from examples. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pages 404-415. IEEE, 2017.Google ScholarDigital Library
R. Rolim, G. Soares, R. Gheyi, T. Barik, and L. D'Antoni. Learning quick fixes from code repositories, 2018.Google Scholar
R. Singh. Blinkfill: Semi-supervised programming by example for syntactic string transformations. Proc. VLDB Endow., 9 ( 10 ): 816-827, June 2016. ISSN 2150-8097. doi: 10.14778/2977797.2977807. URL https://doi.org/10.14778/2977797.2977807. Google ScholarDigital Library
R. Singh and A. Solar-Lezama. Synthesizing data structure manipulations from storyboards. In T. Gyimóthy and A. Zeller, editors, SIGSOFT/FSE'11 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE-19) and ESEC'11: 13th European Software Engineering Conference (ESEC-13), Szeged, Hungary, September 5-9, 2011, pages 289-299. ACM, 2011. doi: 10.1145/2025113.2025153. URL https://doi.org/10.1145/2025113.2025153. Google ScholarDigital Library
A. Solar-Lezama, R. M. Rabbah, R. Bodík, and K. Ebcioglu. Programming by sketching for bit-streaming programs. In V. Sarkar and M. W. Hall, editors, Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, Chicago, IL, USA, June 12-15, 2005, pages 281-294. ACM, 2005. doi: 10.1145/1065010.1065045. URL https://doi.org/10.1145/1065010.1065045. Google ScholarDigital Library
A. Solar-Lezama, L. Tancau, R. Bodik, S. Seshia, and V. Saraswat. Combinatorial sketching for finite programs. In Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, pages 404-415, 2006.Google ScholarDigital Library
A. Solar-Lezama, L. Tancau, R. Bodík, S. A. Seshia, and V. A. Saraswat. Combinatorial sketching for finite programs. In J. P. Shen and M. Martonosi, editors, Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2006, San Jose, CA, USA, October 21-25, 2006, pages 404-415. ACM, 2006. doi: 10.1145/1168857.1168907. URL https://doi.org/10.1145/1168857.1168907. Google ScholarDigital Library
A. Solar-Lezama, G. Arnold, L. Tancau, R. Bodík, V. A. Saraswat, and S. A. Seshia. Sketching stencils. In J. Ferrante and K. S. McKinley, editors, Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, San Diego, California, USA, June 10-13, 2007, pages 167-178. ACM, 2007. doi: 10.1145/1250734.1250754. URL https://doi.org/10.1145/1250734.1250754. Google ScholarDigital Library
A. Solar-Lezama, C. G. Jones, and R. Bodík. Sketching concurrent data structures. In R. Gupta and S. P. Amarasinghe, editors, Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, Tucson, AZ, USA, June 7-13, 2008, pages 136-148. ACM, 2008. doi: 10.1145/1375581.1375599. URL https://doi.org/10.1145/1375581.1375599. Google ScholarDigital Library
F. Steimann and J. von Pilgrim. Refactorings without names. In Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, ASE 2012, pages 290-293, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1204-2. doi: 10.1145/2351676.2351726. URL http://doi.acm.org/10.1145/2351676.2351726. Google ScholarDigital Library
A. Udupa, A. Raghavan, J. V. Deshmukh, S. Mador-Haim, M. M. Martin, and R. Alur. Transit: specifying protocols with concolic snippets. ACM SIGPLAN Notices, 48 ( 6 ): 287-296, 2013.Google Scholar
M. Vakilian, N. Chen, S. Negara, B. A. Rajkumar, B. P. Bailey, and R. E. Johnson. Use, disuse, and misuse of automated refactorings. In 2012 34th International Conference on Software Engineering (ICSE), pages 233-243. IEEE, 2012.Google ScholarCross Ref
M. T. Vechev, E. Yahav, and G. Yorsh. Abstraction-guided synthesis of synchronization. In M. V. Hermenegildo and J. Palsberg, editors, Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2010, Madrid, Spain, January 17-23, 2010, pages 327-338. ACM, 2010. doi: 10.1145/1706299.1706338. URL https: //doi.org/10.1145/1706299.1706338. Google ScholarDigital Library
N. Yaghmazadeh, X. Wang, and I. Dillig. Automated migration of hierarchical data to relational tables using programmingby-example. Proc. VLDB Endow., 11 ( 5 ): 580-593, 2018. doi: 10.1145/3187009.3177735. URL http://www.vldb.org/pvldb/ vol11/p580-yaghmazadeh.pdf.Google ScholarDigital Library
K. Yessenov, Z. Xu, and A. Solar-Lezama. Data-driven synthesis for object-oriented frameworks. ACM SIGPLAN Notices, 46 ( 10 ): 65-82, 2011.Google Scholar
H. Zhang, A. Jain, G. Khandelwal, C. Kaushik, S. Ge, and W. Hu. Bing developer assistant: improving developer productivity by recommending sample code. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 956-961, 2016.Google ScholarDigital Library
X. Zhu and A. B. Goldberg. Introduction to semi-supervised learning. Synthesis lectures on artificial intelligence and machine learning, 3 ( 1 ): 1-130, 2009.Google Scholar
X. J. Zhu. Semi-supervised learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences, 2005.Google Scholar

Index Terms

Feedback-driven semi-supervised synthesis of program transformations
1. Computing methodologies
  1. Artificial intelligence
2. Software and its engineering
  1. Software creation and management
    1. Software development techniques
      1. Automatic programming
  2. Software notations and tools
    1. Development frameworks and environments
      1. Integrated and visual development environments
    2. Software maintenance tools

Recommendations

Learning syntactic program transformations from examples
ICSE '17: Proceedings of the 39th International Conference on Software Engineering

Automatic program transformation tools can be valuable for programmers to help them with refactoring tasks, and for Computer Science students in the form of tutoring systems that suggest repairs to programming assignments. However, manually creating ...
Read More
Spreadsheet table transformations from examples
PLDI '11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation

Every day, millions of computer end-users need to perform tasks over large, tabular data, yet lack the programming knowledge to do such tasks automatically. In this work, we present an automatic technique that takes from a user an example of how the ...
Read More
Combinatorially efficient exploration of program transformations for automatic programming
ICCOMP'05: Proceedings of the 9th WSEAS International Conference on Computers

Program induction, where one or more parts of a potentially huge software system are automatically synthesized, is an emerging technology that will become more and more industrially useful as more computing power becomes available, for example in the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the ACM on Programming Languages Volume 4, Issue OOPSLA
November 2020
3108 pages
EISSN:2475-1421
DOI:10.1145/3436718
Issue’s Table of Contents

Copyright © 2020 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 November 2020
Published in pacmpl Volume 4, Issue OOPSLA

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Program synthesis
Program transformation
Programming by Example
Refactoring
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 747
  Total Downloads
- Downloads (Last 12 months)239
- Downloads (Last 6 weeks)42
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Feedback-driven semi-supervised synthesis of program transformations

Proceedings of the ACM on Programming Languages

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

Learning syntactic program transformations from examples

Spreadsheet table transformations from examples

Combinatorially efficient exploration of program transformations for automatic programming

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Feedback-driven semi-supervised synthesis of program transformations

Proceedings of the ACM on Programming Languages

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

Learning syntactic program transformations from examples

Spreadsheet table transformations from examples

Combinatorially efficient exploration of program transformations for automatic programming

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media