ABSTRACT
Software developers often submit questions to technical Q&A sites like Stack Overflow (SO) to resolve their code-level problems. Usually, they include example code segments with their questions to explain the programming issues. When users of SO attempt to answer the questions, they prefer to reproduce the issues reported in questions using the given code segments. However, such code segments could not always reproduce the issues due to several unmet challenges (e.g., too short code segment) that might prevent questions from receiving appropriate and prompt solutions. A previous study produced a catalog of potential challenges that hinder the reproducibility of issues reported at SO questions. However, it is unknown how the practitioners (i.e., developers) perceive the challenge catalog. Understanding the developers’ perspective is inevitable to introduce interactive tool support that promotes reproducibility. We thus attempt to understand developers’ perspectives by surveying 53 users of SO. In particular, we attempt to – (1) see developers’ viewpoints on the agreement to those challenges, (2) find the potential impact of those challenges, (3) see how developers address them, and (4) determine and prioritize tool support needs. Survey results show that about 90% of participants agree to the already exposed challenges. However, they report some additional challenges (e.g., error log missing) that might prevent reproducibility. According to the participants, too short code segment and absence of required Class/Interface/Method from code segments severely prevent reproducibility, followed by missing important part of code. To promote reproducibility, participants strongly recommend introducing tool support that interacts with question submitters with suggestions for improving the code segments if the given code segments fail to reproduce the issues.
- B. CD Anda, D. IK Sjøberg, and A. Mockus. 2008. Variability and reproducibility in software engineering: A study of four companies that developed the same system. TSE (2008).Google Scholar
- Anonymous. 2021. Replication Package. https://bit.ly/3nOS2KMGoogle Scholar
- Tingting Bi, Xin Xia, David Lo, John Grundy, Thomas Zimmermann, and Denae Ford. 2021. Accessibility in Software Practice: A Practitioner’s Perspective. arXiv preprint arXiv:2103.08778(2021).Google Scholar
- R. PL Buse and W. R. Weimer. 2008. A metric for software readability. In Proc. ISSTA.Google Scholar
- R. PL Buse and W. R. Weimer. 2009. Learning a metric for code readability. TSE (2009).Google Scholar
- E. Daka, J. Campos, G. Fraser, J. Dorn, and W. Weimer. 2015. Modeling readability to improve unit tests. In Proc. FSE.Google Scholar
- F. Ebert, F. Castor, N. Novielli, and A. Serebrenik. 2019. Confusion in code reviews: Reasons, impacts, and coping strategies. In Proc. SANER.Google Scholar
- M. Erfani J., M. Mirzaaghaei, and A. Mesbah. 2014. Works for me! characterizing non-reproducible bug reports. In Proc. MSR.Google Scholar
- M. Fazzini, M. Prammer, M. d’Amorim, and A. Orso. 2018. Automatically translating bug reports into test cases for mobile apps. In Proc. ISSTA.Google Scholar
- D. Ford, K. Lustig, J. Banks, and C. Parnin. 2018. ” We Don’t Do That Here” How Collaborative Editing with Mentors Improves Engagement in Social Q&A Communities. In Proc. CHI.Google Scholar
- Z. Gao, X. Xia, D. Lo, J. Grundy, and Y. F. Li. 2020. Code2Que: A Tool for Improving Question Titles from Mined Code Snippets in Stack Overflow. arXiv preprint arXiv:2007.10851(2020).Google Scholar
- R. M. Groves, Floyd J. Fowler J., M. P. Couper, J. M. Lepkowski, E. Singer, and R. Tourangeau. 2011. Survey methodology.Google Scholar
- E. Horton and C. Parnin. 2018. Gistable: Evaluating the executability of python code snippets on github. In Proc. ICSME.Google Scholar
- E. Horton and C. Parnin. 2019. Dockerizeme: Automatic inference of environment dependencies for python code snippets. In Proc. ICSE.Google Scholar
- A. Joshi, S. Kale, S. Chandel, and D. K. Pal. 2015. Likert scale: Explored and explained. CJAST (2015).Google Scholar
- B. A. Kitchenham and S. L. Pfleeger. 2008. Personal opinion surveys. In Guide to advanced empirical software engineering.Google Scholar
- J. C. Lin and K. C. Wu. 2008. Evaluation of software understandability based on fuzzy matrix. In Proc. FUZZ.Google Scholar
- S. Mondal, M. M. Rahman, and C. K. Roy. 2019. Can issues reported at stack overflow questions be reproduced?: an exploratory study. In Proc. MSR.Google ScholarDigital Library
- Saikat Mondal and Banani Roy. 2021. Reproducibility Challenges and Their Impacts on Technical Q&A Websites: The Practitioners’ Perspectives. arXiv preprint arXiv:2112.10056(2021).Google Scholar
- K. Moran, M. Linares-Vásquez, C. Bernal-Cárdenas, C. Vendome, and D. Poshyvanyk. 2016. Automatically discovering, reporting and reproducing android application crashes. In Proc. ICST.Google Scholar
- D. Mu, A. Cuevas, L. Yang, H. Hu, X. Xing, B. Mao, and G. Wang. 2018. Understanding the reproducibility of crowd-reported security vulnerabilities. In Proc. USENIX Security.Google Scholar
- Stack Overflow. 2009. Java: Resetting all values in the program. https://stackoverflow.com/questions/798184 Online; Last accessed 10 January 2020.Google Scholar
- Stack Overflow. 2009. Use Sounds in java?https://stackoverflow.com/questions/1264770 Online; Last accessed 10 January 2020.Google Scholar
- Stack Overflow. 2010. get resource is null. https://stackoverflow.com/questions/2012643 Online; Last accessed 10 January 2020.Google Scholar
- Stack Overflow. 2010. Getting Error On Xml resultSet in java. https://stackoverflow.com/questions/2018247 Online; Last accessed 10 January 2020.Google Scholar
- Stack Overflow. 2013. Scanner doesn’t see after space. https://stackoverflow.com/questions/19509647 Online; Last accessed 10 January 2020.Google Scholar
- D. Posnett, A. Hindle, and P. Devanbu. 2011. A simpler model of software readability. In Proc. MSR.Google Scholar
- Chaiyong Ragkhitwetsagul, Jens Krinke, Matheus Paixao, Giuseppe Bianco, and Rocco Oliveto. 2019. Toxic code snippets on stack overflow. IEEE Transactions on Software Engineering(2019).Google ScholarCross Ref
- M. M. Rahman, F. Khomh, and M. Castelluccio. 2020. Why are Some Bugs Non-Reproducible? An Empirical Investigation using Data Fusion. In Proc. ICSME.Google Scholar
- S. Scalabrino, G. Bavota, C. Vendome, M. Linares-Vásquez, D. Poshyvanyk, and R. Oliveto. 2017. Automatically assessing code understandability: How far are we?. In Proc. ASE.Google Scholar
- S. Scalabrino, M. Linares-Vasquez, D. Poshyvanyk, and R. Oliveto. 2016. Improving code readability models with textual features. In Proc. ICPC.Google Scholar
- J. Singer and N. G. Vinson. 2002. Ethical issues in empirical studies of software engineering. TSE (2002).Google Scholar
- M. Soltani, A. Panichella, and A. Van D.2017. A guided genetic algorithm for automated crash reproduction. In Proc. ICSE.Google Scholar
- M. Tahaei, K. Vaniea, and N. Saphra. 2020. Understanding Privacy-Related Questions on Stack Overflow. In Proc. CHI.Google Scholar
- V. Terragni, Y. Liu, and S. C. Cheung. 2016. CSNIPPEX: automated synthesis of compilable code snippets from Q&A sites. In Proc. ISSTA.Google ScholarDigital Library
- Y. Tian, D. Lo, and J. Lawall. 2014. Automated construction of a software-specific word similarity database. In Proc. CSMR-WCRE.Google Scholar
- C. Treude, O. Barzilay, and M. A. Storey. 2011. How do programmers ask and answer questions on the web?(NIER track). In Proc. ICSE.Google ScholarDigital Library
- C. Treude and M. P. Robillard. 2017. Understanding stack overflow code fragments. In Proc. ICSME.Google ScholarCross Ref
- A. Trockman, K. Cates, M. Mozina, T. Nguyen, C. Kästner, and B. Vasilescu. 2018. ” Automatically assessing code understandability” reanalyzed: combined metrics matter. In Proc. MSR.Google Scholar
- W. M. Vagias. 2006. Likert-type scale response anchors. Clemson International Institute for Tourism & Research Development, Department of Parks, Recreation and Tourism Management. Clemson University (2006).Google Scholar
- N. Vincent, I. Johnson, and B. Hecht. 2018. Examining Wikipedia with a broader lens: Quantifying the value of Wikipedia’s relationships with other large-scale online communities. In Proc. CHI.Google Scholar
- M. White, M. Linares-Vásquez, P. Johnson, C. Bernal-Cárdenas, and D. Poshyvanyk. 2015. Generating reproducible and replayable bug reports from android application crashes. In Proc. ICPC.Google Scholar
- A. Yamashita and L. Moonen. 2013. Do developers care about code smells? An exploratory survey. In Proc. WCRE.Google Scholar
- D. Yang, A. Hussain, and C. V. Lopes. 2016. From query to usable code: an analysis of stack overflow code snippets. In Proc. MSR.Google ScholarDigital Library
- Haoxiang Zhang, Shaowei Wang, Tse-Hsun Peter Chen, Ying Zou, and Ahmed E Hassan. 2019. An empirical study of obsolete answers on Stack Overflow. IEEE Transactions on Software Engineering(2019).Google Scholar
Index Terms
- Reproducibility Challenges and Their Impacts on Technical Q&A Websites: The Practitioners’ Perspectives
Recommendations
Can issues reported at stack overflow questions be reproduced?: an exploratory study
MSR '19: Proceedings of the 16th International Conference on Mining Software RepositoriesSoftware developers often look for solutions to their code level problems at Stack Overflow. Hence, they frequently submit their questions with sample code segments and issue descriptions. Unfortunately, it is not always possible to reproduce their ...
The reproducibility of programming-related issues in Stack Overflow questions
AbstractSoftware developers often look for solutions to their code-level problems using the Stack Overflow Q&A website. To receive help, developers frequently submit questions that contain sample code segments along with the description of the programming ...
Works for Me! Cannot Reproduce – A Large Scale Empirical Study of Non-reproducible Bugs
AbstractSoftware developers attempt to reproduce software bugs to understand their erroneous behaviours and to fix them. Unfortunately, they often fail to reproduce (or fix) them, which leads to faulty, unreliable software systems. However, to date, only ...
Comments