Learning to answer programming questions with software documentation through social context embedding

https://doi.org/10.1016/j.ins.2018.03.014Get rights and content

Abstract

Official software documentation provides a comprehensive overview of software usages, but not on specific programming tasks or use cases. Often there is a mismatch between the documentation and a question on a specific programming task because of different wordings. We observe from Stack Overflow that the best answers to programmers’ questions often contain links to formal documentation. In this paper, we propose a novel deep-learning-to-answer framework, named QDLinker, for answering programming questions with software documentation. QDLinker learns from the large volume of discussions in community-based question answering site to bridge the semantic gap between programmers’ questions and software documentation. Specifically, QDLinker learns question-documentation semantic representation from these question answering discussions with a four-layer neural network, and incorporates semantic and content features into a learning-to-rank schema. Our approach does not require manual feature engineering or external resources to infer the degree of relevance between a question and documentation. Through extensive experiments, results show that QDLinker effectively answers programming questions with direct links to software documentation. QDLinker significantly outperforms the baselines based on traditional retrieval models and Web search services dedicated for software documentation retrieval. The user study shows that QDLinker effectively bridges the semantic gap between the intent of a programming question and the content of software documentation.

Introduction

For most programming languages and software packages, there exist comprehensive language specifications, Application Programming Interface (API) documentation, and tutorials. Such official documentation1 provides information about functionality, structure, and parameters, but not on specific issues or specific usage scenarios [31], [42]. On the other hand, programmers often face very specific issues which are not explicitly stated in software documentation. For many such issues, software documentation does serve as a good reference for why the issues happen and how to address them. However, it is challenging to use a question as a keyword query to search for relevant software documents. This is because the software documentation and question are often in different wordings; one is for generic reference and the other is from a specific usage scenario in practice.

With the emergence of Web 2.0 in modern software development, the behavior of developers is changed, in relation to how they search for crowd-generated knowledge to fulfill their needs [21], [22], [25]. The mismatch between the needs of documentation consumers and the knowledge provided, leads to the overwhelming discussions accumulated at various Community-based Question Answering (CQA) websites such as Quora2 and Stack Overflow3. In these discussions, the community users often refer to software documentation when answering programming questions. From Stack Overflow, we collected 45,288 best answers each contains at least one link to Java official documentation. Fig. 1 plots the distribution of the number of links to Java documentation per best answer, which obeys a power-law distribution. It shows that 72.6% of best answers have exactly one link to Java documentation and fewer than 10% have more than three links. This distribution suggests that for many Java programming questions, there exists a Java official document as a good reference to address the question. The large volume of discussions also create the ‘semantic link’ between programmers’ questions and software documentation, through the community of programmers, illustrated in Fig. 2.

Posting questions and waiting for answers from other programmers may take much time. The immediate question is: can we answer a programmer’s question by providing a link to the most relevant software documentation? In this research, we aim to build an answering system where the questions are from programmers in natural language and the answers are the links to official documentation, illustrated in Fig. 2. This system will provide convenience not only for documentation consumers but also the companies that provide technical support.

However, understanding programming questions to build an effective answering system is not trivial. First of all, mapping question-answer pairs into a discriminative feature space is a critical step. A widely adopted approach is to encode question-answer pairs using various features, e.g., lexical, linguistic, and syntactic features [36], [37], [53], [61], [67]. These hand-crafted features may heavily depend on external resources at the loss of generality. Besides, many existing knowledge bases are about lexical knowledge or about open domain facts. A typical example is WordNet [29], a lexical knowledge base for general English language, which may not be suitable to build answer systems for technical questions about programming. As shown by the analysis above, taking the advantage of neural networks to learn semantic representation of question-documentation pair seems to be more appropriate for our task. Neural networks have been proved to be powerful tools in many fields, such as machine transliteration [7], computer vision [50], electromagnetic theory [18], wire coating analysis [30], and bioinformatics [40]. Note that, our task cannot be addressed by search engines for source code [13], [35]. Code search system cannot well answer queries in natural language, especially when the queries do not contain any code snippets or API-like terms.

In this paper, we propose a novel deep-learning-to-answer framework named QDLinker, to answer programming questions with software documentation through social context embedding. Social context of a link to software documentation refers to the surrounding words of the link, when community users use it to answer questions in CQA. QDLinker embeds social contexts in a latent space, and uses a four-layer Deep Neural Network (DNN) to learn semantic representations of question-documentation pairs. The learned semantic representations and simple content features are then passed to a learning-to-rank schema to train a ranker. Compared to prior work on software text retrieval [67], our approach does not require manual feature engineering or hand-coded resources beyond the pre-trained word vectors. The architecture we proposed is beneficial not only to learn a ranker in training phase, but also to automatic feature extraction for the newcoming query-documentation pairs in online phase. Moreover, our approach takes into account documentation content and social context simultaneously, for its effectiveness in bridging the semantic gap between programming questions and software documentation.

We conducted extensive experiments on Stack Overflow dataset to evaluate the effectiveness of QDLinker. Empirical results show that QDLinker outperforms three baseline methods which are based on traditional retrieval models. Through a user study with 25 natural language queries collected from test dataset, we show that QDLinker significantly outperforms a commercial search engine. In short, our empirical results show that QDLinker can effectively bridge the semantic gap between questions and software documentation. In this paper, we make the following contributions:

  • We propose QDLinker, a novel framework for answering programming questions with software documentation through social context embedding. It leverages the content in official sites and social contexts in CQA to learn semantic representations of question-documentation pairs and answers programming questions in natural language.

  • We conduct a large-scale automatic evaluation, to evaluate the performance of QDLinker against three baseline methods. The empirical evaluation reveals that our approach can effectively answer Java technical questions against the traditional retrieval models.

  • We conduct a user study to compare the software documentation retrieval performances of QDLinker and Google search. The results show that QDLinker significantly outperforms Google search in the retrieval task.

The remainder of this paper is organized as follows. Section 2 summarizes the related work. Section 3 details our approach QDLinker. Section 4 presents the empirical evaluation. Section 5 presents the user study. Finally, we conclude the paper in Section 6.

Section snippets

Related work

Question Retrieval. Question retrieval has attracted much attention in recent years [4], [8], [10], [16]. Different retrieval models have been employed in the task, including the Okapi model [16], the translation model [63], the language model [8], and the vector space model [16], [17]. In addition, question category information has also been exploited for question retrieval [4]. Xue et al. [57] proposed a translation-based language model that combines the translation model and the language

Deep learning to answer

In this section, we first give an overview of the proposed framework QDLinker, and then detail the core modules in QDLinker in Sections 3.2–3.4. The input to the framework, i.e., the word embedding, is presented in Section 3.1.

As shown in Fig. 3, the QDLinker framework consists of three core modules: candidate documentation generation, a four-layer neural network, and learning a ranker. Given a programming question in natural language, candidate documentation generation returns a small set of

Empirical evaluation

We now evaluate the effectiveness of QDLinker by measuring its accuracy on linking questions on Stack Overflow to software documentation. Our evaluation assumes that the software documentation mentioned in a question’s best answer is the most relevant to the question.

User study

To the best of our knowledge, there is no existing work on answering programming questions in natural language. Commercial search engines, e.g., Google and Bing, are tools for daily use in software development. It naturally motivates us to compare the returned results with such search engines. If we can improve the performance of search results on the search engines, it will provide convenience not only for developers but also for the companies that provide documentation support.

In the previous

Conclusion

Developers often encounter questions in specific programming tasks. Although programming languages and software packages are well supported by formal documentation, the documentation aims at comprehensive coverage and not on specific tasks. The semantic gap between the developers’ questions and software documentation makes it difficult for developers to search for the most relevant documentation. Utilizing the social context available on Stack Overflow, we built QDLinker to bridge the gap

References (67)

  • H. Bagci et al.

    Context-aware location recommendation by using a random walk-based approach

    Knowl. Inf. Syst.

    (2016)
  • A. Berger et al.

    Bridging the lexical chasm: statistical approaches to answer-finding

    Proceedings of the SIGIR

    (2000)
  • R.D. Burke et al.

    Question answering from frequently asked question files: experiences with the FAQ finder system

    AI Mag.

    (1997)
  • X. Cao et al.

    A generalized framework of exploring category information for question retrieval in community question answer archives

    Proceedings of the WWW

    (2010)
  • Y. Cheng et al.

    An exploration of parameter redundancy in deep networks with circulant projections

    Proceedings of the ICCV

    (2015)
  • R. Collobert et al.

    Natural language processing (almost) from scratch

    J. Mach. Learn. Res.

    (2011)
  • T. Deselaers et al.

    A deep learning approach to machine transliteration

    Proceedings of the Fourth Workshop on Statistical Machine Translation

    (2009)
  • H. Duan et al.

    Searching questions by identifying question topic and question focus.

    Proceedings of the ACL

    (2008)
  • A. Gani et al.

    A survey on indexing techniques for big data: taxonomy and performance evaluation

    Knowl. Inf. Syst.

    (2016)
  • M. Grbovic et al.

    Context-and content-aware embedding for query rewriting in sponsored search

    SIGIR

    (2015)
  • X. Gu et al.

    Deep API learning

    Proceedings of the FSE

    (2016)
  • K. He et al.

    Convolutional neural networks at constrained time cost

    Proceedings of the CVPR

    (2015)
  • Y. Hou et al.

    Hitszicrc: Exploiting classification approach for answer selection in community question answering

    Proceedings of the SemEval

    (2015)
  • J. Jeon et al.

    Finding similar questions in large question and answer archives

    Proceedings of the CIKM

    (2005)
  • Z. Ji et al.

    Question-answer topic model for question retrieval in community question answering

    Proceedings of the CIKM

    (2012)
  • J.A. Khan et al.

    Nature-inspired computing approach for solving non-linear singular Emden–Fowler problem arising in electromagnetic theory

    Conn. Sci.

    (2015)
  • Y. Kim, Convolutional neural networks for sentence classification, arXiv:1408.5882...
  • D. Kingma, J. Ba, Adam: a method for stochastic optimization, arXiv:1412.6980...
  • A.J. Ko et al.

    Information needs in collocated software development teams

    Proceedings of the ICSE

    (2007)
  • A.J. Ko et al.

    An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks

    IEEE Trans. Softw. Eng.

    (2006)
  • Y. LeCun et al.

    Deep learning

    Nature

    (2015)
  • J. Li et al.

    BPMiner: mining developers’ behavior patterns from screen-captured task videos

    Proceedings of the SAC

    (2016)
  • J. Li et al.

    From discussion to wisdom: web resource recommendation for hyperlinks in stack overflow

    Proceedings of the SAC

    (2016)
  • Cited by (29)

    • Mining API usage scenarios from stack overflow

      2020, Information and Software Technology
      Citation Excerpt :

      The automated mining of crowd-sourced knowledge from developer forums has generated considerable attention in recent years. To offer a point of reference of our analysis of related work, we review the research papers listed in the Stack Exchange question ‘Academic Papers Using Stack Exchange Data’ [63] and whose titles contain the keywords (‘documentation’ and/or ‘API’) [3,5,11–13,18,20,37,38,42,43,51,64,65,81,85,87,96,97,104]. Existing research has focused on the following areas:

    • Detecting and Augmenting Missing Key Aspects in Vulnerability Descriptions

      2022, ACM Transactions on Software Engineering and Methodology
    View all citing articles on Scopus
    View full text