Skip to main content
Log in

Cost-aware triage ranking algorithms for bug reporting systems

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Bug triaging of deciding whom to fix the bug has been studied actively. However, existing work does not consider varying cost of the same bug over developers with diverse backgrounds and experiences. In clear contrast, we argue the “cost” of one bug can be low for one developer, while high for another. Based on this view, we study an automatic triaging system considering both accuracy and cost. Our preliminary solution, CosTriage, models user-specific experiences and estimated cost on each bug category, obtained from topic modeling, and assigns the bug to the developer who not only can, but also is expected to fix fast. For user-specific cost modeling, we are inspired by recommender system work, of estimating user-specific rating of items, e.g., movies. With this view, existing triaging work of categorizing bugs and assigning developers with experiences in the category falls into content-based recommendation (CBR). However, CBR is well known to cause overspecialization because it recommends only the types of bugs that each developer has solved before. This problem is critical because the experienced developers can become overloaded with bugs they hate to fix, though there exist other categories he can fix faster. CosTriage adopts content-boosted collaborative filtering (CBCF), considering not only similar bugs (content-based) but similar developers (collaborative) for estimating user-specific cost. In this paper, we extend to include special scenarios. First, bug may not have textual report (e.g., crash report) or textual report may lack a topic word (e.g., 1957 of 48,424 in Mozilla reports) Mozilla reports. Second, in some scenarios, developer profiles may change over time. For these scenarios, we extend CosTriage to support non-textual description and dynamic profiles, which we denote as CosTriage+. Our experimental evaluation shows that our solution reduces the cost efficiently by 30 % without seriously compromising accuracy in comparison with the baseline only considering accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Tai’s algorithm has \(O(V_1\times V_2\times D_1^2\times D_2^2)\) time complexity, where \(V_i\) is the number of nodes in \(T(\mathcal {I}_{\mathcal {B}_i})\), and \(D_i\) is the depth of \(T(\mathcal {I}_{\mathcal {B}_i})\).

  2. Zhang’s algorithm has \(O(V_1\times V_2\times \hbox {min}(L_1,D_1)\times \hbox {min}(L_2,D_2)\) time complexity, where \(L_i\) denotes the number of leaves in \(T_i\). An implementation is available at http://web.science.mq.edu.au/~swan/howtos/treedistance/.

  3. Apache, https://issues.apache.org/bugzilla/.

  4. Eclipse, https://bugs.eclipse.org/bugs/.

  5. Linux kernel, https://bugzilla.kernel.org/.

  6. Mozilla, https://bugzilla.mozilla.org/.

  7. http://svmlight.joachims.org/.

  8. Since the test set of Apache only has 131 bugs, several bug types have no or few bugs. In Fig. 6, we only show the bug types with more than 2.5 %.

References

  1. Park J, Lee M-W, Kim J, Hwang S, Kim S (2011) Costriage: a cost-aware triage algorithm for bug reporting systems. In: AAAI

  2. Anvik J (2007) Assisting bug report triage through recommendation. PhD thesis, University of British Columbia

  3. Jeong G, Kim S, Zimmermann T (2009) Improving bug triage with bug tossing graphs. In: ESEC/FSE

  4. Guo PJ, Zimmermann T, Nagappan N, Murphy B (2010) Characterizing and predicting which bugs get fixed: an empirical study of microsoft windows. In: ICSE

  5. Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug? In: ICSE

  6. Anvik J, Murphy GC (2011) Reducing the effort of bug report triage: recommenders for development-oriented decisions. ACM Trans Softw Eng Methodol 20(3):10

  7. Čubranić D (2004) Automatic bug triage using text categorization. In: SEKE

  8. Canfora G, Cerulo L (2006) Supporting change request assignment in open source development. In: Proceedings of the 2006 ACM symposium on applied computing

  9. Canfora G, Cerulo L (2005) How software repositories can help in resolving a new change request. In: Workshop on empirical studies in reverse engineering

  10. di Lucca G (2002) An approach to classify software maintenance requests. In: ICSM

  11. Matter D, Kuhn A, Nierstrasz O (2009) Assigning bug reports using a vocabulary-based expertise model of developers. In: MSR

  12. Tamrawi A, Nguyen TT, Al-Kofahi JM, Nguyen TN (2011) Fuzzy set and cache-based approach for bug triaging. In: ESEC/FSE

  13. Kim S, Whitehead EJ Jr (2006) How long did it take to fix bugs? In: MSR

  14. Weiss C, Premraj R, Zimmermann T, Zeller A (2007) How long will it take to fix this bug? In: MSR

  15. Rahman MM, Ruhe G, Zimmermann T (2009) Optimized assignment of developers for fixing bugs an initial evaluation for eclipse projects. In: ESEM

  16. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022

  17. Bettenburg N, Premraj R, Zimmermann T, Kim Sunghun (2008) Duplicate bug reports considered harmful... really? In: ICSM

  18. Chen L, Wang X, Liu C (2011) An approach to improving bug assignment with bug tossing graphs and bug similarities. J Softw 6(3):421–427

  19. Xuan J, Jiang H, Ren Z, Yan J, Luo Z (2010) Automatic bug triage using semi-supervised text classification. In: SEKE

  20. Bhattacharya P, Neamtiu I (2010) Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging. In: ICSM

  21. Lin Z, Shu F, Yang Y, Hu C, Wang Q (2009) An empirical study on bug assignment automation using chinese bug data. In: ESEM

  22. Kim J, Lee S, Hwang S, Kim S (2009) Adding examples into java documents. In: ASE

  23. Kim J, Lee S, Hwang S, Kim S (2010) Towards an intelligent code search engine. In: AAAI

  24. Kim J, Lee S, Hwang S, Kim S (2013) Enriching documents with examples: a corpus mining approach. ACM Trans Inf Syst 31(1):1

  25. Lee M-W, Roh J-W, Hwang S, Kim S (2010) Instant code clone search. In: FSE

  26. Park J, Lee M-W, Roh J-W, Hwang S, Kim S (2014) Surfacing code in the dark: an instant clone search approach. Knowl Inf Syst 41(3):727–759

  27. Melville P, Mooney RJ, Nagarajan R (2002) Content-boosted collaborative filtering for improved recommendations. In: AAAI

  28. Arun R, Suresh V, Veni Madhavan CE, Narasimha Murthy MN (2010) On finding the natural number of topics with latent dirichlet allocation: some observations. In: PAKDD

  29. Cao J, Xia T, Li J, Zhang Y, Tang S (2009) A density-based method for adaptive lda model selection. Neurocomputing 72(7–9):1775–1781

  30. Zavitsanos E, Petridis S, Paliouras G, Vouros GA (2008) Determining automatically the size of learned ontologies. In: ECAI

  31. Herlocker J, Konstan JA, Riedl J (2002) An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Inf Retr 5(4):287–310

  32. Ma H, King I, Lyu MR (2007) Effective missing data prediction for collaborative filtering. In: SIGIR

  33. Allan J (1996) Incremental relevance feedback for information filtering. In: SIGIR

  34. Chen Z, Jiang Y, Zhao Y (2010) A collaborative filtering recommendation algorithm based on user interest change and trust evaluation. In: JDCTA

  35. Lathia N, Hailes S, Capra L, Amatriain X (2010) Temporal diversity in recommender systems. In: SIGIR

  36. Cavnar WB, Trenkle JM (1994) N-gram-based text categorization. In: SDAIR

  37. Bettenburg N, Premraj R, Zimmermann T, Kim S (2008) Extracting structural information from bug reports. In: MSR

  38. Microsoft (2010) Windows error reporting: getting started. http://www.microsoft.com/whdc/winlogo/maintain/StartWER.mspx

  39. Mozilla (2010) Crash stats. http://crash-stats.mozilla.com

  40. Apple (2010) Technical note TN2123: CrashReporter

  41. Tai K-C (1979) The tree-to-tree correction problem. J Assoc Comput Mach

  42. Chen W (2001) New algorithm for ordered tree-to-tree correction problem. J Algorithms 40(2):135–158

  43. Demaine ED, Mozes S, Rossman B, Weimann O (2009) An optimal decomposition algorithm for tree edit distance. ACM Trans Algorithms 6(1):2

  44. Dulucq S, Touzet H (2003) Analysis of tree edit distance algorithms. In: CPM

  45. Klein PN (1998) Computing the edit-distance between unrooted ordered trees. In: Proceedings of the 6th annual European symposium on algorithms

  46. Zhang K, Shasha D (1989) Simple fast algorithms for the editing distance between trees and related problems. SIAM J Comput 18(6):1245–1262

  47. Bremner D, Demaine E, Erickson J, Iacono J, Langerman S, Morin P, Toussaint G (2005) Output-sensitive algorithms for computing nearest-neighbour decision boundaries. In: Algorithms and Data Structures. Proceedings of 8th International Workshop, WADS 2003, Ottawa, Ontario, Canada, July 30-August 1,2003. Springer, Heidelberg, pp 451–461

  48. Coomans D, Massart DL (1982) Alternative k-nearest neighbour rules in supervised pattern recognition: Part 1. k-Nearest neighbour classification by using alternative voting rules. Anal Chim Acta

  49. Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27

  50. Brown RG (1963) Smoothing, forecasting and prediction of discrete time series. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  51. Weron R, Weron K, Weron A (1999) A conditionally exponential decay approach to scaling in finance. Phys A Stat Theor Phys 264(3–4):551–561

  52. Han J, Kamber M (2006) Data mining: concepts and techniques

  53. Jiang L, Misherghi G, Su Z, Glondu S (2007) Deckard: scalable and accurate tree-based detection of code clones. In: ICSE

  54. Bettenburg N, Just S, Schröter A, Weiss C, Premraj R, Zimmermann T (2008) What makes a good bug report? In: SIGSOFT FSE

  55. Hooimeijer P, Weimer W (2007) Modeling bug report quality. In: ASE

  56. Aranda J, Venolia G (2009) The secret life of bugs: going past the errors and omissions in software repositories. In: ICSE

  57. Giger E, Pinzger M, Gall H (2010) Predicting the fix time of bugs. In: RSSE

Download references

Acknowledgments

This work was supported by Institute for Information & communications Technology Promotion (IITP) Grant funded by the Korea government (MSIP) (No. 10041244, SmartTV 2.0 Software Platform).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Seung-won Hwang.

Additional information

This work builds on and extends our preliminary work [1].

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Park, Jw., Lee, MW., Kim, J. et al. Cost-aware triage ranking algorithms for bug reporting systems. Knowl Inf Syst 48, 679–705 (2016). https://doi.org/10.1007/s10115-015-0893-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-015-0893-9

Keywords

Navigation