skip to main content
research-article

Measuring and Modeling Group Dynamics in Open-Source Software Development: A Tensor Decomposition Approach

Published:17 November 2021Publication History
Skip Abstract Section

Abstract

Many open-source software projects depend on a few core developers, who take over both the bulk of coordination and programming tasks. They are supported by peripheral developers, who contribute either via discussions or programming tasks, often for a limited time. It is unclear what role these peripheral developers play in the programming and communication efforts, as well as the temporary task-related sub-groups in the projects. We mine code-repository data and mailing-list discussions to model the relationships and contributions of developers in a social network and devise a method to analyze the temporal collaboration structures in communication and programming, learning about the strength and stability of social sub-groups in open-source software projects. Our method uses multi-modal social networks on a series of time windows. Previous work has reduced the network structure representing developer collaboration to networks with only one type of interaction, which impedes the simultaneous analysis of more than one type of interaction. We use both communication and version-control data of open-source software projects and model different types of interaction over time. To demonstrate the practicability of our measurement and analysis method, we investigate 10 substantial and popular open-source software projects and show that, if sub-groups evolve, modeling these sub-groups helps predict the future evolution of interaction levels of programmers and groups of developers. Our method allows maintainers and other stakeholders of open-source software projects to assess instabilities and organizational changes in developer interaction and can be applied to different use cases in organizational analysis, such as understanding the dynamics of a specific incident or discussion.

REFERENCES

  1. [1] Airoldi Edoardo M., Blei David M., Fienberg Stephen E., and Xing Eric P.. 2008. Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9, 65 (2008), 19812014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Anandkumar Anima, Foster Dean P., Hsu Daniel J., Kakade Sham M., and Liu Yi-Kai. 2012. A spectral algorithm for latent Dirichlet allocation. In Advances in Neural Information Processing Systems (NIPS). Curran Associates, Inc., 926934. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Anandkumar Animashree, Ge Rong, Hsu Daniel J., and Kakade Sham M.. 2014. A tensor approach to learning mixed membership community models. J. Mach. Learn. Res. 15, 1 (2014), 22392312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Anandkumar Animashree, Ge Rong, Hsu Daniel J., Kakade Sham M., and Telgarsky Matus. 2014. Tensor decompositions for learning latent variable models. J. Mach. Learn. Res. 15, 1 (2014), 27732832. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Ashraf Usman, Mayr-Dorn Christoph, Mashkoor Atif, Egyed Alexander, and Panichella Sebastiano. 2021. Do communities in developer interaction networks align with subsystem developer teams? An empirical study of open source systems. In Proceedings of the Joint International Conference on Software and System Processes (ICSSP) and International Conference on Global Software Engineering (ICGSE). IEEE, 6171.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Bird Christian. 2011. Sociotechnical coordination and collaboration in open source software. In Proceedings of the International Conference on Software Maintenance (ICSM). IEEE, 568573. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Bird Christian, Gourley Alex, Devanbu Prem, Gertz Michael, and Swaminathan Anand. 2006. Mining email social networks. In Proceedings of the International Workshop Mining Software Repositories (MSR). ACM, 137143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Bird Christian, Pattison David, D’Souza Raissa, Filkov Vladimir, and Devanbu Premkumar. 2008. Latent social structure in open source projects. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE). ACM, 2435. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Caglayan Bora, Turhan Burak, Bener Ayse, Habayeb Mayy, Miransky Andriy, and Cialini Enzo. 2015. Merits of organizational metrics in defect prediction: An industrial replication. In Proceedings of the International Conference on Software Engineering (ICSE). IEEE, 8998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Cataldo Marcelo and Herbsleb James D.. 2008. Communication networks in geographically distributed software development. In Proceedings of the International Conference on Computer-Supported Cooperative Work (CSCW). ACM, 579588. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Cataldo Marcelo and Herbsleb James D.. 2013. Coordination breakdowns and their impact on development productivity and software failures. IEEE Trans. Softw. Eng. 39, 3 (2013), 343360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Cataldo Marcelo, Herbsleb James D., and Carley Kathleen M.. 2008. Socio-technical congruence: A framework for assessing the impact of technical and work dependencies on software development productivity. In Proceedings of the International Symposium Empirical Software Engineering and Measurement (ESEM). ACM, 211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Cataldo Marcelo, Wagstrom Patrick A., Herbsleb James D., and Carley Kathleen M.. 2006. Identification of coordination requirements: Implications for the design of collaboration and awareness tools. In Proceedings of the International Conference on Computer-Supported Cooperative Work (CSCW). ACM, 353362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Chen Bolun, Hua Yong, Yuan Yan, and Jin Ying. 2018. Link prediction on directed networks based on AUC optimization. IEEE Access 6 (2018), 2812228136.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Chen Zitai, Chen Chuan, Zheng Zibin, and Zhu Yi. 2019. Tensor decomposition for multilayer networks clustering. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). AAAI, 33713378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Cheng Can, Li Bing, Li Zeng-Yang, Zhao Yu-Qi, and Liao Feng-Ling. 2017. Developer role evolution in open source software ecosystem: An explanatory study on GNOME. J. Comput. Sci. Technol. 32, 2 (2017), 396414.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Colfer Lyra J. and Baldwin Carliss Y.. 2016. The mirroring hypothesis: Theory, evidence, and exceptions. Industr. Corpor. Change 25, 5 (2016), 709738.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Conway Melvin E.. 1968. How do committees invent?Datamation 14, 4 (1968), 2831.Google ScholarGoogle Scholar
  19. [19] Crowston Kevin and Shamshurin Ivan. 2017. Core-periphery communication and the success of free/libre open source software projects. J. Internet Serv. Applic. 8, 1 (2017), 10:1–10:11.Google ScholarGoogle Scholar
  20. [20] Csárdi Gábor and Nepusz Tamás. 2006. The igraph software package for complex network research. Inter J. Complex Syst. 1695, 5 (2006), 19.Google ScholarGoogle Scholar
  21. [21] Fernandes Sofia da Silva, Fanaee-T. Hadi, and Gama João. 2018. Dynamic graph summarization: A tensor decomposition approach. Data Mining Knowl. Discov. 32, 5 (2018), 13971420. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Dong Yuxiao, Tang Jie, Wu Sen, Tian Jilei, Chawla Nitesh V., Rao Jinghai, and Cao Huanhuan. 2012. Link prediction and recommendation across heterogeneous social networks. In Proceedings of the International Conference on Data Mining (ICDM). IEEE, 181190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Dorogovtsev Sergei N. and Mendes José F. F.. 2013. Evolution of Networks: From Biological Nets to the Internet and WWW. Oxford University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Ducheneaut Nicolas. 2005. Socialization in an open source software community: A socio-technical analysis. In Proceedings of the International Conference on Computer-Supported Cooperative Work (CSCW) 14, 4 (2005), 323368. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Dunlavy Daniel M., Kolda Tamara G., and Kegelmeyer W. Philip. 2011. Multilinear algebra for analyzing data with multiple linkages. In Graph Algorithms in the Language of Linear Algebra. (Software, Environment, Tools, Vol. 22.) Society for Industrial and Applied Mathematics (SIAM), 85114.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Fosdick Bailey K. and Hoff Peter D.. 2015. Testing and modeling dependencies between a network and nodal attributes. J. Amer. Statist. Assoc. 110, 511 (2015), 10471056.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Foucault Matthieu, Palyart Marc, Blanc Xavier, Murphy Gail C., and Falleri Jean-Rémy. 2015. Impact of developer turnover on quality in open-source software. In Proceedings of the European Software Engineering Conference and the International Symposium Foundations of Software Engineering (ESEC/FSE). ACM, 829841. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Gall Harald, Hajek Karin, and Jazayeri Mehdi. 1998. Detection of logical coupling based on product release history. In Proceedings of the International Conference on Software Maintenance (ICSM). IEEE, 190198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Gandhi Mahen, Kumar Amit, Desai Yugandhar, and Agarwal Sonali. 2019. Studying multifaceted collaboration of OSS developers and its impact on their bug fixing performance. In Proceedings of the International Workshop Quantitative Approaches to Software Quality (QuASoQ). CEUR Workshop Proceedings, 3744.Google ScholarGoogle Scholar
  30. [30] Gauvin Laetitia, Panisson André, and Cattuto Ciro. 2014. Detecting the community structure and activity patterns of temporal networks: A non-negative tensor factorization approach. PLoS One 9, 1 (2014), 113.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] González-Bailón Sandra, Wang Ning, Rivero Alejandro, Borge-Holthoefer Javier, and Moreno Yamir. 2014. Assessing the bias in samples of large online networks. Soc. Netw. 38 (2014), 1627.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Grewal Rajdeep, Lilien Gary L., and Mallapragada Girish. 2006. Location, location, location: How network embeddedness affects project success in open source systems. Manag. Sci.ence 52, 7 (2006), 10431056. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Grinter Rebecca E., Herbsleb James D., and Perry Dewayne E.. 1999. The geography of coordination: Dealing with distance in R&D work. In Proceedings of the International Conference on Supporting Group Work (GROUP). ACM, 306315. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Guzzi Anja, Bacchelli Alberto, Lanza Michele, Pinzger Martin, and Deursen Arie van. 2013. Communication in open source software development mailing lists. In Proceedings of the International Workshop on Mining Software Repositories (MSR). IEEE, 277286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Herbsleb James D., Mockus Audris, and Roberts Jeffrey A.. 2006. Collaboration in software engineering projects: A theory of coordination. In Proceedings of the International Conference on Information Systems (ICIS). Association for Information Systems, 553568.Google ScholarGoogle Scholar
  36. [36] Hoff Peter D.. 2007. Modeling homophily and stochastic equivalence in symmetric relational data. In Advances in Neural Information Processing Systems (NIPS). Curran Associates, Inc., 657664. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Hoff Peter D.. 2009. Multiplicative latent factor models for description and prediction of social networks. Comput. Math. Organiz. Theor. 15, 4 (2009), 261272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Hoff Peter D.. 2011. Hierarchical multilinear models for multiway data. Comput. Statist. Data Anal. 55, 1 (2011), 530543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Hoff Peter D., Raftery Adrian E., and Handcock Mark S.. 2002. Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 97, 460 (2002), 10901098.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Hong Qiaona, Kim Sunghun, Cheung Shing Chi, and Bird Christian. 2011. Understanding a developer social network and its evolution. In Proceedings of the International Conference on Software Maintenance (ICSM). IEEE, 323332. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Howison James, Inoue Keisuke, and Crowston Kevin. 2006. Social dynamics of free and open source team communications. In Proceedings of the International Conference on Open Source Systems (OSS). Springer, 319330.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Hunsen Claus, Siegmund Janet, and Apel Sven. 2020. On the fulfillment of coordination requirements in open-source software projects: An exploratory study. Empir. Softw. Eng. 25, 6 (2020), 43794426.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Hyndman Rob J., Athanasopoulos George, Bergmeir Christoph, Caceres Gabriel, Chhay Leanne, O’Hara-Wild Mitchell, Petropoulos Fotios, Razbash Slava, Wang Earo, and Yasmeen Farah. 2018. Forecast: Forecasting Functions for Time Series and Linear Models. R package version 8.4. https://cran.r-project.org/src/contrib/Archive/forecast/forecast_8.4.tar.gz.Google ScholarGoogle Scholar
  44. [44] Hyndman Rob J., Koehler Anne B., Snyder Ralph D., and Grose Simone. 2002. A state space framework for automatic forecasting using exponential smoothing methods. Int. J. Forecast. 18, 3 (2002), 439454.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Iacovacci Jacopo and Bianconi Ginestra. 2016. Extracting information from multiplex networks. Chaos: Interdisc. J. Nonlin. Sci. 26, 6 (2016), 065306.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Innes Martin, Roberts Colin, Preece Alun, and Rogers David. 2017. After Woolwich: Analyzing open source communications to understand the interactive and multi-polar dynamics of the arc of conflict. British J. Criminol. 58, 2 (2017), 434454.Google ScholarGoogle Scholar
  47. [47] Jebb Andrew T., Parrigon Scott, and Woo Sang Eun. 2017. Exploratory data analysis as a foundation of inductive research. Hum. Resour. Manag. Rev. 27, 2 (2017), 265276.Google ScholarGoogle Scholar
  48. [48] Jensen Chris and Scacchi Walt. 2005. Collaboration, leadership, control, and conflict negotiation and the netbeans.org open source software development community. In Proceedings of the Hawaii International Conference on System Sciences (HICSS). IEEE, 196b. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Jergensen Corey, Sarma Anita, and Wagstrom Patrick. 2011. The onion patch: Migration in open source ecosystems. In Proceedings of the European Software Engineering Conference on and the International Symposium Foundations of Software Engineering (ESEC/FSE). ACM, 7080. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Jermakovics Andrejs, Sillitti Alberto, and Succi Giancarlo. 2011. Mining and visualizing developer networks from version control systems. In Proceedings of the International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE). ACM, 2431. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Joblin Mitchell, Apel Sven, Hunsen Claus, and Mauerer Wolfgang. 2017. Classifying developers into core and peripheral: An empirical study on count and network metrics. In Proceedings of the International Conference on Software Engineering (ICSE). IEEE, 164174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Joblin Mitchell, Apel Sven, and Mauerer Wolfgang. 2017. Evolutionary trends of developer coordination: A network approach. Empir. Softw. Eng. 22, 4 (2017), 20502094. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Joblin Mitchell, Mauerer Wolfgang, Apel Sven, Siegmund Janet, and Riehle Dirk. 2015. From developer networks to verified communities: A fine-grained approach. In Proceedings of the International Conference on Software Engineering (ICSE). IEEE, 563573. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. [54] Khomh Foutse, Adams Bram, Dhaliwal Tejinder, and Zou Ying. 2015. Understanding the impact of rapid releases on software quality. Empir. Softw. Eng. 20, 2 (2015), 336373. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. [55] Kolda Tamara G., Bader Brett W., and Kenny Joseph P.. 2005. Higher-order web link analysis using multilinear algebra. In Proceedings of the International Conference on Data Mining (ICDM). IEEE, 8–pp. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. [56] Koren Yehuda, Bell Robert, and Volinsky Chris. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 3037. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. [57] Kozlowski Steve W. J. and Chao Georgia T.. 2018. Unpacking team process dynamics and emergent phenomena: Challenges, conceptual advances, and innovative methods. Amer. Psychol. 73, 4 (2018), 576592.Google ScholarGoogle ScholarCross RefCross Ref
  58. [58] Kraut Robert E. and Streeter Lynn A.. 1995. Coordination in software development. Commun. ACM 38, 3 (1995), 6982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. [59] Krishnamurthy Rajiv, Jacob Varghese, Radhakrishnan Suresh, and Dogan Kutsal. 2016. Peripheral developer participation in open source projects: An empirical analysis. ACM Trans. Manag. Inf. Syst. 6, 4 (2016), 131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. [60] Kunegis Jérôme, Fay Damien, and Bauckhage Christian. 2010. Network growth and the spectral evolution model. In Proceedings of the International Conference on Information and Knowledge Management (CIKM). ACM, 739748. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. [61] Kwan Irwin, Cataldo Marcelo, and Damian Daniela. 2012. Conway's law revisited: The evidence for a task-based perspective. IEEE Softw. 29, 1 (2012), 9093. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. [62] Kwan Irwin, Schroter Adrian, and Damian Daniela. 2011. Does socio-technical congruence have an effect on software build success? A study of coordination in a software project. IEEE Trans. Softw. Eng. 37, 3 (2011), 307324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. [63] Lee Gwendolyn K. and Cole Robert E.. 2003. From a firm-based to a community-based model of knowledge creation: The case of the Linux kernel development. Organiz. Sci. 14, 6 (2003), 633649. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. [64] Leskovec Jure, Backstrom Lars, Kumar Ravi, and Tomkins Andrew. 2008. Microscopic evolution of social networks. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD). ACM, 462470. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. [65] Li James, Bien Jacob, and Wells Martin. 2015. rTensor: Tools for Tensor Analysis and Decomposition. R package version 1.3. https://cran.r-project.org/src/contrib/Archive/rTensor/rTensor_1.3.tar.gz.Google ScholarGoogle Scholar
  66. [66] Lin Bin, Robles Gregorio, and Serebrenik Alexander. 2017. Developer turnover in global, industrial open source projects: Insights from applying survival analysis. In Proceedings of the International Conference on Global Software Engineering (ICGSE). IEEE, 6675. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. [67] Liu Manlu, Hull Clyde Eiríkur, and Hung Yu-Ting Caisy. 2017. Starting open-source collaborative innovation: The antecedents of network formation in community source. Inf. Syst. J. 27, 5 (2017), 643670.Google ScholarGoogle ScholarCross RefCross Ref
  68. [68] Lobo Jorge M., Jiménez-Valverde Alberto, and Real Raimundo. 2008. AUC: A misleading measure of the performance of predictive distribution models. Glob. Ecol. Biogeog. 17, 2 (2008), 145151.Google ScholarGoogle ScholarCross RefCross Ref
  69. [69] López-Fernández Luis, Robles Gregorio, Gonzalez-Barahona Jesus M., and Herraiz Israel. 2006. Applying social network analysis techniques to community-driven libre software projects. Int. J. Inf. Technol. Web Eng. 1 (2006), 2850.Google ScholarGoogle ScholarCross RefCross Ref
  70. [70] Lusher Dean, Koskinen Johan, and Robins Garry. 2012. Exponential Random Graph Models for Social Networks: Theory, Methods, and Applications. (Structural Analysis in the Social Sciences, Vol. 35.) Cambridge University Press.Google ScholarGoogle ScholarCross RefCross Ref
  71. [71] Malone Thomas W. and Crowston Kevin. 1990. What is coordination theory and how can it help design cooperative work systems? In Proceedings of the International Conference on Computer-Supported Cooperative Work (CSCW). ACM, 357370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. [72] Mannan Umme Ayda, Ahmed Iftekhar, Jensen Carlos, and Sarma Anita. 2020. On the relationship between design discussions and design quality: A case study of Apache projects. In Proceedings of the European Software Engineering Conference and the International Symposium on Foundations of Software Engineering (ESEC/FSE). ACM, 543555. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. [73] Mauerer Wolfgang, Joblin Mitchell, Tamburri Damian A., Paradis Carlos, Kazman Rick, and Apel Sven. 2021. In search of socio-technical congruence: A large-scale longitudinal study. IEEE Trans. Softw. Eng. (2021). Retrieved from https://www.computer.org/csdl/journal/ts/5555/01/09436025/1tJsglfkGru.Google ScholarGoogle ScholarCross RefCross Ref
  74. [74] Meneely Andrew and Williams Laurie. 2011. Socio-technical developer networks: Should we trust our measurements? In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 281290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. [75] Menon Aditya Krishna and Elkan Charles. 2011. Link prediction via matrix factorization. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD). Springer, 437452. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. [76] Mockus Audris, Fielding Roy T., and Herbsleb James D.. 2002. Two case studies of open source software development: Apache and Mozilla. ACM Trans. Softw. Eng. Methodol. 11, 3 (2002), 309346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. [77] Nagappan Nachiappan, Murphy Brendan, and Basili Victor R.. 2008. The influence of organizational structure on software quality: An empirical case study. In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 521530. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. [78] Oh Wonseok and Jeon Sangyong. 2007. Membership herding and network stability in the open source community: The Ising perspective. Manag. Sci. 53, 7 (2007), 10861101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. [79] Oliva Gustavo A., Santana Francisco W., Oliveira Kleverton C. M. de, Souza Cleidson R. B. de, and Gerosa Marco A.. 2012. Characterizing key developers: A case study with Apache Ant. In Proceedings of the International Conference on Collaboration and Technology (CRIWG). Springer, 97112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. [80] O’Mahony Siobhán and Ferraro Fabrizio. 2007. The emergence of governance in an open source community. Acad. Manag. J. 50, 5 (2007), 10791106.Google ScholarGoogle ScholarCross RefCross Ref
  81. [81] Palomba Fabio and Tamburri Damian A.. 2021. Predicting the emergence of community smells using socio-technical metrics: A machine-learning approach. J. Syst. Softw. 171 (2021), 110847.Google ScholarGoogle ScholarCross RefCross Ref
  82. [82] Parnas David L.. 1972. On the criteria to be used in decomposing systems into modules. Commun. ACM 15, 12 (1972), 10531058. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. [83] Pohl Mathias and Diehl Stephan. 2008. What dynamic network metrics can tell us about developer roles. In Proceedings of the International Workshop Cooperative and Human Aspects of Software Engineering (CHASE). ACM, 8184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. [84] Quintane Eric, Conaldi Guido, Tonellato Marco, and Lomi Alessandro. 2014. Modeling relational events: A case study on an open source software project. Organiz. Res. Meth. 17, 1 (2014), 2350.Google ScholarGoogle ScholarCross RefCross Ref
  85. [85] Ramsauer Ralf, Lohmann Daniel, and Mauerer Wolfgang. 2019. The list is the process: Reliable pre-integration tracking of commits on mailing lists. In Proceedings of the International Conference on Software Engineering (ICSE). IEEE, 807818. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. [86] Rashid Mehvish, Clarke Paul M., and O’Connor Rory V.. 2019. A systematic examination of knowledge loss in open source software projects. Int. J. Inf. Manag. 46 (2019), 104123.Google ScholarGoogle ScholarCross RefCross Ref
  87. [87] Team R Core. 2017. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. http://www.R-project.org/.Google ScholarGoogle Scholar
  88. [88] Robin Xavier, Turck Natacha, Hainard Alexandre, Tiberti Natalia, Lisacek Frédérique, Sanchez Jean-Charles, and Müller Markus. 2011. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinf. 12, 77 (2011), 18.Google ScholarGoogle Scholar
  89. [89] Rodgers Joseph Lee and Nicewander W. Alan. 1988. Thirteen ways to look at the correlation coefficient. American Statist. 42, 1 (1988), 5966.Google ScholarGoogle ScholarCross RefCross Ref
  90. [90] Schosser Josef. 2021. Tensor extrapolation: Forecasting large-scale relational data. J. Oper. Res. Societ. (2021). Retrieved from https://www.tandfonline.com/doi/full/10.1080/01605682.2021.1892460.Google ScholarGoogle ScholarCross RefCross Ref
  91. [91] Seary Andrew J. and Richards William D.. 2003. Spectral methods for analyzing and visualizing networks: An introduction. In Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers. National Academy of Science, 209228.Google ScholarGoogle Scholar
  92. [92] Setia Pankaj, Rajagopalan Balaji, Sambamurthy Vallabh, and Calantone Roger. 2012. How peripheral developers contribute to open-source software development. Inf. Syst. Res. 23, 1 (2012), 144163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. [93] Sharma Pankajeshwara N., Savarimuthu Bastin Tony Roy, and Stanger Nigel. 2017. Boundary spanners in open source software development: A study of Python email archives. In Proceedings of the Asia-Pacific Software Engineering Conference (APSEC). IEEE, 308317.Google ScholarGoogle ScholarCross RefCross Ref
  94. [94] Shashua Amnon and Hazan Tamir. 2005. Non-negative tensor factorization with applications to statistics and computer vision. In Proceedings of the International Conference on Machine Learning (ICML). ACM, 792799. Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. [95] Shihab Emad, Bettenburg Nicolas, Adams Bram, and Hassan Ahmed E.. 2010. On the central role of mailing lists in open source projects: An exploratory study. In New Frontiers in Artificial Intelligence: JSAI-isAI 2009 Workshops. Springer, 91103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. [96] Spiegel Stephan, Clausen Jan, Albayrak Sahin, and Kunegis Jérôme. 2012. Link prediction on evolving data using tensor factorization. In New Frontiers in Applied Data Mining: PAKDD 2011 International Workshops. Springer, 100110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. [97] Steinmacher Igor, Treude Christoph, and Gerosa Marco Aurelio. 2019. Let me in: Guidelines for the successful onboarding of newcomers to open source projects. IEEE Softw. 36, 4 (2019), 4149.Google ScholarGoogle ScholarCross RefCross Ref
  98. [98] Storey Margaret-Anne, Singer Leif, Filho Fernando Figueira, Zagalsky Alexey, and German Daniel M.. 2017. How social and communication channels shape and challenge a participatory culture in software development. IEEE Trans. Softw. Eng. 43, 2 (2017), 185204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. [99] Tamburri Damian A., Kruchten Philippe, Lago Patricia, and Vliet Hans van. 2015. Social debt in software engineering: Insights from industry. J. Internet Serv. Applic. 6, 10 (2015), 117.Google ScholarGoogle Scholar
  100. [100] Tamburri Damian A., Lago Patricia, and Vliet Hans van. 2013. Organizational social structures for software engineering. ACM Comput. Surv. 46, 1 (2013), 135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. [101] Tamburri Damian A., Palomba Fabio, Serebrenik Alexander, and Zaidman Andy. 2019. Discovering community patterns in open-source: A systematic approach and its evaluation. Empir. Softw. Eng. 24, 3 (2019), 13691417. Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. [102] Tan Xin, Zhou Minghui, and Fitzgerald Brian. 2020. Scaling open source communities: An empirical study of the Linux kernel. In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 12221234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. [103] Terceiro Antonio, Rios Luiz Romario, and Chavez Christina. 2010. An empirical study on the structural complexity introduced by core and peripheral developers in free software projects. In Proceedings of the Brazilian Symposium on Software Engineering (SBES). IEEE, 2129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. [104] Toral Sergio L., Martínez-Torres M. Rocío, and Barrero Federico. 2010. Analysis of virtual communities supporting OSS projects using social network analysis. Inf. Softw. Technol. 52, 3 (2010), 296303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. [105] Tymchuk Yuriy, Mocci Andrea, and Lanza Michele. 2014. Collaboration in open-source projects: Myth or reality? In Proceedings of the International Workshop Mining Software Repositories (MSR). ACM, 304307. Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. [106] Vandecappelle Michiel, Vervliet Nico, and Lathauwer Lieven De. 2017. Nonlinear least squares updating of the canonical polyadic decomposition. In Proceedings of the European Signal Processing Conference. (EUSIPCO). IEEE, 663667.Google ScholarGoogle ScholarCross RefCross Ref
  107. [107] Wiese Igor Scaliante, Côgo Filipe Roseiro, Ré Reginaldo, Steinmacher Igor, and Gerosa Marco Aurélio. 2014. Social metrics included in prediction models on software engineering: A mapping study. In Proceedings of the International Conference on Predicitive Models in Software Engineering (PROMISE). ACM, 7281. Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. [108] Wiese Igor Scaliante, Silva José Teodoro da, Steinmacher Igor, Treude Christoph, and Gerosa Marco Aurélio. 2016. Who is who in the mailing list? Comparing six disambiguation heuristics to identify multiple addresses of a participant. In Proceedings of the International Conference on Software Maintenance and Evolution (ICSME). IEEE, 345355.Google ScholarGoogle ScholarCross RefCross Ref
  109. [109] Williams Trenton Alma and Shepherd Dean A.. 2021. Bounding and binding: Trajectories of community-organization emergence following a major disruption. Organiz. Sci. 32, 3 (2021), 824855.Google ScholarGoogle ScholarCross RefCross Ref
  110. [110] Xuan Qi and Filkov Vladimir. 2014. Building it together: Synchronous development in OSS. In Proceedings of the International Conference on Software Engineering (ICSE). ACM, 222233. Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. [111] Yang Yang, Lichtenwalter Ryan N., and Chawla Nitesh V.. 2015. Evaluating link prediction methods. Knowl. Inf. Syst. 45, 3 (2015), 751782. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. [112] Zhang Wen, Yang Ye, and Wang Qing. 2011. Network analysis of OSS evolution: An empirical study on ArgoUML project. In Proceedings of the International Workshop on Principles of Software Evolution and ERCIM Workshop on Software Evolution (IWPSE-EVOL). ACM, 7180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  113. [113] Zhou Minghui, Chen Qingying, Mockus Audris, and Wu Fengguang. 2017. On the scalability of Linux kernel maintainers’ work. In Proceedings of the European Software Engineering Conference and the International Symposium Foundations of Software Engineering (ESEC/FSE). ACM, 2737. Google ScholarGoogle ScholarDigital LibraryDigital Library
  114. [114] Zhu Mu. 2004. Recall, Precision and Average Precision. Technical Report. University of Waterloo, Waterloo, Canada.Google ScholarGoogle Scholar
  115. [115] Zimmermann Thomas, Zeller Andreas, Weißgerber Peter, and Diehl Stephan. 2005. Mining version histories to guide software changes. IEEE Trans. Softw. Eng. 31, 6 (2005), 429445. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Measuring and Modeling Group Dynamics in Open-Source Software Development: A Tensor Decomposition Approach

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Software Engineering and Methodology
      ACM Transactions on Software Engineering and Methodology  Volume 31, Issue 2
      April 2022
      789 pages
      ISSN:1049-331X
      EISSN:1557-7392
      DOI:10.1145/3492439
      • Editor:
      • Mauro Pezzè
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 November 2021
      • Accepted: 1 June 2021
      • Revised: 1 May 2021
      • Received: 1 December 2020
      Published in tosem Volume 31, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format