skip to main content
research-article

Power laws in software

Published:07 October 2008Publication History
Skip Abstract Section

Abstract

A single statistical framework, comprising power law distributions and scale-free networks, seems to fit a wide variety of phenomena. There is evidence that power laws appear in software at the class and function level. We show that distributions with long, fat tails in software are much more pervasive than previously established, appearing at various levels of abstraction, in diverse systems and languages. The implications of this phenomenon cover various aspects of software engineering research and practice.

References

  1. Adamic, L. A. 2000. Zipf, power-laws, and Pareto—a ranking tutorial. http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html.Google ScholarGoogle Scholar
  2. Adamic, L. A. and Huberman, B. A. 2002. Zipf's law and the internet. Glottometrics 3, 143--150.Google ScholarGoogle Scholar
  3. Adams, E. N. 1984. Optimizing preventive service of software products. IBM J. Resear. Devel. 28, 1, 2--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Albers, S. and Westbrook, J. 1998. Self-organizing data structures. In Online Algorithms: The State of the Art, A. Fiat and G. J. Woeginger, Eds. Lecture Notes in Computer Science, vol. 1442. Springer-Verlag, Berlin, 31--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Albert, R., Jeong, H., and Barabási, A.-L. 1999. Diameter of the World-Wide Web. Nature 401, 130.Google ScholarGoogle ScholarCross RefCross Ref
  6. Albert, R., Jeong, H., and Barabási, A.-L. 2000. Error and attack tolerance of complex networks. Nature 406, 378--382.Google ScholarGoogle ScholarCross RefCross Ref
  7. Allen, B. and Munro, I. 1978. Self-organizing binary search trees. J. ACM 25, 4, 526--535. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Barabási, A.-L. 2002. Linked: The New Science of Networks. Perseus Publishing, Cambridge, MA.Google ScholarGoogle Scholar
  9. Barabási, A.-L. and Albert, R. 1999. Emergence of scaling in random networks. Science 286, 509--512.Google ScholarGoogle ScholarCross RefCross Ref
  10. Barabási, A.-L., Albert, R., and Jeong, H. 1999. Mean-field theory for scale-free random networks. Physical A 272, 173--187.Google ScholarGoogle ScholarCross RefCross Ref
  11. Barabási, A.-L. and Bonabeau, E. 2003. Scale-free networks. Scientific Amer. 288, 5, 50--59.Google ScholarGoogle ScholarCross RefCross Ref
  12. Baxter, G., Frean, M., Noble, J., Rickerby, M., Smith, H., Visser, M., Melton, H., and Tempero, E. 2006. Understanding the shape of java software. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA'06). ACM Press, New York, NY, 397--412. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Bentley, J. L. and McGeoch, C. C. 1985. Amortized analyses of self-organizing sequential search heuristics. Comm. ACM 28, 4, 404--411. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Boehm, B. and Basili, V. R. 2001. Software defect reduction top 10 list. IEEE Softw. 34, 1, 135--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Boehm, B. W. 1987. Industrial software metrics top 10 list. IEEE Softw. 4, 9, 84--85.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Candea, G., Brown, A. B., Fox, A., and Patterson, D. 2004. Recovery-oriented computing: building multitier dependability. IEEE Comput. 37, 11, 60--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Chou, A., Yang, J., Chelf, B., Hallem, S., and Engler, D. 2001. An empirical study of operating systems errors. In Proceedings of the 18th ACM Symposium on Operating System Principles. ACM Press, New York, NY, 73--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Clark, D. W. and Green, C. C. 1977. An empirical study of list structure in Lisp. Comm. ACM 20, 2, 78--87. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Denning, P. J. 2005. The locality principle. Comm. ACM 48, 7, 19--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Dorogovtsev, S. N. and Mendes, J. F. F. 2003. Evolution of Networks: From Biological Nets to the Internet and WWW. Oxford University Press, Oxford, U.K. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ebert, C. 2001. Metrics for indentifying critical components in software projects. In Handbook of Software Engineering and Knowledge Engineering, S. K. Chang, Ed. Vol. 1, Fundamentals. World Scientific, London, U.K.Google ScholarGoogle Scholar
  22. Economides, N. 1996. The economics of networks. Int. J. Indust. Org. 16, 4, 673--699.Google ScholarGoogle ScholarCross RefCross Ref
  23. Endres, A. 1975. An analysis of errors and their causes in system programs. ACM SIGPLAN Notices 10, 6, 327--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Faloutsos, M., Faloutsos, P., and Faloutsos, C. 1999. On power-law relationships of the internet topology. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM'99). ACM Press, New York, NY, 251--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Feldman, S. I. 1979. Make—a program for maintaining computer programs. Softw. Prac. Exper. 9, 4, 255--265.Google ScholarGoogle ScholarCross RefCross Ref
  26. Feller, W. 1971. An Introduction to Probability Theory and Its Applications 2nd ed. Vol. 2. John Wiley & Sons, New York, NY.Google ScholarGoogle Scholar
  27. Fenton, N. E. and Ohlsson, N. 2000. Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Softw. Eng. 26, 8, 797--814. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Fowler, M. 1999. Refactoring: Improving the Design of Existing Code. Addison-Wesley, Reading, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Fox Keller, E. 2005. Revisiting “scale-free” networks. BioEssays 27, 10, 1060--1068.Google ScholarGoogle ScholarCross RefCross Ref
  30. Glass, R. L. 1998. Reuse: What's wrong with this picture? IEEE Softw. 15, 2, 57--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Heising, W. P. 1963. Note on random addressing techniques. IBM Syst. J. 2, 2, 112--116.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Henry, S. and Kafura, D. 1981. Software structure metrics based on information flow. IEEE Trans. Softw. Eng. 7, 5, 510--518. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Huberman, B. A. and Adamic, L. A. 1999. Growth dynamics of the World-Wide Web. Nature 401, 131.Google ScholarGoogle ScholarCross RefCross Ref
  34. Knuth, D. E. 1984a. The TeXbook. Computers & Typesetting, vol. A. Addison Wesley Publishing Company, Reading, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Knuth, D. E. 1984b. Literate programming. Comput. J. 27, 97--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Knuth, D. E. 1986a. TeX: The Program. Computers & Typesetting, vol. B. Addison Wesley Publishing Company, Reading, MA.Google ScholarGoogle Scholar
  37. Knuth, D. E. 1986b. The METAFONT Book. Computers & Typesetting, vol. C. Addison Wesley Publishing Company, Reading, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Knuth, D. E. 1986c. METAFONT The Program. Computers & Typesetting, vol. D. Addison Wesley Publishing Company, Reading, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Knuth, D. E. 1989. The errors of TeX. Softw. Prac. Exper. 19, 7, 607--685. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Knuth, D. E. 1998. Sorting and Searching, 2nd ed. The Art of Computer Programming, vol. 3. Addison-Wesley, Reading, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Laherrère, J. and Sornette, D. 1998. Stretched exponential distributions in nature and economy: “fat tails with characteristic scales.” Europ. Phys. J. B 2, 525--539.Google ScholarGoogle ScholarCross RefCross Ref
  42. Lakos, J. 1996. Large Scale C++ Software Development. Addison-Wesley, Boston, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Larsen (guest editor), G. 2000. Component-based enterprise frameworks. Comm. ACM 43, 10, 24--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Li, W. 1992. Random texts exhibit zipf's-law-like word frequency distribution. IEEE Trans. Inform. Theory 38, 6, 1841--1845.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Lindholm, T. and Yellin, F. 1999. The Java Virtual Machine Specification, 2nd ed. Addison-Wesley, Reading, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Mandelbrot, B. 1953. An informational theory of the statistical structure of language. In Proceedings of the 2nd London Symposiumon Communication Theory, W. Jackson, Ed. Butterworth, London, 486--504.Google ScholarGoogle Scholar
  47. Mandelbrot, B. M. 1951a. Adaptation du message á la ligne de transmission: I. Quanta d' information. Comptes Rendus des séances de l' Academie des Sciences 232, 1636--1740.Google ScholarGoogle Scholar
  48. Mandelbrot, B. M. 1951b. Adaptation du message á la ligne de transmission: II. Interprétation physiques. Comptes Rendus des séances de l' Academie des Sciences 232, 2003--2005.Google ScholarGoogle Scholar
  49. Mandelbrot, B. M. 1983. The Fractal Geometry of Nature. W. H. Freeman and Company, New York, NY.Google ScholarGoogle Scholar
  50. Marchesi, M., Pinna, S., Serra, N., and Tuveri, S. 2004. Power laws in Smalltalk. In Proceedings of the 12th European Smalltalk User Group Joint Event. Köthen, Germany.Google ScholarGoogle Scholar
  51. Martin, R. C. 2003. Agile Software Development: Principles, Patterns, and Practices. Prentice Hall, Upper Saddle River, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Mitzenmacher, M. 2004. A brief history of generative models for power law and lognormal distributions. Internet Mathematics 1, 2, 226--251.Google ScholarGoogle ScholarCross RefCross Ref
  53. Möller, K.-H. 1993. An empirical investigation of software fault distribution. In Proceedings of the 1st International Metrics Symposium. IEEE Computer Society Press, Los Alamitos, CA, 82--90.Google ScholarGoogle ScholarCross RefCross Ref
  54. Myers, C. R. 2003. Software systems as complex networks: structure, function, and evolvability of software collaboration graphs. Phys. Rev. E 68, 046116.Google ScholarGoogle ScholarCross RefCross Ref
  55. Newman, M. E. J. 2005. Power laws, pareto distributions and zipf's law. Contem. Phys. 46, 5, 232--351.Google ScholarGoogle ScholarCross RefCross Ref
  56. Ohlsson, N. and Alberg, H. 1996. Predicting fault-prone software modules in telephone switches. IEEE Trans. Softw. Eng. 22, 12, 886--894. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Ostrand, T. J. and Weyuker, E. J. 2002. The distribution of faults in a large industrial software system. In Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM Press, New York, NY, 55--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Pareto, V. 1897. Cours d' Économie Politique. Rouge, Lausanne.Google ScholarGoogle Scholar
  59. Potanin, A., Noble, J., Frean, M., and Biddle, R. 2005. Scale-free geometry in object-oriented programs. Comm. ACM 48, 5, 99--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Schwartz, E. E. 1963. A dictionary for minimum reduncancy encoding. J. ACM 10, 4, 413--439. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Shiode, N. and Batty, M. 2000. Power law distributions in real and virtual worlds. In Proceedings of the 10th Annual Internet Society Conference (INET'00). Yokohama.Google ScholarGoogle Scholar
  62. Shull, F., Basili, V., Boehm, B., Brown, A. W., Costa, P., Lindvall, M., Port, D., Ioana, R., Tesoriero, R., and Zelkowitz, M. 2002. What we have learned about fighting defects. In Proceedings of the 8th IEEE Symposium on Software Metrics (METRICS'02). IEEE Computer Society, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Simon, H. A. 1955. On a class of skew distribution functions. Biometrika 42, 3/4, 425--440.Google ScholarGoogle Scholar
  64. Spinellis, D. and Szyperski, C. 2004. How is open source affecting software development? IEEE Softw. 21, 1, 28--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Szyperski, C., Gruntz, D., and Murer, S. 2002. Component Software: Beyond Object-Oriented Programming, 2nd ed. Addison-Wesley, London. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. TIS Committee. 1995. Tool Interface Standard (TIS) Executable and Linking Format (ELF) Specification. Version 1.2.Google ScholarGoogle Scholar
  67. Valverde, S., Cancho, R. F., and Solé, R. V. 2002. Scale-free networks from optimal design. Europhysics Lett. 60, 4, 512--517.Google ScholarGoogle ScholarCross RefCross Ref
  68. Valverde, S. and Solé, R. V. 2003. Hierarchical small worlds in software architecture. Working Paper 03-07-044, Santa Fe Institute, Santa Fe, NM.Google ScholarGoogle Scholar
  69. Venkatasubramanian, V., Katare, S., Patkar, P. R., and Mu, F.-P. 2004. Spontaneous emergence of complex optimal networks through evolutionary adaptation. Comput. Chem. Engin. 28, 9, 1789--1798.Google ScholarGoogle ScholarCross RefCross Ref
  70. Weber, S. 2004. The Success of Open Source. Harvard University Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Weiner, L. H. 1978. The roots of structured programming. ACM SIGCSE Bull. 10, 1, 243--253. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Wheeldon, R. and Counsell, S. 2003. Power law distributions in class relationships. In Proceedings of the 3rd IEEE International Workshop on Source Code Analysis and Manipulation (SCAM'03). IEEE Computer Society Press, Los Alamitos, CA, 45--54.Google ScholarGoogle Scholar
  73. Yule, G. U. 1925. A mathematical theory of evolution, based on the conclusions of Dr. J. C. Willis, F.R.S. Philoso. Transa. Royal Soc. London: Series B 213, 21--87.Google ScholarGoogle ScholarCross RefCross Ref
  74. Zipf, G. K. 1935. The Psycho-Biology of Language: An Introduction to Dynamic Philology. Houghton Mifflin, Boston, MA.Google ScholarGoogle Scholar
  75. Zipf, G. K. 1949. Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Addison-Wesley, Reading, MA.Google ScholarGoogle Scholar

Index Terms

  1. Power laws in software

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Software Engineering and Methodology
              ACM Transactions on Software Engineering and Methodology  Volume 18, Issue 1
              September 2008
              119 pages
              ISSN:1049-331X
              EISSN:1557-7392
              DOI:10.1145/1391984
              Issue’s Table of Contents

              Copyright © 2008 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 7 October 2008
              • Accepted: 1 September 2007
              • Revised: 1 January 2007
              • Received: 1 August 2005
              Published in tosem Volume 18, Issue 1

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader