Skip to main content

The Degree Distribution of Networks: Statistical Model Selection

  • Protocol
  • First Online:
Bacterial Molecular Networks

Part of the book series: Methods in Molecular Biology ((MIMB,volume 804))

Abstract

The degree distribution has been viewed as an important characteristic of network data. Many biological networks have been labelled scale-free as their degree distribution can be approximately described by a power-law probability distribution. This chapter presents a formal statistical model selection procedure that can determine which functional form, from a collection of specified models, best describes the degree distribution of network data. The degree distribution found for empirical data is viewed as belonging to a class of probability models and the model which best describes the data is determined in a maximum likelihood framework. In conclusion, it is important to note that these statistical tests do not confirm the true underlying distribution of the observed data, but instead show which models from a chosen set best describe the data. In reality, these approaches should be viewed as providing evidence for which probability models do not adequately (or optimally) describe the data, and give an indication of the underlying sampling and true interaction properties of the system considered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sanchez C, Lachaize C, Janody F, Bellon B, Röder L, Euzenat J, Rechenmann F, Jacq B. (1999) Grasping at molecular interactions and genetic networks in drosophila melanogaster using FlyNets, an internet database. Nucleic Acids Res, 27(1):89–94.

    Google Scholar 

  2. Xenarios I, Rice D, Salwinski L, Baron M, Marcotte EM, Eisenberg D. (2000) Dip: the database of interacting proteins. Nucleic Acids Res, 28(1):289–291.

    Article  PubMed  CAS  Google Scholar 

  3. Legrain P, Wojcik J, Gauthier JM. (2001) Protein–protein interaction maps: a lead towards cellular functions. Trends Genet, 17(6):346–352.

    Google Scholar 

  4. Lehner B, Fraser AG. (2004) A first-draft human protein-interaction map. Genome Biol, 5(9):R63.

    Google Scholar 

  5. Zhang J. (2003) Evolution by gene duplication: an update. Trends Ecol Evol, 18(6): 292–298.

    Google Scholar 

  6. Qin H, Lu H, Wu W, Li W-H. (2003) Evolution of the yeast protein interaction network. Proc Natl Acad Sci U S A, 100(22): 12820–12824.

    Article  CAS  Google Scholar 

  7. Hakes L, Pinney JW, Robertson DL, Lovell SC. (2008) Protein-protein interaction networks and biology – what’s the connection? Nat Biotechnol, 26(1):69–72.

    Article  PubMed  CAS  Google Scholar 

  8. Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y. (2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci, 98(8):4569–4574.

    Article  CAS  Google Scholar 

  9. Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, Li Y, Hao Y, Ooi C, Godwin B, Vitols E, Vijayadamodar G, Pochart P, Machineni H, Welsh M, Kong Y, Zerhusen B, Malcolm R, Varrone Z, Collis A, Minto M, Burgess S, McDaniel L, Stimpson E, Spriggs F, Williams J, Neurath K, Ioime N, Agee M, Voss E, Furtak K, Renzulli R, Aanensen N, Carrolla S, Bickelhaupt E, Lazovatsky Y, DaSilva A, Zhong J, Stanyon CA, Finley R, White K, Braverman M, Jarvie T, Gold S, Leach M, Knight JR, Shimkets R, McKenna M, Chant J, Rothberg J. (2003) A protein interaction map of drosophila melanogaster. Science, 302(5651):1727–1736.

    CAS  Google Scholar 

  10. Tarassov K, Messier V, Landry CR, Radinovic S, Serna Molina MM, Shames I, Malitskaya Y, Vogel J, Bussey H, Michnick SW. (2008) An in vivo map of the yeast protein interactome. Science, 320(5882):1465–1470.

    Google Scholar 

  11. Kelly WP, Stumpf MPH. (2008) Protein-protein interactions: from global to local analyses. Curr Opin Biotechnol, 19:396–403.

    Google Scholar 

  12. Jeong H, Tombor B, Albert R, Oltvai ZN, Barabási AL. (2000) The large-scale organization of metabolic networks. Nature, 407(6804):651–654.

    Article  PubMed  CAS  Google Scholar 

  13. Reguly T, Breitkreutz A, Boucher L, Breitkreutz B-J, Hon G, Myers C, Parsons AB, Friesen H, Oughtred R, Tong A, Stark C, Ho Y, Botstein D, Andrews BJ, Boone C, Troyanskya O, Ideker T, Dolinski K, Batada NN, Tyers M. (2006) Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J Biol, 5(4):11.

    Google Scholar 

  14. Kelly WP. (2009) On the analysis of protein interaction networks. PhD thesis.

    Google Scholar 

  15. Doyle J, Alderson D, Li L, Low L, Roughan M, Shalunov S, Tanaka R, Willinger W. (2005) The “robust yetfragile” nature of the internet. Proc Natl Acad Sci, 102(41):14497–14502.

    Article  CAS  Google Scholar 

  16. Willinger W, Alderson D, Doyle J. (2009) Mathematics and the internet: a source of enormous confusion and great potential. Not Am Math Soc, 56(5):586–599.

    Google Scholar 

  17. Tanaka R, Yi T-M, Doyle J. (2005) Some protein interaction data do not exhibit power law statistics. FEBS Lett, 579(23):5140–5144.

    Article  PubMed  CAS  Google Scholar 

  18. Stumpf MPH, Ingram P. (2005) Probability models for degree distributions of protein interaction networks. EPL (Europhys Lett), 71(1):152–158.

    Google Scholar 

  19. Stumpf MP, Ingram P, Nouvel I, Wiuf C. (2005) Statistical model selection methods applied to biological networks. Lect Notes Comput Sci, 65–77.

    Google Scholar 

  20. Stumpf MPH, Thorne T. (2006) Multi-model inference of network properties from incomplete data. J Integr Bioinform, 3(2):32.

    Google Scholar 

  21. Burnham K, Anderson DR. (1998) Model Selection and Inference: A Practical Information-Theoretic Approach. Springer, New York.

    Google Scholar 

  22. Akaike H. (1983) Information measures and model selection. Bull Inst Int Statist, 50(1):277–290.

    Google Scholar 

  23. Parrish JR, Yu J, Liu G, Hines JA, Chan JE, Mangiola BA, Zhang H, Pacifico S, Fotouhi F, DiRita VJ, Ideker T, Andrews P, Finley RL. (2007) A proteome-wide protein interaction map for Campylobacter jejuni. Genome Biol, 8(7):R130.

    Google Scholar 

  24. Hirschman JE, Balakrishnan R, Christie KR, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hong EL, Livstone MS, Nash R, Park J, Oughtred R, Skrzypek M, Starr B, Theesfeld CL, Williams J, Andrada R, Binkley G, Dong Q, Lane C, Miyasato S, Sethuraman A, Schroeder M, Thanawala MK, Weng S, Dolinski K, Botstein D, Cherry JM. (2006) Genome snapshot: a new resource at the Saccharomyces genome database (sgd) presenting an overview of the Saccharomyces cerevisiae genome. Nucleic Acids Res, 34(Database issue):D442–D445.

    Google Scholar 

  25. Stark C, Breitkreutz B-J, Reguly T, Boucher L, Breitkreutz A, Tyers M. (2006) Biogrid: a general repository for interaction datasets. Nucleic Acids Res, 34(Database issue):D535–D5396.

    Google Scholar 

  26. Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, Punna T, Peregrín-Alvarez JM, Shales M, Zhang X, Davey M, Robinson MD, Paccanaro A, Bray JE, Sheung A, Beattie BK, Richards DP, Canadien V, Lalev A, Mena F, Wong P, Starostine A, Canete MM, Vlasblom J, Wu S, Orsi C, Collins SR, Chandran S, Haw R, Rilstone JJ, Gandi K, Thompson NJ, Musso G, St Onge P, Ghanny S, Lam MHY, Butland G, Altaf-Ul AM, Kanaya S, Shilatifard A, O’Shea E, Weissman JS, Ingles CJ, Hughes TR, Parkinson J, Gerstein M, Wodak SJ, Emili A, Greenblatt JF. (2006) Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature, 440(7084):637–643.

    Article  Google Scholar 

  27. Breitkreutz B-J, Stark C, Reguly T, Boucher L, Breitkreutz A, Livstone M, Oughtred R, Lackner DH, Bähler J, Wood V, Dolinski K, Tyers M. (2008) The biogrid interaction database: 2008 update. Nucleic Acids Res, 36(Database issue):D637–D640.

    Google Scholar 

  28. Schwarz G. (1978) Estimating the dimension of a model. Ann Stat, 6(2):461–464.

    Article  Google Scholar 

Download references

Acknowledgments

This work has been funded by the Wellcome Trust and the BBSRC. MPHS is a Royal Society Wolfson Research Merit Award holder.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael P. H. Stumpf .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Kelly, W.P., Ingram, P.J., Stumpf, M.P.H. (2012). The Degree Distribution of Networks: Statistical Model Selection. In: van Helden, J., Toussaint, A., Thieffry, D. (eds) Bacterial Molecular Networks. Methods in Molecular Biology, vol 804. Springer, New York, NY. https://doi.org/10.1007/978-1-61779-361-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-61779-361-5_13

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-61779-360-8

  • Online ISBN: 978-1-61779-361-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics