The Degree Distribution of Networks: Statistical Model Selection

Kelly, William P.; Ingram, Piers J.; Stumpf, Michael P. H.

doi:10.1007/978-1-61779-361-5_13

William P. Kelly⁴,
Piers J. Ingram⁵ &
Michael P. H. Stumpf⁴

Part of the book series: Methods in Molecular Biology ((MIMB,volume 804))

3642 Accesses
3 Citations

Abstract

The degree distribution has been viewed as an important characteristic of network data. Many biological networks have been labelled scale-free as their degree distribution can be approximately described by a power-law probability distribution. This chapter presents a formal statistical model selection procedure that can determine which functional form, from a collection of specified models, best describes the degree distribution of network data. The degree distribution found for empirical data is viewed as belonging to a class of probability models and the model which best describes the data is determined in a maximum likelihood framework. In conclusion, it is important to note that these statistical tests do not confirm the true underlying distribution of the observed data, but instead show which models from a chosen set best describe the data. In reality, these approaches should be viewed as providing evidence for which probability models do not adequately (or optimally) describe the data, and give an indication of the underlying sampling and true interaction properties of the system considered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Sanchez C, Lachaize C, Janody F, Bellon B, Röder L, Euzenat J, Rechenmann F, Jacq B. (1999) Grasping at molecular interactions and genetic networks in drosophila melanogaster using FlyNets, an internet database. Nucleic Acids Res, 27(1):89–94.
Google Scholar
Xenarios I, Rice D, Salwinski L, Baron M, Marcotte EM, Eisenberg D. (2000) Dip: the database of interacting proteins. Nucleic Acids Res, 28(1):289–291.
Article PubMed CAS Google Scholar
Legrain P, Wojcik J, Gauthier JM. (2001) Protein–protein interaction maps: a lead towards cellular functions. Trends Genet, 17(6):346–352.
Google Scholar
Lehner B, Fraser AG. (2004) A first-draft human protein-interaction map. Genome Biol, 5(9):R63.
Google Scholar
Zhang J. (2003) Evolution by gene duplication: an update. Trends Ecol Evol, 18(6): 292–298.
Google Scholar
Qin H, Lu H, Wu W, Li W-H. (2003) Evolution of the yeast protein interaction network. Proc Natl Acad Sci U S A, 100(22): 12820–12824.
Article CAS Google Scholar
Hakes L, Pinney JW, Robertson DL, Lovell SC. (2008) Protein-protein interaction networks and biology – what’s the connection? Nat Biotechnol, 26(1):69–72.
Article PubMed CAS Google Scholar
Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y. (2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci, 98(8):4569–4574.
Article CAS Google Scholar
Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, Li Y, Hao Y, Ooi C, Godwin B, Vitols E, Vijayadamodar G, Pochart P, Machineni H, Welsh M, Kong Y, Zerhusen B, Malcolm R, Varrone Z, Collis A, Minto M, Burgess S, McDaniel L, Stimpson E, Spriggs F, Williams J, Neurath K, Ioime N, Agee M, Voss E, Furtak K, Renzulli R, Aanensen N, Carrolla S, Bickelhaupt E, Lazovatsky Y, DaSilva A, Zhong J, Stanyon CA, Finley R, White K, Braverman M, Jarvie T, Gold S, Leach M, Knight JR, Shimkets R, McKenna M, Chant J, Rothberg J. (2003) A protein interaction map of drosophila melanogaster. Science, 302(5651):1727–1736.
CAS Google Scholar
Tarassov K, Messier V, Landry CR, Radinovic S, Serna Molina MM, Shames I, Malitskaya Y, Vogel J, Bussey H, Michnick SW. (2008) An in vivo map of the yeast protein interactome. Science, 320(5882):1465–1470.
Google Scholar
Kelly WP, Stumpf MPH. (2008) Protein-protein interactions: from global to local analyses. Curr Opin Biotechnol, 19:396–403.
Google Scholar
Jeong H, Tombor B, Albert R, Oltvai ZN, Barabási AL. (2000) The large-scale organization of metabolic networks. Nature, 407(6804):651–654.
Article PubMed CAS Google Scholar
Reguly T, Breitkreutz A, Boucher L, Breitkreutz B-J, Hon G, Myers C, Parsons AB, Friesen H, Oughtred R, Tong A, Stark C, Ho Y, Botstein D, Andrews BJ, Boone C, Troyanskya O, Ideker T, Dolinski K, Batada NN, Tyers M. (2006) Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J Biol, 5(4):11.
Google Scholar
Kelly WP. (2009) On the analysis of protein interaction networks. PhD thesis.
Google Scholar
Doyle J, Alderson D, Li L, Low L, Roughan M, Shalunov S, Tanaka R, Willinger W. (2005) The “robust yetfragile” nature of the internet. Proc Natl Acad Sci, 102(41):14497–14502.
Article CAS Google Scholar
Willinger W, Alderson D, Doyle J. (2009) Mathematics and the internet: a source of enormous confusion and great potential. Not Am Math Soc, 56(5):586–599.
Google Scholar
Tanaka R, Yi T-M, Doyle J. (2005) Some protein interaction data do not exhibit power law statistics. FEBS Lett, 579(23):5140–5144.
Article PubMed CAS Google Scholar
Stumpf MPH, Ingram P. (2005) Probability models for degree distributions of protein interaction networks. EPL (Europhys Lett), 71(1):152–158.
Google Scholar
Stumpf MP, Ingram P, Nouvel I, Wiuf C. (2005) Statistical model selection methods applied to biological networks. Lect Notes Comput Sci, 65–77.
Google Scholar
Stumpf MPH, Thorne T. (2006) Multi-model inference of network properties from incomplete data. J Integr Bioinform, 3(2):32.
Google Scholar
Burnham K, Anderson DR. (1998) Model Selection and Inference: A Practical Information-Theoretic Approach. Springer, New York.
Google Scholar
Akaike H. (1983) Information measures and model selection. Bull Inst Int Statist, 50(1):277–290.
Google Scholar
Parrish JR, Yu J, Liu G, Hines JA, Chan JE, Mangiola BA, Zhang H, Pacifico S, Fotouhi F, DiRita VJ, Ideker T, Andrews P, Finley RL. (2007) A proteome-wide protein interaction map for Campylobacter jejuni. Genome Biol, 8(7):R130.
Google Scholar
Hirschman JE, Balakrishnan R, Christie KR, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hong EL, Livstone MS, Nash R, Park J, Oughtred R, Skrzypek M, Starr B, Theesfeld CL, Williams J, Andrada R, Binkley G, Dong Q, Lane C, Miyasato S, Sethuraman A, Schroeder M, Thanawala MK, Weng S, Dolinski K, Botstein D, Cherry JM. (2006) Genome snapshot: a new resource at the Saccharomyces genome database (sgd) presenting an overview of the Saccharomyces cerevisiae genome. Nucleic Acids Res, 34(Database issue):D442–D445.
Google Scholar
Stark C, Breitkreutz B-J, Reguly T, Boucher L, Breitkreutz A, Tyers M. (2006) Biogrid: a general repository for interaction datasets. Nucleic Acids Res, 34(Database issue):D535–D5396.
Google Scholar
Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, Punna T, Peregrín-Alvarez JM, Shales M, Zhang X, Davey M, Robinson MD, Paccanaro A, Bray JE, Sheung A, Beattie BK, Richards DP, Canadien V, Lalev A, Mena F, Wong P, Starostine A, Canete MM, Vlasblom J, Wu S, Orsi C, Collins SR, Chandran S, Haw R, Rilstone JJ, Gandi K, Thompson NJ, Musso G, St Onge P, Ghanny S, Lam MHY, Butland G, Altaf-Ul AM, Kanaya S, Shilatifard A, O’Shea E, Weissman JS, Ingles CJ, Hughes TR, Parkinson J, Gerstein M, Wodak SJ, Emili A, Greenblatt JF. (2006) Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature, 440(7084):637–643.
Article Google Scholar
Breitkreutz B-J, Stark C, Reguly T, Boucher L, Breitkreutz A, Livstone M, Oughtred R, Lackner DH, Bähler J, Wood V, Dolinski K, Tyers M. (2008) The biogrid interaction database: 2008 update. Nucleic Acids Res, 36(Database issue):D637–D640.
Google Scholar
Schwarz G. (1978) Estimating the dimension of a model. Ann Stat, 6(2):461–464.
Article Google Scholar

Download references

Acknowledgments

This work has been funded by the Wellcome Trust and the BBSRC. MPHS is a Royal Society Wolfson Research Merit Award holder.

Author information

Authors and Affiliations

Theoretical Systems Biology group, Imperial College, London, UK
William P. Kelly & Michael P. H. Stumpf
Department of Mathematics, Imperial College, South Kensington Campus, London, UK
Piers J. Ingram

Authors

William P. Kelly
View author publications
You can also search for this author in PubMed Google Scholar
Piers J. Ingram
View author publications
You can also search for this author in PubMed Google Scholar
Michael P. H. Stumpf
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael P. H. Stumpf .

Editor information

Editors and Affiliations

Labo. Bioinformatique des, Génomes et des Réseaux (BiGRe), Université Libre de Bruxelles, bd. du Triomphe, Bruxelles, 1050, Belgium
Jacques van Helden
Labo. Bioinformatique des, Génomes et des Réseaux (BiGRe), Université Libre de Bruxelles, Bvd. du Triomphe, Bruxelles, 1050, Belgium
Ariane Toussaint
INSERM 1024, Institut de Biologie de l'Ecole Normale, rue d'Ulm 46, Paris, 75230, France
Denis Thieffry

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Kelly, W.P., Ingram, P.J., Stumpf, M.P.H. (2012). The Degree Distribution of Networks: Statistical Model Selection. In: van Helden, J., Toussaint, A., Thieffry, D. (eds) Bacterial Molecular Networks. Methods in Molecular Biology, vol 804. Springer, New York, NY. https://doi.org/10.1007/978-1-61779-361-5_13

Download citation

DOI: https://doi.org/10.1007/978-1-61779-361-5_13
Published: 28 October 2011
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-61779-360-8
Online ISBN: 978-1-61779-361-5
eBook Packages: Springer Protocols

Publish with us

Policies and ethics