Skip to main content

Machine Learning Methodology in Bioinformatics

  • Chapter
Springer Handbook of Bio-/Neuroinformatics

Part of the book series: Springer Handbooks ((SHB))

Abstract

Machine learning plays a central role in the interpretation of many datasets generated within the biomedical sciences. In this chapter we focus on two core topics within machine learning, supervised and unsupervised learning, and illustrate their application to interpreting these datasets. For supervised learning, we focus on support vector machines (SVMs), which is a subtopic of kernel-based learning. Kernels can be used to encode many different types of data, from continuous and discrete data through to graph and sequence data. Given the different types of data encountered within bioinformatics, they are therefore a method of choice within this context. With unsupervised learning we are interested in the discovery of structure within data. We start by considering hierarchical cluster analysis (HCA), given its common usage in this context. We then point out the advantages of Bayesian approaches to unsupervised learning, such as a principled approach to model selection (how many clusters are present in the data) through to confidence measures for assignment of datapoints to clusters. We outline five case studies illustrating these methods. For supervised learning we consider prediction of disease progression in cancer and protein fold prediction. For unsupervised learning we apply HCA to a small colon cancer dataset and then illustrate the use of Bayesian unsupervised learning applied to breast and lung cancer datasets. Finally we consider network inference, which can be approached as an unsupervised or supervised learning task depending on the data available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 269.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 349.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

DAG:

directed acyclic graph

DNA:

deoxyribonucleic acid

EGF:

epidermal growth factor

EGFR:

epidermal growth factor receptor

EM:

expectation-maximization

ERK:

extracellular signal-regulated kinase

GA:

genetic algorithm

HCA:

hierarchical cluster analysis

KL:

Kullback–Leibler

LIBSVM:

library for support vector machines

LOO:

leave-one-out

LPD:

latent process decomposition

MAP:

maximum a posteriori

MCMC:

Markov chain Monte Carlo

MKL:

multiple kernel learning

ML:

maximum likelihood

MRI:

magnetic resonance imaging

ODE:

ordinary differential equation

PSD:

positive semidefinite

QP:

quadratic programming

RNA:

ribonucleic acid

SDP:

semidefinite programming

SVM:

support vector machine

TG:

triacylglyceride

TSA:

test set accuracy

cDNA:

complementary DNA

log:

logistic regression

References

  1. L. Bottou, O. Chapelle, D. DeCoste, J. Weston: Large-Scale Kernel Machines, Neural Information Processing Series (MIT Press, Cambridge 2007)

    Google Scholar 

  2. J. Platt, N. Cristianini, J. Shawe-Taylor: Large margin DAGS for multiclass classification, Adv. Neural Inform. Proces. Syst. 12, 547–553 (2000)

    Google Scholar 

  3. Y. Lee, Y. Lin, G. Wahba: Multicategory support vector machines, Technical Report 1043 (Univ. Madison, Wisconsin 2001)

    Google Scholar 

  4. T. Hastie, R. Tibshirani: Classification by pairwise coupling, Ann. Stat. 26, 451–471 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  5. T.G. Dietterich, G. Bakiri: Solving multiclass learning problems via error-correcting output codes, J. Artif. Intell. 2, 263–286 (1995)

    MATH  Google Scholar 

  6. E.L. Allwein, R.E. Schapire, Y. Singer: Reducing multiclass to binary: A unifying approach for margin classifiers, J. Mach. Learn. Res. 1, 133–141 (2000)

    MathSciNet  MATH  Google Scholar 

  7. K.-B. Duan, S.S. Keerthi: Which is the best multiclass SVM Method? An empirical study, Proc. 6th Int. Workshop Multiple Classifier Syst. (2005), Vol. 3541 (Springer, Berlin, Heidelberg 2006) pp. 278–285

    Google Scholar 

  8. C. Cortes, V. Vapnik: Support vector networks, Mach. Learn. 20, 273–297 (1995)

    MATH  Google Scholar 

  9. K. Veropoulos, C. Campbell, N. Cristianini: Controlling the sensitivity of support vector machines, Proc. Int. Joint Conf. Artif. Intell. (IJCAI) (1999)

    Google Scholar 

  10. J. Platt: Probabilistic outputs for support vector machines and comparison to regularised likelihood methods, Adv. Large Margin Classifiers (MIT Press, Cambridge 1999) pp. 61–74

    Google Scholar 

  11. A.E. Hoerl, R. Kennard: Ridge regression: Biased estimation for nonorthogonal problems, Technometrics 12, 55–67 (1970)

    Article  MATH  Google Scholar 

  12. C. Saunders, A. Gammermann, V. Vovk: Ridge regression learning algorithm in dual variables, Proc. Fifteenth Int. Conf. Mach. Learn. (ICML), ed. by J. Shavlik (Morgan Kaufmann, 1998)

    Google Scholar 

  13. V. Vapnik: The Nature of Statistical Learning Theory (Springer, New York 1995)

    Book  MATH  Google Scholar 

  14. V. Vapnik: Statistical Learning Theory (Wiley, New York 1998)

    MATH  Google Scholar 

  15. B. Schölkopf, A.J. Smola: Learning with Kernels (MIT Press, Cambridge 2002)

    MATH  Google Scholar 

  16. J. Weston, A. Gammerman, M. Stitson, V. Vapnik, V. Vovk, C. Watkins: Support vector density estimation, Advances in Kernel Methods: Support Vector Machines (MIT Press, Cambridge 1998) pp. 293–306

    Google Scholar 

  17. A.J. Smola, B. Schölkopf: A tutorial on support vector regression, Stat. Comput. 14, 199–222 (2004)

    Article  MathSciNet  Google Scholar 

  18. R.D. Williams, S.N. Hing, B.T. Greer, C.C. Whiteford, J.S. Wei, R. Natrajan, A. Kelsey, S. Rogers, C. Campbell, K. Pritchard-Jones, J. Khan: Prognostic classification of relapsing favourable histology Wilms tumour using cDNA microarray expression profiling and support vector machines, Genes Chromosom. Cancer 41, 65–79 (2004)

    Article  Google Scholar 

  19. I. Guyon, A. Elisseeff: An Introduction to Variable and Feature Selection, J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  20. T. Graepel, R. Herbrich, P. Bollmann-Sdorra, K. Obermayer: Classification on pairwise proximity data, Adv. Neural Inform. Proces. Syst. 11, 438–444 (1998)

    Google Scholar 

  21. E. Pekalska, P. Paclik, R.P.W. Duin: A generalized kernel approach to dissimilarity based classification, J. Mach. Learn. Res. 2, 175–211 (2002)

    MathSciNet  MATH  Google Scholar 

  22. V. Roth, J. Laub, M. Kawanabe, J.M. Buhmann: Optimal cluster preserving embedding of nonmetric proximity data, IEEE Trans. Pattern Analys. Mach. Intell. 25, 1540–1551 (2003)

    Article  Google Scholar 

  23. R. Luss, A. dʼAspremont: Support vector machine classification with indefinite kernels, Adv. Neural Inform. Proces. Syst. 20, 953–960 (2008)

    Google Scholar 

  24. Y. Ying, C. Campbell, M. Girolami: Analysis of SVM with Indefinite Kernels, Adv. Neural Informat. Proces. Syst. 22, 2205–2213 (2009)

    Google Scholar 

  25. N. Cristianini, C. Campbell, J. Shawe-Taylor: Dynamically adapting kernels in support vector machines, Adv. Neural Inform. Proces. Syst. 11, 204–210 (1999)

    Google Scholar 

  26. T. Joachims: Estimating the generalization performance of an SVM efficiently, Proc. 17th Int. Conf. Mach. Learn. (Morgan Kaufmann, 2000) pp. 431–438

    Google Scholar 

  27. O. Chapelle, V. Vapnik: Model selection for support vector machines, Adv. Neural Inform. Proces. Syst. 12, 673–680 (2000)

    Google Scholar 

  28. V. Vapnik, O. Chapelle: Bounds on error expectation for support vector machines, Neural Comput. 12, 2013–2036 (2000)

    Article  Google Scholar 

  29. P. Sollich: Bayesian methods for support vector machines: Evidence and predictive class probabilities, Mach. Learn. 46, 21–52 (2002)

    Article  MATH  Google Scholar 

  30. J. Shawe-Taylor, N. Cristianini: Kernel Methods for Pattern Analysis (Cambridge Univ. Press, Cambridge 2004)

    Book  MATH  Google Scholar 

  31. H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, C. Watkins: Text classification using string kernels, J. Mach. Learn. Res. 2, 419–444 (2002)

    MATH  Google Scholar 

  32. C. Leslie, R. Kuang: Fast kernels for inexact string matching, 16th Ann. Conf. Learning Theory 7th Kernel Workshop, Vol. 2777 (Springer, Berlin, Heidelberg 2003) pp. 114–128

    Chapter  Google Scholar 

  33. S. Vishwanathan, A. Smola: Fast Kernels for String and Tree Matching, Adv. Neural Inform. Proces. Syst. 15, 569–576 (2003)

    Google Scholar 

  34. I.R. Kondor, J.D. Lafferty: Diffusion kernels on graphs and other discrete structures, Proc. Int. Conf. Mach. Learn. (Morgan Kaufmann, San Francisco, 2002) pp. 315–322

    Google Scholar 

  35. A.J. Smola, I.R. Kondor: Kernels and regularization on graphs, Conf. Learning Theory (COLT), Vol. 2777 (Springer, Berlin, Heidelberg 2003) pp. 144–158

    Google Scholar 

  36. T. Gartner, P. Flach, S. Wrobel: On graph kernels: Hardness results and efficient alternatives, Proc. Annu. Conf. Computational Learning Theory (COLT) (Springer, Berlin, Heidelberg 2003) pp. 129–143

    Google Scholar 

  37. S.V.N. Vishwanathan, K.M. Borgwardt, I.R. Kondor, N.N. Schraudolph: Graph Kernels, J. Mach. Learn. Res. 9, 1–41 (2008)

    MATH  MathSciNet  Google Scholar 

  38. G.R.G. Lanckriet, N. Cristianini, P. Bartlett, L. El Ghaoui, M.I. Jordan: Learning the kernel matrix with semidefinite programming, J. Mach. Learn. Res. 5, 27–72 (2004)

    MATH  MathSciNet  Google Scholar 

  39. F. Bach, G.R.G. Lanckriet, M.I. Jordan: Multiple kernel learning, conic duality and the SMO algorithm, Proc. 21st Int. Conf. Machine Learning (ICML) (Morgan Kaufmann, New York 1998)

    Google Scholar 

  40. S. Sonnenburg, G. Rätsch, C. Schäfer, B. Schölkopf: Large scale multiple kernel learning, J. Mach. Learn. Res. 7, 1531–1565 (2006)

    MathSciNet  MATH  Google Scholar 

  41. A. Rakotomamonjy, F. Bach, S. Canu, Y. Grandvalet: SimpleMKL, J. Mach. Learn. Res. 9, 2491–2521 (2008)

    MathSciNet  MATH  Google Scholar 

  42. Z. Xu, R. Jin, I. King, M.R. Lyu: An extended level method for multiple kernel learning, Adv. Neural Inform. Proces. Syst. 22, 1825–1832 (2008)

    Google Scholar 

  43. Y. Ying, K. Huang, C. Campbell: Enhanced protein fold recognition through a novel data integration approach, BMC Bioinf. 10, 267–285 (2009)

    Article  Google Scholar 

  44. T. Damoulas, M. Girolami: Probabilistic multi-class multi-kernel learning: On protein fold recognition and remote homology detection, Bioinformatics 24, 1264–1270 (2008)

    Article  Google Scholar 

  45. G.R.G. Lanckriet, T. De Bie, N. Cristianini, M.I. Jordan, W.S. Noble: A statistical framework for genomic data fusion, Bioinformatics 20, 2626–2635 (2004)

    Article  Google Scholar 

  46. M. Kloft, U. Brefeld, S. Sonnenburg, P. Laskov, K.-R. Müller, A. Zien: Efficient and accurate lp-norm multiple kernel learning, Adv. Neural Inform. Proces. Syst. 22, 997–1005 (2009)

    MATH  Google Scholar 

  47. U. Alon, N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, D. Mack, A.J. Levine: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays, Proc. Natl. Acad. Sci. USA 96(12), 6745–6750 (1999)

    Article  Google Scholar 

  48. B. Everitt: Cluster Analysis (Arnold, New York 1993)

    MATH  Google Scholar 

  49. L. Kaufman, P.J. Rousseeuw: Finding Groups in Data (Wiley, New York 2005)

    Google Scholar 

  50. R.O. Duda, P.E. Hart, D.G. Stork: Pattern classification (Wiley, New York 2001)

    MATH  Google Scholar 

  51. Y.W. Teh, D. Newman, M. Welling: A collapsed variational Bayesian inference algorithm for latent dirichlet allocation, Adv. Neural Inform. Proces. Syst. 19, 1353–1360 (2006)

    Google Scholar 

  52. Y. Ying, P. Li, C. Campbell: A marginalized variational Bayesian approach to the analysis of array data, BMC Proc. 2(4), S7 (2008)

    Article  Google Scholar 

  53. P. Li, Y. Ying, C. Campbell: A variational approach to semi-supervised clustering, Proc. ESANN2009 (2009) pp. 11–16

    Google Scholar 

  54. D.M. Blei, M.I. Jordan: Modeling annotated data, Proc. 26th Annu. Int. ACM SIGIR Conf. Res. Dev. Inf. Retr. (ACM Press, New York 2003) pp. 127–134

    Chapter  Google Scholar 

  55. P. Agius, Y. Ying, C. Campbell: Bayesian Unsupervised Learning with Multiple Data Types, Stat. Appl. Genet. Molec. Biol. 8, 27 (2009)

    MathSciNet  MATH  Google Scholar 

  56. S. Rogers, M. Girolami, C. Campbell, R. Breitling: The latent process decomposition of cdna microarray datasets, IEEE/ACM Trans. Comput. Biol. Bioinforma. 2, 143–156 (2005)

    Article  Google Scholar 

  57. C. Blenkiron, L.D. Goldstein, N.P. Thorne, I. Spiteri, S.F. Chin, M.J. Dunning, N.L. Barbosa-Morais, A.E. Teschendorff, A.R. Green, I.O. Ellis, S. Tavaré, C. Caldas, E.A. Miska: MicroRNA expression profiling of human breast cancer identifies new markers of tumour subtype, Genome Biol. 8(10), R214–1–R214–16 (2007)

    Article  Google Scholar 

  58. L. Carrivick, S. Rogers, J. Clark, C. Campbell, M. Girolami, C. Cooper: Identification of prognostic signatures in breast cancer microarray data using Bayesian techniques, J. R. Soc. Interf. 3, 367–381 (2006)

    Article  Google Scholar 

  59. E. Garber, O.G. Troyanskaya, K. Schluens, S. Petersen, Z. Thaesler, M. Pacyna-Gengelbach, M. van de Rijn, G.D. Rosen, C.M. Perou, R.I. Whyte, R.B. Altman, P.O. Brown, D. Botstein, I. Petersen: Diversity of gene expression in adenocarcinoma of the lung, Proc. Natl. Acad. Sci. USA 98, 13784–13789 (2001)

    Article  Google Scholar 

  60. C. Andrieu, N. De Freitas, A. Doucet, M.I. Jordan: An introduction to MCMC for machine learning, Mach. Learn. 50, 5–43 (2003)

    Article  MATH  Google Scholar 

  61. W.R. Gilks, S. Richardson, D.J. Spiegelhalter: Markov Chain Monte Carlo in Practice (Chapman Hall/CRC, New York 1996)

    MATH  Google Scholar 

  62. C.P. Robert, G. Casella: Monte Carlo Statistical Methods (Springer, Berlin, Heidelberg 2004)

    Book  MATH  Google Scholar 

  63. S. Chib, E. Greenberg: Understanding the Metropolis Hastings Algorithm, Am. Stat. 49(4), 327–335 (1995)

    Google Scholar 

  64. B.A. Berg: Markov Chain Monte Carlo Simulations and Their Statistical Analysis (World Scientific, Singapore 2004)

    Book  MATH  Google Scholar 

  65. W.M. Bolstad: Understanding Computational Bayesian Statistics (Wiley, New York 2010)

    MATH  Google Scholar 

  66. K. Bleakley, G. Biau, J.-P. Vert: Supervised reconstruction of biological networks with local models, Bioinformatics 23, i57–i65 (2007)

    Article  Google Scholar 

  67. B. Calderhead, M. Girolami: Estimating Bayes factors via thermodynamic integration and population MCMC, Comput. Stat. Data Anal. 53, 4028–4045 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  68. T.R. Xu, V. Vyshemirsky, A. Gormand, A. von Kriegsheim, M. Girolami, G.S. Baillie, D. Ketley, A.J. Dunlop, G. Milligan, M.D. Houslay, W. Kolch: Inferring signaling pathway topologies from multiple perturbation measurements of specific biochemical species, Sci. Signal. 3(113), ra20:1–ra20:10 (2010)

    Article  Google Scholar 

  69. Cancer Genome Atlas: Available at http://cancergenome.nih.gov

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Colin Campbell .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag

About this chapter

Cite this chapter

Campbell, C. (2014). Machine Learning Methodology in Bioinformatics. In: Kasabov, N. (eds) Springer Handbook of Bio-/Neuroinformatics. Springer Handbooks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30574-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30574-0_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30573-3

  • Online ISBN: 978-3-642-30574-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics