Skip to main content

EigenMS: De Novo Analysis of Peptide Tandem Mass Spectra by Spectral Graph Partitioning

  • Conference paper
Research in Computational Molecular Biology (RECOMB 2005)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3500))

Abstract

We report on a new de novo peptide sequencing algorithm that uses spectral graph partitioning. In this approach, relationships between m/z peaks are represented by attractive and repulsive springs, and the vibrational modes of the spring system are used to infer information about the peaks (such as “likely b-ion” or “likely y-ion”). We demonstrate the effectiveness of this approach by comparison with other de novo sequencers on test sets of ion-trap and QTOF spectra, including spectra of mixtures of peptides. On all data sets we outperform the other sequencers. Along with spectral graph theory techniques, EigenMS incorporates another improvement of independent interest: robust statistical methods for recalibration of time-of-flight mass measurements. Robust recalibration greatly outperforms simple least-squares recalibration, achieving about three times the accuracy for one QTOF data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alpert, C., Kahng, A., Yao, S.: Spectral partitioning: the more eigenvectors, the better. Discrete Applied Math. 90, 3–26 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  2. Andreev, V.P., et al.: A universal denoising and peak picking algorithm for LC-MS based on matched filtration in the chromatographic time domain. Anal. Chem. 75, 6314–6326 (2003)

    Article  Google Scholar 

  3. Bafna, V., Edwards, N.: SCOPE: a probabilistic model for scoring tandem mass spectra against a peptide database. Bioinformatics 17, S13–S21 (2001)

    Article  Google Scholar 

  4. Bafna, V., Edwards, N.: On de novo interpretation of tandem mass spectra for peptide identification. In: RECOMB 2003, pp. 9–18 (2003)

    Google Scholar 

  5. Bartels, C.: Fast algorithm for peptide sequencing by mass spectrometry. Biomedical and Environmental Mass Spectrometry 19, 363–368 (1990)

    Article  Google Scholar 

  6. Bern, M., Goldberg, D.: Automatic quality assessment of peptide tandem mass spectra. Bioinformatics, ISMB special issue (2004)

    Google Scholar 

  7. Chen, T., Kao, M.-Y., Tepel, M., Rush, J., Church, G.M.: A dynamic programming approach to de novo peptide sequencing by mass spectrometry. J. Computational Biology 8, 325–337 (2001)

    Article  Google Scholar 

  8. Chung, F.R.K.: Spectral Graph Theory. CBMS Series, vol. 92. American Mathematical Society, Providence (1997)

    MATH  Google Scholar 

  9. Clauser, K.R., Baker, P.R., Burlingame, A.L.: The role of accurate mass measurment (+/- 10 ppm) in protein identification strategies employing MS or MS/MS and database searching. Anal. Chem. 71, 2871–2882 (1999)

    Article  Google Scholar 

  10. Creasy, D.M., Cottrell, J.S.: Error tolerant searching of uninterpreted tandem mass spectrometry data. Proteomics 2, 1426–1434 (2002)

    Article  Google Scholar 

  11. Dančik, V., Addona, T.A., Clauser, K.R., Vath, J.E., Pevzner, P.A.: De novo peptide sequencing via tandem mass spectrometry. J. Computational Biology 6, 327–342 (1999)

    Article  Google Scholar 

  12. Day, R.M., Borziak, A., Gorin, A.: PPM-Chain – de novo peptide identification program comparable in performance to Sequest. Proc. IEEE Computational Systems Bioinformatics, 505–508 (2004)

    Google Scholar 

  13. Demmel, J.: Lecture notes on graph partitioning, http://www.cs.berkeley.edu/~demmel/cs267/lecture20/lecture20.html

  14. Elias, J.E., Gibbons, F.D., King, O.D., Roth, F.P., Gygi, S.P.: Intensity-based protein identification by machine learning from a library of tandem mass spectra. Nature Biotechnology 22, 214–219 (2004)

    Article  Google Scholar 

  15. Eng, J.K., McCormack, A.L., Yates III, J.R.: An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5, 976–989 (1994)

    Article  Google Scholar 

  16. Fiedler, M.: A property of eigenvectors of nonnegative symmetric matrices and its applications to graph theory. Czech. Math. J. 25, 619–633 (1975)

    MathSciNet  Google Scholar 

  17. Gobom, J., Mueller, M., Egelhofer, V., Theiss, D., Lehrach, H., Nordhoff, E.: A calibration method that simplifies and improves accurate determination of peptide molecular masses by MALDI-TOF MS. Anal. Chem. 74, 3915–3923 (2002)

    Article  Google Scholar 

  18. Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. The Johns Hopkins University Press, Baltimore (1996)

    MATH  Google Scholar 

  19. Han, Y., Ma, B., Zhang, K.: SPIDER: software for protein identification from sequence tags with de novo sequencing error. Proc. IEEE Computational Systems Bioinformatics, 206–215 (2004)

    Google Scholar 

  20. Havilio, M., Haddad, Y., Smilansky, Z.: Intensity-based statistical scorer for tandem mass spectrometry. Anal. Chem. 75, 435–444 (2003)

    Article  Google Scholar 

  21. Havilio, M.: Automatic peptide identification using de novo sequencing and efficient indexing. Poster presentation. In: Fifth International Symp. Mass Spectrometry in the Health and Life Sciences, San Francisco (2001)

    Google Scholar 

  22. Kinter, M., Sherman, N.E.: Protein Sequencing and Identification Using Tandem Mass Spectrometry. John Wiley & Sons, Chichester (2000)

    Book  Google Scholar 

  23. Lanczos, C.: An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Res. Nat. Bur. Stand. 45, 255–282 (1950), http://www.netlib.org/laso/

    MathSciNet  Google Scholar 

  24. Liebler, D.C.: Introduction to Proteomics: Tools for the New Biology. Humana Press, Totowa (2002)

    Google Scholar 

  25. Lubeck, O., Sewell, C., Gu, S., Chen, X., Cai, D.M.: New computational approaches for de novo peptide sequencing from MS/MS experiments. Proc. IEEE 90, 1868–1874 (2002)

    Article  Google Scholar 

  26. Ma, B., Zhang, K., Liang, C.: An effective algorithm for the peptide de novo sequencing from MS/MS spectrum. In: Symp. Comb. Pattern Matching 2003, pp. 266–278 (2003)

    Google Scholar 

  27. Ma, B., Zhang, K., Hendrie, C., Liang, C., Li, M., Doherty-Kirby, A., Lajoie, G.: PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Comm. in Mass Spectrometry 17, 2337–2342 (2003), http://www.bioinformaticssolutions.com

    Article  Google Scholar 

  28. MacCoss, M.J., et al.: Shotgun identification of protein modifications from protein complexes and lens tissue. Proc. Natl. Acad. Sciences 99, 7900–7905 (2002)

    Article  Google Scholar 

  29. Mann, M., Wilm, M.: Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Anal. Chem. 66, 4390–4399 (1994)

    Article  Google Scholar 

  30. Perkins, D.N., Pappin, D.J.C., Creasy, D.M., Cottrell, J.S.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 (1999)

    Article  Google Scholar 

  31. Pevzner, P.A., Dančik, V., Tang, C.L.: Mutation-tolerant protein identification by mass spectrometry. J. Comput. Bio. 7, 777–787 (2000)

    Article  Google Scholar 

  32. Pevzner, P.A., Mulyukov, Z., Dančik, V., Tang, C.L.: Efficiency of database search for identification of mutated and modified proteins via mass spectrometry. Genome Research 11, 290–299 (2001)

    Article  Google Scholar 

  33. Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. John Wiley & Sons, Chichester (1987)

    Book  MATH  Google Scholar 

  34. Searle, B.C., et al.: High-throughput identification of proteins and unanticipated sequence modifications using a mass-based alignment algorithm for MS/MS de novo sequencing results. Anal. Chem. 76, 2220–2230 (2004)

    Article  Google Scholar 

  35. Shevchenko, A., Wilm, M., Mann, M.: Peptide mass spectrometry for homology searches and cloning of genes. J. Protein Chem. 5, 481–490 (1997)

    Article  Google Scholar 

  36. Shevchenko, A., et al.: Charting the proteomes of organisms with unsequenced genomes by MALDI-quadrupole time-of-flight mass spectrometry and BLAST homology searching. Anal. Chem. 73, 1917–1926 (2001)

    Article  Google Scholar 

  37. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Machine Intell. 22, 888–905 (2000)

    Article  Google Scholar 

  38. Siuzdak, G.: The Expanding Role of Mass Spectrometry in Biotechnology. MCC Press (2003)

    Google Scholar 

  39. Tabb, D.L., Smith, L.L., Breci, L.A., Wysocki, V.H., Lin, D., Yates III, J.R.: Statistical characterization of ion trap tandem mass spectra from doubly charged tryptic digests. Anal. Chem. 75, 1155–1163 (2003)

    Article  Google Scholar 

  40. Tabb, D.L., Saraf, A., Yates III, J.R.: GutenTag: high-throughput sequence tagging via an empirically derived fragmentation model. Anal. Chem. 75, 6415–6421 (2003)

    Article  Google Scholar 

  41. Tabb, D.L., MacCoss, M.J., Wu, C.C., Anderson, S.D., Yates III, J.R.: Similarity among tandem mass spectra from proteomic experiments: detection, significance, and utility. Anal. Chem. 75, 2470–2477 (2003)

    Article  Google Scholar 

  42. Taylor, J., Johnson, R.: Implementation and uses of automated de novo peptide sequencing by tandem mass spectrometry. Anal. Chem. 73, 2594–2604 (2001)

    Article  Google Scholar 

  43. Uttenweiler-Joseph, S., Neubauer, G., Christoforidis, S., Zerial, M., Wilm, M.: Automated de novo sequencing of proteins using the differential scanning technique. Proteomics 1, 668–682 (2001)

    Article  Google Scholar 

  44. Yan, B., Pan, C., Olman, V.N., Hettich, R.L., Xu, Y.: Separation of ion types in tandem mass spectrometry data interpretation – a graph-theoretic approach. Proc. IEEE Computational Systems Bioinformatics, 236–244 (2004)

    Google Scholar 

  45. Yates III, J.R., Eng, J., McCormack, A., Schietz, D.: Method to correlate tandem mass spectra of modified peptides to amino acid sequences in a protein database. Anal. Chem. 67, 1426–1436 (1995)

    Article  Google Scholar 

  46. Zhang, Z.: Least median of squares. Web-site tutorial, http://www-sop.inria.fr/robotvis/personnel/zzhang/Publis/Tutorial-Estim/node25.html

  47. Zhong, H., Zhang, Y., Wen, Z., Li, L.: Protein sequencing by mass analysis of polypeptide ladders after controlled protein proteolysis. Nature Biotechnology 22, 1291–1296 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bern, M., Goldberg, D. (2005). EigenMS: De Novo Analysis of Peptide Tandem Mass Spectra by Spectral Graph Partitioning. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2005. Lecture Notes in Computer Science(), vol 3500. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11415770_27

Download citation

  • DOI: https://doi.org/10.1007/11415770_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25866-7

  • Online ISBN: 978-3-540-31950-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics