Abstract
This paper describes a biomolecular classification methodology based on multilayer perceptron neural networks. The system developed is used to classify enzymes found in the Protein Data Bank. The primary goal of classification, here, is to infer the function of an (unknown) enzyme by analysing its structural similarity to a given family of enzymes. A new codification scheme was devised to convert the primary structure of enzymes into a real-valued vector. The system was tested with a different number of neural networks, training set sizes and training epochs. For all experiments, the proposed system achieved a higher accuracy rate when compared with profile hidden Markov models. Results demonstrated the robustness of this approach and the possibility of implementing fast and efficient biomolecular classification using neural networks.
Similar content being viewed by others
References
Attwood TK, Beck ME, Bleasby AJ et al. 1996. Progress with the PRINTS protein fingerprint database. Nucleic Acids Res, 24:182–8.
Chiba S, Sugawara K, Watanabe T. 2001. Classification and function estimation of protein by using data compression and genetic algorithms. In: Proceedings of the 2001 Congress on Evolutionary Computation 2001. Volume 2. Piscataway: IEEE Pr. p 839–44.
Eddy SR. 1998. Profile hidden Markov models. Bioinformatics, 14: 755–63.
Eskin E, Noble WS, Singer Y. 2003. Protein family classification using sparse Markov transducers. J Comput Biol, 10:187–214.
Fausett L. 1994. Fundamentals of neural networks. Upper Saddle River: Prentice Hall.
Hand DJ. 1997. Construction and assessment of classification rules. New York: J Wiley.
Hu Y. 1998. Biopattern discovery by genetic programming. In Koza JR, ed. Proceedings of the Third Annual Genetic Programming Conference. San Francisco: Morgan Kaufmann Publ. p 152–7.
Koza JR. 1997. Classifying protein segments as transmembrane domains using genetic programming and architecture-altering operations. In Bäck T, Fogel DB, Michalewicz Z, eds. Handbook of evolutionary computation. Volume 6. Bristol: Institute of Physics Publ. p 1–5.
Kyte J, Doolittle R. 1982. A simple method for displaying the hydropathic character of proteins. J Mol Biol, 157:105–32.
Lehninger AL, Nelson DL, Cox MM. 1998a. Principles of biochemistry with an extended discussion of oxygen-binding proteins. 2nd ed. New York: Worth Publ.
Lehninger AL, Nelson DL, Cox MM. 1998b. Principles of biochemistry. 2nd ed. New York: Worth Publ.
McCulloch WS, Pitts W. 1943. A logical calculus of the ideas immanent in nervous activity. Bull Math Biol, 5:115–33.
Murzin AG, Brenner SE, Hubbard T et al. 1995. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol, 247:536–40.
Narayanan A, Keedwell EC, Olsson B. 2002. Artificial intelligence techniques for bioinformatics. Appl Bioinform, 1:191–222.
Ohkawa T, Namihira D, Komoda N et al. 1996. Protein structure classification by structural transformation. In Bourbakis NG, ed. Proceedings of IEEE International Joint Symposia on Intelligence and Systems. Los Alamitos: IEEE Computer Society Pr. p 23–9.
Wang JTL, Ma Q, Shasha D et al. 2000. Application of neural networks to biological data mining: a case study in protein sequence classification. In Ramakrishnan R, ed. Proceedings of Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Pr. p 305–9.
Wei Y, Kim S, Fela D et al. 2003. Solution structure of a de novo protein from a designed combinatorial library. Proc Natl Acad Sci USA, 100:13270–3.
Wu CH. 1997. Artificial neural networks for molecular sequence analysis. Comput Chem, 21:237–56.
Wu CH, Whitson GM, Montllor GJ. 1990. PROCANS: a protein classification system using a neural network. In: Proceedings of IEEE International Joint Conference on Neural Networks. Volume 2. Los Alamitos: IEEE Computer Society Pr. p 91–6.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Weinert, W.R., Lopes, H.S. Neural networks for protein classification. Appl-Bioinformatics 3, 41–48 (2004). https://doi.org/10.2165/00822942-200403010-00006
Published:
Issue Date:
DOI: https://doi.org/10.2165/00822942-200403010-00006