A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features

Cost, Scott; Salzberg, Steven

doi:10.1023/A:1022664626993

A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features

Published: January 1993

Volume 10, pages 57–78, (1993)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features

Download PDF

Scott Cost¹ &
Steven Salzberg¹

4141 Accesses
23 Citations
Explore all metrics

Abstract

In the past, nearest neighbor algorithms for learning from examples have worked best in domains in which all features had numeric values. In such domains, the examples can be treated as points and distance metrics can use standard definitions. In symbolic domains, a more sophisticated treatment of the feature space is required. We introduce a nearest neighbor algorithm for learning in domains with symbolic features. Our algorithm calculates distance tables that allow it to produce real-valued distances between instances, and attaches weights to the instances to further modify the structure of feature space. We show that this technique produces excellent classification accuracy on three problems that have been studied by machine learning researchers: predicting protein secondary structure, identifying DNA promoter sequences, and pronouncing English text. Direct experimental comparisons with the other learning algorithms show that our nearest neighbor algorithm is comparable or superior in all three domains. In addition, our algorithm has advantages in training speed, simplicity, and perspicuity. We conclude that experimental evidence favors the use and continued development of nearest neighbor algorithms for domains such as the ones studied here.

References

Aha, D. (1989). Incremental, instance-based learning of independent and graded concept descriptions. Proceedings of the Sixth International Workshop on Machine Learning (pp. 387–391). Ithaca, NY: Morgan Kaufmann.
Google Scholar
Aha, D. & Kibler, D. (1989). Noise-tolerant instance-based learning algorithms. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (p. 794–799). Detroit, MI: Morgan Kaufmann.
Google Scholar
Aha, D. (1990). A study of instance-based algorithms for supervised learning tasks. Doctoral dissertation, Department of Information and Computer Science, University of California, Irvine. Technical Report 90-42.
Aha, D., Kibler, D., & Albert, M. (1991). Instance-based learning algorithms. Machine Learning, 6 (1) 37–66.
Google Scholar
Chou, P. & Fasman, G. (1978). Prediction of the secondary structure of proteins from their amino acid sequence. Advanced Enzymology, 47, 45–148. Biochemistry, 13, 222–245.
Google Scholar
Cohen, F, Abarbanel, R., Kuntz, I., & Fletterick, R. (1986). Turn prediction in proteins using a pattern matching approach. Biochemistry, 25, 266–275.
Google Scholar
Cost, S. (1990). Master's thesis, Department of Computer Science, Johns Hopkins University.
Cost, S. & Salzberg, S. (1990). Exemplar-based learning to predict protein folding. Proceedings of the Symposium on Computer Applications to Medical Care (pp. 114–118). Washington, DC.
Cover, T. & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13 (1), 21–27.
Google Scholar
Crick, F. & Asanuma, C. (1986). Certain aspects of the anatomy and physiology of the cerebral cortex. In J. McClelland, D. Rumelhart, & the PDP Research Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol. II). Cambridge, MA: MIT Press.
Google Scholar
Dietterich, T., Hild, H., & Bakiri, G. (1990). A comparative study of ID3 and backpropagation for English text-to-speech mapping. Proceedings of the 7th International Conference on Machine Learning (pp. 24–31), San Mateo, CA: Morgan Kaufmann.
Google Scholar
Fertig, S. & Gelernter, D. (1991). FGP: A virtual machine for acquiring knowledge from cases. Proceedings of the 12th International Joint Conference on Artificial Intelligence (pp. 796–802). Los Altos, CA: Morgan Kaufmann.
Google Scholar
Fisher, D. & McKusick, K. (1989). An empirical comparison of ID3 and backpropagation. Proceedings of the International Joint Conference on Artificial Intelligence (pp. 788–793) San Mateo, CA: Morgan Kaufmann.
Google Scholar
Garnier, J., Osguthorpe, D., & Robson, B. (1978). Analysis of the accuracy and implication of simple methods for predicting the secondary structure of globular proteins. Journal of Molecular Biology, 120, 97–120.
Google Scholar
Hanson, S. & Burr, D. (1990). What connectionist models learn: Learning and representation in connectionist networks. Behavioral and Brain Sciences, 13 471–518.
Google Scholar
Holley, L. & Karplus, M. (1989). Protein secondary structure prediction with a neural network. Proceedings of the National Academy of Sciences USA, 86, 152–156.
Google Scholar
Kabsch, W. & Sander, C. (1983). Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometric features. Biopolymers, 22, 2577–2637.
PubMed Google Scholar
Kontogiorgis, S. (1988). Automatic letter-to-phoneme transcription for speech synthesis (Technical Report JHU-88/22). Department of Computer Science, Johns Hopkins University.
Lathrop, R., Webster, T., & Smith, T. (1987). ARIADNE: Pattern-directed inference and hierarchical abstraction in protein structure recognition. Communications of the ACM, 30 (11), 909–921.
Google Scholar
Lim, V. (1974). Algorithms for prediction of alpha-helical and beta-structural regions in globular proteins. Journal of Molecular Biology, 88, 873–894.
Google Scholar
Mathews, B.W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta, 405, 442–451.
Google Scholar
McClelland, J. & Rumelhart, D. (1986). A distributed model of human learning and memory. In J. McClelland, D. Rumelhart, & the PDP Research Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol. II). Cambridge, MA: MIT Press.
Google Scholar
Medin, D. & Schaffer, M. (1978). Context theory of classification learning. Psychological Review, 85 (3) 207–238.
Google Scholar
Mooney, R., Shavlik, J., Towell, G., & Gove, A. (1989). An experimental comparison of symbolic and connectionist learning algorithms. Proceedings of the International Joint Conference on Artificial Intelligence (pp. 775–780). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Nosofsky, R. (1984). Choice, similarity, and the context theory of classification. Journal of Experimental Psychology: Learning, Memory, and Cognition 10 (1), 104–114.
Google Scholar
O'Neill, M. (1989). Escherichia coli promoters: I. Consensus as it relates to spacing class, specificity, repeat substructure, and three dimensional organization. Journal of Biological Chemistry, 264, 5522–5530.
Google Scholar
Preparata, F. & Shamos, M. (1985). Computational geometry: An introduction. New York: Springer-Verlag.
Google Scholar
Qian, N. & Sejnowski, T. (1988). Predicting the secondary structure of globular proteins using neural network models. Journal of Molecular Biology, 202, 865–884.
Google Scholar
Reed, S. (1972). Pattern recognition and categorization. Cognitive Psychology, 3, 382–407.
Google Scholar
Rumelhart, D., Hinton, G., & Williams, R. (1986). Learning representations by backpropagating errors. Nature, 323 (9), 533–536.
Google Scholar
Rumelhart, D., Smolensky, P., McClelland, J., & Hinton, G. (1986). Schemata and sequential thought processes in PDP models. In J. McClelland, D. Rumelhart, & the PDP Research Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol. II). Cambridge, MA: MIT Press.
Google Scholar
Rumelhart, D., McClelland, J., & the PDP Research Group (1986). Parallel distributed processing: Explorations in the microstructure of cognition (Vol. I). Cambridge, MA: MIT Press.
Google Scholar
Salzberg, S. (1989). Nested hyper-rectangles for exemplar-based learning. In K.P. Jantke (Ed.), Analogical and Inductive Inference: International Workshop All '89. Berlin: Springer-Verlag.
Google Scholar
Salzberg, S. (1990). Learning with nested generalized exemplars. Norwell, MA: Kluwer Academic Publishers.
Google Scholar
Salzberg, S. (1991). A nearest hyperrectangle learning method. Machine Learning, 6 (3), 251–276.
Google Scholar
Sejnowski, T. & Rosenberg, C. (1987). NETtalk: A parallel network that learns to read aloud. Complex Systems, 1 145–168. (Also Technical Report JHU/EECS-86/01. Baltimore, MD: John Hopkins University.
Google Scholar
Shavlik, J., Mooney, R., & Towell, G. (1989). Symbolic and neural learning algorithms: an experimental comparison (Technical Report #857). Madison, WI: Computer Sciences Department, University of Wisconsin.
Google Scholar
Sigillito, V. (1989). Personal communication.
Stanfill, C. & Waltz, D. (1986). Toward memory-based reasoning. Communications of the ACM, 29 (12), 1213–1228.
Google Scholar
Towell, G., Shavlik, J., & Noordewier, M. (1990). Refinement of approximate domain theories by knowledge-based neural networks. Proceedings Eighth National Conference on Artificial Intelligence (pp. 861–866). Menlo Park, CA: AAAI Press.
Google Scholar
Waltz, D. (1990). Massively parallel AI. Proceedings Eighth National Conference on Artificial Intelligence (pp. 1117–1122). Menlo Park, CA: AAAI Press.
Google Scholar
Weiss, S. & Kapouleas, I. (1989). An empirical comparison of pattern recognition, neural nets, and machine learning classification methods. Proceedings of the International Joint Conference on Artificial Intelligence (pp. 781–787). San Mateo, CA: Morgan Kaufmann.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Johns Hopkins University, Baltimore, MD, 21218
Scott Cost & Steven Salzberg

Authors

Scott Cost
View author publications
You can also search for this author in PubMed Google Scholar
Steven Salzberg
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cost, S., Salzberg, S. A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features. Machine Learning 10, 57–78 (1993). https://doi.org/10.1023/A:1022664626993

Download citation

Issue Date: January 1993
DOI: https://doi.org/10.1023/A:1022664626993

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features

Abstract

Article PDF

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

ImageNet Large Scale Visual Recognition Challenge

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features

Abstract

Article PDF

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

ImageNet Large Scale Visual Recognition Challenge

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation