Abstract
We introduce an algorithm, called Large Width (LW), that produces a multi-category classifier (defined on a distance space) with the property that the classifier has a large ‘sample width.’ (Width is a notion similar to classification margin.) LW is an incremental instance-based (also known as ‘lazy’) learning algorithm. Given a sample of labeled and unlabeled examples, it iteratively picks the next unlabeled example and classifies it while maintaining a large distance between each labeled example and its nearest-unlike prototype. (A prototype is either a labeled example or an unlabeled example which has already been classified.) Thus, LW gives a higher priority to unlabeled points whose classification decision ‘interferes’ less with the labeled sample. On a collection UCI benchmark datasets, the LW algorithm ranks at the top when compared to 11 instance-based learning algorithms (or configurations). When compared to the best candidate from instance-based learners, MLP, SVM, decision tree learner (C4.5) and Naive Bayes, LW is ranked at second place after only MLP which comes at first place by a single extra win against LW. The LW algorithm can be implemented in parallel distributed processing to yield a high speedup factor and is suitable for any distance space, with a distance function which need not necessarily satisfy the conditions of a metric.
Similar content being viewed by others
References
Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)
Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations. Cambridge University Press, Cambridge (1999)
Anthony, M., Ratsaby, J.: Maximal width learning of binary functions. Theoret. Comput. Sci. 411, 138–147 (2010)
Anthony, M., Ratsaby, J.: Analysis of a multi-category classifier. Discret. Appl. Math. 160(16–17), 2329–2338 (2012)
Anthony, M., Ratsaby, J.: A hybrid classifier based on boxes and nearest neighbors. Discret. Appl. Math. 172, 1–11 (2014)
Anthony, M., Ratsaby, J.: Learning bounds via sample width for classifiers on finite metric spaces. Theoret. Comput. Sci. 529, 2–10 (2014)
Anthony, M., Ratsaby, J.: A probabilistic approach to case-based inference. Theoret. Comput. Sci. 589, 61–75 (2015)
Anthony, M., Ratsaby, J.: Multi-category classifiers and sample width. J. Comput. Syst. Sci. 82(8), 1223–1231 (2016)
Anthony, M., Ratsaby, J.: Classification based on prototypes with spheres of influence. Inf. Comput. 256, 372–380 (2017)
Anthony, M., Ratsaby, J.: Large-width bounds for learning half-spaces on distance spaces. Discret. Appl. Math. 243, 73–89 (2018)
Anthony, M., Ratsaby, J.: Large width nearest prototype classification on general distance spaces. Theoret. Comput. Sci. 738, 65–79 (2018)
Atkeson, C.G., Moore, A.W., Schaal, S.: Locally weighted learning. Artif. Intell. Rev. 11(1–5), 11–73 (1997)
Chester, U., Ratsaby, J.: Universal distance measure for images. In: Proceedings of the \(27th\) IEEE Convention of Electrical Electronics Engineers in Israel (IEEEI’12), pages 1–4, Eilat, Israel, November 14–17 (2012)
Chester, U., Ratsaby, J.: Machine learning for image classification and clustering using a universal distance measure. In: N. Brisaboa, O. Pedreira, and P. Zezula, editors, Proceedings of the 6th International Conference on Similarity Search and Applications (SISAP’13), volume 8199 of Springer Lecture Notes in Computer Science, pages 59–72 (2013)
Cilibrasi, R., Vitanyi, P.: Clustering by compression. IEEE Trans. Inf. Theory 51(4), 1523–1545 (2005)
Cleary, J.G., Trigg, K.E.: K*: An instance-based learner using and entropic distance measure. In: Proceedings of the Twelfth International Conference on International Conference on Machine Learning, ICML’95, 108–114, Morgan Kaufmann Publishers Inc, San Francisco (1995)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and other Kernel-based learning methods. Cambridge University Press, Cambridge (2000)
Deza, M., Deza, E.: Encyclopedia of Distances, volume 15 of Series in Computer Science. Springer-Verlag, (2009)
Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man Cybernet. SMC–6(4), 325–327 (1976)
Duin, R.P.W., Pekalska, E., Loog, M.: Non-euclidean dissimilarities: causes, embedding and informativeness. In: Pelillo, M. (ed.) Similarity-Based Pattern Analysis and Recognition Advances in Computer Vision and Pattern Recognition. Springer, Berlin (2013)
Frank, E., Hall, M.A., Witten, I.: The WEKA Workbench. Practical Machine Learning Tools and Techniques. Morgan Kaufmann, fourth edition, Online Appendix for Data Mining (2016)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explorat. 11(1), 10–18 (2009)
Li, M., Chen, X., Li, X., Ma, B., Vitanyi, P.: The similarity metric. IEEE Trans. Info. Theory 50(12), 3250–3264 (2004)
Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)
Nadeau, C., Bengio, Y.: Inference for the generalization error. Mach. Learn. 52(3), 239–281 (2003)
Pekalska, E., Duin, R.P.W.: The Dissimilarity Representation for Pattern Recognition: Foundations And Applications (Machine Perception and Artificial Intelligence). World Scientific Publishing Co.Inc, River Edge, NJ (2005)
Ratsaby, J., Sabaty, A.: Parallelizing the large width learning algorithm. In: IEEE International Conference on the Science of Electrical Engineering (ICSEE’2018), 1–5, December 14–16 (2018)
UCI Machine Learning Repository
Smola, A.J., Bartlett, P.L., Scholkopf, B., Schuurmans, D.: Advances in Large-Margin Classifiers (Neural Information Processing). MIT Press, Cambridge (2000)
Acknowledgements
This work was supported in part by the Ministry of Science & Technology, ISRAEL. The authors thank the reviewers for their thoughtful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
For the experiment displayed in Table 2, the algorithms’ parameter settings are as follows:
(1) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.LW – -M 0 -I 0 -R true’ -4523450618538717400 |
(2) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 1 -W 0 -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400 |
(3) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 1 -W 0 -I -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400 |
(4) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 1 -W 0 -F -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400 |
(5) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 5 -W 0 -X -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400 |
(6) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 5 -W 0 -X -I -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400 |
(7) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 5 -W 0 -X -F -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400 |
(8) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 10 -W 0 -X -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400 |
(9) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 10 -W 0 -X -I -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400 |
(10) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 10 -W 0 -X -F -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400 |
(11) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.KStar – -B 20 -M a’ -4523450618538717400 |
(12) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.LWL – -U 0 -K -1 -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)” -W trees.DecisionStump’ -4523450618538717400 |
For the experiment displayed in Table 6, the algorithms’ parameter settings are as follows:
(1) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.LW – -M 0 -I 0 -R true’ -4523450618538717400 |
(2) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 5 -W 0 -X -F -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400 |
(3) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W functions.MultilayerPerceptron – -L 0.3 -M 0.2 -N 500 -V 0 -S 0 -E 20 -H a’ -4523450618538717400 |
(4) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W trees.J48 – -C 0.25 -M 2’ -4523450618538717400 |
(5) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W functions.SMO – -C 1.0 -L 0.001 -P 1.0E-12 -N 0 -V -1 -W 1 -K \(\backslash \)”functions.supportVector.PolyKernel -E 1.0 -C 250007\(\backslash \)” -calibrator \(\backslash \)”functions.Logistic -R 1.0E-8 -M -1 -num-decimal-places 4\(\backslash \)”’ -4523450618538717400 |
(6) | meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W bayes.NaiveBayes’ -4523450618538717400 |
Rights and permissions
About this article
Cite this article
Anthony, M., Ratsaby, J. Large-width machine learning algorithm. Prog Artif Intell 9, 275–285 (2020). https://doi.org/10.1007/s13748-020-00212-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13748-020-00212-4