Large-width machine learning algorithm

Anthony, Martin; Ratsaby, Joel

doi:10.1007/s13748-020-00212-4

Large-width machine learning algorithm

Regular Paper
Published: 05 August 2020

Volume 9, pages 275–285, (2020)
Cite this article

Progress in Artificial Intelligence Aims and scope Submit manuscript

199 Accesses
Explore all metrics

Abstract

We introduce an algorithm, called Large Width (LW), that produces a multi-category classifier (defined on a distance space) with the property that the classifier has a large ‘sample width.’ (Width is a notion similar to classification margin.) LW is an incremental instance-based (also known as ‘lazy’) learning algorithm. Given a sample of labeled and unlabeled examples, it iteratively picks the next unlabeled example and classifies it while maintaining a large distance between each labeled example and its nearest-unlike prototype. (A prototype is either a labeled example or an unlabeled example which has already been classified.) Thus, LW gives a higher priority to unlabeled points whose classification decision ‘interferes’ less with the labeled sample. On a collection UCI benchmark datasets, the LW algorithm ranks at the top when compared to 11 instance-based learning algorithms (or configurations). When compared to the best candidate from instance-based learners, MLP, SVM, decision tree learner (C4.5) and Naive Bayes, LW is ranked at second place after only MLP which comes at first place by a single extra win against LW. The LW algorithm can be implemented in parallel distributed processing to yield a high speedup factor and is suitable for any distance space, with a distance function which need not necessarily satisfy the conditions of a metric.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Is a Data-Driven Approach Still Better Than Random Choice with Naive Bayes Classifiers?

Distance Metric Learning with Prototype Selection for Imbalanced Classification

An Unsupervised Learning Classifier with Competitive Error Performance

References

Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)
Google Scholar
Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations. Cambridge University Press, Cambridge (1999)
Book MATH Google Scholar
Anthony, M., Ratsaby, J.: Maximal width learning of binary functions. Theoret. Comput. Sci. 411, 138–147 (2010)
Article MathSciNet MATH Google Scholar
Anthony, M., Ratsaby, J.: Analysis of a multi-category classifier. Discret. Appl. Math. 160(16–17), 2329–2338 (2012)
Article MathSciNet MATH Google Scholar
Anthony, M., Ratsaby, J.: A hybrid classifier based on boxes and nearest neighbors. Discret. Appl. Math. 172, 1–11 (2014)
Article MathSciNet MATH Google Scholar
Anthony, M., Ratsaby, J.: Learning bounds via sample width for classifiers on finite metric spaces. Theoret. Comput. Sci. 529, 2–10 (2014)
Article MathSciNet MATH Google Scholar
Anthony, M., Ratsaby, J.: A probabilistic approach to case-based inference. Theoret. Comput. Sci. 589, 61–75 (2015)
Article MathSciNet MATH Google Scholar
Anthony, M., Ratsaby, J.: Multi-category classifiers and sample width. J. Comput. Syst. Sci. 82(8), 1223–1231 (2016)
Article MathSciNet MATH Google Scholar
Anthony, M., Ratsaby, J.: Classification based on prototypes with spheres of influence. Inf. Comput. 256, 372–380 (2017)
Article MathSciNet MATH Google Scholar
Anthony, M., Ratsaby, J.: Large-width bounds for learning half-spaces on distance spaces. Discret. Appl. Math. 243, 73–89 (2018)
Article MathSciNet MATH Google Scholar
Anthony, M., Ratsaby, J.: Large width nearest prototype classification on general distance spaces. Theoret. Comput. Sci. 738, 65–79 (2018)
Article MathSciNet MATH Google Scholar
Atkeson, C.G., Moore, A.W., Schaal, S.: Locally weighted learning. Artif. Intell. Rev. 11(1–5), 11–73 (1997)
Article Google Scholar
Chester, U., Ratsaby, J.: Universal distance measure for images. In: Proceedings of the \(27th\) IEEE Convention of Electrical Electronics Engineers in Israel (IEEEI’12), pages 1–4, Eilat, Israel, November 14–17 (2012)
Chester, U., Ratsaby, J.: Machine learning for image classification and clustering using a universal distance measure. In: N. Brisaboa, O. Pedreira, and P. Zezula, editors, Proceedings of the 6th International Conference on Similarity Search and Applications (SISAP’13), volume 8199 of Springer Lecture Notes in Computer Science, pages 59–72 (2013)
Cilibrasi, R., Vitanyi, P.: Clustering by compression. IEEE Trans. Inf. Theory 51(4), 1523–1545 (2005)
Article MathSciNet MATH Google Scholar
Cleary, J.G., Trigg, K.E.: K*: An instance-based learner using and entropic distance measure. In: Proceedings of the Twelfth International Conference on International Conference on Machine Learning, ICML’95, 108–114, Morgan Kaufmann Publishers Inc, San Francisco (1995)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Article MATH Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and other Kernel-based learning methods. Cambridge University Press, Cambridge (2000)
Book MATH Google Scholar
Deza, M., Deza, E.: Encyclopedia of Distances, volume 15 of Series in Computer Science. Springer-Verlag, (2009)
Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man Cybernet. SMC–6(4), 325–327 (1976)
Article Google Scholar
Duin, R.P.W., Pekalska, E., Loog, M.: Non-euclidean dissimilarities: causes, embedding and informativeness. In: Pelillo, M. (ed.) Similarity-Based Pattern Analysis and Recognition Advances in Computer Vision and Pattern Recognition. Springer, Berlin (2013)
Google Scholar
Frank, E., Hall, M.A., Witten, I.: The WEKA Workbench. Practical Machine Learning Tools and Techniques. Morgan Kaufmann, fourth edition, Online Appendix for Data Mining (2016)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explorat. 11(1), 10–18 (2009)
Article Google Scholar
Li, M., Chen, X., Li, X., Ma, B., Vitanyi, P.: The similarity metric. IEEE Trans. Info. Theory 50(12), 3250–3264 (2004)
Article MathSciNet MATH Google Scholar
Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)
MATH Google Scholar
Nadeau, C., Bengio, Y.: Inference for the generalization error. Mach. Learn. 52(3), 239–281 (2003)
Article MATH Google Scholar
Pekalska, E., Duin, R.P.W.: The Dissimilarity Representation for Pattern Recognition: Foundations And Applications (Machine Perception and Artificial Intelligence). World Scientific Publishing Co.Inc, River Edge, NJ (2005)
Book MATH Google Scholar
Ratsaby, J., Sabaty, A.: Parallelizing the large width learning algorithm. In: IEEE International Conference on the Science of Electrical Engineering (ICSEE’2018), 1–5, December 14–16 (2018)
UCI Machine Learning Repository
Smola, A.J., Bartlett, P.L., Scholkopf, B., Schuurmans, D.: Advances in Large-Margin Classifiers (Neural Information Processing). MIT Press, Cambridge (2000)
Book MATH Google Scholar

Download references

Acknowledgements

This work was supported in part by the Ministry of Science & Technology, ISRAEL. The authors thank the reviewers for their thoughtful comments and suggestions.

Author information

Authors and Affiliations

Department of Mathematics, The London School of Economics and Political Science, Houghton Street, London, WC2A2AE, UK
Martin Anthony
Electrical and Electronics Engineering Department, Ariel University, Ariel, 40700, Israel
Joel Ratsaby

Authors

Martin Anthony
View author publications
You can also search for this author in PubMed Google Scholar
Joel Ratsaby
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joel Ratsaby.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

For the experiment displayed in Table 2, the algorithms’ parameter settings are as follows:

(1)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.LW – -M 0 -I 0 -R true’ -4523450618538717400
(2)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 1 -W 0 -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400
(3)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 1 -W 0 -I -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400
(4)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 1 -W 0 -F -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400
(5)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 5 -W 0 -X -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400
(6)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 5 -W 0 -X -I -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400
(7)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 5 -W 0 -X -F -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400
(8)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 10 -W 0 -X -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400
(9)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 10 -W 0 -X -I -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400
(10)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 10 -W 0 -X -F -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400
(11)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.KStar – -B 20 -M a’ -4523450618538717400

(12)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.LWL – -U 0 -K -1 -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)” -W trees.DecisionStump’ -4523450618538717400

For the experiment displayed in Table 6, the algorithms’ parameter settings are as follows:

(1)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.LW – -M 0 -I 0 -R true’ -4523450618538717400
(2)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W lazy.IBk – -K 5 -W 0 -X -F -A \(\backslash \)”weka.core.neighboursearch.LinearNNSearch -A /\(\backslash \)”weka.core.EuclideanDistance -R first-last/\(\backslash \)”\(\backslash \)”’ -4523450618538717400
(3)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W functions.MultilayerPerceptron – -L 0.3 -M 0.2 -N 500 -V 0 -S 0 -E 20 -H a’ -4523450618538717400
(4)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W trees.J48 – -C 0.25 -M 2’ -4523450618538717400
(5)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W functions.SMO – -C 1.0 -L 0.001 -P 1.0E-12 -N 0 -V -1 -W 1 -K \(\backslash \)”functions.supportVector.PolyKernel -E 1.0 -C 250007\(\backslash \)” -calibrator \(\backslash \)”functions.Logistic -R 1.0E-8 -M -1 -num-decimal-places 4\(\backslash \)”’ -4523450618538717400
(6)	meta.FilteredClassifier ’-F \(\backslash \)”supervised.instance.ClassBalancer -num-intervals 10\(\backslash \)” -S 1 -W bayes.NaiveBayes’ -4523450618538717400

Rights and permissions

Reprints and permissions

About this article

Cite this article

Anthony, M., Ratsaby, J. Large-width machine learning algorithm. Prog Artif Intell 9, 275–285 (2020). https://doi.org/10.1007/s13748-020-00212-4

Download citation

Received: 29 October 2019
Accepted: 19 July 2020
Published: 05 August 2020
Issue Date: September 2020
DOI: https://doi.org/10.1007/s13748-020-00212-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Large-width machine learning algorithm

Abstract

Access this article

Similar content being viewed by others

Is a Data-Driven Approach Still Better Than Random Choice with Naive Bayes Classifiers?

Distance Metric Learning with Prototype Selection for Imbalanced Classification

An Unsupervised Learning Classifier with Competitive Error Performance

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Large-width machine learning algorithm

Abstract

Access this article

Similar content being viewed by others

Is a Data-Driven Approach Still Better Than Random Choice with Naive Bayes Classifiers?

Distance Metric Learning with Prototype Selection for Imbalanced Classification

An Unsupervised Learning Classifier with Competitive Error Performance

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation