Abstract
We propose a new method to solve the classification problem based on separation of two sets in space Rd by constructing and separating their ε-nets in a range space with respect to hyperplanes. The concept of separation domain is introduced as the set of possible values of ε for which the sets can be separated. The paper contains examples of separation domain for random variables with different distributions and proves the theorem about its convergence. The concept of the set of all possible ε-nets of some set is introduced and its properties are proved. The weak convergence of the normalized difference of empirical and theoretical separation curves to the normal distribution is proved. It makes it possible to check the hypothesis about the place of theoretical separation curve at a specific point.
Similar content being viewed by others
References
D. Haussler and E.Welzl, “Epsilon-nets and simplex range queries,” Discrete Comput. Geom., No. 2, 127–151 (1987).
B. Gärtner and M. Hoffmann, Computational Geometry, http://www.ti.inf.ethz.ch/ew/lehre/CG12/lecture/CG%20lecture%20notes.pdf.
S. Hausler, VC Dimension: A Tutorial for the Course Computational Intelligence, http://www.igi.tugraz.at/lehre/CI.
J. Kulkarni and S. Govindarajan, “New ε-net constructions,” Canadian Conf. on Computational Geometry, CCCG (2010), pp. 159–162.
B. Aronov, E. Ezra, and M. Sharir, “Small-size epsilon-nets for axis-parallel rectangles and boxes,” Proc. Symp. on Theory of Computing (2009), pp. 639–648.
J. Matousek, R. Seidel, and E. Welzl, “How to net a lot with little: Small ε-nets for disks and halfspaces,” Proc. 6th Annual Symp. on Computational Geometry, Berkley, CA, USA, June 7–9 (1990), pp. 16–22.
A. Veselý, “Economic classification and regression problems and neural networks,” Agricultural Economics, CZECH, No. 57, 150–157 (2011).
D. Price, A. M. Pollock, and P. Brhlikova, “Classification problems and the dividing line between government and the market: An examination of NHS foundation trust classification in the UK,” Annals of Public and Cooperative Economics, 82, Issue 4, 455–473 (2011).
D. E. Schaubel and J. Cai, “Analysis of clustered recurrent event data with application to hospitalization rates among renal failure patients,” Biostatistics, No. 6, 404–419 (2005).
M. Weatherall, P. Shirtcliffe, J. Travers, and R. Beasley, “Use of cluster analysis to define COPD phenotypes,” Eur. Respir. J., 36, 472–474 (2010).
A. Mahr, S. Katsahian, and H. Varet, “Revisiting the classification of clinical phenotypes of anti-neutrophil cytoplasmic antibody-associated vasculitis: A cluster analysis,” Annals of the Rheumatic Diseases, 72, 1003–1010 (2012).
O. L. Mangasarian, W. N. Street, and W. H. Wolberg, “Breast cancer diagnosis and prognosis via linear programming,” Oper. Research, 43, No. 4, 570–577 (1995).
O. L. Mangasarian, “A simple characterization of solution sets of convex programs,” Comp. Sci. Tech. Report # 685, 21–26 (1987).
D. Card and A. B. Krueger, “Minimum wages and employment: A case study of the fast-food industry in New Jersey and Pennsylvania,” American Economic Review, 84, No. 4, 772–793 (1994).
J. Angrist and L. Victory, “Using maimonides’ rule to estimate the effect of class size on student achievement,” Quart. J. of Economics, 114, 533–575 (1999).
D. Lee, E. Moretti, and M. J. Butler, “Do voters affect or elect policies? Evidence from the U.S. House,” Quart. J. of Economics, 119, 807–859 (2004).
M. A. Ivanchuk and I. V. Malyk, “Comparison of the methods for classification of observations in predicting complications in critically ill patients,” Cybern. Syst. Analysis 51, No. 2, 303–312 (2015).
M. A. Ivanchuk and I. V. Malyk, “Using ε-nets for linear separation of two sets in a Euclidean space R d,” Cybern. Syst. Analysis 51, No. 6, 965–968 (2015).
P. Embrechts and M. Hofert, “A note on generalized inverses,” Math. Methods of Oper. Research, 77, No. 5, 423–432 (2013).
C. R. Smith, “A characterization of star-shaped sets,” American Mathematical Monthly, 75, No. 4, 386 (1968).
H. G. Tucker, “A generalization of the Glivenko–Cantelli theorem,” The Annals of Mathematical Statistics, 30, No. 3, 828–830 (1959).
A. Dvoretzky, J. Kiefer, and J. Wolfowitz, “Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator,” Annals of Mathematical Statistics, 27, No. 3, 642–669 (1956).
R. Durrett, Probability: Theory and Examples, Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge Univ. Press (2010).
L. P. Usol’tsev, “Asymptotics and large deviations in the central limit theorem for sums ∑f(q n t),” Vestnik SamGU, Estestvennonauchnaya Seriya, 4 (70), 52–84 (2009).
Author information
Authors and Affiliations
Corresponding author
Additional information
Translated from Kibernetika i Sistemnyi Analiz, No. 4, July–August, 2016, pp. 134–144.
Rights and permissions
About this article
Cite this article
Ivanchuk, M.A., Malyk, I.V. Solving the Classification Problem Using ε-nets. Cybern Syst Anal 52, 613–622 (2016). https://doi.org/10.1007/s10559-016-9863-9
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10559-016-9863-9