Abstract
We describe and evaluate two algorithms for Neyman-Pearson (NP) classification problem which has been recently shown to be of a particular importance for bipartite ranking problems. NP classification is a nonconvex problem involving a constraint on false negatives rate. We investigated batch algorithm based on DC programming and stochastic gradient method well suited for large-scale datasets. Empirical evidences illustrate the potential of the proposed methods.
- Andrieu, L., Cohen, G., and Vázquez Abad, F. 2007. Stochastic programming with probability constraints. http://fr.arxiv.org/abs/0708.0281.Google Scholar
- Arrow, K., Hurwicz, L., and Uzawa, H. 1958. Studies in Nonlinear Programming. Stanford University Press.Google Scholar
- Bach, F. R., Heckerman, D., and Horvitz, E. 2006. Considering cost asymmetry in learning classifiers. J. Mach. Learn. Res. 7, 1741. Google ScholarDigital Library
- Bottou, L. 2007. Learning with large datasets. Tutorial of the Conference on Advances in Neural Information Processing Systems (NIPS).Google Scholar
- Bounsiar, A., Beauseroy, P., and Grall-Maës, E. 2008. General solution and learning method for binary classification with performance constraints. Patt. Recogn. Lett. 29, 10, 1455--1465. Google ScholarDigital Library
- Ciarlet, P. G. 1989. Introduction to Numerical Linear Algebra and Optimisation. Cambridge University Press. Google ScholarDigital Library
- Clémençon, S. and Vayatis, N. 2007. Ranking the best instances. J. Mach. Learn. Res. 8, 2671--2699. Google ScholarDigital Library
- Clémençon, S. and Vayatis, N. 2009. Overlaying classifiers: A practical approach for optimal ranking. In Advances in Neural Information Processing Systems, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou Eds., Vol. 21. MIT Press, Cambridge, MA, 313--320.Google Scholar
- Collobert, R., Sinz, F., Weston, J., and Bottou, L. 2006a. Large scale transductive svms. J. Mach. Learn. Res. 7, 1687--1712. Google ScholarDigital Library
- Collobert, R., Sinz, F., Weston, J., and Bottou, L. 2006b. Trading convexity for scalability. In Proceedings of the International Conference on Machine Learning. ACM, New York, 201--208. Google ScholarDigital Library
- Cortes, C. and Mohri, M. 2004. Auc optimization vs. error rate minimization. In Advances in Neural Information Processing Systems 16, S. Thrun, L. K. Saul, and B. Schölkopf Eds., MIT Press, Cambridge, MA.Google Scholar
- Davenport, M., Baraniuk, R., and Scott, C. 2010. Tuning support vector machines for minimax and neyman-pearson classification. IEEE Trans. Patt. Anal. Mach. Intell. 99, PrePrints. Google ScholarDigital Library
- Elias, J. E. and Gygi, S. P. 2007. Target-Decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nature Methods 4, 3, 207--214.Google ScholarCross Ref
- Gärtner, B., Giesen, J., and Jaggi, M. 2009. An exponential lower bound on the complexity of regularization paths. CoRR abs/0903.4817.Google Scholar
- Hsieh, C., Chang, K., Lin, C., Keerthi, S., and Sundararajan, S. 2008. A dual coordinate descent method for large-scale linear SVM. In Proceedings of the 25th International Conference on Machine Learning (ICML'08). Omnipress, 408--415. Google ScholarDigital Library
- Huang, K., Yang, H., King, I., and Lyu, M. 2006. Imbalanced learning with a biased minimax probability machine. IEEE Trans. Syst. Man Cyb. B 36, 4, 913. Google ScholarDigital Library
- Joachims, T. 2005. A support vector method for multivariate performance measures. In Proceedings of the International Conference on Machine Learning. Google ScholarDigital Library
- Käll, L., Canterbury, J., Weston, J., Noble, W., and MacCoss, M. 2007. A semi-supervised machine learning technique for peptide identification from shotgun proteomics datasets. Nature Methods 4, 923--25.Google ScholarCross Ref
- Kim, S., Magnani, A., Samar, S., Boyd, S., and Lim, J. 2006. Pareto optimal linear classification. In Proceedings of the 23rd International Conference on Machine Learning. ACM, New York, 473--480. Google ScholarDigital Library
- Mozer, M., Dodier, R., Colagrosso, M., Guerra-Salcedo, C., and Wolniewicz, R. 2002. Prodding the ROC curve: Constrained optimization of classifier performance. In Advances in Neural Information Processing Systems, Vol. 2. MIT Press, Cambridge, MA, 1409--1416.Google Scholar
- Platt, J. 1999. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods—Support Vector Learning, B. Schölkopf, C. J. C. Burges, and A. J. Smola Eds., MIT Press, Cambridge, MA, 185--208. Google ScholarDigital Library
- Scott, C. and Nowak, R. 2005. A Neyman-Pearson approach to statistical learning. IEEE Trans. Inf. Theory 51, 11, 3806--3819. Google ScholarDigital Library
- Spivak, M., Weston, J., Bottou, L., K&alluml;, L., and Noble, W. S. 2009. Improvments to the percolator algorithm for peptide identification form shotgun proteomics data sets. J. Proteome Res. 8, 3737--3737.Google ScholarCross Ref
- Storey, J. D. and Tibshirani, R. 2003. Statistical significance for genome-wide studies. Proc. Nat. Acad. Sci. 100, 9440--9445.Google ScholarCross Ref
- Streit, R. 1990. A neural network for optimum neyman-pearson classification. In Proceedings of the International Joint Conference on Neural Networks. 685--690.Google ScholarCross Ref
- Tao, P. D. and An, L. T. H. 1998. Dc optimization algorithms for solving the trust region subproblem. SIAM J. Optim. 8, 2, 476--505. Google ScholarDigital Library
- Tsypkin, Y. Z. 1971. Adaptation and Learning in Automatic Systems. Academic Press, New York. Google ScholarDigital Library
- Vapnik, V. N. 1998. Statistical Learning Theory. John Wiley & Sons.Google Scholar
- Yu, J., Vishwanathan, S., and Zhang, J. 2009. The entire quantile path of a risk-agnostic svm classifier. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. Google ScholarDigital Library
Index Terms
- Batch and online learning algorithms for nonconvex neyman-pearson classification
Recommendations
Online Passive-Aggressive Active learning
We investigate online active learning techniques for online classification tasks. Unlike traditional supervised learning approaches, either batch or online learning, which often require to request class labels of each incoming instance, online active ...
Unsupervised Detection of Anomalous Sound Based on Deep Learning and the Neyman–Pearson Lemma
This paper proposes a novel optimization principle and its implementation for unsupervised anomaly detection in sound ADS using an autoencoder AE. The goal of the unsupervised-ADS is to detect unknown anomalous sounds without training data of anomalous ...
An online AUC formulation for binary classification
The area under the ROC curve (AUC) provides a good scalar measure of ranking performance without requiring a specific threshold for performance comparison among classifiers. AUC is useful for imprecise environments since it operates independently with ...
Comments