skip to main content
research-article

Batch and online learning algorithms for nonconvex neyman-pearson classification

Published:06 May 2011Publication History
Skip Abstract Section

Abstract

We describe and evaluate two algorithms for Neyman-Pearson (NP) classification problem which has been recently shown to be of a particular importance for bipartite ranking problems. NP classification is a nonconvex problem involving a constraint on false negatives rate. We investigated batch algorithm based on DC programming and stochastic gradient method well suited for large-scale datasets. Empirical evidences illustrate the potential of the proposed methods.

References

  1. Andrieu, L., Cohen, G., and Vázquez Abad, F. 2007. Stochastic programming with probability constraints. http://fr.arxiv.org/abs/0708.0281.Google ScholarGoogle Scholar
  2. Arrow, K., Hurwicz, L., and Uzawa, H. 1958. Studies in Nonlinear Programming. Stanford University Press.Google ScholarGoogle Scholar
  3. Bach, F. R., Heckerman, D., and Horvitz, E. 2006. Considering cost asymmetry in learning classifiers. J. Mach. Learn. Res. 7, 1741. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bottou, L. 2007. Learning with large datasets. Tutorial of the Conference on Advances in Neural Information Processing Systems (NIPS).Google ScholarGoogle Scholar
  5. Bounsiar, A., Beauseroy, P., and Grall-Maës, E. 2008. General solution and learning method for binary classification with performance constraints. Patt. Recogn. Lett. 29, 10, 1455--1465. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ciarlet, P. G. 1989. Introduction to Numerical Linear Algebra and Optimisation. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Clémençon, S. and Vayatis, N. 2007. Ranking the best instances. J. Mach. Learn. Res. 8, 2671--2699. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Clémençon, S. and Vayatis, N. 2009. Overlaying classifiers: A practical approach for optimal ranking. In Advances in Neural Information Processing Systems, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou Eds., Vol. 21. MIT Press, Cambridge, MA, 313--320.Google ScholarGoogle Scholar
  9. Collobert, R., Sinz, F., Weston, J., and Bottou, L. 2006a. Large scale transductive svms. J. Mach. Learn. Res. 7, 1687--1712. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Collobert, R., Sinz, F., Weston, J., and Bottou, L. 2006b. Trading convexity for scalability. In Proceedings of the International Conference on Machine Learning. ACM, New York, 201--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Cortes, C. and Mohri, M. 2004. Auc optimization vs. error rate minimization. In Advances in Neural Information Processing Systems 16, S. Thrun, L. K. Saul, and B. Schölkopf Eds., MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  12. Davenport, M., Baraniuk, R., and Scott, C. 2010. Tuning support vector machines for minimax and neyman-pearson classification. IEEE Trans. Patt. Anal. Mach. Intell. 99, PrePrints. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Elias, J. E. and Gygi, S. P. 2007. Target-Decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nature Methods 4, 3, 207--214.Google ScholarGoogle ScholarCross RefCross Ref
  14. Gärtner, B., Giesen, J., and Jaggi, M. 2009. An exponential lower bound on the complexity of regularization paths. CoRR abs/0903.4817.Google ScholarGoogle Scholar
  15. Hsieh, C., Chang, K., Lin, C., Keerthi, S., and Sundararajan, S. 2008. A dual coordinate descent method for large-scale linear SVM. In Proceedings of the 25th International Conference on Machine Learning (ICML'08). Omnipress, 408--415. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Huang, K., Yang, H., King, I., and Lyu, M. 2006. Imbalanced learning with a biased minimax probability machine. IEEE Trans. Syst. Man Cyb. B 36, 4, 913. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Joachims, T. 2005. A support vector method for multivariate performance measures. In Proceedings of the International Conference on Machine Learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Käll, L., Canterbury, J., Weston, J., Noble, W., and MacCoss, M. 2007. A semi-supervised machine learning technique for peptide identification from shotgun proteomics datasets. Nature Methods 4, 923--25.Google ScholarGoogle ScholarCross RefCross Ref
  19. Kim, S., Magnani, A., Samar, S., Boyd, S., and Lim, J. 2006. Pareto optimal linear classification. In Proceedings of the 23rd International Conference on Machine Learning. ACM, New York, 473--480. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Mozer, M., Dodier, R., Colagrosso, M., Guerra-Salcedo, C., and Wolniewicz, R. 2002. Prodding the ROC curve: Constrained optimization of classifier performance. In Advances in Neural Information Processing Systems, Vol. 2. MIT Press, Cambridge, MA, 1409--1416.Google ScholarGoogle Scholar
  21. Platt, J. 1999. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods—Support Vector Learning, B. Schölkopf, C. J. C. Burges, and A. J. Smola Eds., MIT Press, Cambridge, MA, 185--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Scott, C. and Nowak, R. 2005. A Neyman-Pearson approach to statistical learning. IEEE Trans. Inf. Theory 51, 11, 3806--3819. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Spivak, M., Weston, J., Bottou, L., K&alluml;, L., and Noble, W. S. 2009. Improvments to the percolator algorithm for peptide identification form shotgun proteomics data sets. J. Proteome Res. 8, 3737--3737.Google ScholarGoogle ScholarCross RefCross Ref
  24. Storey, J. D. and Tibshirani, R. 2003. Statistical significance for genome-wide studies. Proc. Nat. Acad. Sci. 100, 9440--9445.Google ScholarGoogle ScholarCross RefCross Ref
  25. Streit, R. 1990. A neural network for optimum neyman-pearson classification. In Proceedings of the International Joint Conference on Neural Networks. 685--690.Google ScholarGoogle ScholarCross RefCross Ref
  26. Tao, P. D. and An, L. T. H. 1998. Dc optimization algorithms for solving the trust region subproblem. SIAM J. Optim. 8, 2, 476--505. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Tsypkin, Y. Z. 1971. Adaptation and Learning in Automatic Systems. Academic Press, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Vapnik, V. N. 1998. Statistical Learning Theory. John Wiley & Sons.Google ScholarGoogle Scholar
  29. Yu, J., Vishwanathan, S., and Zhang, J. 2009. The entire quantile path of a risk-agnostic svm classifier. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Batch and online learning algorithms for nonconvex neyman-pearson classification

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Intelligent Systems and Technology
        ACM Transactions on Intelligent Systems and Technology  Volume 2, Issue 3
        April 2011
        259 pages
        ISSN:2157-6904
        EISSN:2157-6912
        DOI:10.1145/1961189
        Issue’s Table of Contents

        Copyright © 2011 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 6 May 2011
        • Accepted: 1 August 2010
        • Revised: 1 July 2010
        • Received: 1 April 2010
        Published in tist Volume 2, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader