Skip to main content
Log in

A randomized algorithm for two-cluster partition of a set of vectors

  • Published:
Computational Mathematics and Mathematical Physics Aims and scope Submit manuscript

Abstract

A randomized algorithm is substantiated for the strongly NP-hard problem of partitioning a finite set of vectors of Euclidean space into two clusters of given sizes according to the minimum-of-the sum-of-squared-distances criterion. It is assumed that the centroid of one of the clusters is to be optimized and is determined as the mean value over all vectors in this cluster. The centroid of the other cluster is fixed at the origin. For an established parameter value, the algorithm finds an approximate solution of the problem in time that is linear in the space dimension and the input size of the problem for given values of the relative error and failure probability. The conditions are established under which the algorithm is asymptotically exact and runs in time that is linear in the space dimension and quadratic in the input size of the problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. K. Anil and K. Jain, “Data clustering: 50 years beyond k-means,” Pattern Recogn. Lett. 31, 651–666 (2010).

    Article  Google Scholar 

  2. J. B. MacQueen, “Some methods for classification and analysis of multivariate observations,” Proceedings of the 5th Berkeley Symposium of Mathematical Statistics and Probability (Univ. of California Press, Berkeley, 1967), Vol. 1, pp. 281–297.

    Google Scholar 

  3. M. Rao, “Cluster analysis and mathematical programming,” J. Am. Stat. Assoc. 66, 622–626 (1971).

    Article  MATH  Google Scholar 

  4. A. E. Galashov and A. V. Kel’manov, “A 2-approximate algorithm to solve one problem of the family of disjoint vector subsets,” Autom. Remote Control 75(4), 595–606 (2014).

    Article  MATH  MathSciNet  Google Scholar 

  5. A. V. Dolgushev and A. V. Kel’manov, “On the algorithmic complexity of a problem in cluster analysis,” J. Appl. Ind. Math. 5(2), 191–194 (2011).

    Article  MathSciNet  Google Scholar 

  6. P. Hansen and B. Jaumard, “Cluster analysis and mathematical programming,” Math. Program. 79, 191–215 (1997).

    MATH  MathSciNet  Google Scholar 

  7. P. Hansen, B. Jaumard, and N. Mladenovich, “Minimum sum of squares clustering in a low dimensional space,” J. Classification 15, 37–55 (1998).

    Article  MATH  MathSciNet  Google Scholar 

  8. M. Inaba, N. Katch, and H. Imai, “Applications of weighted Voronoi diagrams and randomization to variance-based clustering,” Proceedings of the Annual Symposium on Computational Geometry (Stony Brook, New York, 1994), pp. 332–339.

    Google Scholar 

  9. D. Aloise, A. Deshpande, P. Hansen, and P. Popat, “NP-hardness of Euclidean sum-of-squares clustering,” Machine Learning 75(2), 245–248 (2009).

    Article  Google Scholar 

  10. A. A. Ageev, A. V. Kel’manov, and A. V. Pyatkin, “NP-hardness of the Euclidean max-cut problem,” Dokl. Math. 89(3), 343–345 (2014).

    Article  MATH  Google Scholar 

  11. A. V. Dolgushev and A. V. Kel’manov, “An approximation algorithm for solving a problem of cluster analysis,” J. Appl. Ind. Math. 5(4), 551–558 (2011).

    Article  MathSciNet  Google Scholar 

  12. A. V. Dolgushev, A. V. Kel’manov, and V. V. Shenmaier, “A PTAS for a problem of cluster analysis,” Proceedings of the 9th International Conference on Intelligent Information Processing, Budva, Montenegro (Torus, Moscow, 2012), pp. 242–244.

    Google Scholar 

  13. I. I. Eremin, E. Kh. Gimadi, A. V. Kel’manov, A. V. Pyatkin, M. Yu. Khachai, “2-Approximation algorithm for finding a clique with minimum weight of vertices and edges,” Proc. Steklov Inst. Math. 284,Suppl. 1, S87–S95 (2014).

    Article  MATH  Google Scholar 

  14. A. V. Kel’manov, “On the complexity of some cluster analysis problems,” Comput. Math. Math. Phys. 51(11), 1983–1988 (2011).

    Article  MathSciNet  Google Scholar 

  15. A. V. Kel’manov, “On the complexity of some data analysis problems,” Comput. Math. Math. Phys. 50(11), 1941–1947 (2010).

    Article  MathSciNet  Google Scholar 

  16. A. V. Kel’manov, “Off-line detection of a quasi-periodically recurring fragment in a numerical sequence,” Proc. Steklov Inst. Math. 263,Suppl. 2, S84–S92 (2008).

    Article  MathSciNet  Google Scholar 

  17. A. V. Kel’manov, S. M. Romanchenko, and S. A. Khamidullin, “Accurate pseudopolynomial-time algorithms for certain NP-hard problems of searching for a vector subsequence,” Vychisl. Mat. Mat. Fiz. 53(1), 143–153 (2013).

    MATH  Google Scholar 

  18. A. V. Kel’manov and A. V. Pyatkin, “On complexity of some problems of cluster analysis of vector sequences,” J. Appl. Ind. Math. 7(3), 363–369 (2013).

    Article  MathSciNet  Google Scholar 

  19. A. V. Kel’manov and A. V. Pyatkin, “Complexity of certain problems of searching for subsets of vectors and cluster analysis,” Comput. Math. Math. Phys. 49(11), 1966–1971 (2009).

    Article  MathSciNet  Google Scholar 

  20. A. V. Kel’manov and A. V. Pyatkin, “On the complexity of a search for a subset of “similar” vectors,” Dokl. Math 78(1), 574–575 (2008).

    Article  MATH  MathSciNet  Google Scholar 

  21. A. V. Kel’manov and V. I. Khandeev, “A 2-approximation polynomial algorithm for a clustering problem,” J. Appl. Ind. Math. 7(4), 515–521 (2013).

    Article  MathSciNet  Google Scholar 

  22. M. Rajeev and R. Prabhakar, Randomized Algorithms (Cambridge University Press, New York, 1995).

    MATH  Google Scholar 

  23. A. A. Markov, Calculus of Probabilities (Tipograf. Imperator. Akad. Nauk, St. Petersburg, 1900) [in Russian].

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. V. Kel’manov.

Additional information

Original Russian Text © A.V. Kel’manov, V.I. Khandeev, 2015, published in Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki, 2015, Vol. 55, No. 2, pp. 335–344.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kel’manov, A.V., Khandeev, V.I. A randomized algorithm for two-cluster partition of a set of vectors. Comput. Math. and Math. Phys. 55, 330–339 (2015). https://doi.org/10.1134/S096554251502013X

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S096554251502013X

Keywords

Navigation