EN|RU
English version:
Journal of Applied and Industrial Mathematics, 2016, 10:3, 349–355

Volume 23, No 3, 2016, P. 21-34

UDC 519.16 + 519.85
A. V. Kel’manov and A. V. Motkova
Exact pseudopolinomial algorithms for a balanced 2-clustering problem

Abstract:
We consider the strongly NP-hard problem of partitioning a set of Euclidean points into two clusters so as to minimize the sum (over both clusters) of the weighted sum of the squared intracluster distances from the elements of the clusters to their centers. The weights of sums are the sizes of the clusters. The center of one cluster is given as input, while the center of the other cluster is unknown and determined as the average value over all points in the cluster (the geometric center). The two versions of the problems are analyzed in which the cluster sizes are either parts of the input or optimization variables. We present and prove exact pseudopolynomial algorithms in the case of integer components of the input points and fixed space dimension.
Bibliogr. 24.

Keywords: Euclidean space, balanced clustering, NP-hardness, integer inputs, fixed space dimension, exact pseudopolynomial algorithm.

DOI: 10.17377/daio.2016.23.520

Alexander V. Kel’manov 1,2
Anna V. Motkova 2

1. Sobolev Institute of Mathematics,
4 Koptyug Ave., 630090 Novosibirsk, Russia
2. Novosibirsk State University,
2 Pirogov St., 630090 Novosibirsk, Russia
e-mail: kelm@math.nsc.ru, anitamo@mail.ru

Received 25 May 2016

References

[1] A. E. Baburin, E. Kh. Gimadi, N. I. Glebov, and A. V. Pyatkin, The problem of finding a subset of vectors with the maximum total weight, Diskretn. Anal. Issled. Oper., Ser. 2, 14, No. 1, 32–42, 2007. Translated in
J. Appl. Ind. Math., 2, No. 1, 32–38, 2008.

[2] N. Wirth, Algorithms + Data Structures = Programs, Prentice Hall, Upper Saddle River, USA, 1976. Translated under the title Algoritmy + struktury dannykh = programmy, Mir, Moscow, 1985.

[3] E. Kh. Gimadi, Yu. V. Glazkov, and I. A. Rykov, On two problems of choosing some subset of vectors with integer coordinates that has maximum norm of the sum of elements in Euclidean space, Diskretn. Anal. Issled. Oper., 15, No. 4, 30–43, 2008. Translated in J. Appl. Ind. Math., 3, No. 3, 343–352, 2009.

[4] E. Kh. Gimadi, A. V. Kel’manov, M. A. Kel’manova, and S. A. Khamidullin, A posteriori detecting a quasiperiodic fragment with a given number of repetitions in a numerical sequence, Sib. Zh. Ind. Mat., 9, No. 1, 55–74, 2006.

[5] A. V. Dolgushev and A. V. Kel’manov, An approximation algorithm for solving a problem of cluster analysis, Diskretn. Anal. Issled. Oper., 18, No. 2, 29–40, 2011. Translated in J. Appl. Ind. Math., 5, No. 4, 551–558, 2011.

[6] A. V. Dolgushev, A. V. Kel’manov, and V. V. Shenmaier, Polynomial-time approximation scheme for a problem of partitioning a finite set into two clusters, Tr. Inst. Mat. Mekh., 21, No. 3, 100–109, 2015.

[7] A. V. Kel’manov and A. V. Pyatkin, NP-hardness of some quadratic Euclidean 2-clustering problems, Dokl. Akad. Nauk, 464, No. 5, 535–538, 2015. Translated in Dokl. Math., 92, No. 2, 634–637, 2015.

[8] A. V. Kel’manov and A. V. Pyatkin, On the complexity of some quadratic Euclidean 2-clustering problems, Zh. Vychisl. Mat. Mat. Fiz., 56, No. 3, 150–156, 2016. Translated in Comput. Math. Math. Phys., 56, No. 3, 491–497, 2016.

[9] A. V. Kel’manov and S. M. Romanchenko, Pseudopolynomial algorithms for certain computationally hard vector subset and cluster analysis problems, Avtom. Telemekh., No. 2, 156–162, 2012. Translated in Autom. Remote Control, 73, No. 2, 349–354, 2012.

[10] A. V. Kel’manov and S. M. Romanchenko, An FPTAS for a vector subset search problem, Diskretn. Anal. Issled. Oper., 21, No. 3, 41–52, 2014. Translated in J. Appl. Ind. Math., 8, No. 3, 329–336, 2014.

[11] A. V. Kel’manov and V. I. Khandeev, A 2-approximation polynomial algorithm for a clustering problem, Diskretn. Anal. Issled. Oper., 20, No. 4, 36–45, 2013. Translated in J. Appl. Ind. Math., 7, No. 4, 515–521, 2013.

[12] A. V. Kel’manov and V. I. Khandeev, A randomized algorithm for two-cluster partition of a set of vectors, Zh. Vychisl. Mat. Mat. Fiz., 55, No. 2, 335–344, 2015. Translated in Comput. Math. Math. Phys., 55, No. 2, 330–339, 2015.

[13] A. V. Kel’manov and V. I. Khandeev, An exact pseudopolynomial algorithm for a problem of the two-cluster partitioning of a set of vectors, Diskretn. Anal. Issled. Oper., 22, No. 3, 36–48, 2015. Translated in J. Appl. Ind. Math., 9, No. 4, 497–502, 2015.

[14] D. Aloise, A. Deshpande, P. Hansen, and P. Popat, NP-hardness of Euclidean sum-of-squares clustering, Mach. Learn., 75, No. 2, 245–248, 2009.

[15] P. Brucker, On the complexity of clustering problems, in Optimization and Operations Research (Proc. Workshop Held Univ. Bonn, Bonn, Germany, Oct. 2–8, 1977), pp. 45–54, Springer-Verlag, Heidelberg, 1978 (Lect. Notes Econ. Math. Syst., Vol. 157).

[16] W. F. de la Vega, M. Karpinski, C. Kenyon, and Y. Rabani, Polynomial time approximation schemes for metric Min-Sum clustering, in Electron. Colloq. Comput. Complexity, Report No. 25, Hasso-Plattner-Institut Softwaresystemtechnik, Potsdam, 2002.

[17] W. F. de la Vega and C. Kenyon, A randomized approximation scheme for metric Max-Cut, J. Comput. Syst. Sci., 63, 531–541, 2001.

[18] R. A. Fisher, Statistical methods and scientific inference, Hafner Press, New York, 1959.

[19] M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, Freeman, San Francisco, 1979. Translated under the title Vychislitel’nye mashiny i trudnoreshaemye zadachi, Mir, Moscow, 1982.

[20] E. Kh. Gimadi, A. V. Kel’manov, M. A. Kel’manova, and S. A. Khamidullin, A posteriori detecting a quasiperiodic fragment in a numerical sequence, Pattern Recognit. Image Anal., 18, No. 1, 30–42, 2008.

[21] S. Hasegawa, H. Imai, M. Inaba, N. Katoh, and J. Nakano, Efficient algorithms for variance-based k-clustering, in Computer Graphics and Applications (Proc. 1st Pac. Conf. Comput. Gr. Appl., Seoul, Korea, Aug. 30 – Sept. 2, 1993), pp. 75–89, World Scientific, River Edge, NJ, USA, 1993.

[22] M. Inaba, N. Katoh, and H. Imai, Applications of weighted Voronoi diagrams and randomization to variance-based k-clustering, in Proc. 10th Symp. Comput. Geom., Stony Brook, NY, USA, June 6–8, 1994, pp. 332–339, ACM, New York, 1994.

[23] M. R. Rao, Cluster analysis and mathematical programming, J. Am. Stat. Assoc., 66, 622–626, 1971.

[24] S. Sahni and T. Gonzalez, P-complete approximation problems, J. ACM, 23, 555–566, 1976.
 © Sobolev Institute of Mathematics, 2015