EN|RU

Volume 29, No 1, 2022, P. 18-32

UDC 519.87+519.854
O. A. Kutnenko
Computational complexity of two problems of cognitive data analysis

Abstract:
The NP-hardness in the strong sense is proved for two problems of cognitive data analysis. One of them is the problem of taxonomy (clustering), i. e. splitting an unclassified sample of objects into disjoint subsets. The other is the problem of sampling a subset of typical representatives of a classified sample which consists of objects of two images. The first problem can be considered as a special case of the second problem, provided that one of the images consists of one object. To obtain a quantitative quality estimate for the set of selected typical representatives of the sample, the function of rival similarity (FRiS function) is used, which assesses the similarity of an object with the closest typical object.
Illustr. 1, bibliogr. 18.

Keywords: NP-hardness, taxonomy (clustering), typical object (prototypes) selection, function of rival similarity.

DOI: 10.33048/daio.2022.29.713

Olga A. Kutnenko 1,2
1. Sobolev Institute of Mathematics,
4 Koptyug Ave., 630090 Novosibirsk, Russia
2. Novosibirsk State University,
2 Pirogov St., 630090 Novosibirsk, Russia
e-mail: olga@math.nsc.ru

Received April 26, 2021
Revised December 2, 2021
Accepted December 3, 2021

References

[1] N. G. Zagoruiko, I. A. Borisova, V. V. Dyubanov, and O. A. Kutnenko, Methods of recognition based on the function of rival similarity, Pattern Recognit. Image Anal. 18 (1), 1–6 (2008).

[2] I. A. Borisova, V. V. Dyubanov, N. G. Zagoruiko, and O. A. Kutnenko, Similarity and compactness, in Proc. 14th All-Russian Conf. “Mathematical Methods for Pattern Recognition”, Suzdal, Russia, Sept. 21–25, 2009 (Maks Press, Moscow, 2009), pp. 89–92 [Russian].

[3] C. J. C. Burges, A tutorial on support vector machines for pattern recognition, Data Mining Knowl. Discov. 2 (2), 121–167 (1998).

[4] M. E. Tipping, The relevance vector machine, in Advances in Neural Information Processing Systems 12 (Proc. 1999 Conf., Denver, CO, USA, Nov. 29–Dec. 4, 1999) (MIT Press, Cambridge, MA, 2000), pp. 652–658.

[5] N. G. Zagoruiko, Applied Methods of Data and Knowledge Analysis (Izd. Inst. Mat., Novosibirsk, 1999) [Russian].

[6] K. V. Vorontsov and A. O. Koloskov, Compactness profiles and prototype object selection in metric classification algorithms, Iskusstv. Intell., No. 2, 30–33
(2006) [Russian].

[7] M. N. Ivanov and K. V. Vorontsov, Prototypes selection based on minimization of a complete follow-control functional, in Proc. 14th All-Russian Conf. “Mathematical Methods for Pattern Recognition”, Suzdal, Russia, Sept. 21–25, 2009 (Maks Press, Moscow, 2009), pp. 119-–122 [Russian].

[8] S. Bermejo and J. Cabestany, Learning with nearest neighbor classifiers, Neural Proc. Lett. 13 (2), 159-–181 (2001).

[9] V. N. Vapnik, The Task of Learning Pattern Recognition (Znanie, Moscow, 1971) [Russian].

[10] N. G. Zagoruiko, I. A. Borisova, V. V. Dyubanov, and O. A. Kutnenko, A quantitative measure of compactness and similarity in a competitive space, Sib. J. Ind. Math. 13 (1), 59–71 (2010) [Russian] [J. Appl. Ind. Math. 5 (1), 144–154 (2011)].

[11] I. A. Borisova, A taxonomy algorithm FRiS-Tax, Nauchn. Vestn. NGTU, No. 3, 3–12 (2007) [Russian].

[12] I. A. Borisova and N. G. Zagoruiko, A FRiS-TDR algorithm for solving a generalized taxonomy and recognition problem, in Proc. 2nd All-Russian Conf. “Knowledge–Ontology–Theory”, Novosibirsk, Russia, Oct. 22–24, 2009, Vol. 1 (Inst. Mat., Novosibirsk, 2009), pp. 93–102 [Russian].

[13] J. B. MacQueen, Some methods for classification and analysis of multivariate observations, in Proc. 5th Berkley Symp. Math. Stat. Prob., Berkley, USA, June 21–July 18, 1965; Dec. 27, 1965–Jan. 7, 1966, Vol. 1 (Univ. California Press, Berkley, 1967), pp. 281—297.

[14] A. V. Zukhba, NP-completeness of the problem of prototype selection in the nearest neighbor method, Pattern Recognit. Image Anal. 20 (4), 484–494 (2010).

[15] I. A. Borisova, V. V. Dyubanov, O. A. Kutnenko, and N. G. Zagoruiko, Use of the FRiS-function for taxonomy, attribute selection and decision rule construction, in Knowledge Processing and Data Analysis (Rev. Sel. Pap. 1st Int. Conf. KONT 2007, Novosibirsk, Russia, Sept. 14–16, 2007; 1st Int. Conf. KPP 2007, Darmstadt, Germany, Sept. 28–30, 2007) (Springer, Heidelberg, 2011), pp. 256–270 (Lect. Notes Comput. Sci., Vol. 6581).

[16] N. G. Zagoruiko, I. A. Borisova, O. A. Kutnenko, and V. V. Dyubanov, A construction of a compressed description of data using a function of rival similarity, Sib. J. Ind. Math. 16 (1), 29–41 (2013) [Russian] [J. Appl. Ind. Math. 7 (2), 275–286 (2013)].

[17] I. A. Borisova, Computational complexity of the problem of choosing typical representatives in a 2-clustering of a finite set of points in a metric space, Discrete Anal. Oper. Res. 27 (2), 5–16 (2020) [Russian] [J. Appl. Ind. Math. 14 (2), 242–248 (2020)].

[18] M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness (Freeman, San Francisco, 1979; Mir, Moscow, 1982 [Russian]).
 © Sobolev Institute of Mathematics, 2015