EN|RU


Том 28, номер 4, 2021 г., Стр. 5-60

УДК 519.87
Васильев И. Л., Ушаков А. В.
Дискретные задачи размещения в машинном обучении

Аннотация:
Задачи размещения составляют довольно широкий класс оптимизационных задач, являющихся одними из базовых объектов исследования в области комбинаторной оптимизации и исследования операций. Помимо большого числа экономических приложений, такие задачи нашли широкое применение и в области машинного обучения. Например, задача кластеризации может быть представлена как задача размещения, в которой необходимо разделить множество клиентов на кластеры, обслуживаемые открытыми предприятиями. В настоящем обзоре планируется кратко проследить, как идеи и подходы, возникшие в области теории размещения, привели к появлению современных популярных алгоритмов машинного обучения, реализованных в большинстве коммерческих пакетов прикладных программ и технических вычислений. Помимо этого, предполагается провести обзор современных точных методов и эвристик, а также некоторых обобщений базовых задач и алгоритмов, возникающих непосредственно в практических приложениях из анализа данных. Отметим, что основное внимание будет уделено дискретным задачам размещения, лежащим, например, в основе многих популярных алгоритмов кластеризации (PAM, affinity propagation и т. д.) Поскольку объём современных данных создаёт существенные трудности для классических алгоритмов ввиду их высокой вычислительной сложности, в обзоре будут рассмотрены современные подходы к реализации упомянутых алгоритмов для анализа больших массивов данных.
Библиогр. 138.

Ключевые слова: машинное обучение, задачи размещения, кластеризация.

DOI: 10.33048/daio.2021.28.714

Васильев Игорь Леонидович 1
Ушаков Антон Владимирович 1
1. Институт динамики систем и теории управления им. В. М. Матросова,
ул. Лермонтова, 134, 664033 Иркутск, Россия
е-mail: vil@icc.ru, aushakov@icc.ru

Статья поступила 30 апреля 2021 г.
После доработки — 17 июня 2021 г.
Принята к публикации 21 июня 2021 г.

Литература

[1] Cooper L. Location-allocation problems // Oper. Res. 1963. Vol. 11, No. 3. P. 331–343.

[2] Cooper L. Heuristic methods for location-allocation problems // SIAM Rev. 1964. Vol. 6, No. 1. P. 37–53.

[3] Plastria F. The Weiszfeld algorithm: Proof, amendments, and extensions // Foundations of Location Analysis. New York: Springer, 2011. P. 357–389.

[4] Lloyd S. Least squares quantization in PCM // IEEE Trans. Inf. Theory. 1982. Vol. 28, No. 2. P. 129–137.

[5] Forgy E. W. Cluster analysis of multivariate data: efficiency versus interpretability of classifications // Biometrics. 1965. Vol. 21, No. 3. P. 768—769.

[6] Banerjee A., Merugu S., Dhillon I. S., Ghosh J. Clustering with Bregman divergences // J. Mach. Learn. Res. 2005. Vol. 6, No. 58. P. 1705–1749.

[7] MacQueen J. Some methods for classification and analysis of multivariate observations // Proc. 5th Berkeley Symp. Math. Stat. Probab. (Berkeley, USA, June 21–July 18, 1965; Dec. 27, 1965–Jan. 7, 1966). Vol. 1. Berkeley: Univ. California Press, 1967. P. 281–297.

[8] Vinod H. D. Integer programming and the theory of grouping // J. Am. Stat. Assoc. 1969. Vol. 64, No. 326. P. 506–519.

[9] Balinski M. L. Integer programming: Methods, uses, computations // Manage. Sci. 1965. Vol. 12, No. 3. P. 253–313.

[10] Efroymson M. A., Ray T. L. A branch-bound algorithm for plant location // Oper. Res. 1966. Vol. 14, No. 3. P. 361–368.

[11] ReVelle C. S., Swain R. W. Central facilities location // Geogr. Anal. 1970. Vol. 2, No. 1. P. 30–42.

[12] Hakimi S. L. Optimal location of switching centers and the absolute centers and medians of a graph // Oper. Res. 1964. Vol. 12, No. 3. P. 450–459.

[13] Hakimi S. L. Optimum distribution of switching centers in a communication network and some related graph theoretic problems // Oper. Res. 1965. Vol. 13, No. 3. P. 462–475.

[14] Kaufman L., Rousseeuw P. J. Clustering by means of medoids // Statistical Data Analysis Based on the L1-Norm and Related Methods. Amsterdam: North-Holland, 1987. P. 405–416.

[15] Charikar M., Guha S., Tardos E., Shmoys D. B. A constant-factor approximation algorithm for the $k$-median problem // J. Comput. Syst. Sci. 2002. Vol. 65, No. 1. P. 129–149.

[16] Balcan M.-F., Blum A., Gupta A. Approximate clustering without the approximation // Proc. 20th Annu. ACM-SIAM Symp. Discrete Algorithms (New York, USA, Jan. 4–6, 2009). Philadelphia: SIAM, 2009. P. 1068–1077.

[17] Ahmadian S., Norouzi-Fard A., Svensson O., Ward J. Better guarantees for $k$-means and Euclidean $k$-median by primal-dual algorithms // SIAM J. Comput. 2020. Vol. 49, No. 4. P. FOCS17-97–FOCS17-156.

[18] Kariv O., Hakimi S. An algorithmic approach to network location problems. II: The $p$-medians // SIAM J. Appl. Math. 1979. Vol. 37, No. 3. P. 539–560.

[19] Megiddo N., Supowit K. J. On the complexity of some common geometric location problems // SIAM J. Comput. 1984. Vol. 13, No. 1. P. 182–196.

[20] Mahajan M., Nimbhorkar P., Varadarajan K. The planar $k$-means problem is NP-hard // Theor. Comput. Sci. 2012. Vol. 442. P. 13–21.

[21] Megiddo N., Zemel E., Hakimi S. L. The maximum coverage location problem // SIAM J. Algebr. Discrete Methods. 1983. Vol. 4, No. 2. P. 253–261.

[22] Aloise D., Deshpande A., Hansen P., Popat P. NP-hardness of Euclidean sum-of-squares clustering // Mach. Learn. 2009. Vol. 75, No. 2. P. 245–248.

[23] Papadimitriou C. H. Worst-case and probabilistic analysis of a geometric location problem // SIAM J. Comput. 1981. Vol. 10, No. 3. P. 542–557.

[24] Rosing K. E., ReVelle C. S., Rosing-Vogelaar H. The $p$-median and its linear programming relaxation: An approach to large problems // J. Oper. Res. Soc. 1979. Vol. 30, No. 9. P. 815–823.

[25] Church R. L. COBRA: A new formulation of the classic $p$-median location problem // Ann. Oper. Res. 2003. Vol. 122, No. 1–4. P. 103–120.

[26] Cornuejols G., Nemhauser G. L., Wolsey L. A. A canonical representation of simple plant location problems and its applications // SIAM J. Algebr. Discrete Methods. 1980. Vol. 1, No. 3. P. 261–272.

[27] Hansen P., Brimberg J., Urosevic D., Mladenovic N. Solving large $p$-median clustering problems by primal-dual variable neighborhood search // Data Min. Knowl. Discov. 2009. Vol. 19, No. 3. P. 351–375.

[28] Elloumi S. A tighter formulation of the $p$-median problem // J. Comb. Optim. 2010. Vol. 19, No. 1. P. 69–83.

[29] Maranzana F. E. On the location of supply points to minimize transport costs // Oper. Res. Q. 1964. Vol. 15, No. 3. P. 261–270.

[30] Teitz M. B., Bart P. Heuristic methods for estimating the generalized vertex median of a weighted graph // Oper. Res. 1968. Vol. 16, No. 5. P. 955–961.

[31] Hartigan J. A., Wong M. A. Algorithm AS 136: A $k$-means clustering algorithm // J. R. Stat. Soc. Ser. C. 1979. Vol. 28, No. 1. P. 100–108.

[32] Church R. L., ReVelle C. S. Theoretical and computational links between the $p$-median, location set-covering, and the maximal covering location problem // Geogr. Anal. 1976. Vol. 8, No. 4. P. 406–415.

[33] Cornuejols G., Fisher M. L., Nemhauser G. L. Location of bank accounts to optimize float: An analytic study of exact and approximate algorithms // Manage. Sci. 1977. Vol. 23, No. 8. P. 789–810.

[34] Arya V., Garg N., Khandekar R., Meyerson A., Munagala K., Pandit V. Local search heuristics for $k$-median and facility location problems // SIAM J. Comput. 2004. Vol. 33, No. 3. P. 544–562.

[35] Gupta A., Tangwongsan K. Simpler analyses of local search algorithms for facility location. Ithaca, NY: Cornell Univ., 2008. (Cornell Univ. Libr. e-Print Archive; arXiv:0809.2554).

[36] Кочетов Ю. А., Пащенко М. Г., Плясунов А. В. О сложности локального поиска в задаче о $p$-медиане // Дискрет. анализ и исслед. операций. Сер. 2. 2005. Т. 12, № 2. С. 44–71.

[37] Alekseeva E. V., Kochetov Yu. A., Plyasunov A. V. Complexity of local search for the $p$-median problem // Eur. J. Oper. Res. 2008. Vol. 191, No. 3. P. 736–752.

[38] Whitaker R. A. A fast algorithm for the greedy interchange for large-scale clustering and median location problems // INFOR. 1983. Vol. 21. P. 95–108.

[39] Whitaker R. A. Some interchange algorithms for median location problems // Environ. Plann. Ser. B. 1982. Vol. 9, No. 2. P. 119–129.

[40] Densham P. J., Rushton G. A more efficient heuristic for solving large $p$-median problems // Pap. Reg. Sci. 1992. Vol. 71, No. 3. P. 307–329.

[41] Hansen P., Mladenovic N. Variable neighborhood search for the $p$-median // Locat. Sci. 1997. Vol. 5, No. 4. P. 207–226.

[42] Schubert E., Rousseeuw P. J. Faster $k$-medoids clustering: Improving the PAM, CLARA, and CLARANS algorithms // Similarity Search and Applications. Proc. 12th Int. Conf. (Newark, USA, Oct. 2–4, 2019). Cham: Springer, 2019. P. 171–187. (Lect. Notes Comput. Sci.; Vol. 11807).

[43] Resende M. G. C., Werneck R. F. A fast swap-based local search procedure for location problems // Ann. Oper. Res. 2007. Vol. 150, No. 1. P. 205–230.

[44] Feo T. A., Resende M. G. C. Greedy randomized adaptive search procedures // J. Glob. Optim. 1995. Vol. 6, No. 2. P. 109–133.

[45] Resende M. G. C., Werneck R. F. A hybrid heuristic for the $p$-median problem // J. Heuristics. 2004. Vol. 10, No. 1. P. 59–88.

[46] Mirzasoleiman B., Badanidiyuru A., Karbasi A., Vondrák J., Krause A. Lazier than lazy greedy // Proc. 29th AAAI Conf. Artificial Intelligence (Austin, USA, Jan. 25–30, 2015). Palo Alto: AAAI Press, 2015. P. 1812–1818.

[47] Tiwari M., Zhang M. J., Mayclin J., Thrun S. Bandit-PAM: Almost linear time $k$-medoids clustering via multi-armed bandits // Proc. 34th Conf. Neural Information Processing Systems (Vancouver, Canada, Dec. 6–12, 2020). Red Hook: Curran Assoc., 2020. P. 10211–10222.

[48] Hastie T., Tibshirani R., Friedman J. The elements of statistical learning: Data mining, inference, and prediction. New York: Springer, 2009.

[49] Park H.-S., Jun C.-H. A simple and fast algorithm for $k$-medoids clustering // Expert Syst. Appl. 2009. Vol. 36, No. 2, Pt. 2. P. 3336–3341.

[50] Newling J., Fleuret F. A sub-quadratic exact medoid algorithm // Proc. Mach. Learn. Res. 2017. Vol. 54. P. 185–193.

[51] Bagaria V., Kamath G., Ntranos V., Zhang M., Tse D. Medoids in almost- linear time via multi-armed bandits // Proc. Mach. Learn. Res. 2018. Vol. 84. P. 500–509.

[52] Paterlini A. A., Nascimento M. A., Jr. C. T. Using pivots to speed-up $k$-medoids clustering // J. Inf. Data Manage. 2011. Vol. 2, No. 2. P. 221–236.

[53] Kaufman L., Rousseeuw P. J. Finding groups in data: An introduction to cluster analysis. Hoboken, NJ: Wiley, 2005.

[ 54] Ng R. T., Han J. CLARANS: A method for clustering objects for spatial data mining // IEEE Trans. Knowl. Data Eng. 2002. Vol. 14, No. 5. P. 1003–1016.

[55] Newling J., Fleuret F. $K$-medoids for $K$-means seeding // Proc. 31st Int. Conf. Neural Information Processing Systems (Long Beach, CA, USA, Dec. 4–9, 2017). Red Hook, NY: Curran Assoc., 2017. P. 5201–5209.

[56] Van der Laan M., Pollard K., Bryan J. A new partitioning around medoids algorithm // J. Stat. Comput. Simul. 2003. Vol. 73, No. 8. P. 575–584.

[57] Zhang Q., Couloigner I. A new and efficient $k$-medoid algorithm for spatial clustering // Computational Science and Its Applications. Proc. Int. Conf. (Singapore, May 9–12, 2005). Pt. 3. Heidelberg: Springer, 2005. P. 181–189. (Lect. Notes Comput. Sci.; Vol. 3482).

[58] Yu D., Liu G., Guo M., Liu X. An improved $k$-medoids algorithm based on step increasing and optimizing medoids // Expert Syst. Appl. 2018. Vol. 92. P. 464–473.

[59] Wang X., Wang X., Wilkes D. M. Machine learning-based natural scene recognition for mobile robot localization in an unknown environment. Singapore: Springer, 2020. P. 85–108.

[60] Zadegan S. M. R., Mirzaie M., Sadoughi F. Ranked $k$-medoids: A fast and accurate rank-based partitioning algorithm for clustering large datasets // Knowl.-Based Syst. 2013. Vol. 39. P. 133–143.

[61] Rangel E. M., Hendrix W., Agrawal A., Liao W., Choudhary A. AGORAS: A fast algorithm for estimating medoids in large datasets // Procedia Comput. Sci. 2016. Vol. 80. P. 1159–1169.

[62] Fushimi T., Saito K., Ikeda T., Kazama K. Accelerating greedy $k$-medoids clustering algorithm with $L_1$ distance by pivot generation // Foundations of Intelligent Systems. Proc. 23rd Int. Symp. (Warsaw, Poland, June 26–29, 2017). Cham: Springer, 2017. P. 87–96. (Lect. Notes Comput. Sci.; Vol. 10352).

[63] An H.-C., Svensson O. Recent developments in approximation algorithms for facility location and clustering problems // Combinatorial Optimization and Graph Algorithms. Singapore: Springer, 2017. P. 1–19.

[64] Mladenovic N., Brimberg J., Hansen P., Moreno-Pérez J. The $p$-median problem: A survey of metaheuristic approaches // Eur. J. Oper. Res. 2007. Vol. 179, No. 3. P. 927–939.

[65] Reese J. Solution methods for the $p$-median problem: An annotated bibliography // Networks. 2006. Vol. 28, No. 3. P. 125–142.

[66] Avella P., Sassano A., Vasilyev I. L. Computational study of large-scale $p$-median problems // Math. Program. 2007. Vol. 109, No. 1. P. 89–114.

[67] Church R. L. BEAMR: An exact and approximate model for the $p$-median problem // Comput. Oper. Res. 2008. Vol. 35, No. 2. P. 417–426.

[68] García S., LabbéM., MarínA. Solving large $p$-median problems with a radius formulation // INFORMS J. Comput. 2011. Vol. 23, No. 4. P. 546–556.

[69] Mulvey J. M., Crowder H. P. Cluster analysis: An application of Lagrangian relaxation // Manage. Sci. 1979. Vol. 25, No. 4. P. 329–340.

[70] Geoffrion A. M. Lagrangian relaxation for integer programming // Approaches to Integer Programming. Heidelberg: Springer, 1974. P. 82–114. (Math. Program. Study; Vol. 2).

[71] Beltran C., Tadonki C., Vial J. Solving the $p$-median problem with a semi-Lagrangian relaxation // Comput. Optim. Appl. 2006. Vol. 35, No. 2. P. 239–260.

[72] Avella P., Boccia M., Salerno S., Vasilyev I. L. An aggregation heuristic for large scale $p$-median problem // Comput. Oper. Res. 2012. Vol. 39, No. 7. P. 1625–1632.

[73] Irawan C. A., Salhi S. Solving large $p$-median problems by a multistage hybrid approach using demand points aggregation and variable neighbourhood search // J. Glob. Optim. 2015. Vol. 63. P. 537–554.

[74] Cebecauer M., Buzna L. A versatile adaptive aggregation framework for spatially large discrete location-allocation problems // Comput. Ind. Eng. 2017. Vol. 111. P. 364–380.

[75] Jain K., Vazirani V. V. Approximation algorithms for metric facility location and $k$-median problems using the primal-dual schema and Lagrangian relaxation // J. ACM. 2001. Vol. 48, No. 2. P. 274–296.

[76] Li S., Svensson O. Approximating $k$-median via pseudo-approximation // Proc. 45th Annu. ACM Symp. Theory of Computing (Palo Alto, USA, June 1–4, 2013). New York: ACM, 2013. P. 901–910.

[77] Byrka J., Pensyl T., Rybicki B., Srinivasan A., Trinh K. An improved approximation for $k$-median and positive correlation in budgeted optimization // ACM Trans. Algorithms. 2017. Vol. 13, No. 2. P. 23:1–23:31.

[78] Jain K., Mahdian M., Markakis E., Saberi A., Vazirani V. V. Greedy facility location algorithms analyzed using dual fitting with factor-revealing LP // J. ACM. 2003. Vol. 50, No. 6. P. 795–824.

[79] Nellore A., Ward R. Recovery guarantees for exemplar-based clustering // Inf. Comput. 2015. Vol. 245. P. 165–180.

[80] Awasthi P., Bandeira A. S., Charikar M., Krishnaswamy R., Villar S., Ward R. Relax, no need to round: Integrality of clustering formulations // Proc. 2015 Conf. Innovations in Theoretical Computer Science (Rehovot, Israel, Jan. 11–13, 2015). New York: ACM, 2015. P. 191–200.

[81] Crainic T. G., Gendreau M., Hansen P., Mladenovic N. Cooperative parallel variable neighborhood search for the $p$-median // J. Heuristics. 2004. Vol. 10, No. 3. P. 293–314.

[82] Garcia-López F., Melián-Batista B., Moreno-Pérez J. A., Moreno-Vega J. M. The parallel variable neighborhood search for the $p$-median problem // J. Heuristics. 2002. Vol. 8, No. 3. P. 375–388.

[83] Garcia-López F., Melián-Batista B., Moreno-Pérez J. A., Moreno-Vega J. M. Parallelization of the scatter search for the $p$-median problem // Parallel Comput. 2003. Vol. 29, No. 5. P. 575–589.

[84] Crainic T.G., Toulouse M. Parallel meta-heuristics // Handbook of Metaheuristics. New York: Springer, 2010. P. 497–541. (Int. Ser. Oper. Res. Manage. Sci.; Vol. 146).

[85] Ma L., Lim G. J. GPU-based parallel vertex substitution algorithm for the $p$-median problem // Comput. Ind. Eng. 2013. Vol. 64, No. 1. P. 381–388.

[86] Xiao N. A parallel cooperative hybridization approach to the $p$-median problem // Environ. Plann. Ser. B. 2012. Vol. 39, No. 4. P. 755–774.

[87] Arbelaez A., Quesada L. Parallelising the $k$-medoids clustering problem using space-partitioning // Proc. 6th Annu. Symp. Combinatorial Search (Leavenworth, USA, July 11–13, 2013). Palo Alto: AAAI, 2013. P. 20–28.

[88] Blelloch G. E., Tangwongsan K. Parallel approximation algorithms for facility-location problems // Proc. 22nd Annu. ACM Symp. Parallelism in Algorithms and Architectures (Thira Santorini, Greece, June 13–15, 2010). New York: ACM, 2010. P. 315–324.

[89] Blelloch G. E., Gupta A., Tangwongsan K. Parallel probabilistic tree embeddings, $k$-median, and buy-at-bulk network design // Proc. 24th Annu. ACM Symp. Parallelism in Algorithms and Architectures (Pittsburgh, USA, June 25–27, 2012). New York: ACM, 2012. P. 205–213.

[90] Bandyapadhyay S., Inamdar T., Pai S., Pemmaraju S. V. Near-optimal clustering in the $k$-machine model // Proc. 19th Int. Conf. Distributed Computing and Networking (Varanasi, India, Jan. 4–7, 2018). New York: ACM, 2018. P. 15:1–15:10.

[91] Karloff H. J., Suri S., Vassilvitskii S. A model of computation for MapReduce // Proc. 21st Annu. ACM-SIAM Symp. Discrete Algorithms (Austin, USA, Jan. 17-19, 2010). Philadelphia, PA: SIAM, 2010. P. 938–948.

[92] Ene A., Im S., Moseley B. Fast clustering using MapReduce // Proc. 17th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining (San Diego, USA, Aug. 21–24, 2011). New York: ACM, 2011. P. 681–689.

[93] Jakovits P., Srirama S. N. Clustering on the cloud: Reducing CLARA to MapReduce // Proc. 2nd Nordic Symp. Cloud Computing and Internet Technologies (Oslo, Norway, Sep. 2–3, 2013). New York: ACM, 2013. P. 64–71.

[94] Ushakov A. V., Vasilyev I. L. Near-optimal large-scale $k$-medoids clustering // Inf. Sci. 2021. Vol. 545. P. 344–362.

[95] Yang X., Lian L. A New data mining algorithm based on MapReduce and Hadoop // Int. J. Signal Process., Image Process., Pattern Recognit. 2014. Vol. 7, No. 2. P. 131–142.

[96] Martino A., Rizzi A., Frattale Mascioli F. M. Efficient approaches for solving the large-scale $k$-medoids problem: Towards structured data // Computational Intelligence. Proc. 9th Int. Joint Conf. (Funchal-Madeira, Portugal, Nov. 1–3, 2017). Cham: Springer, 2019. P. 199–219.

[97] Zhu Y.,Wang F., Shan X., Lv X. $K$-medoids clustering based on MapReduce and optimal search of medoids // Proc. 9th Int. Conf. Comput. Sci. Education (Vancouver, Canada, Aug 22–24, 2014). Piscataway: IEEE, 2014. P. 573–577.

[98] Song H., Lee J.-G., Han W.-S. PAMAE: Parallel $k$-medoids clustering with high accuracy and efficiency // Proc. 23rd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining (Halifax, Canada, Aug. 13–17, 2017). New York: ACM, 2017. P. 1087–1096.

[99] Mirzasoleiman B., Karbasi A., Sarkar R., Krause A. Distributed submodular maximization: Identifying representative elements in massive data // Proc. 26th Int. Conf. Neural Information Processing Systems (Lake Tahoe, USA, Dec. 5–10, 2013). Vol. 2. Red Hook, NY: Curran Assoc., 2013. P. 2049–2057.

[100] Redondo J. L., Marín A., Ortigosa P. M. A parallelized Lagrangian relaxation approach for the discrete ordered median problem // Ann. Oper. Res. 2016. Vol. 246, No. 1. P. 253–272.

[101] Mancini E. P., Marcarelli S., Vasilyev I. L., Villano U. A grid-aware MIP solver: Implementation and case studies // Futur. Gener. Comp. Syst. 2008. Vol. 24, No. 2. P. 133–41.

[102] Lai P.-S., Fu H.-C. Variance enhanced $k$-medoid clustering // Expert Syst. Appl. 2011. Vol. 38, No. 1. P. 764–775.

[103] Ayyala D. N., Lin S. GrammR: Graphical representation and modeling of count data with application in metagenomics // Bioinformatics. 2015. Vol. 31, No. 10. P. 1648–1654.

[104] Elhamifar E., Sapiro G., Vidal R. Finding exemplars from pairwise dissimilarities via simultaneous sparse recovery // Proc. 25th Int. Conf. Neural Information Processing Systems (Lake Tahoe, USA, Dec. 3–8, 2012). Vol. 1. Red Hook, NY: Curran Assoc., 2012. P. 19–27.

[105] Charikar M., Khuller S., Mount D. M., Narasimhan G. Algorithms for facility location problems with outliers // Proc. 12th Annu. ACM-SIAM Symp. Discrete Algorithms (Washington, USA, Jan. 7–9, 2001). Philadelphia, PA: SIAM, 2001. P. 642–651.

[106] Frey B. J., Dueck D. Clustering by passing messages between data points // Science. 2007. Vol. 315, No. 5814. P. 972–976.

[107] Brusco M. J., Köhn H.-F. Comment on “Clustering by passing messages between data points” // Science. 2008. Vol. 319, No. 5864. P. 726–726.

[108] Brusco M. J., Steinley D. Affinity propagation and uncapacitated facility location problems // J. Classif. 2015. Vol. 32, No. 3. P. 443–480.

[109] Leone M., Sumedha,Weigt M. Clustering by soft-constraint affinity propagation: Applications to gene-expression data // Bioinformatics. 2007. Vol. 23, No. 20. P. 2708–2715.

[110] Mirchandani P., Jagannathan R. Discrete facility location with nonlinear diseconomies in fixed costs // Ann. Oper. Res. 1989. Vol. 18, No. 1. P. 213–224.

[111] Körkel M. Discrete facility location with nonlinear facility costs // RAIROOper. Res. 1991. Vol. 25, No. 1. P. 31–43.

[112] Carrizosa E., Ushakov A. V., Vasilyev I. L. A computational study of a nonlinear minsum facility location problem // Comput. Oper. Res. 2012. Vol. 39, No. 11. P. 2625–2633.

[113] Aghaee A., Ghadiri M., Baghshah M. S. Active distance-based clustering using $k$-medoids // Advances in Knowledge Discovery and Data Mining. Proc. 20th Pacific-Asia Conf. (Auckland, New Zealand, Apr. 19–22, 2016). Cham: Springer, 2016. P. 253–264. (Lect. Notes Comput. Sci.; Vol. 9651).

[114] Randel R., Aloise D., Mladenovic N., Hansen P. On the $k$-medoids model for semi-supervised clustering // Variable Neighborhood Search. Proc. 6th Int. Conf. (Sithonia, Greece, Oct. 4–7, 2018). Cham: Springer, 2019. P. 13–27. (Lect. Notes Comput. Sci.; Vol. 11328).

[115] Marín A., Pelegrín M. Adding incompatibilities to the simple plant location problem: Formulation, facets and computational experience // Comput. Oper. Res. 2019. Vol. 104. P. 174–190.

[116] Marín A., Pelegrín M. The double-assignment plant location problem with co-location // Comput. Oper. Res. 2021. Vol. 126. P. 105059.

[117] Fersini E., Messina E., Archetti F. A $p$-median approach for predicting drug response in tumour cells // BMC Bioinform. 2014. Vol. 15, No. 1. P. 1–19.

[118] Ushakov A. V., Klimentova X., Vasilyev I. L. Bi-level and bi-objective $p$-median type problems for integrative clustering: Application to analysis of cancer gene-expression and drug-response data // IEEE-ACM Trans. Comput. Biol. Bioinform. 2018. Vol. 15, No. 1. P. 46–59.

[119] Алексеева Е. А., КочетовЮ. А. Генетический локальный поиск для задачи о $p$-медиане с предпочтениями клиентов // Дискрет. анализ и ис- след. операций. 2007. Т. 14, № 1. С. 3–31.

[120] Cánovas L., García S., Labbé M., Marín A. A strengthened formulation for the simple plant location problem with order // Oper. Res. Lett. 2007. Vol. 35, No. 2. P. 141–150.

[121] Vasilyev I. L., Klimentova X., Boccia M. Polyhedral study of simple plant location problem with order // Oper. Res. Lett. 2013. Vol. 41, No. 2. P. 153–158.

[122] Васильев И. Л., Климентова К. Б. Метод ветвей и отсечений для задачи размещения с предпочтениями клиентов // Дискрет. анализ и исслед. операций. 2009. Т. 16, № 2. С. 21–41.

[123] Benati S., García S. A mixed integer linear model for clustering with variable selection // Comput. Oper. Res. 2014. Vol. 43. P. 280–285.

[124] Benati S., García S., Puerto J. Mixed integer linear programming and heuristic methods for feature selection in clustering // J. Oper. Res. Soc. 2018. Vol. 69, No. 9. P. 1379–1395.

[125] Kuehn A. A., Hamburger M. J. A heuristic program for locating warehouses // Manage. Sci. 1963. Vol. 9, No. 4. P. 643–666.

[126] Mulvey J. M., Beck M. P. Solving capacitated clustering problems // Eur. J. Oper. Res. 2003. Vol. 18, No. 3. P. 339–348.

[127] Negreiros M., Palhano A. The capacitated centred clustering problem // Comput. Oper. Res. 2006. Vol. 33, No. 6. P. 1639–1663.

[128] Boccia M., Sforza A., Sterle C., Vasilyev I. L. A cut and branch approach for the capacitated $p$-median problem based on Fenchel cutting planes // J. Math. Model. Algorithms. 2008. Vol. 7. P. 43–58.

[129] Gnägi M., Baumann P. A matheuristic for large-scale capacitated clustering // Comput. Oper. Res. 2021. Vol. 132. P. 105304.

[130] Lorena L. A. N., Senne E. L. F. A column generation approach to capacitated $p$-median problems // Comput. Oper. Res. 2004. Vol. 31, No. 6. P. 863–876.

[131] Mai F., Fry M. J., Ohlmann J. W. Model-based capacitated clustering with posterior regularization // Eur. J. Oper. Res. 2018. Vol. 271, No. 2. P. 594–605.

[132] Stefanello F., de Araújo O. C. B., Müller F. M. Matheuristics for the capacitated $p$-median problem // Int. Trans. Oper. Res. 2015. Vol. 22, No. 1. P. 149–167.

[133] Chou C.-A., Chaovalitwongse W. A., Berger-Wolf T. Y., DasGupta B., Ashley M. V. Capacitated clustering problem in computational biology: Combinatorial and statistical approach for sibling reconstruction // Comput. Oper. Res. 2012. Vol. 39, No. 3. P. 609–619.

[134] Frahm J.-M., Fite-Georgel P., Gallup D. [et al.]. Building Rome on a cloudless day // Computer Vision. Proc. 11th Eur. Conf. (Heraklion, Greece, Sep. 5–11, 2010). Pt. 4. Heidelberg: Springer, 2010. P. 368–381. (Lect. Notes Comput. Sci.; Vol. 6314).

[135] Gong Y., Pawlowski M., Yang F., Brandy L., Boundev L., Fergus R. Web scale photo hash clustering on a single machine // Proc. 2015 IEEE Conf. Computer Vision and Pattern Recognition (Boston, USA, June 7–12, 2015). Piscataway: IEEE, 2015. P. 19–27.

[136] Brusco M. J., Steinley D., Stevens J. $K$-medoids inverse regression // Commun. Stat. Theory Methods. 2019. Vol. 48, No. 20. P. 4999–5011.

[137] Suárez J. L., García S., Herrera F. A tutorial on distance metric learning: Mathematical foundations, algorithms, experimental analysis, prospects and challenges // Neurocomputing. 2021. Vol. 425. P. 300–322.

[138] Song H. O., Jegelka S., Rathod V., Murphy K. Deep metric learning via facility location // Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (Honolulu, USA, July 21–26, 2017). Piscataway: IEEE, 2017. P. 2206–2214.

 © Институт математики им. С. Л. Соболева, 2015