Open Access Issue
Fair k-Center Problem with Outliers on Massive Data
Tsinghua Science and Technology 2023, 28 (6): 1072-1084
Published: 28 July 2023

The clustering problem of big data in the era of artificial intelligence has been widely studied. Because of the huge amount of data, distributed algorithms are often used to deal with big data problems. The distributed computing model has an attractive feature: it can handle massive datasets that cannot be put into the main memory. On the other hand, since many decisions are made automatically by machines in today’s society, algorithm fairness is also an important research area of machine learning. In this paper, we study two fair clustering problems: the centralized fair k-center problem with outliers and the distributed fair k-center problem with outliers. For these two problems, we have designed corresponding constant approximation ratio algorithms. The theoretical proof and analysis of the approximation ratio, and the running space of the algorithm are given.

total 1