ASCFL: Accurate and Speedy Semi-Supervised Clustering Federated Learning

Jingyi He; Biyao Gong; Jiadi Yang; Hai Wang; Pengfei Xu; Tianzhang Xing

doi:10.26599/TST.2022.9010057

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

PDF (4.6 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

ASCFL: Accurate and Speedy Semi-Supervised Clustering Federated Learning

Jingyi He^¹, Biyao Gong^¹, Jiadi Yang^¹, Hai Wang^¹, Pengfei Xu^¹, Tianzhang Xing^{¹^,²}(

)

1School of Information Science and Technology, Northwest University, Xi’an 710100, China

2Internet of Things Research Center, Northwest University, Xi’an 710100, China

Show Author Information

Abstract

The influence of non-Independent Identically Distribution (non-IID) data on Federated Learning (FL) has been a serious concern. Clustered Federated Learning (CFL) is an emerging approach for reducing the impact of non-IID data, which employs the client similarity calculated by relevant metrics for clustering. Unfortunately, the existing CFL methods only pursue a single accuracy improvement, but ignore the convergence rate. Additionlly, the designed client selection strategy will affect the clustering results. Finally, traditional semi-supervised learning changes the distribution of data on clients, resulting in higher local costs and undesirable performance. In this paper, we propose a novel CFL method named ASCFL, which selects clients to participate in training and can dynamically adjust the balance between accuracy and convergence speed with datasets consisting of labeled and unlabeled data. To deal with unlabeled data, the prediction labels strategy predicts labels by encoders. The client selection strategy is to improve accuracy and reduce overhead by selecting clients with higher losses participating in the current round. What is more, the similarity-based clustering strategy uses a new indicator to measure the similarity between clients. Experimental results show that ASCFL has certain advantages in model accuracy and convergence speed over the three state-of-the-art methods with two popular datasets.

Keywords

semi-supervised learning federated learning clustered federated learning non-Independent Identically Distribution (non-IID) data similarity indicator client selection

References

[1]

T. Yang, X. Li, and H. Shao, Federated learning-based power control and computing for mobile edge computing system, in Proc. 2021 IEEE 94^th Vehicular Technology Conf., Norman, OK, USA, 2021, pp. 1–6.

Crossref Google Scholar

[2]

I. Gupta, R. Gupta, A. K. Singh, and R. Buyya, MLPAM: A machine learning and probabilistic analysis based model for preserving security and privacy in cloud environment, IEEE Syst. J., vol. 15, no. 3, pp. 4248–4259, 2021.

Crossref Google Scholar

[3]

M. Kumar, M. Rossbory, B. A. Moser, and B. Freudenthaler, Deriving an optimal noise adding mechanism for privacy-preserving machine learning, in Database and Expert Systems Applications, G. Anderst-Kotsis, A. M. Tjoa, I. Khalil, M. Elloumi, A. Mashkoor, J. Sametinger, X. Larrucea, A. Fensel, J. Martinez-Gil, B. Moser, et al., eds. Linz, Austria: Springer, 2019, pp. 108–118.

[4]

A. K. Sandhu, Big data with cloud computing: Discussions and challenges, Big Data Mining and Analytics, vol. 5, no. 1, pp. 32–40, 2022.

Crossref Google Scholar

[5]

J. Feng, C. Rong, F. Sun, D. Guo, and Y. Li, PMF: A privacy-preserving human mobility prediction framework via federated learning, Proc. ACM Interact., Mob., Wearable Ubiquitous Technol., vol. 4, no. 1, p. 10, 2020.

Crossref Google Scholar

[6]

T. Yu, T. Li, Y. Sun, S. Nanda, V. Smith, V. Sekar, and S. Seshan, Learning context-aware policies from multiple smart homes via federated multi-task learning, in Proc. 2020 IEEE/ACM 5^th Int. Conf. Internet-of-Things Design and Implementation, Sydney, Australia, 2020, pp. 104–115.

Crossref Google Scholar

[7]

A. Giuseppi, L. Della Torre, D. Menegatti, and A. Pietrabissa, AdaFed: Performance-based adaptive federated learning, in Proc. 5^th Int. Conf. Advances in Artificial Intelligence, virtual, 2021, pp. 38–43.

Crossref Google Scholar

[8]

L. Tu, X. Ouyang, J. Zhou, Y. He, and G. Xing, FedDL: Federated learning via dynamic layer sharing for human activity recognition, in Proc. 19^th ACM Conf. Embedded Networked Sensor Systems, 2021, pp. 15–28.

Crossref Google Scholar

[9]

W. Zheng, L. Yan, C. Gou, and F. Y. Wang, Federated meta-learning for fraudulent credit card detection, in Proc. 29^th Int. Joint Conf. Artificial Intelligence, Yokohama, Japan, 2021, pp. 4654–4660.

Crossref Google Scholar

[10]

A. H. Gonsalves, F. Thabtah, R. M. A. Mohammad, and G. Singh, Prediction of coronary heart disease using machine learning: An experimental analysis, in Proc. 2019 3^rd Int. Conf. Deep Learning Technologies, Xiamen, China, 2019, pp. 51–56.

Crossref Google Scholar

[11]

Y. S. Can and C. Ersoy, Privacy-preserving federated deep learning for wearable IoT-based biomedical monitoring, ACM Transactions on Internet Technology, vol. 21, no. 1, p. 21, 2021.

Crossref Google Scholar

[12]

D. C. Nguyen, Q. V. Pham, P. N. Pathirana, M. Ding, A. Seneviratne, Z. Lin, O. Dobre, and W. J. Hwang, Federated learning for smart healthcare: A survey, ACM Comput. Surv., vol. 55, no. 3, p. 60, 2022.

Crossref Google Scholar

[13]

J. Zhang, X. Cheng, C. Wang, Y. Wang, Z. Shi, J. Jin, A. Song, W. Zhao, L. Wen, and T. Zhang, FedAda: Fast-convergent adaptive federated learning in heterogeneous mobile edge computing environment, World Wide Web, vol. 25, pp. 1971–1998, 2022.

Crossref Google Scholar

[14]

F. Sattler, S. Wiedemann, K. R. Müller, and W. Samek, Robust and communication-efficient federated learning from Non-i.i.d. data, IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 9, pp. 3400–3413, 2020.

Crossref Google Scholar

[15]

F. Sattler, K. R. Müller, and W. Samek, Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints, IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 8, pp. 3710–3722, 2020.

Crossref Google Scholar

[16]

H. Yang, F. Li, D. Yu, Y. Zou, and J. Yu, Reliable data storage in heterogeneous wireless sensor networks by jointly optimizing routing and storage node deployment, Tsinghua Science and Technology, vol. 26, no. 2, pp. 230–238, 2021.

Crossref Google Scholar

[17]

C. Briggs, Z. Fan, and P. Andras, Federated learning with hierarchical clustering of local updates to improve training on non-IID data, presented at the 2020 Int. Joint Conf. Neural Networks, Glasgow, UK, 2020, pp. 1–9.

Crossref Google Scholar

[18]

L. Yu, W. Nie, L. Xin, and M. Guo, Clustered federated learning based on data distribution, in Proc. 3^rd Int. Conf. Advanced Information Science and System, Sanya, China, 2021, p. 51.

Crossref Google Scholar

[19]

Z. Xue and H. Wang, Effective density-based clustering algorithms for incomplete data, Big Data Mining and Analytics, vol. 4, no. 3, pp. 183–194, 2021.

Crossref Google Scholar

[20]

A. Ghosh, J. Chung, D. Yin, and K. Ramchandran, An efficient framework for clustered federated learning, in Proc. 34^th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, pp. 19586–19597.

Google Scholar

[21]

M. Tang, X. Ning, Y. Wang, J. Sun, Y. Wang, H. Li, and Y. Chen, FedCor: Correlation-based active client selection strategy for heterogeneous federated learning, in Proc. 2022 IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, LA, USA, 2022, pp. 10092–10101.

Crossref Google Scholar

[22]

F. Lai, X. Zhu, H. V. Madhyastha, and M. Chowdhury, Oort: Efficient federated learning via guided participant selection, in Proc. 18th USENIX Symposium on Operating Systems Design and Implemerttation, Virtual, 2021, pp. 19–35.

Google Scholar

[23]

Y. J. Cho, J. Wang, and G. Joshi, Client selection in federated learning: Convergence analysis and power-of-choice selection strategies, arXiv preprint arXiv: 2010.01243, 2020.

Google Scholar

[24]

T. Nishio and R. Yonetani, Client selection for federated learning with heterogeneous resources in mobile edge, in Proc. 2019 IEEE Int. Conf. Communications, Shanghai, China, 2019, pp. 1–7.

Crossref Google Scholar

[25]

H. Wang, Z. Kaplan, D. Niu, and B. Li, Optimizing federated learning on non-IID data with reinforcement learning, in Proc. IEEE Conf. Computer Communications, Toronto, Canada, 2020, pp. 1698–1707.

Crossref Google Scholar

[26]

T. Li, M. Sanjabi, A. Beirami, and V. Smith, Fair resource allocation in federated learning, in Proc. 8th Int. Comp. on Learning Reproseations, Addis Abȧba, Ethiopia, 2020. 2020.

Google Scholar

[27]

C. Zhang, S. Zhang, J. J. Q. Yu, and S. Yu, FASTGNN: A topological information protected federated learning approach for traffic speed forecasting, IEEE Trans. Ind. Inform., vol. 17, no. 12, pp. 8464–8474, 2021.

Crossref Google Scholar

[28]

Y. Zhu, Y. Liu, J. J. Q. Yu, and X. Yuan, Semi-supervised federated learning for travel mode identification from GPS trajectories, IEEE Trans. Intell. Transp. Syst., vol. 23, no. 3, pp. 2380–2391, 2022.

Crossref Google Scholar

[29]

J. Guo, H. Liang, S. Ai, C. Lu, H. Hua, and J. Cao, Improved approximate minimum degree ordering method and its application for electrical power network analysis and computation, Tsinghua Science and Technology, vol. 26, no. 4, pp. 464–474, 2021.

Crossref Google Scholar

[30]

X. Liang, Y. Lin, H. Fu, L. Zhu, and X. Li, RSCFed: Random sampling consensus federated semi-supervised learning, in Proc. 2022 IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, LA, USA, 2022, pp. 10144–10153.

Crossref Google Scholar

[31]

D. A. Van Dyk and X. L. Meng, The art of data augmentation, J. Comput. Graph. Statist., vol. 10, no. 1, pp. 1–50, 2001.

Crossref Google Scholar

[32]

C. Shorten and T. M. Khoshgoftaar, A survey on image data augmentation for deep learning, J. Big Data, vol. 6, no. 1, p. 60, 2019.

Crossref Google Scholar

[33]

G. Xu, F. Dong, and J. Feng, Mapping the technological landscape of emerging industry value chain through a patent lens: An integrated framework with deep learning, IEEE Trans. Eng. Manag., vol. 69, no. 6, pp. 3367–3378, 2022.

Crossref Google Scholar

[34]

J. Xie, Z. Zheng, X. Fang, S. C. Zhu, and Y. N. Wu, Learning cycle-consistent cooperative networks via alternating MCMC teaching for unsupervised cross-domain translation, Proc. AAAI Conf. Artif. Intell., vol. 35, no. 12, pp. 10430–10440, 2021.

Crossref Google Scholar

[35]

H. Bai, Y. Yang, and J. Wang, Exploiting more associations between slots for multi-domain dialog state tracking, Big Data Mining and Analytics, vol. 5, no. 1, pp. 41–52, 2022.

Crossref Google Scholar

[36]

R. Bi, Q. Liu, J. Ren, and G. Tan, Utility aware offloading for mobile-edge computing, Tsinghua Science and Technology, vol. 26, no. 2, pp. 239–250, 2021.

Crossref Google Scholar

[37]

J. Xu and H. Wang, Client selection and bandwidth allocation in wireless federated learning networks: A long-term perspective, IEEE Trans. Wirel. Commun., vol. 20, no. 2, pp. 1188–1200, 2021.

Crossref Google Scholar

[38]

S. AbdulRahman, H. Tout, A. Mourad, and C. Talhi, FedMCCS: Multicriteria client selection model for optimal IoT federated learning, IEEE Internet of Things Journal, vol. 8, no. 6, pp. 4723–4735, 2021.

Crossref Google Scholar

[39]

C. Li, X. Zeng, M. Zhang, and Z. Cao, PyramidFL: A fine-grained client selection framework for efficient federated learning, in Proc. 28^th Annu. Int. Conf. Mobile Computing and Networking, Sydney, Australia, 2022, pp. 158–171.

Crossref Google Scholar

[40]

Y. Ma and H. Ghasemzadeh, LabelForest: Non-parametric semi-supervised learning for activity recognition, Proc. AAAI Conf. Artif. Intell., vol. 33, no. 1, pp. 4520–4527, 2019.

Crossref Google Scholar

[41]

J. Xie, R. Girshick, and A. Farhadi, Unsupervised deep embedding for clustering analysis, in Proc. 3rd Int. Conf. on Machine Learning, New York, USA, 2016, pp. 478–487.

Google Scholar

[42]

K. Nandury, A. Mohan, and F. Weber, Cross-silo federated training in the cloud with diversity scaling and semi-supervised learning, in Proc. 2021 IEEE Int. Conf. Acoustics, Speech and Signal Processing, Toronto, Canada, 2021, pp. 3085–3089.

Crossref Google Scholar

[43]

Z. Zhang, Y. Yang, Z. Yao, Y. Yan, J. E. Gonzalez, K. Ramchandran, and M. W. Mahoney, Improving semi-supervised federated learning by reducing the gradient diversity of models, in Proc. 2021 IEEE Int. Conf. Big Data, Orlando, FL, USA, 2021, pp. 1214–1225.

Crossref Google Scholar

[44]

Q. Liu, H. Yang, Q. Dou, and P. A. Heng, Federated semi-supervised medical image classification via inter-client relation matching, in Proc. 24^th Int. Conf. Medical Image Computing and Computer Assisted Intervention, Strasbourg, France, 2021, pp. 325–335.

Crossref Google Scholar

[45]

O. Aouedi, K. Piamrat, G. Muller, and K. Singh, FLUIDS: Federated learning with semi-supervised approach for intrusion detection system, in Proc. 2022 IEEE 19^th Annu. Consumer Communications & Networking Conf., Las Vegas, NV, USA, 2022, pp. 523–524.

Crossref Google Scholar

[46]

Y. Kang, Y. Liu, and X. Liang, FedCVT: Semi-supervised vertical federated learning with cross-view training, ACM Trans. Intell. Syst. Technol., vol. 13, no. 4, p. 64, 2022.

Crossref Google Scholar

Tsinghua Science and Technology

Volume 28 Issue 5,
October 2023

Pages 823-837

DOI: 10.26599/TST.2022.9010057

Cite this article:

He J, Gong B, Yang J, et al. ASCFL: Accurate and Speedy Semi-Supervised Clustering Federated Learning. Tsinghua Science and Technology, 2023, 28(5): 823-837. https://doi.org/10.26599/TST.2022.9010057

670

Views

198

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Received: 09 November 2022

Accepted: 27 November 2022

Published: 19 May 2023

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).