Personalized federated learning: new algorithms and statistical rates,

Background: Federated Learning (FL) has emerged as a powerful paradigm for distributed, privacy-preserving machine learning over a large network of devices [1]. Most existing works on FL focus on learning a single model that is deployed to all devices. Given the diverse characteristics of the users and application scenarios, personalization is highly desirable and inevitable in the near future. Personalized Federated Learning (PFL) aims to improve the experience of individual users by training personalized on-device models that overcome the limitations of a common model. The vision of PFL brings many challenges [1] and calls for research advances on several fronts: statistical learning theory, domain- specific modeling, optimization algorithms, privacy-preserving machine learning and security.

In this project, we focus on developing new algorithms and statistical learning theory for PFL. In particular, training a large number of personalized models simultaneously requires more flexible and robust distributed algorithms than training a common model. Meanwhile, we need to address statistical learning questions on when and how we can leverage the large amount of data from heterogeneous distributions to improve the training of each personalized model (see [2] for some recent results).

Research agenda: We propose to study proximal-based optimization models and algorithms for PFL. In contrast to the consensus-based models and algorithms for training a common model (which carry hard constraints that the local models at all the devices being the same), proximal-based models allow the local models to be different and use proximity regularization to penalize their divergence from a virtual common model. Appropriate strength of regularization brings the benefits of collective statistics (through the virtual common model) and also allows for flexible personalizations (a form of bias-variance tradeoff). An important question of research is what are the right strengths of regularization for different local models (convex models or deep learning models) and how to determine and adjust them efficiently.

Proximal-based models open many avenues for innovative algorithms research for PFL. In particular, we aim to extend the power and rich theory of the classical proximal-point algorithm to the distributed optimization setting. Specifically, we can reformulate the proximal-based PFL models as minimizing the sum of the Moreau envelopes of the local empirical loss functions [3]. Working with the Moreau envelope has many advantages over working with the local loss functions, including its better smoothness and convexity properties (even if the local loss functions are nonconvex!). In addition, proximal-based algorithms are closely related to operator splitting methods, which we recently studied for Federated Learning [4]. There are exciting synergies between the two that we plan to investigate.

References: 

  1. PKairouz, H. B. McMahan, et al., Advances and Open Problems in Federated Learning, Foundations and Trends in Machine Learning, vol.14, no.1-2, June 2021.

  2. S. Chen, Q. Zheng, Q. Long and W. J. Su, A Theorem of Alternative for Personalized Federated Learning, arXiv preprint: arXiv:2103.01901, 2021.

  3. C. T. Dinh , N. H. Tran and T. D. Nguyen, Personalized Federated Learning with Moreau Envelopes, Advances in Neural Information Processing Systems 34 (NeurIPS), 2020.

  4. R. Pathak and M. J. Wainwright, FedSplit: An Algorithmic Framework for Fast Federated Optimization, Advances in Neural Information Processing Systems 34 (NeurIPS), 2020.

  5. K. Wei, J. Li, M. Ding, C. Ma, H. H. Yang, F. Farokhi, S. Jin, T. Q. S. Quek, H. V. Poor, Federated  Learning with Differential Privacy: Algorithms and Performance Analysis. In IEEE Transactions on  Information Forensics & Security, 15, pp. 3454-3469, 2020.