Toward Privacy-Preserving, Secure, and Fair Federated Learning

Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Federated learning is a collaborative machine learning approach that enables multiple clients to train a shared model while keeping their local data private. This method addresses privacy concerns by ensuring that sensitive data remains decentralized, thus reducing the risk of data breaches. By leveraging the collective knowledge of diverse datasets, federated learning enhances model performance and generalization, making it particularly valuable in scenarios where data privacy and security are paramount. This dissertation aims to enhance federated learning by addressing three key challenges: privacy preservation, security against malicious attacks, and fairness across diverse demographic groups. First, we propose a novel privacy-preserving federated learning mechanism. While local datasets remain private, intermediate model parameters can still leak sensitive information. Existing solutions either add noise, which reduces model accuracy, or use inefficient cryptographic techniques. Our method employs two non-colluding servers and efficient cryptographic primitives for secure aggregation, maintaining privacy without sacrificing accuracy or efficiency. Second, we present a secure federated learning method to defend against malicious clients. Federated learning is vulnerable to Byzantine attacks, where malicious clients corrupt their local data or updates to degrade the global model. Current methods either ineffectively filter malicious updates or rely on a potentially biased trusted dataset. Our method evaluates client trust based on similarity to a trusted dataset and incrementally builds a trusted client set. This approach effectively defends against Byzantine attacks, achieving high model accuracy even with an initially biased trusted dataset. Finally, we introduce a fair federated learning method to ensure equitable accuracy across demographic groups. Machine learning models often favor majority groups due to data imbalances or biased training. Existing fairness methods are typically centralized and require access to the entire dataset, making them unsuitable for federated learning. Our approach guarantees fairness by exchanging minimal additional data among clients, preserving the global model's utility while ensuring equitable accuracy across all groups.
Description
Keywords
Citation