Federated Learning: Train AI Models Without Sharing Private Data

Build privacy-preserving ML systems across distributed data sources

返回教程列表
高级45 分钟

Federated Learning: Train AI Models Without Sharing Private Data

Build privacy-preserving ML systems across distributed data sources

Learn federated learning fundamentals and implementation using PySyft and Flower. Build ML models that train across multiple clients without centralizing sensitive data.

federated-learningprivacydistributed-mlflowerdifferential-privacy

Federated Learning: Privacy-Preserving AI

What is Federated Learning?

Federated Learning (FL) allows training ML models across multiple devices/servers without exchanging raw data. Each participant trains locally and only shares model updates.

Key Benefits

  • Data never leaves local device
  • Compliant with GDPR, HIPAA
  • Works with sensitive data (medical, financial)
  • Reduces data transfer costs
  • How Federated Learning Works

  • Central server sends model to clients
  • Each client trains on local data
  • Clients send model updates (gradients) back
  • Server aggregates updates (FedAvg)
  • Repeat until convergence
  • Implementation with Flower

    python
    import flwr as fl
    import torch
    from torch import nn

    class CifarClient(fl.client.NumPyClient): def __init__(self, model, trainloader, testloader): self.model = model self.trainloader = trainloader self.testloader = testloader def get_parameters(self, config): return [val.cpu().numpy() for val in self.model.parameters()] def set_parameters(self, parameters): for param, val in zip(self.model.parameters(), parameters): param.data = torch.Tensor(val) def fit(self, parameters, config): self.set_parameters(parameters) train(self.model, self.trainloader, epochs=1) return self.get_parameters({}), len(self.trainloader.dataset), {} def evaluate(self, parameters, config): self.set_parameters(parameters) loss, accuracy = test(self.model, self.testloader) return float(loss), len(self.testloader.dataset), {"accuracy": accuracy}

    Start client

    fl.client.start_numpy_client( server_address="server:8080", client=CifarClient(model, trainloader, testloader) )

    Server Strategy

    python
    

    FedAvg aggregation

    strategy = fl.server.strategy.FedAvg( fraction_fit=0.5, # 50% clients per round fraction_evaluate=0.3, min_fit_clients=3, min_evaluate_clients=2, min_available_clients=5 )

    fl.server.start_server( server_address="0.0.0.0:8080", config=fl.server.ServerConfig(num_rounds=10), strategy=strategy )

    Differential Privacy

    Add noise to protect individual data points:
    python
    from opacus import PrivacyEngine

    privacy_engine = PrivacyEngine() model, optimizer, train_loader = privacy_engine.make_private( module=model, optimizer=optimizer, data_loader=train_loader, noise_multiplier=1.1, max_grad_norm=1.0 )

    Use Cases

  • Healthcare: Multi-hospital model training
  • Finance: Cross-bank fraud detection
  • Mobile: Keyboard prediction (Google Gboard)
  • 相关工具

    flowerpysyftopacustensorflow-federated