Towards Fair and Privacy Preserving Federated Learning for the Healthcare Domain
Federated learning enables data sharing in healthcare contexts where it might otherwise be difficult due to data-use-ordinances or security and communication constraints. Distributed and shared data models allow models to become generalizable and learn from heterogeneous clients. While addressing data security, privacy, and vulnerability considerations, data itself is not shared across nodes in a given learning network. On the other hand, FL models often struggle with variable client data distributions and operate on an assumption of independent and identically distributed data. As the field has grown, the notion of fairness-aware federated learning mechanisms has also been introduced and is of distinct significance to the healthcare domain where many sensitive groups and protected classes exist. In this paper, we create a benchmark methodology for FAFL mechanisms under various heterogeneous conditions on datasets in the healthcare domain typically outside the scope of current federated learning benchmarks, such as medical imaging and waveform data formats. Our results indicate considerable variation in how various FAFL schemes respond to high levels of data heterogeneity. Additionally, doing so under privacy-preserving conditions can create significant increases in network communication cost and latency compared to the typical federated learning scheme.
READ FULL TEXT