Abstract:
We present a federated, asynchronous, and $(\varepsilon, \delta)$-differentially private algorithm for $\PCA$ in the memory-limited setting.
%
Our algorithm incrementally computes local model updates using a streaming procedure and adaptively estimates its $r$ leading principal components when only $\mathcal{O}(dr)$ memory is available with $d$ being the dimensionality of the data.
%
We guarantee differential privacy via an input-perturbation scheme in which the covariance matrix of a dataset $\B{X} \in \R^{d \times n}$ is perturbed with a non-symmetric random Gaussian matrix with variance in $\mathcal{O}\left(\left(\frac{d}{n}\right)^2 \log d \right)$, thus improving upon the state-of-the-art.
%
Furthermore, contrary to previous federated or distributed algorithms for $\PCA$, our algorithm is also invariant to permutations in the incoming data, which provides robustness against straggler or failed nodes.
%
Numerical simulations show that, while using limited-memory, our algorithm exhibits performance that closely matches or outperforms traditional non-federated algorithms, and in the absence of communication latency, it exhibits attractive horizontal scalability.
Chat is not available.