Part of Proceedings of Machine Learning and Systems 6 (MLSys 2024) Conference
Kiwan Maeng, G. Edward Suh
Secure multi-party computation (MPC) allows users to offload machine learning inference on untrusted servers without having to share their privacy-sensitive data. Despite their strong security properties, MPC-based private inference has not been widely adopted due to their high communication overhead, mostly incurred when evaluating non-linear layers.This paper presents HummingBird, an MPC framework that reduces the ReLU communication overhead significantly. HummingBird leverages an insight that determining whether a value is positive or negative mostly does not need a full-bit communication.With its theoretical analyses and an efficient search engine, HummingBird discards 66--72% of the bits during ReLU without altering the outcome, and discards 87--91% when some accuracy can be degraded. On a realistic MPC setup, HummingBird achieves on average 2.03--2.67$\times$ end-to-end speedup without introducing any errors, and up to 8.42$\times$ when some accuracy degradation is tolerated.