Part of Proceedings of Machine Learning and Systems 3 (MLSys 2021)
Omid Aramoon, Pin-Yu Chen, Gang Qu
Engineering a top-notch deep learning model is an expensive procedure that involves collecting data, hiring human resources with expertise in machine learning, and providing high computational resources. For that reason, deep learning models are considered as valuable Intellectual Properties (IPs) of the model vendors. To ensure reliable commercialization of deep learning models, it is crucial to develop techniques to protect model vendors against IP infringements. One of such techniques that recently has shown great promise is digital watermarking. However, current watermarking approaches can embed very limited amount of information and are vulnerable against watermark removal attacks. In this paper, we present GradSigns, a novel watermarking framework for deep neural networks (DNNs). GradSigns embeds the owner's signature into the gradient of the cross-entropy cost function with respect to inputs to the model. Our approach has a negligible impact on the performance of the protected model and it allows model vendors to remotely verify the watermark through prediction APIs. We evaluate GradSigns on DNNs trained for different image classification tasks using CIFAR-10, SVHN, and YTF datasets. Experimental results show that GradSigns is robust against all known counter-watermark attacks and can embed a large amount of information into DNNs.