Poster
Invisible Image Watermarks Are Provably Removable Using Generative AI
Xuandong Zhao · Kexun Zhang · Zihao Su · Saastha Vasan · Ilya Grishchenko · Christopher Kruegel · Giovanni Vigna · Yu-Xiang Wang · Lei Li
Invisible watermarks safeguard images’ copyright by embedding hidden messages only detectable by owners. They also prevent people from misusing images, especially those generated by AI models. We propose a family of regeneration attacks to remove these invisible watermarks. The proposed attack method first adds random noise to an image to destroy the watermark and then reconstructs the image. This approach is flexible and can be instantiated with many existing image-denoising algorithms and pre-trained generative models such as diffusion models. Through formal proofs and empirical results, we demonstrate that pixel-level invisible watermarks are vulnerable to the proposed attack. For a particularly resilient watermark, RivaGAN, regeneration attacks remove 93-99% of the invisible watermarks while the baseline attacks remove no more than 3%. However, watermarks that keep the image semantically similar can be an alternative defense against our attack. Our finding underscores the need for a shift in research/industry emphasis from invisible watermarks to semantic-preserving watermarks. We provide the code in the supplementary materials.
Live content is unavailable. Log in and register to view live content