Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Workshop on Machine Learning and Compression

Perception Loss Function Adaptive to Rate for Learned Video Compression

Sadaf Salehkalaibar · Truong Buu Phan · João Atz Dick · Ashish Khisti · Jun Chen · Wei Yu


Abstract:

We consider causal, low-latency, sequential video compression, with mean squared-error (MSE) as the distortion loss, and a perception loss function (PLF) to enhance the realism of outputs. Prior works have employed two PLFs: one based on the joint distribution (JD) of all frames up to the current one, and the other based on frame-wise marginal distribution (FMD). We introduce a new PLF, called \emph{adaptive to rate (AR)}, which preserves the joint distribution of the current frame with all previous reconstructions. Through information-theoretic analysis and deep-learning experiments, we show that PLF-AR can rectify past errors in future reconstructions when the initial frame is compressed at a low bitrate. However, in this bitrate scenario, PLF-JD exhibits the error permanence phenomenon, propagating mistakes in subsequent outputs. When the initial frame is compressed at a high bitrate, PLF-AR maintains temporal correlation among frames, preventing error propagation in future reconstructions---unlike PLF-JD, which remains stuck in past mistakes. Furthermore, PLF-FMD does not preserve temporal correlation as effectively as PLF-AR. These characteristics of PLFs are especially apparent in scenarios with sharp frame movements. In contrast, when frame movements are smoother, the three PLFs display slight variations: PLF-AR and PLF-JD yield more diverse outputs, while PLF-FMD tends to replicate the initial frame in all future reconstructions. We validate our findings through information-theoretic analysis of the rate-distortion-perception tradeoff for the Gauss-Markov source model and deep-learning experiments on moving MNIST and UVG datasets.

Chat is not available.