Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Workshop on Machine Learning and Compression

SNeRV: Scalable Neural Representations for Video Coding

Yiying Wei · Hadi Amirpour · Christian Timmerer


Abstract:

Scalable or layered video coding encodes a video stream into multiple layers in such a way that it can be decoded at different levels of quality or resolution, depending on the capabilities of the device or the available network bandwidth. Traditional approaches are built as an extension of existing video codec standards, but lack industry deployments. In this paper, we propose a Scalable Neural Representation (SNeRV) for video coding that encodes multi-resolution/-quality videos into a single neural network comprising multiple layers. The base layer (BL) of the neural network encodes the lowest resolution/quality of the video stream. Enhancement layers (ELs) encode additional information that, using the BL as a starting point, can be used to reconstruct a higher-resolution/-quality video during the decoding process. This multi-layered structure allows the scalable bitstream to be truncated to adapt to the client's bandwidth conditions or computational decoding requirements. Unlike conventional video codecs constrained by complex and highly designed modules, SNeRV represents a video as a neural network and employs any model weight compression method for video compression. Experimental results demonstrate that SNeRV outperforms H.264/AVC's Scalable Video Coding (SVC) extension and achieves comparable decoding speed at high resolutions.

Chat is not available.