Andrey Norkin

Deep learning for speeding up a video encoder

Deep learning can actually make encoding video faster. In this work, we used a CNN to speed up a VP9 intra-frame encoding by more than three times at a cost of 1.71% higher bitrate at the same quality. A fully convolutional CNN is used in the video encoder to make predictions of how to partition a superblock without performing the RDO recursively. More details here.

Film Grain Synthesis in AV1

Film grain is widely present in the motion picture and TV material and is often a part of the creative intent. Since film grain shows a high degree of randomness, it is difficult to compress efficiently. AV1 video codec has mandatory support of the film grain synthesis that makes it possible to model film grain or sensor noise in the encoder, denoise the video before encoding and add the synthesized grain after decoding, which enables significant bitrate savings on grainy content. More details here.

HDR video pre-processing

One of the options for streaming high dynamic range (HDR) video is HDR10 that uses HEVC Main10 encoding, ST.2084 transfer function, BT.2020 color space, Y'CbCr 4:2:0 non-constant luminance color format. A combination of non-constant luminance Y'CbCr 4:2:0 with highly non-linear transfer function can sometimes cause artifacts in saturated colors at the boundaries of the color gamut. A fast algorithms that helps to alleviate this problem via a closed-form solution has been proposed. More details here.

Deblocking filter in HEVC/H.265

As many other video codecs, HEVC uses block-based prediction and transform coding. In such a coding scheme, discontinuities, called block artifacts, can occur in the reconstructed signal at the block boundaries. The HEVC/H.265 standard defines a deblocking in-loop filter that can significantly reduce visual artifacts thus improving both perceptual video quality and contributing to more efficient compression. Various aspects of the HEVC deblocking filter design are discussed in this article.

Keeping perceived 3D scene proportions by adjusting virtual camera parameters

Stereo perception is a common factor in 3D displays, 360-degree video and AR/VR applications. The 3D and stereo material is typically optimized for a particular display size and viewing distance. When the content is shown on a different display size and/or from different viewing distance, the perceived proportions of objects will be distorted compared to the intended ones. This can make the scene look unnatural and even lead to eye strain and fatigue when observing the content. This article proposes solution for adapting 3D scene rendering parameters to new viewing conditions such that the perceived proportions of the objects in 3D scene are the same as for the reference viewing conditions.