AV1@Scale: Film Grain Synthesis, The Awakening

原始链接: https://netflixtechblog.com/av1-scale-film-grain-synthesis-the-awakening-ee09cfdff40b

Li-Heng Chen and team at Netflix are enhancing the streaming experience with AV1 Film Grain Synthesis (FGS), now enabled at scale. FGS tackles the challenge of compressing film grain, a crucial element for visual depth and realism but difficult to manage with traditional compression. FGS works by modeling film grain using two components: a Film Grain Pattern, replicating the grain's spatial correlation with auto-regressive (AR) coefficients, and a Film Grain Intensity, which adjusts grain strength based on lighting using a piecewise linear scaling function. The encoding process denoises the video, compresses it, and transmits the grain's pattern and intensity. During playback, the grain is recreated and reintegrated, optimizing for smooth playback. By removing the grain before compression, the video becomes easier to compress, leading to significant bitrate savings while preserving the artistic integrity of the original film grain, resulting in improved visual quality. As showcased in the example of "They Cloned Tyrone" FGS offers a premium streaming experience.

Here's a summary of the Hacker News thread discussing Netflix's use of AV1@Scale for film grain synthesis: The article details Netflix's attempt to add film grain to their content. While some criticize the specific implementation for potentially over-blurring and not perfectly replicating film grain, the "at scale" aspect of the technology is appreciated for its potential to be more broadly deployed. The philosophical merits of adding noise are debated, some finding that grain enhances realism, depth, and hides compression artifacts. Others argue it's an artistic choice and aesthetic preference that gets associated with authenticity due to historical correlations of grainy media with real-life footage. Some even mention a potential effect on improving perceived quality via stochastic resonance. Others dislike the grain and want to disable it. Some users are more connected to certain eras of grain. Ultimately, there's interest in the technology, its deployment challenges, and the diverse perspectives on its aesthetic value.

原文

Li-Heng Chen, Andrey Norkin, Liwei Guo, Zhi Li, Agata Opalach and Anush Moorthy

Picture this: you’re watching a classic film, and the subtle dance of film grain adds a layer of authenticity and nostalgia to every scene. This grain, formed from tiny particles during the film’s development, is more than just a visual effect. It plays a key role in storytelling by enhancing the film’s depth and contributing to its realism. However, film grain is as elusive as it is beautiful. Its random nature makes it notoriously difficult to compress. Traditional compression algorithms struggle to manage it, often forcing a choice between preserving the grain and reducing file size.

In the digital age, noise remains a ubiquitous element in video content. Camera sensor noise introduces its own characteristics, while filmmakers often add intentional grain during post-production to evoke mood or a vintage feel. These elements create a visually rich experience that tests conventional compression methods.

We’re giving members globally a transformed streaming experience with the recent rollout of AV1 Film Grain Synthesis (FGS) streams. While FGS has been part of the AV1 standard since its inception, we only enabled it for a limited number of titles during our initial launch of the AV1 codec in 2021. Now, we’re enabling this innovative technology at scale, leveraging it to preserve the artistic integrity of film grain while optimizing data efficiency. In this blog post, we’ll explore how FGS revolutionizes video streaming and enhances your viewing experience.

The AV1 Film Grain Synthesis tool models film grain through two key components, with model parameters estimated before the encoding of the denoised video:

Film Grain Pattern: an auto-regressive (AR) model is used to replicate the pattern of film grain. The key parameters are the AR coefficients, which can be estimated from the residual between the source video and the denoised video, essentially capturing the noise. This model captures the spatial correlation between the grain samples, ensuring that the noise characteristics of the original content are accurately preserved. By adjusting the auto-regressive coefficients {ai}, the model can control the grain’s shape, making it appear coarser or finer. With these coefficients, a 64x64 noise template is generated, as illustrated in the animation below. To construct the noise layer during playback, random 32x32 patches are extracted from the 64x64 noise template and added to the decoded video.

Fig. 1 The synthesis process of the 64x64 noise template using the simplest AR kernel with a lag parameter L=1. Each noise value is calculated as a linear combination of previously synthesized noise sample values, with AR coefficients a0, a1, a2, a3 and a white Gaussian noise (wgn) component.

Film Grain Intensity: a scaling function is employed to control the grain’s appearance under varying lighting conditions. This function, estimated during the encoding process, models the relationship between pixel value and noise intensity using a piecewise linear function. This allows for precise adjustments to the grain strength based on video brightness and color. Consequently, the film grain strength is adapted to the areas of the picture, closely recreating the look of the original video. The animation below demonstrates how the grain intensity is adjusted by the scaling function:

Fig. 2 Illustration of the scaling function’s impact on film grain intensity. Left: The scaling function graph showing the relationship between pixel value and scaling intensity. Right: A grayscale SMPTE bars frame with film grain applied according to the scaling function.

With these models specified by AV1 standard, the encoding process first removes the film grain from the video. The standard does not mandate a specific method for this step, allowing users to choose their preferred denoiser. Following the denoising, the video is compressed, and the grain’s pattern and intensity are estimated and transmitted alongside the compressed video data. During playback, the film grain is recreated and reintegrated into the video using a block-based method. This approach is optimized for consumer devices, ensuring smooth playback and high-quality visuals. For a more detailed explanation, please refer to the original paper.

By combining these components, the AV1 Film Grain Synthesis tool preserves the artistic integrity of film grain while making the content “easier to compress” by denoising the source video prior to encoding. This process enables high-quality video streaming, even in content with heavy grain, resulting in significant bitrate savings and improved visual quality.

In our pursuit of premium streaming quality, enabling AV1 Film Grain Synthesis has led to significant bitrate reduction, allowing us to deliver high-quality video with less data while preserving the artistic integrity of film grain. Below, we showcase visual examples highlighting the improved quality and reduced bitrate, using a frame from the Netflix title They Cloned Tyrone: