Andrey Norkin

AOMedia Film Grain Synthesis 1 (AFGS1)

Published March 5, 2024.

In January 2024, the Alliance for Open Media (AOMedia) published the AOMedia Film Grain Synthesis 1 (AFGS1) specification. AFGS1 is a standalone film grain synthesis specification that can be used with a number of video codecs. Film grain is widely present in professional video content, such as movies and TV shows, and is notoriously difficult to compress using traditional video compression approaches. To overcome this problem, AFGS1 can be used to facilitate the following approach for compressing video with film grain (see Fig. 1), which is similar to the AV1 film grain synthesis technology.

Fig. 1. Film grain framework in AFGS1 and AV1 [1].

When the grain is detected, the source video is denoised. The encoder efficiently compresses the denoised video. The denoised video is also analyzed to find flat regions. These flat regions are used to estimate film grain model parameters from the difference between the noisy and denoised versions of the video. The compressed video is sent to the decoder along with the film grain parameters. In the decoder, the compressed denoised video is decoded, and the grain is synthesized from the received grain parameters and added to the pictures before they are displayed. You can find more information on the importance of film grain preservation and difficulties encoding the film grain titles in my post on the AV1 film grain synthesis.

AFGS1 specification basics

The AFGS1 specification establishes a standalone film grain synthesis technology that can be used with a variety of video codecs. When applied to a decoded picture, this technology is equivalent to the film grain synthesis used in AV1. This means that when applied to a frame, AFGS1 uses exactly the same film grain parameters and the low-level reference algorithm as the AV1 film grain synthesis, which results in essentially the same output. As described in the blog post on AV1 film grain synthesis, an auto-regressive (AR) process is used to create film grain templates for three color components (or one component in the case of monochrome video). The grain strength is signaled using so-called scaling functions that map a decoded sample's luma value or a linear combination of chroma and luma components' values to the grain strength. After the grain templates have been generated, the grain is applied to the decoded picture in a raster scan order in 32x32 blocks. The reader can find more details on the film grain synthesis reference process in AV1 and AFSG1 in the blog post on AV1 film grain. Except for the reference process, which is exactly the same in both specifications, AFGS1 has some differences with the AV1 film grain synthesis, described in the following.

Transport of AFGS1 model parameters

AFGS1 specifies the format for transmission of the film grain model parameters that uses the ITU-T T.35 registered user data mechanism. Many modern video codecs use ITU-T T.35 to define registered user data SEI or metadata messages, which allows defining SEI message syntax outside of the main video codec specification while also providing means to uniquely identify the SEI message for the decoder that supports it. Decoders that do not support this SEI message would simply ignore it. Another example of technology that uses the ITU-T T.35 SEI/metadata messages is HDR10+.

Using ITU-T T.35 allows AFGS1 parameters to be signaled in most video codec bitstreams, including AVC/H.264, HEVC/H.265, VVC/H.266, and AV1 without the need to modify any of these specifications. The relevant ITU-T T.35 syntax in AFGS1 is shown in Fig. 2. The syntax elements that identify the AFGS1 payload are itu_t_t35_country_code that shall be equal to 0xB5 (USA), itu_t_t35_terminal_provider_code equal to 0x5890 (AOMedia), and itu_t_t35_terminal_provider_oriented_code equal to 0x01 (AFGS1 metadata). If a video decoding device supports the AFGS1 specification, it will be able to parse the film grain parameters and apply the film grain synthesis process to the decoded video according to the transmitted parameters.

Fig. 2. AFGS1 ITU-T T.35 SEI message / metadata syntax.

Multiple AFGS1 parameter sets per picture

In AV1, a film grain model parameter set is sent for the decoded frame, which has certain characteristics, such as the resolution, color space (typically YCbCr) etc. The AV1 reference implementation applies the film grain synthesis process at the decoded frame resolution, as shown in Fig. 3. If the video needs to be upscaled for display, it happens after applying the grain. Other possible blocks in the display pipeline are omitted for simplicity.

Fig. 3. Example of AV1 display pipeline.

An AFGS1 parameter set includes the frame (or picture) resolution and other parameters, such as CICP color space information. Including this information serves two purposes. First, this allows the decoder to check that a correct set of the film grain parameters has been sent with the bitstream, for example, when transcoding has been applied to the bistream at a certain point. In addition, AFGS1 allows sending several sets of the film grain parameters for each decoded frame, where these parameter sets correspond to unique combinations of resolution, color space, chroma components subsampling, bit depth, and video range flag. Up to 8 film grain parameter sets can be sent for one frame. Also, when the encoder is sending an AFGS1 SEI message with multiple parameter sets, it must send a set of film grain parameters that corresponds to the decoded picture format and characteristics.

In other words, the encoder is expected to send a set of film grain parameters that corresponds to the decoded picture resolution and may also choose to send additional parameter sets that correspond to other (usually higher) picture resolutions [ 2 ]. If a decoder supports this optional feature, it may choose the set of film grain parameters that corresponds to its display resolution (or other higher resolution) and apply the film grain synthesis after upscaling the decoded picture (see Fig. 4). This approach may be helpful when sending video at low resolutions, where the film grain may not be preserved because of the downscaling.

Fig. 4. Example of AFGS1 optional display pipeline.

The mandatory set of parameters for the decoded picture resolution serves several purposes. First, some encoders may choose to add film grain at the decoded picture resolution when the video is shown in a smaller window on the display. Second, some decoders may output video with added film grain for further transcoding. A device may also decide to add the grain at the decoded resolution in a lower energy mode. Finally, the mandatory set of film grain parameters for the decoded picture resolution and format provides compatibility with AV1 film grain synthesis implementations. This use case is shown in Fig. 5, where the AFGS1 film grain pipeline is similar to the one used in the AV1 film grain synthesis.

Fig. 5. Example of AV1-compatible display pipeline in AFGS1.

Film grain model parameters prediction and signaling

Temporal (horizontal) parameters copying

The film grain parameters are not necessarily expected to change with every frame. In some video sequences, a set of film grain parameters may represent the film grain sufficiently well for a whole scene. In this case, it is more efficient to reuse a previously signaled set of parameters rather than sending it again. To support this scenario, AV1 allows copying of film grain parameters that belong to previously decoded frames. In particular, AV1 can use the reference frame index to select a set of film grain parameters associated with a frame in the decoded frame buffer.

Likewise, AFGS1 allows using a previously sent film grain parameter set. In particular, the film grain parameters are associated with a film_grain_param_set_idx that can take values from 0 to 7. As shown in Fig. 6, when update_grain_flag is equal to 1, a new set of film grain parameters is parsed and written to the memory associated with the value of film_grain_param_set_idx signaled for this parameter set.

Fig. 6. Updating and saving a parameter set.

Later, another film grain parameter set with update_grain_flag equal to 0 can load film grain model parameters from the memory associated with the (same) value of film_grain_param_set_idx (see Fig. 7 for details). This kind of parameters copy is only allowed between parameter sets sharing the same resolution and other picture characteristics and associated with different output pictures. Memory associated with the previously signaled parameter set can also be rewritten by a subsequent parameter set. Naturally, for a coded video sequence, the encoder shall first update/save a parameter set associated with a certain value of film_grain_param_set_idx before it can load a parameter set associated with that film_grain_param_set_idx value. Maximum of 8 different film grain parameter sets can be kept by the encoder at the same time.

Fig. 7. Loading a parameter set.

Inter-resolution (vertical) parameters prediction

It is well understood that multiple frames of the same resolution can share a set of film grain parameters. It may be less obvious whether film grain parameters from lower and higher resolution representations of the same picture or video are related. In particular, AFGS1 uses an auto-regressive model to represent spatial correlation between the film grain samples (you can find more details here). Typically, different resolutions would use different sets of AR coefficients to represent the film grain.

Unlike the AR coefficients, film grain strength associated with different luma and chroma sample values is coded with the piece-wise linear scaling function, which indicates how much the grain from the template generated with the AR process is scaled for each reconstructed sample value. This function would be different between different resolution representations of a video sequence. In particular, one can expect that the film grain variance decreases with decreasing picture resolution due to low-pass filtering applied during the downsampling. Fig. 8. demonstrates scaling functions at different resolutions of the same video sequence [ 3 ]. The results in Fig. 8 have been generated with the libaom film grain analysis software. One can see that higher resolutions have higher grain strength, but in general, the scaling functions have similar shape.

Fig. 8. Scaling functions at different resolutions.

One can use a linear model to predict a scaling function estimated for one resolution from a scaling function estimated for another resolution of the same sequence. Fig. 9 shows scaling functions of several resolutions predicted from the scaling function of another resolution. Predicted scaling functions are shown with dashed lines, and the function used as a reference with the bold line. The least squares method has been used to fit the linear model. The y coordinates of the functions are found as

PointYScaling[ i ] = y_scaling_mult * PointYScalingRef[ i ] + y_scaling_add,

where PointYScalingRef[ i ] are values of the scaling function used for prediction, PointYScaling[ i ] are values of the function that is predicted, and y_scaling_mult and y_scaling_add are the linear model parameters. The x (sample) values of the scaling functions are the same as in the reference function.

Fig. 9. Scaling functions prediction between resolutions.

It can be seen in Fig. 9 that the scaling functions are predicted somewhat closely, even though not exactly. For some applications, this precision of scaling function values may be sufficient. For application that need to signal the model parameters more precisely, the specification offers optional signaling of residual values point_y_scaling_res[ i ] that are added to the scaling function after prediction. It is possible to choose granularity of the residual values with the syntax element y_scaling_res_granularity, which is similar to quantization used in image and video codecs. In the description above, the parameters for luma ( Y ) component have been used. The parameter names for chroma components can be obtained by replacing y with cb and cr in the syntax elements names, respectively. The scaling function prediction in AFGS1 is only allowed from the first parameter set signaled in an AFGS1 SEI message.

Note that the same range of film grain parameter set indices film_grain_param_set_idx is shared between "horizontal" and "vertical" predictions. This means that the total number of parameter sets simultaneously stored in an AFGS1 module for all resolutions and timestamps cannot exceed 8 at any moment of time.

Parameter prediction restrictions

As previously mentioned, film grain parameters in AFGS1 can either be copied "horizontally", from a previously received picture at the same resolution, or predicted "vertically", from a different resolution parameters set associated with the same picture. "Vertical" scaling function prediction is only allowed inside one SEI/metadata message, and only from the first film grain parameter set signalled in this SEI/metadata message. Scaling function prediction between different SEI messages is not allowed, even though more than one AFGS1 SEI message can be associated with a picture. A "diagonal" prediction or copy, from a parameter set associated with another picture and a different resolution, is explicitly prohibited. Fig. 10 shows examples of parameters copy and scaling function prediction. Note that the first parameter set does not have to be the one associated with the decoded picture resolution. The order of the parameter sets is flexible, and any set can be signaled first and used for prediction of other sets.

Fig. 10. Copy and prediction of parameter sets.

In addition to that, the parameter sets associated with the decoded picture resolution cannot use the scaling function ("vertical") prediction. It can use a copy from a previously signaled parameter set, but only from a parameter set that is associated with the decoded picture resolution equal to the resolution of the current picture (this restriction is relevant to codecs that allow varying output picture resolution in the same coded video sequence, such as AV1 and VVC). The rationale for these restrictions is that a device that applies film grain synthesis at the decoded picture resolution only needs to keep track of the film grain parameters associated with the decoded picture resolution.

There is also a requirement that the parameter sets signaled for the same picture should all have unique combinations of resolution, color space, and other parameters used to identify the parameter set, and should also use unique values of the film_grain_param_set_idx within one picture.

There are other, more general, restrictions on parameters prediction. For example, a parameter set associated with the certain value of film_grain_param_set_idx should first be updated after a random access point and only then can be used for parameters copy or prediction. The parameters prediction and copy should also adhere to restrictions on scalability layers in the underlying video codec bitstream. The encoder has to make sure that FGS prediction and copy do not break, for example, when frames in a temporal scalability layer are dropped to reduce the bitrate.

Skipping parameter set

To enable decoders and devices skip film grain parameter sets they are not planning to use, each parameter set in the AFGS1 SEI message is associated with its payload_size. This allows a decoder to quickly check the parameter set characteristics, such as the resolution it is associated with, and skip further parsing of the parameter set if it is not needed. It also allows parallel parsing of film grain parameter sets. Note that in general, a decoder can only safely discard parameters sets if it only applies film grain synthesis at the decoded picture resolution. Otherwise, flexible choice of a reference parameter set may result in prediction of the needed parameter set from a parameter set associated with a resolution dropped by the decoder. Therefore, skipping functionality should be exercised with care.

Other improvements to parameters signaling

In addition to prediction of parameter sets from a reference set, more syntax modifications have been added to AFGS1 for efficient parameters signaling. These modifications improve parameters signaling when a parameter set is updated / signaled rather than predicted. When the scaling function is signaled explicitly, the x-coordinate of the scaling function PointYValue[i] is signaled as an increment from the previous point coordinate [ 3 ]

PointYValue[ i ] = PointYValue[ i - 1 ] + point_y_value_increment[ i ].

Since there is a requirement that the scaling function points are signaled in the order of increasing x-coordinate, all increments have positive values. This enables more efficient signaling of the scaling functions since the number of bits used to signal the increment is controlled by point_y_value_increment_bits_minus1 and can be decreased when the number of points is on the higher end.

The bits used to signal the scaling function's y-coordinate value PointYScaling[ i ] can also be reduced with the point_y_scaling_bits_minus5 syntax element. The highest number of bits used to signal the parameter is the same as in the AV1 film grain synthesis. However, a lower number of bits may be used when the scaling function's y-coordinate does not use the whole allowed range [0, 255]. In addition to that, the chroma component can use an offset cb_scaling_offset or cr_scaling_offset since in some cases, the scaling function of the chroma component looks somewhat flat and the values may be more efficiently signaled as an offset plus a value in a narrower range.

Finally, it has been observed that the AR coefficients do not always use the whole range of allowed values. The range of AR coefficient values can change depending on the content. Therefore, the standard allows to set the number of bits used to signal AR coefficients with the bits_per_ar_coeff_y_minus5, bits_per_ar_coeff_cb_minus5, and bits_per_ar_coeff_cr_minus5 syntax elements for luma and two chroma components, respectively. Since up to 74 AR coefficients can be signaled for a film grain parameter set, this optimization can help to reduce the parameter set size significantly in some cases. Note that the number of AR coefficients used by the model can also be reduced by decreasing the value of syntax element ar_coeff_lag (see the blog post on AV1 film grain synthesis for more details).

It is worth remembering that none of the aforementioned optimizations increases the range of film grain model parameter values compared to AV1. The ranges of some parameter values can only be decreased to enable more efficient signaling when the whole value range is not used.

Relation to AV1 film grain synthesis

As previously mentioned, in AFGS1, the film grain synthesis model and reference algorithm are the same as in AV1. The same set of model parameters is used, and parameters utilize the same range. The main differences between AFGS1 and AV1 film grain are the registered user data SEI mechanism for sending the model parameters, different syntax for parameters signaling, and an option to send multiple film grain parameter sets for the the output picture that correspond to different resolutions. The decoder should pick one parameter set and apply it at the intended resolution. Since the parameter set for the decoded resolution and format is always present, some devices have an option to always apply the film grain synthesis at the decoded picture resolution.

This makes it possible for some systems that implemented AV1 film grain synthesis to also support AFGS1 after updating the parsing, which is usually implemented in firmware rather than hardware. Certain restrictions to the film grain parameters prediction have been added to facilitate that approach. Some devices supporting AV1 film grain could perhaps support applying the film grain at other resolutions too.

Conformance

There are two conformance options for AFGS1, similar to the AV1 film grain synthesis. Option 1 is to implement the film grain synthesis algorithm that produces the output that exactly matches the output of the reference model. Option 2 is to implement the film grain synthesis algorithm that produces pictures that do not have perceptually significant differences with those produced by the reference model. Option 2 allows more flexibility with respect to the implementation details of the display pipeline.

Further reading

To learn more about the film grain synthesis tool in AV1, which is the basis for the AFGS1 low-level processing, the reader is encouraged to read the DCC 2018 paper, written by me and Neil Birkbeck from YouTube. In addition to that, the readers can refer to my blog post on AV1 film grain synthesis. The AFGS1 specification is the source that provides all details on the AFGS1 film grain synthesis algorithm. In case of further questions, the readers can contact me using the information at the bottom of this page.

Acknowlegement

The acknowledgement goes to other contributors to the AFGS1 specification, in particular to Andrew Segall (Amazon), Alexis Tourapis, Dmitry Podborski (Apple), Li-Heng Chen (Netflix), Wade Wan (Broadcom), Mark Thompson (AMD) and others who actively participated in the meetings and provided their valuable feedback.

References

  1. A. Norkin and N. Birkbeck, Film Grain Synthesis for AV1 Video Codec, in Proc. Data Compression Conference 2018, March 27-30, Snowbird, UT

  2. A. Segall, B. Choi, and K. Misra, On the AOMedia Film Grain Synthesis Specification (AFGS1), AOMedia document CWG-D119, Sept. 2023

  3. A. Norkin and L.-H. Chen, Film grain model parameters signaling, AOMedia document CWG-D156, Nov. 2023