Meta 的 FFmpeg：规模化媒体处理

Meta 的 FFmpeg：规模化媒体处理
FFmpeg at Meta: Media Processing at Scale

原始链接: https://engineering.fb.com/2026/03/02/video-engineering/ffmpeg-at-meta-media-processing-at-scale/

## Meta 对 FFmpeg 的依赖与贡献 FFmpeg 对于 Meta 至关重要，是行业标准工具，每日执行数十亿次，用于媒体处理——包括转码和分析视频上传。多年来，Meta 维护了一个内部 FFmpeg 分支，以支持上游版本中不可用的功能，特别是线程多通道编码和实时质量指标计算。然而，随着开源 FFmpeg 随着新编解码器和可靠性改进不断发展，维护一个差异化的分支变得越来越具有挑战性。为了简化操作，Meta 与 FFlabs 和 VideoLAN 的 FFmpeg 开发者合作，将这些关键功能直接集成到上游 FFmpeg 中。这涉及重大贡献，从而实现更高效的线程（在 FFmpeg 6.0 和 8.0 中实现）和用于实时质量指标的“环路内”解码（FFmpeg 7.0+）。 Meta 还通过标准化的 API（例如用于其定制 ASIC MSVP 的 API）贡献硬件支持，但由于测试限制，这些仍然是内部的。通过优先考虑对社区产生广泛影响的上游贡献，Meta 成功地废弃了其内部分支，使自身管道和更广泛的 FFmpeg 生态系统都受益。 Meta 仍然致力于投资于 FFmpeg 的持续开发和可靠性。

最近的 Hacker News 讨论集中在 Meta 与开源项目 FFmpeg 的关系上，FFmpeg 是媒体处理的关键工具。Meta 详细说明了他们如何从 FFmpeg 的内部分支过渡到完全依赖上游版本，并在过程中贡献了改进。评论员认为这是一个受益于并支持开源开发的积极例子，可以防止重复的内部工作。然而，一些人认为 Meta 的财务贡献虽然受到赞赏，但考虑到他们的巨大财富，还不够充分，应该优先资助像 FFmpeg 这样的重要 FOSS 项目，以避免未来的安全漏洞（例如 XZ Utils 事件）。另一些人赞扬 Meta 对开源的更广泛承诺，并提到了他们对 PHP 的支持以及对其他项目的众多资助。FFmpeg 最近的一篇帖子承认了 Meta 的资助，但也强调了对更可持续支持的持续需求。

原文

FFmpeg is truly a multi-tool for media processing. As an industry-standard tool it supports a wide variety of audio and video codecs and container formats. It can also orchestrate complex chains of filters for media editing and manipulation. For the people who use our apps, FFmpeg plays an important role in enabling new video experiences and improving the reliability of existing ones.

Meta executes ffmpeg (the main CLI application) and ffprobe (a utility for obtaining media file properties) binaries tens of billions of times a day, introducing unique challenges when dealing with media files. FFmpeg can easily perform transcoding and editing on individual files, but our workflows have additional requirements to meet our needs. For many years we had to rely on our own internally developed fork of FFmpeg to provide features that have only recently been added to FFmpeg, such as threaded multi-lane encoding and real-time quality metric computation.

Over time, our internal fork came to diverge significantly from the upstream version of FFmpeg. At the same time, new versions of FFmpeg brought support for new codecs and file formats, and reliability improvements, all of which allowed us to ingest more diverse video content from users without disruptions. This necessitated that we support both recent open-source versions of FFmpeg alongside our internal fork. Not only did this create a gradually divergent feature set, it also created challenges around safely rebasing our internal changes to avoid regressions.

As our internal fork became increasingly outdated, we collaborated with FFmpeg developers, FFlabs, and VideoLAN to develop features in FFmpeg that allowed us to fully deprecate our internal fork and rely exclusively on the upstream version for our use cases. Using upstreamed patches and refactorings we’ve been able to fill two important gaps that we had previously relied on our internal fork to fill: threaded, multi-lane transcoding and real-time quality metrics.

Building More Efficient Multi-Lane Transcoding for VOD and Livestreaming

A video transcoding pipeline producing multiple outputs at different resolutions.

When a user uploads a video through one of our apps, we generate a set of encodings to support Dynamic Adaptive Streaming over HTTP (DASH) playback. DASH playback allows the app’s video player to dynamically choose an encoding based on signals such as network conditions. These encodings can differ in resolution, codec, framerate, and visual quality level but they are created from the same source encoding, and the player can seamlessly switch between them in real time.

In a very simple system separate FFmpeg command lines can generate the encodings for each lane one-by-one in serial. This could be optimized by running each command in parallel, but this quickly becomes inefficient due to the duplicate work done by each process.

To work around this, multiple outputs could be generated within a single FFmpeg command line, decoding the frames of a video once and sending them to each output’s encoder instance. This eliminates a lot of overhead by deduplicating the video decoding and process startup time overhead incurred by each command line. Given that we process over 1 billion video uploads daily, each requiring multiple FFmpeg executions, reductions in per-process compute usage yield significant efficiency gains.

Our internal FFmpeg fork provided an additional optimization to this: parallelized video encoding. While individual video encoders are often internally multi-threaded, previous FFmpeg versions executed each encoder in serial for a given frame when multiple encoders were in use. By running all encoder instances in parallel, better parallelism can be obtained overall.

Thanks to contributions from FFmpeg developers, including those at FFlabs and VideoLAN, more efficient threading was implemented starting with FFmpeg 6.0, with the finishing touches landing in 8.0. This was directly influenced by the design of our internal fork and was one of the main features we had relied on it to provide. This development led to the most complex refactoring of FFmpeg in decades and has enabled more efficient encodings for all FFmpeg users.

To fully migrate off of our internal fork we needed one more feature implemented upstream: real-time quality metrics.

Enabling Real-Time Quality Metrics While Transcoding for Livestreams

Visual quality metrics, which give a numeric representation of the perceived visual quality of media, can be used to quantify the quality loss incurred from compression. These metrics are categorized as reference or no-reference metrics, where the former compares a reference encoding to some other distorted encoding.

FFmpeg can compute various visual quality metrics such as PSNR, SSIM, and VMAF using two existing encodings in a separate command line after encoding has finished. This is okay for offline or VOD use cases, but not for livestreaming where we might want to compute quality metrics in real time.

To do this, we need to insert a video decoder after each video encoder used by each output lane. These provide bitmaps for each frame in the video after compression has been applied so that we can compare against the frames before compression. In the end, we can produce a quality metric for each encoded lane in real time using a single FFmpeg command line.

Thanks to “in-loop” decoding, which was enabled by FFmpeg developers including those from FFlabs and VideoLAN, beginning with FFmpeg 7.0, we no longer have to rely on our internal FFmpeg fork for this capability.

We Upstream When It Will Have the Most Community Impact

Things like real-time quality metrics while transcoding and more efficient threading can bring efficiency gains to a variety of FFmpeg-based pipelines both in and outside of Meta, and we strive to enable these developments upstream to benefit the FFmpeg community and wider industry. However, there are some patches we’ve developed internally that don’t make sense to contribute upstream. These are highly specific to our infrastructure and don’t generalize well.

FFmpeg supports hardware-accelerated decoding, encoding, and filtering with devices such as NVIDIA’s NVDEC and NVENC, AMD’s Unified Video Decoder (UVD), and Intel’s Quick Sync Video (QSV). Each device is supported through an implementation of standard APIs in FFmpeg, allowing for easier integration and minimizing the need for device-specific command line flags. We’ve added support for the Meta Scalable Video Processor (MSVP), our custom ASIC for video transcoding, through these same APIs, enabling the use of common tooling across different hardware platforms with minimal platform-specific quirks.

As MSVP is only used within Meta’s own infrastructure, it would create a challenge for FFmpeg developers to support it without access to the hardware for testing and validation. In this case, it makes sense to keep patches like this internal since they wouldn’t provide benefit externally. We’ve taken on the responsibility of rebasing our internal patches onto more recent FFmpeg versions over time, utilizing extensive validation to ensure robustness and correctness during upgrades.

Our Continued Commitment to FFmpeg

With more efficient multi-lane encoding and real-time quality metrics, we were able to fully deprecate our internal FFmpeg fork for all VOD and livestreaming pipelines. And thanks to standardized hardware APIs in FFmpeg, we’ve been able to support our MSVP ASIC alongside software-based pipelines with minimal friction.

FFmpeg has withstood the test of time with over 25 years of active development. Developments that improve resource utilization, add support for new codecs and features, and increase reliability enable robust support for a wider range of media. For people on our platforms, this means enabling new experiences and improving the reliability of existing ones. We plan to continue investing in FFmpeg in partnership with open source developers, bringing benefits to Meta, the wider industry, and people who use our products.

Acknowledgments

We would like to acknowledge contributions from the open source community, our partners in FFlabs and VideoLAN, and many Meta engineers, including Max Bykov, Jordi Cenzano Ferret, Tim Harris, Colleen Henry, Mark Shwartzman, Haixia Shi, Cosmin Stejerean, Hassene Tmar, and Victor Loh.