使用 QUIC 无需数据报即可实现时效

使用 QUIC 无需数据报即可实现时效
Timeliness without datagrams using QUIC

原始链接: https://quic.video/blog/never-use-datagrams/

简而言之，本文讨论了在互联网上开发应用程序时在传输控制协议 (TCP) 和用户数据报协议 (UDP) 之间的选择。 TCP 保证可靠的数据传输，而 UDP 提供更快但可靠性较低的数据传输。作者反对想要“不可靠”，而是专注于实现实时数据的时效性，例如实时视频和对话。他们解释说，由于网络限制，数据包可能会被丢弃或无序传送，从而导致称为缓冲区膨胀的排队问题。为了防止这个问题，采用了拥塞控制、将数据分成优先级流和基于延迟的控制器等方法。此外，文本还提到了最新的发展，包括 QUIC 的数据报支持以及基于 UDP 创建实时视频协议的日益流行。作者最后呼吁采取行动，鼓励读者参与有关在线媒体流的最佳方法的持续讨论。

TCP 和用户数据报协议 (UDP) 在网络通信中具有不同的用途。 TCP 确保可靠、有序的数据传输，而 UDP 提供更快、更灵活的数据传输，但不保证顺序或完整性。以下是突出显示主要差异的简要比较： TCP（传输控制协议）： 1.可靠：TCP检查数据包接收确认并重传丢失的数据包以确保数据完整性。与 UDP 相比，这会增加开销，从而产生更大的延迟。 2. 有序：数据按照发送的顺序传送，保持消息的顺序。 3. 流量控制：TCP 管理流量控制，防止数据溢出并确保可用带宽的最佳利用。 4.面向连接：TCP在两个设备之间建立虚拟连接，允许它们之间连续的数据流。 5、纠错：如果传输过程中出现错误，TCP会检测并纠正错误，提高可靠性。 UDP（用户数据报协议）： 1、不可靠：UDP不保证数据的可靠传送，也不报告错误情况。数据包可能会丢失或无序到达。 2. 无顺序保证：与 TCP 不同，UDP 不承诺数据将按照发送的顺序接收。 3.无连接：UDP工作在传输层，通信双方之间不建立永久连接。每个数据报都携带足够的信息来识别源地址和目标地址。 4.无错误处理：由于UDP缺乏错误报告能力，用户必须实现自己的方法来检测和响应错误。 5. 较低的延迟：由于其极简的性质，UDP 通常会引入较少的延迟，使其适合需要快速响应时间的应用程序。常见应用： TCP 通常用于依赖一致数据传输的应用程序，例如电子邮件、文档共享或网页浏览。另一方面，UDP 用于需要快速响应时间和松散同步的应用，例如在线游戏、VoIP 和多媒体流。虽然 TCP 和 UDP 都有其优点，但为特定用例选择适当的协议对于优化网络性能和确保与预期应用程序要求的兼容性至关重要。

原文

Click-bait title, but hear me out.

TCP vs UDP

So you’re reading this blog over the internet. I would wager you do a lot of things over the internet.

If you’ve built an application on the internet, you’ve undoubtedly had to decide whether to use TCP or UDP. Maybe you’re trying to make, oh I dunno, a live video protocol or something. There are more choices than just those two but let’s pretend like we’re a networking textbook from the 90s.

The common wisdom is:

use TCP if you want reliable delivery
use UDP if you want unreliable delivery

What the fuck does that mean? Who wants unreliability?

You don’t want a hard-drive that fails 5% of writes.
You don’t want something with random holes in the middle (unless it’s cheese).
You don’t want a service that is randomly unavailable because ¯\_(ツ)_/¯.

Nobody* wants memory corruption or deadzones or artifacts or cosmic rays. Unreliability is a consequence, not a goal.

Video glitch — *Unless you’re making some cursed GIF art. Source

Properties

So what do we actually want?

If you go low enough level, you can use electrical impulses to do neat stuff like:

Power on LEDs in a desired configuration.
Spin magnets at ludicrous speeds.
Make objects tingle and shake.
etc you get the idea.

But we don’t want to deal with electrical impulses. We want higher level functionality.

Fortunately, software engineering is all about standing on the shoulders of others. There are layers on top of layers on top of layers of abstraction. Each layer provides properties so you don’t have to reinvent the personal computer every time.

Our job as developers is to decide which shoulders we want to stand on. But some shoulders are awful, so we have to be selective. Over-abstraction is bad but so is under-abstraction.

What user experience are we trying to build, and how can we leverage the properties of existing layers to achieve that?

“Unreliable”

There was a recent MoQ interim in Denver. For those unaware, it’s basically a meetup of masochistic super nerds who want to design a live video protocol. We spent hours debating the semantic differences between FETCH and SUBSCRIBE among other riveting topics.

Denver interim — I’m the one in the back right corner, the one with the stupid grin on their face.

A few times, it was stated that SUBSCRIBE should be unreliable. The room cringed, and I hard cringed enough to write this blog post.

What I actually want is timeliness. If the internet can choose between delivering two pieces of data, I want it to deliver the newest one.

In the live video scenario, this is the difference between buffering and skipping ahead. If you’re trying to have a conversation with someone on the internet, there can’t be a delay. You don’t want a buffering spinner on top of their face, nor do you want to hear what they said 5 seconds ago.

To accomplish timeliness, the live video industry often uses UDP datagrams instead of TCP streams. As does the video game industry apparently. But why?

Datagrams

A datagram, aka an IP packet, is an envelope of 0s and 1s that gets sent from a source address to a destination address. Each device has a different maximum size allowed, which is super annoying, but 1200 bytes is generally safe. And of course, they can be silently lost or even arrive out of order.

But the physical world doesn’t work in discrete packets; it’s yet another layer of abstraction. I’m not a scientist-man, but the data is converted to analog signals and sent through some medium. It all gets serialized and deserialized and buffered and queued and retransmitted and dropped and corrupted and delayed and reordered and duplicated and lost and all sorts of other things.

So why does this abstraction exist?

Internet of Queues

It’s pretty simple actually: something’s got to give.

Screamer Rock — Let the packets hit the FLOOR

When there’s too much data sent over the network, the network has to decide what to do. In theory it could drop random bits but oh lord that is a nightmare, as evidenced by over-the-air TV. So instead, a bunch of smart people got together and decided that routers should drop at packet boundaries.

But why drop packets again? Why can’t we just queue and deliver them later? Well yeah, that’s what a lot of routers do these days since RAM is cheap. It’s a phenomenon called bufferbloat and my coworkers can attest that it’s my favorite thing to talk about. 🐷

But RAM is a finite resource so the packets will eventually get dropped. Then you finally get the unreliability you wanted all along…

Oh no

Oh shit, I forgot, I actually want timeliness and bufferbloat is the worst possible scenario. Naively, you would expect the internet to deliver packets immediately, with some random packets getting dropped. However bufferbloat causes all packets to get queued, possibly for seconds, ruling out any hope of timely delivery.

How do you avoid this? Basically, the only way to avoid queuing is to detect it, and then send less. The sender uses some feedback from the receiver to determine how long it took a packet to arrive. We can use that signal to infer when routers are queuing packets, and back off to drain any queues.

This is called congestion control and it’s a huge, never ending area of research. I briefly summarized it in the Replacing WebRTC post if you want more CONTENT. But all you need to know is that sending packets at unlimited rate is a recipe for disaster.

BBR — Source: Riveting slides from IETF meetings that you’re missing out on.

You, The Application Developer

Speaking of a recipe for disaster. Let’s say you made the mistake of using UDP directly because you want them datagrams. You’re bound to mess up, and you won’t even realize why.

If you want to build your own transport protocol on top of UDP, you “need” to implement:

And if you want a great protocol, you also need:

And if you want an AMAZING protocol, you also need:

Let’s be honest, you don’t even know what half of those are, nor why they are worth implementing. Just use a QUIC library instead.

But if you still insist on UDP, you’re actually in good company with a lot of the video industry. Building a live video protocol on top of UDP is all the rage; for example, WebRTC, SRT, Sye, RIST, etc. With the exception of Google, it’s very easy make a terrible protocol on top of UDP. Look forward to the upcoming Replacing RTMP but please not with SRT blog post!

Timeliness

But remember, I ultimately want to achieve timeliness. How can we do that with QUIC?

Avoid bloating the buffers 🐷. Use a delay-based congestion controller like BBR that will detect queueing and back off. There are better ways of doing this, like how WebRTC uses transport-wide-cc, which I’ll personally make sure gets added to QUIC.
Split data into streams. The bytes within each stream are ordered, reliable, and can be any size; it’s nice and convenient. Each stream could be a video frame, or a game update, or a chat message, or a JSON blob, or really any atomic unit.
Prioritize the streams. Streams are independent and can arrive in any order. But you can tell the QUIC stack to focus on delivering important streams first. The low priority streams will be starved, and can be closed to avoid wasting bandwidth.

That’s it. That’s the secret behind Media over QUIC. Now all that’s left is to bikeshed the details.

And guess what? This approach works with higher latency targets too. It turns out that the fire-and-forget nature of datagrams only works when you need real-time latency. For everything else, there’s QUIC streams.

You don’t need datagrams.

QUIC logo

In Defense of Datagrams

Never* use Datagrams got you to click, but the direction of QUIC and MoQ seems to tell another story:

QUIC has support for datagrams via an extension.
WebTransport requires support for datagrams.
The latest MoQ version adds support for datagrams.
The next MoQ version will require support for datagrams.

Like all things designed by committee, there’s going to be some compromise. There are some folks who think datagram support is important. And frankly, it’s trivial to support and allow people to experiment. For example, OPUS has FEC support built-in, which is why MoQ supports the ability to send each audio “frame” as a datagram.

But it’s a trap. Designed to lure in developers who don’t know any better. Who wouldn’t give up their precious UDP datagrams otherwise.

If you want some more of my hot-takes:

Conclusion

There is no conclusion. This is a rant.

Please don’t design your application on top of datagrams. Old protocols like DNS get a pass, but be like DNS over HTTPS instead.

And please, please don’t make yet another video protocol on top of UDP. Get involved with Media over QUIC instead! Join our Discord and tell me how wrong I am.

Written by @kixelated.