我的 C++20 协程教程 (2021)
My tutorial and take on C++20 coroutines (2021)

原始链接: https://www.scs.stanford.edu/~dm/blog/c++-coroutines.html

## C++20 协程:总结 本文详细描述了一位开发者深入研究 C++20 协程的过程,其动机是希望改进事件驱动编程——传统上由于众多回调函数分散代码而变得繁琐。虽然 C++11 lambda 表达式提供了一些改进,但协程有望提供更优雅的解决方案。 作者发现现有的协程解释令人困惑,不得不直接查阅 C++ 规范。本质上,协程允许函数暂停和恢复执行,而不会丢失其状态,这得益于 `co_await` 运算符。该运算符将局部变量保存到堆中,并创建一个可调用对象以稍后恢复执行。 然而,该实现因缺乏标准库支持而受到批评,过于复杂,需要浏览“垃圾堆”并进行自定义内存管理。关键概念包括 `coroutine_handle`(一个类似于指针的对象,用于管理协程状态)和 `promise_type`,它处理返回值、异常和挂起。 文章通过越来越复杂的示例,最终以通用生成器为顶点,突出了 `co_yield` 和 `co_return` 的复杂性。作者得出结论,虽然协程是一项重大改进,但其设计笨拙,并可能引入未定义行为,尤其是在协程完成方面,并建议在未来迭代中进行潜在的简化。尽管存在这些批评,但预计协程将成为 C++ 工具箱中的一个有价值的补充。

一篇2021年发表的关于C++20协程的教程在Hacker News上引发了讨论。用户们认为这篇文章有助于*理解*协程的工作原理,但一些评论员质疑它们在简单示例之外的实际应用。 一位用户指出,协程的主要用处在于与Boost Asio等库结合使用,以管理异步I/O操作(套接字、事件循环)和多线程复杂性。其他人则表示难以理解现有的协程解释。 对话中还包含了一些关于编译器版本的玩笑——一些人仍在使用的GCC版本非常旧——以及提醒该教程本身是2021年的。最后,帖子底部包含了一个Y Combinator申请的公告。
相关文章

原文
My tutorial and take on C++20 coroutines

Over the last 25 years, I’ve written a lot of event-driven code in C++. A typical example of event-driven code is registering a callback that gets invoked every time a socket has data to be read. Once you have read an entire message, possibly after many invocations, you parse the message and invoke another callback from a higher layer of abstraction, and so forth. This kind of code is painful to write because you have to break your code up into a bunch of different functions that, because they are different functions, don’t share local variables.

As an example, here’s a subset of the methods on the smtpd class of Mail Avenger, my SMTP server written in C++03:

lambda expressions. Now you only need one cmd_rcpt method on the class, and can use nested lambda expressions for the remaining ones. Better yet, lambdas can capture local variables from enclosing functions. Nonetheless, you still need to break your code into many functions. It’s clumsy to skip multiple steps or support situations where the order of issuing asynchronous events may change at runtime. Finally, you often end up fighting the right-hand margin of your text editor as your nested lambda expressions get further and further indented.

I was super excited to see that C++20 supports coroutines, which should hugely improve the process of writing event-driven code. Now that someone has finally published a book (or at least a draft of a book) on C++20, I eagerly got a copy a few days ago and read it. While I found the book did a reasonable job on concepts (the language feature) and other C++20 improvements, I sadly found the explanation of coroutines utterly incomprehensible. Same for almost every other explanation I found on the web. Hence, I had to dig through the specification and cppreference.org to figure out what was really going on.

This blog post represents my attempt to explain coroutines—basically the tutorial I wish I’d had 48 hours ago when I just wanted to figure this stuff out.

Roughly speaking, coroutines are functions that can invoke each other but do not share a stack, so can flexibly suspend their execution at any point to enter a different coroutine. In the true spirit of C++, C++20 coroutines are implemented as a nice little nugget buried underneath heaps of garbage that you have to wade through to access the nice part. Frankly, I was disappointed by the design, because other recent language changes were more tastefully done, but alas not coroutines. Further obfuscating coroutines is the fact that the C++ standard library doesn’t actually supply the heap of garbage you need to access coroutines, so you actually have to roll your own garbage and then wade through it. Anyway, I’ll try to save any further editorializing for the end of this blog post…

One other complication to be aware of is that C++ coroutines are often explained and even specified using the terms future and promise. These terms have nothing to do with the types std::future and std::promise available in the C++ <future> header. Specifically, std::promise is not a valid type for a coroutine promise object. Nothing in my blog post outside this paragraph has anything to do with std::future or std::promise.

With that out of the way, the nice little nugget C++20 gives us is a new operator called co_await. Roughly speaking, the expression “co_await a;” does the following:

  1. Ensures all local variables in the current function—which must be a coroutine—are saved to a heap-allocated object.
  2. Creates a callable object that, when invoked, will resume execution of the coroutine at the point immediately following evaluation of the co_await expression.
  3. Calls (or more accurately jumps to) a method of co_await’s target object a, passing that method the callable object from step 2.

Note that the method in step 3, when it returns, does not return control to the coroutine. The coroutine only resumes execution if and when the callable from step 2 is invoked. If you’ve used a language supporting call with current continuation, or played with the Haskell Cont monad, the callable in step 2 is a bit like a continuation.

Compiling code using coroutines

Since C++20 is not yet fully supported by compilers, you’ll need to make sure your compiler implements coroutines to play with them. I’m using GCC 10.2, which seems to support coroutines so long as you compile with the following flags:

g++ -fcoroutines -std=c++20

Clang’s support is less far along. You need to install llvm libc++ and compile with:

clang++ -std=c++20 -stdlib=libc++ -fcoroutines-ts

Unfortunately, with clang you also need to include the coroutine header as <experimental/coroutine> rather than <coroutine>. Moreover, a number of types are named std::experimental::xxx instead of std::xxx. Hence, as of this writing, the examples below won’t compile out-of-the box with clang, but ideally should with a future release.

If you want to play around, all the demos in this blog post are available in a single file corodemo.cc.

Coroutine handles

As previously mentioned, the new co_await operator ensures the current state of a function is bundled up somewhere on the heap and creates a callable object whose invocation continues execution of the current function. The callable object is of type std::coroutine_handle<>.

A coroutine handle behaves a lot like a C pointer. It can be easily copied, but it doesn’t have a destructor to free the memory associated with coroutine state. To avoid leaking memory, you must generally destroy coroutine state by calling the coroutine_handle::destroy method (though in certain cases a coroutine can destroy itself on completion). Also like a C pointer, once a coroutine handle has been destroyed, coroutine handles referencing that same coroutine will point to garbage and exhibit undefined behavior when invoked. On the plus side, a coroutine handle is valid for the entire execution of a coroutine, even as control flows in and out of the coroutine many times.

Now let’s look more specifically at what co_await does. When you evaluate the expression co_await a, the compiler creates a coroutine handle and passes it to the method a.await_suspend(coroutine_handle). The type of a must support certain methods, and is sometimes referred to as an “awaitable” object or an “awaiter.”

Now let’s look at a complete program that uses co_await. For now, ignore the ReturnObject type—it’s just part of the garbage we have to get through to access co_await.

std::suspend_always and std::suspend_never. As their names imply, suspend_always::await_ready always returns false, while suspend_never::await_ready always returns true. The other methods on these classes are empty and do nothing.

The coroutine return object

In the previous example, we ignored the return type of counter. However, the language restricts the allowable return types of coroutines. Specifically, the return type of a coroutine—call it R—must be an object type with a nested type R::promise_type. Among other requirements, R::promise_type must include a method R get_return_object() that returns an instance of the outer type R. The result of get_return_object() is the return value of the coroutine function, in this case counter(). Note that in many discussions of coroutines, the return type R is referred to as a future, but for clarity I’ll just call it the return object type.

Instead of passing a coroutine_handle<>* into counter, it would be nicer if we could just return the handle from counter(). We can do that if we put the coroutine handle inside the return object. Since promise_type::get_return_object computes the return object, we simply need that method to stick the coroutine handle into the return object. How can we get a coroutine handle from within get_return_object? As it happens, the coroutine state referenced by a coroutine_handle contains an instance of promise_type at a known offset, and so std::coroutine_handle allows us to compute a coroutine handle from the promise object.

Thus far, we’ve glossed over the template argument to coroutine handles, which are actually declared like this:

the static method coroutine_handle::from_pomise:

co_await std::suspend_always{}. Second, note that the return object goes out of scope and is destroyed in the fist line of main2. However, a coroutine_handle is like a C pointer, not like an object. It doesn’t matter that we’ve destroyed the object containing ReturnObject2::h_, because we’ve copied the pointer into h. On the other hand, somebody needs to reclaim the space pointed to by h, which we do at the end of main2 by calling h.destroy(). In particular, if any code calls counter2() and ignores the return value (or otherwise fails to destroy the handle in the ReturnObject2 object), it create a memory leak.

The promise object

Our examples thus far are a bit unsatisfactory in that even though we can pass control back and forth between a main function and a coroutine, we have not passed any data. It would be great if our counter function, instead of writing to standard output, just returned values to main, which could then either print them or use them in calculations.

Since we know the coroutine state includes an instance of promise_type, we can add a field value_ to this type and use that field to transmit values from the coroutine to our main function. How do we get access to the promise type? In the main function, this isn’t too hard. Instead of converting our coroutine handle to a std::coroutine_handle<>, we can keep it as a std::coroutine_handle<ReturnObject3::promise_type>. The method promise() on this coroutine handle will return the promise_type& that we need.

What about within counter—how can a coroutine obtain its own promise object? Recall the Awaiter object in our first example, and how it squirreled away a copy of the coroutine handle for main1. We can use a similar trick to get the promise within the coroutine: co_await on a custom awaiter that gives us the promise object. Unlike our previous type Awaiter, however, we don’t want this new custom awaiter to suspend the coroutine. After all, until we get our hands on the promise object, we can’t stick a valid return value inside it, so wouldn’t be returning anything valid from the coroutine.

Even though previously our Awaiter::await_suspend method returned void, that method is also allowed to return a bool. In that case, if await_suspend returns false, the coroutine is not suspended after all. In other words, a coroutine isn’t actually suspended unless first await_ready returns false, then await_suspend (if it returns type bool instead of void) returns true.

We thus define a new awaiter type GetPromise that contains a field promise_type *p_. We have its await_suspend method store the address of the promise object in p_, but then return false to avoid actually suspending the coroutine. Until now, we have only seen co_await expressions of type void. This time, we want our co_await to return the address of the promise object, so we also add an await_resume function returning p_.

co_return operator. There are three ways for a coroutine to signal that it is complete:

  1. The coroutine can use “co_return e;” to return a final value e.

  2. The coroutine can use “co_return;” with no value (or with a void expression) to end the coroutine without a final value.

  3. The coroutine can let execution fall off the end of the function, which is similar to the previous case.

In case 1, the compiler inserts a call to p.return_value(e) on the promise object p. In cases 2–3, the compiler calls p.return_void(). To find out if a coroutine is complete, you can call h.done() on its coroutine handle h. (Do not confuse coroutine_handle::done() with coroutine_handle::operator bool(). The latter merely checks whether the coroutine handle contains a non-null pointer to coroutine memory, not whether execution is complete.)

Here is a new version of counter in which the counter function itself decides to produce only 3 values, while the main function just keeps printing values until the coroutine is done. There’s one more change we need to make to promise_type::final_suspend(), but let’s first look at the new code, then discuss the promise object below.

you get undefined behavior. I’ll have more to say about that in the editorial below, but suffice it to say that undefined behavior is really, really bad—like use-after-free or array-bounds-overflow bad. So be careful not to drop off the end of a coroutine whose promise object lacks a return_void method!

The other thing to note about co_return is that promise_type::return_void() and promise_type::return_value(v) both return void; in particular they don’t return awaitable objects. This is presumably out of a desire to unify handling of return values and exceptions (which we’ll discuss further down). Nonetheless, there’s an important question about what to do at the end of a coroutine. Should the compiler update the coroutine state and suspend the coroutine one final time, so that even after evaluating co_return, code in the main function can access the promise object and make sane use of the coroutine_handle? Or should returning from a coroutine automatically destroy the coroutine state, like an implicit call to coroutine_handle::destroy()?

This question is resolved by the final_suspend method on the promise_type. The C++ spec says says that a coroutine’s function-body is effectively wrapped in the following pseudo-code:

std::current_exception to obtain a std::exception_ptr that it stores in the promise object. When this execption_ptr is non-NULL, the generator uses std::rethrow_exception to propagate the exception in the main function.

Another important point is that up until now, our coroutines have been computing the first value (0) as soon as they are invoked, before the first co_await, and hence before the return object is constructed. There are two reasons you might want to defer computation of the first value until after the first coroutine suspension. First, in cases where values are expensive to compute, it may be better to save work in case the coroutine is never resumed (perhaps because of an error in a different coroutine). Second, because of the need to destroy coroutine handles manually, things can get awkward if a coroutine throws an exception before the first time it has been suspended. Take the following example: