In today’s final article covering the C++ language, we’ll explore a new C++20 feature: coroutines. These are analogous to both C# iterator functions (i.e. those with yield) and C# async functions. There are a lot of interesting aspects of coroutines, so let’s dive in explore!

Table of Contents

Coroutine Basics

As is typical for C++, we are given low-level tools on which to build higher-level features. In the case of coroutines, we’re given the tools necessary to build the equivalent of C# iterator functions that yield and the equivalent of C# async functions that await. Because these relatively higher-level features aren’t built into the language directly, we can customize how they work and build myriad more similar features.

Like C# iterator functions that become so simply by having one or more yield return or yield break statements, a C++ coroutine becomes so by having one more more co_yield, co_return, or co_await operators. There’s no async keyword to put on functions in C++ like we do in C#. A coroutine may simply go ahead and co_await. Here’s an equivalency between the two languages:

C# C++
Yield a value yield return x; co_yield x;
End with no value yield break; co_return;
End with a value yield return x; then yield break; co_return x;
Wait for an async process await x; co_await x;

Let’s jump in and look at perhaps the simplest possible coroutine we can build in C++:

ReturnObj SimpleCoroutine()
{
    DebugLog("Start of coroutine");
    co_return;
    DebugLog("End of coroutine");
}

This is a coroutine because it has co_return in it. Instead of using IEnumerable as the return type, as in C#, we’re instead using ReturnObj. This type has to be a class that has particular members. C++ just calls it a “return object” but it’s usually named after its purpose: Generator for iterators, Task for asynchronous tasks, Lazy for lazy evaluation, etc. Our ReturnObj is essentially useless though, so it gets a very generic name.

Let’s take a look at ReturnObj:

struct ReturnObj
{
    ReturnObj()
    {
        DebugLog("ReturnObj ctor");
    }
 
    ~ReturnObj()
    {
        DebugLog("ReturnObj dtor");
    }
 
    struct promise_type
    {
        promise_type()
        {
            DebugLog("promise_type ctor");
        }
 
        ~promise_type()
        {
            DebugLog("promise_type dtor");
        }
 
        ReturnObj get_return_object()
        {
            DebugLog("promise_type::get_return_object");
            return ReturnObj{};
        }
 
        NeverSuspend initial_suspend()
        {
            DebugLog("promise_type::initial_suspend");
            return NeverSuspend{};
        }
 
        void return_void()
        {
            DebugLog("promise_type::return_void");
        }
 
        NeverSuspend final_suspend()
        {
            DebugLog("promise_type::final_suspend");
            return NeverSuspend{};
        }
 
        void unhandled_exception()
        {
            DebugLog("promise_type unhandled_exception");
        }
    };
};

ReturnObj mostly exists to call DebugLog at various points of the coroutine’s lifecycle. That lifecycle is broadly similar to the lifecycle of a C# iterator function or async function. Behind the scenes there is a class that implements a state machine. The class holds the current state and all the local variables of the function as fields and a compiler-rewritten version of the function to operate that state machine.

In C++, we’re usually going to use “return object” classes that implement C# iterator- and async-like functionality. C++20 lacks such classes, but the community has filled in the gap until they can be delivered as part of the Standard Library in the next version: C++23. For the purposes of learning how these classes work in this article, we’ll actually implement our own.

So, looking at ReturnObj we see that it has a promise_type member class. We’ll get a compiler error if this class isn’t present or doesn’t have this name. This is the type of the “promise” that the coroutine is making to its caller. Its purpose is to control the behavior of the coroutine at various points of execution.

The first point of execution is represented by get_return_object. As the name suggests, this is used to get the “return object” and we simply return a default-constructed one in this example.

The second point is initial_suspend and it’s called after get_return_object to determine what to do right at the start.

Third, we have return_void which is called when the coroutine uses co_return without a return value: i.e. just co_return;.

Fourth, final_suspend is called to determine what to do when the coroutine reaches the end of execution.

Fifth, and finally, unhandled_exception is called when an exception escapes the coroutine.

initial_suspend and final_suspend are both returning a default-constructed NeverSuspend class, so let’s look at that next:

#include <coroutine>
 
struct NeverSuspend
{
    NeverSuspend()
    {
        DebugLog("NeverSuspend ctor");
    }
 
    ~NeverSuspend()
    {
        DebugLog("NeverSuspend dtor");
    }
 
    bool await_ready()
    {
        DebugLog("NeverSuspend::await_ready");
        return true;
    }
 
    void await_suspend(std::coroutine_handle<>)
    {
        DebugLog("NeverSuspend::await_suspend");
    }
 
    void await_resume()
    {
        DebugLog("NeverSuspend::await_resume");
    }
};

This class also has some mandatory members. First, await_ready is called to check if the coroutine should suspend at all. When the coroutine is synchronous, like this one, we can return true to indicate that there should be no suspension.

Second, await_suspend is passed a std::coroutine_handle<> object from the Standard Library’s coroutine header. We don’t use it or even give it a name in this example, but it’s a way to access the coroutine state when suspending the coroutine.

Third and finally, await_resume is called when the coroutine resumes.

Now let’s actually use the coroutine and see a log of its lifecycle via all these DebugLog messages:

void Foo()
{
    DebugLog("Calling coroutine");
    ReturnObj ret = SimpleCoroutine();
    DebugLog("Done");
}
Calling coroutine
promise_type ctor
promise_type::get_return_object
ReturnObj ctor
promise_type::initial_suspend
NeverSuspend ctor
NeverSuspend::await_ready
NeverSuspend::await_resume
NeverSuspend dtor
Start of coroutine
promise_type::return_void
promise_type::final_suspend
NeverSuspend ctor
NeverSuspend::await_ready
NeverSuspend::await_resume
NeverSuspend dtor
promise_type dtor
Done
ReturnObj dtor

Of course we begin with “Calling coroutine” but things immediately get interesting right after that. We see that the promise_type class is instantiated. This is allocated using operator new, so the default is to put it on the heap but we can overload that operator to control that behavior.

Next get_return_object is called to build the return object, which yields a call to the ReturnObj constructor.

At this point we get our first chance to suspend with the call to initial_suspend. We respond by calling the NeverSuspend constructor as a rich way of providing control over this initial suspension. Our await_ready is called and we return true to indicate that we don’t want to suspend so we get a call to await_resume and then our NeverSuspend is destroyed as the initial suspension phase is over and its job is therefore done.

Since we didn’t suspend, the coroutine can actually begin! We see “Start of coroutine” and immediately call co_return; which makes a call to return_void. Since we’ve terminated the coroutine, we get a call to our final_suspend and we once again create and return a NeverSuspend object that goes through the same lifecycle.

The coroutine is now over, so our promise_type is destroyed and control is returned to Foo right after the call to SimpleCoroutine. We print “Done” and the ReturnObj local variable goes out of scope and is destroyed.

Notice that “End of coroutine” was never logged. This is because we called co_return; before that could happen. This ended the coroutine just like return; would in a normal function so that statement was never executed.

Lazy Evaluation

Now that we have a good hang of a coroutine’s lifecycle, let’s do something useful with it. Say we have something expensive to compute but we want to delay computing it and possibly never compute it at all. We can use a coroutine to achieve this. We’ll need to make several changes though, starting with an AlwaysResume suspension policy:

struct AlwaysSuspend
{
    AlwaysSuspend()
    {
        DebugLog("AlwaysSuspend ctor");
    }
 
    ~AlwaysSuspend()
    {
        DebugLog("AlwaysSuspend dtor");
    }
 
    bool await_ready()
    {
        DebugLog("AlwaysSuspend::await_ready");
        return false;
    }
 
    void await_suspend(std::coroutine_handle<>)
    {
        DebugLog("AlwaysSuspend::await_suspend");
    }
 
    void await_resume()
    {
        DebugLog("AlwaysSuspend::await_resume");
    }
};

The only difference between AlwaysSuspend and NeverSuspend is that we return false in await_ready to indicate that we should suspend.

Now let’s look at our new “return object”, Lazy:

class Lazy
{
    struct promise_type;
 
    std::coroutine_handle<promise_type> handle;
    bool haveVal{ false };
 
public:
 
    Lazy(std::coroutine_handle<promise_type> handle)
        : handle{ handle }
    {
        DebugLog("Lazy ctor");
    }
 
    Lazy(const Lazy&) = delete;
 
    Lazy(Lazy&& s)
        : handle(s.handle)
    {
        DebugLog("Lazy move ctor");
        s.handle = nullptr;
    }
 
    ~Lazy()
    {
        DebugLog("Lazy dtor");
        if (handle)
        {
            handle.destroy();
        }
    }
 
    Lazy& operator=(const Lazy&) = delete;
 
    Lazy& operator=(Lazy&& s)
    {
        DebugLog("Lazy move assignment operator");
        handle = s.handle;
        s.handle = nullptr;
        return *this;
    }
 
    int GetValue()
    {
        DebugLog("Lazy::GetValue");
        if (!haveVal)
        {
            handle.resume();
            haveVal = true;
        }
        return handle.promise().value;
    }

First, we declare promise_type so that the compiler knows it’s a struct before we define it later on. This enables us to declare a std::coroutine_handle<promise_type> data member. This is a class template in the Standard Library that provides our return type (Lazy) with access to the promise_type for the coroutine. We take the handle in our constructor and save it to that data member. We also store whether we’ve already computed the value.

Next, we disallow copying of Lazy by deleting the copy constructor and copy assignment operator. This is because Lazy “owns” that handle by calling handle.destroy() in its destructor. We wouldn’t want two copies of Lazy to be destroyed and call handle.destroy() twice. Move construction and assignment are OK though, so we define those functions.

Finally, we provide a GetValue rather than making the value public. This gives us control to call handle.resume() to compute the value. We use handle.promise() to get a reference to the promise_type and access the computed value data member. We use haveVal to remember if we’ve already called handle.resume() so we don’t resume the coroutine twice.

Now let’s take a look at the promise type, which is still a member of Lazy:

struct promise_type
{
    int value{0};
 
    promise_type()
    {
        DebugLog("promise_type ctor");
    }
 
    ~promise_type()
    {
        DebugLog("promise_type dtor");
    }
 
    Lazy get_return_object()
    {
        DebugLog("promise_type::get_return_object");
        return Lazy{
            std::coroutine_handle<promise_type>::from_promise(*this) };
    }
 
    AlwaysSuspend initial_suspend()
    {
        DebugLog("promise_type::initial_suspend");
        return AlwaysSuspend{};
    }
 
    void return_value(int value)
    {
        DebugLog("promise_type::return_value");
        this->value = value;
    }
 
    AlwaysSuspend final_suspend()
    {
        DebugLog("promise_type::final_suspend");
        return AlwaysSuspend{};
    }
 
    void unhandled_exception()
    {
        DebugLog("promise_type unhandled_exception");
    }
};

The first change is that value is now simply a data member. We no longer use new int to allocate our own storage of the value. The coroutine_handle is taking care of this for us. We therefore have to pass std::coroutine_handle<promise_type>::from_promise(*this) to the Lazy constructor in our get_return_object. This creates the handle for our promise object.

The second change is that our initial_suspend and final_suspend functions now return AlwaysSuspend objects. Suspending prevents the coroutine function from being executed until it’s explicitly resumed by GetValue.

The third and last change is that we now have a return_value instead of a return_void. This allows for the co_return value to be passed to us and, at least in this example, for us to store it for later retrieval in GetValue.

Now we can use Lazy in our coroutine:

Lazy VeryExpensiveCalculation()
{
    DebugLog("Start of coroutine");
    co_return 123;
    DebugLog("End of coroutine");
}

This is obviously not really an expensive calculation in this example as we simply co_return 123;.

We call it the same way as before, but we need to use GetValue now:

void Foo()
{
    DebugLog("Calling coroutine");
    Lazy ret = VeryExpensiveCalculation();
    DebugLog("Get value first time");
    DebugLog(ret.GetValue());
    DebugLog("Get value second time");
    DebugLog(ret.GetValue());
}

Running this outputs these log messages:

Calling coroutine
promise_type ctor
promise_type::get_return_object
Lazy ctor
promise_type::initial_suspend
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
Get value first time
Lazy::GetValue
AlwaysSuspend::await_resume
AlwaysSuspend dtor
Start of coroutine
promise_type::return_value
promise_type::final_suspend
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
123
Get value second time
Lazy::GetValue
123
Lazy dtor
AlwaysSuspend dtor
promise_type dtor

The beginning part is all the same except that we see AlwaysSuspend being used for the initial_suspend phase. After that we don’t see the “Start of coroutine” right away. Before it runs, we see “Get value first time” indicating that we could inject arbitrary code between the point where we called the coroutine and the point we decided to get the calcuated value.

Once we do call GetValue, we see await_resume prompted by the handle.resume() call. Initial suspension is now done, so the AlwaysSuspend is destroyed and the coroutine begins execution.

The value is then “calculated” and given to co_return which calls return_value with the value. We then enter the final suspension phase and simply suspend. Control is handed back to GetValue which returns the value to Foo and it’s printed.

The second call to GetValue skips the handle.resume() call and simply returns the value. We don’t see any coroutine functions being called at this point.

Finally, the Lazy object goes out of scope and calls handle.destroy() to destruct the promise_type and final suspension AlwaysSuspend object.

Yielding

The next common usage of coroutines involves co_yield. This “generator” pattern allows us to produce a series of values. The series can even be infinite if we so choose. Here’s how our example generator coroutine looks:

Generator Squares(int count)
{
    DebugLog("Start of coroutine");
    for (int i = 1; i < count+1; ++i)
    {
        int square = i * i;
        DebugLog("Yielding", square, "for", i);
        co_yield square;
        DebugLog("Done yielding", square, "for", i);
    }
    DebugLog("End of coroutine");
}

In this case we just generate the first count squares of integers. Instead of using co_return, we use co_yield to output a single value and be able to pick up the function right after that operator.

To support this, we need an updated “return object” class called Generator:

class Generator
{
    struct promise_type;
 
    std::coroutine_handle<promise_type> handle;
 
public:
 
    Generator(std::coroutine_handle<promise_type> handle)
        : handle{ handle }
    {
        DebugLog("Generator ctor");
    }
 
    Generator(const Generator&) = delete;
 
    Generator(Generator&& s)
        : handle(s.handle)
    {
        DebugLog("Generator move ctor");
        s.handle = nullptr;
    }
 
    ~Generator()
    {
        DebugLog("Generator dtor");
        if (handle)
        {
            handle.destroy();
        }
    }
 
    Generator& operator=(const Generator&) = delete;
 
    Generator& operator=(Generator&& s)
    {
        DebugLog("Generator move assignment operator");
        handle = s.handle;
        s.handle = nullptr;
        return *this;
    }
 
    bool MoveNext()
    {
        DebugLog("Generator::MoveNext");
        handle.resume();
        bool done = handle.done();
        DebugLog("done?", done);
        return !done;
    }
 
    int GetValue()
    {
        DebugLog("Generator::GetValue");
        return handle.promise().value;
    }
 
    struct promise_type
    {
        int value{ 0 };
 
        promise_type()
        {
            DebugLog("promise_type ctor");
        }
 
        ~promise_type()
        {
            DebugLog("promise_type dtor");
        }
 
        Generator get_return_object()
        {
            DebugLog("promise_type::get_return_object");
            return Generator{
                std::coroutine_handle<promise_type>::from_promise(*this) };
        }
 
        AlwaysSuspend initial_suspend()
        {
            DebugLog("promise_type::initial_suspend");
            return AlwaysSuspend{};
        }
 
        AlwaysSuspend yield_value(int value)
        {
            DebugLog("promise_type::yield_value", value);
            this->value = value;
            return AlwaysSuspend{};
        }
 
        void return_void()
        {
            DebugLog("promise_type::return_void");
        }
 
        AlwaysSuspend final_suspend()
        {
            DebugLog("promise_type::final_suspend");
            return AlwaysSuspend{};
        }
 
        void unhandled_exception()
        {
            DebugLog("promise_type unhandled_exception");
        }
    };
};

Very little has changed here! We no longer have a haveVal data member since that doesn’t apply to a generator. Instead, we’ve split some of GetValue out into a new MoveNext. We now call handle.resume() in MoveNext and use handle.done() to check if the coroutine is done. GetValue now simply gets the value from the promise object.

The only other change is that we now have a yield_value instead of a return_value. This is called with the value given to co_yield and it returns AlwaysSuspend to control suspension at that point of the coroutine’s execution. We also have a return_void since there’s no co_return and therefore we have an implicit co_return; at the end of the coroutine.

Now let’s call this coroutine:

void Foo()
{
    DebugLog("Calling coroutine");
    Generator ret = Squares(3);
    while (ret.MoveNext())
    {
        DebugLog("Get value");
        DebugLog(ret.GetValue());
    }
}

Running this gives the following log:

Calling coroutine
promise_type ctor
promise_type::get_return_object
Generator ctor
promise_type::initial_suspend
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
Generator::MoveNext
AlwaysSuspend::await_resume
AlwaysSuspend dtor
Start of coroutine
Yielding 1 for 1
promise_type::yield_value 1
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
done? false
Get value
Generator::GetValue
1
Generator::MoveNext
AlwaysSuspend::await_resume
AlwaysSuspend dtor
Done yielding 1 for 1
Yielding 4 for 2
promise_type::yield_value 4
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
done? false
Get value
Generator::GetValue
4
Generator::MoveNext
AlwaysSuspend::await_resume
AlwaysSuspend dtor
Done yielding 4 for 2
Yielding 9 for 3
promise_type::yield_value 9
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
done? false
Get value
Generator::GetValue
9
Generator::MoveNext
AlwaysSuspend::await_resume
AlwaysSuspend dtor
Done yielding 9 for 3
End of coroutine
promise_type::return_void
promise_type::final_suspend
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
done? true
Generator dtor
AlwaysSuspend dtor
promise_type dtor

We can see the coroutine being repeatedly suspended and resumed as calls are made to handle.resume() via MoveNext. Values are output to the yield_value function and returned to Foo via GetValue.

Since a “generator” like this represents a sequence of values, it’s common to adapt it to C++’s “iterator” paradigm so it can be used in a range-based for loop. To do so, we add the required begin and end functions to Generator and make them return an object that fulfills all the requirements of range-based for loops:

class Iterator
{
    Generator& owner;
    bool done;
 
public:
 
    Iterator(Generator& o, bool d)
        : owner(o)
        , done(d)
    {
        if (!done)
        {
            MoveNext();
        }
    }
 
    void MoveNext()
    {
        owner.handle.resume();
        done = owner.handle.done();
    }
 
    bool operator!=(const Iterator& other) const
    {
        return done != other.done;
    }
 
    Iterator& operator++()
    {
        MoveNext();
        return *this;
    }
 
    int operator*() const
    {
        return owner.handle.promise().value;
    }
};
 
Iterator begin()
{
    return Iterator{ *this, false };
}
 
Iterator end()
{
    return Iterator{ *this, true };
}

Now we can use the coroutine like this:

for (int val : Squares(3))
{
    DebugLog(val);
}

And we’ll get the expected values:

1
4
9
Asynchronous Coroutines

The last coroutine keyword is co_await. It’s typically used to form asynchronous coroutines similar to C# async functions. For example, we might write file = co_await DownloadUrl("https://test.com/big-file"); in order to suspend our coroutine until a file is downloaded.

For now, let’s just make our Lazy return object support co_await. All we really need to do is add a few member functions to it:

bool await_ready()
{
    DebugLog("Lazy::await_ready");
    const auto done = handle.done();
    DebugLog("Done?", done);
    return handle.done();
}
 
void await_suspend(std::coroutine_handle<> awaitHandle)
{
    DebugLog("Lazy::await_suspend");
    DebugLog("Resuming handle");
    handle.resume();
    DebugLog("Resuming awaitHandle");
    awaitHandle.resume();
}
 
auto await_resume()
{
    DebugLog("Lazy::await_resume");
    return handle.promise().value;
}

These member functions correspond with different points of execution in a coroutine that’s using co_await. These are exactly the suspension functions we’ve already seen in NeverSuspend and AlwaysSuspend. In fact, those classes can be used with co_await!

Lazy AwaitSuspender()
{
    co_await NeverSuspend{};
}

Now that we have them in Lazy, we can use them from other coroutines. Let’s repurpose VeryExpensiveCalculation as the source of the count in our Squares coroutine:

Lazy VeryExpensiveCalculation()
{
    DebugLog("Start of coroutine");
    co_return 3;
    DebugLog("End of coroutine");
}
 
Generator Squares()
{
    int count = co_await VeryExpensiveCalculation();
 
    for (int i = 1; i < count + 1; ++i)
    {
        int square = i * i;
        co_yield square;
    }
}

We can run this exactly as before:

void Foo()
{
    DebugLog("Calling coroutine");
    for (int val : Squares())
    {
        DebugLog(val);
    }
}

Here’s the log we get:

Calling coroutine
promise_type ctor
promise_type::get_return_object
Generator ctor
promise_type::initial_suspend
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
AlwaysSuspend::await_resume
AlwaysSuspend dtor
promise_type ctor
promise_type::get_return_object
Lazy ctor
promise_type::initial_suspend
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
Lazy::await_ready
Done? false
Lazy::await_suspend
Resuming handle
AlwaysSuspend::await_resume
AlwaysSuspend dtor
Start of coroutine
promise_type::return_value
promise_type::final_suspend
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
Resuming awaitHandle
Lazy::await_resume
Lazy dtor
AlwaysSuspend dtor
promise_type dtor
promise_type::yield_value 1
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
1
AlwaysSuspend::await_resume
AlwaysSuspend dtor
promise_type::yield_value 4
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
4
AlwaysSuspend::await_resume
AlwaysSuspend dtor
promise_type::yield_value 9
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
9
AlwaysSuspend::await_resume
AlwaysSuspend dtor
promise_type::return_void
promise_type::final_suspend
AlwaysSuspend ctor
AlwaysSuspend::await_ready
AlwaysSuspend::await_suspend
Generator dtor
AlwaysSuspend dtor
promise_type dtor

We can see the calls to Lazy::await_ready, Lazy::await_suspend, and Lazy::await_resume at the start as Squares uses co_await to get the count. Afterward we simply see Squares go on to generate the sequence as before.

Notice that co_await doesn’t imply any particular multi-threading system as it does in C#. We’re free to keep everything on one thread as we do here, employ a thread pool, funnel coroutine return objects through a job system, or anything else we deem appropriate.

Conclusion

Coroutines in C++ fulfill very similar purposes to iterator functions and async functions in C#. As usual, C++ provides a lower level of control over their behavior. This level of control is at least powerful enough to implement both C# features and will likely be used for various creative purposes as more and more codebases adopt C++20.

While no guarantees are made, C++ compilers typically aggressively optimize coroutines. Such immediate, synchronous usage as we’ve seen today is likely to avoid any heap allocation and most if not all of the function calls and storage requirements will simply be optimized away leaving only a raw loop or a constant. When that’s not possible, they may be used flexibly such as by storing asynchronous coroutine return and promise objects for the long-term.