Coroutine in C Language : Implementation Example

Reading Time: 8 minutes

It’s been quite a while that I haven’t published anything on my blog. But that’s due to the job change. I hope you understand that it has never been easy to re-settle in a new environment with new people while maintaining a steep technical learning curve. It takes time to tune yourself accordingly. Anyways, I wrote on “Coroutine in C Language” as a pre-pend to my upcoming post on C++20 Coroutine. Today we will see “How Coroutine Works Internally?”.

Prologue

If you are an absolute beginner, then go through the below pre-requisites. And if you are not a beginner, you better know what to skip!

Note: Context switching APIs getcontext, setcontext, makecontext and swapcontext were obsoleted in POSIX.1-2004 and removed in POSIX.1-2008 citing portability issues. So, please do not use it. Here I have used it for demonstration purpose.

Coroutine Basics

What Is Coroutine?

A coroutine is a function/sub-routine(co-operative sub-routine to be precise) that can be suspended and resumed.
In other words, You can think of coroutine as an in-between solution of normal function & thread. Because, once function/sub-routine called, it executes till the end. On other hand, a thread can be blocked by synchronization primitives(like mutex, semaphores, etc) or suspended by an OS scheduler. But again you can not decide on suspension & resumption on it. As it is done by the OS scheduler.
While coroutine on other hand, can be suspended on a pre-defined point & resumed later on a need basis by the programmer. So here programmer will be having complete control of execution flow. That too with minimal overhead as compared to thread.
A coroutine is also known as native threads, fibres(in windows), lightweight threads, green threads(in java), etc.

Why Do We Need Coroutine?

As I usually do, before learning anything new, you should be asking this question to yourself. But, let me answer it:
Coroutines can provide a very high level of concurrency with very little overhead. As it doesn’t need OS intervention in scheduling. While in a threaded environment, you have to bear the OS scheduling overhead.
A coroutine can suspend on a pre-determined point, so you can also avoid locking on shared data structures. Because you would never tell your code to switch to another coroutine in the middle of a critical section.
With the threads, each thread needs its own stack with thread local storage & other things. So your memory usage grows linearly with the number of threads you have. While with co-routines, the number of routines you have doesn’t have a direct relationship with your memory usage.
For most use cases coroutine is a more optimal choice as it is faster as compared to thread.
And if you are still not convinced then wait for my C++20 Coroutine post.

To-the-point Context Switching API Theory

Before we dive into a implementation of Coroutine in C, we need to understand the below foundation functions/APIs for context switching. Off-course, as we do, with less to-the-point theory & with more code examples.
1. setcontext
2. getcontext
3. makecontext
4. swapcontext
If you are already familiar with setjmp/longjmp, then you might have ease in understanding these functions. You can consider these functions as an advanced version of setjmp/longjmp.
The only difference is setjmp/longjmp allows only a single non-local jump up the stack. Whereas, these APIs allows the creation of multiple cooperative threads of control, each with its own stack or entry point.

Data Strucutre To Store Execution Context

ucontext_t type structure that defined as below is used to store the execution context.
All four(setcontext, getcontext, makecontext & swapcontext) control flow functions operates on this structure.

typedef struct {
    ucontext_t *uc_link;    
    stack_t     uc_stack;
    mcontext_t  uc_mcontext;
    sigset_t    uc_sigmask;
    ...
} ucontext_t;

uc_link points to the context which will be resumed when the current context exits, if the context was created with makecontext (a secondary context).
uc_stack is the stack used by the context.
uc_mcontext stores execution state, including all registers and CPU flags, frame/base pointer(i.e. indicates current execution frame), instruction pointer(i.e. program counter), link register(i.e. stores return address) and the stack pointer(i.e. indicates current stack limit or end of current frame). mcontext_t is an opaque type.
uc_sigmask is used to store the set of signals blocked in the context. Which isn’t the focus for today.

`int setcontext(const ucontext_t *ucp)`

This function transfers control to the context in ucp. Execution continues from the point at which the context was stored in ucp. setcontext does not return.

`int getcontext(ucontext_t *ucp)`

Saves current context into ucp. This function returns in two possible cases:
1. after the initial call,
2. or when a thread switches to the context in ucp via setcontext or swapcontext.
The getcontext function does not provide a return value to distinguish the cases (its return value is used solely to signal error), so the programmer must use an explicit flag variable, which must not be a register variable and must be declared volatile to avoid constant propagation or other compiler optimisations.

`void makecontext(ucontext_t ucp, void (func)(), int argc, ...)`

The makecontext function sets up an alternate thread of control in ucp , which has previously been initialised using getcontext.
The ucp.uc_stack member should be pointed to an appropriately sized stack; the constant SIGSTKSZ or MINSIGSTKSZ is commonly used.
When ucp is jumped to using setcontext or swapcontext, execution will begin at the entry point to the function pointed to by func, with argc arguments as specified. When func terminates, control is returned to the context specified in ucp.uc_link.

`int swapcontext(ucontext_t oucp, ucontext_t ucp)`

Saves the current execution state into oucp and then transfers the execution control to ucp.

[Example 1]: Understanding Context Switching With `setcontext` & `getcontext` Functions

Now, that we have read lot of theory. Let’s create meaningful out of it.
Consider below program that implements plain infinite loop printing “Hello world” every second.

#include <stdio.h>
#include <ucontext.h>
#include <unistd.h>
#include <stdlib.h>

int main( ) {
    ucontext_t ctx = {0};

    getcontext(&ctx);   // Loop start
    puts("Hello world");
    sleep(1);
    setcontext(&ctx);   // Loop end 

    return EXIT_SUCCESS;
}

Here, getcontext is returning with both possible cases as we have mentioned earlier i.e.:
1. after the initial call,
2. when a thread switches to the context via setcontext.
Rest is I think self-explanatory.

[Example 2]: Understanding Control Flow With `makecontext` & `swapcontext` Functions

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <signal.h>
#include <ucontext.h>

void assign(uint32_t *var, uint32_t val) { 
    *var = val; 
}

int main( ) {
    uint32_t var = 0;
    ucontext_t ctx = {0}, back = {0};

    getcontext(&ctx);

    ctx.uc_stack.ss_sp = calloc(1, MINSIGSTKSZ);
    ctx.uc_stack.ss_size = MINSIGSTKSZ;
    ctx.uc_stack.ss_flags = 0;

    ctx.uc_link = &back; // Will get back to main as `swapcontext` call will populate `back` with current context
    // ctx.uc_link = 0;  // Will exit directly after `swapcontext` call

    makecontext(&ctx, (void (*)())assign, 2, &var, 100);
    swapcontext(&back, &ctx);    // Calling `assign` by switching context

    printf("var = %d\n", var);

    return EXIT_SUCCESS;
}

Here, the makecontext function sets up an alternate thread of control in ctx. And when jump made with ctx by using swapcontext, execution will begin at assign, with respective arguments as specified.
When assign terminates, control will be switch to ctx.uc_link. Which points to back & will be populated by swapcontext before jump/context-switch.
If the ctx.uc_link is made to 0, then current execution context is considered as the main context, and the thread will exit when assign context gets over.
Before a call is made to makecontext, the application/developer needs to ensure that the context being modified has a pre-allocated stack. And argc matches the number of arguments of type int passed to func. Otherwise, the behavior is undefined.