| comments | tags:Linux BPF categories:tech series:english
Using eBPF to add arbitrary delay to kernel functions
It used to be the case that eBPF programs were very limited in what they could do in the kernel; no loops, only a very limited number of instructions, etc.
Over time, the capabilities of eBPF have expanded quite dramatically. And even though the verifier makes sure eBPF programs will eventually exit, “eventually” does not necessarily mean “in a timely manner”. This has been relatively well-known in the eBPF community itself, even though this sometimes comes as a surprise to outside observers (as was seen in a mailing list discussion back in February).
However, while some may argue that the ability to stall the kernel is a bad thing, this can also be useful! The kernel sometimes manages to stall itself (the RTNL mutex is a common cause), and when this happens it is often intermittent and unpredictable, making it very challenging to debug.
But what if we could deterministically inject delay into arbitrary places in the running kernel? Well, with eBPF it turns out we can!
The code to do this is actually quite trivial, it just requires a recursive
bpf_loop()
invocation:
static int recurse_loop(__u64 idx, void *ctx)
{
return 0;
}
static int run_delay(__u64 idx, void *ctx)
{
bpf_loop(100000, recurse_loop, NULL, 0);
return 0;
}
SEC("fentry/veth_dellink")
int BPF_PROG(delay_function)
{
bpf_loop(100000, run_delay, NULL, 0);
return 0;
}
This will loop 10 billion times every time the veth_dellink
function is
called, which on my machine takes around 10 seconds1.
Turning this into a more general utility was not too hard either; I implemented this in the bpf-examples repository. The utility only has a few additions on top of the basic example above: it allows attaching to arbitrary kernel functions, and it allows specifying a target delay in microseconds. A calibration run is performed before attaching to find the number of iterations to use in the outer loop to hit that target.
Note the warning in the README file, though:
Also, THERE IS NO SAFETY VALVE. This program will run in a tight loop every time it is executed inside the kernel! Depending on which function it is attached to this may COMPLETELY STALL THE SYSTEM, leaving to way to interrupt it other than a hard power cycle of the machine. Use at your own risk!
And yes, I did manage to stall my laptop while implementing this; thank you for asking! :)
This reminds me of when I was first learning to program (in Visual Basic at the age of 12). I wanted to add some delay to the application, so I inserted an empty
[return]for
loop that did 100000 iterations in the middle of the code.