eBPF is a set of tools that lets us inject programs into the Linux kernel at runtime, attaching them to events in the Kernel and user space, and doing interesting things with those events. Now we can extend the kernel without rebuilding, and as the programs are run through a verifier at load time, do it in a safe, reliable fashion.
“Interesting things” can range from implementing flexible network firewalling - the original purpose of eBPF - to end-to-end observability, security & network controls (cilium), performance monitoring (BCC), load balancing (katran), and all sorts of other exciting things. It’s hard to overstate how many new use cases it makes practical - there’s a great primer on this stuff over at eBPF.io.
I have yet to play with eBPF directly, besides using tools like cilium and Brendan Gregg’s rather comprehensive eBPF tools.
A toy project - Dumping unencrypted SSL traffic
I learn how stuff works by building things with the aforementioned stuff. To get a proper grasp on eBPF, I’ll build a program that leverages eBPF to intercept SSL traffic in user-space, capturing data before it is encrypted (outgoing messages) and after it is decrypted (incoming messages). This’ll let us read wire-encrypted SSL traffic on our local system without proxies, or having to meddle directly with the processes involved.
Step 1 - How’s curl do SSL?
Let’s start by seeing how curl works, with an eye to understanding what it’s using for SSL. I’ll go through the process I used to work out what’s going on here, rather than jumping straight to the “…and that’s how it works” part.
I’ve spent a fair bit of time working in C-land, and figure a good place to start is dumping the shared objects used by curl:
Looks like we’ve got libssl.so.3, which is part of OpenSSL 3.0. Let’s check what exports it has, so we can try and work out what we want to hook. We can use ltrace
to watch dynamically linked library calls - maybe we’ll see it using libssl to do something:
Whoops - there’s no libssl
, but there is libnghttp2
. If we add -S
to include system calls, we get a lot more output, but it doesn’t help, much. We can see we call read
to read some OpenSSL configuration. Let’s use a debugger, set a breakpoint on read
, and see what our stack looks like when we hit it. Hopefully this will go through OpenSSL and we have a thread to pull:
Look! Some OpenSSL frames right in the middle there. Looks like we’ve caught OpenSSL starting up, as we’d hoped. Let’s set some other breakpoints and see if we can catch its involvement in encryption - looks like we can use rbreak SSL.*
to catch every call through the library.
Running through these breakpoints, there’s lots of noise in there, and it seems a fair bit is internal details of OpenSSL. A bit more of a look around the symbol tables reveals SSL_read
and SSL_write
. Let’s break on them, and see if we can get some info out on how they are being called:
We can’t easily see the locals or the args, because we don’t have a build of curl with symbols - we have a normal release build that’s been stripped.
Let’s park this debugging adventure for the moment - we have enough information to see that SSL_read
and SSL_write
are what we want, and we can see their method signatures in the OpenSSL manual anyway.
Step 2 - What tooling do we need to app-ify this?
Now we know that to capture curl’s traffic, we need to userspace-hook SSL_write
and SSL_read
from the OpenSSL library. I know that eBPF has userspace probes that I want to use for this, but I still need some tooling to develop in. A casual google turns up a bunch of options - ebpf-go for golang, libbpf-rs for rust, and libbpf for C. There’s also a bunch of tools you can use to save writing the userspace part - but as this is a journey of discovery, I’ll discard those.
Based entirely on vibes (and github popularity!) i’ve decided to start with ebpf-go. It’s part of the Cilium project, is very well documented, and pops up constantly when you google in the eBPF domain.
Step 3 - Pulling apart an example
First we need to decide what sort of probe we need - there’s all sorts - kernel functions, user-space functions, hardware events, etc. In our case we know we need user probes - which brings us to uprobe and uretprobe. Looking at the API documentation for SSL_read
and SSL_write
, we see that we have a buffer and a length as the second and third arguments of each function call. In the write case, the buffer contains the unencrypted payload when the function is called, in the read case, the buffer is filled with the decrypted payload once the function has completed (and the return code of SSL_write
indicates the number of bytes read).
ebpf-go has a great example of uretprobe we can start with - main.go
is the user-space part - that loads the eBPF program and reads its output - and uretprobe.c is the eBPF itself. Some bits jump out —
First, the data interface between the eBPF and the user-space program. We can see we return “events” that contain the process ID and some text. BPF_MAP_TYPE
indicates we have a BPF map, which are the storage types supported by eBPF. These aren’t all hashmaps despite the naming.
In main.go
, we can see these events being read out:
Sticking with main.go
, we can also see how we attach the probe to some user-space function - specifying the binary in question. In our case, we will need to provide the path to the openSSL so
that we want to hook:
Finally, in the eBPF, we see the probe itself:
At this point, we have a pretty good idea how our program should look.
Step 4 - Code! Code! Code!
For this bit I’ll break the code down by component rather than following along with how I incrementally built it; I think this will be easier to follow.
Let’s start with the probes themselves - we need to hook both SSL_read
and SSL_write
. We have two options - a uprobe lets us hook a function before it begins running, and a uretprobe after it has completed, before it returns. For write its straightforward enough - the user provides OpenSSL the unencrypted data, so we simply hook a uprobe up and grab that buffer:
The read side is a bit more interesting, as the caller provides the encrypted payload. I naively thought I could just use a uretprobe here, but at the point of return, the arguments to SSL_read
are no longer in the registers, and we can’t get the buffer the caller provides. This means we need both a uprobe (to get the decryption buffer address) and a uretprobe (to read the decrypted data), and some way of passing the details from the uprobe to the uretprobe, so we can ultimately push it all back to our userspace app. Let’s see what that looks like:
At this point, we need two “external stores” - places to stash data that aren’t on the stack. The first is obviously to pass data back to user-space so we can get at our unencrypted data. We will need this mechanism for both read/write. The second is so we can pass the buffer address and length from the uprobe to the uretprobe.
The mechanism for this in eBPF world is the “map”. These aren’t just dictionaries/hashmaps, but rather a bunch of general-purpose data structures we can use to share data between eBPF programs and user space - including hashmaps, arrays, ringbuffers and so on.
Let’s start with how we share data between the two “read” probes. For this, we will use a per-CPU array (BPF_MAP_TYPE_PERCPU_ARRAY
), and simply set the 0th element of the array to the data we want to share. We can see the retrieval of the data from the map in uprobe_libssl_read above
; the definition of the map looks like this:
To send data back to userspace, we use a ring buffer (BPF_MAP_TYPE_RINGBUF
). This gives us a lock-free way of pushing data back to the userspace, at the expense of potentially overwriting entries that have not yet been read, if userspace fails to keep up. For our use cases, this is fine. The definition looks like this:
To write to the map, we have another function available to us:
Finally, to access this from user-space, using ebpf-go:
And … that’s the big pieces in place! We can use our new tool to dump out some curl
SSL traffic (note that we’re forcing HTTP1.1 here, because HTTP2 adds a framing layer that makes it more complicated to read than simply dumping out the unencrypted session):
That’s it! It’s not super complete as a tool for dumping SSL sessions, but it was a great toy project to get hands-on with eBPF. And of course - you can grab the whole project off of github if you are interested.
Open Questions
In the interests of not ending up in a bottomless rabbit hole of discovery, here are some open questions I have I will try answer … another day.
We use a per-CPU array to share data between our two “read” probes, assuming that the function we are book-ending does not get preempted onto another CPU between the probes. If this happened, we wouldn’t be able to find our data, or we’d find data from an unrelated invocation! I’m not sure if this assumption always holds.
When I attach a debugger to a process and dump out the values of registers, what I am reading, actually? Have the registers been shifted into system memory, like during preemption or a context switch, and GDB is simply reading from there?
How are the userspace probes injected into the process - is it part of the kernel linking the shared objects up when the executable is loaded? Do the uprobes appear as extra frames in the stack, or, are they “weaved” into the start and end of the function they are hooking?