BPF Development: Starting with Hello World

Part 1 Overview

Background

BPF technology has been listed as one of the hottest new areas in the Linux kernel domain in recent years. It has successfully endowed the Linux kernel with a degree of dynamic programmability, allowing for real-time modifications to the kernel's behavior during runtime, without the need to recompile or reboot the kernel.

Consequently, BPF shines in three major areas within the Linux world: networking, observability, and security. Mastering BPF technology is very meaningful for Linux kernel and application developers.

What is BPF?

BPF is implemented in the Linux kernel as a highly streamlined virtual machine, offering near lossless performance efficiency. It allows for code to be written in a syntax similar to C, which is then compiled into assembly code that can run on the BPF virtual machine, enabling its powerful logic processing capabilities.

Today, starting with Hello World, we will take you into the world of BPF.

Starting with BPF

Following the tradition of learning computer programming languages, we begin our journey into eBPF development with the classic Hello World program. First, let's briefly compare the similarities and differences between standard C programs and eBPF programs.

C Language Program

HelloWorld.c

#include <stdio.h>

int main()
{
  printf("HelloWorld\n");
  return 0;
}

Compilation and Execution of C Language

eBPF Program

HelloWorld.bpf.c

#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>

int helloworld(void *ctx)
{
  bpf_printk("Hello world!\n");
  return 0;
}

Compilation and Execution of eBPF Program

Compiling, loading, and running eBPF is a bit more complex than for standard C programs. The entire process will be introduced step by step.

Part 2 Development Environment

We use Vagrant + VirtualBox + Ubuntu 22.04 to set up our development environment. Note, it only supports x86 CPUs (Windows, Linux, or Mac). Apple's M chip is not supported.

Save the Vagrantfile, the virtual machine description file:

Vagrant.configure('2') do |config|
  config.vm.define 'bpf' do |bpf|
    bpf.vm.provider 'virtualbox' do |vb|
      vb.gui = false
      vb.memory = '4096'
      vb.cpus = 2
    end
    bpf.vm.synced_folder '.', '/vagrant', disabled: true
    bpf.vm.box = 'bento/ubuntu-22.04'
    bpf.vm.network 'private_network', type: 'dhcp'
    bpf.vm.provision 'shell', inline: <<-SHELL
        set -x
        set -e
        apt-get install -y linux-tools-generic linux-tools-$(uname -r) gcc-multilib clang libbpf0 libbpf-dev
    SHELL
  end
end

Some common commands:

vagrant up: Start the virtual machine
vagrant halt: Power off the virtual machine
vagrant destroy: Destroy the virtual machine
vagrant ssh: Log into the virtual machine

Part 3 Starting Development

Taking the previously mentioned HelloWorld.bpf.c as an example.

Compiling eBPF

Generate eBPF bytecode, including debug information:

clang -target bpf -Wall -O2 -g -c HelloWorld.bpf.c -o HelloWorld.o

View the compiled eBPF bytecode

llvm-objdump -d -r -S --print-imm-hex HelloWorld.o

root@vagrant:~# llvm-objdump -d -r -S --print-imm-hex HelloWorld.o
HelloWorld.o: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <helloworld>:
; {
 0: b7 01 00 00 0a 00 00 00 r1 = 0xa
; bpf_printk("Hello world!\n");
 1: 6b 1a fc ff 00 00 00 00 *(u16*)(r10 - 0x4) = r1
 2: b7 01 00 00 72 6c 64 21 r1 = 0x21646c72
 3: 63 1a f8 ff 00 00 00 00 *(u32*)(r10 - 0x8) = r1
 4: 18 01 00 00 48 65 6c 6c 00 00 00 00 6f 20 77 6f r1 = 0x6f77206f6c6c6548 ll
 6: 7b 1a f0 ff 00 00 00 00 *(u64*)(r10 - 0x10) = r1
 7: bf a1 00 00 00 00 00 00 r1 = r10
 8: 07 01 00 00 f0 ff ff ff r1 += -0x10
; bpf_printk("Hello world!\n");
 9: b7 02 00 00 0e 00 00 00 r2 = 0xe
 10: 85 00 00 00 06 00 00 00 call 0x6
; return 0;
 11: b7 00 00 00 00 00 00 00 r0 = 0x0
 12: 95 00 00 00 00 00 00 00 exit
}

This chunk of eBPF assembly code is available for interested readers to reference: eBPF Instruction Set Specification, v1.0

Running BPF

Process:

Process

There are many events in the Linux kernel that support eBPF. After mounting an eBPF program, you need to wait for an event to be asynchronously triggered to see the effects.

Loading eBPF Bytecode

The Linux kernel's official tool, bpftool, encapsulates various operations for eBPF bytecode. In this article, we use it to mount and debug eBPF.

Generating eBPF File Structure in the Kernel

eBPF exists in the Linux kernel as a special file form, which will automatically be destroyed if the process exits. To allow eBPF programs to persist independently of processes within the Linux kernel, the kernel provides a bpf filesystem where eBPF programs can be pinned. Ubuntu 22.04 comes with the bpf filesystem mounted by default, located at /sys/fs/bpf, and can be used directly.

The current eBPF bytecode is just a piece of pure code. To pass it into the kernel, it is necessary to specify the type of the eBPF bytecode. You can list all the eBPF types supported by the kernel through bpftool feature:

...
Scanning eBPF program types...
eBPF program_type socket_filter is available
eBPF program_type kprobe is available
eBPF program_type sched_cls is available
eBPF program_type sched_act is available
eBPF program_type tracepoint is available
eBPF program_type xdp is available
eBPF program_type perf_event is available
eBPF program_type cgroup_skb is available
eBPF program_type cgroup_sock is available
eBPF program_type lwt_in is available
eBPF program_type lwt_out is available
...

HelloWorld.bpf.c is a "universal" eBPF program because it only outputs "HelloWorld". Therefore, it can be designated as any type of eBPF program, and here we choose raw_tracepoint.

bpftool prog load HelloWorld.o /sys/fs/bpf/HelloWorld type raw_tracepoint

Now, it can be seen in /sys/fs/bpf/:

root@vagrant:~# ls /sys/fs/bpf/HelloWorld
/sys/fs/bpf/HelloWorld

Or:

root@vagrant:~# bpftool prog show pinned /sys/fs/bpf/HelloWorld
353: raw_tracepoint name helloworld tag fc3c56cde923df12 gpl
 loaded_at 2023-03-11T01:59:48+0000 uid 0
 xlated 104B jited 71B memlock 4096B
 btf_id 118

This indicates that our eBPF program has been successfully loaded into the kernel.

Mounting the eBPF Program in the Kernel to Events

Normally, we would mount eBPF to specific events and then wait for the events to be asynchronously triggered before executing the eBPF program.

For convenience in this article, we choose the simplest interface used for debugging and testing: BPF_PROG_TEST_RUN, also called BPF_PROG_RUN,

They are fully equivalent. This interface can directly execute the eBPF program synchronously, without needing to mount it to an event and wait for the event to occur asynchronously.

BPF_PROG_TEST_RUN only supports a limited number of eBPF program types:

BPF_PROG_TYPE_SOCKET_FILTER
BPF_PROG_TYPE_SCHED_CLS
BPF_PROG_TYPE_SCHED_ACT
BPF_PROG_TYPE_XDP
BPF_PROG_TYPE_SK_LOOKUP
BPF_PROG_TYPE_CGROUP_SKB
BPF_PROG_TYPE_LWT_IN
BPF_PROG_TYPE_LWT_OUT
BPF_PROG_TYPE_LWT_XMIT
BPF_PROG_TYPE_LWT_SEG6LOCAL
BPF_PROG_TYPE_FLOW_DISSECTOR
BPF_PROG_TYPE_STRUCT_OPS
BPF_PROG_TYPE_RAW_TRACEPOINT
BPF_PROG_TYPE_SYSCALL

Previously, we designated our eBPF bytecode as raw_tracepoint, which precisely meets the requirements of that interface.

By using bpftool with the BPF_PROG_TEST_RUN interface to run it (how to set parameters will be analyzed in a special topic later, do not modify bpftool parameters here): OK, we have now successfully executed our first, simplest BPF program.

bpftool prog run pinned /sys/fs/bpf/HelloWorld repeat 0

Result:

root@vagrant:~# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 1/1   #P:4
#
#                                _-----=> irqs-off
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| / _-=> migrate-disable
#                              |||| /     delay
#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |
        bpftool-39498   [001] d.... 979047.864950: bpf_trace_printk: Hello world!

OK, we have now successfully executed our first, simplest BPF program.

Part 4 BPF in MatrixOne

As our flagship database product, MatrixOne will fully utilize BPF technology in the future to enhance the performance, stability, and security of the database cluster.

Networking

As a cloud-native database running on standard Kubernetes clusters, networking is one of its most crucial infrastructures. We will use BPF technology to reduce the overhead of the Linux kernel network protocol stack, achieving network performance optimization for MatrixOne.

Observability

We will use BPF to collect and analyze various runtime metrics of the database in real-time within our observability component, for automated analysis and fault diagnosis. Simultaneously, we will provide a series of BPF tools for SREs to manually analyze and tune MatrixOne.

Security

In our accompanying security component, we will provide critical operation monitoring and prohibition based on BPF, for real-time detection and defense against security threats to the database.