Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 67 additions & 0 deletions pocs/linux/kernelctf/CVE-2025-40019_mitigation_2/docs/exploit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Exploit

This submission targets `mitigation-v4-6.12` as a novelty-only follow-up for CVE-2025-40019. The regular vulnerability slot for this CVE and target was already taken; the relevant new part is the page-table adaptation described in `docs/novel-techniques.md`.

## High-Level Flow

The exploit turns the ESSIV scatterwalk offset underflow into a 16-byte write through a reclaimed scatterlist entry.

1. Create a sacrificial AF_ALG AEAD request that builds a chained receive scatterlist.
2. Free that request so the inline first receive SGL, a second receive SGL, and the tag SGL remain as residual slab contents.
3. Reclaim the second receive SGL slab slot with a Unix socket control-buffer allocation containing a crafted scatterlist entry.
4. Reclaim the freed anonymous pipe pages as user page-table pages.
5. Trigger the ESSIV decryption path with `assoclen == 0` and an output length of zero, causing `scatterwalk_ffwd()` to walk the residual SGL chain and write the encrypted IV into a page-table page.
6. Use a two-pass flow: pass 1 leaks the physical base needed for the target mapping, and pass 2 writes a coredump helper into `core_pattern`.

The exploit does not use user namespaces, `io_uring`, BPF, or a separate KASLR leak service.

## Scatterlist Shaping

The trigger sends exactly 32 bytes to `essiv(authenc(hmac(sha256),cbc(aes)),sha256)`. For this transform, the authentication tag size is 32 bytes. During decrypt, `_aead_recvmsg()` computes:

```text
outlen = used - authsize = 0x20 - 0x20 = 0
```

Because `outlen` is zero, `af_alg_get_rsgl()` does not initialize the receive SGL. The request nevertheless passes the receive SGL to ESSIV as both source and destination. The vulnerable ESSIV offset calculation wraps to `0xfffffff0`, so `scatterwalk_ffwd()` walks past the inline SGL and follows stale chain entries.

The sacrificial request constructs the stale chain before the trigger request:

```text
first_rsgl[0..15] -> second_rsgl -> tsgl -> anonymous pipe pages
```

After the sacrificial request is freed, the exploit reclaims the `second_rsgl` allocation with a Unix socket `msg_control` buffer. The crafted entry supplies a large length value that steers the final `scatterwalk_ffwd()` position into the stale `tsgl` entries. The `tsgl` entries still encode pages that were freed after pipe closure; those pages are then reclaimed as page-table pages by a controlled `mmap()` spray.

## IV Encoding

ESSIV encrypts the IV before copying it back. The exploit embeds AES code so it can precompute an IV that decrypts to the desired 16-byte page-table write for the selected pass.

In pass 1, the write maps a physical window that contains a known kernel trampoline page. Reading through the resulting huge mapping reveals enough physical address information to derive the `_stext` physical base used by pass 2.

In pass 2, the write maps the 1 GB physical window containing `core_pattern` and writes the helper payload through the corrupted user mapping.

## Privilege Escalation

The payload written into `core_pattern` is:

```text
|/proc/%P/root/tmp/ex %P
```

The `/proc/%P/root` prefix is required because the coredump helper path is resolved outside the process's jail-local mount namespace. The helper reopens the crashing process's standard file descriptors with `pidfd_open()` and `pidfd_getfd()`, then reads the flag from the target VM as root.

## Reproduction

Build and run on `mitigation-v4-6.12`:

```sh
make
./exploit
```

For the vulnerability-only KASAN check:

```sh
./exploit --vuln-trigger
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Novel Technique: Huge-PUD Recovery From a Wrong-Level Page-Table Write

The existing CVE-2025-40019 exploits targeted the same ESSIV scatterwalk primitive, but this submission handles a different landing condition on `mitigation-v4-6.12`: the repeated page-table write landed on a PUD page instead of a PTE page.

The usual response to this condition is to treat it as a failed PTE hit. This exploit instead converts the wrong-level page-table write into a useful primitive by writing valid 1 GB huge-PUD entries.

## Why This Works

The ESSIV primitive gives a 16-byte write, but the copied value is the encrypted IV. By precomputing the IV, the exploit controls the 16 bytes that the kernel writes after ESSIV transforms it.

When the reclaimed page is a PUD page, those 16 bytes can be used as two adjacent PUD entries. The exploit writes present, user-accessible huge entries that map a chosen 1 GB physical window into userspace.

The exploit uses this twice:

1. Pass 1 maps a physical window that exposes a stable kernel address-derived value, then derives the `_stext` physical base.
2. Pass 2 maps the 1 GB physical window containing `core_pattern`, then writes a coredump helper string through the resulting user mapping.

This avoids needing the ESSIV write to hit a PTE page. It also avoids ROP and avoids depending on user namespaces or `io_uring`.

## Namespace-Safe Coredump Helper

The second novelty is operational rather than a new corruption primitive. Writing `|/tmp/ex %P` to `core_pattern` was not reliable in this environment because the kernel resolves the coredump helper outside the jail-local mount namespace. The working payload is:

```text
|/proc/%P/root/tmp/ex %P
```

This resolves the helper through the crashing process's root, so the exploit can install `/tmp/ex` inside the target process namespace while the kernel still starts the correct helper as root.

## Practical Impact

This technique makes a page-table exploit usable even when the allocator consistently gives the vulnerable write a PUD page instead of a PTE page. For this target, that changed the result from repeated near-misses into a working mitigation bypass and flag capture.
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# CVE-2025-40019

## Requirements

- Capabilities: none
- User namespaces: not required
- io_uring: not required
- Kernel configuration: `CONFIG_CRYPTO_USER_API`, `CONFIG_CRYPTO_USER_API_AEAD`, `CONFIG_CRYPTO_ESSIV`
- Affected component: Linux kernel crypto, ESSIV AEAD template
- Fixed by: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6bb73db6948c2de23e407fe1b7ef94bf02b7529f
- Introduced by: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=be1eb7f78aa8fbe34779c56c266ccd0364604e71
- Affected versions: v5.4 through v6.18

## Summary

`essiv_aead_crypt()` performs an unchecked unsigned subtraction when copying the encrypted IV back into the destination scatterlist for decryption or in-place encryption. If `req->assoclen` is smaller than the AEAD IV size, the offset passed to `scatterwalk_map_and_copy()` wraps to a large unsigned value.

The vulnerable path is reachable by an unprivileged local user through AF_ALG AEAD sockets using the ESSIV template, for example `essiv(authenc(hmac(sha256),cbc(aes)),sha256)`.

## Root Cause

In the vulnerable code, the decryption and in-place paths copy the transformed IV to:

```c
scatterwalk_map_and_copy(req->iv, req->dst,
req->assoclen - crypto_aead_ivsize(tfm),
crypto_aead_ivsize(tfm), 1);
```

`req->assoclen` and `crypto_aead_ivsize(tfm)` are unsigned. With `assoclen == 0` and `ivsize == 16`, the subtraction becomes `0xfffffff0`. The later `ssize < 0` check existed only in the out-of-place encryption path, so it did not protect decryption or in-place encryption.

AF_ALG allows userspace to set `ALG_SET_AEAD_ASSOCLEN` to zero and then issue a decrypt request. In `_aead_recvmsg()`, sending exactly the authentication tag size makes the receive output length zero. That causes `af_alg_get_rsgl()` to return without initializing the receive scatterlist, while the ESSIV layer still receives that scatterlist as `req->dst`.

The wrapped offset makes `scatterwalk_ffwd()` walk beyond the initialized scatterlist entries and eventually treat residual heap data as a scatterlist entry. The ESSIV layer then writes the 16-byte encrypted IV to the page, offset, and length described by that residual or reclaimed entry.

## Fix

The fix moves the signed size validation to the start of `essiv_aead_crypt()`, before the decryption and in-place paths can use the underflowed value.

## Minimal Trigger

The submitted exploit supports:

```sh
./exploit --vuln-trigger
```

That mode opens the ESSIV AF_ALG AEAD transform, sends a decrypt request with `ALG_SET_AEAD_ASSOCLEN = 0`, and calls `recv()` without the heap grooming or privilege escalation stages.
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
CC ?= gcc
CFLAGS ?= -O2 -w -DMIT_612
DEBUG_CFLAGS ?= -O2 -g -w -DMIT_612
STATIC_LDFLAGS ?= -static

all: exploit

exploit: exploit.c wrapper.c
$(CC) -B/usr/bin/ $(CFLAGS) -o payload_bin exploit.c
cp /lib64/ld-linux-x86-64.so.2 ld_bin
if [ -e /lib/x86_64-linux-gnu/libc.so.6 ]; then cp /lib/x86_64-linux-gnu/libc.so.6 libc_bin; else cp /usr/lib64/libc.so.6 libc_bin; fi
ld -r -b binary -o payload_bin.o payload_bin
ld -r -b binary -o ld_bin.o ld_bin
ld -r -b binary -o libc_bin.o libc_bin
$(CC) -B/usr/bin/ -O2 -w $(STATIC_LDFLAGS) -o exploit wrapper.c payload_bin.o ld_bin.o libc_bin.o

exploit_debug: exploit.c
$(CC) -B/usr/bin/ $(DEBUG_CFLAGS) -o $@ $< $(STATIC_LDFLAGS)

clean:
rm -f exploit exploit_debug payload_bin ld_bin libc_bin payload_bin.o ld_bin.o libc_bin.o
Binary file not shown.
Loading
Loading