Skip to content

Android 15 container crashes with Exit Code 129 on Kubernetes (containerd) due to read-only sysfs #925

@ZhouJze

Description

@ZhouJze

Describe the bug

Android 15 redroid container immediately crashes with Exit Code 129 (SIGHUP) when running on Kubernetes with containerd runtime. The root cause is that containerd mounts /sys as read-only even for privileged containers, while Android 15's apexd-bootstrap requires write access to /sys/block/loopX/queue/read_ahead_kb during APEX initialization.

Other Android versions (11, 12, 13, 16) work fine on the same nodes because their APEX initialization paths don't depend on writing to read_ahead_kb.

请详细描述问题(ZH_CN)

Android 15 redroid 容器在 Kubernetes (containerd 运行时) 环境中启动时立即崩溃,退出码 129 (SIGHUP)。

根因:containerd 即使对 privileged: true 的容器,也以只读方式挂载 /sys。Android 15 的 apexd-bootstrap 在初始化 APEX 模块时需要写入 /sys/block/loopX/queue/read_ahead_kb,写入失败导致 APEX 激活失败,init 触发 InitFatalReboot

Docker 环境下 privileged 容器的 /sys 是 rw 的,所以不受影响。其他 Android 版本(11/12/13/16)在相同节点上正常运行,因为它们的 APEX 初始化路径不依赖此写入。

Environment / 环境信息:

  • Host kernel: 5.15.0-170-generic (Ubuntu 22.04 ARM64)
  • Container runtime: containerd 1.6.34 (via Kubernetes)
  • Image: redroid/redroid:15.0.0-latest (based on redroid 15)
  • Docker version on working host: 28.2.2 (same kernel, works fine)

dmesg output from host:

apexd-bootstrap: DM_DEV_CREATE failed for [com.android.adservices]: Bad file descriptor
apexd-bootstrap: Failed to create empty device com.android.adservices
apexd-bootstrap: DM_DEV_CREATE failed for [com.android.art]: Bad file descriptor
apexd-bootstrap: Failed to create empty device com.android.art
apexd-bootstrap: Failed to activate /system/apex/com.android.tzdata.apex(com.android.tzdata@352090000): Could not create loop device for /system/apex/com.android.tzdata.apex: Failed to open /sys/block/loop47/queue/read_ahead_kb: Read-only file system
apexd-bootstrap: Failed to activate /system/apex/com.android.i18n.apex(com.android.i18n@1): Could not create loop device for /system/apex/com.android.i18n.apex: Failed to open /sys/block/loop46/queue/read_ahead_kb: Read-only file system
apexd-bootstrap: Failed to activate /system/apex/com.android.runtime.apex(com.android.runtime@1): Could not create loop device for /system/apex/com.android.runtime.apex: Failed to open /sys/block/loop47/queue/read_ahead_kb: Read-only file system
apexd-bootstrap: Failed to activate bootstrap apex files : Failed to activate 3 APEX packages.
init: Sending signal 9 to service 'apexd-bootstrap' (pid 11) process group...

Key difference between Docker and containerd:

# Inside Docker privileged container:
$ cat /proc/1/mountinfo | grep sysfs
... /sys rw,nosuid,nodev,noexec,relatime - sysfs sysfs rw

# Inside Kubernetes privileged container (containerd):
$ cat /proc/1/mountinfo | grep sysfs
... /sys rw,nosuid,nodev,noexec,relatime - sysfs sysfs ro
#                                                       ^^^ read-only!

Workaround / 解决方案:

Run mount -o remount,rw /sys before starting /init. In Kubernetes, this can be done via a privileged initContainer or in the container's startup command:

initContainers:
  - name: sysfs-fix
    image: python:3.9
    command: ["sh", "-c", "mount -o remount,rw /sys"]
    securityContext:
      privileged: true

在启动 /init 之前执行 mount -o remount,rw /sys 即可解决。在 K8s 中可通过 privileged initContainer 实现。


make sure the required kernel modules present

确保必须的内核功能已开启(ZH_CN)

  • grep binder /proc/filesystems — binder is available (built-in to 5.15 kernel)
  • grep ashmem /proc/misc — N/A (not required for this kernel version)

Kernel modules are NOT the issue. The problem is sysfs mount mode (ro vs rw).

内核模块不是问题所在。问题在于 sysfs 的挂载模式(只读 vs 可读写)。


Collect debug logs

收集调试日志(ZH_CN)

Container exits immediately (within 1 second), no container logs are produced. The diagnostic information comes from host dmesg output (shown above).

容器在 1 秒内退出,无容器日志输出。诊断信息来自宿主机 dmesg(如上所示)。


Screenshots

截图(ZH_CN)

N/A — issue is fully reproducible via dmesg logs above.


Additional context

This appears to be a containerd-specific issue. The same image with the same kernel version works perfectly under Docker because Docker mounts /sys as rw for privileged containers.

Affected: Android 15 only (apexd-bootstrap behavior changed)
Not affected: Android 11, 12, 13, 16 (different APEX init paths)

Suggested fix for redroid: Add a fallback in apexd-bootstrap when read_ahead_kb is not writable, or document the mount -o remount,rw /sys workaround for containerd/Kubernetes users.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions