Skip to content

Race condition: concurrent JDK auto-provisioning corrupts JDK installation #2446

@dzied-baradzied

Description

@dzied-baradzied

Description

When multiple jbang processes start simultaneously and no JDK is installed, they all attempt to download and extract the same JDK archive concurrently into ~/.jbang/cache/jdks/<version>.tmp/. The concurrent tar extractions corrupt each other, leaving a broken JDK that fails all subsequent jbang invocations.

This is related to but distinct from #2445 (same-script compilation race). That issue is about the jar build cache; this one is about JDK provisioning.

Reproducer

100% failure rate with 4 parallel processes when no JDK is pre-installed.

Files

ToolA.java (also create ToolB.java, ToolC.java, ToolD.java with same content, different class names):

///usr/bin/env jbang "$0" "$@" ; exit $?
//JAVA 21+

public class ToolA {
    public static void main(String[] args) {
        System.out.println("ToolA OK — JDK " + System.getProperty("java.version"));
    }
}

reproduce.sh:

#!/bin/bash
set -euo pipefail

rm -rf ~/.jbang/cache/jdks/*

echo "=== jbang JDK provisioning race condition reproducer ==="
echo "Running 4 DIFFERENT scripts in parallel (all require //JAVA 21+)"
echo "No JDK is pre-installed — all processes will try to provision one."

pids=""
for script in ToolA ToolB ToolC ToolD; do
    jbang --verbose "${script}.java" > "run-${script}.log" 2>&1 &
    pids="$pids $!"
done

exits=""
for pid in $pids; do
    wait "$pid" && exits="$exits 0" || exits="$exits $?"
done

echo "=== Results ==="
failed=0
i=0
for code in $exits; do
    i=$((i + 1))
    if [ "$code" -ne 0 ]; then
        echo "Process $i: FAIL (exit $code)"
        failed=1
    else
        echo "Process $i: PASS"
    fi
done

if [ $failed -eq 1 ]; then
    echo "RACE CONDITION REPRODUCED"
    for f in run-Tool*.log; do
        echo "--- $f ---"
        cat "$f"
    done
    exit 1
fi

Dockerfile:

FROM alpine:3.21
RUN apk add --no-cache bash curl
RUN curl -Ls https://sh.jbang.dev | bash -s - app setup --quiet
ENV PATH="/root/.jbang/bin:${PATH}"
RUN ! which java && echo "No JDK pre-installed"
WORKDIR /workspace
COPY ToolA.java ToolB.java ToolC.java ToolD.java reproduce.sh ./
RUN chmod +x reproduce.sh
CMD ["./reproduce.sh"]

Run

docker build -t jbang-jdk-race -f Dockerfile .
docker run --rm jbang-jdk-race

Observed Failure Modes

Three distinct failure modes, all from concurrent tar extraction into the same directory:

1. Directory conflicts during extraction

Multiple tar processes try to create the same directory structure simultaneously:

tar: can't create directory 'jdk-17.0.19+9/': No such file or directory
tar: can't open 'jdk-17.0.19+9/': Is a directory
Error installing JDK

2. Symlink/rename conflicts

One process creates files while another rearranges the directory:

tar: can't create symlink 'jdk-17.0.19+9/legal/jdk.naming.rmi/ASSEMBLY_EXCEPTION' to '../java.base/ASSEMBLY_EXCEPTION'
mv: can't rename '/root/.jbang/cache/jdks/17.tmp/jdk-17.0.19+9/lib': Directory not empty
mv: can't rename '/root/.jbang/cache/jdks/17.tmp/lib/jfr': Not a directory
Error installing JDK

3. Corrupt JDK binary (most severe)

The JDK appears to install but is incomplete — shared libraries are missing:

Error relocating /root/.jbang/cache/jdks/17/bin/java: JLI_InitArgProcessing: symbol not found

This is the worst case: the JDK directory exists and java binary is present, but it's broken. All subsequent jbang invocations will fail until the cache is manually cleared.

Root Cause

jbang's JDK provisioning code downloads and extracts the JDK archive without any locking mechanism. When multiple processes need the same JDK version:

  1. All processes check if the JDK exists — none find it
  2. All processes download the same archive (redundant network traffic)
  3. All processes extract into the same ~/.jbang/cache/jdks/<version>.tmp/ directory
  4. The concurrent tar extractions corrupt each other
  5. One process "wins" and moves the .tmp dir to the final location, but it may be incomplete

There is no:

  • Lock file to serialize JDK provisioning
  • Unique temp directory per process
  • Post-installation integrity check

Environment

  • jbang 0.138.0
  • Alpine 3.21 / Docker
  • No pre-installed JDK (jbang auto-provisions JDK 17 as bootstrap)
  • Tested on macOS (Docker Desktop)

Suggested Fix

  1. Use a lock file (e.g., ~/.jbang/cache/jdks/<version>.lock) to serialize JDK provisioning
  2. Each process should extract to a unique temp directory (e.g., <version>.tmp.<pid>)
  3. Use atomic rename to move the completed extraction to the final path
  4. If the final path already exists when the rename is attempted, delete the temp dir (another process won the race)

Workaround

Pre-install a JDK before running parallel jbang scripts:

# Option 1: use jbang to install JDK once
jbang jdk install 21

# Option 2: install system JDK
apk add openjdk21-jdk

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions