Skip to content

fix(commands_file): read from git when missing from the working tree#59

Merged
skipi merged 3 commits into
masterfrom
db/commands-file-sparse-ondemand
Jun 22, 2026
Merged

fix(commands_file): read from git when missing from the working tree#59
skipi merged 3 commits into
masterfrom
db/commands-file-sparse-ondemand

Conversation

@DamjanBecirovic

@DamjanBecirovic DamjanBecirovic commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

What

When a repository is checked out with a sparse working tree (only part of the
tree materialized), a commands_file referenced outside the sparse paths is not
present on disk and spc compile fails to read it. This adds a fallback: when
the file is missing from the working tree, read its content from the checked-out
revision via git show HEAD:<path>. For partial (blobless) clones this fetches
the blob on demand, so commands_file references keep working regardless of the
sparse paths.

Changes

  • pkg/commands/file.go: on os.IsNotExist, fall back to git show HEAD:<repo- relative-path>. Path resolution mirrors the existing absolute/relative rules
    and rejects paths that escape the repository root. Full-checkout behavior is
    unchanged (the file is read from disk as before).
  • pkg/commands/file_test.go: test that a commands_file removed from the
    working tree is still read from Git (absolute and relative paths).

Testing

  • go test ./pkg/commands/..., go vet, gofmt, and revive all clean.

Enables a sparse/partial checkout in the pipeline initialization job; required
so pipelines using commands_file continue to compile under that checkout.

Related PRs

🤖 Generated with Claude Code

DamjanBecirovic and others added 2 commits June 12, 2026 13:59
When the repository is checked out with a sparse working tree (e.g. the pipeline
initialization job materializing only the pipeline directory), a commands_file
referenced outside the sparse paths is absent on disk. Fall back to reading its
content from the checked-out revision via `git show HEAD:<path>`, which fetches
the blob on demand for partial (blobless) clones. Resolution mirrors the
existing absolute/relative path rules and rejects paths escaping the repo root.
Full-checkout behavior is unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
revive's add-constant rule flagged the magic 0o755/0o644 perms in the new test.
Extract them into testDirPerm/testFilePerm constants.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Empty commit to fire the push webhook now that spc is connected as a
Semaphore project, so CI runs on this PR branch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@skipi skipi merged commit 658a7a8 into master Jun 22, 2026
1 check passed
skipi added a commit to semaphoreci/toolbox that referenced this pull request Jun 23, 2026
…#542)

## What

Adds an opt-in "optimized" checkout path to the `checkout` toolbox
command, for
callers that only need the Git history (commits/trees) and a subset of
the
working tree rather than the full repository. It is controlled by two
env vars
and is fully backwards compatible — when neither is set, `checkout`
behaves
exactly as before.

```
SEMAPHORE_GIT_PARTIAL_CLONE_FILTER   e.g. "blob:none"  -> git clone --filter=...
SEMAPHORE_GIT_SPARSE_CHECKOUT_PATHS  e.g. ".semaphore" -> cone-mode sparse checkout
```

When either is set, `checkout` performs a `--no-checkout` (optionally
filtered)
clone and a cone-mode sparse checkout limited to the requested paths,
handling
push/branch, pull-request, and tag refs. For large repositories this
avoids
downloading blob content and materializing tens of thousands of files
when only
a small subset is needed.

## Changes

- `libcheckout`: new `checkout::optimized` /
`checkout::configure_sparse` paths,
dispatched from `checkout()` only when the new env vars are present.
Existing
  `shallow` / `refbased` / `use-cache` paths are untouched.
- **Graceful fallback:** `git sparse-checkout` (cone mode) requires git
>= 2.25.
On older clients the optimized path detects the missing subcommand and
falls
  back to a standard checkout, so the request degrades into a correct
(non-optimized) clone instead of a full checkout from a blobless clone.
- `tests/libcheckout.bats`: capability-aware tests for push/branch, PR
and tag —
asserting the sparse working tree where supported and the full-tree
fallback
  where not.
- CI: the macOS `xcode26 arm` block's prologue ran `brew upgrade
ruby-build`,
which newer Homebrew turns into an interactive `[y/n]` prompt that
blocked on
stdin and stopped the block (also broken on `master`). Pipe `yes` so it
  proceeds non-interactively.

## Testing

- bats `libcheckout` suite green on Docker, Linux, Ubuntu 24.04, and
Alpine 3.9
(git 2.20 -> exercises the fallback), plus `shellcheck -s bash
libcheckout`.

> Pairs with a consumer change (a pipeline initialization job that sets
these env
> vars) and a related compiler change; the env-var interface is additive
and safe
> to merge independently.

## Related PRs

- semaphoreio/semaphore#1063
- semaphoreci/spc#59

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Mikołaj Kutryj <mikolaj.kutryj@gmail.com>
skipi added a commit to semaphoreio/semaphore that referenced this pull request Jun 25, 2026
…ature flag (#1063)

## What

The pipeline initialization (compilation) job only needs the pipeline
YAML and
the Git history (commits/trees, used by `change_in`) to compile a
pipeline — not
the full repository working tree. For large repositories the full
checkout
dominates the init-job runtime (cloning hundreds of MiB and
materializing tens
of thousands of files on every run).

This makes the init job perform a **blobless partial clone + sparse
checkout** of
the pipeline directory, gated by a per-organization feature flag and
disabled
when pre-flight checks are present.

## How it works

The init job emits `SEMAPHORE_GIT_PARTIAL_CLONE_FILTER=blob:none` and
`SEMAPHORE_GIT_SPARSE_CHECKOUT_PATHS=<working_dir>` before `checkout`.
The
optimization is applied only when **both** hold:

- there are **no pre-flight checks** (their custom commands may rely on
the full
  working tree), and
- the `sparse_checkout_init_job` feature is enabled for the
organization.

The feature check **fails closed** — a missing org id or an unreachable
Feature
service keeps the standard full checkout. `change_in` keeps working
because it
relies only on `git diff --name-only` / `--shortstat` / `merge-base`,
which need
tree/commit objects (present in a blobless clone), not blob contents.

## Changes

- `proto`: generate `InternalApi.Feature` stubs; add
`INTERNAL_API_LOCAL_PATH`
  support to the proto Makefile so they can be regenerated from a local
  `internal_api` checkout.
- `ppl`: add the `feature_provider` dependency and a FeatureHub-backed
provider
  + gRPC client (`INTERNAL_API_URL_FEATURE`); `Ppl.Features` exposes the
fail-closed flag check; init `FeatureProvider` and a `:feature_cache` in
the
  application supervisor.
- `ppl`: gate the optimized checkout in the compilation init-job command
  generation; auto-disable when pre-flight checks are configured.
- Build: move ppl's Docker build context to the repository root
(matching the
top-level services) so `feature_provider`, which lives outside
`plumber/`, is
  included in the image.
- Tests: feature-flag gate + provider coverage, with a Feature gRPC
mock; plus a
small fix to make an unrelated integration test's env restore nil-safe.

> Note: `yaml_elixir` is pinned to `~> 1.3` via override because the
umbrella's
> YAML validators rely on 1.x return semantics; this means
feature_provider's
> YAML provider is not used here (the FeatureHub gRPC provider is). A
follow-up
> is tracked to align on yaml_elixir 2.x for YAML-defined features.

## Dependencies

This relies on the toolbox `checkout` partial/sparse support and the spc
`commands_file` on-demand fetch being released (and the toolbox image
rebuilt).

## Testing

- `Ppl: QA` and `Ppl: Integration QA` green; dev and prod image builds
validated.

## Related PRs

- semaphoreci/toolbox#542
- semaphoreci/spc#59

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Mikołaj Kutryj <mikolaj.kutryj@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants