Skip to content

feat: per-skill content hash for change detection + differential autosync#205

Merged
raymondk merged 3 commits into
mainfrom
rk/versioning
Jun 3, 2026
Merged

feat: per-skill content hash for change detection + differential autosync#205
raymondk merged 3 commits into
mainfrom
rk/versioning

Conversation

@raymondk
Copy link
Copy Markdown
Collaborator

@raymondk raymondk commented Jun 3, 2026

Summary

Adds a per-skill content hash to the skills discovery index and rewrites the autosync-ic-skills sync script to use it for differential sync — so consumer projects re-download only the skills that actually changed instead of blindly re-mirroring everything every session.

Two commits:

  1. Publish hash in .well-known/skills/index.json — each skill entry gains "hash": "sha256:<hex>", an aggregate over the skill's served files (per-file sha256 + path, sorted). Content-based (not git-based), so it covers references//scripts/ files and is rename-sensitive. Purely additive: files stays a string array, so already-deployed consumers are unaffected.

  2. Differential sync in autosync-ic-skills — the script fetches index.json once, diffs each skill's hash against a {name: hash} manifest, and downloads only changed/new skills (pruning removed ones). No-op syncs are silent; on change it emits a SessionStart JSON object (systemMessage + additionalContext) so the summary surfaces in the UI and Claude's context. The script is now shipped as an attached file and fetched via curl for byte-exact delivery rather than transcribed from an inline block.

Behavior / robustness

  • Hashless fallback — if the server publishes no hash for a skill, the script re-downloads it every run (records an empty hash that never matches), so it stays correct against an un-upgraded server.
  • Failure-safe — keeps cached skills on network/jq failure; on a partial download keeps the old hash so the next run retries.
  • Legacy migration — transparently upgrades the old bare-array manifest format.

Testing

getSkillHash verified deterministic across rebuilds and sensitive to subdir-file changes. Sync script exercised against a local server serving the hash-bearing build:

Sync script behavior matrix
Run Expected Actual
Cold all added, real hashes stored 23 added, hashes sha256:f3ee…
Warm silent no output
One hash changed 1 updated 0 added, 1 updated, 0 removed (22 unchanged)
Legacy array manifest migrate + report 23 updated, manifest now object
Hashless index (current live) always re-download 23 updated (fallback)

npm run build clean; node scripts/check-project.js passes (pre-existing missing-eval warnings only).

Note

Step 1 of the skill fetches the script from the live site, which won't serve the new scripts/sync-ic-skills.sh path or the hash field until this is merged and deployed.

🤖 Generated with Claude Code

Add a `hash` field ("sha256:<hex>") to each skill entry so consumers can
detect which skills changed from a single index fetch, without downloading
and hashing every file. The hash is computed over the skill's served files
(path + per-file sha256, sorted) — order-independent and rename-sensitive.

`files` stays a string array; the field is purely additive, so already
deployed sync scripts are unaffected. Includes the design doc documenting
the hash contract.

This is the publishing half; the differential-sync consumer upgrade in
autosync-ic-skills is a follow-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@raymondk raymondk requested review from a team and JoshDFN as code owners June 3, 2026 15:26
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

Skill Validation Report

Validating skill: /home/runner/work/icskills/icskills/skills/autosync-ic-skills

Structure

  • Pass: SKILL.md found
  • Pass: all files in scripts/ are referenced

Frontmatter

  • Pass: name: "autosync-ic-skills" (valid)
  • Pass: description: (644 chars)
  • Pass: license: "Apache-2.0"
  • Pass: metadata: (2 entries)

Markdown

  • Pass: no unclosed code fences found

Tokens

File Tokens
SKILL.md body 1,579
Total 1,579

Content Analysis

Metric Value
Word count 1,067
Code block ratio 0.05
Imperative ratio 0.14
Information density 0.10
Instruction specificity 0.90
Sections 8
List items 24
Code blocks 4

Contamination Analysis

Metric Value
Contamination level low
Contamination score 0.01
Primary language category shell
Scope breadth 2
  • Warning: Language mismatch: config (1 category differ from primary)

Result: passed

Project Checks


WARNINGS (1):
  ⚠ autosync-ic-skills/SKILL.md: missing evaluations/autosync-ic-skills.json — see CONTRIBUTING.md for evaluation guidance

✓ Project checks passed for 1 skills (1 warnings)

Replace the blind full-mirror with a differential sync. The script fetches
index.json once, compares each skill's published `hash` against a
{name: hash} manifest (.ic-managed.json), and re-downloads only changed or
new skills — pruning removed ones. Unchanged skills are skipped with no
per-file downloads, and a no-op sync is silent.

Falls back to re-downloading any skill the server publishes no `hash` for,
keeps cached skills on network/jq failure, retains the old hash on a failed
download so the next run retries, and transparently migrates the legacy
bare-array manifest format.

The script is now shipped as an attached file
(scripts/sync-ic-skills.sh) and the installer fetches it via curl for
byte-exact delivery instead of transcribing an inline block. On change it
emits a SessionStart JSON object (systemMessage + additionalContext) so the
summary surfaces in the Claude Code UI and Claude's context.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@marc0olo marc0olo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash mechanism and differential sync logic are solid — good addition. Two things worth addressing before or after merge:

reloadSkills: true missing from JSON output

The script emits a SessionStart JSON object but doesn't include reloadSkills: true. Without it, skills downloaded during the first-ever sync (or when a new skill is added) won't be triggerable until the next session start — even though the script just fetched them. Claude Code re-scans skill directories only before SessionStart hooks run, so anything installed by the hook is invisible to the current session unless you request a re-scan.

reloadSkills: true triggers that re-scan immediately after the hook completes. It doesn't load skill content into context — it just registers newly installed skills so they can be triggered in the same session. The docs even show this exact pattern (sync a skill repo, return reloadSkills: true) as the canonical example.

Fix is a one-liner in sync-ic-skills.sh:

jq -n --arg msg "$summary" '{
  systemMessage: $msg,
  reloadSkills: true,
  hookSpecificOutput: {
    hookEventName: "SessionStart",
    additionalContext: $msg
  }
}'

curl-to-fetch vs inline script

Moving the script out of the SKILL.md into a remote fetch avoids transcription errors, which is a real concern for 130-line bash scripts. The tradeoff worth being aware of: an agent following this skill can no longer audit what it's installing before writing it to disk, and setup fails hard if the URL is unavailable at install time (rather than falling back to cached content like the sync itself does).

Not a blocker — the approach works and the URL will be stable — just flagging the tradeoff so it's an informed design decision.

@raymondk raymondk merged commit 49e32e5 into main Jun 3, 2026
6 checks passed
@raymondk raymondk deleted the rk/versioning branch June 3, 2026 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants