Rebuild Solid skill + add AGENTS.md template for AI assistants#97
Merged
Conversation
The skill is rewritten to lead with pubspec.yaml detection of solid_annotations / solid_generator as the authoritative signal, so it applies even when the user never says "Solid". Two new capabilities: - Decision shortcuts table for the most common Flutter→Solid moves (adding widgets, fetch-on-change, lib/-edit diagnoses, etc.). - references/third-party-packages.md catalogue for redirecting go_router / freezed / riverpod / drift / json_serializable file creation from lib/ to source/ — the case where another AI follows package READMEs literally. scripts/verify.sh now chains dart fix --apply after build_runner so generated lib/ output is lint-clean (adds const, removes unused imports, prefers relative imports). dart fix failure is non-fatal. Adds evals/ with four seed prompts covering counter creation, reactive query, lib-edit migration, and go_router redirect — plus trigger_queries.json (20 should/should-not queries) for running the skill-creator description optimization loop later. Docs: new "Clean up after generation" subsection in guides/getting-started.mdx with dev + CI guidance for dart fix.
skill-creator's run_loop.py produces per-run workspaces (results.json, report.html, per-iteration logs) alongside the skill. These are ephemeral evaluation outputs, not artefacts to track in the repo. Ran 5 iterations of run_loop against skills/solid (sonnet-4.5, 20 trigger queries split 12 train / 8 test, 3 runs per query). All five iterations tied on test pass count (4/8). The optimizer picked the original hand-written description as the best by held-out score. No SKILL.md change needed; description already at the local optimum that loop could find.
Ran nine description-optimization passes through skill-creator's
run_loop.py against 20 trigger queries (12 train / 8 test split,
held-out scoring). The clean ceiling on this query set is 4/8 due to
Claude's tendency to handle simple Flutter intents without consulting
a skill (the agentskills.io "undertrigger on one-step queries" mode).
The PRIORITY framing breaks past the noise floor by reframing the
skill as a precondition for safe Dart edits in Solid projects:
- v1 (original "use whenever working on…", sonnet): 4/8
- v3 ("encodes knowledge Claude can't infer alone", opus): 5/8 noisy
- v4 (REQUIRED + trigger-phrase listing, opus): 5/8 noisy
- v5-v6 (hybrid / bulleted variants): 4/8 (dilution)
- v7 (v3 verified with 5 runs/query): 4/8 (variance flushed out)
- v8 (PRIORITY — read first, opus, 5 runs): 5/8 reliably ← chosen
- v9 (v8 + annotation-contract emphasis): regressed to 4/8
The v8 description catches the "scaffold a new page that fetches…"
query at 3/5 trigger rate, which no other framing reached. Final
sentence retains the @SolidQuery-no-params / @SolidEnvironment-late
hooks that v3 used to catch @SolidEnvironment, giving the
description coverage on two distinct query types at once.
Per-iteration logs in solid-workspace/ (gitignored). Full
run_loop_v8.log was the best at 25/40 runs correct, recall 25%,
precision 100%.
After 10 description-optimization iterations against the skill triggering loop, the empirical ceiling on the test query set is ~5/8 due to Claude's documented "undertrigger on simple queries" pattern (agentskills.io). Solo skill triggering can't reliably catch routine-looking Flutter intents like "scaffold a page that fetches /api/profile" even when the project's pubspec.yaml clearly declares solid_annotations. AGENTS.md sidesteps the trigger problem entirely. Most modern AI coding tools (Claude Code, Cursor, Codex, GitHub Copilot, Amp, OpenHands, …) auto-load AGENTS.md at session start for every interaction, with no trigger gate. Users who copy this template to their Solid app root get the inverted source/-vs-lib/ rule applied to every prompt, not just ones that mention "Solid" or annotation names. The skill remains the on-demand reference (full annotation contract, references/, scripts, evals). AGENTS.md is the always-loaded baseline. Two-layer setup: AGENTS.md for the rule, skill for the depth. Docs updated to recommend AGENTS.md installation as Step 1 of AI assistant setup, with a curl one-liner and a CLAUDE.md symlink fallback.
- Restructured getting-started AI assistant setup into proper Steps component for clarity - Updated example code to use relative imports instead of package imports (source/ pattern) - Added analyzer rule to allow non-package imports - Removed unnecessary const from MaterialApp in source example
Deploying solid with
|
| Latest commit: |
0ca8443
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://1fe445b8.solid-4m0.pages.dev |
| Branch Preview URL: | https://skill-creator-rebuild.solid-4m0.pages.dev |
The fixture simulates a user's mis-edit of generated lib/ output. CI runs `dart format --set-exit-if-changed .` over the whole tree including eval/ fixtures, so the file needs to satisfy the formatter even though its semantic role is to look like build_runner output.
`dart analyze --fatal-infos` from the repo root was flagging `include: package:flutter_lints/flutter.yaml` in the eval fixture analysis_options.yaml files — flutter_lints isn't part of the workspace so the URI couldn't resolve, and --fatal-infos elevates the warning to a failure. The fixtures are template project skeletons used by skill-creator's eval runner, not real source to be analyzed, so the right fix is to exclude them at the root. Verified locally with the full CI sequence (flutter pub get, dart format --set-exit-if-changed ., dart analyze --fatal-infos, dart analyze packages/solid_generator/test/golden/outputs/, dart analyze packages/solid_annotations, dart test packages/solid_generator/, flutter test packages/integration_tests/) — all 7 steps green.
The eval-1/2/4 fixtures were byte-identical clones of eval-1's blank starter. The prompts pass either way, but identical fixtures hide what each scenario actually tests. Differentiate: - eval-2 (Fetch posts when userId changes): add source/posts_page.dart as a placeholder widget with a user dropdown and a 'No user selected' body. The agent is expected to extend this (add @SolidState userId and @SolidQuery, replace the placeholder body with .when(...)). Assertion updated from 'new or modified widget' to 'extends posts_page.dart'. - eval-4 (Add go_router): add source/home_page.dart and source/settings_page.dart as existing widgets. The agent should wire them into a GoRouter, not recreate them. New assertion: 'the agent did NOT recreate or overwrite' the pre-existing pages, isolating the test from 'did the agent also create the pages'. - eval-3 (lib/ edits keep disappearing): add lib/main.dart alongside the existing lib/counter.dart so the simulated 'generated lib/' tree matches what build_runner would actually emit from source/. Verified locally with the full CI sequence (flutter pub get, dart format --set-exit-if-changed ., dart analyze --fatal-infos, dart analyze packages/solid_generator/test/golden/outputs/, dart analyze packages/solid_annotations, dart test packages/solid_generator/, flutter test packages/integration_tests/) — all 7 steps green.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rebuild the Solid skill from scratch using Anthropic's
skill-creator, and ship anAGENTS.mdtemplate for the always-loaded layer.New skill structure (
skills/solid/):solid_annotations/solid_generatorinpubspec.yamlis the authoritative signal) so the skill applies even when the user never says "Solid"references/third-party-packages.md) — when go_router/freezed/riverpod/drift/json_serializable docs saylib/, agents substitutesource/scripts/verify.shchainsdart run build_runner build --delete-conflicting-outputs+dart fix --applyevals/)run_loopAlways-loaded layer (
skills/solid/assets/AGENTS.md):source/vslib/rule applies to every prompt regardless of phrasingDocs (
docs/guides/getting-started.mdx):## AI assistant setupsection (matching the Installation `` UI)Why two layers
Description-optimization loops capped at ~5/8 trigger rate on the test query set due to Claude's documented "undertrigger on simple queries" pattern (per agentskills.io).
AGENTS.mdbypasses the trigger problem entirely — it's loaded into context on every interaction. The skill remains the on-demand reference layer for depth (annotation contract, scripts, troubleshooting).The old skill content is preserved on GitHub history (and was parked at `/tmp/solid-skill-reference/` during the rebuild for reference).
Test plan