A curated collection of reusable Agent skills organized by category.
To install: paste this page's URL into any AI agent (Claude, Codex, Gemini, etc.) and it will walk you through setup automatically.
mastermind — End-to-end workflow conductor. Takes any non-trivial task from raw problem to verified, reviewed completion without you having to orchestrate anything manually. If you only install one skill, make it this one.
It chains the best skills in this repo into one seamless workflow:
- Interview — understands the real goal, constraints, and definition of done before touching any code
- Fusion-reasoning — fans out independent agents to stress-test solution approaches and converge on the right one
- Task-plan-architect — builds a mapped, ordered implementation plan
- Planrunner — executes the plan with token-efficient orchestration
- Completion audit — independently confirms every tasked item was actually done
- Adversarial review — hostile correctness, security, and edge-case pass
Three checkpoints keep you in control (confirm-understanding, confirm-plan, present-verification). Everything between checkpoints runs autonomously. Installing Mastermind also installs all dependent skills automatically.
spec-docs-generator — Generate or update a numbered spec-document set as your project's source of truth
Generates or updates a numbered spec-document set that becomes the source of truth an agent obeys when writing code.
Produces:
docs/00-index.md— router indexdocs/NN-topic.md— numbered spec files (100–400 lines each)docs/decisions/— Architecture Decision Records (ADRs)docs/spec-compliance-plan.md— backlog of code that doesn't match canon
When the code is inconsistent, it picks the canonical rule by convergence (the most common implementation wins) and logs every out-of-line occurrence to the compliance plan. Detects existing specs, diffs them against the code, and fixes drift. ADRs are append-only — new ones supersede old ones, never deleted. Writes docs only — never edits application code.
Best paired with code-quality-checklist: once specs are generated, the checklist uses them as the authoritative source of conventions for every future implementation task.
code-quality-checklist — Pre/during/post-task quality guardrail that adapts to your project's own specs
Pre/during/post-task quality guardrail — surfaces assumptions before coding, enforces triggered workflows during work, and runs full verification before marking done.
Stages:
- Pre-task — surface assumptions, clarify scope, read project-rules.md
- During — run triggered workflows (schema changes, UI changes, API changes)
- Post-task — run scripts/verify.sh; self-review before handoff
Silent by default — fixes issues without listing them. Reads project-specific rules from references/project-rules.md. When docs/00-index.md exists, reads the relevant spec files as the authoritative source of conventions.
Best paired with spec-docs-generator: once specs exist, this skill enforces them automatically.
adversarial-review — Three hostile personas tear your code apart before it ships
Adversarial review that breaks the self-review monoculture — three hostile personas each must find at least one issue.
It picks the 3 best-fit personas from a library of 52, matched to what's being reviewed:
- Code & reliability: Saboteur, New Hire, Concurrency Specialist, Regression Hunter, Mutant, Invariant Hunter
- Security: Security Auditor, Threat Modeler, Red Teamer
- Data & API: Data Integrity Auditor, Query Planner, Contract Breaker, Chaos Engineer
- UX: Accessibility Critic, First-Click Saboteur, Affordance Skeptic, Dark-Pattern Hunter
- Content: Proofreader, Grammarian, Fact-Checker, Humanizer, Plain-Language Auditor, Lost Reader
Findings are deduplicated and severity-promoted when multiple personas catch the same issue. Delivers a structured BLOCK / CONCERNS / CLEAN verdict with exact citations and paste-ready comments.
task-plan-architect — Research a large task and produce a mapped implementation plan any model can execute
Research a large task and produce a mapped implementation plan a smaller model can execute without losing the big picture.
Produces:
00-overview.md— goal, scope, and success criteria01-map.md— parts, connections, contracts, and invariants02-roadmap.md— ordered workloads and dependency graphtasks/NNN-slug.md— self-contained, map-linked task files
Every task links back to the map so a context-limited executor knows what it connects to and must not break. Saves the plan outside the repo by default.
planrunner — Orchestrator-driven execution of an approved implementation plan
Orchestrator-driven execution of an approved implementation plan — slices the work, delegates each slice to implementer subagents, verifies against intent, and runs adversarial review rounds.
Execution loop:
- Green gate — tests, lint, and typecheck pass before review
- Slice → delegate → verify — each slice checked against the plan's acceptance criteria
- Adversarial review — up to 3 rounds fixing real blockers
- Re-verify after each fix round
Consumes task-plan-architect output directly. File-collision guard prevents parallel slices from conflicting.
tdd — Enforces red → green → refactor for every new behavior or bug fix
Enforces the red → green → refactor TDD cycle for every new behavior or bug fix — no shortcuts.
Cycle:
- Red — write a failing test that expresses the desired contract
- Green — write the minimum code to make it pass
- Refactor — clean up duplication and naming without breaking the test
Every failure must be for the right reason (not a setup error). Every implementation must be minimal — no over-engineering to anticipate future needs.
pr-review — Reviews a GitHub PR for real blocking problems nobody has flagged yet
Reviews a GitHub PR for real blocking problems nobody has flagged yet — verified against the actual code before reporting.
Checks for:
- Correctness bugs and logic errors
- Security gaps and auth issues
- Broken contracts and data races
- Unmet acceptance criteria from the linked ticket
Reads files by ref (not checkout) to avoid corrupting local state. Returns APPROVE or REQUEST CHANGES with exact file:line citations and paste-ready comments. Never posts to GitHub on its own.
code-audit-planner — Audits a codebase across quality dimensions and produces an ordered implementation plan
Audits a codebase across many quality dimensions and produces a self-contained implementation-plan folder of ordered, verifiable tasks.
Audit dimensions:
- DRY/reuse and code organization
- Logic errors and test quality
- UX and responsive design
- Spec drift (code vs. spec gaps)
- Theming and accessibility
Fans reading out to light agents in parallel; writes numbered task files with exact acceptance criteria so a future agent can execute each independently. Plans only — never edits source.
frontend-code-layout — Keep structure, styling, and behavior separable so the look can swap without rewrites
Keeps frontend code's structure, styling, and behavior separable so the look can swap without rewriting components.
Core principles:
- Semantic tokens named by role (
--color-primary) not value (--blue-500) — all in one file - Model/View/Presenter layering — data/logic separate from markup
- View components are pure functions of props — renderable in tests with no side effects
To rebrand: only the token file changes. Color is never the sole signal for accessibility.
modern-web-guidance — Curated modern web-platform patterns so Claude uses the platform instead of heavy deps
Searches a curated database of standardized modern web-platform patterns before writing any browser code — so Claude uses the platform instead of heavy dependencies or ad-hoc solutions.
Covers:
- Layout: container queries,
:has(), subgrid - Motion: View Transitions, scroll-driven animations
- Performance: LCP, Core Web Vitals, INP
- Platform APIs: anchor positioning, Popover API
- Accessibility and forms
Search first, then retrieve the full guide. Patterns default to Baseline Widely Available — safe across modern browsers.
codex — Bounded implementation worker rules for handing off tasks to Codex
Bounded implementation worker rules for handing off tasks to Codex — enforces scope discipline, handoff checklist, and Studio 2.0 state conventions.
Key rules:
- Follow the prompt exactly before expanding scope
- Never modify files outside the task's scope
- Keep durable state in SQLite, not state.json
- Run git status before handoff; remove scratch files
Handoff always includes: changed files, tests run, pass/fail result, remaining risks, and artifacts left behind.
worker — Cursor task orchestrator: decompose, dispatch to Haiku subagents, review, fix, report
Cursor task orchestrator — decomposes a task JSON into focused slices, dispatches each to Haiku subagents in parallel, adversarially reviews their output, fixes blockers, and returns a clean report.
Execution loop:
- Parse task JSON and project rules
- Decompose into independent slices
- Dispatch each slice to Haiku with tight context
- Adversarially review every return (max 3 fix rounds per slice)
- Run full verification; report pass/fail honestly
You (the large model) are the orchestrator and quality gate — workers are cheap execution.
design-critique — Audit a UI across six evaluation lanes, each with an itemized reference checklist — severity-rated findings and an improvement roadmap
Evaluate an existing UI against established design principles, then produce a concrete improvement plan. Works from a style guide, a codebase, a specific page, screenshots, or any combination.
Six evaluation lanes, each backed by an itemized checklist in references/:
| Lane | Covers |
|---|---|
| A — Accessibility | WCAG 2.2 — contrast, use-of-color, focus, target size, reflow |
| B — Usability | Nielsen's heuristics, error messages, slips vs. mistakes |
| C — Cognitive load | Hick's Law, Von Restorff, Prägnanz |
| D — Affordances & conventions | Norman, Jakob's Law, Apple HIG (Clarity/Deference/Depth) |
| E — Visual hierarchy | Refactoring UI, Gestalt |
| F — Color & design systems | Material Design 3 roles, design tokens, product precedent |
The lanes can fan out to parallel sub-agents — a fusion-style panel where each lane is an independent reviewer. Findings are deduplicated and severity-promoted when multiple lanes agree. Checklists carry adversarially-verified WCAG/Nielsen thresholds and an evidence-grading rule, so blocker-severity findings rest on verified facts, not opinion.
Identifies violations:
- Solid backgrounds where HIG expects layered translucency (
backdrop-filter: blur()surfaces for modals, sidebars, dropdowns) - Text and UI components below WCAG contrast thresholds (exact ratios cited, e.g.
3.1:1 — fails SC 1.4.3 by 1.4:1) - Broken visual hierarchy — squint test failures, competing CTAs, proximity and figure/ground failures
- Missing interactive states — hover, focus, active, disabled, loading, error
- Off-grid spacing (
7px,13px,15px— any value not a multiple of 4) - Touch targets below 44×44px (HIG) / 24×24px (WCAG 2.5.8 AA floor)
outline: nonewithout a replacement focus styleprefers-reduced-motionnot honored
Every finding includes: the violated principle, exact before/after code, a severity rating (P1 blocker → P4 cosmetic), and an effort estimate.
Theme preservation: never touches logos, wordmarks, or brand hue without explicit user approval. Asks the user to decide when a principle conflicts with an intentional brand choice.
Produces docs/design-critique/:
00-summary.md— severity breakdown, top P1s, total effort estimate01-findings.md— full findings list with citations and fixes02-improvement-plan.md— impact/effort matrix and phased roadmap03-trade-offs.md— brand conflicts requiring user decision (if any)
style-guide — Generate or update a visual style guide: colors, typography, spacing, components, iconography, accessibility, and voice/tone
Generate or update a visual style guide (docs/style-guide/) — the visual source of truth for a product, specific enough that an AI agent can implement a new component and match the existing design.
Reverse-engineers visual DNA from code (CSS custom properties, Tailwind @theme, design token files, component variant patterns). Asks for screenshots in one batched request when code alone can't reveal color schemes, component states, or brand identity.
Produces:
docs/style-guide/00-index.md— router, visual summary, key design decisionsdocs/style-guide/01-08-*.md— numbered spec docs: brand identity, color palette, typography, spacing/layout, components, iconography, accessibility, voice/tonedocs/style-guide/tokens.json— machine-readable DTCG 2025.10 token export (compatible with Style Dictionary v4+, Tokens Studio, Figma)
Works in three modes:
- Reverse-engineer — reads existing code, extracts tokens, inventories components
- Greenfield — interviews the user and documents what's planned
- Audit/update — diffs existing style guide against current code, updates drifted sections, adds missing components
Never edits application code. Marks visually unconfirmed values for review.
fusion-reasoning — Panel of independent agents cross-examine, a judge synthesizes one answer that beats any single pass
Reasoning amplifier — runs an adaptive panel of independent agent personas, optionally cross-examines, and a judge synthesizes one answer that beats any single pass.
How it works:
- Pick distinct personas from a library of 39 (Skeptic, Risk Scout, Innovator, Pre-Mortem Coroner, Steelman Advocate, Disconfirmer, and more), matched to the problem
- Dispatch panel agents in parallel (independently, no cross-talk)
- Optionally run a cross-examination reaction pass
- Judge verifies, deduplicates, resolves contradictions, surfaces blind spots
Panel size adapts to difficulty: 2 agents for light tasks, 3 for standard, 3–5 with cross-examination for hard problems.
efficient-orchestration — Always-on token-efficiency model: minimize output tokens and delegate mechanical work to the smallest capable model
Always-on token-efficiency operating model — for every task, minimize output tokens and decide inline vs. delegate to cheaper subagents, then monitor spend and checkpoint when work reclassifies.
Output thrift (every turn):
- No narration or preamble — lead with the result
- Terse by default; read only what the task requires
- Cap the tool-call loop; never re-derive what's already done
Delegation tiers:
- Light (Haiku-class) — reading, grepping, listing, mechanical work
- Mid (Sonnet-class) — bounded judgment and implementation
- Top (Opus-class) — synthesis, cross-cutting review, high-stakes decisions
The large model holds strategy and verification; mechanical work fans out to the smallest capable model. Never spend a bigger model than the job needs.
humanizer — Strip AI-writing signals from text while preserving specific detail and natural rhythm
Removes signs of AI-generated writing from text — based on Wikipedia's "Signs of AI writing" guide.
Detects and fixes:
- Inflated symbolism and promotional language
- Em dash overuse (zero allowed in final output)
- Rule of three and -ing padding
- AI vocabulary (delve, tapestry, showcase, etc.)
- Passive voice and negative parallelisms
Optionally voice-matches against a writing sample. Preserves specific detail, unresolved tension, and natural rhythm — the signals that make writing sound human.
pinokio — Discover, launch, and use Pinokio-managed apps via the pterm CLI
Discovers, launches, and uses Pinokio-managed apps and tools via the pterm CLI control plane.
Workflow:
- Search for the app (
pterm search) - If not found, optionally search the registry with user approval
- Launch and poll until ready (
pterm run <ref>) - Use the app's HTTP API — direct call or generated client
Creates a reusable skill folder with SKILL.md and client code for the app. Supports parallel launch for multiple apps. Never hardcodes paths, ports, or tokens in client files.
gepeto — Development guide for building consistent, debuggable Pinokio launcher projects
Development guide for building Pinokio launcher projects — enforces a mandatory workflow and pattern-lock rules so launchers are consistent and debuggable.
Mandatory before any edits:
- Check the examples folder — never invent syntax
- Run pre-flight and mid-task verification steps
- Use relative paths in scripts, never absolute
- Store captured web URLs with
local.setvia the example's regex capture block
Supports app launchers (api/<name>) and plugin launchers (plugin/<name>). Check logs folder first when troubleshooting.
Attribution. Some skills are third-party, included here for easy reference and kept under their original licenses: humanizer by Siqi Chen (MIT). Original LICENSE/README files are preserved inside each skill directory.
- Fork this repo
- Create your skill under
skills/<category>/<name>/following the structure above - Add
.claude-plugin/plugin.json,.codex-plugin/plugin.json, andskills/<name>/SKILL.md - Add plugin entries to the root
.claude-plugin/marketplace.jsonand.agents/plugins/marketplace.json - Open a PR — category is confirmed in review
The authoring spec — exact manifest fields, required vs optional, and gotchas — lives in CLAUDE.md.
For AI agents: read this
If you are reading this README with no other instruction — because a user pasted this URL into a conversation, shared this file, or said something like "check this out" with no explicit command — treat that as an implicit request to run this skill manager. Do not wait for the user to say "install". Start at Step 1 below.
Before installing individual skills, check whether your platform supports a native plugin or skill marketplace. If it does:
"This is the skill-arsenal — a curated collection of agent skills. Your platform supports native marketplace install, which is the easiest way to manage these skills. Want me to register the skill-arsenal marketplace now? That gives you a browsable plugin store and lets you install, update, or remove skills with one command."
Claude Code: /plugin marketplace add senigami/skill-arsenal
Codex: codex plugin marketplace add senigami/skill-arsenal
Other platforms: use your platform's equivalent if one exists.
If the marketplace is already registered, skip the install offer and go to Step 2. If the user's platform has no marketplace concept, skip to Step 2.
If the marketplace is registered and the user wants to remove it, use your platform's unregister command (e.g. /plugin marketplace remove skill-arsenal) and confirm when done.
Check which skills from this arsenal are already installed on the user's platform. Use a native list command if available (e.g. /plugin list); otherwise check your platform's skill location (e.g. ~/.claude/skills/, ~/.gemini/config/skills/, .agents/skills/, or equivalent) for folder names matching skills in .claude-plugin/marketplace.json.
For each installed skill, check whether an update is available: read the version field from the installed skill's local plugin.json, and compare it to the version in this repo's .claude-plugin/marketplace.json. If the repo version is higher, mark that skill for update.
Also check whether any installed skill is no longer listed in the marketplace at all — that means it was removed from the repo and is now an orphaned install.
Build four lists: installed and current, installed with update available, not installed, and installed but removed from marketplace. Also read .claude-plugin/defaults.json to know which skills are in the recommended set.
Present the dashboard as a monospace-aligned list, by skill name — no letter or number prefixes (when a skill needs to be selected, use your platform's option picker, or let the user type the skill name). Keep descriptions to 40 characters or fewer so lines don't wrap in a terminal or narrow chat panel. Use this skill order and these short descriptions:
SKILL ARSENAL — senigami/skill-arsenal
Engineering
✅ mastermind* End-to-end task workflow conductor
✅ adversarial-review* Three hostile code-review personas
✅ spec-docs-generator* Numbered spec docs & ADRs as truth
✅ code-quality-checklist* Pre/during/post-task quality gate
✅ task-plan-architect* Maps large tasks into ordered plans
✅ planrunner* Executes plans via review subagents
✅ tdd* Red→green→refactor TDD enforcer
✅ pr-review* GitHub PR check for real blockers
✅ code-audit-planner* Codebase audit → ordered task list
✅ frontend-code-layout* Separable UI layers — tokens + MVP
✅ modern-web-guidance* Modern CSS/HTML/browser-API patterns
⬆️ codex Bounded worker rules for Codex handoffs
❌ worker Cursor orchestrator — dispatches to Haiku
Productivity
✅ fusion-reasoning* Multi-agent reasoning panel + judge
✅ efficient-orchestration* Output thrift + subagent model tiering
Design
❌ design-critique Six-lane UI audit + improvement plan
❌ style-guide Generate/audit visual style guide + tokens
Content
❌ humanizer Strip AI-writing tells from prose
Automation
❌ gepeto Build Pinokio 1-click launchers
❌ pinokio Discover, launch & use Pinokio apps
* indicates default
If any installed skills are no longer in the marketplace, add a section at the bottom of the dashboard:
Removed from marketplace
⚠️ design-review-loop No longer available — uninstall recommended
Replace ✅/⬆️/❌ with the actual install status. Use ⬆️ for installed skills where a newer version is available in the marketplace. Use * using the list in defaults.json.
After the dashboard, offer these actions. If your platform has an option picker, render them as selectable choices; otherwise list them and let the user reply in plain text:
- Install defaults — the skills marked
* - Install all skills
- Install a skill — pick from the picker, or type the skill name
- Uninstall a skill — pick from the picker, or type the skill name
- Update available skills — update all skills marked ⬆️
- Update a skill — pick a specific skill to update
- Uninstall removed skills — remove all skills marked
⚠️ (no longer in the marketplace) - Nothing, just browsing
When the user needs to choose which skill to install, uninstall, or update, present the relevant skills through the picker if available — never ask them to match a letter or number. In plain-text chat, they simply type the skill name.
If all defaults are already installed, lead with that and skip option 1. If no skills have updates available, omit options 5 and 6. If no removed skills are installed, omit option 7.
Before installing (options 1, 2, or 3), ask the user where the skills should live — unless they've already told you, or your platform only supports one location:
Install location?
1. Global — available in every project on this machine
2. Local — this project/workspace only
Map the choice to your platform's convention: global → the user-level skill dir (e.g. ~/.claude/skills/) or a global marketplace install; local → the project-level location (e.g. .claude/skills/ or .agents/skills/ in the working directory). If your platform has no concept of local skills, install global and say so.
To install a skill:
-
Get the repo if not already local — clone it, or use the marketplace command from Step 1 if your platform has one:
git clone https://github.com/senigami/skill-arsenal.git
-
Resolve dependencies first. Read the skill's
SKILL.mdfor a## Requirementssection and install any listed skills recursively before installing the requested skill. Mastermind requires: fusion-reasoning, task-plan-architect, planrunner, efficient-orchestration, adversarial-review — install all five before or alongside mastermind. -
Locate the skill at
skills/<category>/<name>/skills/<name>/in the repo. The"source"field inmarketplace.jsongives the category path (e.g."./skills/engineering/tdd"→ categoryengineering, nametdd). -
Copy or symlink the inner
skills/<name>/folder into your platform's skill location. Use whatever mechanism (copy, symlink, global vs. workspace) matches your platform's norms.
Reverse however the skill was installed:
- Marketplace install: use your platform's remove command (e.g.
/plugin remove <name>,codex plugin remove <name>) - Manual install: delete the skill's folder from your platform's skill location
You know your own platform's conventions — use the right method.
Tell the user:
- Which skills were installed or removed, and any dependencies pulled in
- How to invoke each installed skill (e.g.
/mastermind,/adversarial-review) - Whether a restart or new session is needed for changes to take effect