certmate-agent

Conversational assistant for CertMate (source).

Embedded local LLM (LM Studio by default) + a 1:1 mapping of CertMate's REST API as LLM-callable tools. Read tools auto-execute; write tools queue a pending_action with a human-readable summary and require an explicit confirmation from the UI.

Ecosystem

Part of the CertMate ecosystem:

CertMate — open-source SSL certificate management (API + UI).
certmate-tools — free, privacy-first, client-side TLS / certificate / ACME diagnostics.
nis2-public — NIS2 continuous posture management & remediation.

Enterprise / high-scale — multi-tenant, mTLS, white-label and NIS2-aligned deployments are available through CertMate-ng (source-available, BSL 1.1, EU-built). Contact fabrizio.salmi@gmail.com.

Operating modes

Set AGENT_MODE to choose between:

full (default) — sidecar to a real CertMate instance. All tools available: live state reads, write commands with confirm, admin (/reindex), RAG over docs. Used for self-hosters running their own CertMate.
docs_only — public docs assistant. No CertMate API connection. Only docs_search + /help + /docs available; the LLM's system prompt is also adjusted to tell it there is no live state. Used for agent.certmate.org-style deployments where anyone can ask questions about CertMate features and configuration.

Mode-dependent behavior:

	`full`	`docs_only`
Tools registered	23	1 (`docs_search`)
Slash commands	all	`/help`, `/docs`, `/ask`
CertMate API client	opened per turn	not opened
`/health.certmate`	live check	`"status": "disabled"`
System prompt	"you can read live state + docs"	"no live state, docs only"
Widget header badge	hidden	`DOCS ONLY`

The widget discovers the mode by hitting /health on mount and adapts its autocomplete + intro hint accordingly.

Architecture

 user ──► widget (vanilla web component)
            │ SSE
            ▼
       FastAPI agent ──► LM Studio (chat + embeddings, OpenAI-compatible)
            │
            └─► CertMate REST API (Bearer token)

Single LLM endpoint, configurable via env (LM Studio by default).
Read tools run inline; write tools save a pending_action and return a confirm_token. The widget posts to /tools/execute with the token to actually run them.
sqlite for conversations / pending actions / audit log.

Run

cp .env.example .env
# set CERTMATE_URL + CERTMATE_TOKEN, point LMSTUDIO_URL at your LM Studio server

uv pip install -e .          # or: pip install -e .
python -m agent.main

Open http://127.0.0.1:8765/widget/ for the standalone test page.

Embed in CertMate or any page:

<script type="module" src="http://127.0.0.1:8765/widget/certmate-agent.js"></script>
<link rel="stylesheet" href="http://127.0.0.1:8765/widget/certmate-agent.css">
<certmate-agent endpoint="http://127.0.0.1:8765"></certmate-agent>

Endpoints

Method	Path	What
GET	`/health`	agent + LM Studio + CertMate health
GET	`/models`	check the configured chat/embed models are loaded
POST	`/chat`	SSE stream — body `{message, history?}`
POST	`/tools/execute`	confirm a queued write — body `{token}`

Model notes

Configured by default with google/gemma-4-e2b (chat) and text-embedding-embeddinggemma-300m (embeddings).

Gemma's small variants are "thinking" models — they spend tokens on internal reasoning before producing output. If you see empty assistant replies, raise AGENT_MAX_TOKENS (try 2048+), or swap to an instruct model with reliable native tool-calling such as qwen/qwen3-8b.

Tool surface (1:1 with CertMate API)

Read (auto-executed): system_overview, system_health, cert_list, cert_get, cert_deployment_status, cert_dns_alias_check, settings_get, dns_providers_info, dns_accounts_list, dns_account_get, backups_list, storage_info, client_certs_list.

Write (require confirm): cert_create, cert_renew, cert_auto_renew_toggle, cert_deploy, cache_clear, backup_create, dns_account_add.

Destructive (require confirm + extra UI warning): backup_delete, dns_account_delete.

Slash commands (deterministic — skip the LLM)

Command	What
`/help` (`/?`)	List all commands
`/health`	CertMate service health
`/status` (`/overview`)	Health + cert count + certs expiring within 30d
`/expiring [days]`	Certificates expiring within N days (default 30)
`/list` (`/certs`, `/ls`)	All managed certificates
`/cert <domain>`	Details for one certificate
`/providers` (`/dns`)	Supported + configured DNS providers
`/accounts [provider]`	Configured DNS accounts
`/backups`	Available backups
`/renew <domain> [--force]`	Renew (queues a confirm)
`/deploy <domain>`	Run deploy hook (queues a confirm)
`/cache-clear`	Clear server cache (queues a confirm)
`/docs <query>` (`/ask`)	Search the CertMate docs (RAG)
`/reindex [repo] [branch]`	Rebuild the docs index (admin only)

Slash commands bypass the LLM entirely — sub-200ms response, deterministic output, and write commands reuse the same confirm-token flow as LLM-emitted tool calls.

Docs RAG (built once, queried locally)

The agent ships with an indexer that pulls README.md and docs/*.md from the CertMate GitHub repo, chunks by markdown headings, and embeds with the local text-embedding-embeddinggemma-300m. Queries embed with the same model and cosine-rank in pure Python (no numpy / vector DB).

python -m agent.rag.indexer                  # defaults: fabriziosalmi/certmate@main
python -m agent.rag.indexer --repo X/Y --branch main
python -m agent.rag.indexer --paths README.md,docs/api.md

Index is written to docs_index/index.pkl (~2 MB for 271 chunks). The agent loads it lazily at first docs_search call — restart not required after rebuild.

Both /docs <query> (slash, sub-50ms after embed) and the LLM tool docs_search use the same path. The system prompt instructs the LLM to call docs_search for any conceptual / how-to question, which keeps small models like gemma-4-e2b grounded.

Admin gate

Set AGENT_ADMIN_TOKEN=<secret> to enable admin-only commands like /reindex. Clients prove admin status by sending one of:

HTTP header: X-Agent-Admin: <secret> (preferred)
JSON body field: "admin_token": "<secret>"

When empty, all admin commands are disabled (404-equivalent: refused with an explanatory error in the chat). Comparison is constant-time.

In the widget, set the attribute only on admin-facing pages:

<certmate-agent endpoint="…" admin-token="MY_SECRET"></certmate-agent>

Conversation persistence

Optional. Off by default (stateless: client passes history each turn). Set AGENT_PERSIST_CONVERSATIONS=true to:

mount /conversations/<session_id> (GET to fetch, DELETE to clear)
store user + assistant messages in sqlite, keyed by session_id
on each turn the server loads history server-side and ignores the client's history field — survives page reloads, multi-tab use, and works when the agent is behind a load balancer

In the widget, opt in with the persist attribute (generates a per-host session_id in localStorage and adds a "New session" button):

<certmate-agent endpoint="…" persist></certmate-agent>

Background scheduler

A single asyncio task runs every AGENT_CLEANUP_INTERVAL_SECONDS (default 1 hour) and prunes:

expired pending_actions (always, regardless of persistence)
conversation_messages older than AGENT_CONVERSATION_TTL_DAYS (only when AGENT_PERSIST_CONVERSATIONS=true)

Set AGENT_CLEANUP_INTERVAL_SECONDS=0 to disable. A pass runs on boot so a long-stopped instance cleans backlog before serving traffic.

Optional fallback LLM

Set OPENROUTER_API_KEY to enable a fallback chat provider. The agent tries the primary LM Studio first and falls back to OpenRouter only when the primary errors out (connection, timeout, 5xx). A small circuit breaker trips the primary after LLM_PRIMARY_FAILURE_THRESHOLD consecutive failures and keeps it tripped for LLM_PRIMARY_COOLDOWN_SECONDS before retrying.

Embeddings always stay on the primary (only LM Studio runs the embedding model). The widget receives an extra status event "served via openrouter" when the fallback handled the turn, so you can see it in the chat log.

Public deployment (`docs_only` on Fly.io + weekly index)

Pre-built workflows + manifest ship the public docs assistant (agent.certmate.org-style) end-to-end:

1. Weekly index rebuild — `.github/workflows/rebuild-docs-index.yml`

Runs every Monday 06:00 UTC and on manual trigger. Pulls fabriziosalmi/certmate@main, chunks + embeds the docs, publishes the resulting pickle as a index-latest GitHub Release.

Required repo secrets (Settings → Secrets and variables → Actions):

Secret	Example value
`INDEX_EMBED_URL`	`https://api.cloudflare.com/client/v4/accounts/<ACCOUNT_ID>/ai/v1`
`INDEX_EMBED_API_KEY`	Cloudflare API token (Workers AI scope)
`INDEX_EMBED_MODEL`	`@cf/baai/bge-base-en-v1.5` (768-dim)

OpenAI works too:

Secret	Example value
`INDEX_EMBED_URL`	`https://api.openai.com/v1`
`INDEX_EMBED_API_KEY`	`sk-...`
`INDEX_EMBED_MODEL`	`text-embedding-3-small`

Important: the same embedding model must be configured on the runtime that serves the index. The store warns loudly if they don't match.

2. Fly.io deploy — `fly.toml` + `.github/workflows/deploy-fly.yml`

The Fly app runs AGENT_MODE=docs_only, fetches the published index on cold start via AGENT_INDEX_BOOTSTRAP_URL, and uses Cloudflare Workers AI (or OpenRouter) for chat. Setup once:

fly apps create certmate-agent              # pick any name
# Edit fly.toml: replace REPLACE_ME in LMSTUDIO_URL with your CF account id
fly secrets set LMSTUDIO_API_KEY=<token>    # CF Workers AI token
fly volumes create certmate_agent_data --size 1 --region fra
fly deploy                                   # one-shot bootstrap

Get a deploy token for GitHub:

fly tokens create deploy -x 99999h
# paste into repo secret FLY_API_TOKEN

After that, the deploy workflow runs automatically on:

push to main (rebuilds the image)
successful rebuild-docs-index (just restarts machines so they re-bootstrap the index, no image rebuild)
manual dispatch

Point your DNS:

agent.certmate.org → CNAME → <app-name>.fly.dev

Fly handles TLS issuance + renewal — fitting for the CertMate ecosystem.

Why this split scales

Layer	Updates when	Cost
Docker image	code/config changes (rare)	0
Index artifact	weekly cron or doc push	$0.001/run on OpenAI; free on CF WAI
Fly machine	code change or index refresh	scale-to-zero, ~free under low traffic

Docker

docker/Dockerfile + docker/docker-compose.example.yml provided. sqlite db + RAG index live under /data (volume). Build the index from inside the container the first time:

docker compose -f docker/docker-compose.example.yml up -d
docker compose -f docker/docker-compose.example.yml exec certmate-agent \
    python -m agent.rag.indexer

After that you can rebuild via the /reindex admin command from the widget (no shell access needed) — hot-swap, no restart.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github		.github
agent		agent
docker		docker
site		site
tests		tests
widget		widget
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
fly.toml		fly.toml
pyproject.toml		pyproject.toml
screenshot.png		screenshot.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

certmate-agent

Ecosystem

Operating modes

Architecture

Run

Endpoints

Model notes

Tool surface (1:1 with CertMate API)

Slash commands (deterministic — skip the LLM)

Docs RAG (built once, queried locally)

Admin gate

Conversation persistence

Background scheduler

Optional fallback LLM

Public deployment (`docs_only` on Fly.io + weekly index)

1. Weekly index rebuild — `.github/workflows/rebuild-docs-index.yml`

2. Fly.io deploy — `fly.toml` + `.github/workflows/deploy-fly.yml`

Why this split scales

Docker

About

Uh oh!

Releases 1

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

certmate-agent

Ecosystem

Operating modes

Architecture

Run

Endpoints

Model notes

Tool surface (1:1 with CertMate API)

Slash commands (deterministic — skip the LLM)

Docs RAG (built once, queried locally)

Admin gate

Conversation persistence

Background scheduler

Optional fallback LLM

Public deployment (docs_only on Fly.io + weekly index)

1. Weekly index rebuild — .github/workflows/rebuild-docs-index.yml

2. Fly.io deploy — fly.toml + .github/workflows/deploy-fly.yml

Why this split scales

Docker

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Uh oh!

Contributors

Uh oh!

Languages

Public deployment (`docs_only` on Fly.io + weekly index)

1. Weekly index rebuild — `.github/workflows/rebuild-docs-index.yml`

2. Fly.io deploy — `fly.toml` + `.github/workflows/deploy-fly.yml`