Here are
3 public repositories
matching this topic...
Stage 2 of the Daktus CDSS pipeline: eval and correction harness for LLM-generated clinical protocols. Pydantic schemas, AST-based logic validation, LLM-as-judge audits, multi-model routing via OpenRouter, closed-loop feedback learning, cost telemetry.
Updated
May 15, 2026
Python
Quality layer for Agent Skills (agentskills.io) — lint, hash, sign, version, eval your SKILL.md files. 10 structural checks, Ed25519 signing, semantic diff, 3-stage eval pipeline. Bidirectional SKILL.md ↔ AIF conversion.
Updated
May 11, 2026
Rust
Production-grade RAG pipeline with section-based scraping, retrieval tuning, and LLM-as-judge evaluation — built on Pinecone and Claude.
Updated
May 23, 2026
Python
Improve this page
Add a description, image, and links to the
eval-pipeline
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
eval-pipeline
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.