eval-pipeline

Here are 3 public repositories matching this topic...

daanmt / agente-daktus-qa

Stage 2 of the Daktus CDSS pipeline: eval and correction harness for LLM-generated clinical protocols. Pydantic schemas, AST-based logic validation, LLM-as-judge audits, multi-model routing via OpenRouter, closed-loop feedback learning, cost telemetry.

feedback-loop observability json-validation cdss pydantic medical-ai ast-analysis openrouter cost-tracking llm-as-judge structured-outputs model-routing clinical-protocol deterministic-validation daktus eval-pipeline

Updated May 15, 2026
Python

LiqunChen0606 / skillforge

Star

Quality layer for Agent Skills (agentskills.io) — lint, hash, sign, version, eval your SKILL.md files. 10 structural checks, Ed25519 signing, semantic diff, 3-stage eval pipeline. Bidirectional SKILL.md ↔ AIF conversion.

lint rust signing semver codex claude llm ai-native agent-skills skill-verification token-efficient skillmd document-compiler skill-authoring semantic-format eval-pipeline

Updated May 11, 2026
Rust

brianmcd08 / civ-rag-pipeline

Star

Production-grade RAG pipeline with section-based scraping, retrieval tuning, and LLM-as-judge evaluation — built on Pinecone and Claude.

pinecone rag vector-search llm langchain llm-as-judge rag-pipeline eval-pipeline

Updated May 23, 2026
Python

Improve this page

Add a description, image, and links to the eval-pipeline topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the eval-pipeline topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly