This project is currently under active development.
ClaimCheck AI is a source-grounded claim verification backend. It helps users check whether a written claim is supported by evidence from uploaded PDF documents.
Users can upload source PDFs, the backend parses and chunks the document text, and each claim can be matched against relevant evidence chunks. The current MVP supports ranked keyword-based evidence retrieval and rule-based claim verification.
AI-generated writing often contains unsupported claims, misleading citations, or fabricated references. Even when a source document is provided, the cited document may not actually support the claim.
ClaimCheck AI addresses this problem by grounding claims in uploaded source documents and returning evidence-linked verification results.
- Upload source PDF documents
- Parse PDF text using PyMuPDF
- Split PDF text into overlapping chunks
- Store documents and chunks in PostgreSQL
- Create and store claims
- Retrieve relevant evidence chunks for each claim
- Rank evidence chunks by keyword match score
- Automatically verify a claim using a rule-based verifier
- Store verification results in the database
- Query verification results by claim or by result ID
- Test all backend APIs through FastAPI Swagger UI
The current MVP uses a simple rule-based verifier:
- If no evidence is found:
- status:
not_enough_evidence - confidence:
0.2
- status:
- If the top evidence chunk has a keyword match score of 3 or higher:
- status:
likely_supported - confidence:
0.8
- status:
- If the top evidence chunk has a keyword match score below 3:
- status:
weak_evidence - confidence:
0.5
- status:
This rule-based system is intentionally simple and will later be replaced or enhanced with embedding-based retrieval and LLM-based verification.
- FastAPI
- Python
- SQLAlchemy
- Pydantic
- PostgreSQL
- PyMuPDF
- Docker
- Docker Compose
- Environment variables
- FastAPI Swagger UI
- Git and GitHub
- Embedding-based semantic retrieval
- pgvector for vector similarity search
- LLM-based claim extraction
- LLM-based evidence-aware verification
- Structured JSON outputs
FastAPI Backend
|
+--> API Layer
| +--> documents.py
| +--> chunks.py
| +--> upload.py
| +--> claims.py
| +--> verification.py
|
+--> Service Layer
| +--> retrieval.py
| +--> verification.py
|
+--> Database Layer
| +--> SQLAlchemy models
| +--> PostgreSQL
|
+--> Schemas
+--> Pydantic request and response models
Upload PDF
↓
Parse PDF text
↓
Split text into chunks
↓
Store document and chunks in PostgreSQL
↓
Create claim
↓
Retrieve ranked evidence chunks
↓
Run rule-based verification
↓
Store and return verification result
GET /health
GET /health/db
POST /documents/
GET /documents/
GET /documents/{document_id}
GET /documents/{document_id}/chunks/
POST /upload/pdf
Uploads a PDF, extracts text, splits it into chunks, and stores the document and chunks in PostgreSQL.
POST /claims/
GET /claims/
GET /claims/{claim_id}
GET /claims/{claim_id}/evidence
POST /claims/{claim_id}/verify
GET /claims/{claim_id}/evidence returns ranked evidence chunks with a keyword match score.
POST /claims/{claim_id}/verify automatically retrieves evidence, applies the rule-based verifier, stores a verification result, and returns it.
POST /verification-results/
GET /verification-results/
GET /verification-results/claim/{claim_id}
GET /verification-results/{verification_id}
These endpoints support manual creation and querying of verification results.
- Upload a PDF through
POST /upload/pdf. - Check the stored chunks with
GET /documents/{document_id}/chunks/. - Create a claim with
POST /claims/. - Retrieve evidence with
GET /claims/{claim_id}/evidence. - Automatically verify the claim with
POST /claims/{claim_id}/verify. - View saved verification results with
GET /verification-results/claim/{claim_id}.
backend/
app/
api/
claims.py
documents.py
health.py
upload.py
verification.py
core/
config.py
db/
database.py
init_db.py
models.py
schemas/
claim.py
chunk.py
document.py
evidence.py
verification.py
services/
retrieval.py
verification.py
main.py
requirements.txt
docker-compose.yml
README.md
- FastAPI backend setup
- PostgreSQL connection with SQLAlchemy
- Database models for documents, chunks, claims, and verification results
- PDF upload and parsing
- Text chunking with overlap
- Claim creation and query APIs
- Ranked keyword-based evidence retrieval
- Rule-based automatic claim verification
- Verification result creation and query APIs
- Service layer refactor for retrieval and verification logic
- Add stronger retrieval scoring
- Add embedding generation for document chunks
- Add pgvector similarity search
- Add LLM-based claim extraction
- Add LLM-based verification with structured JSON output
- Add frontend dashboard
- Add authentication and user-specific documents
- Add tests for core API endpoints
Start the PostgreSQL database:
docker compose up -dStart the FastAPI backend:
cd backend
uvicorn app.main:app --reloadOpen the API docs:
http://127.0.0.1:8000/docs
This project is under active development. The current version is a backend MVP with source document ingestion, ranked evidence retrieval, and rule-based claim verification.