Class-based, modular Text-to-SQL app for SQLite with guardrails and security.
python cli_main.py # normal mode
python cli_main.py --debug # verbose logs + SQLAlchemy echo
python cli_main.py --quiet # only final answerOptional API key gate:
set APP_API_KEY=your-secret # Windows
python cli_main.py --api-key your-secret- args_logging.py: CLI flags and logging
- database.py: schema creation and idempotent seed
- llm_prompt.py: schema load (redacted) and prompt/LLM builder
- agent.py: TextToSQLAgent (validation, retries, execution)
- sql_guard.py: allowlist validation, LIMIT injection, (extensible) timeout hooks
- security.py: API key validation, rate limiting, abuse checks
- summarizer.py: concise answer generation from query results
- cli_main.py: interactive CLI entrypoint
- API key check: set
APP_API_KEYand pass--api-keyto require a key. - Rate limiting: simple token-bucket style (30 req/min default) in CLI.
- Abuse checks: rejects overly long inputs or those containing risky tokens.
- Schema redaction: sensitive columns (
email,address,salary) are removed from the schema string provided to the LLM.
Adjust redaction in llm_prompt.py via SENSITIVE_COLUMN_SUBSTRINGS.
- Only
SELECTqueries allowed (validated by allowlist insql_guard.py). - Automatic LIMIT added when missing (default 100) to bound result size.
- Future hook: SQLite progress handler for timeouts; see
sql_guard.pyfor stubs.
--quiethides intermediary prints and only prints the final answer.--debugenables detailed logs and SQLAlchemy engine echo.
Environment variables:
APP_API_KEY: Optional. Require matching--api-keyfor access.OLLAMA_MODEL: Defaults tomistral:latestif not provided.
Linting: project runs clean with built-in lints. To run the CLI locally use Python 3.12 and install dependencies in your venv as needed.
- This app remains demo-grade. For production, add HTTP service layer (FastAPI), persisted rate limits, structured logging, centralized metrics/tracing, and comprehensive tests.
Text-to-SQL Agent is a command-line tool that lets you query a SQL database using natural language. Powered by a language model (LLM), it translates your question into SQL, runs it, and summarizes the results — all without needing to write a single line of SQL.
$ python cli_main.py --debug
> What are the top 5 products by sales this month?
SQL Generated:
SELECT name, SUM(sales) FROM products WHERE date >= '2025-10-01' GROUP BY name ORDER BY SUM(sales) DESC LIMIT 5;
Summary:
The top 5 products by sales this month are...-
✅ Natural Language to SQL: Ask questions in plain English.
-
🔌 Pluggable LLM: Swap in different language models.
-
🗃 Built-in DB Setup: Auto-creates schema and seed data if needed.
-
🧼 Readable Output: SQL + summarized results.
-
🛠 Modular Design: Easy to extend and maintain.
.
├── args_logging.py # CLI arguments and logging
├── database.py # DB engine + schema + seed data
├── llm_prompt.py # LLM + prompt handling
├── agent.py # TextToSQLAgent class
├── summarizer.py # Summarizes query results
├── cli_main.py # Main CLI entry point
-
Python 3.8+
-
(Optional) OpenAI API Key if using an OpenAI-backed LLM
-
Install dependencies:
-
pip install -r requirements.txt
You can run either the new modular CLI or the legacy script.
python cli_main.py [--debug | --quiet]
CLI Flags
--debug: Enables verbose logging
--quiet: Hides non-essential output
-
You ask a question in natural language.
-
The system builds a prompt for the LLM using the current database schema.
-
The LLM generates SQL.
-
The agent validates and executes the SQL.
-
A summary of the results is returned to you.
To customize the LLM or prompt, check llm_prompt.py.
To change or extend the DB schema, see database.py.