Natural language to MongoDB query converter using a multi-agent AI system. Transform plain English queries like "Find technology companies with high profit margins" into optimized MongoDB queries.
Finks Naive uses a sophisticated multi-agent pipeline to understand natural language queries and convert them into precise MongoDB queries. The system employs five specialized AI agents working in concert:
- Field Extraction Agent - Identifies relevant database fields from natural language
- Sorting Extraction Agent - Determines sorting intent and direction
- Instruction Processing Agent - Applies context-aware interpretation to fields
- Synthesis Agent - Combines interpretations into a unified query structure
- Query Generation Agent - Produces the final MongoDB query
- Runtime: Python 3.12
- Framework: FastAPI
- AI: Pydantic AI with Gemini/OpenAI models
- Database: MongoDB Atlas
- Package Manager: UV (ultra-fast Python package manager)
- Deployment: AWS Lambda + API Gateway (via Pulumi)
- Architecture: ARM64 (AWS Graviton2)
- Python 3.12+
- UV package manager
- MongoDB Atlas cluster
- API keys for Gemini and/or OpenAI
git clone https://github.com/your-org/finks-naive.git
cd finks-naivecurl -LsSf https://astral.sh/uv/install.sh | shuv syncCreate a .env file in the project root:
# AI API Keys
GEMINI_API_KEY=your_gemini_api_key
OPENAI_API_KEY=your_openai_api_key
# MongoDB Configuration
MONGODB_URL=mongodb+srv://username:password@cluster.mongodb.net
MONGODB_DB_NAME=your_database_name
# AWS Configuration (for deployment)
AWS_REGION=ca-central-1# Start the FastAPI server
uv run uvicorn app.main:app --reload
# Or run the main script directly
uv run python main.pyThe API will be available at http://localhost:8000
# Test with a sample query
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"query": "Find technology companies with high profit margins",
"max_results": 10
}'finks-naive/
├── app/ # Application code
│ ├── core/ # Core utilities (config, database, cache)
│ ├── modules/
│ │ └── agents/ # AI agents implementation
│ │ ├── field_extraction/
│ │ ├── instruction_processing/
│ │ ├── sorting_extraction/
│ │ ├── synthesis/
│ │ └── query_generation/
│ └── main.py # FastAPI application
├── config/ # Configuration files
│ ├── field_mappings.yaml # Field to collection mappings
│ ├── field_instructions.yaml # Natural language interpretation rules
│ ├── field_categories.yaml # Field categorization
│ └── unavailable_fields.yaml # Fields pending implementation
├── scripts/ # Utility scripts
│ ├── normalize_collections.py # Populate master_search collection
│ ├── process_vic_csv.py # Generate field instructions from CSV
│ └── test_yaml_files.py # Validate YAML configurations
├── tests/ # Test files
├── deployment_config.yaml # AWS Lambda deployment settings
└── pyproject.toml # UV/Python dependencies
All configuration files use YAML format for better readability and documentation:
- field_mappings.yaml: Maps field names to MongoDB collections
- field_instructions.yaml: Provides AI guidance for interpreting user queries
- field_categories.yaml: Groups fields into categories with selection rules
- unavailable_fields.yaml: Lists fields that need future implementation
# Run unit tests
uv run pytest
# Test YAML configuration validity
uv run python scripts/test_yaml_files.py
# Test the complete pipeline
uv run python test_sequential_simple.py --fresh- Add field mapping to
config/field_mappings.yaml - Add interpretation instructions to
config/field_instructions.yaml - Categorize the field in
config/field_categories.yaml - Run normalization to update master_search collection:
uv run python scripts/normalize_collections.py
To update field instructions from a CSV file:
uv run python scripts/process_vic_csv.pyThis project uses Pulumi for infrastructure as code deployment to AWS Lambda.
- AWS CLI configured with appropriate credentials
- Pulumi CLI installed
- AWS account with permissions for Lambda, API Gateway, and CloudWatch
# Install Pulumi
curl -fsSL https://get.pulumi.com | sh
# Login to Pulumi (using local backend)
pulumi login --local
# Or login to Pulumi Cloud
pulumi logincd pulumi
pulumi stack init dev
pulumi config set aws:region ca-central-1
pulumi upcd pulumi
pulumi stack select staging
pulumi config set aws:region ca-central-1
pulumi config set finks-naive:environment staging
pulumi upcd pulumi
pulumi stack select production
pulumi config set aws:region ca-central-1
pulumi config set finks-naive:environment production
pulumi config set finks-naive:reservedConcurrency 10
pulumi up# List all stacks
pulumi stack ls
# Switch between stacks
pulumi stack select dev
# View stack outputs (API endpoint, etc.)
pulumi stack output
# Destroy stack resources
pulumi destroySee deployment_config.yaml for detailed Lambda and API Gateway settings including:
- Memory allocation and timeout settings
- Environment variables
- Caching configuration
- Monitoring and alerting
- Cost optimization strategies
Once running, visit:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
Convert natural language to MongoDB query
{
"query": "Find profitable tech companies",
"max_results": 10,
"use_cache": true
}Health check endpoint
The system includes several optimization strategies:
- Parallel Agent Execution: Field and sorting extraction run concurrently
- Multi-level Caching: In-memory and MongoDB-based query caching
- Query Fingerprinting: Normalizes queries for better cache hits
- Connection Pooling: Optimized MongoDB connections for Lambda
- ARM Architecture: Uses AWS Graviton2 for cost efficiency
When deployed, the system provides CloudWatch metrics:
- Query processing time
- Cache hit rate
- Concurrent queries
- Error rates
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is proprietary and confidential.
For issues and questions:
- Check the deployment documentation
- Review the API documentation
- Contact the development team