AI-powered document search and Q&A system
Transform your documents into intelligent, searchable knowledge. Upload files, ask questions, and get contextual answers with source attribution.
- π Semantic Search: Vector-based document discovery with Azure AI Search
- π€ AI Answers: GPT-powered contextual responses from your documents
- π Multi-Format: PDF, DOCX, TXT, and Markdown support
- π¬ Chat Memory: Conversational AI with context awareness
- β‘ Optimized: 80-95% cost reduction through smart context windows
- π Secure: Token-based file downloads with expiration
- π Smart Queries: AI-enhanced search with query expansion
- .NET 10.0 SDK
- Azure OpenAI Service (text-embedding-ada-002, gpt-5.4)
- Azure AI Search Service
- Azure Blob Storage
# Clone and setup
git clone <repository-url>
cd DriftMind
dotnet restore
# Configure services (see Configuration section)
# Edit appsettings.json with your Azure credentials
# Run the application
dotnet run
# Access Swagger UI
open http://localhost:5175/swagger
Pre-built Docker images are available on GitHub Container Registry (GHCR):
- Release builds: Tagged with version numbers (
v0.0.24-alpha) andlatest - Security builds: Weekly rebuilds every Monday to include upstream security fixes
- Manual builds: Available for critical security updates
docker pull ghcr.io/justdanman/driftmind:latest
- Azure OpenAI Service
- Deploy
text-embedding-ada-002model
- Deploy
- Deploy
gpt-5.4model - Note endpoint and API key
-
Azure AI Search Service
- Create service (Basic tier recommended)
- Note endpoint and admin API key
-
Azure Blob Storage
- Create storage account
- Create
documentscontainer - Note connection string
Update appsettings.json:
{
"AzureOpenAI": {
"Endpoint": "https://your-openai.openai.azure.com/",
"ApiKey": "your-api-key",
"EmbeddingDeploymentName": "text-embedding-ada-002",
"ChatDeploymentName": "gpt-5.4",
"AnswerReasoningEffort": "",
"QueryExpansionReasoningEffort": ""
},
"AzureSearch": {
"Endpoint": "https://your-search.search.windows.net",
"ApiKey": "your-search-api-key"
},
"AzureStorage": {
"ConnectionString": "your-storage-connection-string",
"ContainerName": "documents"
}
}
Reasoning effort is optional and should only be configured for reasoning-capable deployments such as o3, o4-mini, or GPT-5 reasoning deployments that support it. Supported values depend on model and SDK support and are typically low, medium, high, and in some environments minimal. Leave the reasoning values empty for standard chat deployments.
POST /upload
Upload and process documents into searchable chunks.
curl -X POST "http://localhost:5175/upload" \
-F "file=@document.pdf" \
-F "documentId=my-doc" \
-F "metadata=Documentation"
Response:
{
"documentId": "my-doc",
"chunksCreated": 15,
"success": true,
"message": "File processed successfully"
}
POST /search
Search documents and generate AI-powered answers.
{
"query": "How do I configure authentication?",
"maxResults": 10,
"includeAnswer": true,
"enableQueryExpansion": true,
"chatHistory": [
{
"role": "user",
"content": "What is Azure AD?",
"timestamp": "2025-08-15T10:00:00Z"
}
]
}
Response:
{
"query": "How do I configure authentication?",
"expandedQuery": "configure setup authentication Azure Active Directory",
"results": [
{
"id": "doc-123_5",
"content": "To configure authentication...",
"documentId": "doc-123",
"chunkIndex": 5,
"score": 0.87,
"vectorScore": 0.85,
"metadata": "File: auth-guide.pdf",
"createdAt": "2025-08-15T10:00:00Z",
"originalFileName": "auth-guide.pdf",
"contentType": "application/pdf",
"fileSizeBytes": 1048576,
"blobPath": "documents/auth-guide.pdf"
}
],
"generatedAnswer": "Based on your documents, here's how to configure authentication...",
"success": true,
"totalResults": 1
}
GET /documents List all documents using query parameters.
Query Parameters:
maxResults(optional, 1-100, default: 50)skip(optional, default: 0)documentId(optional filter)
POST /documents List all documents using request body.
{
"maxResults": 20,
"skip": 0,
"documentIdFilter": "optional-filter"
}
Response:
{
"documents": [
{
"documentId": "doc-123",
"chunkCount": 15,
"fileName": "auth-guide.pdf",
"fileType": ".pdf",
"fileSizeBytes": 1048576,
"metadata": "File: auth-guide.pdf",
"createdAt": "2025-08-15T10:00:00Z",
"lastUpdated": "2025-08-15T10:00:00Z",
"sampleContent": [
"This guide covers authentication...",
"Chapter 1: Getting Started..."
]
}
],
"totalDocuments": 1,
"returnedDocuments": 1,
"success": true,
"message": "Retrieved 1 documents successfully."
}
DELETE /documents/{documentId} Delete document and all chunks.
POST /documents/delete Alternative delete endpoint using JSON body:
{
"documentId": "doc-123"
}
POST /download/token Generate secure, time-limited download token.
{
"documentId": "doc-123",
"expirationMinutes": 15
}
Response:
{
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"documentId": "doc-123",
"expiresAt": "2025-08-15T10:15:00Z",
"downloadUrl": "/download/file",
"success": true
}
POST /download/file Download file using secure token in request body.
{
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}
Returns binary file download with appropriate headers.
POST /admin/migrate/optimize-metadata Optimize storage by consolidating metadata to first chunk only.
POST /admin/migrate/fix-content-types Fix incorrect MIME types for existing documents.
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β File Upload βββββΆβ Text Chunks βββββΆβ Embeddings β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β β
βΌ βΌ
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Blob Storageβ β AI Search β β Vector Storeβ
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β β β
βββββββββββββββββββββΌββββββββββββββββββββ
βΌ
βββββββββββββββββββββββ
β Search & AI Answers β
βββββββββββββββββββββββ
- DocumentProcessingService: File upload and indexing
- SearchService: Vector and semantic search
- ChatService: AI answer generation with context
- QueryExpansionService: Intelligent query enhancement
- BlobStorageService: File storage and retrieval
Uses adjacent chunks strategy for 80-95% cost reduction:
- Smart context windows instead of full documents
- Maintains document flow and coherence
- Linear scaling with document count
- Query Expansion: Automatically enhances vague queries
- Multi-Language: German/English cross-language search
- Chat Memory: Contextual conversations with history
- Smart Relevance: Hybrid vector + text scoring
- Metadata Efficiency: 98% storage reduction through smart indexing
- Embedding Cache: 80-90% API call reduction
- Batch Processing: Optimized bulk operations
- Linear Scaling: Predictable cost growth
- Token-based: HMAC-SHA256 signed download tokens
- Time-limited: 15-minute default, 60-minute maximum
- Audit logging: All download activity tracked
- No direct URLs: Files never accessible via direct links
- Use Azure Key Vault for secrets in production
- Enable managed identity for Azure services
- Configure proper CORS policies
- Implement rate limiting as needed
Additional detailed documentation:
- Azure Blob Storage Integration
- Chat History Integration
- Query Expansion Feature
- Adjacent Chunks Optimization
- PDF/Word Integration
MIT License - see LICENSE file for details.
The following third-party packages are used in this project. Their respective licenses apply to those components. The overall project license remains MIT as noted above.
| Package | Version | License | Copyright |
|---|---|---|---|
| Azure.AI.OpenAI | 2.1.0 | MIT | Β© Microsoft Corporation |
| Azure.Search.Documents | 11.7.0 | MIT | Β© Microsoft Corporation |
| Azure.Storage.Blobs | 12.27.0 | MIT | Β© Microsoft Corporation |
| DocumentFormat.OpenXml | 3.5.1 | MIT | Β© Microsoft Corporation |
| PdfPig | 0.1.14 | Apache-2.0 | Β© UglyToad / PdfPig Contributors |
| Microsoft.AspNetCore.OpenApi | 10.0.5 | MIT | Β© Microsoft Corporation |
| Swashbuckle.AspNetCore | 10.1.7 | MIT | Β© Swashbuckle Contributors |
Full license texts: see THIRD-PARTY-NOTICES.md.
Built with β€οΈ for intelligent document search and AI-powered knowledge management.