LifeTrace is an AI-powered intelligent life recording system that helps users record and retrieve daily activities through automatic screenshot capture, OCR text recognition, and multimodal search technologies. The system supports traditional keyword search, semantic search, and multimodal search, providing powerful life trajectory tracking capabilities.
- Automatic Screenshot Recording: Timed automatic screen capture to record user activities
- Intelligent OCR Recognition: Uses RapidOCR to extract text content from screenshots
- Multimodal Search: Supports text, image, and semantic search
- Vector Database: Efficient vector storage and retrieval based on ChromaDB
- Web API Service: Provides complete RESTful API interfaces
- Frontend Integration: Supports integration with various frontend frameworks
┌─────────────────────────────────────────────────────────────┐
│ LifeTrace Backend Architecture │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Web API │ │ Frontend UI │ │ Admin Tools │ │
│ │(FastAPI) │ │ │ │ │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │ │ │ │
│ └───────────────────┼───────────────────┘ │
│ │ │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ Core Services ││
│ ├─────────────────────────────────────────────────────────┤│
│ │ ││
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ││
│ │ │Screenshot │ │File │ │OCR │ ││
│ │ │Recorder │ │Processor │ │Service │ ││
│ │ └─────────────┘ └─────────────┘ └─────────────┘ ││
│ │ │ │ │ ││
│ │ └────────────────┼────────────────┘ ││
│ │ │ ││
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ││
│ │ │Vector │ │Multimodal │ │Storage │ ││
│ │ │Service │ │Service │ │Manager │ ││
│ │ └─────────────┘ └─────────────┘ └─────────────┘ ││
│ └─────────────────────────────────────────────────────────┘│
│ │ │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ Data Storage ││
│ ├─────────────────────────────────────────────────────────┤│
│ │ ││
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ││
│ │ │SQLite DB │ │Vector DB │ │File Storage │ ││
│ │ │Metadata │ │ChromaDB │ │Screenshots │ ││
│ │ └─────────────┘ └─────────────┘ └─────────────┘ ││
│ └─────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘
RESTful API service built on FastAPI, providing the following main endpoints:
-
Screenshot Management
GET /api/screenshots- Get screenshot listGET /api/screenshots/{id}- Get single screenshot detailsGET /api/screenshots/{id}/image- Get screenshot image file
-
Search Services
POST /api/search- Traditional keyword searchPOST /api/semantic-search- Semantic searchPOST /api/multimodal-search- Multimodal search
-
System Management
GET /api/statistics- Get system statisticsGET /api/config- Get system configurationGET /api/health- Health checkPOST /api/cleanup- Clean old data
-
Vector Database Management
GET /api/vector-stats- Vector database statisticsPOST /api/vector-sync- Sync vector databasePOST /api/vector-reset- Reset vector database
Defines the core data models of the system:
- Screenshot: Screenshot record model
- OCRResult: OCR recognition result model
- SearchIndex: Search index model
- ProcessingQueue: Processing queue model
Unified configuration management system:
- Supports YAML configuration files
- Environment variable override
- Default configuration
- Configuration validation and type conversion
Database management and storage services:
- DatabaseManager: SQLite database management
- Transaction management support
- Automatic database migration
- Connection pool management
- Data cleanup and maintenance
Image text recognition service:
- SimpleOCRProcessor: Text recognition based on RapidOCR
- Supports multiple image formats
- Batch processing capability
- Result caching mechanism
- Integration with vector services
- VectorService: Text semantic search service
- Text embedding based on sentence-transformers
- ChromaDB vector database storage
- Supports reranking
- Automatic synchronization mechanism
- MultimodalVectorService: Image + text joint search
- Multimodal embedding based on CLIP model
- Separate text and image vector storage
- Weight fusion search algorithm
- Cross-modal semantic understanding
File system monitoring and processing:
- FileProcessor: File monitoring and processing
- ScreenshotHandler: Screenshot file event handling
- Asynchronous processing queue
- File change monitoring
- Batch processing optimization
Automatic screenshot functionality:
- ScreenRecorder: Screen recording management
- Multi-screen support
- Intelligent deduplication mechanism
- Configurable screenshot interval
- Active window information acquisition
Common utility functions:
- Log configuration management
- File hash calculation
- Active window information acquisition
- Cross-platform compatibility
- File cleanup tools
Screenshot → File Monitor → OCR Process → Vector → Storage
↓ ↓ ↓ ↓ ↓
Scheduled File Events Text Extract Embedding Database
↓ ↓ ↓ ↓ ↓
Multi-screen Queue Process RapidOCR CLIP SQLite
↓ ↓
Vector DB
(ChromaDB)
User Query
↓
┌─────────────┬─────────────┬─────────────┐
│Keyword │ Semantic │Multimodal │
│Search │ Search │Search │
├─────────────┼─────────────┼─────────────┤
│SQL LIKE │Vector │Image-Text │
│Full-text │Similarity │Fusion │
│Exact Match │Semantic │CLIP Model │
│ │Understanding│Cross-modal │
└─────────────┴─────────────┴─────────────┘
↓ ↓ ↓
Result Ranking → Reranking → Weight Fusion
↓
Unified Result Format
- FastAPI: Web framework and API service
- SQLAlchemy: ORM and database operations
- SQLite: Main database
- ChromaDB: Vector database
- RapidOCR: Text recognition engine
- sentence-transformers: Text embedding models
- CLIP: Multimodal embedding model
- transformers: Transformer model library
- Pillow: Image processing
- watchdog: File system monitoring
- psutil: System information acquisition
- pydantic: Data validation
- Python 3.8+
- Supported OS: Windows, macOS, Linux
- Optional: CUDA support (for GPU acceleration)
pip install -r requirements.txtMain configuration file: config/default_config.yaml
server:
host: 127.0.0.1
port: 8840
debug: false
vector_db:
enabled: true
collection_name: "lifetrace_ocr"
embedding_model: "shibing624/text2vec-base-chinese"
rerank_model: "BAAI/bge-reranker-base"
persist_directory: "vector_db"
multimodal:
enabled: true
text_weight: 0.6
image_weight: 0.4python start_all_services.pypython -m lifetrace_backend.server --port 8840# Start recorder
python -m lifetrace_backend.recorder
# Start processor
python -m lifetrace_backend.processor
# Start OCR service
python -m lifetrace_backend.simple_ocrAfter starting the service, access API documentation at:
- Swagger UI: http://localhost:8840/docs
- ReDoc: http://localhost:8840/redoc
LifeTrace/
├── lifetrace_backend/ # Core modules
│ ├── server.py # Web API service
│ ├── models.py # Data models
│ ├── config.py # Configuration management
│ ├── storage.py # Storage management
│ ├── simple_ocr.py # OCR processing
│ ├── vector_service.py # Vector service
│ ├── multimodal_*.py # Multimodal services
│ ├── processor.py # File processing
│ ├── recorder.py # Screen recording
│ └── utils.py # Utility functions
├── config/ # Configuration files
├── doc/ # Documentation
├── data/ # Data directory
├── logs/ # Log directory
└── requirements.txt # Dependencies
- Add new search algorithms: Extend
vector_service.py - Support new OCR engines: Modify
simple_ocr.py - Add new API endpoints: Extend
server.py - Custom data models: Modify
models.py
- Regular index rebuilding
- Batch insert optimization
- Memory usage monitoring
- Image preprocessing
- Batch processing
- Result caching
- Result pagination
- Query caching
- Index optimization
- Log files:
logs/lifetrace_YYYYMMDD.log - Log levels: DEBUG, INFO, WARNING, ERROR
- Regular cleanup of old data
- Database backup
- Index rebuilding
- Service health check:
GET /api/health - System statistics:
GET /api/statistics - Queue status:
GET /api/queue/status
-
Vector database initialization failure
- Check ChromaDB dependency installation
- Verify data directory permissions
-
Poor OCR recognition quality
- Adjust image preprocessing parameters
- Check RapidOCR model files
-
Multimodal search unavailable
- Install CLIP-related dependencies
- Check model download status
python -m lifetrace_backend.server --debug- Fork the project
- Create a feature branch
- Commit changes
- Create a Pull Request
This project is licensed under the MIT License.
For detailed documentation, please refer to the doc/ directory: