Autonomous AI Software Engineer β An enterprise-grade LangGraph pipeline that autonomously writes, executes, and debugs Python code inside an isolated Docker sandbox, with multi-model LLM failover.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Streamlit Dashboard β
β (app.py β Control Center + Live Console) β
ββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β LangGraph State Machine β
β (src/core/graph.py) β
β β
β ββββββββββββ ββββββββββββ ββββββββββββ βββββββββββ β
β β Planner ββββΈβ Coder ββββΈβ Terminal ββββΈβDebugger β β
β ββββββββββββ ββββββββββββ ββββββ¬ββββββ ββββββ¬βββββ β
β β β β
β β βββββββββββ β β
β β (repair loop, β
β β capped at 5) β
ββββββββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββ
β
βββββββββββββββΌβββββββββββββββ
β Docker Sandbox β
β (python:3.11-slim) β
β Ephemeral, read-only β
β 10s execution timeout β
ββββββββββββββββββββββββββββββ
| Stage | Role | Implementation |
|---|---|---|
| Planner | Proposes target artifact and augments the user prompt | src/core/graph.py |
| Coder | Generates Python via LLM, sanitizes markdown fences, writes to disk | src/agents/coder.py |
| Terminal | Executes code in an ephemeral Docker container with 10s timeout | src/agents/terminal.py |
| Debugger | Analyzes errors, rewrites code, and re-runs (max 5 attempts) | src/agents/debugger.py |
The system uses a two-tier LLM strategy implemented in src/core/llm_fallback.py:
- Primary: Google Gemini 2.5 Flash via
ChatGoogleGenerativeAI - Fallback: Meta Llama-3-8B-Instruct via
ChatHuggingFace(triggered on 429 / RESOURCE_EXHAUSTED / 503)
A graceful Visual Portfolio Simulation fallback activates automatically if all backends are unavailable, ensuring the dashboard always renders a complete demonstration.
- Python 3.9+
- Docker Desktop running (required for sandboxed code execution)
- API keys configured in
.env:GEMINI_API_KEY=your_gemini_api_key HUGGINGFACEHUB_API_TOKEN=your_hf_token
# 1. Clone the repository
git clone https://github.com/Harsh-Sharma29/Devin-s.git
cd Devin-s
# 2. Install pinned dependencies
pip install -r requirements.txt
# 3. Configure environment variables
cp .env.example .env
# Edit .env with your API keys
# 4. Launch the dashboard
python -m streamlit run app.pyThe dashboard opens at http://localhost:8501.
βββ app.py # Streamlit dashboard entry point
βββ requirements.txt # Pinned production dependencies
βββ .env # API keys (gitignored)
βββ src/
β βββ core/
β β βββ graph.py # LangGraph state machine & routing
β β βββ llm_fallback.py # Multi-model LLM wrapper with failover
β β βββ config.py # Configuration constants
β β βββ logger.py # Logging configuration
β β βββ prompts.py # System prompt templates
β βββ agents/
β β βββ coder.py # Code generation agent
β β βββ debugger.py # Autonomous repair agent
β β βββ terminal.py # Docker sandbox execution agent
β β βββ planner.py # Task planning agent
β βββ tools/
β βββ file_ops.py # File I/O & Docker execution helpers
βββ Dockerfile # Container build config
βββ docker-compose.yml # Multi-service orchestration
βββ tests/ # Test suite
- Python 3.9 Compatibility: A global
importlib.metadata.packages_distributionsmonkey-patch inapp.pyisolates dependency conflicts fromgoogle-auth/langchain-coreon older runtimes. - Immediate Termination Routing: The LangGraph
route_from_terminalfunction returns"END"immediately whendetected_errorsis empty oris_verifiedisTrue, preventing unnecessary recursion loops. - Dual State Access: All agent nodes support both Pydantic model objects and raw dictionaries for maximum compatibility across LangGraph versions.
- Session State Hygiene: On each pipeline execution,
st.session_statebuffers are fully reset before invocation to prevent stale error displays.
MIT