Your AI pair-programmer in the terminal. Describe what you want — the CLI reads your codebase, plans changes, writes code, runs commands, spawns sub-agents for complex tasks, and saves every session. All powered by DeepSeek models.
- Node.js 18+ (or Docker — see below)
- A DeepSeek (or OpenAI-compatible) API key and base URL
git clone <repo-url>
cd Deepseek-Agentic-CLICreate a .env file (git-ignored):
DEEPSEEK_API_KEY=your_api_key_here
DEEPSEEK_BASE_URL=https://api.deepseek.com/v1Install and run:
npm install
node main.jsdocker-compose up --buildThe Docker setup mounts your .env as a Docker secret. No Node.js installation required.
npm test # Full test suite
npm run test:unit # Unit tests only (offline)- Multi-turn conversation — persistent chat history with automatic context window management (
/save, auto-save). - Streaming output — real-time token-by-token display of model responses, including reasoning/thinking blocks.
- Reasoning/Thinking content — optional display of the model's chain-of-thought (DeepSeek
thinking.type). - Tool-use agent loop — the model can autonomously invoke tools in parallel, receive results, and continue the conversation.
- Role-based sub-agent delegation — delegate tasks to specialised sub-agents (e.g.
execution) with isolated context, independent terminal windows, and role-specific tool sets. - Agent / Plan mode — toggle between unrestricted Agent mode and restricted Plan mode (file mutations blocked outside
artifacts/active/). Iteration guard prevents unbounded tool loops. - Token estimation & cost tracking — real-time input/output token counts via
tiktoken(with heuristic fallback), accumulated cost estimates, and per-turn audit trail (/audit). - Sliding context window — when approaching the token limit, older messages are pruned while preserving conversation integrity.
- Session commands —
/help,/plan,/agent,/save,/clear,/status,/verbose,/audit,/exit. - Chat history persistence — every session is saved to
chat_history/with timestamped JSON files, sanitised filenames, and auto-save option. - Artifact management — plans and deliverables are written to
artifacts/active/and can be archived toartifacts/history/with timestamps. - Colour-coded terminal output — clear visual distinction between reasoning, model output, tool alerts, batch summaries, errors, and system info.
- Read-only caching — idempotent read tools (
get_project_tree,read_file_chunk,multi_file_search_string) are cached per-turn to avoid redundant API calls. - Iteration guard — warns when the inner tool-execution loop exceeds the configured limit, preventing diagnostic spirals.
- Session Memory — tracks files created/modified, user preferences, and key decisions across turns.
- Modular architecture — single-responsibility files, DRY boilerplate via a tool handler factory, Open/Closed principle-friendly tool registry.
Deepseek-Agentic-CLI/
├── .gitignore
├── .env # API configuration (git-ignored)
├── .dockerignore
├── Dockerfile # Docker image for containerised deployment
├── docker-compose.yml # Docker Compose with secrets support
├── eslint.config.js # ESLint flat config
├── main.js # Entry point (4 lines: imports runChat, invokes it)
├── helper.js # Barrel re-exports for backward compatibility
├── package.json # Dependencies & scripts
├── lib/
│ ├── orchestrator.js # Chat loop, model invocation, sliding context, mode switching (1096 lines)
│ ├── tokenizer.js # Token estimation (tiktoken + heuristic fallback)
│ ├── cliInput.js # User I/O (ask, askYesNo, startChat, thinkingToggle, createPromptLoop)
│ ├── streamHandler.js # Streaming response parser (reasoning, content, tool calls)
│ ├── subAgentLoop.js # Independent model loop for sub-agents (role-based tool sets)
│ ├── subAgentTerminal.js # Spawn and manage dedicated PowerShell windows for sub-agents
│ ├── artifactManager.js # Artifact lifecycle (active/ → history/ archive, plan validation)
│ ├── chatHistory.js # Conversation persistence (timestamped JSON, sanitised filenames)
│ ├── colors.js # Central colour definitions and colorize() helper
│ └── fileReader.js # Shared UTF-8 file reading with \r\n → \n normalisation
├── tools/
│ ├── registry.js # Central tool map: ALL_TOOLS, ORCHESTRATOR_TOOLS, buildSubagentTools()
│ ├── template.js # createToolHandler() factory — DRY log + consent + try/catch wrapper
│ ├── callToolsInBatch.js # Batch execution engine (consent tools serial, read-only concurrent, caching)
│ ├── executeTerminal.js # Shell command execution (with Plan Mode safety checks)
│ ├── patchFile.js # Targeted string search-and-replace / line-number replacement
│ ├── readFileChunk.js # Read a range of lines from a file
│ ├── getProjectTree.js # Directory tree walk respecting .gitignore
│ ├── fetchUrl.js # Fetch URL → clean Markdown extraction (with proxy & Wayback fallback)
│ ├── askUserPreferences.js # Multi-question preference prompts
│ ├── writeOrCreateFile.js # Write/create files (with Plan Mode guard, line-range overwrite)
│ ├── multiFileSearchString.js # Search for a string across multiple files with glob support
│ ├── delegateSubAgent.js # Delegate 1..N sub-tasks to role-based sub-agents in isolated terminals
│ └── roleSystemPrompts.js # Canonical role definitions (execution) with tool permissions
├── docs/
│ ├── README.md # Skill document writing guidelines (for AI agents & human authors)
│ └── skills/
│ ├── docker/SKILL.md # Docker domain knowledge
│ ├── fullstack/SKILL.md # Full-stack development domain knowledge
│ ├── githubcicd/SKILL.md# GitHub CI/CD domain knowledge
│ ├── uiux/SKILL.md # UI/UX design domain knowledge
│ └── web-research/SKILL.md # Web research methodology
├── artifacts/
│ ├── active/ # Active plans & deliverables (git-ignored)
│ └── history/ # Archived plans & deliverables (git-ignored)
├── chat_history/ # Saved conversation JSON files (git-ignored)
└── test/
├── setup.js # Vitest setup
├── helpers.js # Test utilities
├── fixtures/ # Test fixture files
├── functionality/ # Functionality tests (13 test files)
└── reliability/ # Reliability tests (14 test files)
Run the agent from the project root:
node main.js- Model selection — choose between available models (e.g.
deepseek-v4-flash,deepseek-v4-pro). - Reasoning toggle — enable or disable the display of chain-of-thought / thinking content.
- Auto-save prompt — choose whether to auto-save chat history every turn.
On startup, a session banner displays the active model, mode (Plan/Agent), thinking status, and token limit.
- Type your message and press Enter.
- The model may respond with text, and/or invoke one or more tools.
- Tool results are fed back to the model, which can then respond or call additional tools.
- The model may also delegate sub-tasks to role-based sub-agents, which open independent terminal windows.
- Type
exitto end the conversation (you'll be prompted to save).
| Command | Description |
|---|---|
/help |
Show available commands |
/plan |
Switch to Plan Mode (mutation blocked, artifacts/active/ exempt) |
/agent |
Switch to Agent Mode (all tools available) |
/save |
Save the current session immediately |
/clear |
Clear the terminal screen |
/status |
Show session info (mode, model, tokens, messages, estimated cost) |
/verbose |
Toggle detailed per-iteration telemetry on/off |
/audit |
Show sub-agent token & cost breakdown with orchestrator totals |
/exit |
Quit the chat session |
Token estimates (input/output/total) are shown before each user prompt when context usage exceeds 15% of the configured token limit. The context window automatically slides (removing older messages) when usage exceeds 80% of the configured limit.
The AI has 9 tools at its disposal, grouped by capability. Read-only tools run autonomously; mutation/network tools require your y/n consent.
| Tool | What it does |
|---|---|
get_project_tree |
Walk directory tree (respects .gitignore) |
read_file_chunk |
Read a range of lines from any file |
multi_file_search_string |
Grep-style search across files with glob patterns |
| Tool | What it does | Consent |
|---|---|---|
patch_file |
Targeted search-and-replace or line-number edit | ❌ No |
write_or_create_file |
Write new file or overwrite sections (line-range, append, create parents) | ❌ No |
execute_terminal_command |
Run any PowerShell command | ✅ Yes |
| Tool | What it does |
|---|---|
fetch_url |
Fetch URL, extract clean Markdown (proxy & Wayback fallback) |
ask_user_preferences |
Ask you multi-choice questions to resolve ambiguity |
| Tool | What it does |
|---|---|
delegate_sub_agents |
Spawn 1..N isolated sub-agents in separate terminals, each with role-scoped tools |
search_webis temporarily disabled pending DuckDuckGo rate-limit resolution.
| Registry | Contents | Used By |
|---|---|---|
ALL_TOOLS |
All 9 tools as [schema, handler] pairs (no consent flags) |
Master catalog |
ORCHESTRATOR_TOOLS |
9 tools with consent flags | Main orchestrator |
buildSubagentTools(role) |
Dynamically built from role definitions in roleSystemPrompts.js; all consent flags false |
Sub-agents (autonomous) |
Three exploration tools (get_project_tree, read_file_chunk, multi_file_search_string) are cached per-turn. Identical calls within the same turn return cached results — no redundant I/O.
The system operates in two modes:
| Mode | Description | Mutation Blocked |
|---|---|---|
| Agent Mode | All tools available, unrestricted file mutations | ❌ No |
| Plan Mode | File mutations (patch_file, write_or_create_file, execute_terminal_command) are blocked unless the target path is inside artifacts/active/. Git status/diff and commands redirecting to artifacts/ are allowed. |
✅ Yes (except artifacts/active/) |
Switch modes at any time with /plan or /agent.
When the model encounters a complex multi-step task, it can use delegate_sub_agents to:
- Generate a structured Markdown prompt file with the sub-agent's goal, purpose, deliverables, and context.
- Look up the role definition (e.g.,
execution) fromroleSystemPrompts.js. - Spawn an independent PowerShell terminal window via
subAgentTerminal.js. - Run a separate model loop (
subAgentLoop.js) in that window with its own context, using only the tools allowed by the role. - The sub-agent operates autonomously, writing results back to the project files.
- The main agent continues its own conversation in the original terminal, checking the sub-agent's output when done.
Roles are defined in tools/roleSystemPrompts.js:
| Role | Description | Tools Allowed |
|---|---|---|
execution |
Implement code changes, create/modify files, execute terminal commands | execute_terminal_command, patch_file, read_file_chunk, get_project_tree, fetch_url, ask_user_preferences, write_or_create_file, multi_file_search_string |
This provides true context isolation — the sub-agent's token budget, message history, and reasoning do not consume the main agent's context window.
- Every session is saved to
chat_history/{DD.MM.YYYY}/{HH.MM.SS} - {model} - {title}.json. - Filenames are sanitised (Windows-safe, max 50 chars).
- The LLM generates a chat title from the first user message (fallback: truncated user input).
- Auto-save can be enabled at startup or toggled per-session.
- Use
/saveto trigger a manual save at any time.
- Plans and deliverables are written to
artifacts/active/. - On task completion,
archiveActiveToHistory(taskName)moves all files toartifacts/history/{taskName}/{YYYY-MM-DD_HH.MM.SS}/. - Plan files (starting with
PLAN-) are validated for structural integrity before archiving.
When the inner tool-execution loop exceeds iteration_limit (default: 30), the iteration guard prompts:
- (Y) Continue with extended budget (+25 iterations)
- (N) Abort the current task
- (P) Switch to Plan Mode and continue
This prevents unbounded diagnostic spirals and excessive token consumption.
The orchestrator maintains a session memory object that tracks across turns:
- Files Created — tracked from
write_or_create_filecalls - Files Modified — tracked from
patch_filecalls - User Preferences — captured from
ask_user_preferencescalls - Key Decisions — inferred from tool interactions
This memory is injected into the system prompt at the start of each turn for continuity.
Hyperparameters are defined in lib/orchestrator.js under the HYPERPARAMETERS object:
| Parameter | Default | Description |
|---|---|---|
token_limit |
300000 | Maximum tokens per request |
token_multiplier |
1.5 | Fallback multiplier for heuristic token estimation |
stream |
true |
Enable streaming responses |
reasoning_effort |
"high" |
Reasoning effort level (when thinking is enabled) |
iteration_limit |
30 | Max iterations before iteration guard triggers |
iteration_continue_budget |
25 | Additional iterations granted on "Continue" |
system_prompt |
(custom) | Full agent system prompt with planning pipeline |
| Model | Input | Output | Cache Miss |
|---|---|---|---|
deepseek-v4-flash |
$0.0028 | $0.28 | $0.14 |
deepseek-v4-pro |
$0.003625 | $0.87 | $0.435 |
| Parameter | Default | Description |
|---|---|---|
token_limit |
65535 | Maximum tokens per sub-agent request |
token_multiplier |
1.5 | Fallback multiplier for heuristic token estimation |
stream |
true |
Enable streaming responses |
max_iterations |
20 | Default max iterations (overridable via prompt banner) |
max_output_tokens |
8192 | Max output tokens per sub-agent response |
| Reasoning | Disabled | Sub-agents are autonomous workers |
- Imports
runChatfromlib/orchestrator.jsand invokes it with an IIAFE. - No other logic — pure entry dispatch.
- Sets up OpenAI client and loads
HYPERPARAMETERS. runChat()— top-level entry: model selection + thinking toggle + auto-save prompt.multiTurnLoop()— conversation orchestrator with sliding context, inner tool-execution loop, iteration guard, mode switching, and per-iteration telemetry.callModel()— thin wrapper overOpenAI.chat.completions.create()with DeepSeek/Gemini compatibility.SessionContext— mutable session state object (mode, tokens, audit trail, session memory).- Slash command handler (
/help,/plan,/agent,/save,/clear,/status,/verbose,/audit). - Iteration guard, chat title generation, session memory management.
- Initialises
tiktokenencoder (cl100k_base) with heuristic fallback. estimateTokens(messages, reasoning_history, token_multiplier)— iterates messages, sums input/output tokens accounting for tool calls, reasoning history, and structural overhead.
ask(question)— wrapsreadline.question()in a Promise with stdin mutex for Windows console mode safety.askYesNo(question, defaultYes)— boolean prompt with defaults.startChat()— model selection menu.thinkingToggle()— reasoning content enable/disable menu.createPromptLoop()— persistent prompt loop with input history (arrow-up recall).
printStreamResponse(stream, extra_body, role)— async generator consumer that returns{ reasoning_content, content, tool_calls, usage }.- Handles thinking content (coloured), standard content, and incremental tool-call assembly from chunks.
- Captures actual API usage from the final stream chunk.
- Mirrors the main orchestrator loop but uses
buildSubagentTools(role)for role-specific tool sets. - Reasoning is disabled by default — sub-agents are autonomous workers.
- Reads the delegated task prompt, runs until complete, reports final summary.
createSubAgentTerminal(subAgentName)— spawns a new PowerShell window that tails a temp log file.- Returns a logger object with
write()andclose()methods. - All sub-agent output appears in the dedicated window, isolated from the main terminal.
ensureActiveDir()— createsartifacts/active/if missing.archiveActiveToHistory(taskName)— moves files fromactive/tohistory/{taskName}/{timestamp}/.isPlanFile(filename)— identifies plan files byPLAN-prefix.validatePlanContent(content)— validates plan structure.
sanitizeFilename(raw, maxLen)— Windows-safe filename sanitisation.saveChatHistory(messages, modelName, title)— saves conversation to timestamped JSON.saveAuditHistory(auditData)— saves audit trail data.
- Central
Cobject with semantic colour codes (user, model, system, warning, error, success, tool, border, heading). colorize(text, color)— apply colour with reset.
readFileUtf8Normalized(filePath)— reads UTF-8 file with\r\n→\nnormalisation for cross-platform consistency.
- Imports all 9 tool schemas and handlers.
- Exports
ALL_TOOLS,ORCHESTRATOR_TOOLS,buildSubagentTools(role). - Exports
callToolsInBatchfor batch execution.
createToolHandler(name, handlerFn, needsConsent)— wraps any pure handler with:- Console alert (tool name + truncated args)
- Optional user consent prompt (
y/n) try/catchwith formatted error return
- Runs multiple tool calls from a single model response.
- Phase 1: Batch summary display — prints all tool calls with consent tags.
- Phase 2: Unified execution — consent tools serialised via lock; read-only tools concurrent via
Promise.all. - Phase 3: Progress indicators with per-tool timing.
- Read-only caching for idempotent tools.
- Plan Mode mutation blocking with safe-command allowlist (
git status/diff,artifacts/redirects).
- Defines the
executionrole with description, output constraints, and allowed tools. getRoleEntry(role)— lookup function for role resolution.VALID_ROLES— convenience array for validation.
- Plan Mode blocks dangerous mutations (
patch_file,write_or_create_file,execute_terminal_command) unless targetingartifacts/active/. - Safe command allowlist in Plan Mode:
git status,git diff, commands redirecting toartifacts/. - Dangerous command patterns (e.g.
Get-Content *) are blocked before execution. - All destructive/network operations in the orchestrator prompt the user for approval before proceeding.
- Sub-agents operate autonomously in independent terminal windows; all their tool calls are logged to stdout via
callToolsInBatch. - File paths in
writeOrCreateFile.jsare sanitised to prevent path traversal. - Plan content is validated for structural integrity before archiving.
The project uses Vitest as the test runner with two test suites:
npm run test:functionality # Functionality tests (13 test files)
npm run test:reliability # Reliability tests (14 test files)
npm run test:unit # Unit tests only (excludes @network tests)
npm run test:network # Network-dependent tests onlyTest files cover: askUserPreferences, callToolsInBatch, delegateSubAgent, executeTerminal, fetchUrl, getProjectTree, multiFileSearchString, patchFile, readFileChunk, registry, searchWeb, template, writeOrCreateFile.
The project includes a Dockerfile (Node 22 Alpine) and docker-compose.yml for containerised deployment:
docker-compose up --build- Uses Docker secrets for
.envinjection at runtime. - Runs as non-root
nodeuser. - Exposes port 3000.
The docs/skills/ directory contains SKILL.md files encoding industry-standard principles, constraints, and anti-patterns for specific domains. When a user request falls into a covered domain, the orchestrator reads the relevant SKILL.md to inform planning and implementation decisions.
Available skill domains:
- Docker — containerisation best practices
- Full-stack — full-stack development standards
- GitHub CI/CD — continuous integration & deployment
- UI/UX — user interface & experience design
- Web Research — web research methodology & tooling
See docs/README.md for SKILL.md authoring guidelines.