AI Assistant Tool

Date: 2026.03.30

Author: ZF, https://github.com/jumphone/

Python: 3.12+ required

A command-line AI assistant supporting multiple interaction modes: basic chat, local document retrieval (RAG), web search, and combined capabilities. Built with Moonshot AI's Kimi models and FastAPI proxy for secure API key management. Includes optional voice input support for macOS.

Feature Matrix

Command	Mode	RAG	Web Search	Verbose Output	Description
`ai`	Basic Chat	❌	❌	✅	Standard conversation with history
`air`	AI + RAG	✅	❌	✅	Search local documents
`aiw`	AI + Web	❌	✅	✅	Search internet
`aiwr`	AI + Web + RAG	✅	✅	✅	Combined search
`aiq`	Quiet Chat	❌	❌	❌	Non-streaming output
`aiwq`	Quiet Web	❌	✅	❌	Non-streaming web search
`ragindex`	Index Builder	-	-	-	Build FAISS vector store

Installation

# Clone repository
git clone https://github.com/jumphone/ai.git
cd ai

# Install Python 3.12 dependencies
pip install -r docs/requirements.txt

# Verify installation
python3 bin/ai.py "who are you"  # Will prompt for API key on first run

Configuration

Initial Setup

On first run, the tool prompts for an "ai-key". You can provide:

Real Moonshot API key (format: sk-...)
Proxy password (from host's ~/.ai/pkey.txt)

The key is validated against two endpoints:

API_URL='https://api.moonshot.cn/v1' (real endpoint)
PROXY_URL='http://127.0.0.1:5260/v1' (local proxy)

Validated credentials are stored in:

~/.ai/key.txt - Real API key (700 permissions)
~/.ai/pkey.txt - Proxy password (700 permissions)

Directory Structure

The tool creates ~/.ai/ with 700 permissions:

~/.ai/
├── bkg.txt              # System background rules
├── key.txt              # Real API key
├── pkey.txt             # Proxy password
├── tmp/                 # Conversation history
│   └── YYYYMMDD_tmp_ai_<user>.txt
├── log/                 # Interaction logs
│   └── YYYYMMDD_log_ai_<user>.txt
├── database/            # RAG source documents (.txt, .md)
└── vectorstore/         # FAISS index

Background Rules

Default system prompt (src/prepare.py:DEFAULT_BKG):

1. Only state factual content, do not guess or fabricate.
2. For unknown, uncertain, or unconfirmed content, directly state "I don't know", do not invent.
3. Do not fabricate names of people, places, times, data, papers, literature, titles, authors, or sources.
4. Do not exaggerate, imagine, or embellish.
5. If citation is needed, only cite content that actually exists, do not fabricate citations.
6. Respond directly without redundant tone.
7. Must answer, provide the most reliable conclusion even if uncertain.

Usage

Basic Chat

# Simple query
bin/ai.py "What is Python?"

# Include file content
bin/ai.py document.txt "Summarize this"

# Enable conversation history
bin/ai.py usetmp

RAG Quick Start Guide

0. Run "ai" at least once to initialize the ai resource folder.
1. Upload all PDF files (ending with lowercase ".pdf") to a new empty folder on your Linux server.
2. Use the "cd" command to enter that folder, then type "dirpdf2txt ./" (copy-paste the quoted content to avoid errors).
3. Use "cp ./*.txt ~/.ai/database/" (same principle, copy-paste the quoted content).
4. Type "ragindex". It may report errors, ignore them. If it errors, run again until it shows "finished".
5. Use "air" to ask questions that will search the database before responding.

# Note: If you only have txt documents, skip step 2. Run "cp ./*.txt ~/.ai/database/" then proceed to steps 4-5.

RAG Mode (Local Documents)

# 1. Place documents in ~/.ai/database/
cp my_notes.md ~/.ai/database/

# 2. Build vector index
bin/ragindex.py

# 3. Query documents
bin/air.py "What did I write about authentication?"

Web Search Mode

# Current events and internet queries
bin/aiw.py "Latest Python 3.12 features"

Combined Mode

# Search both internet and local documents
bin/aiwr.py "Compare cloud providers with my internal notes"

Quiet Mode (Non-Streaming)

# For scripting and piping
bin/aiq.py "Calculate 2+2" > result.txt
bin/aiwq.py "Current Bitcoin price" | grep -i price

Conversation Management

bin/ai.py usetmp     # Enable history for current day
bin/ai.py cleantmp   # Reset history (remove and recreate)
bin/ai.py stoptmp    # Disable history (delete tmp file)

Proxy Server

# Start FastAPI proxy for key protection
bash server.sh

# Server runs at http://127.0.0.1:5260
# Configure clients to use this endpoint

Architecture

Key Functions

loadBKG() - Load system prompt and timestamp
loadTMP() - Load conversation history from ~/.ai/tmp/
loadNEW() - Process CLI arguments (files or text)
getResult() - Standard streaming chat
getResult_web() - Chat with web search tool
retrieve_rag() - RAG retrieval with double-check
build_vectorstore() - FAISS index management

RAG Pipeline

Keyword Generation (RAG_KEYWORD_MODEL): Extract search intent from user query
Vector Retrieval: FAISS similarity search (VECTOR_K=10 chunks)
Double-Check (RAG_CHECK_MODEL): Verify relevance with temperature=0.0
Answer Generation: Combine verified references with user query using RAG_ANSWER_PROMPT

Web Search Integration

Uses Moonshot's builtin $web_search tool (src/util.py:chat()). The tool is declared in each request and handles tool calls automatically through a loop that continues until finish_reason is not "tool_calls".

File Structure

.
├── bin/                      # Executable entry points
│   ├── ai.py                # Basic chat
│   ├── air.py               # + RAG
│   ├── aiw.py               # + Web
│   ├── aiwr.py              # + Web + RAG
│   ├── aiq.py               # Quiet mode
│   ├── aiwq.py              # Quiet web
│   └── ragindex.py          # Build index
├── mains/                    # Main logic modules
│   ├── main_ai.py
│   ├── main_air.py
│   ├── main_aiw.py
│   ├── main_aiwr.py
│   ├── main_aiq.py
│   ├── main_aiwq.py
│   └── main_ragindex.py
├── src/                      # Core implementation
│   ├── check.py             # API key validation
│   ├── config.py            # Configuration constants
│   ├── safe.py              # Secure password input
│   ├── prepare.py           # Directory initialization
│   ├── rag.py               # FAISS and embeddings
│   ├── util.py              # Main utility functions
│   └── server.py            # FastAPI proxy
├── supp/                     # Supplementary tools
│   ├── voice2ssh_mac.py     # macOS voice input
│   └── ssh2ai.py            # Voice processing bridge
├── docs/
│   └── requirements.txt     # Python dependencies
├── server.sh                # Proxy startup script
└── README.md                # This file

Security Features

Key Protection: Proxy server validates tokens before forwarding to real API
File Permissions: ~/.ai/ directory created with 700 permissions
Secure Input: Password masking with getpass_star() (src/safe.py)
Key Storage: Separate files for real key and proxy password
No Key Exposure: Client applications use proxy password, not real API key

Proxy Server Details

Endpoint: http://127.0.0.1:5260/v1
Validation: Checks token against PROXY_API_KEY list
Forwarding: Injects real API key, strips sensitive headers
Logging: Prints request bodies to stdout (configurable)
CORS: Not configured (intended for local use only)

Voice Input (macOS)

The tool includes optional voice input support for macOS:

voice2ssh_mac.py: Records audio using right Command key, transcribes with Faster-Whisper, sends text to Linux server via SSH
ssh2ai.py: Monitors for transcribed text on server and triggers bin/ai commands

Configuration required in supp/voice2ssh_mac.py:

LINUX_HOST, LINUX_USER, LINUX_PASS, LINUX_PORT
MODEL_PATH - Path to Faster-Whisper model
REMOTE_OUTPUT_FILE - Where to store transcriptions

You can download the "turbo" from Baidu Cloud Disk:

https://pan.baidu.com/s/1l_GcVR1nBLaQfF_UodYjyw?pwd=biuh

Configuration Reference

Models (`src/config.py`)

MODEL_LIST = [
  'kimi-k2-thinking-turbo',
  'kimi-k2-turbo-preview',
  'kimi-k2-0905-preview',
  'kimi-k2-thinking',
  'kimi-latest',
  'moonshot-v1-auto',
  'moonshot-v1-8k',
  'moonshot-v1-32k',
  'moonshot-v1-128k'
]
BASIC_MODEL = MODEL_LIST[0]
RAG_KEYWORD_MODEL = MODEL_LIST[1]
RAG_CHECK_MODEL = MODEL_LIST[1]

RAG Parameters

CHUNK_SIZE = 5000          # Document chunk size
CHUNK_OVERLAP = 500        # Chunk overlap
VECTOR_K = 10              # Retrieved chunks
RAG_KEYWORD_TEMPERATURE = 0.2
RAG_CHECK_TEMPERATURE = 0.0  # Deterministic checking

Paths

AI_FOLDER = '~/.ai/'
RAG_SRC_DATABASE = '~/.ai/database/'
RAG_SRC_VECTOR = '~/.ai/vectorstore/'
SEARCH_MODEL_PATH = '/home/toolkit/tools/ai_local_src/models/multilingual-model'

You can download the "multilingual-model" from Baidu Cloud Disk:

https://pan.baidu.com/s/1EnD5oFqJ7-mj2cr5NhJFMw?pwd=biuh

Troubleshooting

"No sentence-transformers model found"

The tool suppresses this warning via warnings.filterwarnings() and sys.stderr redirection in src/rag.py:init_huggingface_embeddings(). Ensure SEARCH_MODEL_PATH contains valid model files.

Key Validation Fails

Check network connectivity to API_URL and PROXY_URL
Verify key format (should start with sk-)
Inspect ~/.ai/key.txt and ~/.ai/pkey.txt permissions
Use fast_verify_key() in src/safe.py for manual testing

RAG Returns No Results

Confirm documents in ~/.ai/database/ (.txt or .md only)
Rebuild index: bin/ragindex.py
Check embedding model path: SEARCH_MODEL_PATH
Verify VECTOR_K value isn't too small

History Not Saving

Run bin/ai.py usetmp to create temp file
Check ~/.ai/tmp/ permissions (should be 700)
Verify TMP_USE is True in src/prepare.py

Proxy Server Errors

Ensure port 5260 is available: lsof -i :5260
Check PROXY_KEY_FILE exists and contains valid token
Review server logs for request/response details
Verify REAL_KEY_FILE contains valid Moonshot API key

Development

Adding New Commands

Create bin/newcmd.py with shebang and import pattern
Add mains/main_newcmd.py with run() function
Implement logic in src/util.py or new module
Update feature matrix in this README

Modifying Models

Edit src/config.py:MODEL_LIST and BASIC_MODEL. Changes apply immediately on next run.

Custom Embeddings

Update SEARCH_MODEL_PATH in src/config.py. The tool expects a HuggingFace-compatible model.

Extending Proxy

Modify src/server.py:PROXY_API_KEY to support multiple client tokens. Add routes as needed in the FastAPI app.

Requirements

See docs/requirements.txt for complete list. Key dependencies:

AI/ML: torch==2.6.0, transformers==5.3.0, sentence-transformers==5.3.0
RAG: langchain==1.2.13, faiss-cpu==1.8.0, langchain-huggingface==1.2.1
API: openai==1.60.0, httpx==0.25.2, fastapi==0.135.2
Utilities: pydantic==2.12.5, python-dotenv==1.2.2, PyYAML==6.0.3

CUDA libraries are optional and included for GPU acceleration support.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
__pycache__		__pycache__
bin		bin
docs		docs
mains		mains
src		src
supp		supp
README.md		README.md
server.sh		server.sh

Folders and files

Latest commit

History

Repository files navigation

AI Assistant Tool

Feature Matrix

Installation

Configuration

Initial Setup

Directory Structure

Background Rules

Usage

Basic Chat

RAG Quick Start Guide

RAG Mode (Local Documents)

Web Search Mode

Combined Mode

Quiet Mode (Non-Streaming)

Conversation Management

Proxy Server

Architecture

Key Functions

RAG Pipeline

Web Search Integration

File Structure

Security Features

Proxy Server Details

Voice Input (macOS)

Configuration Reference

Models (src/config.py)

RAG Parameters

Paths

Troubleshooting

"No sentence-transformers model found"

Key Validation Fails

RAG Returns No Results

History Not Saving

Proxy Server Errors

Development

Adding New Commands

Modifying Models

Custom Embeddings

Extending Proxy

Requirements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Models (`src/config.py`)

Packages