Automated LinkedIn job discovery, scoring, and application powered by LLM-based resume matching, browser automation, and market intelligence.
AI Job Hunter finds fresh LinkedIn jobs matching your background, scores each one against your resume using embedding similarity and LLM evaluation, and — when a job passes your thresholds — applies via Easy Apply automatically. A built-in Market Intelligence engine analyses technology trends, builds role archetypes, and matches your skills against market demand.
- Features
- Prerequisites
- Installation
- Getting Started — Step by Step
- Web GUI
- Authentication & Multi-User
- Scheduling
- Email Notifications
- Operating Modes
- CLI Reference
- Application Pipeline
- Market Intelligence Pipeline
- Configuration
- Data Directory
- Data Models
- Project Structure
- Testing
- Docker Deployment
- Safety & Ethics
- Development Roadmap
- License
| Feature | Description |
|---|---|
| Profile generation | Extract skills & experience from resume PDF + LinkedIn (URL or PDF) via LLM |
| Job discovery | Cookie-based LinkedIn search with pagination, CSS + JS fallback parsing |
| AI scoring | Embedding similarity + LLM fit evaluation (0–100) with skill gap analysis |
| Industry preferences | Boost/penalise scores based on preferred and disliked industries |
| Easy Apply automation | Multi-step wizard with LLM-powered form filling for arbitrary questions |
| Challenge detection | Pauses on captcha, marks job as BLOCKED — no bypass attempts |
| Market Intelligence | Technology trend analysis, role archetypes, candidate–role matching, gap analysis |
| Web GUI | Full command & control dashboard (FastAPI + HTMX + Pico CSS) |
| Multi-user auth | JWT login, per-user settings & API keys, account management, admin panel |
| Visual dashboard | Donut chart, fit histogram, skill gap bars, activity timeline, market panel |
| Scheduled runs | APScheduler cron-based automation with configurable days, time, and pipeline mode |
| Email notifications | Pipeline summary emails via Resend API or SMTP (Gmail, Outlook, custom) |
| Resume review | AI-powered gap analysis comparing your resume to target jobs |
| Daily reports | Markdown + JSON summaries with stats, job tables, and market section |
| Docker support | Dockerfile + docker-compose with health check, volume mount, auto-restart |
| Settings persistence | Web UI settings saved to .env — survives restarts |
| Mock mode | Full pipeline testable offline with HTML fixtures — no API keys needed |
| CLI | hunt command with 10+ subcommands and global flags |
.env support |
API keys and settings from .env file |
| SQLite database | Job, Score, ApplicationAttempt + 13 market tables, WAL mode |
- Python 3.13+
- uv — fast Python package manager
- OpenAI API key — for scoring, profile generation, form filling, and market extraction (not needed for mock/offline testing)
git clone https://github.com/kertser/AIJobHunter.git
cd AIJobHunter
uv sync # installs all dependencies including dev toolsWindows note: All
uvandpytestcommands work the same in PowerShell. For.envcreation, useCopy-Item .env.example .envinstead ofcp.
Verify the installation:
uv run hunt --help
uv run pytest -q # 377 tests, all offlineThis is the recommended sequence to go from zero to a fully operational system.
# Install dependencies
uv sync
# Create your .env file
cp .env.example .env # Linux/macOS
# Copy-Item .env.example .env # Windows PowerShellEdit .env and set your OpenAI API key:
JOBHUNTER_OPENAI_API_KEY=sk-proj-...
Initialise the database:
uv run hunt initThis creates data/job_hunter.db and the data/reports/ directory.
The system needs to understand your background. Provide your resume (PDF) and optionally your LinkedIn profile:
# Resume + LinkedIn URL (Playwright scrapes the public profile)
uv run hunt profile --resume path/to/resume.pdf --linkedin https://www.linkedin.com/in/your-name/
# Resume + LinkedIn PDF export (if you've downloaded it)
uv run hunt profile --resume path/to/resume.pdf --linkedin linkedin.pdf
# Resume only
uv run hunt profile --resume path/to/resume.pdfOr use the web GUI: start the server (uv run hunt --real --dry-run serve) and go to http://localhost:8000 — the Setup page will guide you.
This generates two files:
| File | Purpose |
|---|---|
data/user_profile.yml |
Your extracted profile: name, skills, experience, desired roles, education |
data/profiles.yml |
Search profiles: keywords, location, seniority, scoring thresholds |
Verify what was generated:
uv run hunt profile --showYou can edit both files manually or via the Profiles page in the web GUI.
The system uses browser cookies for authentication (no password storage):
uv run hunt loginA browser window opens. Log in to LinkedIn manually. Once login is detected, the browser closes and cookies are saved to data/cookies.json.
Note: Cookies expire periodically. If discovery stops working, run
hunt loginagain.
Before applying to real jobs, test the full pipeline in dry-run mode:
# Discover + score + "apply" without submitting
uv run hunt --real --dry-run run --profile defaultThis will:
- Discover — search LinkedIn for jobs matching your profile keywords
- Score — compute embedding similarity + LLM fit score for each job
- Queue — mark qualifying jobs for application
- Apply (dry-run) — walk through Easy Apply forms but stop before submitting
- Report — generate a daily summary in
data/reports/
Check the results:
# View the dashboard
uv run hunt --real --dry-run serve
# Open http://localhost:8000Once you're satisfied with the scoring and job selection:
# Full pipeline — will submit applications
uv run hunt --real run --profile defaultOr use the web GUI:
uv run hunt --real serve
# Open http://localhost:8000 → Pipeline page → click "Run Pipeline"After discovering and scoring some jobs, run market analysis to understand technology trends and how your skills compare to market demand:
Via the web GUI (easiest):
uv run hunt --real serve
# Open http://localhost:8000 → Pipeline page → click "Analyse Market"Via CLI:
# Full market pipeline (7 steps)
uv run hunt --real market run-all --extractor openai --normalizer openai --profile defaultThis runs: Ingest → Extract → Graph → Trends → Role Model → Candidate Model → Match.
View results:
- Web: http://localhost:8000/market — trends, role archetypes, skill gaps, company demand
- CLI:
uv run hunt market report— generates a market intelligence report indata/reports/ - Job details: each job now shows market-enhanced scoring data
Start the web server:
uv run hunt --real --dry-run serve # safe mode — won't submit applications
uv run hunt --real serve # live mode — will submit applications
uv run hunt --mock serve # offline testing with mock data
uv run hunt serve --host 0.0.0.0 --port 3000 # custom bind addressOpen http://localhost:8000 in your browser.
| Page | Path | Description |
|---|---|---|
| Dashboard | / |
Visual stats — donut chart, fit histogram, skill gap bars, activity timeline, market intelligence panel |
| Jobs | /jobs |
Sortable/filterable table with bulk actions, status management, persistent sort |
| Job Detail | /api/jobs/{hash} |
Full description, scores, market boost, Easy Apply button, application history |
| Pipeline | /run |
Trigger Discover / Score / Apply / Full Pipeline / Market Analysis with live SSE progress |
| Market | /market |
Technology trends, rising entities, role archetypes, candidate matches, company demand |
| Profiles | /profiles |
Edit user profile and search profiles (skills, keywords, thresholds) |
| Resume Review | /resume-review |
AI gap analysis — missing skills, improvement suggestions, quick wins |
| Reports | /reports |
Browse and view daily pipeline + market reports |
| Account | /account |
Personal account settings — edit display name, email, change password |
| Settings | /settings |
Toggle mock/dry-run/headless, configure LinkedIn session cookies, email notifications, API keys |
| Schedule | /schedule |
Cron-style automation — set time, days, pipeline mode, view run history |
| Admin | /admin |
User management, profile management, database reset (admin-password protected) |
| Setup | /onboarding |
Upload resume PDF + LinkedIn URL to generate profiles (first-run wizard) |
Live progress streaming: Pipeline and market operations stream real-time progress via Server-Sent Events (SSE) — you see each step as it happens, with keepalive pings during long operations.
The web GUI is login-protected. All pages (except login/register) require a valid session.
- First user — when no users exist, the registration page is shown automatically. The first registered user is auto-promoted to admin and can optionally set the admin panel password.
- Subsequent users — can register while
JOBHUNTER_REGISTRATION_ENABLED=true(default). - Sessions — JWT-based via
access_tokencookie (7-day expiry). Logging in from another browser/device replaces the session.
Click your name in the top navigation bar to open /account, where you can:
- Change your display name and email address
- Change your password (requires current password)
- View account info: user ID, role, status, registration date, last login
Each user can override global settings from the Settings page:
- OpenAI API key — each user can set their own key
- Runtime flags — mock mode, dry-run, headless, slow-mo
- Email notifications — provider, credentials, recipient
Settings that are left unset on the user row inherit from the global AppSettings (.env file).
The admin panel (/admin) is protected by a standalone admin password (not tied to any user account). If no admin password is configured, any logged-in user can access it.
The admin panel provides:
- User management — activate/deactivate, promote/demote admin, delete users
- Database reset — erase all data and start fresh (including admin password)
Set the admin password via JOBHUNTER_ADMIN_PASSWORD in .env, during first-user registration, or from the admin panel itself.
The built-in scheduler runs pipeline jobs automatically on a cron-like schedule. Configure it from the Schedule page in the web GUI or via data/schedule.yml.
| Setting | Default | Description |
|---|---|---|
| Enabled | off | Toggle the scheduler on/off |
| Time of day | 09:00 |
When to run (24 h format) |
| Days of week | Mon–Fri | Which days to run |
| Pipeline mode | full |
discover / discover_score / full / market |
| Profile | default |
Which search profile to use |
- The scheduler runs inside the web server process (APScheduler
AsyncIOScheduler). - Only one task runs at a time — if a pipeline is already running, the scheduled trigger is skipped.
- After each run, a summary is recorded in
data/schedule_history.yml(capped at 100 entries). - If email notifications are enabled, a pipeline summary email is sent after each run.
schedule:
enabled: true
time_of_day: '09:00'
days_of_week: [mon, tue, wed, thu, fri]
pipeline_mode: full
profile_name: defaultReceive email summaries after each scheduled pipeline run. Two providers are supported:
- Sign up at resend.com (free tier: 100 emails/day)
- Create an API key
- In Settings → Email Notifications, select Resend, paste the API key, enter your email
That's it — no SMTP configuration needed.
For Gmail, Outlook, or custom mail servers:
| Provider | Host | Port | TLS | Notes |
|---|---|---|---|---|
| Gmail | smtp.gmail.com |
587 | ✅ | Requires an App Password |
| Outlook | smtp.office365.com |
587 | ✅ | |
| Local relay | localhost |
25 | ❌ | Leave credentials blank |
| Variable | Description |
|---|---|
JOBHUNTER_EMAIL_PROVIDER |
resend (default) or smtp |
JOBHUNTER_RESEND_API_KEY |
Resend API key |
JOBHUNTER_NOTIFICATION_EMAIL |
Recipient email address |
JOBHUNTER_NOTIFICATIONS_ENABLED |
true / false |
JOBHUNTER_SMTP_HOST |
SMTP server hostname |
JOBHUNTER_SMTP_PORT |
SMTP port (default: 587) |
JOBHUNTER_SMTP_USER |
SMTP username (optional for relays) |
JOBHUNTER_SMTP_PASSWORD |
SMTP password |
JOBHUNTER_SMTP_USE_TLS |
true / false |
All settings can also be configured from the Settings page in the web GUI. Changes are persisted to .env automatically.
The system has three operating modes controlled by global flags:
| Mode | Flags | Applications | Use Case | |
|---|---|---|---|---|
| Mock | --mock |
Local HTML fixtures | Never | Development & testing |
| Dry-run | --real --dry-run |
Real LinkedIn | Stops before submit | Preview & verification |
| Live | --real |
Real LinkedIn | Submits for real | Production use |
Recommended workflow: Start with --mock to verify setup, then --real --dry-run to preview real jobs, then --real when ready.
hunt [GLOBAL OPTIONS] COMMAND [COMMAND OPTIONS]
| Flag | Default | Description |
|---|---|---|
--mock |
off | Use mock LinkedIn (local HTML fixtures) |
--real |
off | Use real LinkedIn (requires cookies) |
--dry-run |
off | Run without submitting applications |
--headless / --no-headless |
headless | Browser visibility |
--slowmo-ms INT |
0 |
Slow-motion delay in ms (for debugging) |
--data-dir PATH |
data |
Data directory path |
--log-level |
INFO |
DEBUG / INFO / WARNING / ERROR |
| Command | Description |
|---|---|
hunt init |
Initialise database and data directory |
hunt login |
Open browser for manual LinkedIn login, save cookies |
hunt profile |
Generate or view user profile and search profiles |
hunt discover --profile NAME |
Discover LinkedIn jobs for a search profile |
hunt score --profile NAME |
Compute fit-scores for discovered jobs |
hunt apply --profile NAME |
Apply to qualified jobs via Easy Apply |
hunt run --profile NAME |
Run full pipeline: discover → score → apply → report |
hunt report |
Generate daily Markdown + JSON report |
hunt serve |
Start web GUI (default: http://localhost:8000) |
All market commands are under hunt market:
| Command | Description |
|---|---|
hunt market run-all |
Run the full 7-step market pipeline |
hunt market ingest |
Convert discovered jobs → market events |
hunt market extract -e TYPE |
Run signal extraction (heuristic / openai / fake) |
hunt market graph |
Build entity + evidence graph from extractions |
hunt market trends |
Compute frequency, momentum, novelty, burst |
hunt market role-model -n TYPE |
Build role archetypes (heuristic / openai / fake / legacy) |
hunt market candidate-model -p NAME |
Project user profile into candidate capabilities |
hunt market match -p NAME |
Match candidate against role archetypes |
hunt market export -f FORMAT |
Export graph (json / graphml) |
hunt market report |
Generate market intelligence report |
hunt market dialogue-list |
List dialogue sessions |
hunt market dialogue-evaluate |
Evaluate un-assessed dialogue sessions |
uv run hunt --real market run-all \
--extractor openai \ # heuristic | openai | fake
--normalizer openai \ # heuristic | openai | fake | legacy
--profile default # candidate key# Quick local test (no API key needed)
uv run hunt --mock --dry-run run --profile default
# Real LinkedIn, preview only
uv run hunt --real --dry-run run --profile backend-python
# Full pipeline, live applications
uv run hunt --real run --profile default
# Market analysis with OpenAI
uv run hunt --real market run-all --extractor openai --normalizer openai
# Market analysis with heuristic (free, no API key)
uv run hunt --real market run-all --extractor heuristic --normalizer heuristic
# Web GUI
uv run hunt --real --dry-run serve┌──────────┐ ┌───────┐ ┌───────┐ ┌───────┐ ┌────────┐
│ Discover │───>│ Score │───>│ Queue │───>│ Apply │───>│ Report │
└──────────┘ └───────┘ └───────┘ └───────┘ └────────┘
│ │ │
▼ ▼ ▼
LinkedIn Embeddings + Easy Apply wizard
search LLM evaluation + LLM form filler
+ parsing + industry prefs (Playwright)
+ market boost
- Discover — searches LinkedIn using your profile keywords, parses HTML results
- Score — computes embedding similarity (0.0–1.0) and LLM fit score (0–100) for each job; applies industry preference boosts; optionally enriches with market opportunity signals
- Queue — marks jobs as
QUEUEDif:easy_apply=true, fit ≥min_fit_score, similarity ≥min_similarity - Apply — automates the Easy Apply wizard via Playwright; LLM generates contextual answers for form questions
- Report — produces daily Markdown + JSON summaries with stats, job tables, and market section
Job statuses: new → scored → queued → applied / skipped / blocked / review / failed
LLM form filling: the Easy Apply wizard encounters arbitrary questions (dropdowns, text fields, radio buttons). The system uses GPT to generate contextually appropriate answers based on your user profile.
┌────────┐ ┌─────────┐ ┌───────┐ ┌────────┐ ┌────────────┐ ┌───────────┐ ┌───────┐
│ Ingest │──>│ Extract │──>│ Graph │──>│ Trends │──>│ Role Model │──>│ Candidate │──>│ Match │
└────────┘ └─────────┘ └───────┘ └────────┘ └────────────┘ └───────────┘ └───────┘
Model
- Ingest — converts discovered jobs into market events (idempotent)
- Extract — runs signal extraction on job descriptions: technologies, skills, tools, methodologies, certifications, companies (heuristic keyword matching or OpenAI LLM)
- Graph — normalises entities (fuzzy matching + alias resolution), builds evidence and co-occurrence edges
- Trends — computes per-entity frequency, momentum (rising/falling), novelty, and burst scores
- Role Model — clusters jobs by normalised title, builds role archetypes with entity importance weights; title normalisation via heuristic rules or OpenAI LLM
- Candidate Model — projects your
user_profile.ymlskills into the entity graph with confidence scoring - Match — compares your capabilities against each role archetype: success score, confidence, learning upside, mismatch risk, hard/soft/learnable gap classification; graph proximity boost
Outputs:
- Per-role opportunity scores with gap analysis
- Rising technology trends and skill demand signals
- Market-enhanced scoring on individual job detail pages
- Market Intelligence section in daily reports
| Variable | Required | Default | Description |
|---|---|---|---|
JOBHUNTER_OPENAI_API_KEY |
For real use | "" |
OpenAI API key |
JOBHUNTER_LLM_PROVIDER |
No | "openai" |
LLM provider: openai or local |
JOBHUNTER_LOCAL_LLM_URL |
For local | "http://localhost:8080/v1" |
Local LLM server URL |
JOBHUNTER_LOCAL_LLM_MODEL |
No | "" |
Local model name (blank = auto-detect) |
JOBHUNTER_LLM_TEMPERATURE |
No | 0.2 |
Global default temperature for LLM calls |
JOBHUNTER_LLM_MAX_TOKENS |
No | 0 |
Global default max tokens (0 = no limit) |
JOBHUNTER_DATA_DIR |
No | "data" |
Path to the data directory |
JOBHUNTER_SECRET_KEY |
No | auto-generated | JWT signing key (auto-persisted on first run) |
JOBHUNTER_ADMIN_PASSWORD |
No | "" |
Admin panel password (empty = no gate) |
JOBHUNTER_REGISTRATION_ENABLED |
No | true |
Allow new user registration |
JOBHUNTER_EMAIL_PROVIDER |
No | "resend" |
Email provider: resend or smtp |
JOBHUNTER_RESEND_API_KEY |
For Resend | "" |
Resend API key |
JOBHUNTER_NOTIFICATION_EMAIL |
For email | "" |
Notification recipient email |
JOBHUNTER_NOTIFICATIONS_ENABLED |
No | false |
Enable email notifications |
JOBHUNTER_SMTP_HOST |
For SMTP | "" |
SMTP server hostname |
JOBHUNTER_SMTP_PORT |
No | 587 |
SMTP port |
JOBHUNTER_SMTP_USER |
No | "" |
SMTP username |
JOBHUNTER_SMTP_PASSWORD |
For SMTP | "" |
SMTP password |
JOBHUNTER_SMTP_USE_TLS |
No | true |
Use TLS for SMTP |
All settings are GUI-configurable. You can set every variable above from the Settings page in the web GUI — no need to manually edit
.env. Settings saved from the GUI are automatically persisted to.envand survive restarts.
Create a .env file in the project root (or configure everything from the GUI):
JOBHUNTER_OPENAI_API_KEY=sk-proj-...
JOBHUNTER_EMAIL_PROVIDER=resend
JOBHUNTER_RESEND_API_KEY=re_...
JOBHUNTER_NOTIFICATION_EMAIL=you@example.comThe app reads .env on startup. Shell environment variables take precedence.
Your extracted profile — generated by hunt profile or the Setup wizard. Editable manually or via the Profiles page.
user_profile:
name: Jane Doe
first_name: Jane
last_name: Doe
email: jane@example.com
phone: "555-0123"
phone_country_code: "+1"
title: Senior Python Developer
summary: Experienced backend engineer with 8 years in Python ecosystems.
skills:
- Python
- FastAPI
- AWS
experience_years: 8
seniority_level: Senior
spoken_languages:
- English
- Spanish
programming_languages:
- Python
- SQL
preferred_industries: # Boost score for jobs in these industries
- startups
- healthcare
disliked_industries: # Penalise score for jobs in these industries
- fintech
- adtechSearch profiles define what jobs to look for and scoring thresholds:
profiles:
- name: backend-python
keywords:
- Senior Python Developer
- Backend Engineer
location: Remote
remote: true
seniority:
- Senior
- Mid-Senior
blacklist_companies: []
blacklist_titles:
- Intern
min_fit_score: 75 # LLM score threshold (0–100)
min_similarity: 0.35 # Embedding similarity threshold (0.0–1.0)
max_applications_per_day: 25You can have multiple search profiles and select them with --profile NAME.
After setup, data/ contains:
data/
├── job_hunter.db ← SQLite database (application + market tables)
├── user_profile.yml ← your extracted profile
├── profiles.yml ← search profiles
├── cookies.json ← LinkedIn session cookies (after hunt login)
├── schedule.yml ← scheduler configuration (time, days, mode)
├── schedule_history.yml ← last 100 scheduled run records
├── users/ ← per-user data directories (data/users/<user_id>/)
└── reports/ ← daily Markdown + JSON reports
Configurable via --data-dir or JOBHUNTER_DATA_DIR.
| Field | Type | Description |
|---|---|---|
id |
UUID | Primary key |
user_id |
UUID (nullable) | FK to users — multi-user data isolation |
source |
string | Job source (default "linkedin") |
external_id |
string | LinkedIn job ID |
title |
string | Job title |
company |
string | Company name |
location |
string | Job location |
posted_at |
datetime | When posted on LinkedIn |
description_text |
text | Full job description |
easy_apply |
bool | Easy Apply available |
hash |
string | SHA-256 dedup hash |
status |
enum | new → scored → queued → applied / skipped / blocked / review / failed |
notes |
text | User notes on the job |
| Field | Type | Description |
|---|---|---|
user_id |
UUID (nullable) | FK to users — multi-user data isolation |
job_hash |
string | References Job |
resume_id |
string | Resume identifier (default "default") |
embedding_similarity |
float | Cosine similarity (0.0–1.0) |
llm_fit_score |
int | LLM evaluation (0–100) |
missing_skills |
JSON | Skills the candidate lacks |
risk_flags |
JSON | Red flags identified by LLM |
decision |
enum | apply / skip / review |
| Field | Type | Description |
|---|---|---|
user_id |
UUID (nullable) | FK to users — multi-user data isolation |
job_hash |
string | References Job |
result |
enum | success / failed / blocked / dry_run / already_applied |
failure_stage |
string | Which wizard step failed |
form_answers_json |
JSON | Answers submitted in the form |
| Table | Purpose |
|---|---|
market_events |
Jobs converted to analysable events |
market_extractions |
Signal extraction results per event |
market_entities |
Canonical technology/skill/tool entities |
market_aliases |
Entity name aliases |
market_evidence |
Entity–extraction linkages |
market_edges |
Entity co-occurrence edges |
market_snapshots |
Per-entity trend data (frequency, momentum, novelty, burst) |
dialogue_sessions |
Evaluation dialogue sessions |
dialogue_turns |
Individual dialogue turns |
dialogue_assessments |
Session quality assessments |
candidate_capabilities |
Candidate skill projections |
role_requirements |
Role archetype requirements |
match_explanations |
Candidate–role match details |
AIJobHunter/
├── pyproject.toml # Dependencies & build config (hatchling)
├── Dockerfile # Multi-stage Docker build
├── Dockerfile.llm # Local LLM sidecar image
├── docker-compose.yml # One-command container deployment
├── README.md
├── AGENTS.md # Detailed architecture & conventions
├── .env.example # Environment config template
│
├── config/
│ └── llm_server.json # LLM sidecar server params (n_ctx, threads, etc.)
│
├── src/job_hunter/
│ ├── cli.py # Typer CLI — all commands + market sub-app
│ ├── llm_client.py # Central LLM client factory + task param resolution
│ │
│ ├── auth/
│ │ ├── models.py # User ORM — credentials, per-user settings
│ │ ├── repo.py # CRUD: create, authenticate, update profile/password
│ │ └── security.py # bcrypt hashing, JWT tokens, admin tokens
│ │
│ ├── config/
│ │ ├── models.py # AppSettings, SearchProfile, UserProfile, ScheduleConfig
│ │ └── loader.py # YAML load/save, settings factory, .env persistence
│ │
│ ├── db/
│ │ ├── models.py # ORM: Job, Score, ApplicationAttempt
│ │ ├── repo.py # DB init, session, CRUD helpers
│ │ └── migrations.py # Lightweight ALTER TABLE migrations (idempotent)
│ │
│ ├── profile/
│ │ ├── extract.py # PDF text + LinkedIn URL scraping
│ │ └── generator.py # LLM profile generation
│ │
│ ├── linkedin/
│ │ ├── session.py # Cookie-based Playwright session
│ │ ├── discover.py # Job search + pagination
│ │ ├── parse.py # HTML → structured data
│ │ ├── apply.py # Easy Apply wizard automation
│ │ ├── forms.py # Form-filling helpers
│ │ ├── form_filler_llm.py # LLM-powered form answers
│ │ ├── selectors.py # CSS/XPath selectors
│ │ └── mock_site/fixtures/ # HTML fixtures for mock mode
│ │
│ ├── matching/
│ │ ├── embeddings.py # Embedding providers (OpenAI + Fake)
│ │ ├── llm_eval.py # LLM evaluators (OpenAI + Fake)
│ │ ├── scoring.py # Combined scoring + market boost
│ │ └── description_cleaner.py # Rule-based + LLM description cleanup
│ │
│ ├── orchestration/
│ │ ├── pipeline.py # Async: discover → score → apply → report
│ │ └── policies.py # Rate limits, blacklists, daily caps
│ │
│ ├── reporting/
│ │ └── report.py # Markdown + JSON reports (+ market section)
│ │
│ ├── notifications/
│ │ └── email.py # Email providers: Resend, SMTP, Fake
│ │
│ ├── scheduling/
│ │ └── scheduler.py # APScheduler cron automation + history
│ │
│ ├── market/ # Market Intelligence package
│ │ ├── pipeline.py # Full 7-step market pipeline
│ │ ├── cli.py # hunt market … subcommands
│ │ ├── events.py # Job → market event ingestion
│ │ ├── extract.py # Signal extraction (Heuristic/OpenAI/Fake)
│ │ ├── normalize.py # Entity canonicalisation + fuzzy matching
│ │ ├── title_normalizer.py # Job title cleaning (Heuristic/OpenAI/Fake)
│ │ ├── role_model.py # Role archetype builder
│ │ ├── candidate_model.py # Candidate capability projection
│ │ ├── matching.py # Candidate ↔ role matching
│ │ ├── opportunity.py # Opportunity scoring + gap analysis
│ │ ├── dialogue.py # Dialogue session CRUD
│ │ ├── dialogue_eval.py # Session evaluation (RuleBased/Fake)
│ │ ├── report.py # Market intelligence report generation
│ │ ├── db_models.py # 13 SQLAlchemy market tables
│ │ ├── schemas.py # Extraction I/O Pydantic models
│ │ ├── repo.py # Market CRUD helpers
│ │ ├── graph/
│ │ │ ├── builder.py # Entity + evidence graph materialisation
│ │ │ └── metrics.py # NetworkX export (JSON / GraphML)
│ │ ├── trends/
│ │ │ ├── compute.py # Trend computation engine
│ │ │ └── queries.py # SQL helpers for trend analysis
│ │ ├── web/
│ │ │ └── router.py # Market web pages + JSON APIs
│ │ └── data/
│ │ └── aliases.yml # Technology alias dictionary
│ │
│ ├── web/ # Web GUI (FastAPI + HTMX + Pico CSS)
│ │ ├── app.py # App factory + lifespan
│ │ ├── deps.py # Dependency injection
│ │ ├── task_manager.py # Background tasks + SSE broadcasting
│ │ ├── routers/
│ │ │ ├── auth.py # Login, register, logout, /api/auth/me
│ │ │ ├── account.py # Account settings — profile & password
│ │ │ ├── admin.py # Admin panel — user mgmt, DB reset
│ │ │ ├── dashboard.py # Visual stats + charts + market panel
│ │ │ ├── jobs.py # Jobs CRUD + bulk actions + market boost
│ │ │ ├── onboarding.py # First-run profile wizard
│ │ │ ├── profiles.py # User + search profile editing
│ │ │ ├── resume_review.py # AI resume gap analysis
│ │ │ ├── run.py # Pipeline + market trigger + SSE progress
│ │ │ ├── reports.py # Daily report viewer
│ │ │ ├── settings.py # Runtime settings + .env persistence
│ │ │ └── schedule.py # Scheduler config + run history
│ │ ├── templates/ # Jinja2 HTML templates
│ │ └── static/ # Banner, favicon
│ │
│ └── utils/
│ ├── logging.py # Structured logging setup
│ ├── rate_limit.py # Token-bucket rate limiter
│ ├── retry.py # Exponential back-off decorator
│ └── hashing.py # SHA-256 job dedup
│
├── tests/ # 377 tests — all run fully offline
│ ├── test_web.py # Web GUI endpoints (incl. schedule, settings, email)
│ ├── test_market.py # Market Intelligence (all stages)
│ ├── test_notifications.py # Email providers + pipeline summary
│ ├── test_scheduling.py # Scheduler config, YAML, start/stop/reschedule
│ ├── test_apply_mock_flow.py # Easy Apply wizard
│ ├── test_discover_parse_mock.py # Discovery & parsing
│ ├── test_matching_scoring.py # Scoring logic
│ ├── test_pipeline.py # Pipeline orchestration
│ ├── test_db_repo.py # Database CRUD
│ ├── test_profile_generation.py # Profile generation
│ ├── test_reporting.py # Report generation
│ ├── test_description_cleaner.py # Description cleanup
│ ├── test_config.py # Config loading
│ ├── test_llm_eval_schema.py # LLM evaluator schema
│ ├── test_linkedin_session.py # Session & URLs
│ └── fixtures/ # Sample data
│
└── data/ # Runtime data (gitignored)
All tests run fully offline — no API keys, internet, or LinkedIn account needed.
uv run pytest -q # Quick summary
uv run pytest -v # Verbose output
uv run pytest tests/test_web.py # Specific test file
uv run pytest tests/test_market.py # Market Intelligence tests
uv run pytest -k "test_upsert" # Pattern matchingCurrent: 377 tests passed.
Tests use fake implementations for all external services:
FakeEmbedder— fixed similarity scoresFakeLLMEvaluator— deterministic fit scoresFakeProfileGenerator— canned profile outputFakeMarketExtractor— deterministic signal extractionFakeTitleNormalizer— deterministic title cleaningFakeDialogueEvaluator— deterministic session scoresFakeNotifier— records emails instead of sending
All database tests use in-memory SQLite. Mock discovery tests spin up a local HTTP server with HTML fixtures.
Primary development is on Windows (PowerShell). Common equivalents:
| Unix | PowerShell |
|---|---|
cat file |
Get-Content file |
head -n 20 file |
Get-Content file -Head 20 |
tail -n 20 file |
Get-Content file -Tail 20 |
tail -f file |
Get-Content file -Wait -Tail 20 |
grep pattern file |
Select-String -Pattern "pattern" file |
find . -name "*.py" |
Get-ChildItem -Recurse -Filter *.py |
export VAR=val |
$env:VAR = "val" |
cp src dst |
Copy-Item src dst |
All uv, pytest, and docker compose commands work identically on Windows. Only deploy.sh requires WSL or Docker Desktop.
The included deploy.sh script handles the full lifecycle via docker compose — stop, pull, build, start:
chmod +x deploy.sh
./deploy.sh # app only — web GUI on port 80
./deploy.sh --with-llm # app + local LLM sidecar (auto-downloads model if missing)Windows:
deploy.shis Linux/macOS only. On Windows, use Docker Desktop withdocker composecommands below, or run the script in WSL.
This stops running containers, pulls the latest code, rebuilds images, and starts
the app on port 80 with the data/ volume and .env config. The --with-llm
flag additionally builds and starts the local LLM sidecar on port 8080.
# Build and start — app only (works on all platforms including Windows)
docker compose up -d
# App + local LLM sidecar
docker compose --profile local-llm up -d
# View logs
docker compose logs -f
# Stop
docker compose downThe app container:
- Exposes the web GUI on port 80
- Mounts
./dataas a volume for persistent storage - Reads configuration from
.env - Runs a health check every 30 s at
/api/health - Restarts automatically (
unless-stopped) - Connects to the LLM sidecar at
http://llm:8080/v1via Docker networking (when running)
An optional self-hosted LLM container provides an OpenAI-compatible API, so the system can run fully offline without an OpenAI API key. Uses llama-cpp-python with GGUF models.
Easiest way — use the deploy script:
./deploy.sh --with-llm # downloads model if missing, builds & starts both containersManual setup:
# 1. Download a model (~1.8 GB default)
.\scripts\download_model.ps1 # Windows
./scripts/download_model.sh # Linux/macOS
# 2. Start both containers (app + LLM sidecar on same Docker network)
docker compose --profile local-llm up -d
# 3. Verify the sidecar is running
curl http://localhost:8080/v1/modelsThen set LLM Provider → Local LLM in the web Settings page, or add to .env:
JOBHUNTER_LLM_PROVIDER=localNote: When running via docker compose, the app automatically uses
http://llm:8080/v1(Docker internal hostname). TheJOBHUNTER_LOCAL_LLM_URLenv var indocker-compose.ymlis pre-configured — no manual URL setup needed.
Server configuration is in config/llm_server.json (mounted into the container):
{
"model": "/models/model.gguf",
"n_ctx": 4096,
"n_batch": 512,
"n_threads": 4,
"n_threads_batch": 8,
"use_mlock": true,
"cache": true,
"host": "0.0.0.0",
"port": 8080
}Edit this file to tune context size, thread count, caching, etc. Changes take effect on container restart.
Recommended models:
| Model | Size | Quality |
|---|---|---|
| Llama-3.2-1B-Instruct-Q4_K_S | ~700 MB | Basic (fast) |
| Llama-3.2-3B-Instruct-Q4_K_M | ~1.8 GB | Good (default) |
| Phi-3.5-mini-Q4_K_M | ~2.2 GB | Good (128K context) |
| Mistral-7B-Instruct-Q4_K_M | ~4.4 GB | Great (slow) |
Note: For reliable JSON output (scoring, profiles), 3B+ models are recommended. Embeddings still use OpenAI when an API key is set; without a key, similarity defaults to a fixed value.
All LLM calls use centralized temperature and max-token settings from AppSettings. Per-task presets are pre-configured with optimal values:
| Task | Temperature | Max Tokens | Used By |
|---|---|---|---|
scoring |
0.2 | — | Job fit evaluation |
description_clean |
0.1 | 2000 | Job description formatting |
profile_gen |
0.3 | — | Profile generation |
market_extract |
0.0 | — | Market signal extraction |
title_normalize |
0.0 | — | Job title normalization |
form_fill |
0.1 | 1000 | Easy Apply form filling |
resume_review |
0.4 | 2000 | Resume gap analysis |
Global defaults (temperature, max tokens) are configurable from the Settings page or via environment variables. Per-task overrides automatically apply on top of the global defaults.
- No captcha bypassing. Challenges pause the bot and mark the job as
BLOCKED. - Rate limiting — configurable delays, daily application caps.
- Dry-run mode — runs everything without submitting applications.
- No secret logging — cookies and API keys never appear in logs or reports.
- Respectful automation — navigates like a human with realistic delays.
| Phase | Description | Status |
|---|---|---|
| 1 | Skeleton + DB + CLI | ✅ |
| 2 | Profile generation from resume PDF + LinkedIn | ✅ |
| 3 | Mock LinkedIn + HTML parser + discovery | ✅ |
| 4 | Embeddings + LLM scoring | ✅ |
| 5 | Easy Apply automation + LLM form filling | ✅ |
| 6 | Real LinkedIn integration + cookies | ✅ |
| 7 | Orchestration + reporting | ✅ |
| 8 | Web GUI dashboard (FastAPI + HTMX) | ✅ |
| 9 | .env support, industry preferences, UI polish |
✅ |
| 10 | Resume review, visual dashboard | ✅ |
| 11 | Market Intelligence — graph foundation | ✅ |
| 12 | Market Intelligence — trends, roles, dialogue, web | ✅ |
| 13 | Market Intelligence — candidate model, matching, scoring | ✅ |
| 14 | Market Intelligence — UI panels, reporting, evaluation | ✅ |
| 15 | Operational integration — pipeline, SSE, title normalisation | ✅ |
| 16 | Scheduled pipelines, email notifications (Resend + SMTP), Docker | ✅ |
| 17 | Multi-user authentication, account management, admin panel | ✅ |
| 18 | Centralized LLM configuration — server config file, per-task params, GUI controls | ✅ |
| Next | Outcome learning, career trajectory parsing, fairness-aware reranking | 🔜 |
This project is for personal use. See LICENSE for details.
