AI Job Hunter

Automated LinkedIn job discovery, scoring, and application powered by LLM-based resume matching, browser automation, and market intelligence.

AI Job Hunter finds fresh LinkedIn jobs matching your background, scores each one against your resume using embedding similarity and LLM evaluation, and — when a job passes your thresholds — applies via Easy Apply automatically. A built-in Market Intelligence engine analyses technology trends, builds role archetypes, and matches your skills against market demand.

Features
Prerequisites
Installation
Getting Started — Step by Step
- Step 1: Install & Configure
- Step 2: Generate Your Profile
- Step 3: Log in to LinkedIn
- Step 4: Test with a Dry Run
- Step 5: Run for Real
- Step 6: Market Intelligence (Optional)
Web GUI
Authentication & Multi-User
Scheduling
Email Notifications
Operating Modes
CLI Reference
- Global Flags
- Core Commands
- Market Intelligence Commands
Application Pipeline
Market Intelligence Pipeline
Configuration
- Environment Variables
- Profile Files
Data Directory
Data Models
Project Structure
Testing
Docker Deployment
Safety & Ethics
Development Roadmap
License

Features

Feature	Description
Profile generation	Extract skills & experience from resume PDF + LinkedIn (URL or PDF) via LLM
Job discovery	Cookie-based LinkedIn search with pagination, CSS + JS fallback parsing
AI scoring	Embedding similarity + LLM fit evaluation (0–100) with skill gap analysis
Industry preferences	Boost/penalise scores based on preferred and disliked industries
Easy Apply automation	Multi-step wizard with LLM-powered form filling for arbitrary questions
Challenge detection	Pauses on captcha, marks job as BLOCKED — no bypass attempts
Market Intelligence	Technology trend analysis, role archetypes, candidate–role matching, gap analysis
Web GUI	Full command & control dashboard (FastAPI + HTMX + Pico CSS)
Multi-user auth	JWT login, per-user settings & API keys, account management, admin panel
Visual dashboard	Donut chart, fit histogram, skill gap bars, activity timeline, market panel
Scheduled runs	APScheduler cron-based automation with configurable days, time, and pipeline mode
Email notifications	Pipeline summary emails via Resend API or SMTP (Gmail, Outlook, custom)
Resume review	AI-powered gap analysis comparing your resume to target jobs
Daily reports	Markdown + JSON summaries with stats, job tables, and market section
Docker support	Dockerfile + docker-compose with health check, volume mount, auto-restart
Settings persistence	Web UI settings saved to `.env` — survives restarts
Mock mode	Full pipeline testable offline with HTML fixtures — no API keys needed
CLI	`hunt` command with 10+ subcommands and global flags
`.env` support	API keys and settings from `.env` file
SQLite database	Job, Score, ApplicationAttempt + 13 market tables, WAL mode

Prerequisites

Python 3.13+
uv — fast Python package manager
OpenAI API key — for scoring, profile generation, form filling, and market extraction (not needed for mock/offline testing)

Installation

git clone https://github.com/kertser/AIJobHunter.git
cd AIJobHunter
uv sync          # installs all dependencies including dev tools

Windows note: All uv and pytest commands work the same in PowerShell. For .env creation, use Copy-Item .env.example .env instead of cp.

Verify the installation:

uv run hunt --help
uv run pytest -q          # 377 tests, all offline

Getting Started — Step by Step

This is the recommended sequence to go from zero to a fully operational system.

Step 1: Install & Configure

# Install dependencies
uv sync

# Create your .env file
cp .env.example .env          # Linux/macOS
# Copy-Item .env.example .env  # Windows PowerShell

Edit .env and set your OpenAI API key:

JOBHUNTER_OPENAI_API_KEY=sk-proj-...

Initialise the database:

uv run hunt init

This creates data/job_hunter.db and the data/reports/ directory.

Step 2: Generate Your Profile

The system needs to understand your background. Provide your resume (PDF) and optionally your LinkedIn profile:

# Resume + LinkedIn URL (Playwright scrapes the public profile)
uv run hunt profile --resume path/to/resume.pdf --linkedin https://www.linkedin.com/in/your-name/

# Resume + LinkedIn PDF export (if you've downloaded it)
uv run hunt profile --resume path/to/resume.pdf --linkedin linkedin.pdf

# Resume only
uv run hunt profile --resume path/to/resume.pdf

Or use the web GUI: start the server (uv run hunt --real --dry-run serve) and go to http://localhost:8000 — the Setup page will guide you.

This generates two files:

File	Purpose
`data/user_profile.yml`	Your extracted profile: name, skills, experience, desired roles, education
`data/profiles.yml`	Search profiles: keywords, location, seniority, scoring thresholds

Verify what was generated:

uv run hunt profile --show

You can edit both files manually or via the Profiles page in the web GUI.

Step 3: Log in to LinkedIn

The system uses browser cookies for authentication (no password storage):

uv run hunt login

A browser window opens. Log in to LinkedIn manually. Once login is detected, the browser closes and cookies are saved to data/cookies.json.

Note: Cookies expire periodically. If discovery stops working, run hunt login again.

Step 4: Test with a Dry Run

Before applying to real jobs, test the full pipeline in dry-run mode:

# Discover + score + "apply" without submitting
uv run hunt --real --dry-run run --profile default

This will:

Discover — search LinkedIn for jobs matching your profile keywords
Score — compute embedding similarity + LLM fit score for each job
Queue — mark qualifying jobs for application
Apply (dry-run) — walk through Easy Apply forms but stop before submitting
Report — generate a daily summary in data/reports/

Check the results:

# View the dashboard
uv run hunt --real --dry-run serve
# Open http://localhost:8000

Step 5: Run for Real

Once you're satisfied with the scoring and job selection:

# Full pipeline — will submit applications
uv run hunt --real run --profile default

Or use the web GUI:

uv run hunt --real serve
# Open http://localhost:8000 → Pipeline page → click "Run Pipeline"

Step 6: Market Intelligence (Optional)

After discovering and scoring some jobs, run market analysis to understand technology trends and how your skills compare to market demand:

Via the web GUI (easiest):

uv run hunt --real serve
# Open http://localhost:8000 → Pipeline page → click "Analyse Market"

Via CLI:

# Full market pipeline (7 steps)
uv run hunt --real market run-all --extractor openai --normalizer openai --profile default

This runs: Ingest → Extract → Graph → Trends → Role Model → Candidate Model → Match.

View results:

Web: http://localhost:8000/market — trends, role archetypes, skill gaps, company demand
CLI: uv run hunt market report — generates a market intelligence report in data/reports/
Job details: each job now shows market-enhanced scoring data

Web GUI

Start the web server:

uv run hunt --real --dry-run serve          # safe mode — won't submit applications
uv run hunt --real serve                    # live mode — will submit applications
uv run hunt --mock serve                    # offline testing with mock data
uv run hunt serve --host 0.0.0.0 --port 3000  # custom bind address

Open http://localhost:8000 in your browser.

Pages

Page	Path	Description
Dashboard	`/`	Visual stats — donut chart, fit histogram, skill gap bars, activity timeline, market intelligence panel
Jobs	`/jobs`	Sortable/filterable table with bulk actions, status management, persistent sort
Job Detail	`/api/jobs/{hash}`	Full description, scores, market boost, Easy Apply button, application history
Pipeline	`/run`	Trigger Discover / Score / Apply / Full Pipeline / Market Analysis with live SSE progress
Market	`/market`	Technology trends, rising entities, role archetypes, candidate matches, company demand
Profiles	`/profiles`	Edit user profile and search profiles (skills, keywords, thresholds)
Resume Review	`/resume-review`	AI gap analysis — missing skills, improvement suggestions, quick wins
Reports	`/reports`	Browse and view daily pipeline + market reports
Account	`/account`	Personal account settings — edit display name, email, change password
Settings	`/settings`	Toggle mock/dry-run/headless, configure LinkedIn session cookies, email notifications, API keys
Schedule	`/schedule`	Cron-style automation — set time, days, pipeline mode, view run history
Admin	`/admin`	User management, profile management, database reset (admin-password protected)
Setup	`/onboarding`	Upload resume PDF + LinkedIn URL to generate profiles (first-run wizard)

Live progress streaming: Pipeline and market operations stream real-time progress via Server-Sent Events (SSE) — you see each step as it happens, with keepalive pings during long operations.

Authentication & Multi-User

The web GUI is login-protected. All pages (except login/register) require a valid session.

Registration & Login

First user — when no users exist, the registration page is shown automatically. The first registered user is auto-promoted to admin and can optionally set the admin panel password.
Subsequent users — can register while JOBHUNTER_REGISTRATION_ENABLED=true (default).
Sessions — JWT-based via access_token cookie (7-day expiry). Logging in from another browser/device replaces the session.

Account Settings

Click your name in the top navigation bar to open /account, where you can:

Change your display name and email address
Change your password (requires current password)
View account info: user ID, role, status, registration date, last login

Per-User Settings

Each user can override global settings from the Settings page:

OpenAI API key — each user can set their own key
Runtime flags — mock mode, dry-run, headless, slow-mo
Email notifications — provider, credentials, recipient

Settings that are left unset on the user row inherit from the global AppSettings (.env file).

Admin Panel

The admin panel (/admin) is protected by a standalone admin password (not tied to any user account). If no admin password is configured, any logged-in user can access it.

The admin panel provides:

User management — activate/deactivate, promote/demote admin, delete users
Database reset — erase all data and start fresh (including admin password)

Set the admin password via JOBHUNTER_ADMIN_PASSWORD in .env, during first-user registration, or from the admin panel itself.

Scheduling

The built-in scheduler runs pipeline jobs automatically on a cron-like schedule. Configure it from the Schedule page in the web GUI or via data/schedule.yml.

Configuration

Setting	Default	Description
Enabled	off	Toggle the scheduler on/off
Time of day	`09:00`	When to run (24 h format)
Days of week	Mon–Fri	Which days to run
Pipeline mode	`full`	`discover` / `discover_score` / `full` / `market`
Profile	`default`	Which search profile to use

How it works

The scheduler runs inside the web server process (APScheduler AsyncIOScheduler).
Only one task runs at a time — if a pipeline is already running, the scheduled trigger is skipped.
After each run, a summary is recorded in data/schedule_history.yml (capped at 100 entries).
If email notifications are enabled, a pipeline summary email is sent after each run.

`data/schedule.yml` example

schedule:
  enabled: true
  time_of_day: '09:00'
  days_of_week: [mon, tue, wed, thu, fri]
  pipeline_mode: full
  profile_name: default

Email Notifications

Receive email summaries after each scheduled pipeline run. Two providers are supported:

Resend (recommended — simple)

Sign up at resend.com (free tier: 100 emails/day)
Create an API key
In Settings → Email Notifications, select Resend, paste the API key, enter your email

That's it — no SMTP configuration needed.

SMTP (advanced)

For Gmail, Outlook, or custom mail servers:

Provider	Host	Port	TLS	Notes
Gmail	`smtp.gmail.com`	587	✅	Requires an App Password
Outlook	`smtp.office365.com`	587	✅
Local relay	`localhost`	25	❌	Leave credentials blank

Environment variables

Variable	Description
`JOBHUNTER_EMAIL_PROVIDER`	`resend` (default) or `smtp`
`JOBHUNTER_RESEND_API_KEY`	Resend API key
`JOBHUNTER_NOTIFICATION_EMAIL`	Recipient email address
`JOBHUNTER_NOTIFICATIONS_ENABLED`	`true` / `false`
`JOBHUNTER_SMTP_HOST`	SMTP server hostname
`JOBHUNTER_SMTP_PORT`	SMTP port (default: 587)
`JOBHUNTER_SMTP_USER`	SMTP username (optional for relays)
`JOBHUNTER_SMTP_PASSWORD`	SMTP password
`JOBHUNTER_SMTP_USE_TLS`	`true` / `false`

All settings can also be configured from the Settings page in the web GUI. Changes are persisted to .env automatically.

Operating Modes

The system has three operating modes controlled by global flags:

Mode	Flags	LinkedIn	Applications	Use Case
Mock	`--mock`	Local HTML fixtures	Never	Development & testing
Dry-run	`--real --dry-run`	Real LinkedIn	Stops before submit	Preview & verification
Live	`--real`	Real LinkedIn	Submits for real	Production use

Recommended workflow: Start with --mock to verify setup, then --real --dry-run to preview real jobs, then --real when ready.

CLI Reference

hunt [GLOBAL OPTIONS] COMMAND [COMMAND OPTIONS]

Global Flags

Flag	Default	Description
`--mock`	off	Use mock LinkedIn (local HTML fixtures)
`--real`	off	Use real LinkedIn (requires cookies)
`--dry-run`	off	Run without submitting applications
`--headless / --no-headless`	headless	Browser visibility
`--slowmo-ms INT`	`0`	Slow-motion delay in ms (for debugging)
`--data-dir PATH`	`data`	Data directory path
`--log-level`	`INFO`	`DEBUG` / `INFO` / `WARNING` / `ERROR`

Core Commands

Command	Description
`hunt init`	Initialise database and data directory
`hunt login`	Open browser for manual LinkedIn login, save cookies
`hunt profile`	Generate or view user profile and search profiles
`hunt discover --profile NAME`	Discover LinkedIn jobs for a search profile
`hunt score --profile NAME`	Compute fit-scores for discovered jobs
`hunt apply --profile NAME`	Apply to qualified jobs via Easy Apply
`hunt run --profile NAME`	Run full pipeline: discover → score → apply → report
`hunt report`	Generate daily Markdown + JSON report
`hunt serve`	Start web GUI (default: http://localhost:8000)

Market Intelligence Commands

All market commands are under hunt market:

Command	Description
`hunt market run-all`	Run the full 7-step market pipeline
`hunt market ingest`	Convert discovered jobs → market events
`hunt market extract -e TYPE`	Run signal extraction (`heuristic` / `openai` / `fake`)
`hunt market graph`	Build entity + evidence graph from extractions
`hunt market trends`	Compute frequency, momentum, novelty, burst
`hunt market role-model -n TYPE`	Build role archetypes (`heuristic` / `openai` / `fake` / `legacy`)
`hunt market candidate-model -p NAME`	Project user profile into candidate capabilities
`hunt market match -p NAME`	Match candidate against role archetypes
`hunt market export -f FORMAT`	Export graph (`json` / `graphml`)
`hunt market report`	Generate market intelligence report
`hunt market dialogue-list`	List dialogue sessions
`hunt market dialogue-evaluate`	Evaluate un-assessed dialogue sessions

Key Options for `hunt market run-all`

uv run hunt --real market run-all \
    --extractor openai \        # heuristic | openai | fake
    --normalizer openai \       # heuristic | openai | fake | legacy
    --profile default           # candidate key

Examples

# Quick local test (no API key needed)
uv run hunt --mock --dry-run run --profile default

# Real LinkedIn, preview only
uv run hunt --real --dry-run run --profile backend-python

# Full pipeline, live applications
uv run hunt --real run --profile default

# Market analysis with OpenAI
uv run hunt --real market run-all --extractor openai --normalizer openai

# Market analysis with heuristic (free, no API key)
uv run hunt --real market run-all --extractor heuristic --normalizer heuristic

# Web GUI
uv run hunt --real --dry-run serve

Application Pipeline

┌──────────┐    ┌───────┐    ┌───────┐    ┌───────┐    ┌────────┐
│ Discover │───>│ Score │───>│ Queue │───>│ Apply │───>│ Report │
└──────────┘    └───────┘    └───────┘    └───────┘    └────────┘
     │               │                        │
     ▼               ▼                        ▼
  LinkedIn      Embeddings +           Easy Apply wizard
  search        LLM evaluation         + LLM form filler
  + parsing     + industry prefs       (Playwright)
                + market boost

Discover — searches LinkedIn using your profile keywords, parses HTML results
Score — computes embedding similarity (0.0–1.0) and LLM fit score (0–100) for each job; applies industry preference boosts; optionally enriches with market opportunity signals
Queue — marks jobs as QUEUED if: easy_apply=true, fit ≥ min_fit_score, similarity ≥ min_similarity
Apply — automates the Easy Apply wizard via Playwright; LLM generates contextual answers for form questions
Report — produces daily Markdown + JSON summaries with stats, job tables, and market section

Job statuses: new → scored → queued → applied / skipped / blocked / review / failed

LLM form filling: the Easy Apply wizard encounters arbitrary questions (dropdowns, text fields, radio buttons). The system uses GPT to generate contextually appropriate answers based on your user profile.

Market Intelligence Pipeline

┌────────┐   ┌─────────┐   ┌───────┐   ┌────────┐   ┌────────────┐   ┌───────────┐   ┌───────┐
│ Ingest │──>│ Extract │──>│ Graph │──>│ Trends │──>│ Role Model │──>│ Candidate │──>│ Match │
└────────┘   └─────────┘   └───────┘   └────────┘   └────────────┘   └───────────┘   └───────┘
                                                                         Model

Ingest — converts discovered jobs into market events (idempotent)
Extract — runs signal extraction on job descriptions: technologies, skills, tools, methodologies, certifications, companies (heuristic keyword matching or OpenAI LLM)
Graph — normalises entities (fuzzy matching + alias resolution), builds evidence and co-occurrence edges
Trends — computes per-entity frequency, momentum (rising/falling), novelty, and burst scores
Role Model — clusters jobs by normalised title, builds role archetypes with entity importance weights; title normalisation via heuristic rules or OpenAI LLM
Candidate Model — projects your user_profile.yml skills into the entity graph with confidence scoring
Match — compares your capabilities against each role archetype: success score, confidence, learning upside, mismatch risk, hard/soft/learnable gap classification; graph proximity boost

Outputs:

Per-role opportunity scores with gap analysis
Rising technology trends and skill demand signals
Market-enhanced scoring on individual job detail pages
Market Intelligence section in daily reports

Configuration

Environment Variables

Variable	Required	Default	Description
`JOBHUNTER_OPENAI_API_KEY`	For real use	`""`	OpenAI API key
`JOBHUNTER_LLM_PROVIDER`	No	`"openai"`	LLM provider: `openai` or `local`
`JOBHUNTER_LOCAL_LLM_URL`	For local	`"http://localhost:8080/v1"`	Local LLM server URL
`JOBHUNTER_LOCAL_LLM_MODEL`	No	`""`	Local model name (blank = auto-detect)
`JOBHUNTER_LLM_TEMPERATURE`	No	`0.2`	Global default temperature for LLM calls
`JOBHUNTER_LLM_MAX_TOKENS`	No	`0`	Global default max tokens (0 = no limit)
`JOBHUNTER_DATA_DIR`	No	`"data"`	Path to the data directory
`JOBHUNTER_SECRET_KEY`	No	auto-generated	JWT signing key (auto-persisted on first run)
`JOBHUNTER_ADMIN_PASSWORD`	No	`""`	Admin panel password (empty = no gate)
`JOBHUNTER_REGISTRATION_ENABLED`	No	`true`	Allow new user registration
`JOBHUNTER_EMAIL_PROVIDER`	No	`"resend"`	Email provider: `resend` or `smtp`
`JOBHUNTER_RESEND_API_KEY`	For Resend	`""`	Resend API key
`JOBHUNTER_NOTIFICATION_EMAIL`	For email	`""`	Notification recipient email
`JOBHUNTER_NOTIFICATIONS_ENABLED`	No	`false`	Enable email notifications
`JOBHUNTER_SMTP_HOST`	For SMTP	`""`	SMTP server hostname
`JOBHUNTER_SMTP_PORT`	No	`587`	SMTP port
`JOBHUNTER_SMTP_USER`	No	`""`	SMTP username
`JOBHUNTER_SMTP_PASSWORD`	For SMTP	`""`	SMTP password
`JOBHUNTER_SMTP_USE_TLS`	No	`true`	Use TLS for SMTP

All settings are GUI-configurable. You can set every variable above from the Settings page in the web GUI — no need to manually edit .env. Settings saved from the GUI are automatically persisted to .env and survive restarts.

Create a .env file in the project root (or configure everything from the GUI):

JOBHUNTER_OPENAI_API_KEY=sk-proj-...
JOBHUNTER_EMAIL_PROVIDER=resend
JOBHUNTER_RESEND_API_KEY=re_...
JOBHUNTER_NOTIFICATION_EMAIL=you@example.com

The app reads .env on startup. Shell environment variables take precedence.

Profile Files

`data/user_profile.yml`

Your extracted profile — generated by hunt profile or the Setup wizard. Editable manually or via the Profiles page.

user_profile:
  name: Jane Doe
  first_name: Jane
  last_name: Doe
  email: jane@example.com
  phone: "555-0123"
  phone_country_code: "+1"
  title: Senior Python Developer
  summary: Experienced backend engineer with 8 years in Python ecosystems.
  skills:
    - Python
    - FastAPI
    - AWS
  experience_years: 8
  seniority_level: Senior
  spoken_languages:
    - English
    - Spanish
  programming_languages:
    - Python
    - SQL
  preferred_industries:      # Boost score for jobs in these industries
    - startups
    - healthcare
  disliked_industries:       # Penalise score for jobs in these industries
    - fintech
    - adtech

`data/profiles.yml`

Search profiles define what jobs to look for and scoring thresholds:

profiles:
  - name: backend-python
    keywords:
      - Senior Python Developer
      - Backend Engineer
    location: Remote
    remote: true
    seniority:
      - Senior
      - Mid-Senior
    blacklist_companies: []
    blacklist_titles:
      - Intern
    min_fit_score: 75           # LLM score threshold (0–100)
    min_similarity: 0.35        # Embedding similarity threshold (0.0–1.0)
    max_applications_per_day: 25

You can have multiple search profiles and select them with --profile NAME.

Data Directory

After setup, data/ contains:

data/
├── job_hunter.db          ← SQLite database (application + market tables)
├── user_profile.yml       ← your extracted profile
├── profiles.yml           ← search profiles
├── cookies.json           ← LinkedIn session cookies (after hunt login)
├── schedule.yml           ← scheduler configuration (time, days, mode)
├── schedule_history.yml   ← last 100 scheduled run records
├── users/                 ← per-user data directories (data/users/<user_id>/)
└── reports/               ← daily Markdown + JSON reports

Configurable via --data-dir or JOBHUNTER_DATA_DIR.

Data Models

Job

Field	Type	Description
`id`	UUID	Primary key
`user_id`	UUID (nullable)	FK to `users` — multi-user data isolation
`source`	string	Job source (default `"linkedin"`)
`external_id`	string	LinkedIn job ID
`title`	string	Job title
`company`	string	Company name
`location`	string	Job location
`posted_at`	datetime	When posted on LinkedIn
`description_text`	text	Full job description
`easy_apply`	bool	Easy Apply available
`hash`	string	SHA-256 dedup hash
`status`	enum	`new` → `scored` → `queued` → `applied` / `skipped` / `blocked` / `review` / `failed`
`notes`	text	User notes on the job

Score

Field	Type	Description
`user_id`	UUID (nullable)	FK to `users` — multi-user data isolation
`job_hash`	string	References Job
`resume_id`	string	Resume identifier (default `"default"`)
`embedding_similarity`	float	Cosine similarity (0.0–1.0)
`llm_fit_score`	int	LLM evaluation (0–100)
`missing_skills`	JSON	Skills the candidate lacks
`risk_flags`	JSON	Red flags identified by LLM
`decision`	enum	`apply` / `skip` / `review`

ApplicationAttempt

Field	Type	Description
`user_id`	UUID (nullable)	FK to `users` — multi-user data isolation
`job_hash`	string	References Job
`result`	enum	`success` / `failed` / `blocked` / `dry_run` / `already_applied`
`failure_stage`	string	Which wizard step failed
`form_answers_json`	JSON	Answers submitted in the form

Market Tables (13 total)

Table	Purpose
`market_events`	Jobs converted to analysable events
`market_extractions`	Signal extraction results per event
`market_entities`	Canonical technology/skill/tool entities
`market_aliases`	Entity name aliases
`market_evidence`	Entity–extraction linkages
`market_edges`	Entity co-occurrence edges
`market_snapshots`	Per-entity trend data (frequency, momentum, novelty, burst)
`dialogue_sessions`	Evaluation dialogue sessions
`dialogue_turns`	Individual dialogue turns
`dialogue_assessments`	Session quality assessments
`candidate_capabilities`	Candidate skill projections
`role_requirements`	Role archetype requirements
`match_explanations`	Candidate–role match details

Project Structure

AIJobHunter/
├── pyproject.toml                        # Dependencies & build config (hatchling)
├── Dockerfile                            # Multi-stage Docker build
├── Dockerfile.llm                        # Local LLM sidecar image
├── docker-compose.yml                    # One-command container deployment
├── README.md
├── AGENTS.md                             # Detailed architecture & conventions
├── .env.example                          # Environment config template
│
├── config/
│   └── llm_server.json                   # LLM sidecar server params (n_ctx, threads, etc.)
│
├── src/job_hunter/
│   ├── cli.py                            # Typer CLI — all commands + market sub-app
│   ├── llm_client.py                     # Central LLM client factory + task param resolution
│   │
│   ├── auth/
│   │   ├── models.py                     # User ORM — credentials, per-user settings
│   │   ├── repo.py                       # CRUD: create, authenticate, update profile/password
│   │   └── security.py                   # bcrypt hashing, JWT tokens, admin tokens
│   │
│   ├── config/
│   │   ├── models.py                     # AppSettings, SearchProfile, UserProfile, ScheduleConfig
│   │   └── loader.py                     # YAML load/save, settings factory, .env persistence
│   │
│   ├── db/
│   │   ├── models.py                     # ORM: Job, Score, ApplicationAttempt
│   │   ├── repo.py                       # DB init, session, CRUD helpers
│   │   └── migrations.py                 # Lightweight ALTER TABLE migrations (idempotent)
│   │
│   ├── profile/
│   │   ├── extract.py                    # PDF text + LinkedIn URL scraping
│   │   └── generator.py                  # LLM profile generation
│   │
│   ├── linkedin/
│   │   ├── session.py                    # Cookie-based Playwright session
│   │   ├── discover.py                   # Job search + pagination
│   │   ├── parse.py                      # HTML → structured data
│   │   ├── apply.py                      # Easy Apply wizard automation
│   │   ├── forms.py                      # Form-filling helpers
│   │   ├── form_filler_llm.py            # LLM-powered form answers
│   │   ├── selectors.py                  # CSS/XPath selectors
│   │   └── mock_site/fixtures/           # HTML fixtures for mock mode
│   │
│   ├── matching/
│   │   ├── embeddings.py                 # Embedding providers (OpenAI + Fake)
│   │   ├── llm_eval.py                   # LLM evaluators (OpenAI + Fake)
│   │   ├── scoring.py                    # Combined scoring + market boost
│   │   └── description_cleaner.py        # Rule-based + LLM description cleanup
│   │
│   ├── orchestration/
│   │   ├── pipeline.py                   # Async: discover → score → apply → report
│   │   └── policies.py                   # Rate limits, blacklists, daily caps
│   │
│   ├── reporting/
│   │   └── report.py                     # Markdown + JSON reports (+ market section)
│   │
│   ├── notifications/
│   │   └── email.py                      # Email providers: Resend, SMTP, Fake
│   │
│   ├── scheduling/
│   │   └── scheduler.py                  # APScheduler cron automation + history
│   │
│   ├── market/                           # Market Intelligence package
│   │   ├── pipeline.py                   # Full 7-step market pipeline
│   │   ├── cli.py                        # hunt market … subcommands
│   │   ├── events.py                     # Job → market event ingestion
│   │   ├── extract.py                    # Signal extraction (Heuristic/OpenAI/Fake)
│   │   ├── normalize.py                  # Entity canonicalisation + fuzzy matching
│   │   ├── title_normalizer.py           # Job title cleaning (Heuristic/OpenAI/Fake)
│   │   ├── role_model.py                 # Role archetype builder
│   │   ├── candidate_model.py            # Candidate capability projection
│   │   ├── matching.py                   # Candidate ↔ role matching
│   │   ├── opportunity.py                # Opportunity scoring + gap analysis
│   │   ├── dialogue.py                   # Dialogue session CRUD
│   │   ├── dialogue_eval.py              # Session evaluation (RuleBased/Fake)
│   │   ├── report.py                     # Market intelligence report generation
│   │   ├── db_models.py                  # 13 SQLAlchemy market tables
│   │   ├── schemas.py                    # Extraction I/O Pydantic models
│   │   ├── repo.py                       # Market CRUD helpers
│   │   ├── graph/
│   │   │   ├── builder.py                # Entity + evidence graph materialisation
│   │   │   └── metrics.py                # NetworkX export (JSON / GraphML)
│   │   ├── trends/
│   │   │   ├── compute.py                # Trend computation engine
│   │   │   └── queries.py                # SQL helpers for trend analysis
│   │   ├── web/
│   │   │   └── router.py                 # Market web pages + JSON APIs
│   │   └── data/
│   │       └── aliases.yml               # Technology alias dictionary
│   │
│   ├── web/                              # Web GUI (FastAPI + HTMX + Pico CSS)
│   │   ├── app.py                        # App factory + lifespan
│   │   ├── deps.py                       # Dependency injection
│   │   ├── task_manager.py               # Background tasks + SSE broadcasting
│   │   ├── routers/
│   │   │   ├── auth.py                   # Login, register, logout, /api/auth/me
│   │   │   ├── account.py                # Account settings — profile & password
│   │   │   ├── admin.py                  # Admin panel — user mgmt, DB reset
│   │   │   ├── dashboard.py              # Visual stats + charts + market panel
│   │   │   ├── jobs.py                   # Jobs CRUD + bulk actions + market boost
│   │   │   ├── onboarding.py             # First-run profile wizard
│   │   │   ├── profiles.py               # User + search profile editing
│   │   │   ├── resume_review.py          # AI resume gap analysis
│   │   │   ├── run.py                    # Pipeline + market trigger + SSE progress
│   │   │   ├── reports.py                # Daily report viewer
│   │   │   ├── settings.py               # Runtime settings + .env persistence
│   │   │   └── schedule.py               # Scheduler config + run history
│   │   ├── templates/                    # Jinja2 HTML templates
│   │   └── static/                       # Banner, favicon
│   │
│   └── utils/
│       ├── logging.py                    # Structured logging setup
│       ├── rate_limit.py                 # Token-bucket rate limiter
│       ├── retry.py                      # Exponential back-off decorator
│       └── hashing.py                    # SHA-256 job dedup
│
├── tests/                                # 377 tests — all run fully offline
│   ├── test_web.py                       # Web GUI endpoints (incl. schedule, settings, email)
│   ├── test_market.py                    # Market Intelligence (all stages)
│   ├── test_notifications.py             # Email providers + pipeline summary
│   ├── test_scheduling.py                # Scheduler config, YAML, start/stop/reschedule
│   ├── test_apply_mock_flow.py           # Easy Apply wizard
│   ├── test_discover_parse_mock.py       # Discovery & parsing
│   ├── test_matching_scoring.py          # Scoring logic
│   ├── test_pipeline.py                  # Pipeline orchestration
│   ├── test_db_repo.py                   # Database CRUD
│   ├── test_profile_generation.py        # Profile generation
│   ├── test_reporting.py                 # Report generation
│   ├── test_description_cleaner.py       # Description cleanup
│   ├── test_config.py                    # Config loading
│   ├── test_llm_eval_schema.py          # LLM evaluator schema
│   ├── test_linkedin_session.py          # Session & URLs
│   └── fixtures/                         # Sample data
│
└── data/                                 # Runtime data (gitignored)

Testing

All tests run fully offline — no API keys, internet, or LinkedIn account needed.

uv run pytest -q                    # Quick summary
uv run pytest -v                    # Verbose output
uv run pytest tests/test_web.py     # Specific test file
uv run pytest tests/test_market.py  # Market Intelligence tests
uv run pytest -k "test_upsert"     # Pattern matching

Current: 377 tests passed.

Tests use fake implementations for all external services:

FakeEmbedder — fixed similarity scores
FakeLLMEvaluator — deterministic fit scores
FakeProfileGenerator — canned profile output
FakeMarketExtractor — deterministic signal extraction
FakeTitleNormalizer — deterministic title cleaning
FakeDialogueEvaluator — deterministic session scores
FakeNotifier — records emails instead of sending

All database tests use in-memory SQLite. Mock discovery tests spin up a local HTTP server with HTML fixtures.

Windows Development Notes

Primary development is on Windows (PowerShell). Common equivalents:

Unix	PowerShell
`cat file`	`Get-Content file`
`head -n 20 file`	`Get-Content file -Head 20`
`tail -n 20 file`	`Get-Content file -Tail 20`
`tail -f file`	`Get-Content file -Wait -Tail 20`
`grep pattern file`	`Select-String -Pattern "pattern" file`
`find . -name "*.py"`	`Get-ChildItem -Recurse -Filter *.py`
`export VAR=val`	`$env:VAR = "val"`
`cp src dst`	`Copy-Item src dst`

All uv, pytest, and docker compose commands work identically on Windows. Only deploy.sh requires WSL or Docker Desktop.

Docker Deployment

One-command deploy

The included deploy.sh script handles the full lifecycle via docker compose — stop, pull, build, start:

chmod +x deploy.sh
./deploy.sh              # app only — web GUI on port 80
./deploy.sh --with-llm   # app + local LLM sidecar (auto-downloads model if missing)

Windows: deploy.sh is Linux/macOS only. On Windows, use Docker Desktop with docker compose commands below, or run the script in WSL.

This stops running containers, pulls the latest code, rebuilds images, and starts the app on port 80 with the data/ volume and .env config. The --with-llm flag additionally builds and starts the local LLM sidecar on port 8080.

Docker Compose

# Build and start — app only (works on all platforms including Windows)
docker compose up -d

# App + local LLM sidecar
docker compose --profile local-llm up -d

# View logs
docker compose logs -f

# Stop
docker compose down

The app container:

Exposes the web GUI on port 80
Mounts ./data as a volume for persistent storage
Reads configuration from .env
Runs a health check every 30 s at /api/health
Restarts automatically (unless-stopped)
Connects to the LLM sidecar at http://llm:8080/v1 via Docker networking (when running)

Local LLM Sidecar

An optional self-hosted LLM container provides an OpenAI-compatible API, so the system can run fully offline without an OpenAI API key. Uses llama-cpp-python with GGUF models.

Easiest way — use the deploy script:

./deploy.sh --with-llm    # downloads model if missing, builds & starts both containers

Manual setup:

# 1. Download a model (~1.8 GB default)
.\scripts\download_model.ps1          # Windows
./scripts/download_model.sh           # Linux/macOS

# 2. Start both containers (app + LLM sidecar on same Docker network)
docker compose --profile local-llm up -d

# 3. Verify the sidecar is running
curl http://localhost:8080/v1/models

Then set LLM Provider → Local LLM in the web Settings page, or add to .env:

JOBHUNTER_LLM_PROVIDER=local

Note: When running via docker compose, the app automatically uses http://llm:8080/v1 (Docker internal hostname). The JOBHUNTER_LOCAL_LLM_URL env var in docker-compose.yml is pre-configured — no manual URL setup needed.

Server configuration is in config/llm_server.json (mounted into the container):

{
  "model": "/models/model.gguf",
  "n_ctx": 4096,
  "n_batch": 512,
  "n_threads": 4,
  "n_threads_batch": 8,
  "use_mlock": true,
  "cache": true,
  "host": "0.0.0.0",
  "port": 8080
}

Edit this file to tune context size, thread count, caching, etc. Changes take effect on container restart.

Recommended models:

Model	Size	Quality
Llama-3.2-1B-Instruct-Q4_K_S	~700 MB	Basic (fast)
Llama-3.2-3B-Instruct-Q4_K_M	~1.8 GB	Good (default)
Phi-3.5-mini-Q4_K_M	~2.2 GB	Good (128K context)
Mistral-7B-Instruct-Q4_K_M	~4.4 GB	Great (slow)

Note: For reliable JSON output (scoring, profiles), 3B+ models are recommended. Embeddings still use OpenAI when an API key is set; without a key, similarity defaults to a fixed value.

LLM Inference Parameters

All LLM calls use centralized temperature and max-token settings from AppSettings. Per-task presets are pre-configured with optimal values:

Task	Temperature	Max Tokens	Used By
`scoring`	0.2	—	Job fit evaluation
`description_clean`	0.1	2000	Job description formatting
`profile_gen`	0.3	—	Profile generation
`market_extract`	0.0	—	Market signal extraction
`title_normalize`	0.0	—	Job title normalization
`form_fill`	0.1	1000	Easy Apply form filling
`resume_review`	0.4	2000	Resume gap analysis

Global defaults (temperature, max tokens) are configurable from the Settings page or via environment variables. Per-task overrides automatically apply on top of the global defaults.

Safety & Ethics

No captcha bypassing. Challenges pause the bot and mark the job as BLOCKED.
Rate limiting — configurable delays, daily application caps.
Dry-run mode — runs everything without submitting applications.
No secret logging — cookies and API keys never appear in logs or reports.
Respectful automation — navigates like a human with realistic delays.

Development Roadmap

Phase	Description	Status
1	Skeleton + DB + CLI	✅
2	Profile generation from resume PDF + LinkedIn	✅
3	Mock LinkedIn + HTML parser + discovery	✅
4	Embeddings + LLM scoring	✅
5	Easy Apply automation + LLM form filling	✅
6	Real LinkedIn integration + cookies	✅
7	Orchestration + reporting	✅
8	Web GUI dashboard (FastAPI + HTMX)	✅
9	`.env` support, industry preferences, UI polish	✅
10	Resume review, visual dashboard	✅
11	Market Intelligence — graph foundation	✅
12	Market Intelligence — trends, roles, dialogue, web	✅
13	Market Intelligence — candidate model, matching, scoring	✅
14	Market Intelligence — UI panels, reporting, evaluation	✅
15	Operational integration — pipeline, SSE, title normalisation	✅
16	Scheduled pipelines, email notifications (Resend + SMTP), Docker	✅
17	Multi-user authentication, account management, admin panel	✅
18	Centralized LLM configuration — server config file, per-task params, GUI controls	✅
Next	Outcome learning, career trajectory parsing, fairness-aware reranking	🔜

License

This project is for personal use. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
agentic_data		agentic_data
config		config
data		data
docs		docs
models		models
scripts		scripts
src/job_hunter		src/job_hunter
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
Dockerfile.llm		Dockerfile.llm
README.md		README.md
TODO.txt		TODO.txt
deploy.sh		deploy.sh
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

AI Job Hunter

Table of Contents

Features

Prerequisites

Installation

Getting Started — Step by Step

Step 1: Install & Configure

Step 2: Generate Your Profile

Step 3: Log in to LinkedIn

Step 4: Test with a Dry Run

Step 5: Run for Real

Step 6: Market Intelligence (Optional)

Web GUI

Pages

Authentication & Multi-User

Registration & Login

Account Settings

Per-User Settings

Admin Panel

Scheduling

Configuration

How it works

data/schedule.yml example

Email Notifications

Resend (recommended — simple)

SMTP (advanced)

Environment variables

Operating Modes

CLI Reference

Global Flags

Core Commands

Market Intelligence Commands

Key Options for hunt market run-all

Examples

Application Pipeline

Market Intelligence Pipeline

Configuration

Environment Variables

Profile Files

data/user_profile.yml

data/profiles.yml

Data Directory

Data Models

Job

Score

ApplicationAttempt

Market Tables (13 total)

Project Structure

Testing

Windows Development Notes

Docker Deployment

One-command deploy

Docker Compose

Local LLM Sidecar

LLM Inference Parameters

Safety & Ethics

Development Roadmap

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`data/schedule.yml` example

Key Options for `hunt market run-all`

`data/user_profile.yml`

`data/profiles.yml`

Packages