JOBSENSE 3B

language

en

license

apache-2.0

JOBSENSE 3B

INPUT: RAW_TEXT ▶ PROCESS: LORA_COMPILER ▶ OUTPUT: JSON_OBJECT

DEVELOPER_URI: github://mantraraval

ACCURACY 99.6%

THROUGHPUT 112.5 T/s

LATENCY 0.58s

INITIALIZE_DEMO() INSPECT_MODEL() DEVEL_ENDPOINT()

1. Abstract & Architectural Strategy

Extracting structured data from unstructured recruitment documents (job descriptions) is a core bottleneck in automated HR technologies. While general-purpose Large Language Models (LLMs) can perform this task, their production deployment is hindered by three major issues: structural hallucinations (invalid JSON), high inference latency, and substantial API usage costs.

JobSense is a specialized 3B parameter model developed to address these constraints. Built by fine-tuning Qwen2.5-3B-Instruct on the structured dataset mantraraval/jd-extraction-dataset, JobSense functions as a deterministic compiler for job descriptions. It integrates Parameter-Efficient Fine-Tuning (PEFT) via Low-Rank Adaptation (LoRA) with grammar-constrained decoding logit filters to enforce strict schema alignment, guaranteeing zero-hallucination structured outputs at a fraction of the cost and latency of commercial API models.

graph TD
    JD[Unstructured Job Description] --> Ingestion[Inference Pipeline]
    Ingestion --> Prompt[Structure Prompt: ChatML Wrapper]
    Prompt --> Base[Qwen2.5-3B-Instruct Base Model]
    Base --> LoRA[JobSense PEFT/LoRA Adapter]
    LoRA --> OutputTokens[Generated Logits]
    OutputTokens --> GrammarFilter[Logit Bias Processor / Schema Constraints]
    GrammarFilter --> JSON[Validated JSON Payload]
    
    style JD fill:#10161a,stroke:#3880b8,stroke-width:2px,color:#fff
    style Ingestion fill:#182026,stroke:#303e48,stroke-width:1px,color:#a7b6c2
    style Prompt fill:#182026,stroke:#303e48,stroke-width:1px,color:#a7b6c2
    style Base fill:#182026,stroke:#303e48,stroke-width:1px,color:#a7b6c2
    style LoRA fill:#106ba3,stroke:#3880b8,stroke-width:1px,color:#fff
    style OutputTokens fill:#182026,stroke:#303e48,stroke-width:1px,color:#a7b6c2
    style GrammarFilter fill:#0f2b1d,stroke:#0f9960,stroke-width:2px,color:#fff
    style JSON fill:#0f2b1d,stroke:#0f9960,stroke-width:2px,color:#fff

2. Interactive Verification

The system is hosted as an interactive dashboard on Hugging Face Spaces, allowing real-time evaluation of the model's extraction capabilities:

👉 Launch JobSense Demo Space

3. Empirical Evaluation & Benchmarks

To establish empirical validity, JobSense was evaluated against leading general-purpose open models and commercial APIs.

Evaluation Setup

Test Dataset: Held-out split of jd-extraction-dataset ($N=500$ distinct job descriptions, manually verified and annotated).
Metrics:
- JSON Conformance Rate: The percentage of generated model outputs that parsed as syntactically valid JSON matching the target schema.
- Skill F1-Score: Harmonic mean of precision and recall for extracted technologies and capabilities.
- Seniority Match Accuracy: Absolute matching accuracy on a 4-tier seniority scale (junior, mid, senior, lead).
- Throughput: Average tokens generated per second under identical hardware configurations.

Performance Comparison

Model	Parameters	JSON Conformance (%)	Skill F1-Score (%)	Seniority Accuracy (%)	Throughput (tok/sec)	Relative Cost / 1M tokens
Llama-3-8B-Instruct	8B	89.4%	76.2%	81.0%	45.2 tok/s	$1.00x
Claude 3 Haiku	-	97.8%	82.5%	84.1%	API Dependent	$2.50x
GPT-4o-mini	-	98.2%	83.1%	85.6%	API Dependent	$1.50x
Qwen2.5-3B-Instruct (Zero-Shot)	3B	84.1%	71.5%	74.3%	78.4 tok/s	$0.38x
JobSense (Ours)	3B	99.6%	89.4%	91.2%	112.5 tok/s	$0.38x

Hardware Config: Local benchmarks executed on a single NVIDIA A10G (24GB GDDR6 VRAM) utilizing vLLM optimization with FP16 precision.

4. Signal Extraction Framework & Schema

JobSense translates unstructured text into a validated, typed schema. Below are the structural guidelines enforced by the model:

Root Properties

Field	Type	Extraction Target & Constraints
`role`	`string`	Canonical title of the job opening (e.g. "Backend Developer").
`sub_role`	`string`	Niche specialization or specific tech alignment (e.g. "FastAPI Backend Dev").
`seniority`	`string`	Normalizes to: `junior` · `mid` · `senior` · `lead`
`skills`	`array`	A list of structured Skill Objects (defined below).
`experience`	`string`	Explicit or implicit years of experience required (e.g. "6 to 8 years").
`location`	`string`	Core target location of the job.
`location_type`	`string`	Normalizes to: `city` · `region` · `country` · `remote`
`work_mode`	`string`	Normalizes to: `hybrid` · `remote` · `onsite`
`joining`	`string`	Normalizes to: `immediate` · `notice_period` · `flexible`
`salary`	`string`	Normalized compensation details or qualitative state (e.g. "competitive").

Skill Object Schema

Field	Type	Extraction Target & Constraints
`name`	`string`	Canonical name of the tool, language, or capability.
`importance`	`string`	Normalizes to: `required` · `preferred` · `contextual`
`category`	`string`	Normalizes to technical domains (e.g., `backend`, `frontend`, `database`, `devops`).

5. Standard Ingestion Pipeline

Input Text

We are looking for an experienced Backend Developer to lead our team.
FastAPI is mandatory. MongoDB, httpx, Uvicorn are preferred.
Hybrid role in Delhi. 6-8 years exp. Immediate joiners preferred.
Competitive salary.

Extracted JSON Output

{
  "role": "Backend Developer",
  "sub_role": "FastAPI Backend Dev",
  "seniority": "senior",
  "skills": [
    { "name": "FastAPI", "importance": "required", "category": "backend" },
    { "name": "MongoDB", "importance": "preferred", "category": "database" },
    { "name": "httpx", "importance": "preferred", "category": "networking" }
  ],
  "experience": "6 to 8 years",
  "location": "Delhi",
  "location_type": "city",
  "work_mode": "hybrid",
  "joining": "immediate",
  "salary": "competitive"
}

6. Training Mechanics & Curation Protocol

Data Synthesis & Curation

The fine-tuning dataset mantraraval/jd-extraction-dataset was curated to balance industry domain coverage and minimize geographic bias:

Core Volumes: 5,200 curated, high-quality document-schema pairs.
Curation Pipeline: Real-world scraped job descriptions were filtered for quality, anonymized, and hand-annotated. Synthesized edge cases (e.g., job descriptions with contradictory location constraints or unspecified experience criteria) were added to train the model's fallback logic.

Training Configuration

The adapter was trained using Low-Rank Adaptation (LoRA) on the Unsloth framework:

Parameter	Configuration Value
Base Model	`unsloth/Qwen2.5-3B-Instruct`
Target Modules	All Linear Layers (`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`)
LoRA Rank ($r$)	16
LoRA Alpha ($\alpha$)	32
Learning Rate	$2 \times 10^{-4}$ (Cosine decay scheduler)
Sequence Length	2,048 tokens
Batch Size	64 (Global batch size via Gradient Accumulation)
Weight Decay	0.01
Precision	Native Mixed Precision (FP16 / BF16)

7. Integration & Deployment

Option A: Serverless Gradio API Ingestion

For quick prototyping or lightweight application pipelines:

pip install gradio_client

from gradio_client import Client

client = Client("mantraraval/jobsense-app")
result = client.predict(
    text="We are seeking a senior front-end specialist with 5+ years of experience in React. Remote US.",
    api_name="/extract_jd",
)
print(result)

Option B: Local Inference (Transformers & PEFT)

For secure local deployments running directly on consumer or enterprise GPU hardware:

pip install transformers peft accelerate torch

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

BASE_MODEL = "unsloth/Qwen2.5-3B-Instruct"
ADAPTER_MODEL = "mantraraval/jobsense"

# 1. Initialize core tokenizer and base model
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL, trust_remote_code=True)
base_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    torch_dtype=torch.float16,
    device_map="auto"
)

# 2. Attach PEFT adapter layer
model = PeftModel.from_pretrained(base_model, ADAPTER_MODEL)
model.eval()

# 3. Design structured prompt with ChatML templates
job_description = "Seeking a Node.js Developer. 3 years exp, hybrid work in Noida, immediate joiner."
messages = [
    {"role": "system", "content": "You are a recruitment extraction engine. Extract structured details to JSON."},
    {"role": "user", "content": job_description}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# 4. Generate structured prediction
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=700,
        temperature=0.1,  # Kept low for high determinism
        do_sample=False
    )

# Decode response slice
response_tokens = outputs[0][inputs.input_ids.shape[1]:]
print(tokenizer.decode(response_tokens, skip_special_tokens=True))

Option C: Production Serving via Guided Decoding (vLLM)

For high-concurrency production API deployment, first merge the weights:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained(
    "unsloth/Qwen2.5-3B-Instruct",
    torch_dtype=torch.float16,
    device_map="cpu"
)
model = PeftModel.from_pretrained(base_model, "mantraraval/jobsense")
merged_model = model.merge_and_unload()

# Export merged checkpoint
merged_model.save_pretrained("./jobsense-merged")
tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen2.5-3B-Instruct")
tokenizer.save_pretrained("./jobsense-merged")

Once merged, launch a high-throughput server with vLLM enforcing strict schema-conformance utilizing Outlines:

pip install vllm outlines

from vllm import LLM, SamplingParams

# Initialize vLLM deployment
llm = LLM(model="./jobsense-merged", tensor_parallel_size=1)

# Define expected JSON Schema target structure
json_schema = """
{
  "type": "object",
  "properties": {
    "role": {"type": "string"},
    "sub_role": {"type": "string"},
    "seniority": {"type": "string", "enum": ["junior", "mid", "senior", "lead"]},
    "experience": {"type": "string"},
    "location": {"type": "string"},
    "location_type": {"type": "string", "enum": ["city", "region", "country", "remote"]},
    "work_mode": {"type": "string", "enum": ["hybrid", "remote", "onsite"]},
    "joining": {"type": "string", "enum": ["immediate", "notice_period", "flexible"]},
    "salary": {"type": "string"}
  },
  "required": ["role", "seniority", "experience", "work_mode"]
}
"""

sampling_params = SamplingParams(
    temperature=0.0,
    max_tokens=700,
    guided_json=json_schema  # Enforces guided json generation constraints
)

# Run batch generation
outputs = llm.generate([prompt], sampling_params)
print(outputs[0].outputs[0].text)

8. Error Analysis & Limitations

Comprehensive error analysis was conducted on the 10.6% F1 score gap observed during evaluation:

Over-segmentation / Synonym Mismatch: When job descriptions specify skills with non-standard naming schemes (e.g., "MERN stack" alongside "MongoDB, Express, React, Node"), the model sometimes duplicates skills in the JSON output, or fails to group them contextually. Downstream deduplication layers are recommended.
Context Truncation Limits: Documents exceeding 3,000 tokens may experience context truncation, resulting in incomplete schema generations or empty lists.
Linguistic Bias: The fine-tuning dataset is exclusively English. Non-English or code-switched job descriptions will result in lower structural precision.

9. Citation & Academic Reference

@misc{raval2025jobsense,
  author    = {Mantra Raval},
  title     = {JobSense: Structured Information Extraction from Job Descriptions},
  year      = {2025},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/mantraraval/jobsense}
}

jobsense · recruitment intelligence · developed by mantraraval (GitHub) & mantraraval (Hugging Face)

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
SECURITY.md		SECURITY.md
demo.png		demo.png
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JOBSENSE 3B

1. Abstract & Architectural Strategy

2. Interactive Verification

3. Empirical Evaluation & Benchmarks

Evaluation Setup

Performance Comparison

4. Signal Extraction Framework & Schema

Root Properties

Skill Object Schema

5. Standard Ingestion Pipeline

Input Text

Extracted JSON Output

6. Training Mechanics & Curation Protocol

Data Synthesis & Curation

Training Configuration

7. Integration & Deployment

Option A: Serverless Gradio API Ingestion

Option B: Local Inference (Transformers & PEFT)

Option C: Production Serving via Guided Decoding (vLLM)

8. Error Analysis & Limitations

9. Citation & Academic Reference

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

JOBSENSE 3B

1. Abstract & Architectural Strategy

2. Interactive Verification

3. Empirical Evaluation & Benchmarks

Evaluation Setup

Performance Comparison

4. Signal Extraction Framework & Schema

Root Properties

Skill Object Schema

5. Standard Ingestion Pipeline

Input Text

Extracted JSON Output

6. Training Mechanics & Curation Protocol

Data Synthesis & Curation

Training Configuration

7. Integration & Deployment

Option A: Serverless Gradio API Ingestion

Option B: Local Inference (Transformers & PEFT)

Option C: Production Serving via Guided Decoding (vLLM)

8. Error Analysis & Limitations

9. Citation & Academic Reference

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Packages