Fake Job Detector is an AI-powered tool that helps job seekers identify fraudulent job postings. With a trained machine learning model and integrated Gemini-2.5-Flash-Lite LLM, this tool not only classifies jobs as Real or Fake but also explains why a job is likely fraudulent helping users stay safe from scams demanding personal data or money.
- Used dataset from kaggle.
- Contains job descriptions labeled as fake or real.
- Dropped unnecessary columns (e.g., job ID, salary range).
- Replaced null values with blank strings for consistency.
-
Used
RandomUnderSampler(fromimblearn) to handle data imbalance – there were too many real jobs vs. fake. -
Visualized:
- Number of job postings per country (USA dominated).
- Job experience distribution (Entry-level had the highest share).
- Most common words in real vs. fake job descriptions using word clouds.
most common words used in real job descriptions

most common words used in fake job descriptions

- Removed stopwords using NLTK to clean up textual noise.
- Normalized and tokenized the text data.
- Split data into train and test sets.
- Used CountVectorizer to convert job descriptions into numerical form (Document-Term Matrix).
-
Trained using Naive Bayes – known for strong performance on text data.
-
Also tried Decision Tree, but Naive Bayes + CountVectorizer gave better accuracy.
-
Evaluation Metrics:
- Accuracy
- Precision
- F1 Score
- Support
- Saved the trained model and vectorizer using
picklefor web deployment.
- Integrated Google’s Gemini Flash model to explain why a job is real or fake.
- After prediction, users can click the "Explain Why" button to get an AI-generated reasoning.
Component Tool/Library
Frontend HTML, CSS, JavaScript
Backend Flask
ML & NLP Scikit-learn, NLTK, CountVectorizer, Imbalanced-learn
LLM Gemini-2.5-Flash-Lite via Google Generative AI API
Model Naive Bayes Classifier
- Freshers & students verifying internship offers.
- Job seekers avoiding scam job listings.
- Platforms enhancing job post moderation.
- Career counselors verifying job leads before sharing.
-
Clone the repo:
git clone https://github.com/yourusername/fake-job-detector.git cd fake-job-detector -
Install requirements:
pip install -r requirements.txt
-
Run the Flask server:
python app.py