Skip to content

Harshitpant12/SkillSync

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

116 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SkillSync - AI-Powered ATS Optimizer & Microservices Architecture

Build Status MERN Stack Python License

SkillSync is a full-stack SaaS application designed to calculate ATS (Applicant Tracking System) compatibility scores. It utilizes a decoupled microservices architecture, isolating a MERN-stack primary server from a dedicated Python NLP engine to process compute-heavy document parsing and cosine-similarity algorithms without blocking the main event loop.


Live Demo

View Live Application


Key Features

  • Microservices Architecture: Independently deployed Node.js REST API on Vercel and a Python/Flask NLP engine on Render for optimized CPU load balancing.
  • Advanced NLP Pipeline: Leverages pdfminer.six for high-fidelity text extraction, utilizing spaCy and scikit-learn to calculate vector similarities between resumes and job descriptions in under 15 seconds.
  • Enterprise-Grade Authentication: Built a robust JWT system with silent refresh interceptors and cross-domain HTTP-only cookies, utilizing Vercel edge rewrites to bypass strict browser tracking policies.
  • Client-Side PDF Generation: Utilized native browser Print APIs to generate highlightable, vector-based PDF reports directly from the React DOM, bypassing server-side rendering constraints and reducing cloud computing costs.
  • Dynamic Client Dashboard: Engineered a responsive React/Vite frontend using Tailwind CSS and Framer Motion, featuring drag-and-drop file uploads and real-time polling.
  • High-Performance Routing: Load-tested using Apache JMeter, optimizing CORS configurations and payload routing to maintain sub-400ms average response times under concurrent loads.

Tech Stack

Frontend (Vercel)

  • React.js (Vite): Modern, component-based UI library.
  • Tailwind CSS: Utility-first CSS framework for responsive design.
  • Framer Motion: Declarative animation library for fluid UI state transitions.
  • Axios: HTTP client with custom interceptors for automated JWT refresh cycles.

Primary Backend (Vercel Serverless)

  • Node.js & Express.js: RESTful API structured for serverless deployment.
  • MongoDB & Mongoose: NoSQL database for storing user schemas and vector analysis history.
  • JWT: Stateless authentication securely managed via sameSite: lax cookies.
  • Multer: Memory storage middleware for handling binary PDF uploads.

Python Service (Render)

  • Python (Flask): Lightweight WSGI web application framework.
  • spaCy: Industrial-strength Natural Language Processing for entity recognition.
  • scikit-learn: Machine learning library utilized for TF-IDF vectorization and cosine similarity calculations.
  • pdfminer.six: High-precision PDF document text extraction.

Screenshots (Visuals)

Landing_Page New_Scan Result_Page

Architecture & Project Structure

The repository isolates the frontend client, the Node.js API, and the Python NLP engine into independent directories for scalable deployment.

skillsync/
├── backend/             # Node.js / Express Serverless API
│   ├── src/
│   │   ├── controllers/ # Request logic (auth, analysis processing)
│   │   ├── middleware/  # JWT validation, Multer config, error handling
│   │   ├── models/      # Mongoose schemas (User, AnalysisRecord)
│   │   ├── routes/      # API endpoints routing
│   │   ├── utils/       # DB connection, Axios instance for Python API
│   │   └── index.js     # App entry point (Trust Proxy & CORS configured)
│   └── .env             # Environment variables
├── frontend/            # React / Vite Client
│   ├── public/          
│   ├── src/
│   │   ├── api/         # Axios global config with token interceptors
│   │   ├── components/  # Reusable UI components
│   │   ├── context/     # AuthContext for global user state
│   │   ├── pages/       # Views (Dashboard, History, Results, Settings)
│   │   └── main.jsx     # React DOM render entry
│   └── vercel.json      # Reverse proxy rules for first-party cookies
└── python-service/      # Python / Flask Microservice
    ├── app.py           # Flask server and routing
    ├── ats_scorer.py    # Core scoring logic and TF-IDF vectorization
    ├── extractor.py     # pdfminer.six text extraction logic
    ├── matcher.py       # spaCy entity matching and similarity calculation
    ├── requirements.txt # Python dependencies
    └── skills_list.py   # Curated dictionary/dataset of technical skills

## Installation & Setup

Follow these steps to run the complete microservices architecture locally.

### Prerequisites

* **Node.js** (v18+)
* **Python** (v3.9+)
* **MongoDB** (Local instance or MongoDB Atlas URL)
* **Git**

### 1. Clone the Repository

```bash
git clone [https://github.com/Harshitpant12/skillsync.git](https://github.com/Harshitpant12/skillsync.git)
cd skillsync

### 2. Node.js Backend Setup

Navigate to the backend folder and install dependencies:

```bash
cd backend
npm install

Create a .env file in the backend/ root folder:

PORT=5000
MONGO_URI=your_mongodb_connection_string
JWT_ACCESS_SECRET=your_access_token_secret
JWT_REFRESH_SECRET=your_refresh_token_secret
ACCESS_TOKEN_EXPIRES=15m
REFRESH_TOKEN_EXPIRES=7d
PYTHON_SERVICE_URL=http://localhost:5001
NODE_ENV=development
GEMINI_API_KEY=your_gemini_api_key
GOOGLE_CLIENT_ID=your_google_client_id
GOOGLE_CLIENT_SECRET=your_google_client_secret

Start the backend server:

npm run dev

3. Python Service Setup

Open a new terminal, navigate to the python-service folder:

cd python-service
python -m venv venv
source venv/bin/activate 
pip install -r requirements.txt
python -m spacy download en_core_web_sm

Start the Flask server:

python app.py

(The Python service runs on http://localhost:5001)

4. Frontend Setup

Open a third terminal window, navigate to the frontend folder, and install dependencies:

cd frontend
npm install

Create a .env file in the frontend/ root folder:

VITE_API_URL=http://localhost:5000/api
VITE_GOOGLE_CLIENT_ID=your_google_client_id

Start the React development server:

npm run dev

API Reference

Analysis Routes (Node.js)

Method Endpoint Description Protected?
POST /api/analysis/run Forwards PDF/JD to Python service & saves results Yes
GET /api/analysis/my Retrieves user's historical scan data Yes
GET /api/analysis/wake-python Pre-flight ping to prevent cold-starts Yes

Auth Routes (Node.js)

Method Endpoint Description Protected?
POST /api/auth/register Hash password & create user No
POST /api/auth/login Authenticate & issue cross-domain HTTP-only cookies No
POST /api/auth/refresh Silent access token regeneration via refresh cookie No
POST /api/auth/logout Destroy secure session cookies Yes
GET /api/auth/me Verify active session state Yes

Internal NLP Engine (Python / Flask)

This microservice acts as the dedicated compute engine for the platform, offloading CPU-intensive NLP tasks from the primary Node.js server to avoid blocking the event loop.

Method Endpoint Description Payload Type
GET / or /health Service Heartbeat: Used by Render for health monitoring and uptime checks. N/A
GET /wake Pre-warm Trigger: Targeted ping from the frontend to initialize spaCy models in RAM. N/A
POST /extract-text PDF Extraction: Accepts a multipart form-data PDF and returns raw unstructured text via pdfminer.six. multipart/form-data
POST /extract NLP Orchestration: Executes the core pipeline including TF-IDF vectorization, Cosine Similarity matching, and skill lemmatization. application/json

Contributing

Contributions are welcome!

  1. Fork the project.
  2. Create your Feature Branch (git checkout -b feature/Optimization).
  3. Commit your changes (git commit -m 'Implement faster vectorization').
  4. Push to the Branch (git push origin feature/Optimization).
  5. Open a Pull Request.

License

Distributed under the MIT License. See LICENSE for more information.

About

An NLP Driven ATS Matching Engine and Scoring Platform built with a decoupled MERN stack and a Python/Flask NLP microservice for real time skill matching.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors