From 49e2acdbca318d0a2492eb046b0a01df235e1ff7 Mon Sep 17 00:00:00 2001 From: Hari Date: Tue, 2 Jun 2026 10:26:22 +0530 Subject: [PATCH 1/2] mermaid fix --- backend/README.md | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/backend/README.md b/backend/README.md index f0e9a7a..0df86e3 100644 --- a/backend/README.md +++ b/backend/README.md @@ -34,35 +34,35 @@ flowchart TD classDef storage fill:#F5A623,stroke:#C68015,stroke-width:1.5px,color:#fff; classDef trigger fill:#D0021B,stroke:#9E0010,stroke-width:1.5px,color:#fff; - A[Gmail Sync Push / Webhook] :::trigger --> B(GmailWebhookService) :::service - B --> C{SmartExtractionService} :::service + A[Gmail Sync Push / Webhook]:::trigger --> B(GmailWebhookService):::service + B --> C{SmartExtractionService}:::service %% Template Ingest Pathway - C -->|LinkedIn / Indeed Template| D(TemplateParser) :::utility + C -->|LinkedIn / Indeed Template| D(TemplateParser):::utility D -->|Extract Role & Company| E{Body Template Matches?} - E -->|Yes| F[Extract from Body & Clean Location] :::utility - E -->|No| G[Extract from Subject & Find Location] :::utility - D -->|Extract & Clean URL| H[UrlParser: Decode next= Parameter] :::utility - D -->|Determine Status| I[Strip Safety Warnings & Scan Keywords] :::utility + E -->|Yes| F[Extract from Body & Clean Location]:::utility + E -->|No| G[Extract from Subject & Find Location]:::utility + D -->|Extract & Clean URL| H[UrlParser: Decode next= Parameter]:::utility + D -->|Determine Status| I[Strip Safety Warnings & Scan Keywords]:::utility %% AI Ingest Pathway C -->|Unmatched Emails| J{app.gemini.enabled?} - J -->|true| K(GeminiExtractionService) :::service - J -->|false| L(MockGeminiService) :::service + J -->|true| K(GeminiExtractionService):::service + J -->|false| L(MockGeminiService):::service %% Job Matching & Storage - F --> M(JobService: createOrUpdateJob) :::service + F --> M(JobService: createOrUpdateJob):::service G --> M K --> M L --> M - M --> N{findBestMatch} :::service - N -->|Stricter Token Matching for Short Names| O[Update Existing Job & Append Notes] :::storage - N -->|No Active Match| P[Create New Job Entry] :::storage + M --> N{findBestMatch}:::service + N -->|Stricter Token Matching for Short Names| O[Update Existing Job & Append Notes]:::storage + N -->|No Active Match| P[Create New Job Entry]:::storage - O --> Q[(Database: PostgreSQL)] :::storage + O --> Q[(Database: PostgreSQL)]:::storage P --> Q - Q --> R(Caffeine Cache Eviction) :::service + Q --> R(Caffeine Cache Eviction):::service ``` ### Core Architecture Components From a51f60cf6157de26d9c51e665716be907e06f46e Mon Sep 17 00:00:00 2001 From: Hari Date: Tue, 2 Jun 2026 10:30:07 +0530 Subject: [PATCH 2/2] readme update --- README.md | 89 ++++++++++++++++++++++++++++++++++------------ frontend/README.md | 4 +-- 2 files changed, 69 insertions(+), 24 deletions(-) diff --git a/README.md b/README.md index 6687062..950a223 100644 --- a/README.md +++ b/README.md @@ -38,29 +38,74 @@ --- -## 🧠 System Architecture - -This application is built as a distributed system focusing on separation of concerns, data integrity, and automation. - -### 1. 🤖 AI-Powered Email Ingestion Pipeline -I engineered an event-driven pipeline to eliminate manual data entry using **Google Gemini 2.0 Flash**. - -* **Smart Upsert:** Uses fuzzy matching and Jaccard Similarity to detect if an incoming email is a *new application* or an *interview update* for an existing job. -* **Resilience:** Gracefully handles unstructured data, HTML-only emails, and missing headers using regex fallback strategies. - -### 2. 🔐 Hybrid Security Architecture -* **Implementation:** Custom `OAuth2SuccessHandler` that merges identities based on trusted email verification. -* **Flow:** Users can log in via **Google/GitHub** OR **Email/Password** interchangeably without creating duplicate accounts or data silos. -* **Stateless:** Fully secured via **JWT (RS256)** with a custom Security Filter Chain. - -### 3. ☁️ Atomic Cloud Storage -* **Provider:** Cloudflare R2 (AWS S3 Compatible). -* **Transactional Integrity:** Profile updates are atomic. If a user uploads a new image but the database transaction fails, the image upload is rolled back. -* **Garbage Collection:** The system automatically issues delete commands for old/orphaned images in the R2 bucket when a user updates their photo, preventing storage leaks and reducing costs. +## 🏗️ System Architecture & App Flow + +JobTrackerPro is built as a highly automated, decoupled system focusing on security, latency reduction, and seamless user experiences. + +### 🔄 End-to-End Application Flow + +```mermaid +flowchart TD + %% Styling + classDef service fill:#4A90E2,stroke:#357ABD,stroke-width:1.5px,color:#fff; + classDef utility fill:#50E3C2,stroke:#34A790,stroke-width:1.5px,color:#333; + classDef storage fill:#F5A623,stroke:#C68015,stroke-width:1.5px,color:#fff; + classDef trigger fill:#D0021B,stroke:#9E0010,stroke-width:1.5px,color:#fff; + + A[Gmail Webhook / Push Notice]:::trigger --> B(GmailWebhookService):::service + B --> C{SmartExtractionService}:::service + + %% Template Ingest Pathway + C -->|LinkedIn / Indeed Template| D(TemplateParser):::utility + D -->|Extract Role & Company| E{Body Template Matches?} + E -->|Yes| F[Extract from Body & Clean Location]:::utility + E -->|No| G[Extract from Subject & Find Location]:::utility + D -->|Extract & Clean URL| H[UrlParser: Decode next= Parameter]:::utility + D -->|Determine Status| I[Strip Safety Warnings & Scan Keywords]:::utility + + %% AI Ingest Pathway + C -->|Unmatched Emails| J{app.gemini.enabled?} + J -->|true| K(GeminiExtractionService):::service + J -->|false| L(MockGeminiService):::service + + %% Job Matching & Storage + F --> M(JobService: createOrUpdateJob):::service + G --> M + K --> M + L --> M + + M --> N{findBestMatch}:::service + N -->|Stricter Token Matching for Short Names| O[Update Existing Job & Append Notes]:::storage + N -->|No Active Match| P[Create New Job Entry]:::storage + + O --> Q[(Database: PostgreSQL)]:::storage + P --> Q + Q --> R(Caffeine Cache Eviction / Angular Signal Update):::service +``` -### 4. ⚡ High-Performance Analytics -* **Backend:** Leverages **Java Stream API** for efficient in-memory aggregation of job statistics, reducing database hits to a single optimized read operation per dashboard load. -* **Frontend:** Uses **Angular Signals** for reactive state management and **Optimistic UI** updates, ensuring zero-latency feedback for the user even on slow networks. +The system operates across four key pipeline phases: + +#### 1. 📬 Automated Ingestion & Webhook Layer +- **Gmail Sync Integration**: Users link their Gmail account via OAuth2. A Google Pub/Sub topic listens to mailbox events and sends real-time push notifications to `GmailWebhookService`. +- **Smart routing (`SmartExtractionService`)**: Ingested emails are routed through a `@Primary` interceptor. It checks if the email is a standard template from **LinkedIn** or **Indeed**. + - **Matched**: Processed locally instantly (0ms AI cost, sub-millisecond execution). + - **Unmatched**: Delegated to **Google Gemini 2.0 Flash** (or local `MockGeminiService` if offline). + +#### 2. ⚙️ Manual Extraction Engine (`TemplateParser`) +- **Forward Header Reconstruction**: Recognizes forwarded email structures so you can forward confirmation emails directly to your sync mailbox. +- **Indeed Body Parser**: Extracts the role, company name, and location directly from standard Indeed body layouts to avoid misidentifying hyphenated roles (e.g. `Java Backend Developer - MS` parses `Java Backend Developer - MS` as the role and `Capco` as the company). +- **Url query decoding**: Recognizes `apply.indeed.com` confirmation links, extracts their `next=` query parameter, URL-decodes it, and stores the direct public listing URL (e.g., `https://in.indeed.com/viewjob?jk=...`), reducing length from 220+ to ~50 characters. +- **Location & Status Cleaning**: Removes reviews meta-data (e.g. `Remote 95 reviews` -> `Remote`) and filters safety warning disclaimers to prevent false matches (e.g., preventing *"without an interview"* from triggering an *Interview Scheduled* status). + +#### 3. 🧠 Smart Deduplication & Matching (`JobService`) +- When a parsed job hits `JobService.createOrUpdateJob()`, it compares the company and role with your active board entries: + - **Strict Token matching**: If either company name is $\le 3$ characters (e.g., `MS`), it requires an exact word-token match rather than a simple substring `contains` check (preventing `MS` from matching `ORION SYSTEMS`). + - **Status Updates**: If a match is found (e.g. an interview invite email for a job you already applied to), it updates the existing entry in-place and appends the email details to the transaction notes. Otherwise, it inserts a new job entry. + +#### 4. 🎨 Reactive UI & In-Memory Caching +- **State Management**: The Angular frontend uses **Angular Signals** for reactive state updates, giving the user instant UI response. +- **Caffeine In-Memory Caching**: User profiles, dashboard analytics, and job lists are cached on the Spring Boot backend, automatically evicted upon job modifications to keep the UI up-to-date while protecting the DB from heavy read traffic. +- **Don't-Drown Slice Scaling**: The D3.js Donut Chart scales tiny slices (e.g., 1 application status out of 400+) up to a minimum 2.5% display size so they are visible and hoverable, while keeping tooltip values mathematically accurate. --- diff --git a/frontend/README.md b/frontend/README.md index a0d0770..d3197d6 100644 --- a/frontend/README.md +++ b/frontend/README.md @@ -14,10 +14,10 @@ The modern, responsive frontend for **JobTrackerPro**. Built with Angular and Ta ## ✨ Key Features -* **📊 Interactive Dashboard:** Real-time statistics using **D3.js** interactive charts. +* **📊 Interactive Dashboard:** Real-time statistics using **D3.js** interactive charts, featuring custom slice scaling to ensure tiny status percentages (e.g., 1 application out of 400+) remain visible and hoverable. * **👤 Advanced Profile:** Atomic updates for profile data, supporting file uploads (R2) and external URLs. * **🌗 Theming:** Built-in **Dark Mode** support persisted via LocalStorage. -* **⚡ Optimistic UI:** Smart caching and context-aware data fetching to minimize network latency. +* **⚡ Optimistic UI:** Smart caching and context-aware data fetching (leveraging Angular Signals) to minimize network latency. * **🔒 Security:** JWT-based authentication with route guards (`AuthGuard`, `GuestGuard`). ## 🛠️ Tech Stack