Skip to content

anasahhm/Specter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SPECTER: Wire API-Powered AI Threat Intelligence Platform

Real-time URL threat analysis. Wire API extracts technical signals, Google Generative AI recognizes behavioral patterns, deterministic scoring outputs risk (0-100). Async architecture handles 120s Wire API + 45s AI calls without blocking.

Live Demo


Why I Built This

Static URL threat detection is broken:

  • Blacklist-based tools miss 60%+ of new phishing campaigns
  • SaaS solutions cost $500/month and have 5-minute latencies
  • Open-source tools use only regex patterns (too many false positives)

I needed hybrid intelligence: raw technical signals (domain age, SSL validity, redirect chains) + AI pattern recognition (behavioral clustering, social engineering vectors). And it had to be fast.


The Architecture Problem

The Issue: Wire API takes 120s, Google Generative AI takes 45s. If I block the request thread waiting for both, the user sits staring at a loading spinner for 175 seconds.

The Solution: Async worker architecture.

POST /api/investigations/start → returns investigation ID immediately (300ms)
Background: Wire API (120s) → AI Analysis (45s) → Threat Scoring (10s)
Frontend polls GET /api/investigations/:id every 2s with 3-min graceful timeout
Status persisted: processing → completed
User gets results without waiting (8-15s typical, 180s max)

This pattern scales. Investigate 50 URLs and come back later. No polling hell, no WebSocket complexity.


Tech Stack

Frontend: React 18 + Vite (3-4x faster than Webpack) + Tailwind + Framer Motion
Backend: Node.js/Express + MongoDB + Mongoose + Helmet.js + express-rate-limit
Intelligence: Wire API (technical metadata) + Google Generative AI (pattern analysis)
Auth: JWT (30-day expiry) + bcryptjs (salt: 10) + input validation
Deployment: Vercel (frontend) + Render (backend) + MongoDB Atlas (database)


Installation & Setup

Prerequisites

Node.js v18+, npm/yarn, MongoDB (Atlas free tier works)

Clone & Install

git clone https://github.com/anasahhm/specter.git
cd specter

# Backend
cd backend
npm install
cp .env.example .env  # Add your API keys

# Frontend
cd frontend
npm install
cp .env.example .env

Environment Variables

Backend (.env):

NODE_ENV=development
PORT=5000
MONGODB_URI=mongodb+srv://user:password@cluster.mongodb.net/specter
JWT_SECRET=your-super-secret-key-minimum-32-characters
WIRE_API_KEY=your-wire-api-key-here
GOOGLE_GENERATIVE_AI_KEY=your-key-here
FRONTEND_URL=http://localhost:5173

Frontend (.env):

VITE_API_URL=http://localhost:5000
VITE_APP_NAME=SPECTER

Run Locally

Terminal 1 (Backend):

cd backend && npm run dev
# Should output:
# ╔══════════════════════════════════════════╗
# ║        SPECTER - SERVER STARTED          ║
# ║ Port: 5000 | Database: Connected         ║
# ║ Wire API: ✓ | Google AI: ✓               ║
# ╚══════════════════════════════════════════╝

Terminal 2 (Frontend):

cd frontend && npm run dev
# http://localhost:5173

Terminal 3 (Test API):

curl http://localhost:5000/api/health
# {"status":"operational","timestamp":"2024-05-31T12:00:00.000Z"}

How It Works

3-Stage Pipeline

Step 1: Wire API (120s timeout)

  • Domain metadata, SSL certificates, age, MX records
  • Redirect chains, technology stack detection
  • Embedded links, forms, scripts
  • Output: Raw technical signals

Step 2: AI Analysis (45s timeout)

  • Google Generative AI pattern recognition
  • Behavioral clustering against known threats
  • Phishing vector identification
  • Confidence scoring and summary generation
  • Fallback: Rule-based analysis if AI unavailable

Step 3: Threat Scoring (10s timeout)

  • Risk score (0-100)
  • Threat classification (Critical/High/Medium/Low/Safe)
  • Scam probability, toxicity rating, confidence
  • Output: Final verdict

Data Flow

1. User submits URL
2. POST /api/investigations/start
3. Backend returns investigationId (status: processing)
4. Frontend polls GET /api/investigations/:id every 2s
5. Background: Step 1 → Step 2 → Step 3
6. Status changes to completed
7. Frontend renders results

Gotchas I Solved

1. Slow External APIs Without Blocking

  • Problem: Wire API + AI = 165s. Blocking the request thread kills UX.
  • Solution: Async workers + polling. POST returns instantly with ID, frontend polls every 2s.
  • Lesson: For external APIs >10s, always use async + polling or WebSockets.

2. Rate Limit Exploitation

  • Problem: Users hammer the API. Bots scrape URL intelligence.
  • Solution: Dual-axis rate limiting:
    • Global: 100 requests/15min (catches distributed attacks)
    • Per-user: 5 investigations/min (prevents individual abuse)
    • Sliding window (not fixed buckets)
  • Lesson: Single rate limit is insufficient. Attack from one user looks different than botnet traffic.

3. External API Resilience

  • Problem: What if Wire API is down? What if Google AI returns an error?
  • Solution: Graceful degradation:
    • Wire API failure → Use cached domain reputation data
    • Google AI timeout → Fall back to rule-based threat scoring
    • Both failures → Return partial results with explicit warnings
  • Lesson: Single point of failure cascades. Build fallbacks at every layer.

4. JWT Token Expiry Handling

  • Problem: Users investigate for hours but tokens expire after 30 days.
  • Solution: Token refresh pattern:
    • 30-day access tokens + refresh tokens
    • Frontend axios interceptor refreshes automatically
    • No sensitive data in error messages
  • Lesson: Never leak token details in error responses.

5. JavaScript-Heavy Sites

  • Problem: Wire API sees static HTML. Dynamic forms, obfuscated links, JS-rendered content are invisible.
  • Solution: Hybrid approach:
    • Wire API for structural/technical analysis
    • Google AI for behavioral/pattern analysis
    • Triangulation catches what either misses
  • Lesson: No single tool is complete. Combine strengths.

6. MongoDB Connection Pooling

  • Problem: Mongoose default pool size (5) was too small under concurrent load.
  • Solution: Tuned pool settings in connection URI, added connection monitoring.
  • Lesson: Database bottlenecks surface under load, not in dev.

API Reference

Authentication

POST /api/auth/register
  { email, password, displayName? }

POST /api/auth/login
  { email, password }

GET /api/auth/profile
  Headers: Authorization: Bearer {token}

Investigations

POST /api/investigations/start
  { targetType: "url", targetValue: "https://..." }
  Returns: { investigationId, status: "processing" }

GET /api/investigations/:investigationId
  Returns: Complete threat analysis

GET /api/investigations?page=1&limit=10
  Returns: User's investigation history

PUT /api/investigations/:investigationId/bookmark
  { isBookmarked: boolean }

Reports & Analytics

GET /api/reports/:investigationId
GET /api/reports/:investigationId/export?format=pdf|json

GET /api/analytics/user-stats
GET /api/analytics/threat-distribution

Threat Metrics

Metric Range Meaning
Risk Score 0-100 Overall threat severity
Threat Level Critical/High/Medium/Low/Safe Classification
Phishing Detected Yes/No Known phishing patterns
Scam Probability 0-100% Fraudulent intent likelihood
Toxicity Score 0-100 Content toxicity
Confidence Score 0-100% Analysis certainty

Performance

Metric Target Actual
Page Load <2s 1.2s
Investigation Start (API) <500ms 300ms
Results Available <30s 8-15s
API Response Time <1s 200-400ms
Database Query <100ms 50-80ms

Security

JWT Auth - 30-day token expiry + refresh rotation
Password Hashing - bcryptjs (salt rounds: 10)
Rate Limiting - 100 req/15min global + 5/min per user
Helmet.js - CSP, X-Frame-Options, HSTS, etc.
CORS - Whitelist frontend origin only
Input Validation - Email format, password entropy, URL structure
Error Handling - No sensitive data leakage
Environment Isolation - Secrets in .env, never in code


Deployment

Frontend (Vercel):

  1. Push to GitHub
  2. vercel.com/new → Import repo
  3. Set VITE_API_URL env var
  4. Deploy

Backend (Render):

  1. render.com → Create Web Service
  2. Connect GitHub repo
  3. Set all env vars (MONGODB_URI, WIRE_API_KEY, etc.)
  4. Deploy

Database (MongoDB Atlas):

  1. cloud.mongodb.com → Create cluster (free tier)
  2. Get connection string
  3. Whitelist your IP
  4. Set as MONGODB_URI

Project Structure

specter/
├── frontend/
│   ├── src/
│   │   ├── pages/          # Route components
│   │   ├── components/     # Reusable UI components
│   │   ├── hooks/          # Custom React hooks
│   │   ├── api/            # API client + interceptors
│   │   ├── context/        # Auth context
│   │   ├── utils/          # Helpers
│   │   └── styles/         # Global CSS
│   ├── vite.config.js
│   ├── tailwind.config.js
│   └── package.json
│
├── backend/
│   ├── src/
│   │   ├── routes/         # Express route handlers
│   │   ├── services/       # Business logic (Wire, AI, Scoring)
│   │   ├── models/         # Mongoose schemas
│   │   ├── config/         # Validation, constants
│   │   ├── scripts/        # Database seeders
│   │   ├── server.js       # Express app setup
│   │   └── index.js        # Entry point
│   └── package.json
│
└── docs/
    ├── ARCHITECTURE.md     # System design
    ├── API.md              # Endpoint reference
    └── DEPLOYMENT.md       # Production setup

Testing

Manual API Tests

# Register
curl -X POST http://localhost:5000/api/auth/register \
  -H "Content-Type: application/json" \
  -d '{"email":"test@example.com","password":"Test123!"}'

# Login
curl -X POST http://localhost:5000/api/auth/login \
  -H "Content-Type: application/json" \
  -d '{"email":"test@example.com","password":"Test123!"}'

# Start investigation
curl -X POST http://localhost:5000/api/investigations/start \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"targetType":"url","targetValue":"https://example.com"}'

# Get results
curl http://localhost:5000/api/investigations/INVESTIGATION_ID \
  -H "Authorization: Bearer YOUR_TOKEN"

Browser Testing

  1. Register account
  2. Test URLs: example.com (safe), malicious-url.com (suspicious)
  3. Verify threat scores, phishing detection, report generation
  4. Check investigation history and bookmarks

Troubleshooting

Frontend can't reach backend

# Check backend is running
curl http://localhost:5000/api/health

# Check VITE_API_URL in frontend/.env matches backend
# Check FRONTEND_URL in backend/.env matches frontend origin (http://localhost:5173)
# Check browser console for CORS errors

MongoDB connection failed

# Verify connection string
MONGODB_URI=mongodb+srv://user:password@cluster.mongodb.net/specter

# Check IP whitelist in MongoDB Atlas (add 0.0.0.0/0 for development)
# Verify database user has correct credentials

Wire API errors

  • Check API key is valid and quota isn't exceeded
  • Review Wire API docs for rate limits
  • Enable debug logging: DEBUG=* npm run dev

Investigation timeout (>180s)

  • Default timeout is 180 seconds
  • Check backend logs for step-specific errors
  • Test with simple URL first (e.g., example.com)

Stats

  • 48 hour build (hackathon sprint)
  • 2100+ LOC (1200 frontend, 900 backend)
  • 12 API endpoints (Auth, Investigations, Reports, Analytics)
  • 3-stage pipeline (Wire API → AI → Scoring)
  • 8-15s typical latency (8-180s max with timeouts)
  • 4 database collections (users, investigations, reports, analytics)
  • 18 React components (modular, reusable)
  • 3 backend services (Wire client, AI analyzer, threat scorer)

What I Learned

  1. Async beats blocking. External APIs >10s? Don't wait. Async + polling scales better.
  2. Graceful degradation saves systems. When Wire API fails, use cached data. When AI times out, use rules.
  3. Rate limiting is multidimensional. Global limits catch botnets. Per-user limits catch individual abuse.
  4. Hybrid intelligence works. One data source has blind spots. Wire API + AI catch what each misses.
  5. Security is layering. JWT + bcryptjs + Helmet + CORS + input validation = defense in depth.

License

MIT — see LICENSE


Made By Anas Ahmed

Questions? Open a GitHub issue

Live: https://specter-weld.vercel.app

About

AI-powered digital threat intelligence platform built for Anakin Build-a-thon 2026 using Anakin Wire.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors