Healy PAHEALYCODES

Healy Patel

AI engineer focused on LLM safety, alignment, and reliable systems. Mississauga, ON · pahealyai@gmail.com · LinkedIn

I build LLM-powered systems with an emphasis on hallucination mitigation, adversarial robustness, and auditable decision-making. My projects ship runnable code and the evaluation harnesses to know when the code stops working.

I care about scalable oversight, interpretability, and the kind of boring engineering discipline that makes alignment techniques actually usable in production — structured outputs, refusal strategies, per-claim validation, and eval sets you can read in an afternoon.

Featured projects

AI Safety RAG Tutor

Grounded Q&A over AI safety research papers, with multi-layer source validation and a 30-prompt adversarial eval harness.

Every claim has to trace back to a retrieved chunk — or the system refuses. Built to make citation hallucination a first-class failure mode. Ships with a MockLLM so the pipeline runs without an API key, plus OpenAI + FAISS for real deployments.

Python · LangChain · FAISS · OpenAI API

LLM Lead Qualification Agent

Chain-of-thought reasoning agent that produces auditable, Pydantic-validated JSON across 500+ synthetic test cases.

Three-layer architecture (reason → classify → score), each layer unit-tested in isolation. Retry + fallback logic; NEEDS_REVIEW is a first-class class for the cases the model is genuinely unsure about.

Python · LangChain · Pydantic · Structured Reasoning

Multi-Agent Content Repurposer

CrewAI-style multi-agent workflow with a Reviewer critique loop and a built-in prompt-sensitivity study.

Models scalable-oversight patterns on a mundane task: one source → three formats (Twitter, LinkedIn, email). Every run writes a full transcript. The sensitivity command measures how much the output varies under prompt perturbations — because "it worked last time" is not a guarantee.

Python · CrewAI · Prompt Engineering

Adversarial Prompt Robustness Evaluator

Red-teaming toolkit with 40+ curated attacks spanning jailbreaks, prompt injections, goal misspecification, and boundary probes.

Each attack has an expected field — so the evaluator catches over-refusal, not just under-refusal. Generates a readable markdown report with per-category rates and paraphrase-consistency checks. Works against any target model via the Defender protocol.

Python · OpenAI API · Red-teaming

What I'm currently working on

Reading: Constitutional AI (Anthropic), Scaling Monosemanticity (Anthropic), Eliciting Latent Knowledge (ARC), Sleeper Agents (Anthropic)
Following: Anthropic, ARC, Redwood Research, EleutherAI, Apollo Research
Building: adversarial evaluation tooling and better refusal-boundary measurement

Tech I use

AI/ML: LangChain · CrewAI · LlamaIndex · Hugging Face Transformers · OpenAI API · Anthropic API RAG / search: FAISS · ChromaDB · embedding pipelines · source grounding Languages: Python (advanced) · Bash · TypeScript Practice: RLHF concepts · Constitutional AI · adversarial prompt testing · structured evaluation frameworks · refusal design

Open to AI safety / alignment engineering roles, research assistantships, and fellowship programs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly