ML Systems Engineerr
MS Computer Science — Texas State University · GPA 3.56/4.0 · Thesis defended April 2026 San Marcos, TX · F-1 OPT (STEM) · Seeking H-1B sponsorship
I build end-to-end systems — from CUDA/PyTorch research code to production APIs and published packages.
My MS thesis (defended April 2026, IMICS Lab) is a codec-agnostic, ROI-aware video compression pipeline for wildlife camera-trap footage under edge constraints: it splits the animal region and the background into separate streams with independent temporal sampling and pluggable codec backends, achieving 97.4% transmitted-size reduction at 36.12 dB ROI PSNR on a 20-clip held-out test set. Manuscript in preparation with my advisor (co-author).
Alongside the thesis I shipped two public artifacts: a deterministic INT16 CUDA runtime for the DCVC-RT codec family that produces cross-device byte-reproducible bitstreams (validated by encoding on an NVIDIA DGX and decoding on an RTX 3070 Ti laptop GPU — matching SHA-256), and llm-code-validator on PyPI, a static-analysis CLI that detects version-incompatible Python API usage. I also maintain a live English↔Spanish translation demo on Hugging Face Spaces.
Before grad school I spent 1.5 years as an Associate Software Engineer at NeoSOFT (CMMI Level 5) — AWS Lambda, MySQL, React → Next.js migration, Agile delivery.
Languages
ML / Systems
Backend / DevOps
Codec-agnostic dual-stream design · PyTorch · MegaDetector · KLT · AMT · Jetson Orin Nano · AV1 / HEVC
ROI-aware video compression for wildlife camera traps under edge constraints. The pipeline treats the animal region and the background as two streams with independent temporal sampling and different codec operating points, so bitrate is spent where it matters for downstream review.
The methodology is codec-agnostic by design: the main controlled study fixes a learned neural codec backend (DCVC-RT) so the effect of the dual-stream split can be isolated from codec choice; the deployment study on a Jetson Orin Nano swaps in a portable conventional stack (AV1 for the ROI stream, HEVC for the background stream) and demonstrates the same methodology end-to-end on edge hardware.
| Metric | Result |
|---|---|
| Transmitted-size reduction (held-out 20 clips) | 97.4% (399.16 MB → 10.36 MB, 38.52× ratio) |
| Mean ROI PSNR | 36.12 dB |
| Mean ROI MS-SSIM | 0.9758 (vs full-frame 0.9625) |
| ROI-stage detector calls reduced | 93.28% |
| ROI-stage speedup vs dense detection | 4.41× |
| ROI-presence recall | 96.24% |
Defended April 3, 2026 — IMICS Lab, Texas State University. Advisor: Dr. Vangelis Metsis. Manuscript in preparation with thesis advisor (co-author). Funded in part by the BioStream grant.
CUDA · PyTorch C++ Extensions · INT16 · rANS · SHA-256 · Apache-2.0
The upstream DCVC-RT codec runs FP16 inference, whose rounding behavior differs across NVIDIA GPU microarchitectures — small numerical divergences accumulate through the entropy context and corrupt reference frames when encode and decode run on different GPUs. This project replaces the FP16 inference path with a signed INT16 arithmetic profile using fixed power-of-two quantization scales, so every multiply-accumulate follows an explicit integer contract and bitstreams are bit-identical across GPUs.
| Metric | Result |
|---|---|
| Cross-device determinism | SHA-256 match — encode on NVIDIA DGX, decode on RTX 3070 Ti laptop GPU |
| End-to-end validation clip | 2,593 frames, 1280×720, QP 32 |
| Average RGB PSNR | 28.90 dB |
| Average RGB MS-SSIM | 0.934 |
| Test suite | 35 tests (CUDA kernel parity, entropy roundtrip, bitstream SHA-256) |
| Size reduction vs original MP4 | 94.27% (17.45× smaller) |
Native CUDA kernels for INT16 operations; C++ rANS entropy-coding PyTorch extension; calibration pipeline that converts upstream DCVC-RT FP16 checkpoints into INT16 bundles. Apache-2.0 with upstream Microsoft DCVC attribution.
PyTorch · FastAPI · ChromaDB · Hugging Face · Docker · MIT
▶ Live demo on Hugging Face Spaces
Custom encoder-decoder Transformer trained from scratch on a 3.5M English-Spanish sentence-pair corpus (Europarl, TED2020, News Commentary, OpenSubtitles). Deployed behind a FastAPI service with an optional ChromaDB bilingual evidence retrieval layer and an optional GPT reviewer over retrieved context — each response returns the draft, retrieved evidence, review status, final wording, and per-request provenance fields.
| Metric | Result |
|---|---|
| Held-out sacreBLEU | 31.41 |
| Test pairs | 878,564 |
| Training corpus | 3.5M sentence pairs |
| Retrieval index | 50K bilingual pairs · all-MiniLM-L6-v2 |
| Baseline comparison | DeepL, Google Cloud Translate, Azure Custom Translator, MarianMT |
Docker + docker-compose deployment, GitHub Actions CI, six dependency profiles (api / dev / demo / training / rag / gpt), Hugging Face Spaces Gradio deployment.
Python · AST · CLI · pre-commit · GitHub Actions · No-LLM-key
pip install llm-code-validatorStatic-analysis CLI that scans Python source code without executing it, identifies version-incompatible third-party API usage via AST, and reports issues before runtime. Targeted at codebases that depend on fast-moving libraries — OpenAI, Anthropic, LangChain, LangGraph, LlamaIndex, FastAPI, Pydantic, pandas, NumPy, scikit-learn, SQLAlchemy, PyTorch, Transformers, TensorFlow, ChromaDB, Pinecone, Weaviate, Qdrant, and more.
| Metric | Result |
|---|---|
| Rule database | 68 API-drift rules · 15 safe fixes · 22 supported libraries |
| Test suite | 84 tests passing |
| Internal benchmark | precision 1.0 / recall 1.0 |
| Output formats | text · JSON · GitHub Actions annotations |
| Integrations | pre-commit · GitHub Actions · staged-Git scan |
| LLM API keys required | None — default checks are 100% local, no network calls |
Supports private/internal signature databases, conservative "safe-fix" preview-and-apply mode, and exit codes appropriate for CI gating.
Graduate Teaching Assistant — Texas State University (Sep 2025 – Present) Lead weekly C++ programming labs for 25–30 undergraduate students in a first-year CS foundation course required of every engineering undergraduate at Texas State. Coordinate workload distribution, grading consistency, and student-support standards across a team of 21 Graduate Instructional Assistants.
Graduate Instructional Assistant — Texas State University (Dec 2024 – Sep 2025) Tutored undergraduate students and graded C++ lab submissions for the same first-year CS foundation course. Promoted to Graduate Teaching Assistant after one semester.
Associate Software Engineer — NeoSOFT Technologies (CMMI Level 5) (Aug 2021 – Jan 2023) Built AWS Lambda services for a multi-tenant CRM platform. Led a React → Next.js migration with server-side rendering for a media client. Diagnosed and corrected production MySQL data inconsistencies with targeted SQL repair scripts that ran without service interruption.
MS Computer Science (Thesis Option) — Texas State University · GPA 3.56/4.0 · Aug 2024 – May 2026 Thesis: Neural Region-of-Interest-Aware Video Compression for Wildlife Monitoring Under Edge Computing Constraints Advisor: Dr. Vangelis Metsis · IMICS Lab
BTech Computer Science and Engineering — CSMSS Chh. Shahu College of Engineering (Dr. Babasaheb Ambedkar Technological University) · 2018 – 2021