🔥 TorchCode

title	TorchCode
emoji	🔥
colorFrom	red
colorTo	yellow
sdk	docker
app_port	7860
pinned	false

🔥 TorchCode

Crack the PyTorch interview.

Practice implementing operators and architectures from scratch — the exact skills top ML teams test for.

Like LeetCode, but for tensors. Self-hosted. Jupyter-based. Instant feedback.

🎓 このフォークについて — W2-W5 練習トラック

duoan/TorchCode のフォークをベースにした自作練習問題集。 PyTorch で CNN 学習の主要トピックを W2-W5 の 4 週分に整理した 29 問を全問日本語化。本家の 40 問に加え、典型的な CNN 学習レシピ（pooling / augmentation / 評価指標 / 現代 optimizer 系）に直結する 16 問を spec-driven 生成インフラ と一緒に追加。

週次マッピング

Week	テーマ	問題数	フォルダ
W2	MLP / 基本分類 / 基礎 optimization	8	`practice/W2/`
W3	正則化 / 正規化 / advanced optimization	9	`practice/W3/`
W4	CNN 基礎 + 基本 transforms	7	`practice/W4/`
W5	CIFAR-10 advanced レシピ	5	`practice/W5/`

各週フォルダの README.md に 学習順 で問題リスト、各 .ipynb は実装 → check("...") で自動採点（5 テスト/問、計 145 テスト）。1 問目以外を開きたい時は各週の README から任意の .ipynb の Colab badge をクリック。

使い方

Colab で（推奨・セットアップ不要）： practice/W{n}/ 配下の .ipynb 右上の Colab badge をクリック → Run All → ✏️ セルに実装を書く → 最後の check("...") セルで採点。

ローカル（Docker / JupyterLab）：

make run                                # Docker 起動
# ブラウザで http://localhost:8888 → practice/W2/ に移動

メンテ・拡張手順

変更タイプごとに対応する再生成スクリプトを走らせる：

やりたいこと	編集対象	再生成コマンド
新規問題を追加	`problem_specs/{id}.py` を新規作成	`python scripts/build.py --verify`
spec 問題（#41-56）の説明/テスト/解答を修正	`problem_specs/{id}.py`	`python scripts/build.py --verify`
upstream 問題（#01-40）の intro 修正	`templates/{file}.ipynb` の cell 0	`python scripts/sync_solutions.py`
週マッピング変更	`scripts/week_mapping.py`	`python scripts/build_weeks.py`
週フォルダ完全リセット（in-progress 破棄）	（上記）	`python scripts/build_weeks.py --reset`
全 56 解答の健全性チェック	（なし）	`python scripts/verify_all_solutions.py`

ソースと生成物の対応

このリポジトリは spec-driven (16 問) と upstream-hand-written (40 問) のハイブリッド。編集禁止のファイルを直接いじると次の再生成で消える。

ソース（編集 OK）	生成物（編集禁止、再生成される）
`problem_specs/*.py` (16 問)	`torch_judge/tasks/{id}.py` + `templates/{4,5}.ipynb` + `solutions/{4,5}_solution.ipynb`
`templates/0-40_.ipynb` (40 既存) cell 0	対応する `solutions/*_solution.ipynb` の cell 0 (intro 部分のみ、code は upstream のまま)
`scripts/week_mapping.py`	`practice/W{n}/`, `practice/W{n}/README.md`, `practice/README.md`

大きな変更後は verify_all_solutions.py で 56 解答が全 pass することを確認するのが安全。

詳細は下記 Architecture / Adding Your Own Problems も参照。

License

本家 duoan/TorchCode は MIT License で公開されている。本フォークもそれを継承し MIT License で公開する。フォーク独自の追加・改変部分も MIT で利用可能。詳細は LICENSE を参照。

以下は 本家 TorchCode の README（英語、56 問全体の解説）。フォーク独自の追加問題は #41 以降。

🎯 Why TorchCode?

Top companies (Meta, Google DeepMind, OpenAI, etc.) expect ML engineers to implement core operations from memory on a whiteboard. Reading papers isn't enough — you need to write softmax, LayerNorm, MultiHeadAttention, and full Transformer blocks code.

TorchCode gives you a structured practice environment with:

	Feature
🧩	40 curated problems	The most frequently asked PyTorch interview topics
⚖️	Automated judge	Correctness checks, gradient verification, and timing
🎨	Instant feedback	Colored pass/fail per test case, just like competitive programming
💡	Hints when stuck	Nudges without full spoilers
📖	Reference solutions	Study optimal implementations after your attempt
📊	Progress tracking	What you've solved, best times, and attempt counts
🔄	One-click reset	Toolbar button to reset any notebook back to its blank template — practice the same problem as many times as you want
	Open in Colab	Every notebook has an "Open in Colab" badge + toolbar button — run problems in Google Colab with zero setup

No cloud. No signup. No GPU needed. Just make run — or try it instantly on Hugging Face.

🚀 Quick Start

Option 0 — Try it online (zero install)

Launch on Hugging Face Spaces — opens a full JupyterLab environment in your browser. Nothing to install.

Or open any problem directly in Google Colab — every notebook has an badge.

Option 0b — Use the judge in Colab (pip)

In Google Colab, install the judge from this fork's git URL (so you get the full task set, including the additions in this fork that aren't on the upstream PyPI package):

!pip install -q --force-reinstall --no-deps git+https://github.com/alextfkd/TorchCode.git

(The notebook templates already have this install cell at the top — just Run All in Colab.)

Then in a notebook cell:

from torch_judge import check, status, hint, reset_progress
status()           # list all problems and your progress
check("relu")      # run tests for the "relu" task
hint("relu")       # show a hint

Option 1 — Pull the pre-built image (fastest)

docker run -p 8888:8888 -e PORT=8888 ghcr.io/duoan/torchcode:latest

If the registry image is unavailable for your platform, use Option 2 instead. This is the common path on Apple Silicon / arm64.

Option 2 — Build locally

make run

make run will try the prebuilt image first and automatically fall back to a local build when needed.

Open http://localhost:8888 — that's it. Works with both Docker and Podman (auto-detected).

📋 Problem Set

Frequency: 🔥 = very likely in interviews, ⭐ = commonly asked, 💡 = emerging / differentiator

🧱 Fundamentals — "Implement X from scratch"

The bread and butter of ML coding interviews. You'll be asked to write these without torch.nn.

#	Problem	What You'll Implement	Freq	Key Concepts
1	ReLU	`relu(x)`	🔥	Activation functions, element-wise ops
2	Softmax	`my_softmax(x, dim)`	🔥	Numerical stability, exp/log tricks
16	Cross-Entropy Loss	`cross_entropy_loss(logits, targets)`	🔥	Log-softmax, logsumexp trick
17	Dropout	`MyDropout` (nn.Module)	🔥	Train/eval mode, inverted scaling
18	Embedding	`MyEmbedding` (nn.Module)	🔥	Lookup table, `weight[indices]`
19	GELU	`my_gelu(x)`	⭐	Gaussian error linear unit, `torch.erf`
20	Kaiming Init	`kaiming_init(weight)`	⭐	`std = sqrt(2/fan_in)`, variance scaling
21	Gradient Clipping	`clip_grad_norm(params, max_norm)`	⭐	Norm-based clipping, direction preservation
31	Gradient Accumulation	`accumulated_step(model, opt, ...)`	💡	Micro-batching, loss scaling
40	Linear Regression	`LinearRegression` (3 methods)	🔥	Normal equation, GD from scratch, nn.Linear
3	Linear Layer	`SimpleLinear` (nn.Module)	🔥	`y = xW^T + b`, Kaiming init, `nn.Parameter`
4	LayerNorm	`my_layer_norm(x, γ, β)`	🔥	Normalization, running stats, affine transform
7	BatchNorm	`my_batch_norm(x, γ, β)`	⭐	Batch vs layer statistics, train/eval behavior
8	RMSNorm	`rms_norm(x, weight)`	⭐	LLaMA-style norm, simpler than LayerNorm
15	SwiGLU MLP	`SwiGLUMLP` (nn.Module)	⭐	Gated FFN, `SiLU(gate) * up`, LLaMA/Mistral-style
22	Conv2d	`my_conv2d(x, weight, ...)`	🔥	Convolution, unfold, stride/padding
41	2D Max Pooling	`my_max_pool2d(x, k, stride, padding)`	🔥	Unfold + amax, pad with `-inf` for negative inputs
49	2D Average Pooling	`my_avg_pool2d(x, k, stride, padding)`	⭐	Unfold + mean, `count_include_pad=True` default
50	Global Average Pooling	`global_avg_pool(x)`	🔥	Mean over (H, W), ResNet/MobileNet head replacing FC
51	Label Smoothing CE	`label_smoothing_ce(logits, targets, ε)`	⭐	Smoothed target dist, modern training recipe
52	Top-k Accuracy	`top_k_accuracy(logits, targets, k)`	🔥	`topk` indices + `any`, ImageNet eval standard
53	NLL Loss	`my_nll_loss(log_probs, targets)`	⭐	Advanced indexing, `CE = log_softmax + NLL`

🧠 Attention Mechanisms — The heart of modern ML interviews

If you're interviewing for any role touching LLMs or Transformers, expect at least one of these.

#	Problem	What You'll Implement	Freq	Key Concepts
23	Cross-Attention	`MultiHeadCrossAttention` (nn.Module)	⭐	Encoder-decoder, Q from decoder, K/V from encoder
5	Scaled Dot-Product Attention	`scaled_dot_product_attention(Q, K, V)`	🔥	`softmax(QK^T/√d_k)V`, the foundation of everything
6	Multi-Head Attention	`MultiHeadAttention` (nn.Module)	🔥	Parallel heads, split/concat, projection matrices
9	Causal Self-Attention	`causal_attention(Q, K, V)`	🔥	Autoregressive masking with `-inf`, GPT-style
10	Grouped Query Attention	`GroupQueryAttention` (nn.Module)	⭐	GQA (LLaMA 2), KV sharing across heads
11	Sliding Window Attention	`sliding_window_attention(Q, K, V, w)`	⭐	Mistral-style local attention, O(n·w) complexity
12	Linear Attention	`linear_attention(Q, K, V)`	💡	Kernel trick, `φ(Q)(φ(K)^TV)`, O(n·d²)
14	KV Cache Attention	`KVCacheAttention` (nn.Module)	🔥	Incremental decoding, cache K/V, prefill vs decode
24	RoPE	`apply_rope(q, k)`	🔥	Rotary position embedding, relative position via rotation
25	Flash Attention	`flash_attention(Q, K, V, block_size)`	💡	Tiled attention, online softmax, memory-efficient

🏗️ Architecture & Adaptation — Put it all together

#	Problem	What You'll Implement	Freq	Key Concepts
26	LoRA	`LoRALinear` (nn.Module)	⭐	Low-rank adaptation, frozen base + `BA` update
27	ViT Patch Embedding	`PatchEmbedding` (nn.Module)	💡	Image → patches → linear projection
13	GPT-2 Block	`GPT2Block` (nn.Module)	⭐	Pre-norm, causal MHA + MLP (4x, GELU), residual connections
28	Mixture of Experts	`MixtureOfExperts` (nn.Module)	⭐	Mixtral-style, top-k routing, expert MLPs

🎨 Data Augmentation — "Boost CIFAR-10 accuracy without changing the model"

The data side of the recipe. Together with normalization + cosine LR, these turn a baseline CNN into a competitive one.

#	Problem	What You'll Implement	Freq	Key Concepts
42	Per-Channel Normalize	`my_normalize(x, mean, std)`	🔥	Channel-wise `(x − μ) / σ`, broadcast to (C, 1, 1)
43	Random Horizontal Flip	`random_horizontal_flip(x, p)`	🔥	Per-sample Bernoulli mask + `torch.flip`
44	Random Crop with Padding	`random_crop(x, size, padding)`	🔥	`F.pad` + per-sample random offset slice
45	Cutout / RandomErasing	`cutout(x, size)`	⭐	Random rectangle zero-mask, DeVries 2017
46	Mixup	`mixup(x, y, α)`	⭐	`Beta(α, α)`, 4-tuple `(x_mix, y_a, y_b, lam)` interface
47	CutMix	`cutmix(x, y, α)`	⭐	Area-based λ recomputed after boundary clipping
48	TTA (Horizontal Flip)	`tta_hflip(model, x)`	💡	Probability-space averaging, free 0.3–1% bump

⚙️ Training & Optimization

#	Problem	What You'll Implement	Freq	Key Concepts
29	Adam Optimizer	`MyAdam`	⭐	Momentum + RMSProp, bias correction
30	Cosine LR Scheduler	`cosine_lr_schedule(step, ...)`	⭐	Linear warmup + cosine annealing
54	SGD with Momentum	`MySGDMomentum`	🔥	`v = μ·v + g` (PyTorch convention — no `(1−μ)` factor)
55	Weight Decay (L2)	`apply_weight_decay(params, wd)`	⭐	`g += wd·p`, compare with decoupled WD (#56)
56	AdamW	`MyAdamW`	🔥	Decoupled WD: `p *= (1 − lr·λ)`, Transformer default

🎯 Inference & Decoding

#	Problem	What You'll Implement	Freq	Key Concepts
32	Top-k / Top-p Sampling	`sample_top_k_top_p(logits, ...)`	🔥	Nucleus sampling, temperature scaling
33	Beam Search	`beam_search(log_prob_fn, ...)`	🔥	Hypothesis expansion, pruning, eos handling
34	Speculative Decoding	`speculative_decode(target, draft, ...)`	💡	Accept/reject, draft model acceleration

🔬 Advanced — Differentiators

#	Problem	What You'll Implement	Freq	Key Concepts
35	BPE Tokenizer	`SimpleBPE`	💡	Byte-pair encoding, merge rules, subword splits
36	INT8 Quantization	`Int8Linear` (nn.Module)	💡	Per-channel quantize, scale/zero-point, buffer vs param
37	DPO Loss	`dpo_loss(chosen, rejected, ...)`	💡	Direct preference optimization, alignment training
38	GRPO Loss	`grpo_loss(logps, rewards, group_ids, eps)`	💡	Group relative policy optimization, RLAIF, within-group normalized advantages
39	PPO Loss	`ppo_loss(new_logps, old_logps, advantages, clip_ratio)`	💡	PPO clipped surrogate loss, policy gradient, trust region

⚙️ How It Works

Each problem has two notebooks:

File	Purpose
`01_relu.ipynb`	✏️ Blank template — write your code here
`01_relu_solution.ipynb`	📖 Reference solution — check when stuck

Workflow

1. Open a blank notebook           →  Read the problem description
2. Implement your solution         →  Use only basic PyTorch ops
3. Debug freely                    →  print(x.shape), check gradients, etc.
4. Run the judge cell              →  check("relu")
5. See instant colored feedback    →  ✅ pass / ❌ fail per test case
6. Stuck? Get a nudge              →  hint("relu")
7. Review the reference solution   →  01_relu_solution.ipynb
8. Click 🔄 Reset in the toolbar  →  Blank slate — practice again!

In-Notebook API

from torch_judge import check, hint, status

check("relu")               # Judge your implementation
hint("causal_attention")    # Get a hint without full spoiler
status()                    # Progress dashboard — solved / attempted / todo

📅 Suggested Study Plan

Total: ~12–16 hours spread across 3–4 weeks. Perfect for interview prep on a deadline.

Week	Focus	Problems	Time
1	🧱 Foundations	ReLU → Softmax → CE Loss → Dropout → Embedding → GELU → Linear → LayerNorm → BatchNorm → RMSNorm → SwiGLU MLP → Conv2d	2–3 hrs
2	🧠 Attention Deep Dive	SDPA → MHA → Cross-Attn → Causal → GQA → KV Cache → Sliding Window → RoPE → Linear Attn → Flash Attn	3–4 hrs
3	🏗️ Architecture + Training	GPT-2 Block → LoRA → MoE → ViT Patch → Adam → Cosine LR → Grad Clip → Grad Accumulation → Kaiming Init	3–4 hrs
4	🎯 Inference + Advanced	Top-k/p Sampling → Beam Search → Speculative Decoding → BPE → INT8 Quant → DPO Loss → GRPO Loss → PPO Loss + speed run	3–4 hrs

🏛️ Architecture

┌──────────────────────────────────────────┐
│           Docker / Podman Container      │
│                                          │
│  JupyterLab (:8888)                      │
│    ├── templates/  (reset on each run)   │
│    ├── solutions/  (reference impl)      │
│    ├── torch_judge/ (auto-grading)       │
│    ├── torchcode-labext (JLab plugin)    │
│    │     🔄 Reset — restore template     │
│    │     🔗 Colab — open in Colab        │
│    └── PyTorch (CPU), NumPy              │
│                                          │
│  Judge checks:                           │
│    ✓ Output correctness (allclose)       │
│    ✓ Gradient flow (autograd)            │
│    ✓ Shape consistency                   │
│    ✓ Edge cases & numerical stability    │
└──────────────────────────────────────────┘

Single container. Single port. No database. No frontend framework. No GPU.

🛠️ Commands

make run    # Build & start (http://localhost:8888)
make stop   # Stop the container
make clean  # Stop + remove volumes + reset all progress

🧩 Adding Your Own Problems

TorchCode uses auto-discovery — just drop a new file in torch_judge/tasks/:

TASK = {
    "id": "my_task",
    "title": "My Custom Problem",
    "difficulty": "medium",
    "function_name": "my_function",
    "hint": "Think about broadcasting...",
    "tests": [ ... ],
}

No registration needed. The judge picks it up automatically.

📦 Publishing `torch-judge` to PyPI (maintainers)

The judge is published as a separate package so Colab/users can pip install torch-judge without cloning the repo.

Automatic (GitHub Action)

Pushing to master after changing the package version triggers .github/workflows/pypi-publish.yml, which builds and uploads to PyPI. No git tag is required.

Bump version in torch_judge/_version.py (e.g. __version__ = "0.1.1").
Configure PyPI Trusted Publisher (one-time):
- PyPI → Your project torch-judge → Publishing → Add a new pending publisher
- Owner: duoan, Repository: TorchCode, Workflow: pypi-publish.yml, Environment: (leave empty)
- Run the workflow once (push a version bump to master or Actions → Publish torch-judge to PyPI → Run workflow); PyPI will then link the publisher.
Release: commit the version bump and git push origin master.

Alternatively, use an API token: add repository secret PYPI_API_TOKEN (value = pypi-... from PyPI) and set TWINE_USERNAME=__token__ and TWINE_PASSWORD from that secret in the workflow if you prefer not to use Trusted Publishing.

Manual

pip install build twine
python -m build
twine upload dist/*

Version is in torch_judge/_version.py; bump it before each release.

❓ FAQ

Do I need a GPU?

No. Everything runs on CPU. The problems test correctness and understanding, not throughput.

Can I keep my solutions between runs?

Blank templates reset on every make run so you practice from scratch. Save your work under a different filename if you want to keep it. You can also click the 🔄 Reset button in the notebook toolbar at any time to restore the blank template without restarting.

Can I use Google Colab instead?

Yes! Every notebook has an Open in Colab badge at the top. Click it to open the problem directly in Google Colab — no Docker or local setup needed. You can also use the Colab toolbar button inside JupyterLab.

How are solutions graded?

The judge runs your function against multiple test cases using torch.allclose for numerical correctness, verifies gradients flow properly via autograd, and checks edge cases specific to each operation.

Who is this for?

Anyone preparing for ML/AI engineering interviews at top tech companies, or anyone who wants to deeply understand how PyTorch operations work under the hood.

🤝 Contributors

Thanks to everyone who has contributed to TorchCode.

_duoan

_Ando233

_ThierryHJ

Auto-generated from the GitHub contributors graph with avatars and GitHub usernames.

Built for engineers who want to deeply understand what they build.

If this helped your interview prep, consider giving it a ⭐

☕ Buy Me a Coffee

Scan to support

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.github/workflows		.github/workflows
jupyter_config		jupyter_config
labextension		labextension
practice		practice
problem_specs		problem_specs
scripts		scripts
solutions		solutions
templates		templates
torch_judge		torch_judge
w5		w5
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
bmc_qr.png		bmc_qr.png
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh
pyproject.toml		pyproject.toml
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

🔥 TorchCode

🎓 このフォークについて — W2-W5 練習トラック

週次マッピング

使い方

メンテ・拡張手順

ソースと生成物の対応

License

🎯 Why TorchCode?

🚀 Quick Start

Option 0 — Try it online (zero install)

Option 0b — Use the judge in Colab (pip)

Option 1 — Pull the pre-built image (fastest)

Option 2 — Build locally

📋 Problem Set

🧱 Fundamentals — "Implement X from scratch"

🧠 Attention Mechanisms — The heart of modern ML interviews

🏗️ Architecture & Adaptation — Put it all together

🎨 Data Augmentation — "Boost CIFAR-10 accuracy without changing the model"

⚙️ Training & Optimization

🎯 Inference & Decoding

🔬 Advanced — Differentiators

⚙️ How It Works

Workflow

In-Notebook API

📅 Suggested Study Plan

🏛️ Architecture

🛠️ Commands

🧩 Adding Your Own Problems

📦 Publishing torch-judge to PyPI (maintainers)

Automatic (GitHub Action)

Manual

❓ FAQ

🤝 Contributors

☕ Buy Me a Coffee

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

📦 Publishing `torch-judge` to PyPI (maintainers)

Packages