ROADMAP M8 just landed cross-port. The new spec at docs/HARDENING.md defines the decoder-safety contract every port must enforce against untrusted input; the adversarial corpus tests it; the 9-port matrix workflow runs it on every protowire PR.
Companion check_decode reference for this port: scripts/check_decode.py (commit 51df4a6).
The workflow ships in advisory mode (continue-on-error: true) so failures do not block PRs. This issue tracks driving this port's regressions to zero so we can flip the gate to required for python.
Current baseline — 4 advisory failures (1 crash)
| Corpus |
Verdict |
Reason |
pxf/deep-nesting-200.pxf |
FAIL_VERDICT |
accepted; missing MaxNestingDepth cap (heavy lifting in C++ FFI) |
pxf/deep-nesting-1000.pxf |
FAIL_VERDICT |
same |
pxf/deep-nesting-100000.pxf |
FAIL_CRASH |
C++ FFI stack overflow leaks through to Python |
pxf/invalid-utf8-string.pxf |
FAIL_VERDICT |
C++ FFI's PXF decoder doesn't enforce UTF-8 on proto3 string |
The Python port passes most corpus inputs because:
pb/deep-submessage-200.binpb rejected by google.protobuf (which has built-in recursion limits).
pb/length-prefix-truncated.binpb rejected by google.protobuf.
pxf/long-numeric.pxf rejected by Python's int(str) raising ValueError on the int64 parser path.
pxf/lone-surrogate.pxf rejected at the lexer (already validates ValidRune).
What to fix
The 4 PXF failures are all on the C++ FFI side (the _protowire/module.cc extension and its protowire-cpp dependencies). Two HARDENING.md invariants in protowire-cpp will fix them upstream:
- §Recursion —
MaxNestingDepth = 100 cap on the C++ PXF parser/decoder. Tracked in [protowire-cpp's M8 issue]; once that lands, this port inherits the fix automatically. The 100k-deep crash converts to a clean error returned through the FFI.
- §UTF-8 — strict UTF-8 enforcement on proto3
string populating in protowire-cpp's PXF decoder. Same upstream fix.
If the maintainers want a Python-side defense-in-depth in addition (e.g. setting sys.setrecursionlimit lower, or wrapping the FFI call in a stack-budget check), that's optional but not required to clear the corpus — the upstream C++ fixes are sufficient.
The pure-Python envelope.py decoder also has a couple of IndexError-vs-ValueError exception-type leaks the cross-port review flagged (envelope.py:131-132, 85-93); those aren't tested by the current corpus but are good hygiene fixes.
Reproduce locally
# In the venv with `pip install -e .` already done:
WALLCLOCK_SECONDS=10 bash ../protowire/scripts/cross_security_check.sh
# Or run a single corpus input:
python3 scripts/check_decode.py --format pxf \
--schema adversarial.v1.Tree \
--proto ../protowire/testdata/adversarial/adversarial.proto \
--input ../protowire/testdata/adversarial/pxf/deep-nesting-100000.pxf
echo $? # 0 = accepted, 1 = clean reject, 134/139 = crash via FFI
Convergence
When all 4 corpus inputs pass on this port, comment here. We'll flip continue-on-error: false for the python matrix entry in protowire/.github/workflows/security.yml. When all 9 ports converge, the workflow becomes a required check.
Cross-port context: 8 sibling per-port issues track the same convergence. The full cross-port matrix lives in protowire's ROADMAP § M8. Most of this port's findings are upstream of protowire-cpp — see that repo's M8 issue.
ROADMAP M8 just landed cross-port. The new spec at
docs/HARDENING.mddefines the decoder-safety contract every port must enforce against untrusted input; the adversarial corpus tests it; the 9-port matrix workflow runs it on everyprotowirePR.Companion
check_decodereference for this port:scripts/check_decode.py(commit51df4a6).The workflow ships in advisory mode (
continue-on-error: true) so failures do not block PRs. This issue tracks driving this port's regressions to zero so we can flip the gate to required forpython.Current baseline — 4 advisory failures (1 crash)
pxf/deep-nesting-200.pxfFAIL_VERDICTMaxNestingDepthcap (heavy lifting in C++ FFI)pxf/deep-nesting-1000.pxfFAIL_VERDICTpxf/deep-nesting-100000.pxfFAIL_CRASHpxf/invalid-utf8-string.pxfFAIL_VERDICTstringThe Python port passes most corpus inputs because:
pb/deep-submessage-200.binpbrejected bygoogle.protobuf(which has built-in recursion limits).pb/length-prefix-truncated.binpbrejected bygoogle.protobuf.pxf/long-numeric.pxfrejected by Python'sint(str)raisingValueErroron the int64 parser path.pxf/lone-surrogate.pxfrejected at the lexer (already validatesValidRune).What to fix
The 4 PXF failures are all on the C++ FFI side (the
_protowire/module.ccextension and itsprotowire-cppdependencies). Two HARDENING.md invariants in protowire-cpp will fix them upstream:MaxNestingDepth = 100cap on the C++ PXF parser/decoder. Tracked in [protowire-cpp's M8 issue]; once that lands, this port inherits the fix automatically. The 100k-deep crash converts to a clean error returned through the FFI.stringpopulating in protowire-cpp's PXF decoder. Same upstream fix.If the maintainers want a Python-side defense-in-depth in addition (e.g. setting
sys.setrecursionlimitlower, or wrapping the FFI call in a stack-budget check), that's optional but not required to clear the corpus — the upstream C++ fixes are sufficient.The pure-Python
envelope.pydecoder also has a couple ofIndexError-vs-ValueErrorexception-type leaks the cross-port review flagged (envelope.py:131-132,85-93); those aren't tested by the current corpus but are good hygiene fixes.Reproduce locally
Convergence
When all 4 corpus inputs pass on this port, comment here. We'll flip
continue-on-error: falsefor thepythonmatrix entry inprotowire/.github/workflows/security.yml. When all 9 ports converge, the workflow becomes a required check.Cross-port context: 8 sibling per-port issues track the same convergence. The full cross-port matrix lives in
protowire's ROADMAP § M8. Most of this port's findings are upstream ofprotowire-cpp— see that repo's M8 issue.