Speed up PlutusData/CBORSerializable decode ~5.4x and encode ~4.3x (no behavior change)#492
Conversation
Measures decode/encode/to_json for typed PlutusData and untyped RawPlutusData across synthetic complexity sweeps, to locate the chain-indexing bottleneck. Run: python benchmarks/plutus_bench.py [iters] (set CBOR_C_EXTENSION=1 to compare a fast backend). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Typed PlutusData/dataclass decode recomputed get_type_hints(cls) once per decoded node and getfullargspec(t.from_primitive) once per typed field on every node. Both depend only on the class, not the data, yet dominated typed decode (~422 get_type_hints + ~421 getfullargspec calls per decode of a 200-element datum — together ~70% of decode time). Memoize both in module-level WeakKeyDictionary caches (so dynamically created classes can still be garbage collected). Generic aliases, which are not always weakly referenceable, are computed without caching. Result: ~3.6x faster typed PlutusData decode (200-inner datum 4.94ms -> 1.36ms, cbor2pure), backend-independent. All 568 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PlutusData.__post_init__ ran on every decoded instance, re-checking each field's declared type against the allowed set (a class-invariant check) and recomputing fields() each time. Cache the validated fields tuple per class in a WeakKeyDictionary (safe for dynamically created classes); cached instances run only the per-instance byte-length check. First instance preserves the original interleaved type/length validation exactly, and a class with an invalid field type is never cached so it keeps raising identically. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Decode: _restore_typed_primitive re-derived a field's decode strategy (issubclass / __origin__ / isinstance / try-except chains) on every value even though it depends only on the field type. Resolve it once into a memoized "decode plan" callable per type, and build per-class array/map field plans, all cached in WeakKeyDictionaries (with safe fallbacks for unhashable or non-weakreferenceable types). Behavior is identical: same DeserializeException cases, Union fallback order, list/dict/Optional handling, IndefiniteList preservation, object_hook metadata, and the one-time f.type resolution. Encode: the recursive to_primitive descent re-validated the large Primitive Union return type via typeguard at every node. Route base-implementation recursion through an un-annotated _to_primitive worker (public to_primitive keeps its annotation and top-level check; overrides still dispatch polymorphically). Output is byte-for-byte identical. Result (typed PlutusData, cbor2pure, backend-independent): ~1.5x faster decode and ~4.3x faster encode on top of the type-hint caching. All 568 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #492 +/- ##
==========================================
+ Coverage 90.62% 90.83% +0.20%
==========================================
Files 34 34
Lines 5154 5289 +135
Branches 781 802 +21
==========================================
+ Hits 4671 4804 +133
Misses 304 304
- Partials 179 181 +2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
…tive OrderedSet.append/remove re-encoded each element via dumps() twice (once for the membership check, once for the dict key); compute the CBOR de-dup key once. This dominated decode of set-heavy transactions (dumps was ~78% of decode cumtime on real fixtures). to_validated_primitive carried a `-> Primitive` return annotation, so the @TypeChecked class decorator re-validated the result against the 26-member Primitive Union even though to_primitive (which it calls) already return-checks it once. Drop the annotation (mirrors the existing _to_primitive worker). Result: set-heavy tx decode 886 -> 384 us (2.3x); tx encode ~1.4x. Byte-identical output, all 568 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…present to_shallow_primitive did deepcopy(self).normalize() on every encode solely to avoid mutating self while stripping zero/empty entries. Scan first and skip the deepcopy when there is nothing to strip (the common case). Result: MultiAsset.to_shallow_primitive 130 -> 33 us (~3.9x) for a typical token-transfer multi-asset. Byte-identical output, all 568 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- OrderedSet (#4): key de-duplication by the element's native hash, falling back to CBOR bytes only for unhashable elements (namespaced so it can't collide). This avoids a dumps() per element entirely for hashable set members. BEHAVIOR NOTE: de-dup for hashable elements is now by Python __eq__/__hash__ rather than CBOR-byte equality. These coincide for pycardano's set element types (TransactionInput, key hashes, witnesses); unhashable elements keep the original CBOR-byte semantics. Added tests for the unhashable/mixed paths. - _dfs encode recursion (#5): scalar-leaf fast path + iterate IndefiniteList.data directly (avoids the slow collections.abc.Sequence.__iter__). - Cache dataclasses.fields() per class in to_shallow_primitive (Python-Cardano#6). Result (cbor2pure, backend-independent): set-heavy tx decode 395 -> 224 us (1.76x), typed PlutusData encode 3257 -> 2285 us (1.43x), datum_hash 3959 -> 3027 us (1.31x). All 569 tests pass; byte-identical output (except the documented OrderedSet de-dup-key semantics). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address.from_primitive ran a speculative cbor2.loads() on every address to detect a Byron tag-24 wrapper. A Byron address is a 2-element CBOR array whose first element is tag 24, i.e. bytes starting b"\x82\xd8\x18"; no Shelley header byte is 0x82. Only run the probe when the prefix matches, skipping it on the common Shelley path. Byte-identical behavior. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Cover the reachable new branches with targeted tests: - OrderedSet de-dup unhashable/mixed fallback and remove() (test_serialization) - direct _restore_typed_primitive entry + ByteString passthrough/wrap - Asset/MultiAsset.to_shallow_primitive deepcopy/normalize path (zero values) - PlutusData.__post_init__ cached fast-path byte-length validation - Byron address decode from raw CBOR bytes + invalid-CBOR fallback Mark genuinely-unreachable defensive branches with `# pragma: no cover` (impossible generic-alias arities; the is-CBORSerializable-AND-PRIMITIVE_TYPE case that cannot co-occur; non-hashable / non-weakreferenceable type fallbacks; non-init map fields). All diff lines are now covered or excludable. 574 tests pass; flake8/mypy/black/isort clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
cffls
left a comment
There was a problem hiding this comment.
Looks pretty good! Added a few minor comments.
| for f, v in zip(all_fields, values): | ||
| if not isclass(f.type): | ||
| f.type = type_hints[f.name] | ||
| v = _restore_dataclass_field(f, v) |
There was a problem hiding this comment.
Function _restore_dataclass_field could be removed since it isn't used anymore.
| # b"\x82\xd8\x18" (array(2) + tag(24)). Guarding on that prefix avoids a | ||
| # speculative cbor2.loads() on every (Shelley) address, whose header byte is | ||
| # never 0x82. | ||
| if value[:3] == b"\x82\xd8\x18": |
There was a problem hiding this comment.
b"\x82\xd8\x18" is used in multiple places, including tests. Would be good to make it a constant.
| tv is int | ||
| or tv is str | ||
| or tv is bytes | ||
| or tv is bool | ||
| or tv is float | ||
| or value is None |
There was a problem hiding this comment.
Can we simplify here, something like the following?
_SCALAR_TYPES = frozenset({int, str, bytes, bool, float, type(None)})
...
if type(value) in _SCALAR_TYPES:
return value
| assert _restore_typed_primitive(ByteString, b"hello") == bs | ||
| assert _restore_typed_primitive(ByteString, bs) is bs | ||
|
|
||
|
|
There was a problem hiding this comment.
Is it possible to add some tests to ensure the caching continue to work onwards in case future code changes bypass them?
- Remove the now-unused `_restore_dataclass_field` helper and its `Field` import. - Extract the Byron-address CBOR prefix `b"\x82\xd8\x18"` to a named `_BYRON_ADDRESS_CBOR_PREFIX` constant, used in address.py and the tests. - Simplify the scalar-leaf fast path to a `_SCALAR_TYPES` frozenset membership test. - Add regression tests that the per-class introspection caches (type hints, fields) memoize and aren't bypassed by repeated (de)serialization. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
A set of backend-independent optimizations to pycardano's
(de)serializationhot paths, primarily benefiting PlutusData- and set-heavy workloads such as on-chain data indexing and transaction building. No new dependencies and no change to the CBOR library. Output is byte-for-byte identical (one documented exception noted below). All tests pass on Python 3.9–3.13.Motivation
Profiling showed the cost of typed
PlutusDatadecode and real-transaction (de)serialization is dominated by repeated, data-independent work in pycardano itself — not the CBOR backend: per-nodeget_type_hints()/ signature introspection, per-value type dispatch, redundant typeguard return-checks, re-encoding set elements for de-duplication, and defensivedeepcopys on every multi-asset encode.Changes
Decode
get_type_hints(cls)andfrom_primitivetype_argsintrospection per class.PlutusData.__post_init__field-type validation per class (per-instance byte-length check preserved).issubclass/__origin__/isinstancechain on every value.OrderedSetkeys de-duplication by the element's native hash (CBOR-bytes fallback for unhashable elements), avoiding adumps()per element — previously ~78% of real-transaction decode time.cbor2.loadsprobe behind a byte-prefix check.Encode
to_primitivedescent through an un-annotated worker, and drop the redundant typeguard return-check onto_validated_primitive, so the largePrimitiveUnionisn't re-validated at every node._dfs: scalar-leaf fast path and directIndefiniteList.dataiteration (avoids the slowSequence.__iter__).dataclasses.fields()per class into_shallow_primitive.deepcopy().normalize()inAsset/MultiAsset.to_shallow_primitivewhen there are no zero/empty entries.All caches are keyed by class, not data (so the wins hold across millions of unique objects) and use
WeakKeyDictionaryso dynamically-created classes are still garbage-collected.Benchmarks
Pure-Python CBOR backend, best-of-5 (reproduce with the included
benchmarks/plutus_bench.py):PlutusDatadecode (200 fields)PlutusDataencode (200 fields)MultiAsset.to_shallow_primitiveCorrectness
DeserializeExceptioncases,Unionfallback order, list/dict/Optionalhandling,IndefiniteListpreservation,object_hookmetadata, and one-timef.typeresolution.OrderedSetelements, de-duplication is now by Python__eq__/__hash__rather than CBOR-byte equality. These coincide for pycardano's set element types (transaction inputs, key hashes, witnesses); unhashable elements retain the original CBOR-byte semantics. Covered by new tests.flake8,mypy,black,isortclean. New code paths are covered by added tests; genuinely-unreachable defensive branches are marked# pragma: no cover.🤖 Generated with Claude Code