Release 0.4.4: async tasks, nested refs, and hash correctness#22
Conversation
There was a problem hiding this comment.
PR Review
PR: Release 0.4.4: async tasks, nested refs, and hash correctness
Important
Verdict: Request changes - 4 actionable findings, highest severity P2.
Findings (4)
visited set not exception-safe in _contains_resolvable_ref
src/cashet/async_executor.py:61
The function adds an object id to visited set before recursing and discards it afterwards. If any exception occurs inside the recursive any() call, the id is never removed, potentially causing false negatives for subsequent lookups of the same object within the same call stack.
Set resolution may silently drop duplicate values
src/cashet/async_executor.py:330
When resolving refs inside a set, if two distinct refs load to identical objects, the set may shrink because sets naturally deduplicate. The original set's cardinality is not preserved, which could break functions that depend on the number of refs passed.
visited set not exception-safe in _collect_input_refs
src/cashet/dag.py:29
The function adds an object id to visited set before iterating over container items and discards it afterwards without a finally block. If the iteration or the recursive call raises an exception, the id remains in the set, potentially causing incorrect deduplication or missing refs in subsequent calls.
Deep argument walk in resolve_input_refs may degrade performance
src/cashet/dag.py:45
The new deep traversal of all argument containers via _collect_input_refs can be expensive for tasks with large data structures that contain no refs. Previously, only top-level refs were detected. Consider short-circuiting when the argument cannot contain any ref (e.g., using a fast-path check similar to _contains_resolvable_ref in async_executor).
Optional Recheck
After pushing a fix, DS-Review will review the new commit automatically when pull-request sync reviews are enabled.
Reply @ds-review recheck under the relevant inline finding for targeted verification of that thread.
| for k, v in value.items(): | ||
| key = await self._resolve_value(k, memo) |
There was a problem hiding this comment.
Dict key resolution can produce unhashable objects
The _resolve_value method resolves dict keys via _resolve_value, which can convert a ResultRef key into an unhashable object like a list or dict, causing a TypeError when inserting into the new dict. Keys should not be deeply resolved, or the resolution should validate hashability.
| for k, v in value.items(): | |
| key = await self._resolve_value(k, memo) | |
| for k, v in value.items(): | |
| dict_result[k] = await self._resolve_value(v, memo) |
| for val in kwargs.values(): | ||
| if hasattr(val, "__cashet_ref__"): | ||
| refs.append(val.__cashet_ref__()) | ||
| _collect_input_refs(val, refs, visited) |
There was a problem hiding this comment.
Input refs not deduplicated causing commit hash variability
The _collect_input_refs function appends refs without deduplication. If the same ref appears multiple times across nested containers, the commit hash changes unnecessarily because the duplicate hashes are included. Deduplicate refs by their hash before returning.
| _collect_input_refs(val, refs, visited) | |
| return list({ref.hash: ref for ref in refs}.values()) |
| if isinstance(value, frozenset): | ||
| resolved_items = [await self._resolve_value(item, memo) for item in value] | ||
| frozenset_result = frozenset(resolved_items) | ||
| memo[value_id] = frozenset_result | ||
| return frozenset_result |
There was a problem hiding this comment.
frozenset resolution may fail with unhashable items
When resolving items inside a frozenset, a resolved value may be unhashable (e.g., a list from a ref), causing a TypeError when constructing the frozenset. Add a hashability check before creating the frozenset.
| if isinstance(value, frozenset): | |
| resolved_items = [await self._resolve_value(item, memo) for item in value] | |
| frozenset_result = frozenset(resolved_items) | |
| memo[value_id] = frozenset_result | |
| return frozenset_result | |
| if isinstance(value, frozenset): | |
| resolved_items = [await self._resolve_value(item, memo) for item in value] | |
| for item in resolved_items: | |
| try: | |
| hash(item) | |
| except TypeError: | |
| raise TypeError( | |
| f"Cannot resolve frozenset item {item!r} because it is not hashable" | |
| ) | |
| frozenset_result = frozenset(resolved_items) | |
| memo[value_id] = frozenset_result | |
| return frozenset_result |
Summary
This release fixes several cache correctness and execution edge cases:
AsyncClientnow supportsasync deftask callablesResultRef/AsyncResultRefvalues now resolve inside containersrange,slice, anddatetimevalues are included in function hashes0.4.4Details
AsyncClient.submit()can now run both synchronous and asynchronous task functions.Function hashing now includes immutable global values referenced by the function or nested code objects such as comprehensions, so changing a module-level constant invalidates affected cache entries.
Refs can now be passed inside common containers like dicts, lists, tuples, sets, and frozensets. They resolve before execution and are recorded in commit metadata as input refs.
The nested resolver keeps normal arguments intact when no refs are present, preserves tuple subclass behavior where possible, and avoids constructing invalid dict/frozenset containers from unhashable resolved values.
Task execution now awaits the task function itself when needed, but it does not recursively await the value returned by that function. That keeps returned awaitable objects as the function's actual result.
Redis exports from a base install no longer fail as
NoneTypecallables. They now raise an install error pointing users to the Redis extra.