Context
Today `tests/test_patches.py` entries are `(botocore_symbol, {hash_set})`. The classifier (new check-async-need skill in #1567 + follow-up branch `amohr/classifier-sonnet-tuning`) needs to know, for each overridden symbol, the corresponding aiobotocore function so it can reason about async-contamination.
Currently the classifier computes this via two separate registries:
- `overrides` from test_patches.py (authoritative botocore→hash map)
- `async_names` scraped from `async def` declarations in `aiobotocore/**/*.py` (derived; not authoritative)
The two can drift: e.g. a new aiobotocore-only async helper isn't in test_patches.py but enters `async_names`, and a sync-signature delegate (like `emit`) is in test_patches.py but isn't picked up as an async name automatically (currently handled via a small curated `_SYNC_BUT_CONTAMINATED_NAMES` set in `plugins/aiobotocore-bot/evals/_common.py`).
Proposal
Extend each test_patches.py entry with a pointer to the aiobotocore counterpart. Shape:
```python
(
ClientArgsCreator.get_client_args, # botocore side
AioClientArgsCreator.get_client_args, # aiobotocore side (import added to the module)
{'hash'},
),
```
Benefits:
- Single source of truth for "this botocore symbol is overridden by that aiobotocore function"
- Classifier can derive the full async registry from one file
- Catches drift: if an import is removed but the entry stays, it's a visible diff
- No more curated `_SYNC_BUT_CONTAMINATED_NAMES` — the delegate mapping becomes explicit
Drawbacks:
- Every entry changes shape (large diff)
- Pytest parametrize needs to be updated to handle the new shape
- Slightly harder to read at a glance
Alternatives considered
- Keep two registries (current state); tolerate some drift; curate where needed.
- Add a parallel `aiobotocore_overrides.py` file that only contains the aiobotocore-side refs; keep test_patches.py unchanged.
- Derive the mapping purely from convention (`X` → `AioX` in same module); accept edge cases via curation.
Priority
Low — current two-registry setup works for the classifier after #1567. Worth revisiting if we hit classification errors traceable to registry drift, or if we add more `_SYNC_BUT_CONTAMINATED_NAMES` entries and the curation-maintenance cost grows.
References
Context
Today `tests/test_patches.py` entries are `(botocore_symbol, {hash_set})`. The classifier (new check-async-need skill in #1567 + follow-up branch `amohr/classifier-sonnet-tuning`) needs to know, for each overridden symbol, the corresponding aiobotocore function so it can reason about async-contamination.
Currently the classifier computes this via two separate registries:
The two can drift: e.g. a new aiobotocore-only async helper isn't in test_patches.py but enters `async_names`, and a sync-signature delegate (like `emit`) is in test_patches.py but isn't picked up as an async name automatically (currently handled via a small curated `_SYNC_BUT_CONTAMINATED_NAMES` set in `plugins/aiobotocore-bot/evals/_common.py`).
Proposal
Extend each test_patches.py entry with a pointer to the aiobotocore counterpart. Shape:
```python
(
ClientArgsCreator.get_client_args, # botocore side
AioClientArgsCreator.get_client_args, # aiobotocore side (import added to the module)
{'hash'},
),
```
Benefits:
Drawbacks:
Alternatives considered
Priority
Low — current two-registry setup works for the classifier after #1567. Worth revisiting if we hit classification errors traceable to registry drift, or if we add more `_SYNC_BUT_CONTAMINATED_NAMES` entries and the curation-maintenance cost grows.
References