Fix ContextBuilder checkpoint loading for non-default architectures by harens · Pull Request #13 · Thijsvanede/DeepCASE

harens · 2026-04-23T22:31:30Z

This PR updates ContextBuilder persistence so checkpoints store the architecture metadata needed to reconstruct the model reliably.

Previously, save() wrote only the raw state_dict, and load() inferred constructor settings from tensor shapes. That worked for some default models, but was unreliable for non-default configurations such as custom num_layers, bidirectional encoders, or LSTM-based models.

Changes:

Store ContextBuilder constructor metadata alongside the state_dict.
Restore saved input_size, output_size, hidden_size, num_layers, max_length, bidirectional, and LSTM values on load.
Preserve backwards compatibility with older raw state_dict checkpoints.
Infer num_layers, bidirectional, and LSTM from recurrent tensor keys/shapes where older checkpoints do not include metadata.

This should make checkpoint round-tripping reliable for non-default architectures while keeping existing saved models loadable.

For legacy checkpoints, architecture parameters are inferred from PyTorch recurrent weight naming conventions (e.g. weight_ih_l{k},_reverse suffix, and gate dimensionality). This is a best-effort heuristic.

Save ContextBuilder architecture settings alongside the state_dict so non-default num_layers, bidirectional, and LSTM configurations can be restored. Keep loading older raw state_dict checkpoints by inferring constructor settings from stored tensor shapes and recurrent layer keys where possible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ContextBuilder checkpoint loading for non-default architectures#13

Fix ContextBuilder checkpoint loading for non-default architectures#13
harens wants to merge 1 commit into
Thijsvanede:mainfrom
harens:fix/contextbuilder-load-metadata

harens commented Apr 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

harens commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

harens commented Apr 23, 2026 •

edited

Loading