PickleStorage speed up by AbhinovKoutharapu · Pull Request #48 · nasa/SMCPy

AbhinovKoutharapu · 2026-06-09T17:54:41Z

Cached the phi_sequence and mut_ratio_sequence for faster load times.

…tart into one

Copilot

Pull request overview

This PR aims to speed up PickleStorage by caching phi_sequence and mut_ratio_sequence in memory, avoiding repeated full-file deserialization when those sequences are accessed.

Changes:

Added in-memory caches for phi_sequence and mut_ratio_sequence in PickleStorage.
Replaced the restart/length scan mechanism with _scan_new_records() and updated how record offsets/length are tracked.
Updated phi_sequence / mut_ratio_sequence properties to return cached values.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

    def save_step(self, step):
        file = self._open_file(self._mode)
        self._mode = "ab"
        pickle.dump(step, file, pickle.HIGHEST_PROTOCOL)
        self._close(file)
+        self._phi_sequence.append(step.attrs["phi"])
+        self._mut_ratio_sequence.append(step.attrs["mutation_ratio"])


peleser-nasa · 2026-06-17T15:15:38Z

+            while True:
+                start = f.tell()
+                try:
+                    obj = pickle.load(f)


@AbhinovKoutharapu current implementation makes sense to me, but I think we're still going to run into some issues with having to load pickle objects, especially for large numbers of particles. The pickle.load time increases with size of particles. When we use this in PySIPS, those objects can be quite large.

I think my preferred approach would be:

On write:

In save_step, save the metadata in a separate pickle.dump so that the file structure is interleaved: <metadata 0>, <particles 0>, <metadata 1>, <particles 1>, etc.:

pickle.dump(meta_data_dict, file, pickle.HIGHEST_PROTOCOL) pickle.dump(step, file, pickle.HIGHEST_PROTOCOL)

On read:

Use genops() to build the byte offset list without having to load any objects (fast)

Create different read behavior depending on whether the function requests the particles objects or whether it just wants metadata (e.g., mutation_ratio_sequence)

For the interleaved file structure, this means metadata would be the even indices [0, 2, 4, ...] and particle objects would be the odd indices [1, 3, 5, ...]. To be a bit more explicit, consider two separate lists for each (e.g., meta_byte_offsets and step_byte_offsets).

I think the end result is the ability to load the really small metadata objects lightning fast even for really large particle, large step simulations.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

…u/SMCPy into pickle_speed_up

…kle.load

Abhinov Koutharapu and others added 2 commits June 8, 2026 15:36

PickleStorage speed up

b2fc5d1

Combined functionality of _rebuild_attributes and _init_length_on_res…

97f7723

…tart into one

peleser-nasa requested a review from Copilot June 11, 2026 15:15

Copilot started reviewing on behalf of peleser-nasa June 11, 2026 15:15 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

AbhinovKoutharapu and others added 4 commits June 11, 2026 12:08

Check for pickle.UnpicklingError in _scan_new_records

2137dd5

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Fixed issue with duplicating cached sequences

19d9c1b

Merge branch 'pickle_speed_up' of https://github.com/AbhinovKoutharap…

763b07d

…u/SMCPy into pickle_speed_up

Implemented interleaved file structure and used genops instead of pic…

5c35ae9

…kle.load

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PickleStorage speed up#48

PickleStorage speed up#48
AbhinovKoutharapu wants to merge 6 commits into
nasa:developfrom
AbhinovKoutharapu:pickle_speed_up

AbhinovKoutharapu commented Jun 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

peleser-nasa Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

AbhinovKoutharapu commented Jun 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

peleser-nasa Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants