Skip to content

WIP: cache woodbury intermediate matrix and GLS noise blocks#1994

Open
dlakaplan wants to merge 12 commits into
nanograv:masterfrom
dlakaplan:cache_woodbury
Open

WIP: cache woodbury intermediate matrix and GLS noise blocks#1994
dlakaplan wants to merge 12 commits into
nanograv:masterfrom
dlakaplan:cache_woodbury

Conversation

@dlakaplan
Copy link
Copy Markdown
Contributor

Part of #1985

Cache some of the intermediate matrix calculations that don't change if the noise stays the same. I am not sure if this choice of key is appropriate.

@dlakaplan
Copy link
Copy Markdown
Contributor Author

I don't yet have a good test for this to demonstrate the speed. If anybody can suggest a good data-set which includes some noise fitting but is still finite in time, let me know. I am also still unsure if the key is constructed in the best way.

@dlakaplan dlakaplan changed the title WIP: cache woodbury intermediate matrix WIP: cache woodbury intermediate matrix and GLS noise blocks May 21, 2026
@dlakaplan
Copy link
Copy Markdown
Contributor Author

I think in the testing that Daniel did, he used @jeremy-baier's GP branch, so not clear if this can be properly tested without that. But any feedback would be welcome.

@jeremy-baier
Copy link
Copy Markdown
Contributor

jeremy-baier commented May 21, 2026

I don't yet have a good test for this to demonstrate the speed. If anybody can suggest a good data-set which includes some noise fitting but is still finite in time, let me know. I am also still unsure if the key is constructed in the best way.

I think that these should work fine for testing ( in /test/datafiles/ )! It just contains the achromatic red noise basis.
"B1855+09_NANOGrav_9yv1.gls.par"
"B1855+09_NANOGrav_9yv1.tim”

The speed ups with the above alone should not be anywhere near as dramatic as with a larger noise basis but I would imagine still apparent.
You could additionally add these noise bases to that par/tim but this test will run a bit slower:

def add_DM_noise_to_model(model):
    all_components = Component.component_types
    model.add_component(all_components["PLDMNoise"](), validate=False)
    model["TNDMAMP"].quantity = -13
    model["TNDMGAM"].quantity = 1.2
    model["TNDMC"].value = 30
    model["TNDMFLOG"].value = 4
    model["TNDMFLOG_FACTOR"].value = 2
    model.validate()


def add_SW_noise_to_model(model):
    all_components = Component.component_types
    model.add_component(all_components["PLSWNoise"](), validate=False)
    model["TNSWAMP"].quantity = -12
    model["TNSWGAM"].quantity = -2.0  # blue spectrum
    model["TNSWC"].value = 50
    model["TNSWFLOG"].value = 4
    model["TNSWFLOG_FACTOR"].value = 2
    model.validate()


def add_chrom_noise_to_model(model):
    all_components = Component.component_types
    model.add_component(all_components["PLChromNoise"](), validate=False)
    model["TNCHROMAMP"].quantity = -14
    model["TNCHROMGAM"].quantity = 1.2
    model["TNCHROMC"].value = 30
    model["TNCHROMFLOG"].value = 4
    model["TNCHROMFLOG_FACTOR"].value = 2

    model.add_component(all_components["ChromaticCM"](), validate=False)
    model["TNCHROMIDX"].value = 4

    model.validate()

@jeremy-baier
Copy link
Copy Markdown
Contributor

I think in the testing that Daniel did, he used @jeremy-baier's GP branch, so not clear if this can be properly tested without that. But any feedback would be welcome.

My PR #1966 will certainly run into some conflicts with this, but I should be able to resolve them and all of the speed ups should carry over. Making a note for myself, that I will need to look into the phi matrix caching because in #1966 , I am making the phi matrix optionally non-diagonal.

@dlakaplan
Copy link
Copy Markdown
Contributor Author

I don't yet have a good test for this to demonstrate the speed. If anybody can suggest a good data-set which includes some noise fitting but is still finite in time, let me know. I am also still unsure if the key is constructed in the best way.

I think that these should work fine for testing ( in /test/datafiles/ )! It just contains the achromatic red noise basis. "B1855+09_NANOGrav_9yv1.gls.par" "B1855+09_NANOGrav_9yv1.tim”

Those are what I was using, but I don't think they have any of the noise parameters as free, so they won't change. therefore it will result in different caching behavior. I could just free them, but I don't know if there is something better.

@dlakaplan
Copy link
Copy Markdown
Contributor Author

@jeremy-baier : do you know what cache key was used for those tests? I tried one, and I think it works OK, but it might be too conservative. I am not finding any caching of the GLS sub matrix, only on the Woodbury side.

@jeremy-baier
Copy link
Copy Markdown
Contributor

@jeremy-baier : do you know what cache key was used for those tests? I tried one, and I think it works OK, but it might be too conservative. I am not finding any caching of the GLS sub matrix, only on the Woodbury side.

I don’t know what cache key Daniel (@DanielJOliver )was using in his Claude testing of this. When I tried some caching earlier, I used the toas and frequencies I think but did not see the speed ups that I thought. ( Though I think I was doing this wrong. )

@dlakaplan
Copy link
Copy Markdown
Contributor Author

@jeremy-baier : do you know what cache key was used for those tests? I tried one, and I think it works OK, but it might be too conservative. I am not finding any caching of the GLS sub matrix, only on the Woodbury side.

I don’t know what cache key Daniel (@DanielJOliver )was using in his Claude testing of this. When I tried some caching earlier, I used the toas and frequencies I think but did not see the speed ups that I thought. ( Though I think I was doing this wrong. )

OK, I don't think that is a proper cache. The matrices depend on the noise properties, right? So they should change if the noise values change, which is what I did.

@DanielJOliver
Copy link
Copy Markdown

Agreed that the matrices do depend on the noise values, so the key has to include them. In my speedups with Claude the key I used was:

def _noise_designmatrix_cache_key(self, toas):
    noise_params = self.get_params_of_component_type("NoiseComponent")
    noise_state = []
    for pname in noise_params:
        par = getattr(self, pname)
        noise_state.append((pname,
                            repr(par.quantity),
                            repr(par.key),
                            repr(par.key_value)))
    return (id(toas), len(toas), tuple(noise_state))

So it keys on the noise parameter state (value plus key/key_value, which also covers per-flag noise like per-backend EFAC/EQUAD), not just the TOAs. The same key was used for the Woodbury intermediate matrix (_calc_gls_chi2) and the GLS sub-matrix / mtcm noise-noise block in step()

On the GLS sub-matrix not caching, in my testing it wasn't the key, it was where the cache lived. The Woodbury cache is module-level, but the mtcm cache as it's written here is an instance attribute (self._mtcm_cache = {} in GLSState.__init__), and GLSState.take_step() constructs a fresh GLSState on every downhill iteration so the _mtcm_cache starts empty each iteration and never hits across iterations. The fix was making the noise-noise block cache module-level (like _woodbury_cache) specifically so it survived the per-iteration state reconstruction (and the deep-copies in take_step_model). That's probably the one change that would get the GLS sub-matrix caching to work.

One small thing for down the line, id(toas) keys on the object's identity, which is good within a single fit, but Python can reuse an address once the old object is freed, so in principle a new TOAs object could collide with the id of an earlier one and produce a false cache hit. The len(toas) + noise tuple make that unlikely and it never came up for me, but it is possible.

Happy to share the exact diff for the module-level mtcm cache if it would be useful.

@dlakaplan
Copy link
Copy Markdown
Contributor Author

Agreed that the matrices do depend on the noise values, so the key has to include them. In my speedups with Claude the key I used was:

def _noise_designmatrix_cache_key(self, toas):
    noise_params = self.get_params_of_component_type("NoiseComponent")
    noise_state = []
    for pname in noise_params:
        par = getattr(self, pname)
        noise_state.append((pname,
                            repr(par.quantity),
                            repr(par.key),
                            repr(par.key_value)))
    return (id(toas), len(toas), tuple(noise_state))

So it keys on the noise parameter state (value plus key/key_value, which also covers per-flag noise like per-backend EFAC/EQUAD), not just the TOAs. The same key was used for the Woodbury intermediate matrix (_calc_gls_chi2) and the GLS sub-matrix / mtcm noise-noise block in step()

OK, I think this is much more like what I had written (without looking at your Claude code):

https://github.com/nanograv/PINT/pull/1994/changes#diff-7cbfb3877d78c1fbb69c8d3e37bf61c30f2bf40de4934ad13675f1064686b003

  def _noise_tuple(self) -> Tuple:
        out = []
        for cp in self.NoiseComponent_list:
            out.append((cp.category,) + tuple([self[p].value for p in cp.params]))
        return tuple(out)

    def _noise_designmatrix_cache_key(self, toas: TOAs) -> Tuple:
        return (id(toas), len(toas), self._noise_tuple())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants