Update to stitching procedure by mattgebert · Pull Request #44 · xraysoftmat/kkcalc

mattgebert · 2026-04-22T06:31:28Z

Closes #4 by adding a new stiching method prepost_fit. This makes the joining and scaling of the atomic scattering factors data much more aligned to the atomic scattering factor background.

Some other minor changes too

… groups

Add github release source for GUI executables Modify API page title

…o create an extended db object

Major update to correctly (and smoothly) stitch optical data into the database data

make formula mass and stoichiometry more independent. fix appropriate tests

…ges to the polynomials class, some small errors. Changed representation to now specify the size of the object domain, rather than displaying (unhelpfully) rows of the asp coefficients

Copilot

Pull request overview

Adds a new distortion-correction (“stitching”) method (prepost_fit) intended to improve scaling/merging of weak spectra against the ASF database background (per Issue #4), along with some API tightening, display tweaks, docs updates, and CI adjustments.

Changes:

Extend database scaling/extension logic with fix_distortions_method including new prepost_fit pathway.
Tighten atomic_scattering consistency rules (warnings → ValueError) and adjust tests accordingly.
Misc updates: string representations, docs links/text, and Coveralls parallelization.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 15 comments.

Show a summary per file

File	Description
`kkcalc2/models/db_models.py`	Introduces `prepost_fit` and new parameters for distortion-fixing during scaling/extension; expands typing/overloads.
`kkcalc2/models/common.py`	Converts warnings to `ValueError` for conflicting inputs; adds attempted stoichiometry normalization.
`tests/models/test_common.py`	Updates tests to assert `ValueError` instead of warnings; adds formula-mass immutability check.
`kkcalc2/models/polynomials.py`	Adjusts `__str__/__repr__`, tweaks kwargs typing/stoichiometry propagation, adds an energy-domain assertion.
`kkcalc2/models/factors.py`	Simplifies `__str__/__repr__`; updates `scale_to_database` to use keyword args.
`kkcalc2/conversions.py`	Changes `ASP_to_ASF` return shape via `squeeze()`; suppresses a warning.
`kkcalc2/__main__.py`	Fixes/cleans optional PyQt import handling and internal imports.
`pyproject.toml`	Adds `pip >= 25.1` to build-system requires; adjusts doc-lint ignores.
`docs/source/install.rst`	Adds GitHub releases info and improves a pip link.
`docs/source/contributing.rst`	Fixes issue URL but leaves spelling/capitalization issues.
`docs/source/tutorials/index.rst`	Adds an “Introduction” stub.
`docs/source/api.rst`	Renames API page title to KKCalc2.
`docs/index.rst`	Fixes repo links but leaves “Github” capitalization.
`.github/workflows/tests.yml`	Enables Coveralls parallel uploads and adds a “finish” job.
`.github/workflows/release.yml`	Runs artifact re-zip step unconditionally.

Comments suppressed due to low confidence (1)

kkcalc2/models/db_models.py:898

These branches are typed to accept Sequence[...], but runtime checks only handle list (e.g. isinstance(data_asf, list) / isinstance(fix_distortions, list)). Passing a tuple or other Sequence will follow the non-sequence path and behave incorrectly. Either constrain the types back to list[...] or update the runtime checks to handle general sequences (while excluding str/bytes).

    ) -> None:  # numpydoc ignore=GL08
        # Check if data_asf is a list
        if isinstance(data_asf, list):
            # Check if merge_domain is a list
            if isinstance(merge_domain, list):
                # Check if the lengths match
                if len(data_asf) != len(merge_domain):
                    raise ValueError("Length of data_asf and merge_domain must match")

                # Check each merge domain doesn't overlap

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-22T06:39:06Z

        scaled_data_y = (data_y - data_merge_range[0]) * scale + db_asp_merge_range[0]

        # Energy values within the merge domain to use for fitting
        energies = data_e[
            data_merge_lb_idx:db_merge_ub_end
        ]  # essential to only use domain data to perform fit.

        if fix_distortions:
-            # Perform a fit along the domain
-            db_y = asp.eval_asf_on_coefs(
-                target_energies=energies,
-                energies=db_e,
-                coefs=db_coefs,
-            )  # Find equivalent values of the db_asp energies to the data energies
-            guess_grad = (
-                -(data_merge_range[1] - data_merge_range[0])
-                / (db_asp_merge_range[1] - db_asp_merge_range[0])
-                / data_y[-1]
-            )
-            fit_func = asp_db_extended.grad_min
            fit_y = scaled_data_y[
                data_merge_lb_idx:db_merge_ub_end
            ]  # essential to only use domain data to perform fit.

-            (grad,), _ = opt.leastsq(
-                func=fit_func,
-                x0=guess_grad,
-                args=(energies, fit_y, db_asp_merge_range, db_y),
-                # Minimizes difference to the db_y values.
-            )
-            # Reassign the scaled data, but use full energy domain to generate values.
-            scaled_data_y = db_asp_merge_range[0] + asp_db_extended.grad_min(
-                grad,
-                data_e,
-                scaled_data_y,
-                db_asp_merge_range,
-                db_y=0,  # No minimization here.
-                # Adjust merge indexes, so we can scale whole dataset.
-                idx0=data_merge_lb_idx,
-                idx1=data_merge_ub_idx,
-            )
+            match fix_distortions_method:
+                case "grad_min":
+                    # Perform a fit along the domain
+                    db_y = asp.eval_asf_on_coefs(
+                        target_energies=energies,
+                        energies=db_e,
+                        coefs=db_coefs,
+                    )  # Find equivalent values of the db_asp energies to the data energies
+                    guess_grad = (
+                        -(data_merge_range[1] - data_merge_range[0])
+                        / (db_asp_merge_range[1] - db_asp_merge_range[0])
+                        / data_y[-1]
+                    )
+                    fit_func = asp_db_extended.grad_min

+                    (grad,), _ = opt.leastsq(
+                        func=fit_func,
+                        x0=guess_grad,
+                        args=(energies, fit_y, db_asp_merge_range, db_y),
+                    )
+                    # Reassign the scaled data
+                    scaled_data_y = db_asp_merge_range[0] + asp_db_extended.grad_min(
+                        grad,
+                        energies,
+                        fit_y,
+                        db_asp_merge_range,
+                        db_y=0,
+                        idx0=0,
+                        idx1=-1,
+                    )


asp_db_abstract.scale_data() overwrites scaled_data_y with a distortion-corrected array computed only on the merge-domain slice (fit_y/energies). That changes the return length from len(data_y) to len(energies) when fix_distortions=True, which breaks callers like asf.scale_to_database() that expect the full-length scaled dataset. Apply distortion corrections to the full scaled_data_y (using the merge-domain indices for fitting), or clearly change the API to return a truncated dataset and update all callers + docstrings accordingly.

Copilot · 2026-04-22T06:39:07Z

+                    # Apply the linear correction to the full domain
+                    intermediate_y = fit_y - (
+                        pre_m * (energies - pre_x[0])
+                    )  # Correct the gradient


In the new prepost_fit branch, np.polyfit returns (pre_m, pre_c) and (post_m, post_c) but the intercept terms are never used in the corrections. As written, the method can only adjust slope, not offset, despite explicitly fitting the offset; this looks like an implementation error and will under-correct pre/post-edge mismatches. Incorporate the intercept terms into the applied correction (or remove them from the fit if intentionally unused).

Suggested change

# Apply the linear correction to the full domain

intermediate_y = fit_y - (

pre_m * (energies - pre_x[0])

) # Correct the gradient

# Apply the full linear correction (slope and offset) to the full domain

intermediate_y = fit_y - (

pre_m * energies + pre_c

) # Correct the fitted mismatch

Copilot · 2026-04-22T06:39:07Z

+                    correction = (
+                        np.arange(N)
+                        / N  # Ramping factor to slowly apply the correction over the domain


The ramp factor for applying the post-edge correction uses np.arange(N) / N, which never reaches 1.0 (it ends at (N-1)/N). If the intent is for the correction to be fully applied at the endpoint, use a normalization that reaches 1.0 (e.g. divide by N-1 when N>1).

Suggested change

correction = (

np.arange(N)

/ N # Ramping factor to slowly apply the correction over the domain

ramp = (

np.arange(N) / (N - 1)

if N > 1

else np.zeros(N, dtype=float)

)

correction = (

ramp

Copilot · 2026-04-22T06:39:07Z

+        assert re.energies.shape == im.energies.shape and np.all(
+            re.energies == im.energies
+        ), (
+            "Real and imaginary parts must have the same energy domains after truncation and extension."
+        )


Using assert for this validation is unsafe in library code because assertions can be disabled with Python optimizations (-O), which would skip the check and allow inconsistent energy domains to propagate. Replace this with an explicit conditional and raise ValueError (or a domain-specific exception) when energies still differ after truncation/extension.

Suggested change

assert re.energies.shape == im.energies.shape and np.all(

re.energies == im.energies

), (

"Real and imaginary parts must have the same energy domains after truncation and extension."

)

if re.energies.shape != im.energies.shape or not np.all(

re.energies == im.energies

):

raise ValueError(

"Real and imaginary parts must have the same energy domains after truncation and extension."

)

Copilot · 2026-04-22T06:39:08Z

+        #         kwargs["max_rows"] = 6
+        #     return self.dataframe().to_string(**kwargs)
+        # else:
+        return f"<{self.__class__.__name__} ({len(self.energies)} points, {self.energies[0]:0.2f} eV to {self.energies[-1]:0.2f} eV)>"


__str__ now indexes self.energies[0] and self.energies[-1], which will raise IndexError for empty datasets (which can be created via slicing, e.g. obj[:0]). Guard against len(self.energies) == 0 and return a sensible empty representation in that case.

Suggested change

return f"<{self.__class__.__name__} ({len(self.energies)} points, {self.energies[0]:0.2f} eV to {self.energies[-1]:0.2f} eV)>"

n_points = len(self.energies)

if n_points == 0:

return f"<{self.__class__.__name__} (0 points, empty)>"

return f"<{self.__class__.__name__} ({n_points} points, {self.energies[0]:0.2f} eV to {self.energies[-1]:0.2f} eV)>"

Copilot · 2026-04-22T06:39:09Z

+Want to contribute to developement? That's fantastic! Have a discussion with us first via `issues <https://github.com/xraysoftmat/kkcalc/issues>`_ to avoid wasted effort, and so we can make our visions align.

 To install for development, create your own `fork/branch <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo>`_ of ``kkcalc2`` on ``github`` so you can save and contribute your own changes, and clone that to your local file system using ``Github Desktop`` or ``Git`` via Windows Terminal / CMD.


Spelling/capitalization: "developement" -> "development", and use consistent "GitHub" capitalization (including "GitHub Desktop" and "Git" / "github").

Suggested change

Want to contribute to developement? That's fantastic! Have a discussion with us first via `issues <https://github.com/xraysoftmat/kkcalc/issues>`_ to avoid wasted effort, and so we can make our visions align.

To install for development, create your own `fork/branch <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo>`_ of ``kkcalc2`` on ``github`` so you can save and contribute your own changes, and clone that to your local file system using ``Github Desktop`` or ``Git`` via Windows Terminal / CMD.

Want to contribute to development? That's fantastic! Have a discussion with us first via `issues <https://github.com/xraysoftmat/kkcalc/issues>`_ to avoid wasted effort, and so we can make our visions align.

To install for development, create your own `fork/branch <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo>`_ of ``kkcalc2`` on ``GitHub`` so you can save and contribute your own changes, and clone that to your local file system using ``GitHub Desktop`` or ``Git`` via Windows Terminal / CMD.

Copilot · 2026-04-22T06:39:09Z

+- Github Releases: https://github.com/xraysoftmat/kkcalc/releases
+
+Note that the Github releases also contains compiled binaries for Windows and MacOS, which can be used without installing Python or any dependencies.


Spelling/capitalization/grammar: use "GitHub" (not "Github"), and "releases also contain" (not "contains"). These strings are user-facing docs and should match standard branding/grammar.

Suggested change

- Github Releases: https://github.com/xraysoftmat/kkcalc/releases

Note that the Github releases also contains compiled binaries for Windows and MacOS, which can be used without installing Python or any dependencies.

- GitHub Releases: https://github.com/xraysoftmat/kkcalc/releases

Note that the GitHub releases also contain compiled binaries for Windows and MacOS, which can be used without installing Python or any dependencies.

Copilot · 2026-04-22T06:39:10Z

+Please raise any `issues <https://github.com/xraysoftmat/kkcalc/issues>`_ here.

- Development (Github): https://github.com/xraysoftmat/kkcalc-xraysoftmat/
+- Development (Github): https://github.com/xraysoftmat/kkcalc/


Spelling/capitalization: "GitHub" should be capitalized consistently ("Development (GitHub)").

Suggested change

- Development (Github): https://github.com/xraysoftmat/kkcalc/

- Development (GitHub): https://github.com/xraysoftmat/kkcalc/

Copilot · 2026-04-22T06:39:10Z

-                "Setting a formula mass will not be internally used when a `stoichiometry` has been assigned.",
-                UserWarning,
+            raise ValueError(
+                "Formula mass is immutable when a stoichiometry is defined, as it is derived from `stoichiometry`"


The ValueError message here is missing whitespace/punctuation between sentences because two string literals are concatenated without a separating space. This makes the error hard to read ("...stoichiometryPlease..."). Add a space/newline or combine into a single formatted string.

Suggested change

"Formula mass is immutable when a stoichiometry is defined, as it is derived from `stoichiometry`"

"Formula mass is immutable when a stoichiometry is defined, as it is derived from `stoichiometry` "

Copilot · 2026-04-22T06:39:10Z

+To do this,
+
+
+



The new Tutorials "Introduction" section currently ends with an incomplete sentence ("To do this,") followed by blank lines. This will render oddly and reads like unfinished documentation; either complete the introductory paragraph or remove the dangling clause/extra blank lines.

Suggested change

To do this,

The tutorials in this section explain how to use these features.

previously, asp.extend_energies accidentially truncated the final energy value incorrectly

added properties for `betas`, `refractive`, `deltas` and `refractive_index`, which return the existing `eval_<name>` using the internal energies rather than requiring specification of the values.

mattgebert and others added 13 commits March 18, 2026 23:43

build(pyproject.toml): add pip requirement to install with dependency…

4bd8d4e

… groups

docs(index.rst, contributing.rst): fix broken links to github

8d4f360

docs(install.rst, api.rst): minor doc fixes

164b077

Add github release source for GUI executables Modify API page title

ci(tests.yml): enable coveralls parallel builds

d432d93

ci(release.yml): remove doc-zip dependency on gui

c468082

Merge branch 'xraysoftmat:v2' into v2

c900cfa

wip

4491a7f

fix(asp_db_im_extended): added shortcut method to input nexafs data t…

f05cfd6

…o create an extended db object

refactor(__main__.py): adjust import positions

3d66870

Merge branch 'v2' of https://github.com/mattgebert/kkcalc into v2

89eaf7e

feat(db_models.py): add new datascaling method prepost_fit

180fb3d

Major update to correctly (and smoothly) stitch optical data into the database data

fix(common.py): remove variable propogation errors

d844205

make formula mass and stoichiometry more independent. fix appropriate tests

refactor(modify-__repr__-for-asp-and-other-minor-changes): Small chan…

ee0b090

…ges to the polynomials class, some small errors. Changed representation to now specify the size of the object domain, rather than displaying (unhelpfully) rows of the asp coefficients

Copilot AI review requested due to automatic review settings April 22, 2026 06:31

Copilot started reviewing on behalf of mattgebert April 22, 2026 06:31 View session

Copilot AI reviewed Apr 22, 2026

View reviewed changes

mattgebert added 2 commits April 23, 2026 08:57

fix(asp): fix asp.extend_energies bug

d566225

previously, asp.extend_energies accidentially truncated the final energy value incorrectly

fix(polynomials): add properties for refractive index values to ASP

3319e73

added properties for `betas`, `refractive`, `deltas` and `refractive_index`, which return the existing `eval_<name>` using the internal energies rather than requiring specification of the values.

mattgebert force-pushed the v2 branch from 907c058 to 3319e73 Compare April 23, 2026 02:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update to stitching procedure#44

Update to stitching procedure#44
mattgebert wants to merge 15 commits into
xraysoftmat:v2from
mattgebert:v2

mattgebert commented Apr 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        return f"<{self.__class__.__name__} ({len(self.energies)} points, {self.energies[0]:0.2f} eV to {self.energies[-1]:0.2f} eV)>"
+        n_points = len(self.energies)
+        if n_points == 0:
+            return f"<{self.__class__.__name__} (0 points, empty)>"
+        return f"<{self.__class__.__name__} ({n_points} points, {self.energies[0]:0.2f} eV to {self.energies[-1]:0.2f} eV)>"

		Want to contribute to developement? That's fantastic! Have a discussion with us first via `issues <https://github.com/xraysoftmat/kkcalc/issues>`_ to avoid wasted effort, and so we can make our visions align.

		To install for development, create your own `fork/branch <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo>`_ of ``kkcalc2`` on ``github`` so you can save and contribute your own changes, and clone that to your local file system using ``Github Desktop`` or ``Git`` via Windows Terminal / CMD.

		- Github Releases: https://github.com/xraysoftmat/kkcalc/releases

		Note that the Github releases also contains compiled binaries for Windows and MacOS, which can be used without installing Python or any dependencies.

	- Development (Github): https://github.com/xraysoftmat/kkcalc/
	- Development (GitHub): https://github.com/xraysoftmat/kkcalc/

	"Formula mass is immutable when a stoichiometry is defined, as it is derived from `stoichiometry`"
	"Formula mass is immutable when a stoichiometry is defined, as it is derived from `stoichiometry` "

	To do this,
	The tutorials in this section explain how to use these features.

Uh oh!

Conversation

mattgebert commented Apr 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants