Skip to content

Update to stitching procedure#44

Open
mattgebert wants to merge 15 commits into
xraysoftmat:v2from
mattgebert:v2
Open

Update to stitching procedure#44
mattgebert wants to merge 15 commits into
xraysoftmat:v2from
mattgebert:v2

Conversation

@mattgebert

Copy link
Copy Markdown
Member

Closes #4 by adding a new stiching method prepost_fit. This makes the joining and scaling of the atomic scattering factors data much more aligned to the atomic scattering factor background.

Some other minor changes too

Copilot AI review requested due to automatic review settings April 22, 2026 06:31

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new distortion-correction (“stitching”) method (prepost_fit) intended to improve scaling/merging of weak spectra against the ASF database background (per Issue #4), along with some API tightening, display tweaks, docs updates, and CI adjustments.

Changes:

  • Extend database scaling/extension logic with fix_distortions_method including new prepost_fit pathway.
  • Tighten atomic_scattering consistency rules (warnings → ValueError) and adjust tests accordingly.
  • Misc updates: string representations, docs links/text, and Coveralls parallelization.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
kkcalc2/models/db_models.py Introduces prepost_fit and new parameters for distortion-fixing during scaling/extension; expands typing/overloads.
kkcalc2/models/common.py Converts warnings to ValueError for conflicting inputs; adds attempted stoichiometry normalization.
tests/models/test_common.py Updates tests to assert ValueError instead of warnings; adds formula-mass immutability check.
kkcalc2/models/polynomials.py Adjusts __str__/__repr__, tweaks kwargs typing/stoichiometry propagation, adds an energy-domain assertion.
kkcalc2/models/factors.py Simplifies __str__/__repr__; updates scale_to_database to use keyword args.
kkcalc2/conversions.py Changes ASP_to_ASF return shape via squeeze(); suppresses a warning.
kkcalc2/__main__.py Fixes/cleans optional PyQt import handling and internal imports.
pyproject.toml Adds pip >= 25.1 to build-system requires; adjusts doc-lint ignores.
docs/source/install.rst Adds GitHub releases info and improves a pip link.
docs/source/contributing.rst Fixes issue URL but leaves spelling/capitalization issues.
docs/source/tutorials/index.rst Adds an “Introduction” stub.
docs/source/api.rst Renames API page title to KKCalc2.
docs/index.rst Fixes repo links but leaves “Github” capitalization.
.github/workflows/tests.yml Enables Coveralls parallel uploads and adds a “finish” job.
.github/workflows/release.yml Runs artifact re-zip step unconditionally.
Comments suppressed due to low confidence (1)

kkcalc2/models/db_models.py:898

  • These branches are typed to accept Sequence[...], but runtime checks only handle list (e.g. isinstance(data_asf, list) / isinstance(fix_distortions, list)). Passing a tuple or other Sequence will follow the non-sequence path and behave incorrectly. Either constrain the types back to list[...] or update the runtime checks to handle general sequences (while excluding str/bytes).
    ) -> None:  # numpydoc ignore=GL08
        # Check if data_asf is a list
        if isinstance(data_asf, list):
            # Check if merge_domain is a list
            if isinstance(merge_domain, list):
                # Check if the lengths match
                if len(data_asf) != len(merge_domain):
                    raise ValueError("Length of data_asf and merge_domain must match")

                # Check each merge domain doesn't overlap

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 258 to +299
scaled_data_y = (data_y - data_merge_range[0]) * scale + db_asp_merge_range[0]

# Energy values within the merge domain to use for fitting
energies = data_e[
data_merge_lb_idx:db_merge_ub_end
] # essential to only use domain data to perform fit.

if fix_distortions:
# Perform a fit along the domain
db_y = asp.eval_asf_on_coefs(
target_energies=energies,
energies=db_e,
coefs=db_coefs,
) # Find equivalent values of the db_asp energies to the data energies
guess_grad = (
-(data_merge_range[1] - data_merge_range[0])
/ (db_asp_merge_range[1] - db_asp_merge_range[0])
/ data_y[-1]
)
fit_func = asp_db_extended.grad_min
fit_y = scaled_data_y[
data_merge_lb_idx:db_merge_ub_end
] # essential to only use domain data to perform fit.

(grad,), _ = opt.leastsq(
func=fit_func,
x0=guess_grad,
args=(energies, fit_y, db_asp_merge_range, db_y),
# Minimizes difference to the db_y values.
)
# Reassign the scaled data, but use full energy domain to generate values.
scaled_data_y = db_asp_merge_range[0] + asp_db_extended.grad_min(
grad,
data_e,
scaled_data_y,
db_asp_merge_range,
db_y=0, # No minimization here.
# Adjust merge indexes, so we can scale whole dataset.
idx0=data_merge_lb_idx,
idx1=data_merge_ub_idx,
)
match fix_distortions_method:
case "grad_min":
# Perform a fit along the domain
db_y = asp.eval_asf_on_coefs(
target_energies=energies,
energies=db_e,
coefs=db_coefs,
) # Find equivalent values of the db_asp energies to the data energies
guess_grad = (
-(data_merge_range[1] - data_merge_range[0])
/ (db_asp_merge_range[1] - db_asp_merge_range[0])
/ data_y[-1]
)
fit_func = asp_db_extended.grad_min

(grad,), _ = opt.leastsq(
func=fit_func,
x0=guess_grad,
args=(energies, fit_y, db_asp_merge_range, db_y),
)
# Reassign the scaled data
scaled_data_y = db_asp_merge_range[0] + asp_db_extended.grad_min(
grad,
energies,
fit_y,
db_asp_merge_range,
db_y=0,
idx0=0,
idx1=-1,
)

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asp_db_abstract.scale_data() overwrites scaled_data_y with a distortion-corrected array computed only on the merge-domain slice (fit_y/energies). That changes the return length from len(data_y) to len(energies) when fix_distortions=True, which breaks callers like asf.scale_to_database() that expect the full-length scaled dataset. Apply distortion corrections to the full scaled_data_y (using the merge-domain indices for fitting), or clearly change the API to return a truncated dataset and update all callers + docstrings accordingly.

Copilot uses AI. Check for mistakes.
Comment on lines +330 to +333
# Apply the linear correction to the full domain
intermediate_y = fit_y - (
pre_m * (energies - pre_x[0])
) # Correct the gradient

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the new prepost_fit branch, np.polyfit returns (pre_m, pre_c) and (post_m, post_c) but the intercept terms are never used in the corrections. As written, the method can only adjust slope, not offset, despite explicitly fitting the offset; this looks like an implementation error and will under-correct pre/post-edge mismatches. Incorporate the intercept terms into the applied correction (or remove them from the fit if intentionally unused).

Suggested change
# Apply the linear correction to the full domain
intermediate_y = fit_y - (
pre_m * (energies - pre_x[0])
) # Correct the gradient
# Apply the full linear correction (slope and offset) to the full domain
intermediate_y = fit_y - (
pre_m * energies + pre_c
) # Correct the fitted mismatch

Copilot uses AI. Check for mistakes.
Comment on lines +356 to +358
correction = (
np.arange(N)
/ N # Ramping factor to slowly apply the correction over the domain

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ramp factor for applying the post-edge correction uses np.arange(N) / N, which never reaches 1.0 (it ends at (N-1)/N). If the intent is for the correction to be fully applied at the endpoint, use a normalization that reaches 1.0 (e.g. divide by N-1 when N>1).

Suggested change
correction = (
np.arange(N)
/ N # Ramping factor to slowly apply the correction over the domain
ramp = (
np.arange(N) / (N - 1)
if N > 1
else np.zeros(N, dtype=float)
)
correction = (
ramp

Copilot uses AI. Check for mistakes.
Comment on lines +2025 to +2029
assert re.energies.shape == im.energies.shape and np.all(
re.energies == im.energies
), (
"Real and imaginary parts must have the same energy domains after truncation and extension."
)

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using assert for this validation is unsafe in library code because assertions can be disabled with Python optimizations (-O), which would skip the check and allow inconsistent energy domains to propagate. Replace this with an explicit conditional and raise ValueError (or a domain-specific exception) when energies still differ after truncation/extension.

Suggested change
assert re.energies.shape == im.energies.shape and np.all(
re.energies == im.energies
), (
"Real and imaginary parts must have the same energy domains after truncation and extension."
)
if re.energies.shape != im.energies.shape or not np.all(
re.energies == im.energies
):
raise ValueError(
"Real and imaginary parts must have the same energy domains after truncation and extension."
)

Copilot uses AI. Check for mistakes.
Comment thread kkcalc2/models/factors.py
# kwargs["max_rows"] = 6
# return self.dataframe().to_string(**kwargs)
# else:
return f"<{self.__class__.__name__} ({len(self.energies)} points, {self.energies[0]:0.2f} eV to {self.energies[-1]:0.2f} eV)>"

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__str__ now indexes self.energies[0] and self.energies[-1], which will raise IndexError for empty datasets (which can be created via slicing, e.g. obj[:0]). Guard against len(self.energies) == 0 and return a sensible empty representation in that case.

Suggested change
return f"<{self.__class__.__name__} ({len(self.energies)} points, {self.energies[0]:0.2f} eV to {self.energies[-1]:0.2f} eV)>"
n_points = len(self.energies)
if n_points == 0:
return f"<{self.__class__.__name__} (0 points, empty)>"
return f"<{self.__class__.__name__} ({n_points} points, {self.energies[0]:0.2f} eV to {self.energies[-1]:0.2f} eV)>"

Copilot uses AI. Check for mistakes.
Comment on lines +5 to 7
Want to contribute to developement? That's fantastic! Have a discussion with us first via `issues <https://github.com/xraysoftmat/kkcalc/issues>`_ to avoid wasted effort, and so we can make our visions align.

To install for development, create your own `fork/branch <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo>`_ of ``kkcalc2`` on ``github`` so you can save and contribute your own changes, and clone that to your local file system using ``Github Desktop`` or ``Git`` via Windows Terminal / CMD.

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling/capitalization: "developement" -> "development", and use consistent "GitHub" capitalization (including "GitHub Desktop" and "Git" / "github").

Suggested change
Want to contribute to developement? That's fantastic! Have a discussion with us first via `issues <https://github.com/xraysoftmat/kkcalc/issues>`_ to avoid wasted effort, and so we can make our visions align.
To install for development, create your own `fork/branch <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo>`_ of ``kkcalc2`` on ``github`` so you can save and contribute your own changes, and clone that to your local file system using ``Github Desktop`` or ``Git`` via Windows Terminal / CMD.
Want to contribute to development? That's fantastic! Have a discussion with us first via `issues <https://github.com/xraysoftmat/kkcalc/issues>`_ to avoid wasted effort, and so we can make our visions align.
To install for development, create your own `fork/branch <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo>`_ of ``kkcalc2`` on ``GitHub`` so you can save and contribute your own changes, and clone that to your local file system using ``GitHub Desktop`` or ``Git`` via Windows Terminal / CMD.

Copilot uses AI. Check for mistakes.
Comment thread docs/source/install.rst
Comment on lines +13 to +15
- Github Releases: https://github.com/xraysoftmat/kkcalc/releases

Note that the Github releases also contains compiled binaries for Windows and MacOS, which can be used without installing Python or any dependencies.

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling/capitalization/grammar: use "GitHub" (not "Github"), and "releases also contain" (not "contains"). These strings are user-facing docs and should match standard branding/grammar.

Suggested change
- Github Releases: https://github.com/xraysoftmat/kkcalc/releases
Note that the Github releases also contains compiled binaries for Windows and MacOS, which can be used without installing Python or any dependencies.
- GitHub Releases: https://github.com/xraysoftmat/kkcalc/releases
Note that the GitHub releases also contain compiled binaries for Windows and MacOS, which can be used without installing Python or any dependencies.

Copilot uses AI. Check for mistakes.
Comment thread docs/index.rst
Please raise any `issues <https://github.com/xraysoftmat/kkcalc/issues>`_ here.

- Development (Github): https://github.com/xraysoftmat/kkcalc-xraysoftmat/
- Development (Github): https://github.com/xraysoftmat/kkcalc/

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling/capitalization: "GitHub" should be capitalized consistently ("Development (GitHub)").

Suggested change
- Development (Github): https://github.com/xraysoftmat/kkcalc/
- Development (GitHub): https://github.com/xraysoftmat/kkcalc/

Copilot uses AI. Check for mistakes.
Comment thread kkcalc2/models/common.py
"Setting a formula mass will not be internally used when a `stoichiometry` has been assigned.",
UserWarning,
raise ValueError(
"Formula mass is immutable when a stoichiometry is defined, as it is derived from `stoichiometry`"

Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ValueError message here is missing whitespace/punctuation between sentences because two string literals are concatenated without a separating space. This makes the error hard to read ("...stoichiometryPlease..."). Add a space/newline or combine into a single formatted string.

Suggested change
"Formula mass is immutable when a stoichiometry is defined, as it is derived from `stoichiometry`"
"Formula mass is immutable when a stoichiometry is defined, as it is derived from `stoichiometry` "

Copilot uses AI. Check for mistakes.
Comment on lines +9 to 13
To do this,




Copilot AI Apr 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new Tutorials "Introduction" section currently ends with an incomplete sentence ("To do this,") followed by blank lines. This will render oddly and reads like unfinished documentation; either complete the introductory paragraph or remove the dangling clause/extra blank lines.

Suggested change
To do this,
The tutorials in this section explain how to use these features.

Copilot uses AI. Check for mistakes.
previously, asp.extend_energies accidentially truncated the final energy value incorrectly
added properties for `betas`, `refractive`, `deltas` and `refractive_index`, which return the existing `eval_<name>` using the internal energies rather than requiring specification of the values.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Normalisation of weak spectra

2 participants