Update to stitching procedure#44
Conversation
Add github release source for GUI executables Modify API page title
…o create an extended db object
Major update to correctly (and smoothly) stitch optical data into the database data
make formula mass and stoichiometry more independent. fix appropriate tests
…ges to the polynomials class, some small errors. Changed representation to now specify the size of the object domain, rather than displaying (unhelpfully) rows of the asp coefficients
There was a problem hiding this comment.
Pull request overview
Adds a new distortion-correction (“stitching”) method (prepost_fit) intended to improve scaling/merging of weak spectra against the ASF database background (per Issue #4), along with some API tightening, display tweaks, docs updates, and CI adjustments.
Changes:
- Extend database scaling/extension logic with
fix_distortions_methodincluding newprepost_fitpathway. - Tighten
atomic_scatteringconsistency rules (warnings →ValueError) and adjust tests accordingly. - Misc updates: string representations, docs links/text, and Coveralls parallelization.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
kkcalc2/models/db_models.py |
Introduces prepost_fit and new parameters for distortion-fixing during scaling/extension; expands typing/overloads. |
kkcalc2/models/common.py |
Converts warnings to ValueError for conflicting inputs; adds attempted stoichiometry normalization. |
tests/models/test_common.py |
Updates tests to assert ValueError instead of warnings; adds formula-mass immutability check. |
kkcalc2/models/polynomials.py |
Adjusts __str__/__repr__, tweaks kwargs typing/stoichiometry propagation, adds an energy-domain assertion. |
kkcalc2/models/factors.py |
Simplifies __str__/__repr__; updates scale_to_database to use keyword args. |
kkcalc2/conversions.py |
Changes ASP_to_ASF return shape via squeeze(); suppresses a warning. |
kkcalc2/__main__.py |
Fixes/cleans optional PyQt import handling and internal imports. |
pyproject.toml |
Adds pip >= 25.1 to build-system requires; adjusts doc-lint ignores. |
docs/source/install.rst |
Adds GitHub releases info and improves a pip link. |
docs/source/contributing.rst |
Fixes issue URL but leaves spelling/capitalization issues. |
docs/source/tutorials/index.rst |
Adds an “Introduction” stub. |
docs/source/api.rst |
Renames API page title to KKCalc2. |
docs/index.rst |
Fixes repo links but leaves “Github” capitalization. |
.github/workflows/tests.yml |
Enables Coveralls parallel uploads and adds a “finish” job. |
.github/workflows/release.yml |
Runs artifact re-zip step unconditionally. |
Comments suppressed due to low confidence (1)
kkcalc2/models/db_models.py:898
- These branches are typed to accept
Sequence[...], but runtime checks only handlelist(e.g.isinstance(data_asf, list)/isinstance(fix_distortions, list)). Passing a tuple or otherSequencewill follow the non-sequence path and behave incorrectly. Either constrain the types back tolist[...]or update the runtime checks to handle general sequences (while excludingstr/bytes).
) -> None: # numpydoc ignore=GL08
# Check if data_asf is a list
if isinstance(data_asf, list):
# Check if merge_domain is a list
if isinstance(merge_domain, list):
# Check if the lengths match
if len(data_asf) != len(merge_domain):
raise ValueError("Length of data_asf and merge_domain must match")
# Check each merge domain doesn't overlap
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| scaled_data_y = (data_y - data_merge_range[0]) * scale + db_asp_merge_range[0] | ||
|
|
||
| # Energy values within the merge domain to use for fitting | ||
| energies = data_e[ | ||
| data_merge_lb_idx:db_merge_ub_end | ||
| ] # essential to only use domain data to perform fit. | ||
|
|
||
| if fix_distortions: | ||
| # Perform a fit along the domain | ||
| db_y = asp.eval_asf_on_coefs( | ||
| target_energies=energies, | ||
| energies=db_e, | ||
| coefs=db_coefs, | ||
| ) # Find equivalent values of the db_asp energies to the data energies | ||
| guess_grad = ( | ||
| -(data_merge_range[1] - data_merge_range[0]) | ||
| / (db_asp_merge_range[1] - db_asp_merge_range[0]) | ||
| / data_y[-1] | ||
| ) | ||
| fit_func = asp_db_extended.grad_min | ||
| fit_y = scaled_data_y[ | ||
| data_merge_lb_idx:db_merge_ub_end | ||
| ] # essential to only use domain data to perform fit. | ||
|
|
||
| (grad,), _ = opt.leastsq( | ||
| func=fit_func, | ||
| x0=guess_grad, | ||
| args=(energies, fit_y, db_asp_merge_range, db_y), | ||
| # Minimizes difference to the db_y values. | ||
| ) | ||
| # Reassign the scaled data, but use full energy domain to generate values. | ||
| scaled_data_y = db_asp_merge_range[0] + asp_db_extended.grad_min( | ||
| grad, | ||
| data_e, | ||
| scaled_data_y, | ||
| db_asp_merge_range, | ||
| db_y=0, # No minimization here. | ||
| # Adjust merge indexes, so we can scale whole dataset. | ||
| idx0=data_merge_lb_idx, | ||
| idx1=data_merge_ub_idx, | ||
| ) | ||
| match fix_distortions_method: | ||
| case "grad_min": | ||
| # Perform a fit along the domain | ||
| db_y = asp.eval_asf_on_coefs( | ||
| target_energies=energies, | ||
| energies=db_e, | ||
| coefs=db_coefs, | ||
| ) # Find equivalent values of the db_asp energies to the data energies | ||
| guess_grad = ( | ||
| -(data_merge_range[1] - data_merge_range[0]) | ||
| / (db_asp_merge_range[1] - db_asp_merge_range[0]) | ||
| / data_y[-1] | ||
| ) | ||
| fit_func = asp_db_extended.grad_min | ||
|
|
||
| (grad,), _ = opt.leastsq( | ||
| func=fit_func, | ||
| x0=guess_grad, | ||
| args=(energies, fit_y, db_asp_merge_range, db_y), | ||
| ) | ||
| # Reassign the scaled data | ||
| scaled_data_y = db_asp_merge_range[0] + asp_db_extended.grad_min( | ||
| grad, | ||
| energies, | ||
| fit_y, | ||
| db_asp_merge_range, | ||
| db_y=0, | ||
| idx0=0, | ||
| idx1=-1, | ||
| ) |
There was a problem hiding this comment.
asp_db_abstract.scale_data() overwrites scaled_data_y with a distortion-corrected array computed only on the merge-domain slice (fit_y/energies). That changes the return length from len(data_y) to len(energies) when fix_distortions=True, which breaks callers like asf.scale_to_database() that expect the full-length scaled dataset. Apply distortion corrections to the full scaled_data_y (using the merge-domain indices for fitting), or clearly change the API to return a truncated dataset and update all callers + docstrings accordingly.
| # Apply the linear correction to the full domain | ||
| intermediate_y = fit_y - ( | ||
| pre_m * (energies - pre_x[0]) | ||
| ) # Correct the gradient |
There was a problem hiding this comment.
In the new prepost_fit branch, np.polyfit returns (pre_m, pre_c) and (post_m, post_c) but the intercept terms are never used in the corrections. As written, the method can only adjust slope, not offset, despite explicitly fitting the offset; this looks like an implementation error and will under-correct pre/post-edge mismatches. Incorporate the intercept terms into the applied correction (or remove them from the fit if intentionally unused).
| # Apply the linear correction to the full domain | |
| intermediate_y = fit_y - ( | |
| pre_m * (energies - pre_x[0]) | |
| ) # Correct the gradient | |
| # Apply the full linear correction (slope and offset) to the full domain | |
| intermediate_y = fit_y - ( | |
| pre_m * energies + pre_c | |
| ) # Correct the fitted mismatch |
| correction = ( | ||
| np.arange(N) | ||
| / N # Ramping factor to slowly apply the correction over the domain |
There was a problem hiding this comment.
The ramp factor for applying the post-edge correction uses np.arange(N) / N, which never reaches 1.0 (it ends at (N-1)/N). If the intent is for the correction to be fully applied at the endpoint, use a normalization that reaches 1.0 (e.g. divide by N-1 when N>1).
| correction = ( | |
| np.arange(N) | |
| / N # Ramping factor to slowly apply the correction over the domain | |
| ramp = ( | |
| np.arange(N) / (N - 1) | |
| if N > 1 | |
| else np.zeros(N, dtype=float) | |
| ) | |
| correction = ( | |
| ramp |
| assert re.energies.shape == im.energies.shape and np.all( | ||
| re.energies == im.energies | ||
| ), ( | ||
| "Real and imaginary parts must have the same energy domains after truncation and extension." | ||
| ) |
There was a problem hiding this comment.
Using assert for this validation is unsafe in library code because assertions can be disabled with Python optimizations (-O), which would skip the check and allow inconsistent energy domains to propagate. Replace this with an explicit conditional and raise ValueError (or a domain-specific exception) when energies still differ after truncation/extension.
| assert re.energies.shape == im.energies.shape and np.all( | |
| re.energies == im.energies | |
| ), ( | |
| "Real and imaginary parts must have the same energy domains after truncation and extension." | |
| ) | |
| if re.energies.shape != im.energies.shape or not np.all( | |
| re.energies == im.energies | |
| ): | |
| raise ValueError( | |
| "Real and imaginary parts must have the same energy domains after truncation and extension." | |
| ) |
| # kwargs["max_rows"] = 6 | ||
| # return self.dataframe().to_string(**kwargs) | ||
| # else: | ||
| return f"<{self.__class__.__name__} ({len(self.energies)} points, {self.energies[0]:0.2f} eV to {self.energies[-1]:0.2f} eV)>" |
There was a problem hiding this comment.
__str__ now indexes self.energies[0] and self.energies[-1], which will raise IndexError for empty datasets (which can be created via slicing, e.g. obj[:0]). Guard against len(self.energies) == 0 and return a sensible empty representation in that case.
| return f"<{self.__class__.__name__} ({len(self.energies)} points, {self.energies[0]:0.2f} eV to {self.energies[-1]:0.2f} eV)>" | |
| n_points = len(self.energies) | |
| if n_points == 0: | |
| return f"<{self.__class__.__name__} (0 points, empty)>" | |
| return f"<{self.__class__.__name__} ({n_points} points, {self.energies[0]:0.2f} eV to {self.energies[-1]:0.2f} eV)>" |
| Want to contribute to developement? That's fantastic! Have a discussion with us first via `issues <https://github.com/xraysoftmat/kkcalc/issues>`_ to avoid wasted effort, and so we can make our visions align. | ||
|
|
||
| To install for development, create your own `fork/branch <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo>`_ of ``kkcalc2`` on ``github`` so you can save and contribute your own changes, and clone that to your local file system using ``Github Desktop`` or ``Git`` via Windows Terminal / CMD. |
There was a problem hiding this comment.
Spelling/capitalization: "developement" -> "development", and use consistent "GitHub" capitalization (including "GitHub Desktop" and "Git" / "github").
| Want to contribute to developement? That's fantastic! Have a discussion with us first via `issues <https://github.com/xraysoftmat/kkcalc/issues>`_ to avoid wasted effort, and so we can make our visions align. | |
| To install for development, create your own `fork/branch <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo>`_ of ``kkcalc2`` on ``github`` so you can save and contribute your own changes, and clone that to your local file system using ``Github Desktop`` or ``Git`` via Windows Terminal / CMD. | |
| Want to contribute to development? That's fantastic! Have a discussion with us first via `issues <https://github.com/xraysoftmat/kkcalc/issues>`_ to avoid wasted effort, and so we can make our visions align. | |
| To install for development, create your own `fork/branch <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo>`_ of ``kkcalc2`` on ``GitHub`` so you can save and contribute your own changes, and clone that to your local file system using ``GitHub Desktop`` or ``Git`` via Windows Terminal / CMD. |
| - Github Releases: https://github.com/xraysoftmat/kkcalc/releases | ||
|
|
||
| Note that the Github releases also contains compiled binaries for Windows and MacOS, which can be used without installing Python or any dependencies. |
There was a problem hiding this comment.
Spelling/capitalization/grammar: use "GitHub" (not "Github"), and "releases also contain" (not "contains"). These strings are user-facing docs and should match standard branding/grammar.
| - Github Releases: https://github.com/xraysoftmat/kkcalc/releases | |
| Note that the Github releases also contains compiled binaries for Windows and MacOS, which can be used without installing Python or any dependencies. | |
| - GitHub Releases: https://github.com/xraysoftmat/kkcalc/releases | |
| Note that the GitHub releases also contain compiled binaries for Windows and MacOS, which can be used without installing Python or any dependencies. |
| Please raise any `issues <https://github.com/xraysoftmat/kkcalc/issues>`_ here. | ||
|
|
||
| - Development (Github): https://github.com/xraysoftmat/kkcalc-xraysoftmat/ | ||
| - Development (Github): https://github.com/xraysoftmat/kkcalc/ |
There was a problem hiding this comment.
Spelling/capitalization: "GitHub" should be capitalized consistently ("Development (GitHub)").
| - Development (Github): https://github.com/xraysoftmat/kkcalc/ | |
| - Development (GitHub): https://github.com/xraysoftmat/kkcalc/ |
| "Setting a formula mass will not be internally used when a `stoichiometry` has been assigned.", | ||
| UserWarning, | ||
| raise ValueError( | ||
| "Formula mass is immutable when a stoichiometry is defined, as it is derived from `stoichiometry`" |
There was a problem hiding this comment.
The ValueError message here is missing whitespace/punctuation between sentences because two string literals are concatenated without a separating space. This makes the error hard to read ("...stoichiometryPlease..."). Add a space/newline or combine into a single formatted string.
| "Formula mass is immutable when a stoichiometry is defined, as it is derived from `stoichiometry`" | |
| "Formula mass is immutable when a stoichiometry is defined, as it is derived from `stoichiometry` " |
| To do this, | ||
|
|
||
|
|
||
|
|
||
|
|
There was a problem hiding this comment.
The new Tutorials "Introduction" section currently ends with an incomplete sentence ("To do this,") followed by blank lines. This will render oddly and reads like unfinished documentation; either complete the introductory paragraph or remove the dangling clause/extra blank lines.
| To do this, | |
| The tutorials in this section explain how to use these features. |
previously, asp.extend_energies accidentially truncated the final energy value incorrectly
added properties for `betas`, `refractive`, `deltas` and `refractive_index`, which return the existing `eval_<name>` using the internal energies rather than requiring specification of the values.
Closes #4 by adding a new stiching method
prepost_fit. This makes the joining and scaling of the atomic scattering factors data much more aligned to the atomic scattering factor background.Some other minor changes too