Tweaking code example tests#258
Merged
mabruzzo merged 9 commits intoMar 26, 2025
Merged
Conversation
…n executed by pytest)
Previously, the logic invoking core-lib tests (i.e. through `ctest`) was implement as a bunch of shell commands directly embedded into a "run-step" of the test-suite job. This commit relocates the logic to a named command called "run-core-tests." This command contains logic for running the cxx_omp_example example (while this commit simply duplicates the logic, the next commit will rectify the duplication)
I renamed the "run-tests" command to be called "run-pygrackle-tests". I also removed the logic for running the OpenMP example from this command For context, the prior commit configured the "run-core-tests" command to have the logic for running the OpenMP example. But, to make changes more atomic, that commit didn't remove the logic from the command now known "run-pygrackle-tests"
This was referenced Feb 28, 2025
brittonsmith
approved these changes
Mar 25, 2025
brittonsmith
left a comment
Contributor
There was a problem hiding this comment.
Looks fine to me. If you could create an issue to make documentation for how to run these tests locally, that would be great. Otherwise, merge when ready.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This depends on #254
Overview
The purpose of this PR is to move the code-example tests out of the pytest test-suite into the test suite associated with the closely associated with the core library. Additionally, I
Note
The collapsible sections in this PR mostly s mostly here for the sake of completeness and clarity. This PR has been written in a way that you should be able to skip over them.
Describing "the problem" with these tests
In my experience, the code_examples tests have always been an extremely flaky part of Grackle’s test suite! (Way back when I first started contributing to Grackle, it was a major struggle to get them working)
These tests consist of compiling our-code examples, running the examples, and then comparing the outputs to reference results.
I provide a detailed describe of the tests' issues in the following collapsed section. The main highlights are summarized at the start of the next subsection
Detailed description of issues with easily running these tests
Detailed description of the tests' issues
Compiling anything is obviously complex. In order to make these tests work as “automatically as possible,” the tests have made a LOT of assumptions. These assumptions include:
pygrackle. Specifically, we need to find the auto-generated headers at compile-time and the copy of the shared library for linkingLD_LIBRARY_PATH). This may not seem like a problem since this was historically a requirement for runningpygrackle(this isn't the case anymore). But, it makes development of Grackle extremely difficult without breaking your current installation (historically, I would alway delete my current installation of Grackle before I would start developing it).If things aren’t set up “just so” on your system, the tests won’t run properly.
When I introduced the CMake build-system, these tests were a major pain. One of CMake’s killer features is its support for out-of-source builds.2 This is fundamentally at odds with the way these tests work. But I came up with a work-around that let us temporarily limp along.
This problem has been further exacerbated now that I shifted our python build-system from
setuptoolstoscikit-build-core. This change made it extremely easy for somebody to installpygrackle:pip install .orpip install -e .. This performs an "embedded" build of the grackle library, which is packaged within the pygrackle modulepip(anduv) perform builds and installation in one steppygracklein this waySummary of the tests' issues
To summarize the issues with easily running these tests:
pygrackle. where the core grackle library (libgrackle.so) is automatically built as part ofpygrackle(and is packaged into thepygracklewheel) when you invokepip install.orpip install -e .I've given a lot of thoughts on how to solve the issue. At a glance, both issues seem solvable. As I detail in the following collapsed section, it doesn't make sense to solve the issue in this way. The common feature among all of these "solutions" is that they require that we bend over backwards (and the result will still be somewhat flaky)
Details
Flawed "Solutions"
To solve question 1: I originally asked "why not package the required headers as part of pygrackle?" The short answer for why we shouldn't do this is that this simply isn't done (e.g.
h5pydoesn't do that). More importantly, the presence of these headers may falsely suggest to users that applications could directly link against their code agains the copy oflibgrackle.sopackaged withpygrackle.3 This is something we NEED TO ACTIVELY DISCOURAGE!! I don't think that simply explaining to people that they shouldn't do this is enough (when there are better alternatives):grackleheaders packaged intopygrackle(and using the copy oflibgrackle.sopackaged intopygrackle, if present).pygracklewill actually work sometimes. This depends on:libgrackle.sois actually packaged withpygrackle. While the easy/simple way of installingpygrackledoes this, we will almost certainly want to maintain support for linkingpygrackleagainst an external of grackle.libgrackle.sopackaged inpygrackleto be compiled/linked agains version of shared dependencies that conflict with the versions used by the application.4 If we ever distribute binary wheels ofpygrackle(which will include copies of these runtime libraries), these conflicts would be unavoidableTo solve question 2: I thought about using
--rpathto avoid requiring a global installation. The issue with this solution is that it is not portable.--rpathis not a standard flag across compilers/linkers (most toolchains provide similar functionality). Plus, the behavior differs across platforms. For example, there are some difference on macOS and on Linux (I think from differences between MACH-O and ELF). Abstracting over this machinery is possible (CMake has support for doing this!), but I don't think it's worth our time to do that.An alternative way to solve both question involves having CMake directly build the code examples and having the pytest test-suite directly invoke those tests. You can already build the code-examples this way (SPOILER: this has some resemblance to the final solution).
pytesttelling it where it can find the build-directory. We would also need to make sure that the code-examples are built ahead of time.pip installapproach using somescikit-build-corefeatures (but it requires that user passes some special flags while installingpygrackle). I actually had a discussion with thescikit-build-coredevelopers about how I could wire this all up in a fairly seamless manner... While I'm confident my plan would work, the discussion have helped me conclude that there is a better solution.Motivating The Solution
It's instructive to step back and consider how this repository is essentially a monorepo that provides source code for 2 distinct entities:
libgrackle)5pygracklepython packageTo drive home this point, consider that this pretty unusual for the large community-maintained open-source scientific software packages (with python interfaces) that we typically encounter as astronomers:
hdf5/h5pyOR mpi implementations andmpi4py). While there are plenty of python package repositories that include C/C++/Fortran code (e.g.numpy,scipy,yt), the python package is the main "product" (the compiled code isn't separately distributed)5libgrackleandpygrackleand that (ii) grackle users can installlibgracklewithoutpygrackleis indicative of this distinction.The distinction between the core library and
pygrackleis also reflected by the source code separation, the fact thatpygrackleconsumesgrackle's public API (and has no special access to internal functions), and there is a clear distinction in the build-systems.6The point of this drawn-out discussion is to emphasize that having distinct test-suites for both entities is sensible. While the majority could belong to either suite, a subset of tests must to one suite or the other. Obviously, tests of
pygrackle's python-logic must go intopygrackle's test-suite (thepytesttest-suite). Likewise, unit-tests of internal core logic or parts of grackle's API that aren't covered wrapped bypygrackleneed to go into the core-library's test-suite. (We have already started to respect to account for this by introducing unit-tests written withgoogletest, but we never actually hashed any of this out)We resolve the "problems" with these tests by transferring the code-example tests to the core library's test suite.
The Solution
In more detail:
-DGRACKLE_BUILD_TESTS=ON, the code-examples (and all other test-code) are all built alongsidelibgrackle. CMake is smart enough to link all of the code-examples against thelibgracklelibrary in the build-directory in a way that they can be run without fully installinglibgrackle(it leverages machinery related to--rpathunder the hood)ctestprogram that is shipped as part ofCMake.CTesttest-driver is integrated withCMakeand it is designed to flexibly run various kinds of tests by invoking the command-lineCTest(CMake provides nice built-in functionality to automatically detect each unit-test written with that framework and support running them as distinct test-cases)cxx_omp_example, we just compile and run the program (which is what we currently do)There is an important detail I have largely ignored until now: these tests (other than
cxx_omp_example) are currently answer-tests because they don't have an obvious "correct answer". My solution is as follows:To actually achieve this outcome, I needed to slightly modify some of the code-examples. All I did for
c_local_example,c_example, andcxx_examplewas adjust the precision used to print out the result.fortran_exampleneeded more workfortran_exampleso that the logical structure more closely resembledc_exampleI ultimately realized that the annotation of the
dtargument in Fortran's interface declaration ofsolve_chemistry(units, fields, dt)was wrong.dtvalue into the C function.dtargument is adoublepassed by value. Consequently, the C function would essentially treat the pointer address specified by Fortran as a double (it didn't know it had to dereference it).Some discrepancies still persist between
fortran_exampleand the other examples, but the discrepancy is a relative error has a magnitude of ~6e-7. I think this acceptable for the moment (to fix this, I think we would need to remove the dynamic calculation of all input values and instead we would need to hardcode all inputs with literals)Important
This solution doesn't actually support running the code-examples when you use the easy-install mode of
pygrackle(it is definitely possible, if you pass a bunch of extra flags). The main point is that there is no longer an expectation that the code-example tests should run this way.Instead, the code-example tests are now run alongside the rest of the core-library tests:
Other thoughts
I realize that we need documentation for the core-library test-suite. That's on my todo list
If you really don't want us to use ctest, I could move all the code logic into googletest and invoke them with the popen posix system call. There are 2 reasons why we may not want to do this:
googletestand usingpopen(to avoid breaking things). It's definitely possible (Cholla does it), but it doesn't seem like a great idea8ctestor they will need to exist entirely outside ofgoogletestandpytest(Issue Add tests for linking #249)Footnotes
I suspect that the C and Fortran examples originally had much more similar structure. Then over the years, as the structure of the C example improved, the Fortran example wasn’t updated. (I suspect this was at least partially my fault) ↩
Out-of-source builds are extremely useful. It’s also the reason CMake projects (like Grackle) are able to support automatic dependency management. Creating multiple different build directories can radically speed up development of large projects (if you maintain a different directory for each branch you are working on, rebuilds will be much faster). It also allows easier IDE support. ↩
I originally thought that this could be a really easy, simple, attractive way for us to distribute pre-compiled copies of
grackle. It seems like a fantastic idea before you understand the full picture. ↩Imagine the application that uses parallel HDF5 while
pygracklegot compiled with serial HDF5. Or that different fortran compilers got used with incompatible Fortran Runtime libraries. Or that compilers with incompatible OpenMP runtimes are used. These incompatibilities don't actually create problems when different python extension modules are compiled with incompatible dependencies because python internally usesdlopen. ↩A notable exception to all of this is something like
pytorch. It has an underlying C++ library,libtorch, that you can use without the rest ofpytorch(my impression is thatlibtorchis much less feature-rich thanpytorch). ↩ ↩2Aside: the scikit-build-core is explicitly designed to support cases where the line between the python extensions-modules and a C library are much more blurry and are much more tightly coupled. When I adopted scikit-build-core, I made sure to maintain the clear division between pygrackle and grackle ↩
Note: this didn't have to be a python. It could theoretically be any portable scripting language (bash/awk) or I could have written a C/C++ program (that CMake would compile) in order to run this test for us. ↩
In the future, I actually want to migrate the logic of the
test_chemistry_struct_synched.pytest into googletest (doing this will let us write a bunch of useful new tests that require access to the internal details of grackle). Doing that will require us to usepopen. (so this undermines my point to some degree) ↩