Allows odd N mesh for nonzero Temp#401
Open
jplauzie wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi,
This PR adds support for using odd grids with temperature enabled. Previously, N (N=NxNyNz) was required to be even when temperature was nonzero, due to Curand's use of the Box-Muller algorithm which produces random values in pairs. Essentially the same as the mumaxplus fix (mumax/plus#155).
Instead of doing a cuda memcpy, it is done by the existing cuda kernel. In the cuda kernel the size of B_th and noise for the purpose of the kernel uses size of B_th, so since B_th is size N and noise is size N+1, the kernel uses N, and the final float inside noise is essentially invisible and unused.
In that discussion, there were some concerns over speed and memory usage when using a buffer instead of just writing the values directly. Mumax3 currently uses a buffer. The speed difference was negligible. For a test grid size of ~515x515x55, there is a memory savings of ~ 53 MB for directly writing, or 0.9% of the ~6GB total memory required for the simulation. This seemed small enough that I just kept the buffer method for both N even or odd, since it seems a bit cleaner. The biggest concern is that some future change might accidentally use the N+1 size of noise, but the noise buffer doesn't get touched in the code outside of this section, so the odds of the N+1 size being unintentionally used elsewhere in a future update is small. (I spent a little bit of time trying to see if I could get mumaxplus to avoid needing the buffer for odd N, but it's structured a little differently, seems like it'd be a bit more of a headache to track the hidden float.)
When reviewing, probably the two big things to check is there is no unintentional buffer overflow, and that the N+1 value never gets used unintentionally, but given how noise is used I think I avoided that.
This does mean that for generated values in the stream in N odd simulations, 1 float gets consumed without being used per component, per time step. Otherwise the random value streams line up exactly as before, as expected. As example, I've attached an example script along with a few ovf files for a grid of 3x3x3 and 3x3x2. They match exactly up until the value 0.14437014 (missing from step0 3x3x3. this also corresponds exactly to the bottom of the first column/top of the second column). You see the same thing in future steps, every 3x3x3+1=27+1th value gets skipped, as expected, but otherwise the stream values match exactly.
oddtest.txt
Thermal field000000_332.txt
Thermal field000000_333.txt
Thermal field000001_332.txt
Thermal field000001_333.txt
I didn't include a test script, but the example could probably be converted to one fairly easily if it's worth it.
Best regards,
Josh L.