Skip to content

GK: loss cone updater refactored for arbitrary geometry#997

Open
Maxwell-Rosen wants to merge 60 commits into
mainfrom
gk-lc-mask-yushmanov
Open

GK: loss cone updater refactored for arbitrary geometry#997
Maxwell-Rosen wants to merge 60 commits into
mainfrom
gk-lc-mask-yushmanov

Conversation

@Maxwell-Rosen

@Maxwell-Rosen Maxwell-Rosen commented Apr 21, 2026

Copy link
Copy Markdown
Collaborator

Hamiltonian-based Classification of Loss and Trapped Orbits in Mirror Geometry

This PR was generated with the assistance of GitHub Copilot for the unit tests and the updater functions. Most of it is LLM-generated, but I've read it all and understand it.

Consider a one-dimensional coordinate $z\in[z_L,z_R]$ aligned with a magnetic field line, with magnetic-field magnitude $B(z)$ and electrostatic potential $\phi(z)$. For a species of mass $m_s$ and charge $q_s$, the guiding-center Hamiltonian in $(z,v_\parallel,\mu)$ coordinates is

$$ H(z,v_\parallel,\mu)=\frac{1}{2}m_s v_\parallel^2+\mu B(z)+q_s\phi(z), $$

where $v_\parallel$ is the parallel velocity and $\mu$ is the magnetic moment. In the absence of collisions, $H$ and $\mu$ are invariants of motion, and particle trajectories lie along contours of constant $H$ in $(z, v_\parallel)$.
Turning points satisfy $v_\parallel=0$, hence

$$H=U(z,\mu), \qquad U(z,\mu)= \mu B(z)+q\phi(z), $$

where $U$ is the Yushmanov potential.

Prior work simply identified the trapped/passing boundary based on the
loss cone criterion

$$\mu = \frac{\frac{1}{2}m_s v_\parallel^2 + q_s\left(\phi(z) - \phi_m\right)}{B_m - B(z)}.$$

In that implementation, $B_m,\phi_m$ are determined at the peaks of $B$. However, the effective potential experienced by particles is $U$, which combines both $B$ and $\phi$. This can be essential for identifying the trapped region of phase space induced by angled neutral-beam injection of ions at $45^\circ$, where an off-axis density peak occurs. Furthermore, this implementation only uses a single loss criterion per region, when multiple can exist, for instance, in tandem mirrors. A Hamiltonian-based region-identification algorithm is more accurate and flexible because it directly addresses the Yushmanov potential.

Two-Wall Escape Criterion

One issue with a Hamiltonian-based identifier is that $H$ is non-unique. For instance, consider a symmetric simple mirror without an electric potential, just two peaks in B. The magnetic field has fourfold degeneracy at the half-maximum. The points outside the peaks in $B$ should be labeled passing, while those between peaks of $B$ should be trapped.

A trajectory at fixed $(H,\mu)$ can escape through the left wall at $z_L$ if it can traverse the interval $[z_L,z]$ without violating $H\ge U$. For a phase-space point $(z,v_\parallel,\mu)$, define the escape barrier to the left wall as

$$EB_L(z,\mu)=\max_{s\in[z_L,z]}U(s,\mu), $$

and the escape barrier to the right wall as

$$EB_R(z,\mu)=\max_{s\in[z,z_R]}U(s,\mu). $$

The minimum escape barrier required for escape through at least one wall is

$$EB(z,\mu)=\min\left(EB_L(z,\mu),EB_R(z,\mu)\right). $$

Defining the signed loss function

$$\mathcal{F}(z,v_\parallel,\mu)=H(z,v_\parallel,\mu)-EB(z,\mu). $$

$\mathcal{F}\ge 0$ means that there is no energy barrier between the wall and this point, indicating a passing orbit. If $\mathcal{F}<0$, then the escape barrier is at a higher energy than this point, so it is trapped. So the orbiting mask is

$$\chi_{\rm orbit}(z, v_\parallel,\mu) = \mathrm{bool}(\mathcal{F}(z,v_\parallel,\mu)<0)$$

Inclusion of Sheath Wall Potentials

The field $\phi(z)$ does not directly represent the sheath drop at the wall, but instead it is assumed that $\phi=0$ in Gkeyll's sheath boundary conditions. Let $\phi_L^{\mathrm{bc}}$ and $\phi_R^{\mathrm{bc}}$ denote prescribed wall potentials at $z_L$ and $z_R$, respectively, and let $\phi(z_L)$ and $\phi(z_R)$ be the values implied by the loaded field data near the walls. These are left general, but taken as zero in the simulations. The effective wall potentials are taken as

$$\begin{aligned} U_L^{\mathrm{wall}}(\mu) &=\max\left(\mu B(z_L)+q\phi(z_L),\mu B(z_L)+q\phi_L^{\mathrm{bc}}\right), \\ U_R^{\mathrm{wall}}(\mu) &=\max\left(\mu B(z_R)+q\phi(z_R),\mu B(z_R)+q\phi_R^{\mathrm{bc}}\right). \end{aligned}$$

The corresponding augmented path barriers are

$$\begin{aligned} \widetilde{EB}_L(z,\mu)&=\max\left(\mathcal{B}_L(z,\mu),U_L^{\mathrm{wall}}(\mu)\right), \\ \widetilde{EB}_R(z,\mu)&=\max\left(\mathcal{B}_R(z,\mu),U_R^{\mathrm{wall}}(\mu)\right), \end{aligned}$$

with an augmented escape barrier

$$\widetilde{EB}(z,\mu)=\min\left(\widetilde{EB}_L(z,\mu),\widetilde{EB}_R(z,\mu)\right). $$

Replacing $EB$ by $\widetilde{EB}$
yields a classifier that includes sheath reflection consistently.

Integration into the Gkeyll Workflow

Within Gkeyll, one first interpolates the fields to nodal values, computes $EB$, calculates the loss barrier, and then, if any corners of the cell are not orbiting, the cell is not evolved.

Community Standards

  • Documentation has been updated.
  • My code follows the project's coding guidelines.
  • Changes to layer/zero should have a unit test, e.g., core/zero.

Testing: (x (yes), blank (no))

  • I added a regression test to test this feature.
  • I added this feature to an existing regression test.
  • I added a unit test to test this feature.
  • Ran make check and unit tests all pass.
  • I ran the code with make valcheck, and it is clean.
  • I ran the code through computer-sanitizer on GPU, and it is clean.
  • I ran a few regression tests to ensure no apparent errors.
  • Tested and works on CPU.
  • Tested and works on multi-CPU.
  • Tested and works on GPU.
  • Tested and works on multi-GPU.

Additional notes

Here is a comparison in a beam simulation with the current loss mask and the old one.
This is an R=32 mirror case with Boltzmann electrons, showing the difference in the fdot multiplier. Red means these cells are now masked with the new updater, while blue means cells that were masked, but are not with this version. In other words, this is new_mask - old_mask. We see the population of small mu particles which are trapped due to the off-axis beam injection, which is what we expect.

newplot

Here is another view, this time in pyvista
image

Here is a simulation run with the new loss cone mask. I'm re-running a beam R=32 case, which was used for the paper Gyrokinetic equilibria of high temperature superconducting magnetic mirrors

This did decrease the time-step during the OAP to 3.4e-7 versus 2.6e-6 seconds. Perhaps a more coarse mesh in mu would help.
yushmanov-comparison

…e maximum magnetic field is determined. Bmag max is stored as a gkyl_array. Right now, we only do this for bmag, but we need to store phi as a 1d maximum array. I haven't decided on the final design for how 2x OAP simulations should be accomodated. Perhpas we need some general gkyl_dg_array_reduce methods that take a 2D array and turn it into a 1D array instead of a 0D number. This is a kind of reduction method, but it's not a total reduction. I tested the regression test included for a 2x2v boltzmann mirror and the output of the magnetic field looks correct. The current implementation evaluates bmag at cell corners, but we ideally should do the corners in Z, but the quadrature nodes in psi.
- Introduced a new header file `gkyl_array_dg_find_peaks.h` that defines a structure and functions for finding peaks (local maxima, minima, and boundary values) in DG fields.
- Implemented an internal structure in `gkyl_array_dg_find_peaks_priv.h` to manage peak finding operations, including storage for peak values and coordinates.
- Removed unused initialization and writing of `bmag_max` arrays in `gyrokinetic.c` to streamline the geometry setup process.
- Deleted the `gkyl_gk_geometry_bmag_max_init` and `gkyl_gk_geometry_bmag_max_release` functions from `gk_geometry.c` as they are no longer needed, simplifying the geometry management.
…es another DG array at the peaks of the initilized array. Add appropriate unit tests which pass to ctest_array_dg_find_peaks. Update gk_species_fdot_multiplier to use the project_on_peaks function with phi. Now, everything passed to loss_cone_mask_gyrokinetic is a gkyl_array. The loss_cone_mask is updated accordingly
…ount of compution we need for evaluating phi at its peak in the app. Unit tests pass. Regression tests look fine as well. They're all valgrind clean. I think the right way to do the paralellism is to do the peak finding on a global bmag, just like how it is done for the position_map, then when we evaluate phi, all processes evaluate it at this peak, however only one will return a true value. This process will broadcast the array to the rest of the processes
… method, which is just like find_peaks, but it computes the global maximum or minimum.
…nd regression tests are brought over from another branch. Unit tests for the array mask, loss cone mask, and the regression tests for the kinetic electron POA mirror are valgrind free
…grind clean. Regression test is added and produces reasonable results.
…formance

- Introduced a helper function `mkarr` to streamline array allocation for GPU and CPU.
- Removed the `gkyl_loss_cone_mask_gyrokinetic_Dbmag_quad_wall` function and integrated its logic into the main processing flow.
- Updated the GPU kernel `gkyl_loss_cone_mask_gyrokinetic_Dbmag_quad_cu_ker` to compute `Dbmag_quad` directly from `bmag_peak` instead of `bmag_max`.
- Enhanced tandem mirror support by adding handling for `bmag_peak` and `phi_m` in the GPU kernels.
- Simplified the logic for determining trapped particles in the `gkyl_loss_cone_mask_gyrokinetic_ker` and `gkyl_loss_cone_mask_gyrokinetic_quad_ker` functions.
- Improved readability and maintainability by restructuring conditional checks and variable assignments.
…plier. Add possibility of kinetic electrons and tandem mirrors. The damping regression test is failing, both here and on main. They are for different issues. Main fails because the loss_cone updater has an issue. Here, it fails because it's using scale_by_cell with a multi-component array. I'm not sure the right way to fix this
…ove the aspects about the cellwise evaluation and quadrature points because that breaks the array_scale_by_cell method which is used
… refactor this in the future, but it is just proof of concept for now to make sure it works correctly.
… arrays need to be passed to objects like the loss_cone_mask, where it expects these to be GPU arrays. It's just easier to have this module fit the archatecture of the rest of the code, rather than doing something different and copying between device and host. It wouldn't interface well. Claude generated most of the cuda code, with strong guidence from Maxwell
…h compute sanitizer with the array_find_peaks which was causing crashes in the loss cone mask. These issues are fixed. There was some funny business regarding the basis being on the host vs device.

Refactor the allocations in the GPU kernels to not be inside the kernels. Instead, it's allocated at init time. The GPU code pulses, which is odd, but it runs. The 2x2v POA regression test runs and is compute sanitizer clean. The other POA tests do not error either on GPU and are compute sanitizer clean.
…itting code to main and broke a unit test with the geometry enum changes
…hi_smooth_global array for improved performance and consistency across computations.
…ion of doing the kinetic electron and tandem mirror. The code is built again to make sure nothing is affected.
…egression test (and in my production simulations)
…erything is done in computational coordinates
…and switch to user-defined damping type in simulation context.
- Removed unused includes and commented-out code for clarity.
- Simplified the initialization of quadrature values by consolidating the logic into a single function.
- Introduced a new function to compute escape barriers based on the potential and magnetic field.
- Replaced the previous approach of handling quadrature nodes with a more efficient nodal representation.
- Updated the main advance function to utilize the new escape barrier calculations and streamline the trapped particle detection logic.
- Cleaned up memory management by ensuring proper release of allocated arrays.
…nce communication logic for global data assembly
- Removed unused inline functions for node coordinate matching and field value calculations in loss_cone_mask_gyrokinetic.c.
- Simplified the escape_barriers function by directly integrating its logic into the main kernel.
- Updated the CUDA kernel for loss cone mask gyrokinetic to streamline the computation of escape barriers and Hamiltonian checks.
- Enhanced the handling of configuration and phase ranges in the CUDA implementation for better clarity and performance.
- Adjusted the kernel launch parameters to accommodate the new structure of the code and ensure proper execution.
@Maxwell-Rosen Maxwell-Rosen marked this pull request as draft April 21, 2026 15:50
@Maxwell-Rosen Maxwell-Rosen marked this pull request as ready for review April 24, 2026 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant