Skip to content

Variable sound speed in LinearEuler2D + HOHQMesh ISM/ISM-MM reader#123

Merged
fluidnumericsJoe merged 7 commits into
mainfrom
feature/variable-sound-speed-lineareuler2d
May 14, 2026
Merged

Variable sound speed in LinearEuler2D + HOHQMesh ISM/ISM-MM reader#123
fluidnumericsJoe merged 7 commits into
mainfrom
feature/variable-sound-speed-lineareuler2d

Conversation

@fluidnumericsJoe
Copy link
Copy Markdown
Member

@fluidnumericsJoe fluidnumericsJoe commented May 13, 2026

Summary

  • Variable sound speed in LinearEuler2D. Sound speed c becomes the 5th solution variable (nvar 4→5). Its flux and source are identically zero so c is held bitwise fixed in time at each location, while still being available point-wise to the pressure flux and the Riemann L-F dissipation. Volume flux, Riemann flux, entropy function, no-normal-flow / radiation BCs, GPU kernels, GPU Fortran wrappers, and the three existing 2D examples are all updated. The model's scalar %c attribute is removed.
  • HOHQMesh ISM and ISM-MM mesh reader for Mesh2D. Auto-detects format from the first line, parses nodes and per-element blocks, runs transfinite (Coons-patch) interpolation for curved interiors using Chebyshev-Gauss-Lobatto sample coordinates, and reconstructs full side connectivity (which neither format carries) by hashing sorted corner-node-id pairs. Adds nMaterials, elemMaterial(:), materialNames(:) to Mesh2D_t (default-allocated to a single "default" entry so existing HOPr / structured paths stay unchanged). Mirrored on the GPU Mesh2D Init/Free.
  • Multi-material BoneAndMarrow example. Reads the ISM-MM BoneAndMarrow.mesh (973 elements, polyOrder 4, three materials: Muscle / Bone / Marrow), stamps c per material into solution(...,5), seeds a small Gaussian density+pressure bump in the muscle annulus, applies CPU radiation BCs on the outer disk boundary, and integrates 500 RK3 steps. Entropy decays monotonically from 2.83e-7 to 1.97e-7; c is preserved bitwise at [0.92, 2.3].

Test plan

  • CPU build is clean (cmake --build build).
  • New unit tests pass: mesh2d_read_ism (Block2D.mesh: 25 elems, polyOrder 1, single "default" material, 20 boundary + 80 interior sides) and mesh2d_read_ismmm (BoneAndMarrow.mesh: 973 elems, polyOrder 4, ≥3 materials including Muscle and Bone, 4·nElem sides accounted for).
  • All three existing 2D LinearEuler examples (spherical_soundwave_closeddomain, planewave_propagation, planewave_reflection) still run cleanly with min(c) == max(c) == 1.0 bitwise across 500 timesteps.
  • BoneAndMarrow example runs on CPU: 500 RK3 steps complete, entropy monotone-decreasing, c preserved bitwise.
  • Existing ctest results — 0 new regressions vs. main. The 50 pre-existing failures are all Failed to open /share/mesh/... from missing test fixtures unrelated to this change.
  • GPU long-run + visualization. Build with SELF_ENABLE_GPU=ON on a GPU host, run linear_euler2d_boneandmarrow with an extended endtime (e.g. 4.0–8.0) so the pulse fully crosses into and reflects off the Bone region. Render the output HDF5 frames (e.g. via tecplot / paraview) to confirm: (a) the wavefront refracts/reflects sensibly at the muscle-bone and bone-marrow interfaces, (b) c displays the material-banded map throughout, (c) the outer boundary remains transparent (no spurious reflection back into the domain).
  • Confirm the GPU Mesh2D Init/Free compile with the new material-table allocations.

🤖 Generated with Claude Code

fluidnumericsJoe and others added 4 commits May 11, 2026 16:19
Move c from a scalar model attribute to solution variable index 5
so the sound speed can vary in space. Volume flux, Riemann flux,
and source for the new variable are identically zero, so c is held
fixed in time at each location.

- _t module: nvar=5, c metadata, drop %c scalar, flux/riemann/entropy/
  no-normal-flow BC updated; SphericalSoundWave now takes c as an arg.
- GPU Fortran: drop c parameter from interfaces and call sites;
  radiation BC wrapper now also passes the interior boundary buffer.
- GPU C++ kernels: read c from solution[idof+4*ndof] (or boundary state),
  use max(cL,cR) for Lax-Friedrichs dissipation, emit zero flux for var 5;
  no-normal-flow copies c through; radiation preserves c from interior.
- Examples: planewave_{propagation,reflection} carry their own c
  attribute (parent scalar is gone) and write solution(...,5) in both
  initial conditions and prescribed BCs; spherical_soundwave passes c
  into SphericalSoundWave.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `Read_HOHQMesh` to `Mesh2D_t` that auto-detects ISM vs ISM-MM
from the first line of the file, parses nodes and per-element
blocks, and reconstructs SELF's full mesh state — including side
connectivity, which is not present in either format and must be
derived from corner-node-id matching.

- New fields on `Mesh2D_t`: `nMaterials`, `elemMaterial(:)`,
  `materialNames(:)`. Default-allocated by `Init` to a single
  "default" material so HOPr/structured readers stay unchanged.
- HOHQMesh curve samples are at Chebyshev-Gauss-Lobatto nodes;
  the reader sets `quadrature = CHEBYSHEV_GAUSS_LOBATTO` and uses
  SELF's `ChebyshevQuadrature` for the TFI parametric coords.
- Curved-edge interiors (polyOrder > 1) are filled via transfinite
  (Coons-patch) interpolation. Sample order is reversed on the
  North/West sides to match SELF's (ξ, η) convention.
- Side connectivity rebuilt by hashing sorted corner-id pairs:
  fills `sideInfo(3:4,...)`, deduplicates global side ids, and
  marks boundary edges via a bc-name lookup table; `RecalculateFlip`
  resolves orientation.
- Adds tests `mesh2d_read_ism` and `mesh2d_read_ismmm` plus the
  BoneAndMarrow ISM-MM fixture (and its .control file). The Block2D
  fixture is reused. Side accounting and material tables are
  validated against expected values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tation

New CPU example demonstrates the variable-sound-speed LinearEuler2D
model on the ISM-MM BoneAndMarrow mesh. A small Gaussian pressure
and density bump is seeded in the Muscle annulus, c is initialised
per element from the parsed material name (Muscle=1.0, Bone=2.3,
Marrow=0.92 in normalized units), and a CPU radiation BC handler
is registered on top of the base class. All four outer-boundary
sides remap to SELF_BC_RADIATION via ResetBoundaryConditionType.

While bringing the example up I found the transfinite interpolation
was reversing curve samples for the North and West sides. HOHQMesh's
edgeMap (1,2;2,3;4,3;1,4) already runs sides 3 and 4 in SELF's +ξ
and +η directions, so my reversal was folding curved elements at
material interfaces (62 of 973 elements had Jacobians that changed
sign). With the reversal removed, every Jacobian is positive and
the simulation runs stably: entropy decays monotonically from
2.83e-7 to 1.97e-7 over 500 RK3 steps, c is preserved bitwise at
the material values, and the acoustic pulse spreads cleanly through
the heterogeneous medium and out through the radiation boundary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The GPU Mesh2D duplicates host-array allocations inline rather than
delegating to the parent Init/Free, so the new elemMaterial and
materialNames fields would be left unallocated on a GPU build —
Read_HOHQMesh's deallocate / assignment would then segfault.

Replicate the parent's allocation pattern: allocate a single
"default" material entry covering every element in Init, and
deallocate the pair in Free.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 13, 2026

Codecov Report

❌ Patch coverage is 92.19331% with 21 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/SELF_Mesh_2D_t.f90 91.21% 21 Missing ⚠️

📢 Thoughts on this report? Let us know!

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@fluidnumericsJoe
Copy link
Copy Markdown
Member Author

solutions are looking good on MI210 gpus. Just a simple gaussian bump in the initial pressure propagates through the different regions with jump discontinuities in the sound speed. Exterior boundaries are using the radiation boundary condition.

untitled.mp4

Once we have the formatting cleaned up, we should be good to go here. Our cpu-only node on the cluster is down at the moment causing the buildkite pipeline to hang, so I might adjust that as well

fluidnumericsJoe and others added 2 commits May 14, 2026 11:15
The 2-D linear-Euler model with variable sound speed (variable 5)
was using LLF / Rusanov with cmax = max(c_L, c_R). On a curved
multi-material mesh with discontinuous c across element interfaces
this is non-conservative for the pressure equation (each side's
flux uses its own c^2; the central average mismatches both) and at
high polynomial order the resulting aliasing instability blows the
solution up the first time an acoustic wave encounters the c jump.

Replaced with the exact (characteristic-decomposition) Riemann flux
for linear acoustics with variable c. The normal-flux Jacobian
eigenstructure is

    +c : right-going acoustic   W_+ = rho0*c*u_n + p
    -c : left-going  acoustic   W_- = -rho0*c*u_n + p
     0 : entropy density        W_0 = rho' - p/c^2
     0 : tangential vorticity   u_t

Upwinding each mode by its characteristic direction across the face
(W_+|* = W_+|_L, W_-|* = W_-|_R) yields the impedance-matched
interface state

    u_n* = (Z_L u_n,L + Z_R u_n,R + (p_L - p_R)) / (Z_L + Z_R)
    p*   = (Z_R p_L + Z_L p_R + Z_L Z_R (u_n,L - u_n,R)) / (Z_L + Z_R)

with Z = rho0*c. This is exact upwind / Godunov for the linearised
acoustic system and reduces correctly to Fresnel-style reflection /
transmission at a c-discontinuity. The pressure flux uses an averaged
c^2 as a pragmatic treatment of the non-conservative product
rho0*c^2*div(v) at the face; a fully path-conservative (Castro-Pares)
treatment would use each side's own c^2 in its surface integral.

Verified on the BoneAndMarrow example with N=7 controlDegree on
GAUSS quadrature, dt=1e-4, endtime=40. The previous LLF scheme
diverged at t~4 (the first wave-bone encounter) at N=5 and N=7
regardless of dt; the new flux runs cleanly through 40 sim seconds
with pressure remaining bounded and entropy monotonically decreasing
as the disturbance radiates out of the domain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ctually runs

The example overrode AdditionalInit and chained to AdditionalInit_LinearEuler2D_t
(the CPU base). On a GPU build that bypassed AdditionalInit_LinearEuler2D
(the GPU parent), so the GPU radiation BC wrapper hbc2d_Radiation_LinearEuler2D_GPU_wrapper
was never registered. The example's own CPU radiation handler ran instead,
modifying host buffers that the device pipeline never reads — leaving
solution%extBoundary_gpu at outer boundary faces with c = 0.

With c_R = 0 the impedance-matched Riemann flux degenerates: u_n* picks
up the entire interior pressure (p_L / Z_L), p* collapses to zero. That's
not a radiation condition — it's effectively a soft-wall reflector
that inverts the wave amplitude. Visible in slices as strong reflections
off the outer disk boundary.

Fix is to remove the AdditionalInit override (and the now-redundant CPU
radiation handler hbc2d_Radiation_lineareuler2d_boneandmarrow). The
example type inherits AdditionalInit from lineareuler2d, which is the
GPU parent's AdditionalInit_LinearEuler2D. That registers both the GPU
no_normal_flow wrapper and the GPU radiation wrapper, and the wrapper
correctly preserves c on the device-side extBoundary buffer.

On the 4-radius disk with the Gaussian pulse, the bounded amplitude
after 40 simulation seconds drops by a factor of ~3000 in peak |P| vs
the broken-BC run, and visible boundary reflections vanish. The viz mp4
size shrinks from 1.3 MB to 477 KB — most of the wave content is
genuinely radiating out instead of bouncing back.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@fluidnumericsJoe
Copy link
Copy Markdown
Member Author

Looking at that video, there's a fair amount of reflection of the wave off the domain boundary. This shouldn't happen for radiation boundary conditions; at least not to that extent. Dug into this a bit and it looked like the AdditionalInit method for the extended LinearEuler class didn't assign gpu accelerated kernels for the boundary conditions. This caused the external state for the sound speed to be effectively zero leading to the strong reflection.

Additionally, there was some artifacting at the edges where the sound speed transitioned from one value to another. This completely unstable for higher order approximations. The riemann boundary flux was updated to reconstruct from a diagonalization of the normal flux and using appropriate upwinded stated. This fixed the stability issue for higher order approximations.

@fluidnumericsJoe
Copy link
Copy Markdown
Member Author

bone_pressure_N7_fixed_bc.mp4

Attached is a video of the same simulation but with the radiation boundary condition and riemman solver fixes. Waves are properly radiated from the domain without the need for a sponge layer. Only major improvement for robust radiation conditions would be a perfectly matched layer setup, but that is out of scope for this PR

@fluidnumericsJoe fluidnumericsJoe merged commit b76d8fd into main May 14, 2026
11 checks passed
@fluidnumericsJoe fluidnumericsJoe deleted the feature/variable-sound-speed-lineareuler2d branch May 14, 2026 19:55
fluidnumericsJoe added a commit that referenced this pull request May 16, 2026
Reflect the recent changes that landed on main with #123 and the
extensible BC system from #121:

- Solution vector is now 5 variables with c carried as a per-node
  solution variable; flux table and entropy expression updated to
  match the code (P/rho0 momentum flux, p^2/(rho0 c^2) entropy, zero
  flux for c).
- Riemann solver section replaced with the impedance-matched
  characteristic-decomposition flux including the eigenstructure
  list, the u_n* / p* formulas, and the rationale for replacing
  local Lax-Friedrichs (over-dissipation and aliasing instability
  at material interfaces).
- Boundary-condition section rewritten around hyperbolicBCs%
  RegisterBoundaryCondition; bcids example uses the SELF_BC_*
  constants. Notes call out that custom prescribed handlers must
  also fill solution%extBoundary(:,:,:,5).
- Added a "Setting the sound speed" subsection and the
  SphericalSoundWave(c=...) signature.
- Examples list now includes linear_euler2d_boneandmarrow.f90 and
  documents that the planewave examples carry their own c
  attribute. Doc link to BoundaryConditions.md fixed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fluidnumericsJoe added a commit that referenced this pull request May 16, 2026
* Add Points module for off-grid scalar sampling

Introduces a Points class that holds a cloud of physical coordinates,
locates each point inside a SEMQuad/SEMHex element via spatial-hash +
Newton inverse-map, and samples MappedScalar2D/3D fields at the located
reference coordinates. This enables probes, observation operators, and
arbitrary off-grid diagnostics that previously had no in-tree primitive.

The Newton step uses the covariant Jacobian dxds interpolated to xi via
high-order Lagrange basis; the inverse is closed-form (2x2 / 3x3).
Search is rank-local: points whose element is not owned by this rank
get the sentinel elements(p) = 0 and sample to zero.

Per-point Lagrange basis values are cached on the Points object during
LocatePoints (a function of reference coords only), so repeated calls
to EvaluateScalar skip basis evaluation and run only the tensor
contraction. EvaluateScalar falls back to on-the-fly basis evaluation
if the cache is stale or absent.

Follows the existing _t + cpu/gpu split (gpu wrapper is structural).
All 96 serial tests pass; new tests cover 2D/3D round-trip accuracy
and the off-mesh sentinel path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Test both cached and fallback paths in Points EvaluateScalar

Adds points_eval_cache_2d and points_eval_cache_3d, each of which
runs EvaluateScalar twice on the same Points cloud: once with the
basis cache populated by LocatePoints, then again after manually
clearing nCached so the on-the-fly Lagrange evaluation runs instead.
Asserts both paths produce bitwise-equivalent values and that both
match the analytic reference. Without these, the fallback branch
in EvaluateScalar was dead code at test time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add GPU backend for Points scalar evaluation

src/gpu/SELF_Points.f90 replaces the structural-only stub with a
device-resident wrapper. Init allocates device buffers for the
per-point state (elements, coordinates), and UpdateDevice lazily
allocates and fills the device-side cached basis once LocatePoints
determines the polynomial degree. The point search itself stays on
the host (it is irregular and one-shot per Points cloud); only the
hot path EvaluateScalar moves to the GPU.

src/gpu/SELF_Points.cpp implements EvalScalarPoints_2D_gpu and
EvalScalarPoints_3D_gpu. Both launch one thread per (point, variable)
and consume only the precomputed lS/lT[/lU] cache — there is no
Lagrange-polynomial evaluation in the kernel, matching the design
goal. Layouts are column-major to mirror Fortran.

src/SELF_Points_t.f90: the LocatePoints/EvalScalar specifics on the
base class are now `procedure, public` so the GPU child can override
LocatePoints_2D_Points_t / LocatePoints_3D_Points_t (each override
runs the inherited host search then calls UpdateDevice). The GPU
Points adds device-output EvalScalar specifics under the same
generic name, so callers can request either a host array or a
caller-allocated device pointer.

Validation: the CPU path is unchanged and all 5 points tests still
pass. The GPU path has not been exercised — this machine has no
HIP/CUDA toolchain; the new code requires a GPU host to compile
(SELF_ENABLE_GPU=ON) and run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add GPU EvaluateScalar tests for Points

points_eval_gpu_2d / _3d exercise the device-output EvaluateScalar
path: allocate a device output buffer via hipMalloc, dispatch the
generic on a c_ptr argument (selects the GPU specific), copy the
result back with hipMemcpy, and assert agreement with both the
host-cached path (call this%Points_t%EvaluateScalar on the same
field) and the analytic reference.

Both tests are gated by `#ifdef ENABLE_GPU`. On CPU builds the body
is a no-op and the test passes trivially, so the suite stays green
in both configurations from a single test file.

Cannot run these on this machine (no HIP/CUDA toolchain). The CPU
no-op path is verified; the full GPU path needs validation on a GPU
host.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Document SELF_Points off-grid sampling

Adds docs/Learning/PointSampling.md and wires it into mkdocs nav
under Learning > Code. Covers the data layout, the spatial-hash +
Newton inverse-map algorithm with the index conventions used
throughout SELF, the per-point basis cache (why and how), MPI
rank-local semantics, off-mesh sentinel behavior, and end-to-end
CPU and GPU usage examples. Includes the LaTeX expressions for the
isoparametric map and the tensor-product sampling formula so the
math stays close to the API description.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Change attributes to type pointer to support gpu memcpy's

* Refresh LinearEuler2D model docs

Reflect the recent changes that landed on main with #123 and the
extensible BC system from #121:

- Solution vector is now 5 variables with c carried as a per-node
  solution variable; flux table and entropy expression updated to
  match the code (P/rho0 momentum flux, p^2/(rho0 c^2) entropy, zero
  flux for c).
- Riemann solver section replaced with the impedance-matched
  characteristic-decomposition flux including the eigenstructure
  list, the u_n* / p* formulas, and the rationale for replacing
  local Lax-Friedrichs (over-dissipation and aliasing instability
  at material interfaces).
- Boundary-condition section rewritten around hyperbolicBCs%
  RegisterBoundaryCondition; bcids example uses the SELF_BC_*
  constants. Notes call out that custom prescribed handlers must
  also fill solution%extBoundary(:,:,:,5).
- Added a "Setting the sound speed" subsection and the
  SphericalSoundWave(c=...) signature.
- Examples list now includes linear_euler2d_boneandmarrow.f90 and
  documents that the planewave examples carry their own c
  attribute. Doc link to BoundaryConditions.md fixed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Migrate Points_t per-point arrays from allocatable to pointer

Changes the elements / coordinates / lS_cache / lT_cache / lU_cache
members of Points_t from allocatable to pointer (with => null()
default-initialization), aligning with the SELF data-type convention
already used by Scalar2D_t, Vector2D_t, Tensor2D_t, etc.

Surrounding code updated to match the new attribute:
- allocated(this%foo) -> associated(this%foo) wherever foo is one of
  the now-pointer members (Free, LocatePoints cache rebuild,
  EvaluateScalar useCache, GPU Free, UpdateDevice).
- After deallocate, each pointer is explicitly nullified so Free is
  idempotent and a second call is safe.
- Tests (points_eval_cache_2d / _3d) updated similarly where they
  probe pts%lS_cache et al.

this%x stays allocatable since it's user-input shape and not handed
to any GPU c_loc; no behavioural change there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant