Skip to content

Solid Particle Implementation for E-L Solver#1301

Draft
jaguilar37 wants to merge 9 commits intoMFlowCode:masterfrom
jaguilar37:MovingParticlesFresh-Final
Draft

Solid Particle Implementation for E-L Solver#1301
jaguilar37 wants to merge 9 commits intoMFlowCode:masterfrom
jaguilar37:MovingParticlesFresh-Final

Conversation

@jaguilar37
Copy link
Copy Markdown

@jaguilar37 jaguilar37 commented Mar 11, 2026

Description

This update expands upon the pre-existing (in development) E-L solver for bubble dynamics to include solid particle dynamics. This is in support of the PSAAP center, which requires the capability to model solid particle dynamics in MFC.

Type of change

  • New feature
  • Refactor

Testing

The solver has been tested by running various 2D/3D problems involving fluid-particle interactions, such as spherical blasts surrounded by a layer of particles, shock-particle curtains, collision tests, etc.

The inputs to the EL solid particle solver have all been turned on/off to verify they work independent of each other, and together.

The code has been tested for CPU and GPU usage. The GPU usage has been tested on Tuolumne.

Two new files have been added:

m_particles_EL.fpp
m_particles_EL_kernels.fpp
File 1 has the main particle dynamics subroutines. This initializes the particles, computes fluid forces, coupling terms, computes collision forces, enforces boundary conditions, and writes the data for post-processing.

File 2 has the gaussian kernel projection code and the subroutine to compute the force on the particle due to the fluid. This compute the quasi-steady drag force, pressure gradient force, added mass force, stokes drag, gravitational force. Models for the quasi-steady drag are implemented here.

Checklist

  • I added or updated tests for new behavior
  • I updated documentation if user-facing behavior changed

See the developer guide for full coding standards.

GPU changes (expand if you modified src/simulation/)
  • GPU results match CPU results
  • Tested on NVIDIA GPU or AMD GPU

@sbryngelson sbryngelson force-pushed the MovingParticlesFresh-Final branch from a2bf4ac to e527f8f Compare March 16, 2026 20:08
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 16, 2026

Claude Code Review

Incremental review from: e527f8f
Head SHA: 069237f

Previously-flagged issues addressed in this update: save variable, bare float literals in all four drag functions, kahan_comp dead code, weights_*_grad missing deallocations, stage missing intent, 5.0d-11 literal, bub_pp broadcast restoration, s_mpi_reduce_int_sum non-MPI path, s_transfer_collision_forces non-MPI path, hardcoded viscosity (conditionally fixed).


New findings since last Claude review

[HIGH] Declaration-after-executable-statement: Fortran syntax error in MPI builds
src/common/m_mpi_common.fpp, around the new bubs_glb = 0 line

The fix initializes bubs_glb before the #ifdef MFC_SIMULATION / #ifdef MFC_MPI-guarded local variable declarations (integer :: ierr, etc.). In a simulation+MPI build, the preprocessor expands to:

bubs_glb = 0           ! ← executable statement
integer :: ierr        ! ← local declaration after executable: Fortran syntax error
integer :: i, j, k, ...

Per the Fortran standard, all declarations must precede the first executable statement in a scoping unit. Move bubs_glb = 0 to after all local variable declarations (i.e., after the #ifdef-guarded block).


Remaining from prior review (not re-examined here): vL_x/vL_y/vL_z/vR_x/vR_y/vR_z intent(inout)→intent(in), case_validator.py entries for new parameters.

@github-actions
Copy link
Copy Markdown

Claude Code Review

Incremental review from: 069237f
Head SHA: 3c49f42

Previously-flagged issues addressed or carried forward: The declaration-after-executable bug (bubs_glb = 0 before integer :: ierr) flagged in the last review is still present in this commit — see [HIGH] below. The prior remaining note about vL_x/vL_y/vL_z/vR_x/vR_y/vR_z intent(inout)→intent(in) is now re-examined and partially confirmed for the new s_gradient_field in m_particles_EL.fpp — same finding applies there.


Findings since last Claude review

[HIGH] Declaration-after-executable still present — Fortran standard violation
src/common/m_mpi_common.fpp, s_mpi_reduce_stability_criteria_extrema

This was flagged in the previous review and remains unfixed. In an MFC_SIMULATION + MFC_MPI build the preprocessor expands to:

bubs_glb = 0           ! ← executable statement
! ...
integer :: ierr        ! ← local declaration AFTER executable: illegal per Fortran standard

Fix: move bubs_glb = 0 to after the #ifdef MFC_SIMULATION / #ifdef MFC_MPI guard block (i.e., after all local variable declarations), or restructure so the #ifdef-guarded integer :: ierr block comes first.


[MEDIUM] beta_vars allocated but never deallocated in pre_process and post_process

In src/pre_process/m_global_parameters.fpp and src/post_process/m_global_parameters.fpp, beta_vars is allocated (via bare allocate, not @:ALLOCATE) in s_compute_derived_variables when bubbles_lagrange or particles_lagrange is true, but s_finalize_global_parameters_module in both targets has no corresponding deallocation. The simulation target correctly uses @:DEALLOCATE(beta_vars) in its finalizer. The pre_process and post_process finalizers should be made consistent.


[MEDIUM] s_gradient_field in m_particles_EL.fpp: vL_field/vR_field declared intent(inout) but only read

src/simulation/m_particles_EL.fpp, ~line 2034:

real(wp), ..., intent(inout) :: vL_field
real(wp), ..., intent(inout) :: vR_field

The implementation only reads these arrays (computes dq = (vR - vL)/dx); it never writes to them. They should be intent(in). This is the same class of issue previously flagged for the analogous variables in m_bubbles_EL.fpp.


[LOW] Bare integer literal 3 in mixed real expression, m_bubbles.fpp f_advance_step case (3)

src/simulation/m_bubbles.fpp, ~line 600:

aTemp(l) = 2._wp*f_bTemp/(fmass_g + fmass_v) - 3*fV*fVel(l)/fR

The bare integer 3 is inconsistent with the style rule (use 3._wp) and with the analogous line in m_bubbles_EL.fpp which correctly writes 3._wp*myV*myVel(l)/myR. While Fortran's type promotion makes this numerically correct at wp, the inconsistency is a style violation and should use 3._wp.


[LOW] Typo in NVTX range label: BETA-COMM-SENDRECV-NO-RMDA

src/common/m_mpi_common.fpp, ~line 1344:

call nvtxStartRange("BETA-COMM-SENDRECV-NO-RMDA")

Should be NO-RDMA (the same typo also exists in the pre-existing RHS path at line 968, but this is new code adding it again).


Remaining from prior reviews (not re-examined here): case_validator.py entries for new particle parameters (currently no validation logic for particles parameters beyond definitions.py entries).

@github-actions
Copy link
Copy Markdown

Claude Code Review

Incremental review from: 3c49f42
Head SHA: 38f7dd2

Previously-flagged issues addressed in this update: bubs_glb = 0 declaration-after-executable (fixed by moving into the #ifdef MFC_MPI block), beta_vars missing deallocation in pre/post finalizers (fixed), vL_field/vR_field intent(inout)intent(in) in s_gradient_field (fixed), bare integer literal 33._wp in m_bubbles.fpp (fixed), NVTX typo NO-RMDANO-RDMA (fixed), inc_ghost / only_beta moved after declarations in m_particles_EL.fpp (fixed).


New findings since last Claude review

[HIGH] Module-level max_dt array: missing deallocation + shadowed by local variable
src/simulation/m_time_steppers.fpp, lines 73 and 741

This commit adds:

! Module level (line 73)
real(wp), allocatable, dimension(:, :, :) :: max_dt

allocated conditionally in the initializer:

if (cfl_dt) then
    @:ALLOCATE(max_dt(0:m, 0:n, 0:p))   ! line 474
end if

but s_finalize_time_steppers_module has no corresponding @:DEALLOCATE(max_dt) — memory and GPU device memory leak whenever cfl_dt is true.

Additionally, inside s_compute_dt (line 741):

real(wp) :: max_dt   ! local scalar — shadows the module-level 3-D array

The GPU parallel loop at lines 766-769 passes this local scalar to s_compute_dt_from_cfl, not the module-level array. The 3-D array is therefore never written or read — it is allocated dead code, and the GPU declaration for it in GPU_DECLARE is superfluous.

Fix: add if (cfl_dt) @:DEALLOCATE(max_dt) to s_finalize_time_steppers_module, and either remove the module-level array (if the scalar per-cell caching was not actually intended) or rename the local scalar so it no longer shadows the module variable.


Remaining from prior reviews (not re-examined here): case_validator.py entries for new particle parameters (no physics-constraint validation logic currently).

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 17, 2026

Claude Code Review

Incremental review from: 754cc32
Head SHA: 0756c08

Previously-flagged findings from last review: No new issues were found in the prior increment. All previously flagged issues (declaration-after-executable, beta_vars deallocation, vL_field/vR_field intent, bare integer literal, NVTX typo, max_dt shadow/leak, END_GPU_PARALLEL_LOOP terminator, p > 1p > 0) were resolved.


New findings since last Claude review

[MEDIUM] Optional type(scalar_field) argument in GPU device routine f_advance_step
src/simulation/m_bubbles.fpp, lines 469–484

f_advance_step is now a GPU device routine ($:GPU_ROUTINE(parallelism='[seq]')) with optional dummy arguments including type(scalar_field), intent(in), dimension(sys_size), optional :: q_prim_vf. scalar_field contains a Fortran pointer component, making it a derived type with pointer components — a class of optional argument that has incomplete support in OpenACC and OpenMP target device routines across all four CI-gated compilers.

The if (bubbles_lagrange) guard correctly prevents accessing absent arguments at runtime (safe for the EE call path without optional args). The existing codebase already uses optional real(wp) args in GPU device routines (s_vflux, f_bpres_dot), but extending this to a derived type with pointer components is higher-risk. The PR description notes successful testing on Tuolumne (Cray/OpenMP), but portability across nvfortran (--gpu acc), Intel ifx, and gfortran builds should be verified in CI.


[LOW] No-op assignments in f_advance_step default case
src/simulation/m_bubbles.fpp, lines 621–625

The case default block of the select case (lag_vel_model) loop contains self-assignments:

case default
    do l = 1, num_dims
        fVel(l) = fVel(l)
        fPos(l) = fPos(l)
    end do

These are no-ops. The block can be removed entirely or replaced with a comment.


[LOW] Unconditional MPI broadcast of hardcoded-IC variables in pre_process
src/pre_process/m_mpi_proxy.fpp, lines 1473–1477

interface_file, normFac, normMag, g0_ic, p0_ic are always broadcast regardless of whether particles_lagrange is set. No correctness impact (values are unused if the feature is off), but they should be guarded by if (particles_lagrange) for consistency with the rest of the file.


Remaining from prior reviews (not re-examined here): case_validator.py physics-constraint validation entries for new particles_lagrange parameters.

@sbryngelson sbryngelson force-pushed the MovingParticlesFresh-Final branch from d973f42 to eb03599 Compare March 17, 2026 23:23
@github-actions
Copy link
Copy Markdown

Claude Code Review

Incremental review from: 0756c08
Head SHA: eb03599

Previously-flagged issues addressed in this update: optional type(scalar_field) GPU device routine concern (still present but carried/noted in prior review), case default no-op in f_advance_step (still present — noted again below).


New findings since last Claude review

[HIGH] s_finalize_mpi_proxy_module does not deallocate any of the new particle/bubble MPI buffers
src/simulation/m_mpi_proxy.fpp, s_finalize_mpi_proxy_module (line 1442)

s_initialize_particles_mpi allocates p_send_buff, p_recv_buff, p_send_ids via @:ALLOCATE.
s_initialize_solid_particles_mpi allocates all of the above plus force_send_counts, force_recv_counts, force_send_ids, force_send_vals, flat_send_ids, flat_send_vals via @:ALLOCATE.

The finalizer currently only handles the pre-existing IB buffers:

subroutine s_finalize_mpi_proxy_module()
#ifdef MFC_MPI
    if (ib) then
        @:DEALLOCATE(ib_buff_send, ib_buff_recv)
    end if
#endif
end subroutine s_finalize_mpi_proxy_module

All nine new module-level allocatable variables leak their host and GPU device memory. Add matching @:DEALLOCATE calls guarded by the appropriate bubbles_lagrange / particles_lagrange flags.


[MEDIUM] flat_send_ids / flat_send_vals allocated with @:ALLOCATE but not in GPU_DECLARE
src/simulation/m_mpi_proxy.fpp, s_initialize_solid_particles_mpi

@:ALLOCATE expands to Fortran allocate plus GPU_ENTER_DATA(create=...), creating unnecessary GPU device copies of two CPU-only temporaries. force_send_counts, force_send_ids, force_send_vals are correctly in GPU_DECLARE (used by s_add_force_to_send_buffer on the device), but flat_send_ids and flat_send_vals are only used in s_transfer_collision_forces on the host. Use plain allocate() for these two variables.


[LOW] Missing intent on dummy arguments in s_add_particles_to_transfer_list and both init routines
src/simulation/m_mpi_proxy.fpp, lines 576–581 and s_initialize_particles_mpi / s_initialize_solid_particles_mpi

In s_add_particles_to_transfer_list:

impure subroutine s_add_particles_to_transfer_list(nBub, pos, posPrev, include_ghost)
    real(wp), dimension(:, :) :: pos, posPrev   ! no intent
    integer :: bubID, nbub                       ! nbub == nBub (case-insensitive); no intent

pos, posPrev, and nBub (declared via the case-insensitive alias nbub) lack intent declarations. Convention requires explicit intent on all dummy arguments.

Similarly, both s_initialize_particles_mpi and s_initialize_solid_particles_mpi declare lag_num_ts in the argument list without intent(in).


[LOW] No-op case default self-assignments in f_advance_step (still present)
src/simulation/m_bubbles.fpp, f_advance_step

Previously flagged; still present:

case default
    do l = 1, num_dims
        fVel(l) = fVel(l)
        fPos(l) = fPos(l)
    end do

These are no-ops. The case default block should be removed or replaced with a comment/error.


@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 18, 2026

<!-- claude-review: thread=primary; reviewed_sha=f0440446198aca0c66b76e8e88c888a49965dea3; mode=incremental -->

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 19, 2026

Claude Code Review

Incremental review from: 8bd0d15
Head SHA: 978e707

New findings since last Claude review:

  • Missing $:END_GPU_PARALLEL_LOOP() in s_smear_field_contributions (src/simulation/m_particles_EL.fpp): The GPU parallel loop over do k = 1, n_el_particles_loc (the particle Gaussian smearing loop) is not terminated before call s_populate_beta_buffers(...) and the subsequent post-processing. For GPU builds (OpenACC or OpenMP target offload), the MPI communication and void-fraction clamping code after end do would be incorrectly included inside the GPU parallel region, producing either a compiler error or incorrect runtime behavior. Every other GPU_PARALLEL_LOOP in this file has a matching $:END_GPU_PARALLEL_LOOP().

  • Junk separator comment !!!!!!!!!!!!!!!!!!! (src/simulation/m_particles_EL.fpp, near the end of s_smear_field_contributions): This comment is a visual separator and falls under the "junk pattern" rule enforced by toolchain/mfc/lint_source.py. It will cause ./mfc.sh precheck to fail.

  • Silent behavioral change: qs_drag_model values 2 and 3 swapped (docs/documentation/case.md, src/simulation/m_particles_EL_kernels.fpp): The documentation update explicitly reassigns model IDs — old model 2 (Modified Parmar) becomes new model 3, and old model 3 (Osnes full correlation) becomes new model 2. Since m_particles_EL_kernels.fpp is in the changed-files list and the documentation reflects an intentional renumbering, any existing case file using qs_drag_model=2 or qs_drag_model=3 will silently compute with a different drag model after this change. There is no runtime warning or migration path in the diff.

@sbryngelson sbryngelson force-pushed the MovingParticlesFresh-Final branch 5 times, most recently from 82cd581 to dad92d8 Compare March 26, 2026 19:40
@sbryngelson sbryngelson force-pushed the MovingParticlesFresh-Final branch from dad92d8 to 61a4b80 Compare March 27, 2026 00:56
sbryngelson and others added 7 commits March 27, 2026 10:47
1.
Fixes volume fraction and source term smearing. Previously, these two were combined in a convoluted way. Volume fraction field needs to be computed/communicated at the start of the timestep, source term contributions need to be computed/communicated after computing the fluid force on the particle. These are now split in a clean way.

2.
The filling in the buffer cells has the update that uses sum and replace algorithm (implemented by Ben W.). This was further modified to take in an array as input with the indexes of the variables in q_particles/q_beta that should be updated by the algorithm. This was done so that the volume fraction update could be split from the source term contribution update.

3.
The collision force parameters are now defined in the inputs in the particle physical properties. Includes:

particle_pp%ksp_col, particle_pp%nu_col, particle_pp%E_col, particle_pp%cor_col. Need to be set if collisions is turned on.

The collision forces are not communicated anymore. Instead, each local particle is looped through and checked for overlap with neighbors. Then, only the collision force on this local particle of interest is added to this local particle. This avoids communication of forces.

4.
The sutherland viscosity for air is hardcoded into the force subroutine if "viscous" is turned off. This is a temporary bandaid. Use lag_params%mu_ref = 1.716E-5

5.
Implements quasi-steady drag fluctuations force (logical input : lag_params%qs_fluct_force)
The subroutine is within the kernels file. This uses the random number generator in src/common/m_model.fpp. src/common/m_model.fpp was modified to make the random number generator subroutine public.

6.
The documentation for the qs_drag_model was corrected. New documentation is added for the collision force inputs, quasi-steady fluctuation force, and fluid (air) reference viscosity).
I removed the collision force sending arrays in mpi_proxy and added a check in the deallocate for processor count greater than 0. This is a bug fix for running mfc with mpi with only 1 rank.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants