Interpolate laser from file on GPU #1330

AlexanderSinn · 2025-12-22T10:13:00Z

This PR rewrites the laser file reader to achieve a big performance improvement for a production simulation. Previously the laser input grid was interpolated to the full 3D simulation grid by a single CPU core at initialization. This could take a long time and consume a lot of CPU memory even for moderately sized simulations. Now the laser file is read directly into pinned memory and is interpolated per slice into the main laser array by the GPU during the first time step. This uses a similar implementation compared to the plasma density file reader.

I also changed the profiling regions to be less redundant and actually measure the time used for laser initialization and I fixed some of the formatting for the other laser initialization types.

PR:

00:00:27 Rank 7 started step 31 at time = 7.755365213e-13 with dt = 2.501730714e-14

TinyProfiler total time across processes [min...avg...max]: 37.76 ... 39.92 ... 41.65

-------------------------------------------------------------------------------------------------------
Name                                                    NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
-------------------------------------------------------------------------------------------------------
main()                                                       1   0.004496     0.1325      1.006   2.41%
MultiLaser::GetEnvelopeFromFile()                            1  8.577e-06     0.1287     0.1724   0.41%
MultiLaser::InitLaserSlice()                               205          0    0.01663      0.133   0.32%
-------------------------------------------------------------------------------------------------------

Pinned Memory Usage:
------------------------------------------------------------------------------------------------------------------------
Name                                      Nalloc  AvgMem min  AvgMem avg  AvgMem max  MaxMem min  MaxMem avg  MaxMem max
------------------------------------------------------------------------------------------------------------------------
MultiLaser::GetEnvelopeFromFile()              1       0   B      38 MiB     304 MiB       0   B      38 MiB     305 MiB
main()                                       536     434   B     457   B     479   B     480   B     480   B     480   B
------------------------------------------------------------------------------------------------------------------------

Dev:

TinyProfiler total time across processes [min...avg...max]: 105.8 ... 107.8 ... 109.6

-------------------------------------------------------------------------------------------------------
Name                                                    NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
-------------------------------------------------------------------------------------------------------
main()                                                       1      6.758      60.17      68.09  62.11%
MultiLaser::GetEnvelopeFromFile()                            0          0      7.666      61.33  55.94%
MultiLaser::InitLaserSlice()                               205          0     0.1201     0.9606   0.88%
MultiLaser::GetEnvelopeFromFileHelper()                      0          0  0.0004775    0.00382   0.00%
MultiLaser::InitSliceEnvelope()                            205          0  0.0001499   0.001199   0.00%
-------------------------------------------------------------------------------------------------------

Pinned Memory Usage:
------------------------------------------------------------------------------------------------------------------------
Name                                      Nalloc  AvgMem min  AvgMem avg  AvgMem max  MaxMem min  MaxMem avg  MaxMem max
------------------------------------------------------------------------------------------------------------------------
main()                                       537     361   B    3077 MiB      24 GiB     480   B    3280 MiB      25 GiB
------------------------------------------------------------------------------------------------------------------------

Tested with rt and xyz laser input files

Small enough (< few 100s of lines), otherwise it should probably be split into smaller PRs
Tested (describe the tests in the PR description)
Runs on GPU (basic: the code compiles and run well with the new module)
Contains an automated test (checksum and/or comparison with theory)
Documented: all elements (classes and their members, functions, namespaces, etc.) are documented
Constified (All that can be const is const)
Code is clean (no unwanted comments, )
Style and code conventions are respected at the bottom of https://github.com/Hi-PACE/hipace
Proper label and GitHub project, if applicable

AlexanderSinn added 6 commits December 19, 2025 18:36

per slice laser from file interp

9ae777e

add per slice reader

5fc8c74

fix msvc

a992311

add static_cast

cf3ad9f

simplify coordinate transform calculation

dcdc7dd

fix

137b6ba

AlexanderSinn changed the title ~~Per slice laser file reader~~ Interpolate laser from file on GPU Dec 22, 2025

AlexanderSinn added GPU Related to GPU acceleration performance optimization, benchmark, profiling, etc. component: laser envelope About the laser envelope solver labels Dec 22, 2025

AlexanderSinn requested a review from MaxThevenet December 23, 2025 09:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Interpolate laser from file on GPU #1330

Interpolate laser from file on GPU #1330

Uh oh!

AlexanderSinn commented Dec 22, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Interpolate laser from file on GPU #1330

Are you sure you want to change the base?

Interpolate laser from file on GPU #1330

Uh oh!

Conversation

AlexanderSinn commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AlexanderSinn commented Dec 22, 2025 •

edited

Loading