Skip to content

Conversation

@AlexanderSinn
Copy link
Member

@AlexanderSinn AlexanderSinn commented Dec 22, 2025

This PR rewrites the laser file reader to achieve a big performance improvement for a production simulation. Previously the laser input grid was interpolated to the full 3D simulation grid by a single CPU core at initialization. This could take a long time and consume a lot of CPU memory even for moderately sized simulations. Now the laser file is read directly into pinned memory and is interpolated per slice into the main laser array by the GPU during the first time step. This uses a similar implementation compared to the plasma density file reader.

I also changed the profiling regions to be less redundant and actually measure the time used for laser initialization and I fixed some of the formatting for the other laser initialization types.

PR:

00:00:27 Rank 7 started step 31 at time = 7.755365213e-13 with dt = 2.501730714e-14

TinyProfiler total time across processes [min...avg...max]: 37.76 ... 39.92 ... 41.65

-------------------------------------------------------------------------------------------------------
Name                                                    NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
-------------------------------------------------------------------------------------------------------
main()                                                       1   0.004496     0.1325      1.006   2.41%
MultiLaser::GetEnvelopeFromFile()                            1  8.577e-06     0.1287     0.1724   0.41%
MultiLaser::InitLaserSlice()                               205          0    0.01663      0.133   0.32%
-------------------------------------------------------------------------------------------------------

Pinned Memory Usage:
------------------------------------------------------------------------------------------------------------------------
Name                                      Nalloc  AvgMem min  AvgMem avg  AvgMem max  MaxMem min  MaxMem avg  MaxMem max
------------------------------------------------------------------------------------------------------------------------
MultiLaser::GetEnvelopeFromFile()              1       0   B      38 MiB     304 MiB       0   B      38 MiB     305 MiB
main()                                       536     434   B     457   B     479   B     480   B     480   B     480   B
------------------------------------------------------------------------------------------------------------------------

Dev:

TinyProfiler total time across processes [min...avg...max]: 105.8 ... 107.8 ... 109.6

-------------------------------------------------------------------------------------------------------
Name                                                    NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
-------------------------------------------------------------------------------------------------------
main()                                                       1      6.758      60.17      68.09  62.11%
MultiLaser::GetEnvelopeFromFile()                            0          0      7.666      61.33  55.94%
MultiLaser::InitLaserSlice()                               205          0     0.1201     0.9606   0.88%
MultiLaser::GetEnvelopeFromFileHelper()                      0          0  0.0004775    0.00382   0.00%
MultiLaser::InitSliceEnvelope()                            205          0  0.0001499   0.001199   0.00%
-------------------------------------------------------------------------------------------------------

Pinned Memory Usage:
------------------------------------------------------------------------------------------------------------------------
Name                                      Nalloc  AvgMem min  AvgMem avg  AvgMem max  MaxMem min  MaxMem avg  MaxMem max
------------------------------------------------------------------------------------------------------------------------
main()                                       537     361   B    3077 MiB      24 GiB     480   B    3280 MiB      25 GiB
------------------------------------------------------------------------------------------------------------------------

Tested with rt and xyz laser input files

  • Small enough (< few 100s of lines), otherwise it should probably be split into smaller PRs
  • Tested (describe the tests in the PR description)
  • Runs on GPU (basic: the code compiles and run well with the new module)
  • Contains an automated test (checksum and/or comparison with theory)
  • Documented: all elements (classes and their members, functions, namespaces, etc.) are documented
  • Constified (All that can be const is const)
  • Code is clean (no unwanted comments, )
  • Style and code conventions are respected at the bottom of https://github.com/Hi-PACE/hipace
  • Proper label and GitHub project, if applicable

@AlexanderSinn AlexanderSinn changed the title Per slice laser file reader Interpolate laser from file on GPU Dec 22, 2025
@AlexanderSinn AlexanderSinn added GPU Related to GPU acceleration performance optimization, benchmark, profiling, etc. component: laser envelope About the laser envelope solver labels Dec 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component: laser envelope About the laser envelope solver GPU Related to GPU acceleration performance optimization, benchmark, profiling, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant