Skip to content

Conversation

@Taepper
Copy link
Collaborator

@Taepper Taepper commented Oct 30, 2025

reduces impact of #1034

Summary

We were already using mimalloc indirectly with Arrow (e.g. through Arrow RecordBatch creating). Now we are using mimalloc for all allocations in SILO.

I tested this by starting an instance with loculus data and stress testing the sequence download:

Before:

alexander@dev-1:~/LAPIS-SILO$ monitor_rss
Monitoring RSS of process 177715...
Timestamp, PID, RSS (KB)
2025-10-30 11:07:48, 177715, 61696
2025-10-30 11:07:49, 177715, 61696
2025-10-30 11:07:50, 177715, 437364
2025-10-30 11:07:51, 177715, 517664
2025-10-30 11:07:52, 177715, 467140
2025-10-30 11:07:53, 177715, 513908
2025-10-30 11:07:54, 177715, 468708
2025-10-30 11:07:55, 177715, 468708
2025-10-30 11:07:56, 177715, 468708
2025-10-30 11:07:57, 177715, 465888
2025-10-30 11:07:58, 177715, 440084
2025-10-30 11:07:59, 177715, 425920
2025-10-30 11:08:00, 177715, 460664
2025-10-30 11:08:01, 177715, 463024
2025-10-30 11:08:02, 177715, 483336
2025-10-30 11:08:03, 177715, 433832
2025-10-30 11:08:05, 177715, 491012
2025-10-30 11:08:06, 177715, 471748

After:

alexander@dev-1:~/LAPIS-SILO$ monitor_rss
Monitoring RSS of process 236049...
Timestamp, PID, RSS (KB)
2025-10-30 11:15:55, 236049, 61824
2025-10-30 11:15:56, 236049, 61824
2025-10-30 11:15:57, 236049, 254328
2025-10-30 11:15:58, 236049, 243360
2025-10-30 11:15:59, 236049, 315176
2025-10-30 11:16:00, 236049, 259712
2025-10-30 11:16:01, 236049, 269188
2025-10-30 11:16:02, 236049, 259024
2025-10-30 11:16:03, 236049, 223536

I let the stress tests running for >10 minutes and the memory did not increase above the peak anymore

@github-actions
Copy link
Contributor

github-actions bot commented Oct 30, 2025

This is a preview of the changelog of the next release. If this branch is not up-to-date with the current main branch, the changelog may not be accurate. Rebase your branch on the main branch to get the most accurate changelog.

Note that this might contain changes that are on main, but not yet released.

Changelog:

0.9.2 (2025-11-10)

Bug Fixes

  • silo: reduce peak memory load by changing default allocator and tuning mimalloc options (fe1db01)


if (soft_memory_limit_in_kb.has_value() && rss.value() > soft_memory_limit_in_kb.value()) {
SPDLOG_INFO("Manually invoking malloc_trim() to give back memory to OS.");
malloc_trim(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you removing this in the non-mimalloc case (making benchmarking numbers incomparable)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is also invoked during benchmarking or am I missing something?

void operator>>(std::ostream& output) {
for (size_t i = 0; i < BATCH_SIZE; ++i) {
output << streams[i].rdbuf();
writeChunked(output, std::move(streams[i]).str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this change has nothing to do with mimalloc, but you were trying to find out where the leaking comes from?


"hwloc/*:shared": False,

"mimalloc/*:override": True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do? (Can't find it on https://conan.io/center/recipes/mimalloc)

#ifdef SILO_USE_MIMALLOC
// If this option is not set, memory remains very high even when no requests are sent
// Also reduces peak memory usage under concurrency
mi_option_set(mi_option_purge_delay, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worrisome with regards to performance: I guess fixes the "leaking" but at the cost of speed: "Setting N to 0 purges immediately when a page becomes unused which can improve memory usage but also decreases performance." (https://github.com/microsoft/mimalloc/blob/main/doc/mimalloc-doc.h)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants