Skip to content

Conversation

andershermansen
Copy link
Contributor

What?

Stream S3 object directly to response instead of creating a Buffer in memory and wire up an abort controller to stop streaming if user aborts download

Why?

To avoid excessive memory usage and to abort s3 download if user has aborted the request anyway.

How?

In node environment the AWS S3 always returns a Readable. The streamToBuffer method always required this, but the any type hided that this was actually needed. Now there is an explicit type check, but this should never trigger in a node server environment.

Wire up and abort controller to the request so that we tell the S3 object to also stop streaming further if the user aborts.

Fixes #10286
Maybe also helps on other issues with s3 and resource usage

@andershermansen
Copy link
Contributor Author

Script to validate memory usage:

#!/bin/bash

PID=5556

while :; do date +"%T"; ps -o rss=,vsz= -p $PID | awk '{printf "RSS: %.1fMB  VSZ: %.1fMB\n",$1/1024,$2/1024}'; sleep 1; done

Downloading a 50MB PDF file twice.

Output from script when running a version without the fix:

11:14:47
RSS: 256.2MB  VSZ: 444977.0MB
11:14:48
RSS: 256.2MB  VSZ: 444977.0MB
11:14:49
RSS: 256.2MB  VSZ: 444977.0MB
11:14:50
RSS: 263.6MB  VSZ: 444979.7MB
11:14:51
RSS: 272.1MB  VSZ: 444995.3MB
11:14:52
RSS: 281.4MB  VSZ: 445131.5MB
11:14:53
RSS: 291.6MB  VSZ: 445148.0MB
11:14:54
RSS: 302.3MB  VSZ: 445161.2MB
11:14:55
RSS: 400.2MB  VSZ: 445253.4MB
11:14:56
RSS: 404.5MB  VSZ: 445253.4MB
11:14:57
RSS: 400.6MB  VSZ: 445245.4MB
11:14:58
RSS: 401.6MB  VSZ: 445245.4MB
11:14:59
RSS: 407.3MB  VSZ: 445270.2MB
11:15:01
RSS: 405.7MB  VSZ: 445262.7MB
11:15:02
RSS: 390.0MB  VSZ: 445374.7MB
11:15:03
RSS: 390.0MB  VSZ: 445374.7MB

Memory usage goes from 256MB and up to over 400MB when downloading the file twice.

Output from script when running with the fix applied:

11:16:33
RSS: 274.8MB  VSZ: 445001.2MB
11:16:34
RSS: 274.8MB  VSZ: 445001.2MB
11:16:35
RSS: 274.8MB  VSZ: 445001.2MB
11:16:36
RSS: 274.8MB  VSZ: 445001.2MB
11:16:37
RSS: 282.0MB  VSZ: 445003.5MB
11:16:38
RSS: 288.2MB  VSZ: 445007.0MB
11:16:39
RSS: 290.2MB  VSZ: 445135.3MB
11:16:41
RSS: 290.4MB  VSZ: 445263.3MB
11:16:42
RSS: 290.5MB  VSZ: 445263.3MB
11:16:43
RSS: 290.8MB  VSZ: 445391.3MB
11:16:44
RSS: 294.3MB  VSZ: 445395.5MB
11:16:45
RSS: 294.4MB  VSZ: 445395.5MB
11:16:46
RSS: 294.4MB  VSZ: 445395.5MB
11:16:47
RSS: 294.4MB  VSZ: 445395.5MB
11:16:48
RSS: 294.5MB  VSZ: 445395.5MB
11:16:49
RSS: 294.5MB  VSZ: 445395.8MB

Only going from 274MB and up to 294MB

So as you can see this is a big resource and memory saving change.

@DanRibbens DanRibbens requested a review from Copilot August 12, 2025 15:45
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes S3 file serving by streaming S3 objects directly to HTTP responses instead of buffering them in memory, while adding abort controller support to properly handle client disconnections.

  • Removes the streamToBuffer function that was creating full file buffers in memory
  • Adds abort controller integration to stop S3 downloads when clients abort requests
  • Improves type safety for the stream detection helper function

})

streamed = true
return new Response(stream, { headers, status: 200 })
Copy link
Preview

Copilot AI Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning a Node.js Readable stream directly to Response constructor may not work in all environments. Consider checking if the runtime supports streaming responses or provide a fallback mechanism.

Copilot uses AI. Check for mistakes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be safe with newer node versions, but not sure about other edge runtimes.
We can wrap it in Readable.toWeb(stream), but not sure about the upside/downside of this.

@andershermansen andershermansen changed the title perf(storage-s3): stream files from static handler perf(storage-s3): stream files and abort s3 request from static handler Aug 13, 2025
@andershermansen
Copy link
Contributor Author

Updated with some refactoring to make the code more clear.

I have tested the following cases:

  • OK path - this seems to work fine, I connected to a web project and all images is showing fine.
  • 304 path - this is easy to test by opening image url from cms in browser and doing regular refresh compared to hard refresh. Regular refresh gets the correct etag and 304 response, I have confirmed this is handled in finally path.
  • abort path, this can be triggered with a curl on a binary file without saving to a file, then you get a warning from curl that it is a binary file and you need to force about or save to file. In this scenario curl will abort the request. In this scenario I have confirmed the abort signal is received on req and the handler executes.

For performance it seems it's best to leave the Readable.toWeb out of it if we can.

@DanRibbens
Copy link
Contributor

Would this also solve #12366?

@andershermansen
Copy link
Contributor Author

Would this also solve #12366?

I don’t know. Unable to reproduce that issue myself, so hard to know.

Maybe you can create a canary package with this change and let the people in that issue test?

@DanRibbens
Copy link
Contributor

Fantastic call out @andershermansen!
I am publishing a prerelease of your branch 3.53.0-internal.2dadf5b.
Thanks for merging main so it was easy.

@DanRibbens DanRibbens merged commit 36fd6e9 into payloadcms:main Aug 20, 2025
84 checks passed
Copy link
Contributor

🚀 This is included in version v3.53.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OOM downloading large files
2 participants