Skip to content

Fix ListObjectsV2 pagination to support listing more than 1000 objects #2652

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

Copilot
Copy link

@Copilot Copilot AI commented Jul 4, 2025

Problem

The AwsObjectStorage.doList() method only returned the first 1000 objects from S3's ListObjectsV2 API, which has a maximum response limit of 1000 objects per request. This caused issues when trying to list buckets with more than 1000 objects, as subsequent objects were silently ignored.

According to the AWS ListObjectsV2 documentation:

Returns some or all (up to 1,000) of the objects in a bucket with each request.

The implementation should support listing all objects with a given prefix, regardless of the total count.

Solution

Implemented proper pagination support by:

  1. Added recursive pagination logic using listObjectsWithPagination() helper method
  2. Checks response truncation using resp.isTruncated() to determine if more pages exist
  3. Uses continuation tokens (resp.nextContinuationToken()) to fetch subsequent pages
  4. Accumulates results from all pages into a single list
  5. Maintains async behavior using CompletableFuture composition

Before:

@Override
CompletableFuture<List<ObjectInfo>> doList(String prefix) {
    return readS3Client.listObjectsV2(builder -> builder.bucket(bucket).prefix(prefix))
        .thenApply(resp -> /* process single page only */);
}

After:

@Override
CompletableFuture<List<ObjectInfo>> doList(String prefix) {
    return listObjectsWithPagination(prefix, null, new ArrayList<>());
}

private CompletableFuture<List<ObjectInfo>> listObjectsWithPagination(String prefix, String continuationToken, List<ObjectInfo> accumulator) {
    // Handles pagination recursively until all objects are retrieved
}

Testing

Added comprehensive test coverage:

  • Pagination scenario: Multiple pages with continuation tokens
  • Single page scenario: All objects fit in one response
  • Empty results: No objects match the prefix
  • Existing functionality: All existing tests continue to pass

Impact

This fix ensures that operations like log cleanup in LogUploader and other components that rely on listing objects will now work correctly with buckets containing more than 1000 objects.

Closes #2650


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@Gezi-lzq Gezi-lzq changed the title [WIP] @AutoMQ/automq/issues/2650 我希望你可以尝试解决这个issue。 ListObjectsV2 的文档描述中:Returns some or all (up to 1,000) of the objects in a bucket with each request. 但是此接口实现预期应该是支持list超过1000个,也就是list出真实的该前缀的object。 [WIP] @AutoMQ/automq/issues/2650ListObjectsV2 的文档描述中:Returns some or all (up to 1,000) of the objects in a bucket with each request. 但是此接口实现预期应该是支持list超过1000个,也就是list出真实的该前缀的object。 Jul 4, 2025
@Gezi-lzq Gezi-lzq closed this Jul 4, 2025
@Gezi-lzq Gezi-lzq changed the title [WIP] @AutoMQ/automq/issues/2650ListObjectsV2 的文档描述中:Returns some or all (up to 1,000) of the objects in a bucket with each request. 但是此接口实现预期应该是支持list超过1000个,也就是list出真实的该前缀的object。 [WIP] test agent Jul 4, 2025
@Gezi-lzq Gezi-lzq deleted the copilot/fix-16f483e0-e131-4def-b665-433e297351e8 branch July 4, 2025 07:40
@Copilot Copilot AI restored the copilot/fix-16f483e0-e131-4def-b665-433e297351e8 branch July 4, 2025 07:41
@Gezi-lzq Gezi-lzq deleted the copilot/fix-16f483e0-e131-4def-b665-433e297351e8 branch July 4, 2025 07:41
@Copilot Copilot AI changed the title [WIP] test agent Fix ListObjectsV2 pagination to support listing more than 1000 objects Jul 4, 2025
@Copilot Copilot AI requested a review from Gezi-lzq July 4, 2025 07:52
Copilot finished work on behalf of Gezi-lzq July 4, 2025 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] AwsObjectStorage#doList only list first 1000 object
3 participants