Skip to content

Conversation

@jeqo
Copy link
Contributor

@jeqo jeqo commented Apr 8, 2024

As it is currently validated, there is a chance indexes are not fully consumed when transforming, as the validation is assuming the whole stream is consumed, when it is not.

final var inputStream = transformFinisher.nextElement();
segmentIndexBuilder.add(indexType, singleChunk(transformFinisher.chunkIndex()).range().size());
return inputStream;

By introducing an output stream to hold the processed content, and passing the buffer to the storage layer we guarantee that the validation is correctly validating that there is only a single chunk processed.

As it is currently validated, there is a chance indexes are not fully consumed when transforming, as the validation is assuming the whole stream is consumed, when it is not.

By introducing an output stream to hold the processed content, and passing the buffer to the storage layer we guarantee that the validation is correctly validating that there is only a single chunk processed.
@jeqo jeqo marked this pull request as ready for review April 9, 2024 08:44
@jeqo jeqo requested a review from a team as a code owner April 9, 2024 08:44
@ivanyu
Copy link
Contributor

ivanyu commented Apr 9, 2024

Do I understand correctly that the problem is that the index may be longer than nextElement()? Or something else?

@jeqo
Copy link
Contributor Author

jeqo commented Apr 11, 2024

Yes, that was my understanding; however I may have rushed a bit here and actually it's working correctly (though not obvious):

  • Chunk size is about source, not target content. This means that even if the transformation changes the size (e.g. encryption) if the chunk size is equal to the original size, that's enough to get a single chunk as a result.
  • The chunk index is fully built by next element if there are no more elements, validating the conditions to fully read the original content.

I got confused by the mixture of a outdated past memories. I will create a quick PR to document these less obvious conditions so the future me doesn't get confused again 😅

@jeqo jeqo closed this Apr 11, 2024
@jeqo
Copy link
Contributor Author

jeqo commented Apr 11, 2024

Added doc comments as part of #535

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants