-
Notifications
You must be signed in to change notification settings - Fork 916
GODRIVER-3587 Use raw bytes in valueReader #2120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
prestonvasquez
wants to merge
42
commits into
mongodb:master
Choose a base branch
from
prestonvasquez:GODRIVER-3587
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
42 commits
Select commit
Hold shift + click to select a range
2ce17b5
Add valueReaderByteSrc interface
prestonvasquez 2adb7cc
Add bufferedValueReader valueReaderByteSrc implementation
prestonvasquez 1a436da
Add streamingValueReader valueReaderBytSrc implementation
prestonvasquez 5e1fad9
Rename newDocumentReader to newBufferedDocumentReader
prestonvasquez 8f960b0
Reorganize newBufferedDocumentReader to use bufferedValueReader
prestonvasquez 36d3c18
Reorganize NewDocumentReader to use streamingValueReader
prestonvasquez 031bf94
Update (*valueReader).pop() to support bVR + streaming
prestonvasquez 54d08c4
Update (*valueReader).readValueBytes() to support bVR + streaming
prestonvasquez 26765bd
Update (*valueReader).Skip() to support bVR + streaming
prestonvasquez 8167d47
Update (*valueReader).ReadArray() to support bVR + streaming
prestonvasquez 02d38df
Update (*valueReader).ReadBinary() to support bVR + streaming
prestonvasquez 50a265a
Add comment to (*valueReader).ReadBoolean()
prestonvasquez 4bf9f61
Update (*valueReader).ReadDocument() to support bVR + streaming
prestonvasquez 828593c
Update (*valueReader).ReadCodeWithScope() to support bVR + streaming
prestonvasquez 96b5eb8
Update (*valueReader).ReadDBPointer() to support bVR + streaming
prestonvasquez 0861119
Update (*valueReader).ReadDateTime() to support bVR + streaming
prestonvasquez 443fb81
Update (*valueReader).ReadDecimal128() to support bVR + streaming
prestonvasquez 942e52e
Add comment to (*valueReader).ReadDouble()
prestonvasquez f934c70
Update (*valueReader).ReadInt32() to support bVR + streaming
prestonvasquez c3ba283
Update (*valueReader).ReadInt64() to support bVR + streaming
prestonvasquez 1a3d33d
Update (*valueReader).ReadJavascript() to support bVR + streaming
prestonvasquez b7d0042
Update (*valueReader).ReadObjectID() to support bVR + streaming
prestonvasquez 0cbacc1
Add comment to (*valueReader).Read(MinKey|MaxKey|Null)()
prestonvasquez 7d16cf8
Add comment to (*valueReader).ReadRegex()
prestonvasquez a497774
Update (*valueReader).ReadString() to support bVR + streaming
prestonvasquez dfe5502
Update (*valueReader).ReadSymbol() to support bVR + streaming
prestonvasquez 062adeb
Add comment to (*valueReader).ReadTimestamp()
prestonvasquez 83a47d5
Add comment to (*valueReader).ReadUndefined()
prestonvasquez 5b570de
Update (*valueReader).ReadTimestamp() to support bVR + streaming
prestonvasquez 403445f
Update (*valueReader).ReadElement() to support bVR + streaming
prestonvasquez dd709d1
Update (*valueReader).ReadValue() to support bVR + streaming
prestonvasquez dea88a6
Update (*valueReader).readValue() to support bVR + streaming
prestonvasquez 2e8be2a
Update (*valueReader).readCString() to support bVR + streaming
prestonvasquez 0036af6
Update (*valueReader).readString() to support bVR + streaming
prestonvasquez f62a9de
Update (*valueReader).peekLength() to support bVR + streaming
prestonvasquez 0ba5775
Update (*valueReader).readLength() to support bVR + streaming
prestonvasquez 296ff25
Update (*valueReader).read(i32|u32|i64|u64)() to support bVR + streaming
prestonvasquez 5a12f54
Remove read and appendBytes
prestonvasquez ef6b024
Update newValueReader to use bVR
prestonvasquez 620eb5a
Update tests to support bRV + streaming
prestonvasquez 7e44186
Update bson/value_reader.go
prestonvasquez 383f64d
Extend valueReader tests for both streaming and buffered
prestonvasquez File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
// Copyright (C) MongoDB, Inc. 2025-present. | ||
// | ||
// Licensed under the Apache License, Version 2.0 (the "License"); you may | ||
// not use this file except in compliance with the License. You may obtain | ||
// a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
package bson | ||
|
||
import ( | ||
"bytes" | ||
"io" | ||
) | ||
|
||
// bufferedValueReader implements the low-level byteSrc interface by reading | ||
// directly from an in-memory byte slice. It provides efficient, zero-copy | ||
// access for parsing BSON when the entire document is buffered in memory. | ||
type bufferedValueReader struct { | ||
buf []byte // entire BSON document | ||
offset int64 // Current read index into buf | ||
} | ||
|
||
var _ valueReaderByteSrc = (*bufferedValueReader)(nil) | ||
|
||
// Read reads up to len(p) bytes from the in-memory buffer, advancing the offset | ||
// by the number of bytes read. | ||
func (b *bufferedValueReader) readExact(p []byte) (int, error) { | ||
if b.offset >= int64(len(b.buf)) { | ||
return 0, io.EOF | ||
} | ||
n := copy(p, b.buf[b.offset:]) | ||
b.offset += int64(n) | ||
return n, nil | ||
} | ||
|
||
// ReadByte returns the single byte at buf[offset] and advances offset by 1. | ||
func (b *bufferedValueReader) ReadByte() (byte, error) { | ||
if b.offset >= int64(len(b.buf)) { | ||
return 0, io.EOF | ||
} | ||
Comment on lines
+37
to
+39
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What are your thoughts about returning the remaining buf as well as the EOF for |
||
b.offset++ | ||
return b.buf[b.offset-1], nil | ||
} | ||
|
||
// peek returns buf[offset:offset+n] without advancing offset. | ||
func (b *bufferedValueReader) peek(n int) ([]byte, error) { | ||
// Ensure we don't read past the end of the buffer. | ||
if int64(n)+b.offset > int64(len(b.buf)) { | ||
return b.buf[b.offset:], io.EOF | ||
} | ||
|
||
// Return the next n bytes without advancing the offset | ||
return b.buf[b.offset : b.offset+int64(n)], nil | ||
} | ||
|
||
// discard advances offset by n bytes, returning the number of bytes discarded. | ||
func (b *bufferedValueReader) discard(n int) (int, error) { | ||
// Ensure we don't read past the end of the buffer. | ||
if int64(n)+b.offset > int64(len(b.buf)) { | ||
// If we have exceeded the buffer length, discard only up to the end. | ||
left := len(b.buf) - int(b.offset) | ||
b.offset = int64(len(b.buf)) | ||
|
||
return left, io.EOF | ||
} | ||
|
||
// Advance the read position | ||
b.offset += int64(n) | ||
return n, nil | ||
} | ||
|
||
// readSlice scans buf[offset:] for the first occurrence of delim, returns | ||
// buf[offset:idx+1], and advances offset past it; errors if delim not found. | ||
func (b *bufferedValueReader) readSlice(delim byte) ([]byte, error) { | ||
// Ensure we don't read past the end of the buffer. | ||
if b.offset >= int64(len(b.buf)) { | ||
return nil, io.EOF | ||
} | ||
|
||
// Look for the delimiter in the remaining bytes | ||
rem := b.buf[b.offset:] | ||
idx := bytes.IndexByte(rem, delim) | ||
if idx < 0 { | ||
return nil, io.EOF | ||
} | ||
|
||
// Build the result slice up through the delimiter. | ||
result := rem[:idx+1] | ||
|
||
// Advance the offset past the delimiter. | ||
b.offset += int64(idx + 1) | ||
|
||
return result, nil | ||
} | ||
|
||
// pos returns the current read position in the buffer. | ||
func (b *bufferedValueReader) pos() int64 { | ||
return b.offset | ||
} | ||
|
||
// regexLength will return the total byte length of a BSON regex value. | ||
func (b *bufferedValueReader) regexLength() (int32, error) { | ||
rem := b.buf[b.offset:] | ||
|
||
// Find end of the first C-string (pattern). | ||
i := bytes.IndexByte(rem, 0x00) | ||
if i < 0 { | ||
return 0, io.EOF | ||
} | ||
|
||
// Find end of second C-string (options). | ||
j := bytes.IndexByte(rem[i+1:], 0x00) | ||
if j < 0 { | ||
return 0, io.EOF | ||
} | ||
|
||
// Total length = first C-string length (pattern) + second C-string length | ||
// (options) + 2 null terminators | ||
return int32(i + j + 2), nil | ||
} | ||
|
||
func (*bufferedValueReader) streamable() bool { | ||
return false | ||
} | ||
|
||
func (b *bufferedValueReader) reset() { | ||
b.buf = nil | ||
b.offset = 0 | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,153 @@ | ||
// Copyright (C) MongoDB, Inc. 2025-present. | ||
// | ||
// Licensed under the Apache License, Version 2.0 (the "License"); you may | ||
// not use this file except in compliance with the License. You may obtain | ||
// a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
package bson | ||
|
||
import ( | ||
"bytes" | ||
"io" | ||
"testing" | ||
|
||
"go.mongodb.org/mongo-driver/v2/internal/assert" | ||
"go.mongodb.org/mongo-driver/v2/internal/require" | ||
) | ||
|
||
func TestBufferedvalueReader_discard(t *testing.T) { | ||
tests := []struct { | ||
name string | ||
buf []byte | ||
n int | ||
want int | ||
wantOffset int64 | ||
wantErr error | ||
}{ | ||
{ | ||
name: "nothing", | ||
buf: bytes.Repeat([]byte("a"), 1024), | ||
n: 0, | ||
want: 0, | ||
wantOffset: 0, | ||
wantErr: nil, | ||
}, | ||
{ | ||
name: "amount less than buffer size", | ||
buf: bytes.Repeat([]byte("a"), 1024), | ||
n: 100, | ||
want: 100, | ||
wantOffset: 100, | ||
wantErr: nil, | ||
}, | ||
{ | ||
name: "amount greater than buffer size", | ||
buf: bytes.Repeat([]byte("a"), 1024), | ||
n: 10000, | ||
want: 1024, | ||
wantOffset: 1024, | ||
wantErr: io.EOF, | ||
}, | ||
{ | ||
name: "exact buffer size", | ||
buf: bytes.Repeat([]byte("a"), 1024), | ||
n: 1024, | ||
want: 1024, | ||
wantOffset: 1024, | ||
wantErr: nil, | ||
}, | ||
{ | ||
name: "from empty buffer", | ||
buf: []byte{}, | ||
n: 10, | ||
want: 0, | ||
wantOffset: 0, | ||
wantErr: io.EOF, | ||
}, | ||
} | ||
|
||
for _, tt := range tests { | ||
t.Run(tt.name, func(t *testing.T) { | ||
reader := &bufferedValueReader{buf: tt.buf, offset: 0} | ||
n, err := reader.discard(tt.n) | ||
if tt.wantErr != nil { | ||
assert.ErrorIs(t, err, tt.wantErr, "Expected error %v, got %v", tt.wantErr, err) | ||
} else { | ||
require.NoError(t, err, "Expected no error when discarding %d bytes", tt.n) | ||
} | ||
|
||
assert.Equal(t, tt.want, n, "Expected to discard %d bytes, got %d", tt.want, n) | ||
assert.Equal(t, tt.wantOffset, reader.offset, "Expected offset to be %d, got %d", tt.wantOffset, reader.offset) | ||
}) | ||
} | ||
} | ||
|
||
func TestBufferedvalueReader_peek(t *testing.T) { | ||
tests := []struct { | ||
name string | ||
buf []byte | ||
n int | ||
offset int64 | ||
want []byte | ||
wantErr error | ||
}{ | ||
{ | ||
name: "nothing", | ||
buf: bytes.Repeat([]byte("a"), 1024), | ||
n: 0, | ||
want: []byte{}, | ||
wantErr: nil, | ||
}, | ||
{ | ||
name: "amount less than buffer size", | ||
buf: bytes.Repeat([]byte("a"), 1024), | ||
n: 100, | ||
want: bytes.Repeat([]byte("a"), 100), | ||
wantErr: nil, | ||
}, | ||
{ | ||
name: "amount greater than buffer size", | ||
buf: bytes.Repeat([]byte("a"), 1024), | ||
n: 10000, | ||
want: bytes.Repeat([]byte("a"), 1024), | ||
wantErr: io.EOF, | ||
}, | ||
{ | ||
name: "exact buffer size", | ||
buf: bytes.Repeat([]byte("a"), 1024), | ||
n: 1024, | ||
want: bytes.Repeat([]byte("a"), 1024), | ||
wantErr: nil, | ||
}, | ||
{ | ||
name: "from empty buffer", | ||
buf: []byte{}, | ||
n: 10, | ||
want: []byte{}, | ||
wantErr: io.EOF, | ||
}, | ||
{ | ||
name: "peek with offset", | ||
buf: append(bytes.Repeat([]byte("a"), 100), bytes.Repeat([]byte("b"), 100)...), | ||
offset: 100, | ||
n: 100, | ||
want: bytes.Repeat([]byte("b"), 100), | ||
wantErr: nil, | ||
}, | ||
} | ||
|
||
for _, tt := range tests { | ||
t.Run(tt.name, func(t *testing.T) { | ||
reader := &bufferedValueReader{buf: tt.buf, offset: tt.offset} | ||
n, err := reader.peek(tt.n) | ||
if tt.wantErr != nil { | ||
assert.ErrorIs(t, err, tt.wantErr, "Expected error %v, got %v", tt.wantErr, err) | ||
} else { | ||
require.NoError(t, err, "Expected no error when peeking %d bytes", tt.n) | ||
} | ||
|
||
assert.Equal(t, tt.want, n, "Expected to peek %d bytes, got %d", len(tt.want), len(n)) | ||
assert.Equal(t, tt.offset, reader.offset, "Expected offset to be %d, got %d", tt.offset, reader.offset) | ||
}) | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: This isn't a
ValueReader
, so the name is a bit confusing. Consider a name likebufferedByteSrc
.