-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Use doc_value based value fetcher for patterned_text #134693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Use doc_value based value fetcher for patterned_text #134693
Conversation
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
@@ -154,6 +154,24 @@ public void testSmallValueNotStored() throws IOException { | |||
} | |||
} | |||
|
|||
public void testPhraseQuery() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only way I could figure to test this change was to use the debugger during a phrase query and verify that the values were coming from the field data loader. If you can think of a better way I'd be interested. Decided to leave in this test that runs a phrase query since, though testQueryResultsSameAsMatchOnlyText
does this, it's a bit too smart which makes it harder to trust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also add a unit test to PatternedTextFieldTypeTests
? I think using MapperTestCase#assertFetch(...)
will help assert that we read from doc values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Implementing generateRandomInputValue
values testFetch
to run this test. We still have to disable assertFetchMany
though since patterned_text only allows a single value per field.
...ogsdb/src/main/java/org/elasticsearch/xpack/logsdb/patternedtext/PatternedTextFieldType.java
Show resolved
Hide resolved
return SourceValueFetcher.toString(name(), context, format); | ||
// This operation is really a SEARCH, not a SCRIPT operation. | ||
// But we only allow direct access to field data for scripts. | ||
// The value fetcher uses field data internally though, so pretend the operation is script. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using SCRIPT here is a bit sketchy, but it's the only way I could think to disallow aggregations, while letting the value fetcher use doc_values (though the field data api).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is sketchy. The alternative is I think implementing a custom ValueFetcher
implementation that uses PatternedTextDocValues.from(...)
directly? To avoid the validation in PatternedTextFieldType#fielddataBuilder(...)
.
...rc/test/java/org/elasticsearch/xpack/logsdb/patternedtext/PatternedTextIntegrationTests.java
Outdated
Show resolved
Hide resolved
@@ -154,6 +154,24 @@ public void testSmallValueNotStored() throws IOException { | |||
} | |||
} | |||
|
|||
public void testPhraseQuery() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also add a unit test to PatternedTextFieldTypeTests
? I think using MapperTestCase#assertFetch(...)
will help assert that we read from doc values?
return SourceValueFetcher.toString(name(), context, format); | ||
// This operation is really a SEARCH, not a SCRIPT operation. | ||
// But we only allow direct access to field data for scripts. | ||
// The value fetcher uses field data internally though, so pretend the operation is script. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is sketchy. The alternative is I think implementing a custom ValueFetcher
implementation that uses PatternedTextDocValues.from(...)
directly? To avoid the validation in PatternedTextFieldType#fielddataBuilder(...)
.
The value fetcher is used produce message values during the second phrase of the two phrase iterator during a source confirmed query to check that the message actually matches. A query may need to scan many values so the value fetcher must be fast. Currently the value fetcher requires building the source. Instead it should use the doc_value iterator and only load the message values.