-
Notifications
You must be signed in to change notification settings - Fork 1k
PHOENIX-7705 Support for a row size function #2292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tkhurana has already asked on how to account for all versions/delete markers in a row in a comment on the jira.
@kadirozde A related question to this -
- If we introduce a separate RAW_ROW_SIZE for such case, can it be used in conjunction with other functions (e.g.
select raw_row_size,count(*) from table group by tenant_id;). This is considering raw_row_size would need to do a raw scan which may not be compatible with other queries (which assumes that scan only always return most recent version)
@ujjawal4046, I added the support for RAW_ROW_SIZE() in this PR
| * Function to return the total size of the HBase cells that constitute a given row | ||
| */ | ||
| @BuiltInFunction(name = RowSizeFunction.NAME, nodeClass = RowSizeParseNode.class, args = {}) | ||
| public class RowSizeFunction extends ScalarFunction { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the scalar function evaluated on server side as well (or only on client side) ? If it's client side, then we need to fetch the whole row back to client for size computation ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be evaluated on the client size to check if the where clause evaluates to true on an empty tuple and, or when it is specified as a top level expression node in a select clause. This PR does not allow the row_size function to be a top level node in a select clause. In this PR, a row is never returned to the client; only its size is returned as part of an aggregation function result.
| boolean asSubquery, boolean allowPageFilter, QueryPlan innerPlan, boolean inJoin, | ||
| boolean inUnion) throws SQLException { | ||
| for (AliasedNode node : select.getSelect()) { | ||
| if (node.getNode() instanceof RowSizeParseNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not clear with the usage. Does it mean we can't use query where row_size needs to be fetched for each row (e.g. select row_size() from table or select row_size() from table group by tenant_id)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the exception message and also the row size test to see how to get individual row sizes.
| } | ||
| if (context.hasRowSizeFunction()) { | ||
| scan.getFamilyMap().clear(); | ||
| ScanUtil.removePageFilter(scan); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the need to remove family map so that it accounts for cells across all column families.
Why do we need to remove page filter ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Page filter removal was not intentional. Good catch!
|
@kadirozde The IT tests are timing out and being aborted. https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/view/change-requests/job/PR-2292/ |
|
|
|
@tkhurana, Thank you for pointing out the test failure. It should be fixed now. |
No description provided.