-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-29106: When drop a table, there is no need to verify the directo… #5997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
zxl-333
wants to merge
12
commits into
apache:master
Choose a base branch
from
zxl-333:HIVE-29106
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
7a22655
HIVE-29106: When drop a table, there is no need to verify the directo…
zxl-333 478f7e8
path to string
zxl-333 1026bf7
formatted
zxl-333 c96481c
change formatted
zxl-333 73ff004
throw exception
zxl-333 f49a5b4
modify test
zxl-333 caa4e90
modify test
zxl-333 7e49d10
modify test
zxl-333 e7455a3
Ensure the rollback of failed transactions is deleted
zxl-333 83ca188
get tbl path
zxl-333 07ac244
modify throw exception
zxl-333 a5a78ac
import ioexception
zxl-333 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zxl-333 initially, you added the permission check in HIVE-28804, mentioning that some Hive clusters don't use Ranger and now you're removing it?
I'm not sure I understand the reasoning.
I see you've added
ms.rollbackTransaction()
. So the flow is: make a DB call to drop the metadata, and then depending on whether the directory on the filesystem was successfully removed, you either commit or roll back, right?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deniskuzZ When Ranger is not enabled, there is no issue with HIVE-28804. However, when Ranger is enabled, the HMS code fails to retrieve authorization information from Ranger during verification, leading to the following situation: even though write permissions for the table have been granted through Ranger, if the HDFS ACL permissions do not include write access, the user will be prompted with a "no write permission" error, resulting in the failure to drop the table.
Now, we need to resolve the issue where table deletion fails when permissions are available and succeeds when permissions are unavailable, regardless of whether Ranger is enabled. The solution is to skip the verification of HDFS ACL permissions for database and table directories. Instead, the permission verification during deletion should be entrusted to the NameNode. If the deletion fails due to permission issues, the metadata must also be rolled back, and the engine side will receive the exception information about the deletion failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zxl-333, please share detailed step-by-step reproduce. I don't recall any customer using Ranger raising escalations related to that functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @saihemanth-cloudera, @dengzhhu653
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zxl-333 - Based on the scenario you mentioned, I believe this is a Hive or Ranger config issue.
Regarding Hive config, what is hive.server2.enable.doAs set to? It should be set to true. By doing so, assuming that you are running queries from jdbc client like beeline, when an end-user runs some query, user 'hive' does all the operations in the HDFS. Now if an end-user drops a table, user 'hive' will try to drop the data for the table path and its partitions. Now user 'hive' is having issues with the permissions with the path that leads to the Ranger config issue.
Regarding ranger config, user 'hive' should be added in hdfs service in the ranger policy for hive warehouse table path, and I believe the issue here because of this missing policy in hdfs service. Also, when ranger policies are missing, the Hadoop ACLs permissions takes into effect.
So I feel, this is not a product issue, and you might have to abandon this patch.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saihemanth-cloudera @deniskuzZ
Do not use ranger. Refer to the steps in HIVE-18888.
When using ranger, even if ranger has been authorized, it may still result in the failure of the ACL write permission verification in HDFS, preventing the deletion of the table.
specific steps are as follows:
1、User_a has only read permissions for the test db and the following tables in HDFS.
hdfs://nn1/usr/warhouse/hive/test.db
2、However, now the read and write permissions for the 'test' db and its tables have been granted through the 'ranger' authorization.
3. If 'user_a' attempts to delete the 'test_1' table within the 'test' library, the following exception will be thrown
test_1 metadata not deleted since hdfs://nn//usr/warhouse/hive/test.db is not writable by user_a
The exception thrown above will result in the user being unable to delete even if the ranger authorization is granted.
Therefore, regardless of whether the ranger permission check is enabled or not, in order to ensure that users do not have read or write permissions for the database tables, the operation must fail and the metadata database transaction must be rolled back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From your comments, you seem to use Ranger only for metadata permissions and not file permissions (HDFS service is not being used in Ranger for auth OR there are no policies for user 'hive'). So you would rely on Hadoop ACLs for file authorization. So (2) you mentioned doesn't apply. And (3) is expected to happen and it should happen in the absence of Ranger related Hadoop policies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. After enabling the ranger control of HDFS file permissions, it is usually not necessary to use HDFS's ACL permissions for re-authorization. If this modification is not made, then after using ranger to control HDFS permissions, it is still necessary to grant HDFS ACL permissions to the user. This is unacceptable because the user needs to be authorized twice for a table path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zxl-333 -- Can you please give me a detailed repro of this issue (probably a doc with screenshots with ranger policies, core-site.xml (in Hadoop conf) etc.), as previously asked by @deniskuzZ as well.
Because I cannot get this whatsoever
If you are using to control HDFS permissions, then you don't need to grant the end user the required permissions.
I'm guessing that you are missing this config in your core-site.xml of Hadoop service:
<property> <name>hadoop.proxyuser.hive.hosts</name> <value>*</value> </property>
<property> <name>hadoop.proxyuser.hive.groups</name> <value>*</value> </property>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core-site has already set these parameters.






steps:
1、create database and table
2、Check the HDFS permissions for the database and table
3、Grant all permissions to the dmp_query user through the ranger authorization.
4、Verify whether the dmp_query user has write permissions
5、Use the dmp_query user to drop this table
We observed that although ranger was authorized, the inability to delete was caused by the permission verification stored in HDFS.