You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Revert "[SPARK-52576][SDP] Drop/recreate on full refresh and MV update"
This reverts commit 8b43757.
### What changes were proposed in this pull request?
Reverts SPARK-52576. I.e. truncates + alters instead of drop + recreate, for materialized views and full refreshes.
### Why are the changes needed?
Some pipeline runs result in wiping out and replacing all the data for a table:
- Every run of a materialized view
- Runs of streaming tables that have the "full refresh" flag
Prior to SPARK-52576, this "wipe out and replace" was implemented by:
- Truncating the table
- Altering the table to drop/update/add columns that don't match the columns in the DataFrame for the current run
However, we discovered that this didn't work on Hive. So we moved to drop + recreate, which did work on Hive. However, compared to truncate + alter, drop + recreate has some undesirable effects. E.g. it interrupts readers of the table and wipes away things like ACLs.
This Hive behavior was fixed here: apache#51007.
So now we can switch back to truncate + alter.
### Does this PR introduce _any_ user-facing change?
Yes, described above
### How was this patch tested?
Existing tests
### Was this patch authored or co-authored using generative AI tooling?
Closesapache#51497 from sryza/revert-drop-recreate.
Authored-by: Sandy Ryza <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
0 commit comments