You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I want to add a 'created' column to all my tables, which should only be set when the row is 'NOT MATCHED' in a 'MERGE INTO' query. Right now the generated SQL uses the same df.columns for both cases ('WHEN MATCHED THEN' and 'WHEN NOT MATCHED THEN')
Describe the solution you'd like
Add optional arguments to the to_iceberg function to specify which columns should be used for 'match' / 'not matched', defaulting to None in which case the code will use all the dataframe columns. E.g:
_match_update_columns = match_update_columns or df.columns
_no_match_insert_columns = no_match_insert_columns or df.column
sql_statement = f"""
MERGE INTO "{database}"."{table}" target
USING "{database}"."{temp_table}" source
ON {' AND '.join([f'target."{x}" = source."{x}"' for x in merge_cols])}
WHEN MATCHED THEN
UPDATE SET {', '.join([f'"{x}" = source."{x}"' for x in df.columns if x in _match_update_columns ])}
WHEN NOT MATCHED THEN
INSERT ({', '.join([f'"{x}"' for x in df.columns if x in _no_match_insert_columns ])})
VALUES ({', '.join([f'source."{x}"' for x in df.columns if x in _no_match_insert_columns ])})
Describe alternatives you've considered
No current alternative exists, except not using the package