Athena to_iceberg - allow to distinguish between columns that should be merged when matched and columns that should be merged when not matched

**Is your feature request related to a problem? Please describe.**
I want to add a 'created' column to all my tables, which should only be set when the row is 'NOT MATCHED' in a 'MERGE INTO' query. Right now the generated SQL uses the same df.columns for both cases ('WHEN MATCHED THEN' and 'WHEN NOT MATCHED THEN')

**Describe the solution you'd like**
Add optional arguments to the to_iceberg function to specify which columns should be used for 'match' / 'not matched', defaulting to None in which case the code will use all the dataframe columns. E.g:
```
match_update_columns: list[str] | None,
no_match_insert_columns: list[str] | None,
```

And then something along the lines of:
```
_match_update_columns = match_update_columns or df.columns
_no_match_insert_columns = no_match_insert_columns or df.column
sql_statement = f"""
                MERGE INTO "{database}"."{table}" target
                USING "{database}"."{temp_table}" source
                ON {' AND '.join([f'target."{x}" = source."{x}"' for x in merge_cols])}
                WHEN MATCHED THEN
                    UPDATE SET {', '.join([f'"{x}" = source."{x}"' for x in df.columns if x in _match_update_columns ])}
                WHEN NOT MATCHED THEN
                    INSERT ({', '.join([f'"{x}"' for x in df.columns if x in _no_match_insert_columns ])})
                    VALUES ({', '.join([f'source."{x}"' for x in df.columns if x in _no_match_insert_columns ])})
```

**Describe alternatives you've considered**
No current alternative exists, except not using the package



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Athena to_iceberg - allow to distinguish between columns that should be merged when matched and columns that should be merged when not matched #3176

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Athena to_iceberg - allow to distinguish between columns that should be merged when matched and columns that should be merged when not matched #3176

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions