Skip to content

Conversation

@dqkqd
Copy link
Contributor

@dqkqd dqkqd commented Oct 26, 2025

Which issue does this PR close?

Rationale for this change

with_param_values doesn't substitute params' type if it is used on EmptyRelation.
Thus, causing SELECT $1, $2 to have incorrect schema after substitution.

For example: after replacing $1 = 1, $2 = "s", the schema is [Null, Null], but it should
be [Int64, Utf8].

This schema type mismatch is resolved before converting to physical plan by
the type_coercion rule in the analyzer.

.map_data(|plan| plan.recompute_schema())

So I'm not quite sure should we fix it in with_param_values.

What changes are included in this PR?

  • recompute the schema after replacing param values.

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@github-actions github-actions bot added logical-expr Logical plan and expressions core Core DataFusion crate labels Oct 26, 2025
@dqkqd dqkqd marked this pull request as ready for review October 26, 2025 09:45
Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me 👍

Do you think it's also possible to add the queries from the original issue as tests?

@dqkqd
Copy link
Contributor Author

dqkqd commented Oct 27, 2025

Thanks @Jefffrey

I tried to add test for the subquery alias: SELECT a, b FROM (VALUES ($1, $2)) AS t(a, b); similar to the issue:

let df = ctx.sql("SELECT a, b FROM (VALUES ($1, $2)) AS t(a, b)").await?;
let df_with_params_replaced = df.with_param_values(vec![
    ScalarValue::UInt32(Some(1)),
    ScalarValue::Utf8(Some("foofy".to_string())),
])?;
dbg!(df_with_params_replaced.collect().await?[0].schema());
#> Error: ArrowError(InvalidArgumentError("column types must match schema types, expected Null but found UInt32 at column index 0"), Some(""))

But the test still fail, the plan after replacing params (the schema for t is [Null, Null])

        Projection: t.a, t.b [a:Null;N, b:Null;N]
          SubqueryAlias: t [a:Null;N, b:Null;N]
            Projection: column1 AS a, column2 AS b [a:Null;N, b:Null;N]
              Values: (Int32(1) AS $1, Utf8("s") AS $2) [column1:Null;N, column2:Null;N]

The expected params:

        Projection: t.a, t.b [a:Int32;N, b:Utf8;N]
          SubqueryAlias: t [a:Int32;N, b:Utf8;N]
            Projection: column1 AS a, column2 AS b [a:Int32;N, b:Utf8;N]
              Values: (Int32(1) AS $1, Utf8("s") AS $2) [column1:Int32;N, column2:Utf8;N]

I think it needs more works, convert to draft for now.

@dqkqd dqkqd marked this pull request as draft October 27, 2025 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate logical-expr Logical plan and expressions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SubqueryAlias, Values, and/or EmptyRelation have incorrect schemas after replacing Placeholder values

2 participants