Skip to content

Conversation

codluca
Copy link
Member

@codluca codluca commented Sep 24, 2025

Fix 'IS NOT DISTINCT FROM' not working in JOIN when nulls are the only value.

Fixes issue: #26257

Description

The changes try to enable Domain with only null value to be used.

Modify JoinDomainBuilder to take into consideration the nulls from the value blocks.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
(X) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`26257`)

@cla-bot cla-bot bot added the cla-signed label Sep 24, 2025
@codluca codluca marked this pull request as draft September 24, 2025 10:50
@codluca codluca force-pushed the 26257-is-not-distinct-from-with-nulls branch 6 times, most recently from a6880af to b12ce1c Compare September 24, 2025 17:30
@codluca codluca marked this pull request as ready for review September 24, 2025 18:31
@codluca
Copy link
Member Author

codluca commented Sep 24, 2025

@raunaqmorarka The pull request is ready.
There is a failed check (ci / test plugin/trino-pinot), but it is something random, not related to the code changes.

@wendigo wendigo requested a review from Copilot September 25, 2025 21:14
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a bug where 'IS NOT DISTINCT FROM' operations in JOIN clauses were not working correctly when null values were involved. The issue occurred because the Domain building logic was incorrectly treating null-only domains as empty domains.

  • Modified JoinDomainBuilder to properly handle null values by tracking when nulls are present in value blocks
  • Updated Domain creation logic to include null allowance when nulls are detected in the data
  • Fixed DynamicFilters to correctly handle null-allowed domains in IS NOT DISTINCT FROM comparisons

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
TestJoinIsNotDistinct.java Adds comprehensive test coverage for IS NOT DISTINCT FROM with various null scenarios
TestDynamicFilterSourceOperator.java Updates existing tests to expect correct null-allowed domain behavior
DynamicFilters.java Fixes domain comparison logic to properly handle null allowance in IS NOT DISTINCT FROM operations
JoinDomainBuilder.java Adds null tracking and removes incorrect comments about join behavior with nulls

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@codluca codluca force-pushed the 26257-is-not-distinct-from-with-nulls branch from bddd4fb to ff85da8 Compare September 28, 2025 11:15
import static java.lang.String.format;

public class TestJoinIsNotDistinct
extends AbstractTestQueryFramework
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use QueryAssertions instead of AbstractTestQueryFramework. Take a look at the tests under io.trino.sql.query.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If on master (without the fix) I add this test to io.trino.sql.query.TestJoin, it will succeed:

    @Test
    public void testJoinWithIsNotDistinctFromOnNullsInMemory()
    {
        assertThat(assertions.query(
                """
                WITH
                  a AS (SELECT k1, k2 FROM (VALUES (1, NULL)) AS t(k1, k2)),
                  b AS (SELECT k1, k2 FROM (VALUES (1, NULL)) AS t(k1, k2))
                SELECT *
                FROM a
                INNER JOIN b ON a.k1 IS NOT DISTINCT FROM b.k1
                            AND a.k2 IS NOT DISTINCT FROM b.k2
                """))
                .matches("VALUES (1, NULL, 1, NULL)");
    }

The code doesn't follow the same path, using the JoinDomainBuilder, if the temporary tables are used. The Domain is build differently.
I added the extra test from below (testJoinWithIsNotDistinctFromOnNullsInMemory) just for the future, as the test was working before the fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to use temporary tables with QueryAssertions, as well. You just need to initialize QueryAssertions with the memory plugin and populate the tables for the test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with QueryAssertions

@codluca codluca force-pushed the 26257-is-not-distinct-from-with-nulls branch from 5a1943b to f96c605 Compare October 4, 2025 17:31
The changes try to enable Domain with only null value to be used.

Modify JoinDomainBuilder to take into consideration the nulls from the value blocks.
@codluca codluca force-pushed the 26257-is-not-distinct-from-with-nulls branch from c0083a9 to 19e3e58 Compare October 4, 2025 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants