Skip to content

Conversation

@kabo87777
Copy link

Fix Non-Deterministic Behavior in AggregationDistributionTest.testGroupByLevel2Series2Devices3Regions

Problem

The test testGroupByLevel2Series2Devices3Regions was failing non-deterministically under NonDex with a 60% failure rate (3 out of 5 runs) due to order-dependent fragment access and assumptions about node positions in the plan tree.

Way to Reproduce

cd iotdb-core/datanode
mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex \
  -Dtest=AggregationDistributionTest#testGroupByLevel2Series2Devices3Regions \
  -DnondexRuns=5

# Expected: Test fails with certain NonDex seeds
# Failure: AssertionError or ClassCastException when accessing nodes at fixed positions

Root Cause

The test made multiple order-dependent assumptions:

  1. Fixed fragment indices: Accessed fragments by hardcoded positions (get(0), get(2))
  2. Fixed tree positions: Assumed GroupByLevelNode is always at specific child positions
  3. Rigid structure checks: Used instanceof checks at fixed depth levels
// Assumed fragment 0 has nested GroupByLevelNode at specific position
assertTrue(
    fragmentInstances.get(0)
        .getFragment().getPlanNodeTree()
        .getChildren().get(0)
        .getChildren().get(0)
    instanceof GroupByLevelNode);

// Assumed fragment 2 has GroupByLevelNode at first child
assertTrue(
    fragmentInstances.get(2).getFragment().getPlanNodeTree()
        .getChildren().get(0)
    instanceof GroupByLevelNode);

// Assumed descriptor patterns match fragment order
verifyGroupByLevelDescriptor(expectedDescriptorValue,
    fragmentInstances.get(0)...);
verifyGroupByLevelDescriptor(expectedDescriptorValue2,
    fragmentInstances.get(2)...);

When NonDex shuffled collection iteration order, fragments appeared in different sequences and with varying tree structures, causing failures despite the query plan being semantically correct.

Solution

Made the test order-independent by searching all fragments for expected patterns instead of relying on fixed positions:

1. Added Helper Method for Tree Search

private <T extends PlanNode> T findFirstNodeOfType(PlanNode root, Class<T> nodeType) {
  if (root == null) return null;
  if (nodeType.isInstance(root)) return (T) root;
  for (PlanNode child : root.getChildren()) {
    T result = findFirstNodeOfType(child, nodeType);
    if (result != null) return result;
  }
  return null;
}

2. Changed from Position-Based to Pattern-Based Verification

Before (Order-Dependent):

// Hardcoded positions
verifyGroupByLevelDescriptor(expectedDescriptorValue,
    (GroupByLevelNode) fragmentInstances.get(0)
        .getFragment().getPlanNodeTree().getChildren().get(0));
verifyGroupByLevelDescriptor(expectedDescriptorValue2,
    (GroupByLevelNode) fragmentInstances.get(2)
        .getFragment().getPlanNodeTree().getChildren().get(0));

After (Order-Independent):

// Define expected descriptor patterns
Map<String, List<String>> expectedDescriptorValue1 = new HashMap<>();
expectedDescriptorValue1.put(groupedPathS1, Arrays.asList(groupedPathS1, d2s1Path));
expectedDescriptorValue1.put(groupedPathS2, Collections.singletonList(groupedPathS2));

Map<String, List<String>> expectedDescriptorValue2 = new HashMap<>();
expectedDescriptorValue2.put(groupedPathS1, Collections.singletonList(d1s1Path));
expectedDescriptorValue2.put(groupedPathS2, Collections.singletonList(d1s2Path));

// Search all fragments for matching patterns
boolean foundPattern1 = false;
boolean foundPattern2 = false;
int groupByLevelNodeCount = 0;

for (FragmentInstance instance : fragmentInstances) {
  PlanNode root = instance.getFragment().getPlanNodeTree();
  if (countNodesOfType(root, GroupByLevelNode.class) > 0) {
    groupByLevelNodeCount++;
    // Find GroupByLevelNode at any depth
    GroupByLevelNode groupByLevel = findFirstNodeOfType(root, GroupByLevelNode.class);
    if (groupByLevel != null) {
      try {
        verifyGroupByLevelDescriptor(expectedDescriptorValue1, groupByLevel);
        foundPattern1 = true;
      } catch (AssertionError e) {
        try {
          verifyGroupByLevelDescriptor(expectedDescriptorValue2, groupByLevel);
          foundPattern2 = true;
        } catch (AssertionError e2) {
          // This fragment doesn't match any expected pattern
        }
      }
    }
  }
}

// Verify both patterns exist somewhere in the plan
assertTrue("Expected at least 2 GroupByLevelNode instances", groupByLevelNodeCount >= 2);
assertTrue("Expected to find fragment matching pattern 1", foundPattern1);
assertTrue("Expected to find fragment matching pattern 2", foundPattern2);

Key Improvements:

  1. Recursive search: Uses findFirstNodeOfType() to locate GroupByLevelNode at any depth
  2. Pattern matching: Verifies descriptor patterns instead of fragment positions
  3. Flexible counting: Ensures minimum required nodes exist without assuming exact count
  4. Try-catch pattern matching: Attempts to match each fragment against expected patterns

Verification

Tested with 50 NonDex runs - 0 failures (100% success rate):

mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex \
  -Dtest=AggregationDistributionTest#testGroupByLevel2Series2Devices3Regions \
  -DnondexRuns=50
# Result: 0 failures

Key Changed Classes

  • AggregationDistributionTest:
    • Modified testGroupByLevel2Series2Devices3Regions to use order-independent verification
    • Added findFirstNodeOfType() helper method for recursive node search at any depth
    • Leverages existing countNodesOfType() helper method

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant