Skip to content

Conversation

@Jack-LuoHongyi
Copy link

Summary

  • ensure web page fixtures are flushed and visible before running result-size assertions
  • add Hive-specific helpers reused by the result-size test suite
  • adapt assertResultSize as the Hive-tailored counterpart to DataStoreTestUtil.testResultSize

Root Cause
Result-size checks relied on immediate schema visibility and deterministic iteration. Hive may expose newly written rows with a delay, yielding divergent counts when the generic helper runs without the additional synchronization provided in other deterministic branches.

Fix

  • introduce awaitWebPageSchema, populateWebPages, and sortedWebPageUrls so every query operates on a confirmed schema and stable key ordering
  • implement assertResultSize by mirroring the core logic of DataStoreTestUtil.testResultSize(...), while adding Hive-specific guarding against duplicate or missing rows
  • route the result-size test methods through the new helper, ensuring consistent setup with the existing query-oriented helpers

This change is constrained to test-only sources; production code remains untouched.

Validation

  • the focused result-size test suite continues to succeed under repeated runs
  • exercising reordered iteration via edu.illinois:nondex-maven-plugin:2.1.7 reports no discrepancies

Additional Notes
assertResultSize preserves the loop structure and expectations from DataStoreTestUtil.testResultSize(...), adding only the visibility waits and set-based checks required for Hive. The helper functions introduced here are consistent with earlier deterministic fixes for the same class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant