Skip to content

Conversation

srinathk10
Copy link
Contributor

@srinathk10 srinathk10 commented Jul 7, 2025

Why are these changes needed?

concat: Handle mixed Tensor types for structs

unify_schemas

  • Handle duplicate column names in schema.
  • For structs, invoke unify_schemas on itself.
  • For tensors, handle missing fields.

concat

  • For structs, _align_struct_fields is invoked to handle missing fields and aligned schemas. Here handle Tensors type mismatch in _backfill_missing_fields.

Tests

  • Added test fixtures to existing test. No logic changes.
test_arrow_concat_empty
test_arrow_concat_single_block
test_arrow_concat_basic
test_arrow_concat_null_promotion
test_arrow_concat_tensor_extension_uniform
test_arrow_concat_tensor_extension_variable_shaped
test_arrow_concat_tensor_extension_uniform_and_variable_shaped
test_arrow_concat_tensor_extension_uniform_but_different
test_arrow_concat_with_objects
test_struct_with_different_field_names
test_nested_structs
test_struct_with_null_values
test_struct_with_mismatched_lengths
test_struct_with_empty_arrays
test_arrow_concat_object_with_tensor_fails
test_unify_schemas
test_unify_schemas_type_promotion
test_arrow_block_select
test_arrow_block_slice_copy
test_arrow_block_slice_copy_empty
  • Test concat of tables with structs & tensors coverage.
test_struct_with_arrow_variable_shaped_tensor_type
test_mixed_tensor_types_same_dtype
test_mixed_tensor_types_fixed_shape_different
test_mixed_tensor_types_variable_shaped
test_mixed_tensor_types_in_struct
test_nested_struct_with_mixed_tensor_types
test_multiple_tensor_fields_in_struct
test_struct_with_incompatible_tensor_dtypes_fails
test_struct_with_additional_fields
test_struct_with_null_tensor_values

  • Test unify_schema coverage.
test_unify_schemas_null_typed_lists
test_unify_schemas_object_types
test_unify_schemas_duplicate_fields
test_unify_schemas_incompatible_tensor_dtypes
test_unify_schemas_objects_and_tensors
test_unify_schemas_missing_tensor_fields
test_unify_schemas_nested_struct_tensors
test_unify_schemas_edge_cases
test_unify_schemas_mixed_tensor_types


Related issue number

"Closes #54186"

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Srinath Krishnamachari <[email protected]>
Signed-off-by: Srinath Krishnamachari <[email protected]>
@srinathk10 srinathk10 added the go add ONLY when ready to merge, run all tests label Jul 7, 2025
Signed-off-by: Srinath Krishnamachari <[email protected]>
Signed-off-by: Srinath Krishnamachari <[email protected]>
Signed-off-by: Srinath Krishnamachari <[email protected]>
@srinathk10 srinathk10 changed the title concat: Handle mixed Tensor types concat: Handle mixed Tensor types for structs Jul 7, 2025
@srinathk10 srinathk10 changed the title concat: Handle mixed Tensor types for structs concat: Handle mixed Tensor types for structs Jul 7, 2025
@srinathk10 srinathk10 marked this pull request as ready for review July 7, 2025 23:50
@srinathk10 srinathk10 requested a review from a team as a code owner July 7, 2025 23:50
Signed-off-by: Srinath Krishnamachari <[email protected]>
Copy link
Contributor

Important

Installation incomplete: to start using Gemini Code Assist, please ask the organization owner(s) to visit the Gemini Code Assist Admin Console and sign the Terms of Services.

Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

2 similar comments
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

srinathk10 and others added 3 commits July 10, 2025 07:26
Signed-off-by: Srinath Krishnamachari <[email protected]>
Signed-off-by: Srinath Krishnamachari <[email protected]>


def test_arrow_concat_single_block():
def test_arrow_concat_single_block(simple_concat_data):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added pytest fixtures



def test_arrow_concat_basic():
def test_arrow_concat_basic(basic_concat_blocks, basic_concat_expected):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added pytest fixtures

# Check equivalence.
expected = pa.concat_tables(ts)
assert out == expected


def test_arrow_concat_null_promotion():
def test_arrow_concat_null_promotion(null_promotion_blocks, null_promotion_expected):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added pytest fixtures

# Check equivalence.
expected = pa.concat_tables(ts, promote=True)
assert out == expected


def test_arrow_concat_tensor_extension_uniform():
def test_arrow_concat_tensor_extension_uniform(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added pytest fixtures


# Check equivalence.
expected = pa.concat_tables(ts, promote=True)
assert out == expected


def test_arrow_concat_tensor_extension_variable_shaped():
def test_arrow_concat_tensor_extension_variable_shaped(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added pytest fixtures

assert "objects and tensors" in str(exc_info.value.__cause__)


def test_unify_schemas():
def test_unify_schemas(unify_schemas_basic_schemas, unify_schemas_multicol_schemas):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added pytest fixtures

pa.field("A", pa.int32(), nullable=True),
]
)
def test_unify_schemas_type_promotion(unify_schemas_type_promotion_data):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added pytest fixtures

df = pd.DataFrame({"one": [10, 11, 12], "two": [11, 12, 13], "three": [14, 15, 16]})
table = pa.Table.from_pandas(df)
block_accessor = BlockAccessor.for_block(table)
def test_arrow_block_select(block_select_data):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added pytest fixtures


with pytest.raises(ValueError):
block = block_accessor.select([lambda x: x % 3, "two"])


def test_arrow_block_slice_copy():
def test_arrow_block_slice_copy(block_slice_data):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added pytest fixtures

@@ -917,12 +775,12 @@ def check_for_copy(table1, table2, a, b, is_copy):
check_for_copy(table, table2, a, b, is_copy=False)


def test_arrow_block_slice_copy_empty():
def test_arrow_block_slice_copy_empty(block_slice_data):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added pytest fixtures

@alexeykudinkin alexeykudinkin enabled auto-merge (squash) July 11, 2025 18:53
Signed-off-by: Srinath Krishnamachari <[email protected]>
@github-actions github-actions bot disabled auto-merge July 11, 2025 19:24
@alexeykudinkin alexeykudinkin merged commit 781e61f into master Jul 11, 2025
5 checks passed
@alexeykudinkin alexeykudinkin deleted the srinathk10/mixed_tensors branch July 11, 2025 21:38
aslonnie added a commit that referenced this pull request Jul 12, 2025
Signed-off-by: Lonnie Liu <[email protected]>
aslonnie added a commit that referenced this pull request Jul 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Ray Data] Ray doesn't correctly handle variable shaped tensors in .map after change in blocklevel metadata in ray>2.45
3 participants