Skip to content

Conversation

kosiew
Copy link
Contributor

@kosiew kosiew commented Jul 9, 2025

Which issue does this PR close?

Closes #7886.

Rationale for this change

Casting large Decimal256 values to Float64 can exceed the representable range of floating point numbers. Previously, this could result in a panic due to unwrapping a failed conversion.

This PR introduces a safe conversion that saturates overflowing values to INFINITY or -INFINITY, following standard floating point semantics. This ensures stable, predictable behavior without runtime crashes.

What changes are included in this PR?

  • Introduced a helper function decimal256_to_f64 that converts i256 to f64, returning INFINITY or -INFINITY when the value is out of range.
  • Updated the casting logic for Decimal256Float64 to use the new safe conversion.
  • Improved inline and module-level documentation to reflect that this conversion is lossy and saturating.
  • Added a unit test test_cast_decimal256_to_f64_overflow to validate overflow behavior.

Are there any user-facing changes?

Yes.

  • Behavior Change: When casting Decimal256 values that exceed the f64 range, users now receive INFINITY or -INFINITY instead of a panic.
  • Improved Docs: Updated documentation clarifies the lossy and saturating behavior of decimal-to-float casting.
  • Not a Breaking Change: There are no API changes, but users relying on panics for overflow detection may observe different behavior.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jul 9, 2025
kosiew added 2 commits July 9, 2025 17:05
… detailed context in error messages, such as failing element index and input value.
@tustvold
Copy link
Contributor

tustvold commented Jul 9, 2025

Is it possible that the issue is that the float conversion is fallible, I wouldn't have expected the conversion to be fallible? Can we fix this? Changing to try_unary will significantly regress performance and may be the wrong fix?

@kosiew
Copy link
Contributor Author

kosiew commented Jul 9, 2025

Thanks @tustvold for the quick feedback

✅ What we’re doing now (safe, slow):

  • Replace .unwrap() with proper error handling.
  • Use try_unary to safely propagate conversion errors.
  • This is correct, but may slow things down.

🛠 Alternative approaches (fast, risky or complex):

  1. Clamp or saturate

    • Convert what we can, and clamp values to f64::MAX or f64::MIN when overflow is detected.
    • Might hide data corruption; unsafe for financial data.
  2. Ignore errors silently

    • Keep using .unary() and fallback to 0.0 or NaN on error.
    • Fast but potentially dangerous and violates correctness expectations.
  3. Make overflow handling configurable

    • Introduce a CastOptions flag like allow_float_overflow.
    • Use unary() when allowed, try_unary() when strict.

    Which of the above (or another option) would you recommend?

@tustvold
Copy link
Contributor

tustvold commented Jul 9, 2025

I would expect us to follow standard floating point overflow behaviour. Ultimately if you're opting for floating point numbers you are opting for this behaviour. Floating point in general is not appropriate for financial data as it is lossy by design.

kosiew added 2 commits July 9, 2025 19:27
…ion failures"

This reverts commit f25ec6b3ba8561f8a66b276d9d7869f8636ce48c.
…f error on overflow

- Changed decimal-to-float casts to use lossy conversion consistent with IEEE semantics,
  saturating to ±INFINITY instead of returning an error on overflow or out-of-range values.
- Updated `cast_decimal_to_float` to use infallible conversion function signature.
- Added `decimal256_to_f64` helper for Decimal256 to f64 conversion with saturation.
- Adjusted casting logic in `cast_with_options` accordingly.
- Removed tests that expected errors on decimal-to-float overflow since now conversion saturates.
- Clarified documentation to specify that decimal to float casts are lossy and saturate on overflow.
@kosiew kosiew changed the title Improve Decimal to Float Casting with Error Propagation and Overflow Handling Add lossy decimal to float casting with saturation for overflows in Arrow Jul 9, 2025
@kosiew kosiew marked this pull request as ready for review July 9, 2025 13:58
);
}
#[test]
fn test_cast_decimal256_to_f64_overflow() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to cover negative infinity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klion26

Good catch.
I amended the test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also please either add or ensure there is an existing test for casting Decimal128 (i128::MIN and i128::MAX to f64)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update, I didn't notice that @kosiew added a follow on ticket to track this work:

@alamb alamb changed the title Add lossy decimal to float casting with saturation for overflows in Arrow Fix panic on lossy decimal to float casting: round to saturation for overflows Jul 10, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kosiew @tustvold and @klion26 -- this looks like a clear improvement to me (no panics 👏 )

I think we should add the equivalent test for Decimal128 but I don't think it is required for this PR (we could do it in another PR)

from_type,
to_type,
|x: i256| x.to_f64().unwrap(),
|x: i256| decimal256_to_f64(x),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do something similar for Decimal128 above?

Copy link
Contributor

@scovich scovich Jul 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having a hard time understanding the semantics of i256::to_64, but it seems to be a provided method of impl ToPrimitive for i256, which

tries to convert through to_i64(), and failing that through to_u64()

In contrast, the Decimal128 case above is doing a rust cast x as f64, which I would assume already handles this case Just Fine (and probably doesn't even need to resort to +/- INF, because f64 has a dynamic range of ~2048 powers of two (from 2**-1023 to 2**1023) -- far more than enough to handle even Decimal256 (which has dynamic range of only 512 powers of two (from 2**-255 to 2**255)

If we're anyway trying to convert the value to f64, it seems like we should fix this bug by following the advice from the docs for ToPrimitive::to_f64:

Types implementing this trait should override this method if they can represent a greater range.

This shouldn't actually be very difficult. Assuming the two i128 parts together form the halves of one giant two's complement value, something like this should do the trick, and would always return a finite value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just be careful -- the least-significant leading zero is the sign bit, and mustn't be shifted out; I'm not 100% certain my code above handles that correctly, could be off-by-one)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

/// Convert a [`i256`] to `f64` saturating to infinity on overflow.
fn decimal256_to_f64(v: i256) -> f64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saturating to INF seems a better solution than panic'ing

);
}
#[test]
fn test_cast_decimal256_to_f64_overflow() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also please either add or ensure there is an existing test for casting Decimal128 (i128::MIN and i128::MAX to f64)

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @kosiew and @scovich -- I took a look at this PR and I think it is strictly better than what is on main.

I think we have captured the follow ons as well:

So 🚀

from_type,
to_type,
|x: i256| x.to_f64().unwrap(),
|x: i256| decimal256_to_f64(x),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb alamb merged commit a7f3ba8 into apache:main Jul 23, 2025
27 checks passed
@kosiew
Copy link
Contributor Author

kosiew commented Jul 24, 2025

Thanks @alamb, @klion26 , @scovich for your review comments

alamb pushed a commit that referenced this pull request Jul 29, 2025
…cimal256 → Float64 casts (#7986)

# Which issue does this PR close?

Closes #7985

---

# Rationale for this change

The existing Decimal256 → Float64 conversion was changed to **saturate**
out-of-range values to `±INFINITY` (PR #7887) in order to avoid panics.
However, every 256-bit signed integer actually fits within the exponent
range of an IEEE-754 `f64` (±2¹⁰²³), so we can always produce a
**finite** `f64`, only sacrificing mantissa precision.
By overriding `i256::to_f64` to split the full 256-bit magnitude into
high/low 128-bit halves, recombine as
```text
(high as f64) * 2^128 + (low as f64)
```
and reapply the sign (special-casing i256::MIN), we:

- Eliminate both panics and infinite results

- Match Rust’s built-in (i128) as f64 rounding (ties-to-even)

- Simplify casting logic—no saturating helpers or extra flags required

# What changes are included in this PR?

- Added full-range fn to_f64(&self) -> Option<f64> for i256, using
checked_abs() + to_parts() + recombination

- Removed fallback through 64-bit to_i64()/to_u64() and .unwrap()

- Replaced the old decimal256_to_f64 saturating helper with a thin
wrapper around the new i256::to_f64() (always returns Some)

- Updated Decimal256 → Float64 cast sites to call the new helper

## Tests

- Reworked “overflow” tests to assert finite & correctly signed results
for i256::MAX and i256::MIN

- Added typical-value tests; removed expectations of ∞/-∞

# Are there any user-facing changes?
Behavior change:

- Very large or small Decimal256 values no longer become +∞/-∞.

- They now map to very large—but finite—f64 values (rounded to nearest
mantissa).

## API impact:

No public API signatures changed.

Conversion remains lossy by design; users relying on
saturation-to-infinity will observe different (more faithful) behavior.

---------

Co-authored-by: Ryan Johnson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panic when casting large Decimal256 to f64 due to unchecked unwrap()
5 participants