Skip to content

Fix Sema unreachable when shfting by vector containing undef elems #24396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Justus2308
Copy link
Contributor

@Justus2308 Justus2308 commented Jul 10, 2025

Closes #24392

I added some missing checks for undef values before using compareHetero and getUnsignedInt to avoid hitting an unreachable prong.
This also enables a small AIR optimization basically for free that replaces a vec that only contains undef elems with a single undef value and returns early from zirShl/zirShr, skipping some unnecessary logic and emitting a shift instruction:

export fn foobar(x: @Vector(4, u8)) @Vector(4, u8) {
    const a = x << @as(@Vector(4, u3), .{ undefined, undefined, undefined, undefined });
    return a;
}
old AIR
# Begin Function AIR: repro.foobar:
# Total AIR+Liveness bytes: 206B
# AIR Instructions:         6 (54B)
# AIR Extra Data:           8 (32B)
# Liveness tomb_bits:       8B
# Liveness Extra Data:      0 (0B)
# Liveness special table:   0 (0B)
  %0 = arg(@Vector(4, u8), 0)
  %1!= save_err_return_trace_index()
  %2!= dbg_stmt(2:14)
  %3 = shl(%0!, <@Vector(4, u3), .{ undefined, undefined, undefined, undefined }>)
  %4!= dbg_stmt(2:5)
  %5!= ret_safe(%3!)
# End Function AIR: repro.foobar
new AIR
# Begin Function AIR: repro.foobar:
# Total AIR+Liveness bytes: 180B
# AIR Instructions:         4 (36B)
# AIR Extra Data:           6 (24B)
# Liveness tomb_bits:       8B
# Liveness Extra Data:      0 (0B)
# Liveness special table:   0 (0B)
  %0!= arg(@Vector(4, u8), 0)
  %1!= save_err_return_trace_index()
  %2!= dbg_stmt(2:5)
  %3!= ret_safe(<@Vector(4, u8), undefined>)
# End Function AIR: repro.foobar
old LLVM IR
; Function Attrs: nounwind sspstrong uwtable
define dso_local <4 x i8> @foobar(<4 x i8> %0) #0 !dbg !195 {
Entry:
  %1 = alloca <4 x i8>, align 4
  %2 = alloca <4 x i8>, align 4
  store <4 x i8> %0, ptr %2, align 4, !dbg !196
  call void @llvm.dbg.declare(metadata ptr %2, metadata !197, metadata !DIExpression()), !dbg !196
  %3 = zext <4 x i3> <i3 undef, i3 undef, i3 undef, i3 undef> to <4 x i8>, !dbg !198
  %4 = shl <4 x i8> %0, %3, !dbg !198
  store <4 x i8> %4, ptr %1, align 4, !dbg !198
  call void @llvm.dbg.declare(metadata ptr %1, metadata !199, metadata !DIExpression()), !dbg !198
  ret <4 x i8> %4, !dbg !200
}

ReleaseFast:

; Function Attrs: nounwind uwtable
define dso_local <16 x i8> @foobar(<16 x i8> %0) #0 !dbg !182 {
Entry:
  call void @llvm.dbg.value(metadata <16 x i8> %0, metadata !184, metadata !DIExpression()), !dbg !183
  %1 = zext <16 x i3> <i3 undef, i3 undef, i3 undef, i3 undef, i3 undef, i3 undef, i3 undef, i3 undef, i3 undef, i3 undef, i3 undef, i3 undef, i3 undef, i3 undef, i3 undef, i3 undef> to <16 x i8>, !dbg !185
  %2 = shl <16 x i8> %0, %1, !dbg !185
  call void @llvm.dbg.value(metadata <16 x i8> %2, metadata !186, metadata !DIExpression()), !dbg !185
  ret <16 x i8> %2, !dbg !187
}
new LLVM IR
; Function Attrs: nounwind sspstrong uwtable
define dso_local <4 x i8> @foobar(<4 x i8> %0) #0 !dbg !194 {
Entry:
  %1 = alloca <4 x i8>, align (4)
  %2 = alloca <4 x i8>, align (4)
  %3 = alloca <4 x i8>, align (4)
  store <4 x i8> %0, ptr %3, align (4), !dbg !195
  call void @llvm.dbg.declare(metadata ptr %3, metadata !196, metadata !DIExpression()), !dbg !195
  store <4 x i8> undef, ptr %2, align (4), !dbg !197
  call void @llvm.dbg.declare(metadata ptr %2, metadata !198, metadata !DIExpression()), !dbg !197
  call void @llvm.memset.p0.i64(ptr align (4) %1, i8 -86, i64 4, i1 false), !dbg !199
  %4 = load <4 x i8>, ptr %1, align (4), !dbg !199
  ret <4 x i8> %4, !dbg !199
}

ReleaseFast:

; Function Attrs: nounwind uwtable
define dso_local <16 x i8> @foobar(<16 x i8> %0) #0 !dbg !181 {
Entry:
  call void @llvm.dbg.value(metadata <16 x i8> %0, metadata !183, metadata !DIExpression()), !dbg !182
  call void @llvm.dbg.value(metadata <16 x i8> undef, metadata !185, metadata !DIExpression()), !dbg !184
  ret <16 x i8> undef, !dbg !186
}

@alexrp alexrp requested a review from mlugg July 11, 2025 02:52
@Justus2308
Copy link
Contributor Author

Justus2308 commented Jul 11, 2025

I cannot reproduce the CI failure locally from a M1 MacBook Pro using

zig build test-behavior -fqemu -Duse-llvm -Dtest-target-filter="loongarch64-linux-musl" --seed 0xa2eb50f3 --summary all

I'm using qemu 10.0.2 from homebrew, could this be something that can only be reprod with ziglang/static-qemu?

EDIT: I was able to retrieve a loongarch64 coredump by running static-qemu qemu-loongarch64 on utm debian 11

@alexrp
Copy link
Member

alexrp commented Jul 11, 2025

It is quite possible that this failure is not actually your fault. Did you find anything interesting in the coredump?

@alexrp
Copy link
Member

alexrp commented Jul 11, 2025

Managed to repro that failure locally, will investigate.

@alexrp
Copy link
Member

alexrp commented Jul 11, 2025

Going with #24407 for now to get CI stable.

@Justus2308
Copy link
Contributor Author

Thanks for the help! I haven't had time to look into the coredump yet but I'll let you know if I find anything.

Copy link
Member

@mlugg mlugg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch. I'm not entirely sure about these changes. For one thing, the use of bit_count in place of undefined at runtime isn't great -- the backend would like to know about undefined values so it can use them to optimize!

But more importantly, I actually suspect that shifting with an undefined operand should be a compile error. Our usual rule (as of #23177) is that if an operator can trigger IB, then passing undefined -- to any operand -- does trigger IB. Since e.g. my_u31 << runtime_31 can indeed trigger IB, it seems to me that it logically follows that shifting should follow the "undefined causes IB" rule. Of course, my_u32 << x can't trigger IB, but I think we should avoid type-dependence for IB rules. #3806 would theoretically eliminate the IB, but my gut still tells me that this should be IB.

If it is IB, then this gets much simpler: it's just a compile error for either operand to be undefined or have undefined as a vector element. I think it would be reasonable for now to implement that, because I believe it is more in line with the direction we're taking.

(Of course, if you don't want to make that change, feel free to just close this PR, and we'll get to this at some point!)

@Justus2308
Copy link
Contributor Author

Thanks for the feedback, makes sense in light of #23177!
Looking at the code again, should @shrExact also be a special case for undef vals then? if the lhs is undef it currently just returns undef for scalars and fails for vectors with undef elems because undef doesn't compare equal to zero when checking the to-be-shifted-out bits for any 1s. That's already an inconsistency anyway. If the rhs is undef it just returns undef.
Both of these cases can't really cause IB on a bit level (?) but they definitely can in the context of their operator.
I propose that @shrExact with any operand that returns true for anyScalarIsUndef should always cause a compile error. That would also be in line with the behaviour of @divExact which was rewritten in #23177.
I'd also be happy to implement these changes.

@Justus2308 Justus2308 requested a review from mlugg July 30, 2025 12:14
@Justus2308 Justus2308 force-pushed the vector-undef-shift-segfault branch 2 times, most recently from ab010b4 to d54fbb4 Compare July 30, 2025 12:18
@mlugg
Copy link
Member

mlugg commented Jul 30, 2025

Yep, I agree that @shrExact should also trigger safety-checked IB if either operand is (fully or partially) undefined.

Added additional checks for undefined values and vectors
containing undefined elements in `zirShl` and `zirShr`.
This also fixes a bug when shifting by a partially undefined
vector that reached unreachable in `compareHetero` and
`getUnsignedInt`.
This patch pretends that `Value.isUndefDeep()` is already
implemented (TBC).
@Justus2308 Justus2308 force-pushed the vector-undef-shift-segfault branch from d54fbb4 to 678fe03 Compare July 30, 2025 17:19
@Justus2308
Copy link
Contributor Author

Justus2308 commented Jul 30, 2025

I noticed that quite a lot of functions in Sema already pretend to handle the case of a vector containing only undefined elements by using isUndefDeep, but currently that function is basically only an alias for isUndef so it doesn't actually check for that case. So I just did the same in the first commit and also pretended that is was already implemented and then actually implemented it (hopefully correctly) in the second one.
The only case my implementation doesn't really catch is a plain union with an undefined value and without runtime safety, I think that should be checked at the call sites or maybe with something like isUndefAdvanced. With runtime safety it has a safety tag and we can't just pretend that's undefined too, but without runtime safety it could be treated like any other undefined value. If that's a desired feature I will open a separate issue for that.
I'm also unsure whether pointer-like optionals should return true if they are undefined since they technically don't have a dedicated tag and their entire memory region is really undefined if the value is undefined.

Now correctly returns `true` for aggregates
containing only undefined elements, undefined
unions without a (safety) tag.
@Justus2308 Justus2308 force-pushed the vector-undef-shift-segfault branch from 678fe03 to c7c9b92 Compare July 30, 2025 19:00
@mlugg
Copy link
Member

mlugg commented Jul 30, 2025

Ah, so, the doc comment on isUndefDeep is wrong -- doing that check recursively is actually a bad idea. Instead, the InternPool should essentially be canonicalizing .{ undefined, undefined } into undefined when interning (preserving appropriate types obviously). For that reason, could you revert that last commit? If you'd like, you can remove isUndefDeep, replacing its call sites with isUndef, and document this TODO (converting all-undefined aggregates into undefined in the InternPool) in a comment on isUndef. Sorry for the confusion!

Of course, if you'd like to try your hand at implementing that, by all means feel free! However, the InternPool might be a little tricky if you've not worked in that code before. Although, thinking about it, it might be a better idea to do it in a layer outside of the InternPool. So if you want to do this, perhaps introduce a Zcu.PerThread.aggregateValue wrapper, which does this all-undef check and interns an undef key instead in that case. Then switch any pt.intern(.{ .aggregate = ... }) where the fields might be undef to that wrapper, and assert in InternPool.get that the fields of an aggregate key are not all undef. That'd probably be preferable I think, since it's probably worth maintaining the property that InternPool.indexToKey of InternPool.get preserves the Key tag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compiler segfauts when shifting by a vector with an undefined value
3 participants