-
Notifications
You must be signed in to change notification settings - Fork 48
Labels
kernelsThings about kernels and how they are compiled.Things about kernels and how they are compiled.
Description
For a simple situation where I need to take the floor of a number and convert it to a integer, the compilation step through GPUCompiler fails with
ERROR: InvalidIRError: compiling kernel #36#37(MtlDeviceVector{Float32, 1}) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to gpu_malloc)
The failing code is the following:
import Metal
Metal.@sync Metal.@metal (xs-> (x = xs[1]; k = Int(x); xs[1] = k; nothing))(Metal.MtlVector(Float32[1,2,3]))
Metadata
Metadata
Assignees
Labels
kernelsThings about kernels and how they are compiled.Things about kernels and how they are compiled.