-
Notifications
You must be signed in to change notification settings - Fork 793
[SYCL][NFC] Mark external libc functions. #19368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
Conversation
<sycl/detail/defines_elementary.hpp> defines a __DPCPP_SYCL_EXTERNAL macro for functions that are defined elsewhere. This macro is used both for functions defined by the platform's system libraries and for functions defined in libclc/libdevice. For Native CPU, these should be treated differently: functions defined by the platform's system libraries should be processed according to the platform's ABI, functions defined by libclc/libdevice should be processed in a target-independent way. Therefore, this PR introduces a __DPCPP_SYCL_EXTERNAL_LIBC macro so that libc functions can be tagged as such. This PR does not yet make the changes to Native CPU to define __DPCPP_SYCL_EXTERNAL and __DPCPP_SYCL_EXTERNAL_LIBC differently.
The naming is not ideal, I'm happy to take alternative suggestions for |
@bader , @AlexeySachkov , any input from you? |
I know almost nothing about Native CPU, so I'm unsure the input I can provide is valuable. From the description alone, it's not clear what is the problem we are trying to solve. Follow-up patches are supposed to provide the justification for another macro.
According to my understanding libclc/libdevice are not target independent. We build dedicated version for each target separately. |
What we are currently applying in our internal Native CPU testing, and what I am working on cleaning up in order to hopefully be suitable to merge: What this PR prepares for is diff --git a/sycl/include/sycl/detail/defines_elementary.hpp b/sycl/include/sycl/detail/defines_elementary.hpp
index 17107c9216b3..ba3401496849 100644
--- a/sycl/include/sycl/detail/defines_elementary.hpp
+++ b/sycl/include/sycl/detail/defines_elementary.hpp
@@ -25,7 +25,12 @@
#endif // __SYCL_ALWAYS_INLINE
#ifdef SYCL_EXTERNAL
+#ifdef __NativeCPU__
+#define __DPCPP_SYCL_EXTERNAL SYCL_EXTERNAL __attribute__((intel_ocl_bicc))
+#define __DPCPP_SYCL_EXTERNAL_LIBC SYCL_EXTERNAL
+#else
#define __DPCPP_SYCL_EXTERNAL SYCL_EXTERNAL
+#endif
#else
#ifdef __SYCL_DEVICE_ONLY__
#define __DPCPP_SYCL_EXTERNAL __attribute__((sycl_device)) where whether we actually use
Apologies, I should have phrased that differently. I'm aware we build libclc/libdevice for each target separately, but we don't build for each aux (host) target separately, nor should we. We just build libclc once for Native CPU, and that version can be used with any host target. That is, the same libclc files are used for In order for this to work, we either need to default to the platform-independent ABI (calling into libclc) for translating to LLVM IR and explicitly mark those functions where we do need the host ABI (calling into libc), or we default to the host ABI and explicitly mark those functions where we need the platform-independent ABI. Either would work, and both approaches require this PR or something equivalent. |
intel_ocl_bicc was created for one very specific use case, which is not relevant to SYCL. Modifying SYCL headers to use calling convention for "Intel OpenCL CPU builtins" doesn't sound right me and goes against the purpose of Native CPU as I understand it. This calling convention works only for x86 processors, moreover it's supposed to be used only for SVML functions implementing OpenCL built-in functions.
I'm lost here. libclc is written in OpenCL. OpenCL compiler doesn't have the concept of "aux-target".
When you say "build", what exactly are we producing? My assumption it's LLVM IR. I would prefer this patch to be reviewed by someone who understands Native CPU architecture and this issue. It will take me some time to dig deep into this issue and I don't want @hvdijk to waste his time on dealing with all my questions. |
Agreed, it is a bit of a misuse of the calling convention, that's why I'm looking at changing that before submitting that as a PR. We need some calling convention that we give special treatment, and in our local patches we just went with
It's a calling convention that in that patch is handled entirely by the Native CPU target, not by the x86 target: we bypass the host target's regular handling for it. This really does work with Native CPU on x86_64, on AArch64, and on RISC-V. Even if in what we end up submitting as a PR, we end up using a different calling convention, the approach will be the same. In that patch, that is handled by the change to
The idea of Native CPU is that SYCL device code runs directly on the host CPU, so any device code is host code.
For the initial device compilation, we indeed produce LLVM IR. Then, at link time, when we embed that LLVM IR in the host executable, we translate that device LLVM IR using the host backend to produce machine code for whatever target is selected.
Native CPU does not have things like pointer size, it copies all those attributes from the host. It cannot generally be used standalone. There are a few other targets in LLVM upstream already that similarly copy a lot of attributes from the host, that handling was taken from the NVPTX and SPIR targets.
In general, you're right, Native CPU LLVM IR is not host-target-independent, nor is it meant to be. In libclc, however, we avoid anything that results in LLVM IR that varies between host targets and we have nothing that depends on the size of pointers, and those LLVM IR files are meant to be host-target-independent. As far as I am aware, DPC++ does not support SYCL on CPUs with 32-bit pointers. In a local build, months ago, I tried it and used the compiler to build SYCL-CTS (for spir32). It did not go well. There is also no CI to test this, and no documentation that suggests this should work. If that ever changes, though, it should also work for Native CPU as well. I can double-check later this week that we have no libclc/libdevice bugs in this area just to be sure (separate from this PR).
I'm happy to wait for someone else to review but they're good questions to ask to get answers written out for later reference. |
I might as well update here anyway: we don't have problems with pointers directly, we do have problems with assuming |
<sycl/detail/defines_elementary.hpp> defines a __DPCPP_SYCL_EXTERNAL macro for functions that are defined elsewhere. This macro is used both for functions defined by the platform's system libraries and for functions defined in libclc/libdevice. For Native CPU, these should be treated differently: functions defined by the platform's system libraries should be processed according to the platform's ABI, functions defined by libclc/libdevice should be processed in a host-target-independent way. Therefore, this PR introduces a __DPCPP_SYCL_EXTERNAL_LIBC macro so that libc functions can be tagged as such.
This PR does not yet make the changes to Native CPU to define __DPCPP_SYCL_EXTERNAL and __DPCPP_SYCL_EXTERNAL_LIBC differently.