Skip to content

Commit 3be3fbf

Browse files
Tan Jerrythewilsonator
authored andcommitted
update documentation dflags, full example kernel
1 parent 595abfc commit 3be3fbf

File tree

3 files changed

+86
-10
lines changed

3 files changed

+86
-10
lines changed

README.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -51,19 +51,21 @@ To build DCompute you will need:
5151
* a SPIRV capable LLVM (available [here](https://github.com/thewilsonator/llvm/tree/compute) to build ldc to to support SPIRV (required for OpenCL)).
5252
* or LDC built with any LLVM 3.9.1 or greater that has the NVPTX backend enabled, to support CUDA.
5353
* [dub](https://github.com/dlang/dub) then just run `$dub build.`
54+
5455
Alternatively, you can include dcompute as a dependency, as shown below:
5556
* add
5657
```json
57-
"dcompute": {
58-
"version": "~>0.1.1",
59-
"dflags": [
60-
"-mdcompute-targets=cuda-800",
61-
"-mdcompute-targets=ocl-300",
62-
"-oq"
63-
]
64-
}
58+
"dependencies": {
59+
"dcompute": {
60+
"version": "~>0.1.1"
61+
}
62+
},
6563
```
66-
to your `dub.json` under `dependencies`. The dflags will be passed to LDC to generate code for the specified targets. You can run `ldc2 --help` to look for that flag. Use `ocl-xy0` for OpenCL x.y and `cuda-xy0` for CUDA Compute Capability x.y. So the above flags are for OpenCL 3.0 and CUDA CC 8.0. The two flags must be included separately as shown in the `dub.json`.
64+
to your `dub.json` under `dependencies`. You should include the following dub flags under `dflags-ldc`, which are passed to the compiler:
65+
```json
66+
"dflags-ldc": ["-mdcompute-targets=cuda-800","-mdcompute-targets=ocl-300","-version=LDC_DCompute","-oq"],
67+
```
68+
The dflags will be passed to LDC to generate code for the specified targets. You can run `ldc2 --help` to look for that flag. Use `ocl-xy0` for OpenCL x.y and `cuda-xy0` for CUDA Compute Capability x.y. So the above flags are for OpenCL 3.0 and CUDA CC 8.0. The two flags must be included separately as shown above.
6769
* If you get an error saying `Need to use a DCompute enabled compiler`, you likely forgot the `-mdcompute-targets` flags.
6870
* Check NVIDIA's [website](https://developer.nvidia.com/cuda-gpus) for your CUDA Compute Capability.
6971
* Alternatively add the equivalent to dub.sdl, `dependency "dcompute" version="~>0.1.1"` to your `dub.sdl` and include the dflags.

docs/05-driver/00-intro.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,3 +44,72 @@ context's devices.
4444

4545
**Event:** Represents a future return value from executing an asynchronous operation, such
4646
as a data transfer or kernel launch.
47+
48+
# Running a Kernel
49+
50+
Now, let's run our `mykernel` kernel that we have built up (see `04-std/01-index.md`). Recall
51+
that our kernel code should be in a separate file. For our main function, we can have something
52+
as shown below. This is assumes compilation for CUDA backend. Note that we import our
53+
`mykernels` module containing our kernel code and the dcompute driver for cuda.
54+
55+
```d
56+
import std.stdio;
57+
import ldc.dcompute;
58+
import std.algorithm;
59+
import std.stdio;
60+
import std.file;
61+
import std.traits;
62+
import std.meta;
63+
import std.exception : enforce;
64+
import std.experimental.allocator;
65+
import std.array;
66+
import mykernels;
67+
import dcompute.driver.cuda;
68+
69+
int main()
70+
{
71+
enum size_t N = 128;
72+
float c = 5.0;
73+
float[N] res, x;
74+
foreach (i; 0 .. N)
75+
{
76+
x[i] = i;
77+
}
78+
79+
Platform.initialise();
80+
81+
auto devs = Platform.getDevices(theAllocator);
82+
auto dev = devs[0];
83+
auto ctx = Context(dev); scope(exit) ctx.detach();
84+
85+
// Change the file to match your GPU.
86+
Program.globalProgram = Program.fromFile("kernels_cuda800_64.ptx");
87+
auto q = Queue(false);
88+
89+
Buffer!(float) b_res, b_x;
90+
b_res = Buffer!(float)(res[]); scope(exit) b_res.release();
91+
b_x = Buffer!(float)(x[]); scope(exit) b_x.release();
92+
93+
b_x.copy!(Copy.hostToDevice);
94+
95+
q.enqueue!(mykernel)
96+
([N,1,1],[1,1,1])
97+
(b_res,b_x,c);
98+
b_res.copy!(Copy.deviceToHost);
99+
100+
foreach(i; 0 .. N)
101+
enforce(res[i] == x[i] + c);
102+
writeln(res[]);
103+
104+
return 0;
105+
}
106+
```
107+
It is important to change the file path on the `Program.fromFile("kernels_cuda800_64.ptx")` line
108+
to the ptx file generated by the compilation step. Depending on how you set up dub, it may be in
109+
`./.dub/obj` or just your project directory. You should verify that your kernels actually show
110+
up in the ptx file after running dub build (it's in plaintext).
111+
112+
With the above example, we should get a successful run with the integers from 5 to 132 printed, since
113+
our kernel adds c, which is 5 in this case, to the input vector, which has 0 to 127 in our case.
114+
115+
See `source/dcompute/tests` for examples of a slightly more complicated kernel and running with opencl driver.

docs/README.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,12 @@ These docs are designed to help getting started installing & using DCompute.
2121
4.1 index
2222
5. The compute API driver
2323

24-
## D
24+
You can find the corresponding Readme for each of the listed items in the parent `docs` directory, labelled with names
25+
starting with 00 through 05. For the device standard library and compute API driver, look in the
26+
subdirectories `04-std` and `05-driver`, respectively. These instructions will help you install and execute
27+
your first kernel with DCompute.
28+
29+
## D Basics Refresher
2530

2631
This guide assumes that the reader is familiar with the basics of D, although anyone
2732
familiar with the C family of languages should be able to understand most of it.

0 commit comments

Comments
 (0)