Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
251 changes: 251 additions & 0 deletions contributor_docs/p5.strands.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,251 @@
<!-- How p5.strands JS-to-GLSL compilation works. -->

# p5.strands Overview

Shader programming is an area of creative coding that can feel like a dark art to many. People share lots of stunning visuals that are created with shaders, but shaders feel like a completely different way of coding, requiring you to learn a new language, pipeline, and paradigm.

p5.strands hopes to address all of those issues by letting you write shader snippets in JavaScript and compiling it to OpenGL Shading Language (GLSL) for you!

## Code processing pipeline

At its core, p5.strands works in four steps:
1. The user writes a function in pseudo-JavaScript.
2. p5.strands transpiles that into actual JavaScript and rewrites aspects of your code.
3. The transpiled code is run. Variable modification function calls are tracked in a graph data structure.
4. p5.strands generates GLSL code from that graph.

## Why pseudo-JavaScript?

The code the user writes when using p5.strands is mostly JavaScript, with some extensions. Shader code heavily encourages use of vectors, and the extensions all make this as easy in JavaScript as in GLSL.
- In JavaScript, there is not a vector data type. In p5.strands, you create vectors by creating array, e.g. `myVec = [1, 0, 0]`. You can't use actual arrays in p5.strands; all arrays are fixed-size vectors.
- In JavaScript, you can only use mathematical operators like `+` between numbers and strings, not with vectors. In p5.strands, we allow use of these operators between vectors.
- In GLSL, you can do something called *swizzling*, where you can create new vectors out of the components of an existing vector, e.g. `myvec.xy`, `myvec.bgr`, or even `myvec.zzzz`. p5.strands adds support for this on its vectors.

When we transpile the input code, we rewrite these into valid JavaScript. Array literals are turned into function calls like `vec3(1, 0, 0)` which return vector class instances. These instances are wrapped in a `Proxy` that handles property accesses that look like swizzles, and converts them into sub-vector references. Operators between vectors like `a + b` are rewritten into method calls, like `a.add(b)`.

If a user writes something like this:

```js
baseMaterialShader().modify(() => {
const t = uniformFloat(() => millis())
getWorldInputs((inputs) => {
inputs.position += [20, 25, 20] * sin(inputs.position.y * 0.05 + t * 0.004)
return inputs
})
})
```

...it gets transpiled to something like this:
```js
baseMaterialShader().modify(() => {
const t = uniformFloat('t', () => millis())
getWorldInputs((inputs) => {
inputs.position = inputs.position.add(strandsNode([20, 25, 20]).mult(sin(inputs.position.y.mult(0.05).add(strandsNode(t).mult(0.004)))))
return inputs
})
})
```

## The program graph

The overall structure of a shader program is represented by a **control-flow graph (CFG)**. This divides up a program into chunks that need to be outputted in linear order based on control flow. A program like the one below would get chunked up around the if statement:

```js
// Start chunk 1
let a = 0;
let b = 1;
// End chunk 1

// Start chunk 2
if (a < 2) {
b = 10;
}
// End chunk 2

// Start chunk 3
b += 2;
return b;
// End chunk 3
```

We store the individual states that variables can be in as nodes in a **directed acyclic graph (DAG)**. This is a fancy name that basically means each of these variable states may depend on previous variable states, and outputs can't feed back into inputs. Each time you modify a variable, that represents a new state of that variable. For example, below, it is not sufficient to know that `c` depends on `a` and `b`; you also need to know *which version of `b`* it branched off from:

```js
let a = 0;
let b = 1;
b += 1;
let c = a + b;
return c;
```

We can imagine giving each of these states a separate name to make it clearer. In fact, that's what we do when we output GLSL, because we don't need to preserve variable names.
```js
let a_0 = 0;
let b_0 = 1;
let b_1 = b_0 + 1;
let c_0 = b_1 + a_0;
return c_0;
```

When we generate GLSL from the graph, we start from the variables we need to output, the return values of the function (e.g. `c_0` in the example above.) From there, we can track dependencies through the DAG (in this case, `b_1` and `a_1`). Each dependency has their own dependencies. We make sure we output the dependencies for a node before the node itself.

Each node in the DAG belongs to a chunk in the CFG. This helps us keep track of key points in the code. If we need to, for example, generate a temporary variable at the end of an if statement, we can refer to that CFG chunk rather than whatever the last value node in the if statement happens to be.

## Control flow

p5.strands has to convert any control flow that should show up in GLSL into function calls instead of JavaScript keywords. If we don't, they run in JavaScript, and are invisible to GLSL generation. For example, if you had a loop that runs 10 times that adds 1 each time, it would output the add 1 line 10 times rather than outputting a for loop.

<table>
<tr>
<th>Input</th>
<th>Output without converting control flow</th>
</tr>
<tr>
<td>

```js
let a = 0;
for (let i = 0; i < 10; i++) {
a += 2;
}
return a;
```

</td>
<td>

```glsl
float a = 0.0;
a += 2.0;
a += 2.0;
a += 2.0;
a += 2.0;
a += 2.0;
a += 2.0;
a += 2.0;
a += 2.0;
a += 2.0;
a += 2.0;
return a;
```

</td>
</tr>
</table>

However, once we have a function call instead of real control flow, we also need a way to make sure that when the users' javascript subsequently references nodes that were updated in the control flow, they properly reference the modified value after the `if` or `for` and not the original value.

<table>
<tr>
<th>Input</th>
<th>Transpiled without updating references</th>
<th>States without updating references</th>
</tr>
<tr>
<td>

```js
let a = 0;
for (let i = 0; i < 10; i++) {
a += 2;
}
let b = a + 1;
return b;
```

</td>
<td>

```js
let a = 0;
p5.strandsFor(
() => 0,
(i) => i.lessThan(10),
(i) => i.add(1),

() => {
a = a.add(2);
}
);
let b = a.add(1);
return b;
```

</td>
<td>

```js
let a_0 = 0;

p5.strandsFor(
// ...
)
// At this point, the final state of a is a_n

// ...but since we didn't actually run the loop,
// b still refers to the initial state of a!
let b_0 = a_0.add(1);
return b;
```

</td>
</tr>
</table>

For that, we make the function calls return updated values, and we generate JS code that assigns these updated values back to the original JS variables. So for loops end up transpiled to something like this, inspired by the JavaScript `reduce` function:

<table>
<tr>
<th>Input</th>
<th>Transpiled with updated references</th>
</tr>
<tr>
<td>

```js
let a = 0;
for (let i = 0; i < 10; i++) {
a += 2;
}
let b = a + 1;
return b;
```

</td>
<td>

```js
let a = 0;

const outputState = p5.strandsFor(
() => 0,
(i) => i.lessThan(10),
(i) => i.add(1),

// Explicitly output new state based on prev state
(i, prevState) => {
return { a: prevState.a.add(2) };
},

{ a } // Pass in initial state
);
a = outputState.a; // Update reference

// b now correctly is based off of the final state of a
let b = a.add(1);
return b;
```

</td>
</tr>
</table>

We use a special kind of node in the DAG called a **phi node**, something used in compilers to refer to the result of some conditional execution. In the example above, the state of `a` in the output state is represented by a phi node.

In the CFG, we surround chunks producing phi nodes by a `BRANCH` and a `MERGE` chunk. In the `BRANCH` chunk, we can initialize phi nodes, sometimes giving them initial values. In the `MERGE` chunk, the value of the phi node has stabilized, and other nodes can use them as a dependency.

## GLSL generation

GLSL is currently the only output format we support, but p5.strands is designed to be able to generate multiple formats. Specifically, in WebGPU, they use the WebGPU Shading Language (WGSL). Our goal is that your same JavaScript p5.strands code can be used in WebGL or WebGPU without you having to do any modifications.

To support this, p5.strands separates out code generation into **backends.** A backend is responsible for converting each type of CFG chunk into a string of shader source code. We currently have a GLSL backend, but in the future we'll have a WGSL backend too!
19 changes: 8 additions & 11 deletions src/strands/ir_builders.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import * as DAG from './ir_dag'
import * as CFG from './ir_cfg'
import * as FES from './strands_FES'
import { NodeType, OpCode, BaseType, DataType, BasePriority, OpCodeToSymbol, typeEquals, } from './ir_types';
import { createStrandsNode, StrandsNode } from './strands_api';
import { createStrandsNode, StrandsNode } from './strands_node';
import { strandsBuiltinFunctions } from './strands_builtins';

//////////////////////////////////////////////
Expand Down Expand Up @@ -165,7 +165,6 @@ export function memberAccessNode(strandsContext, parentNode, componentNode, memb

export function structInstanceNode(strandsContext, structTypeInfo, identifier, dependsOn) {
const { cfg, dag, } = strandsContext;

if (dependsOn.length === 0) {
for (const prop of structTypeInfo.properties) {
const typeInfo = prop.dataType;
Expand Down Expand Up @@ -266,7 +265,6 @@ export function primitiveConstructorNode(strandsContext, typeInfo, dependsOn) {
};

const id = constructTypeFromIDs(strandsContext, finalType, mappedDependencies);

if (typeInfo.baseType !== BaseType.DEFER) {
CFG.recordInBasicBlock(cfg, cfg.currentBlock, id);
}
Expand Down Expand Up @@ -419,11 +417,11 @@ export function functionCallNode(
return { id, dimension: inferredReturnType.dimension };
}

export function statementNode(strandsContext, opCode) {
export function statementNode(strandsContext, statementType) {
const { dag, cfg } = strandsContext;
const nodeData = DAG.createNodeData({
nodeType: NodeType.STATEMENT,
opCode
statementType
});
const id = DAG.getOrCreateNode(dag, nodeData);
CFG.recordInBasicBlock(cfg, cfg.currentBlock, id);
Expand Down Expand Up @@ -458,7 +456,7 @@ export function swizzleTrap(id, dimension, strandsContext, onRebind) {
return Reflect.get(...arguments);
} else {
for (const set of swizzleSets) {
if ([...property].every(char => set.includes(char))) {
if ([...property.toString()].every(char => set.includes(char))) {
const swizzle = [...property].map(char => {
const index = set.indexOf(char);
return swizzleSets[0][index];
Expand All @@ -476,12 +474,11 @@ export function swizzleTrap(id, dimension, strandsContext, onRebind) {
chars.every(c => swizzleSet.includes(c)) &&
new Set(chars).size === chars.length &&
target.dimension >= chars.length;

if (!valid) continue;

const dim = target.dimension;

// lanes are the underlying values of the target vector
// lanes are the underlying values of the target vector
// e.g. lane 0 holds the value aliased by 'x', 'r', and 's'
// the lanes array is in the 'correct' order
const lanes = new Array(dim);
Expand Down Expand Up @@ -521,7 +518,7 @@ export function swizzleTrap(id, dimension, strandsContext, onRebind) {
}

// The canonical index refers to the actual value's position in the vector lanes
// i.e. we are finding (3,2,1) from .zyx
// i.e. we are finding (3,2,1) from .zyx
// We set the correct value in the lanes array
for (let j = 0; j < chars.length; j++) {
const canonicalIndex = swizzleSet.indexOf(chars[j]);
Expand All @@ -538,9 +535,9 @@ export function swizzleTrap(id, dimension, strandsContext, onRebind) {

target.id = newID;

// If we swizzle assign on a struct component i.e.
// If we swizzle assign on a struct component i.e.
// inputs.position.rg = [1, 2]
// The onRebind callback will update the structs components so that it refers to the new values,
// The onRebind callback will update the structs components so that it refers to the new values,
// and make a new ID for the struct with these new values
if (typeof onRebind === 'function') {
onRebind(newID);
Expand Down
8 changes: 6 additions & 2 deletions src/strands/ir_cfg.js
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,11 @@ export function popBlock(graph) {
graph.currentBlock = graph.blockStack[len-1];
}

export function pushBlockForModification(graph, blockID) {
graph.blockStack.push(blockID);
graph.currentBlock = blockID;
}

export function createBasicBlock(graph, blockType) {
const id = graph.nextID++;
graph.blockTypes[id] = blockType;
Expand Down Expand Up @@ -75,7 +80,6 @@ export function printBlockData(graph, id) {
export function sortCFG(adjacencyList, start) {
const visited = new Set();
const postOrder = [];

function dfs(v) {
if (visited.has(v)) {
return;
Expand All @@ -86,7 +90,7 @@ export function sortCFG(adjacencyList, start) {
}
postOrder.push(v);
}

dfs(start);
return postOrder.reverse();
}
Loading
Loading