Skip to content

[p5.strands] Significant refactor for p5.strands #8009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 55 commits into
base: dev-2.0
Choose a base branch
from

Conversation

lukeplowden
Copy link
Member

@lukeplowden lukeplowden commented Jul 30, 2025

Addresses #7868

Changes:

This (draft) PR is for a significant refactor for p5.strands I've been working on for the past month. Thank you to the contributors in other issues who have been patient in waiting for this update as it has blocked some progress in other areas. And thanks to all in Discord showing an interest too. I would love to get the thoughts of those who have been interested in contributing to p5.strands thus far (or any newcomers). This refactor is all about developer ergonomics for p5.strands:
@LalitNarayanYadav @perminder-17 @reshma045 @pratham-radadiya @ShaunakMishra25 @Orsenna187

At current, the refactor is just missing swizzles, a slight change needs to be made to the transpiler to make Unary operations work. Then, I need to do a once over and remove any extra types etc. which are left over from earlier stages in this refactor.

Overview of the refactor

The main purpose of this refactor is to make it more extendable to WGSL in the future, to modularise for developer ergonomics generally, and to make tests and FES easier to re-implement. It separates concerns throughout the p5.strands architecture and adds a much clearer type system. By modularising the codebase, a more straightforward roadmap and contributor documentation can be written up for p5.strands. More on that at the end of this PR, and I will leave some stubs for new issues related to this.

Entry point

p5.strands is still accessible through the same p5.Shader.modify method. The function override for this now exists in p5.strands.js, however. This file also initialises a strandsContext object, and also initialises the user API with this context. In the future, this file could potentially override createShader().

User API

The user API in strands_api.js includes all of the hooks, i.e. methods available p5.Shader.modify() such as getWorldInputs() , getFinalColor() and so on.

It also includes StrandsNode, a simplified class as compared to the previous implementation. Previously, the user had handles to classes derived from BaseNode. There were between 10-15 of these, each with slightly different methods and data, to handle all edge cases for both operations and also for code generation. This was confusing for developer experience, but also created the problem that it was hard to know where to document strands features, and what to document.

The StrandsNode class only contains user facing methods, like .add(), .mult(), and members for swizzling such as .xyz, .rrg etc. Apart from that, it has a this.id which corresponds to an ID in the compilers Intermediate Representation. More on this later, but overall the user API is less tied to backend specifics now.

This file also contains a few more functions like type constructors (vec3, float, also now with ivec3, bool etc.), strandsIf() and discard() which are in progress, and (now I'm reminded I need to add this:) instanceID() as before.

Finally, it also pulls in functions from strands_builtins.js. These are similar as in the previous implementation, except now with a more robust type system which is explained below. @LalitNarayanYadav, you might be interested in reviewing this and potentially re-porting lerp here (sorry!) and copying noise across too, which shouldn't need to change!

Stages of the compiler

The p5.strands compiler is broken more clearly into separate stages. These are similar, but a bit different, to the classic three stages of a compiler. Previously, These stages were shared between the BaseNode class and its children, the ShaderGenerator class, and the p5.Shader.modify() method. The resulting codebase was becoming difficult to extend, and also difficult to summarise.

1. Front-end: Transpile Stage

Overview: Transpiles from the p5.strands 'language' to the JavaScript API.
Files: strands_transpiler.js
External Dependencies: ESCodegen and Acorn

  • adds operator overloading to allow normal JS operators ([], +, -, == etc) to work on Strands Nodes.
  • It works by using Acorn to generate an AST, traversing the AST and replacing nodes. Then, it uses ESCodegen to turn this back into code.

2. Middle-end: Building the Intermediate Representation (IR)

Overview: Builds graphs which represent the user's code
Files: ir_dag.js, ir_cfg.js, ir_types.js, ir_builders.js

ir_builders.js is one step beyond the User API file. The functions in here do most of the heavy lifting in building up the IR graphs. All of the functions in the User API call to here.

When the user calls methods like .add() or vec3(), they are returned a user facing StrandsNode as mentioned above. However, this also builds a node in the IR's directed acyclic graph (DAG), which model data dependencies, and records its existence in the control flow graph (CFG), which models data flow. These graphs are implemented in the ir_dag.js and ir_cfg.js files respectively.

The users nodes are handles to nodes in the DAG. So this includes variables and operations (that's it for the most part). There are no 'no-ops' at current. Inside of strandsIf(), a new 'basic block' is made in the CFG. The strandsContext (via the builder functions) keeps track of the current block, and any user instructions (like a function call or addition) are recorded in the current block.

The ir_types.js file has a number of pseudo enums and look up tables for different types. These include BlockType or basic blocks, NodeType for variables vs operations (maybe name is too ambiguous now but use if obvious), etc.

The most obvious (and complex) of these are DataType's which model types such as float, int and their vector variations. As a shader DSL based in JS, I've arrived at objects with a shape: { baseType: 'float', dimension: '1', priority: '3' } etc. Therefore you can compose a final shader type by doing node.baseType + node.dimension, which just separates our types from GLSL a bit for down the road.

Once the user's code has finished running and all of the graphs are built, we do a topological sort on the CFG. We are able to topo sort because, although there are kind of back-edges in the graph, we don't really need to model goto's purely, we just need to output code gen if() in the codegen. This is still a work in progress, however.

3. Back-end: Code generation

Overview: Generates GLSL code from the intermediate representation
Files: strands_codegen.js, strands_glslBackend.js

This does as it says: generates GLSL code from the IR. We currently do the CFG sort in this section, and create generationContext object to store our lines of generated code, and temporary variable names. Next, we loop over the basic blocks and output the code for each visited node.

We only have to use some of the same types from the IR, but most of the heavy lifting is already done (as mentioned) and the code output is relatively simpler code. It is similarly structured to Acorn's visitor functions: we define an object with different visitor functions for different node types.

Importantly, the WGSL implementation should be a similarly simple process to add, and could be done by a direct port of the ``strands_glslBackend.js` file.

FES file

I have also disabled and reenabled FES in the strandsContext object as before, however I have also added a temporary strands_FES.js file here. There are several places in which I have added user errors, but I'm not sure on the best approach for this and have to look more deeply at the rest of FES before overriding it.

Next steps / input

  • Right now, there are few classes (only user facing ones, in order to have chainable methods easily). I was reading about data oriented design whilst making this (not saying its perfect) but ended up having few classes because of this. It also means that the graphs are structs of arrays, and nodes are just indices into them. If people feel strongly I can refactor these into classes. For example, strandsContext could become class StrandsRuntime or similar:
function initStrandsContext(ctx, backend) {
    ctx.dag = createDirectedAcyclicGraph();
    ctx.cfg = createControlFlowGraph();
    ctx.uniforms = [];
    ctx.hooks = [];
    ctx.backend = backend;
    ctx.active = true;
    ctx.previousFES = p5.disableFriendlyErrors;
    p5.disableFriendlyErrors = true;
}
  • I'm not sure how I feel about the current (broken) approach to swizzling. Maybe it was better to have Proxy objects as in the previous implementation. I don't like attaching hundreds of members of xyzw permutations to the StrandsNodes prototype, what do you think @davepagurek ?
  • There are fair number of new files and a new strands folder added to the repo in this PR. How do you feel about that and also naming conventions @ksen0?
  • In the coming weeks, this writeup could be adapted into a proper contributor docs outlining all of this in a succinct (and visual way).
  • It could be neat to represent the IR graphs in a p5.sketch (a shader) and use this as a visual test. Visualising the language as much as possible will help contributors to understand how its working.
  • Once I have figured out the if statements properly and finally, loops would follow a similar structure and a good issue for somebody to tackle if they want to
  • There is a possibility to optimize the shader code after the IR is built, for example there's a template for constant folding already in ir_types. I'm just not sure whether this will actually optimize anything, or whether the respective backend compiler (GLSL/ WGPU) will do a better job anyway.
  • As mentioned, would be good to get somebody from FES @IIITM-Jay looking at this at some point, although no rush for now.
  • As the type system mas matured, we are more capable of defining the input structs to the hooks in our IR. I can see a possibility that more and more of our internal shader workings could run on p5.strands

Will write anything down here as I think of more

PR Checklist

…not p5 defined structs such as Vertex inputs)
@lukeplowden lukeplowden added this to the 2.1 milestone Jul 30, 2025
@lukeplowden lukeplowden marked this pull request as draft July 30, 2025 12:36
@lukeplowden lukeplowden requested a review from davepagurek July 30, 2025 12:37
Copy link
Contributor

@davepagurek davepagurek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all this work, it's looking good!

Just for my own understanding (and to maybe put in a doc somewhere at some point), the distinction between the control flow graph and the DAG is that the DAG stores a node for each state of each variable as it goes through the program, and the CFG controls the higher level constructs like functions, loops, and if statements that break up the values? I'm sort of picturing the CFG and DAG nodes as all inhabiting the same overall graph, sort of like these rough diagrams from when we were talking earlier:

image

But in the above picture it's not clear what goes in each block, so like these would both be equivalent:

let a = 1
let b = 2
if (b > 1) {
  let c = 3
  a = c
}
return a
let a = 1
let b = 2
let condition = b > 1
let c = 3
if (condition) {
  a = c
}
return a

So are the control flow graph nodes sort of like the big IfElse block in there but that also draw a line around which values should be within the different parts of the if?

let { dimension, baseType } = typeInfo;

if (dimension !== 1) {
FES.internalError('Created a literal node with dimension > 1.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FES.internalError('Created a literal node with dimension > 1.')
FES.internalError('Created a scalar literal node with dimension > 1.')

p5.disableFriendlyErrors = true;
}

function deinitStrandsContext(ctx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to set ctx.active = false in here? Looks like some of the test failures may be due to the context remaining active

// The callbacks for AssignmentExpression and BinaryExpression handle
// operator overloading including +=, *= assignment expressions
ArrayExpression(node, _state, _ancestors) {
const original = JSON.parse(JSON.stringify(node));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll need to re-apply the early returns added in #7961, where we check for an ancestor being a uniform

}

function ancestorIsUniform(ancestor) {
return ancestor.type === 'CallExpression'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also an updated version of this in #7961 that handles instance mode

@@ -1116,13 +1116,12 @@ function shadergenerator(p5, fn) {
GLOBAL_SHADER = this;
this.userCallback = userCallback;
this.srcLocations = srcLocations;
this.cleanup = () => {};
this.generateHookOverrides(originalShader);
this.output = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file still being used?

}

export function getOrCreateNode(graph, node) {
// const key = getNodeKey(node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha I guess we're not getting, just creating? just double checking if these need to be uncommented

},
...(hasDuplicates ? {} : {
set(value) {
return assignSwizzleNode(strandsContext, this, swizzle, value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this defined?

[NodeType.VARIABLE]: ["identifier", "dimension", "baseType"],
[NodeType.CONSTANT]: ["value", "dimension", "baseType"],
[NodeType.STRUCT]: [""],
[NodeType.PHI]: ["dependsOn", "phiBlocks", "dimension", "baseType"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this type represent?

[NodeType.OPERATION]: ["opCode", "dependsOn", "dimension", "baseType"],
[NodeType.LITERAL]: ["value", "dimension", "baseType"],
[NodeType.VARIABLE]: ["identifier", "dimension", "baseType"],
[NodeType.CONSTANT]: ["value", "dimension", "baseType"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this used currently, or just for the future?

[BlockType.DEFAULT]: (blockID, strandsContext, generationContext) => {
const { dag, cfg } = strandsContext;

const instructions = cfg.blockInstructions[blockID] || [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these have to be sorted by DAG order to be valid? (Are these naturally stored in sorted order already?)

@ksen0 ksen0 moved this to Open for Discussion in p5.js 2.x 🌱🌳 Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Open for Discussion
Development

Successfully merging this pull request may close these issues.

2 participants