Skip to content

Conat integrated cluster support #8388

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 118 commits into from
Jul 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
118 commits
Select commit Hold shift + click to select a range
c86f5b8
starting work on supercluster support
williamstein Jun 25, 2025
898ec4f
Merge branch 'master' into conat-supercluster
williamstein Jun 25, 2025
007be51
conat supercluster: add initial updates to stream; more tests
williamstein Jun 25, 2025
e961c7f
supercluster: working on it.
williamstein Jun 25, 2025
4358848
work-in-progress implement waitForInterest for superclusters...
williamstein Jun 25, 2025
d8e5bc6
Merge branch 'master' into conat-supercluster
williamstein Jun 25, 2025
484ec3c
supercluster -- implement first not-efficient waitForInterest, since …
williamstein Jun 25, 2025
cdc2b74
conat: fix some unit tests
williamstein Jun 25, 2025
445e32e
supercluster config -- switch to organizing by name
williamstein Jun 25, 2025
a904b09
supercluster -- making progress :-)
williamstein Jun 25, 2025
ff7340d
Merge branch 'master' into conat-supercluster
williamstein Jun 25, 2025
0ea68be
Merge branch 'master' into conat-supercluster
williamstein Jun 26, 2025
fc9c29e
Merge branch 'master' into conat-supercluster
williamstein Jun 26, 2025
ef18eea
fix typescript error
williamstein Jun 26, 2025
43a1733
Merge branch 'master' into conat-supercluster
williamstein Jun 26, 2025
f1dbacc
make waitForInterest cancellable and use this when waiting for intere…
williamstein Jun 26, 2025
4376b68
conat cluster: run dedicated persist server for cluster state
williamstein Jun 27, 2025
dcf1570
Merge branch 'master' into conat-supercluster
williamstein Jun 27, 2025
7d8582e
implement automated triming of supercluster stream when interest in p…
williamstein Jun 27, 2025
cb923bb
...
williamstein Jun 27, 2025
fa64267
Merge branch 'master' into conat-supercluster
williamstein Jun 27, 2025
8ca25f3
Merge branch 'master' into conat-supercluster
williamstein Jun 27, 2025
026d555
Merge branch 'master' into conat-supercluster
williamstein Jun 27, 2025
a601fcd
Merge branch 'master' into conat-supercluster
williamstein Jun 27, 2025
828f6e5
cluster: starting work on sticky stream
williamstein Jun 27, 2025
5272b12
Merge branch 'master' into conat-supercluster
williamstein Jun 27, 2025
056a9d7
cluster: working on making sticky state be in a stream
williamstein Jun 28, 2025
b327064
cluster: very basic test of sticky queue group
williamstein Jun 28, 2025
fa9c829
refactor sticky tests
williamstein Jun 28, 2025
3500b3c
try to clarify sticky queue groups a little
williamstein Jun 28, 2025
7be0b60
rename: "supercluster" --> "cluster" everywhere
williamstein Jun 28, 2025
e9a536f
work in progress setting up a way to benchmark/test conat clustering
williamstein Jun 28, 2025
480ba99
Merge branch 'master' into conat-supercluster
williamstein Jun 28, 2025
022dafa
mainly writing benchmarking code for conat server
williamstein Jun 28, 2025
c70c2d9
Merge branch 'master' into conat-supercluster
williamstein Jun 28, 2025
2486ba9
Merge branch 'master' into conat-supercluster
williamstein Jun 28, 2025
4310e65
conat server clustering -- decide to just go fully with my own approach
williamstein Jun 29, 2025
4b6f6c4
rewrite all mention of valkey to use our own clustering approach
williamstein Jun 29, 2025
956c0ff
Merge branch 'master' into conat-supercluster
williamstein Jun 29, 2025
2883a88
server cluster: unjoin
williamstein Jun 29, 2025
8876341
work in progress on cluster implementation
williamstein Jun 29, 2025
f556a72
fix typings for sys api
williamstein Jun 29, 2025
bf0cfd4
better cluster visibility
williamstein Jun 29, 2025
9fb13a3
work in progress on cluster management
williamstein Jun 29, 2025
0e0bd59
unit testing buiding a cluster iteratively, node discovery, etc.
williamstein Jun 30, 2025
3ae1747
automatic cluster discovery protocol fully implemented and unit tested
williamstein Jun 30, 2025
15efc41
provide option to spawn a cluster
williamstein Jun 30, 2025
5f7d7cd
given the use, just make creating a local cluster even simpler
williamstein Jun 30, 2025
3a56bf6
conat: integrating new clustering with hub
williamstein Jun 30, 2025
9d83014
conat: clustering -- got it working integrated with dev environment.
williamstein Jun 30, 2025
e757543
Merge branch 'master' into conat-supercluster
williamstein Jul 1, 2025
974b579
typescript
williamstein Jul 1, 2025
9828bbd
work in progress on cluster testing
williamstein Jul 1, 2025
4e62a33
mainly -- add client.id to sync caches
williamstein Jul 1, 2025
bf8f5a7
wait for intereset when pinging client socket
williamstein Jul 1, 2025
17f01de
clustering: address subtle issues involving creating sockets and asyn…
williamstein Jul 1, 2025
4a35124
must wait for interest before configuring socket, too.
williamstein Jul 1, 2025
487ce03
correctly update connection info on reconnect
williamstein Jul 1, 2025
8426ab4
move a console.trace to debug
williamstein Jul 1, 2025
111d331
disable ssl for cluster nodes
williamstein Jul 1, 2025
f95111c
cluster: implement actual first rough draft of cluster routing algorithm
williamstein Jul 1, 2025
47d3ab1
cluster: fix a bug and some unit tests (showing sticky queue groups i…
williamstein Jul 1, 2025
bd82967
more cluster sticky fixes and unit tests
williamstein Jul 1, 2025
e001efb
fix table order for connections
williamstein Jul 2, 2025
3bad736
do not set password for conat in database -- it's too confusing
williamstein Jul 2, 2025
2630cb3
cluster: fix another issue with sticky routing
williamstein Jul 2, 2025
853b7ff
conat cluster -- test suite is pretty broken for mysterious reasons
williamstein Jul 2, 2025
250f6ba
fix some entaglement of different clusters in the unit testing
williamstein Jul 2, 2025
10e7106
fix some conat unit testing
williamstein Jul 2, 2025
8fbb512
cluster: fix remaning issue with unit testing the cluster, so now the…
williamstein Jul 2, 2025
45536e0
fix the fronend jupyter tests (mainly) and some other testing fixes
williamstein Jul 2, 2025
96d86db
reorganize how tests are run to be more flexible and support retries,…
williamstein Jul 2, 2025
f99c53b
test counter not showing properly
williamstein Jul 2, 2025
ff27263
replace the pointless test in static by an even more pointless one, b…
williamstein Jul 2, 2025
5ec7d0f
update next testing for react19
williamstein Jul 2, 2025
2ebe23e
all tests might pass (?)... but flakie tests get listed at the end
williamstein Jul 2, 2025
e08f668
make a time stress test less stressful/flakie
williamstein Jul 2, 2025
b63f9f9
fix installing into older python's pip
williamstein Jul 2, 2025
3580c8a
fix more testing/building issues
williamstein Jul 2, 2025
58f5446
build and passing tests
williamstein Jul 2, 2025
aa14cf4
attempt to fix process leaks in jupyter kernel unit tests
williamstein Jul 2, 2025
9e10558
Merge branch 'conat-supercluster' of github.com:sagemathinc/cocalc in…
williamstein Jul 2, 2025
5bb414d
add a tough conat test of persist servers in a cluster getting removed
williamstein Jul 3, 2025
594766c
fix bug in core stream state update when there is automatic failover,…
williamstein Jul 3, 2025
2794b21
make conat unit tests more robust to running under load
williamstein Jul 3, 2025
b8c7135
cluster: support for detecting and removing node from cluster
williamstein Jul 3, 2025
009815a
conat cluster: implement handling removing nodes from cluster
williamstein Jul 3, 2025
8fedf17
implement trimming sticky state
williamstein Jul 3, 2025
a73a261
do not require secret token for starting cocalc project in unit testi…
williamstein Jul 3, 2025
6bf59ec
Merge branch 'master' into conat-supercluster
williamstein Jul 3, 2025
1549ae5
making conat tests more robust
williamstein Jul 3, 2025
6a84cbe
conat cluster: fix bug with canceling a wait promise; make tests more…
williamstein Jul 3, 2025
219fdb8
conat: found a way to make it so a critical test always fails -- this…
williamstein Jul 4, 2025
b024ef5
reorganize metrics recorder, error listener, and enable for cluster n…
williamstein Jul 4, 2025
5737964
Merge branch 'master' into conat-supercluster
williamstein Jul 4, 2025
4064d38
work in progress rewriting autoconnect logic
williamstein Jul 4, 2025
06f3e2c
fix for moving hub_register
williamstein Jul 4, 2025
9102d74
revert attempt to have nodejs conat client manage own reconnect logic
williamstein Jul 5, 2025
c75cbff
conat core -- carefully tracking down a subtle test failure
williamstein Jul 5, 2025
8717103
ts errors in some unit testing
williamstein Jul 5, 2025
7382c44
conat core -- working on streams
williamstein Jul 5, 2025
a756f03
conat streams - take into account cases where messages can be dropped…
williamstein Jul 5, 2025
7617309
upgrade prom-client
williamstein Jul 5, 2025
e5fb606
conat: surpress errors when client already closed -- cb is still call…
williamstein Jul 5, 2025
b8bc737
relax some conat test tollerances
williamstein Jul 5, 2025
f383452
(hsy) fix spelling of "flakie"
williamstein Jul 5, 2025
7dc2541
sync tests hang so force exit (for now)
williamstein Jul 5, 2025
ec4b035
conate-core --> conat-router
williamstein Jul 5, 2025
bb0eb01
Merge branch 'conat-supercluster' of github.com:sagemathinc/cocalc in…
williamstein Jul 5, 2025
641a164
conat server: add some kucalc support code (not really tested)
williamstein Jul 6, 2025
e0a54f4
kucalc conat: set the cluster name; do not include self with addresses
williamstein Jul 6, 2025
2b654a7
conat kucalc: set the server id based on hostname
williamstein Jul 6, 2025
51df338
conat kucalc router -- password
williamstein Jul 6, 2025
b0829bc
conat kucalc -- remove pods that have left
williamstein Jul 6, 2025
bb298be
conat core-stream init: properly handle an error
williamstein Jul 6, 2025
c961835
conat tests -- refactoring and improving
williamstein Jul 6, 2025
a426aae
new version
williamstein Jul 6, 2025
66205ef
more testing tweaking
williamstein Jul 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@
"database": "cd dev/project && ./start_postgres.py",
"database-remove-locks": "./scripts/database-remove-locks",
"c": "LOGS=/tmp/ DEBUG='cocalc:*' ./scripts/c",
"version-check": "pip3 install --break-system-packages typing_extensions mypy || pip3 install --break-system-packages typing_extensions mypy && ./workspaces.py version-check && mypy scripts/check_npm_packages.py",
"version-check": "pip3 install typing_extensions mypy || pip3 install --break-system-packages typing_extensions mypy && ./workspaces.py version-check && mypy scripts/check_npm_packages.py",
"test-parallel": "unset DEBUG && pnpm run version-check && cd packages && pnpm run -r --parallel test",
"test": "unset DEBUG && pnpm run version-check && cd packages && pnpm run -r test",
"test": "unset DEBUG && pnpm run version-check && ./workspaces.py test",
"depcheck": "cd packages && pnpm run -r --parallel depcheck",
"prettier-all": "cd packages/",
"local-ci": "./scripts/ci.sh",
Expand Down
3 changes: 3 additions & 0 deletions src/packages/backend/bin/conat-connections.cjs
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
const { conat } = require('@cocalc/backend/conat')
const { showUsersAndStats } = require('@cocalc/conat/monitor/tables');
const { conatServer } = require('@cocalc/backend/data')
const { delay } = require("awaiting");

async function main() {
console.log("Connecting to", conatServer);
const maxMessages = process.argv[2] ? parseInt(process.argv[2]) : undefined;
const maxWait = process.argv[3] ? parseInt(process.argv[3]) : 3000;
const client = conat();
await client.waitUntilSignedIn();
await delay(1000);
if(!maxMessages) {
console.log("\nUsage: pnpm conat-connnections [num-servers] [max-wait-ms]\n")
}
Expand Down
7 changes: 6 additions & 1 deletion src/packages/backend/bin/conat-disconnect.cjs
Original file line number Diff line number Diff line change
@@ -1,12 +1,17 @@
const { conat } = require('@cocalc/backend/conat')
const { connections } = require('@cocalc/conat/monitor/tables');
const { sysApi } = require("@cocalc/conat/core/sys");
const { delay } = require("awaiting");

async function main() {
console.log("Disconnect Clients From Server")
const ids = process.argv.slice(2);
console.log(ids);
const client = await conat()
await client.call('sys.conat.server').disconnect(ids);
await client.waitUntilSignedIn();
await delay(1000);
const sys = sysApi(client);
await sys.disconnect(ids);
process.exit(0);
}

Expand Down
24 changes: 24 additions & 0 deletions src/packages/backend/bin/conat-test-server.cjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
/*
Starts a node for testing and benchmarking. It has no auth or security, and is just meant
for unit testing and benchmarking purposes.
*/
for(const name of ['PORT', 'BASE_PATH', 'CONAT_SERVER', 'COCALC_PROJECT_ID']) {
delete process.env[name];
}
const { initConatServer } = require("@cocalc/backend/conat/test/setup");
const { setConatServer } = require("@cocalc/backend/data");

async function main() {
const clusterName = process.argv[2];
const server = await initConatServer({
id: "0",
clusterName,
systemAccountPassword: "test",
});
console.log("ADDRESS: ", server.address())
setConatServer(server.address())
}

main();


27 changes: 27 additions & 0 deletions src/packages/backend/conat/test/cluster/bench.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
/*
create a cluster consisting of $n$ distinct nodejs processes, each
listening on separate ports on localhost. They are all connected.
*/

interface Node {
port: number;
child; // the spawned child process
}

export class Cluster {
nodes: Node[] = [];

constructor(public N: number) {}

init = async () => {
for (let i = 0; i < this.N; i++) {

}
};
}

export async function createCluster(N: number): Promise<Cluster> {
const C = new Cluster(N);
await C.init();
return C;
}
Loading
Loading