Skip to content

Commit 1232f02

Browse files
authored
Merge pull request #44 from dora-rs/cli-docs
Improve CLI docs
2 parents 3e8bf12 + 503a6c2 commit 1232f02

File tree

3 files changed

+227
-31
lines changed

3 files changed

+227
-31
lines changed

docs/.templates/cli.md

Lines changed: 108 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -3,47 +3,144 @@ This file is auto-generated using:
33
npm run update-cli
44
-->
55

6-
## Overview
6+
# CLI Commands
7+
8+
Dora's _command line interface_ (CLI) is the main way to control and inspect Dora dataflows.
9+
The CLI provides commands to build and run dataflows, both locally and on remote remote machines.
10+
It also provides various helper commands, for example for creating new nodes from a template or for visualizing a dataflow graph.
11+
12+
## Local Build & Run
13+
14+
The two most important commands when getting started are `dora build` and `dora run`.
15+
16+
- **`dora build`** reads the [build](../dataflow-config/#build) and [git](../dataflow-config/#git) fields of the given [dataflow configuration file](../dataflow-config/) and executes the specified clone and build instructions.
17+
- If the dataflow contains Python nodes, you can pass the `--uv` argument to `dora build` to replace `pip` commands with [`uv`](https://docs.astral.sh/uv/) commands
18+
- **`dora run`** starts a dataflow locally. Each node is spawned as specified in its [`path`](../dataflow-config/#path-required) and [`args`](../dataflow-config/#args-and-env) fields.
19+
- it does not run any build commands, so you might want to run `dora build` before
20+
- `dora run` _attaches_ to the started dataflow. This means that the dataflow logs are printed to the terminal and that the dataflow is stopped on `ctrl-c`.
21+
- If the dataflow contains Python nodes, you can pass the `--uv` argument to `dora run` to invoke nodes whose `path` ends with `.py` through [`uv`](https://docs.astral.sh/uv/)
22+
23+
With just these two commands you should be ready to get started!
24+
25+
## Helper Commands
26+
27+
The Dora CLI provides some commands that help during development:
28+
29+
- `dora new` helps with creating new nodes and dataflows from templates
30+
- Example: run the following to create a new Python node: `dora new my_node_name --kind node --lang python`
31+
- `dora graph` visualizes a given dataflow configuration file as a graph
32+
- pass `--open` to view the generated graph in your web browser
33+
- pass `--mermaid` to create a [mermaid](https://mermaid.live/) diagram description
34+
- the `dora self` command allows updating and uninstalling dora
35+
- the `dora help` command prints help text
36+
37+
## Distributed Build & Run
38+
39+
Dataflows can be run in a distributed fashion, where its nodes are running on different machines.
40+
This makes it possible, for example, to move compute-intensive nodes to more powerful machines, e.g. in the cloud.
41+
42+
Working with distributed dataflows is more complex and requires additional commands.
43+
The `dora run` command does not support distributed dataflows. Instead, use the commands described in the following sections.
44+
45+
### Setup
46+
To set up a distributed Dora network, you need a machine that is reachable by all machines that you want to use.
47+
On this machine, spawn the `dora coordinator`, which is the central point of contact for all machines and the CLI.
48+
If you like, you can adjust the port numbers that it listens on through command-line arguments.
49+
50+
After spawning the coordinator, spawn a `dora daemon` instance on all machines that should run parts of the dataflow.
51+
Use the following command for this:
52+
53+
```
54+
dora daemon --machine-id SOME_UNIQUE_ID --coordinator-addr 0.0.0.0
55+
```
56+
57+
Each daemon instance needs to have a different `--machine-id` argument.
58+
These machine IDs can later be referenced in dataflow configuration files to specify the target machine for each node.
59+
60+
Set the `--coordinator-addr` argument to the IP address of the `dora coordinator` instance.
61+
If you chose different port numbers for the coordinator, you additionally need to specify a `--coordinator-port` argument.
62+
63+
### Distributed Build, Run, and Stop
64+
65+
Building a distributed dataflow uses the same `dora build` command as a local build.
66+
The command will inspect the given dataflow specification file and look for `_unstable_deploy` keys in it.
67+
If it finds any, it will do a distributed build instead of a local build.
68+
69+
A distributed build works by instructing the coordinator to send build commands to all connected daemons.
70+
This way, each node is built directly on the machine where it will run later.
71+
If the CLI and coordinator are running on different machines, you need to specify a `--coordinator-addr` argument:
72+
73+
```
74+
dora start --coordinator-addr 0.0.0.0
75+
```
76+
77+
You can also specify a `--coordinator-port` (if the coordinator listens on a non-default port).
78+
79+
To run a distributed dataflow, use the **`dora start`** command.
80+
This command is very similar to `dora build`, but it goes to the coordinator instead of running dataflow directly.
81+
Again, you need to specify the `--coordinator-addr` (and `--coordinator-port`) arguments.
82+
Like `dora run`, you can pass `--uv` to spawn Python nodes through [`uv`](https://docs.astral.sh/uv/).
83+
84+
By default `dora start` will _attach_ to the started dataflow.
85+
This means that the dataflow logs are printed to the terminal and that the dataflow is stopped on `ctrl-c`.
86+
You can also use the `--detach` flag to let the dataflow run in background.
87+
88+
If you want to stop a dataflow, you can use the `dora stop` command.
89+
90+
#### Dismantle a Dora Network
91+
92+
The `dora destroy` command instructs the coordinator to stop all running dataflows, then tells all connected daemons to exit, and finally exits itself.
93+
So it provides a way of stopping everything Dora-related.
94+
95+
#### Inspect
96+
97+
You can use the following commands to request information about distributed dataflows from the coordinator:
98+
99+
- use `list` to get a list of all running dataflow instances
100+
- use `logs` to retrieve the log output of an active or finished dataflow run
101+
102+
103+
## All Commands
7104

8105
{}
9106

10-
## `up`
107+
### `up`
11108

12109
{up}
13110

14-
## `new`
111+
### `new`
15112

16113
{new}
17114

18-
## `start`
115+
### `start`
19116

20117
{start}
21118

22-
## `list`
119+
### `list`
23120

24121
{list}
25122

26-
## `logs`
123+
### `logs`
27124

28125
{logs}
29126

30-
## `check`
127+
### `check`
31128

32129
{check}
33130

34-
## `stop`
131+
### `stop`
35132

36133
{stop}
37134

38-
## `destroy`
135+
### `destroy`
39136

40137
{destroy}
41138

42-
## `graph`
139+
### `graph`
43140

44141
{graph}
45142

46-
## `--version`
143+
### `--version`
47144

48145
```
49146
Returns the current version of dora

docs/api/cli.md

Lines changed: 118 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,104 @@ This file is auto-generated using:
33
npm run update-cli
44
-->
55

6-
## Overview
6+
# CLI Commands
7+
8+
Dora's _command line interface_ (CLI) is the main way to control and inspect Dora dataflows.
9+
The CLI provides commands to build and run dataflows, both locally and on remote remote machines.
10+
It also provides various helper commands, for example for creating new nodes from a template or for visualizing a dataflow graph.
11+
12+
## Local Build & Run
13+
14+
The two most important commands when getting started are `dora build` and `dora run`.
15+
16+
- **`dora build`** reads the [build](../dataflow-config/#build) and [git](../dataflow-config/#git) fields of the given [dataflow configuration file](../dataflow-config/) and executes the specified clone and build instructions.
17+
- If the dataflow contains Python nodes, you can pass the `--uv` argument to `dora build` to replace `pip` commands with [`uv`](https://docs.astral.sh/uv/) commands
18+
- **`dora run`** starts a dataflow locally. Each node is spawned as specified in its [`path`](../dataflow-config/#path-required) and [`args`](../dataflow-config/#args-and-env) fields.
19+
- it does not run any build commands, so you might want to run `dora build` before
20+
- `dora run` _attaches_ to the started dataflow. This means that the dataflow logs are printed to the terminal and that the dataflow is stopped on `ctrl-c`.
21+
- If the dataflow contains Python nodes, you can pass the `--uv` argument to `dora run` to invoke nodes whose `path` ends with `.py` through [`uv`](https://docs.astral.sh/uv/)
22+
23+
With just these two commands you should be ready to get started!
24+
25+
## Helper Commands
26+
27+
The Dora CLI provides some commands that help during development:
28+
29+
- `dora new` helps with creating new nodes and dataflows from templates
30+
- Example: run the following to create a new Python node: `dora new my_node_name --kind node --lang python`
31+
- `dora graph` visualizes a given dataflow configuration file as a graph
32+
- pass `--open` to view the generated graph in your web browser
33+
- pass `--mermaid` to create a [mermaid](https://mermaid.live/) diagram description
34+
- the `dora self` command allows updating and uninstalling dora
35+
- the `dora help` command prints help text
36+
37+
## Distributed Build & Run
38+
39+
Dataflows can be run in a distributed fashion, where its nodes are running on different machines.
40+
This makes it possible, for example, to move compute-intensive nodes to more powerful machines, e.g. in the cloud.
41+
42+
Working with distributed dataflows is more complex and requires additional commands.
43+
The `dora run` command does not support distributed dataflows. Instead, use the commands described in the following sections.
44+
45+
### Setup
46+
To set up a distributed Dora network, you need a machine that is reachable by all machines that you want to use.
47+
On this machine, spawn the `dora coordinator`, which is the central point of contact for all machines and the CLI.
48+
If you like, you can adjust the port numbers that it listens on through command-line arguments.
49+
50+
After spawning the coordinator, spawn a `dora daemon` instance on all machines that should run parts of the dataflow.
51+
Use the following command for this:
52+
53+
```
54+
dora daemon --machine-id SOME_UNIQUE_ID --coordinator-addr 0.0.0.0
55+
```
56+
57+
Each daemon instance needs to have a different `--machine-id` argument.
58+
These machine IDs can later be referenced in dataflow configuration files to specify the target machine for each node.
59+
60+
Set the `--coordinator-addr` argument to the IP address of the `dora coordinator` instance.
61+
If you chose different port numbers for the coordinator, you additionally need to specify a `--coordinator-port` argument.
62+
63+
### Distributed Build, Run, and Stop
64+
65+
Building a distributed dataflow uses the same `dora build` command as a local build.
66+
The command will inspect the given dataflow specification file and look for `_unstable_deploy` keys in it.
67+
If it finds any, it will do a distributed build instead of a local build.
68+
69+
A distributed build works by instructing the coordinator to send build commands to all connected daemons.
70+
This way, each node is built directly on the machine where it will run later.
71+
If the CLI and coordinator are running on different machines, you need to specify a `--coordinator-addr` argument:
72+
73+
```
74+
dora start --coordinator-addr 0.0.0.0
75+
```
76+
77+
You can also specify a `--coordinator-port` (if the coordinator listens on a non-default port).
78+
79+
To run a distributed dataflow, use the **`dora start`** command.
80+
This command is very similar to `dora build`, but it goes to the coordinator instead of running dataflow directly.
81+
Again, you need to specify the `--coordinator-addr` (and `--coordinator-port`) arguments.
82+
Like `dora run`, you can pass `--uv` to spawn Python nodes through [`uv`](https://docs.astral.sh/uv/).
83+
84+
By default `dora start` will _attach_ to the started dataflow.
85+
This means that the dataflow logs are printed to the terminal and that the dataflow is stopped on `ctrl-c`.
86+
You can also use the `--detach` flag to let the dataflow run in background.
87+
88+
If you want to stop a dataflow, you can use the `dora stop` command.
89+
90+
#### Dismantle a Dora Network
91+
92+
The `dora destroy` command instructs the coordinator to stop all running dataflows, then tells all connected daemons to exit, and finally exits itself.
93+
So it provides a way of stopping everything Dora-related.
94+
95+
#### Inspect
96+
97+
You can use the following commands to request information about distributed dataflows from the coordinator:
98+
99+
- use `list` to get a list of all running dataflow instances
100+
- use `logs` to retrieve the log output of an active or finished dataflow run
101+
102+
103+
## All Commands
7104

8105
```
9106
dora-rs cli client
@@ -13,30 +110,30 @@ Usage: dora <COMMAND>
13110
Commands:
14111
check Check if the coordinator and the daemon is running
15112
graph Generate a visualization of the given graph using mermaid.js. Use --open to open
16-
browser
113+
browser
17114
build Run build commands provided in the given dataflow
18-
new Generate a new project or node. Choose the language between Rust, Python, C or
19-
C++
115+
new Generate a new project or node. Choose the language between Rust, Python, C or C++
116+
run Run a dataflow locally
20117
up Spawn coordinator and daemon in local mode (with default config)
21118
destroy Destroy running coordinator and daemon. If some dataflows are still running, they
22-
will be stopped first
23-
start Start the given dataflow path. Attach a name to the running dataflow by using
24-
--name
119+
will be stopped first
120+
start Start the given dataflow path. Attach a name to the running dataflow by using --name
25121
stop Stop the given dataflow UUID. If no id is provided, you will be able to choose
26-
between the running dataflows
122+
between the running dataflows
27123
list List running dataflows
28124
logs Show logs of a given dataflow and node
29125
daemon Run daemon
30126
runtime Run runtime
31127
coordinator Run coordinator
128+
self Dora CLI self-management commands
32129
help Print this message or the help of the given subcommand(s)
33130
34131
Options:
35132
-h, --help Print help
36133
-V, --version Print version
37134
```
38135

39-
## `up`
136+
### `up`
40137

41138
```
42139
Spawn coordinator and daemon in local mode (with default config)
@@ -47,7 +144,7 @@ Options:
47144
-h, --help Print help
48145
```
49146

50-
## `new`
147+
### `new`
51148

52149
```
53150
Generate a new project or node. Choose the language between Rust, Python, C or C++
@@ -59,13 +156,13 @@ Arguments:
59156
60157
Options:
61158
--kind <KIND> The entity that should be created [default: dataflow] [possible values:
62-
dataflow, custom-node]
159+
dataflow, node]
63160
--lang <LANG> The programming language that should be used [default: rust] [possible values:
64161
rust, python, c, cxx]
65162
-h, --help Print help
66163
```
67164

68-
## `start`
165+
### `start`
69166

70167
```
71168
Start the given dataflow path. Attach a name to the running dataflow by using --name
@@ -80,11 +177,13 @@ Options:
80177
--coordinator-addr <IP> Address of the dora coordinator [default: 127.0.0.1]
81178
--coordinator-port <PORT> Port number of the coordinator control server [default: 6012]
82179
--attach Attach to the dataflow and wait for its completion
180+
--detach Run the dataflow in background
83181
--hot-reload Enable hot reloading (Python only)
182+
--uv
84183
-h, --help Print help
85184
```
86185

87-
## `list`
186+
### `list`
88187

89188
```
90189
List running dataflows
@@ -97,7 +196,7 @@ Options:
97196
-h, --help Print help
98197
```
99198

100-
## `logs`
199+
### `logs`
101200

102201
```
103202
Show logs of a given dataflow and node
@@ -114,7 +213,7 @@ Options:
114213
-h, --help Print help
115214
```
116215

117-
## `check`
216+
### `check`
118217

119218
```
120219
Check if the coordinator and the daemon is running
@@ -128,7 +227,7 @@ Options:
128227
-h, --help Print help
129228
```
130229

131-
## `stop`
230+
### `stop`
132231

133232
```
134233
Stop the given dataflow UUID. If no id is provided, you will be able to choose between the running
@@ -147,7 +246,7 @@ Options:
147246
-h, --help Print help
148247
```
149248

150-
## `destroy`
249+
### `destroy`
151250

152251
```
153252
Destroy running coordinator and daemon. If some dataflows are still running, they will be stopped
@@ -161,7 +260,7 @@ Options:
161260
-h, --help Print help
162261
```
163262

164-
## `graph`
263+
### `graph`
165264

166265
```
167266
Generate a visualization of the given graph using mermaid.js. Use --open to open browser
@@ -177,7 +276,7 @@ Options:
177276
-h, --help Print help
178277
```
179278

180-
## `--version`
279+
### `--version`
181280

182281
```
183282
Returns the current version of dora

0 commit comments

Comments
 (0)