|
| 1 | +# Testing Nodes |
| 2 | + |
| 3 | +To ensure that a node behaves as expected, you probably want to do some testing. There are multiple |
| 4 | +levels of testing: |
| 5 | + |
| 6 | +- **[Unit testing](https://en.wikipedia.org/wiki/Unit_testing):** |
| 7 | + Verify that a specific function or component behaves as expected. |
| 8 | + |
| 9 | + Nodes are normal executables/scripts, so you can use the standard testing tools of your chosen |
| 10 | + programming language for unit testing. For example, for Rust nodes you can use Rust's built-in |
| 11 | + [test framework](https://doc.rust-lang.org/book/ch11-01-writing-tests.html) combined with |
| 12 | + `cargo test`. |
| 13 | +- **[[Integration testing](https://en.wikipedia.org/wiki/Integration_testing)]:** |
| 14 | + Verify that a node reacts as expected to a set of inputs and that it produces the expected outputs. |
| 15 | + |
| 16 | + Dora does _not_ offer an automated integration testing feature yet. We plan to add a way |
| 17 | + to run nodes in a standalone "test mode" where inputs are supplied through a special input file |
| 18 | + and outputs are written to an output file. This will enable integration testing of nodes as you |
| 19 | + can verify that each node reacts as expected to given outputs. |
| 20 | + |
| 21 | + However, Dora nodes can be run in a standalone _"[interactive mode](#interactive-mode)"_, where |
| 22 | + inputs are given through the command line. This feature is useful for manual integration testing. |
| 23 | + See [below](#interactive-mode) for details. |
| 24 | +- **[End-to-end testing](https://en.wikipedia.org/wiki/System_testing):** |
| 25 | + Verify that a full dataflow with multiple nodes works as expected. |
| 26 | + |
| 27 | + This sort of testing is often done manually, using the `dora run` or `dora start` CLI commands. |
| 28 | + If your dataflow has well-defined exit conditions, you can also run automated tests through |
| 29 | + `dora run`: the exit status will report whether any error occurred. |
| 30 | + |
| 31 | + |
| 32 | +## Interactive Mode |
| 33 | + |
| 34 | +The interactive mode enables starting a node in a standalone mode that prompts for inputs on the |
| 35 | +terminal. It is available for all nodes that use the `init_from_env` or `init_interactive` |
| 36 | +function for their initialization. To start the interactive mode, start your node executable/script |
| 37 | +manually like a normal executable. |
| 38 | + |
| 39 | +Instead of connecting to a `dora daemon`, this interactive mode will prompt for node inputs |
| 40 | +on the terminal. In this mode, the node is completely isolated from the dora daemon and |
| 41 | +other nodes, so it cannot be part of a dataflow. |
| 42 | + |
| 43 | +### Example |
| 44 | + |
| 45 | +Run any node that uses `init_interactive` or [`init_from_env`](Self::init_from_env) directly |
| 46 | +from a terminal. The node will then start in "interactive mode" and prompt you for the next |
| 47 | +input: |
| 48 | + |
| 49 | +```bash |
| 50 | +> cargo build -p rust-dataflow-example-node |
| 51 | +> target/debug/rust-dataflow-example-node |
| 52 | +hello |
| 53 | +Starting node in interactive mode as DORA_NODE_CONFIG env variable is not set |
| 54 | +Node asks for next input |
| 55 | +? Input ID |
| 56 | +[empty input ID to stop] |
| 57 | +``` |
| 58 | + |
| 59 | +The `rust-dataflow-example-node` expects a `tick` input, so let's set the input ID to |
| 60 | +`tick`. Tick messages don't have any data, so we leave the "Data" empty when prompted: |
| 61 | + |
| 62 | +```bash |
| 63 | +Node asks for next input |
| 64 | +> Input ID tick |
| 65 | +> Data |
| 66 | +tick 0, sending 0x943ed1be20c711a4 |
| 67 | +node sends output random with data: PrimitiveArray<UInt64> |
| 68 | +[ |
| 69 | + 10682205980693303716, |
| 70 | +] |
| 71 | +Node asks for next input |
| 72 | +? Input ID |
| 73 | +[empty input ID to stop] |
| 74 | +``` |
| 75 | + |
| 76 | +We see that both the `stdout` output of the node and also the output messages that it sends |
| 77 | +are printed to the terminal. Then we get another prompt for the next input. |
| 78 | + |
| 79 | +If you want to send an input with data, you can either send it as text (for string data) |
| 80 | +or as a JSON object (for struct data). Other data types are not supported currently. |
| 81 | + |
| 82 | +Empty input IDs are interpreted as stop instructions: |
| 83 | + |
| 84 | +```bash |
| 85 | +> Input ID |
| 86 | +given input ID is empty -> stopping |
| 87 | +Received stop |
| 88 | +Node asks for next input |
| 89 | +event channel was stopped -> returning empty event list |
| 90 | +node reports EventStreamDropped |
| 91 | +node reports closed outputs [] |
| 92 | +node reports OutputsDone |
| 93 | +``` |
| 94 | + |
| 95 | +In addition to the node output, we see log messages for the different events that the node |
| 96 | +reports. After `OutputsDone`, the node should exit. |
| 97 | + |
| 98 | +### JSON data |
| 99 | + |
| 100 | +In addition to text input, the `Data` prompt also supports JSON objects, which will be |
| 101 | +converted to Apache Arrow struct arrays: |
| 102 | + |
| 103 | +```bash |
| 104 | +Node asks for next input |
| 105 | +> Input ID some_input |
| 106 | +> Data { "field_1": 42, "field_2": { "inner": "foo" } } |
| 107 | +``` |
| 108 | +
|
| 109 | +This JSON data is converted to the following Arrow array: |
| 110 | +
|
| 111 | +``` |
| 112 | +StructArray |
| 113 | +-- validity: [valid, ] |
| 114 | +[ |
| 115 | + -- child 0: "field_1" (Int64) |
| 116 | + PrimitiveArray<Int64> |
| 117 | + [42,] |
| 118 | + -- child 1: "field_2" (Struct([Field { name: "inner", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }])) |
| 119 | + StructArray |
| 120 | + -- validity: [valid,] |
| 121 | + [ |
| 122 | + -- child 0: "inner" (Utf8) |
| 123 | + StringArray |
| 124 | + ["foo",] |
| 125 | + ] |
| 126 | +] |
| 127 | +``` |
| 128 | +
|
0 commit comments