|
| 1 | +.. _flux-sched-container: |
| 2 | + |
| 3 | +==================================== |
| 4 | +Quick Start with Flux in a Container |
| 5 | +==================================== |
| 6 | + |
| 7 | +Do you want to submit a job to Flux? Here's a short tutorial on how to do so via a container! |
| 8 | +The reason this works is because a Flux instance can be run anywhere... |
| 9 | + |
| 10 | + | You can run it on a cluster... 🥜️ |
| 11 | + | You can run it alongside Lustre! 📁️ |
| 12 | + | You can run it on a share... 🤗️ |
| 13 | + | You can run it anywhere! 🏔️ |
| 14 | + | You can run it in a container... 📦️ |
| 15 | + | Using Flux? Total no-brainer! 🧠️ |
| 16 | +
|
| 17 | +Flux has a regularly updated Docker image on `Docker Hub <https://hub.docker.com/u/fluxrm>`_. |
| 18 | +You will need to `install docker <https://docs.docker.com/engine/install/>`_ first. |
| 19 | + |
| 20 | +---------------------- |
| 21 | +Starting the Container |
| 22 | +---------------------- |
| 23 | + |
| 24 | +Run the container as follows: |
| 25 | + |
| 26 | +.. code-block:: bash |
| 27 | +
|
| 28 | + # This says "run an interactive terminal" |
| 29 | + $ docker run -it fluxrm/flux-sched:latest |
| 30 | +
|
| 31 | +There is a bit of a trick going on with the entrypoint so that when you run the container as shown above, |
| 32 | +you will shell into a Flux instance of size 1. |
| 33 | + |
| 34 | + |
| 35 | +.. raw:: html |
| 36 | + :file: include/entrypoint-details.html |
| 37 | + |
| 38 | +.. code-block:: shell |
| 39 | +
|
| 40 | + # What is the size of the current Flux instance (i.e. number of flux-broker processes) |
| 41 | + $ flux getattr size |
| 42 | + 1 |
| 43 | +
|
| 44 | + # What resources do we have, and what are their states? |
| 45 | + $ flux resource list |
| 46 | + STATE NNODES NCORES NGPUS NODELIST |
| 47 | + free 1 4 0 f9004106893d |
| 48 | + allocated 0 0 0 |
| 49 | + down 0 0 0 |
| 50 | +
|
| 51 | +
|
| 52 | +In the above, because we are running in a single container, we only see one node with |
| 53 | +four cores. Note that you could also imagine an example with docker compose (and we will |
| 54 | +be adding this example shortly). Let's look at more Flux interactions, first to see Flux |
| 55 | +environment variables: |
| 56 | + |
| 57 | +.. code-block:: shell |
| 58 | +
|
| 59 | + # Where are different Flux things located (the environment) |
| 60 | + $ flux env |
| 61 | +
|
| 62 | +------------ |
| 63 | +Running Jobs |
| 64 | +------------ |
| 65 | + |
| 66 | +``flux submit`` submits a job which will be scheduled and run in the background. |
| 67 | +It prints the jobid to standard output upon successful submission. To run a job interactively, |
| 68 | +use ``flux run`` which submits and then attaches to the job, displays job output in real time, |
| 69 | +and does not exit until the job finishes. |
| 70 | + |
| 71 | +.. code-block:: shell |
| 72 | +
|
| 73 | + $ flux submit hostname |
| 74 | + ƒM5k8m7m |
| 75 | + $ flux run hostname |
| 76 | + f9004106893d |
| 77 | +
|
| 78 | +You can inspect ``--help`` for each of the commands above to see it's possible |
| 79 | +to customize the number of nodes, tasks, cores, and other variables. |
| 80 | +There are many different options to customize your job submission. For further |
| 81 | +details, please see :core:man1:`flux-submit` or run ``flux help submit``. |
| 82 | + |
| 83 | +The identifier shown above is a :ref:`jobid<fluid>` (e.g., ``ƒM5k8m7m``). This kind of identifer |
| 84 | +(or similar) is returned for every job submitted, and will be how you interact with your job moving forward. |
| 85 | +Let's throw in a few more sleep jobs, and immediately ask to see them with ``flux jobs``: |
| 86 | + |
| 87 | +.. code-block:: shell |
| 88 | +
|
| 89 | + $ flux submit sleep 360 |
| 90 | + $ flux submit sleep 360 |
| 91 | +
|
| 92 | +
|
| 93 | +.. code-block:: shell |
| 94 | +
|
| 95 | + $ flux jobs |
| 96 | + JOBID USER NAME ST NTASKS NNODES TIME INFO |
| 97 | + ƒq3d755R fluxuser sleep R 1 1 2.811s 848fc387afd7 |
| 98 | + ƒpcByPgK fluxuser sleep R 1 1 3.805s 848fc387afd7 |
| 99 | +
|
| 100 | +
|
| 101 | +But wait, what happened to our first two jobs? ``flux jobs`` only shows "active" jobs by default, |
| 102 | +add an ``-a`` for "all" to see all the jobs. |
| 103 | + |
| 104 | +.. code-block:: shell |
| 105 | +
|
| 106 | + $ flux jobs -a |
| 107 | + JOBID USER NAME ST NTASKS NNODES TIME INFO |
| 108 | + ƒq3d755R fluxuser sleep R 1 1 39.98s 848fc387afd7 |
| 109 | + ƒpcByPgK fluxuser sleep R 1 1 40.97s 848fc387afd7 |
| 110 | + ƒDi2hxdm fluxuser hostname CD 1 1 0.019s 848fc387afd7 |
| 111 | + ƒBmTaVZ9 fluxuser hostname CD 1 1 0.025s 848fc387afd7 |
| 112 | +
|
| 113 | +.. note:: |
| 114 | + |
| 115 | + See if you can figure out how to list jobs by a particular status, e.g., ``R`` in the output |
| 116 | + above means "running." Try ``flux jobs --help`` or ``flux help jobs``. |
| 117 | + |
| 118 | + |
| 119 | + |
| 120 | +Did you figure it out? It would be ``flux jobs --filter=RUNNING``. What if you were running a long |
| 121 | +process, and you wanted to check on output? Let's do that. Here is script to loop, print, and sleep. |
| 122 | + |
| 123 | +.. code-block:: bash |
| 124 | +
|
| 125 | + #!/bin/bash |
| 126 | + # Save this as loop.sh |
| 127 | + for i in {0..10}; do |
| 128 | + echo "Hello I am loop iteration $i." |
| 129 | + sleep ${i} |
| 130 | + done |
| 131 | +
|
| 132 | +Now make the script executable, and submit the job with flux. |
| 133 | + |
| 134 | +.. code-block:: shell |
| 135 | +
|
| 136 | + $ chmod +x ./loop.sh |
| 137 | + $ flux submit ./loop.sh |
| 138 | +
|
| 139 | +
|
| 140 | +To see output (and wait until completion) use ``flux job attach``: |
| 141 | + |
| 142 | + |
| 143 | +.. code-block:: shell |
| 144 | +
|
| 145 | + $ flux job attach ƒ4evXWb9Z |
| 146 | + Hello I am loop iteration 0. |
| 147 | + Hello I am loop iteration 1. |
| 148 | + Hello I am loop iteration 2. |
| 149 | + Hello I am loop iteration 3. |
| 150 | + Hello I am loop iteration 4. |
| 151 | + Hello I am loop iteration 5. |
| 152 | +
|
| 153 | +See ``flux help job`` or :core:man1:`flux-job` for more information on ``flux job attach``. |
| 154 | + |
| 155 | +------------ |
| 156 | +Viewing Jobs |
| 157 | +------------ |
| 158 | + |
| 159 | +Aside from listing jobs with ``flux jobs`` there are other ways to get metadata about jobs. |
| 160 | +For your running jobs, you can use ``flux pstree`` to see exactly that - a tree of jobs. |
| 161 | +Let's say we run another sleep job: |
| 162 | + |
| 163 | + |
| 164 | +.. code-block:: shell |
| 165 | +
|
| 166 | + $ flux submit sleep 350 |
| 167 | + ƒ744GwLs |
| 168 | +
|
| 169 | +The tree will show us that one sleep! |
| 170 | + |
| 171 | + |
| 172 | +.. code-block:: shell |
| 173 | +
|
| 174 | + $ flux pstree |
| 175 | + flux |
| 176 | + └── sleep |
| 177 | +
|
| 178 | +Submit the same command a few more times? We see that reflected in the tree! |
| 179 | + |
| 180 | +.. code-block:: shell |
| 181 | +
|
| 182 | + $ flux submit sleep 350 |
| 183 | + ƒAivupEb |
| 184 | + $ flux submit sleep 350 |
| 185 | + ƒB621bj5 |
| 186 | +
|
| 187 | + $ flux pstree |
| 188 | + flux |
| 189 | + └── 3*[sleep] |
| 190 | +
|
| 191 | +
|
| 192 | +And have you heard of a flux jobspec? This is a data structure that describes the resources, tasks, |
| 193 | +and attributes of a job. You can see one doing the following: |
| 194 | + |
| 195 | +.. code-block:: shell |
| 196 | +
|
| 197 | + $ flux job info ƒB621bj5 jobspec | jq |
| 198 | +
|
| 199 | +Finally, ``flux top`` is a cool way to see a summary of your jobs: |
| 200 | + |
| 201 | + |
| 202 | +.. code-block:: shell |
| 203 | +
|
| 204 | + ƒ ƒ63WcEKAP 3.6e+04d⌚ |
| 205 | + nodes [ 0/1] 0 pending |
| 206 | + cores [ 0/4] 0 running |
| 207 | + gpus [ 0/0] 3 complete, 0 failed ♡ |
| 208 | +
|
| 209 | + size: 1 depth: 1 uptime: 6.7m 0.47.0-148-ge2b96308f |
| 210 | + JOBID USER ST NTASKS NNODES RUNTIME NAME |
| 211 | +
|
| 212 | +Akin to vim, you can hit ``q`` to exit. And that's it! |
| 213 | +If you have any questions, please `let us know <https://github.com/flux-framework/flux-docs/issues>`_. |
0 commit comments