Skip to content

Commit 7559aa9

Browse files
authored
Merge pull request #231 from jameshcorbett/coral2-updates
Rabbit usage documentation
2 parents 15db1bb + 24fd006 commit 7559aa9

File tree

3 files changed

+192
-70
lines changed

3 files changed

+192
-70
lines changed

tutorials/lab/coral2.rst

Lines changed: 58 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
.. _coral2:
22

3-
==================================
4-
CORAL2: Native Flux on Cray Shasta
5-
==================================
3+
===========================
4+
CORAL2: Flux on Cray Shasta
5+
===========================
66

7-
The LLNL, LBNL, and ORNL next-generation systems like RZNevada, Perlmutter,
8-
El Capitan, and Frontier are in various stages of early access. They are
9-
similar in that they all use the HPE Cray Shasta platform, which requires
10-
a few additional components to integrate completely with Flux.
7+
The LLNL, LBNL, and ORNL systems like Tioga, Perlmutter,
8+
El Capitan, and Frontier are similar in that they all use the
9+
HPE Cray Shasta platform, which requires
10+
an additional component to integrate completely with Flux.
1111

1212
--------------
1313
Things to Know
@@ -18,96 +18,84 @@ Things to Know
1818
Attempting to run further multi-node jobs will cause the excess jobs
1919
to fail. There is no limit on *submitted* multi-node jobs, and
2020
single-node jobs do not count towards the limit.
21-
#. All nested Flux instances (e.g. instances created with ``flux batch``,
22-
``flux alloc``, or ``flux submit ... flux start``
23-
should meet one of the following criteria:
21+
#. All Flux instances should meet one of the following criteria:
2422

2523
- Occupy a single node
2624
- Have exclusive access to the nodes they are running on (e.g. they
2725
do not share their resources with sibling instances).
2826

2927
Instances that do not meet one of the above criteria will not work properly.
3028

29+
By default Flux reserves ports 11000-11999 for itself. At any given
30+
level of the Flux hierarchy, this can be changed by configuring Flux
31+
to load the `cray_pals_port_distributor` jobtap plugin with a different
32+
range of ports, like so:
33+
34+
.. code-block:: toml
35+
36+
[job-manager]
37+
plugins = [
38+
{ load = "cray_pals_port_distributor.so", conf = { port-min = 11000, port-max = 13000 } }
39+
]
40+
3141
------------------------
3242
Building Flux for CORAL2
3343
------------------------
3444

3545
The basic steps to building Flux for Cray Shasta systems are as follows:
3646

37-
#. :ref:`Build flux-core and flux-sched manually <manual_installation>`
38-
with some prefix *P*.
47+
#. :ref:`Build flux-core (version >= 0.49.0) and flux-sched manually
48+
<manual_installation>` with some prefix *P*.
3949
#. Build `flux-coral2 <https://github.com/flux-framework/flux-coral2>`_
4050
with the same prefix *P*.
41-
#. Create a Flux config file specifying that the ``cray_pals_port_distributor.so``
42-
plugin should be loaded with some given port range (see below for an example).
43-
If you have other config files, put the new file in with the others.
44-
Before launching Flux, point Flux to the *directory* containing your config
45-
file(s) by setting the ``FLUX_CONF_DIR`` environment variable, or by passing
46-
``-o"-c/path/to/config"`` to ``flux start``.
47-
#. As an alternative to creating a config file and setting ``FLUX_CONF_DIR``,
48-
you can, after starting Flux, execute ``flux jobtap load
49-
cray_pals_port_distributor.so port-min=$N port-max=$M`` for some *N* and *M*.
50-
51-
52-
If you see job failures with an error message like "no cray_pals_port_distribution
53-
event posted", check that you have the ``cray_pals_port_distributor.so`` plugin
54-
loaded by running ``flux jobtap list``. If you don't see it in the list, retry
55-
step 3 or 4 above.
56-
57-
A script to build Flux is below.
58-
59-
.. code-block:: sh
60-
61-
#!/bin/bash
62-
63-
set -e
64-
65-
PREFIX=$HOME/local # a good default, but modify as needed
66-
PORT_MIN=11000 # a good default, but modify as needed
67-
PORT_MAX=12000 # a good default, but modify as needed
68-
69-
# Step 1: Build flux-core 0.29 or later
70-
71-
wget https://github.com/flux-framework/flux-core/releases/download/v0.29.0/flux-core-0.29.0.tar.gz
72-
73-
tar -xzvf flux-core-0.29.0.tar.gz && cd flux-core-0.29.0
74-
75-
./configure --prefix=$PREFIX && make -j && make install && cd ..
76-
77-
# The `flux` executable will now be in ~/local/bin/flux but it needs some
78-
# additional flux-coral2 extensions
79-
8051

81-
# Step 2: Build flux-sched 0.18 or later (optional but recommended)
52+
------------------
53+
Flux with Cray PMI
54+
------------------
8255

83-
wget https://github.com/flux-framework/flux-sched/releases/download/v0.18.0/flux-sched-0.18.0.tar.gz
56+
Applications linked to Cray MPICH will work natively with Flux
57+
provided the Cray MPICH library uses the PMI2 protocol instead of
58+
the homespun Cray PMI and libPALS. For Flux to support libPALS,
59+
flux-coral2 must be built (see above) and Flux must be configured
60+
to offer libPALS support. This is done by setting the "pmi" shell
61+
option to include "cray-pals" on a per-job basis like so:
8462

85-
tar -xzvf flux-sched-0.18.0.tar.gz && cd flux-sched-0.18.0
63+
.. code-block:: console
8664
87-
./configure --prefix=$PREFIX && make -j && make install && cd ..
65+
$ flux submit -n2 -opmi=cray-pals ./mpi_hello
8866
67+
or by configuring Flux to offer such support by default, by adding
68+
the following lines to the shell's ``initrc.lua`` file:
8969

90-
# Step 3: Build flux-coral2
70+
.. code-block:: lua
9171
92-
git clone https://github.com/flux-framework/flux-coral2.git && cd flux-coral2
72+
if shell.options['pmi'] == nil then
73+
shell.options['pmi'] = 'cray-pals,simple'
74+
end
9375
94-
./autogen.sh && ./configure --prefix=$PREFIX && make -j && make install
9576
96-
libtool --finish $PREFIX/lib/flux/job-manager/
97-
libtool --finish $PREFIX/lib/flux/shell/plugins/
98-
cd ..
77+
The lines should come before any call to load plugins.
9978

79+
If Flux jobs that use Cray MPICH end up as a collection of singletons,
80+
that is usually a sign that Cray MPICH is trying to use libPALS.
10081

101-
# Step 4: add a config file to automatically load a flux-coral2 plugin
82+
-----------------------------
83+
Configuring Flux with Rabbits
84+
-----------------------------
10285

103-
mkdir -p $PREFIX/etc/flux/config
86+
In order for a Flux system instance to be able to allocate
87+
rabbit storage, the ``dws_jobtap.so`` plugin must be loaded.
88+
The plugin can be loaded in a config file like so:
10489

105-
echo "[job-manager]
106-
plugins = [
107-
{ load = \"cray_pals_port_distributor.so\", conf = { port-min = $PORT_MIN, port-max = $PORT_MAX } }
108-
]
109-
" > $PREFIX/etc/flux/config/cray_pals_ports.toml
90+
.. code-block::
11091
111-
echo "Done! Now set FLUX_CONF_DIR=$PREFIX/etc/flux/config
112-
in your environment and run with $PREFIX/bin/flux"
92+
[job-manager]
93+
plugins = [
94+
{ load = "dws-jobtap.so" }
95+
]
11396
97+
Also, the ``flux-coral2-dws`` systemd service must be started
98+
on the same node as the rank 0 broker of the system instance
99+
(i.e. the management node). The ``flux`` user must have
100+
a kubeconfig file in its home directory granting it read
101+
and write access.

tutorials/lab/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,4 +13,5 @@ provided there are for collaborating systems.
1313
LLNL Introduction to Flux <https://hpc-tutorials.llnl.gov/flux/>
1414
coral
1515
coral2
16+
rabbit
1617

tutorials/lab/rabbit.rst

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
.. _rabbit:
2+
3+
=================
4+
Flux with Rabbits
5+
=================
6+
7+
8+
How to Allocate Rabbit Storage
9+
------------------------------
10+
11+
Request rabbit storage allocations for a job
12+
by setting the ``.attributes.system.dw`` field in a jobspec to
13+
a string containing one or more DW directives, or to a list of
14+
singleton DW directives.
15+
16+
DW directives are strings that start with ``#DW``. Directives
17+
that begin with ``#DW jobdw`` are for requesting storage that
18+
lasts the lifetime of the associated flux job. Directives that
19+
begin with ``#DW copy_in`` and ``#DW copy_out`` are for
20+
describing data movement to and from the rabbits, respectively.
21+
22+
The usage is most easily understood by example.
23+
24+
25+
Examples of jobdw directives
26+
----------------------------
27+
28+
Requesting a 10 gigabyte XFS file system per compute node on the
29+
command line:
30+
31+
.. code-block:: console
32+
33+
$ flux alloc -N2 --setattr=dw="#DW jobdw type=xfs capacity=10GiB name=project1"
34+
35+
Requesting both XFS and lustre file systems in a batch script:
36+
37+
.. code-block:: bash
38+
39+
#!/bin/bash
40+
41+
#FLUX: -N 2
42+
#FLUX: -q pdebug
43+
#FLUX: --setattr=dw="""
44+
#FLUX: #DW jobdw type=xfs capacity=1TiB name=xfsproject
45+
#FLUX: #DW jobdw type=lustre capacity=10GiB name=lustreproject
46+
#FLUX: """
47+
48+
echo "Hello World!" > $DW_JOB_lustreproject/world.txt
49+
50+
flux submit -N2 -n2 /bin/bash -c "echo 'Hello World!' > $DW_JOB_xfsproject/world.txt"
51+
52+
53+
jobdw directive fields
54+
----------------------
55+
56+
The **type** field can be one of ``xfs``, ``lustre``, ``gfs2``, or ``raw``.
57+
``lustre`` storage is shared by all nodes in the job. By contrast, the other types
58+
are for node-local storage. Processes on one node will not be able to read
59+
data written on other nodes. Currently only ``xfs`` and ``lustre`` are known
60+
to work properly.
61+
62+
The **capacity** field describes how much storage to be allocated. For ``lustre``
63+
the capacity refers to the overall capacity. For all other types, the capacity refers
64+
to the capacity per node. See
65+
`this Wikipedia article <https://en.wikipedia.org/wiki/Byte#Multiple-byte_units>`_
66+
for the meaning of the suffixes.
67+
68+
The **name** field determines the suffix to the ``DW_JOB_`` environment variable.
69+
See below for more detail.
70+
71+
72+
Using Rabbit Storage
73+
--------------------
74+
75+
For each ``jobdw`` directive associated with your job, your job will have
76+
an environment variable ``DW_JOB_[name]`` where ``[name]`` is the value
77+
of the ``name`` field in the directive. The value of the environment variable
78+
will be the path to the associated file system.
79+
80+
For instance, for a directive ``#DW jobdw type=xfs capacity=10GiB name=project1``,
81+
the associated job will have an environment variable ``DW_JOB_project1``.
82+
83+
84+
Data Movement Directives
85+
------------------------
86+
87+
To request that files be moved to or from the rabbits, additional DW
88+
directives must be added to the job in addition to jobdw directives.
89+
The ``copy_in`` directive is for moving data to the rabbits before the job
90+
starts, and the ``copy_out`` directive is for moving data from the rabbits
91+
after the job completes.
92+
93+
Both ``copy_in`` and ``copy_out`` directives have ``source`` and ``destination``
94+
fields which indicate where data is to be taken from and where it is to be moved to.
95+
The source must exist.
96+
97+
98+
Data Movement Examples
99+
----------------------
100+
101+
.. warning::
102+
103+
When writing ``copy_in`` and ``copy_out`` directives *on the command line*,
104+
be careful to always escape the ``$`` character when writing ``DW_JOB_[name]``
105+
variables. Otherwise your shell will expand them. This warning does not apply
106+
to batch scripts.
107+
108+
Requesting a 10 gigabyte XFS file system per compute node on the command
109+
line with data movement both to and from the rabbits (the source directory
110+
is assumed to exist):
111+
112+
.. code-block:: console
113+
114+
$ flux alloc -N2 --setattr=dw="#DW jobdw type=xfs capacity=10GiB name=project1
115+
#DW copy_in source=/p/lustre1/$USER/dir_in destination=\$DW_JOB_project1/
116+
#DW copy_out source=\$DW_JOB_project1/ destination=/p/lustre1/$USER/dir_out/"
117+
118+
Requesting a lustre file system, with data movement out from the rabbits,
119+
in a batch script:
120+
121+
.. code-block:: bash
122+
123+
#!/bin/bash
124+
125+
#FLUX: -N 2
126+
#FLUX: -q pdebug
127+
#FLUX: --setattr=dw="""
128+
#FLUX: #DW jobdw type=lustre capacity=10GiB name=lustreproject
129+
#FLUX: #DW copy_out source=$DW_JOB_lustreproject destination=/p/lustre1/$USER/lustreproject_results
130+
#FLUX: """
131+
132+
echo "Hello World!" > $DW_JOB_lustreproject/world.txt
133+

0 commit comments

Comments
 (0)