Dask-MPI with Interactive Jobs¶
Dask-MPI can be used to easily launch an entire Dask cluster in an existing MPI environment, and attach a client to that cluster in an interactive session.
In this scenario, you would launch the Dask cluster using the Dask-MPI command-line interface
mpirun -np 4 dask-mpi --scheduler-file scheduler.json
In this example, the above code will use MPI to launch the Dask Scheduler on MPI rank 0 and Dask Workers (or Nannies) on all remaining MPI ranks.
It is advisable, as shown in the previous example, to use the
--scheduler-file option when
dask-mpi CLI. The
--scheduler-file option saves the location of the Dask
Scheduler to a file that can be referenced later in your interactive session. For example,
the following code would create a Dask Client and connect it to the Scheduler using the
scheduler JSON file.
from distributed import Client client = Client(scheduler_file='/path/to/scheduler.json')
As long as your interactive session has access to the same filesystem where the scheduler JSON
file is saved, this procedure will let you run your interactive session easily attach to your
After a Dask cluster has been created, the
dask-mpi CLI can be used to add more workers to
the cluster by using the
mpirun -n 5 dask-mpi --scheduler-file scheduler.json --no-scheduler
In this example (above), 5 more workers will be created and they will be registered with the Scheduler (whose address is in the scheduler JSON file).
Running with a Job Scheduler
In High-Performance Computing environments, job schedulers, such as LSF, PBS, or SLURM, are
commonly used to request the necessary resources needed for an MPI job, such as the number
of CPU cores, the total memory needed, and/or the number of nodes over which to spread out
the MPI job. In such a case, it is advisable that the user place the
mpirun ... dask-mpi ...
command in a job submission script, with the number of MPI ranks (e.g.,
-np 4) matches the
number of cores requested from the job scheduler.
MPI Jobs and Dask Nannies
It is many times useful to launch your Dask-MPI cluster (using
dask-mpi) with Dask Nannies
(i.e., with the
--worker-class distributed.Nanny option), rather than strictly with Dask Workers.
This is because the Dask Nannies can relaunch a worker when a failure occurs. However, in some MPI
environments, Dask Nannies will not be able to work as expected. This is because some installations
of MPI may restrict the number of actual running processes from exceeding the number of MPI ranks
requested. When using Dask Nannies, the Nanny process is executed and runs in the background
after forking a Worker process. Hence, one Worker process will exist for each Nanny process.
Some MPI installations will kill any forked process, and you will see many errors arising from
the Worker processes being killed. If this happens, disable the use of Nannies with the
--worker-class distributed.Worker option to
For more details on how to use the
dask-mpi command, see the Command-Line Interface (CLI).