.. _interactive:

Dask-MPI with Interactive Jobs
==============================

Dask-MPI can be used to easily launch an entire Dask cluster in an existing MPI environment,
and attach a client to that cluster in an interactive session.

In this scenario, you would launch the Dask cluster using the Dask-MPI command-line interface
(CLI) ``dask-mpi``.

.. code-block:: bash

   mpirun -np 4 dask-mpi --scheduler-file scheduler.json

In this example, the above code will use MPI to launch the Dask Scheduler on MPI rank 0 and
Dask Workers (or Nannies) on all remaining MPI ranks.

It is advisable, as shown in the previous example, to use the ``--scheduler-file`` option when
using the ``dask-mpi`` CLI.  The ``--scheduler-file`` option saves the location of the Dask
Scheduler to a file that can be referenced later in your interactive session.  For example,
the following code would create a Dask Client and connect it to the Scheduler using the
scheduler JSON file.

.. code-block:: python

   from distributed import Client
   client = Client(scheduler_file='/path/to/scheduler.json')

As long as your interactive session has access to the same filesystem where the scheduler JSON
file is saved, this procedure will let you run your interactive session easily attach to your
separate ``dask-mpi`` job.

After a Dask cluster has been created, the ``dask-mpi`` CLI can be used to add more workers to
the cluster by using the ``--no-scheduler`` option.

.. code-block:: bash

   mpirun -n 5 dask-mpi --scheduler-file scheduler.json --no-scheduler

In this example (above), 5 more workers will be created and they will be registered with the
Scheduler (whose address is in the scheduler JSON file).

.. tip:: **Running with a Job Scheduler**

   In High-Performance Computing environments, job schedulers, such as LSF, PBS, or SLURM, are
   commonly used to request the necessary resources needed for an MPI job, such as the number
   of CPU cores, the total memory needed, and/or the number of nodes over which to spread out
   the MPI job.  In such a case, it is advisable that the user place the ``mpirun ... dask-mpi ...``
   command in a job submission script, with the number of MPI ranks (e.g., ``-np 4``) matches the
   number of cores requested from the job scheduler.

.. warning:: **MPI Jobs and Dask Nannies**

   It is many times useful to launch your Dask-MPI cluster (using ``dask-mpi``) with Dask Nannies
   (i.e., with the ``--worker-class distributed.Nanny`` option), rather than strictly with Dask Workers.
   This is because the Dask Nannies can relaunch a worker when a failure occurs. However, in some MPI
   environments, Dask Nannies will not be able to work as expected.  This is because some installations
   of MPI may restrict the number of actual running processes from exceeding the number of MPI ranks
   requested.  When using Dask Nannies, the Nanny process is executed and runs in the background
   after forking a Worker process.  Hence, one Worker process will exist for each Nanny process.
   Some MPI installations will kill any forked process, and you will see many errors arising from
   the Worker processes being killed.  If this happens, disable the use of Nannies with the
   ``--worker-class distributed.Worker`` option to ``dask-mpi``.

For more details on how to use the ``dask-mpi`` command, see the :ref:`cli`.