Skip to content

Commit a3baf6d

Browse files
authored
Update Documentation (#558)
* Update Documentation * update keywords * fix readme bug
1 parent 53b791a commit a3baf6d

9 files changed

+105
-119
lines changed

README.md

Lines changed: 32 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -23,25 +23,25 @@ with the [ProcessPoolExecutor](https://docs.python.org/3/library/concurrent.futu
2323
[ThreadPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor) for parallel
2424
execution of Python functions on a single computer. executorlib extends this functionality to distribute Python
2525
functions over multiple computers within a high performance computing (HPC) cluster. This can be either achieved by
26-
submitting each function as individual job to the HPC job scheduler - [HPC Submission Mode](https://executorlib.readthedocs.io/en/latest/2-hpc-submission.html) -
27-
or by requesting a compute allocation of multiple nodes and then distribute the Python functions within this - allocation -
28-
[HPC Allocation Mode](https://executorlib.readthedocs.io/en/latest/3-hpc-allocation.html). Finally, to accelerate the
29-
development process executorlib also provides a - [Local Mode](https://executorlib.readthedocs.io/en/latest/1-local.html) -
30-
to use the executorlib functionality on a single workstation for testing. Starting with the [Local Mode](https://executorlib.readthedocs.io/en/latest/1-local.html)
31-
set by setting the backend parameter to local - `backend="local"`:
26+
submitting each function as individual job to the HPC job scheduler with an [HPC Cluster Executor](https://executorlib.readthedocs.io/en/latest/2-hpc-cluster.html) -
27+
or by requesting a job from the HPC cluster and then distribute the Python functions within this job with an
28+
[HPC Job Executor](https://executorlib.readthedocs.io/en/latest/3-hpc-job.html). Finally, to accelerate the
29+
development process executorlib also provides a [Single Node Executor](https://executorlib.readthedocs.io/en/latest/1-single-node.html) -
30+
to use the executorlib functionality on a laptop, workstation or single compute node for testing. Starting with the
31+
[Single Node Executor](https://executorlib.readthedocs.io/en/latest/1-single-node.html):
3232
```python
33-
from executorlib import Executor
33+
from executorlib import SingleNodeExecutor
3434

3535

36-
with Executor(backend="local") as exe:
36+
with SingleNodeExecutor() as exe:
3737
future_lst = [exe.submit(sum, [i, i]) for i in range(1, 5)]
3838
print([f.result() for f in future_lst])
3939
```
4040
In the same way executorlib can also execute Python functions which use additional computing resources, like multiple
4141
CPU cores, CPU threads or GPUs. For example if the Python function internally uses the Message Passing Interface (MPI)
4242
via the [mpi4py](https://mpi4py.readthedocs.io) Python libary:
4343
```python
44-
from executorlib import Executor
44+
from executorlib import SingleNodeExecutor
4545

4646

4747
def calc(i):
@@ -52,7 +52,7 @@ def calc(i):
5252
return i, size, rank
5353

5454

55-
with Executor(backend="local") as exe:
55+
with SingleNodeExecutor() as exe:
5656
fs = exe.submit(calc, 3, resource_dict={"cores": 2})
5757
print(fs.result())
5858
```
@@ -66,11 +66,11 @@ This flexibility to assign computing resources on a per-function-call basis simp
6666
Only the part of the Python functions which benefit from parallel execution are implemented as MPI parallel Python
6767
funtions, while the rest of the program remains serial.
6868

69-
The same function can be submitted to the [SLURM](https://slurm.schedmd.com) queuing by just changing the `backend`
70-
parameter to `slurm_submission`. The rest of the example remains the same, which highlights how executorlib accelerates
71-
the rapid prototyping and up-scaling of HPC Python programs.
69+
The same function can be submitted to the [SLURM](https://slurm.schedmd.com) job scheduler by replacing the
70+
`SingleNodeExecutor` with the `SlurmClusterExecutor`. The rest of the example remains the same, which highlights how
71+
executorlib accelerates the rapid prototyping and up-scaling of HPC Python programs.
7272
```python
73-
from executorlib import Executor
73+
from executorlib import SlurmClusterExecutor
7474

7575

7676
def calc(i):
@@ -81,16 +81,16 @@ def calc(i):
8181
return i, size, rank
8282

8383

84-
with Executor(backend="slurm_submission") as exe:
84+
with SlurmClusterExecutor() as exe:
8585
fs = exe.submit(calc, 3, resource_dict={"cores": 2})
8686
print(fs.result())
8787
```
8888
In this case the [Python simple queuing system adapter (pysqa)](https://pysqa.readthedocs.io) is used to submit the
8989
`calc()` function to the [SLURM](https://slurm.schedmd.com) job scheduler and request an allocation with two CPU cores
90-
for the execution of the function - [HPC Submission Mode](https://executorlib.readthedocs.io/en/latest/2-hpc-submission.html). In the background the [sbatch](https://slurm.schedmd.com/sbatch.html)
90+
for the execution of the function - [HPC Cluster Executor](https://executorlib.readthedocs.io/en/latest/2-hpc-submission.html). In the background the [sbatch](https://slurm.schedmd.com/sbatch.html)
9191
command is used to request the allocation to execute the Python function.
9292

93-
Within a given [SLURM](https://slurm.schedmd.com) allocation executorlib can also be used to assign a subset of the
93+
Within a given [SLURM](https://slurm.schedmd.com) job executorlib can also be used to assign a subset of the
9494
available computing resources to execute a given Python function. In terms of the [SLURM](https://slurm.schedmd.com)
9595
commands, this functionality internally uses the [srun](https://slurm.schedmd.com/srun.html) command to receive a subset
9696
of the resources of a given queuing system allocation.
@@ -106,7 +106,7 @@ def calc(i):
106106
return i, size, rank
107107

108108

109-
with Executor(backend="slurm_allocation") as exe:
109+
with SlurmJobExecutor() as exe:
110110
fs = exe.submit(calc, 3, resource_dict={"cores": 2})
111111
print(fs.result())
112112
```
@@ -116,29 +116,29 @@ In addition, to support for [SLURM](https://slurm.schedmd.com) executorlib also
116116
to address the needs for the up-coming generation of Exascale computers. Still even on traditional HPC clusters the
117117
hierarchical approach of the [flux](http://flux-framework.org) is beneficial to distribute hundreds of tasks within a
118118
given allocation. Even when [SLURM](https://slurm.schedmd.com) is used as primary job scheduler of your HPC, it is
119-
recommended to use [SLURM with flux](https://executorlib.readthedocs.io/en/latest/3-hpc-allocation.html#slurm-with-flux)
119+
recommended to use [SLURM with flux](https://executorlib.readthedocs.io/en/latest/3-hpc-job.html#slurm-with-flux)
120120
as hierarchical job scheduler within the allocations.
121121

122122
## Documentation
123123
* [Installation](https://executorlib.readthedocs.io/en/latest/installation.html)
124124
* [Minimal](https://executorlib.readthedocs.io/en/latest/installation.html#minimal)
125125
* [MPI Support](https://executorlib.readthedocs.io/en/latest/installation.html#mpi-support)
126126
* [Caching](https://executorlib.readthedocs.io/en/latest/installation.html#caching)
127-
* [HPC Submission Mode](https://executorlib.readthedocs.io/en/latest/installation.html#hpc-submission-mode)
128-
* [HPC Allocation Mode](https://executorlib.readthedocs.io/en/latest/installation.html#hpc-allocation-mode)
127+
* [HPC Cluster Executor](https://executorlib.readthedocs.io/en/latest/installation.html#hpc-cluster-executor)
128+
* [HPC Job Executor](https://executorlib.readthedocs.io/en/latest/installation.html#hpc-job-executor)
129129
* [Visualisation](https://executorlib.readthedocs.io/en/latest/installation.html#visualisation)
130130
* [For Developers](https://executorlib.readthedocs.io/en/latest/installation.html#for-developers)
131-
* [Local Mode](https://executorlib.readthedocs.io/en/latest/1-local.html)
132-
* [Basic Functionality](https://executorlib.readthedocs.io/en/latest/1-local.html#basic-functionality)
133-
* [Parallel Functions](https://executorlib.readthedocs.io/en/latest/1-local.html#parallel-functions)
134-
* [Performance Optimization](https://executorlib.readthedocs.io/en/latest/1-local.html#performance-optimization)
135-
* [HPC Submission Mode](https://executorlib.readthedocs.io/en/latest/2-hpc-submission.html)
136-
* [SLURM](https://executorlib.readthedocs.io/en/latest/2-hpc-submission.html#slurm)
137-
* [Flux](https://executorlib.readthedocs.io/en/latest/2-hpc-submission.html#flux)
138-
* [HPC Allocation Mode](https://executorlib.readthedocs.io/en/latest/3-hpc-allocation.html)
139-
* [SLURM](https://executorlib.readthedocs.io/en/latest/3-hpc-allocation.html#slurm)
140-
* [SLURM with Flux](https://executorlib.readthedocs.io/en/latest/3-hpc-allocation.html#slurm-with-flux)
141-
* [Flux](https://executorlib.readthedocs.io/en/latest/3-hpc-allocation.html#flux)
131+
* [Single Node Executor](https://executorlib.readthedocs.io/en/latest/1-single-node.html)
132+
* [Basic Functionality](https://executorlib.readthedocs.io/en/latest/1-single-node.html#basic-functionality)
133+
* [Parallel Functions](https://executorlib.readthedocs.io/en/latest/1-single-node.html#parallel-functions)
134+
* [Performance Optimization](https://executorlib.readthedocs.io/en/latest/1-single-node.html#performance-optimization)
135+
* [HPC Cluster Executor](https://executorlib.readthedocs.io/en/latest/2-hpc-cluster.html)
136+
* [SLURM](https://executorlib.readthedocs.io/en/latest/2-hpc-cluster.html#slurm)
137+
* [Flux](https://executorlib.readthedocs.io/en/latest/2-hpc-cluster.html#flux)
138+
* [HPC Job Executor](https://executorlib.readthedocs.io/en/latest/3-hpc-job.html)
139+
* [SLURM](https://executorlib.readthedocs.io/en/latest/3-hpc-job.html#slurm)
140+
* [SLURM with Flux](https://executorlib.readthedocs.io/en/latest/3-hpc-job.html#slurm-with-flux)
141+
* [Flux](https://executorlib.readthedocs.io/en/latest/3-hpc-job.html#flux)
142142
* [Trouble Shooting](https://executorlib.readthedocs.io/en/latest/trouble_shooting.html)
143143
* [Filesystem Usage](https://executorlib.readthedocs.io/en/latest/trouble_shooting.html#filesystem-usage)
144144
* [Firewall Issues](https://executorlib.readthedocs.io/en/latest/trouble_shooting.html#firewall-issues)

docs/_toc.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@ format: jb-book
22
root: README
33
chapters:
44
- file: installation.md
5-
- file: 1-local.ipynb
6-
- file: 2-hpc-submission.ipynb
7-
- file: 3-hpc-allocation.ipynb
5+
- file: 1-single-node.ipynb
6+
- file: 2-hpc-cluster.ipynb
7+
- file: 3-hpc-job.ipynb
88
- file: trouble_shooting.md
99
- file: 4-developer.ipynb
1010
- file: api.rst

docs/installation.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -33,10 +33,10 @@ used. The mpi4py documentation covers the [installation of mpi4py](https://mpi4p
3333
in more detail.
3434

3535
## Caching
36-
While the caching is an optional feature for [Local Mode](https://executorlib.readthedocs.io/en/latest/1-local.html) and
37-
for the distribution of Python functions in a given allocation of an HPC job scheduler [HPC Allocation Mode](https://executorlib.readthedocs.io/en/latest/3-hpc-allocation.html),
38-
it is required for the submission of individual functions to an HPC job scheduler [HPC Submission Mode](https://executorlib.readthedocs.io/en/latest/2-hpc-submission.html).
39-
This is required as in [HPC Submission Mode](https://executorlib.readthedocs.io/en/latest/2-hpc-submission.html) the
36+
While the caching is an optional feature for [Single Node Executor](https://executorlib.readthedocs.io/en/latest/1-single-node.html) and
37+
for the distribution of Python functions in a given allocation of an HPC job scheduler [HPC Job Executors](https://executorlib.readthedocs.io/en/latest/3-hpc-job.html),
38+
it is required for the submission of individual functions to an HPC job scheduler [HPC Cluster Executors](https://executorlib.readthedocs.io/en/latest/2-hpc-cluster.html).
39+
This is required as in [HPC Cluster Executors](https://executorlib.readthedocs.io/en/latest/2-hpc-cluster.html) the
4040
Python function is stored on the file system until the requested computing resources become available. The caching is
4141
implemented based on the hierarchical data format (HDF5). The corresponding [h5py](https://www.h5py.org) package can be
4242
installed using either the [Python package manager](https://pypi.org/project/h5py/):
@@ -51,12 +51,12 @@ Again, given the C++ bindings of the [h5py](https://www.h5py.org) package to the
5151
recommended. The h5py documentation covers the [installation of h5py](https://docs.h5py.org/en/latest/build.html) in
5252
more detail.
5353

54-
## HPC Submission Mode
55-
[HPC Submission Mode] requires the [Python simple queuing system adatper (pysqa)](https://pysqa.readthedocs.io) to
54+
## HPC Cluster Executor
55+
[HPC Cluster Executor](https://executorlib.readthedocs.io/en/latest/2-hpc-cluster.html) requires the [Python simple queuing system adatper (pysqa)](https://pysqa.readthedocs.io) to
5656
interface with the job schedulers and [h5py](https://www.h5py.org) package to enable caching, as explained above. Both
5757
can be installed via the [Python package manager](https://pypi.org/project/pysqa/):
5858
```
59-
pip install executorlib[submission]
59+
pip install executorlib[cluster]
6060
```
6161
Or alternatively using the [conda package manager](https://anaconda.org/conda-forge/pysqa):
6262
```
@@ -67,8 +67,8 @@ dependencies, still at least for [SLURM](https://slurm.schedmd.com) no additiona
6767
documentation covers the [installation of pysqa](https://pysqa.readthedocs.io/en/latest/installation.html) in more
6868
detail.
6969

70-
## HPC Allocation Mode
71-
For optimal performance in [HPC Allocation Mode](https://executorlib.readthedocs.io/en/latest/3-hpc-allocation.html) the
70+
## HPC Job Executor
71+
For optimal performance in [HPC Allocation Mode](https://executorlib.readthedocs.io/en/latest/3-hpc-job.html) the
7272
[flux framework](https://flux-framework.org) is recommended as job scheduler. Even when the [Simple Linux Utility for Resource Management (SLURM)](https://slurm.schedmd.com)
7373
or any other job scheduler is already installed on the HPC cluster. [flux framework](https://flux-framework.org) can be
7474
installed as a secondary job scheduler to leverage [flux framework](https://flux-framework.org) for the distribution of

docs/trouble_shooting.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,8 @@ dependency. The installation of this and other optional dependencies is covered
2020

2121
## Missing Dependencies
2222
The default installation of executorlib only comes with a limited number of dependencies, especially the [zero message queue](https://zeromq.org)
23-
and [cloudpickle](https://github.com/cloudpipe/cloudpickle). Additional features like [caching](https://executorlib.readthedocs.io/en/latest/installation.html#caching), [HPC submission mode](https://executorlib.readthedocs.io/en/latest/installation.html#hpc-submission-mode)
24-
and [HPC allocation mode](https://executorlib.readthedocs.io/en/latest/installation.html#hpc-allocation-mode) require additional dependencies. The dependencies are explained in more detail in the
23+
and [cloudpickle](https://github.com/cloudpipe/cloudpickle). Additional features like [caching](https://executorlib.readthedocs.io/en/latest/installation.html#caching), [HPC submission mode](https://executorlib.readthedocs.io/en/latest/installation.html#hpc-cluster-executor)
24+
and [HPC allocation mode](https://executorlib.readthedocs.io/en/latest/installation.html#hpc-job-executor) require additional dependencies. The dependencies are explained in more detail in the
2525
[installation section](https://executorlib.readthedocs.io/en/latest/installation.html#).
2626

2727
## Python Version
@@ -38,7 +38,7 @@ The resource dictionary parameter `resource_dict` can contain one or more of the
3838
* `openmpi_oversubscribe` (bool): adds the `--oversubscribe` command line flag (OpenMPI and SLURM only) - default False
3939
* `slurm_cmd_args` (list): Additional command line arguments for the srun call (SLURM only)
4040

41-
For the special case of the [HPC allocation mode](https://executorlib.readthedocs.io/en/latest/3-hpc-allocation.html)
41+
For the special case of the [HPC Job Executor](https://executorlib.readthedocs.io/en/latest/3-hpc-job.html)
4242
the resource dictionary parameter `resource_dict` can also include additional parameters define in the submission script
4343
of the [Python simple queuing system adatper (pysqa)](https://pysqa.readthedocs.io) these include but are not limited to:
4444
* `run_time_max` (int): the maximum time the execution of the submitted Python function is allowed to take in seconds.

0 commit comments

Comments
 (0)