Management system to run several molecular dynamics simulations from one batch job on Lomonosov-2 cluster
This tool allows you to run several molecular dynamics simulations on Lomonosov-2 cluster simultaneously using only one SLURM job. Total number of nodes needed for allocation auto-computed according to provided taskfile.
Though this tool has been designed specially for Lomonosov-2 supercomputer cluster it can be easily modified for use with any cluster which uses SLURM workload manager. Even if your cluster uses another workload manager l2-multimd can still be adapted for it without any major code modifications.
Feel free to use it, modify it and contribute to it by reporting bugs, suggesting enhancements, feature requests and pull requests.
Currently the following engines are supported:
- AMBER
- NAMD
- Gaussian. WARNING: support is still experimental because we have no Gaussian distribution so run wrappers are based on publicly available documentation only! No real checks were made at all so a lot of bugs possibly hide inside. Feel free to suggest changes/bugfixes/improvements via issues and pull requests.
- CP2K
For successfull installation make sure you have the following packages/utilities installed and available:
bash
(>= 4.0)md5sum
awk
(make sure thatawk
executable is in yourPATH
)sed
To use l2-multimd on your cluster make sure that the following packages/utilities are available in your environment:
bash
(>= 4.0)- SLURM package (
sbatch
andsrun
executables) awk
sed
Also if you want to be able to run MPI versions of executables please install MPI package which providesmpirun
binary (OpenMPI should work fine)
Clone this repo (or download archive) and extract in your home folder on Lomonosov-2 cluster. Then run install.sh
and it will install all needed files to the ~/_scratch/opt/l2-multimd
.
Load all necessary modules (if you use env-modules
) and/or set proper environment variables. Don't forget about SLURM itself (script will remind you if sbatch
command isn't found). Then prepare file tree for your computations somewhere on scratch filesystem. After that prepare the <taskfile>
(see corresponding chapter below) and set necessary keywords. Finally, cd
where your <taskfile>
resides, choose necessary <engine>
and run:
~/_scratch/opt/l2-multimd/multimd.sh <engine> <taskfile>
To make things even easier you can employ useful alias which l2-multimd provides for you: include in your ~/.bashrc
file the following lines (where <username>
refers to your user name on cluster):
if [[ -e "/home/<username>/_scratch/opt/l2-multimd/bash-completion/multimd" ]]
then
source /home/<username>/_scratch/opt/l2-multimd/bash-completion/multimd
fi
Don't forget to source ~/.bashrc
from your ~/.bash_profile
and re-login to make changes effective. After that you could run multiple jobs at once with this simple incantation:
l2-multimd <engine> <taskfile>
TASKFILE consists of pairs KEYWORD options
. Comments are allowed and marked with #
. You can't use them inside line - only the whole line could be commented out. Any extra spaces at the beginning of the line are ignored. Empty lines are also ignored. Keywords are case-insensitive.
TASKFILE is being processed in line-by-line fashion so keep it in mind. For example, task definitions can make use of some job-wide defaults such as number of nodes and executable names so if you want your defaults to be applied properly please be sure that all desirable task definitions are below the corresponding job-wide keywords.
Keywords can occur in TASKFILE several times. Every next keyword overrides previous incantations (with the exception of TASK keywords - it simply creates one more task). However it makes sense to use keywords several times only for two of them - NUMNODES and BIN and only if you want to run large job with dozens of tasks. Let's consider the following example:
...
# this is the first bunch of tasks; they should be run with sander.MPI and on 3 cluster nodes each
BIN sander.MPI
NUMNODES 3
TASK run1
TASK run2
...
TASK runN
# the rest of the tasks should be run with pmemd.cuda and only on 1 node
BIN pmemd.cuda
NUMNODES 1
TASK runN1
TASK runN2
...
TASK runNM
...
Here tasks in directories run1
to runN
will be run with sander.MPI
executable and on 3 nodes per task. During the same job tasks in directories runN1
to runNM
will use pmemd.cuda
and 1 node per task.
Here is the full list of supported keywords:
- DATAROOT
- AMBERROOT
- NAMDROOT
- GAUSSIANROOT
- СP2KROOT
- RUNTIME
- PARTITION
- NUMNODES
- BIN
- TASK
Some of the keywords (DATAROOT, AMBERROOT/NAMDROOT/GAUSSIANROOT/СP2KROOT and TASK) are vital and should reside in TASKFILE in any case. RUNTIME, PARTITION, NUMNODES and BIN have some reasonable defaults hardcoded in multimd.sh
. All unknown keywords are ignored.
This is the root directory where folders with data for MD simulations are stored. multimd.sh
seeks here for the tasks directories. This keyword may contain whitespaces but in that case the whole path should be quoted. Remember that on Lomonosov-2 cluster all computations are carried out on scratch filesystem!
Syntax:
DATAROOT /path/to/MD/root/dir
This is the root directory where AMBER computational package is installed (bin
, lib
and other directories should reside here). Path may contain whitespaces but should be enclosed in quotes. Because of how Lomonosov-2 cluster works AMBER distrib should be installed somewhere on scratch filesystem. Ignored if selected engine is not amber
.
Syntax:
AMBERROOT /path/to/amber/installation
This is the root directory where NAMD computational package is installed (namd2
, numd-runscript.sh
and other files should sit here). Path may contain whitespaces but should be quoted. Because of how Lomonosov-2 cluster works NAMD distrib should be installed on scratch filesystem too. Ignored if selected engine is not namd
.
Syntax:
NAMDROOT /path/to/namd/installation
This is the root directory where Gaussian computational package is installed. gVER
(where gVER
should be the same as BIN keyword, e. g. g09
or g16
), gv
and (perhaps) other directories should sit here. Path may contain whitespaces but should be quoted. Because of the nature of Lomonosov-2 cluster Gaussian installation should be installed on scratch filesystem. Important: only use that engine if you've licensed Gaussian installed! Ignored if selected engine is not gaussian
.
Syntax:
GAUSSIANROOT /path/to/gaussian/installation
This is the directory where CP2K executable files (cp2k.<VERSION>
, cp2k_shell.<VERSION>
, graph.<VERSION>
, libcp2k_unittest.<VERSION>
) are located. Due to extreme variety of CP2K custom build configurations it cannot be predicted from CP2K root directory. Usually these executables are located in exe/<ARCH>
subdirectory of cp2k root directory, for example, /home/user/cp2k/exe/local_cuda_plumed
. Path may contain whitespaces but should be quoted. Because of how Lomonosov-2 cluster works CP2K distrib and all its computational libraries should be installed on scratch filesystem too. Ignored if selected engine is not cp2k
.
Syntax:
CP2KROOT /path/to/cp2k/installation/
Sets the runtime limit for the whole bunch of tasks. After that time SLURM will interrupt the job. Default value is 05:00
.
Syntax:
RUNTIME DD-HH:MM:SS
NB: any higher part of runtime specification is optional, e. g. you could use such as RUNTIME 30:00
Sets the cluster partition to run simulation on.
Syntax:
PARTITION partition-name
Number of CPU cores and GPU cards per node is computed automatically and depends on partition-name
. Possible values are test
, compute
and pascal
. Default value is test
Sets default number of nodes per task. Useful if every task from the given list should use the same number of nodes. Default value is 1
.
Syntax:
NUMNODES n
This is default binary which should be used to perform calculations. Useful if every task uses the same binary executable. May contain spaces (quotes are necessary in this case). Default value is sander
. Important: if you're using Gaussian engine then GAUSSIANROOT should contain directory which is named the same as BIN keyword, for example: g03
, g09
, g16
!
Syntax:
BIN executable-name
This is the core keyword for job queueing. It allows you to specify directory name for every task, input config files, output files and, if AMBER engine is used, to specify AMBER-compatible list of options such as topology files, restart files etc. The only mandatory argument is directory name for the task. Other options could be derived automatically (see default values for those options). Unknown options are ignored. All options (with the exception of nodes/threads number) can contain whitespaces, but remember about quotation!
Syntax:
TASK dir-name [{-N|--nodes n} | {-T|--threads t}] [-b|--bin executable-name] [-i|--cfg config] [-o|--out output] [-p|--prmtop prmtop] [-c|--inpcrd coordinates] [-r|--restrt restart] [-ref|--refc restraints] [-x|--mdcrd trajectory] [-v|--mdvel velocities] [-e|--mden energies] [-inf|--mdinfo info] [-cpin cph-input] [-cpout cph-output] [-cprestrt cph-restart] [-groupfile remd-groupfile] [-ng replicas] [-rem re-type]
These options have the same meaning for any engine.
This is the directory name where all necessary files for one task is stored. Absolutely required.
Number of nodes for executing task in parallel. If not specified then the value of NUMNODES is used. If -T|--threads t
option is present then it will obsolete current option (only for certain binaries/engines, see below). Also see the parallelization policy for more details.
Number of threads for executing task in parallel. The number of nodes for supplying this threads count is computed automatically and based on selected PARTITION. If -N|--nodes n
option is present then current option will take precedence over it. This option has effect only for some executables of AMBER engine: sander.MPI
, pmemd.MPI
, pmemd.cuda.MPI
. Also see the parallelization policy for more details.
Replacement for default BIN executable. Allows to use specific binary for task execution.
File where all settings for calculation are specified. Default value is <dir-name>.in
in case of AMBER engine, <dir-name>.conf
if NAMD engine is used, <dir-name>.gin
for Gaussian engine and <dir-name>.inp
for CP2K engine.
Where all output is kept. Default value is <dir-name>.out
.
These options are specific for AMBER engine. If another engine is requested these options will be skipped.
Topology file for task. Default value is <dir-name>.prmtop
.
File with starting coordinates (and velocities, probably) for run. Default value is <dir-name>.ncrst
.
AMBER will save restart snapshots here. Default value is <dir-name>.ncrst
. NB: there could be collision with <coordinates>
file!
AMBER reads positional restraints from that file. There is no default value for this parameter.
File in which MD trajectory should be saved. Default value is <dir-name>.nc
.
AMBER will save velocities here, unless ntwv
parameter in simulation config is equal to -1
. There is no default value for this parameter.
AMBER will save energies here. There is no default value for this parameter.
File where all MD run statistics are kept. Default value is <dir-name>.mdinfo
.
File with protonation state definitions. Default value is empty.
Protonation state definitions will be saved here. Default value is empty.
Protonation state definitions for restart will be saved here. Default value is empty.
Reference groupfile for replica exchange run. Default value is empty.
Number of replicas. Default value is empty.
Replica exchange type. Default value is empty.
l2-multimd software manages it's own policy to run jobs in parallel. Specific technique depends on selected engine, executable and number of nodes/threads requested. Currently Lomonosov-2 cluster provides the following partitions for performing calculations:
Partition name | CPU type | GPU type | Number of CPU cores (NUMCORES ) |
Number of GPUs (NUMGPUS ) |
Available RAM, Gb |
---|---|---|---|---|---|
test |
Intel Haswell-EP E5-2697v3, 2.6 GHz | NVidia Tesla K40M | 14 | 1 | 64 |
compute |
Intel Haswell-EP E5-2697v3, 2.6 GHz | NVidia Tesla K40M | 14 | 1 | 64 |
pascal |
Intel Xeon Gold 6126, 2.6 GHz | NVidia Tesla P100 | 12 | 2 | 92 |
volta1 |
Intel Xeon Gold 6126, 2.6 GHz | NVidia Tesla V100 | 12 | 2 | 92 |
volta2 |
2x Intel Xeon Xeon Gold 6240, 2.6 GHz | NVidia Tesla V100 | 36 | 2 | 1536 |
Here are basic rules for all possible combinations supported by l2-multimd.
Executables sander
, pmemd
can only be run in single instance (1 thread and 1 node). Thus, option -T|--threads t
is incompatible with these executables.
For pmemd.cuda
the script will place independent tasks on all GPUs present on nodes. For example, if you asked for 8 tasks on pascal
partition with 2 GPUs per node, the script will allocate 4 nodes and run 2 tasks per node on different GPUs. You can override this behavior by setting desired number of tasks per node in -T|--threads t
variable (but not more than GPUs per node), for example, if your task consumes a lot of memory and have to be run in single instance per node.
You should create the number of tasks divisible by number of GPUs per node for maximum node utilization efficiency.
If there are no -T|--threads t
option is specified in task definition then sander.MPI
and pmemd.MPI
will be run with NODES * NUMCORES
threads without oversubscribing. However, if that option is present then -N|--nodes n
option will be ignored and required number of nodes for the task will be recalculated according to the NUMCORES
property of selected partition.
For pmemd.cuda.MPI
executable thread allocation depends on selected partition. If pascal
partition has been requested then 2 threads per node will be allocated for execution and both GPUs will be available for calculations. Otherwise, pmemd.cuda.MPI
will have only 1 thread per node. Option -T|--threads t
is incompatible with this executable.
Threads number for calculation will be the following: THREADS = NODES * NUMCORES
. If pascal
partition has been requested then both GPUs will be available for namd2
. Option -T|--threads t
is incompatible with this engine.
Currently l2-multimd doesn't support Linda + Gaussian bindings so all tasks will be run exactly on 1 node, however the number of threads depends on user-supplied config file. Because of specific way of setting Gaussian calculations in parallel it's all up to user to prepare config input properly to run on desirable number of CPU cores and GPU cards. If pascal
partition has been requested then both GPUs will be available for Gaussian executables. Option -T|--threads t
is incompatible with this engine.
CP2K has three versions with different parallelization models to be used on supercomputer:
- ssmp version with OpenMP parallelization, which runs only on single node. Variables
OMP_NUM_THREADS
andMKL_NUM_THREADS
will be set to number of CPU cores. Not recommended for use on multi-node cluster, but can support CUDA. The main executable binary iscp2k.ssmp
. - popt version with single-threaded MPI parallelization. The script will execute a number of MPI processes equal to number of cores. The main executable binary is
cp2k.popt
. - psmp version, which provides combined MPI and OpenMP parallelization. This version also can be built with CUDA support. Currently script runs a number of MPI processes equal to total number of cores divided by
OMP_NUM_THREADS=MKL_NUM_THREADS=2
. The main executable binary iscp2k.psmp
. Recommended for use on Lomonosov-2 with CUDA support. Ifpascal
partition has been requested then both GPUs will be available for CP2K executables. Option-T|--threads t
is not yet compatible with this engine.