Skip to content

[Roadmap] SLURM support #3467

Open
Open
@germa89

Description

@germa89

Context

Check #2865 for a bit of historical context, which lead to #3091.
In #2865 we proposed implementing PyHPS to interact with HPC clusters. While PyHPS is very powerful, it is not an scheduler, so it needs to be installed additionally to an scheduler (like SLURM) and depends on it.
In this PR we are going to support SLURM HPC clusters only and directly, without PyHPS.

Research

Check #3397 for the research done on launching MAPDL and PyMAPDL on SLURM clusters

Introduction

For the moment we are going to focus more on launching single MAPDL instances, leaving aside the MapdlPool since it does create issues when regarding resource splitting. I think comming up with a good default resource sharing scheme might be a bit tricky.

Also, we are going to focus on the most useful stuff:

  • [Case 1] Batch script submission (Scenario A in PyMAPDL and PyHPS #2865)
  • [Case 2] Interactive MAPDL instance on HPC, and PyMAPDL on entrypoint (Scenario B in PyMAPDL and PyHPS #2865)
  • [Case 3] Interactive MAPDL instance on HPC, and PyMAPDL on outside-cluster computer (Similar to scenario B in PyMAPDL and PyHPS #2865)
    We might need to ssh to the entrypoint pc.
  • [Case 4] Batch submission from ouside-cluster machine. This is tricky because attaching files is complicated. This issue is solved if we are running interactively, because PyMAPDL can take care of uploading the files to the instance. So we will leave this one to the very end.

Roadmap

Start to implement this on the following PRs:

Other features

  • Detach remote ssh session from cluster. We should be able to launch MAPDL instances on remote machines without SLURM scheduler.
  • Checks if
    • run_location exists.

Related issues:

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions