Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: support for launching an MAPDL instance in an SLURM HPC cluster (…
…#3497) * feat: adding env vars needed for multinode * feat: adding env vars needed for multinode * feat: renaming hpc detection argument * docs: adding documentation * chore: adding changelog file 3466.documentation.md * feat: adding env vars needed for multinode * feat: renaming hpc detection argument * docs: adding documentation * chore: adding changelog file 3466.documentation.md * fix: vale issues * chore: To fix sphinx build Squashed commit of the following: commit c1d1a3e Author: German <[email protected]> Date: Mon Oct 7 15:33:19 2024 +0200 ci: retrigger CICD commit b7b5c30 Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Mon Oct 7 13:31:55 2024 +0000 ci: auto fixes from pre-commit.com hooks. for more information, see https://pre-commit.ci commit 32a1c02 Author: Revathy Venugopal <[email protected]> Date: Mon Oct 7 15:31:24 2024 +0200 fix: add suggestions Co-authored-by: German <[email protected]> commit 575a219 Merge: f2afe13 be1be2e Author: Revathyvenugopal162 <[email protected]> Date: Mon Oct 7 15:09:01 2024 +0200 Merge branch 'fix/add-build-cheatsheet-as-env-varaible' of https://github.com/ansys/pymapdl into fix/add-build-cheatsheet-as-env-varaible commit f2afe13 Author: Revathyvenugopal162 <[email protected]> Date: Mon Oct 7 15:08:58 2024 +0200 fix: precommit commit be1be2e Author: pyansys-ci-bot <[email protected]> Date: Mon Oct 7 13:07:35 2024 +0000 chore: adding changelog file 3468.fixed.md commit f052a4d Author: Revathyvenugopal162 <[email protected]> Date: Mon Oct 7 15:05:56 2024 +0200 fix: add build cheatsheet as env variable within doc-build * docs: expanding a bit troubleshooting advices and small format fix * docs: fix vale * fix: nproc tests * feat: adding env vars needed for multinode * feat: renaming hpc detection argument * docs: adding documentation * chore: adding changelog file 3466.documentation.md * fix: vale issues * docs: fix vale * docs: expanding a bit troubleshooting advices and small format fix * fix: nproc tests * revert: "chore: To fix sphinx build" This reverts commit e45d2e5. * docs: clarifying where everything is running. * docs: expanding bash example * tests: fix * docs: adding `PYMAPDL_NPROC` to env var section * feat: adding 'pymapdl_proc' to non-slurm run. Adding tests too. * docs: fix vale issue * docs: fix vale issue * fix: replacing env var name * feat: first 'launch_mapdl_on_cluster` draft * feat: added arguments to 'launch_mapdl_on_cluster'. Added also properties `hostname`, `jobid` and `_mapdl_on_slurm`. * feat: better error messages. Created 'generate_sbatch_command'. * refactor: rename 'detect_HPC' to 'detect_hpc'. Introducing 'launch_on_hpc'. * refactor: move all the functionality to launch_mapdl * feat: launched is fixed now in 'launcher' silently. * refactor: using `PYMAPDL_RUNNING_ON_HPC` as env var. Fixing bugs and tests * chore: adding changelog file 3497.documentation.md [dependabot-skip] * refactor: rename to `scheduler_args` * fix: launching issues * fix: tests * docs: formatting changes. * docs: more cosmetic changes. * tests: adding 'launch_grpc' testing. * tests: adding some unit tests * fix: unit tests * chore: adding changelog file 3466.documentation.md [dependabot-skip] * fix: adding missing import * refactoring: `check_mapdl_launch_on_hpc` and addressing codacity issues * fix: test * refactor: exit method. Externalising to _exit_mapdl function. * fix: not running all tests. * tests: adding test to __del__. * refactor: patching exit to avoid raising exception. I need to fix this later better. * refactor: not asking for version or checking exec_file path if 'launch_on_hpc' is true. * tests: increasing coverage * test: adding stack for patching MAPDL launching. * refactor: to allow more coverage * feat: avoid checking the underlying processes when running on HPC * tests: increasing coverage * chore: adding coverage to default pytesting. Adding _commands for checking coverage. * fix: remote launcher * fix: raising exceptions in __del__ method * fix: weird missing reference (import) when exiting * chore/making sure we regress to the right state after the tests * test: fix test * fix: not checking the mode * refactor: reorg ip section on init. Adding better str representation to MapdlGrpc * feat: avoid killing MAPDL if not `finish_job_on_exit`. Adding also a property for `finish_job_on_exit`. * feat: raising error if specifying IP when `launch_on_hpc`. * feat: increasing grpc error handling options to 3s or 5 attempts. * feat: renaming to scheduler_options. Using variable default start_timeout. Raise an exception if scheduler options are given, but not nproc. Fix scontrol call. * refactor: added types * refactor: launcher args order * refactor: tests * fix: reusing connection attr. * fix: pass start_timeout to `get_job_info`. * fix: test * fix: test * tests: not requiring warning if on minimal since ATP is not present. * feat: simplifying directory property * feat: using cached version of directory. * feat: simplifying directory property * chore: adding changelog file 3517.miscellaneous.md [dependabot-skip] * test: adding test * feat: caching directory in cwd * refactor: mapdl patcher * feat: caching directory in cwd * feat: caching directory for sure. * feat: caching dir at the cwd level. * feat: retry mechanism inside /INQUIRE * feat: changing exception message * feat: adding tests * feat: caching directory * chore: adding changelog file 3517.added.md [dependabot-skip] * refactor: avoid else in while. * refactor: using a temporary variable to avoid overwrite self._path Raise error if empty response only if non_interactive mode. * fix: not keeping state between tests * fix: making sure the state is reset between tests * fix: warning when exiting. * fix: test * feat: using a trimmed version for delete. * refactor: test to pass * refactor: removing all cleaning from __del__ except ending HPC job. * refactor: changing `detect_hpc` with `running_on_hpc`. Simplifying `launch_mapdl_on_cluster`. * docs: adding-sbatch-support (#3513) * docs: expanding a bit the `PyMAPDL on HPC clusters` section * docs: adding info about launching MAPDL in HPC. * chore: adding changelog file 3513.documentation.md [dependabot-skip] * fix: vale issues * docs: changing the name to `scheduler_options`. Add warning about adding nproc. * fix: vale issues * docs: apply suggestions from Kathy code review Co-authored-by: Kathy Pippert <[email protected]> * docs: adding CPUs. --------- Co-authored-by: pyansys-ci-bot <[email protected]> Co-authored-by: Kathy Pippert <[email protected]> * feat: avoid exceptions on `__del__` * tests: adding tests for get_port and get_ip * feat: using a submitter function for grouping. * tests: attempting clean exit * feat: externalising to function getting the batchhost * tests: increasing coverage * tests: fix * fix: doc builds * tests: increasing coverage * fix: not passing args * tests: increase coverage * fix: tests * fix: fixture * ci: uploading bandit reports as artifact. * docs: adding descriptor to phrase --------- Co-authored-by: pyansys-ci-bot <[email protected]> Co-authored-by: Kathy Pippert <[email protected]>
- Loading branch information