Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relieve users of cluster management on HPC? #557

Open
TomNicholas opened this issue Aug 19, 2024 · 0 comments
Open

Relieve users of cluster management on HPC? #557

TomNicholas opened this issue Aug 19, 2024 · 0 comments
Labels

Comments

@TomNicholas
Copy link
Member

In #554 and #555 @applio started to add support for running Cubed on Dragon, with the intention of allowing users to run Cubed on HPC.

One thing that's super nice about the way Cubed runs on serverless cloud executors is that the user no longer has to think about the concept of a "cluster" at all. From the blog post:

No cluster to manage: A serverless design means that the user does not having to deploy and manage a cluster at all. Arguably conceptually simpler, this model also means less boilerplate code, no error-prone deployment step, and only paying for computation you actually do, not for the time the cluster is up.

Can we get a similar user experience on HPC somehow? The challenge is that HPC resources are normally controlled by a queuing system like SLURM or PBS, and are therefore not ephemeral in the same way that serverless functions are.

There are at least 2 ways in which I would expect users to want to run Cubed on HPC:

  1. Submitting a python script as a job
  2. From an interactive node (i.e. in a jupyter notebook)

We also don't want the users to have to think about exact configuration details on an allocation, e.g. how many threads/processes should be created on a particular system. Even if they do have to choose the size of the resource allocation manually, ideally Dragon would automatically create a sensible number of processes for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant