This repository has been archived.
Exploration of the use of parsl for data processing pipelines in k8s:
Use Parsl to create parallel programs composed of Python functions and external components. Execute Parsl programs on any compute resource from laptops to supercomputers.
The following need to be set up on the target cluster to make this work.
Note
Some of this may be "done for you" on the ADC cluster, but you'll still need to set up your local (e.g. Rancher Desktop) cluster.
kubectl create namespace qgnet
Update your config to add a context referencing the new namespace:
kubectl config --kubeconfig={config-file-path} \
set-context {context-name} \
--cluster={cluster-name} \
--namespace=qgnet \
--user={user}
For a local Rancher Desktop cluster, this looks like:
kubectl config --kubeconfig="${HOME}/.kube/config" \
set-context rancher-desktop-qgnet \
--cluster=rancher-desktop \
--namespace=qgnet \
--user=rancher-desktop
The target cluster should be configured to have a qgnet
service account with
permissions for managing pods (creation/deletion).
kubectl apply -f k8s/serviceaccount.yml
Note
Role bindings are based on DataONE example.
First, select the appropriate k8s context. E.g., to run locally:
kubectl config use-context rancher-desktop-qgnet
To run on the remote dev-qgnet
k8s cluster:
Warning
Deployment to dev-qgnet
currently does not work. See
#3
kubectl config use-context dev-qgnet
Submit the example job defined in run.py
with:
python run.py
Note
The local version of python and parsl must match the remote version!
Warning
If a run fails, it is possible that a pod will get "stuck" and not get cleaned up properly. This may require manual cleanup!
Running a Parsl "job" on a remote cluster has a frustrating complexity: The remote workers need to be able to connect back to the host running the Parsl program. If you're behind a firewall you don't control, this may not be possible!
The workaround we're using is to submit a Kubernetes Job that runs the Parsl init
program from a ConfigMap. See run-on-remote-cluster.sh
and job.yml
for an
example of this.
Important
We currently using our own fork of parsl to add support for getting "in-cluster" configuration. See: Parsl/parsl#3357
Check Inspect a Kubernetes PersistentVolumeClaim by Frank Sauerburger for an excellent tutorial.
kubectl apply -f k8s/pvc-inspector.yml
- You may need to wait a few seconds...
kubectl exec -it pvc-inspector -- sh
- Inspect
/pvc
directory - Quit
- Inspect
kubectl delete pod pvc-inspector
MountVolume.SetUp failed for volume "parsl-init-script-volume" : object "qgnet"/"parsl-init-script" not registered
Does not always occur.
Some failure states result in pods getting stuck in a restart loop that do not get cleaned up automatically. To find pods in this state:
kubectl get pods
To remove a pod that is stuck:
kubectl delete pod {pod-name}
You must have a valid $KUBECONFIG
path. Paths including ~
or paths to files which do
not exist will cause Rancher to fail starting the cluster.
- The Parsl user guide's "Kubernetes Clusters" section is a good place to start.