Detect GPU tasks by inspecting inputs/outputs #4656

mrocklin · 2021-03-31T13:21:54Z

It would be useful to automagically detect which tasks engaged the GPU. This would allow us to more easily use both the CPU and GPU in mixed workloads, and require less configuration by the user. Unfortunately automatically detecting GPU tasks is hard.

There are a few approaches to this:

Years ago I tried achieving this by looking at the serialized form of the task, and looking for text like b"cudf" or b"torch". This was surprisingly effective, but also cludgy as heck.
Libraries like cudf could annotate layers, this may help less with PyTorch and delayed/futures though
Users can handle this themselves with annotations and resource restrictions
New idea! we could learn this by looking at the inputs and outputs of a function for common protocols like __cuda_array_interface__ and send that information back to the scheduler

So, the new idea again would be that whenever a task created a result that engaged the __cuda_array_interface__ protocol we would include that information as we sent it up to the scheduler. Probably this requires a new attribute like cuda_nbytes on the TaskState (which I'm personally fine with). The scheduler would watch for this signal, and if it occurred it would probably flip a cuda flag on the TaskPrefix, which would then trigger a signal that got sent down to all of the workers, and maybe pushed that task to run in a different ThreadPoolExecutor (see #4655 )

This would mis-allocate the first few tasks to the CPU Executor, but mostly it would do the right thing, and wouldn't require any intervention from the user

cc @dask/gpu

The text was updated successfully, but these errors were encountered:

jakirkham · 2021-03-31T18:56:18Z

I think @ayushdg has been exploring using annotations for heterogeneous cluster use cases. So he may have thoughts on that approach as well 🙂

kkraus14 · 2021-03-31T19:01:49Z

I think only using task results is going to cause problems. One of the things we see our users somewhat commonly do is self-contain all of their GPU work within a task where neither the inputs or outputs of the task are GPU objects.

mrocklin · 2021-03-31T20:54:45Z

Hrm, good point. Maybe there is a holistic "try many different things" mix of approaches that we use to cover this space.

mrocklin · 2021-03-31T20:55:32Z

Regardless, we should probably think about annotating tasks with a cuda flag, probably on the TaskPrefix, and make sure that that heads down to the Worker. Even if this doesn't do anything yet it might be useful to see how the plumbing there could work.

mrocklin mentioned this issue Apr 2, 2021

Multiple ThreadPoolExecutors #4655

Closed

madsbk mentioned this issue Aug 12, 2021

Dynamic annotations #5207

Draft

2 tasks

quasiben mentioned this issue Aug 25, 2022

[DO NOT MERGE] Experiment with auto-inferring rapidsai/cudf#11599

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect GPU tasks by inspecting inputs/outputs #4656

Detect GPU tasks by inspecting inputs/outputs #4656

mrocklin commented Mar 31, 2021

jakirkham commented Mar 31, 2021

kkraus14 commented Mar 31, 2021

mrocklin commented Mar 31, 2021

mrocklin commented Mar 31, 2021

Detect GPU tasks by inspecting inputs/outputs #4656

Detect GPU tasks by inspecting inputs/outputs #4656

Comments

mrocklin commented Mar 31, 2021

jakirkham commented Mar 31, 2021

kkraus14 commented Mar 31, 2021

mrocklin commented Mar 31, 2021

mrocklin commented Mar 31, 2021