Scheduling: Hierarchies #2
Closed
anshumanmohan
started this conversation in
Ideas
Replies: 1 comment
-
Just updating this discussion to reflect decisions we've made synchronously: we have mostly abandoned this line, because our setup (Core |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
How would a PIFO tree ease scheduling on a smartNIC? In this discussion I suggest that we focus on hierarchies. At the moment I am primarily seeking feedback regarding whether my setup is realistic. First, I present this setup. Then, I explain how a PIFO tree with minor extensions can help in this setting. I sketch those extensions. Finally, I point out some known issues.
Setup
A
,B
, andC
.A
andB
have distinct capabilities (neither can do the work that the other can). These kinds of cores are very performant, but we have few of them.C
can do the work of bothA
andB
. Cores of this kind are less performant, but we have more of them.Our Scheduler
PIFO Tree
I propose that our scheduler live on the smartNIC and be in the shape of a binary-branching PIFO tree of height 2. As a reminder, a PIFO tree already has a scheduling transaction, which has two roles:
Our tree will partition the incoming traffic by core: packets that can be served by cores of category
A
(resp.B
) will go into leafa
(resp.b
). We will default to maintaining FIFO order within a leaf.Popping the Tree
When a core is ready for a task, it pops the tree (via a signal sent right-to-left over the PCIe link). In particular:
C
, which can handle either kind of task, just pops the tree as usual.A
orB
executes a parameterized pop, which is a new but relatively straightforward operation: pop a subtree by providing the address of the subtree's root, and housekeep above the subtree to ensure well-formedness. Details about this operation are below.Relaying Additional Information
A core can also send a right-to-left signal to the tree to communicate information that may only be known to a core, or only revealed once work on a packet has begun. We can put this information into the tree via a dummy push: the push affects a reordering of previously buffered packets and itself enqueues a dummy packet at a leaf with very low priority. Details about this operation are below.
Alternative Strategy
In the strategy above, cores "pull" packets over the PCIe link by (perhaps parametrically) popping the tree. Each core does this when it is ready for more work. The alternate would be for the tree to "push" packets across the link on its own. However, this is challenging:
We reject this alternative strategy.
Extensions
Parameterized Pop
Consider the tree at the top of the following figure. If popped three times as usual, this tree will yield
P1
, thenP2
, thenP3
.However, say we perform a new kind of pop, which takes the address of a subtree. Instead of$pop(t)$ , we now do $pop(t, [L])$ . That is, start at the root of $t$ , follow the address given (here, $[L]$ ) to find a subtree, and pop that tree as usual. Doing this yields the packet $L$ in the room PIFO. In the figure below, I have marked the order of operations in blue.
P2
. However, just doing this results in an ill-formed tree. We must also go housekeep "above" the subtree by removing an occurrence ofDummy Push
Consider the same initial tree as above. If popped three times as usual, this tree will yield
P1
, thenP2
, thenP3
. We can push a dummy value, marked with a hole in red, down some insertion path marked in red. The result is that the tree will now yieldP3
, thenP1
, thenP2
, and then the dummy value.Note that we cannot always guarantee that the dummy value will come out last. Consider the example below. After the dummy value is pushed, the tree will yield
P3
, thenP1
, thenP2
, then the dummy value, and thenP4
. To get around this, the pop method will need to be altered a little: if a pop yields a dummy value, recurse and pop again. Popping an empty tree is undefined, so this will terminate.Known Issues
A
andB
) pull work over to themselves, we are basically not doing scheduling. It's actually worse than not doing scheduling, since we do do some scheduling when inserting the packet into the PIFO tree and must then undo that work via a parameterized pop. Anyway, if the hierarchical arrangement feels awkward and expensive, the natural next option is to just do away with the PIFO tree arrangement and let each category of cores maintain its own queue. These per-category queues may not need to be as sophisticated as PIFO trees.Beta Was this translation helpful? Give feedback.
All reactions