Compiler paths: policy DSL to Calyx eDSL #38
Replies: 1 comment 5 replies
-
Thanks for writing this down! This is a fantastic and very clear overview. Here is a quick summary of what I think is the central tension: Let "simple policies" be the kind that is currently representable in our policy DSL. Let "complex policies" be the kind that is the result from T2TC, i.e., a generalization of simple policies that adds transient nodes. Here are some facts (let me know if I am wrong):
I think the latter implies that we almost certainly want to build centralized-controller hardware, right? That is, we don't think there's a lot of hope that we might come up with a decentralized way to implement complex policies, so a decentralized hardware generator would end up limited to simple policies, which would be a shame. Here is one way that we could proceed, which is a kind of blend of blue and green:
The point of this ordering is that it lets us get started generating hardware immediately without figuring out how to do complex policies yet. In parallel to that, we can work on figuring out what we want to do in phase 2. It leaves open two options for that:
Neither option frees us from the challenge of implementing actual hardware for the complex-policy controller. That is unavoidable. |
Beta Was this translation helpful? Give feedback.
-
This is an attempt to nail down the various handshakes that need to happen as we try to compile programs written in our policy DSL to hardware designs in the Calyx eDSL. From the Calyx eDSL we will go to hardware, but for our purposes the Calyx eDSL is the target.
Overview, and Challenge
Let's work with work-conserving policies for now; I'll talk about PIFO trees and not PIEO trees for that reason. Let's ignore the
peek
operation. Also,pop
s do need to be supported but are not all that challenging, so I will ignore them for now.The source of our compiler is very clear: it is a policy, as defined in our policy DSL. This is a tree-shaped OCaml object whose leaves are classes and whose internal nodes are policies. The policies, for the purposes of this discussion, are just easy "off the shelf" things:
FIFO
,Fair
, andStrict
. That is, we aren't thinking about handwritten policies in the style of this issue.What is the target of our compiler? Well, given a user-written policy, we need to figure out two things that will together realize the policy in the Calyx eDSL:
push
command. That is, some of our PIFOs are acting like the leaves of a tree and the others are acting as nodes; atpush
we need to push a payload value at a leaf and push integer values into the leaf's ancestor nodes as described in Formal Abstractions. In addition to figuring our what to put into what PIFO, we must figure out the ranks for each of these pushes.It is easy to see that step 1 above is not all that hard. It is step 2 that is the challenge. How are we going to figure out what to push where, and with what rank? And that too in hardware?
In the figure below, the rows have the same values. This is because the left column represents the given tree while the right column represents the equivalent tree after tree-to-tree compilation in the Formal Abstractions sense (hereafter, T2TC). Only one version of T2TC exists right now, and so it is shown with a solid arrow. Our starting point is on the top left (policy DSL) and our end goal is the bottom right (Calyx eDSL, after T2TC). The rest of this post discusses the various ways I see of getting from our source to our target.
Approach 1: decentralized clever pieces of hardware (red way)
This approach handles the logic of rank-selection in a decentralized way. It goes directly from the policy to the Calyx eDSL.
Say the user wrote the following policy using the DSL:
strict(fair(A, B, C), D)
. This policy has six nodes. We won't just spin up six naive PIFOs, but rather we'll spin up four PIFOs that havefifo
shims on them, a PIFO that has a3-way fair
shim on it, and a PIFO that has a2-way strict
shim on it.If we spin up clever little pieces of hardware like this, then the big challenge above (finding "some logical way to react to a
push
command") is no longer so hard. Our PIFO+shim gadgets have the tricky logic baked in; we just need to remember the parent-child relation between them. Whenpush
is called, we will be given a payload and a target leaf (one of the FIFO-shimmed gadgets). We will simply give the payload to the leaf, letting the leaf itself determine a rank, and then we will approach the leaf's parent and have it enqueue an indexn
corresponding to which index child the leaf was from the PoV of the parent (e.g., if I was my parent's second child and I receive an input, I will complete the enqueue into myself and then tell my parent to please enqueue a2
). Again we must just furnishn
, and the parent node will itself figure out what rankn
should get. This will recurse upwards until we hit the root.At first blush I actually think this could work great. It directly goes from the policy DSL to the Calyx eDSL, it makes use of the many little gadgets that the undergrads have built over the summer. A drawback of this approach, though, is that the "decentralized but clever" approach is not very amenable to T2TC. We don't have this version of T2TC written out obviously, and further, the T2TC is currently designed to operate on a centralized logic (where an insertion path is computed for an incoming packet) and this is different from the decentralized system discussed above. Since each "shim+PIFO" gadget will basically be a black box, it may be hard to enact a hardware-level version of the T2TC that introduces transient nodes.
I'm not saying that T2TC here is impossible, but I am saying that the trade-off is becoming clear: we get an easier solution to the problem of setting up a working hardware solution that can handle a
push
, but we need to do some interesting new work for T2TC. In the figure, this approach is shown in red.Approach 2: exploit OCaml T2TC (blue way)
Burned by the above, this approach seeks to exploit the existing T2TC that we have! That T2TC is from a "source"
control
to a targetcontrol
. So how do we exploit this?c_src
, which is a control that one might hand-write in the OCaml development. Examples of such handwritten policies can be found inalg.ml
. I don't think this will be so bad. Aside: we can also run our existing visualizer onc_src
to make sure that the user-written policies are behaving as expected.c_tar
that simulatesc_src
.c_tar
into Calyx. This will be tricky. Let's talk about it after breaking out of this little list.Recall that a control is a triple of a state, a PIFO tree, and a scheduling transaction. The scheduling transaction's whole job is to look at incoming packets and issue an insertion path, including all the appropriate ranks, for that packet to successfully be enqueued. Sounds great, right? We can just set up a number of naive PIFOs and then use the scheduling transaction to get paths? Well the issue is that this path-computation is happening in software, not hardware. Moving this logic to hardware will be its own challenge, and I don't have really anything else to say about how we'll tackle that challenge.
Just to make some progress, we could lean into this path-computation-in-software thing. We could get the OCaml control
c_tar
to tell us, for a given batch of input packets, exactly what insertion paths it would have assigned to those packets. This is not so hard: just run the OCaml-level simulator and make it produce some kind of trace. This trace will need to be a little like our.data
files that Calyx queues currently take, but obviously it will need to be beefed up beyond the shared interface's current capabilities because it will need to communicate entire paths to the underlying hardware. The.data
input will also include calls topop
. The underlying hardware will be a number of naive PIFOs (as many PIFOs as there are nodes inc_tar
's PIFO tree) and it will need to learn how to read these new beefed-up path inputs. The hardware will also need to produce a result, which will go into an.expect
file. As an aside, we could tweak our Python visualizer to read in this.expect
file and graph out the result of our Calyx-level hardware work.In the figure, this approach is shown in blue. I have a little empty brace there to show that the final blue arrow doesn't quite get home: there remains a software-hardware gap.
Approach 3: T2TC early (green way)
This approach is to somehow find a way to do T2TC at the policy DSL level, and then go directly from the policy DSL to the Calyx eDSL.
As we saw in approach 1 above, translation from the source policy written in the policy DSL to some runnable Calyx was really quite easy. But then we got stuck in the left column, and we needed to implement T2TC in the hardest possible setting: the bottom row, trying to go from left to right.
We have a longish discussion about finding a way to coherently represent transient nodes in the policy DSL. A few ideas have been aired, but the tension there is always to keep enough information around such that translation to Calyx remains easy. If we can figure that out, we'll be able to:
In the figure, this approach is shown in green. This is the approach I like most in theory, but I understand the least in practice. Send help!
Beta Was this translation helpful? Give feedback.
All reactions