Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce Extra Cycle Latency of Pipeline Calls #49

Open
dz333 opened this issue Sep 28, 2021 · 1 comment
Open

Reduce Extra Cycle Latency of Pipeline Calls #49

dz333 opened this issue Sep 28, 2021 · 1 comment
Labels
code generation Related to Generating RTL Code enhancement New feature or request

Comments

@dz333
Copy link
Collaborator

dz333 commented Sep 28, 2021

Currently, when you call a different PDL pipeline, this translates into enqueuing onto an input FIFO, and getting the response requires dequeuing from an output FIFO.

This means, if you make a request in cycle 1 the minimum cycle number when you can use the response as a client is cycle 3
(enq to input on 1, enq to out on 2, deq from out on 3).

We'd really like this minimum latency to come down 1 cycle -> calling PDL pipelines can take a long time but the minimum time should be a single cycle.

We need to pick a solution that doesn't impact how looping pipelines (i.e. that send data to themselves) works.

Options

  1. input "FIFO" can be read on the same cycle (i.e., bypass queue in BSV library terms)
  2. output "FIFO" can be read on the same cycle

I prefer (1), except this makes calling "recursively" different from calling out to different pipelines (we don't want recursive calls to execute the first stage in the same cycle as the call)

@dz333 dz333 added enhancement New feature or request code generation Related to Generating RTL Code labels Sep 28, 2021
@dz333
Copy link
Collaborator Author

dz333 commented Dec 13, 2021

I'm going to re-use this issue to just track generating a different protocol for calling pipelines:

Input queues will be "Bypass Queues" (in BSV terms) -> a.k.a. if the queue is empty it doesn't write to a register, it just processes the stage.

Output queues will effectively be a single register -> and only the next request can execute an output(val) statement to make the response available. This forces outputs to be in the same order as requests, but I think we can consider how to support "non-blocking" requests in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code generation Related to Generating RTL Code enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant