Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write Operation Cycles #2

Open
Hrayo712 opened this issue Apr 9, 2019 · 1 comment
Open

Write Operation Cycles #2

Hrayo712 opened this issue Apr 9, 2019 · 1 comment
Labels

Comments

@Hrayo712
Copy link

Hrayo712 commented Apr 9, 2019

Hello!

Hope you're doing well, and sorry to bother you. I am currently using your design for one my implementations. However, I require that the write operation last less than 4 cycles (2 if possible).
I was wondering if you could give me a pointer on how could I achieve this ?

Thanks!

@alexforencich
Copy link
Owner

Well, you're not going to get the SRL version below 16 or 32 cycles. You might be able to make this work with the BRAM version, but I think this will require a significant amount of re-working. The BRAM based CAM update requires two read-modify-write operations to clear the match bits for the old value and then set them for the new value. With only one port available for updates, four operations means at least four clock cycles. With one set of BRAMs, one port is used for matching and one port is used for updating. It may be possible to pipeline the current implementation and get it to a throughput of one update every four clock cycles. If you add a second set of BRAMs to "shadow" the first set but don't use that set for matching, then you can use the second port on those instances for the reads and then tie the write ports together so both sets of BRAM would have the same contents. This should enable pipelined operations with a throughput of one operation every two cycles (two reads and two writes per update, on separate ports). The latency will probably be at least four cycles, though, as there needs to be a read against the 'previous value' RAM as well as wait states for the BRAM output registers. You'll also need to add hazard detection logic to make sure concurrent read-modify-write operations to the same address are handled correctly.

Now, if you're going to be doing a lot of updates at the same location in the CAM, then you can design the pipeline logic to 'merge' operations against the same address and get either one operation per two cycles with one set of BRAMs or one operation per cycle with two sets of BRAMs. This is probably not the case, though.

If you really need the fastest possible updates, then just implement your CAM trivially on normal logic, and you can get one update per clock cycle, no problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants