Skip to content

[otp_ctrl] Re-express PartInvDefault as an array #27069

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

alees24
Copy link
Contributor

@alees24 alees24 commented Apr 29, 2025

Expressed as a single bit vector PartInvDefault was of illegal width for Darjeeling. Section 6.9.1 of the SystemVerilog specification permits implementations to be constrained to 65536-bit wide vectors.

The much larger OTP of Darjeeling (8 times that of Earl Grey) was thus unacceptable to Verilator and could be faulted by other EDA tools too. This was discovered when creating a Verilator environment for Darjeeling; with this change we are now able to run some top-level software tests faster.

A bitwise comparison of the previous and new PartInvDefault values has been performed for both Earl Grey and Darjeeling to ensure that the data remains unchanged.

@alees24 alees24 requested review from matutem and rswarbrick April 29, 2025 13:52
@qmn
Copy link
Contributor

qmn commented Apr 29, 2025

Heh, it looks like we ran into exactly the same problem (issue #26883, PR #26998). Unfortunately, I got roundly hosed and hadn't gotten around to incorporating Rupert's feedback ("today is the day," I said, for many days). But now that there are two PRs, it might be worth checking how to proceed.

This is a perfectly valid solution; I will just opine that otp_ctrl_part_pkg.sv becomes less succinct (e.g., the many rows of 8'h00). At this time I'm still unfamiliar with OTP contents or their structure, so I'm not sure if those long literal strings are more meaningful kept together or split bytewise.

@qmn qmn self-requested a review April 29, 2025 18:27
@alees24
Copy link
Contributor Author

alees24 commented Apr 29, 2025

Heh, it looks like we ran into exactly the same problem (issue #26883, PR #26998). Unfortunately, I got roundly hosed and hadn't gotten around to incorporating Rupert's feedback ("today is the day," I said, for many days). But now that there are two PRs, it might be worth checking how to proceed.

This is a perfectly valid solution; I will just opine that otp_ctrl_part_pkg.sv becomes less succinct (e.g., the many rows of 8'h00). At this time I'm still unfamiliar with OTP contents or their structure, so I'm not sure if those long literal strings are more meaningful kept together or split bytewise.

That is unfortunate. I agree that the repeated bytes are inelegant but it isn't obvious to me that there is an easy way to avoid them; ideally I'd at least gather them using replication constructs or format the source to be multiple bytes/line.

I'm not confident that decomposing individual contributions and then concatenating them into a single parameter that still violates the specification (131072-bit vector) is a sufficient solution, however. Elsewhere in otp_ctrl the data is already expressed as a byte array, and I think that's what we must use for the parameter too.

To be clear, and hopefully to avoid further duplicated effort, I have an under-construction PR that provides a Verilator simulation environment and DT-related sw files for Darjeeling. They allow uart_smoketest to run to completion. May I ask how you happened upon this issue, please?

@qmn
Copy link
Contributor

qmn commented Apr 30, 2025

Hi Adrian, thanks so much for letting me know. LRM 6.9.1 doesn't seem to say anything explicit about the width of the final parameter -- it could be Verilator that is picky only when we have overly wide literal vectors.

That Verilator PR would be really really useful. I happened on this bug because I was using Verilator-based lint elsewhere. I wasn't focused on this file in particular, but it wouldn't proceed on other modules until I resolved this error.

Copy link
Contributor

@matutem matutem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The verilator --max-num-width switch may eliminate the need for this. It is documented in this page https://verilator.org/guide/latest/exe_verilator.html#

I am not sure it will solve this issue, but it is worth a try to avoid this problematic change.

Also, fusesoc can pass flags to the simulator via edalize: https://github.com/olofk/edalize#

@alees24
Copy link
Contributor Author

alees24 commented May 4, 2025

I've done a lot of additional experimentation, invoking Verilator manually with a dedicated test bench and reporting the resultant value of PartInvDefault, as well as running chip sims. Here are my findings:

  • There are two separate issues here: the specification (section 6.9.1) permits the maximum vector length to be restricted to 65536 bits, which means that that PartInvDefault for Darjeeling is presently not properly portable and could cause issues with other tools. There seems to be no differentiation between nets and parameters in the specification, so the expectation should be that this constraint applies to all bit vectors. IMHO it would be surprising if that were not the case in any given tool.
  • Verilator imposes a specified maximum width (in bits) on numeric literals, In the documentation referenced by @matutem it explicitly says literals in the documentation, as distinct from bit vectors.
  • Splitting the problematic literal using either the -max-num-width option or the programmatic solution in [otp_ctrl] Expand overly wide literals (vectors) #26998 does permit the otp_ctrl to boot the chip in Verilator simulation without issue, in the same manner as per the more portable, specification-compliant approach taken in [otp_ctrl] Re-express PartInvDefault as an array #27069.

We therefore have three possible solutions:

  1. Use the command-line switch as a Verilator-specific fix.
  2. Use the code in [otp_ctrl] Expand overly wide literals (vectors) #26998 (perhaps with modifications) to produce a Verilator-specific fix.
  3. Use a tidied form of the code in [otp_ctrl] Re-express PartInvDefault as an array #27069 (this PR) to avoid the possibility of problems later with any tools that do enforce a constraint on bit-vector length; my impression is that Verilator does not do this, just on literals.

All three solutions should be considered equally proven since chip sims have been run successfuly with each and I have compared the resultant parameters bitwise.
Since we're building open source hardware, and ideally not constraining the choice of tools, I'd prefer to see the portable solution tidied by use of replication constructs and/or aggregating multiple bytes to reduce the line count, but I'm happy to run with 2 since it seems more generic than 1 and addresses the presently-known problem.

@matutem Do you consider 3 to be problematic for reasons other than the number of lines/bytes? I imagine that neither the long hex literal expression nor the byte array is really more helpful in retrieving the item/value information from the binary data, although you may have noticed I took the opportunity at least to emit the partition names in this PR.

It would be good to get a Darjeeling+Verilator smoke test in CI soon, now that that's possible, please.

@qmn
Copy link
Contributor

qmn commented May 5, 2025

Thanks so much for getting down to the bottom of this with all three methods. I'd also prioritize portability / compliance -- 1. a Verilator-specific fix in this RTL is a little messy (#26998's code really needs to be hoisted somewhere else for other modules) and 2. for Verilator-specific work, it's easy enough to patch over with a larger (probably top-specific) constant argument to --max-num-width in the applicable FuseSoC files.

Re: the file size / repetition, perhaps run-length encoding (i.e. replication constructs) on the 8'h00's could get you about 80% of the way there.

... which is to say I have no strongly held opinion on the matter, and would like to hear from others (@rswarbrick, who so kindly looked at my PR?)

@@ -526,171 +526,15989 @@ package otp_ctrl_part_pkg;


// OTP invalid partition default for buffered partitions.
parameter logic [131071:0] PartInvDefault = 131072'({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally enthusiastic about this PR but, crikey, the generated files have lots of lines. Maybe it makes sense to put more bytes to a line? I tried the following:

  // OTP invalid partition default for buffered partitions.
  parameter logic [${offset - 1}:0][7:0] PartInvDefault = {
  % for k, part in enumerate(otp_mmap["partitions"][::-1]):
    { // Partition '${item["name"]}'
    % for item in part["items"][::-1]:
      % if offset != item["offset"] + item["size"]:
      { // unallocated space
        {${"{}".format(offset - item["size"] - item["offset"])}{8'h00}}
      },<% offset = item['offset'] + item['size'] %>
      % endif
<%
        item_data = []
        for lsb in range(0, 8 * item['size'], 128):
          msb = min(lsb + 128, 8 * item['size']) - 1
          width = msb - lsb + 1
          value = (item['inv_default'] >> lsb) & ((1 << width) - 1)
          item_data.append(f"{width}'h{value:0{(width + 3) // 4}X}")
        item_data.reverse()

        item_suff = "" if loop.last else ","
        offset -= item["size"]
%>\
      % if len(item_data) > 1:
      {
        % for word in item_data:
        ${word}${"" if loop.last else ","}
        % endfor
      }${item_suff}
      % else:
      ${item_data[0]}${item_suff}
      % endif
    % endfor
    }${"{}".format("," if k < len(otp_mmap["partitions"])-1 else "")}
  % endfor

This gives 1993 lines in the top_darjeeling version instead of 16811: I think it's a bit less crazy! The generated text looks like this:

  // OTP invalid partition default for buffered partitions.
  parameter logic [16383:0][7:0] PartInvDefault = {
    { // Partition 'SOC_DBG_STATE'
      {
        64'h01136C663A36C3E3,
        128'hE817E760B27AE937BFCDF15A3429452A,
        128'h851B80674A2B6FBE93B61DE417B9FB33
      },
      {
        128'hD68C96F0B3D1FEED688098A43C33459F,
        128'h0279FC51CC7C626E315FD2B871D88819,
        128'hA0D1E90E8C9FDDFA01E46311FD36D954
      }
    },
    { // Partition 'LC_TRANSITION_CNT'
      64'h3BF7D79A9FF747F6,
      {
        128'hD0BAC511D08ECE0E2C0DBDDEDF7A854D,
...

Maybe this is a reasonable halfway house?

(I haven't ported the "unallocated space" section, so we'd probably want to do that too)

Copy link
Contributor Author

@alees24 alees24 May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all that. I guess I didn't make this clear; the PR was an immediate fix and to raise visibility. I am insufficiently fluent in Python/Mako presently to make an initial attempt at gathering bytes/lines.
I have pushed changes incorporating your tidying, widening the literals to 256 bits which is more compatible with the rest of the _pkg source, and annotating the partition items too. Hopefully that will be acceptable to everyone. Thanks again!

(I haven't ported the "unallocated space" section, so we'd probably want to do that too

I'm inclined to leave the unallocated space as a simple replication since it is defined to be all zeros and some such sections are quite large.

@matutem
Copy link
Contributor

matutem commented May 7, 2025

I personally favor option 1, passing the argument to verilator:

  • the LRM mention lower bounds for parameters, and verilator has mechanisms to get over this bound
  • the dvsim flow supports per simulator specific flags
  • we already have multiple per simulator settings

The underlying issue is that there is no way to use simulators without specific flags: these are complex tools with lots of ways to configure them to do what we need. If you think portability is possible, just compare the xcelium and vcs.cfg files under hw/dv/tools/dvsim to get a reality check.

I think readabilty is more important than a hypothetical portability.

@alees24 alees24 force-pushed the otp-inv-default branch 2 times, most recently from e879f3f to d6913e3 Compare May 8, 2025 14:45
Expressed as a single bit vector PartInvDefault was of illegal
width for Darjeeling. Section 6.9.1 of the SystemVerilog
specification permits implementations to be constrained to
65536-bit wide vectors.

The much larger OTP of Darjeeling was thus unacceptable to
Verilator.

Annotate the partition and item names in the auto-generated SV
package.

Co-authored-by: Rupert Swarbrick <[email protected]>
Signed-off-by: Adrian Lees <[email protected]>
@alees24 alees24 force-pushed the otp-inv-default branch from d6913e3 to d92cafe Compare May 9, 2025 10:49
@alees24 alees24 requested a review from a team as a code owner May 9, 2025 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants