Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add recovery instructions #112

Open
hughsie opened this issue Sep 24, 2020 · 3 comments
Open

Add recovery instructions #112

hughsie opened this issue Sep 24, 2020 · 3 comments

Comments

@hughsie
Copy link
Contributor

hughsie commented Sep 24, 2020

I know from real feedback from end users that flashing hardware makes a lot of people nervous. This is going to be specially so for firmware explicitly not from the chip vendor.

In the README file there is a big warning followed by:

...an external programmer is required, or the external flash must be temporarily disabled during boot-up

...but it doesn't actually say which programmer would be required -- and because the SPI flash is a 2.7V part I'm guessing someone is going to connect up a typical 3.3V programmer and blow it to bits. Perhaps show the user a link of the programmer that you use making it explicit it needs to be 2.7V.

I think if the README was also expanded with, for example, a diagram showing the jumper to set for device recovery that would make a lot of people more likely to try this firmware. Thanks!

@meklort
Copy link
Owner

meklort commented Sep 24, 2020

During firmware development, I ran into a number of cases where I bricked the network card on the Talos II. This happens when incorrect settings are programmed into certain registers on the NIC, resulting in the card dropping off the PCI bus.
When this happens, the only way to recover is to either (1) stop the card from booting off the NVRAM or (2) invalidate the firmware in NVRAM.

Unfortunately, on the Talos II and Blackbird, there are no jumpers to enable/disable the NVM. Additionally, there are no inline resistors and so an external device cannot easily overdrive the SPI signalling.

  • Short CSb to vcc or gnd. This causes the firmware to think NVM is invalid as it does not read valid firmware, and as a result will stop booting off of it. The pins generally need to be shorted until the final OS boots, as the linux driver causes device resets which may reload firmware. Once booted, the NVM config needs to be reset (there's code to do it in bcmflash) as the auto-detection failed during boot. This is sufficient to corrupt the flash, reset the system, and flash using the normal mechanism.

  • Writing to the EEPROM in-circuit. Generally this would require attaching wires to all SPI lines and overdriving as appropriate. I had problems with this approach (as mentioned above) even when the Talos II was un-powered. As this requires soldering to all SPI lines on board, I suggest for the first option (only requires soldering to CSb).

I'll take a look at the Dell NIC and see if there is a good way to recover that doesn't require soldering, but as it stands soldering is required for the blackbird products.
To be clear: Any firmware images release have been tested to boot properly and should not have this issue. The only time I'd expect a possible bricking event is if a user is modifying and rebuilding stage1. That said, all other bad flashes are recoverable without an external programmer or soldering.

In any case, adding the above recovery procedure does make sense, if only to document it. I'll see what the best way to add

  • Pins to solder to to enable the SPI console (and possibly external flashing).
  • Pints to short and procedure to use to recover from a bricked device for the Talos II / Blackbird.
  • Possible devices to use to flash / access the SPI console. Note that there should be no issue with hoking up a 3.3v programmer so long as the board is unpowered. At that point you are then providing your own power supply. As for the signaling. 3.3v should be fine here without any real issue.

@hughsie
Copy link
Contributor Author

hughsie commented Sep 24, 2020

Additionally, there are no inline resistors and so an external device cannot easily overdrive the SPI signalling

Hmm, that's unfortunate.

but as it stands soldering is required

I used to do PCB rework under a 'scope for job :)

adding the above recovery procedure does make sense

Many thanks, and sorry for adding to the every-growing list of things I ask from you.

@meklort
Copy link
Owner

meklort commented Sep 24, 2020

I used to do PCB rework under a 'scope for job :)

In this case, a scope is definitely a requirement.

Many thanks, and sorry for adding to the every-growing list of things I ask from you.

The comments a good - they help me see what issues users are running into so that the documentation / quality can improve. Plus, anything I fix here means you have more time to help me with fwupd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants