-
-
Notifications
You must be signed in to change notification settings - Fork 170
Integration Test: Multiprocessor Test fails in QEMU/OVMF #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Possible clue for current CI failures, but does not explain the one you observed previously where stripping out all test code wouldn't resolve the problem: #114 (comment) . |
Spent some time today investigating this. I wasn't able to locate the root issue (which is probably deep inside qemu) but my findings should hopefully help anybody willing to look into this further: Locally on my Arch Linux system with KVM enabled the test runner finishes the tests sucessfully but fails with error code
This is qemu 4.2.0 from Arch's package repository. A from-source build of qemu 5.0.0-rc0 didn't show this issue so hopefully this resolves itself in a month. Ubuntu 18.04, which is used by CI, has an ancient qemu 2.11. I've setup a local VM and can observe the same behavior as in the CI runs. That is an endless boot loop which eventually fails with a spurious serial port error in an earlier test. The test causing this is
The interesting thing here is that with KVM enabled (by adding the user to the So in summary the problem probably lies somewhere in qemu's TCG or in its interaction with OVMF's code which I don't know how to debug. In uefi-rs's CI runs Ubuntu's old qemu version boot loops instead of panicking. (Unfortunately Github Actions doesn't seem to support nested virtualization so we can't just use KVM to save us here.) It'd be interesting to check whether the same test written in C also exhibits the problem. Just to rule out any funny business with uefi-rs. |
Following up on the previous comment, I've verified that behavior is still the case with recentish QEMU and OVMF. With OVMF debug output enabled, there's an interesting difference between the KVM case and the TCG case:
That "QEMU v2.7" string is hardcoded here, I'm actually testing on QEMU 6.1.0. It seems OVMF sees conflicting data about the number of CPUs, so it hits that branch and resets to one CPU. I'm not sure if this is actually a bug in QEMU or OVMF, or maybe both. The next step here is probably to ask about it on the edk2-devel mailing list. |
For future reference, I did report the issue to edk2-devel: https://edk2.groups.io/g/devel/message/87303 Hopefully the issue doesn't get lost there; unfortunately they don't allow new accounts on their bug tracker for the general public (I asked and they said to use the mailing list instead), so that's the best we can do for now. |
Minor update: I was hoping that this issue would be fixed by tianocore/edk2#3935. However, I have tested with
|
Uh oh!
There was an error while loading. Please reload this page.
There is a multiprocessor test which, if enabled, prevents CI from running all the tests correctly. We should investigate how to fix it.
@Bobo1239 has looked into the cause for this: #103 (comment)
The text was updated successfully, but these errors were encountered: