Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qemu-ga: segmentation fault on VM shutdown from host #25209

Open
val-kulkov opened this issue Oct 27, 2024 · 6 comments
Open

qemu-ga: segmentation fault on VM shutdown from host #25209

val-kulkov opened this issue Oct 27, 2024 · 6 comments

Comments

@val-kulkov
Copy link
Contributor

Maintainer: @yousong
Environment: x86_64, generic, OpenWrt SNAPSHOT r27910-1dc86af356
Package version:
qemu-firmware-seabios - 9.1.0-r2
qemu-ga - 9.1.0-r2

Description:
A host-initiated shutdown of a VM with OpenWrt never completes. The kvm process running on the host has to be manually killed for the VM to stop.

qemu-ga running in the foreground on the guest OpenWrt machine produces the following on stdout:

...
1730056540.243011: debug: received EOF
1730056540.343262: debug: read data, count: 109, data: {"arguments":{"id":3042929038},"execute":"guest-sync-delimited"}
{"arguments":{},"execute":"guest-shutdown"}

1730056540.343402: debug: process_event: called
1730056540.343427: debug: processing command
1730056540.343492: debug: sending data, count: 24
1730056540.343639: debug: process_event: called
1730056540.343680: debug: processing command
1730056540.343773: debug: g_unix_open_pipe() called with FD_CLOEXEC; please migrate to using O_CLOEXEC instead
Segmentation fault

I believe the second last line in the output above gives a hint about what might be going wrong.

@janh
Copy link
Contributor

janh commented Jan 17, 2025

I noticed the same issue on OpenWrt 24.10.0-rc5.

A message like this appears on the console, and the VM does not shut down:

[   90.391011] traps: qemu-ga[3585] general protection fault ip:5631794ca3e7 sp:7fffcd4a7070 error:0 in qemu-ga[5631794b5000+1d000]

This seems to be an issue with shutdown patch which was recently updated by @vooon (#25053, #25106).

The current version of the patch from Alpine works fine:
https://gitlab.alpinelinux.org/alpine/aports/-/blob/b720d51ec844d4754dd5b29084350aa1f5c9a74d/community/qemu/guest-agent-shutdown.patch

@vooon
Copy link
Contributor

vooon commented Jan 17, 2025

Hmm, i think the only possible cause - fallback_cmd remain nil.
I may update the patch, anyway we always fallback on the wrt.

vooon added a commit to vooon/openwrt-packages that referenced this issue Jan 17, 2025
Replace to fix openwrt#25209

Signed-off-by: Vladimir Ermakov <[email protected]>
@janh
Copy link
Contributor

janh commented Jan 18, 2025

@vooon: I debugged this using valgrind. It is actually caused by another bug in QEMU (qemu/qemu@9cfe110). The fix is also included in QEMU 9.1.2, so I think it would make sense to cherry-pick 6ee7a47 to the 24.10 branch.

In addition, I also found an unrelated bug in the shutdown patch: local_err is not cleared in the fallback branch, so while shutdown actually does work if the patch for ga_pipe_read_str is applied, the error from the first call ga_run_command is still returned (so the host thinks the shutdown failed when it actually worked). This could be fixed by calling error_free_or_abort(&local_err);.

But I see you already created a pull request to switch to the patch from Alpine. That of course fixes the second bug as well.

@vooon
Copy link
Contributor

vooon commented Jan 18, 2025

@janh aha, thanks. Regarding backporting - sorry, i'm always on the master, so never ever done that for wrt :)
So i suggest you may do that.

@janh
Copy link
Contributor

janh commented Jan 18, 2025

@vooon No problem, I'll create a pull request for that once #25780 has been applied.

@bobdig
Copy link

bobdig commented Feb 15, 2025

That has bitten me when I did the upgrade, I hope it will be fixed soon. Proxmox can't shutdown my OpenWRT-VMs anymore.

vooon added a commit to vooon/openwrt-packages that referenced this issue Feb 15, 2025
Replace to fix openwrt#25209

Signed-off-by: Vladimir Ermakov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants