-
Notifications
You must be signed in to change notification settings - Fork 57
Added snapshot feature for vhost-device-vsock. #827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@WeiChungHsu CI is failing, please fix it. |
We can't merge code that depends on ongoing work, so I mark this as draft. |
Sorry, I am fixing and working on it now. |
pub(crate) struct VhostUserVsockBackend { | ||
config: VirtioVsockConfig, | ||
queue_size: usize, | ||
pub threads: Vec<Mutex<VhostUserVsockThread>>, | ||
queues_per_thread: Vec<u64>, | ||
pub exit_event: EventFd, | ||
pub handler_thread: Mutex<HandlerThread>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this whole abstraction is needed, because if a thread is created, then Option<JoinHandle<()>>
is not None
. Plus, created
field does not guard against handle
being None
because even if they are guarded under the same mutex, they are accessed separately (!). Change the handler_thread
field to Mutex<Option<JoinHandle<()>>>
and in check_device_state
function you can lock the mutex and use Option::take
to move the thread handle and join it if it's not None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just fixed the codes based on the suggestion. Please let me know if it meets your expectation.
vhost-device-vsock/src/vhu_vsock.rs
Outdated
self.handler_thread.lock().unwrap().created = false; | ||
self.handler_thread.lock().unwrap().handle.take().unwrap().join().unwrap(); | ||
} | ||
return Ok(()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/return Ok(());/Ok(())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the commit message is all on one line. Please put a short descriptive commit title and use the commit message body for longer text.
There is a hang issue during snapshot disable vring, my solution is to check vring enable/disable status but it requires a new function get_enabled from vring. Signed-off-by: Wei-Chung Hsu <[email protected]>
Fixed the commit message with sign-off. Thanks! |
Fixed CI errors and reviewer's comments. Signed-off-by: Wei-Chung Hsu <[email protected]>
I fixed the most of CI failures, but the this failure is based on another PR in vhost crate: "error[E0599]: no method named |
As I mentioned here #827 (comment) can you change Cargo.toml to point to your branch used for rust-vmm/vhost#284 ? In this way we can test also this PR. When that PR will be merged and a new vhost-user-backend released, we can remove that change and mark this PR ready to be merged. |
@WeiChungHsu also, instead of clicking on "Update branch", you can go to the arrow and do "Update with rebase" (maybe I should see if I can enable it by default) |
This PR needs a new vring function in another PR. Signed-off-by: Wei-chung Hsu <[email protected]>
Added snapshot feature for vhost-device-vsock.
There is a hang issue during snapshot disable vring, my solution is to check vring enable/disable status but it requires a new function get_enabled from vring.
The vring new function get_enabled PR is here: rust-vmm/vhost#284
More detailed explain the hang issue:
When the Vring is disabled (and queue set to not ready), the vhu_vsock_thread.rs still try to iterate and read RX queue, therefore, it hang at this line:
https://github.com/rust-vmm/vhost-device/blob/main/vhost-device-vsock/src/vhu_vsock_thread.rs#L591-L594
Furthermore, it hang because this Queue iter() operation return error at this line:
https://github.com/rust-vmm/vm-virtio/blob/main/virtio-queue/src/queue.rs#L592-L597
Requirements
Before submitting your PR, please make sure you addressed the following
requirements:
git commit -s
), and the commit message has max 60 characters for thesummary and max 75 characters for each description line.
test.
Release" section of CHANGELOG.md (if no such section exists, please create one).
unsafe
code is properly documented.