Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lxd] race-y deadlock on purge #1777

Open
Saviq opened this issue Oct 2, 2020 · 0 comments · May be fixed by #1821
Open

[lxd] race-y deadlock on purge #1777

Saviq opened this issue Oct 2, 2020 · 0 comments · May be fixed by #1821
Assignees
Labels
bug low low importance

Comments

@Saviq
Copy link
Collaborator

Saviq commented Oct 2, 2020

I had a couple instances, LXD was happy with them, but Multipass wasn't (they were in Starting state). I was running out of memory, so things got quite throttled, and I asked multipass to purge the instances. That put it in a bind (gdb attached). I then removed the instances through lxc rm -f to regain resources, but multipassd was already deadlocked.

  Id   Target Id                     Frame 
* 1    LWP 2512420 "multipassd"      0x00007f016b9c89f3 in pthread_cond_wait@@GLIBC_2.3.2 () from target:/lib/x86_64-linux-gnu/libpthread.so.0
  2    LWP 2512483 "multipassd"      0x00007f016979e31c in sigtimedwait () from target:/lib/x86_64-linux-gnu/libc.so.6
  3    LWP 2512484 "Qt bearer threa" 0x00007f0169872cf9 in poll () from target:/lib/x86_64-linux-gnu/libc.so.6
  4    LWP 2512485 "QDBusConnection" 0x00007f0169872cf9 in poll () from target:/lib/x86_64-linux-gnu/libc.so.6
  5    LWP 2512492 "default-executo" 0x00007f016b9c89f3 in pthread_cond_wait@@GLIBC_2.3.2 () from target:/lib/x86_64-linux-gnu/libpthread.so.0
  6    LWP 2512493 "resolver-execut" 0x00007f016b9c89f3 in pthread_cond_wait@@GLIBC_2.3.2 () from target:/lib/x86_64-linux-gnu/libpthread.so.0
  7    LWP 2512494 "grpc_global_tim" 0x00007f016b9c89f3 in pthread_cond_wait@@GLIBC_2.3.2 () from target:/lib/x86_64-linux-gnu/libpthread.so.0
  8    LWP 2512496 "multipassd"      0x00007f016b9c8f85 in pthread_cond_timedwait@@GLIBC_2.3.2 () from target:/lib/x86_64-linux-gnu/libpthread.so.0
  9    LWP 2512541 "grpc_global_tim" 0x00007f016b9c8ed9 in pthread_cond_timedwait@@GLIBC_2.3.2 () from target:/lib/x86_64-linux-gnu/libpthread.so.0
  10   LWP 2518071 "grpcpp_sync_ser" 0x00007f0169879959 in syscall () from target:/lib/x86_64-linux-gnu/libc.so.6
  11   LWP 2524055 "grpcpp_sync_ser" 0x00007f0169879959 in syscall () from target:/lib/x86_64-linux-gnu/libc.so.6
  12   LWP 2627764 "grpcpp_sync_ser" 0x00007f0169879959 in syscall () from target:/lib/x86_64-linux-gnu/libc.so.6
  13   LWP 2640192 "grpcpp_sync_ser" 0x00007f016987fd67 in epoll_wait () from target:/lib/x86_64-linux-gnu/libc.so.6
#0  0x00007f016b9c89f3 in pthread_cond_wait@@GLIBC_2.3.2 () from target:/lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1  0x00007f016a1bc8bc in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#2  0x00005620ceab4343 in std::condition_variable::wait<multipass::LXDVirtualMachine::stop()::<lambda()> > (__p=..., __lock=..., this=0x5620d1ec9f80) at /usr/include/c++/7/condition_variable:99
No locals.
#3  multipass::LXDVirtualMachine::stop (this=0x5620d1ec9f50) at /root/parts/multipass/src/src/platform/backends/lxd/lxd_virtual_machine.cpp:260
        lock = {_M_device = 0x5620d1ec9fb0, _M_owns = true}
        present_state = multipass::VirtualMachine::State::starting
        state_task = {d = 0x5620d1f73fb0, o = 0x5620d1f50c98}
#4  0x00005620cea33bed in multipass::Daemon::delet (this=0x7ffce21ed5f0, request=0x7f013801f220, server=<optimized out>, status_promise=0x7f014a7ba700) at /root/parts/multipass/src/src/daemon/daemon.cpp:1658
        instance = @0x5620d1ecd6b8: {<std::__shared_ptr<multipass::VirtualMachine, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<multipass::VirtualMachine, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x5620d1ec9f50, 
            _M_refcount = {_M_pi = 0x5620d1eb66b0}}, <No data fields>}
        name = <optimized out>
        __for_range = @0x7ffce21ecd50: {<std::_Vector_base<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >> = {
            _M_impl = {<std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >> = {<__gnu_cxx::new_allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >> = {<No data fields>}, <No data fields>}, _M_start = 0x5620d22b59e0, _M_finish = 0x5620d22b5a00, _M_end_of_storage = 0x5620d22b5a00}}, <No data fields>}
        __for_begin = <optimized out>
        __for_end = <optimized out>
        purge = true
        logger = {<multipass::logging::Logger> = {_vptr.Logger = 0x5620cf1b81f8 <vtable for multipass::logging::ClientLogger<multipass::DeleteReply>+16>}, logging_level = multipass::logging::Level::error, server = 0x7f014a7ba810, mpx_logger = @0x5620d11e8fe0}
        operational_instances_to_delete = <optimized out>
        trashed_instances_to_delete = <optimized out>
        status = <optimized out>

gdb.txt

@Saviq Saviq added bug low low importance labels Oct 2, 2020
townsend2010 pushed a commit that referenced this issue Oct 30, 2020
If no timeout is set, LXD uses a hardcoded 30 second timeout when waiting on
operations to complete and if the wait timeout occurs, it can lead to incorrect
behavior in the LXD backend.

Fixes #1777
@townsend2010 townsend2010 self-assigned this Oct 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug low low importance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants