-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Package deadlock the whole Go process, 90% of the time, under recent macOS #3
Comments
Thanks for the heads up, Eric. I’m afraid I won’t be able to help. I no
longer work much with Go. If you figure out a fix, please feel free to
fork. I’d be happy to mark this package as deprecated in favor of a working
alternative.
…On Wed, Sep 13, 2023 at 9:25 PM Eric Gouyer ***@***.***> wrote:
FYI, This package nearly doesn't work since some recent versions of MacOS
(at least Intel ones).
The symptom is : 90% of the runs, the progression of the Go programs
stops, yet use 100% CPU. Like a deadlock, or livelock more so.
It probably depends on the "activity" of the Go program, but as an
example, in my case, on a 20 seconds run, I have ~90 % of failure. On ~10%
of my runs, it gets to completion, and when it does complete, I get the
deep C symbols in my profiles.
See golang/go#45558 (comment)
<golang/go#45558 (comment)>
[...] deadlock is happening because the program is trying to fetch a stack
trace while calling the C function calloc. That can happen if a signal is
received during the call to calloc. That will cause a deadlock if the
cgosymbolizer code itsef calls calloc or malloc. The
github.com/benesch/cgosymbolizer code uses macOS libunwind, and at least
some versions of macOS libunwind call malloc. So it's not surprising that a
deadlock can occur.
Also golang/go#45558 (comment)
<golang/go#45558 (comment)>
On macOS, unwind.h is an Apple/LLVM-provided implementation that was
introduced with Big Sur according to this comment in unwind.h [...]
The above issue also suggests :
You might be able to fix the problem by changing the file
cgosymbolizer_darwin.c, the function cgo_traceback, to simply return if
arg->sig_ctx != 0.
Which indeed fix the symptoms, but the traces (at least in my case) won't
have C symbols, as if I didn't use this package.
I don't have any solution, I just figured out it should be mentioned
somewhere.
This package is still useful, but sadly "luck" is involved, have to run it
multiple times and avoid the "hang" symptom...
FWIW I'm on MacOS Ventura 13.4.1 (Intel CPU).
—
Reply to this email directly, view it on GitHub
<#3>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGXSIBCNUA4IBEEUOP2K23X2JMJHANCNFSM6AAAAAA4XKGL7Y>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
I just come back to trying multiple hours to find a solution, but I failed. Hope you don't mind me layout here what I've read / tried ; https://maskray.me/blog/2020-11-08-stack-unwinding
/!\ I've read somewhere that Of course clang (a least MacOS's default one) does not support I think I understood that it's the linker which do all the work of generating those debug sections. I'm not even sure. https://clang.llvm.org/docs/UsersManual.html
I'm not even sure IF llvm/libunwind is == to https://www.nongnu.org/libunwind/
https://reviews.llvm.org/D11897
All compiler flags I told above, I've tried to put them as much as possible in https://faultlore.com/blah/compact-unwinding/ https://lists.nongnu.org/archive/html/libunwind-devel/2016-02/msg00027.html
https://www.mail-archive.com/[email protected]/msg02305.html
I confirm that "returning early" in
As said in the link above :
So I think that the libunwind choose to not pay an upfront cost, and will at run-time call dl_iterate_phdr only when needed. Which makes me realise that maaaaybe at Go init() time, call a C function which would look over all dynamic librairies linked to the program, and call some unw_*() functions to force an upfront call to dl_iterate_phdr()... ? Other random ideas / things to try
My computer : MacOS Ventura 13.4.1, Intel CPU, clang 14.0.3. |
(on macOS Intel, latest Ventura, Xcode/clang are the latest official from Apple) Sorry, still laying out here some random attempts and discussion here. I'm still showing at the bottom how to recompile and link with a locally git fetched uptream llvm's libunwind there, with sources then showing in the lldb debugger. attempt to hook and divert reentrant calls to malloc() to a mmap(NULL)I've worked on https://github.com/folays/cgosymbolizer_macos/tree/test which could be used on a local build with a :
The intention was to hook *alloc() calls at RUNTIME (WHICH WORK) without LD_PRELOAD equivalent.
The intention was that if an reentrant *alloc() was ongoing, and if it was "rare" (only for a few unlucky calls to malloc() from deep dl_iterate_phdr() , to return some allocations from a mmap(NULL) ... lldb (macOS equiv. to gdb) shown me some SEGFAULT in libunwind at address 0x0 when unwinding deep from LuaJIT calls. (I'm sure I get my calls to LuaJIT right tho) attempt to recompile libunwind to have debug symbols of it, when using
|
FYI, This package nearly doesn't work since some recent versions of MacOS (at least Intel ones).
The symptom is : 90% of the runs, the progression of the Go programs stops, yet use 100% CPU. Like a deadlock, or livelock more so.
It probably depends on the "activity" of the Go program, but as an example, in my case, on a 20 seconds run, I have ~90 % of failure. On ~10% of my runs, it gets to completion, and when it does complete, I get the deep C symbols in my profiles.
See golang/go#45558 (comment)
Also golang/go#45558 (comment)
The above issue also suggests :
Which indeed fix the symptoms, but the traces (at least in my case) won't have C symbols, as if I didn't use this package.
I don't have any solution, I just figured out it should be mentioned somewhere.
This package is still useful, but sadly "luck" is involved, have to run it multiple times and avoid the "hang" symptom...
FWIW I'm on MacOS Ventura 13.4.1 (Intel CPU).
The text was updated successfully, but these errors were encountered: