You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the bug report from hell. It occurs only when the moons (openblas_nolapack, JavaCPP) and the stars (Win11 + Java + SBT + Play Framework) align.
I originally filed this in JavaCPP bytedeco/javacpp-presets#1203 where the workaround was "don't use openblas". Ok. That worked while MKL was available, but apparently that's now being dropped by JavaCPP ( bytedeco/javacpp-presets#1575 (comment) ), so I'm rapidly heading up that creek and have no idea where I left the paddle.
I can readily reproduce the issue on my local Win11 machine (and there's a small repro project in the JavaCPP issue). Great. It hangs on:
The stack trace shows that the hang occurs during the native library’s initialization—specifically, within the Fortran runtime code (libgfortran) that OpenBLAS uses. Key points include:
Where the hang occurs:
The thread is stuck in a Windows system call (NtFsControlFile via PeekNamedPipe) that’s invoked during a file-status check (common_stat_handle_file_opened). This call is part of the Fortran runtime’s unit initialization (e.g. in gfortrani_init_units).
Implication:
The Fortran runtime (libgfortran_5) appears to be performing some file/pipe I/O (likely to initialize Fortran I/O units) that hangs when it calls into the Windows API. This suggests that something in the initialization code is either incompatible with Windows 11 or is misbehaving in the context of your Java process.
Where the library is loaded:
The hang happens during the JVM’s library loading process (via JVM_LoadLibrary), so it’s not your Java code per se but the native library’s (OpenBLAS’s) initialization that is problematic.
In summary, the dump indicates that the native library (openblas_nolapack), via its Fortran runtime initialization, is hanging on a Windows file-system call. This points to a potential issue with the library’s (or Fortran runtime’s) initialization code on Windows 11, such as waiting indefinitely for a file/pipe status that never returns. You might need to check for known compatibility issues with OpenBLAS on Windows 11 or consider using a different build/configuration of the library.
Any suggestions? Thanks!
Versions: it (still) happens with openblas-0.3.28.
The text was updated successfully, but these errors were encountered:
Not sure what to make of that - gfortran is "only" needed for LAPACK, and I'm not aware that it would be trying to open any files on initialization (except perhaps pseudo-files for reading environment variables or other system parameters). If you can build OpenBLAS yourself in this context, you could try building with NOFORTRAN=1 (which would result in an older version of LAPACK - but one translated to C - being used). Or if you do not expect to call any LAPACK function at all, only using BLAS, you could even compile with NO_LAPACK=1.
What is known to have caused problems in the past is the small stacksize that Java provides by default (or used to provide in the past), so perhaps the trace you got is misleading and you are seeing some kind of heap-stack-collision as OpenBLAS tries to allocate memory buffers ? In that case you could try starting your java environment with a larger -M (if that still makes sense today)
This is the bug report from hell. It occurs only when the moons (openblas_nolapack, JavaCPP) and the stars (Win11 + Java + SBT + Play Framework) align.
I originally filed this in JavaCPP bytedeco/javacpp-presets#1203 where the workaround was "don't use openblas". Ok. That worked while MKL was available, but apparently that's now being dropped by JavaCPP ( bytedeco/javacpp-presets#1575 (comment) ), so I'm rapidly heading up that creek and have no idea where I left the paddle.
I can readily reproduce the issue on my local Win11 machine (and there's a small repro project in the JavaCPP issue). Great. It hangs on:
I have even attached WinDbg to the process, identified the relevant thread, and taken a stack dump:
o3-micro-high has this to say about that:
Any suggestions? Thanks!
Versions: it (still) happens with openblas-0.3.28.
The text was updated successfully, but these errors were encountered: