-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node 22 version bump with ABI possibly causing severe timeout problem in buildbot #26078
Comments
Do you have any idea what might make node to occasionally react badly to extreme concurrency (-j 12 or 14) in building? Are some node modules dependent on each other without declaring that explicitly? Is the new ABI versioning you implemented with that version bump really mandatory? Unless a fix is figured rather soon, we might need to test my debugging results either by
|
Alternatively, we could disable node subpackages like node-yarn, which seems to be the one that gets built first (and maybe hangs). Looking at its Makefile, it has not been updated along the main node. We seem to be using really ancient yarn version 1.22. Quite possible that it is not in sync with the much newer main node. https://github.com/yarnpkg/yarn#readme
|
@hnyman I tried building node-yarn multiple times with 32 threads locally in the snapshot SDK but I cannot get it to fail |
In my experience, there is no problem in building node.js itself, but I am aware of an extreme increase in npm cli threads when building node packages. |
I also tried testing it on -j32, and it built without any problems. (4 cores, VT-x 8 threads, 32GB memory) When building node packages, the number of threads increases to over 100, but it built without any problems. |
@hnyman diff --git a/lang/node-arduino-firmata/Makefile b/lang/node-arduino-firmata/Makefile
index 90c1c5b34..6c0e94eb0 100644
--- a/lang/node-arduino-firmata/Makefile
+++ b/lang/node-arduino-firmata/Makefile
@@ -18,6 +18,7 @@ PKG_HASH:=d7157e02867eae82887cb5e17b90c963fe7489bacd464110bfd20c672b8d5a98
PKG_BUILD_DEPENDS:=node/host
PKG_BUILD_FLAGS:=no-mips16
+PKG_BUILD_PARALLEL:=0
PKG_MAINTAINER:=Hirokazu MORIKAWA <[email protected]>
PKG_LICENSE:=MIT
diff --git a/lang/node-cylon/Makefile b/lang/node-cylon/Makefile
index 28b3c635b..3bb1c16d0 100644
--- a/lang/node-cylon/Makefile
+++ b/lang/node-cylon/Makefile
@@ -20,6 +20,7 @@ PKG_SOURCE_SUBDIR:=$(PKG_SRC_NAME)-$(PKG_VERSION)
PKG_BUILD_DEPENDS:=node/host
PKG_BUILD_FLAGS:=no-mips16
+PKG_BUILD_PARALLEL:=0
PKG_MAINTAINER:=Hirokazu MORIKAWA <[email protected]>
PKG_LICENSE:=Apache-2.0
diff --git a/lang/node-hid/Makefile b/lang/node-hid/Makefile
index 575f9d579..0437fb63d 100644
--- a/lang/node-hid/Makefile
+++ b/lang/node-hid/Makefile
@@ -18,6 +18,7 @@ PKG_HASH:=6c1f05935215feed4e8d2f4aecf31abbad8fa783d252b0bd6041ed2f2e96e9ba
PKG_BUILD_DEPENDS:=node/host
PKG_BUILD_FLAGS:=no-mips16
+PKG_BUILD_PARALLEL:=0
PKG_MAINTAINER:=Hirokazu MORIKAWA <[email protected]>
PKG_LICENSE:=MIT or X11
diff --git a/lang/node-homebridge/Makefile b/lang/node-homebridge/Makefile
index 7c6d124bc..d638a2fdc 100644
--- a/lang/node-homebridge/Makefile
+++ b/lang/node-homebridge/Makefile
@@ -15,6 +15,7 @@ PKG_HASH:=f91ab0058707a0498d97d87f45f19682065f80660fac942e0985caf9bb205f2a
PKG_BUILD_DEPENDS:=node/host
PKG_BUILD_FLAGS:=no-mips16
+PKG_BUILD_PARALLEL:=0
PKG_MAINTAINER:=Hirokazu MORIKAWA <[email protected]>
PKG_LICENSE:=ISC Apache-2.0
diff --git a/lang/node-javascript-obfuscator/Makefile b/lang/node-javascript-obfuscator/Makefile
index 281656331..fc2b3c5f4 100644
--- a/lang/node-javascript-obfuscator/Makefile
+++ b/lang/node-javascript-obfuscator/Makefile
@@ -14,10 +14,10 @@ PKG_SOURCE_URL:=https://registry.npmjs.org/$(PKG_NPM_NAME)/-/
PKG_HASH:=9bc89b04c78277130bc6f699563871d211f6fc85803c874f6114a632d9456f7b
PKG_BUILD_DEPENDS:=node/host
-HOST_BUILD_PARALLEL:=1
+HOST_BUILD_PARALLEL:=0
HOST_BUILD_DEPENDS:=node/host
-PKG_BUILD_PARALLEL:=1
+PKG_BUILD_PARALLEL:=0
PKG_BUILD_FLAGS:=no-mips16
PKG_MAINTAINER:=Zbynek Kocur <[email protected]>
diff --git a/lang/node-serialport-bindings/Makefile b/lang/node-serialport-bindings/Makefile
index e6352781f..d0daa9b39 100644
--- a/lang/node-serialport-bindings/Makefile
+++ b/lang/node-serialport-bindings/Makefile
@@ -16,6 +16,7 @@ PKG_HASH:=aec200860bd175e4b14b4ab1aa56a5f750172b6c8e20ccb234846206395848d4
PKG_BUILD_DEPENDS:=node/host
PKG_BUILD_FLAGS:=no-mips16
+PKG_BUILD_PARALLEL:=0
PKG_MAINTAINER:=Hirokazu MORIKAWA <[email protected]>
PKG_LICENSE:=MIT
diff --git a/lang/node-serialport/Makefile b/lang/node-serialport/Makefile
index 336d4b2e7..4c0f4af02 100644
--- a/lang/node-serialport/Makefile
+++ b/lang/node-serialport/Makefile
@@ -18,6 +18,7 @@ PKG_HASH:=e19fe993ad16ae0e03fc42e24cfe4babf8fd90f8358e1885d5e216277dda1086
PKG_BUILD_DEPENDS:=node/host
PKG_BUILD_FLAGS:=no-mips16
+PKG_BUILD_PARALLEL:=0
PKG_MAINTAINER:=Hirokazu MORIKAWA <[email protected]>
PKG_LICENSE:=MIT
diff --git a/lang/node-yarn/Makefile b/lang/node-yarn/Makefile
index 47c7112f2..b5527189f 100644
--- a/lang/node-yarn/Makefile
+++ b/lang/node-yarn/Makefile
@@ -19,7 +19,7 @@ PKG_LICENSE_FILES:=LICENSE
PKG_HOST_ONLY:=1
HOST_BUILD_DEPENDS:=node/host
-HOST_BUILD_PARALLEL:=1
+HOST_BUILD_PARALLEL:=0
include $(INCLUDE_DIR)/host-build.mk
include $(INCLUDE_DIR)/package.mk |
Disable parallel builds for node downstream packages, as the buildbot is showing frequent timeout problems for aarch644, arm, i386 and x86, and node & node packages are the primary suspect. Based on discussion in openwrt#26078 Signed-off-by: Hannu Nyman <[email protected]>
Thanks. I applied that to the master branch to test if that is enough to fix things. Ps. |
And it timed out again on couple of archs |
I think that we should temporarily mark the node package itself as BROKEN just to verify that it really is the reason for the frequent hangups. |
Sounds fine to me |
@hnyman thanks a lot for looking into this!
I just bumped it from 1 to 2 hours, lets see. |
I marked node BROKEN an hour ago, so let's see if that takes care of the timeouts. Having the timeout period lengthened to two hours might help in case the node package really is that hard to compile. But then the question raised is if it is wise to spend that much resources for the probably really rarely used package. Node.js is not something typically installed into an OpenWrt home router, I think. |
Maybe there is some issue, where node's build system doesn't honor the build concurrency constraints? Or there is some deadlock/race somewhere, being exhibited only on build systems with lower I/O throughput? If it was changed in that update, maybe the diff between those two versions could show the culprit? How was the previous node version (one which built fine on buildbots) behaving?
Indeed, but t seems to be actively maintained, so there are users.
We could say this about a lot of other packages as well :) |
cc @robimarko @Ansuel @nxhack @ianchi
We have a frequent timeout/stall problem in the packages buildbot, which timeout is destroying quite many builds due to a hangup like
failed 'make -j12 ...' (failure) (timed out
.About 1/3 of the builds in the affected targets end in timeout. As it does not affect all builds of the same target architecture and 2/3 times the build succeeds, it is likely some kind of concurrency/race problem, so that the building order of the packages (or submodules) affects the features detected by a second package, causing a config prompt, or something like that.
That seems to have started approx 3 months ago.
The oldest failures that I have spotted are from around
November 23, 2024
The failures happen on aarch64, arm, i386, x86
But not on armeb, arm_xscale, mips, powerpc, loongarch
It is hard to figure out what is happening, as the buildbot compile step logs available for casual users is just the launch of each package's compilation. And due to concurrent building, the packages are built in sligthly different order each time, so there is not direct diff possibility of the 4000+ line logs.
However, I did debugging by sorting the compile step output, and then comparing from the same target a recent ok build.
I noticed that from both analysed targets (x86_64, arm8vfpv3), the exact same package lines were missing from the timeouted build:
The node main compilation is started and also node-yarn host-compile gets started (as the first node module?). But then there is no trace that compiling other modules ever starts, until a timeout kills the whole buildbot build round.
So, my guess for the reason is #25435 :
node: upgrade to 22.11.0 LTS
on 23 Nov 2024 , which commit in addition to the major version bump, also added ABI versioning to node modules.Node is restricted with
DEPENDS:=@HAS_FPU @(i386||x86_64||arm||aarch64)
to build on the affected targets, which increasingly points out to node being the reason for the major timeouts.So for some reason, the node builds likely fails 1/3 of the times, but succeeds 2/3.
Curious.
Sorted logs:
sort x86 ok stdio.txt
sort x86 error stdio.txt
Original:
x86 ok stdio.txt
x86 error stdio.txt
The text was updated successfully, but these errors were encountered: