You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#3319 lead me to an idea:
respondd should have the ability to report the device being in an error state.
possible error states:
/overlay is read-only
/overlay doesn't have enough erase-blocks
vpn doesn't connect despite having link and an IP on WAN
lost connection to gateway
the last update failed to install
autoupdater cannot find device in manifest
manifest doesn't exist
autoupdater failed to connect to update server y
a wifi interface failed to come up
lost wan connection (info string)
dfs event detected at time y (info string)
NAND flash has y badblocks (info string)
The info could also be available thereby when you are connected to an offline node via wifi mesh but don't have the key on the affected device, yet. You could read out an error-state before rebooting and thereby deleting all logs.
Communities could define custom errors like these:
available space on VM is smaller than y (updates cannot be installed)
RAM is smaller than y and not enough (VM)
edgerouter x: there are bad blocks/there are no bad blocks. (and maybe a further info string: needs to be updated manually)
custom error (mcu timeout, some ath10k error, etc.)
These are just examples. Some communities already did similar things in the past by renaming the release-name or the hostname with a package. But those were fairly limited.
example device: https://map.freifunk-winterberg.net/#!/de/map/fcecda7cc036
These error messages could be displayed on the map and also further evaluated by Grafana (including timestamps when the device started showing an error and how often)
They could help on evaluating major version updates.
I'm not sure how verbose we want to make these messages since they are queried constantly. We could define error codes instead of sending full strings over the air.
Or we could define a new data type in addition to the current one where you can query the full message:
nodeinfo: 158
statistics: 159
neighbours: 160
What are your thoughts on my idea?
The text was updated successfully, but these errors were encountered:
since it's too complex to design a system that resolves errors the system will always report all errors since the last boot.
edit: there could be like a few errors that will spawn "resolved" messages that will resolve previous messages of a specific type. (so the map or grafana only shows current errors)
the respondd format would be a list containing items of:
error-type: info/warning/error
error-message: string
error-count: counting the number of reports since the last boot
error-date: timestamp of the last occurence
this list is sorted in the order the errors appeared. If an error reappears it will be sorted back in at the end of the list.
nodeinfo could have an id counter for the errors so yanic only queries respondd for new errors if the counter was updated since the last request. (either this or a timestamp of the last error but a timestamp could lead to race conditions)
this should not interfere with existing systems like yanic since it only adds a new feature.
#3319 lead me to an idea:
respondd should have the ability to report the device being in an error state.
possible error states:
The info could also be available thereby when you are connected to an offline node via wifi mesh but don't have the key on the affected device, yet. You could read out an error-state before rebooting and thereby deleting all logs.
Communities could define custom errors like these:
These are just examples. Some communities already did similar things in the past by renaming the release-name or the hostname with a package. But those were fairly limited.
example device: https://map.freifunk-winterberg.net/#!/de/map/fcecda7cc036
These error messages could be displayed on the map and also further evaluated by Grafana (including timestamps when the device started showing an error and how often)
They could help on evaluating major version updates.
I'm not sure how verbose we want to make these messages since they are queried constantly. We could define error codes instead of sending full strings over the air.
Or we could define a new data type in addition to the current one where you can query the full message:
What are your thoughts on my idea?
The text was updated successfully, but these errors were encountered: