Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[modbus] doesn't recover after network errors #18111

Open
calle2010 opened this issue Jan 16, 2025 · 2 comments
Open

[modbus] doesn't recover after network errors #18111

calle2010 opened this issue Jan 16, 2025 · 2 comments
Labels
bug An unexpected problem or unintended behavior of an add-on

Comments

@calle2010
Copy link

Expected Behavior

When my network fails, e. g. because the router restarts after an update, modbus polling may fail to reach my device via network, but should recover automatically once the network is back.

Current Behavior

When the network fails sometimes the modbus binding is not recovering even after the network is up again. It doesn't poll the networked device anymore, but the device is reachable and functional over the local network with other Modbus clients.

The modbus bridge configuration:

UID: modbus:tcp:193680686a
label: "Modbus TCP Server: Samson Trovis 5573"
thingTypeUID: modbus:tcp
configuration:
  rtuEncoded: false
  connectMaxTries: 1
  reconnectAfterMillis: 0
  timeBetweenTransactionsMillis: 60
  port: 8899
  timeBetweenReconnectMillis: 0
  connectTimeoutMillis: 10000
  host: ew10
  afterConnectionDelayMillis: 0
  id: 247
  enableDiscovery: false

Modbus related log entries (WARN level):

2025-01-16 04:34:58.877 [DEBUG] [ing.ModbusSlaveConnectionFactoryImpl] - Connect try 1/1 failed: Connect timed out. Connection TCPMasterConnection [m_Socket=Socket[unconnected], m_Timeout=3000, m_Connected=false, m_Address=ew10/192.168.204.85, m_Port=8899, m_ModbusTransport=net.wimpi.modbus.io.ModbusTCPTransport@c14610, m_ConnectTimeoutMillis=10000, rtuEncoded=false]. Endpoint ModbusIPSlaveEndpoint [address=ew10, port=8899]
2025-01-16 04:34:58.878 [WARN ] [ing.ModbusSlaveConnectionFactoryImpl] - Connect reached max tries 1, throwing last error: Connect timed out. Connection TCPMasterConnection [m_Socket=Socket[unconnected], m_Timeout=3000, m_Connected=false, m_Address=ew10/192.168.204.85, m_Port=8899, m_ModbusTransport=net.wimpi.modbus.io.ModbusTCPTransport@c14610, m_ConnectTimeoutMillis=10000, rtuEncoded=false]. Endpoint ModbusIPSlaveEndpoint [address=ew10, port=8899]
2025-01-16 04:34:58.883 [WARN ] [ing.ModbusSlaveConnectionFactoryImpl] - Error connecting connection TCPMasterConnection [m_Socket=Socket[unconnected], m_Timeout=3000, m_Connected=false, m_Address=ew10/192.168.204.85, m_Port=8899, m_ModbusTransport=net.wimpi.modbus.io.ModbusTCPTransport@c14610, m_ConnectTimeoutMillis=10000, rtuEncoded=false] for endpoint ModbusIPSlaveEndpoint [address=ew10, port=8899]: Connect timed out
2025-01-16 04:34:58.886 [WARN ] [ing.ModbusSlaveConnectionFactoryImpl] - KeyedPooledModbusSlaveConnectionFactory: Unknown host: ew10: Temporary failure in name resolution. Connection creation failed.
2025-01-16 04:34:58.887 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Could not connect to endpoint ModbusIPSlaveEndpoint [address=ew10, port=8899] -- aborting request ModbusReadRequestBlueprint [slaveId=247, functionCode=READ_MULTIPLE_REGISTERS, start=105, length=2, maxTries=3] [operation ID a4913cb9-689c-4672-b15a-5033a8cb903b]
2025-01-16 04:34:58.891 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Could not connect to endpoint ModbusIPSlaveEndpoint [address=ew10, port=8899] -- aborting request ModbusReadRequestBlueprint [slaveId=247, functionCode=READ_MULTIPLE_REGISTERS, start=12, length=6, maxTries=3] [operation ID 838f1ec6-838c-4c51-ad96-a0b885761192]

I had only set org.openhab.core.io.transport.modbus.internal.pooling.ModbusSlaveConnectionFactoryImpl to DEBUG level because I was expecting the error here.

I took a thread dump when modbus was stuck and found five threads blocked on the same object:

"OH-modbusManagerPollerThreadPool-23" #6798 [21325] daemon prio=5 os_prio=0 cpu=1817182.00ms elapsed=5061243.12s tid=0x3f7f9550 nid=21325 waiting on condition  [0x457fe000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
	- parking to wait for  <0x63507030> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:371)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block([email protected]/AbstractQueuedSynchronizer.java:519)
	at java.util.concurrent.ForkJoinPool.unmanagedBlock([email protected]/ForkJoinPool.java:3780)
	at java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3725)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await([email protected]/AbstractQueuedSynchronizer.java:1712)
	at org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1323)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:447)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:350)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.borrowConnection(ModbusManagerImpl.java:393)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.lambda$3(ModbusManagerImpl.java:476)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$$Lambda/0x44f0e298.get(Unknown Source)
	at org.openhab.core.io.transport.modbus.internal.SimpleStopWatch.timeSupplier(SimpleStopWatch.java:118)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.getConnection(ModbusManagerImpl.java:476)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.executeOperation(ModbusManagerImpl.java:567)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$ModbusCommunicationInterfaceImpl.lambda$1(ModbusManagerImpl.java:820)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$ModbusCommunicationInterfaceImpl$$Lambda/0x44f0ca08.run(Unknown Source)
	at java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:572)
	at java.util.concurrent.FutureTask.runAndReset([email protected]/FutureTask.java:358)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run([email protected]/ScheduledThreadPoolExecutor.java:305)
	at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1144)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:642)
	at java.lang.Thread.runWith([email protected]/Thread.java:1596)
	at java.lang.Thread.run([email protected]/Thread.java:1583)

"OH-modbusManagerPollerThreadPool-27" #6819 [21463] daemon prio=5 os_prio=0 cpu=1814634.82ms elapsed=5061153.12s tid=0x3ca000c8 nid=21463 waiting on condition  [0x48efe000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
	- parking to wait for  <0x63507030> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:371)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block([email protected]/AbstractQueuedSynchronizer.java:519)
	at java.util.concurrent.ForkJoinPool.unmanagedBlock([email protected]/ForkJoinPool.java:3780)
	at java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3725)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await([email protected]/AbstractQueuedSynchronizer.java:1712)
	at org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1323)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:447)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:350)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.borrowConnection(ModbusManagerImpl.java:393)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.lambda$3(ModbusManagerImpl.java:476)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$$Lambda/0x44f0e298.get(Unknown Source)
	at org.openhab.core.io.transport.modbus.internal.SimpleStopWatch.timeSupplier(SimpleStopWatch.java:118)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.getConnection(ModbusManagerImpl.java:476)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.executeOperation(ModbusManagerImpl.java:567)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$ModbusCommunicationInterfaceImpl.lambda$1(ModbusManagerImpl.java:820)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$ModbusCommunicationInterfaceImpl$$Lambda/0x44f0ca08.run(Unknown Source)
	at java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:572)
	at java.util.concurrent.FutureTask.runAndReset([email protected]/FutureTask.java:358)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run([email protected]/ScheduledThreadPoolExecutor.java:305)
	at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1144)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:642)
	at java.lang.Thread.runWith([email protected]/Thread.java:1596)
	at java.lang.Thread.run([email protected]/Thread.java:1583)

"OH-modbusManagerPollerThreadPool-29" #6826 [21504] daemon prio=5 os_prio=0 cpu=1813324.49ms elapsed=5061133.12s tid=0x3f75e7b0 nid=21504 waiting on condition  [0x517fe000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
	- parking to wait for  <0x63507030> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:371)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block([email protected]/AbstractQueuedSynchronizer.java:519)
	at java.util.concurrent.ForkJoinPool.unmanagedBlock([email protected]/ForkJoinPool.java:3780)
	at java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3725)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await([email protected]/AbstractQueuedSynchronizer.java:1712)
	at org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1323)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:447)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:350)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.borrowConnection(ModbusManagerImpl.java:393)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.lambda$3(ModbusManagerImpl.java:476)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$$Lambda/0x44f0e298.get(Unknown Source)
	at org.openhab.core.io.transport.modbus.internal.SimpleStopWatch.timeSupplier(SimpleStopWatch.java:118)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.getConnection(ModbusManagerImpl.java:476)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.executeOperation(ModbusManagerImpl.java:567)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$ModbusCommunicationInterfaceImpl.lambda$1(ModbusManagerImpl.java:820)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$ModbusCommunicationInterfaceImpl$$Lambda/0x44f0ca08.run(Unknown Source)
	at java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:572)
	at java.util.concurrent.FutureTask.runAndReset([email protected]/FutureTask.java:358)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run([email protected]/ScheduledThreadPoolExecutor.java:305)
	at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1144)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:642)
	at java.lang.Thread.runWith([email protected]/Thread.java:1596)
	at java.lang.Thread.run([email protected]/Thread.java:1583)

"OH-modbusManagerPollerThreadPool-30" #6841 [21593] daemon prio=5 os_prio=0 cpu=1813703.60ms elapsed=5061073.12s tid=0x02eb73a8 nid=21593 waiting on condition  [0x522fe000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
	- parking to wait for  <0x63507030> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:371)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block([email protected]/AbstractQueuedSynchronizer.java:519)
	at java.util.concurrent.ForkJoinPool.unmanagedBlock([email protected]/ForkJoinPool.java:3780)
	at java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3725)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await([email protected]/AbstractQueuedSynchronizer.java:1712)
	at org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1323)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:447)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:350)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.borrowConnection(ModbusManagerImpl.java:393)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.lambda$3(ModbusManagerImpl.java:476)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$$Lambda/0x44f0e298.get(Unknown Source)
	at org.openhab.core.io.transport.modbus.internal.SimpleStopWatch.timeSupplier(SimpleStopWatch.java:118)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.getConnection(ModbusManagerImpl.java:476)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.executeOperation(ModbusManagerImpl.java:567)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$ModbusCommunicationInterfaceImpl.lambda$1(ModbusManagerImpl.java:820)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$ModbusCommunicationInterfaceImpl$$Lambda/0x44f0ca08.run(Unknown Source)
	at java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:572)
	at java.util.concurrent.FutureTask.runAndReset([email protected]/FutureTask.java:358)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run([email protected]/ScheduledThreadPoolExecutor.java:305)
	at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1144)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:642)
	at java.lang.Thread.runWith([email protected]/Thread.java:1596)
	at java.lang.Thread.run([email protected]/Thread.java:1583)

"OH-modbusManagerPollerThreadPool-31" #6844 [21598] daemon prio=5 os_prio=0 cpu=1814670.01ms elapsed=5061069.10s tid=0x4f0d3500 nid=21598 waiting on condition  [0x491fe000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
	- parking to wait for  <0x63507030> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:371)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block([email protected]/AbstractQueuedSynchronizer.java:519)
	at java.util.concurrent.ForkJoinPool.unmanagedBlock([email protected]/ForkJoinPool.java:3780)
	at java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3725)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await([email protected]/AbstractQueuedSynchronizer.java:1712)
	at org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1323)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:447)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:350)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.borrowConnection(ModbusManagerImpl.java:393)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.lambda$3(ModbusManagerImpl.java:476)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$$Lambda/0x44f0e298.get(Unknown Source)
	at org.openhab.core.io.transport.modbus.internal.SimpleStopWatch.timeSupplier(SimpleStopWatch.java:118)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.getConnection(ModbusManagerImpl.java:476)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl.executeOperation(ModbusManagerImpl.java:567)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$ModbusCommunicationInterfaceImpl.lambda$1(ModbusManagerImpl.java:820)
	at org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl$ModbusCommunicationInterfaceImpl$$Lambda/0x44f0ca08.run(Unknown Source)
	at java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:572)
	at java.util.concurrent.FutureTask.runAndReset([email protected]/FutureTask.java:358)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run([email protected]/ScheduledThreadPoolExecutor.java:305)
	at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1144)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:642)
	at java.lang.Thread.runWith([email protected]/Thread.java:1596)
	at java.lang.Thread.run([email protected]/Thread.java:1583)

It doesn't seem to be a deadlock, though, because the id in "parking to wait for" changes over time.

The situation seems to be similar to #10071.

It may make a difference that I also use a name (ew10) instead of a fixed IP address. Still, I'd expect the Modbus binding to recover from network failures instead of block indefinitely.

When I restart the "Modbus Binding", the system recovers and works normally.

Steps to Reproduce (for Bugs)

  1. Modbus TCP connection established via name, not fixed IP address
  2. Bring the network down by restarting the router
  3. Modbus binding stops working, sometimes with Things going to error state, sometimes even without

Your Environment

My OpenHAB installation details:

runtimeInfo:
  version: 4.2.0
  buildString: Release Build
locale: de-DE
systemInfo:
  configFolder: /var/services/homes/openhab/openhab-4/conf
  userdataFolder: /var/services/homes/openhab/openhab-4/userdata
  logFolder: /var/services/homes/openhab/openhab-4/userdata/logs
  javaVersion: 21.0.5
  javaVendor: BellSoft
  osName: Linux
  osVersion: 3.10.108
  osArchitecture: arm
  availableProcessors: 2
  freeMemory: 10698296
  totalMemory: 150994944
  uptime: 5087759
  startLevel: 100
addons:
  - automation-jsscripting
  - binding-http
  - binding-hue
  - binding-livisismarthome
  - binding-modbus
  - binding-mqtt
  - binding-systeminfo
  - binding-tr064
  - misc-homekit
  - persistence-inmemory
  - persistence-rrd4j
  - transformation-jsonpath

running on a Synology NAS, ARM 32bit.

@calle2010 calle2010 added the bug An unexpected problem or unintended behavior of an add-on label Jan 16, 2025
@calle2010
Copy link
Author

I switched on TRACE log level before restarting the "modbus binding" bundle. This produced of course a lot of log messages, multiple like this:

5-01-16 09:21:42.503 [TRACE] [bus.handler.ModbusPollerThingHandler] - unregisterPollTask()
2025-01-16 09:21:42.504 [DEBUG] [bus.handler.ModbusPollerThingHandler] - Unregistering polling from ModbusManager
2025-01-16 09:21:42.504 [DEBUG] [rt.modbus.internal.ModbusManagerImpl] - Unregistering regular poll task BasicPollTask [getEndpoint=ModbusIPSlaveEndpoint [address=ew10, port=8899], request=ModbusReadRequestBlueprint [slaveId=247, functionCode=READ_COILS, start=87, length=3, maxTries=3], getResultCallback()=org.openhab.binding.modbus.handler.ModbusPollerThingHandler$ReadCallbackDelegator@1a37db3, getFailureCallback()=org.openhab.binding.modbus.handler.ModbusPollerThingHandler$ReadCallbackDelegator@1a37db3] (interrupting if necessary)
2025-01-16 09:21:42.505 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Error getting a new connection for endpoint ModbusIPSlaveEndpoint [address=ew10, port=8899]. Error was: java.lang.InterruptedException null
2025-01-16 09:21:42.506 [TRACE] [rt.modbus.internal.ModbusManagerImpl] - borrowing connection (got Optional.empty) for endpoint ModbusIPSlaveEndpoint [address=ew10, port=8899] took 17213181 ms
2025-01-16 09:21:42.506 [TRACE] [rt.modbus.internal.ModbusManagerImpl] - Executing task BasicPollTask [getEndpoint=ModbusIPSlaveEndpoint [address=ew10, port=8899], request=ModbusReadRequestBlueprint [slaveId=247, functionCode=READ_COILS, start=87, length=3, maxTries=3], getResultCallback()=org.openhab.binding.modbus.handler.ModbusPollerThingHandler$ReadCallbackDelegator@1a37db3, getFailureCallback()=org.openhab.binding.modbus.handler.ModbusPollerThingHandler$ReadCallbackDelegator@1a37db3] (oneOff=false)! Connection received in 17213181 ms [operation ID 6fd02b80-4fbe-4d76-825f-8e32421255c7]
2025-01-16 09:21:42.507 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Could not connect to endpoint ModbusIPSlaveEndpoint [address=ew10, port=8899] -- aborting request ModbusReadRequestBlueprint [slaveId=247, functionCode=READ_COILS, start=87, length=3, maxTries=3] [operation ID 6fd02b80-4fbe-4d76-825f-8e32421255c7]
2025-01-16 09:21:42.507 [TRACE] [rt.modbus.internal.ModbusManagerImpl] - Calling error response callback org.openhab.binding.modbus.handler.ModbusPollerThingHandler$ReadCallbackDelegator@1a37db3 for request ModbusReadRequestBlueprint [slaveId=247, functionCode=READ_COILS, start=87, length=3, maxTries=3]. Error was org.openhab.core.io.transport.modbus.exception.ModbusConnectionException Error connecting to endpoint ModbusIPSlaveEndpoint [address=ew10, port=8899]
2025-01-16 09:21:42.507 [TRACE] [rt.modbus.internal.ModbusManagerImpl] - Called write response callback org.openhab.binding.modbus.handler.ModbusPollerThingHandler$ReadCallbackDelegator@1a37db3 for request ModbusReadRequestBlueprint [slaveId=247, functionCode=READ_COILS, start=87, length=3, maxTries=3]. Error was org.openhab.core.io.transport.modbus.exception.ModbusConnectionException Error connecting to endpoint ModbusIPSlaveEndpoint [address=ew10, port=8899]
2025-01-16 09:21:42.508 [TRACE] [rt.modbus.internal.ModbusManagerImpl] - Operation with task BasicPollTask [getEndpoint=ModbusIPSlaveEndpoint [address=ew10, port=8899], request=ModbusReadRequestBlueprint [slaveId=247, functionCode=READ_COILS, start=87, length=3, maxTries=3], getResultCallback()=org.openhab.binding.modbus.handler.ModbusPollerThingHandler$ReadCallbackDelegator@1a37db3, getFailureCallback()=org.openhab.binding.modbus.handler.ModbusPollerThingHandler$ReadCallbackDelegator@1a37db3]. Got a connection which was unconnected (connection issue) [operation ID 6fd02b80-4fbe-4d76-825f-8e32421255c7]
2025-01-16 09:21:42.508 [TRACE] [rt.modbus.internal.ModbusManagerImpl] - Initial connection was not successful, aborting. [operation ID 6fd02b80-4fbe-4d76-825f-8e32421255c7]
2025-01-16 09:21:42.509 [TRACE] [rt.modbus.internal.ModbusManagerImpl] - returning connection for endpoint ModbusIPSlaveEndpoint [address=ew10, port=8899] took 0 ms
2025-01-16 09:21:42.509 [TRACE] [rt.modbus.internal.ModbusManagerImpl] - Connection was returned to the pool, ending operation [operation ID 6fd02b80-4fbe-4d76-825f-8e32421255c7]
2025-01-16 09:21:42.509 [DEBUG] [rt.modbus.internal.ModbusManagerImpl] - Modbus operation ended, timing info: {total: 17213184 ms, connection: 17213181, transaction=0, callback=1} [operation ID 6fd02b80-4fbe-4d76-825f-8e32421255c7]

Look at total time if 17213184ms, which is nearly five hours and corresponds to the time the Modbus binding was not polling the data. It seems like restarting the bundle unstucks the blocked threads.

@calle2010
Copy link
Author

calle2010 commented Jan 16, 2025

I could replicate the situation by restarting my router and having trace logs enabled:

org.openhab.binding.modbus                         │ TRACE
org.openhab.core.io.transport.modbus               │ TRACE

The logs start briefly before the outage. At the end, there is total silence from Modbus in the logs. You can see that after 6 minutes a rule was logging. The network and name resolution was then available already again for several minutes.

openhab-modbus-issue.log
openhab-modbus-issue-threads.txt

It is also interesting to see that only some of the things go into an error state. Actually all of them should be in error since no Modbus communication is happening anymore. But since all the threads are stuck, also no exceptions are passed back.

Image

Please let me know if there are other loggers that I should enable to analyse the situation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An unexpected problem or unintended behavior of an add-on
Projects
None yet
Development

No branches or pull requests

1 participant