Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exception warnings in log #737

Open
awb99 opened this issue Oct 20, 2024 · 6 comments
Open

exception warnings in log #737

awb99 opened this issue Oct 20, 2024 · 6 comments

Comments

@awb99
Copy link

awb99 commented Oct 20, 2024

When I use aleph http client, at some point I am getting this exceptions in the application log.
I dont think it is an issue; I believe they happen after I make a series of requests. So
what I guess is happing is that netty is closing connections that it has opened, and somehow
aleph does not catch this exceptions, this is why I see them as warnings in the log.
Any ideas what should be done?
Thanks!!

Below is the log:

2024-10-20T23:32:13.865Z nuc12 WARN [aleph.http.client:326] - exception-handler #error {
 :cause Connection reset
 :via
 [{:type java.net.SocketException
   :message Connection reset
   :at [sun.nio.ch.SocketChannelImpl throwConnectionReset SocketChannelImpl.java 401]}]
 :trace
 [[sun.nio.ch.SocketChannelImpl throwConnectionReset SocketChannelImpl.java 401]
  [sun.nio.ch.SocketChannelImpl read SocketChannelImpl.java 434]
  [io.netty.buffer.PooledByteBuf setBytes PooledByteBuf.java 255]
  [io.netty.buffer.AbstractByteBuf writeBytes AbstractByteBuf.java 1132]
  [io.netty.channel.socket.nio.NioSocketChannel doReadBytes NioSocketChannel.java 356]
  [io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe read AbstractNioByteChannel.java 151]
  [io.netty.channel.nio.NioEventLoop processSelectedKey NioEventLoop.java 788]
  [io.netty.channel.nio.NioEventLoop processSelectedKeysOptimized NioEventLoop.java 724]
  [io.netty.channel.nio.NioEventLoop processSelectedKeys NioEventLoop.java 650]
  [io.netty.channel.nio.NioEventLoop run NioEventLoop.java 562]
  [io.netty.util.concurrent.SingleThreadEventExecutor$4 run SingleThreadEventExecutor.java 994]
  [io.netty.util.internal.ThreadExecutorMap$2 run ThreadExecutorMap.java 74]
  [manifold.executor$thread_factory$reify__18508$f__18509 invoke executor.clj 71]
  [clojure.lang.AFn run AFn.java 22]
  [io.netty.util.concurrent.FastThreadLocalRunnable run FastThreadLocalRunnable.java 30]
  [java.lang.Thread run Thread.java 1589]]}
@arnaudgeiser
Copy link
Collaborator

Hey @awb99!

So
what I guess is happing is that netty is closing connections that it has opened, and somehow
aleph does not catch this exceptions, this is why I see them as warnings in the log.

That's most likely the opposite.
The connection has been closed on the server side, and the client eventually noticed the socket had been closed while trying to read from it.

If you don't want to keep the connection open for multiple requests, you can set keep-alive? false [1] but there is a cost you will have to pay; establishing a new connection each time.

[1] : https://github.com/clj-commons/aleph/blob/master/src/aleph/http.clj#L168

@awb99
Copy link
Author

awb99 commented Oct 23, 2024

Thanks @arnaudgeiser ! I am sending out a lot of http requests. What I Notice is that whenever I see this log entries that some of my http requests never return. So my fear is that aleph is not getting this exceptions and just writes them to a log. I use manifold
Deferred/on-realized which gets both a success and a failure callback fn. When the remote server closes a connection then I should be able to catch the error. But I cannot. I guess it's a bug in Aleph; this is hard to imagine as exceptions on http requests should happen from time to time. But this is my best explanation so far.

@arnaudgeiser
Copy link
Collaborator

What I Notice is that whenever I see this log entries that some of my http requests never return.

Yes, this should not happen. A request should always end up in a terminal state and should somehow return. You should be able to use d/on-realized or d/catch on it.
However, the fact this connection is closed while no active requests are going in is something that could happen (on a remote service timeout for example).

Can you share some code with us so we can try to reproduce your scenario?

[1] : https://github.com/clj-commons/aleph/blob/master/src/aleph/http/client.clj#L965

@awb99
Copy link
Author

awb99 commented Oct 23, 2024

So what I am doing is I am runnning a few threads in parallel, that work through a seq of tasks
to http head requests.

The function hat I am using is download-link-info in
https://github.com/clojure-quant/quanta-market/blob/main/src/quanta/market/barimport/kibot/http.clj

When I ran this on 2000 requests, at some level the exceptions would be logged, and my task-runner
would hang forever. This is why I am so sure that I am not catching the aleph socket timeout exceptions
correctly.

I am using missionary as a FRP framework, and I made this wrapper code so I can use aleph from
within missionary. The easiest culprit would be that that my wrapper is broken. This is the wrapper:
https://github.com/clojure-quant/quanta-market/blob/main/src/quanta/market/util/aleph.clj
I doubt that I have an error here, as my wrapper is in line with other wrappers for core.async and promises.
I use d/on-realized and hand it a success and a failure callback. I believe the failure callback should get
the exception and I dont have a need for d/catch. Is this correct?

When I ran into this issue, I have written a wrapper for clj-http and switched to clj-http:
https://github.com/clojure-quant/quanta-market/blob/main/src/quanta/market/util/clj_http.clj
With clj-http this errors did no longer happen at all.

The webserver I am getting the data from is kibot.com; and unfortunately I cannot share my account
publicly. But I am very happy to do that privately.

Perhaps it would be easiest if I find a server that never responds to a head (or get) request at all, and just
closes the connection after some time to simulate this behavior.

I would like to use aleph, because its threading model is pretty good.
Thanks a lot for the help!

@arnaudgeiser
Copy link
Collaborator

Perhaps it would be easiest if I find a server that never responds to a head (or get) request at all, and just
closes the connection after some time to simulate this behavior.

I would like to reproduce your scenario by doing exactly. Unfortunately, I didn't have time to look at it for now. I'm going to give it a shot over the course of the next week.

@awb99
Copy link
Author

awb99 commented Nov 3, 2024

I just had a few more of this errors in an endpoint that does not require credentials. I believe that I could setup a a deps.edn alias in one of my projects to get this error more consistently. So I would setup one Version that works with clj-http and another with aleph that makes errors. Should I Try to do that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants