-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added log processing #75
base: master
Are you sure you want to change the base?
Conversation
Hey @ilyakooo0 Thanks for the PR. 🍰 I've only taken a cursory look as I'm short on time but I'll try and review it properly tomorrow. It seems the CI is erroring for unrelated reasons. |
|
||
processLog :: Monad m => ConduitT BSS.ByteString BSS.ByteString m () | ||
processLog = do | ||
-- metadata (is the next string stdout or stderr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly that the first 4 bytes in the STREAM just denote that the data marked as collected from stdout/stderr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I explained in the issue, I couldn't find any documentation regarding the format. All I could find was this repo: https://github.com/mafintosh/docker-raw-stream
In that repo, the second 4 bytes represent the length of the subsequent string, and the first bit denotes whether the string is from stdout or stderr.
I assume that the first 4 bytes are reserved for metadata.
I have had the opportunity to test this branch and it works as expected.
processLog = do | ||
-- metadata (is the next string stdout or stderr) | ||
_ <- CB.take 4 | ||
len' <- CB.take 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we only take 4 bytes at a time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean, why don't we read the first 8 bytes in one go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure reading it in one go and partitioning into two parts would be terribly more efficient.
Actually, some of the builds errored because I used the new UPD: I now fixed it and builds are failing only for unrelated reasons |
Hi everyone, I ran into the same problem yesterday, but found that the format is documented in https://docs.docker.com/engine/api/v1.40/#operation/ContainerAttach Here's some (simplified) code to show how I deal with it at the moment (which seems to work fine):
It is worth noting that it may not be sensible to deal with this directly in the implementation of |
Hey @mbg thanks for the clarification and the example. It was always the intention of the library to leave the stream processing outside of the library ie. let the user define the sink that gets passed to the getContainerLogsSream function. So I'm not sure if it's wise to include this entire machinery in the library itself. That said, there is too much abstration leaking outside of the library and the user has to know docker api internals to construct a proper sink....(ie. the message structure and what the first bytes mean and so on). So I think we need to address this somehow. |
I'd love to get this one merged but I haven't had time to look into it in more detail. Would an example stream that just prints stuff to stdout and a pointer to it in the docs be the best course of action? @ilyakooo0 @mbg what do you think? |
IMO the main problem is that the raw stream produces invalid characters (as some of the data is not text at all). So, the stream is mostly useless unless you do the exact processing that is done in this PR even if you want to get the binary output of the executable. I can't imagine a reason why someone would use this library and want to get the raw unprocessed stream. |
I agree with @ilyakooo0. I think that there should be an implementation of this in the library, at the very least as an opt-in (i.e. even if it was just something along the lines of the
Neither of these are particularly big issues, they are just design decisions that need to be made. |
@denibertovic have you had any more thoughts on this / have you made a decision? |
@mbg I agree with your previous comment. This should be implemented as a separate function and not directly in I don't have time to work on this but I'd be happy to merge if the above changes were made. |
closes #74