-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
post_batch ... differentiate reception batching from transmission. #1213
Comments
I'm assuming we'd also like to have I've added changes for the new configuration option + made |
I've ran the static flow tests with |
I think post_batch says how many files we send... but I'm confused... if you have a:
what happens? I'm guessing it should:
Is that what you do? |
if batch < post_batch... then do we gather more than once? or do we just make post_batch == batch? |
No. I only added the option and made the option tunable in the transfer class (ftp / sftp). If it's not too hard it would be nice to have it configurable both ways (post_batch > batch) && (batch > post_batch). If post_batch > batch, I think what we'd need to do is
|
hey @reidsunderland ? I'm starting to wonder if this is worth while... I thought it would just be a new setting, and a few lines... but it looks like we are putting a new looping layer everywhere to deal with when batch != post_batch. loop multiple times in gather to accumulate post_batch worth of messages (if post_batch > batch) and then loop multiple times in work+post to accumulate batch worth of messages (if post_batch < batch.) ... I started this... but I don't remember any use cases... is it something worth doing? |
I think the reason we were looking at it is because we observed that the tx_commit can be slow. We currently do 1 tx_commit for every 1 message we publish, and we thought that it would be faster to do 1 tx_commit for multiple messages. When we tested that, it was a bit faster, but we weren't sure how it would impact error handling or other failures. I think we decided that it wasn't worth changing tx_commit right now, and it sounds like it's not worth the added complexity to implement a separate post_batch option |
Another wrinkle is... if there simply aren't a full batch worth of incoming... (aka gather results in zero new messages) then we want to proceed to posting regardless of batch... |
batch settings are things that should be tuned, and the best tuning value may be different for consuming versus publishing... likely need independent tuning.
The text was updated successfully, but these errors were encountered: