Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#270 Support periodic manual commits #275

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lepek
Copy link

@lepek lepek commented May 18, 2018

I know we have talked about sanity checks, but in the worst scenario this would commit after each poll, doing the same thing that a manual commit with interval = 0 would do. So I think it is ok to do it like this. But let me know what you think!

@karmi
Copy link

karmi commented May 18, 2018

Hi @lepek, we have found your signature in our records, but it seems like you have signed with a different e-mail than the one used in yout Git commit. Can you please add both of these e-mails into your Github profile (they can be hidden), so we can match your e-mails to your Github profile?

@lepek
Copy link
Author

lepek commented May 18, 2018

Do we want to upgrade the travis build for Logstash 5.6? since it is failing due to:

Installing rake 12.3.1
Gem::RuntimeRequirementNotMetError: rake requires Ruby version >= 2.0.0. The
current ruby version is 1.9.
An error occurred while installing rake (12.3.1), and Bundler
cannot continue.

@jsvd jsvd self-requested a review May 18, 2018 15:51
Copy link
Member

@jsvd jsvd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I performed some manual tests and found some issues that must be addressed before we can merge this.

The wrong boolean makes it so that commits are never done, and the two scenarios that aren't address cause logstash to not commit during idle or shutdown, leading to unnecessary replays and duplicates.


def has_to_commit?(last_commit_time)
# If auto_commit is enable we just leave the commit to the client library on poll and close actions
return false if @enable_auto_commit == "false"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be return false if @enable_auto_commit == "true" (we don't want to commit manually if auto commit is on)

@@ -266,8 +268,9 @@ def thread_runner(logstash_queue, consumer)
end
end
# Manual offset commit
if @enable_auto_commit == "false"
if has_to_commit?(last_commit_time)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by not committing anymore on all poll operations we now have two issues that must be addressed:

  1. if you receive and process some events before has_to_commit?returns true and then no other events arrive, we'll never commit the offset because we have a guard at the start of the loop to skip if no records are returned from poll.
  2. If events are processed but logstash is asked to terminate gracefully we don't commit the offset since the stop operation doesn't do it explicitly. Currently it relies on either commit per poll or auto commit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I only took in account my case in which I have a pretty stable flow.

return false if @enable_auto_commit == "false"

# If auto_commit is disable, we need to commit, we will do it depending on the manual_commit_interval option
@manual_commit_interval_ms <= 0 || (last_commit_time + @manual_commit_interval_ms) < timestamp_ms
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for clarity, we can change the big conditional into two operations:

  def has_to_commit?(last_commit_time)
    # auto_commit is enabled so we just leave the commit to the client library on poll and close actions
    return false if @enable_auto_commit == "true"

    # auto_commit is disabled but interval committing is disabled as well, so commit on every poll
    return true if @manual_commit_interval_ms <= 0

    # auto_commit is disabled and an interval is set, so let's check if enough time passed since last commit
    (last_commit_time + @manual_commit_interval_ms) < current_timestamp_ms
  end

@jsvd
Copy link
Member

jsvd commented May 28, 2018

Btw, a rebase against master should solve the test failures we're seeing here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants