Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send measure metric even when the block raises #111

Closed
wants to merge 1 commit into from

Conversation

maximebedard
Copy link

Fixes #110

In order to fix this issue, I had to break down the duration method since we need to handle situation where the block would raise in order to calculate the remaining time. Since it was the only place used, I decided to expose the gettime method that uses the method available to gather the current time.

Copy link
Contributor

@wvanbergen wvanbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue is that calls that fail with an exception most likely have much different characteristics compared to calls that are successful. This simply combines those two histograms into one.

Maybe that makes sense, but I can also see many cases where it doesn't. What do you think?

start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
yield
Process.clock_gettime(Process::CLOCK_MONOTONIC) - start
def self.gettime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is part of the public API. We cannot really change it like this. I know we have code out there that uses this function

end
else
collect_metric(hash_argument(metric_options).merge(type: :ms, name: key, value: value))
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can still use the old duration method.

value = 1000 * StatsD::Instrument.duration do
  begin
    result = yield
  ensure
    ...
  end
end

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that we need to have the duration by the time we reach the ensure block in order to deliver the metric with the actual length of the call. I dont think this would work :(

@maximebedard
Copy link
Author

most likely have much different characteristics compared to calls that are successful.

True, I haven't considered it that way. Should we publish a different metrics in this case?

@maximebedard
Copy link
Author

TBH I'm not 100% sure what the correct behaviour is. For compairaison, datadog's library also delivers the metric when the block raises https://github.com/DataDog/dogstatsd-ruby/blob/master/lib/datadog/statsd.rb#L203-L204.

For me, I believe the noise may be a good thing since it will skew metrics and potentially help pinpoint errors/unexpected behaviour.

@tjoyal
Copy link
Member

tjoyal commented Oct 31, 2017

One issue is that calls that fail with an exception most likely have much different characteristics compared to calls that are successful. This simply combines those two histograms into one.

I see your concern and to tell the truth I might be biased by the recent work I've been doing.

The way I see it I want this gem to "tell me information on what happens when I run this block" and silencing it is counter intuitive for me. (eg.: measuring around an http call might tell me all is fine but in reality a lot of requests timeout)

Something else to note, other events would still track failures. eg.:
statsd_count_success would keep track of exceptions
statsd_count increment before yielding the old method that might fail

statsd_count_if would on the other end not consider failures

So maybe what is really missing is:
statsd_measure_success that would keep track of success in a tag
statsd_measure_if that would track only success

That leaves us with statsd_measure that should track everything.

Existing code point in the direction I want, so I'm using it as my argument. I'm open to changing how the statsd_count to ignore failures if we prefer to go the other direction. Breaking changes either way.

This gets my 👍 if we are considering "making statsd_measure behave the same way as statsd_count" good enough as a first step (even if we eventually would like to go back).

@wvanbergen
Copy link
Contributor

If you think this is the right way to go, I am on board.

We just need to make sure to keep the duration method in place. Rather than replacing it, let's add current_timestamp (instead of gettime which I don't find very Ruby-like), and rewrite duration to use it.

@dnlserrano
Copy link

dnlserrano commented Nov 2, 2017

This is embarassing... I just noticed you guys had this PR open... 🤦‍♂️

EDIT: Anyways, you guys' version looks good, so let's just get this thing rolling. 🚀

Copy link
Member

@tjoyal tjoyal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo :shipit:

If we do 100% agree on the direction I'm open to discuss it, but I think statsd_measure should handle exception exit path the same way as statsd_count.

I prefer this PR so 👍 for me.
If we do not want, then we fix statsd_count.
But we need to move the needle and keep the ball rolling.

If we go the route of this PR, I think it would mean we could add statsd_measure_if and statsd_measure_success to make "measure" works like "count" and "everything possible"

@dnlserrano
Copy link

dnlserrano commented Nov 9, 2017

I'm sorry if I start to sound like a broken record, but when can we expect to have this merged in? 😇

Thanks guys! ❤️

begin
result = yield
ensure
value = 1000 * StatsD::Instrument.current_timestamp - start
Copy link
Contributor

@wvanbergen wvanbergen Nov 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add parentheses here? 1000 * (StatsD::Instrument.current_timestamp - start). I am not too sure what the rules are in Ruby, with parentheses it's clear that it will be correct.

@maximebedard
Copy link
Author

I've updated this PR, but I've also started building the new API here: #114 for anyone interested

@dnlserrano
Copy link

Hey guys, it's me again. 😊 While we don't have the final version of the new API that @maximebedard is working on, can we get this PR merged and release a new minor of the gem?

This would help me a lot to get better metrics in the short to mid-term. I can also try to help on a solution for a more definitive way of doing things, if that's needed. Thanks! ❤️

@jcmfernandes
Copy link
Contributor

I was bitten by this today as well. Can we please get this PR merged and have a minor version released? 🙏

Thank you!

@wvanbergen
Copy link
Contributor

This has been implemented in #134

@wvanbergen wvanbergen closed this Sep 12, 2019
@wvanbergen wvanbergen deleted the measure-exceptions branch September 12, 2019 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some metrics seems to be skipped on exceptions
5 participants