-
Notifications
You must be signed in to change notification settings - Fork 334
skyline.analyzer.metrics #90
base: master
Are you sure you want to change the base?
Conversation
Added functionality to analyzer.py for skyline to feed all of its own metrics back to graphite. This results in skyline analyzing its own metrics for free :) The resultant graphite metrics and carbon files (if using whisper and not ceres) that this provides are (e.g): skyline/ ├── analyzer │ ├── anomaly_breakdown │ │ ├── first_hour_average.wsp │ │ ├── grubbs.wsp │ │ ├── histogram_bins.wsp │ │ ├── ks_test.wsp │ │ ├── least_squares.wsp │ │ ├── mean_subtraction_cumulation.wsp │ │ ├── median_absolute_deviation.wsp │ │ ├── stddev_from_average.wsp │ │ └── stddev_from_moving_average.wsp │ ├── duration.wsp │ ├── exceptions │ │ ├── Boring.wsp │ │ └── Stale.wsp │ ├── projected.wsp │ ├── run_time.wsp │ ├── total_analyzed.wsp │ ├── total_anomalies.wsp │ └── total_metrics.wsp └── horizon └── queue_size.wsp There will be more for other exceptions and any further added algorithms, this is however quite trivial in terms of whisper storage and new metrics adds. Modified: src/analyzer/analyzer.py
This should probably close #89 |
self.send_graphite_metric('skyline.analyzer.total_metrics', '%d' % len(unique_metrics)) | ||
for key, value in exceptions.items(): | ||
send_metric = 'skyline.analyzer.exceptions.%s' % key | ||
self.send_graphite_metric(send_metric, '%d' % value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this give crazy tracebacks? You don't really want to put that in Graphite as a metric name.
Hi Abe No it seems fine... I hope, although that you raised the question makes me wonder :) Crazy feedback loop, I do see what you are saying, however although it may appear so, no it does not result in that, because actually they are expected to become anomalous. This is not for alerting on per se, but to keep a timeseries set for each of them. So basically it is taking what logger is reporting and turning them into metrics:
So what logger wrote there gets send to graphite as: skyline.analyzer.exceptions.Boring 4501 etc Now seeing as exceptions are classes and anomaly_breakdowns are defines the namespaces should not be messed up by spaces, etc And this results in. And the first crazy skyline feedback graph in the world? :) |
Hi Abe Having assessed further, there is some crazy feedback, although I am not certain the feedback exaggeration is any worse than: The + 1 being the total_anomalies value itself becoming anomalous, as each anomaly_breakdown item is only one metric sent every analyzer runs, as is the total_anomalies count. This would be fairly simple to unexaggerate by adding as a default I know this is quite a fork and not necessarily in the direction of anomaly detection (or is it?), however it does give one a better operational view and visualisation of what skyline is doing and how things are triggered. It is interest to see that ks_test appears to be the least Maybe good for testing new algorithms as well, if you were not using crucible to do that for some reason and just wanted to visually determine how anomalous an algorithm might be. Going to SKIP_LIST |
Added skyline.analyzer.anomaly_breakdown. to SKIP_LIST Modified: src/settings.py.example
Hi Abe The results are in and it appears that there is a small but noticable difference in the total_anomalies when skyline.analyzer.anomaly_breakdown in not the When analyzing the skyline.analyzer.anomaly_breakdown, the total_anomalies graph does seems a bit more peaky, but it is very hard to say when you look into further retention periods. That said I see no reason not to have |
The plot thickens, there was me thinking that my SKIP_LIST had applied, I forget that SKIP_LIST does not get applied in analyzer but rather in horizon (which I did not restart) so the visual peaky diff must have just been in the actual anomalies then. I just got a load of anomaly_breakdown alerts that I was no longer expecting from skyline (rkhunter is running, so it is actually normal). So will update with more info when we have a total_anomalies graph without skyline.analyzer.anomalies_breakdown feedback. |
Hi Abe The status quo is maintained, in the context of the skyline feedback with the The total_anomalies timeseries and graph have already helped us identify and distribute some process intensive tasks over a longer time period, so as to spread the load out more, so it has added some value. Therefore in terms of this pull request, that is it :) |
Added functionality to analyzer.py for skyline to feed all of its own metrics
back to graphite. This results in skyline analyzing its own metrics for free :)
The resultant graphite metrics and carbon files (if using whisper and not ceres)
that this provides are (e.g):
skyline/
├── analyzer
│ ├── anomaly_breakdown
│ │ ├── first_hour_average.wsp
│ │ ├── grubbs.wsp
│ │ ├── histogram_bins.wsp
│ │ ├── ks_test.wsp
│ │ ├── least_squares.wsp
│ │ ├── mean_subtraction_cumulation.wsp
│ │ ├── median_absolute_deviation.wsp
│ │ ├── stddev_from_average.wsp
│ │ └── stddev_from_moving_average.wsp
│ ├── duration.wsp
│ ├── exceptions
│ │ ├── Boring.wsp
│ │ └── Stale.wsp
│ ├── projected.wsp
│ ├── run_time.wsp
│ ├── total_analyzed.wsp
│ ├── total_anomalies.wsp
│ └── total_metrics.wsp
└── horizon
└── queue_size.wsp
There will be more for other exceptions and any further added algorithms, this
is however quite trivial in terms of whisper storage and new metrics adds.
Modified:
src/analyzer/analyzer.py