Description
Issue I'm facing here is with the multi-server update. which I'm very grateful for. However, it added a new, automatic label called "server" that gives the IP:PORT of the database server it's connecting to. It's a useful label, but because prometheus treats a metric with a different label set as a completely different series of that metric, it now appears as a brand new metric in Grafana for regular graphs. New color lines and duplicated legend entries for every metric that includes the time both before and after the update. That's not quite so bad and can sort of be dealt with.
However what breaks is if you were using these metrics in any sort of SingleStat panels. It causes it to throw a multi-series error if you're looking at any time frame that includes the period before the upgrade and after. Because you're asking for a single named metric that returns two completely different sets of labels.
Also, even with a new install where everything has the server label from the beginning, we were actually using the pg_up
metrics from postgres_exporter in some simple math to get server status. And for some reason, the pg_up
metric doesn't include the server label. So now any metric from postgres_exporter that wants to work with the pg_up
metric is now completely broken because it doesn't have a consistent labelling with all the other metrics. You can't do math between metrics that don't have a matching label set.
The only solution I've found to this is to completely filter out the server
label in the prometheus config so it's just back to doing things the way it was before.
- source_labels: [__name__, server]
regex: "ccp_.*;.+"
action: replace
target_label: server
replacement: ""
We use all our own custom queries that start with ccp_
so we can at least only apply this filter to our own.
I don't find filtering out the server
label ideal in the long run, but since we're not actually monitoring multiple database servers from a single exporter, it hasn't become an issue yet.
Is there any way to deal with this change in a better way with either Prometheus or Grafana?