Skip to content

New "server" label breaks existing installations in Grafana #339

Open
@keithf4

Description

@keithf4

Issue I'm facing here is with the multi-server update. which I'm very grateful for. However, it added a new, automatic label called "server" that gives the IP:PORT of the database server it's connecting to. It's a useful label, but because prometheus treats a metric with a different label set as a completely different series of that metric, it now appears as a brand new metric in Grafana for regular graphs. New color lines and duplicated legend entries for every metric that includes the time both before and after the update. That's not quite so bad and can sort of be dealt with.

However what breaks is if you were using these metrics in any sort of SingleStat panels. It causes it to throw a multi-series error if you're looking at any time frame that includes the period before the upgrade and after. Because you're asking for a single named metric that returns two completely different sets of labels.

Also, even with a new install where everything has the server label from the beginning, we were actually using the pg_up metrics from postgres_exporter in some simple math to get server status. And for some reason, the pg_up metric doesn't include the server label. So now any metric from postgres_exporter that wants to work with the pg_up metric is now completely broken because it doesn't have a consistent labelling with all the other metrics. You can't do math between metrics that don't have a matching label set.

The only solution I've found to this is to completely filter out the server label in the prometheus config so it's just back to doing things the way it was before.

    - source_labels: [__name__, server] 
      regex: "ccp_.*;.+" 
      action: replace 
      target_label: server 
      replacement: ""

We use all our own custom queries that start with ccp_ so we can at least only apply this filter to our own.

I don't find filtering out the server label ideal in the long run, but since we're not actually monitoring multiple database servers from a single exporter, it hasn't become an issue yet.

Is there any way to deal with this change in a better way with either Prometheus or Grafana?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions