Skip to content

Add VictoriaMetrics switch guide for TiUP cluster (#20335) #20477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
4f654f0
Add VM switch guide for TiUP cluster
nolouch May 13, 2025
0852522
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
0511628
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
7e825ba
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
748dd30
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
c8f0797
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
6df96a6
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
69366a5
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
7a29a54
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
fd7779d
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
2f4e2ca
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
66080c8
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
a9c8abc
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
9b8e0ec
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
d6f3844
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
14f4e11
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
946c86a
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
f6df362
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
09c28c1
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
42c612b
Remove unnecessary copyable snippet
lilin90 May 28, 2025
01774aa
Update format
lilin90 May 28, 2025
43fda70
Add a necessary space for body heading
lilin90 May 28, 2025
5ace06e
Update maintain-tidb-using-tiup.md
lilin90 May 28, 2025
f1ed389
Keep the order of content consistent with en
lilin90 May 28, 2025
ce117c2
Fix format
lilin90 May 30, 2025
a30e299
Update maintain-tidb-using-tiup.md
lilin90 Jun 10, 2025
18348ea
Update maintain-tidb-using-tiup.md
nolouch Jun 10, 2025
6fa51d8
Update maintain-tidb-using-tiup.md
nolouch Jun 10, 2025
3d9a19a
Update maintain-tidb-using-tiup.md
nolouch Jun 10, 2025
ea5ee40
Update maintain-tidb-using-tiup.md
nolouch Jun 10, 2025
24bf96b
Update maintain-tidb-using-tiup.md
nolouch Jun 10, 2025
8628151
Update maintain-tidb-using-tiup.md
nolouch Jun 10, 2025
a4a6a06
Update wording
lilin90 Jun 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
156 changes: 155 additions & 1 deletion maintain-tidb-using-tiup.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ summary: TiUP 是用于管理 TiDB 集群的工具,可以进行查看集群列

# TiUP 常见运维操作

本文介绍了使用 TiUP 运维 TiDB 集群的常见操作,包括查看集群列表、启动集群、查看集群状态、修改配置参数、关闭集群、销毁集群等
本文介绍使用 TiUP 运维 TiDB 集群的常见操作。

## 查看集群列表

Expand Down Expand Up @@ -271,3 +271,157 @@ tiup cluster clean ${cluster-name} --all --ignore-node 172.16.13.12
```bash
tiup cluster destroy ${cluster-name}
```

## 从 Prometheus 切换到 VictoriaMetrics

在大型集群中,Prometheus 在处理大量实例时可能会遇到性能瓶颈。从 TiUP 1.16.3 版本开始,TiUP 支持将指标监控组件从 Prometheus 切换为 VictoriaMetrics (VM),以提供更好的可扩展性、更高的性能和更低的资源消耗。

### 在新部署中启用 VictoriaMetrics

默认情况下,TiUP 使用 Prometheus 作为指标监控组件。如果要在新部署中使用 VictoriaMetrics 替代 Prometheus,可以在拓扑文件中进行如下配置:

```yaml
# 监控服务器配置
monitoring_servers:
# 监控服务器的 IP 地址
- host: ip_address
...
prom_remote_write_to_vm: true
enable_prom_agent_mode: true

# Grafana 服务器配置
grafana_servers:
# Grafana 服务器的 IP 地址
- host: ip_address
...
use_vm_as_datasource: true
```

### 将现有部署迁移到 VictoriaMetrics

你可以在不中断服务的情况下完成迁移。TiUP 会将现有的指标数据保留在 Prometheus 中,将新的指标数据写入 VictoriaMetrics。

#### 启用 Prometheus 向 VictoriaMetrics 的远程写入

1. 编辑集群配置:

```bash
tiup cluster edit-config ${cluster-name}
```

2. 在 `monitoring_servers` 配置下,添加 `prom_remote_write_to_vm: true`:

```yaml
monitoring_servers:
- host: ip_address
...
prom_remote_write_to_vm: true
```

3. 重新加载配置使其生效:

```bash
tiup cluster reload ${cluster-name} -R prometheus
```

#### 切换 Grafana 默认数据源至 VictoriaMetrics

1. 编辑集群配置:

```bash
tiup cluster edit-config ${cluster-name}
```

2. 在 `grafana_servers` 配置下,添加 `use_vm_as_datasource: true`:

```yaml
grafana_servers:
- host: ip_address
...
use_vm_as_datasource: true
```

3. 重新加载配置使其生效:

```bash
tiup cluster reload ${cluster-name} -R grafana
```

#### 查看切换前的历史指标(可选)

如果需要查看切换前生成的历史指标数据,执行以下步骤切换至 Grafana 的数据源:

1. 编辑集群配置:

```bash
tiup cluster edit-config ${cluster-name}
```

2. 注释掉 `grafana_servers` 下的 `use_vm_as_datasource`:

```yaml
grafana_servers:
- host: ip_address
...
# use_vm_as_datasource: true
```

3. 重新加载配置使其生效:

```bash
tiup cluster reload ${cluster-name} -R grafana
```

4. 若需切换回 VictoriaMetrics,请重复[切换 Grafana 默认数据源至 VictoriaMetrics](#切换-grafana-默认数据源至-victoriametrics) 的步骤。

### 清理旧指标和服务

在确认旧指标已过期的前提下,可按以下步骤移除相关冗余服务和文件,这不会影响集群的正常运行。

#### 将 Prometheus 设置为代理模式

1. 编辑集群配置:

```bash
tiup cluster edit-config ${cluster-name}
```

2. 设置代理模式,并确保相关参数已正确配置。

在 `monitoring_servers` 下设置 `enable_prom_agent_mode` 为 `true`,并确保 `prom_remote_write_to_vm` 和 `use_vm_as_datasource` 也正确设置:

```yaml
monitoring_servers:
- host: ip_address
...
prom_remote_write_to_vm: true
enable_prom_agent_mode: true

grafana_servers:
- host: ip_address
...
use_vm_as_datasource: true
```

3. 重新加载配置使其生效:

```bash
tiup cluster reload ${cluster-name} -R prometheus
```

#### 删除 Prometheus 旧数据目录

1. 在配置文件中找到监控服务器的数据目录路径 `data_dir`:

```yaml
monitoring_servers:
- host: ip_address
...
data_dir: "/tidb-data/prometheus-8249"
```

2. 删除数据目录:

```bash
rm -rf /tidb-data/prometheus-8249
```