diff --git a/TOC.md b/TOC.md
index 9ac74ccbe7028..44ba9bed50077 100644
--- a/TOC.md
+++ b/TOC.md
@@ -165,6 +165,8 @@
- [Maintain TiDB Using TiUP](/maintain-tidb-using-tiup.md)
- [Modify Configuration Dynamically](/dynamic-config.md)
- [Online Unsafe Recovery](/online-unsafe-recovery.md)
+ - [Use Witness Replicas to Save Costs](/use-witness-to-save-costs.md)
+ - [Use Witness Replicas to Speed Up Failover](/use-witness-to-speed-up-failover.md)
- [Replicate Data Between Primary and Secondary Clusters](/replicate-between-primary-and-secondary-clusters.md)
- Monitor and Alert
- [Monitoring Framework Overview](/tidb-monitoring-framework.md)
diff --git a/configure-placement-rules.md b/configure-placement-rules.md
index eb5312c6f115e..a643fdc13d40a 100644
--- a/configure-placement-rules.md
+++ b/configure-placement-rules.md
@@ -12,7 +12,7 @@ aliases: ['/docs/dev/configure-placement-rules/','/docs/dev/how-to/configure/pla
Placement Rules, introduced in v5.0, is a replica rule system that guides PD to generate corresponding schedules for different types of data. By combining different scheduling rules, you can finely control the attributes of any continuous data range, such as the number of replicas, the storage location, the host type, whether to participate in Raft election, and whether to act as the Raft leader.
-The Placement Rules feature is enabled by default in v5.0 and later versions of TiDB. To disable it, refer to [Disable Placement Rules](#disable-placement-rules).
+The Placement Rules feature is enabled by default in v5.0 and later versions of TiDB. To disable it, refer to [Disable Placement Rules](#disable-placement-rules).
## Rule system
@@ -37,6 +37,7 @@ The following table shows the meaning of each field in a rule:
| `StartKey` | `string`, in hexadecimal form | Applies to the starting key of a range. |
| `EndKey` | `string`, in hexadecimal form | Applies to the ending key of a range. |
| `Role` | `string` | Replica roles, including voter/leader/follower/learner. |
+| `IsWitness` | `true`/`false` | Whether it is a [Witness](/glossary.md#witness) replica or not. |
| `Count` | `int`, positive integer | The number of replicas. |
| `LabelConstraint` | `[]Constraint` | Filters nodes based on the label. |
| `LocationLabels` | `[]string` | Used for physical isolation. |
@@ -486,3 +487,32 @@ The rule group:
"override": true,
}
```
+
+### Scenario 6: Configure Witness replicas in a highly reliable storage environment
+
+The following rule shows how to configure `IsWitness` and uses Amazon EBS as an example to save costs by configuring [Witness](/glossary.md#witness) replicas.
+
+The rule is as follows:
+
+```json
+[
+ {
+ "group_id": "pd",
+ "id": "default",
+ "start_key": "",
+ "end_key": "",
+ "role": "voter",
+ "is_witness": false,
+ "count": 2
+ },
+ {
+ "group_id": "pd",
+ "id": "witness",
+ "start_key": "",
+ "end_key": "",
+ "role": "voter",
+ "is_witness": true,
+ "count": 1
+ }
+]
+```
\ No newline at end of file
diff --git a/glossary.md b/glossary.md
index 7dc51be06ca62..02f30a858c15d 100644
--- a/glossary.md
+++ b/glossary.md
@@ -159,3 +159,12 @@ Because TiKV is a distributed storage system, it requires a global timing servic
### TTL
[Time to live (TTL)](/time-to-live.md) is a feature that allows you to manage TiDB data lifetime at the row level. For a table with the TTL attribute, TiDB automatically checks data lifetime and deletes expired data at the row level.
+
+## W
+
+### Witness
+
+A Witness replica only stores the most recent Raft logs for majority confirmation, but does not store data. Witness replicas are applicable to the following scenarios:
+
+- Save costs in a highly reliable storage environment. For more details, see [Use Witness replicas to save costs](/use-witness-to-save-costs.md).
+- Quickly recover from any failure to improve system availability. For more details, see [Use Witness replicas to speed up failover](/use-witness-to-speed-up-failover.md).
diff --git a/pd-configuration-file.md b/pd-configuration-file.md
index c1d2a6043cba6..947932c8e0e3e 100644
--- a/pd-configuration-file.md
+++ b/pd-configuration-file.md
@@ -222,6 +222,11 @@ Configuration items related to scheduling
+ Controls the time interval between the `split` and `merge` operations on the same Region. That means a newly split Region will not be merged for a while.
+ Default value: `1h`
+### `switch-witness-interval` New in v7.0.0
+
++ Controls the time interval in switching between [Witness](/glossary.md#witness) and non-Witness operations on the same Region. That means a Region newly switched to non-Witness cannot be switched to Witness for a while.
++ Default value: `1h`
+
### `max-snapshot-count`
+ Controls the maximum number of snapshots that a single store receives or sends at the same time. PD schedulers depend on this configuration to prevent the resources used for normal traffic from being preempted.
@@ -277,6 +282,21 @@ Configuration items related to scheduling
+ The number of the `Region Merge` scheduling tasks performed at the same time. Set this parameter to `0` to disable `Region Merge`.
+ Default value: `8`
+### `witness-schedule-limit` New in v7.0.0
+
++ Controls the concurrency of Witness scheduling tasks.
++ Default value: `4`
++ Minimum value: `1`
++ Maximum value: `9`
+
+### `enable-witness` New in v7.0.0
+
++ Controls whether to enable the Witness replica feature.
++ Witness replicas are applicable to the following scenarios:
+ - Save costs in a highly reliable storage environment. For more details, see [Use Witness replicas to save costs](/use-witness-to-save-costs.md).
+ - Quickly recover from any failure to improve system availability. For more details, see [Use Witness replicas to speed up failover](/use-witness-to-speed-up-failover.md).
++ Default value: `false`
+
### `high-space-ratio`
+ The threshold ratio below which the capacity of the store is sufficient. If the space occupancy ratio of the store is smaller than this threshold value, PD ignores the remaining space of the store when performing scheduling, and balances load mainly based on the Region size. This configuration takes effect only when `region-score-formula-version` is set to `v1`.
diff --git a/pd-control.md b/pd-control.md
index e8b909674012f..0dcb5378b57d2 100644
--- a/pd-control.md
+++ b/pd-control.md
@@ -140,6 +140,7 @@ Usage:
},
"schedule": {
"enable-cross-table-merge": "true",
+ "enable-witness": "true",
"high-space-ratio": 0.7,
"hot-region-cache-hits-threshold": 3,
"hot-region-schedule-limit": 4,
@@ -1088,6 +1089,17 @@ unsafe remove-failed-stores show
]
```
+To enable the [Witness replica](/glossary.md#witness) feature, run the following command:
+
+```bash
+config set enable-witness true
+```
+
+Witness replicas are applicable to the following scenarios:
+
+- Save costs in a highly reliable storage environment. For more details, see [Use Witness replicas to save costs](/use-witness-to-save-costs.md).
+- Quickly recover from any failure to improve system availability. For more details, see [Use Witness replicas to speed up failover](/use-witness-to-speed-up-failover.md).
+
## Jq formatted JSON output usage
### Simplify the output of `store`
diff --git a/use-witness-to-save-costs.md b/use-witness-to-save-costs.md
new file mode 100644
index 0000000000000..85af57dd22de5
--- /dev/null
+++ b/use-witness-to-save-costs.md
@@ -0,0 +1,48 @@
+---
+title: Use Witness Replicas to Save Costs
+summary: Learn how to use Witness replicas to save costs in a highly reliable storage environment.
+---
+
+# Use Witness Replicas to Save Costs
+
+This document describes how to use Witness replicas to save costs in a highly reliable storage environment. If you need to use Witness replicas to improve the durability when a TiKV node is down, refer to [Use Witness replicas to speed up failover](/use-witness-to-speed-up-failover.md).
+
+## Feature description
+
+In cloud environments, it is recommended to use Amazon Elastic Block Store (EBS) with 99.8%~99.9% durability or Persistent Disk of Google Cloud Platform (GCP) with 99.99%~99.999% durability as the storage of each TiKV node. In this case, using three Raft replicas with TiKV is possible but not necessary. To reduce costs, TiKV introduces the Witness replica, which is the "2 Replicas With 1 Log Only" mechanism. The 1 Log Only replica only stores Raft logs and does not apply data, and still ensures data consistency through the Raft protocol. Compared with the standard three replica architecture, the Witness replica can save storage resources and CPU usage.
+
+> **Warning:**
+>
+> The Withness replica is introduced in v6.6.0 and is not compatible with previous versions. It is not supported to downgrade.
+
+## User scenarios
+
+In a highly reliable storage environment (99.8%~99.9%), such as Amazon EBS and Persistent Disk of GCP, you can enable and configure Witness replicas to save costs.
+
+## Usage
+
+### Step 1: Enable Witness
+
+To enable Witness, use PD Control to run the `config set enable-witness true` command:
+
+```bash
+pd-ctl config set enable-witness true
+```
+
+If the command returns `Success`, the Witness replica feature is enabled. If you have not configured Witness replicas using Placement Rules, no Witness replicas will be created by default. Only when a TiKV node is down, a Witness replica will be added immediately and will be promoted to a normal Voter later.
+
+### Step 2: Configure Witness replicas
+
+Assume that three replicas are present. Modify `rule.json` to the configuration in [Scenario 6: Configure Witness replicas in a highly reliable storage environment](/configure-placement-rules.md#scenario-6-configure-witness-replicas-in-a-highly-reliable-storage-environment).
+
+After editing the file, use the following command to save the configuration to the PD server:
+
+```bash
+pd-ctl config placement-rules save --in=rule.json
+```
+
+## Notes
+
+- It is recommended to configure Witness replicas only in a highly reliable storage environment, such as Amazon EBS with 99.8%~99.9% durability and Persistent Disk of GCP with 99.99%~99.999% durability to store TiKV nodes.
+- Since a Witness replica does not apply Raft logs, it cannot provide read and write services. When the Leader is down and the remaining Voters do not have the latest Raft logs, Raft elects the Witness replica as a Leader. After the Witness replica is elected, it sends Raft logs to Voters and transfers the leader to a Voter. If the Witness replica cannot transfer the leader in time, the application might receive an `IsWitness` error after the Backoff timeout.
+- When there is a pending Voter in the system, to prevent the Witness replica from accumulating too many Raft logs and occupying the entire disk space, the system will promote the Witness replica to a normal Voter.
diff --git a/use-witness-to-speed-up-failover.md b/use-witness-to-speed-up-failover.md
new file mode 100644
index 0000000000000..0163adc677e69
--- /dev/null
+++ b/use-witness-to-speed-up-failover.md
@@ -0,0 +1,30 @@
+---
+title: Use Witness Replicas to Speed Up Failover
+summary: Learn how to use a Witness replica to speed up failover.
+---
+
+# Use Witness Replicas to Speed Up Failover
+
+This document describes how to use Witness replicas to improve durability when a TiKV node is down. If you need to use Witness replicas to save costs in a high-reliability storage environment, refer to [Use Witness replicas to save costs](/use-witness-to-save-costs.md).
+
+## Feature description
+
+The Witness feature can be used to quickly recover from any failure (failover) to improve system availability and data durability. For example, in a Raft group of three replicas, if one replica fails, the system is fragile although it meets the majority requirement. It takes a long time to recover a new member (the process requires copying the snapshot first and then applying the latest logs), especially when the Region snapshot is large. In addition, the process of copying replicas might cause more pressure on unhealthy Group members. Therefore, adding a Witness replica can quickly remove the unhealthy node, reduce the risk of the Raft group being unavailable due to another node failure when recovering a new member (the Learner replica cannot participate in the election and submission), and ensure the security of logs during recovery.
+
+> **Warning:**
+>
+> The Withness replica is introduced in v6.6.0 and is not compatible with previous versions. It is not supported to downgrade.
+
+## User scenarios
+
+In a scenario where you want to quickly recover from any failure to improve durability, you can enable Witness without configuring a Witness replica.
+
+## Usage
+
+To enable Witness, use PD Control to run the `config set enable-witness true` command:
+
+```bash
+pd-ctl config set enable-witness true
+```
+
+If the command returns `Success`, the Witness replica feature is enabled. If you have not configured Witness replicas according to [Use Witness replicas to save costs](/use-witness-to-save-costs.md), no Witness replicas will be created by default. Only when a TiKV node is down, a Witness replica will be added immediately and will be promoted to a normal Voter later.