Skip to content

Commit d69dde0

Browse files
committed
for save
1 parent e54d76c commit d69dde0

File tree

3 files changed

+33
-33
lines changed

3 files changed

+33
-33
lines changed

command-line-flags-for-pd-configuration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,7 @@ PD is configurable using command-line flags and environment variables.
110110
111111
- Force to create a new cluster using current nodes.
112112
- Default: `false`
113-
- It is recommended to use this flag only when recovering services due to PD loses most of the data, which might cause data loss.
113+
- It is recommended to use this flag only when recovering services due to PD losing most replicas, which might cause data loss.
114114
115115
## `-V`, `--version`
116116

tikv-control.md

Lines changed: 20 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -463,6 +463,26 @@ success!
463463
> - The argument of the `-p` option specifies the PD endpoints without the `http` prefix. Specifying the PD endpoints is to query whether the specified `region_id` is validated or not.
464464
> - You need to run this command for all stores where specified Regions' peers are located.
465465
466+
### Recover from ACID inconsistency data
467+
468+
To recover data from ACID inconsistency, such as the loss of most replicas or incomplete data synchronization, you can use the `reset-to-version` command. When using this command, you need to provide an old version number that can promise the ACID consistency. Then, `tikv-ctl` cleans up all data after the specified version.
469+
470+
- The `-v` option is used to specify the version number to restore. To get the value of the `-v` parameter, you can use the `pd-ctl min-resolved-ts` command.
471+
472+
```shell
473+
tikv-ctl --host 127.0.0.1:20160 reset-to-version -v 430315739761082369
474+
```
475+
476+
```
477+
success!
478+
```
479+
480+
> **Note:**
481+
>
482+
> - The preceding command only supports the online mode. Before executing the command, you need to stop processes that will write data to TiKV, such as TiDB. After the command is executed successfully, it will return `success!`.
483+
> - You need to execute the same command for all TiKV nodes in the cluster.
484+
> - All PD scheduling tasks should be stopped before executing the command.
485+
466486
### Ldb Command
467487
468488
The `ldb` command line tool offers multiple data access and database administration commands. Some examples are listed below. For more information, refer to the help message displayed when running `tikv-ctl ldb` or check the documents from RocksDB.
@@ -580,23 +600,3 @@ From the output above, you can see that the information of the damaged SST file
580600
+ In the `sst meta` part, `14` means the SST file number; `552997` means the file size, followed by the smallest and largest sequence numbers and other meta-information.
581601
+ The `overlap region` part shows the information of the Region involved. This information is obtained through the PD server.
582602
+ The `suggested operations` part provides you suggestion to clean up the damaged SST file. You can take the suggestion to clean up files and restart the TiKV instance.
583-
584-
### Recover ACID inconsistency data
585-
586-
On the cluster, if there are problems such as data loss of most replicas or incomplete data synchronization, resulting in data containing incomplete transactions, you can use the `reset-to-version`command to recover data with inconsistent ACID. At this time, you need to provide an old version number that guarantees ACID consistency and then `tikv-ctl` erases all data after this version.
587-
588-
- The `-v` option is used to specify the version number to restore. To obtain the `-v` parameter, you can use the `pd-ctl min-resolved-ts`.
589-
590-
```shell
591-
tikv-ctl --host 127.0.0.1:20160 reset-to-version -v 430315739761082369
592-
```
593-
594-
```
595-
success!
596-
```
597-
598-
> **Note:**
599-
>
600-
> - This command only supports online mode. Before executing the command, you need to stop processes such as TiDB that will write data to TiKV. After the command is executed successfully, it will return `success!`.
601-
> - You need to execute the same command for all TiKV nodes in the cluster.
602-
> - All scheduling of the PD should be stopped before the command.

two-data-centers-in-one-city-deployment.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -273,32 +273,32 @@ The details for the status switch are as follows:
273273

274274
### Disaster recovery
275275

276-
This section introduces the disaster recovery solution of the two data centers in one city deployment. The disaster discussed in this section is the overall failure of the primary data center, or the multiple TiKV nodes in the primary/secondary data center fail, resulting in the loss of most replicas and it is unable to provide services.
276+
This section introduces the disaster recovery solution of the two data centers in one city deployment. The disaster discussed in this section is the overall failure of the primary data center, or multiple TiKV nodes in the primary/secondary data center fail, resulting in the loss of most replicas and it is unable to provide services.
277277

278278
#### Overall failure of the primary data center
279279

280-
In this situation, all Regions in the primary data center have lost most of their replicas, so the cluster is unable to use. At this time, it is necessary to use the secondary data center to recover the service. The recovery ability is determined by the replication status before failure:
280+
In this situation, all Regions in the primary data center have lost most of their replicas, so the cluster is unable to use. At this time, it is necessary to use the secondary data center to recover the service. The replication status before failure determines the recovery ability:
281281

282282
- If the status before failure is in the synchronous replication mode (the status code is `sync` or `async_wait`), you can use the secondary data center to recover using `RPO = 0`.
283283

284-
- If the status before failure is in the asynchronous replication mode (the status code is `async`), the written data in the primary data center in the asynchronous replication mode is lost after using the secondary data center to recover. A typical scenario is that the primary cluster disconnects with the secondary cluster and the primary cluster switches to the asynchronous replication mode and provides service for a period of time before overall failure.
284+
- If the status before failure is in the asynchronous replication mode (the status code is `async`), the written data in the primary data center in the asynchronous replication mode is lost after using the secondary data center to recover. A typical scenario is that the primary data center disconnects from the secondary data center and the primary data center switches to the asynchronous replication mode and provides service for a while before the overall failure.
285285

286-
- If the status before failure is switching from the asynchronous to synchronous (the status code is `sync-recover`), part of the written data in the primary data center in the asynchronous replication mode is lost after using the secondary data center to recover . This might cause the ACID inconsistency, and you need to recover ACID consistency additionally. A typical scenario is that the the primary cluster disconnects with the secondary cluster, the connection is restored after switching to asynchronous mode and data is written. But during the data synchronization between primary and secondary, something goes wrong and causes overall failure of the primary data center.
286+
- If the status before failure is switching from the asynchronous to synchronous (the status code is `sync-recover`), part of the written data in the primary data center in the asynchronous replication mode is lost after using the secondary data center to recover. This might cause the ACID inconsistency, and you need to recover it additionally. A typical scenario is that the primary data center disconnects from the secondary data center, the connection is restored after switching to the asynchronous mode, and data is written. But during the data synchronization between primary and secondary, something goes wrong and causes the overall failure of the primary data center.
287287

288288
The process of disaster recovery is as follows:
289289

290-
1. Stop all PD, TiKV and TiDB services of the secondary data center.
290+
1. Stop all PD, TiKV, and TiDB services of the secondary data center.
291291

292-
2. Start PD nodes of the secondary data center using a replica mode with the [`--force-new-cluster`](/command-line-flags-for-pd-configuration.md#--force-new-cluster) flag.
292+
2. Start PD nodes of the secondary data center using the single replica mode with the [`--force-new-cluster`](/command-line-flags-for-pd-configuration.md#--force-new-cluster) flag.
293293

294-
3. Use [Online Unsafe Recovery](/online-unsafe-recovery.md) to process the TiKV data in the secondary data center and the parameters are the list of all Store IDs in the primary data center.
294+
3. Use the [Online Unsafe Recovery](/online-unsafe-recovery.md) to process the TiKV data in the secondary data center and the parameters are the list of all Store IDs in the primary data center.
295295

296-
4. Write a new placement rule configuration using [PD Control](/pd-control.md), and the Voter replica configuration of the Region is the same as the original cluster in the secondary data center.
296+
4. Write a new configuration of placement rule using [PD Control](/pd-control.md), and the Voter replica configuration of the Region is the same as the original cluster in the secondary data center.
297297

298-
5. Start the PD and TiKV services of the primary cluster.
298+
5. Start the PD and TiKV services of the primary data center.
299299

300-
6. To recover ACID consistency (the status of `DR_STATE` in the old PD is `sync-recover`), you can use [`reset-to-version`](/tikv-control.md#recover-acid-inconsistency-data) to process TiKV data and the `version` parameter used can be obtained from `pd-ctl min-resolved-ts`.
300+
6. To recover ACID consistency (the status of `DR_STATE` in the old PD is `sync-recover`), you can use [`reset-to-version`](/tikv-control.md#recover-from-acid-inconsistency-data) to process TiKV data and the `version` parameter used can be obtained from `pd-ctl min-resolved-ts`.
301301

302-
7. Start the TiDB service in the primary cluster and check the data integrity and consistency.
302+
7. Start the TiDB service in the primary data center and check the data integrity and consistency.
303303

304-
If you need a support for disaster recovery, you can contact the TiDB team for a recovery solution.
304+
If you need support for disaster recovery, you can contact the TiDB team for a recovery solution.

0 commit comments

Comments
 (0)