You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: tikv-control.md
+20-20Lines changed: 20 additions & 20 deletions
Original file line number
Diff line number
Diff line change
@@ -463,6 +463,26 @@ success!
463
463
> - The argument of the `-p` option specifies the PD endpoints without the `http` prefix. Specifying the PD endpoints is to query whether the specified `region_id` is validated or not.
464
464
> - You need to run this command for all stores where specified Regions' peers are located.
465
465
466
+
### Recover from ACID inconsistency data
467
+
468
+
To recover data from ACID inconsistency, such as the loss of most replicas or incomplete data synchronization, you can use the `reset-to-version` command. When using this command, you need to provide an old version number that can promise the ACID consistency. Then, `tikv-ctl` cleans up all data after the specified version.
469
+
470
+
- The `-v` option is used to specify the version number to restore. To get the value of the `-v` parameter, you can use the `pd-ctl min-resolved-ts` command.
> - The preceding command only supports the online mode. Before executing the command, you need to stop processes that will write data to TiKV, such as TiDB. After the command is executed successfully, it will return`success!`.
483
+
> - You need to execute the same commandforall TiKV nodesin the cluster.
484
+
> - All PD scheduling tasks should be stopped before executing the command.
485
+
466
486
### Ldb Command
467
487
468
488
The `ldb`command line tool offers multiple data access and database administration commands. Some examples are listed below. For more information, refer to the help message displayed when running `tikv-ctl ldb` or check the documents from RocksDB.
@@ -580,23 +600,3 @@ From the output above, you can see that the information of the damaged SST file
580
600
+ In the `sst meta` part, `14` means the SST file number; `552997` means the file size, followed by the smallest and largest sequence numbers and other meta-information.
581
601
+ The `overlap region` part shows the information of the Region involved. This information is obtained through the PD server.
582
602
+ The `suggested operations` part provides you suggestion to clean up the damaged SST file. You can take the suggestion to clean up files and restart the TiKV instance.
583
-
584
-
### Recover ACID inconsistency data
585
-
586
-
On the cluster, if there are problems such as data loss of most replicas or incomplete data synchronization, resulting in data containing incomplete transactions, you can use the `reset-to-version`command to recover data with inconsistent ACID. At this time, you need to provide an old version number that guarantees ACID consistency and then `tikv-ctl` erases all data after this version.
587
-
588
-
- The `-v` option is used to specify the version number to restore. To obtain the `-v` parameter, you can use the `pd-ctl min-resolved-ts`.
> - This command only supports online mode. Before executing the command, you need to stop processes such as TiDB that will write data to TiKV. After the command is executed successfully, it will return `success!`.
601
-
> - You need to execute the same command for all TiKV nodes in the cluster.
602
-
> - All scheduling of the PD should be stopped before the command.
Copy file name to clipboardExpand all lines: two-data-centers-in-one-city-deployment.md
+12-12Lines changed: 12 additions & 12 deletions
Original file line number
Diff line number
Diff line change
@@ -273,32 +273,32 @@ The details for the status switch are as follows:
273
273
274
274
### Disaster recovery
275
275
276
-
This section introduces the disaster recovery solution of the two data centers in one city deployment. The disaster discussed in this section is the overall failure of the primary data center, or the multiple TiKV nodes in the primary/secondary data center fail, resulting in the loss of most replicas and it is unable to provide services.
276
+
This section introduces the disaster recovery solution of the two data centers in one city deployment. The disaster discussed in this section is the overall failure of the primary data center, or multiple TiKV nodes in the primary/secondary data center fail, resulting in the loss of most replicas and it is unable to provide services.
277
277
278
278
#### Overall failure of the primary data center
279
279
280
-
In this situation, all Regions in the primary data center have lost most of their replicas, so the cluster is unable to use. At this time, it is necessary to use the secondary data center to recover the service. The recovery ability is determined by the replication status before failure:
280
+
In this situation, all Regions in the primary data center have lost most of their replicas, so the cluster is unable to use. At this time, it is necessary to use the secondary data center to recover the service. The replication status before failure determines the recovery ability:
281
281
282
282
- If the status before failure is in the synchronous replication mode (the status code is `sync` or `async_wait`), you can use the secondary data center to recover using `RPO = 0`.
283
283
284
-
- If the status before failure is in the asynchronous replication mode (the status code is `async`), the written data in the primary data center in the asynchronous replication mode is lost after using the secondary data center to recover. A typical scenario is that the primary cluster disconnects with the secondary cluster and the primary cluster switches to the asynchronous replication mode and provides service for a period of time before overall failure.
284
+
- If the status before failure is in the asynchronous replication mode (the status code is `async`), the written data in the primary data center in the asynchronous replication mode is lost after using the secondary data center to recover. A typical scenario is that the primary data center disconnects from the secondary data center and the primary data center switches to the asynchronous replication mode and provides service for a while before the overall failure.
285
285
286
-
- If the status before failure is switching from the asynchronous to synchronous (the status code is `sync-recover`), part of the written data in the primary data center in the asynchronous replication mode is lost after using the secondary data center to recover. This might cause the ACID inconsistency, and you need to recover ACID consistency additionally. A typical scenario is that the the primary cluster disconnects with the secondary cluster, the connection is restored after switching to asynchronous mode and data is written. But during the data synchronization between primary and secondary, something goes wrong and causes overall failure of the primary data center.
286
+
- If the status before failure is switching from the asynchronous to synchronous (the status code is `sync-recover`), part of the written data in the primary data center in the asynchronous replication mode is lost after using the secondary data center to recover. This might cause the ACID inconsistency, and you need to recover it additionally. A typical scenario is that the primary data center disconnects from the secondary data center, the connection is restored after switching to the asynchronous mode, and data is written. But during the data synchronization between primary and secondary, something goes wrong and causes the overall failure of the primary data center.
287
287
288
288
The process of disaster recovery is as follows:
289
289
290
-
1. Stop all PD, TiKV and TiDB services of the secondary data center.
290
+
1. Stop all PD, TiKV, and TiDB services of the secondary data center.
291
291
292
-
2. Start PD nodes of the secondary data center using a replica mode with the [`--force-new-cluster`](/command-line-flags-for-pd-configuration.md#--force-new-cluster) flag.
292
+
2. Start PD nodes of the secondary data center using the single replica mode with the [`--force-new-cluster`](/command-line-flags-for-pd-configuration.md#--force-new-cluster) flag.
293
293
294
-
3. Use [Online Unsafe Recovery](/online-unsafe-recovery.md) to process the TiKV data in the secondary data center and the parameters are the list of all Store IDs in the primary data center.
294
+
3. Use the [Online Unsafe Recovery](/online-unsafe-recovery.md) to process the TiKV data in the secondary data center and the parameters are the list of all Store IDs in the primary data center.
295
295
296
-
4. Write a new placement rule configuration using [PD Control](/pd-control.md), and the Voter replica configuration of the Region is the same as the original cluster in the secondary data center.
296
+
4. Write a new configuration of placement rule using [PD Control](/pd-control.md), and the Voter replica configuration of the Region is the same as the original cluster in the secondary data center.
297
297
298
-
5. Start the PD and TiKV services of the primary cluster.
298
+
5. Start the PD and TiKV services of the primary data center.
299
299
300
-
6. To recover ACID consistency (the status of `DR_STATE` in the old PD is `sync-recover`), you can use [`reset-to-version`](/tikv-control.md#recover-acid-inconsistency-data) to process TiKV data and the `version` parameter used can be obtained from `pd-ctl min-resolved-ts`.
300
+
6. To recover ACID consistency (the status of `DR_STATE` in the old PD is `sync-recover`), you can use [`reset-to-version`](/tikv-control.md#recover-from-acid-inconsistency-data) to process TiKV data and the `version` parameter used can be obtained from `pd-ctl min-resolved-ts`.
301
301
302
-
7. Start the TiDB service in the primary cluster and check the data integrity and consistency.
302
+
7. Start the TiDB service in the primary data center and check the data integrity and consistency.
303
303
304
-
If you need a support for disaster recovery, you can contact the TiDB team for a recovery solution.
304
+
If you need support for disaster recovery, you can contact the TiDB team for a recovery solution.
0 commit comments