Skip to content

Commit f66d3f9

Browse files
committed
readme-en.md
1 parent 7f40046 commit f66d3f9

File tree

2 files changed

+123
-0
lines changed

2 files changed

+123
-0
lines changed

README-en.md

+122
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
x-pipe
2+
================
3+
4+
[![Build Status](https://travis-ci.com/ctripcorp/x-pipe.svg?branch=master)](https://travis-ci.com/ctripcorp/x-pipe)
5+
[![Coverity Scan Build Status](https://scan.coverity.com/projects/8884/badge.svg)](https://scan.coverity.com/projects/ctripcorp-x-pipe)
6+
[![github CI](https://github.com/ctripcorp/x-pipe/actions/workflows/build.yml/badge.svg?branch=master)](https://github.com/ctripcorp/x-pipe/actions/workflows/build.yml)
7+
[![codecov](https://codecov.io/gh/ctripcorp/x-pipe/branch/master/graph/badge.svg?token=wj3MUNTPcF)](https://codecov.io/gh/ctripcorp/x-pipe)
8+
9+
See the [中文文档](https://github.com/ctripcorp/x-pipe/blob/master/README.md) for Chinese readme.
10+
11+
<!-- MarkdownTOC -->
12+
13+
- [What problems does XPipe solve](#xpipe-解决什么问题)
14+
- [System Overview](#系统详述)
15+
- [Overall Architecture](#整体架构)
16+
- [Redis Data Replication Issues](#redis-数据复制问题)
17+
- [Data Center Switching](#机房切换)
18+
- [Switching Process](#切换流程)
19+
- [High Availability](#高可用)
20+
- [XPipe System High Availability](#xpipe-系统高可用)
21+
- [Redis Self-High Availability](#redis-自身高可用)
22+
- [Test Data](#测试数据)
23+
- [Latency Test](#延时测试)
24+
- [Cross-Public Network Deployment and Architecture](#跨公网部署及架构)
25+
- [Quick Start with Docker](#docker快速启动)
26+
- [In-Depth Understanding](#深入了解)
27+
- [Technical Exchange](#技术交流)
28+
- [License](#license)
29+
30+
31+
<!-- /MarkdownTOC -->
32+
33+
<a name="xpipe-解决什么问题"></a>
34+
# What Problems Does XPipe Solve
35+
Redis is widely used within Ctrip, with client data statistics showing a total of 20 million read and write requests per second across all Redis instances at Ctrip. Among them, write requests are approximately 1 million per second. Many businesses even use Redis as a persistent in-memory database. This creates a significant demand for Redis in a multi-data-center environment, primarily to enhance availability and address Data Center Disaster Recovery (DR) issues and improve access performance. In response to these needs, XPipe was developed.
36+
37+
For convenience in description, DC (Data Center) is used to represent a data center.
38+
39+
<a name="系统详述"></a>
40+
# System Details
41+
<a name="整体架构"></a>
42+
## Overall Architecture
43+
The overall architecture diagram is as follows:
44+
![design](https://raw.github.com/ctripcorp/x-pipe/master/doc/image/total.jpg)
45+
46+
- Console is used to manage metadata for multiple data centers and provides a user interface for configuration and DR switching operations.
47+
- Keeper is responsible for caching Redis operation logs and processing cross-data-center transfers, including compression and encryption.
48+
- Meta Server manages the status of all keepers within a single data center and corrects abnormal states.
49+
50+
<a name="redis-数据复制问题"></a>
51+
## Redis Data Copying Problem
52+
The primary challenge in a multi-data-center environment is data replication—how to transfer data from one DC to another. The decision was made to adopt a pseudo-slave approach, implementing the Redis protocol to masquerade as a Redis slave, allowing the Redis master to push data to the pseudo-slave, referred to as a keeper. The following diagram illustrates this:
53+
![keepers](https://raw.github.com/ctripcorp/x-pipe/master/doc/image/keepers.jpg)
54+
55+
Advantages of using keeper:
56+
57+
- Reduces master full synchronization: Keeper caches RDB and replication log, allowing remote DC slaves to obtain data directly from the keeper, enhancing master stability.
58+
- Reduces cross-data-center network traffic: Data between two data centers only needs to be transmitted through the keeper once, and the keeper-to-keeper transmission protocol can be customized for compression support (not currently supported).
59+
- Reduces full synchronization in case of network issues: Keeper caches Redis log data to disk, enabling the caching of a large amount of log data, ensuring log data transmission even during prolonged network outages between data centers.
60+
- Enhanced security: Data transmission between multiple data centers often occurs over public networks, making data security crucial. Keeper-to-keeper data transmission can also be encrypted (not yet implemented), increasing security.
61+
62+
<a name="机房切换"></a>
63+
## Room Switching
64+
<a name="切换流程"></a>
65+
### Switching Process
66+
- Check if DR switching can be performed: Similar to the 2PC protocol, the process starts with preparation to ensure a smooth flow.
67+
- Disable writes in the original master data center: This step ensures that during migration, only one master is present, addressing potential data loss issues during migration.
68+
- Promote the new master in the target data center.
69+
- Other data centers synchronize with the new master.
70+
71+
Rollback and retry functionalities are provided. The rollback feature can revert to the initial state, while the retry feature allows fixing abnormal conditions and continuing the switch under DBA intervention.
72+
73+
<a name="高可用"></a>
74+
## High Availability
75+
<a name="xpipe-系统高可用"></a>
76+
### XPipe system is highly available
77+
If a keeper goes down, data transmission between multiple data centers may be interrupted. To address this, each keeper has a primary and backup node. The backup node constantly replicates data from the primary node. If the primary node goes down, the backup node is promoted to the primary node to continue service. This promotion operation is performed through a third-party node called MetaServer, responsible for transitioning keeper states and storing internal metadata for the data center. MetaServer also ensures high availability: each MetaServer is responsible for a specific Redis cluster, and when a MetaServer node goes down, another node takes over its Redis cluster. If a new node joins the cluster, an automatic load balancing occurs, transferring some clusters to the new node.
78+
79+
<a name="redis-自身高可用"></a>
80+
### Redis Self-High Availability
81+
Redis itself provides a Sentinel mechanism to ensure cluster high availability. However, in versions prior to Redis 4.0, promoting a new master results in other nodes performing full synchronization upon connection to the new master. This leads to slave unavailability during full synchronization, reduced master availability due to RDB export, and instability in the overall system due to the large-scale data (RDB) transfer within the cluster.
82+
83+
As of the time of writing, Redis 4.0 has not been released, and the internal version used at Ctrip is 2.8.19. To address this, optimizations were made based on Redis 3.0.7, implementing the psync2.0 protocol for incremental synchronization. The Redis author's introduction to the protocol can be found [here](https://gist.github.com/antirez/ae068f95c0d084891305).
84+
85+
[Internal Ctrip Redis address link](https://github.com/ctripcorp/redis)
86+
87+
<a name="测试数据"></a>
88+
## Test Data
89+
<a name="延时测试"></a>
90+
### Delay Test
91+
#### Test Plan
92+
The testing method is illustrated in the following diagram. Data is sent from the client to the master, and the slave notifies the client through keyspace notification. The total test latency is the sum of t1, t2, and t3.
93+
![test](https://raw.github.com/ctripcorp/x-pipe/master/doc/image/delay.jpg)
94+
95+
#### Test Data
96+
First, the latency test was conducted for direct replication from Redis master to slave, resulting in a latency of 0.2ms. Then, adding a layer of keeper between the master and slave increased the overall latency by 0.1ms to 0.3ms.
97+
98+
In production testing at Ctrip, where the round-trip time (RTT) between two data centers was approximately 0.61ms, the average latency with two layers of cross-data-center keepers was around 0.8ms, with a 99.9 percentile latency of 2ms.
99+
100+
<a name="跨公网部署及架构"></a>
101+
## Cross-grid Deployment and Architecture
102+
[Detailed reference - cross-public network deployment and architecture](https://github.com/ctripcorp/x-pipe/blob/master/doc/proxy.md)
103+
104+
<a name="docker快速启动"></a>
105+
# Docker Quick Start
106+
[Detailed reference - docker quick start](https://github.com/ctripcorp/x-pipe/wiki/QuickStart#docker-start)
107+
108+
<a name="深入了解"></a>
109+
# In-Depth Understanding
110+
- [If you have any questions, please read] [XPipe Wiki](https://github.com/ctripcorp/x-pipe/wiki)
111+
- [Current user questions compilation] [XPipe Q&A](https://github.com/ctripcorp/x-pipe/wiki/XPipe-Q&A)
112+
- [Article] [Ctrip Redis Multi-Data Center Solution - XPipe](https://mp.weixin.qq.com/s/Q3bt0-5nv8uNMdHuls-Exw?)
113+
- [Article] [Ctrip Redis Overseas Data Synchronization Practice](https://mp.weixin.qq.com/s/LeSSdT6bOEFzZyN26PRVzg)
114+
- [PPT] [Introduction to XPipe Usage within Ctrip](https://docs.c-ctrip.com/files/6/portal/0AS2w12000947w1mw6A59.pptx)
115+
116+
<a name="技术交流"></a>
117+
# Technical Exchange
118+
![tech-support-qq](https://raw.github.com/ctripcorp/x-pipe/master/doc/xpipe_qq.png)
119+
120+
<a name="license"></a>
121+
# License
122+
The project is licensed under the [Apache 2 license](https://github.com/ctripcorp/x-pipe/blob/master/LICENSE).

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ x-pipe
2727

2828
<!-- /MarkdownTOC -->
2929

30+
See the [English](https://github.com/ctripcorp/x-pipe/blob/master/README-en.md) readme here.
3031

3132
<a name="xpipe-解决什么问题"></a>
3233
# XPipe 解决什么问题

0 commit comments

Comments
 (0)