Current location - Loan Platform Complete Network - Big data management - TiDB Architecture and Key Features
TiDB Architecture and Key Features
First of all, here is a diagram of the TiDB architecture, which consists of three core components: TiDB Server, PD Server, and TiKV Server, and TiSpark, which is used to address the complex OLAP needs of users.

TiDB Server is responsible for receiving SQL requests, processing SQL-related logic, and using PD to find the TiKV address that stores the data needed for the computation, interacting with TiKV to obtain the data, and ultimately returning the results. TiDB Server is stateless, does not store data itself, is only responsible for the computation, and can be infinitely scalable, and can be scaled up and down through load balancing components (e.g., LVS, HVAC, etc.). TiDB Server is stateless and does not store data, it is only responsible for computation, and can scale infinitely, providing a unified access address to the outside world through load balancing components such as LVS, HAProxy or F5.

The Placement Driver (PD) is the management module for the entire cluster, and has three main tasks: first, it stores meta-information about the cluster (which TiKV node a key is stored in); second, it performs scheduling and load balancing for the TiKV cluster (e.g., migration of data, migration of Raft group leaders, etc.); and third, it assigns a global unique and incremental transaction ID.

The PD guarantees data security through the Raft protocol. the Raft leader server handles all operations, and the rest of the PD servers are only used for high availability. It is recommended to deploy an odd number of PD nodes.

TiKV Server is responsible for storing the data. From the outside, TiKV is a distributed Key-Value storage engine that provides transactions. The basic unit of data storage is the Region, each Region is responsible for storing data in a Key Range (the left-closed-right-open interval from StartKey to EndKey), and each TiKV node is responsible for multiple Regions. TiKV uses the Raft protocol for replication to maintain data consistency and disaster recovery. TiKV uses the Raft protocol for replication to maintain data consistency and disaster recovery. Replicas are managed on a Region basis, with multiple Regions on different nodes forming a Raft Group and acting as replicas for each other. Load balancing of data across multiple TiKVs is scheduled by the PD, and here it is also scheduled on a Region basis.

TiSpark, as the main component of TiDB that solves the complex OLAP needs of users, runs Spark SQL directly on the TiDB storage tier, while incorporating the advantages of TiKV distributed clustering and integrating into the big data community ecosystem. At this point, TiDB can support both OLTP and OLAP through a single system, eliminating the need for users to synchronize their data.

TiDB has two core features: horizontal scaling and high availability.

Unlimited horizontal scaling is a major feature of TiDB, and the horizontal scaling we are talking about here includes two aspects: computing power and storage power. TiDB Server is responsible for handling SQL requests, and as the business grows, TiDB Server nodes can simply be added to increase the overall processing power and provide higher throughput. TiKV is responsible for storing data, and with the growth of data volume, more TiKV Server nodes can be deployed to solve the problem of data Scale.PD will do scheduling between TiKV nodes on a Region basis, and migrate part of the data to the newly added nodes. So in the early stage of the business, you can deploy only a small number of service instances (at least 3 TiKV, 3 PD, 2 TiDB), and as the business grows, you can add TiKV or TiDB instances according to your needs.

Another feature of TiDB is its high availability. All three components, TiDB/TiKV/PD, can tolerate some instance failures without affecting the availability of the entire cluster. The following describes the availability of each of these components, what happens when a single instance fails, and how to recover.