The same is true in the field of databases. In the past 50 years, the database has experienced a long process of integration-separation-reintegration of OLTP and OLAP requirements. The reason is that the development of database is always closely related to the change of user scenario requirements. Nowadays, with the rise of cloud computing and big data, the business scene is undergoing unprecedented changes, and the database field has also set off a wave of HTAP.
Gartner has emphasized in many reports that HTAP is one of the most important development trends in the database field and an important data platform for users' digital transformation. The industry even believes that the rise of HTAP represents the beginning of the era of database integration.
So why are database manufacturers and cloud service giants betting on HTAP? Why is open source+cloudy a booster for HTAP popularization? Facing the rise of a new generation of HTAP data, the MySQL ecosystem accumulated for many years has finally found the best home?
A few years ago, HTAP might be considered as a niche product in the database field, and it remains to be seen whether it will become a climate.
With the great changes in data resources, data consumption habits and data-driven scenarios, the contradiction between supply and demand between user needs and traditional databases has become increasingly prominent, making HTAP, a new era database characterized by "supporting OLTP and OLAP at the same time, innovating computing storage framework and removing ETL", an irresistible trend.
Today, almost all database vendors and cloud service giants are deploying HTAP. For example, OceanBase officially announced its entry into HTAP database in version 3.0 launched last year; In May of this year, Google Cloud released AlloyDB, an HTAP cloud database, to provide HTAP database services for PG users. Coupled with the heat wave of Oracle MySQL, even Snowflake released Unistore to "rub" the hot spots of HTAP.
If you count the new products of HTAP in the past year, you will find that almost all of them are built on the cloud. The new generation HTAP+ cloud is becoming an important trend in the database market. For example, TiDB 6.0 recently released by PingCAP is also a new generation HTAP database closely connected with the cloud.
In fact, PingCAP is a very important leader in the field of HTAP database. As early as TiDB 3.0, PingCAP officially turned to HTAP, from OLTP main engine +OLAP auxiliary capability, to OLTP engine+external analysis engine, and then to OLTP engine+fusion analysis engine. PingCAP has been slowly but surely in the HTAP field, a version of by going up one flight of stairs.
Nowadays, with the release of TiDB 6.0, HTAP has been improved more maturely, and the performance of TPC-C is improved by 76.32% compared with version 5.0. TiDB 6.0 has also enhanced a number of enterprise-level features to better meet the needs of users in the cloud era for HTAP databases.
Of course, some people question that HTAP is an old wine in a new bottle, and there is not much new idea. However, there is a general understanding in the industry that the new generation of HTAP is completely different from the past, with open source and cloud, and many of them are blessed by AI. They are born for data agility, have unprecedented innovation vitality and iterative speed, and gradually form a new trend of database technology reform.
PingCAP CTO Huang also bluntly said: "The rapid evolution and iteration of TiDB in recent years has benefited from the help of open source and cloud."
HTAP is favored by users, to some extent because users are extremely eager for data agility.
"In the digital age, customers are most concerned about how to go public quickly. This requires data agility, and HTAP is precisely the core capability of data agility. " Huang said:
In recent years, the demand for "massive, real-time and online" has become more and more extensive. A new generation of enterprises using MySQL and PostgreSQL open source databases need to improve the real-time online analysis ability of hot data. Almost all internet companies and digital transformation companies engaged in online business have this demand. The real-time analysis ability of fresh data directly determines the life and death of these businesses. The traditional data architecture of OLTP+OLAP+ETL has seriously hindered the consumer experience, and this demand has given birth to the technological change of HTAP.
What really helps HTAP connect users' needs is open source+cloud. As we all know, in recent years, the popularity and influence of open source in the database field are increasing day by day. According to DB-Engines data, among the 383 databases in the world, open source databases account for 5 1.7%, and six open source databases are in the top ten. In this new era like HTAP, open source is becoming the source of database innovation.
Taking TiDB of PingCAP as an example, its product R&D system is based on open source system and open source community, which realizes the iteration speed of one big version a year and one small version a month. Huang revealed: "Open source is the first growth engine of TiDB. Through the open source system, developers, contributors, communicators and users can be well connected in series to form a flywheel effect, which will lead the product to a positive cycle of accelerating iteration and innovation. "
It is reported that TiDB has more than 40% code updates every year, and a large part of these codes are enjoyed by external contributors. TiDB open source projects are among the best in the world and China.
If open source has changed the development mode and iteration speed of HTAP products, then the cloud can provide users with the most direct demand feedback. As we all know, cloud database has changed the traditional database deployment, operation and maintenance, expansion and other issues. And make the database easier to use in the form of cloud services; More importantly, with the popularity of cloud computing, the number of users on the cloud continues to increase, and the feedback of users' needs on the cloud occurs all the time, which is very important for the evolution and iteration of database products.
"The real product iteration is how to shorten the feedback time of user questions/needs. The cloud undoubtedly provides such value for basic software such as databases, so that products can be better iterated. " Huang said: Take TiDB as an example. Since the public beta version of TiDB Cloud, a fully hosted database as a service (DBaaS) product, was released in May last year, it has successively landed in the Marketplace of world-renowned cloud service providers such as Amazon Cloud Technology and Google Cloud, and was officially commercialized globally in May this year. In June this year, it cooperated with Alibaba Cloud to launch the Alibaba Cloud market, becoming one of the few database services spanning the three major clouds in the world.
Among many database products, MySQL ranks among the top three most popular databases in the world all the year round because of its advantages of open source, free and suitable for Internet scenarios. According to statistics of Slintel website, MySQL has the highest share in the global relational database market, reaching 43.04%.
In the past twenty years, the open source MySQL database has had a far-reaching impact on all walks of life and captured the hearts of users from the Internet, finance, retail, transportation and other industries, which is called "heartthrob". For example, in China, more than 90% financial institutions have applied MySQL database.
But any database trend is the product of "demand change+technology change+architecture innovation", so is MySQL and HTAP. The data scale, business concurrency and processing speed of the current scenario are not an order of magnitude compared with the past. At this time, the limitations of MySQL database are becoming more and more prominent, and the scalability is difficult to meet the needs of users. Enterprises that want to keep growing have to use sub-database and sub-table scheme, but this will cause the complexity of data architecture.
The new generation HTAP database has the ability of real-time massive OLTP and real-time data analysis without dividing databases and tables. At the same time, it has excellent scalability and is highly compatible with the real-time data display and smooth operation of massive transactions in various business scenarios. It is inevitable that HTAP will rise with the advantages of technical architecture.
"The biggest change in the user demand side is that many users need to use hot data to realize real-time analysis at the operational level and get real-time insight to support decision-making, which greatly promotes the demand for a new generation of HTAP databases." Liu Song, vice president of PingCAP, added.
Although MySQL has joined the list engine Heatwave to gain HTAP capability, it mainly solves the problem of large-scale query. The architecture of the system itself has not undergone revolutionary changes, and its scalability and OLTP throughput still have great limitations. "Smart new energy vehicles are almost the same in appearance as traditional fuel vehicles. The databases are all the same. A new generation of HTAP databases like TiDB are very different from traditional databases in architecture design, coping scenarios and experience. " Liu Song's image metaphor.
In fact, unlike the niche and expensive HTAP of SAP HANA in the past, the new generation HTAP has strong compatibility. Database vendors such as Google Cloud and PingCAP all use the new generation HTAP architecture to expand the capabilities of OLTP and OLAP for enterprises adopting MySQL or PG open source databases.
For example, AlloyDB, an HTAP cloud database released by Google Cloud, provides the best choice for stand-alone PG eco-users, and TiDB becomes the best destination for MySQL eco-users. There are many successful cases of mixed deployment of TiDB and MySQL among a large number of PingCAP users. Thanks to the openness of TiDB, TiDB can also form a new data service solution by "mashing up" with other data service products, for example, by mashing up Flink, a big data computing engine with the same open source, to form a real-time warehouse solution and expand the capability boundary of HTAP database.
Huang bluntly said that the database needs to care about the user experience in addition to products and technologies. "HTAP should make users feel easy to use and shield the complexity of the database." It is reported that PingCAP was the only China database company selected in the field of Gartner Peer Insights cloud database in 2022, with a comprehensive customer score of 4.7 (out of 5), ranking first among all selected enterprises. Among PingCAP users who participated in Gartner Peer Insights scoring, users in key industries such as Internet and finance highly recognized HTAP's modern database concept.
Generally speaking, this year is a big year for HTAP, and major manufacturers are innovating in the market. With the increase of the new generation of HTAP database products, the acceptance and adoption of HTAP database concepts and products in the whole market will be accelerated. With the continuous improvement of the new generation HTAP database, the MySQL ecological user group has truly seen the excellent migration path in the era of big data.