Current location - Loan Platform Complete Network - Big data management - How to add fields to mysql data table
How to add fields to mysql data table

Traditional Situation

Let's review how an add column operation is accomplished when there is no "add column now" feature. Let's also use this to familiarize ourselves with the illustration in this issue:

When an add column operation is performed, all rows of data must be augmented with a segment of data (Column 4 data in the illustration)

As mentioned in the previous illustration, when the length of a row of data is changed, the tablespace needs to be rebuilt (the grayed-blue portion of the illustration is the portion in which the change occurs)

Column definitions in the data dictionary are updated as well

All of these operations are done with the "Add Column Now" feature, which is a feature that allows you to add columns to a row.

The problem with the above is that each add column operation requires a tablespace rebuild, which requires a lot of IO and a lot of time

Add Columns Now

The process of adding columns now is shown below:

Please click to enter a description of the image

Please click to enter a description of the image

Add Columns Now changes only the contents of the data dictionary, which includes:

Please click to enter a description of the image

Add columns now. dictionary, including:

Adding the definition of the new column to the column definition

Adding the default value of the new column

"Add columns now"? When you want to read data from the table after "Add Column Now":

Since "Add Column Now" does not change the row data, only three columns are read

MySQL appends the default value of the fourth new column to the read data

The above procedure describes how to read data from a table that is not in the "Add Column Now" column definition

The default value of the new column is added to the table definition.

So how do you read? How do you read the data that was written after the "add column now"? written ? The process is shown below:

When reading row 4:

Please click to enter an image description

Please click to enter an image description

By determining ? instant? flag bit in the header information of a data row, you can tell that the format of the row is "new format": the row's header information is followed by a new field ?" Number of columns"

By reading the ? data row's ?" Number of Columns"? field of the data row, you can know how many columns of the data row have "real" data, so you can read the data by number of columns

You can see from the above figure: reading the ? In the "immediately add columns"? data written before/after is a different process

Through the above discussion, we can summarize ?" Add Column Now"?

The reason why it is efficient is that the process of writing data before/after the execution of ?" Add Columns Now"? without changing the structure of the data rows

When reading the "old" data, "faking" the ? new columns are added so that the results are correct

When writing "new" data, the new data format is used (with the addition of the instant flag bit and the ?" Number of columns"? field) to distinguish between old and new data

When reading "new" data, the data can be read as is

So? Can we keep "faking"? Can we keep "faking" ? When will it be dismantled ?

Consider the following scenario:

Add column A with "add column now"

Write data row 1

Add column ?B with "add column now"

Write data row ?2

Remove column ?B

Let's speculate on the minimal cost of "removing column B": it would require either modification of the data row's instant flag bit in the data row or the ?" Number of Columns"? field in the data row, which would at least affect the ?" add columns now"? This affects at least the rows of data written after the ? "add column immediately"? field, at a cost similar to rebuilding the data

From the above speculation, it follows that when there is an error associated with the ?" Add Column Now"? operation, the table will need to be rebuilt, as shown in the following figure:

Please click to enter a description of the image

Please click to enter a description of the image

Expanded thought question: Can other data formats be devised to replace the instant flag bit and the ?" Number of columns"? field so that add/delete operations can be done "right away" ? (Hint: Consider adding columns and deleting columns. - Delete columns? - and then add columns)

Limitations

After we understand how this works, let's look at the ?" Add Columns Now"? The first two of these are easy to understand:

"Add columns now"? can only be added at the end of the table, not between other columns

In the metadata, it is only recorded how many columns a row of data should have, not where those columns should be. So it's not possible to specify where the columns should be

"Add columns now"? Can't add primary key columns

Adding columns can't involve changes to clustered indexes, otherwise it becomes a "rebuild" operation, not an "immediate" one

"Add Columns Now" doesn't support compressed tabular formats

According to WL: "Compressed is no need to supported"

"Compressed is no need to supported"

"Compressed is no need to supported"

"Add columns now" is not supported. format)

Summary review

Let's summarize the above discussion:

The reason why "Add Columns Now" is efficient is that:

When performing "Add Columns Now", the structure of the data rows is not altered

When reading the "old" data, the "faked" ? added columns to make the result correct

When writing "new" data, a new data format is used? (with the addition of the ?instant flag bit? and the "number of columns" field) to distinguish between old and new data

When reading "new" data, the data can be read as it is

The "add columns immediately"? s "forgery" maneuver cannot be maintained all the time. When this happens? incompatible with the "add columns now" operation? the table data is rebuilt

Returning to the two remaining questions:

How does "add column now" work ?

We've already answered that question

Is the so-called "add columns immediately" completely business-neutral, and is it truly done "immediately" ?

It can be seen that even if you "immediately add columns", you still need to change the data dictionary, and then the locks can't be escaped. That is to say, the "immediately" here refers to "not changing the structure of the data rows", and does not mean "zero cost to complete the task"