Current location - Loan Platform Complete Network - Big data management - Analyzing the Aeron IPC Implementation
Analyzing the Aeron IPC Implementation

Recently, I played with Aeron, and mainly used the IPC communication function, and I felt that it was well packaged and easy to use, so I spent some time studying the internal implementation, and I made some notes briefly.

As for UDP communication and archive, cluster advanced features did not try, this article does not involve.

There is a very clear architecture diagram in the Aeron Cookbook:

This diagram is very good, but it corresponds to the UDP communication scenario, and the IPC communication scenario is actually not that complex, so I simplified it:

In the IPC communication scenario:

In the case of two Conductors, the first thing you have to do is understand the interaction between them. Conductor, you first need to understand the cnc.dat file.

The full name of cnc is command and control, and as the name implies, management operations are realized through this file.

There are two parts to focus on here: the [to-driver Buffer] and the [to-clients Buffer].

The Client sends commands to the Driver through the [to-driver Buffer] "channel"; the Driver replies to the Client with commands or notifications through the [to-clients Buffer] "channel". The Driver replies with commands or notifications to the Client through the [to-clients Buffer] "channel".

The [to-driver Buffer] "channel" is encapsulated in a ManyToOneRingBuffer, which is a ring buffer; Many To One, which means that multiple Clients can write concurrently, but only one Driver can read. Read.

[to-clients Buffer] This "channel" is a broadcast "channel", so the encapsulation in the Driver is BroadcastTransmitter, and the encapsulation in the Client is BroadcastRemitter. Client is a BroadcastReceiver.

The BroadcastTransmitter doesn't guarantee reliable delivery of messages, but the [to-driver Buffer] and the [to-clients Buffer] ask and answer each other, with occasional notifications, and basically don't lose any data.

As for this, it's not a problem.

As for how this "question-and-answer" correlation works, it's really quite simple: each command has a unique correlationId, and the corresponding response will have that information, so the unique id matches the request to the response.

The correlationId is generated by the CORRELATION_COUNTER at the end of the RingBuffer to ensure uniqueness.

The Driver Conductor updates the CONSUMER_HEARTBEAT at the end of the RingBuffer after processing a command, and the client can use this heartbeat time to determine if the Media Driver is working.

First: Concurrent Writes

Writes use a combination of tryClaim + commit. tryClaim declares a certain length of space for commands to be written, and the solution to the concurrency problem is in the tryClaim method.

Simply put, cas updates the TAIL_POSITION field.

Second: reading commands

From the above, you can see that you can't simply read commands from TAIL_POSITION, but in fact the Driver Conductor determines if the Length field is greater than 0.

This behavior is guaranteed during a commit operation, where you first write a negative length in the Length field, then write the next Type and Encoded Message, and finally write a positive length in the Length field, which guarantees that the full command will be read.

Third: Write Thread Exceptions

As you can see from the above, the writing of commands is not atomic, and there is a possibility that the command is half-written, and a write thread exception occurs, which will continue to block the Driver Conductor's reads if the Length field is always negative.

The solution to this problem is simple: Driver Conductor will actively unblock after a timeout period, which means it recognizes the area where the exception was skipped.

[to-clients Buffer] The logic for this broadcast is even simpler, since writes are single-threaded, so there are no concurrency scenarios to deal with.

The first one: recognizing if a message has been overwritten

The Driver Conductor first updates TAIL_INTENT_COUNTER before writing the message (as well as updating TAIL_COUNTER and LATEST_COUNTER). the Client. The Conductor checks to see if the currently read cursor (local cache) is behind TAIL_INTENT_COUNTER before consuming it, but of course there's no locking here, it's an optimistic strategy.

The above two diagrams have shown the interaction between addPublication and addSubscription, so let's look at the specific publish-subscribe relationship.

Before we get into the specific publish-subscribe logic, let's take a look at the structure of the logbuffer:

The entire ****-enjoyed memory used for data transfers is divided into three terms, which are rotated into active, dirty, and clean states. (AeronCookbook has an animation demonstrating this.)

Specifically in the implementation, Term rotation is accomplished through the Active Term Count in Log Meta Data, and the active Term index is (Active Term Count % 3), and rotation is a self-incrementing operation.

In addition, the Tail Counter # in Log Meta Data indicates the current write position.

This Counter is divided into two parts: the high 32 bits are the termId, from which you can calculate the start position of the term, and the low 32 bits are the most significant offset in the term.

Since termId is 32 bits, it is conceivably possible to overflow, which limits the maximum amount of data that can be written "over" by a single logbuffer to termBufferLength * (1L << 31).

Specifically for publishing logic, Aeron provides ConcurrentPublication wrappers that support concurrency, and ExclusivePublication wrappers that are single-threaded, so you can imagine that there is some performance loss in supporting concurrency.

Like ManyToOneRingBuffer in Section 2, Publication provides both offer and tryClaim+commit APIs, except that Client Conductor only uses tryClaim+commit, so offer was not introduced above, and is not looked at here. The reason we don't look at offer is that it's easier to understand the nature of tryClaim+commit.

Let's look at the ExclusivePublication implementation first. It's easier to understand the backbone logic and then concurrency control.

There is a backpressure threshold, pub-lmt, that is maintained in the Driver Conductor, which is simply the position of consumption plus termWindowLength. The logbuffer can be interpreted as a ManyToManyRingBuffer, so controlling the position of writes is a bit more complicated, and requires a separate logbuffer. This logbuffer can be interpreted as a ManyToManyRingBuffer, so controlling where it is written to is a bit more complicated and requires a separate coordinator (i.e., the Driver Conductor), which is not the same as the way ManyToOneRingBuffer was handled in Section 2.

The main logic is intuitive: first find the term to be written, and then call the claim method of the ExclusiveTermAppender. The logic here is the same as ManyToOneRingBuffer, updating the Tail Counter, then writing to the header, and writing a negative value to the length field, and then writing a positive value when you commit.

But there are also cases where there is not enough space left in the Term, so we fill in a PADDING_FRAME and return -1.

Finally, we update the local position variable, and if it returns -1, then we need to rotate the Term, which we do by updating the Active Term Count and the corresponding Tail Counter #.

For concurrent ConcurrentPublication scenarios, the backbone logic is the same, and there are two thread-safe areas:

First: write thread exceptions

As in section 2.1, there is a possibility that a logbuffer could be written halfway through a thread exception. thread exception. This relies on the Driver Conductor as the coordinator to do an unblock operation based on the timeout.

From the diagram in Section 2, we can see that there are two steps in the Conductor's interaction before consumption:

That is, the image of the subscribing client corresponds to the specific ****-enjoyed memory body, IpcPublication, and the core read logic is also in the image.

With position you can calculate the term index, and the position offset in that term, and with those two pieces of information you can read the data.

The logic for reading is the same as ManyToOneRingBuffer, which determines whether there is a message by whether the Length field is greater than 0.

The read logic is the same as ManyToOneRingBuffer, which determines if there is a message.

The logic of reading is the same as ManyToOneRingBuffer.