Current location - Loan Platform Complete Network - Big data management - Big data distributed queue
Big data distributed queue
Kafka is a log processing buffer component, which is used for processing big data information. Compared with the traditional message queue, the structure and function of the queue are simplified, and the stored (persistent) messages (mainly logs) are processed in the form of streams. The amount of log data is huge, which processing components generally can't handle, so kafka, as a buffer layer, supports huge throughput. In order to prevent information from being lost, messages are not directly discarded after being called, but should be kept for a longer time and then discarded after the expiration time. This is something mq and redis can't have. The main features are as follows: huge storage capacity: supporting TB or even PB data. High throughput and high IO: Generally, a configured server can transmit more than 100K messages per second on a single machine. Message partition and distributed consumption: it can ensure the sequential transmission of messages. Support offline data processing and real-time data processing. Horizontal expansion: Support online horizontal expansion to support greater data processing capacity.

Redis only provides a memory key-value pair with high performance and atomic operation, which has high-speed access ability and can be used as the storage of message queues, but it does not have any function and logic of message queues. If it is to be implemented as a message queue, the function and logic should be implemented by the upper application itself.

Let's take RabbitMQ as an example. It is an open source message queue developed in Erlang language and supports many protocols, including AMQP, XMPP, SMTP and STOMP. Suitable for enterprise development.

MQ supports proxy architecture, and messages need to be queued in a central queue when sent to clients. Good support for routing, load balancing or data persistence.

There are also ActiveMq, ZeroMq and so on. The functions are basically the same. Compared with TPS, ZeroMq is the best, RabbitMq is the second, and ActiveMq is the worst.

Original text: