Batch computing, stream computing, interactive computing, and graph computing are two major Big Data computing models. Among them, stream computing and batch computing are the two main big data computing modes, which are applicable to different big data application scenarios.
Streaming data (or data stream) refers to a series of dynamic data aggregates that are infinite in time distribution and quantity, and the value of the data decreases with the passage of time, so it must be calculated in real time to give a second response. Streaming computing, as the name suggests, is real-time computing that processes streams of data.
Batch computing, on the other hand, unifies the collection of data, stored in the database, and then batch processing of data data data computing. Mainly reflected in the following aspects:
1, data timeliness is different: streaming computing real-time, low latency, batch computing non-real-time, high latency.
2, data characteristics are different: streaming computing data is generally dynamic, no boundary, while the batch data is generally static data.
3, application scenarios are different: streaming computing is used in real-time scenarios, the timeliness of the requirements of the higher scenarios, such as real-time recommendations, business monitoring... Batch computing is generally said to be batch processing, applied in real-time requirements are not high, offline computing scenarios, data analysis, offline reports and so on.
4. The way of operation is different, the task of streaming computing is ongoing, while the task of batch computing is completed at once.