Current location - Loan Platform Complete Network - Big data management - Enterprises most commonly use sparkcore or sparksql
Enterprises most commonly use sparkcore or sparksql
Enterprises most commonly use sparkcore and sparksql for co-processing.

Enterprises are micro-batch based streaming compute engines that typically utilize sparkcore and sparksql together to process data. In enterprise real-time processing architecture, sparkstreaming and kafka integration is usually used as one of the core aspects of the entire big data processing architecture.

SparkSQL is built on top of SparkCore and is specifically designed to process structured data (not just SQL). SparkSQL is encapsulated in SparkCore, which is based on SparkCore and has many optimizations and improvements for structured data processing. In simple terms, SparkSQL supports many kinds of structured data sources, which allows you to skip the complex reading process and easily read data from a variety of data sources. When you use SQL to query the data in these data sources and only use some of the fields, SparkSQL can intelligently scan only those used fields, instead of simply scanning all of the data in a brute force manner like in SparkContexthadoopFile.