Newer fast data architectures are significantly different from big data architectures, and fast data provides true online transaction processing tools. Understanding the changing needs of big data and fast data can help you make the right hardware and software choices.
Big Data Architecture
In contrast to the way organizations typically collected data in the past, Big Data is the process of analyzing and gaining greater insights through larger volumes of data, much of which (e.g., social media data about customers) is accessible in the public **** cloud. This data, in turn, emphasizes fast access and de-emphasizes consistency, and has created a range of Big Data tools such as Hadoop. As a result, the following changes and emphases in the architecture are common:
Support for on-premise software such as Hadoop and Hive, as well as horizontally scalable, cloud-enabled hardware for social media or other scenarios where big data input plays a role.
Support for virtualization and private cloud software for existing data architectures.
Software tools that support large-scale, deep and ad hoc analytics software and that allow data scientists to customize their needs for the enterprise.
Massively scalable storage capacity, especially for near real-time analytics.
Fast Data Architecture
Fast Data is an architecture that can process streaming sensor and IoT data in near real time. This architecture is more focused on rapid updates, and will routinely let go of the limitations on reading data, not locking up until there is data being written to disk. Whether through existing, typical bar charts, databases, or from specially designed Hadoop-related tools, organizations working with this architecture typically fit the need for some initial streaming analysis of the data. Changes in architecture and focus are common in this nascent field:
Database software for rapid updates and initial streaming data analysis.
Dramatically increased use of non-volatile RAM and solid-state drives for fast data storage (e.g., 1 TB of main memory and 1 petabyte of SSDs);
Just-in-time software constraints, similar to those of older real-time operating systems.
Fusion of Fast Data Architecture with Big Data Architecture
Fast Data is intended to be fused with Big Data Architecture. Therefore, in order to merge the two approaches:
The data is separated between fast data for quick response and big data storage for reduced constraints.
This converged architecture allows access to data stored in the fast data architecture using big data databases and analytic tools.
This is a very brief overview of a typical implementation and there are a range of options. The major vendors sell a wide variety of software and hardware to cover all Big Data architectures and the vast majority of Fast Data architectures, while open source vendors cover most of the same software areas. As a result, Fast Data and Big Data implementations are often a balancing act between cost and speed. Smart buyers can gain a competitive advantage by adding effective architectures.
Smaller vendors Redis Labs and GridGain in the fast data space, and larger vendors Oracle and SAP all play important roles in both fast data as well as big data. sap may be the more appropriate vendor in the fast data tools space. In the hardware space, Intel has a strong interest in fast data. Other traditional big data vendors such as IBM and Dell are showing excitement in their EMC acquisitions before they have a chance to publish. Between the two, IBM and Dell, EMC earned its face and its lining, so it may be more relevant than IBM in the future in terms of fast data architectures.