Big data is becoming more and more important these days as organizations need to deal with ever-growing amounts of stored data from multiple sources.
Adopting big data can be called a perfect storm. Cheap storage and an influx of structured and unstructured data has led to the development of big data tools that help organizations "unlock" their accumulated data, from customer records to product performance results and more.
Like traditional business intelligence (BI), these new big data tools can analyze past trends and help companies identify important patterns, such as specific sales trends. Many big data tools now offer a new generation of predictive and prescriptive insights, along with all the data buried deep in enterprise data centers.
In response to the challenges people face, Doug Laney, an analyst at research firm Gartner Inc. said people still don't want to deal with all of this data with an extended infrastructure, but rather from the various data themselves.
"For the real challenge, organizations are processing, integrating, and ****ing the same construction and understanding inputs of their own transactional data and that of their customers, plus data from partners and vendors, and exogenous data such as open and aggregated data from social media and so on, and these are just touching the surface." Laney said stated in an email.
Big data is a big deal: Is your network ready?
While Gartner's clients illustrate by way of a 2-to-1 ratio that all kinds of data is a bigger problem, for them data is growing faster and faster and data processing vendors will continue to offer solutions with bigger and faster money.
Big data solutions are certainly evolving and changing, said Doug Hensant, an analyst at ConstellationResearch.
"In my book, 2014 was the year of the SQLHadoop announcements, but this year enterprises and sellers are starting to recognize that there are opportunities for big data beyond expanding traditional BI and databases." Hensant said, "As a result, the ApacheSpark open source framework and other analytics options have overtaken SQL in 2015. in 2015, hundreds of vendors and large companies began adopting the ApacheSpark open source framework, with IIBM embracing being the most visible vendor advocating for other analytics options and other companies committed to data integration and big data platforms many companies joined the bandwagon."
In fact, the big data wave seems to be upon us, with a variety of solutions being introduced by vendors on a daily basis, including some relatively comprehensive designs. While it's hard to get a comprehensive list, these four tools should be out on users' application lists.
(1) H2O.ai for data scientists
H2O.ai is an independent, open-source machine learning platform launched in late 2014 by startup Oxdata to serve data scientists and developers with a fast machine learning engine for their applications.Oxdata says it can be used on commercially available hardware for any source such as Hadoop, SQL) data can be processed and analyzed, even running on thousands of network nodes or Amazon's AWS cloud. Individuals can try and continue to use H2O.ai for free. oxdata will charge business users.
"A lot of companies use Spark instead of Hadoop short-term memory, which is like memory for big data." Oleg Rogosko, vice president of marketing and growth at H2O, said, "In terms of reading your short-term memory, h20.ai outperforms Spark and basically provides ultra-fast analytics."
Rogosko said H2O.ai is a new breed of data tool designed to provide predictive analytics. He noted that SQL has helped drive the early stages of descriptive data analytics or "tell me what's going on," followed by "predictive" offerings that look at what's going on and try to help customers predict what's going to happen next -- e.g., run out of inventory or have a product breakthrough. product breakthroughs, etc.
"The third phase we'll see come into play in the next few years is the directive phase, where the system says, 'Here's my lesson, here's what I think is going to happen in the future, and you should maximize your goals.'" Rogosco said he also pointed to Google Maps' ability to proactively suggest alternative routes as an example of a prescriptive solution.
H20.ai positions itself as a predictive tool and "box" used by data scientists in a variety of industries. For example, networking giant Cisco has 60,000 models that predict buying decisions, and the company uses H2O.ai to score those models. The results were fantastic, and we saw H2O.ai outperform our counterparts by a factor of three to seven," said Cisco's chief data scientist. In terms of individual modeling scores, the h2o.ai environment is 10 to 15 times better than upwards."
(2) ThoughtSpot3 - Big Data Applications
With the help of search engines like Google Inc. it's easy to scour the web for social and web data that users need, but enterprise data is generally harder to find and harder to utilize. For this reason, seven engineers*** co-founded ThoughtSpot with the goal of developing a Google-like search engine for finding business data.
The company, which supplied Google with hardware devices in its early days to provide ultra-fast search capabilities once an organization has enabled its firewall, combines the applications of the new search engine, which features a fast in-memory database for searching through massive amounts of information. The company also plans to offer a cloud-based service.
ThoughtSpot3, which has a starting price of $90,000, is a tool relied on by data scientists for organizations looking for big data quickly. "We're already seeing an increase in data scientists in organizations using the product." Said Scott Holton, vice president of marketing at ThoughtSpot, "Two billion people are searching, but at work, we still rely on data experts."
Holton conducted a demo at the company's headquarters in PaloAlto, California, showing how the system works using the familiar search bar interface. The just-released ThoughtSpot 3.0 has a number of new features, including "DataRank," which works in a way similar to Google's PageRank and typeahead, and which uses machine-learning algorithms to suggest keywords for customers to search for in order to speed up the process.
Popcharts is definitely the coolest new feature. When you type "sold by East Coast ......" into the search box ThoughtSpot instantly creates query-related charts based on the query and uses machine learning to give you more than 10 charts to choose from.
Another "on-the-fly" feature is AutoJoins, which is designed to navigate organizations that typically have hundreds of data sources. AutoJoins uses ThoughtSpot's data indexing to see if the tables are relevant through indexing patterns and machine learning, and presents the results of the research in less than a second. research results.
ThoughtSpot is more focused on traditional BI analysis of historical data (which is super fast and very easy to use), and its predictive and prescriptive analytics capabilities will be reflected in future software, Holton said.
(3) Connotate Software
Connotate is a company that classifies and analyzes unstructured data from thousands of Web sites around the world in real time for major companies such as the Associated Press, Reuters, Dow Jones and others. When it comes to Web data extraction and monitoring, Connotate software is the world's simplest and most cost-effective solution to effectively utilize massive amounts of data to mine information from it that is valuable to business growth, and allows for highly scalable data monitoring and data collection.
Doug Laney, an analyst at Gartner, said Connotate and BrightPlanet are on his list of big data tools because they help harvest and build rich and colorful content from a company's own databases and the Internet.
"With digitization and economic growth, companies are recognizing that focusing only on their own data is no longer a foolproof recipe for innovation, and they're increasingly turning to exogenous data (i.e., data outside the company)." Laney said.
Connotate said its patented technology for extracting content from Web pages goes far beyond Web crawling or custom scripts. Instead, it employs an intuitive visual understanding of how websites work using machine learning, Connotate said, making its content extraction "accurate, reliable and scalable."
According to the company, the Connotate platform "can easily handle hundreds of websites and megabytes." and provide targeted information that is relevant to the business. It offers content capture at an average cost of 55 percent less than traditional methods.
As an example of a use case, Connotate helped a sales intelligence provider extract contact information (name, title, phone, e-mail and affiliation) from thousands of hospital Web sites and build a national database of physician profiles.
Connotate says its big data solution was sold to several large pharmaceutical companies at no cost in additional hardware or IT resources. Big data extraction scales to provide data on even 500,000 physicians.
(4) BrightPlanet Tools
BrightPlanet, which also extracts data from the Web, claims that its searches are capable of so-called "deep Web" insights. Its Deep Web mines data from password-protected Web sites and other sites that aren't typically indexed by traditional search engines.
BrightPlanet says it collects millions of data entries, including tweets and data from news databases and medical journals, and can filter them according to a company's specific needs and conditions.
The company is offering a free Data-as-a-Service (DaaS) consultation for data collection engineers using the software and describing their services as a good option. The purpose of the consultation is to help enterprise data centers find the right data to collect and get it in the right format so customers can get a good idea of the process and results.
The end user or client can choose which sites to harvest content from. In turn, BrightPlanet fleshes out its content. For example, unstructured data like comments on a social media site are designed to be submitted in a more user-friendly client through a custom format.
end