The era of big data: five trends of business analysis technology
At present, the trend center pays as much attention to how to deal with the analysis challenges as they do to consider how to make full use of opportunities from a new business perspective. For example, as more and more companies begin to face massive data and consider how to use it, technologies for managing and analyzing large and different data sets begin to emerge. Analyzing cost and performance trends in advance means that companies can ask more complicated questions than before and provide more useful information to help them run their business.
in the interview, CIOs summarized five IT trends that affected their analysis. They are: the growth of big data, rapid processing technology, the decline in the cost of IT goods, the popularity of mobile devices and the growth of social media. 1. big data
big data refers to very large data sets, especially those that are not neatly organized and cannot adapt to traditional data warehouses. Web spider data, social media feedback and server logs, as well as data from supply chain, industry, surrounding environment and monitoring sensors all make the company's data more and more complicated than before.
although not every company needs the technology to deal with large, unstructured data sets. PerryRotella, CIO of VeriskAnalytics, believes that all CIOs should pay attention to big data analysis tools. Verisk helps financial companies assess risks and works with insurance companies to prevent insurance fraud. Its revenue in 21 exceeded $1 billion.
Rotella believes that the attitude that technology leaders should take in this regard is that the more data, the better, and the substantial growth of data is welcome. Rotella's job is to find the connections and models between things in advance.
CynthiaNustad, chief information officer of p>HMS, believes that big data presents an "explosive" growth trend. HMS's business includes helping to control the cost of Medicare and Medicaid programs and private cloud services. Its clients include health and human services projects in more than 4 states and more than 13 Medicaid management plans. HMS helped its customers recover losses of $1.8 billion in 21 by preventing wrong payment, saving billions of dollars. Nustad said: "We are collecting and tracking a lot of materials, including structural and unstructured data, because you don't always know what you are looking for."
one of the most talked about big data technologies is Hadoop. This technology is an open source distributed data processing platform, which was originally developed for tasks such as editing network search index. Hadoop is one of many "NoSQL" technologies (including CouchDB and MongoDB), which organizes network-level data in a special way.
Hadoop can distribute subsets of data to hundreds of servers for processing, and the results reported by each server will be sorted by a master job scheduler, so it has the ability to process petabyte-level data. Hadoop can be used not only for data preparation before analysis, but also as an analysis tool. Companies without thousands of idle servers can buy on-demand interviews of Hadoop instances from cloud vendors such as Amazon.
Nustad said that HMS is exploring the use of NoSQL technology, although it is not for its large database of federal medical insurance and Medicaid claims. It includes structured data and can be processed by traditional data warehouse technology. She said that it is unwise to start from the traditional relational database management when answering what kind of relational technology is the best solution proved by practice. However, Nustad believes that Hadoop is playing an important role in preventing fraud and waste analysis, and has the potential to analyze patient medical records reported in various formats.
in the interview, the CIOs interviewed who have experienced Hadoop, including Rotella and JodyMulkey, CIO of Shopzilla, all work in companies that take data services as a business.
Mulkey said, "We are using Hadoop to do what we used to do with data warehouses. More importantly, we have obtained practical and useful analytical techniques that have never been used before. " For example, as a comparative shopping website, Shopzilla accumulates several terabytes of data every day. He said: "In the past, we had to sample the data and classify the data. When dealing with massive data, this workload is very heavy. " Since Hadoop was adopted, Shopzilla has been able to analyze raw data and skip many intermediate links.
GoodSamaritan Hospital is a community hospital located in southwest Indiana, which is of another type. ChuckChristian, the chief information officer of the hospital, said, "We don't have anything that I think is big data." Nevertheless, management regulations require it to store brand-new data types such as huge electronic medical records. He said that this undoubtedly requires them to be able to collect medical and health care quality information from the data. However, this may be achieved in regional or national health care associations, rather than in their single hospitals. Therefore, Christian may not invest in this new technology.
JohnTernent, chief information officer of p>IslandOneResorts, said that the analysis challenge it faces depends on "big" or "data" in big data. However, at present, he is cautiously considering using Hadoop instances in the cloud to analyze complex mortgage portfolios as an economical way. At present, the company is managing 8 time resorts in Florida. He said: "This solution may solve the practical problems we are currently encountering." 2. Speed up business analysis
VinceKellen, chief information officer of the University of Kentucky, believes that big data technology is only one element in the rapid analysis of this general trend. He said: "What we are looking forward to is a more advanced massive data analysis method." Compared with analyzing the data more quickly, the size of the data is not important, "because you want this process to be completed quickly."
because the current calculation can handle more data in the memory, it can calculate the results faster than searching for data in the hard disk. Even if you only deal with a few grams of data, the situation is still the same.
despite decades of development, database performance has been greatly improved by caching frequently accessed data. This technology becomes more practical when loading the whole large data set into the memory of a server or server cluster, and the hard disk is only used as a backup. Because retrieving data from a rotating disk is a mechanical process, it is much slower than processing data in memory.
Rotella said that the analysis he is doing in a few seconds now would have taken one night five years ago. Rotella's company mainly conducts forward-looking analysis of large data sets, which often involves querying, finding patterns and adjusting before the next query. In terms of analysis speed, query completion time is very important. He said: "In the past, the running time was longer than the modeling time, but now the modeling time is longer than the running time."
the columnar database server has changed the traditional row and column structure of relational databases and solved other performance requirements. Queries only access useful columns instead of reading the whole record and selecting optional columns, which greatly improves the performance of applications that organize or measure key columns.
Ternent warned that the performance advantages of columnar databases need to be matched with correct application and query design. He said: "In order to make a difference, you must ask it the right questions in the right way." At the same time, he also pointed out that the columnar database is actually only meaningful for applications that process more than 5G gigabytes of data. He said: "Before you can make a column database work, you must collect a large scale of data, because it depends on a certain level of repetition to improve efficiency."
AllanHackney, CIO of insurance and financial services giant JohnHancock, said that in order to improve the analysis performance, the hardware also needs to be upgraded, such as adding GPU chips, which are the same as the graphics processors used in game systems. He said: "The calculation methods used for visualization are very similar to those used in statistical analysis. Compared with ordinary PC and server processors, graphics processors are hundreds of times faster. Our analysts like this device very much. " 3. technical cost decline
With the increase of computing power, analysis technology began to benefit from the decline of memory and storage prices. At the same time, with the open source software gradually becoming an alternative product of commercial products, the competitive pressure has also led to a further decline in the price of commercial products.
Ternent is a supporter of open source software. Before joining IslandOne, Ternent was vice president of engineering for Pentaho, an open source business intelligence company. He said: "For me, open source determines the field of involvement. Because medium-sized companies like IslandOne can use open source application R instead of SAS for statistical analysis. "
In the past, open source tools only had basic reporting functions, but now they can provide the most advanced predictive analysis. He said: "At present, open source participants can span the whole continuum, which means that anyone can use them."
according to Nustad of p>HMS, the change of computing cost is changing the choice of some basic architectures. For example, a traditional factor in creating a data warehouse is to let the data enter a server with powerful computing power to process them. When the computing power is insufficient, separating the analysis workload from the operating system can avoid the performance degradation of the daily workload. Nustad said that this is no longer a suitable choice at present.
She said, "As hardware and storage become cheaper, you can make these operating systems handle a business intelligence layer." By reformatting the data and loading the data into the warehouse, the analysis directly based on the operational application can provide answers more quickly.
Hackney observed that although the trend of cost performance is beneficial to cost management, these potential savings will be offset by the increasing demand for capacity. Although the storage cost per device of JohnHancock decreased by 2-3% this year, the consumption increased by 2%. 4. The popularity of mobile devices
Like all applications, business intelligence is becoming increasingly mobile. For Nustad, mobile business intelligence has priority, because everyone wants Nustad to personally access the report on whether her company has reached the service level agreement at any time and anywhere. She also hopes to provide mobile access to data for the company's customers and help them monitor and manage health care expenses. She said: "This is a feature that customers like very much. Five years ago, customers didn't need this feature, but now they need it. "
for CIOs, it is more about creating suitable user interfaces for smartphones, tablets and touch-screen devices than more complicated analytical capabilities. Perhaps for this reason, Kellen thinks it is relatively easy. He said: "For me, this is just a small matter."
Rotella doesn't think it's easy. He said: "Mobile computing affects everyone. Many people start to use the iPad to work, while other mobile devices are exploding. This trend is accelerating and changing the way we interact with computing resources within the company. " For example, Verisk has developed products that enable claimants to analyze quickly on the spot, so they can evaluate the replacement cost. He said: "This method has an impact on our analysis, and it also allows everyone who needs it to use it at will."
Rotella said: "The factor that causes this challenge lies in the speed of technology update. Two years ago, we didn't have an iPad, but now many people are using it. With the emergence of various operating systems, we are trying to figure out how they affect our research and development, so that we don't have to write these applications again and again. "
Ternent of p>IslandOne points out that, on the other hand, the need to create native applications for every mobile platform may be fading, because browsers on mobile phones and tablets now have more powerful functions. Ternent said: "If I can use a web-based application specifically for mobile devices, then I am not sure that I will invest in customized mobile device applications." 5. Joining social media
With the rise of social media such as Facebook and Twitter, more and more companies want to analyze the data generated by these websites. The new analytical application supports statistical technologies such as human language processing, sentiment analysis and network analysis, which are not part of the typical business intelligence tool suite.
because they are all new, many social media analysis tools can be used for service acquisition. A typical example is Radian6. Radian6 is a software-as-a-service (SaaS) product, which was recently acquired by Salesforce.com. Radian6 is a social media dashboard, which displays the specific terms mentioned in TwITter messages, posts on Facebook, posts and comments on blogs and discussion boards with positive and negative numbers, especially providing vivid and intuitive inference for brand names. When the marketing and customer service departments buy these tools, they no longer have a serious dependence on the IT department. At present, Kellen of the University of Kentucky still believes that he needs to pay close attention to them. He said: "My job is to identify these technologies, evaluate which algorithms are suitable for the company according to competitiveness, and then start training the right people."
Like other companies, universities are very interested in monitoring the reputation of their universities. At the same time, Kellen said that he may also look for opportunities to develop applications specifically designed to solve problems of concern to schools, such as monitoring the enrollment rate of students. For example, monitoring students' posts on social media can help schools and administrators understand the troubles students encounter in college as soon as possible. Kellen said that Dell has already done this work, and its product support company detects people's tweets about faulty laptops. He said that IT developers should also find some ways to push the alarm information from social media analysis into applications, so that companies can quickly respond to related events.
Hackney said, "We don't have the know-how or tools to process and tap the value of massive social media posts. However, once you collect the data, you need to be able to get enough information about the company's events to relate them. " Although Hac