Big Data Analytics Best Practices Based on the Rules of the Road
"Big data" analytics is unfamiliar because of the new vocabulary, technologies, products, and providers, but tried-and-true best practices for data management are just as effective in this still-emerging discipline.
As with business intelligence (BI) and data warehousing of all kinds, experts say it's important to have a clear understanding of an organization's data management needs and a defined strategy before embarking on a big data analytics project. Big data analytics is widely discussed, and companies in a variety of industries are inundated with new data sources and ever-increasing amounts of information. However, investing significant resources in applying big data technologies before it's clear what value this can really bring to a company is known as a user's worst mistake.
David Menninger is an analyst at Ventana Research who focuses on BI, analytics, and information management technologies. He believes in not acting too aggressively on this technology, but starting with a business perspective and talking to CIOs, data scientists, and business people to work together to define business goals and expected value before getting started.
Accurately defining what data is available and determining how the organization can best utilize those resources is the most critical part of the process, Menninger noted, noting that CIOs, IT managers, and BI people need to determine what data is being retained, aggregated, and used, and compare it to what is being discarded. Also be sure to consider external data sources that are not currently covered but may be added.
Menninger notes that even if a company is unsure when and how to apply big data analytics, it's still beneficial to make that assessment early. In addition, beginning the process of data capture can help you prepare for the ultimate jump. Even if you don't know what you're going to use it for, capture the data first," he says. Otherwise, you're missing an opportunity because you don't have enough historical data to analyze."
Start small with big dataIt's just as important to start with small opportunities to analyze big data sets, and then use them as a starting point. As companies continue to expand the data sources and types of information they analyze, as well as begin to create the analytic models that matter most to help them discover patterns and correlations in structured and unstructured data, they need to pay attention to the outcomes that matter most to their intended business goals.
Yvonne Genovese, an analyst at Gartner, noted, "If you end up just looking for new patterns and they're not useful, then you've definitely hit a dead end."
ComScore, which specializes in tracking Internet use, provides Web analytics and sales intelligence services to corporate clients. They recognized early on the need for some sort of big data strategy. But ComScore picked some very targeted points before slowly building its own big data analytics program.
Will Duckworth, ComScore's vice president of software engineering, said, "We started small -- extracting individual streams of data and then transferring them to different systems. You can't do that overnight if you can't get to a certain scale."
Scale is exactly the aspect of comScore that it values, given the amount of data the company handles. Back in 2009, when it started out capturing just 300 million records a day - now up to 23 billion records a day and still growing - Duckworth started looking for some new systems and technology infrastructure to do comScore's data processing efficiently.
Don't forget that the end goal is still big dataBy leveraging open source Hadoop technologies and new analytics tools, Duckworth optimized the open source environment so that it would be more accessible to SQL's business analysts. He noted that companies must pay attention to the scale factor when determining big data analytics implementation plans.
He explains, "You've got to think about change -- how much data you're going to need to process six months from now, how many servers you're going to need to add, and whether or not you're going to have software to do those tasks. People don't take into account how much the data grows and how popular the dissatisfaction program will be when it's deployed to a production environment."
After falling into the "new normal" of big data, another aspect that many companies often overlook is that the "old normal" of data management is still valid.
Another Gartner analyst, Marcus Collins, noted, "Information management practices are just as important for Big Data today as they were for data warehousing. Even for companies looking to increase processing flexibility, one thing they need to remember is that information is an enterprise asset and should remain as important as ever."
The above is what I have shared with you about the best practices of big data analytics based on the law of routine, for more information, you can follow the Global Green Ivy to share more dry goods