In a way, Google is the librarian of the twenty-first century. No matter how obscure your request, if you bring it to her, she will find the information you are looking for with her own search system, even more inscrutable than the Dewey Decimal system libraries historically operated on. Amongst the many advantages of Google, one in particular stands out for those concerned with researching trends and seeing what is hot – the search engine’s perfect memory. Google Trends is a helpful tool that allows you to type in a search term and it reports back to you the historical volume of searches for that term.
This gives us a way of testing how much buzz there really is about something, whether it is a passing fad or here to stay, and when it really took off. Many people in the world of business have been speaking about Big Data for years now (when a term becomes a proper noun you know it has made it), so we decided to apply Google Trends to Big Data and see what could be learned.
Big Data really took off as a concept in the fall of 2011, with searches exploding then and reaching their zenith around October of 2014. Though there has been a very modest decrease in search volume, the topic is still going strong today. Big Data has the staying power of any other major business concept.
The Four V’s of Big Data
Within Big Data circles, there is a concept of the four V’s:
- Volume
- Velocity
- Variety
- Veracity
Originally conceived at IBM, this method of considering data has taken hold industrywide. By considering data this way, you are able to contextualize a problem and hone in on the root cause.
Volume, in short, describes the scale of data a company has. The amount of data in existence increases exponentially every year, so much to the point that it is almost irrelevant to put a number in writing, though the total was estimated at 2.7 zettabytes in 2017 (you can Google how big that is!). Needless to say, sifting insights out of this desert is daunting.
Velocity speaks to the speed at which incoming data is processed. This is especially important for operations that require real time data processing, but even for companies that look at data retroactively it is possible to get buried with the speed of incoming data. If the right people and systems are in place to deal with data as it comes in, issues will be discovered sooner rather than later, saving precious time and money.
Variety obviously refers to the different types of data a firm has access to. Within this is the idea of structured and unstructured data – structured being that which conforms to an obvious and discernable pattern, such as money, dates, numbers, etc. as opposed to the more humanistic forms such as video. These are two very different types of data that take two very different types of systems to interpret.
Veracity considers how reliable data is; does the data you are collecting accurately reflect reality? More than this, veracity takes into account the manner of collection, processing, and interpretation, making sure to avoid errors and biases. The only thing worse than not having data is having incorrect or invalid data.
This framework of the four V’s – volume, velocity, variety, and veracity – is utilized by BIG when considering Big Data solutions. If an organization can handle the amount of data but cannot process it in a timely manner, then they may have velocity issues. If their data has shown to be generally reliable but not relevant, they likely suffer from variety issues.
Making Sense of It All
Only once the problem’s source is identified can data begin to fulfill its potential. By plugging any leaks in a company’s data systems, we can set to work optimizing those systems. Just as there are four categories for data description, there are four broad categories into which analytics work will fall:
- Visualization
- Decision Making
- Forecasting
- Prescriptive Analytics
Visualization – Data is not useful if it cannot be quickly understood and interpreted. As the old saying goes, “a picture is worth a thousand words!” By creating visual dashboards, graphs, charts, etc. data analysis can easily be performed by anyone within an organization.
Decision Making – If historical data has been proven to be reliable, then it can be used for insights. Sometimes all it takes is a (trained) fresh pair of eyes on a dataset to see the million-dollar idea right in front of you.
Forecasting – Along with using past data for decision making, it is also useful for predicting trends and the likelihood of future events. While we have yet to find a crystal ball, forecasting is probably the next best thing. Everything from inventory optimization to sales funnel decisions can benefit from data forecasting.
Prescriptive Analytics – Prescriptive analytics centers around uncovering the “why” the data behaves the way it does and making decisions with the results. Structured experimentation and statistical modelling can be just what the doctor ordered, to provide healthy bottom line figures.
Companies whose issues have shown to be based on data variety can glean new insights with new data sources. BIG specializes in each of these four categories of data analytics. For the next four coming weeks, we will be profiling one of these techniques and the tangible benefits each can bring to you.