Worley Blog

IS BIG DATA TOO BIG?

Posted on: September 10th, 2019 by Clifford F. Lynch

For years we have heard the old adage that “Information is Power”, the thought being that if one controlled the information, the power would come. A few years ago, a consultant put a damper on that when he said that if information were power, the librarians would own the world. The real power would be in knowing how to use the information. Even before that, Einstein said, “Know where to find the information and how to use it – that is the secret of success.”

I think most of us have found that to be true. We complain about being overwhelmed with data, exposed to more than we can possibly use. Recently, we have found that there is a name for that – Big Data. This term was first used in the 1990’s, popularized in 2008, and being used more and more frequently today. To some, reading about it will simply give you a headache, but others are finding ways to harness at least some of it to reduce costs and increase profits.

Well, what is Big Data? There are several definitions. A 2014 article in Forbes listed twelve of them. The most popular current definition can be found however in Wikipedia. There, Big Data is defined as “a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by traditional data processing application software. Challenges include such things as capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy and data source.”

How big is it? IBM maintains that businesses around the world generate 2.5 quintillion bytes of data daily, and that 90% of the global data has been produced in the past two years. Abhinav Rai, of Upgrad, cites such things as the Google search index, Facebook user profiles, and the Amazon product list as examples of big data. Whatever the source, characteristics of Big Data will include Volume (quantity of generated and stored data, Variety (type and nature of data), Velocity (the speed at which it is generated), and Veracity (data quality and value).

The major question is, how do you get your arms around this mass of data? To do this, we have turned to predictive analytics, user behavior analytics, and other forms of analyses. Amazon describes them as Descriptive Analytics, what happened and why? Predictive Analytics, probability of a given event in the future, and Prescriptive Analytics, what should I do if X happens? Manufacturers are using predictive analytics to get product out faster, reduce the time required to get market presence, predict who will buy the product, and in what quantities. In the retail sector, Walmart handles one million transactions per hour which are imported into a database containing 2.5 petabytes of data, the equivalent of 167 times the information contained in every book in the Library of Congress.

The analysis of Big Data played a major role in Barack Obama’s 2012 re-election campaign. The Healthcare and Insurance industries rely on predictive analytics to predict future outcomes.

It is an established fact that the effective analysis of Big Data will help us make better decisions, but the major issue, especially for smaller companies, is the lack of resources needed to perform these analyses. This is no small task. The McKinsey Global Institute has said there is a shortage of over 1.5 million trained data professionals. The University of Tennessee and the University of California – Berkeley have established masters’ programs to aid in meeting this demand. UC – Berkeley has an on-line program that promises a certification in data analytics in 24 weeks.

There are other ways to meet the Big Data challenge. It probably will come as no great surprise to the reader that Amazon Web Services is here to help you. They promise immediate availability, broad and deep capabilities, and hundreds of partners and solutions. One of their major accounts of course, is Amazon.com

Regardless of your approach, there is value to be gained through the analysis of Big Data. The major challenge will be in finding the resources to do it.