The Big Data Tsunami!


Organizations today, are “Drowning in Information, yet starved of knowledge” at the same time. This information is getting generated out of the quest for understanding real time behavior of dynamic systems within or outside the organization to gain a competitive differentiation. For example, a retailer would want to study his customer’s buying patterns, stock replenishment times to provide a personalized service like, predicting the weekly shopping list for his customers or plan store/warehouse inventory. A manufacturer would want to make his Supply Chain smarter by, getting an end to end visibility of the Supply Chain and how countless entities inside and outside interact with each other in order to have an optimal Supply and Demand flow, and predict an innovative after sales service or a Bank would want to predict a potential credit fraud. The Information Management is a key competitive differentiation in today’s world; however, ensuring quality insights is a task which is far from easy!!!

Information requirements to sustain a competitive edge make organizations  collect data, generate real-time meaningful information and insights out of it, which in turn leads to action and at times predicting the future as well!!, the above business scenarios would require analyzing millions of transactions, massive activity logs, huge number of customer calls, countless audio/video streams.

For any organization, Data is generated from Transactional systems, flat files, xl sheets, databases this kind of data is structured one and generally conventional OLAP and Data Warehouses are used to derive insights from it. Size of a typical data warehouse would range from few 100GBs to several TBs.

Then we have data captured in emails, logs, social media interactions, swipe cards, sensors, pictures, customer care calls, support tickets, videos etc, all this is called unstructured data.  For an enterprise this unstructured data volume may range anywhere from a few Terabytes to petabytes every day!!, according to IDC and Wikipedia

  • 80% of data growth today is due to unstructured data
  • Over next decade, the amount of information managed by enterprises will grow 50 times.
  • Size of information created and stored in 2011 was 1.8 Zettabytes (1 Zettabyte = 1 Billion Terabytes). To give readers a feel of how much 1 Zettabyte is, it is equivalent of 250 billion DVDs, 75 billion, 16GB Apple iPads.
  • 90% data we have in the world has been created in the last two years alone,
  • Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes of data – the equivalent of 167 times the information contained in all the books in the US Library of Congress.
  • Facebook handles 40 billion photos from its user base

This rate of data explosion is faster than the rate at which computing speed is doubling and storage is getting cheaper. Therefore, every passing day, our data problems are, definitely, becoming net harder and not net easier, and its beyond the capability of our traditional RDBMS to capture, process and manage this kind of data.  All these events and problems have given rise to IT trend BIG DATA. Almost all the IT companies have started claiming Big Data solutions and offerings among the technology stack they offer to their customer, and this has become the latest buzzword!!! “Big data” has increased the demand of information management specialists, in that, Oracle, IBM, Microsoft, and SAP have spent more than $15 billion on software firms only specializing in data management and analytics. This industry on its own is worth more than $100 billion and growing at almost 10% a year which is roughly twice as fast as the software business as a whole

It doesn’t end at volume, at times delay of a minute in time-sensitive processes like crime detection, healthcare might be too late. The bigger challenge for big data solution is the ability to give insights and alerts by scanning massive information from sensors, machine logs, and transactional systems as they happen in real world. This requirement also leaves a potential scope for developing next generation of computing machines, platforms, architectures, algorithms etc. Big Data analytics uses techniques, as text analytics, voice analytics, machine learning, NLP etc. however, it still remains a Big research area due to limitations we have in terms of computing power to process real time events and lack of efficient and faster algorithms to churn out meaningful information from massive information stores.

Other problems also are big in this space, according to a study by Mckinsey, There will be a shortage of talent necessary for organizations to take advantage of big data. By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.

At this point of time looks like there are bigger challenges ahead for big data!!!! 

2 Responses to The Big Data Tsunami!

  1. Hi Anu. Excellent ‘data packed’ article on data ;)

    Day is not really far when quantum computing will take over the traditional chip based computing, as we know it today.

  2. Anukool says:

    Sir, Quantum Computing field is very old, however there are still a lot of challenges related with Decoherence/Wave function collapse, which it has to overcome.

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Powered by WordPress | Designed by: search engine optimization company | Thanks to seo service, seo companies and internet marketing company