Something amazing happened at the turn of the last century. For the first time ever, advances in IT enabled retailers to serve customers regardless of location. The age of e-commerce had begun.

The beginning

In the early days of e-commerce, each organization controlled how and what information was collected. Back then, basic search and analytical tools were sufficient for revealing the information content of this data. That allowed retailers to highlight potential areas of growth, redundancy in goods and services, and gaps in the market. This new knowledge made it possible to provide recommendations and other personalized services to consumers. IT management focused their sight on improving infrastructure for warehousing what was turning into a data flood.

By 2005, the increasing popularity of social media and personal, portable devices created a simple way for individuals to share opinions and buying choices in ever widening spheres. This led to a surge in public data, with humans as sensors and active participants defining the goods and services they purchased. This shared information came in the form of unstructured data, in a variety of formats. Retailers soon found that they could increase the value of their user supplied data by mixing in data from other sources to supply context. For example, where did the customer live, who did they know, what were their historical buying habits?

A new approach

With this shift, the IT industry recognized the need for new approaches to capturing, modeling and extracting knowledge from the data. Making sense of the data was complicated by the fact that often the data was of low ‘quality,’ the data might be ‘noisy’ or of questionable origin. Ownership, privacy and legal constraints introduced yet another layer of complexity. Often the sheer volume of the data masked the value it contained.

"As the size and complexity of data grows it will become more efficient to move the analytics to the data, rather than the tradition of moving the data to the analytics."

The term “big data” was coined to describe the volume and associated challenges. However, data today is not simply big, but diverse. It comes from a myriad of distributed and heterogeneous sources. Cloud computing as we know it today was born to provide the additional computational resources required to store, aggregate, and extract the knowledge content of this new data. Today the cloud provides high-end computing power and resources, to anyone who needs access to this new wealth of data.

With this came the need for new business models. “Pay as you go” models and customizable services allow individuals, SMEs and large organizations to “try out” services and fit them to budget and need. New models for managing infrastructure were also required; larger entreprises wanted the best of both worlds. They wanted access to “data and analytics as a service,” but still needed to maintain their own legacy systems for sensitive data and bespoke requirements. This gave rise to hybrid cloud models. Organizations such as Microsoft and IBM, already delivering off-the-shelf and custom analytical and other services, continue to develop a wide range of cloud-based services that fit this hybrid model. Newer Internet-based companies such as Amazon are, with demand, also moving from almost entirely cloud services to a hybrid model.

Game changer

Big data not only changed the way data was processed and stored, it changed the skills required to do so. Organizations everywhere are recognizing the need to retrain and retool their workforce to meet this new challenge. Thus new professions including the “data scientist” have emerged as a discipline to manage the very large-scale, complex data in order to derive knowledge.

As cloud computing matures and data science evolves, things will only get more interesting. Clouds and the data sets they manage will become much more distributed and heterogeneous. As the size and complexity of data grows it will become more efficient to move the analytics to the data, rather than the tradition of moving the data to the analytics. Even more exciting will be innovations like cognitive computing, which will give us analytics systems that learn and reason about data much as humans do. The age of the cloud and big data is only beginning.