Excel spreadsheets, stats, taxes (see my last blog post) – these things have been around for a long time. Why have we only heard about big data in the last few years? The reason is two-fold:
- we now have the tech and theory to properly analyze big data and begin drawing conclusions from it, and
- more data is being generated and collected now than ever before.
Let’s start with 1). The advent of cloud technology, the Internet, and now crazy fast internet speeds are some of the biggest things responsible for the recent emergence of big data. With the cloud, we no longer have to store data on-site, or go through the pain of moving data from one data center to another. This used to be the norm in the 80s and 90s…not anymore!
The cloud can also be used to quickly perform calculations on big data. We can slice large datasets into smaller chunks, perform calculations on each chunk in at the same time (each on a different server), and then combine the results of each chunk at the very end. In the past doing calculations of this sort (known as parallel computing) was expensive and required lots of CPU’s on site…but cloud technologies means we can do this on remote servers all over the world. Services like AWS, Microsoft Azure, and Google Cloud all offer packages which perform parallel computing.
Another thing which has helped is data science and mathematical modeling in general. We can create predictive models from large data sets to see trends and make predictions of what is to happen in the future. This starts getting into Machine Learning and Artificial Intelligence. Since this can prove consequential in so many industries, more money and research is being done in big data than in the past.
Number 2) is easy. YouTube, Netflix, Spotify, Wikipedia…a *lot* of data is either being created for consumption by others. Data in older formats (vinyl, video, microfiche, etc.) are now being converted to digital and stored online. And as I already blogged about, mass data collection is happening all over the place. So yeah…you’re now hearing about big data cause the cat is out of the bag about how much big data is affecting our lives.
So you know *what* big data is and *why* it recently emerged. So *how* is it being used (besides for suggesting the next YouTube video you should watch)? Read on!



Hi, this is a test comment, you should be able to read it with no problem!
This is a reply to myself! Imagine that!