We have all heard about big data. It is the catchphrase of today’s technological giants. Google and Yahoo, and, to a lesser extent Microsoft, Amazon and Cisco all have a hold on what big data is going to achieve.
In 2012, big data was used to track Hurricane Sandy and the ensuing aftermath of that weather pattern to discern what was going to happen and to authenticate those predictions. Big data is the capability of disparate systems with totally different data sets to answer questions none of those data sets were originally designed to ask.
Big data tracks trends. It connects the information from one set of data to something totally different and obscure in another data set. These correlations allow us normal humans to make very sophisticated and accurate predictions about what could happen. Going back to Sandy, information about wind speed, water temperature and location allowed the meteorologists to predict landfall to within a couple of kilometres. This allowed emergency services to be in the right place at the right time.
Law enforcement and anti-terrorist organisations also need to be in the right place at the right time, and they are using big data to do it. A conversation between two people, combined with a parking ticket, combined with information from customs, and you may or may not have the right information concerning a crime someone is planning. No system can predict the future with perfect accuracy, but it is a good place to start! The recent use of big data analytics in Vancouver is a good example of how law enforcement are using these systems to keep an eye on normal crime.
Cybercrime can be fought in the same way. Big data analytics makes it possible to see trends that could not be perceived within any one isolated data set. People can make business, personal and commercial decisions based on those trends.
Probability and statistics are a very powerful combination, and are already being used by the huge multinational organisations like Google and Bing to craft their top secret algorithms. The more they know about you, the more effectively they can influence what you search for and how you do it. Using a focused approach with a larger number of data sets, they gather massive amounts of information about individual internet users. This information is then used to target you in marketing and advertising.
There is already so much information out there that normal humans lack the capability to understand it, never mind control it. This is where big data come to the fore. Using the Google API (Application Programming Interface), an organisation can use the information gathered by Google Spiders (the electronic program that Google deploys to scour and record content on the internet) to look deeper into trends concerning cybercrime.
Just like the chatter on the street before a major importation of drugs, the chatter on the internet concerning a virus outbreak, for instance, is always there. An attack over the internet by a high-profile group like Anonymous is similarly preceded by chatter concerning the attack. The cyber criminals, like everyone else, like to brag. The problem is getting into the circles where the bragging is happening. Data analytics can allow researchers to pinpoint the areas where incriminating information is being shared. This information can then leak to places where it can be used in our defence.
If a tree falls in the forest, does it still make a sound? In the new world of big data, this old question is easy to answer. The smallest noise made by a cyber-criminal or group will be picked up. Big data will allow us not only to hear those tell-tale sounds, but to do something about them.