SPECIAL REPORT / Computer algorithms are better at diagnosing severe cancer than humans, Kenneth Cukier told Euractiv, and big data can predict crimes before they are committed and earn businesses money.
Kenneth Cukier is data editor at The Economist and co-author with Viktor Mayer-Schönberger of Big Data: A Revolution That Will Transform How We Live Work and Think. Translated into 20 languages, the book was a New York Times Bestseller. He spoke to EurActiv’s James Crisp about what big data can teach us.
What is big data?
Well there’s no single definition, which is probably a good thing, because to define it is to constrain it. Broadly speaking, though mankind has more information now than ever, and these huge amounts of data can teach us things that are extremely interesting, in fact things we would never have been able to find out with smaller amounts. That’s done by placing different algorithms onto these large amounts of data.
Let me give you an example. Google handles more than a billion searches in the United States every day and stores them all. It took the 50 million most commonly searched terms between 2003 and 2008 and compared them against historical influenza data from the Centers for Disease Control and Prevention. The idea was to see whether certain searches made in a certain area coincided with flu outbreaks. The CDC tracks patient visits to hospitals and clinics, but the information suffers from a reporting lag of a week or two, an eternity in the case of a pandemic. Google’s system could work in near-real time.
Google ran all the terms through an algorithm – a way of making a calculation - that ranked the terms by how well they correlated with flu outbreaks. Then, the system tried combining the terms. With a billion searches a day it would have been impossible for a person to guess which ones might work best.
After running half a billion calculations against their data, Google identified 45 terms that strongly coincided with CDC’s data on flu outbreaks.
The Google trends method has been criticised, because its been wrong in some instances. However that is not the whole story. It's only been wrong like a weather forecast is wrong, when it is sunny when it is meant to be 90% certain of rain. If one takes the research and blends it with the CDC, it improves the focus of both, which shows it is still a valuable resource.
Another good example is Lufthansa. The autopilot system on their airplanes collects data. Some of the data it collects has actually improved the accuracy of German weather forecasting by 7%, which is a considerable improvement.
Lufthansa now sells that data to a meteorological company, which is a great example of how big data can be commodified.
So big data can be sold?
Absolutely. In fact big data is a potential gold mine. There are a few forward-thinking companies who have realised they can sell the data they collect as they go about their everyday work. It will be a revenue generator. In the future I expect to see companies employing data or chief information officers, who will be responsible for this.
It’s not just companies. In the future, each of us will be able to sell our data. People will upload data to online data exchanges, neutral platforms which can bring the data to the marketplace for a fair price. And there will be a market for this data, as people realise the enormous potential of big data.
Will there be an impact on how people work?
There will be a significant impact. This will be a revolution in the workplace. Both white colour and blue collar jobs will be replaced by big data, but that destruction will also create jobs.
It’s a demonstrable fact that a computer algorithm is better at diagnosing severe cancer than a human. But in in a world where data shape decisions more and more, what purpose will remain for people, or for intuition, or for going against the facts?
Personally, I believe there will always remain a need for the human touch. But it is hard to predict the impact of the big data revolution.
What can policymakers do to ensure that the power of big data can be exploited?
The issue of data privacy and protection has been deservedly getting a lot of attention recently. What needs to happen is a change in law to reflect the reality of this type of statistical collection and ensure it is aligned with our values.
Current laws are broadly based on the idea of notice and consent. Essentially, this means that if you want to use someone’s data, you have to tell them what you are collecting and why.
That isn’t really feasible with big data. For a start, it is impossible to know what purpose the data will be used for.
Small data is like a waltz. There’s a clear tempo with known steps. Big data is like a mosh pit or jazz-improv. No one knows what’s coming next.
So regulators need to support this new reality, not least because of the huge potential of big data.
We need to move from a notice and consent to a system of consent which allows a person to give consent, for that data to be used and reused and reused without knowing what the specific purpose is.
What are the dangers of big data?
Of course there are risks, and there will be challenging questions for us to answer as we enter this new reality, a time when the “information society” truly fulfills its potential.
Big data could be used to predict which people are most likely to commit murder. That throws up interesting questions.
Should the person be arrested because they are likely to kill? Or do the authorities have to wait until he or she actually does it? How is that fair on the victim?
There are dangers. There is an argument to suggest that the 2008 Financial Crisis was in a way a crisis of big data. Decisions were made on economic models that turned out to be false.
But despite that I am convinced big data will change the world for the better.