Forget Big Data, Small Data is the Real Revolution
There is a lot of talk about “big data” at the moment. For example, this is Big Data Week, which will see events about big data in dozens of cities around the world. But the discussions around big data miss a much bigger and more important picture: the real opportunity is not big data, but small data. Not centralized “big iron”, but decentralized data wrangling. Not “one ring to rule them all” but “small pieces loosely joined”.
Big data smacks of the centralization fads we’ve seen in each computing era. The thought that ‘hey there’s more data than we can process!’ (something which is no doubt always true year-on-year since computing began) is dressed up as the latest trend with associated technology must-haves.
Meanwhile we risk overlooking the much more important story here, the real revolution, which is the mass democratisation of the means of access, storage and processing of data. This story isn’t about large organisations running parallel software on tens of thousand of servers, but about more people than ever being able to collaborate effectively around a distributed ecosystem of information, an ecosystem of small data.
Just as we now find it ludicrous to talk of “big software” – as if size in itself were a measure of value – we should, and will one day, find it equally odd to talk of “big data”. Size in itself doesn’t matter – what matters is having the data, of whatever size, that helps us solve a problem or address the question we have.
For many problems and questions, small data in itself is enough. The data on my household energy use, the times of local buses, government spending – these are all small data. Everything processed in Excel is small data. When Hans Rosling shows us how to understand our world through population change or literacy he’s doing it with small data.
おおくの問題と疑問にとって、スモール・データだけで十分だ。家庭での電力消費のデータ、市バスの時刻表、政府の出資――これらはスモール・データだ。Hans Roslingが人口変化でどのように世界を理解するのかとか、彼がスモール・データを使って文学を書いているのかを教えてくれる。
And when we want to scale up the way to do that is through componentized small data: by creating and integrating small data “packages” not building big data monoliths, by partitioning problems in a way that works across people and organizations, not through creating massive centralized silos.
This next decade belongs to distributed models not centralized ones, to collaboration not control, and to small data not big data.
Want to create the real data revolution? Come join our community creating the tools and materials to make it happen — sign up here:
本当の革命を起こしたい? それを実現させるためのツールと資源を創造しようと試みる我々に賛同するのだ。――ここから登録しよう。