Data Data Data!
This was originally posted on my blog. It is the post that helped shape the concept of Visual-Data.
I’ve been experimenting with different ways of displaying data, putting it on maps and generally finding ways to make it look good. This is in preparation for working with a larger bunch of data sets that I’ve mined over the last couple of months.
Some pieces of data help in making things clearer to understand, some in better decision making. Some you see as infographics in your daily newspaper.
One way of visualizing is to use charts.Line graphs for example, are good for viewing time-series data like stock prices. Google Finance has interactive charts that help you do a lot of stuff around stocks.
It allows you to compare multiple stocks and indices. There are options for adding other technical details too. Sliders help you go over stock price history for specific periods.
There are other tools that aid you in better data presentation. If you want to lay out a geographical representation of data, there’s Geocommons. Here’s what I was able to do with some data that I had:
I was able to lay my hands on some Election Commission data and put it on the map. Geocommons helps you by doing some of the processing for you once you’ve uploaded a basic data set. You could use the search feature and browse through the library to see different ways in which people have used it for visualising data.
Then there are other sites like Natural Earth which provides map data that you can layer over base maps like Open Street Map. I discovered both (at different times) via @planemad, who is a superb cartographer himself.
Which brings me to my grouse. Data about India in the public domain is messy. It is available, but unformatted, and dispersed over various websites and buried deep inside some. One act of a transparent government would be to make data available freely and in easily accessible formats. The example I’d cite here is of data.gov. Vivek Kundra, the US Federal CIO, has done a great job of taking some steps to making the U.S. government operation transparent.
In the Indian context, very good data is available at the Election Commission site; but you have to dig into the archives, sift through notification pdfs and what not to reach it.
This, for example is the information on delimitation (MS Excel spreadsheet) in Bihar. But you wouldn’t expect to find detailed data on assembly constituencies in Bihar hidden there.
Strangely enough, the government itself suffers from incomplete information. The data used for delimitation is from the Census Data of 2001. No effort has been made to update or even extrapolate data to the year of delimitation. The delimitation notification was itself issued in 2007 (pdf).
The Central Bureau of Health Intelligence has great datasets dating back to 2001. Half are in html while the other half are in pdf. There’s also the case of broken links. The list goes on. The complete list of Government of India websites can be accessed from here.
I hope that like the UID, someone also centralizes information dissemination. It would help the citizens in knowing where their elected representative and the nation as a whole stands. As they say, Information is Power.