It’s not just about machine readable

One of the approaches I’ve been using in trying to analyse the 50 or so different cases of open data use I’ve collected in this study is to divide up the different processes of data use.

I started with a simple schematic, and building on the theoretical reading I’ve been doing around data and information, looked at different cases that took data and tried to give it context, creating information. A distinction between fixed representations of data, and interactive representations of data quickly became apparent. Some uses of data present it in just one way with a fixed context, or present just one extract from a dataset; other uses of data try and provide a way for others to navigate the data, to explore it in the contexts that matter to the viewer, such as when, for example, you search by postcode to just see data about things local to you. I’ve called this a distinction between a data->information; and data->interface process of using data. Many of the cases I looked at, instead of providing interfaces that represented the data, simply provided new ways of accessing the data – through an API, or combined with other datasets. I’ve called this data->data processes. A few of the cases I looked at are not showing the data to end-users at all, but are using it behind the scenes in order to provide people with a service: for example, using administrative geography data to route reports of faults to the right authorities. I’ve called these data->service uses of data, although my sample doesn’t include enough of them to explore them in depth.

However, the set of data uses I was most surprised to find, given the discourse tends to focus on ‘data for developers’, were what I’ve called data->fact uses. Where people have downloaded a dataset in whatever form it was, and rooted through it until they found a particular fact they were after. Perhaps it’s a fact they’ve been asking their local authority to give them for years about local school provision, and that they can now find in a nationally released dataset. Perhaps it’s a fact that will help them in writing a funding application for their local charity, enabling them to give real local statistics and set better outcome measures for their project. Perhaps it’s just something they were curious about.

What is interesting about data->fact uses, is that they can exist at the very long-tail of data-use. They may be the bits of data that a developer drops out of an application because it’s only of very niche interest. Or they may be the bits of data that no-one will ever build an application around. Which means an implicit or explicit focus on machine-readable data only misses out a vast range of use cases for open government data. Human readable data can be just as important.

The first three of the five-stars of linked data offer a good model for thinking about the release of open data:

★ make your stuff available on the web (whatever format)
★★ make it available as structured data (e.g. excel instead of image scan of a table)
★★★ make it non-proprietary format (e.g. csv instead of excel)

but far to often these get seen as just steps to be leapt over – on the way to a machine-readable web of data, rather than valuable parts of the release of data in-and-of themselves – incredibly useful to many citizens who aren’t app builders – but who do want to know what their government is doing – and who want to be empowered in their interactions with government, rather than operating in circumstances of informational inequality.

A half-star?

Reflecting on these three-stars, and on the first of the draft Public Data Transparency Principles which states that “Public data policy and practice will be clearly driven by the public and businesses who want and use the data, including what data is released when and in what form”, suggests there may be a valuable 0.5 star before these one’s even get started:

★/2 – Publish and keep updated a list of the data you have even if it’s not open yet – and provide a clear way for people to get in touch to talk about opening up datasets.

As many respondents to the Public Data Transparency Principles have noted, not all data will be opened overnight, but making sure citizens searching for facts, as well as developers creating apps, can be drivers of data release could be an important part of the onward process.

Leave a Reply

Open Data in Developing Countries


The focus of my work is currently on the Exploring the Emerging Impacts of Open Data in Developing Countries (ODDC) project with the Web Foundation.

MSc – Open Data & Democracy