Data.gov.uk provides listings of non-personal government-held data, contributed by government departments, or available through the Office of Public Sector Information Data Unlocking Service. Whilst often described as ‘raw-data’ many catalogue records currently refer to derived data and summary statistics, or link to query interfaces providing interactive access to sub-sets of data, rather than bulk dataset downloads.
Providing open government data is a matter of accessibility, format and license. Data must be available (generally online), in forms, and under licenses, allowing for re-use (i.e. non-proprietary formats; open license). David Eaves expresses a version of these ideas in ‘The Three Laws of Open Government Data’:
1) If it can’t be spidered or indexed, it doesn’t exist
2) If it isn’t available in open and machine-readable format, it can’t engage
3) If a legal framework doesn’t allow it to be repurposed, it doesn’t empower
The much-cited Resource.org OGD principles (Malmud et al. 2007) emphasize the need for complete and primary data, “with the highest possible level of granularity, not in aggregate or modified forms”. The Open Knowledge Definition (OKD) defines knowledge as “…open if you are free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and share-alike” (OKF 2006). For datasets, which are not subject to standard copyright protection under UK law, but which may be covered by other intellectual property rights, permission for re-use needs to be explicitly stated (Hatcher Waelde 2007). Many open data advocates oppose ‘non-commercial’ license terms, and any other terms imposing requirements on downstream users of data, such as restrictions on production of derivative works (Murray-Rust et al. 2009). Data.gov.uk lists datasets available under Crown Copyright, which is not OKD-compliant, and under ‘Crown Copyright with data.gov.uk rights’, which explicitly permits use, re-use and sub-licensing of data. At the time of writing, Data.gov.uk hosts 3396 datasets, 1773 of which are recorded as OKD-compliant.
Particular efforts have been undertaken with data.gov.uk to encourage use of linked-data conventions for publishing data (Alani et al. 2007; Berners-Lee 2009; Tennison Sheridan 2010; Ding et al. 2010) whereby data is represented using flexible RDF schemas, and datasets are linked to create a ‘web of linked-data’. The skills, experience-base and tool-chains for working with linked-data are still at relatively early stages of development (Tennison Sheridan 2010; Pellegrini 2009), but advocates of linked-data approaches believe it has the “potential to enable a revolution in how data is accessed and utilized” (Bizer et al. 2009). It should be noted, however, that few data.gov.uk datasets are, as yet, published as linked data.
The European Commission’s PSI Programme has developed methodologies for tracking economic value of PSI. In the UK Pollock (2009) and Newbury et al. (2008) have estimated the economic value to be realized by releasing data from trading funds. The Web Science community are tracking the successes and challenges of ‘bootstrapping the semantic web’ (Alani et al. 2007) from government data. However, as Hogge notes in her recent review of global potentials for OGD programmes, although OGD release has seem dramatic progress, further research is needed into “the social impact of … data catalogues like data.gov and data.gov.uk” (Hogge 2010, p.42). Understanding the different ways in which OGD is being used is essential groundwork to such an evaluation. The following section sets out a theoretical background to data use; exploring relationships between data and information.