GIS Watch 2012 article: Who is doing what when it comes to technology for transparency, accountability and anti-corruption

The 2012 issue of Global Information Society Watch was focussed on ‘The Internet and Corruption’, exploring how online technologies are being used in the fight against corruption across the world. GIS Watch is focussed on country level analysis of information society issues around the world, but also includes a number of wider articles. I was asked to put together the ‘institutional review’ on transparency and accountability work. You can find the full GIS Watch here, or read the institutional review article below (licensed under Creative Commons Attribution 3.0 license, please attribute to GIS Watch).

Who is doing what when it comes to technology for transparency, accountability and anti-corruption

Fighting corruption is a responsibility that all global institutions, funders and NGOs have to take seriously. Institutions are engaged with the fight against corruption on a number of fronts. Firstly, for those institutions such as the World Bank that distribute funds or loans there is a responsibility to address potential corruption in their own project portfolios through accessible and well-equipped review and inspection mechanisms. Secondly, institutions with a regulatory role need to ensure that the markets they regulate are free of corruption, and that regulations minimise the potential space for corrupt activity. And thirdly, recognising the potential of corruption to undermine development, institutions may choose to actively support local, national and international anti-corruption activities and initiatives. This report provides a critical survey of some of the areas where multilateral, intergovernmental, multi-stakeholder institutions, NGOs and community groups have engaged with the internet as a tool for driving transparency and accountability.

Although transparency is an often-cited element of the anti-corruption toolbox, technology-enabled transparency remains a relatively small part of the mainstream discourse around anti-corruption efforts in formal international institutional processes. The UN Convention Against Corruption (UNCAC),[1] was adopted in 2003 and ratified by 160 countries, and the OECD Anti-Bribery Convention,[2] adopted in 1997, provides a backbone of international co-operation against corruption and focus heavily on legal harmonisation, improved law enforcement, criminalisation of cross-border bribery and better mechanisms for asset recovery, addressing many of the pre-conditions for being able to act on corruption when it is identified. Continued co-operation on UNCAC takes place through the UN Office on Drugs and Crime,[3] with input and advocacy from the UNCAC Civil Society Coalition. In international development, the outcomes document of the Fourth High Level Forum on Aid Effectiveness,[4] held in Korea in November 2011, notes these foundations and highlights “fiscal transparency” as a key element of the fight against corruption. However, the document more often discusses transparency as part of the aid-effectiveness agenda, rather than as part of anti-corruption. This illustrates an important point: transparency is just one element of the fight against corruption, and reduced corruption is just one of the outcomes that might be sought from transparency projects. Transparency might also be used as a tool to get policy making better aligned with the demands of citizens, or to support co-operation between different agencies.

The review below starts by looking at technology for transparency in this broader context, before briefly assessing how far efforts are contributing towards anti-corruption goals.

Transparency and open data

The last three years have seen significant interest in online open data initiatives as a tool for transparency, with over 100 now existing worldwide. Open data can be defined as the online publication of datasets in machine-readable, standardised formats that can be re-used without intellectual property or other legal restrictions.[5] A core justification put forward for opening up government or institutional data is that it leads to increased transparency as new data is being made available, and existing data on governments, institutions and companies becomes easier to search, visualise and explore.

Following high-profile Open Government Data (OGD) initiatives in the US (data.gov) and UK (data.gov.uk), in April 2012 the World Bank launched its own open data portal (data.worldbank.org), providing open access to hundreds of statistical indicators. Here is how the World Bank describe the data portal’s mission:

The World Bank recognizes that transparency and accountability are essential to the development process and central to achieving the Bank’s mission to alleviate poverty. The Bank’s commitment to openness is also driven by a desire to foster public ownership, partnership and participation in development from a wide range of stakeholders.[6]

The World Bank has also sponsored the development of Open Government Data initiatives in Kenya (opendata.go.ke) and Moldova (data.gov.md), as well as funding policy research and outreach to promote open data through the Open Development Technology Alliance (ODTA).[7]

Central to many narratives about open data is the idea that it can provide a platform on which a wide range of intermediaries can build tools and interfaces that take information closer to people who can use it. The focus is often on web and mobile application developers as the intermediaries. Many of the applications that have been built on open data are convenience tools, providing access to public transport times or weather information, but others have a transparency focus. For example, some apps visualise financial or political information from a government, seeking to give citizens the information they need to hold the state to account.

Apps alone may not be enough for transparency though. In an early case study of the Kenya open data initiative, Rahemtulla et. al., writing for the ODTA, note that “the release of public sector information to promote transparency represents only the first step to a more informed citizenry…”, and that initiatives should also address digital inclusion and information literacy. This involves ensuring ICT access, and the presence of an ‘info-structure’ of intermediaries who can take data and turn it into useful information that actively supports transparency and accountability.[8] World Bank investments in Kenya linked to the open data project go some way to addressing this, seeking to stimulate and develop the skills of both journalists and technology developers to access and work with open data. However, much of the focus here is on e-government efficiency, or stimulating economic growth through creation of commercial apps with open data, rather than on transparency and accountability goals.

Open data was also a common theme in the first plenary meeting of the Open Government Partnership (OGP)[9] in Brazil in April 2012. The OGP is a new multilateral initiative run by a joint steering committee of governments and civil society. Launched in 2011 by eight governments, it now has over 55 member states. Members commit to create concrete National Action Plans that will “promote transparency, empower citizens, fight corruption, and harness new technologies to strengthen governance”.[10] The OGP has the potential to play an influential role over the next few years in networking civil society technology-for-transparency groups with each other, and with governments, and placing the internet at the centre of the open government debate.

The rapid move of open data from the fringes of policy into the mainstream for many institutions has undoubtedly been influenced by the activities of a number of emerging online networks and organisations. The Open Knowledge Foundation (OKF)[11] has played a particularly notable role through their e-mail lists, working groups and conferences in connecting up different groups pushing for access to open data. OKF was founded in 2004 as a community-based non-profit organisation in the UK and now has 15 chapters across the world. OKF explain that they ‘build tools, projects and communities’ that support anyone to “create, use and share open knowledge”.[12] The OKF paid staff and volunteer team are behind the CKAN software used to power many open data portals, and the OpenSpending.org platform that has the ambition to “track every government financial transaction across the world and present it in useful and engaging forms for everyone from a school-child to a data geek”.[13] This sort of ‘infrastructure work’ – building online platforms that bring government data into the open and seek to make it accessible for a wide range of uses – is characteristic of a number of groups, both private firms and civil society, emerging in the open data space.

Another open data actor gaining attention on the global stage has been the small company OpenCorporates.com.[14] OpenCorporates founder, Chris Taggart, describes how their goal is to gather data on every registered company in the world, providing unique identifiers that can be used to tie together information on corporations, from financial reporting, to licensing and pollution reports. Although sometimes working with open data from company registrars, much of the OpenCorporates database of over 40-million company records has been created through “screen scraping” data off official government websites. In early 2012 OpenCorporates were invited to the advisory panel of the Financial Stability Board’s[15] Global Legal Entity Identifier (LEI) project, being conducted on behalf of the G20. The LEI project aims to give a unique identifier to all financial institutions and counterparties, supporting better tracking of information and transactions. Importantly the recommendations, which have been accepted by the G20, will operate “according to the principles of open access and the nature of the LEI system as a public good… without limit on use or redistribution”.[16]

International transparency initiatives and standards

A number of sector-specific international transparency initiatives have developed in recent years, with a greater or lesser reliance on the internet within their processes.

Online sharing of data is at the heart of the International Aid Transparency Initiative (IATI)[17] which was launched at the third High Level Forum on Aid Effectiveness in Accra, Ghana in 2008, and now has over 19 international aid donors as signatories. The initiative’s political secretariat is hosted by the UK Department for International Development (DfID),[18] and a technical secretariat, which maintains a data standard for publishing data on aid flows, is hosted by the AidInfo programme.[19] IATI sets out the sorts of information on each of their aid activities that donors should publish, and provides an XML standard for representing this as open data.[20] A catalogue of available data is then maintained at http://www.iatiregistry.org, and a number of tools have been developed to visualise and make this data more accessible. Through IATI, countries and institutions, from the Asian Development Bank (ADB), to the UN Office of Project Services (UNOPS), have made information on their aid spending or management more accessible.

The Open Aid Partnership,[21] working closely with IATI, and hosted by the World Bank Institute, is focusing specifically on geodata standards for aid information, using the ‘Mapping for Results’ methodology developed with AidData[22] to geocode the location of aid projects and make this information available online. Geocoded data is seen as important to “promote ICT-enabled citizen feedback loops for reporting on development assistance”.[23]

A number of other high-profile sector transparency initiatives, the Extractive Industries Transparency Initiative (EITI),[24] and the Construction Sector Transparency Initiative (CoST)[25] are less open data or ICT-centred, opting instead for processes based on disclosure and audit of documents through local multi-stakeholder processes. However, the Global Initiative on Fiscal Transparency (GIFT),[26] which aims to “advance and institutionalise global norms and continuous improvement on fiscal transparency, participation and accountability in countries around the world”, has a ‘Harnessing new technologies working group’ led by the OKF, which has outlined a number of ways technology can be used for transparent and accountable finance.[27] The ‘Lead Steward’ organisations for GIFT are the International Monetary Fund, World Bank Group, Brazil Ministry of Planning, Budget and Management, the Department of Budget and Management Philippines, and the Washington based CSO-project, the International Budget Partnership.[28]

Crowd-sourcing 

Transparency and accountability isn’t just about information and data from governments, companies or multi-lateral institutions. Input from citizens is crucial too. Crowd-sourcing projects such as Ushahidi,[29] first developed to monitor post-election violence in Kenya, have been deployed or replicated in a number of anti-corruption settings. Accepting submissions by SMS or online, these tools allow citizens to report problems with public services that might point to appropriation of funds, or to directly report cases of corruption. Reports are generally geocoded and the resulting maps are presented publicly online. With UN Development Programme (UNDP)[30] support, a Ushahidi-based corruption monitoring platform was established in Kosovo.[31] In India, the IPaidABribe.com platform, which was launched in 2010 by Bangalore based non-profit Janaagraha,[32] has collected over 20,000 reports of bribery requests or payments.

UNDP analysis suggests that the success of social media and use of crowd sourcing in transparency and accountability projects relies upon transparent mechanisms for verifying reports, and the backing of institutions or systems that can convert information into action – such as ensuring corrupt tenders are cancelled.[33] In a global mapping of technology for transparency and accountability, The Transparency & Accountability Initiative[34] (a donor collaboration chaired by DfID and the Open Society Foundation),[35] found that many of the one hundred projects they reviewed were started by technology-savvy activists.[36] Where these were tailored to local context, and able to adopt a collaborative approach, involving governments and/or service providers, they were more likely to be sustainable and successful. Global Voices Online maintain a directory of over 60 case studies as part of their ‘technology for transparency network’.[37]

The internet is also being used actively by global advocacy networks such as the Land Matrix Partnership, who launched an online database of land deals at the World Bank Land and Poverty Conference in April 2012, seeking to highlight the growing issue of large scale land acquisitions across the world, particularly in Africa. This database, initially created through online collaboration of researchers, also accepts submissions through its website at http://landportal.info/landmatrix where reported data can also be visualised and explored.

Further activity and institutions

For reasons of space this report can only make passing mention of initiatives aimed at increasing parliamentary transparency through developing and implementing online tools for tracking legislative process and parliamentary debates. These have been established by civil society networks in a number of countries following models developed by the independent GovTrack in the US,[38] and the charity MySociety[39] with their TheyWorkForYou.com platform in the UK. MySociety, with support from Open Society Foundation and Omidyar Network,[40] have been focusing in 2012 on making their transparency and civic action tools easier to implement in other jurisdictions, opening up the Alavateli code that powers the public right to information services WriteToThem.com and AskTheEu.org, amongst others.

The funding for this work from the Omidyar Network, established by eBay founder Pierre Omidyar, draws attention to another set of important institutions and actors in the tech-for-transparency space: donors from the technology industry. Google, Omidyar Network, Cisco Foundation, and Mozilla Foundation amongst others have all been involved in sponsoring technology for transparency open source projects like Ushahidi, the work of MySociety, or data-journalism projects across the world. It is likely that without access to funding derived from internet industry profits, many of the current technology-for-transparency projects around would be far less advanced. 

This report has also not explored how institutions have responded to online leaking of information as part of transparency and accountability efforts. However, one project deserves a brief mention: the WCITLeaks website [41] established to accept leaked documents relating to the revision of the International Telecommunications Regulations (ITRs) in response to the secrecy surrounding International Telecommunication Union (ITU) processes, and the lack of a civil society voice at the forthcoming World Conference on International Telecommunication (WCIT).

Exploring impact 

Technology for transparency is a rapidly growing field. The innovations may be emerging from civil society and internet experts (with much of the funding to scale up projects often coming ultimately from internet firms), but governments and international institutions are opting-in to open data based transparency initiatives, and a number of institutions, from the World Bank, to the newly formed OGP, are active in spreading the technology for transparency message to their clients and members. However, there is little hard evidence yet of the internet becoming an integrated and core part of the global anti-corruption architecture, and many tools and platforms remain experimental, hosting just tens or hundreds of reported issues, and offering only limited stories of where crowd-sourced SMS reports, or irregularities spotted in open data, have led to corruption being challenged, and offenders being held to account.

McGee and Gaventa in a review of general transparency and accountability initiatives funded by DfID explain that the evidence base on their impact is limited across the field.[42] Limited evidence of the anti-corruption impacts of technology for transparency should therefore be taken as a challenge to improve the evidence base and focus in impact, rather than to step back from developing new internet-based approaches for transparency and accountability. Working out the impact of those projects that provide online information infrastructures as foundations for accountability efforts, from general open government data projects, to targeted transparency initiatives, will need particular attention if these efforts are to continue to receive institutional backing, and if the new loose-knit networks that provide many of these platforms are to continue to thrive.

 

 

(All links accessed 7th July 2012)


[3] UN Office on Drugs and Crime: http://www.unodc.org/unodc/en/corruption/

[4] Busan 2011 High Level Forum on Aid Effectiveness: http://www.aideffectiveness.org/busanhlf4/

[5] The Open Definition: http://opendefinition.org/

[6] World Bank Open Data Portal: http://data.worldbank.org/about

[7] Open Development Technology Alliance: http://www.opendta.org

[8] Rahemtulla, H., Kaplan, J., Gigler, B.-S., Cluster, S., Kiess, J., & Brigham, C. (2011). Open Data Kenya: Case study of the Underlying Drivers, Principle Objectives and Evolution of one of the first Open Data Initiatives in Africa. http://www.scribd.com/doc/75642393/Open-Data-Kenya-Long-Version

[9] Open Government Partnership: http://www.opengovpartnership.org/

[11] Open Knowledge Foundation: http://okfn.org/

[12] http://www.okfn.org/about/faq

[13] Open Spending (project) http://www.openspending.org

[14] Open Corporates: http://opencorporates.com/

[15] Financial Stability Board: http://www.financialstabilityboard.org/

[17] International Aid Transparency Initiative: http://www.aidtransparency.net

[18] UK Department for International Development: http://www.dfid.gov.uk

[21] Open Aid Partnership: http://www.openaidmap.org/

[24] Extractive Industries Transparency Initiative: http://eiti.org/

[25] Construction Sector Transparency Initiative: http://www.constructiontransparency.org/

[26] Global Initiative for Fiscal Transparency http://fiscaltransparency.net/

[28] International Budget Partnership: http://internationalbudget.org/

[29] Ushahidi: http://ushahidi.com/about-us

[30] UN Development Programme: http://www.undp.org/

[33] Tsegaye Lemma (2012), Corruption Prevention and ICT: UNDP’s Experience from the field. Presented at Joint Experts Group Meeting and Capacity Development Workshop on Preventing Corruption in Public Administration, UN DESA, New York, USA, 26 – 28 June. http://unpan1.un.org/intradoc/groups/public/documents/un-dpadm/unpan049778.pdf

[34] Transparency and Accountability Initiative: http://www.transparency-initiative.org/

[35] Open Society Foundations: http://www.soros.org/

[36] Avila, R., Feigenblatt, H., Heacock, R., & Heller, N. (2011). Global mapping of technology for transparency and accountability: New technologies. http://www.transparency-initiative.org/reports/global-mapping-of-technology-for-transparency-and-accountability

[37] Technology for Transparency Network: http://transparency.globalvoicesonline.org/

[40] Omidyar Network: http://www.omidyar.com/

[41] WCIT Leaks (project) http://wcitleaks.org/

[42] Mcgee, R., & Gaventa, J. (2010). Review of the Impact and Effectiveness of Transparency and Accountability Initiatives: Synthesis Report. http://www.dfid.gov.uk/R4D/Output/187208/Default.aspx. See also http://www.dfid.gov.uk/R4D/Search/SearchResults.aspx?ProjectID=60827 for other outputs of the research programme this report is taken from.

Open Data, Land, Gender

[Summary: very rough and speculative notes in response to a land coalition online dialogue]

The land coalition are hosting a online dialogue until 20th Feb looking at “using online platforms to increase access to open data and share best practices of monitoring women’s land rights”. It’s an interesting topic for a dialogue particularly given one of the most widely cited cases used to highlight potential downsides of open data relates to the digitisation of land records and their exploitation to the detriment of poor landholders. However, as platforms like the LandMatrix (aggregating together land investment reports from research and advocacy groups across the world), and Open Development Cambodia demonstrate, open data is also being used by citizens to monitor land rights issues.
In this post I share a few quick thoughts on the broad theme of open data, land and gender.

Open data and land

The dialogue asks about how online platforms are contributing to the opening of land data. There are three broad sources of data I can see:

Official data – where governments have well managed land ownership databases then as part of national open government data programmes citizens may be able to secure the ongoing publication of this data in open forms. In the United Kingdom we’ve recently seen the Land Registry place data online, detailing land sale transactions in CSV and linked data; and a publicly owned land is a commonly featured dataset on local open data portals in the UK. However, this data itself may be tricky to use directly, and intermediaries are needed to make it accessible. In Kirklees, the Who Owns My Neighbourhood presents an interesting approach to using official data, and combining it with social features for citizens to input local knowledge and news about publicly owned plots of land: making official land data more ‘social’.

Crowdsourced data – in many cases there may not be an official source for the data activists want, or there may be limited prospect of getting access to the official data. Here a range of ‘crowdsourcing’ approaches exist. The LandMatrix approach uses researchers, and works to verify reports before sharing them. There may be other approaches available that use tools like pybossa to crowdsource extraction of structured information from semi-structured documents, or to split analysis of records into micro-tasks. The Open Street Map platform may also be able to act as source of data, allowing tags to be applied to land. Tools like CrowdMap (based on the Ushahidi platform) make it possible to collate reports submitted on a range of platforms including phone, and to verify reports, although the challenge with any crowdmap project is recruiting people to submit data.

Inferred data – at one of the RHOK Hack Days I took part in at Southampton I was interested to hear about a groups project using satellite data to work out crop types on plots of land. I suspect there are ways this data could be used to detect changes in land use that might indicate also changes in ownership – and the conversion of land from multiple crops to large agribusiness.

Using land data

Having open data on land ownership and land rights is only one part of the story. As the Bhoomi case illustrates, the regulatory framework around the data matters: is a dataset taken as authoritative, or are documents or other customary practices able to override the descriptions held in data? Does the data model through which land ownership and rights are described capture the subtlety and nuance of land use practices (see Srinivasan’s field note for a discussion of the need to mash-up multiple schemas of data to get a view of complex land practices)? And what intermediaries are active to help citizens mobilise land records to secure their rights, rather than those records being only truly accessible to private actors with technical and financial capital?

In the ongoing Land Coalition dialogue I’m interested to learn more about the cases of how data on land rights is being mobilised to create change: whether at the level of global advocacy, where big numbers may matter most; or at the level of individual struggles over ownership, access and rights, where detailed, accurate and timely data on particular plots is likely to be most important.

Open data and women’s land rights

I will admit to knowing very little about the specific issues around women’s land rights. However, in making the connection between open data and women’s land rights I did want to briefly explore whether a focus on digital platforms and open data introduces any particular gender issues. For example, whilst statistics on mobile phone penetration in developing countries suggest widespread access to mobile devices, there is a significant gender gap in mobile ownership and access, with women much less likely to have control of a handset than men. Gender issues may also arise in relation to the culture and practices around open data.

In a recent First Monday article, Joseph Reagle suggests that the ‘free culture’ movement associated with open source software and open knowledge products like Wikipedia possess a gender gap that is potentially event greater than the very gender unequal general computing culture from which it arose. Reagle argues that the ideas of ‘openness’ current in these communities can be used to dismiss concerns about gender gaps, and paint them as an issue of choice, rather than highlighting the wider structural factors that lead to the massive underrepresentation of women in online free software and open knowledge construction. For example, Reagle points to the “double shift” of women’s time, and the ways in which the ‘free time’ used to contribute to creation of open culture, whether through evenings away from work, or hack-days and other events, is unequally distributed between women and men.

Does this critique carry across the open data? It is apparent that the open data field is far from gender equal – at least in terms of advocates for open data, and the creators of tools, platforms and analysis built upon data – although whether it is male dominated to the extent that other fields such as open source contribution are is yet to be measured. In part any gender imbalance may be attributed to the connections between the open data community and the open source and free culture communities, which are already have a significant gender imbalance. However, we should also be open to deeper issues of epistemology: whether the very notion of resolving questions of ownership or fact through datasets, rather than through processes of dialogue, is itself gendered. How far advocacy to open up datasets moves into advocacy for the primacy of data over other ways of knowing, and how data is used and interpreted, has a bearing on whether gendered systems of power are being reinforced or challenged.

An ongoing discussion…

The above remarks are just some first thoughts on the topic. The Land Portal dialogue is running for another week, and I’m looking forward to diving spending time looking at what others are saying to better understand how open data and land can connect in constructive and positive ways.

I hope we might also develop some lines of the gender discussion more in upcoming work of the Open Data in Developing Countries project.

Notes on open government data evaluation and assessment frameworks

The evaluation of open data initiatives has become an increasingly pressing concern for many. As open data initiatives have proliferated, there have been a number of attempts to develop assessment, monitoring and measurement frameworks that can inform policy, and that will support comparative assessment of different open data efforts, or that can guide the creation of new initiatives. In this post I look at a number of the frameworks that have been put forward, or are currently in development. This post is part of my thinking aloud in planning for some common research tools in the Exploring the Emerging Impacts of Open Data in Developing Countries project, and in putting together a methods section for my PhD.

My working notes for this post, with a short summary of each of the frameworks described can be found here.

What is being measured?

The frameworks I explored fall into three broad categories:

  • Readiness assessments – looking at whether the conditions exist for an open data initiative to be started or successful.This category includes the Web Foundation Open Government Data Feasibility Studies and World Bank Open Data Readiness Assessment.
  • Evaluating implementation – looking at whether existing initiatives, or organisations, meet some criteria for ‘good’ open data implementation.This was the largest group, including the Five Stars of Linked Open Data (Berners-Lee, 2010); The Open Data Census [LINK]; The Open Data Index (Farhan, D’Agostino, & Worthington, 2012); mOGD-I; MELODA (Garcia, 2011); The State of Open Data method (Braunschweig, Eberius, Thiele, & Lehner, 2012); the assessment of open budgetary data in Brazil (Craveiro, Santana, & Alburquerque, 2013); Grading Government’s Open Data Publication Practices (Harper, 2012); and the Data Openness Index and Government Data Openness Index (Murillo, 2012).
  • Impact assessment – none of the frameworks I looked at explicitly address impact (though there are a number of studies that have developed methods to try and quantify economic impacts of open data (Vickery, 2011)), but a few frameworks in development do seek to make connections between implementation and different kinds of potential open data impacts (Jetzek, Avital & Bjorn-andersen, 2012; Huber, 2012).

The frameworks I explored operate at a number of different levels. Readiness assessments tend to operate at the country level, although the World Bank suggest their Open Data Readiness Assessment can also be applied at sub-national levels.

Implementation assessments may target a variety of:

  • Individual datasets
  • Open data portals
  • Individual institutions
  • Open data initiatives
  • Whole countries

A number of frameworks generate aggregate assessments of initiatives, portals or institutions based on aggregating up numerical scores for the ‘openness’ of datasets belonging to that parent entity. For example, MELODA, and a recent implementation of the Five Stars of Open Data on Data.gov.uk assign scores to institutions based on an average of the scores assigned to their individually published datasets.

How does measurement take place?

There are a number of non-mutually exclusive approaches to measurement, including:

  • Survey of technical features – identifying a list of features that datasets or data portals should possess, and carrying out an automated, or manual, survey of whether these features are in place. These approaches are generally agnostic as to the subject of the data, but are interested in whether datasets are machine readable, openly licensed and well catalogued (Braunschweig et al., 2012; Garcia, 2011) and the 5 Stars of Linked Open Data.
  • Specific dataset checklist – these approaches determine a short list of particularly important datasets and ask about whether these are available, and then conduct a technical assessment of these particular datasets. The Open Data Index, and Open Data Census both adopt this approach.
  • Domain specific assessments – Harper’s grading of US departments dataset publication practices identifies ideal features of specific datasets, and evaluates them against these (Harper, 2012). For example, where a standard exists for representation of a particular kind of data, it would judge a department higher where it adopts this standard.
  • Added value features – The Open Data Index, and the proposed mOGD-I model include questions on whether applications have been built on top of data, or whether there are accompanying tools around datasets. The readiness assessments also consider the capacity of states to support and stimulate activities that might increase uptake and use of open data.
  • Features of the environment – the readiness assessments major on this, describing social, technical, legal, political, economic and organisational contexts for open data.
  • Expert surveys – most assessment frameworks draw to a degree on survey methods, even though some attempt to automate elements. In most cases a single informant is used.

Some frameworks look to generate a single number that can be used to rank the subject of analysis, as in the case of the Open Data Index, MELODA, or Data.gov.uk implementation of the 5-stars of open data model. Other frameworks present a multi-dimensional assessment of their subject, either omitting aggregation altogether, or providing aggregation along a number of dimensions such as legal, organisation, technical etc. 

What does all this mean for the ODDC project?

In the Exploring the Emerging Impacts of Open Government Data in Developing Countries research project there are a number of things we want to try and understand.

  1. How does the context that an open data initiative operates within affect the use of data in governance processes?
  2. How do the technical features of an open data initiative affect the use of data in governance processes?

The first question draws upon the sort of data that might feature in a readiness assessment. The second draws upon the sort of data gathered in an implementation assessment. Like Huber (2012), and Jetzek et. al. (2012) we are hypothesising that the way an open data initiative is implemented may be slanted towards particular kinds of data re-use and thus impacts. By trying to connect context, implementation and impacts, we will be looking to both draw upon, and inform the further development of, evaluation frameworks.

Within the project we need to be able to perform evaluation at two levels:

  • The macro level – as we build upon learning from the Web Index to refine methods of generating country-level indicators that can inform an assessment of the extent to which a country has capacity to benefit from open data, and the extent to which this is being realised.
  • The case level – as the individual qualitative cases in developing countries generate comparable descriptions of how open data has been used.

The development of the macro level framework will be an ongoing task over the next year, but with the individual cases kicking off very soon, there is some immediate work to be done to develop two resources: a simple contextual questionnaire for describing the environment in a country or city; and a dataset assessment tool that can be applied at the level of individual datasets, collections of datasets, or intermediary platforms.

Hopefully a further iteration of working through the frameworks listed in this post will inform the development of these. As I get started on this task I would welcome pointers to any resources I have missed.

References

Berners-Lee, T. (2010, July). Linked Data – Design Issues. Retrieved from http://www.w3.org/DesignIssues/LinkedData.html

Braunschweig, K., Eberius, J., Thiele, M., & Lehner, W. (2012). The State of Open Data Limits of Current Open Data Platforms. WWW2012. Retrieved from http://www2012.wwwconference.org/proceedings/nocompanion/wwwwebsci2012_braunschweig.pdf

Craveiro, G. da S., Santana, M. T. De, & Alburquerque, J. P. de. (2013). Assessing Open Government Budgetary Data in Brazil. ICDS 2013.

Farhan, H., D’Agostino, D., & Worthington, H. (2012). Web Index 2012. Retrieved from http://thewebindex.org/2012/09/2012-Web-Index-Key-Findings.pdf

Garcia, A. A. (2011). Methodology for Releasing Free Data (MELODA) (pp. 1–15). Retrieved from http://meloda.org/index.php/meloda/category/1-meloda

Harper, J. (2012). Grading the Government’s Data Publication Practices.

Huber, S. (2012). The fitness of OGD for the creation of public value. In P. Parycek, N. Edelmann, & M. Sachs (Eds.), CeDEM12 – Proceeding of the Conference for E-Democracy and Open Government. CeDEM.

Jetzek, T, Avital, M., & Bjorn-andersen, N. (2012). The Value of Open Government Data : A Strategic Analysis Framework. Orlando. Retrieved from http://openarchive.cbs.dk/handle/10398/8621

Murillo, M. J. (2012). Including all audiences in the government loop: From transparency to empowerment through open government data.

Vickery, G. (2011). Review of Recent Studies on PSI re-use and related market developments. PAris.

 

Exploring incentives for transparency in developing countries

[Summary: brief reflections on the dynamics of transparency in developing countries]

Doug Hadden of FreeBalance (developers of Public Financial Management software) has posed the question “What are the Incentives for Transparency in Developing Country Governments?“. Doug notes that many of their developing countries customers have been interested in implementing transparency portals such as Transparency.gov.tl, and transparency has been a major topic of conversation at their annual user group meeting.

My initial draft of a comment became rather long, so here are a few reflections in reply to that question by way of a blog post.

Framing the question

First, we need to identify whether a distinction between developed and developing countries has particular relevance to this question. There are three main areas where the distinction could be being drawn: degree of political freedoms and democracy; levels of corruption; and state capacity and effectiveness. Malesky et al. comment on the fact that we might expect the dynamics of transparency initiatives to be different in more authoritarian regimes. We might anticipate both that authoritarian governments have less incentive to pursue transparency, and that if transparency is pursued, it is less likely to be effective in changing policy and implementation outcomes, further undermining the case for it’s adoption. A similar incentive issue may exist for regimes with high levels of corruption. If political elites are seen to be corrupt, then it may be surprising to see those elites adopt and pursue transparency policies. Lastly, on the question of state effectiveness, it might be argued that it is surprising that a democratic state with limited capacity adopts transparency as a policy instrument over other available public sector reforms. In his chapter in Corruption and Democracy in Brazil Bruno Speck discusses the importance of empowered audit and oversight institutions to ensuring effective use of public finance. Transparency may be a means by which actors outside the state can put things on the agenda of empowered institutions, but without effective state mechanisms to enforce compliance with laws once problems are identified, it may look to be a flawed policy tool. All these distinctions (levels of freedom, corruption and state capacity) might have some degree of correlation with the development status of a country, though the line is not clear cut. Alexandru Grigorescu’s paper on international organisations and government transparency points to one further distinction worth noting: the higher levels of involvement of international organisations in developing countries.

Secondly, we need to identify what sort of transparency we are talking about. David Held suggests we need to distinguish four directions of transparency: upwards (hierarchical relationships; when the superior can see the actions of the subordinate), downwards (when the ruled can see the behaviour/results of their rulers; agencies can see behaviour up the management chain); outwards (when agents inside an organisation can see what is happening outside it); and inwards (when those outside can observe what is happening inside the organisation). Using these categories we can interrogate how a particular transparency initiative is functioning. For example, a transparency portal may be giving inwards and upwards transparency to government, but it may not only be giving new insights to citizens, it may also be allowing agencies who previously struggled to get hold of information due to bureaucratic blocks in mid-level agencies or departments, to more effectively access information they need to do their jobs. It is also important to answer the question ‘transparency of what?’. Transparency of outdated information, or information with little political salience is dramatically different from releasing up-to-date information on the most recent public spending, such as occurs through Brazil’s transparency portal.

With these distinctions in mind, what might some of the incentives for transparency be? All the following are hypothesis only, and more work would be needed to track down data to explore them more, or studies that might look at these effects in more depth.

1) The figleaf
Starting with a sceptical suggestion. Publishing low-salience information with a large fanfare can be a good way to gain attention and initial credibility without actually facing high political costs. Similarly, in regimes with low state effectiveness, where corrupt activity isn’t captured in the data, or there are no balancing audit and reconciliation mechanisms such as exist in the Extractives Industry Transparency Initiative, then the potential credibility gain from developing a transparency initiative outweighs the potential risks. With growing international focus on transparency initiative, the reputation pay off from an adopting an initiative may be high right now, and may allow other more substantive reforms to be sidelined.

2) International and external pressure
Less sceptically, we might see transparency initiative adoption as a genuine measure by governments, but primarily taking place in response to international pressure or funding. This might be from international agencies, as donors fund and require transparency and governance reforms. Aid Transparency portals in particular may come down to pressure from donors to have accountability on how funds are being spent. Or it might be from business, and markets, as assessments of doing business in a country are affected by the degree of transparency.

3) Bottom up citizen and political pressure
Citizens may be demanding transparency. Certainly in the global development of Right to Information legislation, bottom up citizen pressure has played a significant role. Where democratic mechanisms are operating, then citizen pressure can provide incentives for greater transparency. Similarly, as Francis Maude often states, political parties in opposition are often advocates of transparency.

4) Improving information flow
Effective states need to process a lot of information, and transfer it between many different organisations and agencies. Doing this inside the state, in access-controlled ways, through person-to-person relationships can be complex and costly, and involve lots of interoperable IT systems. By contrast, with open data, you place data online in a standard format, and then anyone who needs it can come and take a copy (or so the theory goes; note that feedback loops from the previous person-to-person relationships fall out of the picture here). Publishing data transparently can get around bottlenecks in information exchange. This may be particularly important when public services are being delivered by lots of non-state actors who could not be brought inside government systems in any case.

This is certainly part of the idea behind the International Aid Transparency Initiative, which seeks to ensure aid receiving governments and agencies can get a view of available resources without having to spend considerable labour requesting and reconciling information from many different sources. Here, the goal is efficiency through outwards and horizontal transparency, and other forms of upwards transparency and visibility of data to citizens may be a by-product.

5) Addressing principle-agent problems
Principle-Agent problems concern the challenges of a principal (e.g. the government;) to motivate an agent (e.g. a contractor;) to act in the interests of the principlal, rather than in the agents self-interest. There are all sorts of principal-agent problems at work in government. For example, the citizen as principal, trying to get government as agent to act in their interests; central government as principle, trying to get an implementing agency to act in their interest; or donor as principle, trying to get a government to act in their interest. Transparency can play a role in all of these, though the form the transparency may take can vary.

Governments are not monolithic. Corruption benefits certain actors in government, and not others. Transparency can be a policy that one area of government uses to secure the behaviour of another, through allowing parties outside of government to provide the scrutiny or political pressure needed to address an issue. The nature of transparency mandates is interesting to explore here. Transparency in one area of government can also empower another. For example, both the UK and China have sought to increased the transparency of local government. This may increase citizen oversight of government, but it can also increase upwards transparency of the periphery to the centre, strengthening central government capacity.

Exploring further
This post has taken a fairly general view of some of the dynamics that might be in play in a decision to adopt a transparency initiative. There are undoubtedly other significant dynamics I’ve missed. And going with my own point on distinguishing both the type and subject of data being made more transparent, any more detailed account is likely to need to be about transparency in particular domains rather than general.

Of course, looking back I suspect I may have misread Doug’s question, which could have been asking more for arguments that can be used to convince governments to adopt transparency, rather than an analytical look. However, I hope some persuasive arguments in favour of transparency can also be distilled from the above.

References
Grigorescu, A. (2003). International Organizations and Government Transparency: Linking the International and Domestic Realms. International Studies Quarterly, 47, 643–667.

Malesky, E., Schuler, P., & Tran, A. (2012). The Adverse Effects of Sunshine: A Field Experiment on Legislative Transparency in an Authoritarian Assembly. American Political Science Review, 106(4). doi:10.1017/S0003055412000408

Power, T. J., & Taylor, M. M. (2011). Corruption and Democracy in Brazil: The struggle for accountability. University of Notre Dame.

Of nonsensical numbers: openness score

[Summary: A brief critique of the ‘openness score’]

A recent Cabinet Office press release, picked up Kable Government Computing states that: “The average openness score for all departments is 52%”.

What’s an openness score I hear you ask? Well, apparently it’s “based on the percentage of the datasets published by each department and its arms-length bodies that achieve three stars and above against the Five Star Rating for Open Data set out in the Open Data White Paper”. That is, it’s calculated by an algorithm that looks over all the datasets published by a department on Data.gov.uk and checks to see what format the files linked to are in.

Which seems to display  both a category mistake on behalf of the Cabinet Office, and a rather worrying lack of statistical literacy and awareness of how such a number might be gamed.

On the category mistake: the openness core appears to equate openness with file format – but ‘openness’ in general is not equivalent to the use of an ‘open’ file format. Firstly, even when using a machine readable format data can be non-open and non-machine readable depending on how it is formatted: a garbled CSV is to all intents and purposes less accessible and open than a well formatted Excel file. Secondly, openness is not just a technical concept, and is not just about data (I’ve commented on that in more detail here). To take the number of well-formatted datasets as a proxy for departmental openness is reductive and narrow in the extreme. This may just be an issue of communication, such that Cabinet Office should be talking about an ‘open data score’ rather than ‘openness score’, but as an input into narratives on open government this risks creating confusion, and again muddling the relationship between openness of government in general, and open data.

On nonsensical numbers: even as an ‘open data score’, the current number is practically meaningless, as it is just a ratio of the non-machine readable to the machine-readable datasets. The score can be increased by removing non-machine readable datasets from Data.gov.uk, and is skewed according to how many datasets a department publishes. A department publishing two datasets, one machine readable and one not, gets a score of 50%. If they publish an extra dataset, full of meaningful information, but that is not yet machine-readable, their score drops to 33%. This means the score is not only misleading, but potentially creates perverse incentives that run counter to the very notion of Tim Berners-Lee’s 5-Star rating of open data (which I should remind readers is not a rigorously designed set of criteria, but something Tim prepared just before a conference presentation as a rough heuristic for how data should be opened), which calls for people to put data online as a first step, even if it can’t be made machine readable right away.

An openness score constructed as the current score potentially incentivises less data publishing not more.

I hope whoever came up with the idea of the openness score is encouraged to go back to the drawing board and think about both about it’s design, and how it is communicated.

 

Data and Trust: Raw data now? Or only after rigorous review?

Screen Shot 2012-12-13 at 19.10.17I’ve just come across a report from Science Wise on an ‘Open Data dialogue’ they held earlier this year. The dialogue brought together 40 demographically diverse members of the UK public to discuss how they felt about the application of open data policies to research data. Whilst the dialogue was centred on research data rather than government data, it appears it also touched on public datasets such as health, crime and education statistics, and so the findings have a lot of relevance for the open government data movement as well as the research field. The report from the dialogue is available as a PDF here.

One of the key findings that jumped out at me was amongst a list of ‘8 key principles [the the public identified] that could be used to promote more effective open data policies’, and stated the view that ‘Data should be checked for inaccuracies before being made open’ (a number of the other points, such as ‘Raw data should include  full details explaining what the data relates to, how it was collected, who collected it and how formatted’ were also interesting in providing further basis for the Five Stars of Open Data Engagement, but I’ve written about that plenty already). The interesting thing about the idea that data should be checked before being made open is that it runs counter to the call for ‘Raw Data Now’ commonly heard in open data advocacy, where the argument is made that putting data out will allow the errors to be spotted and fixed.

The reason, the Science Wise report explain, that the public in the dialogue were reluctant to accept this was to do with trust (though it should be noted this is something Science Wise were explicitly interested in exploring in their dialogues, so was a considered, but not unprompted response from participants). With lots of data out there, subject to different interpretations, and potentially inaccurate, trust in the data, and the work based upon it may be eroded. Although building trust is one of the reasons often given for openness, the idea that openness can in fact undermine trust is not a new one: see for example Grimmelikhuijsen on Linking transparency, knowledge and citizen trust in government, and Archon Fung speaking at the Open Data Research Network’s Boston workshop. What some of this past work on trust and openness does do, however, is suggest this is an area open to empirical research to test the claims in either direction.

For example, studies could be constructed to ask:

  • Does putting out ‘Raw Data Now’ actually lead to errors being (a) spotted; (b) fixed in the source data; and (c) corrections propagated so the impact of the errors is minimised in work based on the data?
  • Is research or policy based on open data, where that data has been used by third parties, more or less trusted that comparable research or policy without the underlying data being open? What are the confounding factors in either direction?

The call for ‘raw data now’ may be as much strategic (an attempt to head off objections to releasing data) as anything else, but it will take work to understand when its a strategy with a short-term gain and longer term risks, or when it makes sense to pursue.

Perhaps to end with an image:

WhatDoWeWant

How might open data contribute to good governance?

Below is the pre-print full text of an article of mine forthcoming in the 2012/13 edition of the Commonwealth Governance Handbook.

You can find a PDF copy over here.

How might open data contribute to good governance?

Access to information is increasingly recognised as a fundamental component of good governance. Citizens need access to information on the decision-making processes of government, and on the performance of the state to be able to hold governments to account. States often require disclosure of information from public and private bodies, making use of targeted transparency1 to regulate the actions of both public and private actors.

Conventionally, access to information has involved access to documents: to published reports and print-outs. However, over the last few years an open data movement has emerged, seeking to move beyond static documents, and asking for direct access to raw datasets from governments (and from other institutions). This movement wants access to data in ways that allows it to be searched, sorted, remixed, visualised and shared through the Internet. Governments have been encouraged to establish open data initiatives and data portals, providing online access to data on everything from national budgets, to school performance, health statistics and aid spending. This article considers the potential implications of open data for democratic governance.

What is open data?

Open data can be formally defined as data that is accessible, machine readable, and openly licensed. In practice, that means data: that can be downloaded from the Internet; that can be manipulated in standard software; and where the user is not prohibited in any way from sharing the data further.

An example may help illustrate this: imagine a national budget that is released in a printed report, made up to hundreds of different tables, each with a slightly different layout. To compare this budget to actual spending, or to see a break down of funds by different categories from those the publisher has chosen to present, citizens would have to re-type all the data into a spreadsheet manually. For a budget this could be weeks upon weeks of laborious work. Even once done, citizens might find that the data is covered by copyright that prohibits their wider use of the information. With open data, these barriers are removed: original spreadsheets of budget information should be published, and the intellectual property license applied to the data should permit citizens to use the data as they choose – including for promoting transparency and accountability and even to support commercial enterprises, perhaps based on providing market intelligence to others.

Open data advocates argue that, by freeing public data (which has commonly already been paid for by citizens through taxation) for re-use, technically skilled developers can build applications and visualisations that support citizens to access it more effectively, and a wide community of innovators can use the data in ways that bring social and economic value the government could never have imagined.

The rise of open data

Although the current open data movements draws upon diverse roots,2 it really burst onto the policy scene in 2009, when US President Barack Obama signed a Memorandum on Transparency and Open Government as one of his first acts in office, leading to the creation of the data.gov platform hosting hundreds of federal datasets for public access. This US move was quickly followed by the UK, launching data.gov.uk in early 2010 and starting a programme of open data reforms across government that continued and were expanded under a new administration from mid 2010 onwards. In April 2010 the World Bank launched an open data portal, providing free access to hundreds of economic and social indicators, and in July 2011, with World Bank support, Kenya launched it’s own open data portal (opendata.go.ke), becoming one of the first developing countries to have a national government open data platform. In September 2013, India launched a trial version of data.gov.in, bringing open government data to the world’s largest democracy

Open data has also been a key topic in the Open Government Partnership (OGP)3, co-chaired in 2012/13 by the United Kingdom. Seven commonwealth countries (United Kingdom, Canada Ghana, South Africa, Malta, Trinidad and Tobago and Kenya) are amongst those who joined the Open Government Partnership in its first year. The OGP is a multilateral initiative, jointly run by governments and civil society focusing on transparency, effective and accountable government. The founding declaration of the OGP highlights the importance of technologies in driving more open government:

“New technologies offer opportunities for information sharing, public participation, and collaboration. We intend to harness these technologies to make more information public in ways that enable people to both understand what their governments do and to influence decisions.”

Of the 45 OGP national action plans delivered by July 2012, analysis by Global Integrity4 found that ‘open data’ related commitments were amongst the most common, with countries pledging to create open data portals, or launch open data related programmes of activity.

Open data and governance

A number of connections can be drawn between open data and governance. Open data can drive greater transparency and accountability. It could lead to greater inclusion of citizens in decision-making. And it can support innovation, both in processes of governance, and in the delivery of public services. Let us explore these connections in more detail.

Modern democracy is based upon the idea that governments, institutions and officials in power can be held to account for their decisions and particularly for their use of public funds. Increasingly it is recognised that citizens should also be able to exercise rights to call companies to account for their actions, including their use of natural resources. However, parliaments, citizens and civil society can only exercise their right to call power to account when they have access to transparent, accessible information – and in modern complex states, this may require access to open data.

Open data can allow information from many different sources to be brought together, and for patterns to be found. Instead of searching through boxes of papers, with open data accountability activists, or watchdog organisations, may be able to more easily find out where money is being spent, how government is performing in different regions, or which companies are the worst polluters in a region. In the UK the government has required all local councils to publish open data on their spending transactions over £500, allowing anyone with an Internet connection to see where money is being spent, and which organisations are receiving public funds. Journalists have been some the most frequent users of this data, but it has also been drawn upon by individual citizens and local campaign groups.

Platforms like OpenSpending.org go a step further in seeking to make data accessible and to promote citizen engagement with key issues like national budget decisions. Open Spending shows budget and spend data from governments through interactive graphs and a searchable database. The OpenSpending platform now contains budget data for Nigeria, India, Kenya, South Africa and the UK amongst others – all input by a network of volunteers working with datasets and documents from their governments.

The emerging power of open data can also be seen in projects like the International Aid Transparency Initiative (IATI) that has created a common data standard for information on aid activities. In the past, aid receiving governments have had to rely on regularly requesting data from the donors operating in their countries to find out what projects are funded where, and citizens have had to search across the websites of many different donors to find out about projects in their country, often finding that only limited information was publicly available. Now, over 50% of official Overseas Development Assistance is published in the IATI standard format, giving governments and citizens up-to-date access to information on who is giving what to whom.5 The data is far from perfect (it is still early days for IATI), but because it is published as open data, third-parties can build upon it, adding extra information such as geographic locations of projects, and ‘mashing up’ the data into visualisations and other products that make it accessible to a wide range of groups. IATI points to the potential of open data to support good governance across borders – and to promote transparency of multilateral institutions that often seem opaque and distant to citizens in any particular country.

Meeting the challenges:

Few strong arguments can be made against the idea that governments should open up access to data. However, open data policies have not been entirely uncontroversial. Firstly, there are questions over whether opening access to data simply ‘empowers the already empowered’6 – as the technical skills required to work with datasets can be relatively advanced. Secondly, concerns have been raised that open data policies can be politically manipulated, with governments choosing to selectively release data that serves their interests, using open data as an instrument of state deregulation and marketisation of public services7. Thirdly, as the International Records Management Trust have highlighted, you can only open up data if you have it – and data can only be effectively used for accountability purposes when it is reliable. As such, open data for governance relies upon good records management, which remains a weakness in many countries8. Fourth, some in the Right to Information (RTI) movement have expressed concern that open data policies, which are often based on voluntary proactive publication of data by government, might displace a focus on the need for RTI legislation which ensures citizens rights to demand information that is then reactively shared.

These issues can, to an extent, be addressed by recognising that open data needs to be about more than just publishing datasets on the Internet. Open data policy should sit as a complement to, not a replacement of, RTI legislation. And open data advocates need to recognise that adopting open data policies also requires investment in capacity building to ensure citizens, civil society, and a new generation of technically-skilled civic activists and intermediaries, can take raw data and turn it into transparent information that supports efforts on accountability and democratic inclusion. The iHub in Nairobi, Kenya has been responding to this challenge by creating an ‘incubator’ to develop the skills and focus of potential open data users9. And in the UK, participants at the 2012 UK GovCamp conference articulated a series of principles for ‘Open Data Engagement’ highlighting the need for open data policy to be demand led, and for governments to see open data as an opportunity for greater collaboration with citizens, rather than just as a one-way route to push out information10.

Whether open data initiatives will fully live up to high expectations many have for them remains to be seen. However, it is likely that open data will come to play a part in the governance landscape across many Commonwealth countries in coming years, and indeed, could provide a much needed tool to increase the transparency of Commonwealth institutions. Good governance, pro-social and civic outcomes of open data are not inevitable, but with critical attention they can be realised11.

References

1 Fung, A., Graham, M., & Weil, D. (2007). Full Disclosure: The Perils and Promise of Transparency (p. 282). Cambridge University Press.

2 Including, amongst other roots, advocacy for Public Sector Information (PSI) regulation liberalisation in the 1990s and early 21st Century; long established and more recent Right to Information (RTI) campaigns; e-government programmes; and Access to Knowledge campaigns that emerged in response to a global tightening of intellectual property regimes.

3 www.opengovpartnership.org

4 http://globalintegrity.org/blog/whats-in-OGP-action-plans

5 See http://www.iatistandard.org and find the data at http://www.iatiregistry.org

6 Gurstein, M. (2011). Open data: Empowering the empowered or effective data use for everyone? First Monday, 16(2). Retrieved from http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/3316/2764

7 Bates, J. (2012). “This is what modern deregulation looks like” : co-optation and contestation in the shaping of the UK’s Open Government Data Initiative. The Journal of Community Informatics, 8(2). Retrieved from http://ci-journal.net/index.php/ciej/article/view/845/916

8 See http://irmt.org/open-government-trustworthy-records-presentation

9 http://www.ihub.co.ke/blog/2012/07/is-open-data-making-an-impact/

10 http://www.opendataimpacts.net/engagement/

11 Davies, T. (2010). Open data, democracy and public sector reform: A look at open government data use from data. gov. uk. Practical Participation. Retrieved from http://www.opendataimpacts.net/report/

Notes on a National Information Infrastructure

In October the Advisory Panel on Public Sector Information (APPSI) released a short discussion paper based on the idea of building a ‘National Information Framework’ to build an ‘information infrastructure’ for the United Kingdom. The idea of an information infrastructure draws upon the metaphor of physical infrastructure – noting that governments play a role planning, investing in, and strategically co-ordinating projects like road, rail, electricity and water supplies, working with a range of private and public stakeholders – and suggesting that similar levels of co-ordination could be applied to the resources that power an information economy.

APPSI’s vision of a National Information Framework incorporates not only datasets, but also skills, standards, meta-data, directories, tools and guidance to ensure the country has the necessary resources and processes to capitalise on available information. It notes that not all PSI is open data, and, whilst welcoming the impetus that an initial ‘transparency’ push in the open data movement gave to efforts to open up PSI (that had perhaps not seen the same dynamism under attempts to implement the European Public Sector Information directive), is not overly concerned with whether data is strictly open data or not, but does emphasise the importance of ‘core datasets’ being free at the point of use. One of APPSI’s core contentions is that the push towards open data has lacked a strategic approach, and the mandate/terms of reference of those groups that have emerged recently, such as the Data Strategy Board, Open Data Users Group, and the Open Data Institute, limit their wide-reaching vision to address not only specific datasets, but also wider structures, skills and investment needed to make the most of data.

The concept of an ‘information infrastructure’ is a very useful one: both to support more strategic thinking about how data resources are made available, and at the same time to support critical analysis of which groups may win or lose from the approaches taken. Whether that critical analysis would support the sort of designed infrastructure APPSI are aiming for (they note that it would require both top-down and bottom-up components, but there is a strong sense in the report of an interest in the top-down) is something to explore. Let us look at a few properties of infrastructures worth noting:

  • The design (or historical accidents in creation of) of an infrastructure has a big impact on what can be built upon it: take for example UK railways, where the track gauge and low bridges rule our running continental style double-decker trains as a way to respond to overcrowding. Choices made about infrastructure can leave a long legacy.
  • National infrastructures often involve a mix of public and private investment: and the UK experience, again with railways, and with private-public partnership building projects for schools and hospitals, shows this does not always lead to a net gain for citizens.
  • Infrastructure decisions are political: choosing where to build a road or an airport can be an important decision to bring economic development to an area, but such decisions can also be made not based on demand or need, but on political horse-trading over projects, popularity, votes and support.
  • Infrastructures require governance and regulation: this was perhaps a missing term in APPSI’s list – but many infrastructures have complex governance systems – from shareholdings in joint public-private companies, to user-groups and watchdogs. The power and responsibility of infrastructures requires strong checks and balances.
  • Infrastructures can be justified on results, or provided in principle: when I was in Norway earlier this year I was struck by hearing about the right of citizens to choose where to live across the vast and remote lands of the North, and to have infrastructure provided to their homes supported by the state. Here, the provision of infrastructure was not on the basis of economic return, but was based on the responsibility of the state towards citizens. By contrast, in many places an infrastructure would only be provided where there was a return to be seen. Infrastructures can be built to serve social value, but often the assessment criteria in a ‘cost benefit analysis’ is the financial bottom line.

As my own work turns to look also at the emergence of open data initiatives and activities across a range of different developing world countries, the infrastructure framing also highlights some useful perspectives and experiences to draw upon – noting that the creation of national infrastructures has often been a space of considerable contestation, corruption and challenge, whilst at the same time being generally recognised as essential for development.

Whilst APPSI do not appear to be terribly strong on outreach and engagement to really spark a debate in the UK over a National Information Framework, I certainly hope this paper does generate more discussion, and thought, about going beyond ad-hoc dataset, but also generates some innovative thinking about how to do that in a way that is not just about central control and planning.

 

Aside: will a shift to using private data undermine the availability of public data

Paragraph 29 of the APPSI report (LINK) talks about the fact that governments may be able to collect less data, and more of our national information infrastructure may come from private companies. This raises a risk: governments may face a different price for data depending on whether it is for private use within government, or whether it is to be more widely shared and made available. It could end up a lot cheaper to buy in data just for internal use in policy making, with it very expensive to release that data in ways that would let citizens and other actors in the governance arena scrutinise and hold government to account on policy making. If governments access to data for decision making did take this turn, then open policy making may become comparatively more expensive and be undermined.

It is important then for us to be alert when discussion about discontinuing state collection of certain data, in place of buying in that data from private providers, emerge. If the comparison is between the cost of collecting the data, and the of buying in the data just for government use, then the calculations need to be carefully questioned -as the substitution is not like-for-like. In the status quo where government collects data, it can (and increasingly does) share more widely the data upon which policy is based; where government becomes only the holder a license for limited use of a commercially provided dataset the situation is very different.

Right to Data Code of Practice Consultation

[Summary: notes on another open data consultation response]

As if to provide plenty of opportunities for procrastination from working on my PhD, government is providing a constant stream of open data related consultations right now. Next up, a consultation on the Code of Practice to be issued concerning the ‘Right to Data’ introduced by the 2012 Protection of Freedoms Act.

This one is hosted over on Data.gov.uk, and takes the form of a copy of the current draft with space for paragraph-by-paragraph commenting.

I’ve added in a few responses, in particular to note that:

There are also some wider issues the guidance should perhaps address explicitly, on the additional requirements of attention to detail in privacy terms when releasing data, and there is probably space for the guidance (and consultation) to be in clearer English without sacrificing the legal detail that may be required of it.
It will certainly be interesting to see how the Right to Data plays out in coming years…

Opening the National Pupil Database?

Cross posted from my personal blog.

[Summary: some preparatory notes for a response to the National Pupil Database consultation]

The Department for Education are currently consulting on changing the regulations that govern who can gain access to the National Pupil Database (NPD). The NPD holds detailed data on every student in England, going back over ten years, and covering topics from test and exam results, to information on gender, ethnicity, first language, eligibility for free school meals, special educational needs, and detailed information on absences or school exclusion. At present, only a specified list of government bodies are able to access the data, with the exception that it can be shared with suitably approved “persons conducting research into the educational achievements of pupils”. The DFE consultation proposed opening up access to a far wider range of users, in order to maximise the value of this rich dataset.

The idea that government should maximise the value of the data it holds has been well articulated in the open data policies and white paper that suggests open data can be an “effective engine of economic growth, social wellbeing, political accountability and public service improvement.”. However, the open data movement has always been pretty unequivocal on the claim that ‘personal data’ is not ‘open data’ – yet the DFE proposals seek to apply an open data logic to what is fundamentally a personal, private and sensitive dataset.

The DFE is not, in practice, proposing that the NPD is turned into an open dataset, but it is consulting on the idea that it should be available not only for a wider range of research purposes, but also to “stimulate the market for a broader range of services underpinned by the data, not necessarily related to educational achievement”. Users of the data would still go through an application process, with requests for the most sensitive data subject to additional review, and users agreeing to hold the data securely: but, the data, including easily de-anonymised individual level records, would still be given out to a far wider range of actors, with increased potential for data leakage and abuse.

Consultation and consent

I left school in 2001 and further education in 2003, so as far as I can tell, little of my data is captured by the NPD – but, if it was, it would have been captured based not on my consent to it being handled, but simple on the basis that it was collected as an essential part of running the school system. The consultation documents state that  “The Department makes it clear to children and their parents what information is held about pupils and how it is processed, through a statement on its website. Schools also inform parents and pupils of how the data is used through privacy notices”, yet, it would be hard to argue this would constitute informed consent for the data to now be shared with commercial parties for uses far beyond the delivery of education services.

In the case of the NPD, it would appear particularly important to consult with children and young people on their views of the changes – as it is, after all, their personal data held in the NPD. However the DFE website shows no evidence of particular efforts being taken to make the consultation accessible to under 18s. I suspect a carefully conducted consultation with diverse groups of children and young people would be very instructive to guide decision making in the DFE.

The strongest argument for reforming the current regulations in the consultation document is that, in the past, the DFE has had to turn down requests to use the data for research which appears to be in the interests of children and young people’s wellbeing. For example, “research looking at the lifestyle/health of children; sexual exploitation of children; the impact of school travel on the environment; and mortality rates for children with SEN”. It might well be that, consulted on whether the would be happy for their data to be used in such research, many children, young people and parents would be happy to permit a wider wording of the research permissions for the NPD, but I would be surprised if most would happily consent to just about anyone being able to request access to their sensitive data. We should also note that, whilst some of the research DFE has turned down sound compelling, this does not necessarily mean this research could not happen in any other way: nor that it could not be conducted by securing explicit opt-in consent. Data protection principles that require data to only be used for the purpose it was collected cannot just be thrown away because they are inconvenient, and even if consultation does highlight people may be willing for some wider sharing of their personal data for good, it is not clear this can be applied retroactively to data already collected.

Personal data, state data, open data

The NPD consultation raises an important issue about the data that the state has a right to share, and the data it holds in trust. Aggregate, non-disclosive information about the performance of public services is data the state has a clear right to share and is within the scope of open data. Detailed data on individuals that it may need to collect for the purpose of administration, and generating that aggregate data, is data held in trust – not data to be openly shared.

However, there are many ways to aggregate or process a dataset – and many different non-personally identifying products that could be built from a dataset, Many of these government will never have the need to create – yet they could bring social and economic value. So perhaps there are spaces to balance the potential value in personally sensitive datasets with the the necessary primacy of data protection principles.

Practice accommodations: creating open data products

In his article for the Open Data Special Issue of the Journal of Community Informatics I edited earlier this year, Rollie Cole talks about ‘practice accommodations’ between open and closed data. Getting these accommodations right for datasets like the NPD will require careful thought and could benefit from innovation in data governance structures. In early announcements of the Public Data Corporation (now the Public Data Group and Open Data User Group), there was a description of how the PDC could “facilitate or create a vehicle that can attract private investment as needed to support its operations and to create value for the taxpayer”. At the time I read this as exploring the possibility that a PDC could help private actors with an interest in public data products that were beyond the public task of the state, but were best gathered or created through state structures, to pool resources to create or release this data. I’m not sure that’s how the authors of the point intended it, but the idea potentially has some value around the NPD. For example, if there is a demand for better “demographic models [that can be] used by the public and commercial sectors to inform planning and investment decisions” derived from the NPD, are there ways in which new structures, perhaps state-linked co-operatives, or trusted bodies like the Open Data Institute, can pool investment to create these products, and to release them as open data? This would ensure access to sensitive personal data remained tightly controlled, but would enable more of the potential value in a dataset like NPD to be made available through more diverse open aggregated non-personal data products.

Such structures would still need good governance, including open peer-review of any anonymisation taking place, to ensure it was robust.

The counter argument to such an accommodation might be that it would still stifle innovation, by leaving some barriers to data access in place. However, the alternative, of DFE staff assessing each application for access to the NPD, and having to make a decision on whether a commercial re-use of the data is justified, and the requestor has adequate safeguards in place to manage the data effectively, also involves barriers to access – and involves more risk – so the counter argument may not take us that far.

I’m not suggesting this model would necessarily work – but introduce it to highlight that there are ways to increase the value gained from data without just handing it out in ways that inevitably increase the chance it will be leaked or mis-used.

A test case?

The NPD consultation presents a critical test case for advocates of opening government data. It requires us to articulate more clearly the different kinds of data the state holds, to be be much more nuanced about the different regimes of access that are appropriate for different kinds of data, and to consider the relative importance of values like privacy over ideas of exploiting value in datasets.

I can only hope DFE listen to the consultation responses they get, and give their proposals a serious rethink.

 

Further reading and action: Privacy International and Open Rights Group are both preparing group consultation inputs, and welcome input from anyone with views of expert insights to offer.

Open Data in Developing Countries


The focus of my work is currently on the Exploring the Emerging Impacts of Open Data in Developing Countries (ODDC) project with the Web Foundation.

MSc – Open Data & Democracy