How do you evaluate or compare open government data initiatives (ODI)? With initiatives like data.gov and data.gov.uk well established, and well over 100 national and city-level open data initiatives emerging across the globe, the question of evaluating these initiatives is coming up more and more.
As Jose Alonso of the World Wide Foundation has noted the elements of an evaluation framework are, as yet, few and far between. Linked Open Data (LOD) actors often turn to the ‘5-Stars of linked data’ to provide some metrics for evaluating particular datasets, and Jose has proposed that a 5-Star framework might be extended to provide a more general evaluative framework. Identifying six dimensions which should be taken into account in an open data initiative (political, legal, organizational, technical, social and economic), Jose suggests that:
5-star scale [Open Data Initiative] is one that is 5-star on every single of the six dimensions.
Exactly what it means to be a ‘5-star’ initiative is as yet unspecified, and World Wide Web Foundation are starting to explore how such a framework might be developed, and where it might connect with the development of a multi-dimensional composite World Wide Web Index to rank the impact of the web/open data on countries around the world.
In this post I’ll explore some of the challenges ahead in constructing a 5-star evaluation framework for open data initiatives, offering some remarks of possible routes to explore in the future.
Heading for an index? Reduction and ranking
Simple metrics and indexes are clearly very useful advocacy tools, encouraging behaviour change amongst government, civil society and business communities. ‘Official’ UN, OECD or World Bank Statistics on health or education can drive Ministerial focus in a desire for a country not to slip down the rankings; and civil society indexes like IFPRI’s Global Hunger Index, and Publish What You Funds’s recently released pilot Aid Transparency Index are useful tools in putting an issue on the public agenda. A well constructed index, based on open inputs, has the potential to balance the simplicity conventionally demanded by public advocacy, with the depth required to identify the complexity of creating change around a particular issue.
In outputting a single number and allowing ranking to be constructed, indexes can capture news attention as evaluated entities (often countries) look to see their relative positioning on the scale. But if an index is based on good open data, and this is also published clearly (as in the case of the excellent online interface to the PWYF Index), then the index also provides pointers for countries, companies or whichever institution was evaluated to identify areas where they should focus their efforts for change to get a higher ranking next time. In a good index, the input measures should each be linked directly to, or be proxies for, states and actions that have a proven connection with positive change in the overall domain the index is concerned with. For example, in an ideal context, PWYF should be able to account for how improving against each measured input of their Aid Transparency Index can support improvements in the ultimate effectiveness of a donors aid.
Finding the right inputs for an index is challenging: Indexes tend to rely not only on reducing the output to a single number, but also on reducing each of the inputs to the index to things that are easily quantifiable and comparable: and this can introduce significant national and cultural biases. For example, the Open Knowledge Foundation Open Economic’s group’s pilot Open Knowledge Index includes attempts to capture the existence of an ‘Open Knowledge Society’ by looking at indicators such as “Number of Wikipedia edits per 100.000 inhabitants”, not only prioritizing a particular technical platform and failing to take into account the complexity of comparing editing practices between different country (and their potentially diverse language communities), but also ignoring the likely double-counting of earlier index elements such as “Tertiary Education Rates” and “Fixed broadband Internet subscribers (per 100 people)” introduced by looking at edits of a written online resource per head of population. In fairness, the Open Knowledge index is just in it’s early stages, but it’s reliance upon existing comparable datasets highlights a key limitation of international index construction: it would be hard to use one input dataset for one region, and another in another region without finding some way of making these comparable.
The reductionism of indexes has a further problem: what exactly is to be compared? A number of the indexes above rank ‘countries’, but an open data initiative index might need to cover not only national government-driven open data schemes, but also local government, community and transnational projects. Putting these in a single ranking would obviously be fraught with difficulty – and it might be appropriate to weight elements differently depending on the type of initiative being evaluated. But event amongst national open government data initiatives: would it be right to plot all governments on the same axis? How would comparing Kenya, India, Moldova and the US on an index help develop open data practice in each?
Whilst the political attraction of an index might prove a strong one for open data advocates, and indexes are certainly in vogue, the reduction of indicators to an index needs careful and critical thought – and, if the driving force behind an evaluation framework, could lead to some potentially damaging distortions in it’s development.
5-Star scales; 6 domains; at least 2 sides
Jose’s proposal however isn’t yet for an index. Rather, the post suggests that in addition to the 5-Stars of open linked data in the technical domain, similar ‘scales’ are needed in the political, legal, organizational, social and economic domains. This raises a number of questions.
Firstly, to what extent is the existing 5-Stars of Linked Data model truly a ‘scale’. I’ve commented before on the importance of seeing the stars as incremental and cumulative actions to be taken: as a checklist to work through in order, rather than as a ‘score’ where leaping to the top score without moving through the stages before is desirable. The 5-Stars might better be conceived of as a set of ‘indicators’, with early stars setting out the foundations that future steps should build upon. In the Hear by Right framework (PDF) for Organisational Change co-authored by Bill, my colleague at Practical Participation, 49 indicators are organised around a 7-S organisational change model, and divided into ‘Emerging’, ‘Established’ and ‘Advanced’ levels – highlighting that it’s important to move through ‘Emerging’ practice, to become ‘Established’ and to aspire to ‘Advanced’ forms of practice. It might be possible to maintain a ‘5-star’ model to indicate the movement through from emerging, to established and then advanced practice, but ensuring the design of indicators is not ambiguous about their cumulative nature will be important.
Secondly, we should ask to what extent each dimension (technical, political, legal, organizational, social and economic) can have a single set of cumulative indicators, or to what extent we might identify multiple sets of indicators in each. For example, the five-stars of linked open data (which Jose’s post might imply could simply be adopted as the indicator set for the ‘technical’ domain) only focusses on one set of technical issues in open data publishing: the format and publishing platform (i.e. non-proprietary / linked data; and the web). However, in looking at the use of open data in practice, we find there are important further technical elements to open data initiatives – including providing tools for data discovery (catalogues), providing open source code and tools for working with data, and ensuring technical platforms can cope with demand. Similarly, there might not be one simple sequential set of indicators for the economic or political domain (for example), but rather a parallel set of states that are good to get to, including having political leadership for open data; having open data about politics available; and having open data used in political decision making.
Which leads to the third question, and perhaps one of the most fundamental for an evaluation framework: what exactly are we evaluating? The current 5-Stars of Linked Open Data is primarily a supply-side evaluation: is data being provided. But we might look at the demand or use-side of each of the dimensions Jose points to, asking not only is this domain contributing to the availability of open data, but is open data being effectively used in this domain (which is also different from asking ‘is data about this domain being used effectively).
Jose has already noted the further challenge with the 5-star scale in terms of working out when a star is reached. Is an open data initiative only 5-star when all the data within it’s ambit is published according to Linked Open Data standards, when all public organisations are fully equipped to share and work with open data, and when the whole enterprise sector is engaged with open data and using it to create new jobs? Or is there some threshold of 10% of datasets; and 50% of organisations? Or is that threshold based on ‘valuable datasets’ , which, as Jose notes, raises the question “What does “most valuable” mean? For whom?”
A six-domain five-star model quickly loses it’s potential simplicity when we find the need to focus on both input and impact sides of the equation.
An refined proposal: creating organisational change, measuring social change
So where does this leave us? Again, turning back to learning from Hear by Right, it may be useful to draw a clear distinction between a framework for organisational change, and measuring the impacts of open data.
An organisational change framework for open data initiatives would draw upon the 5-stars already put forward as indicators for mapping and planning: organisations can map their own performance against these indicators (with the possibility of some external assessment and audit too) and can identify actions to move towards a higher level of indicator. Each indicator would identify a set of states or actions an organisation can take to effectively run an open data initiative. Each indicator should be based on a hypothesis about how that state or action increases the impact of open data, but the measurement should simply be based on whether or not the initiative has achieved that state, or taken that action. For example, in the economic dimension, an organisational change framework might include indicators for: ‘The initiative supports the development of a marketplace connecting potential infomediaries with possible sustainable sources of revenue for their services’, and would measure this on the basis of whether the initiative self-assesses (or others judge) that this in place. The organisational change framework would not include any metrics about impact, although if, over time, it became clear an indicator did not lead to the sorts of changes it was hypothesized to support, then it might be removed or amended.
An impact framework would identify key dimensions of change which could take the form of statements about the sorts of impacts an open data strategy might have. For example, “Open data supports economic growth” in the economic dimension; or “Open data is actively used by citizens in policy making processes”. These might have ‘suggested evidence’ requirements, but it’s unlikely these will be reducible to a single number in most cases. Both organisational change and social change are, to a significant extend, subjective. Whilst we can measure certain ‘states’ (existence of organisational policies and practices; performance statistics; etc.) any measurement of organisational and social change needs also to include a narrative component – highlighting experiences of change, and how the benefits of change are distributed, as well as look at aggregate measures of change.
In this approach, we disentangle ‘best practices’ and ‘impacts’ – and allow them to each be evaluated on their own terms. Both are still needed: pursuing organisational change without asking ‘what difference does this make?’ isn’t helpful. And equally, measuring impacts without hypothesizing about how to further them, and planning concrete steps to do so, creates massive missed opportunities.
It might even be possible to fit this approach with the elegance of a 5-star formulation.
Recent Comments