Ramping up Linked Data

As a keen observer of the digital ecosystem, one of the things that fascinates me is the evolution of Linked Data…

The Web took off the way it did due to many factors. Crucially it was easy to consume information but also it was easy to produce it – by the latter I mean that a unix systems administrator could, without specialist knowledge, put up a Web server and people could easily write some HTML to be served by it. To use our meme from last September, there was a “ramp” for consumers and one for producers too.

Something very interesting is happening with Linked Data: the producer ramp seems to have arrived before the consumer ramp; i.e. circumstances are such that there are incentives to produce before the consumers are fully tooled up.  This is different but not unnatural – the case for consumption is clearly better if there is something to consume (e.g. DVD players wouldn’t have been useful without DVDs!) The downside is the production practice might miss some usability requirements but, in the fluid world of the Web, we can expect to see this co-evolution.

How did this early producer ramp come about?  Some of it is due to the openness “wave” that the Linked (Open) Data community is both encouraging and surfing, such that data providers get a tick in a box for publishing this way.  Some of it is because of lobbying and hard work by key players – activists, academics, academic activists – with influence.

Sometimes the business case might not be founded entirely on delivering publicly open data. This is perhaps most evident in a corporation or enterprise with a complexity of information systems (I quite like “complexity” as the collective noun for information systems…)  There are clear internal efficiencies to a common data sharing technology which facilitates internal reuse and perhaps extension to business partners too.  We used to call this the RDF bus, it works well and now it’s emerging as public transport!

The BBC has demonstrated this admirably by using RDF internally to deliver web sites, also delivering linked data externally, and also bringing in other sources (e.g. the wildlifefinder). By the same argument, open government data represents a cost efficiency in-house together with an empowerment of the citizen through access to open data. Whichever incentive may dominate, each attracts applause and encourages progress towards a culture change in data publishing.

Interestingly, in linked data consumers may be producers too.  Some linked data apps consume multiple sources and communicate to the human – many mashups are really visualisations where the integration of information occurs somewhere in the cognitive workflow. But the producer mindset (i.e. it’s better to publish linked data for everyone to use than just put up yet another web site) reminds us that it’s valuable to integrate and republish, not just juxtapose in the UI. This culture of republication is already evident in the web for news aggregation and we can expect to see it in data too.

Anyway, the tooling for the consumer ramp is emerging now, powered by the energy of communities like the consumers of open.gov.uk.  This is good news for Linked Data, because we need both ramps to flourish.

Data analysts might reasonably be concerned that there is more to the consumer ramp than software tooling: there is also a question of “data literacy”.  This was less of an issue last time round because we humans are rather good at processing images and text – even making rapid assessments of information quality at a glance at a list of search results.  We can do this with BBC programme data too, but working with datasets can be more specialist and to do it wrong could ultimately be damaging.

Examples of misinterpretations of data abound before linked data, and we can hope that opening up analysis will lead to more challenge and debate, and emergence of better understanding and practice. There is some good practice emerging. We can publish data with a set of caveats to protect the producer from accusations resulting from misinterpretation, or far better we can publish data with an online tutorial so that people learn how to interpret it – building capability for sustained understanding rather than (or as well as) planning for inevitable misinterpretation. This is part of the consumer ramp too.

So there we are. I hope the producers are responsive to the emerging needs of the consumer as they see what is possible, that the wave of openness and business efficiency drives a culture change in data use, and that citizen analysts lead to better public understanding. In my next post I will explain how to achieve some of this!


