I believe that the academic paper is now obsolescent as the fundamental sharable description of piece of research. In the future we will be sharing some other form of scholarly artefact, something which is digital and designed for reuse and to drop easily into the tooling of e-Research, and better suited to the emerging practices of data-centric researchers. These could be called Knowledge Objects or Publication Objects or whatever: I shall refer to them as Research Objects, because they capture research.
Many people are coming at this by tweaking what we have already in the scholarly knowledge lifecycle – like publishers with supplemental materials on a web site. But for a minute let’s do a thought experiment and let go of this augmentation of an archaic form. Forget papers: How would we define a Research Object instead?
At school we just had the “Three Rs” – Reading, writing and arithmetic. I suggest that in e-Research there are Six Rs and they are the essential characteristics of the research record in contemporary research. Research Objects should have these key properties:
- Replayable – go back and see what happened. Whether observing the planet, the population or an automated experiment, data collection can occur over milliseconds or months. The ability to replay the experiment, and to focus on crucial parts, is essential for human understanding of what happened.
- Repeatable – run the experiment again. Enough information for the original researcher or others to be able to repeat the experiment, perhaps years later, in order to verify the results or validate the experimental environment. This also helps scale to the repetition of processing demanded by data intensive research.
- Reproducible –an independent experiment to reproduce the results. To reproduce (or replicate) a result is for someone else to start with the description of the experiment and see if a result can be reproduced. This is one of the tenets of the scientific method as we know it.
- Reusable – use as part of new experiments. One experiment may call upon another, and by assembling methods in this way we can conduct research, and ask research questions, at a higher level.
- Repurposable – reuse the pieces in a new experiment. An experiment which is a black box is only reusable as a black box. By opening the lid we find parts, and combinations of parts, available for reuse, and the way they are assembled is a clue to how they can be reassembled.
- Reliable – robust under automation. This applies to the robustness of science provided by systematic processing with human-out-the loop, and to the comprehensive handling of failure demanded in complex systems where success may be the exception not the norm.
These points of definition have evolved over a series of talks and numbers vary. An interesting contender for number 7 is reflective - you can run a Research Object like a program but you can also look inside it like data; in other words it needs to be self-contained and self-describing. But that’s a means to an end rather than the end. And contender number 8 is is replicatable, but to a computer scientist this is like repeatable and to a scientist it is like reproducible. I’m not sure how many of six you have to score before something really is a Research Object, but maybe these six are actually necessary and sufficient.
How do we do this? In the Open Repositories world, the Object Reuse and Exchange standard is using RDF graphs to describe collections of things – like all the pieces that make up an experiment – even if they are distributed across the Web. It’s a great starting point for describing Research Objects – especially because, if we’re right, it is Research Objects rather than papers that will be collected in our repositories in the future. One day people will be saying “could I have a copy of that <Research Object> please?”
In my panel position at the European Semantic Web conference I suggested papers are an archaic, linear, human-readable form of Research Object and will be superseded. Actually, there will continue to be value in a human-readable narrative of an experiment, and of course we have a massive corpus in that form – though even today papers are increasingly read by machine rather than people!
Heraklion, Crete, June 2009 (revised August 2009)