Knowledge Futures
Dai Davies, April 2014

This article is about tools for the Knowledge Age where trade in the value of knowledge, its provenance, and reliability will be a primary economic driver. Tools like TheWordMachine will map the structure of a knowledgebase and provide basic grammatical and logical assessment. Knowledge Engineers will provide manual refinement of the process. WordMuller (work in progress) is designed as an interface for Knowledge Engineers.

Putting a value on knowledge is nothing new. All life-forms do it – but we've made it our speciality. All life-forms use language in some form or another – usually chemical, and usually instinctive. We consciously manipulate it, refine it, and communicate it widely. Until recently, these processes have been manual and intuitive but now we are adding automation that will have an impact on knowledge management comparable to the impact that the spreadsheet and database have had on information management.

One knowledge trading institution we have already developed to a quite sophisticated degree is the stock exchange but these don't trade in knowledge about companies, they trade in knowledge about the reliability of such knowledge – how we assess our knowledge of a company and its present activities and, with futures markets, how we expect them to perform at some time in the future. In these giant casinos people gamble on their understanding of the market. Those whose knowledge is generally reliable gain – otherwise they lose.

This process can be generalised, with people staking credibility points on evaluations of knowledge – adding to those points if they are correct or losing them if they are wrong. Aside from stock markets our evaluation of expertise is highly subjective. In academia this has been a long-standing problem. A few people come up with ideas that are clearly ground-breaking and useful and are recognised with prizes. More generally, credibility is gauged by factors such as the number of papers published in a particular field as an indication of their depth of knowledge in that field though this leads to a lot of recycling of old ideas and results. Seasoned academics may also add the spice of variety to their CV and gain recognition as a generalist in a broader arena. Too often other factors such as style, connections, or throwing good dinner parties, dominate.

One approach that has been taken to try and assess the quality of work rather than just apparent quantity is to rank journals on a status ladder. This has inevitably led to an intense need to be published in particular journals and gaming the system has become common practice. Academics socialise at regular conferences where, mixed with status wrangling and speed dating, they form mutual admiration groups that inflate the significance of each others work with pal review replacing peer review. Old ideas are often dressed up in new terminology to appear novel – or at least improved.

Even peer review is quite new in its present form. In the past, scientists would discuss ideas among colleagues and establish novelty, significance and provenance before formally publishing. Over the last century the number of practising scientists has multiplied by several orders of magnitude. The contemporary peer review process is sometimes conducted with painstaking attention to the veracity of content – particularly if the work undermines the credibility of others. Commonly it has reduced to little more than proof-reading, demanding spurious references to the works of the reviewer or their mates, or just sloppy performed in a rush.

Scientific research is big business – a big industry – but the business model of the publishing industry is hardly typical. It takes publicly funded research results, uses the publicly funded time of academics to review work, then charges extortionate prices to libraries to stock their output. Now the scam has become even more acute with publishing on the internet virtually free. The old model is well and truly broken and is only slowly, and often reluctantly, being replace by models that are more suited to our current technology, potentially more transparent in their assessment processes, and cheaper.

There is no cogent argument for public research to be hidden from the public behind pay-walls. A few decades ago the public could walk into a library and access journals. Now we have a class system where a few have free access and the rest pay. The kind of library research I was able to do last century would cost tens of thousands of dollars now with charges of thirty dollars per article common. The situation wouldn't be quite so bad if the money went to a worthy cause, such as supporting further research, but it goes in windfall profits to publishing dinosaurs.

So, rants aside, progress is being made and online journals run by researchers themselves are starting to appear. With this in mind I've pushed the WordMuller interface beyond what is minimally needed at the moment for TheWordMachine in the hope that it might contribute to the broader project of internet publishing.

As I write it stands incomplete – little more than a proof-of-concept for a web based interface. Minimally, it needs to be able to aggregate numerical assessments and provide keyword search through all written comments and reviews. To facilitate this it needs to be able to display the results of many assessments – hundreds – in a convenient form. The plan for this is to display data in a colour-coded form as swatches – columns of coloured bands representing different aspects of an assessment. As an alternative to word search of reviews, swatches will be highlighted to match assessment values as they are varied in the assessment panel.

You can see a simple Photoshop mock-up of this in the swatchtests directory in the articles list. In the functional version, clicking on one of the swatchsets will display it in detail along with any accompanying notes. Optionally, there will be more rows in each swatch set to represent machine assessments, aggregate values, author credibility, etc.

The fixed highlighting of the swatchtests can be seen as representing search hits for some criterion such as 'value greater than 80% for this review aspect' or wihin some range of values. But the highlights have two variables: colour and intensity. These can be put to use with the right controls.

The evaluation panel provides ideal controls. The plan for this is simple. If you have a table of past evaluations displayed – as in the swatchtests – the highlights will vary as you move the sliders or select and deselect aspects. This could be used by some reviewers to influence their review, which is generally not a good thing – but, hey!, we're all human and have such influences anyway so they might as well be accurate and if we see we're going strongly against others' opinions on some point it might give cause to stop and think why.

The main power of this design is in the ability to use a dummy evaluation to explore the data. Add some means of selecting by author, graphs to display time trends, and it could be a valuable tool for Knowledge Engineers.

This topic will need a more detailed description which I will give in another article – when I've thought it through in more detail and had some feedback.