Knowledge Systems and Conflict
Dai Davies, April 2014
Brindabella.id.au


In times of rapid change the world seems more complex. New ideas and words abound but we are also forced to re-examine complexity that has already been dealt with, perhaps many times, in the past.
There's nothing new in humans being overwhelmed by information while trying to make decisions. There's also no good reason to think that problems today are more complex that they were in the past. We've always been limited by the complexity that a single mind can cope with and the ability to communicate and aggregate the knowledge of many individuals. Humans have used technique, process and technology since we learned to use and refine language – expanded its value by extending and refining vocabulary – realised that oral transmission of knowledge between generations could be made more reliable with the use of rhyme and rhythm in stories, song and dance – expanded communication through writing then printing.


Now we have machines to store and communicate large quantities of information but our decision-making is still constrained by the abilities of our individual minds – the strength of our motivation – our eye for detail and precision – freedom from attachment to preconceived notions – the ability to resist the rush to premature conclusions or to be pressured into conforming to consensus views – the freedom to hold and express different views – in other words, a host of social conditions.

We need to resist the assumption that through some wisdom gained through ‘progress’ we can ignore the hard won lessons of the past – that we are superior to our ancestors. A skilled seventeenth century game keeper or poacher would have known more about animal behaviour than modern science. Being able to use a mobile phone – or even the ability to actually make one – doesn't make us more clever than people in the past. Thinking that we are just makes us dumber.
What I’m trying to do here is look at digital technology – particularly knowledge technologies or KTec – and see what we might achieve to help us address complex problems – particularly in an adversarial context. At the same time we need to keep in mind that as these technologies increase in sophistication we are likely to lose track of the fact that we created them and start to look on them as oracles handed down to us from the gods of technology.

We have choices: the choice between creating machines that think for themselves, or machines that act as a prosthetic to assist our own personal thinking; the choice between creating machines that hide knowledge processing in arcane computer languages and impenetrable statistical models, or using human languages and developing better ways to explain and monitor mathematical models.

Digital processing started with data – numbers – with the knowledge of their meaning residing in the minds of the people who used it and partially encoded in the software that manipulated it. Then we embedded data in structures that recorded basic interrelationships between different sets of data in the form of tables in the case of the relational database or tree structures where the relationships were better represented as a hierarchy. This was a move from a data manipulation to information management.

Some of the numbers being manipulated represented letters of the alphabet as part of text files displayed and manipulated with text editors or word processors – still data with the structure and meaning in the minds of users. Tables and trees were of little benefit here. Web technology added a new structure – the link – that provided an arbitrary network which could represent interrelationships between words, sentences or whole documents. The next step in this progression is knowledge technologies that recognise the meanings of words in all their semantic variations, recognise the grammatical structure of sentences, then trace and map the logical connections and inference chains that exist within a body of text or knowledgebase.

These systems can take many forms but here I’ll restrict my discussion to the Natural Language Inference Engine. There are many advantages in using natural language (NL) in KTec. The most compelling one is transparency. If the basic operational rules of a system are NL it is possible for anyone to check the detailed processes that lead to a particular result. It avoids having to relying on experts trained in the obscure symbolism of formal logic, or situations where even experts have no idea how a system came to a particular conclusion. Preferably, anyone prepared to make the effort to think clearly and logically can evaluate information or build operational rule-sets to suit their own personal needs rather than wait for a developer to do something similar and trust that it's operating the way you think it is.

I add a further simplification by considering only English, though most of what I say applies equally to any other human language. Automated parsing of English – recognising sentence structure – is a well established art where the text is grammatical. Where it is not, the problem is as open-ended as the language itself, but even in ungrammatical sentences there is usually some recognisable structure. An interactive system can allow the user to assist and extend the parser by supplying templates for novel structures.

Grammar rules can be expanded beyond normal English to include templates for lists, tables and tree structures, mathematical equations, or arbitrary structures such as web URLs and street addresses. Once sentences are parsed they can be inter-connected through logical inference by an inference engine or the meanings of particular words. This gives us the rich connectivity of logic and semantic networks – not just the meaning of a particular sentence but how it fits into its context. Once we tap in to meaning we move from information systems to knowledge systems where we can take a body of text and quiz it. This may appear to be slipping into science fiction but the basic technologies I describe here – and have built – go back at least to the 1980s.
In addition to providing structure to knowledge, we need a simple, intuitive, and flexible operational model. Software Engineering has refined and automated approaches to project management than can be used by KT to help us organise how we go about solving problems. We define what it is we want to achieve – our requirements; then we need to look at how we can achieve them – design; we act – the implementation phase; then we evaluate the success of the project – the review phase. This four phase approach – requirements, design, implementation, and review (in science we talk of aims, methods, results, and conclusions) – can be seen as fundamental to all our activities, possibly embedded in our brains as instinctive, or logically intrinsic to anything we call a 'task'. We can expand this linear view to create operational hierarchies, taking each phase as a project in its own right and talk about the design of the requirements phase, or the implementation of the design phase, and so on until we have broken a large and complex project down into small manageable tasks.

Much of the emphasis and effort in Software Engineering has been shifting from the construction phase of a project to the upstream design phase and then further upstream to the initial specification phase. The end goal of this transformation is that a detailed specification of the requirements of a project will lead automatically to the construction.

What, then, are the broad requirements for the knowledge management system I'm describing here? Briefly they are, in no particular order: usability – not restricting its use to experts; flexibility – dealing with 'found data' in any format; adaptability – readily adding and modifying information; generality – representing diverse and conflicting information; transparency – the ability to readily delve into processes; structure – automatically building structural representations; disambiguation – providing consistent definitions and, to maintain generality, the capacity to deal with multiple interpretations; evaluation – attaching numerical likelihoods to information and to combine and propogate these through an inference chain; visualisation – the ability to display complex networks in an intuitive graphical form. This is not an exhaustive list but it will do for our discussion here. What we can see at a glance is that all but the last point can be directly addressed by using natural language.

KTec has many potential applications, some of which I've written about elsewhere. Here I want to focus on one particular area – its application to representing and analysing complex problems in the interface between developments in science and technology and their application to public policy. This is a deeply problematic context and often leads to highly polarised debate in situations where our knowledge is incomplete and/or conflicting, and policy options are varied and controversial.

We can approach the problem by starting with some general principles that we want to maintain such as diversity, flexibility, transparency and accountability. We need to allow all views to be aired so that they can be evaluated transparently within the system rather than by hidden gatekeeper tactics. We need to allow for evolution of a debate within the system rather than just having the presentation of fixed, inflexible views. Crucially, we need the will and energy to reach a practical resolution.

We need to implement processes for the evaluation of uncertainties in the information presented and in the diverse social implications of solutions. Inevitably, we will also want to evaluate the expertise of individuals. We are usually dealing with topics that are too complex for any one person – expert or not – to grasp the detail of every area involved. Everyone involved will have to rely on the judgements of others if they want to view the full picture.

Expert judgement is always necessary in applied science. The accuracy of a physical measurement may often be logically defined and uncontroversial. Its practical significance and fitness for the problem at hand is inevitably more subjective. It is a central goal of the approach outlined here to develop processes for the reliable evaluation of expertise. The methods we rely on now have not been particularly reliable.

We use criteria such as: Years of experience: a younger person might be energetic and up-to-date in recent developments but prone to oversimplification of problems and overconfidence. An older person, while hopefully having a broader and deeper knowledge, might have become lazy or locked-in to insupportable views. Reputation in their field: is often well earned, but skill at writing grant proposals and throwing good dinner parties can play important and confounding roles. Numbers of papers published: is not particularly meaningful without some index of quality where, apart from a few outstanding contributions (good and bad) that are read and discussed widely, there is rarely any broad evaluation beyond the opaque and fundamentally flawed peer review process. Citation indexes: are widely used as a metric to gauge relevance of a paper but can be gamed by gratuitous pal referencing, the deliberate promotion of unproductive controversy, or simply publishing work full of errors.

I used to assume that peer review was as old as the tradition of the scientific process. In essence it has been, with scientists passing around copies of research reports among friends and peers before presenting them more broadly at meetings or in journals. The expression 'peer review', however, only passed into general usage in the 1970s. It rose as part of the commercialisation and industrialisation of knowledge – the Great Knowledge Heist – that developed over the 20th century helped along by the pulp-science produced in the production line publish-or-perish mentality of academia.

Publishing houses take publicly funded research results, call on publicly funded researchers for anonymous and private assessment, then sell the work at prices that have become a crippling burden on the finances of libraries across the world and an impediment to private research. In libraries the work was once available to the broad public. Now, on the internet that largely eliminates the costs of publication, it is hidden behind pay-walls with charges of around $30 per article – sometimes per view. When the cost of a thorough literature search can be a six figure sum, access to the results of publicly funded academic research is now generally restricted to large institutions. It is becoming widely recognised that the system is well and truly broken – and that's before we factor in the consequences of political or commercial bias and corruption that hides behind the opacity of the present review process, and the fragmentation of knowledge.

KTec has the potential to improve our assessment of both knowledge and expertise. Just as a spreadsheet can detect errors in mathematical equations and link multiple equations to produce a combined result, an inference engine can detect grammatical errors in sentences and link sentences via logical and semantic associations to produce an inferred result. The reliability, or likelihood, of a result can be calculated from the individual likelihoods of the information used to derive it.

The accountants of these knowledge systems – knowledge engineers – will work their way through knowledge systems refining the language used, reducing ambiguity, and increasing consistency of expression across multiple knowledgebases to enhance the reliability of automated inference. They will also highlight gaps in the knowledge and the weakest links of inference chains. Working alongside the engineers will be people with specialist expertise in the relevant knowledge fields who work to strengthen the connection between the system and the real world and provide increasingly accurate assessments of the likelihood of individual elements of the embedded knowledge.

While error analysis is generally accepted as a necessary part of the scientific process it is often omitted and commonly poorly performed (the IPCC aggregates guesstimates from anonymous self-appointed experts – a further degradation of the already broken review system). The formalisation of knowledge analysis will improve tools and techniques for error analysis and its meta-analysis. For example, errors are usually assumed to be randomly distributed. This assumption is rarely questioned and often wrong. We will have tools to inject more rationalism into science practice.

My prediction for the evolution of KTec is the development of  knowledge assessment markets much like our present stock exchanges but with information likelihoods replacing stock values. Players in the market will look for information that they think is over, or under, valued. They will invest credibility points in their assessed value for a future likelihood and if they turn out to be correct in their judgement they gain points. Over time they build, or lose, credibility capital. The present chaotic chasm between knowledge and public policy can be reduced by people who have transparently earned their claim to expertise.

The personal credibility gained can be monetarised in various ways. People with high capital will be sought after for employment as consultants and analysts. Also, organisations with a vested interest in increasing the reliability of certain knowledge can offer financial rewards for new information or for improving the contribution of information already within decision systems.

Beyond formal knowledge evaluation, actual political decisions need the subjective evaluation of social costs, benefits and priorities that our political processes provide. As flawed as these institutions are, KTec should be used to improve them rather than replace them. As useful as I think KTec can be, it will be necessary for us to be constantly aware that the knowledge embedded in it is the product of human judgement with all its inevitable frailties and strengths. Viewing machines as omniscient oracles is delusional and dangerous. Ultimately all our knowledge is based on human intuition – its diversity, intransigence and creativity. To forget that is to lose our humanity.