USING ISTAR - FOR INFERENCE

This is a step-by-step tutorial on how to make good use of Istar, and we start by looking at how to make use of inference. We take real life kinds of tasks you might want to use it for and guide you in building knowledge bases (KBs) of various kinds, to teach you some of the features of Istar, and also to prod your thinking in how you could use it. But there's lots of facilities that you will not meet in this chapter.

Before you start here, read the sections "How do I start it?" and "Going further" in the Driving Istar document. This guide is at a higher level: it does not describe the features of Istar so much as when you might make use of them, for what purposes or human tasks. Also, check your version of Istar: this is geared to version 1.08, though earlier versions will have only slight differences.

We present several sections here, each a small project with Istar of about quarter of an hour. Each is designed to let you see what Istar could be useful for and at the same time how to employ its facilities. It provides an introduction not just to use of Istar facilities but to the creation of KBs in general. You will find some useful tips. The first eight or so sections take you through some of the potential benefits of knowledge based systems and introduce the main features. But the KB you build during these is far from accurate. Only towards the end do we consider how to make the KB more accurate.

The structure of the projects is to introduce useful features and techniques one by one, and later projects make use of things learned in earlier. So it is advisable to carry them out, or at least read them, in the order given. At the end you should be able not only to drive Istar but to take your first steps in building real knowledge bases. Have fun.

1. EVALUATING STOCKS AND SHARES

Suppose your friend asked you for advice on buying stocks and shares. As an alternative to talking or writing to him you could give him a knowledge base. Then he could run it, it would ask him questions about a company he is interested in and would offer advice. In this section you will create a (much simplified) knowledge base (KB) that gives such advice. (A fuller version of such a knowledge base is available via the Istar Knowledge Server; you can run it over the Internet using your browser from a link on that page. After reading this, perhaps connect to it for a short run to see what you will be aiming at as you go through this tutorial.)

What other reasons are there for building such a KB?

So, let's start building it.

# Fuzzy knowledge. Buying shares depends on 'fuzzy' factors like the quality of the management, the strength of the market, etc. as well as on quantitative ones like share price. So we will use Bayesians to build our KB - these allow accumulation of fuzzy evidence for and against various propositions, and involve special arithmetic that is explained elsewhere. In the top-right window (Item Types) select 'Bayesian'.

# Drawing an inference net. Draw a box (Driving step 4) towards the right of the screen and name it 'Buy It' (Driving step 9). (Note the wee 'OK' button to the right of the Label gadget; it is a version of the main one further below, there for your convenience when all you are doing is to fill in a name or meaning.) 'Buy it' is the 'goal' item, the thing we want to find out, the overall purpose of our KB.

# Draw two more boxes a couple in inches to the left of 'Buy It', calling them 'Will grow' and 'More profits', which express the ideas that the company will grow in the future and that its profits are likely to increase.

# (Labelling. You can use longer names/labels if you wish, but the shorter the better as long as it not too ambiguous. A good rule for labelling this kind of fuzzy qualitative concept: make the name a short proposition rather than a numeric value. So just 'Profits' can be confusing since it could mean "Profits are likely to increase" or "The numerical value of last year's profits". In this case 'More Profits' or 'LYP value', respectively, might be good labels as long as you also fill in a longer Meaning.)

# Both 'Will grow' and 'More profits' are good indicators that shares in the company are worth buying, so we link them to 'Buy it'. Draw a link from 'Will grow' to 'Buy it' and from 'More profits' to 'Buy it'. (Your KB should now resemble a V on its side if the two antecedents, 'Will grow' and 'More profits', are to the left of 'Buy it' and slightly higher and lower than it.)

# Run it (Driving step 12: bring up Attribute Action Panel for 'Buy it', hit Reset and Infer) with a company in mind - fictitious or real. It will ask you to provide slider values that represent your degree of belief, first your belief that the company will grow, then your belief that the company's profits will increase. Belief is positive to the right, negative to the left, the further from the centre, the stronger your belief. (Note: at this stage, it just puts the label or meaning before you.)

# Look at the result (Driving step 13). Not very interesting yet and probably totally wrong in its result, but you have built your initial KB. Not really ready for your friend, so let's add some more knowledge.

# What determines whether a company - any company - will grow? There are many factors, and we'll add a few below. But for now we'll only add two: the company is in a growth sector of the economy and the company has good management. So add two antecedents to 'Will grow' (draw two boxes with names e.g. 'Strong sector' and 'Good management', and link them to 'Will grow'). Now a create a couple of antecedents for 'More profits' and link them in: the company has a good financial history ('Good fin hist') and the company has low overheads ('Low overheads'). Now the KB looks like a branching tree on its side.

# But most KBs are not pure trees but networks. Connect 'Good management' to 'More profits' just as it is already connected to 'Will grow', showing that we believe that increases in profits are more likely with good management.

# Run the KB; it will now seek your degree of belief about four things: Strong sector, Good management, Good fin hist and Low overheads. Make some positive, some negative.

# Save it, by clicking the 'Save As' button on the left side of the KB Panel and giving a name. (You will also be asked whether to make the SaveAs file the Save file; just hit 'Yes' for now; it allows you to use the Save button in future.)

# To summarize, in this section we have:

1a. A Fuller KB

To give you some idea of what a fuller share-evaluation KB might look like, locate the Main Control Panel (left side of control screen unless you have moved it). Hit the 'Get KB' button, and select the Share Dealing file. This KB will be loaded, with its easel, and you will see it has around thirty boxes. It's a bit messy, and the knowledge is incomplete, but it is starting to show the factors that affect decision to buy shares.

# Run it with 'Buy it' as goal item Driving step 12. You will be asked a couple of dozen questions.

# Then hit the 'Rid' button bottom left of the KB Panel, and the Share Dealing KB will be removed, leaving only the one your are working on.

2. DECISION SUPPORT - WHAT-IFFING

A KB like the one you have just created can be used in a decision support mode. If we run it several times giving different answers each time then we can see the effect of the various factors. This is called what-iffing. From this we might find that certain factors are more important than others and this can help us plan e.g. where to put resources or, in this scenario, what portfolio of shares to buy.

# Run the KB (Driving step 11 and 12) with Strong Sector and Good fin hist having strong positive belief (100) and Good Management and Low overheads having strong negative belief (-100). The result should be slightly to the left of centre, around 33%.

# (Notice that your input is in terms of degrees of belief while the answer is in terms of probability, but visually they should be similar at present: left of centre is negative indication, right is positive.)

# Now run it again (shortcut from the data panel to the action panel by pressing the OK-Act button rather than the OK button). Reverse all the answers and you should obtain a slightly positive indication, 66%.

# We can derive some initial knowledge from this: the combination of Good management and Low overheads is a stronger indication of whether to buy than the combination of Strong sector and Good financial history. (That assumes our knowledge base is correct and complete, of course, which is certainly is not!) This is the way we use KBs for decision support: try various combinations of factors against others.

# But is there any overriding factor? Run again, giving the first two 100 and the second two -100. The result should again be slightly positive. Then try other combinations of pairs of 100 and -100.

# What should happen is that the pair that contains Good management should always determine the final result. What this means is that good management is a more important indicator than the others. In looking at the KB it should be obvious why this is so: Good management feeds its influence through to the final goal ('Buy it') by two routes while the others feed their influence by only one. All other things being equal, the more routes by which a factor feeds through to the goal the more important it becomes.

# In a large KB it can be difficult to find these multiple inference paths, and Istar provides mechanisms for finding them; see later. But in most real KBs all other things are not equal; the links themselves have varying strengths (weights). So you cannot determine the strong factors merely by looking at the net; you must run it to find out.

# (Above we always divided the factors into pairs and always gave them extreme values; that was just to give you the idea; in reality all sorts of values would be used and we might vary only one or two factors at a time.)

# To summarize, in this section we have:

3. MAKING IT EASIER AND TIDIER TO USE

When your KB becomes larger the above procedure becomes a bit cumbersome. Here are a couple of things to tidy it up and make it less cumbersome.

# First, let us put in proper question text for the degrees of belief. What we do is to provide question text for each of the four antecedent factors. To do this, for each in turn do the following:

# Click with the RIGHT mouse button on the middle of the box expressing the item. (The Attribute Details Panel should appear; if nothing happens, it is probably that the main easel is not active; to make it active, click on the easel with the left mouse button as per normal Intuition practice.)

# In the Attribute Details Panel, find the first long string gadget, the 'Q' to the right of 'User supplied'. This is the question text. Click in it and fill in the question texts, e.g. "Do you believe the company is in a strong sector?", "What is your degree of belief that the company has good management?", "Have they a strong financial history?", "Do you believe they have low overheads?"

# Now, run the KB again, and your questions should appear.

# Second, it's a nuisance having to bring up the attribute data panel each time to see the result. We can show the result directly on the main easel. To do this bring up the Attribute Details Panel for 'Buy it' (Driving step 9). At the left end of the third row of gadgets is a check box 'Show Value'. Click that so that a tick appears. Click Ok-Act to bring up the action panel and run the KB again. You should see a short black value line part way across the bottom of the 'Buy it' box. This shows its current value as a probability; the longer it is, the higher the probability of the concept expressed by the box (note: probability rather than degree of belief). Run it a few times and you can see how the value changes, not so precisely as with the data panel but enough to give a useful indication. (Note: the show value facility does not work for all data types at present; only booleans, probabilities, proportions and bayesians.)

# Such a value line is solid when the attribute is answered, and dotted when it is unanswered. By judicious choice of which attributes have value lines you can gain a visual indication of how the KB is progressing.

# Third, you can do this with all the antecedents too, showing their values.

# Fourth, if you just want to change a single input antecedent, you don't have to reset and re-ask all the others. Suppose you want to see the effect of varying just 'Good management'. Click on it with the left mouse button to bring up an action panel for it. Then hit Reset and Infer. You will be asked only for Good management. But the answer will be propagated through to 'Buy it'. Try this several times. Forward propagation does not stop at the current goal (i.e. for which an action panel has been raised), but spreads throughout the entire KB net as far as possible. By varying 'Good management' between its extremes you can see the maximum potential effect it can have, which is a good indication of its importance. (Note: For this to work properly, you should ensure that the final goal, 'Buy it', has been answered. If it is not answered then the effect of 'Good management' will not be propagated through to it because propagation usually only occurs once an attribute is answered.)

# (There is also an override facility, by which you can do similar what- iffing with items in the middle of a net; we will cover that later.)

# A goal is an attribute whose value we are interested in obtaining, by the backward and forward chaining process and the question sequence. So far, to obtain the value of a goal like 'Buy it' you have had to bring up the Attribute Action Panel, then hit Reset and Infer. But suppose we have twenty such goals; it would be tedious to do this twenty times. As we will see below, we can set up goal lists to make this less cumbersome.

# Not yet saleable. Your KB is probably tidy enough to be usable by you and by your friend if s/he is sympathetic. But it is not yet tidy enough to sell (even supposing that the knowledge is complete and accurate). But this version of Istar is not designed to take you all the way to building a commercially attractive KB; it is, rather, a tool to help you create knowledge bases in ill structured areas. Then you can export your KB to other software if you wish.

# To summarize, in this section we have:

4. KNOWLEDGE REFINEMENT AND CLARIFICATION

By now you have probably thought about other factors that should be taken into account when deciding whether to purchase shares. Also, you have probably thought that concepts like 'Good management' are a bit vague.

You'd be right on both accounts. What has happened is that in the process of putting the KB together and then trying it a few times your thinking about the domain of knowledge has been stimulated. This can be just remembering things; it can also be actual refinement or clarification of your own knowledge. Istar can help you to clarify and refine your knowledge, which is especially useful is decision support.

# The first step in refinement or clarification of knowledge is to set down precise meanings. By 'Buy it' we mean something like "This company's shares are worth buying at the moment" rather than, for instance, "I command you to buy this share" or even "This share is, and always will be, worth buying". So bring up the attribute details panel for 'Buy it'. Top right is a string gadget for 'Meaning'; into it put text similar to the first above.

# Important Tip: Normally you should fill in the meaning as soon as you create the box. (We have done so later here because of the order in which these tutorial sessions have been planned.)

# (Tip: Notice several things that have been specified when making the meaning of 'Buy it' precise:

# What we buy: "shares" # Which we buy: "of this company" # The situation (when, where, etc.): "at the moment" # etc.

Making meaning precise often involves asking what, when, where, who, which, etc.)

# Now make the meaning of 'Strong sector' precise. What about: "The company operates is a strong and growing sector." Notice the inclusion of growth as well as general strength.

# This inclusion of sector growth sets us thinking: we are happy about steady, well-founded growth, but maybe not about artificially induced boom- type growth. So, for now, go back to 'Strong sector' and alter its Meaning to "... strong and growing (but steady, not boom) sector." (Note how we are frequently accessing the Meaning gadget - just hit return once to get there - and how useful is the wee 'OK' button.)

# Now make the meaning of 'Low overheads' precise; devise your own text.

# Now try 'Good management'. This one's perhaps not so easy to define. The two links we have from it, to 'Will grow' and 'More profits', speak of two aspects of quality of management. The link to 'Will grow' speaks about the extent to which the management has a vision to grow the company and has the skills to do so, such as marketing. The link to 'More profits' speaks more about the financial policies of the management: is their spending under control, is their investment policy sound, and so on. So, as we try to make the meaning precise we see that perhaps there are several things currently bound together in the single concept 'Good management'.

# We have several options here. We can retain the the single, composite concept. We can split the concept into several others. Or we can do both. We will do both, first splitting it and later reinstating it for usability purposes.

# Pull the 'Good management' box over to the left of the others, leaving enough space to place a couple of boxes between it and the others. The links extend to follow it. (Opps: Not enough room to left? Then move the other boxes to the right. Or hit the right-arrow key and the whole KB moves.)

# Draw a box and label it 'Vision for growth' and give it a meaning like "The management of this company has a vision for growth." Place it somewhere north-east of 'Good management'.

# Now we will redirect a link. Hold the left shift key down. This will enable you to 'pick up' the end of a link and move it to another box. Place mouse cursor over the end of the link from 'Will grow' where it leaves the 'Good management' box, press left mouse button and drag the mouse. The link should now leave the box and follow the mouse. Release the end of the link over the new 'Vision for growth' box. (Oops: The wrong link (the one to More Profits) is picked up instead? It picks up the link nearest to the mouse cursor position. Hit Escape and try again. Oops: It doesn't pick up the link but rather starts drawing a new link or box? You didn't keep the Left Shift key down while you pressed the LMB. Oops: The box grows or shrinks in size as you moved the mouse? That happens if you hold the Right Shift key down istead. In either case, hit Escape and try again.)

# How to remember it is the Left Shift key? Well, you are shifting a link, and 'link' and 'left' both begin with L.

# If you run the KB now you will be asked about 'Vision for growth' as well as all the others.

# (Note: The ability to redirect a link so easily is a boon: in many pieces of software you have to delete the link and draw a new one. While this is not much of a problem here, once you have added weights to the link and perhaps routed it around the diagram you lose all that information and have to reinstate it all again.)

# Now draw a box 'Good fin policy', somewhere south east of 'Good management', with meaning like "The management of this company has sound financial policy." Redirect the link between 'Good management' and 'More profits' in the same way as above so that it now starts at 'Good fin policy'.

# (Notice: We still have a 'Good management' box but it is not connected to the goal 'Buy it'. If you run the KB now it will no longer ask about 'Good management'; backward chaining only reaches those parts that are connected to the goal. We could delete it, but as there is no need to do so and as we sometimes find we need such items later, just leave it and ignore it. We will come back to it at the very end, making the KB more usable.)

# Now that we have identified sound financial policy as a relevant concept in our KB we notice its similarity with good financial history. Assuming that the management has been in place for some time, presumably the good financial history is due in part to sound financial policy. So we really want 'Good fin policy' to link into 'Good fin hist' as well as directly into 'More profits'. Link them (drawing a line from the right hand edge of 'Good fin policy' into 'Good fin hist' - if 'Good fin policy' is above 'Good fin hist' then you might want to make the link S-shaped, using the space bar to insert bends).

# To come to think of it, maybe we shouldn't have 'Good fin hist' at all. Maybe it is almost completely subsumed under 'Good fin policy' as far as 'More profits' is concerned? That is:

# If the management has good financial policy then there will have been a good financial history.

# If the management has bad financial policy then there is no way that there could have been a good financial history.

The latter is bound tightly to the former and is fully dependent on it. But, putting as starkly as that makes us think: are there any reasons why we might have a poor financial history even though the policy is good, or that we might have a good financial history even though the policy is bad? Can you think of any? I can.

# For the purpose of this tutorial, we will assume these two items have the same meaning, and merely delete the link from 'Good fin hist' and 'More profits' so that the former no longer has influence on the latter.

# Deleting a link. With right mouse button click over the link between 'Good fin hist' and 'More profits'. Up comes a Relationship Instance panel with details of this particular link. (Notice, in passing, the Unary Op type ('Normal') and the Weight figures towards the bottom which is contained in four small boxes and should read 3,1, 1,3.) But ignore all else but the 'Delete' button. Click it. The link disappears.

# (We have the option of at least two other courses of action: delete the item 'Good fin hist' or merge it with 'Good fin policy'. See the section on Common Net Manipulations. Deletion of an item is more drastic than deletion of a link and in knowledge refinement it is often to err on the side of caution. Even though the two concepts appear similar as far as 'More profits' is concerned, they are not in fact identical in terms of their actual meaning, and it might come about that later parts of the KB require 'Good fin hist' as distinct from 'Good fin policy'.)

# Save (Save-As) the KB as 'R1' so we can pick it up later.

# While we have been manipulating the concepts above, maybe you have thought of other factors that contribute to a belief that this company's shares are worth buying? If so, you can always add it: just place a box for it, link it to 'Buy it' and give it a name and meaning. Maybe you've thought of something extra that contributes to a belief in 'Will grow' or any other box. Add the new concept(s) in the same way.

# Negative evidence. Lastly, for now, maybe you have also thought of some factor that would lower your belief in 'Buy it'. For instance, even if we believe the company will grow and its profits will increase, if there are rumours that its management have been involved in shady dealings, then perhaps it would be dangerous to buy shares. So, in a suitable space to the left of 'Buy it' draw a box 'Shady dealing' with meaning "The management is believed to have been involved in shady deals." Now start drawing a link from 'Shady dealing' towards 'Buy it', but while you are drawing it, hit the Minus key (to right of '0' on top row of keyboard). You should see the line change colour from red to black, indicating that it is now a negative link. (If you change your mind, and want to change it back, hit the Plus key.) Then release the link over 'Buy it'. (Oops: Hitting minus key has no effect? You've probably hit the one on the numeric keypad; hit the one above the main keyboard instead.)

# Negative evidence acts in a similar way to normal evidence, except that its effect is reversed. When you run a KB with a negative link, as you increase belief in the negative evidence then it decreases belief in its consequent. Try it in the what-iffing mode described above.

# Click right mouse button over the negative link to obtain its Relationship Instance panel. Notice that the Unary Operator is 'Negate'. For all the positive links in your KB the Unary Operator is 'Normal'. If you ever want to change a link from positive to negative or vice versa once it is drawn, then bring up this panel and change its Unary Op type (click the wee button to left of its name, as described for Inference type in Driving step 21). Then click 'OK'.

# Notice how our process of knowledge refinement works, in several ways, and of course there are others. Istar provides an easy to use too to aid the process of knowledge refinement and clarification.

# To summarize, in this section we have:

5. KNOWLEDGE COMMUNICATION AND MUTUAL UNDERSTANDING

(Section contains no software action.)

The above exercise, especially that of deciding to remove 'Good fin hist', shows knowledge clarification in action, and it comes about as a direct result of creating a KB with a tool like Istar. Indeed, as explained in Basden and Hibberd (1996) and Basden, Brown, Tetlow and Hibberd (1996), this is precisely why Istar was designed the way it has been, and why much other knowledge based software is not so easy to use. The process of building a good KB often involves amending what you have done before, and changing your mind several times! But it's even better when knowledge engineer (you) and expert work together.

Imagine carrying out knowledge refinement with a colleague who has some expertise, both of you around the screen. To have someone to bounce ideas off is often very fruitful. And it is often of benefit to ensure the newly emerging ideas are communicated to a colleague - and understood by a colleague. Istar is designed for partnership working, where two (or a few) people are around the screen. (And, in fact, one major reason for Istar being on the Amiga is that the Amiga produces a direct PAL/NTSC video signal which can be fed directly into a large screen TV or video projector, for use in such situations.)

(Note: This is NOT the same as group working where each has their own screen linked by some electronic means, though in principle Istar could support that. What we are talking about here is the technically more modest situation - but probably socially and practically more useful situation - where two or more people sit around a single screen.)

You have a partially developed KB up in front of you and your colleague. Think of the knowledge refinement steps we have been through:

If a colleague sat with you, there are four things that could be going on. The first two occur when knowledge is being refined, as above.

# First, your colleague just observes your actions and listens to you as you refine your knowledge. Then your thinking and reasoning would be communicated. S/he would understand why you believe, for instance, that 'Good management' is not a sufficiently precise concept. Using Istar helps this communication and mutual understanding of what are often ill-defined areas.

# Second, your colleague takes an active part in the refinement process, as the two of you discuss whether 'Good management' should be split in two, in three, or kept as a whole. Using Istar facilitates this discussion by providing a graphical 'language' in which to express and try out ideas.

The third and fourth occur when knowledge is already in a reasonably clear state and is merely being set down into the KB without being refined. (This might happen, for instance, when you are entering knowledge from a rulebook or knowledge of established good practice.)

# Third, your colleague just observes you building the KB. But the order in which you build it communicates something of importance. So does your 'body language' while building it - for instance, the act of moving 'Good management' to the left to make room gives information about your intentions. Istar is then facilitating simple one-way, communication, but of more than pure information.

# Fourth your colleague takes an active part in the construction process: "Haven't you forgotten X factor?", "Is that how you interpret that rule; I would interpret it in a different way." Again, Istar facilitates communication, but this time a two-way communication.

In addition to its clear graphical display and intuitive mechanisms for drawing knowledge bases, it has been found that a third facility is very important for communication: the flashing up of the meaning of an item at the bottom of the screen as the mouse runs over it. You've almost certainly noticed it, but if not, just move the mouse over the items of your KB. (If nothing happens, it is probably because the main easel window is not active, and so you must make it so by clicking over its background.)

Superficially like the speech bubbles that come up in MUI, Windows or the Mac, that explain what a button is for, this has a rather more sophisticated use in showing high level meaning rather than low level action. It was found during the INCA project (Basden, Brown, Tetlow and Hibberd, 1996) that the knowledge engineer made little use of it since it was not in the main field of view but the onlooker (your colleague) made enormous use of it to see the meanings of the various items of itnerest.

To summarize, in this section we have:

6. TYPES OF BENEFIT OF A KB

(Section contains no software action.)

We have now discussed four types of benefit that can accrue from a KB. In the first two:

the benefit accrues from running a completed KB, while in the latter two,

# refinement and clarification of your knowledge and # communication and discussion,

the benefit accrues from the process of constructing the KB, rather than running it. For the remaining sections we will return to types of benefit that can accrue from running a completed KB, and how these benefits influence the form and style of the KB:

All these assume a completed KB and the benefits accrue from running it. The KB you have already built is general purpose and can fulfil most of the purposes but with varying degrees of clumsiness. For most effective use the KB should be tailored to the purpose for which it is designed.

I say purposes rather than purpose because many KBs hold knowledge for more than one purpose. For instance the Wheat Counsellor KB first predicted what diseases were likely in a crop of winter wheat and then selected the appropriate preventative treatment. But we will look at the style required for each purpose, and as we do, will meet and learn about various facilities of Istar.

7. PREDICTING THE BEHAVIOUR OF YOUR SHARE

Your knowledge has been refined and the KB now expresses what you believe about share purchasing (or let us suppose so). As you might have realised, your KB can be used to predict outcomes. Suppose you have knowledge of a particular company and its situation. Then if you poke this information into the KB it will predict the attractiveness of the shares of that company to investors.

# All you have to do is to run the KB; there is no change required to the knowledge. (Bring up the action panel for 'Buy it' and click Reset and Infer.)

# What this underlines is that a well designed KB is actually multi- purpose when run: for evaluation, critique, understanding, communication and prediction. Just as with any good model. So, in a sense, a good KB is a model of reality and Istar can be seen as modelling software. But, as those of you who have been involved in modelling will have realised, a rather different style of software.

# So far you have been dealing with bayesians, representing fuzzy concepts. But much modelling deals with harder or more precise concepts, and for these we need numbers, booleans, etc. Istar provides a host of these, though some obvious ones like date are still missing. For the full list, see the file on Value Types. For now we will use numbers and booleans.

# A boolean is like a sharp, hard bayesian, the common true-false, yes-no distinction. Bring up the attribute details panel for 'Strong sector'. On the second row of gadgets, headed 'Value', the leftmost shows the value type (Bayesian). Click on the wee type-change gadget at its left-hand end and a list of value types appears. Select Boolean (and hit its OK button). (If you get a warning message that value type is inconsistent with inference method, it just means that 'Strong sector' is set to 'Infer' rather than 'User supplied'; ignore it for now as it will have no effect since there are no antecedents.) You will see the value type in the attribute detail panel change - but it has not come into effect yet. So click 'OK'. Then bring up the attribute detail panel again, and you will see that the value gadget itself, to the right of the value type identifier, is no longer a slider but a checkbox. You have changed value type.

# Now run it, and you will see that when it asks about 'Strong sector' it no longer gives you a slider but a check box. (Rather ugly, this tiny checkbox; in future versions there will be an option of a larger gadget saying Yes/No.)

# Now for a numeric value. We'll use floating point numbers here, though there are other types including integers, ratios, ordinals, enumerators, proportions, etc. First, we'll practice with just numeric values, then we'll use a numeric value in our bayesian network, which takes a bit more thought. Because: how can we integrate a numeric concept like Share Price into a fuzzy propositional concept like whether or not to buy the share? Think about it as we practice some numerics.

# Distance is half of acceleration times the square of time (if I remember my school physics correctly). On the Select Item Type panel click on Free Float.

1. In a clear part of the easel (notice how it smoothly scrolls as you move the mouse - a wonderful feature of the Amiga!) lay down an item and label it 'Distance'. This will be our goal.

2. Then lay down three antecedents to the left of it: 'Time', 'Acceleration' and 'Constant'.

3. Link all of them to 'Distance'.

4. Bring up the attribute details panel for 'Distance' and locate the Inference Method type gadget. It should lie just to the right of the 'Infer' radio button and should say "X = A + B + C .." or something similar, meaning that the value of this attribute is calculated by adding together the values of all the antecedents. We want multiplication, so click on the wee type-change button at its left hand end and select "X = A * B * C .." from its list. Click OK.

5. But we want the square of time, not time itself: draw a second link from 'Time' to 'Distance', putting a bend in it (Driving step 6b) to visually distinguish it from the first one; having two links into a multiply inference gives the square of distance, having three will give the cube, etc.

6. Bring up the attribute details panel for 'Constant', set the derivation radio button to 'Const' and set the value to '0.5', click its OK button.

# What you have is a wee inference net that says the result is the product of Time, Time, Acceleration and the constant, 0.5. Bring up the action panel. Run it, giving a time of 2 and an acceleration of 3, say. The result should appear as 6. With this you can predict how far a stone will fall from the top of a tower block in a given time. (Or, as mentioned in the first paragraph, you can put this to other uses such as evaluating whether the stone will reach the 23rd floor in three seconds, such as doing a what-if on different strengths of gravity, such as refining your knowledge of what factors contribute to distance fallen.)

# Now, how can we link a numeric concept into a bayesian network? The answer is that we cannot meaningfully link it in directly. Instead, we must often compare the numeric attribute with something and emerge with a truth statement. So, for instance, we could say that if the Share Price is going up (greater than it was a week ago) then we predict that the share is worth buying. So let's put that in. (I know that knowledge is wrong; you can refine it below!) It takes several stages, since you will learn and use several new features.

# First, place two Float boxes south west of 'Buy it' with enough space for another between them. Label them 'Share Price' and 'Price last week', giving meanings of "Current share price of this company" and "The share price of the company a week ago". If you like, give them suitable question text.

# Now select Boolean item type, draw an item between 'Buy it' and these two. Label it 'Shares Rising' with a meaning "The share price is rising".

# Now link 'Share Price' as antecedent to 'Shares Rising'. Then link 'Price last week' similarly. Do it in that order.

# Bring up the attribute details panel for 'Shares Rising'. Change the inference method to "Whether A > all". What this inference method does is to see whether the first antecedent is greater than all the rest (that is why the order was important). It returns a boolean result (unlike most inference methods, in which the consequent is of the same value type as the antecedents). Click OK.

# (If you wish to see this in operation, run the wee KB in which 'Shares Rising' is the goal, giving various pairs of values for 'Share Price' and 'Price last week'.)

# Now link 'Shares Rising' as antecedent into 'Buy it'. Strictly, the bayesian accumulation inference method (which 'Buy it' has) needs bayesian actecedents. But for your convenience Istar is tolerant, making automatic conversions from probability and boolean. (If the antecedent is a simple probability it has no a-priori and this is assumed to be half (0.5, 50%). If the antecedent is boolean, as here, then it is treated as a probability with a value of either 0 or 100% with an a-priori of 50%, as follows:

For details of a-priori values, see below.) You will find automatic value type conversion quite frequently, for instance allowing you to mix integers and floats when doing arithmetic.

# Now run the whole KB, and see the effect of changing share price compared with price last week.

# Knowledge refinement: You probably disagree with the idea that shares should be bought when their price is rising. Many will say they should be bought when price is falling and sold when rising. That simple rule was at the centre the stock market crash of 1989, so I thought we'd put in its opposite here! You are free to correct my possibly wrong knowledge to the conventional rule, and there are several ways of doing it:

Probably, the real knowledge of the link between share price movement and whether to buy is more complex, depending on the rate of price rise or drop, how long it has been rising or dropping, and what other dealers are doing. It would be a good exercise to try to work out a small knowledge base for this, now you have the ability to connect numeric and bayesian information.

In this section we have:

8. SELECTION: DECIDING WHICH SHARE TO PURCHASE

(This is a fairly long exercise since a number of new features are to be introduced.)

Istar can be used to select options, especially where the selection criteria are fuzzy and make use of human 'judgement'. This section looks at one way of doing this.

So far, our KB is run for a single company, and we have been evaluating whether its shares are, or predicting whether it shares will be, worth buying, but we can also use it to select the best company from which to buy shares. To use our existing KB we must run it several times, once for each company, and remember the result for each one. That is, the general purpose KB we have constructed can also be used for selection.

But running it for each company can be inconvenient. For a start, we need to remember the result for each. To continue, there are situations where there is common information and we find we are having to enter the same information each time. This section looks at one way of allowing several selections in a single KB.

This method would not normally be used for selecting between such varied things as company shares, but mainly for selection from a small and static range of options. For instance, to select the best of five alloys from which to make machine parts, depending on their properties.

But we will continue with our shares KB because our aim here is to learn a number of features and techniques, rather than end up with perfect knowledge.

We have two companies between which we must make a decision: Acme plc and Bloggs Ltd. Both are in the Information Technology sector. What we must do is to create separate parts of the KB for each, one part for Acme, one for Bloggs, but using as much common knowledge as possible. (Current version of Istar does not have a knowledge duplication facility, so we must do it manually.)

# Load the KB you saved as 'R1'.

# First, let's rename the attributes in the existing KB as relevant to Acme. Easiest way to do this in our present version (few attributes) is to bring up the attribute details for each attribute in turn and put an "A:" (for Acme) in front of both label and meaning. Do so - except for 'Buy it' which is better labelled 'Buy Acme' and for 'Strong sector' which is common data and thus not specific to either. You can ignore the dangling 'Good fin hist'.

# Now we build a similar KB for Bloggs, but without 'Strong Sector'. Preferably underneath the current one. Create items for:

(but not for 'Strong sector'), and link them in the same pattern as for Acme. (Don't bother duplicating 'Good fin hist'.) For 'Buy Bloggs' hit the Show Value button on the Attribute Details panel, so, like 'Buy Acme', the value is shown as a line in the box.

# Now link the common 'Strong sector' above Acme into 'B: Will grow'. 'Strong sector' is common to both because both are in the same sector, I.T. (Note the long almost-vertical link you have created: this is a visual cue in the diagram that says 'this link is different': unlike most, it is common to the two parts of the KB.)

# Make 'Strong sector' the first antecedent of 'B: Will grow', as described in the following paragraphs ...

# (A note about order of antecedents. Bring up the Attribute Details Panel for 'A: Will grow' and look at the bottom left corner List of Antecedents. This tells you which items/attributes are the direct antecedents to this one. Notice that 'Strong sector' is first. Now bring up that for 'B: Will grow' (no need to send 'A: Will grow' away; Istar allows several such panels to be up at the same time). 'Strong sector' should appear as second in the list. This is because we linked 'Strong sector' to 'B: Will grow' after we linked the rest, not before; any new link is added at the end. What this means is that if you run just 'Buy Acme' you will be asked about sector strength first, while if you run just 'Buy Bloggs' sector strength is asked second. Normally this doesn't matter much because the order should not affect the results (except for some order-sensitive inference methods like "Whether A > the rest"). The main concern is over usability; the user - your friend of the first section, perhaps - might wonder why the two parts behave differently and whether this has any significance. The attribute details panel allows you to change the order by selecting an antecedent and making it first. So select 'Strong sector' on the list (by hitting it with the LMB to highlight it) and click the attached '1' button. It should jump to first in list. And if you run 'Buy Bloggs' now it should ask about sector strength first.)

# (The other buttons by the antecedent list:

Try them, sending each away when you have seen them. These are some of the buttons around the general panel list.)

# Run both parts of the KB (Acme and Bloggs) by bringing up the Action panels for both 'Buy Acme' and 'Buy Bloggs' (you can have any number of attribute panels up at one time). Hit Reset and Infer for Acme, and answer the questions that come up. Hit Reset and Infer for Bloggs, and answer the questions that come up.

# Goal lists. But notice also how cumbersome was the process of bringing up all the attribute action panels individually; also, you probably noticed that you were asked twice about the common factor, 'Strong market'. It gets worse the more goals you have. But we can make it easier by using goal lists. Bring up the KB panel and look down the second column of buttons to the one with three parts; the middle part, a string, might hold 'Goal List'. (Earlier versions than 1.08 are a bit different.)

# (To remove a goal from the goal list, bring up the goal list Item Details Panel by hitting the 'S' key as above, select the goal in its right hand list gadget and hit the attached Rid button.)

# (To find a goal in a large KB, select it on the goal list Item Details Panel consequent list, hit the 'See' button to bring up its Attribute Details Panel, and hit 'Show' button. Then look at the easel: everything has disappeared except the goal and the attributes it is connected to. To bring the easel back, make sure it is active and hit the Numeric Enter key (bottom right on keyboard).)

# Now, to compare Acme and Bloogs shares, one way is to bring up the Attribute Details Panel of each and look at their value sliders.

# Best save your KB: SaveAs 'R2'.

# To summarize, in this section, we have (not in this order):

9. SHOWING RESULTS

# Run the R2 KB with both goals (using the goal list or individually, as you wish). Now look at the two value lines for 'Buy Acme' and 'Buy Bloggs'. (Oops: No value line? Bring up the Attribute Details panel for each and ensure the Show Value button is ticked.)

# Notice how useful these value lines are for comparing two or more similar attributes - the longer one is the share to buy. If you move the boxes so that they are aligned vertically above each other, then comparison is fast and easy. (But it only works for bayesians, booleans, probabilities and proportions at present, not integers nor floats.)

# Value lines not only give a quick visual indication of a comparison, but they are particularly good for comparing fuzzy outcomes. We discuss why, below, after we have used a different way of displaying results.

# Some domains require something more specific, and also require the knowledge base to recommend the best. The following steps show a way of putting out the name of the better buy. It is more suited to numeric goals which cannot be shown by the horizontal value line. We add some extra inference net which compares the two and provides the name of the best goal. In doing so you will learn a new value type and couple more facilities of Istar: the Block type and the Which Max inference method.

# On the Select Item Type panel list, towards the bottom select 'Block'. The Free Block attribute type means that the attribute contains the DSAP (data structure area pointer, an internal reference number) of a block of data in the KB. Place such a box to the right of the two goals, 'Buy Acme' and 'Buy Bloggs'. (Label: 'Best share', meaning e.g.: "This holds holds the DSAP of the block of the share which is most attractive".) Link them to it as its antecedents.

# Bring up the Attribute Detail Panel for 'Best Share', and the inference method should be, by default, "Which Max". (If not, then change it.) "Which Max" looks to find which of the antecedents has the highest value and returns an identifier to show which. In our case the resulting identifier is a DSAP of a block, but it can be an index number if the consequent is integer or ordinal; see the Inference file for more details.

# (If you run it using 'Best share' as the goal - ignore the goal list, bring up its action panel and click Reset and Infer - then the result will be a number like 23720. Now if you bring up the attribute details panels for 'Buy Acme' and 'Buy Bloggs' you will see over the right hand side, about 2/3 of the way down, a gadget 'DSAP' and one of them should be the number you have just seen. Rather meaningless since it is an internal identifier.)

# Now to convert this Block value (DSAP) into something more meaningful: the name of the best share. In the Select Item Type panel, select 'Free String'. Place a string item to the right of 'Best share' and link 'Best share' into it as antecedent. (You can call the new item 'Best share' if you like as Istar allows duplicate names, but it's probably better to differentiate it as a string version either by a different label or in the meaning.)

# (Notice how the colours of the labels in the various items/attributes is a clue to the value type held.)

# Now run the KB with this string 'Best share (string)' as goal, and look at its result value. It should contain as its value either of the strings, 'Buy Acme' or 'Buy Bloggs'. This string value can then be placed in a document (though we will not do so here; in the current version of Istar this facility is not available).

# Which is the better method to show which share to buy - value lines, or name of best share? It depends. But the former (value lines) has two advantages in a fuzzy area of knowledge like share dealing. One is that it is more immediately graphical: look for the longest line. But, more importantly, suppose both results were low (e.g. 3% and 9%). Then the second method would simply say 'Buy Bloggs', whereas in reality it would probably be inappropriate to buy either of them.

# The second method, using "Which Max" inference, makes the decision for the user while the first method merely provides decision support for the user. This is often more useful since if two or more options have approximately the same value (e.g. 85%, 87%, 82%) then it might be appropriate to choose one that is not numerically the highest, for extraneous reasons.

# Let's review what we've done. We have a KB by which the user can choose between Acme and Bloggs shares, given information about both companies and the strength of the market. The KB is identical for each. But in many selector KBs there will be small variations between the knowledge.

# Suppose Acme is a small company and Bloggs a large one, and the user might have a preference for large or small companies. Select Bayesian again and create an item labelled 'Prefers small', meaning "I prefer small company shares". (Notice that this is information pertaining to the user, not to the companies or their situation. Perfectly valid.) Then link it to 'Buy Acme' with a positive link and to 'Buy Bloggs' with a negative link. From previous work both 'Buy Acme' and 'Buy Bloggs' will be answered and their values shown as horizontal lines in their boxes. (If not, make sure they are.) Then bring up the action panel for 'Prefers small' and reset/infer it several times with various values. Depending on the other information, you should see the balance between the two shares change as you change preference.

# In a selector KB, which has knowledge on how to select between a small, static number of known options (like the alloys above) you will find three main types of knowledge:

This is a general pattern.

To summarize, in this section we have:

(Short version of course: You may skip to Section 14 on Making Your KB More Accurate, and simply read the intervening sections.)

10. SELECTION: DECIDING WHAT CRITERIA TO JUDGE THEM ON

The previous section looked at the construction of a KB to select between shares of two specific companies. This is not very useful since there are many companies to consider and different ones at different times; it would be more useful where the range of options is small and fixed, such as selecting the best alloy for constructing a chemical reaction vessel.

In company shares we do not have a small number of known options, but rather a large number of options which are not known until we run the KB. That is, between yesterday's run and today's five new companies might have come into existence and another twelve might have gone bust. In such domains it is better to take a different approach to selection.

I once found that this approach was needed when building a KB to select tree species suitable for planting on a particular plot of land. Just for a change from share dealing, let us try a tree species selector.

A KB of the type we are about to build is offered on-line by the Istar Knowledge Server. It is a fuller version of a tree species selector. Why not connect now, and try it out? Then you can see what we are aiming at here.

The approach we take here is not to hold knowledge of each and every option in the KB, but rather to use the KB to decide what selection criteria are important, and then these selection criteria can be applied to a database containing thousands of options (companies, tree species, etc.) and the best few examined more closely, often manually.

"But why not just use the database direct and apply a SQL query?" you may ask. The answer is that with each run a different set of selection criteria must be used, so that if we just used SQL direct we would have to write a complex SQL statement each and every run. In effect, the KB holds the knowledge on how to write the SQL statement needed for each run. (Though we will not attempt that in this section, one could do so with the current version of Istar, using string value types and the "Concat" inference method).

Also - and this is what we found in the tree selector KB - simple SQL or database access is not enough. It is better if the KB itself can perform inference on data values obtained from the KB. SQL etc. do not normally have bayesian mechanisms built into them.

This kind of KB is rather more sophisticated in structure (as well as in detail) than the ones above. In the ones above, three kinds of information came together into the eventual goals:

In this kind of selector, they separate out to some extent. First we use (a) and some of (b) to arrive at a set of selection criteria, then we apply these to (c) and some more of (b) to find the relative attractiveness of each of the options. In this kind of selector, (c) is often held in the database, but inferences from it must be made in the KB (not just in SQL) as discussed above.

We will build a small KB for the first phase of a tree species selector. It will first be a simple KB using facilities we have already met above. Then we will refine, learning new facilities and approaches as we do so.

# Move the easel so that the whole screen is empty.

# Select bayesian item type.

# The goals. Lay down, to the right, boxes for three selection criteria:

with meanings like:

These are our three criterion for selection of tree species. Click the Show Value box for each so that we can see easily the importance of each criterion.

(Tip: Might be useful to shorten the meanings so that more than the common part ("It is important that trees") appear in the mouse-position window. For instance, make the meanings like "Trees must have a good yield of timber.")

# Over the left hand side, place boxes to represent the requirements of the user:

with meanings:

Notice that 'Near housing' is part of (b) above, while the others are part of (a).

# In the middle, place a box:

with the meaning:

# Link 'Financial return' to both 'High timber yield' and 'Game', because a money can be made either out of timber or game (pheasant shooting, etc.).

# Link 'Timber' to 'High timber yield' with a positive link and to 'Game' with a negative link. This means that if they don't want timber but do want a financial yield then it must be through game.

# Link both 'Wildlife' and 'Game' to 'Good cover'. Both require good cover.

# Link 'Housing nearby' to 'Hardy'. This is because if there is housing nearby then children etc. will visit the wood and are likely to do some damage to the trees, so they must be hardy.

# Now clear the goal list and put the three right hand criteria into it. (See above for how to do this if you've forgotten.)

# Run it (ResetGoals, InferGoals on KB panel). That is our basic tree selector KB.

# (Its results show how much importance should be attached to each criterion when assessing each tree species. Normally (i.e. apart from what-iffing) the KB would be run once to obtain such importances and then these would be applied to the data for all the tree species. The algorithm for applying criterion weights to data will depend on need and data available, but might consist of multiplying each degree of importance by the appropriate data and adding together. For instance, if 'Hardy' turns out to be high (90%) and 'High timber yield' low (10%), then a tree with a hardiness factor of 8 out of 10 and a yield factor of 2 out of 10 the total would be 8 * 90% + 2 * 10% = 7.2 + 0.2 = 7.4. For a tree with a hardiness factor of 2 and a yield factor of 8 the result would be 2 * 90% + 8 * 10% = 1.8 + 0.8 = 2.6. So the former tree species would be preferred. If running the KB had resulted in the opposite weights being given then the tree preferences would be reversed.)

# This KB demonstrates basically how to set up this type of selector: have a group of items that represent the selection criteria, a group that represent the user requirements and a group that represent the common situation, and link them together.

# Did you notice that even if you said you did not want a financial return it still asked you whether the income should be from timber or not. (If you didn't notice it, run it again, answering -100 to 'Financial return'.) This is obviously not right. So we will use a couple of facilities of Istar to amend it.

# First, if the demand for financial return is low then there is little point in asking about 'Timber'. Istar offers a cut-off on bayesians; you might have noticed the two gadgets on the bayesian attribute detail panel called LCO and UCO. These are lower and upper cut-off, and when zero they are ignored. But when not zero they come into play. Take the lower cut-off, LCO. If LCO is 30% then as soon as it is certain that the probability value of the attribute cannot exceed 30% then the attribute is considered answered and no more of its antecedents are sought by backward chaining. This can be used to stop questions about 'Timber' when it is known that there is no financial interest. Conversely with the upper cut- off, UCO.

# Set the LCO of 'High yield' to 30%.

# To use the cut-offs effectively we need to alter the weights of evidence. So far we have used mild weights of 3/1 and 1/3, symmetrical about unity. We need to give the link a strong asymmetry to pull the consequent down to near zero when the antecedent ('Financial return') is low. Then, whatever value the other antecedent takes it will never rise above the LCO and so they will not be asked. (For an understanding of weights, see the section in Inference on bayesians.)

# Bring up the relationship instance panel for the link between 'Financial return' and 'High yield'. It has four integers in four boxes called 'Weight', and they should be 3/1 and 1/3. Set the fourth box to 30 rather than 3. (Also, ensure that the unary operator is 'Normal', not 'Negate'.)

# Now run the KB from just 'High yield' and answer 'Financial return' with -100. It should now not ask about 'Timber'. (If it does, perhaps you have run it by clicking Reset Goals and Infer Goals, because 'Timber' is still needed for 'Game', or perhaps you have not set the LCO of 'High yield' to 30%.)

# To stop bayesian questioning we must therefore do two things: set the LCO (or occasionally the UCO) and set the weights on the link from the controlling antecedent. So we must also do this for 'Game'. Set its LCO also to 30% and set the weight on the link between 'Financial return' and 'Game' to 3/1, 1/30. Then, when we ResetGoals and InferGoals, 'Timber' should only be asked if the 'Financial return' is answered more positively than about -30.

# To summarize, in this section we have:

11. DEPENDENT ATTRIBUTES - IRRELEVANT QUESTIONS

It often happens that one attribute is a sub-type of another, or is somehow dependent on it, so that if the other is answered false then the dependent attribute is no longer applicable. In the example above, 'High timber yield' is dependent on 'Financial return' in this way: if the user does not want a financial return then high timber yield becomes irrelevant, 'not applicable'. When this happens, it is inappropriate to ask questions that solely establish the value of the dependent attribute; we need to suppress such irrelevant questions.

In the above example we did this by means of the Lower Cut Off. But this is specific to Bayesian attributes. There are two other methods. The second method works only with booleans and an AND inference method. The third, the most general, is only available in version 1.04 and later, and works with any consequent though it requires a boolean antecedent.

# Preventing irrelevant questions by boolean AND. Change the attribute type of 'Financial Return' and of 'High timber yield' to boolean (bring up attribute details panel for each and alter the type gadget). Also alter the inference method of 'High timber yield' to 'A & B & ..'. Ensure that 'Financial return' is the first antecedent of 'High timber yield'. Run it several times. When you answer 'Financial return' with 'No' it should not ask about 'Timber'; when you answer 'Yes' it should.

# Using the CONTROL Unary Operator (Version 1.04 onwards). Change 'High timber yield' back to a Bayesian attribute and its inference method back to Bayesian. Turn off the Lower Cut Off, if active, by dragging it back to zero. Keep 'Financial return' as boolean. Now bring up the relationship details panel for the link between them, and change its Unary Operator to 'CONTROL'. Now run the KB. Answering 'Financial return' with 'No' should set the consequent of the CONTROL link ('Timber') to answered and (probably) unknown and thus suppress any further attempts to find its value. When you answer with 'Yes' its value will be sought in the normal way. This method will work with any type of consequent.

To see the usefulness of the CONTROL unary operator, try the following without it. The cost of a house includes a number of optional factors, such as a conservatory. If the conservatory is wanted, then its cost needs to be worked out (e.g. from its size and quality) and added to that of the house. But if not wanted, then don't bother asking size and quality of conservatory, and don't add it to the cost of the house. Using the CONTROL unary operator, we simply link 'Want Conservatory' to 'Conservatory cost'. Without it, the latter will probably have to employ a Chooser, to choose between zero and a worked out cost. While not a complicated thing to do, it nevertheless complicates the KB diagram by at least two extra superfluous attributes, and if there are many such instances, the whole loses its clarity.

12. CAUSAL MODELS

(Section has no software action.)

Istar can be used to built certain types of causal model. A causal model tries to express what happens in some limited area of activity, such as machinery, electronics, biological systems, the weather, social systems, history and many more. Some are deterministic, some normative (that is, there are laws that pertain but entities are not forced to obey them). With a causal model we can predict what will happen, and a major use is simulation (e.g. simulate the weather, or simulate an electronic circuit to see whether it will malfunction, or simulate social development).

The more normative, the more 'fuzzy' the information about them and less accurately can we predict or simulate. Determinative models can often make use of numeric and boolean information while bayesians and probabilities are more useful for normative models.

As you might have realised, the predictive KB above was a simple kind of causal model. The causality was economic and logical in nature rather than physical, and perhaps social and ethical, but it was still a valid kind of causality. There are grounds in philosophy for seeing logical entailment as a form of causality. In the same way evaluation and selection can involve causality. That is why the same KB could be used, with minor alterations, for all these purposes.

But there one purpose of a KB that does not normally make use of a causal model - diagnosis.

13. DIAGNOSING WHY IT WENT WRONG OR RIGHT

Diagnosis is finding out what went wrong. Or, more generally, what some initial state was that resulted in the observed final state.

In most of the uses for KBs we have looked at so far (evaluation, understanding, prediction, etc.) the inference has followed and modelled the causality, so that input information is of some initial state or cause and output information is of an outcome. But in diagnosis inference and causality flow in opposite directions, so that input information is of the outcome and output information is about the initial state or cause. This gives a diagnostic KB a different flavour, though it employs exactly the same inference mechanisms.

We will construct a small KB that seeks to explain why a certain company share did not give us good profits, why it went down in value when we expected it to go up.

First, note that the causal KB we have used so far could in principle be used for diagnosis. To do so we would try all combinations of input information and see which ones gave the state we now observe of an unattractive share. (Try it if you like: run it several times and see which combinations give low belief that we should 'Buy it'.) But there are two problems:

So we need to build a different type of KB, in which the goal attributes on the right hand side are such things as 'Was not strong sector', 'Management had no vision', etc. and the input information on the left hand side includes what was our goal above (negated) 'Should not have bought it' and other information which we have not represented above.

# The first thing we can do is to try reversing the KB we have. Load 'R1'. Rather than reverse the actual KB we will create another below it, reversed. Move the easel down so there is space below it but so that the original KB can be seen. Squash some of its items up a bit to make room (and also to ensure that the main easel is active).

# Move the mouse over 'Buy it' and press the 'T' key. This selects item type to be the same type as the one under the mouse. It's usually more convenient than using the select item type panel, and here we will make good use of it.

# Place an item to the left hand side, corresponding to 'Buy it', but label it something like 'ShdNotHaveBought' with meaning like "We should not have bought this share".

# Now select the type for 'Will' using the 'T' key, and place a box to the right of 'ShdNotHaveBought' and above, labelled 'DidNotGrow' with suitable meaning. (Of course, in this KB most items are free bayesians, so strictly we don't have to keep using the 'T' key. But it is a good habit to get into.)

# Do similarly for 'More Profits', perhaps labelling it 'ProfitsFell'.

# (Tip: Notice the different style of labelling, without spaces; because the labels seem to be longer, but it could be useful as a visual way of distinguishing between casual and diagnostic knowledge.)

# Link 'ShdNotHaveBought' as antecedent to both these. (Note: antecedence and consequence is reversed.)

# Then do similarly for all other items/attributes in the original KB, creating one below it that is the mirror image of it.

# Now, what do you notice about this KB? It has only one input variable, from which all the others are derived. So, however it is answered, there will be no real way of distinguishing between what are now the inference goals. We cannot tell whether it was lack of vision or weakness of the sector or any other factor that meant the shares did not do well. We must add some further information to help us do so.

# One way to do this is to focus on one of the factors that could have been a cause, such as 'SectorWasWeak' and ask "What else would weakness in the sector have led to?" For instance, if the sector as a whole was weak then other firms in that sector would also have been weak. So, if other firms also did badly then this is evidence that sector weakness was the problem, but if other firms did well then this is unlikely to have been the problem.

# Add a bayesian 'OthersDidBadlyToo' as antecedent of 'SectorWasWeak'.

# Maybe you also want to set the LCO of 'SectorWasWeak' then on its attribute list select 'DidNotGrow' and click the 'R' button to bring up the relationship instance panel. Set the fourth weight figure to 60. This will prevent 'OthersDidBadlyToo' from being asked if in fact the company did grow.

# In the same way, ask "What else would have happened if the company did not grow?" e.g. The turnover would have stayed the same, or the number of employees would have stayed the same or reduced. Take one or both of these as evidence for 'DidNotGrow' in the same way.

# Note that instead of asking about belief in the negative statement 'TurnoverDidNotIncrease' you could ask the positive statement 'TurnoverIncreased' and link it to 'DidNotGrow' as negative evidence (that is, with a unary operator of Negate). You will come across many situations where it might be better to reverse a proposition and thus all its relationship (both antecedent and consequent).

# (Knowledge refinement: Instead of asking about turnover and employees separately, perhaps what we want is simply to ask "Did the company in fact grow during this period?")

# In the same way, asking "What else could xxx have led to?", and linking what you come up with as antecedence for xxx, you can build a diagnostic KB. Be sensitive to the possibility that one of these factors might be caused by several factors in your KB, in which case it should be linked as antecedent to them all.

# To summarize, in this section we have:

14. MAKING YOUR KB MORE ACCURATE

So far most of the links in your KB have had the same weight and the a- priori settings have remained at 50%. This is obviously not true to real life. Some factors are more important than others. In this section we start to tailor some of the weights (strengths) of links and set the a- priori probabilities of your bayesian attributes.

# The a-priori probability is the foundation of bayesian evidence. It forms the base point, to which to add evidence as it is collected. It is the starting belief in the proposition represented by the bayesian, the belief you would have in it if you could obtain no evidence one way or another. Usually the a-priori is the statistical probability of the proposition being true for the class of situations in which you will use the KB.

# Load the 'R1' KB that you saved earlier.

# Bring up the 'Buy it' attribute details panel. Given 100 different companies, on average, for how many of them would you consider their shares attractive? 10%? 30%? 50%? 70%? Suppose your answer were 20%, then find the 'AP' slider gadget to the right of the main value and move it to 20. You have set the a-priori.

# Now ask yourself the equivalent question ("Given 100 different companies, on average how many of them will grow?" etc.) for all other bayesians in the KB, and adjust their a-prioris to suit.

# It's basically a simple operation to set the a-priori and with a little practice it becomes almost second nature. But there can be difficulties, especially when you start off.

# One is that you start to think, "Well, it depends ..." On what? One is that it might depend on sector. If you can identify a factor on which this average depends then that factor is actually a piece of evidence that should be incorporated into your KB. It depends on sector? Then make the influence of sector a piece of evidence, and then you change your a-priori question to "Given 100 companies, when I take the average across all sectors, how many of them ...?"

# (Notice how asking the a-priori question sometimes helps in knowledge refinement by forcing hidden influences out into the open. But, in this case, you already have a 'Strong Sector' attribute.)

# 'Will grow' depends on sector, and that is already taken care of, via the item 'Strong sector'. But perhaps 'Vision for growth' also depends on sector, in that in certain sectors the culture of that sector is for growth while in others it is not. The I.T. sector tends to have a growth culture while the agricultural sector might not. The construction industry at present is still in recession. The sports sector seems to have a culture of growing. So let us take these four as an example. We will employ yet another facility of Istar, the Chooser inference method.

# What we want is a piece of evidence for 'Vision for growth' that depends on sector. Place a bayesian 'Sector vision' as antecedent to it. Bring up its attribute details panel and change its inference method to Chooser. The Chooser takes an integer (or an Enum or Ordinal) as its first antecedent, and then all the others should normally be the same type as the consequent. The Chooser first finds the value of the first antecedent. Then, if 1, it takes the first of the rest (i.e. the second antecedent), if 2 it takes the 2nd of the rest, and so on, and copies its value into the consequent. In this case the attributes from which it chooses will be sector constants.

# Select Integer item type and place an integer attribute as antecedent to 'Sector vision'. Label is 'Sector Id' with meaning "The identifier for the sector". Give it a question text of "To which sector does the company belong? 1 = I.T., 2 = Agriculture, 3 = Construction, 4 = Sports". (For a less clumsy method than this long question, see Enumerated Types below.)

# Now create four bayesian attributes as antecedents to 'Sector vision' and, by bringing up their Attribute Details Panels, make each a Constant. Then give each a name and value something like:

Try resetting and inferring 'Sector vision' to see what you obtain with different values of the Id.

# (If you wish, you could also change 'Strong sector' from being an input information to a Chooser controlled by 'Sector Id' and to take its value from a similar bank of constants. Then, by answering 'Sector Id', the user would automatically supply values to both 'Strong sector' and 'Sector vision'.)

# Now, don't forget to set the a-priori for 'Vision for growth' as an average across all sectors. If you expect about the same number of companies in each sector then you could just take the average of the sectors you are dealing with (90+10+20+60/4=45%). The a-priori for 'Sector vision' can stay at 50% since it is merely a modifying factor.

# (Tip: Don't worry too much about accuracy of a-prioris or weights; they are more tolerant than you might think.)

# We have dealt with a-prioris. Now to deal with weights of evidence. We will take as an example the item 'Will grow' and its antecedents 'Strong sector' and 'Vision for growth'. In doing so we deal with odds rather than probabilities. Odds and probabilities are linked as:

O = P / (1 - P)

and

P = O / (1 + O).

In Istar odds are held as two integers, numerator and denominator.

# Let us suppose you have set the a-priori for 'Will grow' to 25%. Its odds are therefore 0.25 / (1 - 0.25) = 0.25 / 0.75 = 1/3 ("one to three against").

# For each antecedent we ask two questions:

QT: "Suppose we know that the antecedent is completely true; then what would the belief in the consequent be?" and

QF: "Suppose we know that the antecedent is completely false; then what would the belief in the consequent be?"

Ask them of 'Strong sector': "Suppose we know for certain that it is a strong sector; what would our belief in 'Will grow' be?" and conversely. Suppose we find QT gives 75% (odds = 3/1) and QF gives 10% (odds = 1/9).

# This gives us the means to work out the weights on the link between them. QT provides the left hand pair of numbers on the relationship instance panel, and QF the right hand. Bring up the panel.

# Each pair is an odds multiplier, such that when multiplied into the a- priori odds of the consequent gives the answer to QT and QF respectively. Take the left hand pair and QT.

So alter the left hand pair to 9/1.

# In a similar way, the right hand pair and QF:

Click on the 'OK' button.

# Now process the others link weights in a similar manner:

# Save the KB as 'R3'.

# Now run the KB. You should start to find the results are more meaningful and interesting than before. (NOTE: At present there is a wee bug in the Chooser, in not resetting its antecedents. To overcome this, you might have to press Reset a couple of times.)

# To summarize, in the section we have:

15. MAKING YOUR KBS MORE USABLE

In this section we note some ways to make the KB more usable, easier to use, more friendly to the user.

# Enumerators

The Chooser was driven by an integer, which meant the question text had to be cumbersome. A more serious problem was that the user was free to enter any value. Future versions of Istar might have value checking on input for integers etc. but a better method is available now: use an Enumerated Type.

An enumerated type is an identifier of which option is selected from a set, e.g. which business sector. One is already available as standard: Weekdays. But we must create another.

It is often useful to create Enumerated types for the major identifiers your KB uses. Some will be Enums, some Ordinals. Ordinals differ from Enums in that the options are in some numeric order rather than being just options. e.g. Low, medium and high.

# The order in which questions are asked

The order in which questions are asked is seldom important to the final results given by a KB but can make a lot of difference to the user. For instance it is sensible to have the major questions at the start and also to try to put questions relating to a given topic together. The order in which they are asked depends on the backward chaining process, the inference methods employed and the answers to previous questions. But there are several ways of controlling the order of asking.

# Suppressing irrelevant questions

Often certain questions are irrelevant to the flow of the run of a KB. For instance for the I.T. sector it is irrelvant to ask about farming practices. There are various ways to prevent irrelevant questions being asked.

# The Chooser. This will backward chain only up the chosen route and will ignore the others. # Bayesian cut-offs. Setting these will stop seeking input once it is certain that the value of the bayesian attribute either cannot exceed the lower cut-off or cannot be less than the upper cut-off, however the remaining unanswered antecedents are answered. # Certain arithmetic and logical inference methods stop when an extreme answer is known. e.g. the process stops when: one of the antecedents is zero in multiplication, when one is false in boolean AND, when one is true in boolean OR. # In the comparisons that give a boolean result, as soon as the comparison fails the process stops. So "A > the rest" will stop as soon as one of the rest is found to be >= A.

# Forms

The normal method of obtaining information when running a KB in Istar is question by question, since that allows the session to be most responsive to the way information given. But sometimes it is useful to have several related pieces of information all on the same screen.

For this, Istar offers a simple 'Forms' facility.

This facility is as yet undeveloped; new versions should appear later.

# Requiring less detail of the user

As it stands, our KB will ask us two things about the quality of management. As the KB develops we might find these split into several more, which can feel rather too detailed to the user. For this reason it is sometimes useful to first ask a more general question about the overall quality of management and derive these other factors from that. Then, only when detail is needed, ask these factors separately.

There are several ways of doing this. The easiest is to link all the management factors back to 'Good management' as their single antecedent, and give weights accordingly.

But that does not differentiate much: they will all rise and fall together. There are several ways to overcome this, such as have another item that asks whether the management's strengths are in marketing, production or finance, and sets the various factors accordingly.

References

Basden A, Brown A J, Tetlow S D A, Hibberd P R, (1996), "Design of a user interface for a knowledge refinement tool", Int. J. Human Computer Studies, v.45, pp.157-183.

Basden A, Hibberd P R, (1996), "User interface issues raised by knowledge refinement", Int. J. Human Computer Studies, v.45, pp.135-155.


Copyright (c) Andrew Basden 4 February 1998.

Updated: 12 June 1999: a couple of hrefs put right.