Example 4: For Want of a Nail

July 28, 2017

Sometimes small details can have big consequences. There is a famous proverb that makes this point, that goes something like this:

For want of a nail the shoe was lost.
For want of a shoe the horse was lost.
For want of a horse the rider was lost.
For want of a rider the message was lost.
For want of a message the battle was lost.
For want of a battle the kingdom was lost.
And all for the want of a horseshoe nail.

But suppose you are an unfortunate farrier (specialist who nails horseshoes to hooves) that did, indeed, attach a horseshoe recently with one fewer nails than is proper. And yes, indeed, the kingdom has subsequently collapsed. You might be wondering if you might be the one to blame. Given the observables of a missing nail and the collapse of the kingdom, you'd like to know the likelihood that the first is the cause of the second. Let's use Etcetera Abduction to answer this question.

Let's start with what we observe. The kingdom, which you know by its name "K" has definitely collapsed. There is was some nail, which you don't know the name of, that you failed to set on a particular shoe, which you are happy to call "S" (for shoe).

;; The observables

(collapse K)
(missing_nail n S)

Notice that the nail is represented by the lowercase letter "n". The other arguments are all logical constants represented with uppercase letters: "K" is a particular known kingdom, and "S" is a particular known shoe. By representing "n" as a lowercase letter, we are signifying that "n" is a variable. Specifically, "n" is an existentially quantified variable. What we are saying here is that there exists some nail that is missing in the shoe "S", but we don't know its name.

This is in contrast to all of the variables that we've seen previously in our axioms. In those axioms, all of the variables are universally quantified variables. They indicate that they can take on any value, and the axiom will still be true.

For example, the axioms that encode the prior probabilities of our observables both have universally quantified variables in them.

;; Priors of the observables

(if (etc0_collapse 0.001 x) (collapse x))
(if (etc0_missing_nail 0.001 x y) (missing_nail x y))

Because "x" and "y" in these axioms are universally quantified variables, these axioms state that anytime these etc0 predicates are true for any entity in the domain, be it a kingdom "K" or a pepperoni pizza "P", then the consequent is true as well, e.g. the pepperoni pizza has collapsed. While it may be a bit disconcerting to be writing an axiom that might apply to a pepperoni pizza in this manner, remember that we're doing abductive reasoning here. These etc0 literals are never going to be observed directly - they can only be assumed as antecedents when trying to explain a consequent. As long as we don't observe collapsing pizzas in the world, the etc0_collapse literal will never be considered as its explanation.

If we execute our abduction program with these observables and these prior probability axioms, we get something interesting:

$ python -m etcabductionpy -i nail-collapse.lisp 
((etc0_collapse 0.001 K) (etc0_missing_nail 0.001 $1 S))
1 solutions.

There is only one solution to this interpretation problem so far, which is simply to assume the prior probabilities. But notice that "$1" appears in the "missing_nail" assumption. Etcetera Abduction has invented a name for the nail that was missing in shoe "S". Technically, this is called a "Skolem constant" after the Norwegian mathematician Thoralf Skolem. You can think of it this way: our abductive reasoning engine needed some way to refer to a specific entity in the world that didn't have a name, so it made up one by joining a dollar-sign "$" with a unique integer "1". If the engine had the occasion to refer to this entity elsewhere, it would have called it "$1" again. If instead it needed to talk about some other entity that was distinct but also unnamed, it might call it "$2" or "$3".

When we graph the best solution, another interesting thing happens:

Notice how only 1 of our 2 observables are plotted with a double line around the literal. Why is that?

The reason is that the literals that are are entailed by our solution are not exactly the same as our observations. We said that we observed an "n", but the assumptions entail a "$1" instead. What's up with that?

This is because Etcetera Abduction doesn't keep track of the names of existential variables that you've provided in the observables, e.g. "n" in the example above. When its finished its search for entailing assumptions, it changes all variables into Skolem constants - effectively giving a name to any entity whose name wasn't previously known, including those entities that originally appeared in the observables. Technically speaking, however, all sets of assumptions do actually entail the given observations, but it requires a bit of extra reasoning to recognize that the existential in the observable is the same as the named Skolem constant. The graphing algorithm doesn't do this extra reasoning step, so it doesn't recognize that (missing_nail $1 S) entails (exists (n) (missing_nail n S)). Thus, no double line around the inference.

Confused? Don't worry about it. Just know that any time your observables have existential variables in them (lowercase letters), your graphs will be drawn without double lines around them, and those lowercase letters will be replaced with dollar-signs and numbers.

With that behind us, let's get down to this business of the collapse of the kingdom. It's true that kingdoms can be lost when a battle is lost. I guess that this happens about half the time when you have battling kingdoms.

;; For want of a battle the kingdom was lost
(if (and (battle_lost_to x y)
	 (etc1_collapse 0.5 x y))
    (collapse x))

(if (etc0_battle_lost_to 0.001 x y) (battle_lost_to x y))

This additional knowledge doesn't change our top interpretation, but now we have a second-best interpretation. Here, we've got to imagine some "$2" exists, which is the name given to the kingdom that defeated "K" in battle.

Let's keep going. I estimate that missing battle messages lead to lost battles about half the time. Missing riders lead to missing messages about half the time. Missing horses lead to missing riders about half the time. Missing shoes lead to missing horses about half the time. And finally, missing nails lead to missing shoes about half the time. Let's encode these conditional probabilities and see what happens.

;; For want of a message the battle was lost
(if (and (missing_message_about m y)
	 (etc1_battle_lost_to 0.5 x y m))
    (battle_lost_to x y))

(if (etc0_missing_message_about 0.001 m y) (missing_message_about m y))

;; for want of a rider the message was lost
(if (and (missing_rider r m)
	 (etc1_missing_message_about 0.5 r m y))
    (missing_message_about m y))

(if (etc0_missing_rider 0.001 r m) (missing_rider r m))

;; For want of a horse the rider was lost
(if (and (missing_horse h r)
	 (etc1_missing_rider 0.5 h r m))
    (missing_rider r m))

(if (etc0_missing_horse 0.001 h r) (missing_horse h r))

;; For want of a shoe the horse was lost
(if (and (missing_shoe s h)
	 (etc1_missing_horse 0.5 h s r))
    (missing_horse h r))

(if (etc0_missing_shoe 0.001 s h) (missing_shoe s h))

;; For want of a nail the shoe was lost
(if (and (missing_nail n s)
	 (etc1_missing_shoe 0.5 n s h))
    (missing_shoe s h))

Ok, now its time for the moment of truth. Is your sloppy horseshoeing the cause of the collapse of the kingdom? Let's graph the top interpretation:

$ python -m etcabductionpy -i nail-collapse.lisp -g | util/dot2safari

Phew! The best interpretation is still the priors. Hopefully that will help you sleep better at night.

Let's use the "grep" command to find interpretations further down the list where the missing nail plays a role in the collapse of the kingdom.

$ python -m etcabductionpy -i nail-collapse.lisp -a | grep "etc1_missing_shoe"
$

Uh oh. That is not good. There are no interpretations that involve the missing nail. Did we mess up the axioms in some way? No, they are all syntactically correct. What is the problem?

The problem is the length of the causal chain that links missing nails to the collapse of the kingdom. Beginning from the collapse, we need to backchain on 6 successive axioms to get to a missing nail, and one more to get to the prior probability of that missing nail. That is seven steps of backchaining, but the Etcetera Abduction engine has a default depth of only five steps. Given the default setting, the engine never considers explanations deeper than a missing horse, and therefore never makes the connection to the missing nail.

What happens if we set the depth to seven, using the "--depth" (or "-d") flag, and graph the top result.

$ python -m etcabductionpy -d 7 -i nail-collapse.lisp -g | util/dot2safari

Oh no! The collapse of the kingdom really was your fault! The best interpretation indicates that the missing nail, some horse "$4", some rider "$5", some message "$2" about rival kingdom "$3" all led to the collapse of kingdom "K".

Now, before you wander off in self-pity about your failings as a farrier, you might want to consider whether our probability estimates are actually reasonable. In this example, we set all of the prior probabilities to 0.001 (very unlikely), but set all of the conditional probabilities to 0.5 (coin-toss likely). As it stands now, we have the conditional probability of the collapse of the kingdom given a missing nail at around 0.015625 (0.5 ^ 6), or 1/64. That actually seems really high, when you think about all of the other crummy farriers in the kingdom, and how sloppy they are with their nails. If skipping a nail would really lead to the collapse of the kingdom 1 in 64 times, the kingdom would not have ever lasted more than a single day. Something must be wrong with our numbers.

Indeed, the conditional probabilities in our axioms are way, way too high. The reasons are subtle, though, and require us to think very carefully about the universally quantified variables in our axioms. Let's take a look at one of our axioms, and consider exactly what it is that we've said about its conditional probability:

;; For want of a nail the shoe was lost
(if (and (missing_nail n s)
	 (etc1_missing_shoe 0.5 n s h))
    (missing_shoe s h))

On the surface, this looks fine. We're trying to say that if your horse is missing a nail in its horseshoe, its going to fall off about half of the time. But actually, that is not precisely what this axiom is saying. Remember, all of the variables in these axioms are universally quantified variables. That means, the axiom is true for any assignment of these variables to any entities in the world of our problem.

A good technique for thinking about this axiom is to substitute the variables for imagined constants, and seeing if the numbers still make sense. Here's what it would look like if we are talking about a specific nail "NAIL9", a specific shoe "SHOE14", and a specific horse "HORSE93":

;; For want of NAIL9, SHOE14 was lost on HORSE93
(if (and (missing_nail NAIL9 SHOE14)
	 (etc1_missing_shoe 0.5 NAIL9 SHOE14 HORSE93))
    (missing_shoe SHOE14 HORSE93))

Now, let's consider whether our conditional probability estimate is still reasonable. What is the conditional probability of "HORSE93" missing "SHOE14", given that "SHOE14" is missing "NAIL9"? Sure, shoes without nails fail half the time, but what is the chance that this particular shoe ended up on this particular horse?

Let's further estimate there are at least 1,000 horses in this kingdom, and each one of them has four hooves, and each hoof takes a shoe with eight nails. The conditional probability that we're looking for is this one, for the conditional probability of seeing a particular shoe missing from a particular horse given that a particular nail is missing from that very shoe:

Pr[(missing_shoe SHOE14 HORSE93) | (missing_nail NAIL9 SHOE14)]

Put this way, we can see that we must factor in the likelihood that this particular crummy shoe-job was applied to this particular horse, out of all 1,000 of the horses in the kingdom (1/1000), and multiply that by the likelihood of loosing the shoe due to the missing nail (1/2). The likelihood of this particular horse missing this particular shoe, given that the shoe is missing a nail, is closer to 1/2000, or 0.0005. And because this conditional probability is the same regardless of which particular nail, shoe, and horse we're concerned with, we should change or original axiom to reflect the more accurate estimate:

;; For want of a nail the shoe was lost
(if (and (missing_nail n s)
	 (etc1_missing_shoe 0.0005 n s h))
    (missing_shoe s h))

We should really fix the numbers in the other axioms via a similar analysis, but even with this change we have a new order in our interpretations:

$ python -m etcabductionpy -d 7 -i nail-collapse.lisp -a
((etc0_missing_nail 0.001 $1 S) (etc1_battle_lost_to 0.5 K $3 $2) (etc1_collapse 0.5 K $3) (etc1_missing_horse 0.5 $4 S $5) (etc1_missing_message_about 0.5 $5 $2 $3) (etc1_missing_rider 0.5 $4 $5 $2) (etc1_missing_shoe 0.5 $1 S $4))
((etc0_collapse 0.001 K) (etc0_missing_nail 0.001 $1 S))
((etc0_battle_lost_to 0.001 K $2) (etc0_missing_nail 0.001 $1 S) (etc1_collapse 0.5 K $2))
((etc0_missing_message_about 0.001 $2 $3) (etc0_missing_nail 0.001 $1 S) (etc1_battle_lost_to 0.5 K $3 $2) (etc1_collapse 0.5 K $3))
((etc0_missing_nail 0.001 $1 S) (etc0_missing_rider 0.001 $4 $2) (etc1_battle_lost_to 0.5 K $3 $2) (etc1_collapse 0.5 K $3) (etc1_missing_message_about 0.5 $4 $2 $3))
((etc0_missing_horse 0.001 $4 $5) (etc0_missing_nail 0.001 $1 S) (etc1_battle_lost_to 0.5 K $3 $2) (etc1_collapse 0.5 K $3) (etc1_missing_message_about 0.5 $5 $2 $3) (etc1_missing_rider 0.5 $4 $5 $2))
((etc0_missing_nail 0.001 $1 S) (etc0_missing_shoe 0.001 $6 $4) (etc1_battle_lost_to 0.5 K $3 $2) (etc1_collapse 0.5 K $3) (etc1_missing_horse 0.5 $4 $6 $5) (etc1_missing_message_about 0.5 $5 $2 $3) (etc1_missing_rider 0.5 $4 $5 $2))
((etc0_missing_nail 0.001 $1 S) (etc0_missing_nail 0.001 $6 $7) (etc1_battle_lost_to 0.5 K $3 $2) (etc1_collapse 0.5 K $3) (etc1_missing_horse 0.5 $4 $7 $5) (etc1_missing_message_about 0.5 $5 $2 $3) (etc1_missing_rider 0.5 $4 $5 $2) (etc1_missing_shoe 0.5 $6 $7 $4))
8 solutions.

And here is the graph of the most likely interpretation.

Congratulations! It likely wasn't your crummy job as a farrier that led to the kingdom's collapse. Kingdoms collapse for unknown reasons, and nails fail to be set into horseshoes for unknown reasons. A causal connection between the two is rather unlikely. In fact, its nearly the least likely interpretation that you can imagine, given the knowledge that you have. The only thing less likely that you can imagine is that it was some other missing nail altogether that is to blame, other than the one you observed yourself.

The point of this example is to encourage you to be careful with your conditional probabilities, especially in the case where you are working with variables that only appear in the etcetera literal and the consequent. You are much better off avoiding these axioms altogether.

Fortunately, Etcetera Abduction is perfectly capable of helping you represent the uncertain about whether the horseshoe with the missing nail ended up on a particular horse. All we need to do is add this assumption to the antecedents of our axiom, and make this probability explicit as a prior probability:

;; Alternative formulation of missing shoe axiom

(if (and (missing_nail n s)
	 (shoe_on_horse s h)
	 (etc2_missing_shoe 0.5 n s h))
    (missing_shoe s h))

(if (etc0_shoe_on_horse 0.001 s h) (shoe_on_horse s h))

Represented this way, the shoe-on-horse assumption is made explicit, as illustrated the following (low-probability) interpretation.

It might be a good exercise for you to rewrite all of the axioms in this manner. This way, you can make explicit the likelihoods that a given rider was on the horse with the missing shoe, that a given message about the enemy was being delivered by that rider, and so forth. Good luck!