home

Example 6: Man Bites Dog

April 21, 2018

Imagine that you open up your neighborhood newsletter and read the following headline:

Man Bites Dog!

You do a double-take. That can't be right. Surely, they meant to write "Dog bites man." Was this an editorial failure?

Your mental anguish over this headline stems from a conflict between syntax and semantics. Or put another way, the order of these three words is promoting an interpretation that is different than the one you would favor if these words were unordered. That is probably a good way to think about syntax, in general: our linguistic capacity for organizing words into syntactic structures based on their order serves to promote some interpretations of the words over others.

In the examples on this page, we are going to use Etcetera Abduction to interpret the meaning of natural language words, first by ignoring their order. We'll see that syntax is superfluous in some cases, and that it is entirely possible to construct structured representations of the meaning of sentences by ignoring syntax altogether - but sometimes we'll get it wrong, as in the simple example "Man Bites Dog."

To begin, we need some way to represent an unordered set of words as our "observables." One way to do this is to have each word be its own literal, each represented as a predicate with no arguments. Technically, would call this a "propositional logic" representation.

;; the observables

(man)
(bites)
(dog)

The important thing to note here is that the input observations in our interpretation problem is the set of words in the sentences. In Etcetera Abduction, we are searching for the most probable set of assumptions that logically entail the observations. So we're looking for the most probable set of assumptions that logically entail these words. The interpretation of this sentence, then, is a set of assumptions. The "meaning" of the sentence explains why these words are observed.

For this particular trio of words, there are a variety of interpretations that account for their cooccurence. At least one of them, however, posits some "biting" event, where some man and some dog are both participants. To encode this meaning, we'll need first-order logic. Indeed the concepts of biting, man, and dog are different literals than the ones we have articulated as observations. The words can be represented as zero-argument propositions, but the concepts that explain them are going to be relational.

Here I provide the core concepts that explain the words - namely a man, a dog, and a biting event - each encoded using eventuality notation.

;; why word bites? maybe bite'

(if (and (bite' e x y)
	 (etc1_word_bites 0.1 e x y))
    (bites))

;; why bites'? why not!

(if (etc0_bites 0.1 e x y) (bite' e x y))

;; why word man? may be a man

(if (and (man' e x)
	 (etc1_word_man 0.1 e x))
    (man))

;; why man'? why not!

(if (etc0_man 0.1 e x) (man' e x))

;; why word dog? maybe a dog

(if (and (dog' e x)
	 (etc1_word_dog 0.1 e x))
    (dog))

;; why dog'? why not! 

(if (etc0_dog 0.1 e x) (dog' e x))

With these six axioms in place, we get the beginning of a structured representation of three propositional observations:

proof n0 etc0_bites 0.1 n6 (bite' $4 $5 $3) n0->n6 n1 etc0_dog 0.1 n7 (dog' $2 $1) n1->n7 n2 etc0_man 0.1 n8 (man' $6 $7) n2->n8 n3 etc1_word_bites 0.1 n9 (bites) n3->n9 n4 etc1_word_dog 0.1 n10 (dog) n4->n10 n5 etc1_word_man 0.1 n11 (man) n5->n11 n6->n9 n7->n10 n8->n11

So far we have three new eventualities ($4, $2, and $6), one agent of biting ($5), one patient of biting ($3), one dog ($1), and one man ($7). What we would rather have is a more parsimonious interpretation, where the entity doing the biting is either the dog ($5 is equal to either $1 or $7), with the thing bitten being the other one. Facilitating these unifications will be the central goal of the axioms we write next, so that the best interpretation is a fully-connected graph.

For our first attempt, let's add some commonsense knowledge about dogs and men. One of the big fears of many dog owners is that their dog might bite someone. It's one of the things dogs occasionally do, and is the reason that people want to see dogs on short leashes. The following three axioms encode this bit of commonsense knowledge.

;; why a dog? maybe a dog biting a person

(if (and (person' e1 a)
	 (bite' e2 d a)
	 (etc1_dog 0.2 e e1 e2 d a))
    (dog' e d))

;; why person? why not!

(if (etc0_person 0.1 e x) (person' e x))

;; why person? maybe a man

(if (and (man' e1 x)
	 (etc1_person 1.0 e e1 x))
    (person' e x))

It is the first of these three axioms that encodes the core bit of commonsense knowledge: When a person gets bitten, its possible that there was a dog to blame. Or put another way, when you observe a dog, it might be that it might be engaged in some person-biting. The second axiom gives a prior for persons. The third axiom is somewhat interesting, though. It states that whenever you have a man, then that man is always a person (probability of 1.0). This axioms is encoding a simple bit of taxonomic knowledge. When forward-chaining on this axiom, we ascend the taxonomy: a man is a type of person. When backward-chaining on this axiom, we are descending this same taxonomy: if its a person, it might be a man.

With these three axioms in place, we have a new most-probable interpretation, where we envision some dog ($6) and some man ($4), where it is the dog that is biting the man.

proof n0 etc0_bites 0.1 n7 (bite' $5 $6 $4) n0->n7 n1 etc0_man 0.1 n8 (man' $1 $4) n1->n8 n2 etc1_dog 0.2 n12 (dog' $2 $6) n2->n12 n3 etc1_person 1.0 n10 (person' $3 $4) n3->n10 n4 etc1_word_bites 0.1 n9 (bites) n4->n9 n5 etc1_word_dog 0.1 n13 (dog) n5->n13 n6 etc1_word_man 0.1 n11 (man) n6->n11 n7->n9 n7->n12 n8->n10 n8->n11 n10->n12 n12->n13

Not bad. We successfully constructed a structured first-order representation that captures the meaning of a sentence of three (unordered) words.

Back in the 1970s and 1980s, there were a number of natural lanugage processing researchers who thought that syntax (derived from word order) was over-rated as a focus of research effort, and that natural language understanding was better approached as a commonsense reasoning problem. The example above provides at least one approach toward deep interpretation without using syntax. You might apply the same approach to understand stories that have no syntax whatsoever, like: "Driving. Night. Raining. Curve. Crash. Coma." Also, this approach seems useful in understanding nonstandard grammatical constructions, e.g., coercing a non-transitive verb into a transitive one, as in "The dog barked the cat up the tree."

Of course the main problem here is that we got the wrong interpretation! The headline was "Man Bites Dog!" Reading this headline does indeed conjure up some vision of some man and some dog, but it is the man that is doing the biting, not the dog. How do we get the right interpretation?

The second thing we can try is to simply provide more commonsense knowledge. Just like we know that dogs are prone to biting people, we know that people are prone to biting food. Indeed, people bite into food around three meals a day. If we can envision the dog as a type of food, then we can get the right interpretation. Here are three axioms that get the job done:

;; why man'? Maybe eating food

(if (and (food' e1 y)
	 (bite' e2 x y)
	 (etc1_man 0.2 e e1 e2 x y))
    (man' e x))

;; why food? why not!

(if (etc0_food 0.1 e x) (food' e x))

;; why food? maybe a dog

(if (and (dog' e1 x)
	 (etc2_dog 0.9 e e1 x))
    (food' e x))

Adding these axioms opens up some new interpretations of our three words. Still, the "correct" interpretation that we sought is only #2 in the list of most probable. We use the "--solution" (or "-s") flag when we want to graph solutions further down the list, as follows:

$ python -m etcabductionpy -i man-bites-dog-v1.lisp -g -s 2
proof n0 etc0_bites 0.1 n7 (bite' $5 $6 $4) n0->n7 n1 etc0_dog 0.1 n8 (dog' $3 $4) n1->n8 n2 etc1_man 0.2 n12 (man' $1 $6) n2->n12 n3 etc1_word_bites 0.1 n9 (bites) n3->n9 n4 etc1_word_dog 0.1 n10 (dog) n4->n10 n5 etc1_word_man 0.1 n13 (man) n5->n13 n6 etc2_dog 0.9 n11 (food' $2 $4) n6->n11 n7->n9 n7->n12 n8->n10 n8->n11 n11->n12 n12->n13

It is not a totally outlandish idea; there are certainly some countries in the world where dogs are eaten as a type of food. Still, the likelihood that a given dog is food is probably not 90 percent, as we stated in our axiom, above. The whole approach seems somewhat wonky, though. There must be an easier way to have the man be the agent of the bite event.

Version 2: The sequence "man bites dog"

In natural lanuage, the order of the words provide you with a lot of information that points you toward the intended interpretation. It is from the order of our three words that we can infer that the man is the agent of the bite, and the dog is the thing that was bitten.

To get Etcetera Abduction to do the right thing, we need to make this ordering part of the observation. While our simple no-argument propositions were good enough to represent an unordered set of input words, specifying both a word and its order requires first-order logic. There are lots of representational options, but one simple way is to posit that there are 3 constants, W1 W2 and W3, which are the three words. Then, the labels for the different words and their sequential order can be predications on these constants, as follows:

;; "man bites dog" version 2

;; The observables

(man W1)
(bites W2)
(dog W3)
(seq W1 W2 W3)

Just as we did in version 1, we'll need axioms that bridge the gap between the words and the entities that they refer to. This time, however, we can make explicit that these words refer to these imagined entities, using a "ref" relation.

;; why word man? maybe refering to a man

(if (and (man' e x)
	 (ref w x)
	 (etc1_word_man 0.1 e w x))
    (man w))

;; why man'? why not!

(if (etc0_man 0.1 e x) (man' e x))

;; why word bites? Maybe refering to a biting eventuality

(if (and (bite' e x y)
	 (ref w e)
	 (etc1_word_bites 0.1 e x y w))
    (bites w))

;; why bites'? why not!

(if (etc0_bite 0.1 e x y) (bite' e x y))

;; why dog? Maybe refering to a dog

(if (and (dog' e d)
	 (ref w d)
	 (etc1_word_dog 0.1 e w d))
    (dog w))

;; why dog'? why not!

(if (etc0_dog 0.1 e x) (dog' e x))

;; why ref? Why not!

(if (etc0_ref 0.1 w c) (ref w c))
      

If we were to ignore the sequence literal (seq W1 W2 W3), we could use these axioms alone to get an unconnected interpretation that imagines a man, a dog, and a biting eventuality, looking like this:

proof n0 etc0_bite 0.1 n9 (bite' $4 $5 $2) n0->n9 n1 etc0_dog 0.1 n10 (dog' $7 $6) n1->n10 n2 etc0_man 0.1 n11 (man' $1 $3) n2->n11 n3 etc0_ref 0.1 n12 (ref W1 $3) n3->n12 n4 etc0_ref 0.1 n13 (ref W2 $4) n4->n13 n5 etc0_ref 0.1 n14 (ref W3 $6) n5->n14 n6 etc1_word_bites 0.1 n16 (bites W2) n6->n16 n7 etc1_word_dog 0.1 n17 (dog W3) n7->n17 n8 etc1_word_man 0.1 n15 (man W1) n8->n15 n9->n16 n10->n17 n11->n15 n12->n15 n13->n16 n14->n17

But we don't want to ignore the sequence literal - that is where all the important information lies. To exploit it, we need a bit of syntactic knowledge. Actually, we need two parts:

  1. A sequence of three words might be an expression involving a monotransitive verb, i.e., where an agent does something to a single patient, where the first word refers to the agent, the second to the verb, and the third to the patient, and
  2. The 3-arity eventuality literal "bite" directly maps to a monotransitive expression.

Here's how I would express these two bits of syntactic knowledge:

;; why seq? maybe monotransitive construction

(if (and (monotransitive e x y)
	 (ref s1 x)
	 (ref s2 e)
	 (ref s3 y)
	 (etc1_seq 0.1 e x y s1 s2 s3))
    (seq s1 s2 s3))

;; why monotransitive? maybe bite'

(if (and (bite' e x y)
	 (etc1_monotransitive 0.1 e x y))
    (monotransitive e x y))

Adding back in our (seq W1 W2 W3) literal as an input, the best interpretation now makes all of the right unifications, leading us to interpret this input as meaning a man is doing the biting, and the dog is the thing that is bitten.

proof n0 etc0_bite 0.1 n11 (bite' $1 $4 $3) n0->n11 n1 etc0_dog 0.1 n12 (dog' $5 $3) n1->n12 n2 etc0_man 0.1 n13 (man' $2 $4) n2->n13 n3 etc0_ref 0.1 n14 (ref W1 $4) n3->n14 n4 etc0_ref 0.1 n15 (ref W2 $1) n4->n15 n5 etc0_ref 0.1 n16 (ref W3 $3) n5->n16 n6 etc1_monotransitive 0.1 n17 (monotransitive $1 $4 $3) n6->n17 n7 etc1_seq 0.1 n21 (seq W1 W2 W3) n7->n21 n8 etc1_word_bites 0.1 n19 (bites W2) n8->n19 n9 etc1_word_dog 0.1 n20 (dog W3) n9->n20 n10 etc1_word_man 0.1 n18 (man W1) n10->n18 n11->n17 n11->n19 n12->n20 n13->n18 n14->n18 n14->n21 n15->n19 n15->n21 n16->n20 n16->n21 n17->n21

Great! But that seemed like a lot of work for a pretty simple semantic role labeling problem. Wouldn't it be a lot easier just to apply the Stanford Parser to this text?

Yes it would! Indeed, if you apply the Stanford Parser to this three-word sequence, you'll get an output that is essentially equivalent to what we've done here using abduction, particularly when you look at the Universal depedencies section.

Stanford Parser

Your query:

  man bites dog

Tagging:
  
  man/NN bites/VBZ dog/NN

Parse:

  (ROOT
    (S
       (NP (NN man))
         (VP (VBZ bites)
           (NP (NN dog)))))

Universal dependencies:

  nsubj(bites-2, man-1)
  root(ROOT-0, bites-2)
  dobj(bites-2, dog-3)

Both the Stanford Parser and our Version 2 are finding structured representations that explain the ordered sequence of words, using very different methods. Still, there are some analogies to be drawn between the two algorithmic approaches, particularly around how the search is conducted.

If you were really ambitious, you could probably write a pretty good syntactic parser using Etcetera Abduction. The advantage of doing so is that you could provide an integrated account of language understanding that included syntax, semantics, and pragmatics all under the umbrella of logical abduction. This idea was one of the great achievements of Prof. Jerry Hobbs and his colleagues at SRI back in the 1990s, immortalized in the following famous NLP paper:

But since then, data-driven statistical parsers like Stanford's have gotten really good at the syntax part of the puzzle. So much so that even the few researchers who are really interested in the idea of "interpretation as abduction" will opt to use a high-performance statsitical parser at the front-end of the language interpretation pipeline. The current favorite among language-logicians is probably the Combinatorial Categorial Grammar parsers coming out of the University of Edinburgh, but people have also had some luck converting the Stanford Parser's Universal Dependenices in a usable logical form for use as input literals. Give it a try!