2.3. Full clausal logic#
Relational logic extends propositional logic by means of the logical variable, which enables us to talk about arbitrary un-named individuals. However, consider the following statement:
Everybody loves somebody.
The only way to express this statement in relational clausal logic, is by explicitly listing every pair of persons such that the first loves the second, e.g.
loves(peter,peter).
loves(anna,paul).
loves(paul,anna).
First of all, this is not a precise translation of the above statement into logic, because it is too explicit (e.g. the fact that Peter loves himself does not follow from the original statement). Secondly, this translation works only for finite domains, while the original statement also allows infinite domains. Many interesting domains are infinite, such as the set of natural numbers. Full clausal logic allows us to reason about infinite domains by introducing more complex terms besides constants and variables. The above statement translates into full clausal logic as
loves(X,person_loved_by(X)).
The fact loves(peter,person_loved_by(peter))
is a logical consequence of this clause. Since we know that everybody loves somebody, there must exist someone whom Peter loves. We have given this person the abstract name
person_loved_by(peter)
without explicitly stating whom it is that Peter loves. As we will see, this way of composing complex names from simple names also gives us the possibility to reflect the structure of the domain in our logical formulas.
Translate to clausal logic:
every mouse has a tail;
somebody loves everybody;
every two numbers have a maximum.
Syntax#
A term is either simple or complex. Constants and variables are simple terms. A complex term is a functor (which follows the same notational conventions as constants and predicates) followed by a number of terms, enclosed in brackets and separated by commas, e.g. eldest_child_of(anna,paul)
. The terms between brackets are called the arguments of the functor, and the number of arguments is the functor’s arity. Again, a ground term is a term without variables. All the other definitions (atom, clause, literal, program) are the same as for relational clausal logic.
Unification vs. evaluation#
Functors should not be confused with mathematical functions. Although both can be viewed as mappings from objects to objects, an expression containing a functor is not evaluated to determine the value of the mapping, as in mathematics. Rather, the outcome of the mapping is a name, which is determined by unification. For instance, given the complex term person_loved_by(X)
, if we want to know the name of the object to which Peter is mapped, we unify X
with peter
to get person_loved_by(peter)
; this ground term is not evaluated any further.
This approach has the disadvantage that we introduce different names for individuals that might turn out to be identical, e.g. person_loved_by(peter)
might be the same as peter
. Consequently, reasoning about equality (of different names for the same object) is a problem in clausal logic. Several possible solutions exist, but they fall outside the scope of this book.
Semantics#
Although there is no syntactic difference in full clausal logic between terms and atoms, their meaning and use is totally different, a fact which should be adequately reflected in the semantics. A term always denotes an individual from the domain, while an atom denotes a proposition about individuals, which can get a truth value. Consequently, we must change the definition of the Herbrand universe in order to accommodate for complex terms: given a program \(P\), the Herbrand universe is the set of ground terms that can be constructed from the constants and functors in \(P\) (if \(P\) contains no constants, choose an arbitrary one). For instance, let \(P\) be the program
plus(0,X,X).
plus(s(X),Y,s(Z)):-plus(X,Y,Z).
then the Herbrand universe of \(P\) is { 0, s(0), s(s(0)), s(s(s(0))), ... }
. Thus, as soon as a program contains a functor, the Herbrand universe (the set of individuals we can reason about) is an infinite set.
Determine the Herbrand universe of the following program:
listlength([],0).
listlength([_X|Y],s(L)):-listlength(Y,L).
(Hint: recall that []
is a constant, and that [X|Y]
is an alternative notation for the complex term .(X,Y)
with binary functor ‘.
’!)
The Herbrand base of \(P\) remains the set of ground atoms that can be constructed using the predicates in \(P\) and the ground terms in the Herbrand universe. For the above program, the Herbrand base is
{ plus(0,0,0), plus(s(0),0,0), ...,
plus(0,s(0),0), plus(s(0),s(0),0), ...,
...,
plus(s(0),s(s(0)),s(s(s(0)))), ... }
As before, a Herbrand interpretation is a subset of the Herbrand base, whose elements are assigned the truth value true. For instance,
{ plus(0,0,0), plus(s(0),0,s(0)), plus(0,s(0),s(0)) }
is an interpretation of the above program.
Is this interpretation also a model of the program? As in the propositional case, we define an interpretation to be a model of a program if it is a model of every ground instance of every clause in the program. But since the Herbrand universe is infinite, there are an infinite number of grounding substitutions, hence we must generate the ground clauses in a systematic way, e.g.
plus(0,0,0).
plus(s(0),0,s(0)):-plus(0,0,0).
plus(s(s(0)),0,s(s(0))):-plus(s(0),0,s(0)).
plus(s(s(s(0))),0,s(s(s(0)))):-plus(s(s(0)),0,s(s(0))).
...
plus(0,s(0),s(0)).
plus(s(0),s(0),s(s(0))):-plus(0,s(0),s(0)).
plus(s(s(0)),s(0),s(s(s(0)))):-plus(s(0),s(0),s(s(0))).
...
plus(0,s(s(0)),s(s(0))).
plus(s(0),s(s(0)),s(s(s(0)))):-plus(0,s(s(0)),s(s(0))).
plus(s(s(0)),s(s(0)),s(s(s(s(0))))):-plus(s(0),s(s(0)),s(s(s(0)))).
...
Now we can reason as follows: according to the first ground clause, plus(0,0,0)
must be in any model; but then the second ground clause requires that plus(s(0),0,s(0))
must be in any model, the third ground clause requires plus(s(s(0)),0,s(s(0)))
to be in any model, and so on. Likewise, the second group of ground clauses demands that
plus(0,s(0),s(0))
plus(s(0),s(0),s(s(0)))
plus(s(s(0)),s(0),s(s(s(0))))
...
are in the model; the third group of ground clauses requires that
plus(0,s(s(0)),s(s(0)))
plus(s(0),s(s(0)),s(s(s(0))))
plus(s(s(0)),s(s(0)),s(s(s(s(0)))))
...
are in the model, and so forth.
In other words, every model of this program is necessarily infinite. Moreover, as you should have guessed by now, it contains every ground atom such that the number of s
’s in the third argument equals the number of s
’s in the first argument plus the number of s
’s in the second argument. The way we generated this infinite model is particularly interesting, because it is essentially what was called the naive proof method in the relational case: generate all possible ground instances of program clauses by applying every possible grounding substitution, and then apply (propositional) resolution as long as you can. While, in the case of relational clausal logic, there inevitably comes a point where applying resolution will not give any new results (i.e. you reach a fixpoint), in the case of full clausal logic with infinite Herbrand universe you can go on applying resolution forever. On the other hand, as we saw above, we get a clear idea of what the infinite model[1] we’re constructing looks like, which means that it is still a fixpoint in some sense. There are mathematical techniques to deal with such infinitary fixpoints, but we will not dive into this subject here.
Although the introduction of only a single functor already results in an infinite Herbrand universe, models are not necessarily infinite. Consider the following program:
reachable(oxford,charing_cross,piccadilly).
reachable(X,Y,route(Z,R)):-
connected(X,Z,_L),
reachable(Z,Y,R).
connected(bond_street,oxford,central).
with intended meaning ‘Charing Cross is reachable from Oxford Circus via Piccadilly Circus’, ‘if X
is connected to Z
by line L
and Y
is reachable from Z
via R
then Y
is reachable from X
via a route consisting of Z
and R
’ and ‘Bond Street is connected to Oxford Circus by the Central line’. The minimal model of this program is the finite set
{ connected(bond_street,oxford,central),
reachable(oxford,charing_cross,piccadilly),
reachable(bond_street,charing_cross,route(oxford,piccadilly)) }
A Prolog program for constructing models of a given set of clauses (or submodels if the models are infinite) can be found in Section 5.4.
Proof theory#
Resolution for full clausal logic is very similar to resolution for relational clausal logic: we only have to modify the unification algorithm in order to deal with complex terms. For instance, consider the atoms
plus(s(0),X,s(X))
and
plus(s(Y),s(0),s(s(Y)))
Their mgu is { Y
→ 0
, X
→ s(0)
}, yielding the atom
plus(s(0),s(0),s(s(0)))
In order to find this mgu, we first of all have to make sure that the two atoms do not have any variables in common; if needed some of the variables should be renamed. Then, after making sure that both atoms contain the same predicate (with the same arity), we scan the atoms from left to right, searching for the first subterms at which the two atoms differ. In our example, these are 0
and Y
. If one of these subterms is not a variable, then the two atoms are not unifiable; otherwise, substitute the other term for all occurrences of the variable in both atoms, and remember this partial substitution (in the above example: { Y
→ 0
}), because it is going to be part of the unifier we are constructing. Then, proceed with the next subterms at which the two atoms differ. Unification is finished when no such subterms can be found (the two atoms are made equal).
Although the two atoms initially have no variables in common, this may change during the unification process. Therefore, it is important that, before a variable is replaced by a term, we check whether the variable already occurs in that term; this is called the occur check. If the variable does not occur in the term by which it is to be replaced, everything is in order and we can proceed; if it does, the unification should fail, because it would lead to circular substitutions and infinite terms. To illustrate this, consider again the clause
loves(X,person_loved_by(X)).
We want to know whether this implies that someone loves herself; thus, we add the query :-loves(Y,Y)
to this clause and try to apply resolution. To this end, we must unify the two atoms. The first subterms at which they differ are the first arguments, so we apply the partial substitution { Y
→ X
} to the two atoms, resulting in
loves(X,person_loved_by(X))
and
loves(X,X)
The next subterms at which these atoms differ are their second arguments, one of which is a variable. Suppose that we ignore the fact that this variable, X
, already occurs in the other term; we construct the substitution { X
→ person_loved_by(X)
}. Now, we have reached the end of the two atoms, so unification has succeeded, we have derived the empty clause, and the answer to the query is
X = person_loved_by(person_loved_by(person_loved_by(...)))
which is an infinite term.
Now we have two problems. The first is that we did not define any semantics for infinite terms, because there are no infinite terms in the Herbrand base. But even worse, the fact that there exists someone who loves herself is not a logical consequence of the above clause! That is, this clause has models in which nobody loves herself. So, unification without occur check would make resolution unsound.
If possible, unify the following pairs of terms:
plus(X,Y,s(Y))
andplus(s(V),W,s(s(V)))
;length([X|Y],s(0))
andlength([V],V)
;larger(s(s(X)),X)
andlarger(V,s(V))
.
The disadvantage of the occur check is that it can be computationally very costly. Suppose that you need to unify X
with a list of thousand elements, then the complete list has to be searched in order to check whether X
occurs somewhere in it. Moreover, cases in which the occur check is needed often look somewhat exotic. Since the developers of Prolog were also taking the efficiency of the Prolog interpreter into consideration, they decided to omit the occur check from Prolog’s unification algorithm. On the whole, this makes Prolog unsound; but this unsoundness only occurs in very specific cases, and it is the duty of the programmer to avoid such cases. In case you really need sound unification, most available Prolog implementations provide it as a library routine, but you must build your own Prolog interpreter in order to incorporate it. In Chapter 3, we will see that this is in fact amazingly simple: it can even be done in Prolog!
Meta-theory#
Most meta-theoretical results concerning full clausal logic have already been mentioned. Full clausal resolution is sound (as long as unification is performed with the occur check), refutation complete but not complete. Moreover, due to the possibility of infinite interpretations full clausal logic is only semi-decidable: that is, if \(A\) is a logical consequence of \(B\), then there is an algorithm that will check this in finite time; however, if \(A\) is not a logical consequence of \(B\), then there is no algorithm which is guaranteed to check this in finite time for arbitrary \(A\) and \(B\). Consequently, there is no general way to prevent Prolog from looping if no (further) answers to a query can be found.