Bottom-up induction

9.2. Bottom-up induction#

The induction program we will develop in this section constructs \(\theta\)-LGG’s of two examples, relative to a partial model \(M\) which consists of all positive examples plus ground facts for the background predicates, of which the definitions are given beforehand. Such \(\theta\)-LGG’s are called relative least general generalisations or RLGG’s. Typically, RLGG’s are quite big clauses, that contain many redundant or otherwise useless literals, but also one or two useful literals. For instance, suppose \(M\) consists of the following positive examples for the predicate append/3:

append([1,2],[3,4],[1,2,3,4]).   append([a],[],[a]).
append([],[],[]).                append([2],[3,4],[2,3,4]).

The RLGG of two examples \(E_1\) and \(E_2\) relative to a model \(M\) is defined as the \(\theta\)-LGG of the clauses \(E_1 \texttt{:-} Conj(M)\) and \(E_2 \texttt{:-} Conj(M)\), where \(Conj(M)\) denotes the conjunction of the ground facts in \(M\). So, the RLGG of the first two examples above is the \(\theta\)-LGG of the following two clauses:

append([1,2],[3,4],[1,2,3,4]):-
    append([1,2],[3,4],[1,2,3,4]),append([a],[],[a]),
    append([],[],[]),append([2],[3,4],[2,3,4]).

append([a],[],[a]):-
    append([1,2],[3,4],[1,2,3,4]),append([a],[],[a]),
    append([],[],[]),append([2],[3,4],[2,3,4]).

The body of the resulting clause consists of 16 literals, constructed by pairwise anti-unification of facts in \(M\):

append([A|B],C,[A|D]):-
    append([1,2],[3,4],[1,2,3,4]),append([A|B],C,[A|D]),
    append(W,C,X),append([S|B],[3,4],[S,T,U|V]),
    append([R|G],K,[R|L]),append([a],[],[a]),
    append(Q,[],Q),append([P],K,[P|K]),append(N,K,O),
    append(M,[],M),append([],[],[]),append(G,K,L),
    append([F|G],[3,4],[F,H,I|J]),append([E],C,[E|C]),
    append(B,C,D),append([2],[3,4],[2,3,4]).

Clearly, this clause contains many redundant literals. First of all, removing the ground facts from \(M\) does not change the logical meaning of the clause, since they are known to be true. Furthermore, note that most literals introduce new variables, that do not appear in the head of the clause[1]. For simplicity, we will assume that this does not occur in the intended program, i.e. all variables in the body of a hypothesis clause also occur in the head. Such clauses are also called constrained. Under this assumption, the clause can be considerably reduced:

append([A|B],C,[A|D]):-
    append([A|B],C,[A|D]),append(B,C,D).

Note that the first body literal turns the clause into a tautology: a clause that is true by definition. We will exclude this literal as well by assuming that hypothesis clauses are strictly constrained, i.e. the set of body variables is a proper subset of the set of head variables (see Exercise 9.4 for a discussion of the kind of program excluded by this restriction). Under this assumption, we arrive at the recursive clause for append/3:

append([A|B],C,[A|D]):-
    append(B,C,D).

It is interesting to trace the literal append(B,C,D) back to its origin: it is the anti-unification of the facts append([],[],[]) and append([2],[3,4],[2,3,4]). These are exactly the ground bodies of the last clause, if we unify its head with the two original examples!

The program for computing the RLGG of two examples is given below. It is a slight modification of the program for computing \(\theta\)-LGG’s, given in the previous section. After the head of the clause is constructed, the variables in the head are passed on to the predicate rlgg_bodies/9, which will only construct literals of which all the variables occur in the head.

% rlgg(E1,E2,M,C) <- C is RLGG of E1 and E2 relative to M
rlgg(E1,E2,M,(H:-B)):-
    anti_unify(E1,E2,H,[],S10,[],S20),
    varsin(H,V),  % determine variables in head of clause
    rlgg_bodies(M,M,[],B,S10,_S1,S20,_S2,V).

% varsin(T,V) <- V is list of variables occuring in term T
%                (standard predicate in many Prologs)

rlgg_bodies([],_B2,B,B,S1,S1,S2,S2,_V).
rlgg_bodies([L|B1],B2,B0,B,S10,S1,S20,S2,V):-
    rlgg_literal(L,B2,B0,B00,S10,S11,S20,S21,V),
    rlgg_bodies(B1,B2,B00,B,S11,S1,S21,S2,V).

rlgg_literal(_L1,[],B,B,S1,S1,S2,S2,_V).
rlgg_literal(L1,[L2|B2],B0,B,S10,S1,S20,S2,V):-
    same_predicate(L1,L2),
    anti_unify(L1,L2,L,S10,S11,S20,S21),
    varsin(L,Vars),
    var_proper_subset(Vars,V),  % no new variables
    !,rlgg_literal(L1,B2,[L|B0],B,S11,S1,S21,S2,V).
rlgg_literal(L1,[_L2|B2],B0,B,S10,S1,S20,S2,V):-
    rlgg_literal(L1,B2,B0,B,S10,S1,S20,S2,V).

%%% var_… uses == rather than unification (Section 10.2 (appendix))

For simplicity, the body of the RLGG thus constructed is a list of literals rather than a conjunction.

The main algorithm of the RLGG-program is relatively simple: construct the RLGG of two positive examples, and remove all positive examples that are extensionally covered by this clause. Such an algorithm, which induces each clause separately, is also called a covering algorithm. Positive and negative examples, identified by a sign, are first separated by means of the predicate pos_neg/3, and the positive examples are combined with a (possibly empty) background model for the background predicates, to yield the model to be used for construction of RLGG’s.

induce_rlgg(Exs,Clauses):-
    pos_neg(Exs,Poss,Negs),  % split pos. & neg. examples
    bg_model(BG),            % ground background model
    append(Poss,BG,Model),   % Model includes pos.exs.
    induce_rlgg(Poss,Negs,Model,Clauses).

induce_rlgg(Poss,Negs,Model,Clauses):-
    covering(Poss,Negs,Model,[],Clauses).

% split positive and negative examples
pos_neg([],[],[]).
pos_neg([+E|Exs],[E|Poss],Negs):-
    pos_neg(Exs,Poss,Negs).
pos_neg([-E|Exs],Poss,[E|Negs]):-
    pos_neg(Exs,Poss,Negs).

% covering algorithm
covering(Poss,Negs,Model,H0,H):-
    construct_hypothesis(Poss,Negs,Model,Hyp),!,
    remove_pos(Poss,Model,Hyp,NewPoss),
    covering(NewPoss,Negs,Model,[Hyp|H0],H).
covering(P,_N,_M,H0,H):-
    append(H0,P,H).  % add uncovered examples to hypothesis

% remove covered positive examples
remove_pos([],_M,_H,[]).
remove_pos([P|Ps],Model,Hyp,NewP):-
    covers_ex(Hyp,P,Model),!,
    write('Covered example: '),write(P),nl,
    remove_pos(Ps,Model,Hyp,NewP).
remove_pos([P|Ps],Model,Hyp,[P|NewP]):-
    remove_pos(Ps,Model,Hyp,NewP).

Tip

If you run the first suggested query in this code example you will see that it induces the following definition of element/2:

element(A,[A|_]).
element(A,[_,B|C]):-element(A, [B|C]).

What is interesting here is that the recursive clause stipulates that the second argument is a list with at least two elements. This is logically correct, since one-element lists are taken care of by the base case, and in fact slightly more efficient.

The two predicates called by the covering algorithm are construct_hypothesis/4 to construct a new clause, and covers_ex/3 to check extensional coverage.

% extensional coverage, relative to a ground model
covers_ex((Head:-Body),Example,Model):-
    try((Head=Example,
        forall(element(L,Body),element(L,Model)))).

% construct a clause by means of RLGG
construct_hypothesis([E1,E2|Es],Negs,Model,Clause):-
    write('RLGG of '),write(E1),
    write(' and '),write(E2),write(' is'),
    rlgg(E1,E2,Model,Cl),
    reduce(Cl,Negs,Model,Clause),!,  % no backtracking
    nl,tab(5),write(Clause),nl.
construct_hypothesis([E1,E2|Es],Negs,Model,Clause):-
    write(' too general'),nl,
    construct_hypothesis([E2|Es],Negs,Model,Clause).

try(Goal) succeeds if and only if Goal succeeds, but without instantiating variables in Goal (see Section 10.2 (appendix)).

The remaining predicate is reduce/4. This predicate first removes all the ground facts in the background model from the body of the clause. In a second step, the clause is further generalised by removing as many literals as possible, as long as the resulting clause does not cover any negative example (this is the only point where negative examples are used). This is needed because an RLGG might still contain redundant literals. For instance, given the following model

append([1,2],[3,4],[1,2,3,4])    append([a],[],[a])
append([],[],[])                 append([],[1,2,3],[1,2,3])
append([2],[3,4],[2,3,4])        append([],[3,4],[3,4])

the RLGG of the first two facts is

append([A|B],C,[A|E]):-
    append(B,C,D),append([],C,C).

This clause contains the redundant literal append([],C,C), which is true in the intended interpretation. Therefore, removing it will not change the meaning of the clause in the intended interpretation.

% remove redundant literals
reduce((H:-B0),Negs,M,(H:-B)):-
    setof0(L,(element(L,B0),not var_element(L,M)),B1),
    reduce_negs(H,B1,[],B,Negs,M).

% reduce_negs(H,B1,B0,B,N,M) <- B is a subsequence of B1
%                               such that H:-B does not
%                               cover elements of N
reduce_negs(H,[L|B0],In,B,Negs,M):-
    append(In,B0,Body),
    not covers_neg((H:-Body),Negs,M,N),!,  % remove L
    reduce_negs(H,B0,In,B,Negs,M).
reduce_negs(H,[L|B0],In,B,Negs,M):-        % keep L
    reduce_negs(H,B0,[L|In],B,Negs,M).
reduce_negs(H,[],Body,Body,Negs,M):-       % fail if clause
    not covers_neg((H:-Body),Negs,M,N).    % covers neg.ex.

covers_neg(Clause,Negs,Model,N):-
    element(N,Negs),
    covers_ex(Clause,N,Model).

%%% var_element/2: see Section 10.2 (appendix)

We illustrate the program by applying it to two induction problems, one without and one with additional background predicates. The first example is the familiar append/3 predicate.

bg_model([]).

?-induce_rlgg([ +append([1,2],[3,4],[1,2,3,4]),
                +append([a],[],[a]),
                +append([],[],[]),
                +append([],[1,2,3],[1,2,3]),
                +append([2],[3,4],[2,3,4]),
                +append([],[3,4],[3,4]),
                -append([a],[b],[b]),
                -append([c],[b],[c,a]),
                -append([1,2],[],[1,3])   ],Clauses).

RLGG of append([1,2],[3,4],[1,2,3,4]) and append([a],[],[a]) is
    append([X|Xs],Ys,[X|Zs]):-[append(Xs,Ys,Zs)]
Covered example: append([1,2],[3,4],[1,2,3,4])
Covered example: append([a],[],[a])
Covered example: append([2],[3,4],[2,3,4])
RLGG of append([],[],[]) and append([],[1,2,3],[1,2,3]) is
    append([],Y,Y):-[]
Covered example: append([],[],[])
Covered example: append([],[1,2,3],[1,2,3])
Covered example: append([],[3,4],[3,4])

Clauses = [ (append([],Y,Y):-[]),
            (append([X|Xs],Ys,[X|Zs]):-[append(Xs,Ys,Zs)]) ]

Note that, because of the use of extensional coverage, we have to provide complete ‘recursive chains’ like

append([1,2],[3,4],[1,2,3,4])
append([2],[3,4],[2,3,4])
append([],[3,4],[3,4])

Note also that the recursive clause is induced before the non-recursive one. This is due to the order in which the examples are presented; of course, it is only possible if we apply extensional coverage rather than intensional coverage.

The second example concerns the use of a non-empty background model. The background predicate num/2 converts the numbers 1…5 to the numerals one…five and vice versa; the predicate listnum/2, which does the same for lists of numbers and numerals, is to be induced.

bg_model([ num(1,one),
           num(2,two),
           num(3,three),
           num(4,four),
           num(5,five) ]).

?-induce_rlgg([ +listnum([],[]),
                +listnum([2,three,4],[two,3,four]),
                +listnum([4],[four]),
                +listnum([three,4],[3,four]),
                +listnum([two],[2]),
                -listnum([1,4],[1,four]),
                -listnum([2,three,4],[two]),
                -listnum([five],[5,5])],Clauses).

RLGG of listnum([],[]) and listnum([2,three,4],[two,3,four]) is
    too general
RLGG of listnum([2,three,4],[two,3,four]) and listnum([4],[four]) is
    listnum([X|Xs],[Y|Ys]):-[num(X,Y),listnum(Xs,Ys)]
Covered example: listnum([2,three,4],[two,3,four])
Covered example: listnum([4],[four])
RLGG of listnum([],[]) and listnum([three,4],[3,four]) is
    too general
RLGG of listnum([three,4],[3,four]) and listnum([two],[2]) is
    listnum([V|Vs],[W|Ws]):-[num(W,V),listnum(Vs,Ws)]
Covered example: listnum([three,4],[3,four])
Covered example: listnum([two],[2])

Clauses =
    [ (listnum([V|Vs],[W|Ws]):-[num(W,V),listnum(Vs,Ws)]),
      (listnum([X|Xs],[Y|Ys]):-[num(X,Y),listnum(Xs,Ys)]),
      listnum([],[]) ]

The RLGG of the first two examples is listnum(X,Y):-[], which is too general since it covers the negative examples. Therefore, the first example is temporarily discarded. After construction of the first clause, it is tried again, without success. Finally, since all examples except the first are covered by the two clauses found, the first example is simply added to the hypothesis as a ground fact.

Exercise 9.4 #

The restriction that the head of a hypothesis clause contains at least one variable that does not occur in the body excludes many useful programs with accumulators, like reverse/3 (Section 3.6). Choose another method to exclude tautological clauses, and demonstrate that your program can learn reverse/3.