<!--H3: Section 7.2-->
(sec:7.2)=
# Definite Clause Grammars #

If we want to build a parser in Prolog, we need a representation for sentences. Ignoring capitals and punctuation marks, a sentence can simply be represented by the list of its words in the same order, for instance
```Prolog
[the,rapid,turtle,beats,achilles]
```
Given this representation, a grammar rule like
```Prolog
sentence --> noun_phrase,verb_phrase.
```
has the following meaning: a list of words represents a sentence, if some first part of it represents a noun phrase, and the rest represents a verb phrase. This statement can easily be expressed as a Prolog clause:
```Prolog
sentence(S):-
    noun_phrase(NP),
    verb_phrase(VP),
    append(NP,VP,S)
```
Similarly, a grammar rule containing a terminal
```Prolog
verb --> [sleeps].
```
means: a list of words represents a verb if it is the list consisting of the single word 'sleeps'. Translated to Prolog:
```Prolog
verb([sleeps]).
```
Obviously, there is a very close relationship between context-free grammar rules and definite clauses, and any context-free grammar can easily be translated to a set of Prolog clauses. The exciting thing about this is that these Prolog clauses are nothing less than a parsing program: for instance, we could ask the query
```Prolog
?-sentence([the,rapid,turtle,beats,achilles]).
```
and get an affirmative answer.

+++

<!--section 3.6-->
We can actually push the correspondence between grammar rules and definite clauses further by employing difference lists ({numref}`sec:3.6`). This allows us to get rid of the `append` literals:
```Prolog
sentence(NP1-VP2):-
    noun_phrase(NP1-VP1),
    verb_phrase(VP1-VP2)
```
This clause should be read as follows: `NP1` is a sentence followed by `VP2`, if `NP1` is a noun phrase followed by `VP1`, and `VP1` is a verb phrase followed by `VP2` ({numref}`fig:7.2`). Queries now should take the form
```Prolog
?-sentence([the,rapid,turtle,beats,achilles]-[]).
```
(after parsing the initial part of the list as a sentence, nothing should be left).

```{figure} /src/fig/part_iii/image004.svg
---
width: 80%
name: 'fig:7.2'
---
The use of difference lists in grammar rules.
```

+++

<!--section 3.6-->
We have shown that there is a one-to-one correspondence between context-free grammars and Prolog programs interpreting those grammars. In fact, the translation from the first to the second is so straightforward that it is built into Prolog. That is, meta-level grammar rules like
```Prolog
sentence --> noun_phrase,verb_phrase.
```
are allowed in Prolog programs. When interpreting these rules, Prolog will invisibly convert them to object-level program clauses like
```Prolog
sentence(L,L0):-
    noun_phrase(L,L1),
    verb_phrase(L1,L0).
```
in which the additional variable is an accumulator rather than the minus list of a difference list ({numref}`sec:3.6`). Furthermore, Prolog provides the meta-level predicate `phrase/2`, such that the object-level query `?-sentence(L,[])` can be replaced by the meta-level query `?-phrase(sentence,L)` ({numref}`fig:7.3`).

```{figure} /src/fig/part_iii/image006.svg
---
width: 80%
name: 'fig:7.3'
---
Meta-level and object-level in Definite Clause Grammars.
```

+++

These Prolog grammars are known as *Definite Clause Grammars* (DCG's). They are an excellent illustration of the power of declarative programming: *specifying a grammar gives you the parser for free*. That is, a grammar is a declarative specification of the corresponding parser, and Prolog directly converts this specification into an executable parser. Moreover, since a grammar is purely declarative, the program is also a sentence **generator**: for instance, it is possible to generate every sentence starting with 'Achilles' by means of the query `?-phrase(sentence,[achilles|Rest])`.

+++

Definite Clause Grammars further extend the power of context-free grammars in two ways:

1. arguments can be added to non-terminals;
2. Prolog goals can be added to the body of grammar rules.

As an illustration of the first feature, we show how plurality agreement can be achieved by means of a DCG instead of a context-sensitive grammar:
```{swish} swish:7.2.1
```
The first rule states that the pluralities of noun phrase and verb phrase should correspond. The second rule states that the plurality of a noun phrase is determined by both article and noun, which should have corresponding pluralities as well. The remaining rules assign pluralities to specific articles, nouns and verbs.

+++

<!--section 4.1 section 4.1-->
We can also use this feature to construct a parse tree while parsing a sentence. Parse trees can be represented by Prolog terms ({numref}`sec:4.1`):

* a parse tree for a terminal `T` of syntactic category `S` is represented by the term `S(T)`;
* a parse tree for a sequence `N1` &hellip; `Nk` of non-terminals of syntactic category `S` is represented by the term `S(N1,` &hellip; `,Nk)`.

Thus, a parse tree for the verb 'sleeps' is represented by the term `verb(sleeps)`, and a parse tree for the sentence 'the turtle sleeps' is represented by the term
```Prolog
s(np(art(the),n(turtle)),vp(iv(sleeps)))
```
(for brevity, syntactic categories are abbreviated). The following grammar indicates how parse trees are built up from their constituents.
```{swish} swish:7.2.1_2
---
query-id: swishq:7.2.1_2
query-text: ?-phrase(sentence(s(np(pn(achilles)), vp(tv(beats),np(art(the),adj(lazy),n(turtle))))), L).
---
```
In the query, the argument of the non-terminal `sentence` will be instantiated to the final parse tree:
```{swish-query} swishq:7.2.1_2
?-phrase(sentence(T),[achilles,beats,the,lazy,turtle]).
  T = s(np(pn(achilles)),
        vp(tv(beats),
           np(art(the),
              adj(lazy),
              n(turtle))))
```
If we use the predicate `term_write/1` given in {numref}`sec:4.1`, a nice tree-like output is obtained:
```Prolog
?-phrase(sentence(T),[achilles,beats,the,lazy,turtle]),
  term_write(T).

  ---------s--------np--------pn--achilles
            --------vp--------tv-----beats
                      --------np-------art-------the
                                -------adj------lazy
                                ---------n----turtle
```

+++

These examples show one way to use arguments of non-terminals: to collect information coming out of the parsing process. In addition, we might want to express that arguments of different non-terminals in a rule are related in some way. To this end, we can add Prolog goals to the body of grammar rules, by enclosing them in curly brackets `{}`. For instance, suppose we have a grammar for English numerals like 'one hundred twenty three', and we want to calculate the number represented by such numerals during parsing. We could write the following DCG:
```{swish} swish:7.2.2
---
query-text: ?-phrase(numeral(N),[nine,thousand,nine,hundred,ninety,nine]).
query-id: swishq:7.2.2
---
```
We could use this DCG for parsing a given numeral, but also for generating the numeral corresponding to a given number:
```{swish-query} swishq:7.2.2
?-phrase(numeral(2211),N).
  N = [two,thousand,two,hundred,eleven]
```

```{exercise} ex:7.3
```

````{tip}
A parser for Roman numerals (adapted from a post on the `comp.compilers` Google group).
```{swish} swish:roman
```
````

+++

In this section, we have seen that writing parsers in Prolog is easy: just write the context-free grammar, possibly extended by arguments to non-terminals and Prolog goals in the body of grammar rules, and you have a program for parsing and sentence generation. However, parsing is not an end in itself: we want to assign an interpretation to a sentence. This is the topic of the following section.
