7. Reasoning with natural language#
A language which is used for communication between humans is commonly called a natural language, in order to distinguish it from an artificial computer language. Despite their apparent differences, artificial and natural language can be described by the same tools, some of which will be studied in this chapter.
Natural language can be described on a number of different levels:
Prosody: rhythm and intonation of spoken language;
Phonology: how to combine simple sounds (phonemes) in spoken language;
Morphology: how to build words from meaningful components (morphemes);
Syntax: how to build sentences from words;
Semantics: how to assign meaning to words and sentences;
Pragmatics: how to use sentences in communication.
Here, we are mainly concerned with written language, so we will not talk about prosody and phonology. Morphology tells us, for instance, that in English the plural of many nouns can be obtained by adding the suffix -s (house–houses, chair–chairs). Syntax allows us to distinguish well-formed sentences (like ‘I sleep’) from ill-formed ones (like ‘me sleeps’), and to discover their grammatical structure. Semantics allows us to understand sentences like ‘time flies like an arrow, but fruit flies like a banana’. Pragmatics tells us that ‘yes’ is in general not a very helpful answer to questions of the form ‘do you know …?’.
It should be noted that this distinction between different levels is not as clear-cut as it may seem. For instance, the sentences ‘time flies like an arrow’ and ‘fruit flies like a banana’ look very similar; yet, semantic analysis shows that they have a different grammatical structure: ‘time (noun) flies (verb) like an arrow’ in the first case, and ‘fruit flies (noun) like (verb) a banana (noun phrase)’ in the second. That is, both sentences have at least two possible grammatical structures, and we need semantics to prefer one over the other.
Without doubt, the language level which has been formalised most successfully is the syntactic level. The process of deriving the grammatical structure of a sentence is called parsing. The outcome of the parsing process is a parse tree, showing the grammatical constituents of the sentence, like verb phrase and noun phrase. This grammatical structure can be further used in the semantic analysis of the sentence. The reverse process, starting from a semantic representation and producing a sentence, is called sentence generation. It is applied in dialogue systems, where answers to queries must be formulated in natural language.