Morfologische aspecten van het ideale woordenboek : Een theoretische en empirische studie naar de lexicale samenhang van het Nederlands ten behoeve van een morfologische kennisbank

Koornwinder, N.O.

Morfologische aspecten van het ideale woordenboek : Een theoretische en empirische studie naar de lexicale samenhang van het Nederlands ten behoeve van een morfologische kennisbank

DSpace/Manakin Repository

Morfologische aspecten van het ideale woordenboek : Een theoretische en empirische studie naar de lexicale samenhang van het Nederlands ten behoeve van een morfologische kennisbank

Koornwinder, N.O.

(2005) Utrecht University Repository

(Dissertation)

Abstract

This thesis defends the idea that morphological structure corresponds with a cognitive device to assign maximal cohesion to a lexical network of knowledge units. This idea is worked out by a theory called Lexical Knowledge Representation by Inductive Name Giving (L-KRING). The L-KRING-theory is designed as an ideal model of ... read more the lexicon. Such a model should provide an integral, dynamic account of all cognitive aspects of the lexicon, and should be compatible with computational requirements. As a consequence, the ideal lexicon model needs a statistical basis, which means that it should be able to derive the contents of the lexicon (both data and data categories) from the input spectrum of the language user. This kind of model will also be useful for the development of artifical language systems. In chapter 2, the lexical metamodel is used to evaluate a broad spectrum of existing lexicon theories, which are taken from different fields of research, ranging from computational grammars to psycholinguistic network models of the mental lexicon. In chapter 3 and 4, the same metamodel is taken as a guide for the development of a (more) general lexicon theory. This challenge is made concrete by discussing a number of empirical problems in the standard approach to Dutch morphology, as described in the Morphological Handbook of Dutch (MHB). These problems are solved by proposing an inductive classification approach which is based on paradigmatic pattern analysis. This approach not only accounts for the problem of morphological rule construction, but also for the existence of syntactic categories, which are claimed to emerge from morphological inflection patterns. In chapter 4, this L-KRING-approach is worked out more formally. In chapters 5 and 6, the L-KRING-theory is used to describe method and results of a large-scale lexicographic study to the morphological structure of Dutch. For this purpose, all words in Van Dale's Large Dictionary of Dutch are assigned morpheme structure in a semi-automatic way, resulting in a Morphological Data Bank of Dutch (MGBN). Chapter 5 describes the structure of this databank and motivates it. Chapter 6 presents a number of statistical analyses with this databank. These analyses focus on the inventarization of the available morphemes and their syntagmatic combination patterns. They show that the Dutch vocabulary has a high degree of cohesion: the 250,000 words in Van Dale's dictionary appear to consist of only 80,000 different base lexemes, which themselves consist of ca. 20,000 different roots, 300 prefixes and 700 suffixes. They correspond with 950 different prefix sequences and 3750 different suffix sequences. The analysis reports are used to evaluate the quality of the MGBN with respect to the morpheme knowledge in the MHB and vice versa. This leads to the conclusion that the MGBN clearly improves the MHB with respect to the amount of described affixes and their combinatorial relations. Moreover, the MGBN provides detailed information about the syntagmatic and paradigmatic properties of all morphemes of Dutch, both stems and affixes. Therefore, the MGBN may be a useful resource for both linguistic research and computational applications.

Download/Full Text

Keywords: lexicology, lexicography, computational morphology, network model, mental lexicon, automatic knowledge acquisition, knowledge representation system

ISBN: 90-76864-83-7

Publisher: LOT

See more statistics about this item