Abstract
The fundamental question that this thesis tries to contribute to is thus: is it possible to code and compute the meaning of phrases, such as "fluffy dog", in a way that encodes not only logical reasoning, but provides a way to return our (possibly multiple) common understanding(s) of such a
... read more
phrase? We rely on a compositional approach: the grammar rules act as the "computational rules", and the meanings of the individual words compose with each other following those same rules, in a way that is consistent with the type of meaning that we wish to assign to each word. To use the grammar rules in this way, we rely on Lambek types. In this framework, a common noun is represented by a Lambek type n and an adjective is represented by a type n/n. Following the appropriate type composition rules, we can show that these two types can be mapped to the type n. This means that an adjective applied on a noun gives something that can be used in the same place on a sentence. This is very intuitive: "fluffy dog" could always be used in the place of "dog". The meanings must compose in a way that is homomorphic to the grammar composition. Additionally, it is convenient that these individual meanings have a format that a computer can understand. The choice made here comes from an approach to NLP called "distributional" semantics, that is able to extract vector representations of words by looking at the other words that usually appear next to it: "You shall know a word by the company it keeps". Each Lambek type is assigned a vector space, and so the composition takes place using operations in vector spaces. Then, each word of a certain type is assigned an element of the correct vector space, a vector. The vector space associated with the Lambek type n is N, which means that a word like "dog" is represented by a vector in the noun space. In its turn, the vector space associated with the Lambek type n/n is N⊗N, which describe maps between vectors. The elements of this space can be represented as matrices, and so an adjective such as "fluffy" is also represented by a matrix. The composition of the adjective and the noun, thus, is automatically computed by a computer as the application of the respective matrix to the respective vector, which in turn results in a vector in the space N, consistent with the resulting Lambek type n of grammar composition. This framework can be extend to more complex types and grammar formulations, which can get rather computationally heavy. One possibility to make these calculations more computationally efficient is to resort to quantum computation, by having each vector component as a quantum state and by performing the contractions using a quantum circuit. In this thesis we explore those avenues, with special focus on ambiguities that appear due to different possible readings of the same phrase, as is the case of the scope in "old men and women", or relative clauses in Dutch, where "man die de hond bijt" can translate either to "man that bites the dog" or "man that the dog bites". For these, the different possible contractions represent different readings, and they can exist simultaneously in quantum superposition. In the process, we explore ways of improving cosine similarity calculations using a metric tensor, and we develop a quantum computation algorithm that is capable to speed up the process of finding an answer to an open multiple choice question.
show less