Constructions in Wonderland: Exploring the functionality of constructions through N-grams

Kim Ebensgaard Jensen, Yoshikata Shibuya

    Publikation: Konferencebidrag uden forlag/tidsskriftPaper uden forlag/tidsskriftForskningpeer review

    296 Downloads (Pure)

    Abstract

    In constructionist theory (e.g. Fillmore et al 1988; Goldberg 1995; Croft 2001), constructions are functional entities that pair form and conventionalized semantic and/or discourse-pragmatic function. In this paper, we will explore the extent to which the N-gram information retrieval technique, which has seen use in phraseology (Stubbs 2009), is applicable in the identification of constructions and their functionality in discourse. An N-gram is a constellation of a specified number (N = number) of entities that frequently (co)occur in a data population. In this paper we will apply N-gram analysis to some different texts, including Lewis Carroll's 1865 novel Alice's Adventures in Wonderland (AW), Mark Twain's 1884/5 novel The Adventures of Huckleberry Finn (HF), and some US presidential speeches, to identify frequently (co)occurring words that may be indicative of underlying constructions (note that, in passing, we also show 1-grams in this paper). For example, a 3-gram search in AW renders a list of N-grams among the most frequent of which we find the following: said the King, said the Hatter, said the Caterpillar, said the Duchess, said the Gryphon, and said the Cat, and the most frequent 2-gram is said the. These 3-grams most frequently appear in the discursive context illustrated below:


    (1) 'How do you like the Queen?' said the Cat in a low voice.

    (2) 'Why, what are YOUR shoes done with?' said the Gryphon.

    (3) 'It isn't,' said the Caterpillar.


    From this we can extrapolate the following constructional schema [QUOTE said NPdef], whose function is, of course, to assign a piece of dialog to a character already known to the reader. An N-gram analysis of HF renders several N-grams that indicate strings of words being represented in the narrative as constructions in the dialect spoken by the narrator and titular character, including the N-grams I reckon, there warn't no, I says to myself, and it warn't no use, thus contributing to his mind-style (Fowler 1977) and characterization. Moreover, a number of recurring strings are additionally imbued with narrative-structural functions, such as the 2-gram and then and the 4-gram and by and by. As for presidential speeches, we compare speeches via identification of N-grams and use Fisher’s exact test to discover (strings of) words that characterize the speeches. Taking our analysis a step further from N-grams, we also show how strings of words are interconnected by representing them in networks. A network represents a collection of constructions, and thus the strength of network connectivity between nodes in a network helps one to quantify how strongly the constructions (here represented in the form of strings of words in the network) are related to each other. Furthermore, we will address “betweenness” in the network, which helps one to identify the nodes’ centrality. In case of AW, for example, in the network consisting of 50 most frequent 2-grams, the node the is the most central node (functioning as the most important “hub”) in the network, and the less central but still important nodes include she, on, went, and said.

    OriginalsprogEngelsk
    Publikationsdato10 dec. 2014
    StatusUdgivet - 10 dec. 2014
    BegivenhedSprogets Funktionalitet - Aalborg Universitet, Aalborg, Danmark
    Varighed: 10 dec. 2014 → …

    Seminar

    SeminarSprogets Funktionalitet
    LokationAalborg Universitet
    Land/OmrådeDanmark
    ByAalborg
    Periode10/12/2014 → …

    Fingeraftryk

    Dyk ned i forskningsemnerne om 'Constructions in Wonderland: Exploring the functionality of constructions through N-grams'. Sammen danner de et unikt fingeraftryk.
    • Constructions

      Jensen, K. E.

      01/12/2010 → …

      Projekter: ProjektForskning

    Citationsformater