Vorfeld
Transcription
Vorfeld
Starting a sentence in L2 German – Discourse annotation of a learner corpus Heike Zinsmeister, University of Konstanz, Germany Margit Breckle, Vilnius Pedagogical University, Lithuania Introduction The ALeSKo Corpus From data collection to empirical studies: • Local coherence – Transition from one sentence to the next – Entity-based coherence / discourse relation-based coherence Hand-written text: Vorfeld pre-field Verb-Second: finite verb • Assumption 133 German Europarl turns 1. Brand-new / frame-setting 2. Element of a partly ordered set (Poset) Transcription: verbal complex middle field Coherence-related Vorfeld Preference hierarchy of pragmatic Vorfeld functions Frau Thimm will go on a journey in winter. During ths journey she will relax and she will learn about the culture of China. Therefore it is not just an escape. – German Vorfeld as an ideal position for linking a sentence to its preceding discourse and establishing local coherence German 3. Backward-looking center • Research question (cf. Speyer 2007) – Do Chinese L2 learners of German use the Vorfeld in the same way as L1 speakers? • Method – Annotation of categories and relations related to local coherence – Contrastive interlanguage analysis of (comparable) L2 texts and L1 texts Chinese Brand-new [Die Leute, die viele Reise machen,] haben immer mehr Geld als die, die selten reisen. the people that many journeys do have always more money than those that seldom travel ‘The people who travel a lot always have more money than those who seldom travel.’ The term “Leute” (‘people’) occurs earlier in the text but does not refer to ‘people, who go on many journeys,’ Poset MMAX2 (Müller & Strube 2006): Frame-setting Constituents that set a frame in which the sentence is interpreted (cf. Jacobs 2001), wdt08_10: [In den Attraktionspunkten] werden (...) notwendige Einrichtungen konzentriert angeboten. at the attraction_sites are necessary facilities focussed offered ‘Necessary facilities are especially offered at the attraction sites.’ Locative frame ADVtimeSV zuótian zuě xià de hěn jǐn yesterday snow descend CSC very incessant ‘Yesterday it snowed incessantly.’ ADVlocativeSV qiáng shang pá zhe hěn duō bìhǔ wall on climb DUR very many salamander • Word order – basic: SVO – definite, non-topic object: SOdefV ‘The wall has a lot of salamanders crawling on it.’ Study • Data: all 43 L2 texts (relevant 884 Vorfelds); subcorpus of 24 L1 texts (764 Vorfelds) • Assumption: Chinese topics roughly correspond to backward-looking centers and framesetting elements Constituents that are linked referentially to a salient element in the previous sentence (cf. Grosz et al. 1995), wdt07_22: Antecedent is highest on a saliency hierarchy in comparison to other potential antecedents: subject > object(s) > others ‘These photos won’t fit in this envelope.’ • time or locative phrase Backward-looking center Referential expression corefers with expression in previous sentence envelope in fit can’t enter this several photo • familiar referent Implicit set: “Tageszeiten” (‘times of the day’); elements of the set: “jeden Morgen” (‘every morning’), “jeden Abend” (‘every evening’) – annotation aid: co-hyponyms Poset: “Jeden Abend” 35 32,5 significant: 30 χ2=5.61, df=1, p<0.05 27,1 not significant, if normalized with text length Two-step annotation process: (i) primary (partly parallel) annotation, (ii) expert decision. Inter-annotator agreement on backward-looking center (167 Vorfelds,): α(coder1,coder2)= 0.21; α(coder1,experts)=0.53; α(coder2,experts)=0.33). The ALeSKo corpus consists of L2 essays (advanced Chinese L2 learners of German, level: ~B2), L1 essays, metadata, annotation, annotation guidelines: • wdt07: 25 L2 texts – topic: Are holidays an unsuccessful escape from everyday life? (6,902 tokens) • wdt08: 18 L2 texts – topic: Does tourism support understanding among nations? (6,685 tokens) • Falko Essays L1 0.5: 39 essays – different topics (34,155 tokens) (cf. Falko, online) Future Work • Discourse-related coherence: first results indicate that L2 learners mark contingency (e.g. Damit ‘hence’, Aus diesem Grund ‘therefore’) and expansion (e.g. Ferner ‘furthermore’, Auf solche Frage ‘on such a question’) more often in the Vorfeld than L1 speakers • Error annotation: marking of errors and target hypotheses (e.g., [Was macht den Tourismus anders als die anderen Branchen] , ist .... (wdt08_02); word order error (finite verb), target hypothesis: Was den Tourismus anders als die anderen Branchen macht, ist ... ) • Readability and Vorfeld use: rating and rewriting experiment (cf. Rosén 2006) • Application in language teaching: creation of teaching material for the training of the effects of the Vorfeld on (local) coherence 25 Percent Durch Reisen können sie auch andere Kultur und Lebenstile kennenlernen. by travelling can they also other culture and lifestyles get_to_know [Sie] können auch ihre Kenntnisse durch Reisen erweitern. they can also their knowledge by travelling broaden ‘By travelling, they1 can become acquainted to other culture and lifestyles. They1 can also broaden their knowledge by travelling.’ OtopicSV xìnfēng lǐ zhuāng bu jìn zhèi xiē zhàopiàn Topic-prominent: the topic always comes first. • Topic – what the sentence is about (Li and Thompson 1989: 15) – sets a spatial, temporal or individual framework within which the main predication holds (Li and Thompson 1989: 85) Constituents that are not linked referentially to the previous discourse (cf. Prince 1981), wdt07_04: [Jeden Morgen] stehen wir auf, um pünktlich zur Arbeit zu sein. (...) every morning get we up for punctual to_the work to be (...) [Jeden Abend] bleiben wir zu Hause, sehen sinnlose Serien im Fernsehn. every evening stay we at home, watch senseless shows in_the television ‘Every morning, we get up for being at work in time. (...) Every evening, we stay at home, watch the senseless shows on TV.’ Coherence-related functions are important but sentence-internal, presentational functions are predominant. – letters to the editor: 32% – scientific radio programme: 71% EXMARaLDA (Schmidt 2004): Categories and Examples Constituents that belong to a partly ordered set of which other elements are already introduced (cf. Prince 1999), wdt08_10: (Dipper and Zinsmeister 2009) The proportion of Vorfelds related to the pre-context varies with the text type 21,3 20 L1 16,9 L2 15 10 7,5 5 7,3 2,4 1,9 0 brand-new poset frame-setting backward-looking center L1 16,9 2,4 7,3 27,1 L2 21,3 1,9 7,5 32,5 Result Almost no Verb-third errors. L2 speakers use the function backward-looking center significantly more often in the Vorfeld than L1 (cf. Breckle & Zinsmeister, in preparation). transfer effect from topic-prominent L1 Chinese to L2 German References Margit Breckle and Heike Zinsmeister. 2009. Annotationsrichtlinien Funktion des Vorfelds. Manuscript. December 2009. Pedagogical University Vilnius and University of Konstanz. Margit Breckle and Heike Zinsmeister. In preparation. A corpus-based contrastive analysis of local coherence in L1 and L2 German. In Proceedings of the HDLP conference. Frankfurt/Main [a.o.]: Peter Lang. Stefanie Dipper and Heike Zinsmeister. 2009. The Role of the German Vorfeld for Local Coherence. In: Christian Chiarcos, Richard Eckart de Castilho and Manfred Stede (eds.) Von der Form zur Bedeutung: Texte automatisch verarbeiten / From Form to Meaning: Processing Texts Automatically. Tübingen: Narr. 69–79. Falko, online. http://www.linguistik.hu-berlin.de/institut/professuren/korpuslinguistik/forschung-en/falko Barbara Grosz, Arvind Joshi and Scott Weinstein. 1995. Centering: A Framework for Modeling the Local Coherence of Discourse. Computational Linguistics, 21. 203–225. Jacobs, Joachim. 2001. The dimensions of topic-comment. Linguistics, 39 (4). 641–681. Charles N. Li and Sandra A. Thompson. 1989. Mandarin Chinese: A Functional Reference Grammar. Berkeley and Los Angeles, CA: University of California Press. Ellen F. Prince. 1981. Toward a taxonomy of given-new information. In Peter Cole (ed.) Radical Pragmatics. New York: Academic Press. 223–255. Ellen F. Prince. 1999. How not to mark topics: ‘Topicalization’ in English and Yiddish. 8 Texas Linguistics Forum. Christina Rosén. 2006. Warum klingt das nicht deutsch? Probleme der Informationsstrukturierung in deutschen Texten schwedischer Schüler und Studenten. Stockholm: Almqvist & Wiksell International. Helmut Schmid. 1994. Probabilistic Part-of-Speech Tagging Using Decision Trees. In: D. H. Jones and H. Somers (eds.) New Methods in Language Processing, UCL Press, 154–164. Thomas Schmidt. 2004. EXMARaLDA – ein Modellierungs- und Visualisierungsverfahren für die computergestützte Transkription gesprochener Sprache. In Proceedings of Konvens. Vienna, Austria. Augustin Speyer. 2007. Die Bedeutung der Centering Theory für Fragen der Vorfeldbesetzung im Deutschen. Zeitschrift für Sprachwissenschaft, 26. 83–115. ALeSKo homepage: http://ling.uni-konstanz.de/pages/home/zinsmeister/alesko.html