Linguistique de corpus appliquée à l`enseignement des collocations


Linguistique de corpus appliquée à l`enseignement des collocations
Appropriation et transmission des langues et des cultures du monde:
Actes du Séminaire Doctoral International 2012
coordonnés par G. Ziegler, I. Schneider, G. Torresin et A. Simpson
Linguistique de corpus appliquée à
l’enseignement des collocations
françaises aux étudiants
Corpus linguistics applied to
French collocations teaching to
Russian-speaking students
Elena Akborisova, INALCO1
PLIDAM/ INALCO, 2, rue de Lille, 75343, Paris cedex 07, [email protected].
Le but de cette publication est de présenter une expérience que l’auteur envisage
d’entreprendre en début d’année prochaine. Les résultats de l’expérience
constitueront la partie pratique d’une thèse de doctorat actuellement en préparation.
En premier lieu, sont passés en revue les concepts clés de l’étude tels que la
collocation, le verbe nucléaire, le corpus, et de là, s’ensuit une description des
exercices illustrés par des exemples concrets tirés du corpus bilingue. L’auteur a
développé trois types d’exercices destinés à l’apprentissage des collocations en
français qu’elle envisage de tester avec un groupe d’étudiants russophones. Les
étudiants se verront, alors, proposer d’explorer le corpus bilingue français-russe à
l’aide d’un logiciel de traitement de texte afin d’y trouver des collocations et leurs
équivalents de traduction. Les données obtenues ainsi que les avis des étudiants
concernant l’approche permettront de tirer une série des conclusions sur l’efficacité
de la méthode et alimenteront, dès lors, la discussion autour de l’apprentissage des
collocations assisté par ordinateur.
Mots-clés: corpus bilingue, didactique des collocations, exercices de vocabulaire,
acquisition du vocabulaire, apprentissage assisté par ordinateur
This paper presents an overview of the author’s experiment planned for the
beginning of the year. Experimental evidence will constitute a practical part of the
PhD thesis in course. The study’s key concepts, such as collocation, nuclear verb,
corpus, are presented first, then follows the description of exercises illustrated by
examples from the bilingual corpus. The author has developed several types of
exercises aimed at French collocations teaching and will test them with a group of
Russian-speaking students. The students will be proposed to explore bilingual
French-Russian comparable corpus with the help of a word processor in order to find
collocations and their translation equivalents. Experimental data obtained as well as
students’ appreciation of the approach will permit to make a set of conclusions as for
the efficiency of the method and will feed the discussion about computer-assisted
collocation teaching.
Appropriation et transmission des langues et des cultures du monde:
Actes du Séminaire Doctoral International 2012
coordonnés par G. Ziegler, I. Schneider, G. Torresin et A. Simpson
Keywords: bilingual corpus, collocation teaching, vocabulary exercises, vocabulary
acquisition, computer-assisted learning
1. Study presentation
The author is preparing a thesis entitled “Corpus linguistics and foreign language and
culture teaching: Russian-French comparative study”. This article will cover verbal
collocation teaching through comparative analysis of tokens from French-Russian
comparable corpus. Verbal collocations, being the main subject of interest, are the
word associations the author searches to identify and extract from the bilingual
1.1 Bilingual Corpus
The corpus is composed of daily newspaper articles from Metro published in major
European cities. The author restrained for a moment the number of articles in each
language up to 50. Metro newspaper is familiar to citizens living in cities all over
Europe; it is the main and probably the most long lasting free source of information
for underground users in Moscow, St. Petersburg and in Paris. The language of the
newspaper is characterized by its simplicity, authenticity and stylistic neutrality. The
articles chosen cover multiple domains such as Environment protection, Security,
Medicine, Schooling, Religion, Economy, etc. These are the topics that are supposed
to cause interest to university students.
1.2 Basic verbs
We can’t but agree with Ake Viberg who writes in his publication ‘Basic Verbs in
Second Language Acquisition’:
“Verbs have a central role in language processing but simultaneously verbs tend to
represent a greater cognitive load on processing than nouns. An important
characteristic of the verb lexicon is that in all languages a small number of verbs
appear to be dominant in terms of frequency. The most frequent verbs in an
individual language are referred to as basic verbs. Among the basic verbs in any
language, there is a set of nuclear verbs which tend to have the same basic meaning
in all languages (a universal tendency). In addition, there are some basic verbs that
have a language-specific meaning.” (2002: 53)
Appropriation et transmission des langues et des cultures du monde:
Actes du Séminaire Doctoral International 2012
coordonnés par G. Ziegler, I. Schneider, G. Torresin et A. Simpson
The idea was to explore what Viberg calls a “universal tendency” and observe it in
the context of foreign vocabulary acquisition. Moreover, corpus-based approach
permits to study language-specific basic verbs use. Language-specific basic verbs,
as opposed to nuclear verbs, are verbs which may lack an equivalent in other
languages. For example, one of the most frequent basic verbs is the verb ‘to have’,
‘avoir’ in French is very often translated as ‘иметь’ into Russian when it is a part of a
set expression. In other cases, when it enters in non-restrained, free, word
combinations, other translation equivalents are more appropriate.
For example,
Example 1: Il a beaucoup d’amis (Fr) У него много друзей
J’ai froid (Fr) Мне холодно (Ru)
Example 2: Personnes supposées y avoir droit /Знайте: вы
имеете на это право;
des opérateurs alternatifs qui ne peuvent y avoir accès/Люди с
городской Хукоу имеют доступ к выделенным государством рабочим
In Example 1 the verb ‘avoir’ is translated by an impersonal construction without any
subject. In Example 2 we see similar constructions in both languages. The following
abstract from the corpus with co-occurrences of Russian verb ‘иметь’ illustrates an
earlier statement about its tendency to be a part of set expressions:
1. оптимистичны: более 2/3 из них считают, что будут иметь
финансовый успех – когда-нибудь. – Я редко бываю… иметь
успех = avoir du succès
2. и отправить своих детей назад в родной город, не имеют
выбора, кроме как полагаться на старые… иметь выбор =
avoir le choix
3. у или видом на жительство. Люди с городской Хукоу имеют
доступ к выделенным государством рабочим местам… иметь
доступ = avoir accès
4. заведующему отделением, главному врачу. Знайте: вы имеете
на это право… иметь право = avoir le droit
The occurrences in bold are bound word combinations.
Together with language-specific basic verbs, Viberg distinguishes nuclear basic
verbs common for such European languages as English, German, Swedish, French,
Spanish, Italian, Rumanian, Russian, Polish, Finnish and Hungarian. They belong to
seven categories:
Motion – go and come;
Possession – give and take;
Production – make;
Verbal communication – say;
Perception – see;
Cognition – know;
Desire – want.
1.3 Collocations
Having collocation being the main point of interest of the study, the author clarifies to
students the concept of collocation and its semantic, syntagmatic and paradigmatic
particularities as well as its types. Then, it follows an explanation of Melčuk’s theory
of the semantic relationship between the base and its collocative partner within a
collocation. According to Melčuk, semiphraseme (2006: 13), the base is chosen by a
speaker exclusively because of its signified and the choice of a collocative partner
depending on the meaning of the whole and the base. Collocations’ types and the
criteria for determining whether it is a collocation will be taught according to the ideas
proposed by A.Tutin and F.Grossmann’s article “Collocations régulières et
irrégulières: esquisse de typologie du phénomèn collocatif"(see references).
1.3.1 Collocations with nuclear verbs
The author’s idea is to explore the bilingual corpus in order to identify collocations in
which the above mentioned verbs enter into both languages and make a conclusion
about semantic differences and/or similarities of Russian and French nuclear basic
verbs and their combinatorial properties. The occurrences would represent a
valuable material for vocabulary exercises. Nuclear verbs have been chosen for the
exercises because the author believes that the best way to learn new knowledge is to
identify the similarities from the familiar. Viberg writes:
“Nuclear verbs are often the best choice when the lexical repertoire is very restricted
due to their semantic coverage and the choice of a nuclear verb can in many cases
be interpreted as a communication strategy.” (2002: 63)
Appropriation et transmission des langues et des cultures du monde:
Actes du Séminaire Doctoral International 2012
coordonnés par G. Ziegler, I. Schneider, G. Torresin et A. Simpson
The major motivation for this study is to look at two languages which are genetically
close, but structurally different. That is why the methodology goes two ways: (i)
identifying the differences (ii) insisting on searching for similarities between the two
languages. Both approaches may bring success. The choice was made to insist on
1.4 Target audience
The experience is designed for students from Russian universities specializing in
foreign languages and literature (what they call philology in Russia) as well as in
linguistics to narrow down to translation and traductology. A questionnaire is
presented to a group of students from Udmurt State University in Izhevsk. The results
of the after-study test and an analysis of the questionnaires will be presented in the
2. Vocabulary exercises
Exercises represent an important constituent part of a teaching process. The author
tries to determine linguistic mechanisms particular to both languages and to find out
how they may apply to teaching languages and help to simplify the process. With this
perspective, it is considered that a pedagogical tool can be created to destine
learning and consolidating knowledge.
2.1 Tools
There are free language processing software, such as AntConc concordance;
general language official corpora Frantext for French and Russian National Corpus
for Russian. Corpus studies applied to facilitating the use of authentic language and
allowing students to be more active and independent in analyzing language with
empirical evidence.
2.2 Identification exercises
After the introduction, the students are proposed to find the occurrences of nuclear
verbs in context in both corpora and to identify the occurrences of collocations. Their
propositions are then discussed in class and collocation candidates are treated one
by one. These are occurrences of a nuclear verb ‘to give’ ‘donner’ (Fr) and
‘дать/давать’ (Ru) from the corpora.
кого не секрет, что сегодня рост социальных медиа дал старт
так называемым «деньгам толпы».
тем, как была построена моя презентация. В ней я дал слово
своим коллегам. Абсолютно все относятся
хся музыке, есть шанс побывать на уроках, которые дают
музыканты мировой величины.
и повышал иммунитет. Сообщается, что каждой особи дают по 50
граммов вина в день, наливая его в тарелочк
равление от участкового педиатра. Если вам его не дают,
требуйте письменный отказ, обращайтесь к заведую
Детям мигрантов не дают учиться В Пекине родители выступают за
равные пра
photovoltaïques chez les particuliers", afin de se donner dans
le même temps la capacité d'intervenir "sur
e, il s'en crée deux par heure". Difficile de lui donner to
c'est normal de payer. Ce n'est pas l'Etat qui va donner de
l'argent". Ce médecin est membre de l'église d
i de conserver des ovocytes frais non fécondés et donne la
possibilité de préserver la fertilité des pati
s instances officielles. Ce qui, dans le rapport, donne : "La
bioéquivalence entre produit référent et
In Russian, abstract only lasts for three occurrences, which are not bound by any
restriction while ‘donner tort’ and ‘donner la possibilité’ in French are bound by lexical
2.3 Comprehension exercises
In the second exercise, students are asked to find collocations’ meaning with the help
of its context in use. The students are supposed to guess the meaning with the cues
from the situational context, or as Viberg (2002) calls it, situational inferencing.
Viberg opposes situational inferencing to linguistic inferencing (identification of word
meaning with the help of linguistic cues) and points out that linguistic inferencing is
less important than situational inferencing in naturalistic acquisition .
Appropriation et transmission des langues et des cultures du monde:
Actes du Séminaire Doctoral International 2012
coordonnés par G. Ziegler, I. Schneider, G. Torresin et A. Simpson
Here are some examples from the French corpus. Students should find the meaning
of a set of expression with ‘voir le jour’ that doesn’t have any direct equivalent in
1. Beaucoup plus visuel, le nouveau Facebook s'accompagne
l'intégration de services de streaming de musique
(Spotify, Deezer) qui permettront d'écouter des morceaux
entre amis.
De nouvelles applications vont également voir le jour,
pour partager beaucoup plus d'informations sur ses goûts
personnels (livres, cuisine, films...) et ses activités
(sport, voyage, etc.). Timeline, actuellement en bêta,
sera lancé dans quelques semaines auprès du public.
2. premier bébé conçu par fécondation in vitro après
vitrification d'un ovocyte est né début mars en
Première naissance française après congélation rapide
d'un ovocyte
Le premier bébé français conçu par fécondation in vitro
après vitrification de l'ovocyte a vu le jour le 4 mars
2012 à Paris.
3. Une nouvelle drogue a été détectée presque chaque semaine
sur le marché européen, selon le rapport annuel de
l'Observatoire européen des drogues et des toxicomanies
(OEDT). Photo : Sipa
UE : 49 nouvelles drogues ont vu le jour en 2011
Au total 49 nouvelles drogues ont été identifiées en 2011
dans l'Union européenne, a indiqué jeudi l'Observatoire
européen des drogues et toxicomanies (OEDT) dans son
rapport annuel.
Comprehension exercises are, actually, translation exercises: students are supposed
to find out the most appropriate translation equivalent to corpora as Frantext and
Russian National Corpus with the help of Metro articles corpus.
2.4 Comparison exercises
The third type of exercises offer a possibility to compare nuclear verbs between
Russian and French collocations and their semantic equivalence and the degree of
freedom of their constituent parts.
These are the corpus inquiry resulting from different forms of the verb: ‘to take’,
‘prendre’ in French and ‘брать/взять’ in Russian.
e. Mais son étude annuelle "Forces de travail" ne prend pas en
compte les périodes non travaillées (congéjusque-là plutôt
écouté et respecté, qu'on s'en prend. Monsieur Sekiguchi,
employé de 48 ans, hausse
de fourrage. L'Etat a donc choisi cette année de prendre les
devants. Le ministère de l'Agriculture a réun
tielle", rapporte l'AFP. Faire preuve d’optimisme Prenant en
compte la baisse du prix des panneaux photovol
(12,35 euros) à des revendeurs à la sauvette pour prendre
place dans la file d'attente. Selon le site Sina.
le deuxième suivait une thérapie traditionnelle, prenant la
forme de cinq consultations avec des psychologs est présente
ce vendredi matin. Toutes viennent prendre des "leçons" qui
enseignent la doctrine. Des leçoavocat, pour qui l'Etat
français "doit maintenant prendre ses responsabilités".
Médicaments périmés : prenez soin de les trier Cyclamed
collecte les médicaments
lentes et complexes. Une évolution que la France prend
aujourd’hui mal en compte. C’est ce qui ressort dtronqué
(sorti le 15 février chez Calmann-Lévy), prend position en
faveur des soins palliatifs
n'avons pas les moyens d'empêcher une personne de prendre le
volant", même en cas d'alcoolémie très excessi
ка, достигается сложнее всего. Но если немедленно взяться за
дело улучшения «качества» и цвета свого лица,
дпринимать шаги по улучшению цвета лица, придется взять на
вооружение правильный образ жизни. Организм по
Студенты обычно платят заслуженным музыкантам, чтобы брать у
них уроки, и платят немало. Ведь есть индивидуальные навыки,
постигнуть которые очень важно, – объясняет Мария Ханько
С семьи умерших могут начать взимать долги, которые были
совершены уже после смерти их близкого человека и в счет долга
могут даже забрать дом или другое имущество.
The word combinations in bold above are collocations. There is only one example
with the same meaning in Russian and French (Toutes viennent prendre des
"leçons" qui enseignent la doctrine and Студенты обычно платят заслуженным
музыкантам, чтобы брать у них уроки, и платят немало). Other examples have
either semantically similarities, such as ‘prendre ses responsabilitйs’ = ‘взять на
себя ответственность’, or having a different meaning as in ‘взять на вооружение’ =
‘s’armer de…‘.A French collocation ’prendre le volant’ (‘take the wheel’) is translated
into Russian as ‘садиться за руль’ (‘sit at the wheel’).
Appropriation et transmission des langues et des cultures du monde:
Actes du Séminaire Doctoral International 2012
coordonnés par G. Ziegler, I. Schneider, G. Torresin et A. Simpson
3. Conclusion
The above method of teaching French collocations to Russian-speaking students will
be piloted in February 2013. The author can only make a set of hypothesis for the
efficiency and applicability of her method at the current stage. If the author’s
assumptions are right, the audience will appreciate the computer-assisted learning
experience which are new to the typical French classroom setting. The results of
teacher-student interactions as well as autonomous students’ work will justify the
hypothesis for bilingual analogies not only by simplifying vocabulary retention, but
also contributing to the creation of a positive attitude towards foreign language
learning. Moreover, if similar bound constructions with nuclear verbs are found with
several languages, they will represent a key identification, to foreign language
phraseology. Some examples of verbal collocations will be illustrated in Russian and
in French which are used semantically close to name phraseological concepts.
Another important expectation of the author is that the students will adopt the
techniques of corpus investigation and will use them in language acquisition.
4. References
1. Mel'cuk I. (2006) Phrasemes in Language and Phraseology in Linguistics.
Montréal: Presse de l’Université de Montréal
2. Mel'cuk I. & Polguère A. (2007) Lexique actif du français L’apprentissage du
vocabulaire fondé sur 20 000 dérivations sémantiques et collocations du français.
Bruxelles : De Boeck & Larcier s.a.
3. Sinclair, J. McH. (1991) Corpus Concordance Collocation. Oxford: Oxford
University Press
4. Tutin, A. & Grossmann, F. (2002). Collocations régulières et irrégulières :
esquisse de typologie du phénomène collocatif. Revue française de linguistique
appliquée, 2002/1, Vol.VII, 7-25.
5. Viberg, Å. (2002). Basic Verbs in Second Language Acquisition. Revue française
de linguistique appliquée, 2002/2, Vol.VII, 61-79.
6. AntConc
7. Frantext
8. Russian National Corpus