Proceedings of the 9th International Conference on

Transcription

Proceedings of the 9th International Conference on
TIA 2011
9th International Conference on
Terminology and
Artificial Intelligence
Proceedings of the Conference
8–10 November 2011
INALCO
Paris, France
Conference program
Tuesday, November 8, 2011
8:30-9:15
Welcome and registration
9:15-9:30
Opening remarks
Invited talk 1: Béatrice Daille (LINA, Nantes University)
Term variation in texts
Béatrice Daille
10:3011:00
Coffee break
Session 1: Terminology structuring
Utilisation de méthodes de structuration de terminologies pour la création de
groupements de termes de pharmacovigilance
Marie Dupuch, Amandine Périnet, Thierry Hamon and Natalia Grabar
Dommages collatéraux de la fusion de terminologies
Natalia Grabar, Marie Dupuch and Fleur Mougin
Boaster session (see posters below)
12:3014:00
Lunch
Session 2: From text to ontology
Bootstrapping a Domain-specific Terminological Taxonomy from Scientific Text
Magdalena Wolska, Ulrich Schäfer and Pham The Nghia
EvOnto une approche pour maintenir la cohérence entre une ressource terminoontologique et des annotations sémantiques
Anis Tissaoui, Nathalie Aussenac-Gilles, Nathalie Hernandez and Philippe Laublet
Nous appelons X cet Y : X est-il un terme émergent ?
Marie-Paule Jacques
iii
Tuesday, November 8, 2011 (continued)
15:3016:00
Coffee break
Session 3: Multilingual and lay terminology
Comparative analysis of the motivatedness structure of Japanese and English terminologies
Takuma Asaishi and Kyo Kageura
Japanese-English Cross-Language Headword Search of Wikipedia
Satoshi Sato and Masaya Okada
Exploring the terminological nature of citizens’ queries in the domain of consumer
justice
Meritxell Fernández-Barrera
Wednesday, November 9, 2011
8:30-9:00
Welcome and registration
Invited talk 2: Gregory Grefenstette (Exalead / Dassault Systèmes)
9:00-10:00
Multilingual Terminology and information Retrieval
Session 4: Terminology and communication
Asymmetric Similarity and Cross-Cultural Communication Process
Fumiko Kano Glückstad
10:3011:00
Coffee break
iv
Wednesday, November 9, 2011 (continued)
Poster session (see posters below)
12:3014:00
Lunch
Session 5: Monolingual syntagmatic extraction
Étude de l’influence de la taille du corpus de référence sur l’extraction terminologique automatique contrastive
Audrey Laroche, Patrick Drouin and Gabriel Bernier-Colborne
Enhancing Multi-word Term Extraction for Designated Theme Embedded in a Domain Corpus
Teruo Koyama and Koichi Takeuchi
Le poids des entités nommées dans le filtrage des termes d’un domaine
Nouha Omrane, Adeline Nazarenko and Sylvie Szulman
15:3016:00
Coffee break
Session 6: Paradigmatic and syntagmatic relations
Simple methods for dealing with term variation and term alignment
Marion Weller, Anita Gojun, Ulrich Heid, Béatrice Daille and Rima Harastani
Identification des participants de lexies prédicatives : évaluation en performance
et en temps d’un système d’annotation automatique
Fadila Hadouche, Suzanne DesGroseilliers, Janine Pimentel, Marie-Claude
L’Homme and Guy Lapalme
16:3017:00
Conclusions, Best paper prize, Presentation of workshops
v
Wednesday, November 9, 2011 (continued)
Poster presentations with long papers
Formalizing Specialized Knowledge Events in Satellite Ontologies
Pamela Faber, Pilar León Araúz and Arianne Reimerink
ULiS: An Expert System on Linguistics to Support Multilingual Management of
Interlingual Knowledge bases
Maxime Lefrançois and Fabien Gandon
Poster presentations with short papers
An Analysis and Evaluation of English Arabic Statistical Machine Translation of
Terminology-Rich Text
Mohammad Ghoniem, Khaled Tokal and Ahmed Y. Tawfik
Multiterminology cross-lingual model to create the European Health Terminology/Ontology Portal
Julien Grosjean, Tayeb Merabti, Nicolas Griffon, Badisse Dahamna and Stefan
Darmoni
BootCatting Comparable Corpora
Adam Kilgarriff, Avinesh Pvs and Jan Pomikálek
Multiple terminological equivalence of word combinations in scientific discourse:
PEPCSAD — pilot study
Katarzyna Kołudzka-Zi˛etek
Search Engine Metrics to Discover Terms Characteristic of a Database of Images
with Captions
Michael Oakes and Lynne Hall
Terminological mismatches in English non legally-binding criminal law texts
Katia Peruzzo
Une étude terminologique de la communication hypertexte web. Caractéristiques
du domaine universitaire.
David Reymond, Nathalie Pinède, Véronique Lespinet-Najib and Benoît Le Blanc
A Case Study of Knowledge-Rich Context Extraction in Russian
Anne-Kathrin Schumann
A Study for Identifying Domain-specific Introductory Terms in Research Papers
Kiyoko Uchiyama
vi
Program Committee Chairs :
Kyo Kageura (University of Tokyo, Tokyo, Japan)
Pierre Zweigenbaum (LIMSI-CNRS & CRIM-INALCO, Paris, France)
Invited Speakers :
Béatrice Daille (LINA, University of Nantes)
Gregory Grefenstette (Exalead/Dassault Systèmes)
Program Committee :
Guadalupe Aguado de Cea (Universidad Politécnica de Madrid, Spain)
Amparo Alcina (Universitat Jaume-I, Castellón de la Plana, Spain)
Sophia Ananiadou (NaCTeM, Manchester, UK)
Nathalie Aussenac-Gilles (IRIT, Toulouse, France)
Caroline Barrière (CRIM, Montréal, Canada)
Maria Teresa Cabré (Universitat Pompeu Fabra, Barcelona, Spain)
Farid Cerbah (Dassault Aviation, Paris, France)
Jean Charlet (AP-HP & INSERM, Paris, France)
Anne Condamines (CLLE-ERSS, Toulouse, France)
Lyne Da Sylva (EBSI, Montréal, Canada)
Béatrice Daille (LINA, Université de Nantes, Nantes, France)
Valérie Delavigne (Institut national du cancer, France)
Pascaline Dury (Université Lyon 2, Lyon, France)
Fidelia Ibekwe-San Juan (Université Lyon 3, Lyon, France)
Kyo Kageura (University of Tokyo, Tokyo, Japan)
Olivia Kwong (City University Hong Kong, Hong Kong, China)
Marie-Claude L’Homme (OLST, Université de Montréal, Canada)
Adeline Nazarenko (LIPN, Université Paris 13, Villetaneuse, France)
Minako O’Hagan (Dublin City University, Dublin, Ireland)
Pascale Sébillot (IRISA, Rennes, France)
Serge Sharoff (University of Leeds, Leeds, UK)
Monique Slodzian (ERTIM-INALCO, Paris, France)
Sylvie Szulman (LIPN, Université Paris 13, Villetaneuse, France)
Koichi Takeuchi (Okayama University, Okayama, Japan)
Rita Temmerman (Erasmushogeschool, Bruxelles, Belgium)
Yannick Toussaint (LORIA, Nancy, France)
Spela Vintar (University of Ljubljana, Ljubljana, Slovenia)
Pierre Zweigenbaum (LIMSI-CNRS & CRIM-INALCO, Paris, France)
vii
TIA 2011 is organised by the Multilingual Engineering Research Centre CRIM/ERTIM (EA2520)
of INALCO, Institut National des Langues et Civilisations Orientales.
Organising Committee Chairs :
Mathieu Valette (ERTIM-INALCO, Paris, France)
Monique Slodzian (ERTIM-INALCO, Paris, France)
Organising Committee :
Jugurtha Aït-Hamlat
Béatrice Billo
Evelyne Bourion
Jean-Michel Daube
Océane Hồ Đình
Claire Lega
Nadia Makouar
Pierre Marchal
Gaël Patin
Egle Ramdani
Monique Slodzian
Mathieu Valette
Zhen Wang
Proceedings assembled with LATEX using the ACLPUB package.
viii
Preface
Terminology and Artificial Intelligence (TIA) is one of the leading international forums in
terminology. The first TIA conference was held in 1995 in Villetaneuse, France, organised
by the French-oriented group “Terminology and Artificial Intelligence”. Since then, it has
continued as a biannual conference. While it started out as a predominantly francophone
conference, promoting text-based terminology, it has gradually attracted a wider audience,
and over the past six years it has grown to become a truly international conference. The 9th
TIA conference matured into a significant gathering of researchers and practitioners, attracting
a wide range of researchers working in a variety of fields including terminology per se,
computational linguistics, translation studies, specialised communication and socio-linguistics.
Taking into account the recent development of and changes in the international communication
scene, we have set as the central topic for TIA 2011 multilingual aspects of terminology, while
at the same time welcoming a broad range of related topics. Reflecting the importance of TIA
as a forum, TIA 2011 attracted 42 submissions from 18 countries in Europe, Asia and North
and South America. These papers were reviewed by a Programme Committee consisting of 30
experts in the domain from all over the world. Of the 42 submissions, 14 papers were accepted
as long papers, and a further 11 papers were accepted as short papers. They are included in
this volume. In order to maximise interactions among participants, the conference will consist
of a single track session, with two keynote speakers, i.e. Professor Béatrice Daille of LINA,
University of Nantes, and Dr. Gregory Grefenstette of Exalead/Dassault Systèmes. We believe
that TIA 2011 will prove to be an exciting and intriguing forum for people involved in research,
practice and applications related to terminology.
We would like to thank everyone involved in TIA 2011. Special thanks go to all members of
the Programme Committee who worked hard at completing their meticulous reviews on time,
and all members of the Organising Committee for taking care of every aspect of the conference
and workshops on the ground. We also appreciate the help of the many volunteers who have
worked hard to ensure TIA 2011 is a pleasant and comfortable experience for all. Last but not
least, we would like to express our thanks to all the authors who submitted their papers on time,
and all the delegates for participating in the conference. Surely, TIA 2011 will not be the last of
the TIA series, and we look forward to the ongoing development of the series and hope that it
continues to be an exciting forum for the exchange of important ideas among all those working
in the field of terminology and related fields.
Kyo Kageura (University of Tokyo, Tokyo, Japan)
Pierre Zweigenbaum (LIMSI-CNRS & CRIM-INALCO, Paris, France)
ix
Table of Contents
Abstracts of keynote presentations
Term variation in texts
Béatrice Daille . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Long papers
Utilisation de méthodes de structuration de terminologies pour la création de groupements de
termes de pharmacovigilance
Marie Dupuch, Amandine Périnet, Thierry Hamon and Natalia Grabar . . . . . . . . . . . . . . . . 2
Dommages collatéraux de la fusion de terminologies
Natalia Grabar, Marie Dupuch and Fleur Mougin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Bootstrapping a Domain-specific Terminological Taxonomy from Scientific Text
Magdalena Wolska, Ulrich Schäfer and Pham The Nghia . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
EvOnto une approche pour maintenir la cohérence entre une ressource termino-ontologique et
des annotations sémantiques
Anis Tissaoui, Nathalie Aussenac-Gilles, Nathalie Hernandez and Philippe Laublet . . . . 23
Nous appelons X cet Y : X est-il un terme émergent ?
Marie-Paule Jacques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Comparative analysis of the motivatedness structure of Japanese and English terminologies
Takuma Asaishi and Kyo Kageura . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Japanese-English Cross-Language Headword Search of Wikipedia
Satoshi Sato and Masaya Okada . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Exploring the terminological nature of citizens’ queries in the domain of consumer justice
Meritxell Fernández-Barrera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Asymmetric Similarity and Cross-Cultural Communication Process
Fumiko Kano Glückstad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Étude de l’influence de la taille du corpus de référence sur l’extraction terminologique automatique contrastive
Audrey Laroche, Patrick Drouin and Gabriel Bernier-Colborne . . . . . . . . . . . . . . . . . . . . . . 65
Enhancing Multi-word Term Extraction for Designated Theme Embedded in a Domain Corpus
Teruo Koyama and Koichi Takeuchi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Le poids des entités nommées dans le filtrage des termes d’un domaine
Nouha Omrane, Adeline Nazarenko and Sylvie Szulman . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
xi
Simple methods for dealing with term variation and term alignment
Marion Weller, Anita Gojun, Ulrich Heid, Béatrice Daille and Rima Harastani . . . . . . . . 86
Identification des participants de lexies prédicatives : évaluation en performance et en temps
d’un système d’annotation automatique
Fadila Hadouche, Suzanne DesGroseilliers, Janine Pimentel, Marie-Claude L’Homme and
Guy Lapalme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Formalizing Specialized Knowledge Events in Satellite Ontologies
Pamela Faber, Pilar León Araúz and Arianne Reimerink . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
ULiS: An Expert System on Linguistics to Support Multilingual Management of Interlingual
Knowledge bases
Maxime Lefrançois and Fabien Gandon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Short papers
An Analysis and Evaluation of English Arabic Statistical Machine Translation of TerminologyRich Text
Mohammad Ghoniem, Khaled Tokal and Ahmed Y. Tawfik . . . . . . . . . . . . . . . . . . . . . . . . 114
Multiterminology cross-lingual model to create the European Health Terminology/Ontology
Portal
Julien Grosjean, Tayeb Merabti, Nicolas Griffon, Badisse Dahamna and Stefan Darmoni
118
BootCatting Comparable Corpora
Adam Kilgarriff, Avinesh Pvs and Jan Pomikálek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Multiple terminological equivalence of word combinations in scientific discourse: PEPCSAD
— pilot study
Katarzyna Kołudzka-Zi˛etek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Search Engine Metrics to Discover Terms Characteristic of a Database of Images with Captions
Michael Oakes and Lynne Hall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Terminological mismatches in English non legally-binding criminal law texts
Katia Peruzzo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Une étude terminologique de la communication hypertexte web. Caractéristiques du domaine
universitaire.
David Reymond, Nathalie Pinède, Véronique Lespinet-Najib and Benoît Le Blanc . . . 138
A Case Study of Knowledge-Rich Context Extraction in Russian
Anne-Kathrin Schumann . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
A Study for Identifying Domain-specific Introductory Terms in Research Papers
Kiyoko Uchiyama . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
xii