Grammar Engineering

Transcription

Grammar Engineering
published as: CLAUS Report number 1, Saarland University, July 1990
GRAMMAR E NGINEERING : PROBLEMS AND PROSPECTS
Report on the Saarbrücken Grammar Engineering Workshop
Gregor Erbach and Hans Uszkoreit
University of the Saarland and
German Research Center for Artificial Intelligence
Abstract
The "Saarbrücken Workshop on Grammar Engineering" took place from June 21st to
23rd, 1990. The aim of the workshop was to bring together for 3 days of intensive
discussion a number of people with practical experience in the development of largecoverage grammars and researchers who have investigated concepts and tools for
grammar development. The workshop focused on the methodology of grammar
engineering, testing and evaluation of grammars, the problem of distributed
development, the formalisms and tools needed, and grammar maintenance and
reusability. A variety of approaches to grammar writing were presented. Prerequisites
for effective grammar engineering were identified.
Introduction
Purpose and Scope of the Workshop
At Coling 1988 in Budapest, M. Nagao organized a panel discussion on "Language
Engineering: The Real Bottleneck of Natural Language Processing." The main question
was "How can grammar writers use linguistic theory?" Indeed, linguistic engineering
constitutes a serious bottleneck in the development of useful NL systems. On the other
hand, recent developments in theoretical linguistics have reduced the distance between
linguistic theory and linguistic engineering.
1
A major problem of linguistic engineering is the lack of appropriate concepts, skills,
methods and tools for this special type of knowledge engineering. In order to get a
clearer understanding of the problems involved, the workshop on Grammar
Engineering was organized by the authors of this report.
Grammar Engineering is the development of linguistic knowledge bases with a broad
coverage to be employed in natural language systems. Our workshop focused on the
development of syntactic grammars.
The Grammar Engineering bottleneck seriously hinders the commercial exploitation of
NL research for product development. It also limits the value of research systems as
simulation devices for human linguistic competence and performance, that could be
used for developing, testing and improving linguistic theories.
There are four observations that add evidence to this claim:
1. The grammars and the linguistic technology in NL products that are on the market
today are usually 10 to 15 years old.
2. There are no NL products on the market yet that exhibit sufficient coverage, i.e.,
something close to the linguistic competence of the human language user.
Extending existing large grammars constitutes a real problem.
3. For every new product, the grammar is written from scratch.
4. There are no means for specifying the coverage of grammar or for comparing
systems according to coverage.
These observations indicate serious problems that need to be solved before essential
progress can be made. Thus the workshop dealt with methods, tools and formalisms
needed for Grammar Engineering, not with the development of specific applications
and products.
The participants contributed expertise in three relevant research areas: development of
large grammars, theoretical concepts for grammar development, and tools for grammar
development. We are very pleased that the developers of some of the largest
computational grammars ever written participated in the workshop.
2
List of Participants
Tania Avgustinova, Bulgarian Academy of Sciences, Sofia
Igor Boguslavski, USSR Academy of Sciences, Moscow
Stephan Busemann, German Research Center for AI (DFKI), Saarbrücken
Dagmar Dwehus, IPSI GMD, Darmstadt
Gregor Erbach, University of the Saarland, Saarbrücken
Karin Harbusch, German Research Center for AI (DFKI), Saarbrücken
Robert Ingria, BBN, Cambridge, MA
Mark Johnson, MIT, Cambridge, MA
Martin Kay, Stanford University and XEROX Palo Alto Research Center, CA
Esther König, University of Stuttgart
John Nerbonne, German Research Center for AI (DFKI), Saarbrücken
Klaus Netter, University of Stuttgart
Karel Oliva, University of the Saarland, Saarbrücken
Stanley Peters, Center for the Study of Language and Information, Stanford, CA
Bettina Rehse, University of the Saarland, Saarbrücken
Jane Robinson, Palo Alto, CA
Stefanie Schachtl, Siemens, München
Paul Schmidt, IAI EUROTRA-D, Saarbrücken
Petra Steffens, IB M Germany, Institute for Knowledge-Based Systems, Stuttgart
Harald Trost, German Research Center for AI (DFKI), Saarbrücken
Hans Uszkoreit, DFKI and University of the Saarland, Saarbrücken
Wolfgang Wahlster, DFKI and University of the Saarland, Saarbrücken
Susan Warwick, ISSCO, Genève
Annie Zaenen, XEROX Palo Alto Research Center, CA
3
Magdalena Zoeppritz, IBM Germany, Inst. for Knowledge-Based Systems, Heidelberg
Karen Jensen, IBM Hawthorne and Bethesda, who was unable to attend the workshop,
sent a summary of her opinions on the topic.
Report on the Workshop
How does Grammar Engineering relate to Theoretical Linguistics?
If linguistic principles are sensible, you will rediscover them as practical necessities.
(Annie Zaenen)
The participants of the workshop reported general dissatisfaction with using analyses
proposed in the linguistic literature for grammar writing. The major problems
encountered were:
• Incorrectness of the analyses. This is a serious problem for languages like
Bulgarian, which have not been studied as extensively as English or German.
• Lack of explicitness, especially in traditional grammars, where some necessary
distinctions are not made.
• Not enough attention to "messy details" like dates, names etc. Linguistic theory is
too much concentrated on the core grammar, and neglects the periphery.
• Problems with implementation, as exemplified by "movement" accounts.
• Insufficient coverage. Linguistic theory does not provide coherent descriptions of
large fragments.
A serious problem is the shortage of well trained computational linguists with
expertise and experience in the area of grammar writing. Theoretical linguists learn to
develop theories of grammar. Very few have learned to design grammars for larger
fragments of the language that would really work.
The main reason is the lack of methodology that could be taught. Another reason is the
relative recency of the confluence of computational and theoretical linguistics. Until
4
very recently the analyses and languages of the computational linguist were quite
different from the ones used in theoretical linguistics. Only very few researchers were
able to transfer results from linguistics to computational linguistics.
The flow of information should not only be from theoretical linguistics to grammar
development, but grammar development should produce linguistic descriptions of
high quality, and thus have an impact on linguistic theory.
Methodology of Grammar Engineering
As the grammar gets larger, the number of rules written per week decreases.
(Wolfgang Wahlster)
Ideally, developing a grammar should start with defining the functionality and
coverage of the grammar. In practice, however, there are no established methods for
determining the coverage that is needed for a specific application, and for the
specification of coverage. It was suggested that the coverage should be specified
semantically rather than syntactically because a user of a natural-language processing
system cannot be expected to use only certain syntactic constructions, but can be
expected to use only a specific semantic domain.
The goals of grammar development should be clearly specified: the coverage, the
domain of application, and the output of the grammar. The grammars presented on the
workshop provided as output phrase structure trees, f-structures, dependency
structures, and semantic representations. There was general agreement that some
semantic representation as output is important.
Stanley Peters stressed the need for a generic semantic interface language which would
serve three purposes:
1. It is possible to specify exactly what the output of the grammar for a set of sentences
should be. It would thus facilitate testing because a test suite could contain pairs of
sentences and semantic representations (see the section on testing and evaluation
below).
2. The performance of different grammars would be comparable, because they
produce similar output.
5
3. Different grammars can be used for an application, because they would produce the
same output.
There is no systematic method for grammar engineering. There was agreement that some
planning ahead is necessary for the development of solid grammars, but that there is no
foolproof method for working one's way from the specification of the coverage to the
final grammar — linguistic intuition is important. However, "legislation", i.e., carefully
documented design decisions, was considered beneficial.
Jane Robinson advocated an empirical approach, not tied inflexibly to one currently
available formalism or theoretical school. The availability of linguistic data, for
example representative texts and concordances, is very important for grammar
development.
Some methods can be taken over from programming. This is true about structured
programming, regular testing and good documentation.
Jane Robinson suggested the principle that there should be no syntactic ambiguities
that do not correspond to semantic ambiguities.
While Klaus Netter insisted that grammars should be developed without interruption,
Stefanie Schachtl reported that she could easily resume work on her grammar after an
intermission of several weeks.
Another problem for grammar engineering is to determine exactly which linguistic data
should be described. Stanley Peters suggested statistical analysis of large corpora in
order to find co-occurrences of phenomena that would otherwise remain unnoticed.
Another question is whether a large-coverage grammar developed by a linguist should
be re-implemented more efficiently for use in a natural-language processing system.
Good documentation is a prerequisite for continuous work, reusability of the grammar,
and collaborative work. There are two kinds of documentation: linguistic
documentation, which discusses the ideas and principles behind the design of the
6
grammar, and technical documentation pertaining to implementation issues, in which
all the details and hacks are documented. An example of the first kind of
documentation is the description of the ETAP-2 machine translation system, which is
published as a book [Apresyan et al. 1989].
As a tool for documentation, Wolfgang Wahlster suggested to use one of the available
truth maintenance systems to control interdependencies between rules and tells which
rules must be modified together. Jane Robinson believed this to be too intricate a
problem, which was dropped in other projects because of the complexity of the task.
Modularization and Distributed Development
Linguists view programmers as slaves on the plantations of their excellent ideas.
(Karel Oliva)
The development of larg e grammars is extremely slow. Existing large grammars have
usually been developed by a single person (at any given time), sometimes with very
limited assistance from a few coworkers.
In computer programming, modularization has proved to be a useful concept for the
distributed development of large programs. A requirement for a module is that its
adequacy and correctness can be specified independently from other modules.
No methods exists for efficient distributed grammar engineering since no methods exist
for the modularization of grammar. The organization of grammars in rules does not
accommodate a useful modularization because the rules are highly interdependent
(noun phrases contain verb phrases, verb phrases contain noun phrases, etc.)
More recent linguistic approaches organize the grammatical knowledge in principles
instead of rules. Since the principles interact even more closely than the rules do,
modularization as it is needed for distributed development becomes even harder—at
least at a first glance.
However, the organization of knowledge elements in lattice-based type hierarchies as it
is employed in feature unification formalisms offers a very promising scheme for
7
modularization.
A modularization concept would not only further efficient
development it would also boost reusability and grammar evaluation.
There is no general agreement about the division of labor in a natural-language group,
but the experiences of the workshop participants can be summarized as follows.
• The linguistic work can be subdivided along the traditional linguistic levels of
description: phonology, morphology, syntax, semantics, discourse, pragmatics
would qualify as modules. An exception is compositional semantics which should
be handled in parallel with syntax. Nonetheless, regular integration on a variety of
levels is needed (as it was practiced in the LILOG project, where "milestones" were
set at which the entire system was integrated.)
• There is a division of labor between programmers and grammar developers. They
should know enough about each other's fields to communicate effectively. In
particular, the linguist should have an idea about the parsing problem. If the
linguists don't know what they can or cannot expect of the programmers, they tend
to use the programmers as "slaves on the plantations of their excellent ideas".
• There was also agreement that work on the lexicon can be given to people other
than the grammar developers. The use of abbreviatory devices like templates (or
macros) allows the lexicon worker to classify a lexeme as belonging to a certain
class, say [transitive-verb, present, 3rd, singular], without having to spell out what
these abbreviations mean in the linguistic analysis adopted by the grammar writer.
This method has the advantage that the lexicon is only viewed as a database, and
the linguistic analyses can be changed without having to modify every single
lexical entry. Only the definitions of templates like "transitive-verb" must be
changed.
• Cooperative development of a grammar may help its extensibility and reusability,
because problems like "legislation", interfaces and documentation must be taken
care of at an early stage in the development.
• Everything in syntax is interdependent so that it is not easy to split syntax into
modules. There was no consensus about how to divide the work of syntactic
grammar development. In most groups, one person was responsible for the entire
grammar.
Several groups reported that they had one person responsible for the noun phrases and
one person responsible for the verb phrases. However, this is only possible if the two
8
people work together very closely. NP syntax and VP syntax are too closely related to
be good candidates for modules.
Igor Boguslavski reported that their work within a dependency-grammar framework
was subdivided according to syntactic relations. At Siemens, Stefanie Schachtl reported,
coordination was treated as a separate module.
Within stratificational frameworks like Meaning-Text-Theory or the EUROTRA levels of
description there have been attempts to have one person responsible for each stratum,
but it turns out that
independent modules.
these strata are too closely interrelated to be viewed as
Jane Robinson suggested modularization according to semantic field rather than
syntactic phenomenon. Such semantic fields like time or comparison may then have
their syntactic effects and manifestations. Grammars should be extended by adding new
semantic fields to the coverage.
There is a need for thorough design and legislation. This means that there should be
explicit agreement about which categories and features are used, about the spelling of
category and feature names. Likewise, all design decisions and all hacks must be well
documented to make it possible for another person to work on the grammar.
There must be clear ownership of files: each file should have only one owner.
Magdalena Zoeppritz said "I think it is not a good idea to share files. You can share
ideas and worries, but one person must be responsible for integration." It was
suggested that responsibility for a component be separated from control.
Petra Steffens reported synchronization problems with grammar development in a large
team: the tools evolved at the same time as the linguistic descriptions.
John Nerbonne sketched two models for the development of natural language systems:
one in which everything is tightly integrated from the beginning (the way large Lisp
machine environments were developed), and a more chaotic way in which separate
components are developed independently, and then integrated (the way UNIX
9
evolved). The latter approach has the advantage that there is more room for creativity
within each module, but the risk is higher that the modules will not fit together very
well.
Testing and Evaluation
Grammars are like Swiss cheeses when it comes to coverage.
(Petra Steffens)
There is no agreed-upon measure for the size or the coverage of a grammar.
Participants of the workshop reported the sizes of their grammars in terms of bytes,
lines of code, the number of rules and/or template definitions, the number of
unifications, distinct node descriptions, and a list of phenomena covered. GPSG
illustrates that the number of rules per se is not a good measure because some highly
schematic rules are equivalent to a very large number of context-free rules, and the
latter number can be used to compare GPSG grammars.
As far as coverage is concerned, there are no generally accepted standards for
determining the coverage of a given grammar. The participants agreed that test suites
are needed that cover a wide range of grammatical phenomena. Corpora are not
considered as adequate for grammar testing because they do not contain a systematic
sample of phenomena.
In order to control overgeneration, the test suite should contain negative examples of
ungrammatical strings. Since not all possible negative examples can be included in the
test suite, a generator is needed that produces a representative sample of the sentences
licensed by the grammar. This is particularly important if the grammar is to be used not
only for analysis, but also for generation. A good test suite should contain at least 500 to
1000 sentences, judging from the numbers that the workshop participants gave.
A problem with the use of test suites is to define what counts as successful processing
of the sentences contained in the test suite. Five different criteria were given:
1. Can the sentence be parsed (or rejected in the case of negative examples)?
2. Does it get the right number of parses?
10
3. Does it get the correct analysis?
4. Is it assigned the right logical form?
5. Does an application based on the grammar give correct answers or translations?
In our opinion, the first two criteria are too weak. They would be adequate only if the
sole purpose of the grammar were to characterize what is a sentence of the language
and what is not (observational adequacy). The fifth criterion may blur the distinction
between grammar testing and system testing, unless the grammar is hooked into an
application whose behavior is thoroughly understood so that any changes in behavior
can be attributed to the grammar. This is the approach taken by Hewlett-Packard who
have a standard test database.
Criteria 3 and 4 raise the question of how a test suite should be organized. It should not
be a list of sentences, but rather a list of pairs <sentence, syntactic analysis> (the
approach of the Treebank project) or a list of pairs <sentence, logical form>. The latter
would again presuppose a generic semantic interface language, as suggested by
Stanley Peters. For machine translation, a list of pairs of sentences was suggested as a
test suite.
Stanley Peters suggested to use the "mean time between failures" as a measure for the
performance of the grammar, because grammar engineering is interested in developing
grammars that show adequate performance for a particular application. The advantage
of the approach is that one may use a corpus for evaluation so that frequently occurring
phenomena are given higher weight than exceptions. What counts as success and
failure may depend on the five criteria given above. The use of semi-automatic
statistical methods was suggested. Another related issue is the robustness of the
grammar (or parser), i.e., its ability to process ungrammatical input.
John Nerbonne proposed and the participants of the workshop agreed that testing
should be taken into the hands of the natural-language community, and not be left to
the funders. The reasons are that regular testing is needed for the development of
accurate grammars, and that the NL community can do the testing more intelligently.
11
Maintenance and Reusability
The term reusability was used in two senses: reusability of grammars for other
applications than that for which they were originally developed, and reusability of
ideas and analyses for writing new grammars. Jane Robinson remarked that grammar
writing can be shortened by looking at other grammars in other formalisms. No one had
a neutral language for writing down syntactic analysis, while a theory-neutral lexicon
was considered feasible. Martin Kay reported about a morphological dictionary for
English, in which each word is associated with a code, which can be interpreted in
different ways, depending on the application.
It is not always easy to decide whether an existing grammar should be extended or
whether the grammar should be redesigned (the life cycle problem). Klaus Netter
reported that their German grammar was rewritten every two years.
There was general agreement that it is useful to have a core grammar, which can be
extended in different ways for different applications. Igor Boguslavski reported that
they selected and expanded a subset of an existing large grammar for application
building. Bob Ingria claimed that the semantics can easily be adapted to a new domain.
There was no doubt that good documentation of the grammar is an essential
prerequisite for maintenance and reusability.
In summary, the following points were made about the methodology of grammar
engineering:
• Collection of linguistic data is a prerequisite.
• A non-formal description of linguistic phenomena should exist, according to
which the coverage of a formal grammar can be specified.
• A generic semantic interface language is needed in order to have a uniform output
for different grammars and a criterion for evaluating the correctness of the output.
• Grammar engineering is similar to programming in that structuring, regular
testing, and extensive documentation are needed.
12
• Linguistic intuition and solid linguistic training are essential.
• Grammar writing should be introduced into (computational) linguistics curricula.
• "Legislation", i.e., careful documentation of design decisions, is useful for
collaborative work.
Formalisms
No one knows what to do with an f-structure.
(Annie Zaenen)
The participants of the workshop reported experiences with special-purpose
formalisms (Tree Adjoining Grammar, Trace Unification Grammar, Lexical Functional
Grammar, Meaning-Text Model) and with general-purpose formalisms (PATR-II, STUF,
Prolog), in which grammars based on theories like HPSG or CUG were implemented.
Formalisms have been shaped by practical needs. Jane Robinson talked of the evolution
rather than the development of a formalism. For example, parametrized templates were
introduced because they were needed by the grammar developers.
General formalisms are preferred to those that are constrained by a particular linguistic
theory. There was agreement that formalisms cannot and should not be idiot proof (as a
formalism would be in which one can express only what is within the bounds of
universal grammar).
Theoretical linguists have the interest of finding the most constrained formalism that
embodies universal grammar, while grammar developers need a general formalism,
because they also have to describe phenomena that do not belong to the core grammar.
The necessity to handle exceptions can lead to a proliferation of features in a grammar
that would otherwise be simple and elegant. General purpose formalisms are also
preferred over specialized formalisms because the linguistic theory is modified and
developed during grammar writing.
As to expressiveness, M. Zoeppritz proposed that new ideas must be expressible in one
place, and not be scattered all over the grammar. Klaus Netter demanded the
13
possibility of rule-independent declarations and global specifications.
In general,
dissatisfaction with stratificational approaches (like EUROTRA) was expressed.
Esther König argued that the current practice puts too much load into feature
structures. She suggested that movement phenomena should not be handled within
feature structures by means of slash features, but rather by the deductive component of
the grammar, as exemplified by extended categorial grammar.
In summary, the following are the requirements of grammar formalisms:
• Formalisms must be declarative.
• Formalisms must be expressive and provide convenient notation; templates and
macros are needed.
• Grammatical principles and generalizations must be expressible in one place.
• Formalisms must allow efficient, incremental and bidirectional processing.
• Formalisms cannot and should not be idiot proof.
• General-purpose formalisms are preferred for an explorative style of grammar
development. They may be replaced by a special-purpose formalism after certain
parameters of the grammar have been fixed.
• The handling of exceptions must be supported.
Tools
There are no types of knowledge that exhibit more complexity and interdependence
than grammatical competence. Clearly, the development of large grammars cannot be
done with pen and paper only, but tools are also urgently needed for maintaining the
consistency of the grammar and for checking its correctness and completeness with
respect to an intended fragment. The role of development tools in software engineering
cannot be overestimated, but appropriate development tools for grammar engineering
are still missing. Although there has been noticeable progress in the design of grammar
development environments, existing tools do not support distributed development.
Neither do they offer sufficient facilities for the organization and presentation of the
grammar that could help the computational linguist cope with the complexity of the
subject matter. The most advanced technologies for working with highly associative
14
knowledge need to be exploited. Among them are visually supported knowledge
navigation techniques.
Most of the tools in use today consist of a grammar formalism, a parser, and means for
inspecting the parse results. Grammar engineering today is based on the edit-parseinspect cycle: A grammar is written or modified, then sentences are parsed and the
results are analyzed in order to evaluate and debug the grammar.
Tools are not always used as intended: the LFG workbench was designed as an
educational tool, but it is used as a grammar engineering tool today.
Wolfgang Wahlster has conducted a survey of existing grammar engineering
environments (in particular GDE, Wednesday2, TagDevEnv, D-PATR and the LFG
Workbench) and compared their functionalities. The results of this survey and the
discussion during the workshop led to the following list of requirements of grammar
engineering tools:
• Tools must be convenient to use, and include navigation, browsing and help
facilities.
• Tools should be unobstructive and not interrupt the creative process. For example,
it should be possible to turn off syntax checking and consistency checking while
writing a grammar.
• The tools must support grammar administration and be able to keep different
versions of a grammar.
• Tools for constant testing of each extension and revision of the grammar are
needed.
• Speed is an important factor because the tools are constantly used. A grammar
engineering tool should include a fast parser, and allow for incremental
compilation of the grammar or for mixing of interpreted and compiled code.
• Reasonable error messages must be provided.
• Structure oriented editors, especially graphic editors for trees or feature structures
are useful, because they allow to use the same graphic format for data input (the
grammar) and output (the parse results).
15
• Macro processors are needed to facilitate grammar writing, and type and template
hierarchies to capture generalizations.
• The tools should support documentation of the grammar and provide visually
supported knowledge navigation techniques that give access to the different
knowledge sources, explanatory texts, and to the parser, generator etc.
• Tools for debugging should include presentation of parse results and partial
analyses, the possibility to discover where unifications failed, and the possibility
to see which knowledge elements (rules, templates, lexical entries) are responsible
for errors and to locate the definitions of these knowledge elements in the source
files.
• A tracer, stepper and backtracer were suggested for debugging.
• "Instrumentation of the parser" is recommended to obtain measurements and
statistics of parse times, the structures that were built, and the behavior of
individual rules.
• Facilities for display and inspection of type and template hierarchies are needed.
• Tools for lexicon development are needed.
• Facilities for comparing files and parse results are needed.
• A generator of representative language samples is needed to control
overgeneration.
• Consistency checking of the grammar was considered desirable, but there were
doubts about the feasibility. A grammar engineering tool should support
"legislation", i.e., agreements about the names of categories, features etc.
• If structures have been converted to some normal form for processing, a
correspondence mapper is needed to display structures in the original notation.
Currently the linguists build their own tools, especially in the United States, but there
was general agreement that more effort must be put into the implementation of
grammar engineering tools by professional software developers. A grammar
engineering tool must meet all the requirements of professionally developed software,
e.g., recovery from a system crash, backup copies of files etc. However, the question
was raised whether the market for such tools is large enough.
16
Wahlster proposed two types of systems: a polytheoretical workbench which can
process different grammatical formalisms, but provides a uniform user interface for all
formalisms. For cooperative grammar development, a number of workbenches and a
workbench server should be connected with fast FDDI Ethernet (100 MB/sec). The tools
should support conflict resolution, take minutes of team discussions, keep different
versions of the grammar etc., by making use of Computer Supported Cooperative Work
(CSCW) technology. The other system proposed by Wahlster is the Linguistic Toolbox,
a laptop-based natural language system for linguistic field work. The toolbox should
support communication with the workbench located in a different place.
While Wahlster advocated one integrated powerful workbench, John Nerbonne
considered a collection of simple tools more useful.
Conclusion
Until very recently, all large grammar development was performed with representation
languages and tools that did not permit the direct utilization of progress in theoretical
linguistics for natural language processing. With the emergence of declarative
grammar formalisms in linguistics, this situation was remedied. Several contemporary
feature unification formalisms are shared by theoretical linguistics and computational
systems building. They have opened promising new directions in the abstract
specification and modular organization of linguistic knowledge.
However, presentations and discussion at the workshop have shown that prerequisites
are still missing for four tasks:
1. efficient large scale development (distributed grammar engineering),
2. extending the coverage of large grammars (engineering of very large grammars),
3. recycling of existing grammars (grammar reusability),
4. specifying, evaluating and comparing grammars (grammar specification).
Despite the progress in the area of representation formalisms and development tools,
the new means have not yet enabled the grammar developers to overcome the language
17
engineering bottleneck.
At this stage, large scale collective grammar engineering
efforts are highly unlikely to yield short or medium term systems.
The missing prerequisites fall into three classes: concepts, tools and training.
First new concepts are needed for the specification, organization, modularization, and
implementation of grammatical competence. The logic-based declarative grammar
formalisms offer powerful means for developing these new concepts, but those will not
come about without extensive efforts dedicated to this task. Existing engineering tools
such as comfortable development environments and tool boxes support state-of-the-art
formalisms, but they do not offer means for diagnosis, consistency control and, most
importantly, distributed collective development. Engineering tools with such extended
functionality can obviously only be built on the basis of the envisaged new formal
engineering concepts. Finally, computational linguists need to be trained for the
development of grammars according to the new concepts and tools.
Grammar formalisms have changed rapidly and drastically. So do the linguistic
processing systems that depend on them. The fast evolution in this area has had the
undesirable side effect that existing large grammars are not used anymore in new
systems. Often the development stops when the project ends or when the main
developer leaves the project. Since the development of grammars is a very costly
endeavor, valuable resources are wasted.
A prerequisite for achieving the reusability of grammatical resources are mathematical
concepts and a representation language for the abstract specification of grammatical
knowledge. An abstract declarative specification language with a clean semantics is
needed for the specification of grammatical competence. Current developments in the
area of typed feature unification formalisms already move in the right direction. The
observed convergence of formalisms could lead to specification standards that may
serve as the basis for grammar reusability. Again, this development will not take place
without considerable additional efforts.
As a result of the workshop, we feel strongly that the following practical steps need to
be taken in order to overcome the deficiencies of current grammar engineering:
18
• research projects on the modularization of grammars in close connection with the
development of grammar engineering methods,
• research projects on the theory-neutral abstract specification of linguistic analyses
and observations,
• development of a generic semantic interface language, which would allow to
specify the input/output behavior of a grammar,
• professional implementation of comfortable and powerful engineering tools,
• the large-scale collection, annotation and classification of linguistic data as the
basis for evaluation and faster processing models (statistical methods),
• linguistic test suites as the basis for tools for diagnosis and consistency
maintenance,
• mandatory courses on grammar development in computational linguistics
curricula,
The existence of the desired concepts and tools will undoubtedly boost the
productivity of grammar development with obvious implications for the commercial
success of computational linguistics research. We are well aware of the enormous
efforts in linguistics research still required before computational grammars will
approximate the linguistic competence of human speakers. But the current situation
where it takes a decade to develop a grammar that covers a language fragment that has
been covered before hinders progress in language technology. The way out of this
deplorable state of affairs is a new kind of grammar engineering with methods, tools
and practitioners as effective as their counterparts in today's software engineering.
Acknowledgments
We want to thank all those who contributed to the success of the workshop. The
workshop was financially supported by IBM Germany through the project LILOG-SB
conducted at the University of the Saarland at Saarbrücken and by the German Federal
Ministry of Research and Technology through the project DISCO carried out at the
German Research Center for Artificial Intelligence (DFKI) in Saarbrücken. During the
workshop, all talks and discussions were recorded by Petra Schwenderling, who is a
19
stenographer in the State Parliament of Baden-Württemberg, and also recorded on tape.
Special thanks to Bobbye Pernice, who proofread the 90 pages produced by the
stenographer and transcribed some of the tapes.
We are especially grateful to those participants who provided helpful comments and
corrections on earlier drafts.
References
[Apresyan et al. 1989]
Yu. D. Apresyan, I. M. Boguslavski, L. L. Iomdin, A. V. Lazurski, I. V. Pertsov, V. Z.
Sannikov, L. L. Tsinman. Lingvisticheskoe obespechenie sistem’y ETAP -2, Moscow:
Nauka, 1989
[Nagao 1988]
Makoto Nagao (Panel Organizer). Panel: Language Engineering: The Real Bottle Neck
in Natural Language Processing, Proceedings of COLING '88, Budapest, pp. 448 - 453.
20