Columns Reviews Volume 12, Number 1 February 2008



Columns Reviews Volume 12, Number 1 February 2008
From the Editors
by Dorothy Chun & Irene Thompson
pp. 1-2
On the Net
You've Got Some GALL: Google-Assisted
Language Learning
by George Chinnery
pp. 3-11
Emerging Technologies
Of Elastic Clouds and Treebanks: New
Opprtunities for Content-Based and DataDriven Language Learning
by Robert Godwin-Jones
pp. 12-18
News from Sponsoring Organizations
pp. 19-22
Edited by Sigrun Biesenbach-Lucas
Review of Five English Learners' Dictionaries
Reviewed by Alfonso Rizo-Rodríguez
pp. 23-42
Volume 12, Number 1
February 2008
Peer Feedback on Language Form in
Paige Ware, Southern Methodist University
Robert O'Dowd, Universidad de Léon
pp. 43-63
The Role of Offline Metalanguage Talk in
Asynchronous Computer-Mediated Communication
Keiko Kitade
Ritsumeikan University, Kyoto
pp. 64-84
Methodological Hurdles in Capturing CMC Data:
The Case of the Missing Self-Repair
Bryan Smith
Arizona State University
pp. 85-103
Commentary: Can Free Reading Take you All the
Way? A Response to Cobb (2007)
Jeff McQuillan, Center for Educational Development
Stephen D. Krashen, University of Southern California
pp. 104-108
Commentary: Response to McQuillan and Krashen
Tom Cobb
Université du Québec à Montreal
pp. 109-114
Contact: Editors or Editorial Assistant
Copyright © 2007 Language Learning & Technology, ISSN 1094-3501.
Articles are copyrighted by their respective authors.
About Language Learning & Technology
Language Learning & Technology is a refereed journal which began publication in July 1997. The journal
seeks to disseminate research to foreign and second language educators in the US and around the world
on issues related to technology and language education.
Language Learning & Technology is sponsored and funded by the University of Hawai'i National
Foreign Language Resource Center (NFLRC) and the Michigan State University Center for
Language Education And Research (CLEAR), and is co-sponsored by the Center for Applied
Linguistics (CAL).
Language Learning & Technology is a fully refereed journal with an editorial board of scholars in
the fields of second language acquisition and computer-assisted language learning. The focus of
the publication is not technology per se, but rather issues related to language learning and
language teaching, and how they are affected or enhanced by the use of technologies.
Language Learning & Technology is published exclusively on the World Wide Web. In this way,
the journal seeks to (a) reach a broad audience in a timely manner, (b) provide a multimedia
format which can more fully illustrate the technologies under discussion, and (c) provide
hypermedia links to related background information.
Beginning with Volume 7, Number 1, Language Learning & Technology is indexed in the
exclusive Institute for Scientific Information's (ISI) Social Sciences Citation Index (SSCI), ISI
Alerting Services, Social Scisearch, and Current Contents/Social and Behavioral Sciences.
Language Learning & Technology is currently published three times per year (January, May,
Copyright © 2007 Language Learning & Technology, ISSN 1094-3501.
Articles are copyrighted by their respective authors.
Sponsors, Board, and Editorial Staff
Volume 12, Number 1
University of Hawai`i National Foreign Language Resource Center (NFLRC)
Michigan State University Center for Language Education and Research (CLEAR)
Center for Applied Linguistics (CAL)
Advisory and Editorial Boards
Advisory Board
Susan Gass
Richard Schmidt
Michigan State University
University of Hawai`i
[email protected]
[email protected]
Sigrun Biesenbach-Lucas
Thierry Chanier
Graham Crookes
Robert Godwin-Jones
Lucinda Hart-González
Philip Hubbard
Michelle Knobel
Marcus Kötter
Marie-Noelle Lamy
Lara Lomicka
Allan Luke
Mary Ann Lyman-Hager
Alison Mackey
Carla Meskill
Denise Murray
Noriko Nagata
John Norris
Georgetown University
Université de Franche-Comte
University of Hawai`i
Virginia Commonwealth Univ.
Second Language Tesing, Inc.
Stanford University
Montclair State University
University of Münster
The Open University
University of South Carolina
University of Queensland
San Diego State University
Georgetown University
San Jose State University
University of San Francisco
University of Hawai`i
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
Lourdes Ortega
Jill Pellettieri
Joy Kreeft Peyton
University of Hawai`i
Santa Clara University
Center for Applied Linguistics,
Washington, DC
University of Cal., Berkeley
Montclair State University
Monterey Institute of
International Studies
Univ. of California, Irvine
[email protected]
[email protected]
[email protected]
Editorial Board
Maggie Sokolik
Susana Sotillo
Leo van Lier
Mark Warschauer
[email protected]
[email protected]
[email protected]
[email protected]
Editorial Staff
Dorothy Chun
Irene Thompson
Associate Editors
Richard Kern
Editorial Assistant
Web Production
Book & Multimedia
Review Editor
On the Net Editors
Technologies Editor
Copy Editors
Batia Laufer
Hunter Hatfield
Carol Wilson-Duffy
Sigrun BiesenbachLucas
Jean W. LeLoup
Robert Ponterio
Robert Godwin-Jones
Stephanie Alexis
Matthew Buscemi
Elizabeth Pfaff
Suann Robinson
University of CA, Santa
The George Washington
University (Emerita)
University of CA,
University of Haifa
University of Hawai`i
Michigan State
Georgetown University
[email protected]
SUNY at Cortland
SUNY at Cortland
Virginia Commonwealth
Indiana University
[email protected]
[email protected]
[email protected]
University of Hawai`i
University of Hawai`i
[email protected]
[email protected]
University of Hawai`i
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
Copyright © 2007 Language Learning & Technology, ISSN 1094-3501.
The contents of this publication were developed under a grant from the Department of Education (CFDA 84.229,
P229A60012-96 and P229A6007). However, the contents do not necessarily represent the policy of the Department
of Education, and one should not assume endorsement by the Federal Government.
Information for Contributors
Language Learning & Technology is seeking submissions of previously unpublished manuscripts on any
topic related to the area of language learning and technology. Articles should be written so that they are
accessible to a broad audience of language educators, including those individuals who may not be familiar
with the particular subject matter addressed in the article. General guidelines are available for reporting
on both quantitative and qualitative research.
Manuscripts are being solicited in the following categories:
Articles | Commentaries | Reviews
Articles should report on original research or present an original framework that links previous research,
educational theory, and language teaching practices that utilize technology. Articles containing only
descriptions of software, classroom procedures, or those presenting results of attitude surveys without
discussing data on actual language learning outcomes will not be considered. Full-length articles should
be no more than 8,500 words in length, including references, and should include an abstract of no more
than 200 words. Appendices should be limited to no more than 1,500 words. We encourage articles that
take advantage of the electronic format by including hypermedia links to multimedia material both within
and outside the article.
All article manuscripts submitted to Language Learning & Technology go through a two-step review
Step 1: Internal Review. The editors of the journal first review each manuscript to see if it meets the basic
requirements for articles published in the journal (i.e., that it reports on original research or presents an
original framework linking previous research, educational theory, and teaching practices), and that it is of
sufficient quality to merit external review. Manuscripts which do not meet these requirements or are
principally descriptions of classroom practices or software are not sent out for further review, and authors
of these manuscripts are encouraged to submit their work elsewhere. This internal review takes about 1-2
weeks. Following the internal review, authors are notified by e-mail as to whether their manuscript has
been sent out for external review or, if not, why.
Step 2: External Review. Submissions which meet the basic requirements are then sent out for blind peer
review from 2-3 experts in the field, either from the journal's editorial board or from our larger list of
reviewers. This second review process takes 2-3 months. Following the external review, the authors are
sent copies of the external reviewers' comments and are notified as to the decision (accept as is, accept
pending changes, revise and resubmit, or reject.
Titles should be concise (preferably fewer than 10 words) and adequately descriptive of the content of the
article. Some good examples are
Social Dimensions of Telecollaborative Foreign Language Study
"Reflective Conversation" in the Virtual Language Classroom
Teaching German Modal Particles: A Corpus-Based Approach
Copyright © 2007 Language Learning & Technology, ISSN 1094-3501.
Articles are copyrighted by their respective authors.
Commentaries are short articles, usually no more than 2,000 words, discussing material previously
published in Language Learning & Technology or otherwise offering interesting opinions on theoretical
and research issues related to language learning and technology. Commentaries which comment on
previous articles should do so in a constructive fashion. Hypermedia links to additional information may
be included. Commentaries go through the same two-step review process as for articles described above.
Submission Guidelines for Articles and Commentaries
Please list the names, institutions, e-mail addresses, and if applicable, World Wide Web addresses
(URLs), of all authors. Also include a brief biographical statement (maximum 50 words, in sentence
format) for each author. (This information will be temporarily removed when the articles are distributed
for blind review.)
Articles and commentaries can be transmitted in either of the following ways:
1. By electronic mail, send the main document and any accompanying files (images, etc.) to
[email protected]
2. By mail, send the material on a Macintosh or IBM diskette to
University of Hawai'i at Manoa
1859 East-West Road, #106
Honolulu, HI 96822
Please check the General Policies below for additional guidelines.
Language Learning & Technology publishes reviews of professional books, classroom texts, and
technological resources related to the use of technology in language learning, teaching, and testing.
Reviews should normally include references to published theory and research in SLA, CALL, pedagogy,
or other relevant disciplines. Reviewers are encouraged to incorporate images (e.g., screen shots or book
covers) and hypermedia links that provide additional information, as well as specific ideas for classroom
or research-oriented implementations.
Reviews of individual books or software are generally 1,200-1,600 words long, while comparative
reviews of multiple products may be 2,000 words or longer. They can be submitted in ASCII, Rich Text
Format, Word, or HTML. Accompanying images should be sent separately as jpeg or gif files. Reviews
should include the name, institutional affiliation, e-mail address, URL (if applicable), and a short
biographical statement (maximum 50 words) of the reviewer(s). In addition, the following information
should be included in a table at the beginning of the review:
Series (if applicable)
City and country
Title (including previous titles, if applicable) and
version number
Minimum hardware requirements
Publisher (with contact information)
Support offered
Copyright © 2007 Language Learning & Technology, ISSN 1094-3501.
Articles are copyrighted by their respective authors.
Year of publication
Number of pages
Target language
Target audience (type of user, level, etc.)
ISBN (if applicable)
LLT does not accept unsolicited reviews. Contact Sigrun Biesenbach-Lucas if you are interested in
having material reviewed or in serving as a reviewer ([email protected]).
Sigrun Biesenbach-Lucas
21333 Comus Court
Ashburn, VA 20147
General Policies
The following policies apply to all articles, reviews, and commentaries:
All submissions should conform to the requirements of the Publication Manual of the American
Psychological Association (4th edition). Authors are responsible for the accuracy of references and
citations, which must be in APA format.
Manuscripts that have already been published elsewhere or are being considered for publication
elsewhere are not eligible to be considered for publication in Language Learning & Technology. It is the
responsibility of the author to inform the editor of any similar work that is already published or under
consideration for publication elsewhere.
Authors of accepted manuscripts will assign to Language Learning & Technology the permanent right to
electronically distribute their article, but authors will retain copyright and, after the article has appeared in
Language Learning & Technology, authors may republish their text (in print and/or electronic form) as
long as they clearly acknowledge Language Learning & Technology as the original publisher.
The editors of Language Learning & Technology reserve the right to make editorial changes in any
manuscript accepted for publication for the sake of style or clarity. Authors will be consulted only if the
changes are major.
Authors of published articles, commentaries, and reviews will receive 10 free hard-copy offprints of their
articles upon publication.
Articles and reviews may be submitted in the following formats:
HTML files
Microsoft Word documents
RTF documents
ASCII text
If a different format is required in order to better handle foreign language fonts, please consult with the
Copyright © 2007 Language Learning & Technology, ISSN 1094-3501.
Articles are copyrighted by their respective authors.
Language Learning & Technology
February 2008, Volume 12, Number 1
p. 1-2
It is our pleasure to introduce Volume 12, Number 1 of Language Learning &
Technology, a regular issue of our journal. We want to take this opportunity to wish you
a happy and productive 2008, a year proclaimed by the United Nations General
Assembly as the International Year of Languages. We are proud to be a part of this
international effort to promote the study of languages and cultures worldwide.
We want to thank our contributors, reviewers, and readers for making 2007 a very
successful year for our journal. The number of subscribers grew from 8,500 in 2006 to
10,600 in 2007. We received a record number of 144 submissions from 31 countries in
2007, up from 105 in the previous year. We are looking forward to 2008 being an even
better year.
This issue features three articles and two commentaries in addition to our regular
columns. The three articles coincidentally all deal with various issues involved in
computer-mediated communication.
"Peer feedback on language form in telecollaboration" by Paige Ware and Robert
O’Dowd explores corrective peer feedback on form in asynchronous discussions. Their
findings indicate that such feedback occurred only when students were explicitly
required to provide it. Pedagogical implications include the need to situate peer
feedback on form within current models of telecollaboration and to assist students in
finding feedback strategies that do not require a sophisticated understanding of L1 or L2
"The role of metalanguage talk in asynchronous computer-mediated communication" by
Keiko Kitade examines the benefits of offline dialogue in an asynchronous computermediated communication (ACMC) activity. The study suggests that offline dialogue
may compensate for lack of instant tailored feedback in ACMC. The author
recommends further investigation of the potential of offline interactions for creating a
collaborative context, not only among online interlocutors but also among offline peers.
"Methodological hurdles in capturing CMC data: The case of the missing self-repair" by
Bryan Smith studies the use of self-repair among learners of German in a task-based
CMC environment in order to (1) establish how potential interpretations of CMC data
may depend on the method of data collection and evaluation and (2) explicitly examine
the nature of CMC self-repair in the task-based foreign language CALL classroom. His
results show that the interpretation of the chat interaction is a function of the data
collection and evaluation methods employed. The findings also suggest a possible
difference in the nature of self-repair across face-to-face and SCMC environments. In
view of the results, this paper calls for CALL researchers to abandon the reliance on
printed chat log files when attempting to interpret SCMC interactional data.
The commentary "Can free reading take you all the way: A response to Cobb (2007)" by
Jeff McQuillan and Stephen Krashen argues that in "Computing the Demands of
Vocabulary Acquisition from Reading" (Language Learning & Technology, October,
2007), Cobb underestimated the amount of reading that even a very modest reading
habit would afford L2 readers, and therefore underestimated the impact of free reading
on L2 vocabulary development. In addition, the authors point out that Cobb’s own data
show that free reading is a very powerful tool in L2 vocabulary acquisition.
Copyright © 2008, ISSN 1094-3501
From the Editors
In his commentary "Response to McQuillan and Krashen (2008)", Cobb questions the
adequacy of free reading for vocabulary development in the typical time frame of
instructed L2 acquisition. He suggests that the development of an adequate L2 lexicon
results from well-designed L2 instruction that includes, but is not limited to, reading.
The "On the Net" column "You’ve got some GALL: Google-assisted language learning"
by guest contributor George Chinnery proposes a number of interesting ways in which
the power of Google can be harnessed for pedagogical purposes.
The "Emerging Technologies" column "Of elastic clouds and treebanks: New
opportunities for content-based and data-driven language learning" by Robert GodwinJones describes a plethora of new technical developments that make it possible to use
large data sets for language learning.
Our Reviews column edited by Sigrun Biesenbach-Lucas contains a detailed review of
five English learners’ dictionaries on CD ROM by Alfonso Rizo-Rodríguez.
Please take a look at the updated list of PhD dissertations dealing with language learning
and technology. We would like to thank Dr. Evelyn Reder Wade of UC Santa Barbara
who compiled the original list and provided the current updated one.
Please take a few minutes to complete the LLT subscription form if you have not already
done so.
Dorothy Chun and Irene Thompson,
Language Learning & Technology
Language Learning & Technology
February 2008, Volume 12, Number 1
pp. 3-11
You’ve Got some GALL: Google-Assisted Language Learning
George M. Chinnery
University of Maryland – Baltimore County
"Just google it!"
"Have you googled it yet?"
"I'll google it later."
Commands, inquiries, and intentions of this sort have become so commonplace in class discussions,
during meetings, over dinner, and on the phone as to approach cliché. One article making the rounds on
the AP wire even investigated "googling your date" (Irvine, 2007).
The impact of the internet on the English, and global, lexicon is nothing new. It has become habitual to
send e-mails or text messages in lieu of using snail-mail or calling on the phone. Many other forms of
computer-mediated communication have similarly found themselves both publicly and officially
recognized. In 2004, blog was named Merriam Webster's word of the year ("Blog Picked," 2004;
Merriam-Webster Online, 2005). Likewise podcast, which the New Oxford American Dictionary named
as its 2005 word of the year ("Wordsmiths Hail Podcast," 2005). Google entered Merriam-Webster the
next year, though only as runner-up for word of the year, losing out to truthiness (Ahrens, 2006;
Merriam-Webster, 2006). What is unique about Google’s cross-over is not only the fact that its brand
name has trumped its function, in the same way many of us blow our noses in kleenex, toss frisbees, and
dress our wounds with band-aids, but that it is this function with which it is synonymous (i.e., it’s a verb).
As such, no longer do we simply ‘search’ for something online. Now we google it.
In 1998, co-founders Larry Page and Sergey Brin launched their newly renamed search engine, Google.
Acknowledging the mathematical origins of its moniker (a ‘googol’ is 1 followed by 100 zeros), a
statement on its website indicates that "Google's play on the term reflects the company's mission to
organize the immense amount of information available on the web." In the eyes of the public, this mission
was seemingly accomplished very quickly, such that in 2003, a New York Times columnist somewhat
sarcastically asked "Is Google God?" (Friedman, 2003). In 2004, Wired magazine celebrated
Googlemania, chronicling the site’s rise to the summit of search and its impending death-match with
Microsoft (Malone, 2004). In 2005, the fictional retrospective documentary, EPIC 2015, mockingly
documented the Evolving Personalized Information Construct, portending a mammoth cyber-merger with
Amazon which would deliver its GoogleZon progeny to the world. In 2006, ‘Google’ itself was the most
searched term on AOL’s search engine during a three month period, i.e., Google itself was apparently
being googled (Nakashima, 2006a). Come 2007, half of all US searches were being conducted on Google
(comScore, 2007a). By March of the same year, Google was reportedly the "world’s most-visited site"
(Kopytoff, 2007). And at least two major universities were offering courses in ‘Google’ (McCloskey,
2007). Numerous books have been published on the subject of Google, covering it both as a successful
business model and a powerful internet tool. Following the theme of the latter, Google offers a range of
practical applications for language instructors and learners alike.
Copyright © 2008, ISSN 1094-3501
George M. Chinnery
Google-Assisted Language Learning
Since its inception, language instructors have recognized the informational potential offered by the
internet. Corpus linguistics, for instance, went online with web-based linguistic corpora and KWIC (key
word in context) concordancers (e.g., MiCASE). Google has itself even been proposed as a ‘quick ‘n
dirty’ concordancer (Robb, 2003; Rundell, 2000). But it also has the capability to do much more than
simply facilitate basic Boolean-type searches.
Google as Informative Tool
At a basic level, Google, by default, checks for and corrects spelling errors, such that a query for ‘cofee’
proffers ‘Did you mean: coffee.’ Beyond superficial form, however, learners can discover meaning by
appending a dictionary command to the start of a term (e.g., ‘define:coffee’). Google can also focus on
usage. The 'define:coffee' command offers several common collocations (e.g., coffee break, Turkish
coffee) at the top of the page. Typing into Google Suggest will preview similar collocations. And in using
a wildcard command such as ‘I drink * coffee’, the asterisk acts as a placeholder for a gap-fill, and results
in a range of potential responses. This is also useful for phrasal verbs, such that by typing ‘come * with’,
learners discover ‘come up with’, ‘come away with’ and more. Another way to maintain context is to
search authentic texts in Google Books, where a search for ‘coffee’ introduces learners to rich prose
describing ‘roasted coffee’ and ‘steaming coffee’. Learners curious about the different synonyms for
coffee can compare the regional popularity of their usage at Google Trends. A search for ‘cup of coffee,
cup of joe, cup of mud, cup of java’ will not only inform the inquisitive learners of the global popularity
of ‘cup of coffee’, but also that ‘cup of joe’ is not uncommon in the United States, particularly in New
York. Conducting such comparative searches on Google Fight, a Google hack (an unauthorized
modification by a third party), provides a more animated and entertaining display of the results. To
discover synonyms in the first place, learners can prefix a given term with a tilde (e.g., ~coffee), which
searches not only for said term, but also popular related terms.
Vocabulary development can be encouraged more interactively through use of the Google Image Labeler,
a real-time two-player game where the goal is to reciprocally label a series of digital photos. Google Sets
provides another option useful even to beginning learners: an opportunity for listing and brainstorming. A
search for ‘black’, for instance, displays an extensive list of other colors. Google also offers several tools
for beginning learners’ numeracy work. Typing ‘3 x 2’ into Google turns it into an instant calculator.
Queries patterned after official exchange lingo (e.g., ‘3.99 USD in RUB’) offer updated currency
conversions. Adapting ‘weather Seattle’ displays local forecasts both textually and graphically. And
simply typing an accurate address into Google directs learners to Google Maps.
Another option well-suited to beginning, as well as more advanced, language learners is the Google
Language Tools page, which provides interfacing in over 100 legitimate and faux (e.g., Elmer Fudd)
languages. Interface and search language ‘preferences’ can also be set from Google’s home page, such
that all results are restricted to the language of choice. Also available for many of these are search and
translation services. Entire websites can be translated in mere seconds. The creative 1888usa Google hack
combines Google's translations with AT&T’s speech synthesis (a.k.a., text-to-speech) demo.
More advanced learners can be encouraged to manipulate and interact with their target language by
conducting creative webquests on Google. For example, learners can type in a few random ingredients
(e.g., ‘black beans brown rice tomatoes cilantro’) to see what recipes Google can concoct. The Cookin'
with Google hack performs similar searches, exclusively on several popular recipe sites. Google can also
be used to guide learners in more traditional webquests. Returning to the coffee illustration, Google
would enable learners conducting research on the history of coffee to search for information on preselected sites (e.g., ‘coffee’). By clicking the ‘Cached’ link under any of
the search results, the search terms are brightly highlighted. And if said learners are in need of more
Language Learning & Technology
George M. Chinnery
Google-Assisted Language Learning
information, use of the link command (e.g., will provide referral to sites
linked to the given source.
Instructors desiring greater control over learners’ search activities can tailor their own search engines
using Google Coop. For example, this Ethiopian Coffee search engine will only search pre-selected sites
identified on that page. Instructors and learners unable or uninterested in recalling the aforementioned
commands can access Google’s Advanced Search page, which provides a more user-friendly interface for
many of these.
Though Google’s range of search tools is in and of itself impressive, perhaps what makes it all the more
powerful is its recognition of the internet’s potential. e-Language Learning describes the use of modern
web-based tools for learning opportunities which are informative, productive, collaborative,
communicative, and aggregative. The preceding examples illustrate how Google successfully provides
myriad opportunities for the first of these, in essence employing its most traditional use as an information
provider. Google also offers a comprehensive suite of free programs (to anyone who registers for an
account) which help facilitate the remainder.
Google as a Productive Tool
An article in the New York Times once declared that "all the Internet’s a stage" (Stross, 2006). Thus,
whereas the heavily informative quality of Google can be aligned with language input, its productive tools
foster opportunities for output. This reality is reflected in blogging sites like Google’s Blogger, which
allows learners to instantly author, publish, and syndicate their own textual, audiovisual, and generally
multimedia productions for a global audience. Google Docs offers collaborative web-based word
processing. Essentially, it is like a free web-based version of Microsoft Word. One of its key strengths is
ability to be shared, like a wiki (incidentally, Google has acquired Jotspot, a popular wiki), a feature
which allows for a plethora of creative applications. An instructor might post a text, intentionally replete
with errors, for learners to correct. Likewise, learners can easily peer-edit, as this program leaves an
editing trail. Another option is chain storytelling, where an instructor begins a story which each learner
contributes to in turn. Such a tool is useful in group projects in general. Another feature of Google Docs is
web-based spreadsheets, similar to Microsoft Excel, that instructors can use for attendance-keeping and
Google as a Collaborative Tool
Vygotskyan constructivism (Vygotsky, 1978) posits that knowledge construction and meaning-making
are best facilitated via scaffolded collaboration. The aforementioned tools clearly provide for such
collaborative opportunities. This potential is further enhanced through use of Google Calendar, which can
be used for scheduling and sending out reminders, and Google Groups, which can also be used to send
out announcements, as well as to facilitate asynchronous class discussions.
Google as a Communicative Tool
Google also offers its own versions of some of the more standard communicative tools, which provide
opportunities for interaction/negotiation in the target language. Gmail is Google’s email program, and
Google Talk is its instant messenger-cum-internet telephony service, that allows users to save, print and
email text chats. The latter can be used as the medium of communication between pairs of learners
engaging in classic cooperative activities such as jigsaw tasks. The benefits of doing so via chat have been
summarized by Swaffar (1998), who indicates that chats "seem to help all individuals in language classes
engage more frequently, with greater confidence, and with greater enthusiasm in the communicative
process than is characteristic for similar students in oral classrooms" (p. 1). Another communicative tool
with which to focus on form is GOOG-411, an automated telephone directory which integrates speech
recognition and text messaging. Used effectively, it can be used to aid in the development of learners’
Language Learning & Technology
George M. Chinnery
Google-Assisted Language Learning
Google as an Aggregative Tool
In addition to providing learning opportunities that are informative, productive, collaborative and
communicative, Google offers several tools that recognize linguistic, visual, audio, gestural and spatial
literacies in aggregate (New London Group, 1996). On iGoogle, for example, learners can create ‘start
pages’ that collect many of the aforementioned Google tools, as well as many others. Google Reader is a
web feed aggregator which can be used by learners and instructors to collect updated news feeds, blogs,
podcasts and vidcasts together into a single interface. Google Gears allows them to view this content
offline, in the same way that podcasting allows audiovisual content to be downloaded from the internet
for later use. Google Page Creator is a deceptively simple webpage creation tool. The ‘My Maps’ feature
of Google Maps is a mash-up tool which allows learners to tailor-make maps, which they can embed with
descriptive text, and digital images and drawings. Google Earth, which has the ability to take learners
home with satellite precision, can be similarly utilized. Video mash-ups can be created using the Googleowned YouTube Remixer. And another feature of Google Docs is a presentation tool tantamount to
Microsoft PowerPoint. All of these can be used to promote digital storytelling by language learners. On
an iGoogle start-page, for example, learners can present their hometowns, complete with digital images,
weather forecasts, current events, and more. Using the Google Maps mash-up tool, immigrants and
sojourners alike can imaginatively narrate their travels.
Despite the benefits its tools offer to those involved in language instruction and learning, as well as the
population-at-large, Google is not without its critics. Publishers are worried about the repercussions of the
Google Books Library Project, which aspires to create a digital archive of the world’s books, public
domain and otherwise (Ekman, 2006; Sipress, 2007). Spurred by similar copyright concerns of newspaper
editors, Belgian courts ordered Google to stop posting headlines from its national papers on Google News
(White, 2007).
Google has also been accused of selective censorship. For a time, Google blocked web content critical of
the Church of Scientology due to pressure from the Church (Gallagher, 2002). It removed YouTube
videos which the Thai government alleged were insulting to its monarchy (Vandenberghe, 2007). And it
voluntarily agreed to censor itself in China (Crampton, 2006). More globally, it has been argued that sites
ranked highest by Google tend to remain the most popular, thereby restricting public exposure to new
sites, essentially a rich-get-richer phenomenon or ‘googlearchy’ (Hindman, Tsioutsiouliklis, & Johnson,
2003). Though this unique form of technological determinism has been accused of widely influencing the
media (Lohr, 2006), allegations have not been Google-specific (Introna & Nessenbaum, 2000).
Furthermore, these findings have been disputed by others who portray the search engine as more of an
egalitarian ‘googleocracy’ (Menczer, Fortunato, Flammini, & Vespignani, 2006).
Then there is the question of its expansion, prompting discussions over "How much more should it be
allowed to grab?" (Pearlstein, 2007), "Is Google too big?" (Spanbauer, 2007), and even "Is Google’s data
grinder dangerous?" (Keen, 2007); echoing comparison to monopolistic Microsoft (Rivlin, 2005).
Perhaps the most widely publicized concerns over Google pertain to privacy. Despite having refused
similar requests from US authorities (Mohammed, 2006), it handed over identifying information of its
users to Brazilian Courts (Nakashima, 2006b). This issue was also raised with the advent of the Street
View feature on Google Maps, which—as the name implies—posts street-level screen shots of certain
locations, complete with unsuspecting passersby (Liedtke, 2007). Google’s decision to save all search
queries by default and for an indefinite period of time prompted calls "to shift the default when storing
personal information back to where it has been for millennia, from remembering forever to forgetting
over time" (Mayer-Schönberger, 2007, p. 17). This all culminated in a 2007 report evaluating internet
privacy, in which Google ranked worst amongst a group of popular websites (Privacy International,
Language Learning & Technology
George M. Chinnery
Google-Assisted Language Learning
2007). Still, it has requested assistance from the US government to battle international censorship
(Rugaber, 2007) and has agreed to ‘anonymize’ search histories after 18 months and auto-delete cookies
after 24 ("Google Cookies," 2007; Wearden, 2007).
Though these controversies might discourage some from using Google, intrepid instructors can
pedagogically transform them into opportunities for critical thinking, akin to the higher levels of Bloom’s
Taxonomy (Bloom, 1956). When learners come across these issues during an assignment, the instructor
could turn the issue into a class discussion or writing assignment. Moreover, already cognizant of these
issues, the instructor might intentionally plan these ‘teachable moments’ as part of the lesson. In addition
to the aforementioned topics of inquiry, for instance, learners might do some of the following:
1) Compare the results of a Google search with those of Yahoo or another search engine, or with a
search conducted on Scroogle, which alleges to ‘scrape’ Google of all its tracking potential.
2) Send emails on pre-selected topics to one another over Gmail and analyze the forthcoming
advertisements embedded at the bottom of the message for their relevance to the original message.
3) Search for Tibet on Google Maps or Google Earth, and use its absence as a discussion prompt over
Google’s policy with China?
4) Google themselves and write a paper based on the results.
5) Hold a mock trademark trial between Google and the inventor of the number ‘googol’ (using stories
from NPR or The Inquirer as prompts).
6) Debate the response of Google’s Advertising Team to the release of Sicko, Michael Moore’s
cinematic attack on the US health care system, using Google’s Health Advertising Blog as a prompt.
7) Discuss the notion of a ‘Google Generation’ and develop a concerted and comprehensive definition of
cyberplagiarism (using this BBC News article as a prompt).
Given Google’s prominence, there is unlikely to be a shortage of provocative issues. Any number of other
controversies can be culled from perusals of Google Watch, Google Blogoscoped, Googlified, and
Google Operating System.
Remember Excite? AltaVista? Even Yahoo—the one-time premiere search tool—ultimately suffered with
the arrival of Google, and for a while even adopted the latter’s search technology (Perrone, 2004). Google
has grown so exponentially as to surpass popular estimations on the advancement of technology
(Kurzweil, 2001; Moore, 1965), and the prevailing signs indicate continued development and acquisition.
Co-founder Larry Page has been quoted as stating that "[t]he ultimate search engine would understand
everything in the world. It would understand everything that you asked it and give you back the exact
right thing instantly" (Wray, 2006). Google’s CEO, Eric Schmidt, has more explicitly envisaged Google’s
role in said engine’s development, indicating that "[t]he goal is to enable Google users to be able to ask
the question such as ‘What shall I do tomorrow?’ and ‘What job shall I take?’" (Daniel & Palmer, 2007).
Such statements are suggestive of an ambition to rise from mere search engine to total internet engine. It
could be argued, however, that such efforts to dominate will ultimately destroy the minimalist appeal
which attracted its legions of fans in the first place.
There are indeed hints that Google may falter. Despite its continued dominance in the US and Europe
(comScore, 2007a; 2007b), according to Hargittai (2004), "many people do not use it, do not know about
it, or even if they use it they may not know how to do so well." China’s Baidu, for instance, remains the
nation's most popular search engine, despite Google’s attempts to gain ground (La Monica, 2007). There
are also signs that Google recognizes its own mortality. Concerned over the potential ‘genericide’ of its
name, and its resultant loss of prestige—and even trademark—it has actively canvassed for an end to such
genericized usage, and a return to its status as a proper adjective (a la Xerox) (Ahrens, 2006; Duffy, 2003;
Sturgeon, 2006).
Language Learning & Technology
George M. Chinnery
Google-Assisted Language Learning
In the meantime, a horde of ‘Google killers’ is looming. And the latest of these search engines are born
more finely-tuned than their forbears, so as to be more accurate and useful. Yahoo’s Mindset, for
example, allows users to quantify the degree to which their intentions are commercial or informational.
Natural language processing (NLP) search engines such as Powerset take into account ‘stop words’ (e.g.,
prepositions, articles) which Google ignores by default, thereby being more likely to consider the
difference between ‘taking off’ and ‘taking in’ a shirt. Along the same lines, Q & A search engines like
Hakia allow users to ask questions directly indicative of their meaning. Some, including ChaCha, even
provide live guides. Clustered or federated search engines such as Clusty and the more visually
stimulating KartOO utilize semantic data-mining technologies. Social search engines (e.g., Swicki) take a
‘wisdom of crowds’ approach, learning from the search strategies of their community. Social
bookmarking sites (e.g., similarly utilize user-generated ‘tags’. A modern version of
keywords, tagging is a system of classification which employs an information retrieval method known as
folksonomy, a portmanteau of folks and taxonomy, literally a classification system by and for the people.
Personalized search engines (e.g., Rollyo) allow users to create their own search engines, using sites of
their choice. Yet others (e.g., Collarity) combine features of clustered, social, and personalized search
Search providers aspire to offer services that are not only more useful than their competitors, but also
more convenient. Some, such as Snap, recognize that many of today’s searchers demand immediate
gratification and may have limited attention spans, and therefore provide instant previews of search
results. Others (e.g., Riya) take the notion of visual literacy to the extreme, excluding text altogether by
searching only images. Also in existence are a host of other search tools—so-called vertical search
engines—that directly target a specific range of topics (e.g., WebMD for health information), media (e.g.,
EveryZing searches content within podcasts and web-based videos), and populations (e.g., cRANKy for
people over 50). And the list goes on…
For more information on this topic, just google it.
George M. Chinnery is an English language instructor, e-teacher trainer, and PhD candidate in Language,
Literacy and Culture at the University of Maryland Baltimore County (UMBC). His research and
practical interests include the cross-cultural uses of information and communication technologies; elanguage learning and e-teaching; and the global digital divide.
Ahrens, F. (2006a, July 7). goo-gle (goo'gul). The Washington Post. Retrieved June 30, 2007, from
Ahrens, F. (2006b, August 5). So Google is no Brand X, but what is 'Genericide'? The Washington Post.
Retrieved June 22, 2007, from
Blog picked as word of the year. (2004, December 1). BBC News. Retrieved June 30, 2007, from
Bloom, B. S. (Ed.). (1956). Taxonomy of educational objectives. New York: Longman.
comScore. (2007a, December 21). comScore releases November US search engine rankings. Retrieved
January 6, 2008, from
Language Learning & Technology
George M. Chinnery
Google-Assisted Language Learning
comScore. (2007b, June 4). comScore releases first comprehensive review of Pan-European
online activity. Retrieved July 1, 2007, from
Conan, N. (2004, May 3). Talk of the Nation. NPR. Retrieved June 30, 2007, from
Crampton, T. (2006, January 25). Google puts muzzle on itself in China. International Herald Tribune.
Retrieved June 22, 2007, from
Daniel, C., & Palmer, M. (2007, May 22). Google’s goal: To organize your daily life. Retrieved
July 14, 2007, from
Duffy, J. (2003, June 20). Google calls in the ‘language police’. BBC News. Retrieved June 22, 2007,
Ekman, R. (2006, August 22). The books Google could open. The Washington Post. Retrieved June 17,
2007, from
Fortune. (2007, January 22). 100 best companies to work for 2007. Retrieved June 9, 2007, from
Friedman, T. L. (2003, June 29). Is Google God? The New York Times. Retrieved June 30,
2007, from
Gallagher, D. F. (2002, April 22). A copyright dispute with the Church of Scientology is forcing Google
to do some creative linking. The New York Times. Retrieved June 22, 2007, from
Google cookies will ‘auto delete’. (2007, July 17). BBC News. Retrieved July 17, 2007, from
Hargittai, E. (2004). Do you google? First Monday, 9(3). Retrieved July 1, 2007, from
Hindman, M., Tsioutsiouliklis, K., & J. A. Johnson. (2003). 'Googlearchy': How a few heavily-linked
sites dominate politics on the web. Paper presented at the Annual Meeting of the Midwest Political
Science Association (Chicago, IL; April 4, 2003). Retrieved June 22, 2007, from
Inquirer Staff. (2004, May 18). Googol may sue Google. Inquirer. Retrieved June 30, 2006, from
Introna, L. D. & Nessenbaum, H. (2000). Shaping the web: Why the politics of search engines matters.
The Information Society, 16(3). Retrieved June 30, 2007, from
Irvine, M. (2007, April 9). Love in the age of Google. CBS News. Retrieved July 16, 2007, from
Keen, A. (2007, July 12). Is Google’s data grinder dangerous? Los Angeles Times. Retrieved January 6,
2008, from
Kopytoff, V. (2007, April 25). Google surpasses Microsoft as world's most visited site. San Francisco
Chronicle. Retrieved June 10, 2007, from
Kurzweil, R. (2001, March 7). The law of accelerating returns. Retrieved January 6, 2008, from
La Monica, P. R. (2007, June 28). Here comes China 2.0. Retrieved July 1, 2007, from
Language Learning & Technology
George M. Chinnery
Google-Assisted Language Learning
Liedtke, M. (2007, June 1). Google hits streets, raises privacy concerns. MSNBC. Retrieved June 30,
2007, from
Lohr, S. (2006, April 9). This boring headline is written for Google. The New York Times. Retrieved June
30, 2007, from
Malone, M. S. (2004, March). The complete guide to Googlemania! Wired, 12(3). Retrieved June 9, 2007,
Mayer-Schönberger, V. (2007, April). Useful void: The art of forgetting in the age of ubiquitous
computing. Retrieved June 17, 2007, from
McCloskey, P. (2007, March 6). Google 101 courses: From hardcore to high concept. Campus
Technology. Retrieved June 8, 2007, from
Menczer, F., Fortunato, S., Flammini, A., & Vespignani, A. (2006, February). Googlearchy or
Googlocracy? IEEE Spectrum Online. Retrieved June 30, 2007, from
Merriam-Webster Online. (2005). Previous words of the year. Retrieved June 30, 2007, from
Merriam-Webster Online (2006). Merriam-Webster’s words of the year 2006. Retrieved January 6, 2008,
Mohammed, A. (2006, January 20). Google refuses demand for search information. The Washington Post.
Retrieved June 30, 2007, from
Moore, G. E. (1965). Cramming more components onto integrated circuits. Electronics, 38(8), 114-117.
Retrieved January 6, 2008, from
Nakashima, E. (2006a, August 17). AOL search queries open windows onto users' worlds. The
Washington Post. Retrieved June 8, 2007, from
Nakashima, E. (2006b, September 2). Google to give data to Brazilian court. The Washington Post.
Retrieved June 30, 2007, from
New London Group. (1996). A pedagogy of multiliteracies: Designing social futures. Harvard
Educational Review, 66(1), 60-92. Retrieved July 17, 2007, from
Pearlstein, S. (2007, April 22). How much more should it be allowed to grab? The Washington Post.
Retrieved June 30, 2007, from
Perrone, J. (2004, February 19). Yahoo! challenges for Google's crown. Guardian Unlimited. Retrieved
July 1, 2007, from
Privacy International. (2007, June 9). A race to the bottom: Privacy ranking of internet service companies.
Retrieved July 1, 2007, from
Rivlin, G. (2005, August 24). It's Google's turn as the villain. The New York Times. Retrieved June 30,
2007, from
Robb, T. (2003, September). Google as a quick 'n dirty corpus tool. TESL-EJ, 7(2). Retrieved June 23,
2007, from
Rugaber, C. S. (June 25, 2007). Google fights global internet censorship. Forbes. Retrieved January 6,
2008, from
Language Learning & Technology
George M. Chinnery
Google-Assisted Language Learning
Rundell, M. (2000, May 17). The biggest corpus of all. Humanising Language Teaching, 2(3). Retrieved
June 23, 2007, from
Sipress, A. (2007, March 7). Microsoft attacks Google over book search. The Washington Post. Retrieved
June 23, 2007, from
Spanbauer, S. (2007, June 19). Is Google too big? The Washington Post. Retrieved June 30, 2007, from
Stross, R. (2006, June 30). All the internet's a state: Why don't CEO's use it? The New York Times.
Retrieved June 2, 2007, from
Sturgeon, W. (2006, August 16). Google wants people to stop googling. CNET Retrieved June
17, 2007, from
Swaffar, J. (1998). Networked language learning: Introduction. In J. Swaffar, S. Romano, P. Markley, &
K. Arens (Eds.), Language learning online: Theory and practice in the ESL and L2 computer classroom.
Austin, TX: The Daedalus Group. Retrieved July 17, 2007, from
Vandenberghe, M. (2007, May 11). YouTube to remove some clips mocking Thai king. Reuters.
Retrieved June 30, 2007, from
Vygotsky, L. S. (1978). Mind and society: The development of higher mental processes. Cambridge, MA:
Harvard University Press.
Wearden, G. (2007, June 12). Google to cut time it holds data. Guardian Unlimited. Retrieved July 1,
2007, from
White, A. (2007, February 13). Court orders Google to pull Belgian news. The Washington Post.
Retrieved June 17, 2007, from
Wordsmiths hail podcast success. (2005, December 7). BBC News. Retrieved June 30, 2007, from
Wray, R. (2006, May 23). Google users promised artificial intelligence. Guardian Unlimited. Retrieved
June 17, 2007, from
Language Learning & Technology
Language Learning & Technology
February 2008, Volume 12, Number 1
pp. 12-18
Robert Godwin-Jones
Virginia Commonwealth University
Creating effective electronic tools for language learning frequently requires large data sets containing
extensive examples of actual human language use. Collections of authentic language in spoken and
written forms provide developers the means to enrich their applications with real world examples. As the
Internet continues to expand exponentially, the vast "cloud" of Web pages created provides a nearly
inexhaustible and continuously updated language bank, particularly in English. The issue remains,
however, of how to make practical use of large amounts of data for language learning, given storage and
data processing demands. Recently, new developments in storage virtualization and distributed computing
offer practical solutions, as demonstrated by Amazon's Elastic Computer Cloud and SimpleDB. At the
same time, the move to XML encoding of language corpora and text collections provides the
compatibility and interchange which has hampered their practical exploitation for language learning.
Tools are also being created to facilitate the transformation of text collections into more usable formats,
particularly into syntactically annotated corpora called treebanks. These developments offer opportunities
for content-based language learning in particular.
Rich data collections are especially important for development of learner focused language applications.
In recent years there has been a sharp increase in the development of language learning tools for specific
learner populations. Not surprisingly, this has been most in evidence in Europe, as the European Union
has continually added new member nations bringing with them additional official languages. The EU
Europa Web site list 171 different projects in the area of content-based language learning that have, since
1999, earned the "European Language Label", awarded for creative applications in language learning. A
number of these projects have been created with funding supplied by EU grant programs, including
Lingua, Leonardo, and Socrates. Most involve the creation of electronic tools and multimedia and
increasingly are using the Web for delivery. Many are designed for use in either instructor-led or selfstudy settings, or both.
The EU site highlights a variety of projects in language learning for special purposes, including such
diverse targeted areas as agricultural workers, apprentices, architectural workers, automotive workers,
building maintenance workers, computer scientists, construction workers, customs officers, dock workers,
entrepreneurs, hospital patients, insurance industry workers, isolated rural inhabitants, teachers, prison
officers, the unemployed, and young immigrants. Some projects are even more narrowly focused, such as
French for racing apprentices, Polish for missionaries, or English for ski lift cashiers. The largest number
of projects targets the hospitality sector, where the need for multi-lingual workers is evident. The
VIRTEX project was recently awarded first place in the European Language Label competition and is
designed for workers in the hotel and restaurant industries learning English or German. Originally a CDROM project, it now incorporates a rich set of online tools, including streaming video.
Several of the vocational language projects make use of a full-fledged virtual learning environment. The
EUROVOLT project, which offers vocationally-oriented language learning in a variety of languages for
many industries, is implemented in Moodle and makes extensive use of new media and collaborative
tools. It also incorporates language e-portfolios. Interesting projects in this area also include BeCult and
Online VoCAL/Weblingua, both of which have richly developed tools and media.
Copyright © 2008, ISSN 1094-3501
Robert Godwin-Jones
Content-Based and Data-Driven Language Learning
Not enough information on the projects listed above is given by their Web sites to know to what extent
they make use of word sets or data collections. An example that shows the benefit of word sets for
content-based language learning is the Academic Word List (AWL)for English, developed by Averil
Coxhead. The 570 words on the list (sub-divided into ten categories) were compiled from a corpus of 400
written academic texts. It excludes the most common 2000 English words. The list targets students
entering an English-speaking university and provides an efficient base on which to create language
learning exercises such as matching or cloze. The AWL Highlighter offers a nice example of the benefits
of having such a list: it allows users to enter an arbitrary text, which is then parsed for AWL words and
returned as a new document with the AWL items in bold, allowing students to work with the words in
context. This helps guide the students to focus on vocabulary likely to be found in the text repeatedly,
rather than learning items that are unlikely to be encountered again.
Content-based language learning is inherently learner-centered, focusing as it does on the specific context
in which the target language will be used. It also lends itself well to task-based learning activities. Many
of the projects targeting language for special purposes are built around real-life scenarios, often delivered
through digital video clips, as an example from the Virtex project demonstrates. The students watch a real
or simulated conversational exchange or an on-the-job interaction and are provided with comprehension
aids such as full/partial transcripts, isolated audio playback, cultural notes, or lists of idiomatic
expressions. Students are then asked to use the expressions from the dialogues in on-line exercises,
written assignments, or group work. The importance of vocabulary development in content-based
language learning necessitates that the vocabulary items chosen are those needed by the learners.
Developing content-specific word lists in the manner of AWL would be highly beneficial, assuming
enough texts can be found to build a specialized corpus.
One of the advantages of having a corpus to draw from is the possibility of using concordances as a
vocabulary and grammar learning tool. Concordances are not effective for all learners, but for many
motivated students it can provide a means for working with language structures through real world use.
Students using concordances can be asked to reflect on areas such as inflections and collocations
involving core vocabulary for the areas they are studying. Since the materials are tailored specifically for
students' needs, it is more likely that such efforts will be successful. Some interesting examples of the use
of concordances are collected by Bernd Rüschoff based on workshops and other sources. Tom Cobb's
lextutor enriches the use of concordances by linking the found items to the on-line WordNet dictionary.
Wordnet is a large lexical database of English that was first made public in 1991 and has since inspired
similar collections in other languages.
The percentage of Web-based vocabulary and discrete grammar exercises based on language corpora is
quite low. There are many understandable reasons for this, including lack of access to appropriate
corpora, incompatibility of the data with authoring tools, ignorance of how to incorporate data sets, and
the need to focus on vocabulary prioritized in textbooks. The process could be made considerably easier
for the average language instructor if available tools interfaced more readily with language corpora or text
collections. Many popular tools for creating Web-based exercises, such as Hot Potatoes, allow for
importation of text files for creating cloze or gap exercises. However, they do not allow for retrieval and
incorporation of texts from large data sets or concordances. This situation is largely a by-product of the
proprietary format in which language corpora and text collections have traditionally been encoded. Data
with idiosyncratic encoding schemes and interfaces does not lend itself to searching or sharing. In many
cases tools created in conjunction with the data have not been designed to be interoperable.
Fortunately, the widespread use of XML for encoding corpora and text collections is moving towards a
resolution of this problem. XML has become the de facto standard for encoding of language corpora.
XML recommends itself because of its platform independence, extensibility, and widespread acceptance
Language Learning & Technology
Robert Godwin-Jones
Content-Based and Data-Driven Language Learning
by software companies and researchers. Standardizing text encoding in XML greatly facilitates data
interchange. Since structural and semantic information about a text is separated from its presentation in
XML, the same encoded text can be displayed in multiple ways, using CSS style sheets or XSLT
transformations. With the advent of XML as the preferred system for representation of corpus resources,
existing tools have been modified to work with XML, while new applications have been created that are
designed to be XML ready. The Linguist's Toolbox, for example, now features export to XML. The text
searching software, Xaira, designed to be used with the British National Corpus, has been re-written as a
general purpose XML search engine with full Unicode support. The Unicode editor CLaRK has been
designed specifically to work with XML. Language archives can now be submitted to OLAC (Open
Language Archives Community) by uploading a single XML file containing the necessary metadata
information about the resource. Tools for the semi-automatic annotation of corpus data are being
developed, such as @nnotate from the University of Saarland. DepAnn is a treebank creation tool, which
uses Tiger-XML, the accepted standard for treebank encoding. EULIA, from the University of the Basque
Country, provides a graphical Web interface for editing annotated corpora. These kinds of tools will
become increasingly important as language data sets increase in size, since manually annotating texts to
create treebanks is a slow and expensive process.
One of the most widely used XML encoding schemes for text archives is TEI, Text Encoding Initiative. A
new version of the TEI Guidelines was released in November, 2007. It offers a number of enhancements,
including more support for manuscript descriptions and better support for multimedia and graphics.
Additionally, a Web application called Roma has been developed which provides a visual editor for
working with TEI. An example of the power and versatility of TEI is the Henry III Fine Rolls project,
from the British National Archives. These are fiscal and administrative records in Latin from the 13th
century. The site provides user-friendly access to graphic representations of the original parchment rolls,
as well as the original texts, translations, and notes/annotations. TEI allows the Henry III project to be
included in general searches and to be easily referenced within other text projects. Version 4 of the
Perseus Digital Library, a collection of classics texts, also uses TEI encoding and adds a set of XMLbased Web services which allow for chunking larger texts into smaller units, as well as for sophisticated
morphological analysis.
Projects that house discrete, well-defined collections of texts can usually manage storage and delivery
resources using traditional options, namely one or more servers housing a database, a Web server, and
any associated Web services. If the site is popular, redundant servers might be needed. However, if the
project is unusually large, such as the American National Corpus, being created as an American English
cousin to the British National Corpus, the traditional project paradigm may not suffice. This is
particularly the case if the goal is not just to deliver static text selections, but to allow for dynamically
generated resources selected by sophisticated search, retrieval, and concatenating options such as are
available with the Perseus project. In this scenario, there are significant demands in terms of processing
which may well overwhelm the traditional setup for a text repository.
In recent years, some new options have emerged which make it easier to set up and manage a large-scale
text project. The technical means have been available for some time to enable load balancing and parallel
processing, but traditionally such systems have been difficult to create and run and tended to be so
expensive as to be beyond the means of most academic projects. Today, through tools and services
originating with Google and Amazon, there are ways for programmers without experience with parallel or
distributed systems to use the resources of a large distributed environment to achieve high performance
with off-the-shelf PC's that are linked together. Large Web companies such as these, as well as Yahoo and
eBay, have established developer outreach programs, through which they hope to drive more users to visit
their site. As part of that program, these companies provide application programming interfaces (APIs)
Language Learning & Technology
Robert Godwin-Jones
Content-Based and Data-Driven Language Learning
which instruct developers on how to write Web applications that take advantage of their sites and
Google's MapReduce is one example which has generated considerable interest. It is a programming
model and associated code library for processing and generating large data sets. The design simplifies the
process of enabling multiple computers to process information and then collect back the results centrally.
MapReduce assigns program instructions to multiple computers to be accomplished in parallel. It breaks
down the calculations into two steps. In step one (the "map" function) a key/value pair is processed,
providing a set of intermediate results. In step two (the "reduce" function), these intermediate results
themselves are merged to compute a final answer. An example of MapReduce from a Google developer
presentation shows how the phrase "to be or not to be" would be processed in the MapReduce model:
Figure 1: MapReduce processing of "to be or not to be"
This seems very simple, and it is, but by extending the process to several levels of analysis (i.e. further
mapping of reduced results) it allows for very complex calculations to be broken down into simple steps.
The general technique can be applied to many analytical problems.
MapReduce includes its own middleware that automatically breaks down computing jobs, doles out tasks
to multiple computers, and collects the results. It also creates duplicate copies of each map-and-reduce
function, finds idle machines to which to assign the tasks, and tracks the results. The worker machines
load their individual piece of data processing, do the work, and notify the master machine when the work
is completed ("mapped") and ready to be collected ("reduced"). If a machine freezes or breaks down, the
master re-assigns that task after a specific period of not being able to communicate with the worker. The
process is used by Google in many different ways, including machine translation between languages.
While MapReduce itself is proprietary to Google, an open source implementation, Hadoop, which
implements the MapReduce method, has been released. Recently, the New York Times used Hadoop as
the basis for creating a system to serve up archived newspaper articles. It needed to implement a largescale operation as the decision had been made to make all the Times archives from 1851 to 1980 publicly
available for free. In addition to Hadoop, the project was implemented using several Web services
available through Amazon, namely Amazon Simple Storage Service (S3) and the Amazon Elastic
Compute Cloud (EC2). S3 is an archive storage service that uses the same scalable system as is
implemented in Amazon's retail site. EC2 is a computing service on which one can load and run
applications. Both use a standard Web services interface, as does the recently announced Amazon
SimpleDB, a database service. Collectively, these services provide the ability to store, process and query
data sets residing on the Internet. Traditionally, this would require a relational database (such as Oracle or
mySQL) and a dedicated database administrator. In contrast, the Amazon system is designed to be
relatively easy to use. While it is not free, its pricing is low enough that it may be cheaper than operating
a home server, let alone setting up a cluster-based computing environment. The Amazon services used by
the New York Times work well not only with text and graphics, but with other media as well. For
Language Learning & Technology
Robert Godwin-Jones
Content-Based and Data-Driven Language Learning
example, CastingWords, a podcasting transcription service, stores audio files and transcribed text on S3.
Clearly, this could be an interesting option for large-scale language projects.
One could envision something like the Harvard Text Annotator, an authoring tool for creating online
glossed texts, running under Amazon and serving up vast quantities of on-the-fly annotated texts culled
from Internet sources. For such a project to be successful, however, more than just text searching would
have to be possible, even if sophisticated search options are available. Items collected in large data sets
also need accompanying metadata to allow for more efficient narrowing of searches. This is important as
well for finding and retrieving structured language learning resources, often labelled "learning objects"
(LO). The OLAC metadata set implements a consensus approach among language corpora researchers.
However, the modified Dublin Core metadata used in OLAC does not fulfil all the needs for materials to
be used in language learning. One project that moves in this direction is the FLORE learning objects
repository (LOR) for teaching and learning French. FLORE takes advantage of the French/English
CanCore Learning Resource Metadata Initiative, a collaborative Canadian project, itself based on the
IEEE Learning Object Metadata (LOM) standard. FLORE leverages a number of the LOM elements to
provide additional information important for judging the appropriateness of a resource for language
learners, including level of language proficiency and type of language learning environment targeted (i.e.,
immersion, self-study, etc.) The FLORE project is noteworthy also because it supports the Open Archive
Initiative's Metadata Harvesting Protocol (OAI-MHP), which allows FLORE's metadata records to be
shared with other repositories and to allow its metadata records to be linked directly with other systems.
There are, in fact, more and more collections of learning resources on the Web. A recent study features an
extensive international listing. However, relatively few of the LORs include standard metadata such as
that provided by OAI-MHP. The GLOBE initiative (Global Learning Objects Brokered Exchange) is an
effort to move repositories in this direction. The CORDRA project is also attempting to standardize LO
encoding. Including standard identifying information with learning resources would help enormously in
making searches across multiple data sets, known as federated searches, faster and more efficient.
Federated searches for learning objects are now available from LOR sites such as Merlot and Ariadne
(which even include searching of sites such as Flickr and YouTube), but the search results are
inconsistent and incomplete and do not allow for advanced search options.
A language learning LOR that exemplifies best practices in this area is the L2 O project out of the
University of Southampton. This is a collaborative project building on the work of the eLanguage group,
which produced a set of lessons for English for Academic Purposes. The L2 O project has been generating
reusable LOs created mostly from existing materials. The project has developed a metadata set based on
the LOM, but which adds contextual information important for language learning such as accent/region
and subtitles/transcript. It complements the work done in this area by the FLORE group. The tagged LOs
are retrievable from the project's repository, CLARe (Contextualized Learning Activity Repository).
CLARe is currently being expanded to include social networking tools such as tag clouds and ratings.
A related project, MURLLO, has begun to develop a user-friendly LO editor. One of the features that
would be helpful to see included in both LO editors and repositories is support for RSS feeds. The
required information for the feeds could be automatically collected from the LO metadata and used by
teachers or learners to be notified whenever new learning resources in targeted areas become available.
Developing easy-to-use tools for LO editing is a high priority if there is an expectation that subject matter
experts such as language teachers create the resource, rather than it being created by technical specialists.
A Swiss project from the University of Zurich is developing a tool for use with its LO model known as
eLML. One of the better-known open source LO editors, eXe, has recently released a new version
available for Windows, Mac, and Linux. A commercial LO editor, the SoftChalk LessonBuilder, is also
Language Learning & Technology
Robert Godwin-Jones
Content-Based and Data-Driven Language Learning
about to see a new version with additional features including more support for multiple languages. These
editors support SCORM, an LO standard that originated with the U.S. Department of Defense but which
has recently been transferred to a new international organization, LETSI, Learning-Education-Training
Systems Interoperability. These and other editors will likely support the new IMS standard, Common
Cartridge. This is a project designed to combine e-learning standards including SCORM, LOM, and IMS
QTI (Question and Test Interoperability), along with other Web services, to create a fully developed
learning module which can be imported into learning management systems such as moodle or
Blackboard. In the US, it is generating considerable interest as an electronic alternative to traditional
textbooks. This is also the thrust of the new Digital Marketplace initiative, an outgrowth of the Merlot
project based at California State University. This has been hailed as a possible model for a "national
digital marketplace," advanced recently by a US government study on the price of textbooks. The Global
Text Project and wikibooks are non-commercial efforts in this direction.
Content-based Language Learning
• E-Lingua European project for learning the language of hotel service and management
• BeCult European language project for students in hospitality industries
• EUROVOLT European Vocational Online Language Teaching and Language learning via a VLE
• VIRTEX Project for Hotel and Catering
• Education & Training Programs in the EU List of projects related to content-based language learning
• MapReduce: Simplified Data Processing on Large Clusters By Jeffrey Dean and Sanjay Ghemawat
Corpora and Data-driven Language Learning
The Compleat Lexical Tutor Example of data-driven language learning
WordNet Lexical database for English
Wordnets in the world Wordnets in multiple languages
Academic Word List From Averil Coxhead
Sample Exercises Data-driven Language Learning examples
Best Practice Recommendations for Language Resource Description For language archives
Penn Treebank Online Searchable tagged corpora in English
Xaira Corpus search engine
The Linguist's Toolbox and XML Technologies By Chris Hellmuth, Tom Myers & Alexander
CanCore Metadata system from Canada
Raise and Rise Example of learning object from wisc online
The Open Archives Initiative Protocol for Metadata Harvesting OAI guidelines
Harvesting Issues About implementing OAI metadata harvesting
Open Archives Initiative Metadata Harvesting Project University of Illinois project
Exposing information resources for e-learning Combining OAi and IMS metadata harvesting
Digital Repositories Specification From IMS
Real-time demonstration of interoperability between Learning Object Repositories Interoperability
demonstration involving the ARIADNE network and the FIRE federation
Language Learning & Technology
Robert Godwin-Jones
Content-Based and Data-Driven Language Learning
CORDRA Content Object Repository Discovery and Registration/Resolution Architecture
GLOBE Global Learning Objects Brokered Exchange
Federated Search Through Ariadne
Sharing Language Learning Objects Example walk through the technological and pedagogical
'process models' of L2O project
• A Typology of Learning Object Repositories By Rory McGreal
• EML eLesson Markup Language for creating structured eLessons using XML
• Common Cartridge: e-Learning Made Easy IMS Standard in place of textbooks
• LETSI The international group now in charge of the SCORM standard
• TEI Text Encoding Initiative
Distributed Computing
Self-service, Prorated Super Computing Fun! NY Times archive use of Amazon S3
hadoop Open source implementation of MapReduce
MapReduce The Google white paper
Running Hadoop MapReduce on Amazon EC2 and Amazon S3 From Amazon development services
S3 Amazon Simple Storage Service
EC2Amazon Elastic Compute Cloud
SimpleDB Amazon data base service
Windows Live Web services from Microsoft
Language Learning & Technology
News From Our Sponsors
University of Hawai`i National Foreign Language Resource Center (NFLRC)
Michigan State University Center for Language Education and Research (CLEAR)
Center for Applied Linguistics (CAL)
University of Hawai'i National Foreign Language Resource
Center (NFLRC)
The University of Hawai‘i National Foreign Language Resource Center engages in research and materials
development projects and conducts workshops and conferences for language professionals among its
many activities.
This conference, to be held on March 17-19, 2008 at the University of Hawai‘i at Manoa, is a venue for
bringing together scholars, writers, language teachers, researchers and other practitioners from around the
world to discuss issues pertaining to the role of Filipino as a global language. Participants can be teachers,
researchers, program administrators/coordinators and other practitioners who are directly involved in the
promotion and nurturing of the Filipino language, literature and culture. This first conference is geared
towards establishing a tradition of scholarly meetings of this kind among practitioners in the field of
Filipino language, literature and culture studies. (The NFLRC serves as co-sponsor for this event).
Just in! With the theme, Exploring SLA: Perspectives, Positions, and Practices, the Second Language
Research Forum (SLRF) returns to the University of Hawai‘i at Manoa for the third time on October 1719. 2008 (with the NFLRC serving as co-sponsor). Check out our website as more information becomes
Selected papers from Pragmatics in the CJK Classroom: The State of the Art
This online collection ( presents 10 selected papers from the
forum on Pragmatics in the CJK Classroom: The State of the Art held from June 5 to June 7, 2006 at the
University of Hawai‘i-Manoa in Honolulu, Hawai‘i. The papers are representative of the many
outstanding contributions to the field of L2 pragmatics that were presented at the gathering. The papers
are also representative of the diverse range of research interests and pedagogical issues taken up by the
conference presenters. Cumulatively, the papers in this volume address current concerns in L2 pragmatics
that range from the development of pragmatic competence by children and college-age students in both
foreign and second language settings to pragmatics-focused instruction in the foreign language classroom
for students at all levels of foreign language learning. The pedagogically-oriented contributions are
diverse in their scope: from innovative approaches for teaching true beginners to the specialized
curriculum of students receiving post-graduate professional training. Instructional innovations for L2
classroom pragmatics-focused teaching from each of the language groups — Chinese, Japanese, and
Korean — are included.
Check out our many other publications.
Language Learning & Technology
News From Our Sponsors
Language Learning & Technology is a refereed online journal, jointly sponsored by the University of
Hawai`i NFLRC and the Michigan State University Center for Language Education and Research
(CLEAR). LLT focuses on issues related to technology and language education. For more information on
submission guidelines, visit the LLT submissions page.
Language Documentation & Conservation is a fully refereed, open-access journal sponsored by NFLRC
and published exclusively in electronic form by the University of Hawai‘i Press. LD&C publishes papers
on all topics related to language documentation and conservation. For more information on submission
guidelines, visit the LD&C submissions page.
Reading in a Foreign Language is a refereed online journal, jointly sponsored by the University of
Hawai`i NFLRC and the Department of Second Language Studies. RFL serves as an excellent source for
the latest developments in the field, both theoretical and pedagogic, including improving standards for
foreign language reading. For more information on submission guidelines, , visit the RFL submissions
Michigan State University Center for Language Education
and Research (CLEAR)
CLEAR's mission is to promote the teaching and learning of foreign languages in the United States.
Projects focus on materials development, professional development training, and foreign language
Selected Products
The list below comprises just some of our free and low-cost materials for language educators. Be sure to
visit our website occasionally for updates and announcements on new products:
NEW! Celebrating the World’s Languages: A Guide to Creating a World Languages Day Event
(guide) – This publication provides a step-by-step guide to planning "World Languages Day," a
university event for high school students designed to stimulate interest in learning languages and
to highlight the importance of cultural awareness.
NEW! La phonétique française (CD-ROM) – Now available in beta version, this cross-platform
multimedia program consists of interactive lessons that can be used by French teachers to learn
how to teach pronunciation, or by advanced students working independently.
CLEAR’s Rich Internet Applications initiative has been underway for over a year. RIA is a
research and development lab where our programmers are working on free tools that language
teachers can use to create online language teaching materials – or have their students create
activities themselves!
o NEW! Audio Dropboxes (put a dropbox in any web page; students’ recordings get put
into your dropbox automatically)
o NEW! Conversations (record prompts for students to do virtual interviews and
o Mashups (combine media elements to create a new resource for language teaching)
o Viewpoint (record or upload videos to link from other sites or embed inside your own
web pages)
o SMILE (tool for creating interactive online exercises)
MIMEA: Multimedia Interactive Modules for Education and Assessment (German, Chinese,
Arabic, Vietnamese, Korean, Russian; online video clips and activities)
Language Learning & Technology
News From Our Sponsors
Language Learning Materials for Russian: A Content-Based Course Pack (online learning
Coming Soon!
Introductory Business German (CD-ROM)
In January and February 2008, CLEAR collaborated with the American Council on the Teaching of
Foreign Languages (ACTFL) to manage their student video podcast contest, “Not Just a Language
Class!” This contest was part of the Discover Languages… Discover the World!
( national public awareness campaign to build public support for
language education. Students were asked to create a two-minute video podcast depicting how the study
of other languages had had an impact on their lives. ACTFL contacted CLEAR and requested that we
create an online submission and storage system for the podcast entries based on our Rich Internet
Application called Viewpoint. We were able to tailor the contest website to ACTFL’s needs, and look
forward to future collaboration – watch for this annual contest! (ACTFL is using our Rich Internet
Applications… are you?
CLEAR exhibits at local and national conferences year-round. We hope to see you at ACTFL, CALICO,
MiWLA, NCOLCTL, Central States, and other conferences.
Central States Conference - Dearborn, MI - March 6-8, 2008
Workshop: Using Rich Internet Applications in Your Classroom
Session: Report from the R&D Lab: Rich Internet Applications for Language Learning
Session: Reaching Out and Building Enrollment through a "World Languages Day"
CALICO - San Francisco, CA - March 18-22, 2008
Session: Learners' Perception and Preference of Audio Stimuli During an Online Pragmatics
CLEAR News is a biyearly publication covering FL teaching techniques, research, and materials. Contact
the CLEAR office to join the mailing list or check it out on the Web at
We welcome your submissions!
The Center for Applied Linguistics (CAL)
The Center for Applied Linguistics is a private, nonprofit organization that promotes and improves the
teaching and learning of languages, identifies and solves problems related to language and culture, and
serves as a resource for information about language and culture. CAL carries out a wide range of
activities in the fields of English as a second language, foreign languages, cultural education, and
Featured Resources:
CAL News
Language Learning & Technology
News From Our Sponsors
CAL News is our new periodic electronic newsletter created to provide periodic updates about our
projects and research s well as information about new publications, online resources, products, and
services of interest to our readers. Visit our Web site to sign up.
Alliance for the Advancement of Heritage Languages
The Alliance for the Advancement of Heritage Languages (the Alliance) consists of individuals and
organizations who share a commitment to advancing language development for heritage language
speakers in the United States. The Alliance is committed to fostering the development of the heritage
language proficiencies of individuals in this country as part of a larger effort to educate members of
our society who can function professionally in English and other languages. The Alliance has
revamped its Web site to offer expanded content and improved navigation.
National K-12 Foreign Language Survey Underway
CAL conducts a national survey of foreign language instruction in elementary and secondary schools
every decade to gain greater understanding of current patterns and shifts over time in enrollments, the
number of schools offering foreign language classes, the types of foreign language offerings, foreign
language curricula and methodologies, teacher qualifications and training, and the effects of NCLB,
among other issues. We are currently conducting the third survey to be able to show trends in foreign
language education at three points in time (1987, 1997, 2007). For further details, see the fall 2007
Center for Research on the Educational Achievement and Teaching of English Language
Learners (CREATE)
Visit the newly expanded CREATE Web site to learn more about CREATE, its research and
upcoming events. To keep current on CREATE activities, sign up to receive an electronic newsletter
and periodic announcements.
Spotlight on Language Series
In support of the Discover Languages campaign led by ACTFL, CAL has developed a regular Web
series to provide information about specific languages. These language spotlights are introductory in
nature and are intended to encourage readers to explore these languages and CAL’s work with them
in more detail. Different languages will be highlighted periodically.
CAL Services
CAL provides a variety of professional development and technical assistance services related to
language education and assessment needs.
Recent Publications:
Guiding Principles for Dual Language Education, Second Edition
Developing Reading and Writing in Second Language Learners
Refugees from Burma: Their Backgrounds and Refugee Experiences
An Insider’s Guide to SIOP Coaching
Realizing the Vision of Two-Way Immersion: Fostering Effective Programs and Classrooms
What’s Different About Teaching Reading to Students Learning English?
Visit CAL’s Web site to learn more about our projects, resources, and services.
Language Learning & Technology
Language Learning & Technology
February 2008, Volume 12, Number 1
pp. 23-42
Cambridge advanced
learner’s dictionary
on CD-ROM. (2nd
Ed., version 2.0a,
Collins Cobuild
advanced learner’s
English dictionary on
CD-R4OM. (5th Ed.,
Windows 98,
NT4, 2000,
ME, or XP
Windows 98,
2000, ME,
XP, or NT
350 MHz
Free hard disk:
120 MB
Longman dictionary
of contemporary
English, writing
assistant edition CDROM. (Updated 4th
Ed., 2005)
Windows 98,
2000, or XP.
Mac 10.2. or
higher. Linux
Redhat 9,
10/10.1, Suse
9.1, Debian
2000 or XP
350 MHz
128 MB RAM
Free hard disk:
460 MB
300 MHz
128 MB RAM
Free hard disk:
450 MB
Windows 98,
2000, or XP.
Mac 10.2 or
higher. Linux
Redhat 9 or
350 MHz
128 MB RAM
Free hard disk:
110 MB
Macmillan English
dictionary for
advanced learners
CD-ROM. (2nd Ed.,
version 2.0.0702,
Oxford advanced
learner’s compass.
(7th Ed., 2005)
300 MHz
128 MB RAM
Free hard disk:
200 MB
Brief manual
attached to the
disk. Guided
Brief manual
attached to the
disk. Help
Brief manual
attached to the
disk. Guided
Brief manual
attached to the
disk. Guided
Review by Alfonso Rizo-Rodríguez, Department of English, University of Jaén, Spain
English learners’ dictionaries on CD-ROM are attracting more and more attention, given the enormous
potential afforded by the new technologies to enhance language description. McCorduck (1996) states
that this "shows the exciting promise of the application of multimedia computer technology to
lexicography and language learning" (p. 225). As an electronic resource, a dictionary on CD-ROM is
based on its printed counterpart, "a synchronic monolingual dictionary intended to meet the demands of
the foreign user" (Herbst, 1990, p. 1379).
This review focuses on the latest editions of five advanced learners’ dictionaries of English on CD-ROM,
each of which comes packaged with their printed editions. The review highlights their most outstanding
characteristics and constraints and compares them over ten dimensions:
Copyright © 2008, ISSN 1094-3501
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
graphical user interface, accessibility and information retrieval, macrostructure, microstructure, thesauruslike consultation, complex searches, copy and print functions, extras, multimedia resources, and
customization. The comparison also addresses the advantages of computer-aided lookup over paper-based
consultation methods.
The GUI is a key feature of electronic dictionaries (EDs) since users expect to gain access to every
function in an electronic dictionary in a simple, direct manner. A GUI is graphics-based, rather than
character-based, although it reproduces the entries content of paper-based dictionaries. Consequently,
GUI design may make consultation easier while extra attributes (e.g., color and clear typography) can act
as psychological incentives for users. Corris, Manning, Poetsch, and Simpson (2000) note that "electronic
interfaces still possess the charm of novelty" (p. 178), and this helps explain users’ satisfaction with EDs
and their preference for them (Nesi, 2000b).
A clear evolution is obvious in the graphical interfaces of current EDs, compared to their earlier editions:
graphical innovations have been added to make the interface more modern and stylish and to enable
access to menus and options easier; further links to extras have been included; and small pop-up windows
have been designed for joint use of EDs with other computer applications (e.g., word processors,
hypertext on the Internet, or e-mail). These elements, and others mentioned below (pop-up menus, tool
bars, on-screen buttons, and dialogue boxes), add to the user-friendliness of EDs and ensure "fail-safe"
lookup procedures (de Schryver, 2003, p. 182). Figures 1 through 5 in the Appendix illustrate the GUIs of
the five EDs examined here, all showing the same search word, catch.
All five EDs have similar interfaces, although certain differences are apparent. As far as layout is
concerned, three dictionaries—Cambridge Advanced Learner’s Dictionary (henceforth CALD2), Collins
Cobuild Advanced Learner’s English Dictionary (COBUILD5), and Macmillan English Dictionary for
Advanced Learners (MED2)—present a long, narrow panel on the left displaying an alphabetical index
list and a results list of all entries containing the search word. This design might be more informative than
the interfaces of the Oxford Advanced Learner’s Compass (OALD7) and the Longman Dictionary of
Contemporary English (LDOCE4), which simply include a drop-down menu next to the search box
showing the search word within a limited list of words beginning with the letters keyed in. Neither of the
last two dictionaries displays the full A-Z index of entries; however, OALD7 offers a results list window
showing a word in four sections: headwords, idioms, phrasal verbs, and structures.
The second element on the screen is a definition (or entry) window showing the entry for the word. The
design of this window is different in each of the dictionaries. In paper dictionaries, "all the information [is
presented] in a linear order on the same level (unless using different typesets or colours)" (Tono, 2000, p.
855); in contrast, three electronic dictionaries—CALD2, MED2, and LDOCE4—opt for the so-called
"layered" presentation (p. 857). They do not display all information straightforwardly, but rather, by
clicking on different tabs, the user can retrieve further details about the search word not supplied directly
in the text of the definition window. This utility adds to the simplicity of entries and facilitates
customization, that is, adaptation to the lookup aims of each user. In contrast, the other two EDs
examined have a "traditional interface", where "information is provided in a similar way to that in a paper
dictionary" (Tono, 2000, p. 856), and, hence, they reproduce the entry text of their printed counterparts
accurately. As a result, the definition window is packed with information and might be a little hard to
Standard elements of the interface of EDs are a menu bar and a tool bar. The former includes Windowslike menus, such as File, Edit, Options, History, or Help, and is found in two dictionaries, CALD2 and
COBUILD5. In contrast, MED2, LDOCE4 and OALD7 opt for Tool bars of various kinds and designs,
including options such as Back, Forward, Copy, Print, Paste, History, Help, and Quick Search, also used
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
in CALD2 and COBUILD5. Tool bars may prove more user-friendly than Windows-like menus since they
provide quicker, more direct access to the software utilities.
A more recent feature of the interface of these EDs is the inclusion of on-screen buttons, or tabs, which
give direct access to dictionary extras, as well as to other complementary books accompanying the
dictionary proper. These elements are very practical for users because, by simply clicking a button, they
can easily consult other resources. Some of the EDs examined also incorporate additional components on
the right-hand side of the interface. LDOCE4 offers access to a Phrase Bank, an Examples Bank, and an
Activate-Your-Language option. OALD7 offers Word Origin, Example Sentences, and a Wordfinder tool.
The interfaces of CALD2, COBUILD5 and MED2 do not contain these types of panels, and, hence,
LDOCE4 and OALD7 can be considered superior in this regard.
From the user’s point of view, all five interfaces allow quick and easy access to the desired information
(see Table 1). The interfaces of LDOCE4 and OALD7 stand out among the others since they enable access
to the largest number of dictionary extras and complementary books. MED2 and CALD2 are unique due
to their modern, layered interface. COBUILD5’s interface (see Figure 2) proves rather plain in its content
and visual appeal.
Table 1. Comparison of Graphical User Interface Features
Index list panel
Results list panel
Definition window
Layered interface
Traditional interface
Menu bar
Tool bar
Access to complementary books
Extra type of panels
EDs surpass hard copy dictionaries in their search potential. The software locates every occurrence of a
word in the entire dictionary. Thus, users automatically get information about lexical items as they appear
not only in main entries, but also in derived words, compounds, phrasal verbs, idioms, collocations,
definitions, and examples. However, the search capabilities of the five EDs are not identical. MED2,
CALD2 and COBUILD5 (in that order) offer the widest range whereas OALD7 is slightly less informative
because its simple search function locates a word only in headwords, idioms, phrasal verbs, and
collocations. LDOCE4 looks for a word only in main entries. Which search function is most practical to a
given user depends on his/her needs and proficiency level. A wide search range is most useful to
advanced users, researchers, and EFL teachers, who may want to seek detailed information about a lexical
item as it is used in idioms, examples, collocations, or definitions. Conversely, language learners may just
need to access a word in a quick manner by referring directly to its main entry.
CALD2 uses a special color code for entries, which helps users identify the type of entry appearing in the
results list, e.g., headwords are displayed in dark blue, phrasal verbs in green. This feature might be
particularly helpful if the user needs to identify the specific location of a search term in the text of the
dictionary quickly, e.g. in a definition or in an idiomatic construction. In contrast, OALD7 uses explicit
labels (headwords, idioms, phrasal verbs, and collocations) for that purpose. This practice, typical of
OALD7, is even more helpful to users because it is more self-explanatory. None of the other three
dictionaries feature this type of color code or label convention, which makes them less user-friendly.
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
Accessibility is a major and distinctive attribute of EDs, realized in different word look-up methods
(Corris et al., 2000). Users may key in a word in the search box, they may choose a lexical item from a
word list after typing in "the first three or so letters of a word" since "the word list automatically scroll[s]
down to that point", or they may locate it by means of "fuzzy spelling options" (p. 175). This last search
method is made possible as a result of a very helpful feature of the five dictionaries, their spell-check
function. When a word typed in the search box is spelled incorrectly, a small window is automatically
activated providing the correct form, as in COBUILD5. The four other EDs also display other lexical
items with a similar spelling. To illustrate, if the user types *assesment, the EDs list assessment,
abasement, assortment, amusement, amazement, and so forth. This fuzzy spelling option helps language
learners retrieve words from the A-Z list when they are not certain about correct spelling and are guided
merely by the way the words sound. Other instances of the flexibility of EDs as search tools are their
"hyperlinking" function, "a search mechanism by which a double click on a word on screen will call up a
dictionary entry for that word" (Nesi, 1999, p. 61) and instant retrieval of fixed expressions and idioms,
which can be looked up with great ease.
The accessibility of the entry text in the five EDs is superior to that of print editions due to the design of
the entry window. The entry word is clearly highlighted; extensive use is made of indentation in order to
separate meanings, example sentences, derivatives, and so forth; spacing is generous; and use of color and
fonts adds to the clarity and neatness of the text (see Figures 1 – 5). As a result, the layout of entries and
their visual impression is entirely distinct from that characteristic of print editions.
Some variation among the five EDs is, nevertheless, evident in the presentation of entries. As shown in
Figures 1 through 5, only MED2 offers a comprehensive list, or menu, of the various meanings of
polysemous words at the top of the definition window—e.g. catch (verb). This facilitates the lookup
process for language learners since the whole range of uses of the word can be seen at a glance.
Moreover, each of the numbered meanings provides hyperlinks to the word’s definition and examples.
LDOCE4 optionally displays this type of menu by allowing the user to click on the appropriate tab on the
toolbar at the top of an entry. The other EDs, like their printed versions, do not feature this element, and
hence the user inevitably has to scroll through the text, which proves cumbersome at times.
Finally, access to further lexicographical information is enhanced through inclusion of a link to various
online electronic dictionaries. This is a feature exclusive to CALD2 and MED2. Users may thus obtain
details about a word not supplied by the dictionary on CD-ROM. More specifically, in CALD2, users may
conduct searches in various works published by Cambridge University Press, such as the Dictionary of
American English, the International Dictionary of Phrasal Verbs, the International Dictionary of Idioms,
or the Diccionario Klett Compact Spanish-English. In MED2, the link enables free access to the British
and American editions of this work, as well as regular updates of new words.
Table 2. Accessibility and Information Retrieval
Simple A-Z search in dictionary text
Color/label-coded results list
Spell-check function
Hyperlinking function
Easy-to-read entry text
List of meanings
Online dictionaries link
Table 2 summarizes the characteristics of the five EDS as far as accessibility and information retrieval are
concerned. The first criterion in the table is satisfied by each dictionary to various degrees, so that a
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
gradation is established by means of one or more plus signs. On the whole, MED2 and CALD2 have the
greatest number of accessibility and information retrieval devices.
The macrostructure of the five EDs, that is, the alphabetical list of entries, is identical to that of their
printed counterparts. Moreover, the latest editions of these printed works, based on solid corpus evidence,
include a large variety of newly coined terms, as well as subject-specific vocabulary, especially from the
areas of science, computing, and communications technology. For example, the terms ozone-friendly and
neural network are included in all five dictionaries.
However, the coverage of regional varieties of English is not uniform in the five EDs. MED2, CALD2 and
OALD7 (in this order) stand out in this respect because they include words characteristic of different
varieties, such as American, Australian, or Indian English, such as anchorman (American), bonzer
(Australian), or prepone (Indian). In contrast, regionalisms, with the exception of American English, are
scarce in LDOCE4 and COBUILD5. Similarly, the treatment of spoken and informal registers is much
more exhaustive in MED2, CALD2, OALD7, and LDOCE4 than in COBUILD5. For example, the terms
nohow, irregardless, foul something up, allnighter, and argy-bargy are included only in the former four
dictionaries. Representative coverage of regionalisms and register differences adds to the usefulness of a
learner’s dictionary for decoding purposes.
Interestingly, CALD2 has maintained an approach characteristic of the first and second editions of its
printed versions: the different meanings of polysemous words are described in separate entries. For
example, the noun line is described in fifteen different entries, and its derivative line (verb) appears three
times under specific meanings of its noun. Similarly, the description of very common functional words,
e.g., in, off, and lexical ones, e.g., go, put, requires a very large number of entries. For example, the verb
go itself (not its use in a phrasal verb) is described in twenty-six different entries accompanied by
semantic indications such as move, leave, become, weaken, and happen. This proliferation of entries may
be a serious obstacle for users interested in one particular meaning of a word because, instead of finding
all the information at a glance within the same entry, users will have to scroll down on the screen.
Another unfortunate consequence of this practice is that the pronunciation and irregular forms (in the case
of verbs) of words are repeated in every single occurrence of each entry (Rizo-Rodríguez, 2005).
On a positive note, the normal macrostructure of the printed edition of LDOCE4 is interspersed with
9,000 encyclopaedic entries taken from the Longman Dictionary of English Language and Culture (2nd
revised edition, 1999). Similarly, the alphabetical list of OALD7 words is supplemented with 10,000
cultural entries from the Oxford Guide to British and American Culture (1999). This is particularly
advantageous because learners can gain access not only to the core vocabulary of English, as is expected
in a learner’s dictionary, but also to a significant number of terms (typical of an encyclopaedia) which
may enrich their background and cultural knowledge.
Table 3. Macrostructure
A-Z list of entries:
Identical to printed edition
Regional varieties of English
Spoken and informal registers
Addition of encyclopaedic entries
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
Table 3 depicts the macrostructure features of the five EDs. Two or more plus signs indicate differences
in coverage. Accordingly, MED2, CALD2 and OALD7’s A-Z list of entries is rather representative of the
different uses and varieties of the English language compared to LDOCE4 and COBUILD5. Besides, the
inclusion of encyclopaedic entries in LDOCE4 and OALD7 constitutes a significant addition to the
macrostructure of their printed counterparts. In contrast, CALD2, COBUILD5, and MED2 are still heavily
dependent on the A-Z list of their printed editions, an obvious shortcoming.
The microstructures of the five EDs, that is, the content of entries, are exact replicas of their hard-copy
versions. Two dictionaries, COBUILD5 and OALD7, opt for the so-called traditional interface, which
faithfully reproduces the entry text of the book. In contrast, the microstructure of CALD2, MED2, and
LDOCE4 exhibits a layered presentation which permits users to select extra details not supplied directly
in the definition window. As a result, entries are less compact and users can customize their searches (see
Table 4).
The entry window in CALD2 displays links to pictures, study pages, and terms related to the target word,
as well as the on-screen buttons: Smart Thesaurus, Word Building, Verb Endings, Extra Examples,
Collocations, Common Learner Errors, and Usage Notes. Both the links and the notes on Collocations,
Common Learner Errors, and Usage are appended to respective entries in the printed version, but the
other buttons supply additional information found only in the electronic version. Similarly, MED2
includes different tabs in some entries—Am/BrE Differences, Animations, Avoiding Offense,
Collocations, Cultural Notes, Exercises, Expressing Yourself, Get-It-Right Notes, Illustrations,
Metaphors, Sound Effects, Synonyms, Usage Notes, Weblinks, Word Sets, and Word Stories. All of
these, except Exercises, Sound Effects, and Weblinks, also form part of the print dictionary. Moreover,
every meaning of an entry on the CD-ROM is connected to a thesaurus button, and entries for verbs,
nouns, and adjectives include an Inflections button which can be clicked to get the inflected forms.
LDOCE4 includes a toolbar showing extra details about a lexical item: Pronunciation, Menu, Word
Family, Word Origin, Verb Forms, Word Sets, and Frequency of Use. These details, except the last, are
available only in the electronic version, which has maintained three features of the hard-copy edition—
Word Choice, Word Focus, and Collocations—by appending them to some entries. Finally, OALD7
supplements entry information with the addition of three small panels (exclusive to the CD-ROM): Word
Origin, Example Sentences, and Wordfinder.
On the whole, as Table 4 illustrates, the microstructure of LDOCE4 offers the largest number of extras, as
compared with its paper-based edition, while COBUILD5 contains the least amount of extra information.
This variety of supplements adds to the usefulness of an ED since they are intended to complete the
description of a search term. Users can thus obtain a wealth of information about a word, and this extra
knowledge may significantly contribute to enhancing their command of the language.
Table 4. Microstructure
Traditional presentation of entry text
Layered presentation of entry text
Extra information
Four EDs—CALD2, MED2, LDOCE4, and OALD7—feature thesaurus-like "onomasiological" resources
not furnished by their book counterparts, apart from their A-Z "semasiological" dictionaries (RizoRodríguez, 2004, p. 37). Every meaning of a word in CALD2 is accompanied by a button which opens up
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
a Smart Thesaurus, whose internal semantic typology of concepts, drawn up very much in the style of
Roget’s Thesaurus of English Words and Phrases (150th anniversary edition, 2002), though less
exhaustive, constitutes the basis of a very detailed onomasiological classification of English words that
proceeds from concepts to words (Rizo-Rodríguez, 2004). The Smart Thesaurus utility in CALD2 is
perfectly integrated with the A-Z list, so that the user can obtain a large variety of expressions
semantically related to each meaning of a search word, as well as definitions and examples. Finally, the
user can consult an index to the Smart Thesaurus categories for words classified in related categories.
The second edition of MED also includes a Thesaurus that supplies synonyms, antonyms, related words,
and their definitions next to every meaning of a word. Conceptual categories semantically associated with
a search word can be looked up, but they are not organized into an index. Another shortcoming of this
Thesaurus is that it can only be consulted jointly with the A-Z dictionary, not as an independent resource.
LDOCE4 features the Longman Language Activator (2nd edition, 2002). Conceived as a production
dictionary, the Activator organizes vocabulary around 866 key words denoting basic concepts, which in
turn are expressed by more specific lexical items. The internal organization of the Activator is
alphabetical with each particular word explained and illustrated. Users can refer to this onomasiological
dictionary separately or can browse through it in conjunction with the A-Z dictionary.
Similarly, OALD7 incorporates the Oxford Learner’s Wordfinder Dictionary (1997), an onomasiological
lexicon which classifies vocabulary into 630 keywords in alphabetical order. Each of these semantic
spheres comprises semantically related subareas and their corresponding terms, all of them accompanied
by a definition and example sentences, as in the Activator. The Wordfinder is available as an independent
resource on this compact disk, or it can be looked up in combination with the A-Z dictionary.
In summary (see Table 5), the most complete onomasiological information is found in CALD2, whose
Smart Thesaurus is easy to use. Equally accessible is LDOCE4’s Activator, another excellent source of
terms semantically related to a search word. OALD7’s Wordfinder is clearly inferior in coverage to the
Activator since it targets intermediate learners. MED2’s Thesaurus is a user-friendly tool, but, as
explained above, it has some limitations. Finally, COBUILD5 does not include thesaurus resources.
Instead, many of its entries list synonyms and antonyms of a search word; however, the electronic version
contains a higher number of these types of terms than its printed edition.
Table 5. Thesaurus-Like Consultation
Thesaurus-like resources
Definition of related words
Exemplification of related words
Index of semantic categories
Alphabetical organization of concepts
Thesaurus as independent resource
A feature exclusive to EDs is their capacity to carry out complex word searches in a manner that exceeds
the capabilities of the most meticulous paper-based dictionary user. It is one of the clearest indications of
the potential of EDs as language learning/teaching tools. Only four of the EDs examined incorporate an
advanced function that allows users to carry out complex lexical searches: CALD2, LDOCE4, MED2 and
OALD7 (see Table 6). In order to compensate for the absence of this function, COBUILD5 enables users
to conduct two extra searches: a phonetic search (e.g., by entering the word rite, one obtains rite, right
and write) and a morphological search (e.g., the program gives the singular form of an irregular plural
noun keyed in (e.g., mouse, mice) or the bare infinitive of any inflected verbal form (e.g., lie, lying)).
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
Three of the EDs, CALD2, MED2 and LDOCE4, possess an elaborate system of filters for customizing
complex searches. For example, users can look for all the adverbs ending in –ly, all the verbs followed by
a that-clause, or all the Spanish loan words. While the search window of each of these EDs shows its own
distinctive design, the searching mechanisms are similar. They are based on filters and are graphically
displayed in a dialog box where the filters are conveniently organized and ready to be selected
individually or in combination with others. Searches can be conducted by entering a word in an
appropriate box and by selecting one or more filters. The software then looks for that word according to
the conditions established by the filters selected.
For instance, in CALD2, after entering the word get and marking the option Linking verb in the grammar
filter, the software identifies only the copulative uses of that verb. Alternatively, a search can be launched
simply by selecting some of the filters. In that case, the program returns all the words in the dictionary
that match the filter criteria. For example, in MED2, by choosing the option Australian in the filter
Region, the user obtains all the terms registered in this dictionary typically used in that variety of English.
From the user’s perspective, these sophisticated searching mechanisms exceed the demands of the
average learner, who will not normally make use of them. However, these tools do serve the needs of the
EFL teacher or the language researcher concerned with the description of English or the retrieval of very
specific linguistic data which can be used in the classroom.
MED2 and CALD2’s search systems are quite exhaustive. They include six filters: part of speech,
grammar, region, style (or usage), frequency, and subject (or topic). CALD2 is more meticulous than
MED2 in its grammar filters. It includes refined options, such as "+ two objects", "+ object + toinfinitive" for the description of verb complementation patterns. Conversely, the region filter is clearly
more specific in MED2 (e.g., Indian, New Zealand, Scottish, and Irish English) than in CALD2 (which
includes only American English, British English, and Other Regions).
In contrast, LDOCE4 has only three filters—frequency, part of speech, and style— but it features its own
original types of searches for multimedia additions, word origin, subject, and pronunciation. Multimedia,
like MED2’s Extra Features Search, locates all the terms that are accompanied by illustrations or sound
effects in the dictionary. Word Origin returns lexical borrowings that entered English and classifies them
by century. Subject search finds terms specific to a discipline. Finally, pronunciation locates all the words
containing a sequence of phonemes, and, hence, it is useful in locating words whose exact spelling is not
known, or in retrieving homophones and words that rhyme. MED2 and COBUILD5, unlike CALD2 and
OALD7, possess identical pronunciation functions.
In contrast, the advanced search utility in the Oxford Advanced Learner’s Dictionary is quite different.
Searches have to be conducted within a blank window without the help of any filter menus. Instead,
searches require the use of specific labels listed in the advanced search window. After a test of this
system, it was found that it is in need of substantial revision and simplification, mainly because queries
must be formulated in a syntax barely explained in the Help menu, and also because many of the labels
are, unfortunately, interpreted literally by the software. For example, after keying in the label
"computing", the user does not only retrieve terms relative to this discipline, but rather all the occurrences
of this lexical unit in the dictionary text.
Finally, advanced searches can be formulated with the help of wild cards: the symbols ? (standing for one
letter) and * (standing for zero or more letters) and Boolean operators AND, OR, and BUT. This is
possible only in MED2, LDOCE4 and OALD7. This sophisticated tool can be exploited by EFL teachers
to obtain supplementary classroom materials. For instance, by typing in will / would AND if, teachers can
retrieve a large number of examples of conditional sentences. Similarly, language researchers may want
to expand the scope of a search in order to look for certain types of lexical items. The search NOUN:*ee
returns all the nouns ending in –ee recorded in the dictionary.
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
On the whole, MED2 has the most powerful advanced search instrument. CALD2 and LDOCE4’s are also
very efficient. The three of them, unlike OALD7, are equipped with an elaborate system of filters which
simplify searches. In contrast, the advanced search function of OALD7 is extremely intricate and actually
hinders complex searches while COBUILD5 does not incorporate this utility at all.
Table 6. Complex Searches
Advanced search function
Filter system
Pronunciation search
Wild cards search
Boolean operators search
Much of the usefulness of the results obtained with the advanced search function described above resides
in their readiness to be exported to a word processor. The copy and print functions will be particularly
appreciated by EFL teachers and language researchers, who can easily employ the results furnished by the
sophisticated search mechanism. CALD2 and MED2 can be successfully exploited for this purpose. With
CALD2, both the hits listed in the results window after performing a normal search and those appearing in
the advanced search window can be copied (up to 1,000 hits) or printed. MED2 also permits copying,
printing, and saving with the advanced search utility (up to 200 hits). The other dictionaries, however, fall
short in this respect. LDOCE4 and OALD7’s complex search results cannot be copied or printed at all.
This is a shortcoming which significantly lessens the utility of the advanced search mechanism.
Another restriction of OALD7 is that the text from the Word Origin window and the content of the
Example Sentences window cannot be exported or printed. In contrast, COBUILD5’s Wordbank,
LDOCE4’s Examples Bank and Phrase Bank, and MED2’s extra examples can be copied or printed, while
those in CALD2 can only be printed. Moreover, COBUILD5 and MED2 are the only dictionaries
reviewed which allow copying of a complete entry without altering its text properties (font, color,
phonological transcription, indentation, spacing) so that it can be used for teaching purposes after pasting
it into a word processor document. The other EDs can copy an entry but turn it into unformatted, plain
To conclude, as shown in Table 7, as far as the handling and export of search results and dictionary text
are concerned, the best dictionaries are MED2 and CALD2. COBUILD5 and LDOCE4 are less flexible,
and OALD7 is deficient in this regard.
Table 7. Copy and Print Functions
Powerful copy facility
Copying and printing results list items
Copying and printing advanced
search hits
Copying extra examples
Printing extra examples
Copying entry text without alterations
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
The latest versions of EDs all contain a number of extra features to cater to users’ different needs.
According to Varantola (2002), "the future dictionary is […] an integrated tool or a number of tools in a
professional user’s toolbox where it coexists with other language technology products such as
encyclopedic [sic] sources of reference, different types of corpora, corpus analysis tools" (p. 35).
However, an ED should not just be a mere amalgam of extra materials (Nesi, 2000a); instead, "there is
need for ‘multi-referencing’: for simultaneous signalling to the user that the same query item is to be
found in a number of different resources" (Leech & Nesi, 1999, p. 301). An examination of the five EDs
under review reveals that four of them—LDOCE4, OALD7, MED2, and CALD2—stand out in terms of
their extras (see Table 8).
Both LDOCE4 and OALD7 are collections of reference books: the former includes the text of its printed
counterpart as well as the Longman Language Activator. Similarly, OALD7 comprises the 7th edition of
its printed version, The Oxford Guide to British and American Culture (1999), The Oxford Learner’s
Wordfinder Dictionary (1997), and also Oxford Genie. All these books work in perfect conjunction, as
advocated by Leech & Nesi (1999), and guarantee multi-referencing. The other three EDs do not include
further reference works.
An exclusive feature of OALD7 is its Word Origin window that furnishes etymological information on
20,000 words. LDOCE4 is notable for its Phrase Bank (a vast collection of phrases containing the search
word and collocates used with the entry word, all of them illustrated with additional examples), its lesson
plans for teachers (a mixture of notes for teachers and language activities intended to promote
familiarization with the dictionary and its use), and its grammar section (an appendix offering brief
summaries of various grammar points). All these extra elements constitute "evidence of pedagogical
design and consideration of the learners’ needs" (Seedhouse, 1997, p. 63). They will benefit the advanced
user who wants to obtain further details about a search term and also the teacher concerned with
promoting dictionary skills.
A similar attempt on the part of LDOCE4, OALD7, CALD2 and MED2 to serve those needs is the
inclusion of a writing assistant. But the capabilities of this tool are not identical in the four EDs. The most
practical writing assistant is that of LDOCE4. It provides thesaurus-like details and diverse grammatical
information, as well as typical learner mistakes and their correct versions. In contrast, it is questionable
whether the Know-how utility in OALD7 is equally useful for every user. Learners must type a sentence in
order to check its acceptability against various example sentences, but the effectiveness of this resource
ultimately depends on the users’ ability to infer grammatical information from linguistic chunks. Despite
its name, CALD2’s Superwrite is not a genuine writing assistant since it simply displays any word in a
text pointed at with the cursor with its complete entry. Finally, MED2 does not feature a writing assistant
proper either, but its sections Improve Your Writing Skills and Expand Your Vocabulary offer appendixlike explanations on a number of communicative functions and expressions accompanied by interactive
writing exercises.
Additional extra features are MED2’s language awareness articles (fourteen contributions by leading
specialists on idioms, metaphors, pragmatics, or word formation), its atlas featuring geographical
information, and its links to websites offering cultural and encyclopaedic information. Similarly, CALD2
contains study pages about grammar, vocabulary, and pronunciation. These will all be helpful to language
learners, who can easily access and use these resources through on-screen buttons.
LDOCE4, CALD2, OALD7, and MED2 also provide a large variety of exercises (accompanied by both
check-answer and show-answer buttons) intended for upper intermediate and advanced level learners of
English. LDOCE4 offers the widest range on grammar, vocabulary, culture, listening comprehension,
intonation, sentence dictation, and word dictation. OALD7 incorporates a variety of vocabulary exercises,
as does CALD2, which additionally includes grammar exercises. Finally, MED2 features lexical and
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
grammatical activities, matching Get-It-Right explanatory notes on typical learner errors (appended to
104 entries) as well as writing exercises. Moreover, LDOCE4, CALD2, and OALD7 include exam papers
from a variety of language certificates, e.g., Certificate of Proficiency in English, Test of English for
International Communication, or Business English Certificate, which should be useful especially for
advanced learners of English preparing for these examinations.
Another outstanding feature of the latest editions of EDs on CD-ROM is the inclusion of extra examples,
or small language corpora, a component metalexicographers find useful (de Schryver, 2003; Svartvik,
1999) as they provide ample evidence of language use. This will prove most advantageous for encoding
tasks, mainly writing, and also for teaching purposes. Four of the EDs feature this component, but to
different degrees (indicated in Table 8 by means of multiple plus signs). COBUILD5 stands out with its
Wordbank, a 5-million-word representative sample of British and American English, both written and
spoken. Likewise, LDOCE4’s Examples Bank window offers a wealth of example sentences (76,000
from other dictionaries published by Longman and over one million examples from the Longman Corpus
Network, OALD7 contains 200,000 extra example
sentences ( These examples
have been taken from the entire text of the dictionary, as is the case with MED2’s Example Sentence
search. Finally, CALD2 gives just five extra examples, apart from those included in the entry itself, for the
most important words, marked with the symbols E(ssential), I(mprover), and A(dvanced).
Finally, CALD2, MED2, and LDOCE4 feature a Guided Tour of the dictionary. The first two are
particularly effective because their tutorials are actual videos combining animation and narration. OALD7
has a graphic tutorial, and COBUILD5 offers a Help menu. This type of graphical element is in line with
our present-day predominantly visual culture nowadays as it may help learners to quickly become familiar
with the content and use of the dictionary. In contrast, a typical traditional paper dictionary merely
includes an introduction and a brief key to its entries, which proves much less user-friendly than a visual
In brief, LDOCE4 is the most outstanding dictionary in terms of the usefulness of its extras (Table 8).
MED2, CALD2, and OALD7 are also characterized by the inclusion of abundant supplementary materials,
and COBUILD5 exceeds the others only with its Wordbank.
Table 8. Extras
CD-ROM as collection of reference books
Etymological information
Phrase and collocations bank
Lesson plans for teachers
Grammar section
Writing assistant
Language awareness articles
Study pages
Language exercises
Exam practice
Extra examples
Guided tour
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
A distinctive feature of EDs is their integration of various kinds of multimedia resources (see Table 9).
The five dictionaries offer both American and British English recorded pronunciation of each entry, as
well as pronunciation practice.. In LDOCE4, even example sentences are accompanied by recordings, and
users can practice reading them aloud and comparing their pronunciation to that of the original. This ED,
like MED2, also features the inclusion of recordings of some musical instruments and of onomatopoeic
verbs and nouns denoting types of sounds difficult to define (e.g. bleat, chirp). Moreover, all of the EDs,
except OALD7, include an option for "automatic pronunciation replay" or "always play
pronunciation/sound," which activates the recorded sound of every word keyed in the search box. These
types of multimedia resources add to the usefulness of an electronic dictionary, since exposure to the
spoken language is always beneficial for language acquisition (Brown & Yule, 1983).
Interestingly, video clips are no longer included in the latest EDs, except in MED2, whose animation
videos illustrate hard to define verbs or nouns (e.g. juggle, lob). Earlier works, such as the Longman
Interactive English Dictionary (2nd edition, 2000) and the Oxford Advanced Learner’s CD-ROM
Dictionary (6th edition, 2000), made use of video clips. This feature might have been eliminated from the
majority of the EDs in order to save space on the CD-ROM and to give priority to the type of original
extras discussed above, which may prove more informative to users.
As far as illustrations or pictures are concerned, images and color also play a prominent role in the EDs
examined, except in COBUILD5. LDOCE4 uses illustrations and photographs exclusively to support the
definition of certain terms and, hence, they form an integral part of the entries. In CALD2, pictures are
presented in an appendix accessed by means of an on-screen button, and they serve to enrich lexical
description by grouping illustrations of semantically related terms (for example, in the office). In MED2,
illustrations accompany some entries, or alternatively, they can be viewed separately as members of a list.
Finally, in OALD7 users have no direct access to pictures, which accompany only some entries in a small
window, but illustrations labelled expand open in a larger window depicting items semantically associated
(for example, vegetables).
MED2 and LDOCE4 are the only EDs whose complete list of illustrations and photographs can be
retrieved by means of the advanced search mode. This might be helpful for teaching purposes in order to
present semantically related vocabulary. In addition, the pictures in CALD2 and MED2 can be rendered
interactive through "hot spots": by holding the cursor over some parts of a drawing, users can activate
terms denoting items in a picture. In contrast, in LDOCE4 and OALD7, illustrations include labels
denoting specific vocabulary.
All in all, multimedia resources are abundant in MED2 and LDOCE4 but not in the other EDs examined.
Table 9. Multimedia Resources
Recorded pronunciation
Recorded example sentences
Sound effects
Automatic pronunciation replay
Animation videos
Illustrations appended to entries
Illustrations in appendix
One recurrent demand from metalexicographers is that EDs should allow customization in terms of the
lookup aims of particular users (Atkins, 2002; Corris et al., 2000; de Schryver, 2003; Varantola, 2002).
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
The five examined EDs allow some degree of customization (see Table 10). Interactivity, a feature of the
advanced search mode, is typical of all the dictionaries except COBUILD5. Users can carry out complex
searches according to their individual reference needs. Customization and flexibility are also evident in
the layered presentation characteristic of the graphical interface of CALD2, MED2, and LDOCE4. This
ensures that users can decide what type of information is of interest to them.
Other customization elements are restricted to the display style of entries on the screen and recorded
pronunciation. All dictionaries offer very similar alternatives: selection of font size, American or British
English pronunciation, and display options (e.g., phonetic transcription, grammar labels, spell check, and
quick/full view). MED2 also features an annotation facility, which allows users to add personal notes to
any entry. For example, a translation equivalent or a list of synonyms previously retrieved with the
advanced search function can be permanently appended to an entry and, afterwards, they can be edited or
removed. COBUILD5 allows customization through its "My Dictionary" function, which enables the
creation of a personal lexicon. On the whole, the five EDs offer very similar customization options.
Table 10. Customization
Interactive advanced search
On-screen buttons access to
specific information
Display style options
Full / compact entry view
Annotation facility
Personal lexicon
Based on the preceding comparison of five EDs several conclusions may be drawn. The electronic
versions of these works are superior to their hard-copy counterparts in terms of accessibility and
flexibility of information retrieval (enhanced by their graphical user interface), wider macrostructure (in
some EDs), more detailed microstructure, thesaurus-like resources, complex search mechanisms, copy
and print functions, extra components, multimedia resources, and the degree of customization.
Additionally, the comparison shows that all EDs are equally easily accessible to users and that they all
provide a large variety of lookup operations. At a more specific level, however, the preceding analysis
suggests that MED2 ranks highest because it exhibits the largest number of functions and innovative
features. It is closely followed by LDOCE4 and CALD2. OALD7 also possesses a large number of
valuable features, while COBUILD5 is the most basic. In particular, MED2, LDOCE4, CALD2, and
OALD7 stand out due to their modern graphical interface, addition of entries from other reference books
(LDOCE4 and OALD7), supplementary information in their entries, inclusion of abundant thesaurus-like
information, powerful search utilities (MED2 and CALD2), excellent copy and print functions (MED2 and
CALD2), high number of extra elements, recorded pronunciation and practice of every example sentence
(LDOCE4), and layered presentation of content (CALD2, MED2, and LDOCE4).
Nevertheless, this comparison has also exposed some weak points in the EDs. For example, the
macrostructure of COBUILD5, CALD2, and MED2 merely reproduces that of their hard-copy
counterparts; LDOCE4’s dictionary search function proves insufficient for retrieving very specific types
of language data (it consists of only three filters); and OALD7’s search function (non-filter based) would
benefit from revision and simplification. In addition, all dictionaries, except MED2 and CALD2, which
already possess it, should incorporate a flexible, unrestricted copy and print function in order to help users
utilize the findings of the advanced search function. Furthermore, a first-rate attribute of LDOCE4 – its
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
recorded pronunciation and sentence practice functions – might well be a "must" in subsequent editions of
other EDs as they facilitate exposure to the spoken language. Furthermore, these dictionaries might also
look toward including instrumental animated videos featured by MED2 since they facilitate learning of
words that defy semantic description. Finally, the EDs might benefit from extending customization
features to suit users’ reference needs. These additions would result in more efficient, powerful language
tools that meet the increasing demands of learners, teachers, and metalexicographers.
Finally, although the macrostructure and microstructure of these electronic works depend to a
considerable extent on those of their printed versions, these EDs also possess properties which set them
apart from paper dictionaries and which make them very effective. The present survey of ED features has
shown that significant moves are being made by lexicographers and publishers to produce versatile,
multipurpose electronic dictionaries that clearly surpass their printed editions.
Figure 1. Cambridge advanced learner’s dictionary on CD-ROM (2nd Ed., 2005)
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
Figure 2. Collins Cobuild advanced learner’s English dictionary on CD-ROM (5th Ed., 2006)
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
Figure 3. Longman dictionary of contemporary English, writing assistant edition CD-ROM (Updated 4th
Ed., 2005)
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
Figure 4. Macmillan English dictionary for advanced learners on CD-ROM (2nd Ed., 2007)
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
Figure 5. Oxford advanced learner’s compass (7th Ed., 2005)
1. Research leading to this review has been sponsored by the Spanish Ministry of Science and
Technology under I+D contract HUM2007-61766/FILO entitled "ADELEX: Assessing and Developing
Lexis through New Technologies."
The author is deeply grateful to Dr. Sigrun Biesenbach-Lucas, Georgetown University, Washington, DC,
USA, and Dr. Carmen Pérez-Basanta, University of Granada, Spain, for their insightful comments on a
previous version of this review.
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
Dr. Alfonso Rizo-Rodríguez (Ph.D., University of Granada, Spain) is senior lecturer in English
Linguistics at the University of Jaén, Spain. His research interests comprise English grammar and
lexicographical theory. His publications include a monograph on English catenative verbs, as well as
numerous articles on diverse grammatical aspects and on dictionary use and criticism.
Email: [email protected]
Atkins, B. T. S. (2002). Bilingual dictionaries: Past, present and future. In M.-H. Corréard (Ed.),
Lexicography and natural language processing. A Festschrift in honour of B. T. S. Atkins (pp. 1-29).
Stuttgart: Euralex.
Brown, G. & Yule, G. (1983). Teaching the spoken language. Cambridge: Cambridge University Press.
Corris, M., Manning, C., Poetsch, S., & Simpson, J. (2000). Bilingual dictionaries for Australian
languages: User studies on the place of paper and electronic dictionaries. In U. Heid, S. Evert, E.
Lehmann, & C. Rohrer (Eds.), Proceedings of the ninth Euralex International Congress, EURALEX 2000
(pp. 169-181). Stuttgart: Universität Stuttgart.
De Schryver, G.-M. (2003). Lexicographers’ dreams in the electronic-dictionary age. International
Journal of Lexicography, 16(2), 143-199.
Herbst, T. (1990). Dictionaries for foreign language teaching: English. In F. Hausmann, O. Reichmann,
H. Wiegand, & L. Zgusta (Eds.), Dictionaries. An International Encyclopaedia of Lexicography. Volume
2 (pp. 1379-1385). Berlin: De Gruyter.
Leech, G., & Nesi, H. (1999). Moving towards perfection: The learners’ (electronic) dictionary of the
future. In T. Herbst and K. Popp (Eds.), The perfect learners’ dictionary (?) (pp. 295-306). Tübingen:
Max Niemeyer Verlag.
McCorduck, E. (1996). Review article of the Longman interactive English dictionary on CD-ROM.
Dictionaries: Journal of the Dictionary Society of North America, 17, 225-235.
Nesi, H. (1999). A user’s guide to electronic dictionaries for language learners. International Journal of
Lexicography, 12(1), 55–66.
Nesi, H. (2000a). Electronic dictionaries in second language vocabulary comprehension and acquisition:
The state of the art. In U. Heid, S. Evert, E. Lehmann, & C. Rohrer (Eds.), Proceedings of the ninth
Euralex International Congress, EURALEX 2000 (pp. 839-847). Stuttgart: Universität Stuttgart.
Nesi, H. (2000b). On screen or in print? Students’ use of a learner’s dictionary on CD-ROM and in book
form. In P. Howarth & R. Herington (Eds.), EAP learning technologies (BALEAP Conference
Proceedings) (pp. 106-114). Leeds: Leeds University Press.
Rizo-Rodríguez, A. (2004). Current lexicographical tools in EFL: Monolingual resources for the
advanced learner. Language Teaching 37(1), 29-46.
Rizo-Rodríguez, A. (2005). Advanced monolingual learners’ dictionaries of English in book form: A
preliminary state-of-the-art survey. In J.-L. Martínez-Dueñas, C. Pérez-Basanta, N. McLaren, & L.
Quereda (Eds.), Towards an understanding of the English language: Past, present and future. Studies in
honour of Fernando Serrano (pp. 565-580). Granada, Spain: Editorial Universidad de Granada.
Seedhouse, P. (1997). Review article of Collins Cobuild on CD-ROM (1995). ReCALL, 9(1), 61-63.
Language Learning & Technology
Alfonso Rizo-Rodríguez
Review of Five English Learners' Dictionaries on CD-ROM
Svartvik, J. (1999). Corpora and dictionaries. In T. Herbst & K. Popp (Eds.), The perfect learners’
dictionary (?) (pp. 283-294). Tübingen: Max Niemeyer Verlag.
Tono, Y. (2000). On the effects of different types of electronic dictionary interfaces on L2 learners’
reference behaviour in productive/receptive tasks. In U. Heid, S. Evert, E. Lehmann, & C. Rohrer (Eds.),
Proceedings of the ninth Euralex International Congress, EURALEX 2000 (pp. 855-861). Stuttgart:
Universität Stuttgart.
Varantola, K. (2002). Use and usability of dictionaries: Common sense and context sensibility? In M.-H.
Corréard (Ed.), Lexicography and natural language processing. A Festschrift in honour of B. T. S. Atkins
(pp. 30-44). Stuttgart: Euralex.
Language Learning & Technology
Language Learning & Technology
February 2008, Volume 12, Number 1
pp. 43-63
Paige D. Ware
Southern Methodist University
Robert O'Dowd
Universidad de Léon, Spain
We performed a two-phase, year-long research project that explored the impact of peer
feedback on language development. We investigated specifically how and when postsecondary learners of English and Spanish provide corrective feedback on their partners'
use of the target language in weekly asynchronous discussions by assigning them to one of
two conditions: e-tutoring, in which students were asked to provide peer feedback on any
linguistic form they perceived as incorrect; and e-partnering, in which students were not
required to provide peer feedback but could do so on their own initiative. We examined
the frequency and type of language use by coding the feedback for language-related
episodes (Swain & Lapkin, 1998) and for feedback strategies (Ros i Solé & Truman,
2005). The findings indicate that students in both conditions preferred an inclusion of
feedback on form as part of their exchange, but such feedback only occurred when
explicitly required in the e-tutoring condition. Pedagogical implications include the need
to situate peer feedback on form within current models of telecollaboration and to assist
students in using feedback strategies such as reformulations, which do not rely on a deep
understanding of the target or native language grammar.
Online communication tools have been taken up eagerly by the foreign language teaching community. An
early focus on within-class communication among foreign language students was quickly followed by a
second stage of network-based language teaching in the late 1990s in which language students were
linked with learners in other contexts to form international partnerships (Kern, 1995, 1996; Tella, 1991;
Warschauer, 1997). Goals of such partnerships, often called telecollaboration, include assisting students'
linguistic and pragmatic development and intercultural awareness (Belz, 2003; Kern, Ware, &
Warschauer, 2004; Thorne, 2006). In recent years, research has explored in greater depth how different
configurations of telecollaboration, from real-time chatting to videoconferencing, have impacted students'
language development through online interaction with peers using the target language (Bauer,
deBenedette, Furstenberg, Levet, & Waryn, 2006; Belz, 2003; Belz & Kinginger, 2003; Belz & Vyatkina,
2005; Dussias, 2006; Kern, 1996; Kinginger, 1998; Kinginger & Belz, 2005; Lee, 2004). A smaller
number of studies within this paradigm (Belz, 2006; Lee, 2006; Levy & Kennedy, 2004; Sotillo, 2005)
have focused on the value of having students actively reflect on language form for linguistic development
in telecollaborative exchanges.
We build on this growing research base by reporting on a two-phase, year-long research project that
explores the impact of peer feedback on language development. We investigated specifically how and
when post-secondary learners of English and Spanish provide corrective feedback on their partners' use of
the target language in weekly asynchronous discussions by assigning them to one of two conditions: etutoring, in which students were asked to provide peer feedback on any linguistic form they perceived as
incorrect, and e-partnering, in which students were not required to provide peer feedback but could do so
on their own initiative. We examined the frequency and type of language use by coding for languagerelated episodes (Swain & Lapkin, 1998) and for feedback strategies (Ros i Solé & Truman, 2005), both
of which are discussed in detail in the methods section.
Copyright © 2008, ISSN 1094-3501
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
Research on language use in telecollaboration has drawn on several areas of applied linguistics research.
With this in mind, we review both sociocultural and interactionist interpretations of telecollaborative
language learning, and we pay particular attention to how a focus on form has been integrated into online
exchanges to date.
Sociocognitive and Sociocultural Perspectives
Researchers have studied a range of issues in synchronous and asynchronous exchanges, such as
intercultural exploration and understanding (Belz, 2003; Furstenberg, Levet, English, & Maillet, 2001;
Liaw, 2006; O'Dowd, 2003, 2006), the role of the instructor (Belz & Müller-Hartmann, 2003; MüllerHartmann, 2006; O'Dowd and Ritter, 2006; Ware & Kramsch, 2005), cultural patterns of use (Kramsch &
Thorne, 2002; Thorne, 2003), and the influence of socioinstitutional contexts on students' participation
patterns and attitudes toward online correspondence (Belz, 2002; Ware, 2005). Much of this research has
yielded rich analyses of language development, including the acquisition of pronouns of address (Belz &
Kinginger, 2003; Kern, 1996), the development of modality and expressions of appraisal (Belz, 2003), the
development of null-overt subject use and gender agreement (Dussias, 2006), and the acquisition of
modal particles (Belz & Vyatkina, 2005).
Interactionist Perspective
Research examining how online interaction can contribute to learners' grammatical competence and
syntactic complexity stems from the literature base of task-based learning, focus on form, and negotiation
of meaning in second language acquisition. These studies are often based on the application of Long and
Robinson's (1998) interaction hypothesis to online environments. This hypothesis proposes that
negotiation of meaning in interaction exposes learners to input that is both linguistically and
interactionally modified. Such input is expected to draw learners' attention towards grammatical form and
to push them to modify their own output. Negotiation of meaning is seen as a natural and automatic
process as interlocutors seek to understand and clarify each others' utterances.
Studies in the interactionist tradition have tended to focus on synchronous online interaction, for example,
MOO's (Multi-User Domain Object Oriented applications) and chats, either between students within the
same classroom (Blake, 2000; Pellettieri, 2000; Smith, 2005) or between native speakers and learners of
the target language (Dussias, 2006; Kötter, 2003; Lee, 2004, 2006; Tudini, 2003). Lee (2004)
demonstrated that native speakers of Spanish assisted non-native speakers in composing their ideas and in
improving their grammar, although she found that language proficiency, computer skills, and age also
impacted the nature of the interactions. In a later study using the Blackboard virtual learning platform
(2006), Lee focused on open-ended and goal-oriented tasks in synchronous interactions between native
speakers of Spanish and American students of Spanish as a foreign language. She found that the Spanish
native speakers provided mostly recasts and focused mainly on lexical rather than syntactical errors.
Tudini (2003) examined Italian language learner interaction in native Italian Web-based chat rooms and
found that negotiation sequences in synchronous interaction occurred in over 9% of total turns and that
language learners received both implicit and explicit feedback on their language from their native speaker
interlocutors. In short, work in the interactionist tradition has shed much-needed light on how real-time
written interaction can support language development in online interactions. However, it focuses mainly
on interactions involving negotiation of meaning, not on additional ways that students can support one
another when attending to form.
Focus on Form
Focus on form in online interaction is considered important for several reasons. First, Lee (2004) and
Levy and Kennedy (2004) have argued that computer-mediated communication should balance fluency
and linguistic accuracy. Second, studies of foreign language students in the US have found that students
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
often consider the "real" part of language learning to involve the study of grammar (Chavez, 2002) and
that a focus on culture takes away from the primary goals of classroom instruction (Kubota, Austin, &
Saito-Abbott, 2003). In a study of telecollaboration by one of the authors (Ware, 2005), many students
cited their preference for focusing on language.
The noticing of language forms can occur through ongoing interactional support provided during the
normal flow of conversation (Foster & Ohta, 2005) and in explicit feedback in electronic tandem (etandem) partnerships (Appel & Mullen, 2000; Brammerts, 1996; O'Rourke, 2005). Foster and Ohta
(2005) provide an example of how the cognitivist approach of the interactionist tradition can be combined
with a sociocultural lens to explore data on oral negotiated interaction among English and Japanese
learners. They found that students helped one another not only through negotiation of episodes that
focused on clarifying meaning, but also through assistance in formulating their messages even when a
communication breakdown did not occur. This type of interaction draws students' attention to language
form by providing opportunities to discuss language choices, to play with language, and to notice the
difference between their own linguistic formulations and those of native speakers. Research on the etandem approach focuses on one-on-one partnerships in which learners provide feedback on one another's
errors whether or not they impede meaning. These take place either outside of a traditional classroom
(Brammerts, 1996) or within a classroom (O'Rourke, 2005). Students can refer to L2 structures and
vocabulary that were used earlier by their partners and reuse them in other situations and contexts.
More recent work has examined how telecollaboration can help students to actively notice, process, and
discuss specific language forms and functions (Belz, 2006; Dussias, 2006; Levy & Kennedy, 2004). For
example, Dussias (2006) compared the linguistic gains of U.S. students of Spanish in a treatment group
who were each paired in telecollaborative partnerships with students in Spain against the gains of U.S.
students in a control group who performed the same tasks with non-native speaking peers. She found
greater gains in the treatment group in overt-null subjects, gender agreement, and communicative fluency.
Belz (2006) proposed using learner corpus analysis to assist learners in examining their own patterns of
error and in tracing their language development. In a study of "stimulated reflection," Levy and Kennedy
(2004) examined how teachers used online communication tools to engage their students in reflection on
form. Students of Italian engaged in audio-conferencing with various interlocutors including classmates
and Italian native speakers. The recordings of the audio interaction and the shared screen content were
then analysed together by the teacher and students with a focus on grammar, vocabulary, pronunciation,
and register. The sessions served to focus on the process of interaction in the L2 and to encourage learners
to reflect on the accuracy and complexity of their target language and on their communication strategies,
including social appropriateness.
A focus on the social aspects of language use stems from the potential of telecollaboration to provide
opportunities for students to see language and culture as two sides of the same coin (Belz, 2003;
Furstenberg et al., 2001; Kern, 1996; Thorne, 2003, 2006; Ware & Kramsch, 2005). Therefore, the tasks
given to the students during both phases of this study focused on highlighting the link between language
and culture and on developing learners' intercultural awareness (see Appendices A and B).
Our study contributes to the above research base by examining peer feedback and attention to language
form in asynchronous writing. It is theoretically grounded in a sociocultural approach that views language
learning as embedded in a particular sociocultural context (Lantolf, 2000). This implies that any study of
focus on form in an online intercultural exchange must take into account sociocultural factors such as the
attitudes of each set of learners to the culture of their interlocutors and issues of face and communication
breakdowns that regularly occur in intercultural interaction. Sociocultural issues identified as particularly
relevant in this study included cultural differences in the techniques used by Spanish and North American
students to correct their partners and how previous experiences of formal language learning shaped
students' attitudes towards the importance of a focus on form in online intercultural exchange.
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
Background Information
This two-phase study investigated the integration of peer feedback on language into classroom-based
adult foreign language learning using qualitative and quantitative methods. We examined the type and
frequency of language-related episodes, feedback strategies students used to focus on morphosyntactic
forms, and students' attitudes toward the presence or absence of an explicit focus on language in their
online interactions. Students were assigned to one of two conditions:
e-tutoring, in which they were asked to provide corrective feedback to their partners on language
errors or, in the absence of errors, to provide suggestions for language improvement such as different
wording or increased vocabulary. Students in this condition received training from their teachers in
how to provide such feedback and suggestions (see Appendix C).
e-partnering, in which students were not explicitly encouraged or trained to provide corrective
feedback to their partners. Instead, they were told that they could provide feedback or suggestions if
they chose to or if their partners asked them to do so.
Research Questions
The following questions guided our study:
1) What are the types and frequencies of language-related episodes in each of the two online conditions
of e-partnering and e-tutoring?
2) What feedback strategies did participants use when integrating a focus on morphosyntactic form into
their online interactions?
3) What were the attitudes of the participants in each condition toward the presence or absence of a
focus on language form in their online interactions?
Stages and Procedures
To answer these questions, the research was conducted in two phases (Table 1).
Table 1. Organization of the Two-Phase Study
Phase I
Phase 2
2 U.S. students & 11 Spanish
2 U.S. students & 11 Spanish
8 weeks
14 U.S. students & 14 Spanish
22 U.S. students & 22 Chilean
8 weeks
8 weeks
10 weeks
Phase I
Phase I took place during the spring semester of 2006 as a monolingual online exchange in English
between 22 EFL students in Spain and 4 post-secondary students in the US. Conducting this pilot phase in
only one language allowed us to control for the effect of instructor, syllabus, classroom, and semester as
we explored the potential for conducting a larger follow-up study involving more students and instructors.
In this first phase, we randomly assigned 22 post-secondary advanced EFL students (ages 19-22) at a
university in Spain to either the e-tutoring or e-partnering condition. All of the students were in the same
language course conducted by the second author. Their online partners were a cohort of four post-
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
secondary students (ages 19-21) enrolled in a small university in the US. Two of the U.S. students were
required to provide weekly feedback to 11 Spanish students in the e-tutoring condition, and two were
asked to provide feedback only when solicited by their EFL partners in the e-partnering condition.
The four U.S. students met weekly with the first author to discuss the tasks and to review the research
protocol, so the pilot phase functioned not as a typical classroom telecollaborative project, but as a small
controlled experiment, in which the four U.S. students were responsible for maintaining the distinctions
between the control and treatment groups. All students completed the same task cycle (see Appendix A)
using a course management system called Moodle, an open source platform similar to commercial course
management systems such as Blackboard and WebCT, that allows for data storage, file sharing, and
asynchronous and synchronous interaction (Robb, 2004).
Phase II
Phase II, in the fall of 2006, was a shift from the more tightly controlled design to an implementation
phase, in which we examined the presence or absence of peer feedback on form in a condition more
typical of bilingual classroom-based telecollaborative projects. The same two conditions were established:
e-tutoring and e-partnering. In the e-tutoring condition, 14 students who were enrolled in an advanced
Spanish grammar course at a university in the US were matched with 14 students in the second author's
Advanced EFL course in Spain. These 28 students were assigned to the e-tutoring condition for eight
weeks. In the e-partnering condition, 22 U.S. students enrolled in an advanced Spanish conversation
course were paired with 22 students enrolled in an advanced EFL course in Chile. They participated in the
e-partnering condition for 10 weeks. The differences in the lengths of the exchanges were due to differing
institutional constraints at the three universities.
The students in each telecollaborative project were required to write at least 300 words in each language
weekly. In both projects, students were placed into pairs (one native English speaker and one native
Spanish speaker), and these pairs remained constant for the duration of the exchange. Students in the etutoring condition were allowed to choose among different tasks (see Appendix B), and students in the epartnering condition wrote on themes related to movies they watched as part of their coursework. All
students in Phase II communicated in asynchronous interactions on Blackboard, a widely used, licensed,
password-protected course management system.
Data Collection and Analysis
Language Related Episodes
The data sources were a database of weekly online transcripts, surveys that provided descriptive
information on students' attitudes, and student-produced writing such as language reflection essays and
term papers. To answer the research questions related to the frequency and type of corrective feedback
and feedback strategies the students used (i.e., questions 1 and 2), language-related episodes (LREs) were
used as a unit of analysis. These are described by Swain and Lapkin (1998) as "any part of a dialogue
where the students talk about the language they are producing, question their language use, or correct
themselves or others" (p. 326). The online written dialogue was coded for any evidence of writing that
focused on language use including mechanics, vocabulary, grammar, style, and other types of corrections
and feedback. The total number of words written in the LREs was divided by the total number of words to
provide the percentage of writing that focused on language in the LREs. The LREs were categorized as
three types of feedback: morphosyntactic, lexical, and affective (see Table 2).
In addition to these three categories, we further sub-coded the morphosyntactic LREs using a coding
scheme of Ros i Solé and Truman (2005). Sub-codes for lexical items and affective feedback were not
needed because no apparent patterns emerged within those categories. Feedback in the morphosyntactic
LREs, however, was provided in two ways: specific feedback, in which partners provided the correct
answer for mistakes or made suggestions for improving style and syntactic complexity, and
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
commentaries, in which partners not only corrected or pointed out errors but also provided extended
metalinguistic commentaries justifying the suggested revisions (see Table 3).
Table 2. Examples of Coding for LREs
LRE Code
Examples From Interactional Data
a. "We use the word transport as a verb, and transportation is the noun."
b. "I can see why you thought it could be used there, but if the two sentences it
connected had been about the same subject, then it would have worked, but they
are two completely different sentences and should be separate."
"Also 'dumb' is like saying she becomes stupid. If that's what you meant, fine, but
it may be better understood if you said ‘dumbfounded' or ‘speechless.'"
"Anyways, I thought now would be a good opportunity to tell you about some of
your English. Overall it sounds very nice and can be read smoothly. There are just
a few changes you should make."
Table 3. Examples of Coding for Feedback Strategies
Specific Feedback
a. "Instead of 'Forest fires every year devastate north Spain…' you should say
'Forest fires devastate northern Spain every year' (order of words)."
b. "Instead of saying 'In add,' say ‘In addition.'"
c. "'She changes of topic' should be said like 'she changes the topic' or just 'she
changes topic.'"
"I don't know if I told you about the trick of using 'FANBOYS' or not .... Adding
commas and semicolons in long sentences makes the sentence more
understandable and easier to read. This is when you should use commas in a
sentence, when you have any of the FANBOYS: For, And, Nor, But, Or, Yet, So."
All transcripts in the e-tutoring and e-partnering conditions of both phases were coded using these
categories. To analyze the attitudes of the participants to an absence (or presence) of a focus on form (i.e.,
research question 3), we used traditional qualitative research methods (Bogdan & Biklen, 1998; Erikson,
1986) including interviews and surveys, which were structured around these areas: background
information concerning experience with the target language and with technology, preference for task
types in the exchange, perceptions of the usefulness of partners' feedback, self-reported increased use of
new forms, and level of interest in online exchanges.
Frequency and Type of Language-Related Episodes
Phase I
Analysis of the data from Phase I in which students interacted only in English reveals that a much greater
percentage of LREs occurred in the e-tutoring condition, in which the students were asked to provide
feedback on their partner's language whether it was solicited or not (see Table 4). This is not surprising as
those students had been asked to provide such feedback, while it was optional for students in the epartnering condition. We did not expect, however, so few LREs in the e-partnering condition because
student surveys had earlier revealed their preferences for having at least some focus on form.
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
Table 4. Percentage of Total Interactions Related to Language Form in Phase I
Total number of words
across all interactions
Total number of words
related to LREs
Percentage of words
related to LREs
Phase II
The results of Phase I led us to expect that students in the e-partnering condition of the second phase
would most likely not provide or elicit feedback unless explicitly directed to do so by their instructors.
Analysis of the data for Phase II confirmed these expectations in that only the students in the e-tutoring
condition tended to provide language-related feedback (see Table 5). Again, while this finding is not
surprising, note that students in the e-partnering phase also indicated in their final surveys a preference for
having a language focus. Given this preference, why they did not actively elicit such language feedback is
unclear to us; possible explanations include a real or perceived lack of time, reluctance to switch the focus
from fluency and conversation, lack of confidence in knowing what feedback to provide, or discomfort
with taking on a role they might see as more fitting for a teacher.
Table 5. Percentage of Total Interaction Related to Language Form in Phase II
Total number of words
across all interaction
Total number of words
related to LRE's
Percentage of words
related to LREs
In our analysis of the type of feedback provided (see Table 6), students assigned to the e-tutoring
condition of the bilingual exchange in Phase II of our project put a major focus on morphosyntactic LREs
and a secondary focus on affective moves such as praise and mitigation. Lexical items received the least
Table 6. Focus of LREs in E-Tutoring Bilingual Forums
Morphosyntactic LREs
Number (percentage) of words in
English forums
8,110 (77.1%)
Number (percentage) of words in
Spanish forums
4,075 (68.0%)
Lexical LREs
968 (9.2%)
749 (12.5%)
Affective LREs
1,441 (13.7%)
1,168 (19.5%)
10,519 (100%)
5,992 (100%)
While the English forums included a slightly higher focus on morphosyntactic LREs and the Spanish
forums a slightly higher focus on lexical LREs, both forums have an overall higher focus on
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
morphosyntactic LREs. The results also show that the students in both conditions produced affective
feedback at a higher rate than they produced lexical feedback.
Students' Strategies for Focusing on Language Form in Online Discussions
Based on the coding categories from Ros i Solé and Truman (2005), the most frequent type of feedback
provided by the U.S. students to the Spanish students in Phase I was that of commentaries (provision of
metalinguistic explanations). This was also the case in Phase II as can be seen in Table 7.
Table 7. Feedback Strategies Used in the Morphosyntactic LREs in Phase II
Number (percentage) of
words in
morphosyntactic LREs
in Spanish
1,625 (39.9%)
Number (percentage) of
words in morphosyntactic
LREs in English
Total number
(percentage) of
morphosyntactic LREs
7,663 (94.5%)
9,288 (76.2%)
2,450 (60.1%)
447 (5.5%)
2,897 (23.8%)
4,075 (100%)
8,110 (100%)
12,185 (100%)
An interesting distinction emerged between the morphosyntactic LREs written by the U.S. students (in
English) and those written by the Spanish students (in Spanish). The Spanish students used more
metalinguistic commentaries (60.1%) than did the American students (5.5%). When the American
students did provide commentaries, the accuracy and depth of their explanations tended to be limited and
not quite accurate, as in these two examples:
" ... 'is visited' is passive voice, and is generally frowned upon in the English language.
Also we say during weekends because it is a period of time."
"Many of the verbs in English are followed by 'to' or take the 'ing' ending as you talked
about. The 'ing' form can be used in multiple tenses as well, such as: 'I was playing soccer'
and 'I am playing soccer.'"
Additionally, the students in the US did not seem as well versed in metalinguistic knowledge or
terminology as their EFL partners in Spain. When his partner requested feedback on phrasal verbs, for
example, this student in the US tried to be helpful but was unsure how to proceed:
"I am not totally sure what you mean by phrasal verbs. If you give me a few phrases of
phrasal verbs and then your own I can correct them again for you. I just looked phrasal
verbs up on the internet, are they verbs like, add up, and act up etc. I haven't taken
grammar in a really long time but if you give me an example I can make up a few phrases.
I can also make up some incorrect ones and you can try to fix them."
In stark contrast to their U.S. peers, however, the Spanish students were much more familiar with
metalinguistic terminology and explanations. They provided significantly more commentaries (60.1% of
the total), most of which were accurate, even if sometimes only partial, explanations as these examples
"Y por fin cuidado con el verbo 'saber' que es irregular: 'sepa' al subjuntivo y 'supieron' al
pretérito" [And finally be careful with the verb 'saber' because it is irregular: 'sepa' in the
subjunctive and 'supieron' in the preterit.]
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
"Cuando dices 'querréis' sería queréis ya que es el verbo querer en tiempo presente. Yo
quiero, tú quieres ... vosotros queréis. Sólo llevaría dos erres en el condicional: querriáis."
[When you say 'querréis' it would be 'queries' because it's the verb querer in the present
tense. I want, you want. It would only have two rr's in the conditional: querriáis.]
The Spanish students' greater familiarity with metalinguistic terminology may be related to their
participation in foreign language classes throughout elementary and secondary education. In contrast,
students in the US often only take two or three years of language before post-secondary education. While
the mismatch in students' access to the language about language did not impede their attempts at
providing feedback, more research would be needed to determine if shared terminology and grammatical
awareness might enhance the type of feedback provided and the manner in which it could be acted upon.
Turning to the code of specific feedback, the American students relied mostly on this strategy (94.5%),
whereas the Spanish students used it less frequently (39.9%). Almost all instances of specific feedback
took the form of reformulations of their partner's original message, which parallels the findings in Lee's
(2006) study on synchronous interaction among learners. In the case of asynchronous interaction,
however, students would first restate the original phrasing, then indicate how it might be better expressed.
A typical episode, for example, would be initiated by the non-native speaker in the form of a generic
request for feedback, which would be coded as an "affective LRE": "I'm looking forward to learning more
through these emails with you. Please tell me about anything that doesn't sound quite right to you!";
"Well, I'm sorry if I have mistakes, I would like to hear your suggestions to how the text can be
improved." Subsequently, the native speaker would choose several specific areas from the non-native
speaker's message on which to provide feedback and then offer specific feedback:
"Everything you wrote was really good, but I have a few suggestions. Instead of saying
'To put out the fires more or less 1,200 soldiers have been deployed in Galicia region' you
can say 'About 1,200 soldiers have been deployed in Galicia region to put out the fires.'.
Instead of 'Forest fires every year devastate north Spain…' you should say 'Forest fires
devastate northern Spain ever year' (order of words)."
To end the episode, the non-native speaker would either provide a general acknowledgement of the advice
or simply request more feedback on the new message, once again coded as an affective LRE ("Thank you
for correcting my english [sic] mistakes, it really helps me.").
In the lexical LREs, feedback tended to come in two forms, either by providing a definition with
examples or by exemplifying the word's use in different contexts:
"When you say 'of the taste' you should say 'with the taste.' Also 'dumb' is like saying she
becomes stupid. If that's what you meant, fine, but it may be better understood if you said
'dumbfounded' or 'speechless.'"
"Cuando dices que vistéis una peli sería el personaje principal ya que no es una persona
sino un actor que hace de esa persona." [When you say that you saw a movie, it would be
'the main character' since it's not a 'person' but rather an actor who acts out this person.]
Such reformulations, with a secondary focus on vocabulary, were more time-efficient for the students,
and they were less likely to lead to inaccurate explanations of grammar. Using these reformulations, their
target partners could use the strategy of noticing (Schmidt, 1993) to compare their own original writing
against the more "native-sounding" rephrasing.
Participant Attitudes Toward Presence or Absence of Focus on Language Form
The role and status of grammar in foreign language education among the different groups of learners in
this study differed slightly. Spanish students taking an English philology degree at the university in this
study tended to attribute considerable importance to the grammar aspects of their language courses
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
despite their open preference for day-to-day class activities based on the development of communicative
skills and intercultural awareness. Informal comments by students often gave the impression that while a
language was best learned by practicing speaking and listening, the real business of language learning in
educational contexts involved the study and mastery of grammatical forms and vocabulary: "Teachers
only just do the textbook or give us photocopies. I think there should be more grammar from which to
take notes. And apart from that there should be interesting exercises from which we can learn." Clearly
reflecting these attitudes to language learning, many of the Spanish students who had participated in
online exchanges in the past had complained of the lack of a clear focus on elements of form in their
collaborative work.
In contrast, students in the United States who were assigned to the e-tutoring condition in which they
were asked to provide corrective feedback were initially hesitant to write commentaries about their
partner's language use. In the U.S. students' institutional context, online learning is frequently a part of
their regular university coursework, and students often participate in student-based discussion boards as
part of their out-of-class coursework. These boards are often informal spaces for sharing ideas, and most
evaluative feedback remains the role of the course instructor, so the U.S. students' concerns centered
mainly on fears of transforming their online conversations into less informal sessions.
At the end of Phase I, Spanish students assigned to both the e-tutoring and e-partnering conditions of the
exchange reported seeing their participation not only as an opportunity to get to know and understand
members of the target culture, but also as a way to improve their English and be exposed to informal
English language from native speakers. Many of those students who had been assigned to the e-partnering
condition were disappointed when the American partners did not explicitly provide language feedback
and concluded the exchange with feelings of frustration:
"No, she's too polite [to comment on grammar]. But I prefer it if she does because if they
don't correct you, you can't improve. It [participating in an exchange] is useful because
you see how language works but it's not enough because you can't improve your writing
because they don't say to you what you are doing wrong."
Interestingly, in the second phase of our study, surveys distributed at the end of the exchange found no
significant differences between the two conditions of e-tutoring and e-partnering in student attitudes
toward language feedback (see Table 8).
Table 8. Student Attitudes Toward Language Feedback in Phase II
Writing to native speakers should be a part of all
language classes.
When writing to native-speaking peers, it is
important to include a focus on grammar.
I prefer my partners to be primarily conversation
partners without a strong focus on grammar.
*Note. E-tutoring, n = 23; e-partnering, n = 27.
Clearly, both groups of students strongly favored writing to native speakers as part of their language
classes. Both groups of students also favored including grammar in the exchange. Students' attitudes
differed slightly in the degree to which they believed a grammar focus was important. Those students who
had participated in the e-tutoring condition tended to favor a stronger focus on grammar than those in the
e-partnering condition. Although the survey indicated no clear consensus as to why, we speculate that the
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
e-tutoring group had concrete, positive experiences with the language focus, whereas for the e-partnering
group, the question was hypothetical because they had primarily focused on conversational fluency.
In summary, students in the e-tutoring condition who did receive feedback on the accuracy of their
writing spoke very positively about this feature of the exchange. These students highlighted the difference
between focusing on form with their online partners compared to the traditional grammar focus in their
contact classes with their teachers. They mentioned, for example, that the corrections they received from
their online partners made a greater impact on their learning than normal classroom feedback and that the
corrections were experienced in a more personalized and unthreatening manner:
"In class you write down notes about grammar and vocabulary and it stays in your
notebook. With an exchange partner she corrects and the information stays with you ....
You learn more from mistakes in the forums than from reading rules from the blackboard
…. Maybe it's more interesting by the net. You are chatting so you are enjoying. If the
teacher gives me a corrected essay, I just read it and that's all."
"My partner helped me with sentence structure because in his emails I saw how he wrote
and I try to learn with his emails and another thing was the vocabulary, because I want to
write something and he had already written another words and it was very useful."
Students viewed online correction as a more contextualized way of learning about grammar and
vocabulary. From the students' perspective, the discussion forums provided them with a springboard for
reflecting on language form that differed from the classroom-based style to which they were accustomed,
and they appreciated the newer style.
Several findings that are worthy of further discussion and analysis emerge from the data. First, note that
the limited focus on feedback (3% in Phase I and 0.003% in Phase II) in the e-partnering conditions of
these asynchronous exchanges replicates the findings from similar research on synchronous interactions.
In his extensive study of a MOO-based synchronous tandem exchange between students of German and
English, for example, Schwienhorst (2000) found that even though students were encouraged to correct
their partners' grammatical errors, as was the case with our e-partnering condition, very little evidence of
error correction appeared in the transcripts1. The author suggested that this was due to the students
perceiving the point of the activity as being primarily one of communicating and establishing
relationships with their online partners. Focusing on grammatical corrections was considered of only
secondary importance to the learners.
Several explanations are possible for the primary focus on morphosyntactic LREs and secondary focus on
affective LREs during the exchange. First, because the students had more time to compose their messages
in asynchronous forums, they were able to look up vocabulary instead of relying on synchronous
negotiation of meaning to clarify unfamiliar terms. With the extra time available to read, interpret, and
respond to messages, they were better positioned to infer vocabulary from the larger context of the
message. Students might also have understood "grammar" to exclude a focus on vocabulary and thereby
focused their attention on morphosyntactical forms, even though they were told to focus on whatever
aspects of language they deemed important, including lexical items. Another possibility is that because all
of the students were in advanced language courses, they might not have had any immediate difficulty
understanding the gist of the messages, thereby eliminating the need to negotiate meaning.
The higher proportion of affective LREs than lexical LREs suggests that students in telecollaborative
exchanges might not feel comfortable providing corrective feedback (Lee, 2004) and therefore want to
mitigate or contextualize their language-related feedback. Students in this study used various ways of
talking about the process of focusing on language, including offering praise on one another's use of the
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
target language, mitigating the importance of their language-related comments, and thanking one another
for the language-related feedback:
"Concerning your grammar, you did a great job in this forum! There are only a very few
mistakes that I saw and they were very small."
"With this phrase, the only problem is ... "
"The sentence is just a little awkward ... "
Attempts to engage in grammar correction as sensitively as possible through the use of praise and
mitigation strategies were well received by the participants. Belz's (2003) findings were similar in that the
American students in her study used positive appraisal with their German partners. Comments from the
Spanish students in their interviews and portfolios confirmed that such affective strategies were
appreciated and a key factor in the success of the exchange: "I found her very helpful, she was really nice
to me and I'm very grateful. The corrections have helped me not to commit so many errors when writing.
And I think if a native speaker corrects you, you'll pay more attention since they do it in an informal
Finally, in relation to students' strategies for focusing on form, it is important to note that individual
students were differently equipped to provide accurate feedback. As mentioned earlier, the Spanish
students as a general rule used greater metalinguistic terminology and typically provided more substantive
feedback than the American students. Even so, the feedback students provided was often considerably
less complete than what a trained teacher could provide. In short, the feedback was sometimes very well
intentioned but misleading.
Pedagogical Implications
The findings of this study raise several issues for instructors and researchers interested in exploring an
explicit focus on peer feedback on language form in online exchanges. First, our research indicates that
language learners do appreciate their partners' active attempts to provide them with individualized
feedback. However, even though they favor this aspect of telecollaboration, they do not integrate it into
their online interactions unless given explicit directions to do so by the language instructor. To counteract
this avoidance of focusing on form, teachers may therefore have to go further than merely encouraging
students to correct their partners. Strategies could include dedicating sufficient class time to modeling
effective feedback strategies and requiring that parts of students' portfolios or final essays be dedicated to
reflecting on how error correction was dealt with during their online interaction.
In our study, students claimed that they used the online discussions to notice how their partners used
language and then re-used that language themselves later. However, we found little evidence of this reuse within the transcripts themselves, as was also the case in the empirical work in the e-tandem tradition
by Little et al. (1999) and Schwienhorst (2000). We speculate that, from a student's perspective, online
exchanges are likely "forward-oriented" toward the next message containing new information, unlike,
perhaps, teacher-directed class assignments that can be iterative products that are revised multiple times
for accuracy (and a grade). Therefore, we would suggest that teachers structure carefully sequenced tasks
so that they build on the previous interaction.
We have evidence that the feedback provided by peers is often limited in scope or accuracy. The
limitations of peers' metalinguistic comments may well be an indication that peer feedback, in the sense
of asking students to provide accurate explanations of their native language grammar, may not be an
appropriate use of telecollaboration. A more effective frame for peer feedback in telecollaboration could
be to request that language learners provide one another with reformulations as they tended to do
naturally both in our asynchronous study and in the synchronous one conducted by Lee (2006). In this
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
way, the online forum can serve as an alternative type of language learner reflection journal, in which
students document what they notice about the target language.
Instructors can use transcripts, or as Belz (2006) suggests, a learner-based corpus stemming from the
transcripts, as a starting point for reflecting on language use and form. To give students ample
opportunities to reflect on their online interaction and to study new linguistic structures and lexical items,
using portfolios and learner diaries as proposed by researchers in the e-tandem tradition (Little et al.,
1999) are invaluable. Learner' diaries, for example, can be used by students to maintain an ongoing record
of their experiences of the online exchange and to reflect on what they are learning, both culturally and
linguistically, from their interaction with their partners. When teachers use portfolios as part of the
evaluation process for telecollaboration, students also have an opportunity to show how they have
benefited from their exchange using presentations in which they demonstrate their use of the feedback.
Limitations and Recommendations for Future Studies
Until more studies are undertaken that can replicate our findings with different groups of students across
other online learning contexts, our conclusions are limited to this particular context of Spanish and
English post-secondary learners. Because the two cohorts of EFL learners were located in Spain and in
Chile, there are potential sociocultural factors at the national and institutional levels (e.g., different
emphasis placed on language or culture at the tertiary level, different secondary educational experiences
with English, etc.) that might have affected the students' interactions. Second, measures of student uptake
of and acquisition of particular language forms are needed to determine whether the online feedback has
an impact on either language learning or metalinguistic awareness beyond the positive evaluation given to
it by the participants in this study. A final limitation is the difficulty of attributing effects to any single
factor, such as the use of asynchronous instead of synchronous forums, the type of in-class instruction
used, or the assignment to e-tutoring or e-partnering. What this study does provide is rich descriptive data
on how peer feedback on form plays out under two types of telecollaboration, those of e-tutoring and epartnering.
Opportunities for future research are multiple. As mentioned previously, strong measures are needed to
examine if and how specific language forms are taken up and acquired in the short and long term as the
result of peer feedback in asynchronous writing. For example, in a recent example of learner uptake in
synchronous chat, Smith (2005) cautions that a "diminished role" (p. 33) is possible for uptake in online
contexts because he found no relationship between uptake and the acquisition of lexical items. Similar
rigorous methods need to be applied to asynchronous contexts and to other aspects of language use such
as morphosyntactic complexity. This could be done using a pre- and post-test design targeting specific
items or through researcher-derived instruments that monitor the ongoing progress of individuals on items
specific to the interactions of each partnership.
Research is also needed that continues to explore the role of task type in promoting attention to language
form along with intercultural learning (Müller-Hartmann, 2000). To create a greater number of online
sequences that involve either negotiation of meaning or peer feedback on language, specific tasks may
need to be adopted that enhance the amount of negotiation between partners or reflection on language use
(Lee, 2006; Pellettieri, 2000). Finally, more research needs to investigate the extent to which
foregrounding a focus on language form might impact the ways in which students establish working
relationships with their partners and grapple with intercultural learning online.
Taking into account the quantitative and qualitative findings of our research project, the students clearly
favored an integration of language form into their online exchanges, but they were not always equipped
with a strong enough understanding of the structure of their native languages to provide quality
metalinguistic explanations. Therefore, telecollaborative projects that intend to have a language focus
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
need to borrow both principles and techniques from various models of online exchange, with special
emphasis given to the role of the instructor. Instructors must not only make clear their expectations that
students provide feedback, but they must also provide examples of when and how to provide feedback.
Students will learn how to work with their partners in the second language in a sensitive and efficient way
when course instructors provide their students with appropriate training and awareness-raising activities
in their contact classes. For this reason, the principle of carefully integrating and linking contact classes
with online activities as proposed by intercultural telecollaborative models (e.g., Belz, 2003; Furstenburg
et al., 2001; Thorne, 2006) is highly recommended for an approach that integrates peer feedback on
language form.
1. Although asynchronous computer mediated communication (ACMC) and its synchronous equivalent
(SCMC) may differ in some ways, these forms of communication share many key characteristics that
justify taking into account research findings from both contexts in the area of online foreign language
learning research. Both ACMC and SCMC, for example, are text-based forms of communication that
provide learners with a level of anonymity that would not usually be possible in face-to-face learning
contexts. The fact that both are text-based also means that learners have the opportunity to focus on the
written form of their own and their partners' output to a greater extent that they would in oral forms of
interaction. This can encourage learners to reflect on accuracy and content, especially when extracts of
interactions from either form of CMC are saved, printed, and reflected on by learners and teachers at a
later point. Nevertheless, it is important to recognize that the immediate nature of SCMC may lead
learners to engage more regularly in negotiation of meaning to resolve misunderstandings that arise in
their interactions. In ACMC, learners usually have more time to reflect on their partners' texts and to
decide what was meant without actually needing to ask them to clarify or reformulate their ideas.
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
Appendix A. Tasks for Phase I
(Task 1) Introductions:
1: February
This task involves getting to know your partner and their local culture.
You should create background texts on two themes: 1) Personal
biographies -- describing who you are (100-150 words) and 2) An
introduction to life in your town and university -- taking into account the
aspects which be of particular interest to someone from the other
culture(400-500 words). Students should (in the following week) respond
to their partners' texts, asking questions and making comments on the
original posts.
Also this month, post your first draft of your film review so your Spanish
and American partners can make suggestions for improvements.
2: March
(Task 2) Both the United States and Spain have experienced periods of
incredible economic growth and social change over the past 15-20 years.
Even if you are too young yourself to remember what life was like in your
country 15 years ago, the media is constantly reminding us about how life
has changed so dramatically. In relation to this, you have two 'sub-tasks'
to do this month: Firstly, discuss with your partner how life has changed
in your country for young people in recent years. How have young
people's lifestyles changed? What are their 'new' interests and hobbies?
What are the main worries and problems of your generation? Are young
people better off now than 15/20 years ago? (Students should write a
minimum of two posts on this part of the task.)
(Task 3) Each group will find on our platform a set of graphs and
statistics which show developing attitudes of your society to certain
topical issues, such as immigration and the death penalty. Describe the
graphs to your partner and then compare how these different issues are
viewed in the USA and Spain. (Students should write a minimum of two
posts on this part of the task.)
3: April
Language Learning & Technology
(Task 4) In this round you have the opportunity to carry out an
ethnographic interview with your partner(s) on the topic of your choice.
(This involves two separate interviews: One in which the American
student interviews the Spanish partners and a second where the Spaniards
interview their American partner.) The two interviews do not need to be
based on the same theme. You can choose any topic related to your
partner's culture which you are particularly interested in (e.g. the
education system in the other country, the issue of immigration etc etc). In
each interview you should send three ‘rounds' of questions to your
partners finding out about their attitudes and experiences in relation to
your theme. Each round of questions should expand on the responses you
receive from your partner.
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
Appendix B. Sample Tasks From the E-Tutoring Condition in Phase II
Translation Help
Creative Expression
Idiomatic Expressions
Choose an advertisement (for example about Coke or some other product aimed at
young people) and write an adaptation of the script in your target language. You can
change the content as well as the language style, so that the ad is appropriate for the
other culture. Your partners should comment on the language, style, and cultural
Suggest that your partner listen to a song or radio station in Spanish or English and use
that as a basis for talking about music in your life, in your generation. A useful website:
You can also use other websites to download songs or podcasts
Choose a short text (song lyrics, article, letter, etc.) in your native language and translate
it into your target language. Without seeing the original text, your partner needs to
correct the translation to make it as appropriate and "natural-sounding" as possible.
Discuss the errors your partner made in their translation and try to explain why it
"sounds wrong."
Express yourself in a creative genre (poem, song, story) and share it with your partner;
have your partner comment on the way you are using Spanish and/or provide you with
tips on making it more colloquial, more formal, or more culturally relevant
Compose a text (or texts) in which you use at least 5 idiomatic expressions that your
partner has asked you to explain and have your partner help you make sure you are
using the appropriately
useful website:
You and your partner can propose a different activity as one of your four choices.
Appendix C. Suggestions for Language-Related Commentaries
1) Distinguish between "global errors" and "local mistakes": Local mistakes are typically small mistakes
that language learners make when they are in a hurry. Often, the learners know the rules they are breaking
but they are so focused on writing or speaking fluently, that they overlook them.Sometimes they are easy
to identify: misspelled words, missing articles, missing accent marks, or the occasional wrong verb tense.
In contrast, global errors are identified as sentences or phrases that sound awkward to your nativespeaking ear.
2) Use specific strategies for providing feedback: It is often helpful to use these strategies:
* Provide feedback: Look for patterns in the errors and provide feedback. Instead of simply
writing in the correct answer for your partner, go back through their text and highlight with a
different font all of the errors of a particular type.
* Selective correction: It is important to focus on just one or two types of errors per message (for
example, focus on verb tenses or on comma usage but not on both at once)
* Reformulation: You can rewrite one or two sentences for your partners so they can compare
the "native-sounding" version to their own. This is a useful technique!
* Give examples: When you explain a grammar rule or a vocabulary word, give multiple
examples so your partner has a context for using the new expression.
* Ask clarification questions: If you do not understand a particular sentence or think there might
be multiple meanings, ask your partner directly what they mean by such-and-such.
* Provide "mini-grammar lessons": If you feel comfortable explaining your native language,
try giving your partner short lessons. Think of these mini-lessons as teaching patterns and
reasons, not necessarily rules.
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
3) Ask your partners what they would like help with (and specify this for yourself, too): It is often easier
to provide feedback when your partner tells you specifically what they would like help with. Here are
some sample requests when asking for focused feedback:
* Could you please read this and comment on how I'm using the subjunctive?
* As you read this, will you write down any more sophisticated vocabulary words that
come to mind? I think mine are still very simple. Please ignore accent marks this time!
* I'd like to learn more idioms -- as you read this, do any come to mind that I might use?
4) Keep the tone positive: Upbeat comments certainly help encourage your partner to take risks in trying
out more complicated and sophisticated target language writing.
* Praise specific points; don't just make general comments:
Good: "I really like how descriptive you are -- you have such a wide range of vocabulary!"
Not-so-good: "Good work!"
5) Don't worry if you don't know how to explain something in your native language: Even language
teachers have to look up language explanations some of the time. You can always help out by looking up
resources online or by asking your own teacher to explain something to you.
* Remember that there are regional and national differences in language. Look at these layers of
language as potential areas to explore, not to "correct."
* Realize that context often influences grammar or vocabulary choice. Help your partners to
obtain a more complex view of how English or Spanish is used by pointing out differences
between "registers" of language.
We are grateful for the funding support of The International Research Foundation for English Language
Education (TIRF) and for the invaluable suggestions made by the LLT reviewers.
Paige Ware is an assistant professor at Southern Methodist University. She has a Ph.D. in Language,
Literacy, and Culture from the University of California at Berkeley. Her publications include research on
the use of new technologies to support adolescent language learners and the integration of
telecollaboration into ESL and EFL courses. She currently directs and teaches in Project CONNECT, a
program for secondary teachers to work with English language learners.
Email: [email protected]
Robert O'Dowd teaches EFL and Foreign Language Methodology at the University of León in Spain and
has a Ph.D. on the development of intercultural competence through the use of networked technologies.
He is currently on the executive committees of both Eurocall and IALIC and has published widely on the
themes of on-line foreign language education and on the role of culture in foreign language learning.
Email: [email protected]
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
Appel, C., & Mullen, T. (2000). Pedagogical considerations for a web-based tandem language learning
environment. Computers and Education, 34, 291-308.
Bauer, B., deBenedette, L., Furstenberg, G., Levet, S., & Waryn, S. (2006). The Cultura project. In J. Belz
& S. Thorne (Eds.), Internet-mediated intercultural foreign language education (pp. 31-62). Boston:
Heinle & Heinle.
Belz, J. (2002). Social dimensions of telecollaborative language study. Language Learning &
Technology, 6(1), 60-81. Retrieved January 8, 2008, from
Belz, J. (2003). Linguistic perspectives on the development of intercultural competence in
telecollaboration. Language Learning & Technology, 7(2), 68-117. Retrieved January 8, 2008, from
Belz, J. (2006). At the intersection of telecollaboration, learner corpus analysis, and L2 pragmatics:
Considerations for language program direction. In J. Belz & S. Thorne (Eds.), Internet-mediated
intercultural foreign language education (pp. 207-246). Boston: Heinle & Heinle.
Belz, J., & Kinginger, C. (2003). Discourse options and the development of pragmatic competence by
classroom learners of German: The case of address forms. Language Learning, 53(4), 591-647.
Belz, J. A., & Müller-Hartmann, A. (2003). Teachers as intercultural learners: Negotiating GermanAmerican telecollaboration along the institutional fault line. The Modern Language Journal, 87(1), 71-89.
Belz, J., & Vyatkina, N. (2005). Learner corpus analysis and the development of L2 pragmatic
competence in networked intercultural language study: The case of German modal particles. Canadian
Modern Language Review/Revue canadienne des langues vivantes, 62(1), 17-48.
Blake, R. (2000). Computer mediated communication: A window on Spanish L2 interlanguage. Language
Learning & Technology, 4(1), 120-136. Retrieved January 8, 2008, from
Bogdan, R. C., & Biklen, S. K. (1998). Qualitative research for education: An introduction to theory and
methods (3rd ed.). Boston: Allyn & Bacon.
Brammerts, H. (1996). Language learning in tandem using the Internet. In M. Warschauer (Ed.),
Telecollaboration in foreign language learning (pp. 121-130). Honolulu, HI: University of Hawaii Press.
Chavez, M. (2002). We say "culture" and students ask "What?": University students' definitions of
foreign language culture. Die Unterrichtspraxis, 35(2), 129-140.
Dussias, P. E. (2006). Morphological development in Spanish-American telecollaboration. In J. Belz & S.
Thorne (Eds.), Internet-mediated intercultural foreign language education (pp. 121-146). Boston: Heinle
& Heinle.
Erikson, F. (1986). Qualitative methods in research on teaching. In M. Wittrock (Ed.), Handbook of
research on teaching. New York: Macmillan.
Foster, P., & Ohta, A. (2005). Negotiation for meaning and peer assistance in second language
classrooms. Applied Linguistics, 26(3), 402-430.
Furstenberg, G., Levet, S., English, K., & Maillet, K. (2001). Giving a virtual voice to the silent language
of culture: The CULTURA project. Language Learning & Technology, 5(1), 55-102. Retrieved January
8, 2008, from
Kern, R. (1995). Restructuring classroom interaction with networked computers: Effects on quality and
quantity of language production. Modern Language Journal, 79(4), 457-476.
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
Kern, R. (1996). Computer-mediated communication: Using e-mail exchanges to explore personal
histories in two cultures. In M. Warschauer (Ed.), Telecollaboration in foreign language learning (pp.
105-120). Honolulu, HI: University of Hawaii Press.
Kern, R., Ware, P., & Warschauer, M. (2004). Crossing frontiers: New directions in online pedagogy and
research. Annual Review of Applied Linguistics, 24(1), 243-260.
Kinginger, C. (1998). Videoconferencing as access to spoken French. Modern Language Journal, 82(4),
Kinginger, C., & Belz, J. (2005). Socio-cultural perspectives on pragmatic development in foreign
language learning: Microgenetic and ontogenetic case studies from telecollaboration and study abroad.
Intercultural Pragmatics, 2(4), 369-422.
Kötter, M. (2003). Negotiation of meaning and codeswitching in online tandems. Language Learning &
Technology, 7(2), 145-172. Retrieved January 8, 2008, from
Kramsch, C., & Thorne, S. (2002). Foreign language learning as global communicative practice. In D.
Block & D. Cameron (Eds.), Globalization and language teaching (pp. 83-100). London: Routledge.
Kubota, R., Austin, T., & Saito-Abbott, Y. (2003). Diversity and inclusion of sociopolitical issues in
foreign language classrooms: An exploratory survey. Foreign Language Annals, 36, 12-24.
Lantolf, J. P. (2000). "Introducing sociocultural theory." In Sociocultural theory and second language
learning (pp. 1-26). Oxford, UK: Oxford University Press.
Lee, L. (2004). Learners' perspectives on networked collaborative interaction with native speakers of
Spanish in the U.S. Language Learning & Technology, 8(1), 83-100. Retrieved January 8, 2008, from
Lee, L. (2006). A study of native and nonnative speakers' feedback and responses in Spanish-American
networked collaborative interaction. In J. Belz & S. Thorne (Eds.), Internet-mediated intercultural foreign
language education (pp. 147-176). Boston: Heinle & Heinle.
Levy, M., & Kennedy, C. (2004). A task-cycling pedagogy using stimulated reflection and audioconferencing in foreign language learning. Language Learning & Technology, 8(2), 50-69. Retrieved
January 8, 2008, from
Liaw, M-L. (2006). E-learning and the development of intercultural competence. Language Learning &
Technology, 10(3), 49-64. Retrieved January 8, 2008, from
Little, D., Ushioda, E., Appel, C., Moran, J., O'Rourke, B., & Schwienhorst, K. (1999). Evaluating
tandem language learning by e-mail: Report on a bilateral project. Dublin, Trinity College: CLCS
Occasional Paper.
Long, M. H., & Robinson, P. (1998). Focus on form: Theory, research and practice. In C. Doughty & J.
Williams (Eds.), Focus on form in second language acquisition (pp. 15-41). Cambridge, UK: Cambridge
University Press.
Müller-Hartmann, A. (2000). The role of tasks in promoting intercultural learning in electronic learning
networks. Language Learning & Technology, 4(2), 129-147. Retrieved January 8, 2008, from
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
Müller-Hartmann, A. (2006). Learning how to teach intercultural communicative competence via
telecollaboration: A model for language teacher education. In J. Belz & S. Thorne (Eds.), Internetmediated intercultural foreign language education (pp. 63-84). Boston: Heinle & Heinle.
O'Dowd, R. (2003). Understanding the "other side": Intercultural learning in a Spanish-English email
exchange. Language Learning & Technology, 7(2), 118-144. Retrieved January 8, 2008, from
O'Dowd, R. (2006). Telecollaboration and the Development of Intercultural Communicative Competence.
Munich, Germany: Langenscheidt-Longman.
O'Dowd, R. and Ritter, M. (2006). Understanding and Working with "Failed Communication" in
Telecollaborative Exchanges. CALICO Journal, 61(2), 623-642.
O'Rourke, B. (2005). Form-focused interaction in online tandem learning. CALICO Journal, 22(3), 433466.
Pellettieri, J. (2000). Negotiation in cyberspace: The role of chatting in the development of grammatical
competence in the virtual foreign language classroom. In M. Warschauer & R. Kern (Eds.), Networkbased language teaching: Concepts and practice (pp. 59-86). Cambridge: Cambridge University Press.
Robb, T. (2004). Moodle: A virtual learning environment for the rest of us. TESL-EJ, 8(2), 1-8.
Ros i Solé, C. & Truman, M. (2005). Feedback in distance language learning: Current practices and new
directions. In B. Holmberg, M. Shelley, & C. White (Eds.). Distance education and languages: evolution
and change (pp. 72-91). Clevedon, UK: Multilingual Matters.
Schmidt, R. (1993). Consciousness, learning, and interlanguage pragmatics. In F. Kasper and S. BlumKulka (Eds.), Interlanguage Pragmatics (pp. 21-42). Oxford, UK: Oxford University Press.
Schwienhorst, K. (2000). Virtual reality and learner autonomy in second language acquisition.
Unpublished manuscript, Trinity College, Dublin, Ireland.
Smith, B. (2005). The relationship between negotiated interaction, learner uptake, and lexical acquisition
in task-based computer-mediated communication. TESOL Quarterly, 39(1), 33-58.
Sotillo, S. M. (2005). Corrective feedback via Instant Messenger learning activities in NS-NNS and NNSNNS dyads. CALICO Journal, 22(3), 467-496.
Swain, M., & Lapkin, S. (1998). Interaction and second language learning: Two adolescent French
immersion students working together. Modern Language Journal, 82(3), 320-337.
Tella, S. (1991). Introducing international communications networks and electronic mail into foreign
language classrooms: A case study in Finnish senior secondary schools. Helsinki, Finland:
Thorne, S. (2003). Artifacts and cultures-of-use in intercultural communication. Language Learning &
Technology, 7(2), 38-67. Retrieved January 8, 2008, from
Thorne, S. (2006). Pedagogical and praxiological lessons from Internet-mediated intercultural foreign
language education research. In J. Belz & S. Thorne (Eds.), Internet-mediated intercultural foreign
language education (pp. 2-30). Boston: Heinle & Heinle.
Tudini, V. (2003). Using native speakers in chat. Language Learning & Technology, 7(3), 141-159.
Retrieved January 8, 2008, from
Language Learning & Technology
Paige Ware and Robert O’Dowd
Peer Feedback on Language Form in Telecollaboration
Ware, P. (2005). "Missed communication" in online communication: Tensions in fostering successful
online interactions. Language Learning & Technology, 9(2), 64-89. Retrieved January 8, 2008, from
Ware, P., & Kramsch, C. (2005). Toward an intercultural stance: Teaching German and English through
telecollaboration. Modern Language Journal, 89(2), 190-205.
Warschauer, M. (1997). Computer-mediated collaborative learning: Theory and practice. Modern
Language Journal, 81(4), 470-481.
Language Learning & Technology
Language Learning & Technology
February 2008, Volume 12, Number 1
pp. 64-84
Keiko Kitade
Ritsumeikan University, Kyoto
In order to demonstrate how learners utilize the text-based asynchronous attributes of the
Bulletin Board System, this study explored Japanese-as-a-second-language learners'
metalanguage episodes (Swain & Lapkin, 1995, 1998) in offline verbal peer speech and
online asynchronous discussions with their Japanese key pals. The findings suggest the
crucial role of offline collaborative dialogue, the interactional modes in which the episodes
occur, and the unique discourse structure of metalanguage episodes concerning online and
offline interactions. A high score on the posttest also suggests the high retention of
linguistic knowledge constructed through offline peer dialogue. In the offline mode, the
learners were able to collaboratively construct knowledge with peers in the stipulated time,
while simultaneously focusing on task content in the online interaction. The retrospective
interviews and questionnaires reveal the factors that could affect the benefits of the
asynchronous computer-mediated communication medium for language learning.
Asynchronous computer-mediated communication (ACMC) enables language learners to actively engage
in interactions with a wider range of interlocutors because the interactions are both place-independent and
time-independent. In addition to the accessibility for learners' engagement in real online communities, the
unique interactional features of ACMC are considered to facilitate second language (L2) learning. By
reexamining the potential of text-based interactions and the time interval between messages within a
sociocultural perspective, this study attempts to investigate learners' behaviors in ACMC activities
beyond the period of online interaction.
Text-Mediated Interactive Features
Studies of both SCMC (Synchronous CMC) and ACMC indicate the significant potential of text-based
interaction within a sociocultural perspective, based on the work of Vygotsky (1978). Warschauer (1997)
employs this framework in computer-mediated communication (CMC) to stress the role of text-mediation
and the context for collaborative learning. From a sociocultural viewpoint, language is one of the semiotic
tools that mediate both higher mental functioning and actions. Considering such cognitive and selfregulative functions of language, text is viewed as a "thinking device," since the writer or reader is able to
describe and reflect upon its immediate interpretation and extract new meanings on the basis of its written
representation (Lotman, 1988). Chang-Wells and Wells (1992) observe children's engagement in writing
activities and indicate that text-based activity fosters the development of "literate thinking." Through this
engagement, children are required to explicitly posit their arguments, keep their arguments consistent
with their own position, consider alternatives and justify them, and carefully evaluate the consequences of
their stance. Text-based communication allows learners to store, edit, reevaluate, revise, and perform such
activities that may enhance their reflective process.
Additionally, CMC's interactive dimension promotes a collaborative context for learning. Vygotsky
(1978) claims that in the process of higher cognitive development in an individual, knowledge is first
constructed through social interaction and then internalized through private speech. According to this
view, learning occurs in collaborative dialogues where learners, with their partners' assistance, are able to
bridge the "zone of proximal development" (ZPD)—the gap between the level of development that
learners are capable of independently attaining and the level that they can achieve with guidance or
Copyright © 2008, ISSN 1094-3501
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
collaborative assistance. Second language acquisition (SLA) studies based on a sociocultural perspective
(e.g., Lantolf, 2000; Lantolf & Appel, 1994) agree with the significant role played by collaboration in
expert-novice and peer interactions in the L2 learning context. By combining the text-based nature of
communication with interactive attributes, CMC may enhance collaborative activities. Kitade (2000) and
Darhower (2002) examined the text-based chat interactions of L2 learners in a discussion task. They
indicate that the learners in online groups work collaboratively by providing guidance to each other and
strategize ways in which to achieve intersubjectivity. Studies in telecollaboration (Belz & Kinginger,
2002; Kinginger, 2000) also suggest that the pragmatic competence of French and German L2 learners
develops through collaborative e-mail and chat exchanges with their French/German partners. ACMC
provides opportunities for collaborative learning to some extent; however, the potential of collaborative
learning in this context is more complex, given the time interval between messages.
Time Interval as a Controversial Factor in L2 Learning
The time interval between the interactions in ACMC is a controversial aspect in L2 learning. It prevents
learners from receiving immediate feedback, which is a key element in collaborative learning. Studies on
novice-expert dialogue describe how experts guide novices in task completion by adjusting the task
difficulty (Radziszewska & Rogoff, 1991; Wertsch, Minick, & Arns, 1984). Rogoff and Gardner (1984)
state that scaffolded assistance enables learners to grasp new task components that novices would be
unable to complete without assistance. From this perspective, in order to address the needs of novices,
feedback should be provided through dialogue. Describing the procedure of effective assistance in the
Zone of Proximal Development (ZPD), Aljaafreh and Lantolf (1994) state, "First, intervention should be
graduated. Help provided by a more experienced member in a joint activity is designed to discover the
novice's ZPD in order to offer the appropriate level of assistance and to encourage the learner to function
at his or her potential level of ability… Second, help should be contingent, meaning that it should be
offered only when it is needed, and withdrawn as soon as the novice shows signs of self-control and the
ability to function independently" (p. 468).
Unlike synchronous interaction, exchanges in ACMC often have significant time delays between
messages, reducing the opportunity of providing adjusted assistance. Kitade (2006) suggests that half of
the initiation moves (i.e., requests for solving linguistic problems) in e-mail exchanges between learners
of the Japanese language and Japanese students are ignored. Stockwell (2004) indicates that in L2
contexts, learners of Japanese in Australia rarely surmount conversational breakdown with their online
Japanese partners. Lamy and Goodfellow (1999) and Kitade (2006) suggest that the time intervals
between messages in asynchronous conferences and e-mail exchanges may decrease the coherence of the
discourse and lessen the pressure on participants to negotiate the meaning of written communication.
On the other hand, the positive aspect of ACMC is that its asynchronous nature offers abundant time,
which amplifies the abovementioned advantages of text-mediation. Lapadat (2002) emphasizes the
similarities between the benefits of ACMC and those of conventional writing by stating that "online
participants can and do take time to think, to polish what they say, and edit. Participants in asynchronous
conferences produce less in total quantity (e.g., number of words), but their contributions to the
discussion tend to be carefully crafted, adapted to the audience, dense with meaning, coherent, and
complete" (p. 8). In order to assess the status of the interlocutors' knowledge and to frame their messages,
participants in ACMC need to consider the perspectives and metalinguistic sensibilities of others. Lamy
and Goodfellow (1999) propose that asynchronous conferences are particularly appropriate for "reflective
conversations," in which the learners discuss metalinguistic and L2 learning issues, because of the time
flexibility and access to previous texts.
In sum, the asynchronous nature of interaction may reduce opportunities for scaffolding in the context of
collaborative learning; however, it may enhance the reflective process. An and Frick (2006) examine
college students' perceptions of ACMC and report that its biggest advantage in the L2 learning context is
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
the ample time available to the participants to reflect upon and develop ideas. At the same time, they also
regard as a shortcoming the lack of immediate feedback.
The Collaborative Context in ACMC In-Class Activity
In order to amplify the benefits of ACMC and to compensate for its shortcomings, it is necessary to
examine an ACMC in-class activity, paying close attention to the learners' total engagement, rather than
limiting the attention to their online interaction. Learners engaging in ACMC in-class activities can
undertake two types of activities: online interactions and offline interactions with peers, referring to the
online texts they are attempting to write or comprehend.
According to Wells (1999), combining the advantages of spoken and text-based communication helps
expedite a child's learning process. The collaborative activity of "talk about text," where speech and text
function interdependently within an activity, enables learners not only to engage in reflective thinking
with text-based communication but also to receive assistance from their partners in the collaborative
context. By observing children's talk in activities involving reading and composing texts, Wells (1999)
discovers that the talk about text activity is successful in providing direct assistance to students when they
are restricted by their individual knowledge. The offline peer interaction during ACMC activities emerges
in the writing process and may have some functions in common with those in the collaborative talkabout-text activity.
Studies on L2 writing activities also explore the potential of collaborative talk about written text during
the following activities: revision (de Guerrero & Villamil, 2000; Villamil & de Guerrero, 1998), joint
writing of a story (Swain & Lapkin, 1998), and joint reflection with native speakers regarding the revised
text (Swain & Lapkin, 2002; Tocalli-Beller & Swain, 2005). Peer collaboration during writing or revision
has been recognized as an effective technique for enhancing the writing skills of L2 learners (Cumming,
in press; Villamil & de Guerrero, 1998). Several studies discussed below claim that peer dialogue plays a
crucial role in L2 learning, particularly when it involves metalanguage talk during writing activities.
Collaborative Dialogue as a Medium for Observing Learning in an ACMC Activity
As discussed above, dialogic interactions can play a significant role in student learning. Expanding on
Vygotsky's original claim about expert-novice dialogic interactions, some studies examine the scaffolding
behavior in peer interactions and illustrate how learners are capable of assisting each other in bridging
their ZPDs (Brooks & Donato, 1994; Donato, 1994; Ohta, 1995; Platt & Brooks, 1994). These studies
employ descriptive analyses to illustrate the learners' behaviors on a moment-to-moment basis and the
changes that take place during collaborative dialogue.
Vygotsky (1978) perceives learners' mental processes to be dynamic phenomena. Underlining Vygotsky's
claim, Wertsch (1985, 1991) suggests that a microgenetic analysis is required to observe the development
of such dynamic phenomena. De Guerrero and Villamil (2000) describe a microgenetic approach as "one
in which moment-to-moment changes in participants' behavior were noted and examined" (p. 54). In their
investigation of peer talk among intermediate ESL learners during the revision of writing, they show that
learners provide each other with knowledge about language, and that the opportunity to exteriorize their
thoughts allows students to reinforce and reconstruct their knowledge of the target language.
Studies by Swain and others (Swain, 2000; Swain, 2006; Swain, Brooks, & Tacalli-Beller, 2002; Swain &
Lapkin, 1998) propose that the observation of peer dialogue reveals learners' mental processes. Swain and
Lapkin (1998) suggest that verbalization in a collaborative context not only enacts the thoughts
constituting the mental process but also makes them observable, since "in a joint problem-solving
activity, what normally remains hidden in individually internalized thought may manifest itself in
dialogue" (Swain & Lapkin, 1998, p. 321).
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
In order to address the question of how learners solve their linguistic problems and the extent to which
scaffolding may impact the knowledge of individual learners, Swain and Lapkin (1995, 1998) highlight
the importance of metalinguistic episodes in dialogues. They refer to the episodes as "language related
episodes (LREs), which are parts of a dialogue where the students talk about the language they are
producing, question their language use, or correct themselves or others" (Swain, 1998, p. 70). Provided
below is an example of a metalanguage episode from the offline data used in this study. B is looking for
the word ataeru to state Warui eikyo: o ataeru 'have a bad influence' in B1. Then, B and W search for the
word by listing candidates: agaru, ageru, ataeru, ataeteageru.
B1: Warui eikyo: ga agaru?
W2: Ageru?
B3: Ageru? Ataeru?
W4: Ataeteageru?
B5: Ataeru. Ataeru.
W6: Ataeru.
B7: Un.
By identifying the episodes that are related to linguistic aspects and included in a tailor-made test, the
dyad-specific posttest may measure how the linguistic issues discussed were dialogically retained by the
learners for at least a short period of time. The dyad-specific posttests are created on the basis of the
metalanguage episodes, as determined by the audio recordings of the peer dialogue during the
performance of the collaborative tasks. Swain (1998, 2000) states that in joint dictogloss tasks, the
learners were able to remember the solutions they arrived at with respect to 70–80% of the items in the
LREs on the posttest, which was held 7–10 days later. The high scores in the posttest suggest that
metalanguage talk in collaborative peer dialogue may be important for L2 learning.
In sum, the findings of the analysis of collaborative dialogue that occur during the writing and revising
activities indicate the potential of offline interaction in ACMC to serve as a learning opportunity.
Moreover, an analysis of the offline interaction may provide a verbal protocol that demonstrates a
learner's status on a moment-to-moment basis. However, offline interaction in ACMC may differ from
dialogues that emerge during other writing activities. In an ACMC activity, learners are required to
comprehend the received online messages and compose text messages that are framed specially for their
partners. In addition, they have two types of interlocutors who can provide assistance for both taskoriented and linguistic needs: online partner(s) and offline peers. The incorporation of a descriptive
approach should be effective in revealing the learners' actual behaviors beyond the domain of online
interaction and the developmental process of learning that occurs in this context.
Many previous studies on ACMC have examined only online interactions (e.g., Kinginger, 2000; Kitade,
2006; Lamy & Goodfellow, 1999; Schwienhorst, 2003; Stockwell & Levy, 2001) without addressing the
role of offline interactions or the learners' engagement in combined online and offline interactions. In
order to fully understand how learners implement a task in the ACMC context and the potential of this
task with regard to L2 learning, this study incorporates a sociocultural perspective and examines both
online and offline interactions to reveal how each type of interaction—online, offline, or combined
interactions—can provide learners with opportunities for collaborative learning. The study investigates
learners' metalinguistic talk in online and offline interactions in order to identify the types of knowledge
used and to show how they are co-constructed from the two types of interaction. A posttest is also
employed to investigate the extent to which learners retain the co-constructed knowledge, at least in the
short term. The research questions are as follows:
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
RQ1: What kind of discourse structure do the learners engage in to construct metalinguistic episodes in
the ACMC activity with respect to the following: (a) online versus offline interaction and (b) receptive
versus productive modes?
RQ2: What kinds of metalinguistic episodes are discussed by the learners to convey their intentions to
their native key pals in terms of the linguistic focus (lexical-, syntactic-, phonological-, or discoursebased)?
RQ3: To what extent do learners retain the knowledge co-constructed with their peers in the
metalinguistic episodes ?
In order to examine learners' interactions in a classroom environment rather than in an experimental
context, this study was conducted in two content-based Japanese study classes held in two terms: Term 1
(Fall, 2005) and Term 2 (Spring, 2006). In these classes, the learners and their classmates studied and
discussed social problems and cultural aspects related to Japan through in-class discussions or using the
bulletin board system (BBS). The participants comprised 36 exchange students (Term 1: 8 students; Term
2: 28 students) studying Japanese in half- or one-year language programs at a Japanese college. They
were enrolled in the advanced-low level Japanese course, comprising eight classes per week, the contentbased class being held once every week. During the data collection process, the students stayed in Japan
for one month, immediately after learning Japanese in their own countries: Korea, China, Taiwan, France,
Germany, Italy, Sweden, Denmark, England, the Philippines, Australia, Canada, and the U.S.A. They
indicated in the questionnaire that they regularly wrote e-mails in Japanese and had no difficulty typing in
Japanese. The 32 Japanese volunteers were all undergraduate students; during the data collection, 10 of
them were attending the Japanese language teaching seminar class.
Task and Procedures
The participants were randomly paired with their classmates and engaged in a decision-making task with
one or two Japanese partners; they could interact with their classmates only through the BBS. The
participants engaged in the task during 60 minutes of a 90-minute class, which was held once a week for
four weeks. To accomplish the task, the Japanese partners were also instructed to hold discussions with
the participants through the BBS.
During the first week of each term, the participants were given instructions on the use of the BBS and
were introduced to the available online dictionaries. The students in both Terms 1 and 2 began comparing
educational or job-hunting system in Japan with those of their own countries. In the subsequent three
weeks, the participants in Term 1 were instructed to discuss their ideal school with the instruction: "If you
were to start a school, what kind of school would you want to establish?" They had to answer this
question with respect to the educational objectives, educational system, content covered, educational
environment and facilities, and name of the school. The participants in Term 2 discussed their ideal job.
They were provided with the following instruction: "When you look for a job, what kinds of conditions
do you need?" They had to answer this question with respect to salary, holidays, working hours, interests,
stability, and human relations. After discussing and arriving at an agreement in four weeks, all the
participants were instructed to write a summary of their responses in Japanese. The Japanese learners
were also required to submit a handwritten report about the linguistic and cultural aspects they had
learned through the activity.
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
Data were obtained from three sources. The first was the interactions conducted during the task with two
types of traceable interactions: text-based online interaction through the BBS and audio-recorded offline
verbal interaction. A total of 16 recordings, each lasting an average of 59.0 minutes for Term 1, and 39
recordings, each lasting an average of 51.5 minutes for Term 2, were transcribed in order to create the
posttest and identify the metalinguistic episodes. The online BBS messages included 60 messages for
Term 1 and 186 messages for Term 2. The type of BBS employed for the ACMC activity, termed
"zoops," enabled registered participants to engage in discussions using a group thread.
Secondly, two research assistants took notes on their observations in the classroom during all sessions in
order to capture the learners' nonverbal behavior, including the use of dictionaries, which could not be
captured by audio-recorded data. Finally, interviews were conducted and questionnaires were distributed
for the purpose of documenting the behavior and perceptions of the Japanese-as-a-second-language
participants. The questions focused primarily on three aspects: (a) the learners' behavior while they were
reading and writing online messages (i.e., if and in what order they paid attention to the organization,
content, or form of the messages); (b) their opinions of the pair work with their classmates; and (c) their
impression of the ACMC interactions for language learning, compared to the other modes of interaction.
In both terms, a research assistant conducted audio-recorded interviews with students; these lasted for an
average of 13 minutes each. The interviews included more open-ended questions that were designed to
extract more detailed answers.
The audio-recorded offline data were transcribed and the metalinguistic episodes were identified based on
the definition of LREs provided by Swain and Lapkin (1998). The discourse structure of the
metalinguistic episodes in the ACMC activities was identified and then, the preferred types of
metalinguistic episodes were determined with respect to linguistic focus on the metalinguistic episodes
and their corresponding solutions. At the beginning, the interrater reliability of two raters—obtained
through the identification and categorization of metalinguistic episodes—was found to be 90.6%.
However, following the discussion of the items that differentiated the assessments (between the two
categories), the disagreements were resolved and the interrater reliability reached 100%. Based on the
identified metalinguistic episodes in the online and offline interactions, test items were individually
developed for each L2 learning participant for the posttest; this test was administered during the sixth
week. Similar to the dyad-specific posttest by Swain and Lapkin (1998), the format of the questions used
in this study varied depending on how the test items were originally discussed in the metalinguistic
episodes. Some examples of the posttest are provided in Appendix A.
Discourse Structure of Metalinguistic Episodes
One of the most salient structural features of the metalinguistic episodes was that they were conducted
through a combination of online and offline interactions. Figure 1 illustrates the dual interactions in the
ACMC activities: the online interaction between a learner and a Japanese student and the offline
interaction between Learners A and B, peers. All the metalinguistic episodes were triggered either by a
linguistic item in the Japanese partner's online message or a linguistic item that Learner A attempted to
write in the online message to his/her Japanese partner. The metalinguistic episodes took place in both
online and offline modes. After the learners discussed and agreed upon a certain linguistic form in the
offline mode, as shown by the rectangles in Figure 1 (off1, off2, and off3), Learner A replied in an online
message to his/her Japanese partner, using the agreed upon linguistic item. The time interval between the
online messages enabled the learners to engage in offline peer interaction, while communicating with
their online Japanese partners.
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
The metalinguistic episodes in the offline interaction are illustrated in a simplified structure in Figure 1
(offline1, offline2, and offline 3), but the actual exchanges in the data are more complicated and varied,
depending on each episode. In one of the more complicated structures, the learners asked the instructor or
the teaching assistant to provide assistance when they were unable to solve the problem through
discussions with their partners.
Figure 1. Combination of online and offline interactions in the ACMC activity. J: Learner A's Japanese
partner; LA: learner A; LB: Learner B (Learner A's peer partner); On: online interaction; Off: offline
The unique functions of repetition in the collaborative dialogue and the evidence of written repetition
constitute another significant feature of the metalinguistic episodes. From a sociocultural perspective,
Dicamilla and Anton (1997), which examines L2 peer dialogue in a joint writing task, indicates the
extensive use of repetition (32% of the total utterances) and demonstrates that repetition plays an essential
role in establishing and maintaining intersubjectivity (Rommetveit, 1985) among peers. Repetition
enables learners to indicate and maintain a mental space wherein they can confirm their agreement with
what has been constructed thus far and add new information to it.
The availability of written repetition also shapes the offline dialogue in a manner that differs from regular
face-to-face interactions. After or while solving the linguistic problem in the peer dialogue, the learners
returned to the online message (on2 in Figure 1) and replied to their online partners using the decided
linguistic item. Therefore, the online texts frequently show traceable evidence of not only the learners'
transferred knowledge but also the shared information obtained through peer collaboration. Moreover, the
written repetition enabled the learners to establish and maintain intersubjectivity, as mentioned above,
particularly in contexts in which the learners sat side-by-side and viewed text that was typed by another
learner on his/her computer screen.
Excerpt 1 illustrates how, when writing a response to their Japanese partners, spoken repetitions were
used to collaboratively construct the learners' knowledge in the offline peer dialogue.
Excerpt 1 (J & E, Session 3, 16:21–18:35): J and E are peers summarizing the group discussion on a
young Japanese individual (Furi:ta:) who makes a living by working a part-time job. In the excerpt, they
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
are discussing and writing the reason why they do not support Furi:ta:. During the discussion, E is typing.
In Excerpt 1, Learners J and E are negotiating the choice of the most appropriate particle (case marker)
for the sentence they are creating. In Line 3, J is misusing o, an object marker particle; in Line 4, E
suggests another particle, ni, a dative case marker particle. However, the verb haratta 'paid' sounds
awkward, and in Line 7, J suggests the use of another particle, no. Finally, E suggests ni with the verb
kakatta 'cost' instead of haratta. In Line 15, J repeats the complete sentence he uttered in line 1 with the
appropriate particle (dative case marker), ni, and the verb, kakatta.
The following (partial) notation system was used in the transcripts:
(.), (..): pauses
[brackets]: The contents within brackets are the transcriber's comments.
*asterisk: The words/phrases marked with an asterisk are incorrect.
Boldface: Boldface is used to highlight the grammatical aspects under discussion.
1J: Nazeka to yu: to ano (..) kyo:iku kyo:
The reason is. Well (..) Education Edu
2E: ((typing)) Kyo:iku ((typing))
((typing)) Education ((typing))
3J: Kyo:iku *o haratta okane no imi wa nai (.) kana?
The money paid for the education [with the wrong usage of the object marker particle o]
would be meaningless (.) I wonder?
4E: Kyo:iku ni ka.
For the education [with the correct usage of the dative marker particle ni], is it?
5J: Kyo:iku ni haratta
Paid for the education [with the correct usage of the dative marker particle ni]
6E: A a =
Ah a =
7J:= No tame no kane da no imi da. tabun no.
It means money for the sake of it. It is probably no [with the particle no]
8E: Kyo:iku ni
For the education [with the correct usage of the dative marker particle ni]
9J: (Kyo:iku)
10E: Ah, kyo:iku NI: kakatta okane?
Ah, the cost of education? [Emphasis with the correct usage of the dative marker particle
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
11J: A un kyo:iku
Ah, yeah. Education.
12E: Kakatta okane?
The cost?
13J: Un, kakatta okane.
Yes, the cost.
14E: ((typing)) okane
((typing)) The money
15J: Kyo:iku ni kakatta okane no imi wa nai.
The cost for education would be meaningless. [with the correct usage of the dative
marker particle ni]
16E: ((typing))
This demonstrates that the learners pay attention to linguistic accuracy as well as the content of the
message, and they co-construct the knowledge to produce the most grammatically appropriate sentence.
Interestingly, the word kyo:iku 'education' first appears in Line 1 and is repeated in Lines 2, 3, 4, 5, 8, 9,
10, 11, and 15. As J and E search for the appropriate particle following kyo:iku, they repeat the word to
indicate the point of agreement, that is, the point at which their knowledge is shared to add new
information. In other words, "kyo:iku strategically facilitates the scaffolding by indicating the momentary
mental space and producing the correct particle. Repetition also functions as a confirmation check, as is
the case in Line 12, and an acceptance, as in Line 13. After collaboratively solving the linguistic
problems, J repeats the completed sentence to reconfirm its modified, completed version (Line 15), and E
types the sentence. The BBS message typed by E is also confirmed to be identical to sentence J, which is
uttered in Line 15.
BBS messages, such as those composed during this discussion, provide noticeable written evidence of the
knowledge gained through peer dialogue. However, in one case, the learners acquired non-target
knowledge through the dialogue. Such an instance implies that learners can co-construct the knowledge
gained and reproduce this knowledge in the subsequent text; however, the acquired knowledge may be
non-target and, therefore, may require confirmation by experts.
Preferred Modes for Metalinguistic Episodes
With respect to the preferred mode of metalinguistic episodes, participants clearly selected the offline
mode: As shown in Table 1, most of the metalinguistic episodes (Term 1: all episodes; Term 2: all
episodes, except two online instances) were discussed during offline verbal interactions rather than online
interactions. There are two possible explanations for this finding. As previous studies suggest, learners
are reluctant to ask their online partners for help with linguistic matters, due to the less frequent
exchanges and lack of instant responses (Kitade, 2006). It is difficult to obtain extensive exchanges with
repetitions in asynchronous interactions. Further, the act of soliciting linguistic help from online native
partners may be threatening. However, the opportunity for offline collaborative dialogue through
synchronous peer dialogue, where learners feel less threatened to ask for linguistic assistance, may avoid
these disadvantages of ACMC.
The other question regarding the preferred types of metalinguistic episodes is the manner in which these
episodes are triggered. In comprehending their online partners' messages and in producing their own
messages, learners may face linguistic problems. The data from both Terms 1 and 2 indicate that 91.7%
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
and 93.5% of the metalinguistic episodes, respectively, took place during the productive mode, where the
learners were trying to compose messages. More complex cognitive processes are required during the
productive mode than during the receptive mode and these processes may promote more metalinguistic
episodes. In addition, the use of online dictionaries may reduce the burden of comprehending messages
with unfamiliar words. Apart from the metalinguistic episodes in the audio-recorded data, the class
observations and learners' interviews suggested that there were more instances in which the learners faced
linguistic problems. In these instances, the learners solved the linguistic challenges by consulting online
or electronic dictionaries. More possible explanations will be discussed in the following section, which
will deal with the learners' perceptions.
Table 1. Frequency of Metalinguistic Episodes in Different Modes
Online metalinguistic episodes
Offline metalinguistic episodes
Productive mode
Receptive mode
Term 1
Term 2
Total (%)
2 (0.6)
288 (99.3)
Note. Three metalinguistic episodes were triggered when the learners attempted to write a response, but the resources are
originally from the online partners' messages.
The other noteworthy finding is that the number of metalinguistic episodes among the pairs indicate a
variation, as shown in Tables 1-a and 1-b. For instance, pairs 2 and 6 were able to engage in more than 30
metalinguistic episodes; this accounts for five times the number of episodes engaged in by pairs 4 and 15,
that is, 6. Thus, the factors affecting the number of metalinguistic episodes in pairs should be investigated
using a larger population.
Table 1-a. Frequency of Metalinguistic episodes (offline) among pairs -Term1Pair (Gender)
Pair 1 (M-F)
Pair 2 (F-F)
Pair 3 (M-F)
Pair 4 (M-F)
Metalinguistic episodes (%)
22 (25.8)
33 (38.8)
24 (28.2)
6 (7.0)
85 (100)
Table 1-b. Frequency of Metalinguistic episodes (offline) among pairs –Term2Pair (Gender)
Pair 5 (M-F)
Pair 6 (F-F)
Pair 7 (M-F)
Pair 8 (F-F)
Pair 9 (M-F)
Pair 10 (F-F)
Pair 11 (M-F)
Pair 12 (M-F)
Pair 13 (M-F)
Pair 14 (M-F)
Pair 15 (M-F)
Pair 16 (M-F)
Pair 17 (M-M)
Pair 18 (M-M)
Metalinguistic episodes (%)
15 (7.3)
32 (15.7)
9 (4.4)
13 (6.4)
8 (3.9)
8 (3.9)
14 (6.8)
8 (3.9)
26 (12.8)
29 (14.2)
6 (2.9)
8 (3.9)
15 (7.3)
12 (5.9)
203 (100)
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
Linguistic Focus of Metalinguistic Episodes in the ACMC Activity
All the metalinguistic episodes are categorized as lexis-based, form-based, discourse-based, phonologicalor orthographic-based, a combination of lexical and syntactic, or a combination of phonological and lexisbased. The categorization is modified from the classification suggested by Swain and Lapkin (1998,
2002) because some aspects, such as the phonological and orthographic focus, are salient in CMC. In
lexis-based metalinguistic episodes, the learners search for and confirm or select the appropriate
vocabulary from alternative Japanese vocabulary items. In form-based metalinguistic episodes, the
learners address one aspect of Japanese syntax or morphology. The orthography (i.e., spelling and Kanji)
and phonological focus (e.g., voiced or voiceless sounds) are categorized in the independent group.
However, in discourse-based metalinguistic episodes, the learners focus on aspects such as discourse
markers, logical sequencing, stylistics including the degree of politeness, or text structure. Some
metalinguistic episodes pertain to more than one linguistic focus and are categorized in the combined
groups (see Appendix B for examples.)
As shown in Table 2, the results of both Terms 1 and 2 were similar with regard to the linguistic focus of
the metalinguistic episodes, although the different topics of the task may have affected the number and
types of metalinguistic episodes that occurred. Out of a total of 288 metalinguistic episodes, 142 (49.3%)
were lexis-based, 74 (25.6%) were form-based, 36 (12.5%) were phonological and orthographic-based,
and 16 (5.5%) were discourse-based. The metalinguistic episodes involving a combination of the lexical
and syntactic and the lexical and phonological focus account for less than 5% each.
Table 2. Linguistic Focus of Metalinguistic Episodes
Phonological & orthographic
Lexical & form
Phonological & lexical
Term1 (%)
43 (50.5)
27 (31.7)
8 (9.4)
5 (5.8)
0 (0)
2 (2.3)
85 (100)
Term 2 (%)
99 (48.7)
47 (23.1)
28 (13.7)
11 (5.4)
13 (6.4)
5 (2.4)
203 (100)
Total (%)
142 (49.3)
74 (25.6)
36 (12.5)
16 (5.5)
13 (4.5)
7 (2.4)
Approximately half the metalinguistic episodes had a lexical focus, but the percentage of form-related
episodes (approximately 31% in Term 1 and Term 2 with the combination of lexical and form aspects,
6.4%) was quite significant. Due to the availability of both online and electronic dictionaries, many
metalinguistic episodes involved more than just vocabulary searching. Most metalinguistic episodes were
classified into three conditions. The first is when learners lack confidence about their knowledge or
hypothesis and request quick verbal assurance from their peers or the instructor. The second is when
learners are unable to choose the appropriate item from those known to them or in the list suggested in the
dictionary. The last condition is when the problem encountered by learners is beyond the scope of the
dictionary. For example, when a learner seeks an expression to describe a highly abstract concept, he/she
would be unable to find a suitable expression even in his/her L1. On these occasions, learners are unable
to solve the problem using dictionaries and need to ask their partners or instructors for further assistance.
The last two conditions, in particular, often lead learners to engage in more complex metalinguistic
explanations (i.e., why one is more appropriate/inappropriate than the other) where learners need to
verbalize their moment-to-moment state of knowledge.
Excerpt 2 illustrates the metalinguistic episodes in which pair Y and D engages in a dialogue to coconstruct grammatical knowledge. In order to formulate the educational objective suggested by Y and D
for their ideal school, Y suggests the use of the expression they have just learned in the other class, A to
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
yu: no wa B no koto de aru [Noun A is to Noun B], as seen in Lines 3, 11, and 13. However, E explains to
Y that the use of the expression is inappropriate in the sentence due to the following reasons: (a) the
structure suggested by Y contains a verb, but the expression requires a noun, as observed in Lines 14–18
and (b) the expression is suitable for a more general definition, when in reality, they are attempting to pen
their opinions, as in Lines 18–22. Y understands D's explanations and suggests a different expression,
Verb koto da to omoimasu [we think it is to Verb] in Line 23.
Excerpt 2 (Y & D, Session 2): After the discussion on the educational objectives of their ideal school, Y
and D begin noting down their ideas. Based on their discussion, D is typing the message.
1Y: Tabun kyo:iku mokuhyo: to yu: no wa (.)
The educational objective is to probably (.)
2D: U:n. Kyo:iku mokuhyo: ((typing)) nn.
Yeah:. The educational objective ((typing)) mm.
3Y: Tabun to yu: no wa naninani no koto de aru toka kakeba i: ka? (.) un etto:
Would it be okay to write something like to yu: no wa naninani no koto de aru?? [a
structure to express it is to such and such] (.) mnn well:
4D: Etto:
5Y: Ki-ho-n-te-ki-na
The basic
6D: Kihonteki na? ((typing))
The basic (.) un (.) things ((typing))
7Y: Kihonteki na (.) un. (.) koto-o
The basic (.) un (.) things
8D: Koto-o ((typing))
Things [with the objective case marker o'] ((typing))
9Y: Benkyo: saseru, benkyo: suru? benkyo: saseru? ...saseru kana? sase
Benkyo: saseru [Let them study, using causative form for study], benkyo: suru [study]?
...Benkyo: saseru, I wonder? (.) sase. [the use of the causative form (study or let them
10D: Benkyo:suru
(benkyo: suru)[study without the causative form]
11Y: *Saseru no koto de aru* (.) *no koto* ((typing))
*Saseru no koto de aru [*it is to let them study]* (.) *no koto ((typing)) [the wrong
usage of the structure]
12D: (Demo)
13Y: Demo nanka to yu: no wa naninani no koto de aru desho? kono hyo:gen dakara:
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
But nanka to yu: no wa naninani no koto de aru [a structure something is to such and
such], isn't it? Because of this expression:
14D: mnn. Ano:: (..) no no nai ho: ga i: to omo: kedo
mnn. well:: (.) but I think it would be better without no [a nominalizer]
15Y: Demo nanka=
But somehow=
16D: =Demo saseru wa do:shi de meishi ja nai
=But saseru [the causative verb suffix] is a verb, but not a noun.
17Y: Demo etto getsuyo:bi kana? kono iroiro na hyo:gen yatta desho? renshu: shita dakara
sono hyo:gen wa naninani to yu: nowa naninani no koto de aru kara.
But I wonder if it was Monday? We learned these kinds of expressions, didn't we? We
practiced it, and the expression was naninani to yu: nowa naninani no koto de aru
[something is to such and such.]
18D: Tabun me:shi, me:shi no ho: ga tekito: to omo: n da ne. sore wa hyakkajiten ja
I think probably a noun; a noun is more appropriate. This is not an encyclopedia, but=
19Y: =Un, so so hai wakatta nanka (.) minna to=
=Yeah. Right right. Yes, I understand. Some (.) Everyone and=
20D: =Jijitsu no yo:na
=Like a fact
21Y: Un un wakatta hai=
Yeah. I got it. Yes =
22D: =Watashitachi no iken
=Our opinion.
23Y: Un. so so so ka. da to omoimasu tte kaita ho: ga i: desho? ((typing)) ja saseru koto
da to omoimasu.
Yeah. Right right right. It would be better to write, da to omoimasu [we think it is],
wouldn't it? ((typing)) Well, saseru koto da to omoimasu [we think it is to let them study].
24D: Ano: ((unintelligible)) un.
Well: ((unintelligible)) yeah
The feedback provided by D contains a metalinguistic explanation addressing what Y had overlooked.
Metalinguistic feedback is claimed to promote a particular type of learners' repair that engages the
learners in deeper cognitive processing (Lyster & Ranta, 1997). Furthermore, D's feedback matches Y's
requirement because D and Y share the same knowledge of the expression learnt in the same class. Such
instances of metalinguistic episodes indicate that if the learners conduct the ACMC task by themselves at
home or individually, they will not be able to solve many of the challenges they will encounter. They may
miss the opportunities for metalinguistic episodes where learning may take place.
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
Results of the Individually Tailored Posttest
The individually tailored posttest was developed on the basis of the metalinguistic episodes identified in
the audio-recorded offline dialogue in Term 2. The multiple-choice questions were created to include the
choices of both the correct and incorrect items discussed (and not discussed) in the metalinguistic
episodes (see Appendix A for examples). The posttest was administered six weeks after the last ACMC
session; the learners were told not to use any assistance during this test and that the result would not affect
their grades.
From the 180 questions developed from the metalinguistic episodes, 132 (73.3%) were correctly resolved
during the posttest, as shown in Table 3. This finding confirms the results of dyad-specific posttests by
LaPierre (1994) and Swain (1998), which were conducted one week to ten days after the task session and
indicate a 70-80% correspondence. Although this study does not ignore the possibility of any change in
the effect of language learning subsequent to the metalinguistic episodes and prior to the posttests, it
suggests that a high rate of linguistic knowledge constructed through dialogue can remain in memory for
a minimum of 6 weeks. The self-reported lexical items that the learners indicated they had learned
through the ACMC activities were tested in the same exam sheet; 28 (49.1%) out of 57 were correctly
resolved. Compared to the results for the self-reported lexical items, the lexical items in the metalinguistic
episodes indicate a higher rate of resolution (68.8%). Interestingly, most newly learned lexical items
reported by the learners were originally from the online partner's messages and not the item discussed in
the metalinguistic episodes. However, the posttest result demonstrates that the lexical items discussed in
the peer dialogue had a higher rate of resolution than those that the learners believed they had learned.
The other significant finding in the posttest is the high rate of resolution (79.5%) of syntactic items.
Selecting the correct syntactic items in the posttest may not necessarily imply that the learners completely
understand the syntactic aspect and are capable of applying it to any given context. However, the data
demonstrate that the learners were at least able to choose the correct syntactic item from the alternatives
they listed in the metalinguistic episodes and could do so by themselves—something they were unable to
accomplish before.
Table 3. Posttest Result
Phonological & orthographic
Lexical & form
Phonological & lexical
Total number of
items in posttest
Items answered
Resolution rate
Significance of Offline Metalanguage Talk
The analysis of the metalinguistic episodes during the offline talk demonstrated the unique discourse
structure of such episodes and suggested the possibility of high retention (73.3%) of linguistic knowledge
when it was discussed among peers. The high scores on the posttest imply that the opportunities to
discuss linguistic aspects with peers and instructors not only reflect the linguistic challenges encountered
by the learners but also have the potential to promote longer maintenance of the item in their individual
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
memories. The unique structural features of the metalinguistic episodes also demonstrate opportunities for
the learners to enhance their knowledge.
One major finding is that the learners' metalanguage talk did not occur in the online interaction with the
Japanese speakers but rather in the offline verbal interaction. The retrospective interviews and
questionnaires indicate that the learners are highly motivated to interact with online Japanese partners, but
preferred the offline mode for metalanguage talk because of the availability of prompt and
comprehensible responses. The offline context—in which extensive exchanges with repetitions are
available—helps learners to establish and maintain intersubjectivity and obtain graduated and contingent
assistance. Unlike offline peer dialogues, asynchronous online interactions lack the exchanges that are
needed to create and maintain such discourse. The other explanation for the preference of offline
interactions is related to the interlocutors' effect. In offline modes, the assistance provided by peers who
share similar background knowledge is more comprehensible. Further, it is less face-threatening to
request linguistic help from offline peers than to request assistance from online Japanese partners that the
learners have never met.
The other crucial feature of the structure of metalinguistic episodes is the written and spoken repetition
discussed by the peers. The learners had the opportunity to incorporate the linguistic solution discussed in
the offline interaction into the online messages they subsequently wrote to their Japanese partners. The
learners' written repetition of what was already discussed in offline metalinguistic dialogues functions not
only as a message to the online Japanese partners but also as visualized evidence indicating the
intersubjectivity agreed upon by the peers in their offline interaction. By viewing the repeated written
words/phrases on the shared computer screen and listening to the spoken repetition, the learners are able
to indicate their stance to one another and be acknowledged for it. Further studies examining the role and
effect of written repetition may explore the potential of the distinguishable discourse structures of
metalinguistic episodes during an ACMC activity.
Although ACMC activities are frequently conducted as outside-the-classroom assignments, the findings
in this study indicate the significance of the in-class ACMC activity, since this entails the beneficial
aspects of offline talk. Although reference to online dictionaries is useful, the learners' retrospective
interviews suggest that there are limitations in the scope of these dictionaries. Unlike the receptive mode
(reading), which requires only comprehension, the productive mode requires the selection of the correct
linguistic knowledge and awareness of how to apply that knowledge in a particular context. Collaborative
peer context is able to meet such complicated demands that cannot be solved using dictionaries.
Factors Affecting Opportunities for Learning: Pair Work and Task Activities
Although most learners indicated in the questionnaires that they took advantage of the allotted time and
peers' help between the online asynchronous messages to write more complex and accurate texts, some
learners are more self-directed and hesitate to ask for assistance frequently. Such individual differences
are also apparent in the learners' perception of pair work. Previous studies based on the sociocultural
perspective, particularly in classroom-based research (Foster, 1993; Swain & Lapkin, 1998), suggest that
the manner in which the learners perceive, interact, and conduct pair work varies depending on the pair.
The responses to the question regarding the perception of pair work indicate that 61% of learners
considered it to be helpful, 30.5% perceived no difference between pair and individual work, and 8%
experienced difficulties working in a pair.
As indicated by previous studies examining pair interactions (Storch, 2002; Storch & Wigglesworth, in
press; Swain & Lapkin, 1998), the amount and pattern of metalinguistic episodes observed in each pair
and the manner in which the tasks were approached varied. The learners' perceptions of pair work may be
related to the congeniality of the two learners and may affect the amount and pattern of metalinguistic
episodes in the pair. Learners' reasons for disliking pair work were a preference for an independent
learning style and an inability to get along with their partners. The pairs who had more metalinguistic
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
episodes indicated that they enjoyed pair work and gained significant knowledge from their peers in terms
of the language and content of the discussion. On the other hand, the peers who had fewer metalinguistic
episodes tended to perceive offline peer interactions as an ineffective context for language learning. As
some studies suggest (Berg, 1999; Swain, Brooks, & Tocalli-Beller, 2002; Tang & Tithecott, 1999),
providing explicit instructions about the rationale of employing peer collaboration and training on
collaboration may promote a positive perception and increase collaborative work.
The type, complexity, and operationalization of the task also moderate the benefits of offline interaction.
The learners were instructed to collaboratively write more than one online message in each session, but
how they collaborated to carry out the task (write the messages) varied between the pairs. The
metalinguistic episodes seemed to consistently take place in the coauthoring (i.e., joint writing) context,
where the pair, using the same computer screen, collaboratively discussed and decided how and with what
content they should respond to their online Japanese partners. However, the pairs who discussed ideas and
decided who would write what, and then individually wrote a message, tended to have relatively fewer
metalinguistic episodes. The nature of the task may vary depending on how the learners construct the task
up to the final outcome (Coughlan & Duff, 1994), and a more qualitative analysis examining the
operationalization of the writing task in each process (e.g., prewriting discussion, composition, and
revision) and the pattern of collaboration (Storch, 2002) should be incorporated in order to address the
tasks involving collaborative writing in CMC. In particular, the effect of coauthoring, in which both
learners are equally responsible for online messages, may be one area of investigation for future research.
Another factor that may reduce the opportunity for learning through peer collaboration is the restrictive
nature of peer assistance. In one episode, a pair reached a non-target solution during the metalinguistic
talk. In another episode, a pair was unsuccessful in finding the correct grammatical form and instead used
an easier, alternative word in its place. This pair was attempting to find a subjunctive form of the word
yokereba 'good' (yokereba is conjugated rather uniquely in Japanese). After listing the incorrect forms, the
pair agreed instead to employ the more well-known word ok. Most of the episodes in the data indicate that
the learners solicited the instructor or the assistant for help when they were unsure about their linguistic
solution or unable to arrive at one. Such instances suggest that the availability of assistance from an
instructor or expert is crucial during peer collaboration.
Methodological Suggestions for Future Studies
Unlike other studies that focus on the effect of planning time in experimental settings, this study
incorporated a microgenetic approach to examine the learners' actual behaviors in an offline setting in
which there is a time stipulation and where peer collaboration occurs naturally. Although offline
behaviors are not stored in the scripts and are not as easy to observe as online behaviors, the analysis of
offline data reveals some of the learners' actual behavior while executing ACMC tasks, such as
collaborative knowledge-buildings during the asynchronous exchanges. The incorporation of audio and
visual recordings may capture the non-verbal cues and demonstrate further details of the peer
collaborative process. The findings from the observation of the learners' offline behaviors suggest that the
planning time of ACMC is not an independent factor. On the other hand, the availability of external
resources during the time interval between the messages (dictionaries and metalanguage talk with peers
and the instructor) is advantageous for L2 learning. While an experimental study is useful in addressing
the general and statistical significance of the effect of planning time on the quality of production, a more
naturalistic and microgenetic approach that takes into account the availability of external resources,
learners' actual behaviors in executing tasks, and their long-term development should also be considered
when studying the effective practical application of ACMC in L2 contexts.
Besides the variety in the amount and quality of pair work and the actual activity carried out in a task, it is
necessary to improve the methods of measuring the transfer of knowledge obtained in peer collaborative
dialogue. This study employed a tailor-made test based on items that were resolved during peer-to-peer
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
dialogue; therefore, it addressed the extent to which the knowledge obtained through collaborative
dialogue was individually transferred and maintained. The items in the posttest posed questions in a new
context; however, each item was asked in varied contexts in order to examine whether the learner had
truly acquired the encountered knowledge item in question and was capable of applying it. A follow-up
posttest to study long-term effects should also be useful.
Although the tailor-made posttest is a relatively new method that requires some modification and is
difficult to apply to larger data, Swain (2000) indicates that this method directly demonstrates the dual
role of a language, suggested in the Vygotskian perspective: language as a mediation tool of cognition
("saying as a cognitive activity") and the construction of knowledge that reflects itself ("what they said
becomes an outcome of that activity.") At the same time, traditional pre-experimental and postexperimental studies examining the quality of the production (i.e., accuracy, complexity, and fluency)
with statistical evidence using a larger data sample should also contribute to the exploration of the effect
of collaboration and planning time. A study with a combination of methodologies and varied approaches
should suggest the effective pedagogical application of ACMC where learners engage in different modes
of interaction.
This study examined the benefits of offline dialogue in an ACMC activity. Since the size of the sample
was small, further research considering factors such as the proficiency levels of learners and online
partners, the use of the L1 (in the context of foreign language learning), and the goal of the activity is
necessary for generalizing the findings. Addressing the preferred type of metalinguistic episodes, the
points at which they occur, and their relationship with language learning in an ACMC activity, this study
demonstrates how ACMC tasks can be structured to allow learners to take advantage of text-mediated
reflective processes that are amplified with sufficient time stipulations and peer collaboration. More so
than SCMC, ACMC provides greater access to real interactions with expert speakers (i.e., without the
difficulty posed by time differences), particularly for learning intercultural communication and pragmatic
competence. However, the asynchronous nature of ACMC can be perceived as unfavorable for L2
learning because it reduces the opportunity for instant and tailored feedback. This study suggests that
offline dialogue may compensate for this shortcoming and serve as an occasion for L2 learning and
knowledge building. The offline verbal peer dialogue data demonstrates that knowledge of the target
language may be collaboratively constructed, and the tailored posttests suggest that learners retain this
knowledge for at least a short period of time.
The findings related to the role of offline peer dialogue in the ACMC activity suggest a need for the
reexamination of CMC and the alternative pedagogical application of ACMC activities. The
methodological implication raised for future studies on CMC is the incorporation of a more detailed
analysis examining the learners' actual behaviors in carrying out CMC activities. Previous studies on
CMC have paid less attention to the role of offline interaction in language learning; however, the potential
of offline interactions to create a collaborative context, not only among online interlocutors but also
among offline peers, should be investigated in future studies. The collaborative peer relationship enables
learners to engage in interactions whereby they deepen their knowledge not only in terms of the content
but also linguistic aspects and at a level higher than they would have achieved individually. To effectively
incorporate collaborative learning in a CMC context, more pedagogical techniques (e.g., encouraging coauthoring activities, taking careful consideration when matching pairs, guiding the learners in pairing
activities) should be carefully considered.
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
Appendix A
Examples of Posttest Questions
}の中から一番適切なものを選んでください。Please choose the most appropriate one from {
Appendix B
Examples of Each Metalinguistic episodes Category
To choose one from two Japanese words for "every": daigaku "goto-ni" vs. daigaku "*tabi-ni"
To choose one from two Japanese words for "position": daigaku no "*tachiba" vs. daigaku no "ichi" (for
The potential verb form for "tsukuru (to make)":"*tsuku-rareru" vs. "tsukur-eru"
The use of the causative form: Kodomo ni "asonde-hoshii" vs. "asobasete-hoshii"
Inserting "tatoeba" (for example) to create cohesion.
"-yo: dearu (sentence final expression)" is too formal and change it to "yo:-desu"
<Lexis and form> Searching for both the lexical item(s) and structure together.
Discussing to find the expression "he got fired" in Japanese.
<Phonological and lexical>
To choose one from two phonologically similar words: "seikaku (personality)"or "seikatsu (life)"
<Phonological & orthographic>
The spelling of "message" in Japanese is "messe:gi," but the learner wrote as "*mesegi."
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
This research is supported in part by a Grant-in-Aid for Scientific Research of the Ministry of Education,
Culture, Sports, Science and Technology. I would like to acknowledge and thank Maiko Ikeda for
transcribing the data, as well as the reviewers and editors of LLT for the helpful comments and feedback.
Keiko Kitade (Ph.D., University of Hawaii at Manoa) is Associate Professor of Japanese at Ritsumeikan
University, Japan. She teaches Japanese language, Japanese linguistics, and Japanese language pedagogy.
Her research interests are second language (L2) learning, computer-mediated communication (CMC),
discourse analysis, and technology and learning.
Email: [email protected]
Aljaafreh, A., & Lantolf, J. P. (1994). Negative feedback as regulation and second language learning in
the zone of proximal development. The Modern Language Journal, 78, 465-487.
An, Y.-J., & Frick, T. (2006). Student perceptions of asynchronous computer-mediated communication in
face-to-face courses. Journal of Computer-Mediated Communication, 11(2), 5. Retrieved December 27,
2007, from
Belz, A. J., & Kinginger, C. (2002). The cross-linguistic development of address form use in
telecollaborative language learning: Two case studies. The Canadian Modern Language Review, 59(2),
Berg, E. C. (1999). The effects of trained peer response on ESL students' revision types and writing
quality. Journal of Second Language Writing, 8(3) 215-241.
Brooks, F. B., & Donato, R. (1994). Vygotskian approaches to understanding foreign language learner
discourse during communicative tasks. Hispania, 77, 261-274.
Chang-Wells, G.L., & Wells, G. (1992). Constructing knowledge together. Portmouth, NH: Heinemann.
Coughlan, P., & Duff, P. (1994). Same task, different activities: Analysis of a second language
acquisition task from an activity theory perspective. In J. P. Lantolf and G. Appel (Eds.): Vygotskian
approaches to second language research (pp. 183-193). Norwood, NJ: Ablex.
Cumming, A. (in press). Writing in the L2 classroom: Issues in research and pedagogy. In R. Manchon
(Ed.), International Journal of English Studies, 1, 2 [special issue].
Darhower, M. (2002). Interactional features of synchronous computer-mediated communication in the
intermediate L2 class: A sociocultural case study. CALICO Journal, 19(2), 249-277.
de Guerrero, M. & Villamil, O. S. (2000). Activating the ZPD: Mutual scaffolding in L2 peer revision.
The Modern Language Journal, 84(1), 51-68.
Dicamilla, F. J., & Anton, M. (1997). Repetition in the collaborative discourse of L2 learners: A
Vygotskian perspective. The Canadian Modern Language Review, 53, 609-633.
Donato, R. (1994). Collective scaffolding in second language learning. In J. P. Lantolf & G. Appel (Eds.),
Vygotskian approaches to second language research (pp. 33-56). Norwood, NJ: Ablex.
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
Foster, P. (1993). Discoursal outcomes of small group work in an EFL classroom: A look at the
interaction of non-native speakers. Thames Valley University Working Papers in English Language
Teaching, 2, 1-30.
Kinginger, C. (2000). Learning the pragmatics of solidarity in the networked foreign classroom. In J. K.
Hall & L. S. Verplaetse (Eds.), Second and foreign language learning through classroom interaction (pp.
23-46). Mahwah, NJ: Lawrence Erlbaum.
Kitade, K. (2000). L2 learners' discourse and SLA theories in CMC: Collaborative interaction in internet
chat. Computer Assisted Language Learning, 13(2), 143-166.
Kitade, K. (2006). The negotiation model in asynchronous computer-mediated communication. Computer
Assisted Language Instruction Consortium (CALICO) Journal, 23(2), 319-348
Lamy, M. & Goodfellow, R. (1999). "Reflective conversation" in the virtual language classroom.
Language Learning & Technology, 2(2), 43-61.
Lantolf, P. J. (Ed.) (2000). Sociocultural theory and second language learning. Oxford: Oxford
University Press.
Lantolf, J., & Appel, G. (1994). Vygotskian approaches to second language research. Norwood, NJ:
Lapadat, J. C. (2002). Written interaction: A key component in online learning. Journal of Computer
Mediated Communication, 7(4). Retrieved January 31, 2008, from
LaPierre, D. (1994). Language output in a cooperative learning setting: Determining its effects on second
language learning. M.A. thesis, University of Toronto.
Lotman, Y. M. (1998). Text within a text. Soviet Psychology, 26(3), 32-51.
Lyster, R. & Ranta, L. (1997). Corrective feedback and learner uptake: Negotiation of form in
communicative classrooms. Studies in Second Language Acquisition, 19, 37-66.
Ohta, A. (1995). Applying sociocultural theory to an analysis of learner discourse: Learner-learner
collaborative interaction in the zone of proximal development. Issues in Applied Linguistics, 6(2), 93-121.
Platt, E. & Brooks, F. B. (1994). The ‘acquisition rich environment' revised. The Modern Language
Journal, 78 (4), 497-511.
Radziszewska, B., & Rogoff, B. (1991). Children's guided participation in planning imaginary errands
with skilled adult or peer partners. Developmental Psychology, 27, 381-397.
Rogoff, B., & Gardner, W. (1984). Adult guidance of cognitive development. In B. Rogoff & J. Lave
(Eds.), Everyday cognition: Its development in social context (pp. 95-116). Cambridge, MA: Harvard
University Press.
Rommetveit, R. (1985). Language acquisition as increasing linguistic structuring of experience and
symbolic behavior control. In J. V. Wertch (Ed.), Culture, communication, and cognition (pp. 183-204).
Cambridge: Cambridge University Press.
Schwienhorst, K. (2003). Learner autonomy and tandem learning: Putting principles into practice in
synchronous and asynchronous telecommunications environments. Computer Assisted Language
Learning, 16(5), 427-443.
Stockwell, G. (2004). Communication breakdown in asynchronous computer-mediated communication
(CMC). Australian Language and Literacy Matters, 1(3), 7-31.
Language Learning & Technology
Keiko Kitade
The Role of Offline Metalanguage Talk in Asyncrhonous CMC
Stockwell, G., & Levy, M. (2001). Sustainability of e-mail interaction between native speaker and
nonnative speakers. Computer Assisted Language Learning, 14(5), 419-442.
Storch, N. (2002). Patterns of interaction in ESL pair work. Language Learning, 52(1), 119-158.
Storch, N. & G. Wigglesworth. In press. Writing tasks: comparing individual and collaborative writing. In
Mar del Pilar Garcia-Mayo (Ed.), Investigating tasks in formal language settings. Clevedon, UK:
Multilingual Matters.
Swain, M. (1998). Focus on form through conscious reflection. In C. Doughty & J. Williams (Eds.),
Focus on form in classroom second language acquisition. Cambridge: Cambridge University Press.
Swain, M. (2000). The output hypothesis and beyond: Mediating acquisition through collaborative
dialogue. In J. Lantolf (Ed.), Sociocultural theory and second language learning (pp. 97-114). Oxford:
Oxford University Press.
Swain, M. (2006). Verbal protocols: What does it mean for research to use speaking as a data collection
tool? In M. Chalhoub-Deville, C. Chapelle, & P. Duff (Eds.), Inference and generalizability in applied
linguistics: Multiple perspectives (pp. 97-113). Amsterdam: John Benjamin Publishing Company.
Swain, M., Brooks, L., and Tocalli-Beller, A. (2002). Peer-peer dialogue as a means of second language
learning. Annual Review of Applied Linguistics, 22, 171-185.
Swain, M., & Lapkin, S. (1995). Problems in output and the cognitive processes they generate: A step
towards second language learning. Applied Linguistics, 16(3), 371-391.
Swain, M., & Lapkin, S. (1998). Interaction and second language learning: Two adolescent French
immersion students working together. The Modern Language Journal, 82(3), 320-337.
Swain, M. & Lapkin, S. (2002). Talking it through: Two French immersion learners' response to
reformulation. International Journal of Educational Research, 37(3-4), 285-304.
Tang, G. M., & Tithecott, J. (1999). Peer response in ESL writing. TESL Canada Journal, 16(2), 20-38.
Tocalli-Beller, A. & Swain, M. (2005). Reformation: The cognitive conflict and L2 learning it generates.
International Journal of Applied Linguistics, 15(1), 5-28.
Villamil, O. S., & de Guerrero, M. C. M. (1998). Assessing the impact of peer revision on L2 writing.
Applied Linguistics, 19(4), 491-514.
Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Cambridge,
MA: Harvard University Press.
Warschauer, M. (1997). Computer-mediated collaborative learning: Theory and practice. The Modern
Language Journal, 81(3), 470-481.
Wells, G. (1999). Dialogic inquiry: Toward a sociocultural practice and theory of education. Cambridge:
Cambridge University Press.
Wertsch, J. (1985). Vygotsky and the social formation of mind. Cambridge, MA: Harvard University
Wertsch, J. (1991). Voices of the mind: A sociocultural approach to mediated action. Cambridge, MA:
Harvard University Press.
Wertsch, J. V., Minick, N., & Arns, F. J. (1984). The creation of context in joint problem-solving. In B.
Rogoff & J. Lave (Eds.), Everyday cognition: Its development in social context (pp. 151-171).
Cambridge, MA: Harvard University Press.
Language Learning & Technology
Language Learning & Technology
February 2008, Volume 12, Number 1
pp. 85-103
Bryan Smith
Arizona State University
This paper reports on a study of the use of self-repair among learners of German in a taskbased CMC environment. The purpose of the study was two-fold. The first goal sought to
establish how potential interpretations of CMC data may be very different depending on
the method of data collection and evaluation employed. The second goal was to explicitly
examine the nature of CMC self-repair in the task-based foreign language CALL
classroom. Paired participants (n=46) engaged in six jigsaw tasks over the course of one
university semester via the chat function in Blackboard. Chat data were evaluated first by
using only the chat log file and second by examining a video file of the screen capture of
the entire interaction. Results show a fundamental difference in the interpretation of the
chat interaction which varies as a function of the data collection and evaluation methods
employed. The findings also suggest a possible difference in the nature of self-repair
across face-to-face and SCMC environments. In view of the results, this paper calls for
CALL researchers to abandon the reliance on printed chat log files when attempting to
interpret SCMC interactional data.
Though the field of computer-assisted language learning (CALL) is a rapidly emerging area within
applied linguistics, the amount of CALL research on many current SLA topics is relatively modest.
Though computer-mediated communication is fundamentally different than face-to-face communication,
CALL/SLA studies often do not adequately ground their inquiry in existing "traditional" SLA research.
Further, many CALL studies do not make use of existing technology in their data collection and analysis
methods, which can severely limit the impact and relevance of their findings. This is unfortunate because
CALL can be a powerful vehicle for exploring many of the core elements of current SLA theory. One of
these core areas is the role of self-repair in L2 development. The present study explores self-repair in a
task-based synchronous computer-mediated communicative (SCMC) environment and employs video
screen capture software to evaluate the amount and nature of self-initiated self-repair (SISR) in an SCMC
context. In this paper I will first establish the relevance of self-repair to SLA in both traditional and CMC
environments. Then, through the discussion of a classroom-based empirical L2 study I will point out how
using video screen capture technology yields a markedly different and more precise picture of the nature
of SISR in an SCMC environment.
Self-repair in SLA
Learner self-repair or self-correction has been explored in a variety of educational contexts from various
theoretical perspectives and with a focus on both native speakers and second/foreign language learners1.
Self-repairs are seen as important from an SLA perspective because they provide us insights into a
learner's interlanguage (IL) development. Indeed, self-repair is viewed by many as evidence of noticing
an observable behavior from which we can infer that a learner has engaged in some monitoring strategy
or has noticed a production error (Kormos, 1999).
Self-repair occurs when speakers detect that their output is faulty or inappropriate in some way. The
speech flow is halted and a self-correction is executed. Foster & Ohta (2005) define self-correction as
"self-initiated, self-repair, [which] occurs when a learner corrects his or her own utterance without being
prompted to do so by another person" (p.420). Wouk (2005) makes the distinction between same turn
Copyright © 2008, ISSN 1094-3501
Bryan Smith
Methodological Hurdles in Capturing CMC Data
self-repair in which a speaker, in the process of producing an utterance, stops that utterance before
completion and continues it in some way that involves alteration of the syntactic structure that is being
produced. The speaker may abort the utterance in progress and begin a completely new structure or
change the syntactic framework of the utterance, utilizing lexical elements of the old syntax in a new
syntactic frame. Buckwalter (2001) notes that the SLA literature normally equates repair with correction.
This view precludes any investigation of difficulty that occurs in the absence of observable error such as a
learner's preemptive action when anticipating difficulty. Kormos (2000) labels such occurrences as covert
self-repair, whereby a learner notices an error prior to articulation and repairs it. Most of the SLA
research on self-repair involves the overt variety since the phenomenon of covert self-repair can only be
explored in highly controlled experimental settings or through the use of verbal reports or stimulated
recall (Gass & Mackey, 2000).
Self-repair is a type of modified output first argued by Swain (1985) to be key in SLA and is now
considered to be a fundamental construct in current SLA theory (Izumi, 2003; Shehadeh, 1999, 2002;
Swain & Lapkin, 1995). Modified or "pushed" output refers to corrections or rephrasings that are elicited
by the L2 learner's interlocutor. Swain's (1985; 2005) comprehensible output hypothesis argues for the
importance of learner output in terms of enhancing the noticing of one's own errors and states that the role
of output is, at a minimum, "to provide opportunities for contextualized, meaningful use, to test out
hypotheses about the target language, and to move the learner from a purely semantic analysis of the
language to a syntactic analysis of it" (p. 252). Swain & Lapkin (1995) argue that when learners produce
the target language, external or internal feedback leads them to notice a gap in their existing (IL)
knowledge. This noticing pushes them to consciously reprocess their utterances to produce modified
output. Research also suggests that learning depends partly on learners' ability to focus on form when they
notice such a gap in their IL and also on the extent to which noticing is learner-initiated (Doughty &
Williams, 1998; Long & Robinson, 1998).
The benefits of pushed output have also been discussed in terms of learner collaboration, which results in
language related episodes "where students reflect consciously on the language they are producing" (Swain
2001:53). Though much of the interactionist research to date suggest that language related episodes are
often triggered by lexis, there is also evidence that a great deal of learner collaboration is related to form
(Ohta, 2001; Swain, 2001).
Though technically different from pushed output, self-initiated self-repairs are functionally similar to
pushed output in that they serve to test hypotheses about the target language, trigger creative solutions to
problems, and expand the learner's existing resources (Kormos, 1999). Self-repair, therefore, occupies an
important position in SLA theory.
Research on Self–repair
Self-correction data have been collected through a variety of means in L2 research, including picture
description, spatial description, interviews, storytelling, open narration, and information gap activities
(Camps, 2003; Fathman, 1980; Kormos, 2000; Lennon, 1990; van Hest, 1996; Verhoeven, 1989).
Generally speaking, the research to date on self-repair suggests that language learners tend to prefer selfover other-repair (Buckwalter, 2001; Foster & Ohta, 2005; van Lier, 1988); that L2 speakers self-repair
more often than native speakers (Kormos, 2000; van Hest, 1996); and that self-repair more often leads to
modified output than does other-initiated repair. Researchers have also explored various types of selfrepairs such as repairs of the message conveyed, repairs in the manner of expression (or appropriateness
repairs), and error repairs, which have included lexical, phonetic, and grammatical repairs. Out of this
work emerge trends in the type of SISR that occurs during learner interaction as well as several key
variables that affect the nature and amount of self-repair, including learner preferences, developmental
factors, and task type.
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
Types of self-repair
In most of the SLA research on self-repair, researchers have generally concentrated on the focus and
structure of self-repair. Kormos (2000) found that lexical errors were repaired considerably more
frequently than grammatical errors in the L2. This is in line with a general assumption by researchers of
L2 production that L2 learners pay considerably more attention to lexical appropriateness than to
grammatical accuracy. Kormos also found that the distribution of self-repairs shows no considerable
difference between L1 and L2 data in the frequency of the various types of self-repairs. In contrast,
Lennon (1990) found relatively little self-correction among his university level L1 German L2 English
speakers, though when they did self-correct they most often focused on lexical items. Fathman (1980)
found that 50% of all repairs were lexical in nature.
Van Hest (1996) discusses self-repair structure in terms of three components: 1) a reparandum, which is
either an error or an inappropriate expression; 2) an editing phase, which occurs immediately following
the interruption of the flow of speech; and 3) a reparatum, which is the actual correction or change of the
problematic item. Much of this work is based on research from L1 psycholinguistics and Levelt's (1983)
repair classification system (Kormos, 2000; van Hest, 1996). This work makes a distinction between overt
and covert repair. From this perspective, covert repairs (those made before articulation) proceed the same
way as overt repairs. One must infer covert repair through indirect evidence such as word or phrase
repetitions, syllabic repetition, silent pauses, etc. (Postma & Kolk, 1992). Van Hest proposes the model
below for classifying overt self-repair. The present study adapts this model for classifying instances of
SISR. The reasons for choosing the model proposed in van Hest (1996) for the current study were that
this model is based largely on Levelt (1983), which has been applied widely in the in the literature on
self-repair (Kormos, 2000). Further, this model is quite systematic and reliable, emerging out of a corpus
of almost 5,000 self-repairs produced by Dutch speakers in their L1 and L2 across various task types.
Finally, adapting such a strong existing taxonomy allows for more powerful comparisons of results across
related studies.
Overt Self-Repairs
• Error repair (E-repair): Those repairs made because the speaker has made an error.
• Appropriateness repair (A-repair): Those repairs made because the speaker thinks the original
message is inappropriate in some way. For example, a message may be perceived as not having been
specific enough.
• Different repair (D-repair): Those repairs in which the speaker interrupts his current message to
introduce a new, totally different topic.
• Rest repair (R-repair): All other types of overt self-repair.
Covert Self-Repairs
• Those cases whereby the speaker discovers imminent trouble in his/her message and “interrupts”
him/herself before the troublesome item is uttered.
In this work van Hest (1996) found that appropriateness repairs accounted for 39.7% of self-repairs, with
error repairs making up 22.4%, different repairs 10.1%, with 12.3% of self-repairs remaining unclassified.
Covert repairs made up 15.5% of all L2 self-corrections. Interestingly, Levelt's model does not explain
where errors of morphology should go, though some researchers have collapsed syntactic and
morphological errors together in a broader category of "grammatical" errors (Lai & Zhao, 2006).
Developmental factors in self-repair
Both age and L2 proficiency level seem to affect the amount and nature of self-repair (Camps, 2003;
Fathman 1980; Kormos, 1999; van Hest, 1996; Verhoeven, 1989). Of these factors, proficiency-related
variables in self-repair are particularly relevant to the current study. Camps (2003) suggests that learners
who make a large number of errors possess a more limited knowledge of the target language, and
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
therefore are not as well prepared to notice errors and correct them. They may be unaware of errors
because either they do not know what a correct form would look like or they are too busy attending to
other elements in their production (like finding suitable lexical items to express their ideas).
In contrast, van Hest (1996) found that advanced learners correct themselves less frequently than lower
level learners. Additionally, beginning and intermediate L2 speakers produced significantly more selfrepairs of lexical errors and significantly fewer repairs of lexical appropriateness than the advanced L2
speakers. Likewise, Kormos (1999) found that participants at a higher level of proficiency self-corrected
linguistic errors significantly less frequently than learners at the pre-intermediate level, whereas they
repaired the appropriateness of informational content more frequently than pre-intermediate students did.
Task type as a variable in self-repair
L2 self-repair research also reveals that the frequency of repairs concerning the information content of the
message varies across tasks (Poulisse, 1997; van Hest, 1996). Van Hest, for example, suggests an effect
for task type, concluding that tasks requiring more precise expression will result in more appropriateness
repairs than in tasks with less-rigid structure. Kormos (2000) also suggests that the frequency of
appropriateness repairs both in L1 and L2 is affected by task characteristics and the situational variables
of the interaction.
Self-Repair in a CALL Context
Self-repair has been accepted by many as evidence of noticing (Lai & Zhao, 2006), which has been
argued to be fundamental to the SLA process (Schmidt, 1993). Text-based chat has been argued to be a
good venue for exploring self-repair, since it seems to provide an increase in processing time and
opportunity for learners to focus on form (Pellettieri, 1999; Shehadeh, 2001; Smith, 2004), which may
lead to a heightened potential for noticing one's own errors. Indeed, Yuan (2003) suggests that the nature
of SCMC requires learners to attend to both linguistic forms as well as the meaning of their
The printed text may also add to the salience of input and output in general and the noticing of non-targetlike input and output in particular (Izumi, 2002; Salaberry, 2000; Smith, 2004). Smith has also argued that
a heightened saliency of linguistic input and output is a favorable byproduct of the CMC interface, with
increased saliency due largely to the permanence of the message. This notion of permanence has also
been used to explain the lack of learner uptake in a synchronous computer-mediated communication
(SCMC) environment (Smith, 2005). Kitade (2000) suggests that internet chat provides opportunities for
learners to self correct both grammatical and pragmatic errors in their own linguistic output for essentially
two reasons: first, there is no turn-taking competition and, second, there is more time for things like selfmonitoring. Also, there are few paralinguistic cues available in text-based chat, which might reduce the
sense of urgency to respond, and this, in turn, might facilitate learners' ability to monitor their language
output more closely. SCMC texts are not ephemeral like oral/aural input and learners can scroll up/down
to access an earlier message quite easily. Whether they in fact do this is an empirical question. Taken
together, these features may positively influence a learner's ability to notice and subsequently correct nontarget like language (Lai & Zhao, 2006; Smith & Gorsuch, 2004).
Recent studies on CMC self repair
Studies of self-repair in a CMC context are few. Those that do exist tend not to build on the existing work
on self-repair (from the non-CMC applied linguistics literature) and take a limited view of the types of
self-repair investigated. These studies are often stifled by a failure to employ existing technology in the
data collection and evaluation phases of the research.
Jepson (2005) found that though both voice and text chats contained various types of repair moves, selfcorrection was not among them. He suggests that self-correction in an SCMC context may be rare
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
because speakers do not notice their errors or because the non-native speaker – non-native speaker SCMC
context is not conducive to self-correction. Lee (2002) considered self-corrections in the framework of a
broader study on the nature of modification devices in a CMC environment and categorized them as
belonging to one of two categories, lexical or grammatical. Though no statistical data were provided, Lee
reports that most of the self-corrections made by her intermediate-level L2 learners of Spanish were made
on the "concordance of gender and number" (p. 284), and only occasionally was incorrect usage of lexical
items recognized and self-corrected. She suggests that learner keyboarding skills, language proficiency,
and attention to linguistic aspects might contribute to a high number of errors in (written) production.
Yuan (2003) found that of 512 errors in his CMC data, only 44 or 8.59% of them were self-repaired. Over
43% of the self-repairs were grammatical in nature (sentence structure, agreement, noun/article, and
preposition), though errors of nouns/articles and prepositions were only very rarely self-corrected. Errors
of verb tense, modals, and adjective-noun sequences were never self-corrected. Lexical self-repairs
accounted for almost 30% of the self-repairs. Yuan also counted self-repairs of spelling errors (25%).
Most notable is the fact that although learners made 57 verb tense errors (exactly as many as sentence
structure errors), none of these errors were self-corrected. Yuan argues that evaluating such CMC chat
logs allows one the advantage of seeing certain processes that the learner undergoes while trying to
construct meaning in their L2, arguing that they provide "real, recorded examples of errors (repaired or
unrepaired) learners made while trying to achieve certain communicative goals" (p. 204).
One of the very few studies to employ screen capture technology in an SLA/CALL study is Lai & Zhao's
(2006) study which examines the capacity of text-based chat to promote learners' noticing of their own
problematic production. In this study instances of learner self-correction are viewed as evidence of
noticing. Lai & Zhao found that online chat was superior to face-to-face interaction for promoting
noticing of mistakes even after controlling for differences in the amount of language output produced in
each condition.
As we can see from these few studies, using printed chat logs in the evaluation of SCMC data is the
methodological "industry standard" (with the notable exception of Lai & Zhao). Though chat logs
certainly do have value for interpreting SCMC interaction, they fail to capture a significant portion of the
data. It is precisely these "missing data" that may provide the most insight into the potential roles of
monitoring, attention and noticing, and pushed output in interlanguage development within a CMC
context. In order to gain a more complete view of learner CMC interaction, especially that which involves
learner self-repair, use of a dynamic screen capture record is required. Relying on a static artifact to make
claims about a dynamic process requires an uncomfortably wide and unnecessary leap of faith.
For example, Jepson (2005) reports that there were absolutely no self-correction moves in his data,
claiming that the SCMC environment is perhaps not conducive to self-correction. This seems unlikely
since there is ample research that strongly suggests a heightened degree of attention to form in a CMC
context. More likely is the possibility that significant self-repair did occur, but the data collection
methodology employed was not sensitive enough to detect it. Jepson (2005) suggests employing
technology that records each keystroke in an effort to uncover what he calls "hidden" self-correction. He
comments that in his study it was not possible to observe if participants edited their own messages before
they sent them and acknowledges that some self-correction repair moves may not have been measurable.
This is an important point of which some CALL researchers have taken note (see, for example,
Pellettieri's, 1999, use of YTalk). However, this approach is not only cumbersome when it comes to data
analysis; it also obscures other potentially interesting elements of online interaction such as scrolling as a
strategy to "recapture" previous content. Due to this limitation, Smith & Gorsuch (2004) suggest that
claims about the occurrence of certain interactional moves and strategies largely require one to infer too
much in those studies that use only a hard copy transcript of the interaction.
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
It seems, then, that if there are instances of self-repair that appear on the chat logs, there must be many
more that are attempted, but edited out before the message is sent to the interlocutor. It is important to
note that from an interactionist perspective on SLA, the potential value of this output in the form of selfrepairs is not diminished by the fact that they may be subsequently edited out by the learner.
CMCovert repair
In an SCMC context a unique type of self-repair is possible that may be considered a CMC-specific form
of self-repair. In this case a message is typed, but a self-initiated correction or rephrasing is executed
before the message is sent to the interlocutor. The self-repair is certainly overt from a psychological
perspective, but essentially "covert" from the interlocutor's perspective in that there is never any evidence
of there having been a self-repair or rephrasing. Further, such repair is often not immediate and regularly
contains several "embedded" self-repairs in the same evolving message. The proposed methodology
allows an examination of this type of self-repair, heretofore lost to methodological limitations of the
research design.
This type of self-repair is different from true covert repair since, in the latter, we may not expect the same
output-related benefit that may only be present upon actually producing the target language. Of course,
some of the argued benefits of pushed output for the speaker are obtained only once the interlocutor reacts
to a speaker's productive output. In an SCMC environment, these conditions will be the same, as will
possible benefits achieved when one engages in truly covert self-repair2.
However, what I term here as "CMCovert" self-repair is an interesting and largely unexplored
phenomenon that may provide us valuable insights into the nature of such self-repair as well as the effect
of CMCovert self-repair on SLA. Given the pervasiveness of such repairs I argue that any examination of
the occurrence and possible effects of SCMC self-repair from the repairer’s point of view simply must
include this CMCovert self-repair. To do this one must abandon the practice of simply using printed chat
logs to analyze CMC interactional data and employ more dynamic means such as screen capture
technology. This does not only apply to investigations of self-repair, but to CMC learner interaction in
general as suggested by Smith & Gorsuch (2004).
The purpose of the current study is two-fold. First, I wish to establish empirically just how misguided it
would be for CALL researchers to continue to rely on printed chat logs alone when making SLA-related
claims about SCMC interaction. This inadequacy is revealed by comparing two types of data from the
same task-based interaction sessions: the printed chat log and the Camtasia screen capture video record of
the same session for the same participants (hereafter chatscript). The amount of SISR, then, is the
dependent variable in this study with the learners in this study serving as their own control group. Data
collection/evaluation methodology is the independent variable. Second, the nature of learner self-repair in
this SCMC context will be explored. Though there is considerable research on face-to-face self-repair,
there is very little CALL work on this topic.
Research Questions
This study explores what Buckwalter (2001) has described as self-initiated self-repair, albeit in a SCMC
environment. Following Lai & Zhao (2006), self-repairs were defined as episodes where the participants
immediately corrected their own production without prompts from their interlocutors. Some of these
episodes are visible on the final chat logs of each session and, as we will see, some are not. All SISR
episodes, however, are visible on the video file of the relevant chat session.
The research questions are as follows:
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
Does the hard copy transcript of chat interactions differ from that which is available using a screen
capture program in terms of the amount of SISR that is evident?
What is the nature of CMCovert self-repair?
In order to answer these questions, hard copies of all the chat logs for all participants across all tasks were
analyzed and coded for instances of SISR. Hypothesis 1 predicts that evidence of a significantly higher
amount of SISR will be found in the screen capture condition. This prediction is based largely on Smith &
Gorsuch (2004), who found that a similar method of data collection captured a much richer picture of the
CMC interaction of the participants in their study. Drawing from previous research on self-correction as
well as CALL, a substantial amount of CMCovert self-repair is expected (Hypothesis 2a). More
pronounced attention to form manifested in a higher number of grammatical self-repairs than lexical selfrepairs are expected due to the increased planning/monitoring, and processing time afforded by the CMC
medium (compared to a face-to-face setting). Also, because of the relatively low proficiency level of the
participants (Novice-High), more self-correction of grammar-related problems is expected (Hypothesis
2b). It is also expected that few appropriateness (A) self-repairs relative to the number of error (E) selfrepairs will be found also due mostly to the relatively low proficiency level of the participants
(Hypothesis 2c). A summary of the four directional hypotheses follows below:
Hypothesis 1: A significantly higher amount of SISR will be found in the screen capture condition
than in the hard copy transcript condition.
Hypothesis 2a: A substantial amount of CMCovert self-repair is expected.
Hypothesis 2b: More SISR of grammar-related problems than lexical ones are expected.
Hypothesis 2c: Few appropriateness (A) SISRs relative to the number of error (E) SISRs will be
Forty-six beginner-high proficiency level students participated in this study as part of their regularly
scheduled German language course at a major southwestern university in the United States. Students were
required to meet once every other week in the foreign language micro-computing lab. Six CMC sessions
were scheduled over the course of the semester. All students were either sophomore or junior
undergraduates and all were native speakers of English. None were German majors. Their proficiency
level and placement in the German sequence was determined by an in-house online placement test. All
participants in the present study were characterized by the instructor of record as roughly at the ACTFL
Novice-High proficiency level and were familiar with the chat function in Blackboard. All participants
did complete one training session prior to data collection to ensure they were familiar with the general
task and procedures since they were not necessarily accustomed to performing similar task-based CMC
activities in their German class. Though there was some evidence of target cultural materials and short
samples of authentic literature, the core textbook and instruction were largely organized along a
grammatical syllabus.
Paired participants completed one jigsaw task per session over the course of the semester, which resulted
in a potential total of six tasks per student (assuming perfect attendance). This task type was chosen
because of its structural requirement of two-way information exchange by participants who are striving to
reach a convergent goal. Pairs were not necessarily matched from week to week. Though each task was
slightly different, they all follow Pica, Kanagy, and Falondon's (1993) task features for jigsaw tasks. Four
of the six tasks were video-based, whereby one learner (learner A) would view a two-minute dramatic
video clip that corresponded to the week's assigned course content. The other learner (learner B) would
not view this clip but would study a series of eight stills from the same video clip, randomly arranged.
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
The stills were such that a logical order was not discernable simply by examining the photos alone but
quickly and easily sequenced upon viewing the clip from which they were taken. The remaining two tasks
were standard sequential ordering tasks, where learners each held three out-of-sequence pictures which
when put together made up a logical story sequence. The video-based tasks were directly tied to the core
content and textbook of the course and came from the ancillary DVD and workbook accompanying the
main course textbook. Participants interacted with one another via the chat function in Blackboard and
were assigned to one of various paired "groups" under Blackboard's Communication tool, Virtual
Classroom. Following Smith & Gorsuch (2004) and Lai & Zhao (2006), the dynamic screen capture
software Camtasia was employed to record exactly what appeared on each participant's computer screen
in real time. Camtasia has the capability of recording and creating a movie file of each participant's
computer screen, allowing one to play back the chat session in its entirety.
Participants were required to attend and participate in each session since these were built into the syllabus
of the course. Participation in the study was voluntary and followed the university's prescribed IRB
protocol. The six CALL tasks described above were completed every other week during the middle part
of the semester (over 12 weeks). Each class lasted approximately one hour. The average amount of time it
took pairs to complete each task was just over 25.5 minutes as measured by the time stamp on the chat
logs of the interactions. No specific time limit was placed on students once they began the task; however,
given the length of the class, participants realistically had about 40 minutes to complete each task.
All students worked collaboratively online with a partner. Each participant was given task sheet A or B.
All of those holding task sheet A were grouped together and separate from those students in group B. This
was done in order to reduce the chance that any participant would gain visual access to their interlocutor’s
(partner’s) task sheet/video clip. For the video-based tasks, the clip for that session was made available to
group A in each dyad. These students viewed the clip with headphones on while their partner studied the
still images of various scenes from the same video clip (group B). Once the video clip had played through
one time, participants were directed to interact with their assigned chat partner and decide the proper
order of the pictures held by student B. In order to successfully complete this task, learner A had to
describe in detail the events in the short video clip (in the target language) while learner B attempted to
place the pictures in order based on these descriptions. Likewise, learner B was told to describe each
picture to learner A in order to facilitate this ordering. Learners were instructed to interact using the target
language with the goal of agreeing on a likely order to the still images held by learner B. Once this order
was agreed on learners were to declare the task completed by typing the proposed order of the pictures
and writing "finished". Upon task completion, Camtasia was stopped and the video record of the
interaction saved to a removable storage drive. The chat logs of these interactions were saved
automatically in Blackboard.
Data analysis
Following Kormos (2000), truly covert repairs were not considered in the data since one can only infer
these occurrences. Rather, CMCovert self-repairs (those that are recorded, but which do not appear on the
chat logs), as well as those which are truly overt (appearing on the chat logs), were considered. Following
Lai & Zhao (2006) and Lee (2002), spelling mistakes/corrections were not counted in the data. Errors in
the obligatory capitalization of German nouns and subsequent self-corrections were counted and coded as
lexical errors (EL). Since the first research question sought to compare the amount of self-repair evident
when employing two data collection methodologies in a paired groups fashion, only those participants
whose chat records showed evidence of self-repair across both of these methodologies were candidates
for inclusion in these data. To this end, all printed chat logs (n=94) of pair interaction were evaluated and
coded for instances of self correction (method A) using the coding scheme below (see Table 1), which is
adapted from van Hest (1996). Table 2 shows examples of each of these categories from the data.
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
Table 1. Description of Types of Self-Repair.
Type of self-repair
Lexical (EL)
Morphological (EM)
Syntactic (ES)
Lexical (AL)
Syntactic (AS)
Insertion (AI)
The learner has selected the wrong word and substitutes the
correct one for it.
The learner corrects a morphological error.
The learner produces a grammatical construction which
cannot be finished without violating the grammar of the
target language.
The learner replaces one term with another, usually more
precise, term which better fits the concept s/he wishes to
The learner replaces the original syntactic construction with
a construction which, in his/her opinion, is more appropriate.
The speaker repeats part of the original utterance and inserts
one or more words to specify his/her message.
The speaker interrupts his/her current message to introduce a
new, totally different topic. Abandonments.
All other self-repairs that do not fit cleanly into any of the
other categories.
Note: Other categories discussed in van Hest (1996) were either not appropriate for the CMC context, for
example, phonological errors, or were not present in the data and, therefore, are not included in these
This resulted in a total of eight chatlogs. That is, out of the transcripts of the twenty-eight students
completing the tasks described above, there was evidence of self-repair on only eight of the 94 chatlog
transcripts, or 8.5%. This resulted in a total of 9 instances of self-repair (see Table 3). Next, the
corresponding Camtasia file was viewed in its entirety (method B) for each corresponding transcript
(n=8). Each video file was coded for instances of SISR according to the same coding scheme (Table 1).
Table 2. Examples of Self-repair Types
Type of self-repair
Lexical (EL)
Morphological (EM)
Syntactic (ES)
Lexical (AL)
Syntactic (AS)
Insertion (AI)
…ist hat auch…
…welche Buchstaben ist…
…weil er dieas maädchen…
…bitte Wurst und S
…wann er mit sein
Drei Leute sind [im Morgen]
im Restaurant. [+]
die männer sind look▌ wie
sagt man looking to look at?
…hat auch…
…welche Buchstab ist …
…weil er das mädchen…
…bitte Schnitzel.
…wann er sein freundin…
Drei Leaute sind im Morgen
im Restaurant.
wie sagt man to look at?
Note: A coding scheme adapted from Smith & Gorsuch (2004) is shown in the Appendix.
The chat data for the target learners yielded a total of 1,464 words. The number of words produced by
these learners on the tasks considered ranged from 101 to 398 (M=183, SD=93). As can be seen from
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
Table 3 below, there were many more instances of self-repair evident when using method B. A Wilcoxon
Signed Ranks Test shows that significantly more instances of learner self-repair were captured using
method B than method A (z = 2.53, p =. 01, r = .79; see Table 4). To allow for a clearer comparison to
previous studies, "standardized scores" for instances of self-correction relative to the quantity of discourse
produced by each participant were calculated. As can be seen from Table 5, the Wilcoxon Signed Ranks
Test yielded very similar results (z = 2.52, p = .01, r = .88). Viewed another way, learners who engaged in
task-based CMC interaction with a partner actually self-repaired about six times per one hundred words
(of printed transcript), whereas they appear to have self-repaired less than once per one hundred words
based on the hard copies of the chat logs alone3.
Table 3. Self-repairs by Type, Across Method of Data Collection
Type of Self-Repair
Lexical (EL)
Morphological (EM)
Syntactic (ES)
Total (E)
Lexical (AL)
Syntactic (AS)
Insertion (AI)
Total (A)
Chat log
Table 4. Comparison of Self-repairs in chat logs vs. Camtasia
A (Chat log)
B (Camtasia)
effect-size r
Figure 1 below shows two transcript versions of the same chat interaction. The left column is the
"Camtasia-enhanced" chatscript, which was transcribed while viewing the video screen capture file. The
column on the right shows a traditional chat log of the same segment without the benefit of the Camtasia
file. In column B only one instance of SISR is evident (lines 6b-7b) whereas column A shows at least
three non-spelling self-repairs (lines 5a, 6a-7a, and 8a). The SISR in lines 5a and 8a go undetected when
relying on printed chat logs alone. The appendix shows the coding scheme employed in evaluating the
Camtasia data.
Table 6 shows the data realigned to collapse the categories of morphological and syntactic selfcorrections together into the broader category of "grammatical" self-repair. This will allow for an easier
comparison with data reported in previous studies discussed earlier. Although there were almost twice as
many grammatical self-repairs as lexical self-repairs, this difference was not statistically significant.
Finally, when we compare the differences in self-corrections of errors with self-corrections of
appropriateness issues, we find that learners self-repaired significantly more errors (z = 2.53, p =. 01, r
= .68; see Table 7). In order to see if there were any differences in which types of self-repairs learners
engaged in within the error (EL, EM, ES) and appropriateness (AL, AS, AI) categories (see Table 3) a
Friedman test was performed. Differences within each category were not statistically significant.
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
SCMC Chatscript
Column A
1a. Pierre: er hat seine geschaft
aufgeraumt 12:59:43
2a. Pierre: bildung C er geht nach hause
und seine garge ist nicht sauber 1:00:08
3a. Daniel: D-Der man ging ins [2a] garage
und es ist sehr schmustivg 1:00:29
4a. Daniel: E-er gi heht raus deines
zimmem er r 1:00:54
5a. Daniel: F-ErDas Garage ist sauber und
er hat eine hacke im hand 1:01:15
6a. Daniel: **in deiner hand 1:01:26
7a. Daniel: seiner* 1:01:31
8a. Daniel: meine ist def seine
geschaftrft? 1:02:19
9a. Daniel: ich habe kaein geschaft nur ein
garage und ein Bettzimmer. 1:02:36
Hard copy of transcript
Column B
1b. Pierre: er hat seine geschaft
aufgeraumt 12:59:43
2b. Pierre: bildung C er geht nach hause
und seine garge ist nicht sauber 1:00:08
3b. Daniel: D-Der man ging ins garage und
es ist sehr schmustig 1:00:29
4b. Daniel: E-er geht raus deines
zimmer 1:00:54
5b. Daniel: F-Das Garage ist sauber und er
hat eine hacke im hand 1:01:15
6b. Daniel: **in deiner hand 1:01:26
7b. Daniel: seiner* 1:01:31
8b. Daniel: seine geschaft? 1:02:19
9b. Daniel: ich habe kein geschaft nur ein
garage und ein Bettzimmer. 1:02:36
Figure 1. Camtasia enabled chatscript and printed chat log comparison. Click the following link to view
this actual segment from the data (link to Camtasia Example Flash file)
Table 5. Self-Repairs Per Word Across the Two Data Collection Methods
effect-size r
A (Chat log)
B (Camtasia)
Table 6. Percentage of the Total Number of Self-Repairs (Grammatical vs. Lexical)
Grammatical self-repairs
Lexical self-repairs
Other self-repairs (AI & D)
A (Chat log)
6 (67%)
3 (33%)
0 (0%)
B (Camtasia)
54 (63%)
28 (33%)
4 (5%)
Note: Morphological and syntactic categories were collapsed together into a larger grammatical category
Table 7. Method B Error and Appropriateness Self-Repairs
Self-repair type
effect-size r
In sum, the data suggest that the Camtasia screen capture method (method B) of data collection and
analysis yields significantly more evidence of SISR than does the chat log method (method A). There are
also significantly more SISRs of errors than appropriateness points. In all cases the effects were of a
strong magnitude as indicated by the high effect-size r measure. No differences across sub-types of error
and appropriateness SISRs were found.
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
The degree of support for the specific hypotheses presented in the previous section is listed below:
Hypothesis 1: A significantly higher amount of SISR will be found in the screen capture condition
than in the hard copy transcript condition.
Result: Strongly supported
Hypothesis 2a: A substantial amount of CMCovert self-repair is expected.
Result: Strongly supported
Hypothesis 2b: More SISR of grammar-related problems than lexical ones are expected.
Result: Partially supported
Hypothesis 2c: Few appropriateness (A) SISRs relative to the number of error (E) SISRs will be
Result: Strongly supported
The data suggest that relying on printed transcripts alone may create the impression that learners do not
self-correct very often in an SCMC environment – clearly a faulty conclusion. Indeed, these numbers
show that evaluating instances of self-correction on the basis of printed chat logs alone leads to an
underestimation by over eight-fold of the amount of self correction that actually occurred. In fact, the
present data may be a conservative estimate of this difference since Lai & Zhao's (2006) data show an
even higher rate of self-correction in an SCMC setting (29 self-corrections per one hundred words). Such
a pronounced and fundamental mis-characterization of the nature of SCMC interaction can help explain
conclusions such as those in Jepson (2005) who, finding little self-repair in his CMC data, stated that
though both voice and text chats contained various types of repair moves, self-correction was not among
them. Jepson also suggests that self-correction in an SCMC context may be rare because speakers do not
notice their errors, and thus would not need to correct them. Alternatively, he suggests that since selfcorrection is very dependent on the social context of the interaction (Kormos, 1999), it may be that the
non-native speaker – non-native speaker SCMC context is somehow not conducive to self-correction. For
example, learners may not see the need for accuracy or may perceive self-correction as face threatening.
This rationale, however, contradicts much of the recent CMC interactionist research which suggests that
the CMC environment most likely enhances noticing (Lai & Zhao, 2006; Smith, 2004).
The results from the current study are similar in some ways to previous findings and quite different in
other ways. Yuan (2003) found that learners corrected under 9% of their errors in a CMC environment.
His data, which includes self-repair of spelling errors, suggests that the CMC environment does not
always make errors more salient to learners, at least not verb tense and modal errors, which were never
self-repaired. In order to make Yuan's data more comparable to this study, the spelling self-corrections in
his data need to be removed. This results in a new number of 33 self-repaired errors (down from 44) of
which 20 were grammatical (61%) and 13 were lexical (39%). These numbers are quite similar to those
found in the present study (63% and 33% respectively). It is also interesting to note that the method of
data collection does not seem to influence the relative amount of each type of self-correction recorded
(Table 6). This point is reinforced when we consider that Yuan's data were based on printed transcripts of
the chat interaction.
In terms of which types of SISR learners engage in, it seems the results are largely in conflict with the
existing self-repair literature, which is largely limited to face-to-face studies. Van Hest (1996), for
example, found that almost 40% of SISR were appropriateness repairs compared with 22% of error SISRs.
Kormos (2000) reports that almost 39% of all SISRs were error repairs compared with almost 23%
appropriateness SISRs. This study found a relatively high percentage of "different" repairs (almost 22%)
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
as well. Kormos also found little difference in grammatical and lexical error self-repairs (16.9% and
14.2% respectively), which is similar in some ways to the findings in the present study.
The little data available that specifically focuses on learner self-correction in a CMC environment
suggests that learners do self-correct in an SCMC task-based setting, perhaps due to a heightened degree
of noticing, which is fostered by the SCMC environment itself. Second, learners tend to focus on errors
rather than appropriateness issues when engaged in SISR and virtually no "different" and "rest" points.
Third, learners seem to correct grammatical points more often than lexical points, though this difference
was not statistically significant.
Though it is normally of little value to compare CALL with face-to-face studies, it is interesting to note
the pronounced differences in what learners seem to self-correct across the two environments. Taken
together, the face-to-face self-repair literature seems to show a clear preference for lexical self-repairs
over grammatical self-repairs, precisely the opposite of that found in most of the SCMC studies. The
question, then, is why this may be so.
Written communication normally affords more opportunity for attention to form, whereas spoken
language often occurs under more time pressure to achieve fluency (Chapelle, 2003). The SCMC
interaction allows for more processing time, which is conducive to focusing on form. The visual saliency
of the text as well as the permanency of the written word, which enables one to review the previous
"utterances," allows learners to focus their attention on the formal aspects of their output without
disrupting the flow of communication.
Context Influencing the Nature of Self-Correction
Linguistic context
It is not surprising that self-repair of grammar-related problems is so common in the data given the
relatively high percentage of morphological errors made by the learners. Aside from the noted potential
for the CMC environment to enhance a focus on form, it may also be that errors that do not require a
major restructuring of the utterance, but rather merely need to apply simple rules of grammar (as do
morphological errors) are more likely to be self-corrected by L2 speakers (Kormos, 2000). In the SCMC
environment this condition may be enhanced as these relatively "minor" morphological errors are
rendered more salient due largely to their "permanence" on the screen. This may help explain why Table
3 shows nearly a 2:1 ratio of the number of morphological errors self-repaired to the number of syntactic
errors repaired in the Camtasia condition.
Classroom Context
The nature of the language instruction that the learners are used to may influence what they choose to
self-correct. For example, if learners receive instruction that stresses the importance of grammatical
accuracy in successful communication and students regard grammatical errors as serious flaws in their
performance, they may make an effort to correct their grammatical errors. This notion is supported by
Bardovi-Harlig and Dörnyei (1998), who found that grammatical errors were more salient for L2 speakers
in a foreign language setting than for L2 learners in a naturalistic environment. The students in the present
study fall clearly into the former category.
There may also be a complex combination of L1 and L2 linguistic, classroom, and cultural influences that
come together to influence the nature of SISR. Shonerd (1994) notes the seeming selectivity of learner
self-repair and suggests the nature of self-repair may be culturally bound. His Japanese L1 speakers made
more morphological and syntactic self-repairs of their English usage and fewer lexical and pronunciation
repairs than did other L1 groups. In terms of the present data, there were numerous SISRs that involved
the capitalization of German nouns. As mentioned, these SISRs were coded as lexical errors. The
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
interaction of German and English linguistic factors and the classroom cultural context may have come
together to influence the nature of SISRs in this study.
Limitations of the Current Study
Perhaps the clearest limitation of the current findings is the small sample size (n=8) from which the data
are drawn. Thus, it is hard to generalize from these data. However, the choice was made from the outset
to include only those chat records which contained hard copy evidence of SISR. This resulted in a low
pool of participants (much lower than actually took part in the study) from which the self-repair data was
drawn. Indeed, this is the only way to really make the comparisons needed to answer research question 1,
which asked whether the hard copy transcript of chat interactions would differ significantly from a
Camtasia-enabled transcript.
Perhaps another limitation was that the interaction was not anonymous. The role of anonymity in CMC
interaction is well established (Zhao, 1998). It could be that knowing who one's interlocutor is has some
effect on the amount and nature of one's SISR. For example, it could be that knowing the identity of one's
interlocutor may affect whether or not one self corrects depending on the learner's relationship to that
interlocutor. This is an empirical question that could be taken up in future research.
Another limitation is that the tasks learners completed each week, while conforming broadly to Pica,
Kanagy, and Falondon's (1993) jigsaw task type, were slightly different, which may have influenced the
nature of the SISRs. In addition to the large body of literature on the effect of task type on learner
interaction, the self-repair literature also suggests that the frequency and type of self-repair is affected by
the task structure (see, for example, van Lier, 1988). It would be interesting to test van Hest's (1996)
assertion that those tasks requiring more precise expression will result in more A-repair. A first step in
this work would be to establish what exactly draws out "more precise" expression in an SCMC context
that goes beyond the "precision" elicited by the CMC medium itself.
The results of this study suggest that relying on printed chat logs alone when analyzing SCMC data is a
very tenuous undertaking. The recommendation here is to abandon this practice in CMC research in favor
of one similar to that presented in this study, at least when data salient to the specific area of inquiry may
be lost.
Future inquiry using this methodology may include exploring the influence of classroom context on the
type of self-correction. It would be interesting to see if, for example, learners accustomed to a
communicative classroom context engage in a different type of SISR than those learners who are used to
more structure-oriented contexts. Second, because there seems to be a trend toward learners self-repairing
grammatical points rather than lexical ones in a SCMC environment, it would be interesting to explore
this notion further with larger numbers of students perhaps across various target languages. As it stands,
there is a theoretical rationale to explain this occurrence as well as some empirical evidence to support it,
but the data are far from conclusive. Third, the SCMC medium may itself influence the nature of SISR.
One artifact of this medium is the ability learners have of scrolling back in the chat text to review
previous messages. Indeed, Lai & Zhao (2006) base some of their predictions on this assumption. Though
such occurrences in an SCMC setting are well established, the influence of this feature of SCMC
interaction has not specifically been shown to directly impact the nature or amount of SISR. Future
research may wish to explore this idea more explicitly. Finally, it seems that existing models of SISR are
insufficient to account for what occurs in an SCMC context. A model of SISR specific to the SCMC
context would be helpful for future inquiry into this important area of applied linguistics.
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
1. Self-repair in this paper is considered synonymous with self-initiated self-completed correction. Self
repair and self-correction are used interchangeably.
2. As one reviewer points out, in true covert repair learners may engage in hypothesis testing in their
minds as evidenced in some think aloud protocols. However, I argue that CMCovert self-repair is indeed
unique in that in an SCMC environment we often see many lengthy and embedded self-repairs, whereas
in a face-to-face environment these are often shorter, more direct, and immediate.
3. The number of words produced by each individual was used when calculating this figure and not the
total words produced by the dyad.
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
SCMC Coding Scheme
Coding symbol
Black bar +
underline ▌
[post-hoc inserted
[post-hoc deleted
[line number]
Example: [3]
Indicates text that has been
typed and subsequently
deleted before message is
Indicates text with embedded
deletion has been deleted.
Black vertical bar indicates
where deletion begins.
Deleted text is underlined to
the left of the bar.
Indicates that the text within
the brackets has been
inserted at a later point in the
Strikethrough shows messages or parts of messages
that a learner has typed but deleted before sending
the final message. None of the strikethrough text
appears on the screen of the interlocutor nor on the
hard copy of the chat log.
Black bar + underline is used when a message or
part of a message which already has some deleted
sections is subsequently deleted in its entirety. This
coding allows the acknowledgment of deletions of
text with embedded deletions.
Note: A second, third, etc. occurrence of post-hoc
inserted text is signified with double/triple brackets
respectively [[text]], [[[text]]].
Indicates that the deleted text
within the brackets was
deleted at a later point in the
Indicates the point in the
message at which the [posthoc inserted text] was
Indicates the point in the
message at which the [posthoc deleted text] was
Note: A second, third, etc. occurrence of post-hoc
deleted text is signified with double/triple brackets
respectively [[text]], [[[text]]].
Indicates the point in the
message at which a correction
was made.
This code is used for short one or two character
corrections such as for capitalization, spelling, typos, etc.
This code eliminates the need for using the more lengthy
[-][+] in sequence.
Indicates the point in text
currently being typed but not
yet sent where a new line
from the interlocutor appears
on the screen of the target
Often a line from the interlocutor will appear on the
screen mid-way through a message which is in the
process of being typed. This code indicates both the
point at which this new message appears and its line
number on the chat log/chatscript. For example in
the chat text below Jordan's line 3 appears while
Katarina is typing line 6. Specifically, it occurs
immediately after Katarina types the word "nimmt".
Language Learning & Technology
Note: The point in the message at which a second,
third, etc. post-hoc insertion is made is signified
with double/triple brackets respectively [[+]],
Note: The point in the message at which a second,
third, etc. post-hoc deletion is made is signified with
double/triple brackets respectively [[-]], [[[-]]].
Bryan Smith
Methodological Hurdles in Capturing CMC Data
The author wishes to thank the following people (in alphabetical order). Shana Bell, Jamison Gray, Peter
Lafford, and, of course, the participants in the study. Also, many thanks go out to the reviewers and those
at LLT for such insightful suggestions and close reading of the manuscript.
Bryan Smith is Assistant Professor of Educational Linguistics at Arizona State University. His research
interests include language learner interaction and computer-mediated communication in the
second/foreign language classroom.
Email: [email protected]
Bardovi-Harlig, K., & Dornyei, Z. (1998). Do language learners recognize pragmatic violations?
Pragmatic versus grammatical awareness in instructed L2 learning. TESOL Quarterly, 32, 233-262.
Buckwalter, P. (2001). Repair sequences in Spanish L2 dyadic discourse: A descriptive study. The
Modern Language Journal, 85, 380-397.
Camps, J. (2003). The analysis of oral self-correction as a window into the development of past time
reference in Spanish. Foreign Language Annals, 36, 233-242.
Chapelle, C. (2003). English language learning and technology: Lectures on applied linguistics in the age
of information and communication technology. Philadelphia: John Benjamins.
Doughty, C., & J. Williams. (Eds.). (1998). Focus on form in classroom second language acquisition.
Cambridge: Cambridge University Press.
Fathman, A. (1980). Repetition and correction as an indication of speech planning and execution
processes among second language learners. In H. Dechert, & M. Raupach (Eds.), Towards a crosslinguistic assessment of speech production (pp. 77-85). Frankfurt: Lang.
Foster, P., & Ohta, A. S. (2005). Negotiation for meaning and peer assistance in second language
classrooms. Applied Linguistics, 26, 402-430.
Gass, S. M., & Mackey, A. (2000). Stimulated recall methodology in second language research. Mahwah,
NJ: Erlbaum Associates.
Izumi, S. (2003). Comprehension and production processes in second language learning: In search of the
psycholinguistic rationale of the output hypothesis. Applied Linguistics, 24, 168-196.
Jepson, K. (2005). Conversations and negotiated interaction in text and voice chatrooms. Language
Learning & Technology, 9(3), 79-98.
Kormos, J. (1999). Monitoring and self-repair in L2. Language Learning, 49, 303-342.
Kormos, J. (2000). The role of attention in monitoring second language speech production. Language
Learning, 50, 343-384.
Lai, C., & Zhao, Y. (2006). Noticing and text-based chat. Language Learning & Technology, 10(3), 102120.
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
Lee, L. (2002). Synchronous online exchanges: A study of modification devices on non-native discourse.
System, 30, 275-288.
Lennon, P. (1990). Investigating fluency in EFL: A quantitative approach. Language Learning, 40, 387417.
Levelt, W. (1983). Monitoring and self-repair in speech. Cognition, 14, 41-104.
Long, M.H., & Robinson, P. (1998). Focus on form: Theory, research, and practice. In C. Doughty & J.
Williams (Eds.), Focus on form in classroom SLA (pp. 15-41). New York: Cambridge University Press.
Ohta, A. S. (2001). Second language acquisition processes in the classroom setting. Learning Japanese.
Mahweh, NJ: Erlbaum Associates.
Pellettieri, J. (1999). Why-talk? Investigating the role of task-based interaction through synchronous
network-based communication among classroom learners of Spanish. Unpublished doctoral dissertation,
University of California, Davis.
Pica, T., Kanagy, R., & Falodun, J. (1993). Choosing and using communication tasks for second language
research and instruction. In G. Crookes & S. M. Gass (Eds.), Tasks and second language learning (pp. 9–
34). Clevedon, UK: Multilingual Matters.
Postma, A., & Kolk, H. (1992). The effects of noise masking and required accuracy on speech errors,
disfluencies, and self-repairs. Journal of Speech and Hearing Research, 35, 537-544.
Poulisse, N. (1997). Compensatory strategies and the principles of clarity and economy. In G. Kasper & E.
Kellerman, Communication strategies: Psycholinguistic and sociolinguistic perspectives (pp. 49-64).
London: Longman.
Schmidt, R. (1993). Awareness and second language acquisition. Annual Review of Applied Linguistics,
13, 206–226.
Shehadeh, A. (2002). Comprehensible output, from occurrence to acquisition: An agenda for acquisitional
research. Language Learning, 52, 597-647.
Shehadeh, A. (2001). Self-and other-initiated modified output during task-based interaction. TESOL
Quarterly, 35, 433-457.
Shehadeh, A. (1999). Non-native speakers' production of modified comprehensible output and second
language learning. Language Learning, 49, 627-675.
Shonerd, H. (1994). Repair in spontaneous speech: A window on second language development. In V.
John-Steiner, C. Panofsky & L. W. Smith, (Eds.) Sociocultural approaches to language and literacy: An
interactionist perspective (pp. 82-108). Cambridge: Cambridge University Press.
Smith, B. (2004). Computer-mediated negotiated interaction and lexical acquisition. Studies in Second
Language Acquisition, 26, 365-398.
Smith, B. (2005). The relationship between negotiated interaction, learner uptake, and lexical acquisition
in task-based computer-mediated communication. TESOL Quarterly, 39, 33-58.
Smith, B., & Gorsuch, G. J. (2004). Synchronous computer mediated communication captured by
usability lab technologies: New interpretations. System, 32, 553-575.
Swain, M. K. (1985). Communicative competence: Some roles of comprehensible input and
comprehensible output in its development. In S. M. Gass & C. G. Madden (Eds.), Input in second
language acquisition (pp. 235–253). Rowley, MA: Newbury House.
Language Learning & Technology
Bryan Smith
Methodological Hurdles in Capturing CMC Data
Swain, M. (2001). Integrating language and content teaching through collaborative tasks. Canadian
Modern Language Review, 58, 44-63.
Swain, M. (2005). The output hypothesis: Theory and research. In E. Hinkel (Ed.), Handbook on research
in second language teaching and learning (pp. 471-484). Mahwah, NJ: Lawrence Erlbaum.
Swain, M., & Lapkin, S. (1995). Problems in output and the cognitive processes they generate: A step
towards second language learning. Applied Linguistics, 16, 371-391.
van Hest, E. (1996). Self-repair in L1 and L2 production. Tilburg: Tilburg University Press.
van Lier, L. (1988). The classroom and the language learner. New York: Longman.
Verhoeven, L. T. (1989). Monitoring in children's second language speech. Second Language Research, 5,
Wouk, F. (2005). The syntax of repair in Indonesian. Discourse Studies, 7, 237-258.
Zhao, Y. (1998). The effects of anonymity on peer review. International Journal of Educational
Telecommunication, 4, 311-346.
Language Learning & Technology
Language Learning & Technology
February 2008, Volume 12, Number 1
pp. 104-108
Jeff McQuillan
Center for Educational Development
Stephen D. Krashen
University of Southern California
Cobb (2007) argues that free reading cannot provide L2 readers with sufficient opportunities for acquiring
vocabulary in order to reach an adequate level of reading comprehension of English texts. In this paper,
we argue that (1) Cobb severely underestimates the amount of reading even a very modest reading habit
would afford L2 readers, and therefore underestimates the impact of free reading on L2 vocabulary
development; and (2) Cobb’s data show that free reading is in fact a very powerful tool in vocabulary
Krashen (1989, 2004) and others have argued that free reading is a major contributor to vocabulary
development among both first and second language readers (see also McQuillan, 1998). Free and
extensive reading advocates have claimed that such reading can and does provide acquirers with sufficient
resources to reach a high level of literacy development.
Cobb (2007) claims, however, that free reading cannot possibly provide sufficient opportunities for L2
readers to reach high level of vocabulary acquisition, of going "all the way" to the state of a fluent adult
L2 reader. Cobb cites evidence showing that vocabulary acquisition requires a minimum of six to ten
exposures to a word family, and that the minimal number of word families required for comprehension of
non-specialist materials in English is 3000 to 5000, depending on which estimate is used (2007, p. 41).
For this study, Cobb used the low end of these estimates (six exposures to a word family, 3000 word
family level).
Cobb analyzed how frequently vocabulary occurred in three subsets of a corpus of academic, fiction, and
newspaper texts, each subset containing between 163,000 and 179,000 words, in order to determine if
words occur in sufficient frequency for acquisition (see Table 1). (Cobb explains that the newspaper
sample is about 100 pages of newspaper reading, the academic sample about 17 scientific papers, and the
fiction sample about six stories the size of Alice in Wonderland.) Cobb estimated that in a "year or two"
of language study, a student could read the equivalent of one of these three subsets, or roughly 175,000
words (p. 41). He considered this to be an "optimistic" estimate.
Table 1. Number of Words in Each Sample
Words in Sample
Cobb then randomly selected ten word families each from of the 1000, 2000, and 3000 most frequently
appearing word families in English and determined how many times those families appear at each level
for each of the three genres of reading material. Using corpus analysis, he found that while the frequency
of recurrence for the 10 word families would probably be sufficient at the 1000 word level for any of the
Copyright © 2008, ISSN 1094-3501
Jeff McQuillan and Stephen D. Krashen
A Response to Cobb (2007)
three genres, free reading would be insufficient to attain the 2000 and 3000 word level. As illustrated in
Table 2 (from Cobb, Table 1), one word out of the sample of ten does not appear often enough in
newspapers, two out of ten in academic writing, and three out of ten in fiction.
Table 2. Results for 2000 Word Frequency Word Families: Frequency of Occurrence (From Cobb, 2007,
Table 1)
The situation is even more serious at the 3000 word level, with six out of ten failing to make the
minimum threshold of six occurrences in the press corpus, eight out of ten in the academic corpus, and
five out of ten in the fiction corpus (Table 3).
Table 3. Results for 3000 Word Frequency Word Families: Frequency of Occurrence (From Cobb, 2007,
Table 1)
Cobb thus concluded that "even the largest plausible amounts of free reading will not take the learner very
far in the 3000-family zone" (p. 44).
Cobb’s analysis suffers from two major problems. First, the amount of reading that Cobb proposes as
"optimistic" is, in fact, pessimistic in the extreme. The number of words read is a product of time spent
reading and reading rate. Table 4 summarizes the results from 11 studies that have reported L2 reading
rates with readers from a variety of L1 backgrounds in both EFL and ESL settings. Fraser (2007)
summarizes the results of several studies included in Table 4 (Cushing-Weigle & Jensen, 1996; Haynes &
Language Learning & Technology
Jeff McQuillan and Stephen D. Krashen
A Response to Cobb (2007)
Carr, 1990; Nassaji & Geva, 1999; Oller & Tullius, 1973; Taguchi, 1997), and data reported from these
studies are taken directly from her Appendix A. For one study (National Institute for Literacy, 2003), oral
reading rates were used. Studies are ordered by average reading rate in words-per-minute. L2 reading
proficiency is based on the researcher’s own classification of the students’ levels.
Table 4. Average Reading Rates of L2 Readers
Taguchi, Takayasu-Maass, &
Gorsuch (2004)
Haynes & Carr (1990)
1st year college EFL
Hirai (1999)1
National Institute for Literacy
Taguchi & Gorsuch (2002)
Taguchi (1997)
Fraser (2007)3 – China Group
1st – 3rd year college EFL
Adult ESL
Fraser (2007)3 – Canada Group
1st-4th year undergraduate
Undergraduate EFL
Graduate ESL
Undergraduate & graduate
Cushing-Weigle & Jensen (1996)
Nassaji & Geva (1999)
Oller & Tullius (1973)
Undergraduate EFL
1st year college EFL
Undergraduate EFL
3rd year undergraduate EFL
L2 Reading
Average L2
Reading Rate
83 wpm
86 wpm
115 wpm
127 wpm
135.5 wpm
87.5 wpm
102 wpm (oral)
140.4 wpm
158 wpm
179 wpm
206 wpm
Table 1, whole group score
Cohort 11 of all ESL readers
Table 1, Task 4 – Learning
It should be noted that these studies probably underestimate reading rates achieved during free reading.
The texts used to determine reading rate in all cases were selected by the researcher, and thus may have
been too difficult for the reader or on a topic about which the reader lacked sufficient background
knowledge. It seems likely that students engaged in free reading, where the text is self-selected and thus
probably a closer fit for the reader’s proficiency and background knowledge, would read at a faster rate.
It is clear from Table 4 that L2 reading rates vary widely, with more proficient readers reading faster than
less proficient ones. We conservatively choose 100 wpm as an average reading rate for our analysis,
which is slightly below the average rate for readers at a beginning level of L2 reading proficiency for the
studies included here (106.8 wpm). An adult L2 reader reading at a speed of 100 words-per-minute would
take 1,750 minutes to go through 175,000 words of text. That is the equivalent of 29.2 hours of reading
which, over the course of two years of language study, would amount to a mere 2.4 minutes of free
reading per day. What Cobb is actually demonstrating is that a very small amount of reading over a
period of 12 to 24 months would not be sufficient to make one a fluent L2 reader.
Second, an examination of the "total" columns in Tables 2 and 3 reveals that a reader who read
newspaper, academic, and fiction texts (all three subsets), for a total of about 517,000 words, would easily
pass even the more demanding criterion of ten encounters for all of the words at the 2000-family level,
and for eight of the ten at the 3000-family level.
An L2 acquirer reading 100 words-per-minute would be able to accomplish this in a little more than 86
hours, or at the rate of one hour per day, the equivalent of a single academic quarter (approximately 13
Language Learning & Technology
Jeff McQuillan and Stephen D. Krashen
A Response to Cobb (2007)
weeks). Free reading across a variety of genres can indeed give you the necessary vocabulary for adultlevel fluency. The contrast between this estimate and Cobb’s is presented in Table 5.
Table 5. Two Estimates of Amount of Reading (at 100 words per minute)
McQuillan & Krashen
2.4 minutes over 2 years
60 minutes over 1
academic quarter
What is surprising about Cobb’s data is just how powerful free reading really is, even at the minimal
levels he used. Even if a reader stuck to one genre, and read as little as Cobb suggests, a lot would be
accomplished. With just 100 pages of newspaper text alone, for example, one can make significant
progress toward the 2000-family level. Cobb’s analysis shows that you would have sufficient encounters
for acquisition of nine of the ten sample word families. Similar progress could be made by reading the
equivalent of six books the length of Alice in Wonderland, which, while perhaps insufficient for academic
purposes, is very impressive.
A reader who dedicated a modest 20 minutes per day to free reading would, over Cobb’s hypothetical two
years of language study, would encounter 1,460,000 words – a substantial number, more than eight times
the number of words in Cobb’s estimate. It seems likely, then, that this amount would allow one to reach
even the 5000-word level.
Free reading may not be sufficient to meet the needs of all demanding academic or specialized texts,
although nothing in Cobb’s analysis would preclude that possibility. Further research should take into
account a more realistic estimate of the volume of reading by the typical L2 acquirer. Cobb has shown
us, however, that for a modest investment in time, free reading does appear to be more than adequate to
reach the vocabulary levels that he argues are necessary for a fluent L2 reader.
Jeff McQuillan is a Senior Research Associate at the Center for Educational Development in Los
Angeles, California.
Email: [email protected]
Stephen D. Krashen is Emeritus Professor of Education at the University of Southern California in Los
Angeles, California.
Email: [email protected]
Cobb, T. (2007). Computing the vocabulary demands of L2 reading. Language Learning & Technology,
11(3), 38-63. Retrieved October 7, 2007 from
Cushing-Weigle, S., & Jensen, L. (1996). Reading rate improvements in university ESL classes.
CATESOL Journal, 9, 55-71.
Language Learning & Technology
Jeff McQuillan and Stephen D. Krashen
A Response to Cobb (2007)
Fraser, C. (2007). Reading rate in L1 Mandarin Chinese and L2 English across five reading tasks. The
Modern Language Journal, 91(3), 372-394.
Haynes, M., & Carr, T.H. (1990). Writing system background and second language reading: A
component skills analysis of English reading by native speakers of Chinese. In T.H. Carr & B.A. Levy
(Eds.), Reading and its development: Component skills approaches (pp. 375-418). San Diego, CA:
Academic Press.
Hirai, A. (1999). The relationship between listening and reading rates of Japanese EFL learners. The
Modern Language Journal, 83(3), 367-384.
Krashen, S.D. (1989). We acquire vocabulary and spelling by reading: Additional evidence for the Input
Hypothesis. The Modern Language Journal, 73, 440-464.
Krashen S.D. (2004). The power of reading, 2nd edition. Portsmouth, NH: Heinemann.
McQuillan, J. (1998). The literacy crisis: False claims, real solutions. Portsmouth, NH: Heinemann.
Nassaji, H., & Geva, E. (1999). The contribution of phonological and orthographic processing skills to
adult ESL reading: Evidence from native speakers of Farsi. Applied Psycholinguistics, 20, 241-267.
National Institute for Literacy (NIFL). (2004). Adult reading components study. Washington, DC.
Retrieved October 7, 2007, from
Oller, J. W., & Tullius, J. R. (1973). Reading skills of non-native speakers of English. International
Review of Applied Linguistics, 11, 69–79.
Taguchi, E. (1997). The effects of repeated readings on the development of lower identification skills of
FL readers. Reading in a Foreign Language, 11, 97-119.
Taguchi, E., & Gorsuch, G. (2002). Transfer effects of repeated EFL reading on reading new passages: A
preliminary investigation. Reading in a Foreign Language, 14(1), 43-65.
Taguchi, E., Takayasu-Maass, M., & Gorsuch, G. (2004). Developing reading fluency in EFL: How
assisted repeated reading and extensive reading affect fluency development. Reading in a Foreign
Language, 16(2), 70-96.
Language Learning & Technology
Language Learning & Technology
February 2008, Volume 12, Number 1
pp. 109-114
Tom Cobb
Université du Québec à Montreal
I was glad to receive a response from Jeff McQuillan and Stephen Krashen to my piece "Computing the
Demands of Vocabulary Acquisition from Reading" (Language Learning & Technology, October, 2007),
because drafting a reply forces me to be even more clear about what I am saying. I was initially surprised
to see the lead responder was an expert in first language (L1) reading rather than second (L2) but on
second thought McQuillan’s (The Literacy Crisis, 1998) participation makes sense.
Position Review
I argued that building an adequate functional L2 lexicon for reading from reading alone (Krashen’s
longstanding position) cannot be done by the majority of learners in the normal time frame of instructed
L2 learning. An example of such a time frame would be the year or two of ESL preparation granted to
foreign students on arrival in a North American university. A minimal functional lexicon is 3,000 word
families, which provides about 90% known-word coverage of average texts. But lexicon building from
reading alone will stall shortly after 2,000 families. This happens for the demonstrable reason that 3,000level words (and other less frequent words) do not appear often enough in the amount of reading of
natural texts that such learners are likely to accomplish. Research has shown that words need to appear
minimally six times for learning to take place.
As proof I offered three samples of natural text at what I proposed was the outer limit of such an amount,
namely any of the journalism, academic, or literary sub-corpora of the Brown corpus. Each of these
amounts to about 175,000 words, of which 10% are words beyond the 2,000-most-frequent level (minus
proper nouns). Through elementary corpus analysis, I showed that a learner who managed to read any one
of these collections would meet no more than half of the third thousand word families six times apiece. A
similar analysis of the collected works of a major author (300,000+ words) and another of an entire set of
graded readers (375,000+ words) pointed to the same conclusion: reading these texts in their entirety
cannot provide enough repeated exposures to enough 3,000-level vocabulary to support the acquisition of
a minimal functional lexicon.
Critics’ Response
I would have expected a critique of this analysis to focus on the assumption that most of the words in a
text need to be known for reading to proceed successfully, given the abiding belief that learners can easily
expand their vocabularies by guessing new word meanings from context, as was assumed but never
shown in many classic accounts of the reading process. So I was surprised that the critics’ actual problem
was with the claim that L2 learners would have trouble reading 175,000 words of fairly difficult natural
text in a year or two. Doing the math, McQuillan and Krashen propose that even "reading relatively
slowly at a speed of 100 words-per-minute," (p. 106) L2 learners should be able to read 175,000 words in
1,750 minutes or 29.2 hours, which, spread over two years, "amounts to only 2.4 minutes of free reading
per day" (p. 106). Such readers would make light work of any of the Brown sub-corpora, or indeed all
three of them.
As a teacher and coordinator of many L2 reading courses and programs, I wondered if we were talking
about the same world. In my experience, even strong ESL readers find small amounts of unsimplified text
fairly hard going, particularly if the text type is expository rather than narrative. I myself after decades of
working in and around French cannot get through Le Monde before the next edition is on the doorstep.
Indeed, expository academic or journalistic text has always been the stuff of the intensive reading course,
Copyright © 2008, ISSN 1094-3501
Tom Cobb
Response to McQuillan and Krashen (2008)
wherein a good deal of scaffolding is provided and no great volume of text actually gets read. Were my
critics and I talking about the same kind of learner and the same kind of reading?
The Reading Rate Research
To support their position, McQuillan and Krashen cite a half dozen reading rate studies, conveniently
gathered in the literature review of a recent study by Fraser (2007). In this research and Fraser’s own
study, L2 reading rates of 100 words-per-minute (wpm) and even somewhat higher appear to be the norm.
But a little digging below the numbers raises questions about their applicability to the matter at hand.
The first thing to note about the Fraser (2007) study is that while my critics use its results to establish how
much L2 readers can read, the researcher herself interprets the data to show how slow and arduous L2
reading is even for experienced readers, and how it remains so for long periods, even for those living and
studying in the L2 culture.
The second point is the nature of the participant groups in Fraser’s study, neither of which greatly
resembles a group of learners who are at the point of taking on the third thousand words as a prelude to
academic reading. One consists of students who have specialized in English language and literature
(English majors in a Chinese university), while the other consists of learners well into their studies in a
Canadian university; some of them had lived in Canada for as long as 12 years (2007, p. 378).
Third and most important is the nature of the material Fraser’s subjects read at the cited rates of 135 and
140 wpm. Fraser reports that in terms of grade equivalent level, the two experimental texts were found to
be suitable for use with Grade 9 or 10 high school students. Her analysis using Vocabprofile confirms
their non-university character; the frequency profiling revealed large proportions of first 2,000 level lexis
-- 83% in one text and 86.8% in the other (p. 394). These proportions of basic lexis are substantially
higher than those consistently found in more typical university-level texts, as represented by, say, the
academic section of the Brown corpus. Table 1 shows randomly chosen profiles from segments of this
corpus. The mean coverage of the first 2,000 words in the Brown texts is only 78.53%, with a very small
standard deviation (3.01%). The differences between the Brown mean and the means of Fraser’s two
experimental texts (5% and 8%) may seem minor, but from the L2 reader’s perspective, the added load of
5% more non-basic vocabulary means one more ‘hard’ word per two lines of text. An added 8% means
one more in almost every line.
Table 1. Lexical Frequency Profiles across Disciplines (coverage percentages).
No. of
1000 +
Social Psychology
Medicine (anatomy)
1K +
2K +
Notes: (1) Table from Cobb & Horst, 2004. (2) Segments from the Brown corpus are described in the
Brown University website accessible from the Compleat Lexical Tutor at (3) AWL
stands for Academic Word List.
Language Learning & Technology
Tom Cobb
Response to McQuillan and Krashen (2008)
To summarize, the experimental materials were easy texts for these learners. As we will see, the other
reading rate studies from Fraser (2007) that McQuillan and Krashen cite are similarly inapplicable to the
question under discussion. Either the readers were too advanced to be considered typical classroom ESL
learners, or the reading materials were much simpler than those specified in my original paper, or both.
But before examining these studies more closely, it may be useful to remind ourselves of the kind of
learners and texts this discussion is about. At issue are the many ESL and EFL learners worldwide who
are trying to move beyond a basic 2,000-word lexicon, by reading texts that contain significant amounts
of post-2,000 lexis. Such learners and texts are not hypothetical but typical. In the case of learners,
Laufer’s (2000) review of seven vocabulary size studies showed that university entry level ESL/ EFL
learners in three Asian, two Middle Eastern, and three European countries were working with an average
2100 known word families (SD 977). In the case of texts, those bearing post-basic vocabulary are
common if not the norm in academic and professional contexts; as mentioned, the Brown academic
corpus bears an average 10% (SD 3%), that is, one post-2,000 word, on average, in every line.
As I have already shown, Fraser’s (2007) reading rates are simply not applicable to the learners and texts
in question. Nor are the four other main sources of rate evidence reviewed in her study and cited by
McQuillan and Krashen. First, Nassanji and Geva’s (1999) study of 60 Farsi-speakers hardly pertains to
ESL learners at the beginning of their academic studies. They were in fact "graduate students at a major
Canadian university… who had been living in Canada for 3 to 6 years" (p. 246). Their mean
comprehension scores on the Nelson-Denny Reading Test, a measure designed for L1 readers, were over
50% (p. 251). The several Taguchi studies cited (e.g., Taguchi, Takayasu-Maass, & Gorsuch, 2004) all
involve repeated readings of simplified texts from the Heinemann New Wave series of graded readers,
with an upper limit of 2200 word families (Hill, 1997). Hirai’s (1999) study involves an experimental
reading task that measures not reading rate but rather "rauding" rate, following Carver’s (1990) notion
that reading rate is best measured using texts that the readers find easy to read. Similarly, Haynes and
Carr’s (1990) study used a text chosen specifically because of its familiar topic and the fact that it
"contained numerous 'lexical familiarizations'… definitions, examples, stipulations, synonyms,
paraphrases, illustrations, etc., which the author had provided to clarify the meaning of new terms
introduced in the text" (p. 396). It goes without saying that not all authentic academic texts would be
so lexically familiarized.
Thus the L2 reading rate research cited is not applicable to academic reading. Even if it were, simple
multiplication of reading rate times hours and days would only tell us what learners might be able to read
in principle. This would still have to be confirmed in studies of what some particular learners had read in
fact. Therefore I returned to some of the research literature where I thought I remembered specific
amounts of L2 reading having been documented. Was any of it (a) of the type we are talking about and
(b) as much as McQuillan and Krashen claimed should be possible?
The Amount-of-Reading Research
Not all extensive reading studies produce such useful specifics as numbers of words or pages read, but
there are some. The largest amounts of L2 reading on record seem to hail from Japanese contexts. One is
Rosszell’s (2007) doctoral study, in which university learners read an average 40 pages per week, which
at 300 words per page, 12,000 words per week, 40 weeks per year, amounts to almost half a million
words in a year. This is in the order of the rate McQuillan and Krashen hypothesize. However, the texts
that these learners were reading were graded readers from Oxford’s Bookworm series, levels 4 to 6 (6 is
highest). As mentioned in my original paper, this type of text does not include adequate inputs from the
third thousand level and indeed makes no claim to.
An even larger amount of reading was recorded by Beniko Mason (2004). Her study involved 18-year old
English majors at a junior college in Japan reading fully 1000 pages, or 250,000 words, per semester
(although not all participants were able to meet these targets). Extrapolating this reading to four terms or
Language Learning & Technology
Tom Cobb
Response to McQuillan and Krashen (2008)
two years, the amount of reading would indeed seem to be about 1 million words. This is the kind of
figure McQuillan and Krashen are talking about, and it is equivalent to the size of the entire Brown
corpus. But in fact, it was not the Brown corpus or anything resembling it in lexical composition that the
learners were reading. Again, all this reading is of graded or simplified texts – much of it at the very
elementary 600 word level.
Certainly, there is nothing wrong with reading simplified texts! But learners reading large amounts of
such text do not make the point that McQuillan and Krashen wish to make.
In fact, reading simplified texts is a very good thing for language learners to do, for many reasons
including but not limited to increasing vocabulary (Horst 2005; Pigada & Schmitt, 2006). Indeed, the
second part of my LLT piece described ways of using technology to expand the library of graded
materials that are accessible to ESL teachers and learners. Text computing can help us expand the range
(to include more expository material) and vocabulary level (to provide a smooth rise up to a vocabulary
size of 3,000+ word families) of available graded materials. At present, there is no smooth rise. Rather,
"there exists a wide gap between the highest level of graded readers and the vocabulary demands of
academic text and unsimplified novels" (Nation, forthcoming, p. 1). Even the best graded reader series,
e.g., Oxford’s Bookworm series, make no claims beyond 2,500 words. The Longman Bridge Series
(1945) was a systematic grading of materials up to 8,000 words, but it is long out of print. The new
Penguin/Longman Active Reading series may claim successor status to Bridge with its 3,000 word-family
target, but none of the studies I located had used this series. In other words, the large amounts of reading
reported in some of the published research is reading of a type that by definition cannot be the route to an
adequate functional reading lexicon.
But are there no studies involving the reading of unsimplified texts? Given the number of learners
worldwide who are trying to improve their reading ability for advanced study through English, it is
surprising how little research addresses this common objective. An exception is work by Parry, who
looked at academic ESL learners reading large amounts of unsimplified texts in credit courses at U.S.
universities. A preliminary goal of these studies was to estimate how many words the learners had
managed to read. In a case study of two learners, one read as few as 7,500 words of an assigned
anthropology text over a complete term, while another read as many as 72,000 words from the same
textbook (Parry, 1997).
At first glance, the second reader’s rate lines up nicely with McQuillan and Krashen’s estimate. If 72,000
words of academic text can be read in a term, and two years is four terms, and the learner is taking four
such courses at a time, then he is reading over one million (72,000 x 4 x 4 = 1,152,000) words of natural
academic text in two years. This is the size of the whole Brown corpus and then some, and its lexical
composition is probably similar. So can we conclude that some ESL learners can read at the rate
McQuillan and Krashen propose, and presumably experience the vocabulary growth that goes with it?
Not exactly.
At the start of the experiment, Parry asked her two readers to write down all the words they thought were
new or difficult while reading, along with the page number. Then at the end of the academic session the
readers were asked to provide the meanings of words they had noted, first out of context and then in the
page context where the word was first noticed. Out of context, the 72,000-word reader could remember
having seen only 5% of the words he had originally noted, and with the help of the context could provide
correct or partly correct meanings (in his L1) for only 28%. The 7,500-word reader, on the other hand,
could remember seeing 29% of her words and could give correct or partly correct meanings for 63%. In
other words, the fast reader was reading a rather large amount but with very little vocabulary growth,
while the slow reader was not getting her reading done but was learning some of the new words in the
small amounts she did manage to complete.
Language Learning & Technology
Tom Cobb
Response to McQuillan and Krashen (2008)
These and a number of similar Parry studies are small but nonetheless, they ring true for anyone who has
taught academic ESL reading in North America or elsewhere. Learners in such courses typically struggle
to get through a few pages - or else read quickly with low comprehension and almost no vocabulary
Conclusion: Not Reading Alone
Parry’s case studies suggest that most academic ESL learners cannot read their way to an adequate
functional second lexicon. The scale of her research is not big enough to be conclusive, but there is no
evidence I know of to contradict it. Also, Parry’s findings are congruent with some other well established
evidence. Replicated research shows that reading becomes arduous and comprehension suffers when
unknown word densities exceed 5% (e.g. Laufer, 1989). Nation (2006) sets the ideal criterion as low as
2%. But academic readers with knowledge of about 2,000 words are reading texts bearing at least 10%
unknown items, and as their ESL teachers can attest, their pain is all too real.
If McQuillan and Krashen have relevant counter-evidence to any of this, I welcome it. Then the
discussion can proceed on a different basis. Until then, the adequacy of free reading is an idea with high
credibility in the time frame of L1 acquisition, and some credibility in an extended time frame of L2
acquisition under conditions of exceptional motivation. But carried into the typical time frame of
instructed L2 acquisition, it is an idea that grossly misrepresents the problems faced by L2 readers who
need to read to learn in their second languages. For these learners, an adequate second lexicon will not
happen by itself; it will be provisioned through well-designed instruction including but not limited to
Tom Cobb began his career in English literature but soon moved toward language study and instruction.
He has taught and coordinated reading and writing courses at a wide range of levels in the United
Kingdom, Saudi Arabia, Oman, Hong Kong, Australia, New Zealand, Japan, Mexico – and, of course,
Canada. He currently trains TESL trainees in the uses of computing in language learning at a Frenchlanguage university in Montreal.
Email: [email protected]
Carver, R. (1990). Reading rate: A review of research and theory. San Diego, CA: Academic Press.
Cobb, T. & Horst, M. (2004). Is there room for an AWL in French? In P. Bogaards & B. Laufer (Eds.),
Vocabulary in a second language: Selection, acquisition, and testing (pp. 15-38). Amsterdam: John
Fraser, C. (2007). Reading rate in L1 Mandarin Chinese and L2 English across five reading tasks. The
Modern Language Journal, 91(3), 372-394.
Haynes, M., & Carr, T.H. (1990). Writing system background and second language reading: A
component skills analysis of English reading by native speakers of Chinese. In T.H. Carr & B.A. Levy
(Eds.), Reading and its development: Component skills approaches (pp. 375-418). San Diego, CA:
Academic Press.
Hill, D. (1997). Graded (basal) readers -- Choosing the best. The Language Teacher Online, 21(5).
Retrieved January 12, 2008, from
Language Learning & Technology
Tom Cobb
Response to McQuillan and Krashen (2008)
Hirai, A. (1999). The relationship between listening and reading rates of Japanese EFL learners. The
Modern Language Journal, 83(3), 367-384.
Horst, M. (2005). Learning L2 vocabulary through extensive reading: A measurement study. Canadian
Modern Language Review, 61(3), 355-382.
Laufer, B. (1989). What percentage of text-lexis is essential for comprehension? In C. Lauren & M.
Nordman (Eds.), Special language: From humans thinking to thinking machines (pp. 316-323). Clevedon,
UK: Multilingual Matters.
Laufer, B. (2000). Task effect on instructed vocabulary learning: The hypothesis of 'involvement.'
Selected Papers from AILA ’99 Tokyo (pp. 47-62). Tokyo: Waseda University Press.
Mason, B. (2004). The effect of adding supplementary writing to an extensive reading program.
International Journal of Foreign Language Teaching, 1(1), 2-16.
McQuillan, J. (1998). Literacy crisis: False claims, real solutions. Portsmouth NH: Heinemann.
Nagy, W.E., & Anderson, R.C. (1984). How many words are there in printed school English? Reading
Research Quarterly, 19(3), 304-330.
Nassaji, H., & Geva, E. (1999). The contribution of phonological and orthographic processing skills to
adult ESL reading: Evidence from native speakers of Farsi. Applied Psycholinguistics, 20(2), 241-267.
Nation, P. (forthcoming) New roles for L2 vocabulary? In L. Wei & V. Cook (Eds.) Language teaching
and learning. Contemporary Applied Linguistics Series. London: Continuum International.
Nation, P. (2006). How large a vocabulary is needed for reading and listening? In M. Horst and T. Cobb
(Eds.), [Special Issue on Second Language Vocabulary Acquisition]. Canadian Modern Language
Review, 63(1), 59-81.
Parry, K. (1997). Vocabulary and comprehension: Two portraits. In J. Coady & T. Huckin (Eds.) Second
language vocabulary acquisition (pp. 55-68). Cambridge, UK: Cambridge University Press.
Pigada, M., & Schmitt, N. (2006). Vocabulary acquisition from extensive reading: A case study. Reading
in a Foreign Language, 18(1), 1-28.
Rosszell, R. (2007). Extensive reading and intensive vocabulary study in a Japanese university.
Unpublished doctoral dissertation, Temple University, Japan.
Taguchi, E., Takayasu-Maass, M., & Gorsuch, G. (2004). Developing reading fluency in EFL: How
assisted repeated reading and extensive reading affect fluency development. Reading in a Foreign
Language, 16(2), 70-96.
Language Learning & Technology