FULL TEXT - Canadian Centre for Knowledge Mobilisation

Transcription

FULL TEXT - Canadian Centre for Knowledge Mobilisation
Volume 29, No. 4
Obtaining and
Interpreting
Maximum
Performance Tasks
from Children: A
Tutorial
Applications of 2D
and 3D Ultrasound
Imaging in
Speech-Language
Pathology
Exploring the Use of
Electropalatography
and Ultrasound in
Speech Habilitation
Published by the
Canadian Association of
Speech-Language
Pathologists and
Audiologists
Publiée par l'Association
canadienne des
orthophonistes et
audiologistes
Winter Hiver 2005
JOURNAL OF SPEECH-LANGUAGE PATHOLOGY AND AUDIOLOGY
Purpose and Scope
The Canadian Association of Speech-Language
Pathologists and Audiologists (CASLPA) is the
recognized national professional association of
speech-language pathologists and audiologists in
Canada. The association was founded in 1964,
incorporated under federal charter in 1975 and is
committed to fostering the highest quality of
service to communicatively impaired individuals
and members of their families. It began its periodical
publications program in 1973.
Indexing
JSLPA is indexed by:
• CINAHL - Cumulative Index to Nursing and
Allied Health Literature
• CSA - Cambridge Scientific Abstracts Linguistics and Language Behavior Abstracts
• Elsevier Bibliographic Databases
• ERIC Clearinghouse on Disabilities and Gifted
Education
• PsycInfo
JSLPA Reviewers
The purpose of the Journal of Speech-Language
Pathology and Audiology (JSLPA) is to disseminate
contemporary knowledge pertaining to normal
human communication and related disorders of
communication that influence speech, language,
and hearing processes. The scope of the Journal is
broadly defined so as to provide the most inclusive
venue for work in human communication and its
disorders. JSLPA publishes both applied and basic
research, reports of clinical and laboratory inquiry,
as well as educational articles related to normal and
disordered speech, language, and hearing in all age
groups. Classes of manuscripts suitable for
publication consideration in JSLPA include
tutorials, traditional research or review articles,
clinical, field, and brief reports, research notes,
and letters to the editor (see Information to
Contributors). JSLPA seeks to publish articles that
reflect the broad range of interests in speechlanguage pathology and audiology, speech sciences,
hearing science, and that of related professions.
The Journal also publishes book reviews, as well as
independent reviews of commercially available
clinical materials and resources.
Subscriptions/Advertising
Nonmember and institution subscriptions are
available. For a subscription order form, including
orders of individual issues, please contact: CASLPA,
200 Elgin Street, Suite 401, Ottawa, Ontario K2P
1L5. Tel.: (800) 259-8519, (613) 567-9968; Fax:
(613) 567-2859; E-mail: [email protected]
Internet: www.caslpa.ca/english/resources/
jslpasubscriptions.asp.
All inquiries concerning the placement of
advertisements in JSLPA should be directed to
[email protected]. The contents of all material and
advertisements which appear in JSLPA are not
necessarily endorsed by the Canadian Association
of Speech-Language Pathologists and Audiologists.
Scott Adams, Joy Armson, Lisa Avery, Shari
Baum, Paul Beaudin, Sandi Bojm, V.J.
Boucher, Janine Boutelier, Tim Bressman,
David Brown, Melanie Campbell, Marshall
Chasin, Margaret Cheesman, Lynne Clarke,
Pat Cleave, Martha Crago, Claire Croteau,
Lynn Dempsey, Luc DeNil, Marla Jean
DeSousa, Christine Dollaghan, Philip Doyle,
Christopher Dromey, Wendy Duke, Ahn
Duong, Andrée Durieux-Smith, Tanya L.
Eadie, Jos Eggermont, Diane Frome Loeb,
Jean-Pierre Gagné, Robin Gaines, Linda
Garcia, Bryan Gick, Luigi Girolametto, Paul
Hagler, Joseph W. Hall, III, Elizabeth Haynes,
Steve Heath, Lynne Hewitt, Megan Hodge,
Bill Hodgetts, Tammy Hopper , Nancy
Hubbard, Marc Joanisse, Jack Jung, Benoît
Jutras, Aura Kagan, Joseph Kalinowski,
Michael Kiefte, Robert Kroll, Deborah Kully,
Guylaine Le Dorze, Jeff Lear, Christopher Lee,
Carol Leonard, Tony Leroux, Janice Light,
Rosemary Lubinski, Shelia MacDonald, Ian
MacKay, Heather MacLean, Ruth Martin,
Virginia Martin, Rosemary Martino, Rachel
Mayberry, David McFarland, Lu-Anne
McFarlane, Alison McVittie, Barbara
Meissner Fishbein, Kathy Meyer, Linda Miller,
Linda Milosky, Jerald Moon, Taslim Moosa,
Robert Mullen, Kathleen Mullin, Kevin
Munhall, Chris Murphy, Candace Myers, J.B.
Orange, Marc Pell, Carole Peterson, Kathy
Pichora-Fuller, Dennis Phillips, Michel Picard,
Karen Pollock, Moneca Price, Barbara Purves,
Elizabeth Kay-Raining Bird, Jana Rieger,
Danielle Ripich, Elizabeth Rochon, Nelson
Roy, Christine Santilli, Susan Scollie, Barb
Shadden, Rosalee Shenker, Bernadette Ska ,
Elizabeth Skarakis-Doyle, Jeff Small, Ravi
Sockalingham, David Stapells, Catriona
Steele, Andrew Stuart, Anne Sutton, Stephen
Tasko, Nancy Thomas-Stonell, Sharon
Trehub, Natacha Trudeau, Anne van Kleeck,
Ted Venema, Joanne Volden, Susan Wagner,
Danya Walker, Linda Walsh, Jian Wang,
Genese Warr Leeper, Penny Webster, Richard
Welland, Lynne Williams, William Yovetich,
Connie Zalmanowitz, Kim Zimmerman
Vol. 29, No. 4
Winter 2005
Editor
Phyllis Schneider, PhD
University of Alberta
Managing Editor/Layout
Judith Gallant
Manager of Communications
Angie Friend
Associate Editors
Marilyn Kertoy
University of Western Ontario
(Language, English submissions)
Tim Bressmann
University of Toronto
(Speech, English submissions)
Rachel Caissie
Dalhousie University
(Audiology, English submissions)
Patricia Roberts, PhD
University of Ottawa
(Speech & Language, French
submissions)
Tony Leroux, PhD
Université de Montréal
(Audiology, French submissions)
Assistant Editor
Vacant
(Material & Resource Reviews)
Assistant Editor
Vacant
(Book Reviews)
Cover illustration
Andrew Young
Review of translation
Tony Leroux, PhD
Université de Montréal
Translation
Smartcom Inc.
ISSN 0848-1970
Canada Post
Publications Mail
# 40036109
JSLPA is published quarterly by the Canadian Association of Speech-Language Pathologists and Audiologists (CASLPA). Publications
Agreement Number: # 40036109. Return undeliverable Canadian addresses to: CASLPA, 200 Elgin Street, Suite 401, Ottawa, Ontario
K2P 1L5. Address changes should be sent to CASLPA by e-mail [email protected] or to the above-mentioned address.
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
141
REVUE D’ORTHOPHONIE ET D’AUDIOLOGIE
Objet et Portée
L’Association canadienne des orthophonistes et
audiologistes (ACOA) est l’association professionnelle
nationale reconnue des orthophonistes et des
audiologistes du Canada. L’Association a été fondée
en 1964 et incorporée en vertu de la charte fédérale
en 1975. L’Association s’engage à favoriser la meilleure
qualité de services aux personnes atteintes de troubles
de la communication et à leurs familles. Dans ce but,
l’Association entend, entre autres, contribuer au
corpus de connaissances dans le domaine des
communications humaines et des troubles qui s’y
rapportent. L’Association a mis sur pied son
programme de publications en 1973.
L’objet de la Revue d’orthophonie et d’audiologie
(ROA) est de diffuser des connaissances relatives à la
communication humaine et aux troubles de la
communication qui influencent la parole, le langage
et l’audition. La portée de la Revue est plutôt générale
de manière à offrir un véhicule des plus compréhensifs
pour la recherche effectuée sur la communication
humaine et les troubles qui s’y rapportent. La ROA
publie à la fois les ouvrages de recherche appliquée et
fondamentale, les comptes rendus de recherche
clinique et en laboratoire, ainsi que des articles
éducatifs portant sur la parole, le langage et l’audition
normaux ou désordonnés pour tous les groupes
d’âge. Les catégories de manuscrits susceptibles d’être
publiés dans la ROA comprennent les tutoriels, les
articles de recherche conventionnelle ou de synthèse,
les comptes rendus cliniques, pratiques et sommaires,
les notes de recherche, et les courriers des lecteurs
(voir Renseignements à l’intention des
collaborateurs). La ROA cherche à publier des articles
qui reflètent une vaste gamme d’intérêts en
orthophonie et en audiologie, en sciences de la
parole, en science de l’audition et en diverses
professions connexes. La Revue publie également des
critiques de livres ainsi que des critiques indépendantes
de matériel et de ressources cliniques offerts
commercialement.
Abonnements/Publicité
Les membres de l’ACOA reçoivent la Revue à ce titre.
Les non-membres et institutions peuvent s’abonner
Les demandes d’abonnement à la ROA ou de copies
individuelles doivent être envoyées à : ACOA, 200,
rue Elgin, bureau 401, Ottawa (Ontario) K2P 1L5.
Tél. : (800) 259-8519, (613) 567-9968; Téléc. : (613)
567-2859 Courriel : [email protected]; Internet :
www.caslpa.ca/francais/resources/jslpa-asp.
Toutes les demandes visant à faire paraître de la
publicité dans la ROA doivent être adressées au
Bureau national. Les articles, éditoriaux et publicités
qui paraissent dans la ROA ne sont pas nécessairement
avalisés par l’Association canadienne des
orthophonistes et audiologistes.
Inscription au Répertoire
ROA est répertoriée dans:
• CINAHL - Cumulative Index to Nursing and
Allied Health Literature
• CSA - Cambridge Scientific Abstracts Linguistics and Language Behavior Abstracts
• Elsevier Bibliographic Databases
• ERIC Clearinghouse on Disabilities and Gifted
Education
• PsycInfo
Réviseurs de la ROA
Scott Adams, Joy Armson, Lisa Avery, Shari
Baum, Paul Beaudin, Sandi Bojm, V.J.
Boucher, Janine Boutelier, Tim Bressman,
David Brown, Melanie Campbell, Marshall
Chasin, Margaret Cheesman, Lynne Clarke,
Pat Cleave, Martha Crago, Claire Croteau,
Lynn Dempsey, Luc DeNil, Marla Jean
DeSousa, Christine Dollaghan, Philip Doyle,
Christopher Dromey, Wendy Duke, Ahn
Duong, Andrée Durieux-Smith, Tanya L.
Eadie, Jos Eggermont, Diane Frome Loeb,
Jean-Pierre Gagné, Robin Gaines, Linda
Garcia, Bryan Gick, Luigi Girolametto, Paul
Hagler, Joseph W. Hall, III, Elizabeth Haynes,
Steve Heath, Lynne Hewitt, Megan Hodge, Bill
Hodgetts, Tammy Hopper , Nancy Hubbard,
Marc Joanisse, Jack Jung, Benoît Jutras, Aura
Kagan, Joseph Kalinowski, Michael Kiefte,
Robert Kroll, Deborah Kully, Guylaine Le
Dorze, Jeff Lear, Christopher Lee, Carol
Leonard, Tony Leroux, Janice Light, Rosemary
Lubinski, Shelia MacDonald, Ian MacKay,
Heather MacLean, Ruth Martin, Virginia
Martin, Rosemary Martino, Rachel Mayberry,
David McFarland, Lu-Anne McFarlane, Alison
McVittie, Barbara Meissner Fishbein, Kathy
Meyer, Linda Miller, Linda Milosky, Jerald
Moon, Taslim Moosa, Robert Mullen,
Kathleen Mullin, Kevin Munhall, Chris
Murphy, Candace Myers, J.B. Orange, Marc
Pell, Carole Peterson, Kathy Pichora-Fuller,
Dennis Phillips, Michel Picard, Karen Pollock,
Moneca Price, Barbara Purves, Elizabeth KayRaining Bird, Jana Rieger, Danielle Ripich,
Elizabeth Rochon, Nelson Roy, Christine
Santilli, Susan Scollie, Barb Shadden, Rosalee
Shenker, Bernadette Ska , Elizabeth SkarakisDoyle, Jeff Small, Ravi Sockalingham, David
Stapells, Catriona Steele, Andrew Stuart, Anne
Sutton, Stephen Tasko, Nancy ThomasStonell, Sharon Trehub, Natacha Trudeau,
Anne van Kleeck, Ted Venema, Joanne
Volden, Susan Wagner, Danya Walker, Linda
Walsh, Jian Wang, Genese Warr Leeper, Penny
Webster, Richard Welland, Lynne Williams,
William Yovetich, Connie Zalmanowitz, Kim
Zimmerman
Vol. 29, No 4
Hiver 2005
REVUE
D’ORTHOPHONIE ET
D’AUDIOLOGIE
Rédactrice en chef
Phyllis Schneider, Ph.D.
University of Alberta
Directrice de la rédaction /
mise en page
Judith Gallant
Directrice des communications
Angie Friend
Rédacteurs en chef adjoints
Marilyn Kertoy
University of Western Ontario
(Orthophonie, soumissions en
anglais)
Tim Bressmann
University of Toronto
(Orthophonie, soumissions en
anglais)
Rachel Caissie
Dalhousie University
(Audiologie, soumissions en
anglais)
Patricia Roberts, Ph.D.
Université d’Ottawa
(Orthophonie, soumissions en
français)
Tony Leroux, Ph.D.
Université de Montréal
(Audiologie, soumissions en
français)
Rédacteur adjoint
Libre
(Évaluation des ressources)
Rédacteur adjoint
Libre
(Évaluation des ouvrages écrits)
Révision de la traduction
Tony Leroux, Ph.D
Université de Montréal
Illustration (couverture)
Andrew Young
Traduction
Smartcom Inc.
ISSN0848-1970
Postes Canada
Envoi publications
# 40036109
La ROA est publiée quatre fois l’an par l’Association canadienne des orthophonistes et audiologistes (ACOA). Numéro de publication:
#40036109. Faire parvenir tous les envois avec adresses canadiennes non reçus au 200, rue Elgin, bureau 401, Ottawa (Ontario) K2P
1L5. Faire parvenir tout changement à l’ACOA au courriel [email protected] ou à l’adresse indiquée ci-dessus.
142
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Table of Contents
Table des matières
Introduction
Winter Issue
144
Introduction
Numéro de l’Hiver
145
Article
Article
Obtaining and Interpreting Maximum Performance
Tasks from Children: A Tutorial
Susan Rvachew, Megan Hodge and Alyssa Ohberg
146
Obtenir et interpréter des durées maximales
d’exécution chez des enfants: un tutoriel
Susan Rvachew, Megan Hodge et Alyssa Ohberg
146
Article
Article
Applications of 2D and 3D Ultrasound Imaging in
Speech-Language Pathology
Tim Bressmann, Chiang-Le Heng and Jonathan C.
Irish
158
Utilisation de l’échographie en 2D et 3D en
orthophonie
Tim Bressmann, Chiang-Le Heng et Jonathan C. Irish
158
Article
Article
Exploring the Use of Electropalatography and
Ultrasound in Speech Habilitation
Barbara Bernhardt, Penelope Bacsfalvi, Bryan Gick,
Bosko Radanov and Rhea Williams
169
Resource Review
183
Information for Contributors
185
Explorer l’électropalatographie et l’échographie
pour l’éducation de la parole
Barbara Bernhardt, Penelope Bacsfalvi, Bryan Gick,
Bosko Radanov et Rhea Williams
169
Évaluation des ressources
183
Renseignements à l’intention des collaborateurs
187
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
143
Introduction
Barbara Bernhardt, Ph.D.
School of Audiology and Speech Sciences
University of British Columbia, Vancouver, BC
T
echnological innovations are providing new opportunities for speech assessment and (re)habilitation. The
papers in this issue present up-to-date tutorial information on technological innovations in Canada, with
some preliminary findings on the use of ultrasound, electropalatography and the TOCS+™ MPT Recorder©
ver. 1 (Hodge & Daniels, 2004) in speech assessment and/or (re)habilitation.
Rvachew, Hodge and Ohberg provide information on the use and evaluation of ‘maximum performance tasks’ in
the assessment of motor speech impairments in children. Such tasks involve prolongation of speech sounds and
repetition of syllables, and have been used to assist in differential diagnosis of speech dyspraxia or dysarthria in children
(e.g., Thoonen et al., 1996; Williams & Stackhouse, 2000). The authors describe how TOCS+™ MPT Recorder© ver.
1 (Hodge & Daniels, 2004) is used along with a waveform editor to facilitate reliable administration and recording of
children’s responses and accurate measurement of maximum durations and repetition rates. The software eliminates
significant impediments to the use of the tasks with children. The authors suggest that the application of these procedures
can result in reliable and valid data from younger children and become a routine part of the speech-language assessment
protocol for children with suspected or confirmed speech impairments and delays.
The other two papers in the issue describe visual display technology for articulatory movements. Bressmann, Heng
and Irish provide a tutorial on speech imaging of the tongue with two-dimensional and three-dimensional ultrasound.
They describe a number of different applications that they envisage for ultrasound imaging in speech-language
pathology and demonstrate research findings from different projects in speech and swallowing that have been
undertaken in the Voice and Resonance Laboratory at the University of Toronto. For example, for patients with
glossectomy, three-dimensional tongue imaging has been used to quantify shapes of the tongue during the production
of speech sounds pre- and postoperatively, and to evaluate the effect of different reconstructive techniques on the
deformation and symmetry of the tongue tissue. The authors make a convincing argument for the usefulness, safety and
cost-effectiveness of ultrasound.
At the University of British Columbia’s Interdisciplinary Speech Research Laboratory, Bernhardt and colleagues
have explored both two-dimensional ultrasound and electropalatography (EPG) as articulatory feedback tools in
speech habilitation. EPG and ultrasound give different types of dynamic information about the tongue during speech
production. EPG shows tongue-palate contact patterns from the tongue tip to the back of the hard palate for mid and
high vowels and lingual consonants. Ultrasound images tongue shape, location and configuration for all vowels and
lingual consonants. In their article, Bernhardt, Bacsfalvi, Gick, Radanov & Williams discuss the relative merits of these
techniques and present data from two preliminary treatment studies using one or both of the techniques. Further
research will clarify the relative benefits of the two tools in speech habilitation and their merit in comparison with other
methods across a variety of speakers.
Currently, the technologies described are available primarily at or near universities. However, the procedures
discussed by Rvachew and colleagues can be adapted for use elsewhere. Portable ultrasound machines exist, and in
British Columbia, a rural consultancy project is underway by Bernhardt and colleagues to evaluate the consultative
use of ultrasound in speech habilitation. A clinical program is being set up in Britain called Cleftnet, which could be
a model for regions of Canada. Cleftnet will be providing electropalatographs to cleft palate clinics, and linking those
clinics with a university for data analysis and treatment recommendations. A charitable organization has provided the
funding for this innovative service delivery program. Such a plan is currently germinating in British Columbia. By
bundling these papers together in one volume, we hope that readers will see potential for future clinical applications
and will be encouraged to gain access to new technologies for speech assessment and intervention. Such technologies
enhance our understanding of speech (and swallowing) and can maximize the potential for good outcomes for our
clients.
References
Hodge, M. M. & Daniels, J. D. (2004). TOCS+™ MPT Recorder© ver.1. [Computer software]. University of Alberta, Edmonton, AB.
Thoonen, G., Maassen, B., Wit, J., Gabreels, F., & Schreuder, R. (1996). The integrated use of maximum performance tasks in differential diagnostic evaluations among
children with motor speech disorders. Clinical Linguistics & Phonetics, 10, 311-336.
Williams, P., & Stackhouse, J. (2000). Rate, accuracy and consistency: diadochokinetic performance of young, normally developing children. Clinical Linguistics & Phonetics,
14, 267-293.
144
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Introduction
Barbara Bernhardt, Ph.D.
School of Audiology and Speech Sciences
University of British Columbia, Vancouver, BC
L
es innovations technologiques offrent de nouvelles possibilités pour l’évaluation et la réadaptation de la
parole. Les articles du présent numéro donnent des renseignements à jour sur les progrès
technologiques au Canada et présentent des résultats préliminaires sur l’utilisation de l’échographie, de
l’électropalatographie et du logiciel TOCS+™ MPT Recorder© ver. 1 (Hodge et Daniels, 2004) pour l’évaluation et
la réadaptation de la parole.
Rvachew, Hodge et Ohberg abordent l’utilisation et l’évaluation de la mesure maximale d’exécution de tâches pour
les troubles moteurs de la parole chez les enfants. Ces tâches demandent aux enfants de prolonger des sons de la parole
et de répéter des syllabes. Elles ont été utilisées pour établir des diagnostics différentiels de la dyspraxie et de la dysarthrie
chez les enfants (p. ex. : Thoonen et coll., 1996; Williams et Stackhouse, 2000). Les auteurs décrivent comment le logiciel
TOCS+™ MPT Recorder© ver. 1 (Hodge et Daniels, 2004) est utilisé en concomitance avec un éditeur d’ondes sonores
pour faciliter l’administration d’un test, l’enregistrement des réponses de l’enfant et la mesure exacte de la durée
maximale et de la fréquence de répétition. Ce logiciel élimine les grands obstacles qui nuisent à l’utilisation de ces tâches
avec des enfants. Les auteurs avancent que le recours à ces procédures peut mener à des données fiables et valides pour
les jeunes enfants et pourrait même faire partie du protocole d’évaluation en orthophonie des enfants chez qui l’on
soupçonne ou l’on a diagnostiqué un trouble ou un retard de la parole.
Les deux autres articles de ce numéro décrivent une technologie de visualisation des mouvements articulatoires.
Bressmann, Heng et Irish présentent un tutoriel sur l’imagerie des mouvements de la langue grâce à l’échographie en
deux et trois dimensions. Ils décrivent un certain nombre d’utilisations différentes de cette technique en orthophonie
et présentent des résultats de recherche issus de divers projets sur la parole et la déglutition menés par le Voice and
Resonance Laboratory à l’University of Toronto. Par exemple, pour des patients ayant subi une glossectomie, l’imagerie
en trois dimensions de la langue a servi à quantifier les formes que prend l’organe durant la production de sons avant
et après l’ablation et à évaluer l’effet de différentes techniques de reconstruction sur la déformation et la symétrie du
tissu de la langue. Les auteurs présentent des arguments convaincants sur l’utilité de l’échographie, sa sécurité et son
caractère économique.
Au Interdisciplinary Speech Research Laboratory de l’University of British Columbia, Bernhardt et ses collègues
ont exploré l’échographie à deux dimensions et l’électropalatographie comme outils de rétroaction visuelle de
l’articulation. Ces deux techniques procurent des types différents d’information dynamique à propos de la langue
durant la parole. L’électropalatographie montre les modèles de contact entre la langue et le palais depuis le bout de la
langue jusqu’au palais dur pour les voyelles mi-fermées et fermées et les consonnes linguales. L’échographie permet de
prendre une image de la forme de la langue, de son emplacement et de sa configuration pour toutes les voyelles et les
consonnes linguales. Dans leur article, Bernhardt, Bacsfalvi, Gick, Radanov et Williams abordent les mérites relatifs
de ces techniques et présentent des données de deux études préliminaires sur le traitement à partir de l’une ou de ces deux
méthodes. D’autres recherches permettront de mieux définir les avantages relatifs de ces deux outils pour la rééducation
de la parole et de comparer leurs mérites par rapport à d’autres méthodes employées pour une variété de locuteurs.
Actuellement, les techniques décrites sont offertes principalement en milieu universitaire ou à proximité. Cependant,
il est possible d’adapter les démarches décrites par Rvachew et ses collègues pour qu’elles soient utilisées ailleurs. Il existe
des appareils à échographie portables, et Bernhardt et ses collègues mènent actuellement un projet d’étude en ColombieBritannique pour évaluer les avantages de l’échographie lors d’une consultation sur la rééducation de la parole. La
Grande-Bretagne met actuellement en œuvre un programme clinique baptisé Cleftnet, qui pourrait servir de modèle
pour les régions du Canada. Cleftnet permettra de faire des électropalatographies pour des cliniques sur la fissure
palatine et de mettre ces cliniques en rapport avec une université pour l’analyse des données et l’établissement des
recommandations de traitement. Un organisme caritatif a fourni les fonds nécessaires au financement de ce programme
novateur de prestation de services. Un tel plan est en cours d’élaboration en Colombie-Britannique. En réunissant tous
ces articles dans un même numéro, nous espérons que les lecteurs y trouveront des applications cliniques et une
motivation pour obtenir l’accès à ces nouvelles technologies d’évaluation de la parole et d’intervention. Ces techniques
améliorent notre compréhension de la parole (et de la déglutition) et peuvent nous permettre d’aller chercher les
meilleurs résultats pour nos clients.
Références
Hodge, M.M. et J.D. Daniels (2004). TOCS+™ MPT Recorder© ver.1. [logiciel informatique]. University of Alberta, Edmonton (Alb.).
Thoonen, G., B. Maassen, J. Wit, F. Gabreels et R. Schreuder (1996). The integrated use of maximum performance tasks in differential diagnostic evaluations among children
with motor speech disorders. Clinical Linguistics & Phonetics, 10, p. 311-336.
Williams, P., et J. Stackhouse (2000). Rate, accuracy and consistency: diadochokinetic performance of young, normally developing children. Clinical Linguistics & Phonetics,
14, p. 267-293
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
145
Maximum Performance Tasks
Obtaining and Interpreting Maximum Performance Tasks
from Children: A Tutorial
Obtenir et interpréter des durées maximales d’exécution
chez des enfants : un tutoriel
Susan Rvachew, Megan Hodge and Alyssa Ohberg
Abstract
The diagnosis of motor speech disorders in children can be aided by the use and interpretation of
measures of maximum performance tasks. These tasks include measuring how long a vowel can be
sustained or how fast syllables can be repeated. This tutorial provides a rationale for including these
measures in assessment protocols for children with speech sound disorders. Software developed to
motivate children to cooperate with these procedures and to expedite recording of sound prolongations
and syllable repetitions is described. Procedures for obtaining maximum performance measures
from digital sound file recordings are illustrated followed by a discussion of how these measures may
aid in clinical diagnosis.
Abrégé
Le diagnostique d’un trouble moteur de la parole chez un enfant peut être facilité par l’utilisation
et l’interprétation de tâches de durée maximale d’exécution. Ces tâches comprennent la mesure de
la durée vocalique et de la rapidité de répétition des syllabes. Le présent tutoriel explique les raisons
pour inclure ces tâches dans les protocoles d’évaluation pour les enfants atteints d’un trouble de
parole. Le logiciel élaboré pour motiver ces derniers à collaborer lors de ces procédures et pour
accélérer l’enregistrement du prolongement sonore et des répétitions de syllabes y est décrit. Les
démarches pour obtenir des durées maximales d’exécution à partir d’un fichier sonore numérique
y sont illustrées et sont suivies par une discussion sur la façon dont ces mesures peuvent aider à poser
un diagnostic.
Ke y Words: Speech sound disorders, motor speech disorders, assessment, maximum
performance tasks
C
Susan Rvachew
Ph.D., S-LP(C)
McGill University
Montreal, QC Canada
Megan Hodge
University of Alberta
Edmonton, AB Canada
Alyssa Ohberg
McGill University
Montreal, QC Canada
146
hildren with speech sound disorders form a heterogeneous group from a
number of perspectives, including underlying etiological factors, the
developmental course of the disorder, and the nature of the overt speech
errors that are present at a given point in time (Shriberg, 1997). Most frequently the
speech sound disorder is of unknown origin and has no obvious motoric basis, a
subtype that will be referred to here as developmental phonological disorder. This
subtype has also been referred to as speech sound disorder of unknown origin, nonspecific speech delay, functional articulation disorder or functional phonological
disorder in the literature cited in the following sections.
Other children's speech sound errors can be linked to motoric factors, with or
without a known primary cause. Childhood apraxia of speech (also referred to as
speech dyspraxia) is identified by a number of inclusionary characteristics including
difficulties with sequencing articulatory movements, phonemes, and syllables; trial
and error groping behaviours; and unusual and inconsistent error patterns for both
consonants and vowels. Dysarthria may also be observed in children and manifests
itself as more consistent error patterns resulting from slow and imprecise movements
associated with an abnormal sensorimotor profile that typically includes weakness and
tone abnormalities of the affected speech muscle groups.
One purpose of a speech-language assessment is to determine the extent to which
motoric factors contribute to a child's difficulties with the acquisition of the sound
system of the native language. Knowledge about whether or not the child's speech
disorder has a motor component will help the clinician to choose the most appropriate
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Maximum Performance Tasks
treatment approach. Accurate diagnosis may also have
ramifications for the child's access to treatment services
because both public and private funders often favour the
provision of services to children with an identifiable
medical impairment.
Measures of maximum performance tasks (MPTs)
such as how long a vowel can be sustained (maximum
phonation duration; MPD) or how fast syllables can be
repeated (maximum repetition rate; MRR) are wellestablished procedures used by speech-language
pathologists when assessing older children and adults
(Duffy, 1995; Kent, Kent, & Rosenbeck, 1987). More
recently, Thoonen and colleagues (Thoonen, Maassen,
Gabreels, & Schreuder, 1999; Thoonen, Maassen, Wit,
Gabreels, & Schreuder, 1996) described the application
of MPTs to assist clinicians in diagnosing the presence and
nature of motor speech impairment in younger children
(age 6 to 10 years). Published protocols for identifying
and describing oral and speech praxis characteristics of
children also include maximum syllable repetition rate
measures as part of a battery of nonspeech and speech
performance measures (e.g., Hickman, 1997). The
classification system developed by Thoonen et al. is
particularly appealing because it offers clinicians a
systematic framework for integrating and interpreting
measures from MPTs to assist in differential diagnosis of
childhood speech disorders.
As with all assessment procedures, the ease and
reliability with which measures of MPTs can be obtained
and their validity and usefulness in differential diagnosis
are key determinants to being adopted in clinical practice.
This tutorial provides a rationale for including these
measures in assessment protocols for young children with
speech sound disorders. It summarizes the tasks and
classification procedure developed by Thoonen et al. and
how the measures obtained are interpreted to ascertain
the presence and nature of motor speech impairment.
Software that expedites recording of the MPTs
recommended by Thoonen et al. is described and
procedures for obtaining MPT measures from digital
sound file recordings are illustrated for readers who may
be unfamiliar with computer-assisted measurement.
Rationale
Accurate identification of speech motor limitations
can be difficult, especially in the case of children who do
not present with an obvious primary impairment such as
cerebral palsy or traumatic brain injury. Campbell (2003)
reported that second opinion assessments conducted at
the Children's Hospital of Pittsburgh confirmed a prior
diagnosis of childhood apraxia of speech (CAS) in only 17
percent of cases, suggesting a significant over-diagnosis of
CAS among children with a severe and persistent speech
sound disorder. On the other hand, Gibbon (1999) has
suggested that a more subtle form of motoric involvement,
termed 'undifferentiated lingual gestures', is frequently
under-diagnosed among children who present with errors
that appear to be phonological on the basis of perceptual
analyses (in particular, velar fronting and/or backing
and fricative gliding and/or stopping). Certain phonetic
errors such as a lateral lisp may also reflect an inability to
independently control the lateral margins of the tongue.
Under-identification of motor speech limitations may
harm individual clients if it prevents them from accessing
services to which they are entitled or receiving the most
appropriate form of treatment. Over-identification also
has far-reaching implications, since threats to the
credibility of our profession will have a negative impact
on the funding of speech therapy services.
One reason for misdiagnosis may be an over-reliance
on diagnostic checklists as a means of identifying motor
speech disorders (Shriberg, Campbell, Karlsson, Brown,
McSweeny, & Nadler , 2003). These lists have a kind of face
validity because they describe the overt characteristics of
the child's speech. Unfortunately they lack specificity
because they fail to distinguish between fundamental
characteristics of a motor speech disorder and the
consequences of such a disorder. The linguistic
consequences of dysarthria or dyspraxia are not clearly
distinguishable from the linguistic consequences of a
developmental phonological disorder. Unintelligibility
and persistence of the speech problem are not specific to
motor speech disorders and systematic error patterns are
not specific to development phonological delay. Shriberg,
Aram, and Kwiatkowski (1997) demonstrated that CAS
could not be differentiated from a developmental
phonological disorder on the basis of structural or
phonological characteristics of the child's conversational
speech (i.e., phonetic repertoire, syllable structure
repertoire, percentage of consonants correct,
intelligibility index, or phonological processes).
Maximum Performance Tasks
A more promising approach is to administer
Maximum Performance Tasks (MPTs) to children.
Thoonen, Maassen, Wit, Gabreels, and Schreuder (1996)
explained that "although [MPTs] assess abilities that differ
from normal speech production…, they provide
information on motor speech abilities underlying
dysarthria and [CAS] (e.g., articulatory coordination,
breath control, speaking rate, speech fluency, articulatory
accuracy and temporal variability)" (p. 312). These
researchers demonstrated how Maximum Phonation
Duration (MPD) and Maximum Repetition Rate (MRR)
can be used to differentiate groups of children with spastic
dysarthria, CAS, developmental phonological disorder,
or normally developing speech. Their criteria for
classification were derived from the responses of children
aged 6 to 10 years of age, some with normally developing
speech and some with clinically diagnosed dyspraxia or
dysarthria. Briefly, children with dysarthria were found
to produce short phonation durations and slow
monosyllabic repetition rates; children with dyspraxia
produced slow trisyllabic repetition rates and short
fricative durations. Later, these criteria were crossvalidated with new samples of school-aged children, this
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
147
Maximum Performance Tasks
time including a sample of children with a developmental
phonological disorder with no motoric component. It
was shown that these tasks could be used to identify
dysarthria with 89% sensitivity and 100% specificity. In
other words, 89% of the children with clinically diagnosed
dysarthria were identified as dysarthric on the basis of
their responses on the MPTs (sensitivity). Furthermore,
none of the children who were not dysarthric by clinical
criteria were falsely identified as dysarthric on the basis of
their responses to the MPTs (specificity). Dyspraxia was
identified from MPT responses with 100% sensitivity and
91% specificity. Overall, diagnostic accuracy was excellent
with 95% correct classification of 41 children as presenting
with normally developing speech, developmental
phonological delay, childhood apraxia of speech, or
dysarthria. Of particular interest was the finding that
children with a developmental phonological disorder
performed these tasks in a qualitatively and quantitatively
different manner from children with dysarthria or
dyspraxia. Children with dyspraxia were often unable to
produce a correct trisyllabic sequence. Children with a
developmental phonological disorder were usually able
to produce the sequence accurately but only after an
unusual number of unsuccessful attempts. Overall their
performance on these tasks was intermediate between the
control group and the dysarthric and dyspraxic groups.
Kent, Kent, & Rosenbeck (1987) described some of the
difficulties inherent to the clinical application and
interpretation of MPTs which may explain why these
techniques are not routinely applied, especially with young
children. A primary issue with interpretation of MPT
performance is the availability of good quality normative
data. Kent et al. reviewed a number of studies that provided
normative data for school-age children and young adults
but noted that there was a lack of normative data for
younger children and older adults. Subsequently,
however, Robbins and Klee (1987) described the MRR
and MPD performance of children aged 2;6 through 6;11
(with a sample of 10 children at each 6-month age interval).
Williams and Stackhouse (2000) reported additional data
regarding repetition performance for 3-, 4-, and 5-yearold children.
Reliability of the measures obtained from the child's
performance of each task presents another challenge.
Stability of the results across repeated trials can be poor.
Individual performance is affected by the task instructions
and the motivation of the child. Kent et al. (1987) suggested
that standardized instructions and procedures would
help reduce variability within and across children. In this
report we describe a software tool that presents a standard
protocol for clinicians to follow when administering MPTs
to young children and recording their productions.
Experience with the software indicates that it increases
children's motivation to comply with the protocol. To
date all of the preschool-aged children that we have tested
with this tool have provided a complete set of responses
for each of the maximum performance tasks.
Unstable performance levels across trials also leads to
questions about the validity of these measures as
148
implemented in a clinical setting. Kent et al. (1987)
reported that it can take as many as 15 trials before a stable
response is achieved, particularly when attempting to
obtain maximum phonation duration. However, Potter,
Kent, & Lazarus (2004) reported that in their investigation
of typical performance on repetition tasks, the first attempt
was most frequently the fastest and most accurate. Over
90% of the children who attempted and could perform the
task gave their best performance within the first three
trials. This is an encouraging finding because our
experience has been that it is impractical to attempt more
than three trials with a young child. Although instability
across repeated trials is a potential threat to the validity
of MPTs, Kent et al. concluded that "nonetheless, the test
may still have clinical utility as a screening procedure if it
is recognized that the object is to determine if the client can
reach some minimal standard" (p. 369). This is the
approach taken by Thoonen et al. (1999). They established
the threshold values for Maximum Phonation Duration
and Maximum Repetition Rates that can be used to
diagnose dyspraxia or dysarthria in children aged 6
through 10 years of age.
Finally, some of the variability in results that is
observed may result from the difficulty of obtaining an
accurate measurement of MPD and MRR when
administering the tasks 'live' with the use of a stop-watch.
Kent et al. (1987) and Thoonen et al. (1996, 1999)
recommended that responses be recorded and measures
of the acoustic waveform be used whenever possible to
obtain more precise measurements. The software described
in this report makes it easy for the clinician to record the
child's responses and retrieve them for measurement.
Durations and repetition rates can then be accurately
measured from these recordings using any available
waveform editor. Procedures for measuring MPD and
MRR from a waveform display are demonstrated in a later
section.
A Protocol for Obtaining MPTs from
Children
The protocol for obtaining MPTs described here was
developed by Thoonen et al. (1996, 1999). This procedure
involves the administration of nine tasks as follows:
prolongation of [a] and [mama] to yield a maximum
phonation duration (MPD), prolongation of [f], [s], and
[z] to yield a maximum fricative duration (MFD),
repetition of the single syllables [pa], [ta], and [ka] to
yield a maximum repetition rate-monosyllabic
(MRRmono), and repetition of the syllable sequence
[pataka] to yield a maximum repetition rate-trisyllabic
(MRRtri). Two additional outcome measures are derived
from the child's performance during the trisyllabic
repetitions task, specifically a score indicating whether
the child achieved a correct trisyllabic sequence (Seq) and
the number of attempts beyond the standard three trials
required for the child to achieve a correct sequencing of
[pataka] (Attempts). The instructions for administering
these items and then combining results across the nine
tasks to yield the six outcome measures are shown in Table 1.
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Maximum Performance Tasks
Table 1
Instructions for adm inistration of the m axim um perform ance tasks, including M axim um Phonation Duration (M PD),
M axim um Fricative Duration (M FD), M axim um Repetition Rate for Single Syllables (M RRm ono), and M axim um
Repetition Rate for Trisyllabic Sequences (M RRtri), adapted from Thoonen et al. (1996)
Task
Instructions
Maximum Phonation Duration (MPD)
[a]
1. Produce a prolonged [a] for approximately 2 seconds on one breath in a monotonic manner with normal
pitch. Ask the child to imitate your model. Repeat if necessary until the child is successful in imitating your
model.
2. As above except model a prolongation of [a] for 4 to 5 seconds and then ask the child to imitate your model.
3. Ask the child to say [a] for as long as possible on one breath (with no model provided in this case). Repeat
the instruction two more times, providing the child with a total of three opportunities to prolong [a] for as long as
possible.
[mama]
Repeat steps 1, 2, and 3 above except that in this case, model a repetition of the syllables [mama...]. Again at
step 3, give the child three opportunities to produce [mama...] for as long as possible on a single breath.
MPD
MPD is the mean of the longest prolongation of [a] and the longest prolongation of [mama...].
Maximum Fricative Duration (MFD)
[f]
Repeat steps 1,2, and 3 as described for MPD, in this case modelling a prolonged production of [f]. Again at
step 3, give the child three opportunities to prolong [f] for as long as possible on a single breath.
[s]
Repeat steps 1,2, and 3 as described above, in this case modelling a prolonged production of [s]. Again at
step 3, give the child three opportunities to prolong [s] for as long as possible on a single breath.
[z]
Repeat steps 1,2, and 3 as described above, in this case modelling a prolonged production of [z]. Again at
step 3, give the child three opportunities to prolong [z] for as long as possible on a single breath.
MFD
MFD is the mean of the longest prolongation of [f], the longest prolongation of [s] and the longest prolongation of
[z].
Maximum Repetition Rate - Monosyllabic (MRRmono)
[pa]
1. Ask the child to say [pa], and then [papapa], and then [papapapapa].
2. Model the repetition of approximately 12 [pa] syllables on a single breath at a rate of about four syllables per
second and ask the child to imitate your model.
3. Ask the child to repeat step 2 but this time as fast as possible. Stop recording when the child has produced
12 or more syllables. Provide the child with two additional opportunities to maximize the repetition rate.
[ta]
Repeat steps 1,2, and 3 as described above, in this case modelling repetition of the syllable [ta]. Again at step
3, give the child three opportunities to produce [ta] as fast as possible on a single breath.
[ka]
Repeat steps 1,2, and 3 as described above, in this case modelling repetition of the syllable [ka]. Again at step
3, give the child three opportunities to produce [ka] as fast as possible on a single breath.
MRRmono
For each trial the repetition rate is calculated as the number of syllables produced per second. MRRmono is the
mean repetition rate for the fastest repetition of [pa], the fastest repetition of [ta], and the fastest repetition of
[ka].
Table 1 continued on page 150
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
149
Maximum Performance Tasks
Table 1 (continued)
Instructions for adm inistration of the m axim um perform ance tasks, including M axim um Phonation Duration (M PD),
M axim um Fricative Duration (M FD), M axim um Repetition Rate for Single Syllables (M RRm ono), and M axim um
Repetition Rate for Trisyllabic Sequences (M RRtri), adapted from Thoonen et al. (1996)
Task
Instructions
Maximum Repetition Rate - Trisyllabic (MRRtri)
[pataka]
1. Ask the child to say [pataka] at a slow rate. Practice this syllable sequence, breaking it down into its
component parts if necessary, until the child can produce a single correct sequence.
2. Produce the sequence twice [patakapataka] fluently and at a slow rate and ask the child to imitate.
3. Produce the sequence three times at a normal speaking rate and ask the child to imitate.
4. Produce the sequence four times at a rate of about four syllables per second and ask the child to imitate.
5. Model a repetition of the sequence, five times and as fast as possible. Ask the child to produce the
sequence as fast as possible for as long as possible on a single breath. Give the child two additional trials to
perform this task. If the child cannot produce the sequence accurately, repeat the steps and allow three
additional attempts to produce a correct sequence as fast as possible and as long as possible on a single
breath.
MRRtri
MRRtri is the number of syllables per second produced during the child's fastest attempt at repeating this
sequence. The sequence must be produced correctly over 5 repetitions on a trial for it to be used to calculate
the MMRtri.
Sequence
Score 1 if the child produces a correct repetition of the sequence. Score 0 if the child does not suceed in
producing a correct sequence.
Attempts
This score is the number of additional attempts (beyond the first three) that are required for the child to achieve
a correct repetition of the sequence.
Generic free or inexpensive software programs are
available to record sound and display the waveforms of
the recordings (e.g., GoldWave, Goldwave, Inc., 2005;
PRAAT, Boersma & Weenink, 2005). Software packages
are also available that count syllable peaks and perform
an automatic count (e.g., Motor Speech Profile,
KayPentax). The TOCS+™ MPT Recorder© ver. 1 (Hodge
& Daniels, 2004) is freeware that was developed specifically
to facilitate administration and measurement of MPTs
with children, following the protocol of Thoonen et al.
(1996). It turns any personal computer that has an
operating system with Windows 98 or later into a digital
audio recorder with a sampling rate of 48 kHz and a
quantization size of 16 bits.
An inexpensive computer microphone is adequate
for the durational measures to be obtained from the
recordings of the child's responses to the MPTs. A headmounted microphone is preferable if the child will tolerate
this but a table microphone is a second option. The
software sets a standard recording level at start-up that
can be checked and modified within the software before
administering the MPT protocol. It guides the user
through administration of the MPD, MFD, and MRR
tasks in succession. At the beginning of each task type
(MPD, MFD, MRRmono, and MRRtri) a screen with
instructions similar to those summarized in Table 1 is
displayed to cue the examiner so that the same instructions
are given each time. This is followed by successive screens
for a practice trial followed by the required number of test
150
trials for each MPT listed in Table 1. For each of these
trials, a short tone and a small icon appear on the screen
to signal to the child that it is time to start the task (see
Figure 1). This ensures onset synchronization of the child's
response and recording, reduces the likelihood of
overlapping examiner and child speech, and avoids false
starts and unnecessary repeat trials. Recordings of each
trial are saved as a .wav file that is named by task and trial
number and stored in the child's folder.
Measurement of MPD and MRR
Digital recordings of MPTs obtained using the
TOCS+ MPT Recorder (or other software with recording
capabilities) can be displayed as a waveform by a variety
of software packages, such as those cited previously. In the
examples that follow, Time-Frequency-Response version
2.1 (TFR; AVAAZ Innovations, Inc., 1999) was used to
demonstrate the measurement of durations and repetition
rates. The basic procedure is the same regardless of the
specific software used to display the waveforms and
measure the durations.
Measurement of MPD is the most straightforward.
After loading the sound file into a waveform display
window, visual inspection of the waveform and the partial
playback feature of the software helps to identify the
waveform that represents the production of the [a]. For
example, in the waveform shown in Panel A of Figure 2,
the prolonged [a] is preceded by some examiner speech
and the client's inhalation, and there is a second inhalation
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Maximum Performance Tasks
Figure 1
Figure 1: Instruction screen with visual prompt to the child to begin the practice trial for the first maximum
performance task, from the TOCS+MPT Recorder Version 1 (altered to appear in black and white)
Figure 2 A.
Client inhalation
Client inhalation
Examiner Speech
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
151
Maximum Performance Tasks
Figure 2 B
File length approximately 12957 ms
Figure 2. Panel A shows the waveform of the recording of a prolonged ‘ah’ [a] marked by a bracket and surrounded
by extraneous information in the file. Panel B shows the prolonged ‘ah’ cut from the first file as shown in A so that the
extraneous information is removed. The duration of the file in milliseconds is indicated with an arrow.
that follows the [a] production. Waveform editors
provide a 'click and drag' function for marking off the
specific waveform of interest, in this case the waveform
that is marked with a bracket. In Panel B of Figure 2 the
duration of the [a] is shown as being 12,956.92 ms which,
when divided by 1000, yields approximately 12.96 seconds.
The procedure for measuring duration of [mama], [f],
[s], and [z] is the same as that shown here for [a].
Measurement of MRRmono is accomplished by
loading the sound file into the wave form display window
and marking off 10 consecutive repetitions of the syllable,
as shown in Figure 3. As described in Table 1, all 10
syllables should be produced on a single breath. These 10
syllables should not include the first syllable after an
inspiration or the last syllable before an inspiration. In
Panel B of Figure 3 the selected 10 syllables are isolated
from the rest of the file. The total duration of the selected
portion is shown as approximately 1835 ms. When using
Thoonen et al.'s protocol for interpreting the results it is
necessary to calculate the number of syllables produced
per second. This value is obtained by converting the time
value to seconds and dividing the 10 repetitions by the
total time in seconds yielding 10/1.835 = 5.45 syllables per
second in this case.
The procedure for determining MRRtri is the same as
that for determining MRRmono except that 4 consecutive
repetitions of the sequence [pataka] (i.e., 12 syllables) are
marked off. The number of syllables per second is
calculated as described previously for MRRmono . For the
example shown in Figure 4, the total time taken to produce
4 repetitions of the sequence [pataka] was 1580 ms. This
152
results in a rate of 7.59 syllables per second (12 syllables/
1.58 seconds).
Alternative Calculation Procedures
The procedures described in the previous section for
measuring MRRmono and MRRtri are specific to the
Thoonen et al. (1999) protocol. The way in which
repetition rates are calculated and represented depends
upon the norms that will be used to interpret the child's
performance. Some norms for single syllable repetition
rates are presented as the time taken to produce a specified
number of repetitions (e.g., Fletcher, 1972). When using
Fletcher's time-by-count norms the examiner simply marks
the required number of repetitions and notes the time
taken to produce those repetitions. Some norms for the
interpretation of trisyllable repetition rates, such as those
published by Robbins and Klee (1987), are based on the
number of repetitions of the entire sequence, (e.g., in the
example in Figure 4, four repetitions of the sequence in
1.58 seconds yields a rate of 2.53 repetitions of [pataka]
per second).
An important point about the measurement of MRRtri
described by Thoonen et al. (1999) is that it requires a
repetition of correctly articulated sequences. Some
younger children may be unable to correctly articulate
the [ka] phoneme in which case they might repeat [patata],
a response that should not be scored using Thoonen et al.'s
procedure. Williams and Stackhouse (2000) reported
repetition performance for three-syllable words and
nonsense words in which accuracy, rate, and consistency
measures were derived independently. Therefore, their
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Maximum Performance Tasks
Figure 3 A
Figure 3 B
File length approximately 1835 ms
Figure 3. Panel A shows the client’s repetitions of the syllable ‘pa’ from the first syllable until the time when recording
was stopped. The duration of the 10 repetitions that are marked by the bracket is measured by cutting these repetitions
from the file as shown in Panel B. The duration of the 10 repetitions shown in the cut file is indicated with an arrow.
paper provides a normative reference for the repetition
rate, regardless of accuracy, for 3- to 5-year-old children.
They found that even 3-year-olds produced repetition
rates no slower than three syllables per second. They
suggested that the ability to repeat a consistent [patata]
sequence at a rate of at least three syllables per second
would not be reason for concern with this age group.
However, inconsistent and inaccurate repetitions of the
sequence would be cause for concern.
Differential Diagnosis
Thoonen et al. (1999) developed a flow chart for
differential diagnosis of dysarthria and dyspraxia, based
on MPT data that they obtained from children aged 6
through 10 years. The application of these criteria are
described here. Figure 5 illustrates the results of this
interpretative process for a hypothetical 7-year-old child.
The process begins with the assignment of a dysarthria
score of 0, 1, or 2, where 0 indicates that the child is not
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
153
Maximum Performance Tasks
Figure 4 A
1 sequence
4 sequences
Figure 4 B
1580 ms to produce 4 sequences or 12 syllables
Figure 4. Panel A shows the client’s repetition of the sequence ‘pataka’ from the first syllable until the time when
recording was stopped. The brackets indicate the first sequence, which is excluded, and the next 4 sequences that are
cut to form the display shown in Panel B. The time taken to produce 12 syllables comprising these 4 sequences is marked
with an arrow.
154
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Maximum Performance Tasks
Figure 5
Figure 5. Example of calculation of Maximum Phonation Duration (MPD), Maximum Fricative Duration (MFD),
Maximum Repetition Rate for monosyllables (MRRmono), Maximum Repetion Rate for trisyllabic sequences (MRRtri),
Attempts, and Sequence. Interpretation of these data to yield a diagnosis is shown at the bottom of the chart.
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
155
Maximum Performance Tasks
dysarthric and a 2 indicates that the child is primarily
dysarthric. MRRmono is the primary diagnostic marker
for dysarthria. A score of 0 is assigned if MMRmono is
greater than 3.5 syllables per second. A score of 2 is
assigned if the MRRmono is less than 3 syllables per
second. If the child’s MRRmono is between 3 and 3.5, the
MPD is examined: if the MPD is less than 7.5 seconds, a
score of 2 is assigned; if the MPD is more than 7.5, a score
of 1 is assigned.
Next, a dyspraxia score of 0, 1, or 2 is assigned, where
0 indicates that the child is not dyspraxic and a score of 2
indicates that the child is dyspraxic. MRRtri and Attempts
are the primary diagnostic markers for CAS. A score of 0
is assigned if the child produces a correct trisyllabic
sequence at a rate of at least 4.4 syllables per second
without requiring more than 2 additional attempts. If the
child cannot produce a correct sequence or the MRRtri
for a correct sequence is less than 3.4 syllables per second
a score of 2 is assigned. If the MRRtri is between 3.4 and
4.4 syllables per second, a score of 1 is assigned as long as
the MFD is appropriate at more than 11 seconds and the
child did not require more than 2 additional attempts to
achieve a correct sequence. If the MRRtri is between 3.4
and 4.4 syllables per second and more than 2 additional
attempts were needed to achieve a correct sequence a score
of 2 is assigned. A score of 2 is also assigned if MRRtri is
between 3.4 and 4.4 syllables per second and MFD is 11
seconds or less.
Note that a diagnosis of ‘primarily dysarthria’ would
be concluded if the child received dysarthria and dyspraxia
scores of 2. Children with dysarthria are likely to produce
very slow repetition rates for both monosyllables and
trisyllabic sequences. Children with CAS are likely to
produce repetition rates that are slower for trisyllabic
sequences than for monosyllables (Thoonen et al., 1999).
The hypothetical child profiled in Figure 5 received a
dysarthria score of 1 and a dyspraxia score of 2, justifying
a clinical diagnosis of CAS. His MRRmono was not slow
enough to justify a diagnosis of dysarthria. His MRRtri
was somewhat slow at 3.45 syllables per second, and he did
not achieve a correct repetition of the sequence until the
sixth trial. Thus the combination of Attempts = 3 and
MRRtri between 3.4 and 4.4 led to a dysarthria score of 2,
resulting in a diagnosis of childhood apraxia of speech.
Summary and Conclusions
A number of normative data sets are available to aid
in the interpretation of a child’s ability to prolong sounds
and repeat syllables (e.g., Kent et al., 1987; Robbins &
Klee, 1987; Thoonen et al., 1996, 1999; Williams &
Stackhouse, 2000). For children with specific phonological
errors such as velar fronting, the diagnostic accuracy of
the procedure can be improved by considering accuracy
and consistency of production of a trisyllabic sequence as
described by Williams and Stackhouse (2000). Thoonen
et al. have provided a framework for using MPTs to assist
in differential diagnosis of speech dyspraxia or dysarthria
in pediatric clients. Technological advances such as the
156
TOCS+™ MPT Recorder© ver. 1 (Hodge & Daniels,
2004) and readily available waveform editors facilitate
reliable administration and recording of children’s
responses and accurate measurement of maximum
durations and maximum repetition rates.
The publication of new normative data and the
availability of audio recording and editing software have
eliminated significant impediments to the use of maximum
performance tasks with children. It is our hope that the
application of these procedures will result in reliable and
valid normative data from younger children and become
a routine part of the speech-language assessment protocol
for all children with suspected or confirmed speech
disorders and delays.
Author Note
Address correspondence to Dr. Susan Rvachew,
School of Communication Sciences and Disorders, McGill
University, 1266 Pine Avenue West, Montréal, Québec
H3G 1A8. Development of the TOCS+™ MPT Recorder©
ver. 1 was supported by a grant from the Canadian
Language and Literacy Research Network
(www.cllrnet.ca) and uses the Universal Sound Server©
software developed for the TOCS+ Project
(www.Tocs.plus.ualberta.ca) at the University of Alberta
by Tim Young. Readers interested in using the TOCS+™
MPT Recorder©
can contact Megan Hodge
([email protected]) to obtain a copy of the
software.
References
Avaaz Innovations, Inc. (1999). Time-Frequency-Response Version 2.1 (TFR).
[Computer software]. London, Ont.: Avaaz Innovations, Inc. www.avaaz.com.
Boersma, P. & Weenink, D. (2005). Praat Version 4.3.12 [Computer software]
Institute of Phonetic Sciences, University of Amsterdam, www.fon.hum.uva.nl/praat/.
Campbell, T. F. (2003). Childhood apraxia of speech: Clinical symptoms and speech
characteristics. In L. D. Shriberg & T. F. Campbell (Eds.), 2002 Childhood Apraxia of
Speech Research Symposium. Carlsbad, CA: The Hendrix Foundation.
Duffy, J. (1995). Motor speech disorders: Substrates, differential diagnosis and
management. St. Louis, MO: Mosby-Year Book, Inc.
Fletcher, S. (1972). Time-by-count measurement of diadochokinetic syllable rate.
Journal of Speech and Hearing Research, 15, 763-770.
Goldwave Inc. (2005). Goldwave Version 5.10. [Comuter software].
www.goldwave.com.
Gibbon, F. E. (1999). Undifferentiated lingual gestures in children with articulation/
phonological disorders. Journal of Speech, Language, and Hearing Research, 42, 382297.
Hickman, L. (1997). The Apraxia Profile. Communication Skill Builders/Therapy
Skill Builders, a division of The Psychological Corporation.
Hodge, M. M. & Daniels, J. D. (2004). TOCS+™ MPT Recorder© ver.1. [Computer
software]. University of Alberta, Edmonton, AB.
KayPentax. Motor Speech Profile [Computer software]. www.kayelemetrics.com.
Kent, R. D., Kent, J.F., & Rosenbeck, J.C. (1987). Maximum performance tests of
speech production. Journal of Speech and Hearing Disorders, 52, 367-387.
Potter, N., Kent, R. & Lazurus, J. (March, 2004). Measures of speech and manual
motor performance in children. Presented at the Conference on Motor Speech,
Albuquerque, NM.
Robbins, J. & Klee, T. (1987). Clinical assessment of oropharyngeal motor
development in young children. Journal of Speech and Hearing Disorders, 52, 271277.
Shriberg, L.D. (1997). The Speech Disorders Classification System (SDCS):
Extensions and lifespan reference data. Journal of Speech, Language, and Hearing
research, 40, 723-740.
Shriberg, L. D., Aram, D. M., & Kwiatkowski, J. (1997). Developmental apraxia
of speech: II. Toward a diagnostic marker. Journal of Speech, Language, and Hearing
Research, 40, 286-312.
Shriberg, L. D., Campbell, T. F., Karlsson, H. B., Brown, R. L., McSweeny, J. L.,
& Nadler, C. J. (2003). A diagnostic marker for childhood apraxia of speech: the lexical
stress ratio. Clinical Linguistics & Phonetics, 17, 549–574.
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Thoonen, G., Maassen, B., Gabreels, F., & Schreuder, R. (1999). Validity of
maximum performance tasks to diagnose motor speech disorders in children. Clinical
Linguistics & Phonetics, 13, 1-23.
Thoonen, G., Maassen, B., Wit, J., Gabreels, F. & Schreuder, R. (1996). The
integrated use of maximum performance tasks in differential diagnostic evaluations
among children with motor speech disorders. Clinical Linguistics & Phonetics, 10,
311-336.
Williams, P., & Stackhouse, J. (2000). Rate, accuracy and consistency:
diadochokinetic performance of young, normally developing children. Clinical
Linguistics & Phonetics, 14, 267-293.
Received: November 29, 2004
Accepted: August 8, 2005
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
157
Applications of 2D and 3D Ultrasound Imaging
Applications of 2D and 3D Ultrasound Imaging in
Speech-Language Pathology
Utilisation de l’échographie en 2D et 3D en orthophonie
Tim Bressmann, Chiang-Le Heng and Jonathan C. Irish
Abstract
Tongue motion in speech and swallowing is difficult to image because the tongue is concealed in the
oral cavity. It is even more difficult to assess the extent of lingual motion quantitatively. Ultrasound
imaging of the tongue in speech production and swallowing allows for a safe and non-invasive data
acquisition. The paper describes the potentials and methodological problems of conducting
ultrasound speech research using dynamic two-dimensional, static three-dimensional and dynamic
three-dimensional ultrasound imaging.
Abrégé
Le mouvement de la langue pour la parole et la déglutition est difficile à mettre en image parce que
cet organe est dissimulé dans la cavité buccale. Il est encore plus difficile d’évaluer l’ampleur du
mouvement de la langue de manière quantitative. L’échographie de la langue lors de la production
de la parole et de la déglutition constitue une méthode sécuritaire et non invasive d’obtenir des
données. Le présent article décrit les possibilités et les problèmes méthodologiques pour mener des
recherches sur l’utilisation de l’échographie dynamique en deux dimensions, statique en trois
dimensions et dynamique en trois dimensions dans le domaine de l’orthophonie.
Key Words
Words: Glossectomy, cancer, tongue paralysis, speech, articulation, swallowing, tongue, 3D
ultrasound, 2D ultrasound
Tim Bressmann, Ph.D.
Graduate Department of
Speech-Language Pathology
University of Toronto
Toronto, ON Canada
Chiang-Le Heng, B.Sc.
Graduate Department of
Speech-Language Pathology
University of Toronto
Toronto, ON Canada
Jonathan C. Irish, M.D.,
M.Sc., F.R.C.S.C., F.A.C.S.
Departments of
Otolaryngology and Surgical
Oncology
Princess Margaret Hospital
University Health Network
and Toronto General
Hospital, University Health
Network
Toronto, ON Canada
158
Introduction
U
ltrasound imaging of the tongue is currently gaining popularity as a
research tool in speech-language pathology and speech science. Until ten
years ago, ultrasound machines were out of reach for most speech researchers
because they were very expensive. In recent years, advances in computer technology
and increased competition between the different manufacturers of ultrasound machines
have helped to bring down the costs considerably. As a consequence, more speech
researchers are now able to purchase ultrasound machines for their laboratories. Also,
more physicians are buying machines for their hospital or private practice. This in turn
may potentially give speech-language pathologists who are affiliated with hospitals or
physicians in private practice access to ultrasound machines.
The main advantages of ultrasound imaging over other methods in phonetic
research lie in low cost, bio-safety and ease of image acquisition. After an ultrasound
machine has been purchased, associated costs to collect data are negligible. The
radiation levels that are generated by a medical ultrasound machine are extremely low
and do not accumulate so it is biologically safe to make extended recording sessions or
examine patients repeatedly. This is an advantage over x-ray based imaging methods
such as videofluoroscopy. Ultrasound imaging is non-invasive for the patient because
it is not necessary to glue transducer coils to the tongue (like in electromagnetic
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Applications of 2D and 3D Ultrasound Imaging
midsagittal articulography). The ultrasound data
acquisition is reasonably comfortable for the participant
so that clinical populations can be studied. It also becomes
easier to make recordings with notoriously ‘difficult’
research populations such as children.
In this paper, we will give an introduction to a
number of different applications that we see for
ultrasound imaging in speech-language pathology and
demonstrate research findings from different projects
that were undertaken in the Voice and Resonance
Laboratory at the University of Toronto.
Two-Dimensional Ultrasound
Diagnostic ultrasound makes use of the pulse-echo
principle: Any physical impulse into an environment
puts objects into oscillation and results in echoes. In the
ultrasound machine, a piezoelectric crystal in the
ultrasound transducer generates a sound-burst. After
sending the burst, the same piezoelectric crystal is then
put into receiving mode and listens for the echoes. By
repeating this process hundreds of times every second, a
two-dimensional image can be reconstructed by a
computer. Commercially available ultrasound machines
deliver a video-frame rate of 30 frames per second. This
corresponds to the standard NTSC format that is used in
television sets and video-recorders in North America.
The NTSC video-frame rate is sufficiently fast to capture
even the quicker aspects of tongue movement in speech
such as those involved in the production of plosives.
Diagnostic ultrasound has long been used in phonetic
research to examine the tongue shape in different speech
sounds (Morrish, Stone, Sonies, Kurtz, & Shawker,1984;
Shawker & Sonies, 1984; Stone, Morrish, Sonies, &
Shawker, 1987, 1988; Wein, Bockler, Huber, Klajman, &
Wilmes, 1990) as well as to assess temporal aspects of
speech motor control (Munhall, 1985; Parush & Ostry,
1993). Much of the clinical application of ultrasound
imaging for the study of tongue function to date has
focused on the study of the oral phase of swallowing
(Casas,Kenny, & Macmillan, 2003; Chi-Fishman, Stone,
& McCall, 1998; Neuschäfer-Rube,Wein, Angerstein,
Klajman, & Fisher-Wein, 1997; Peng, Jost-Brinkmann,
Miethke, & Lin, 2000; Soder & Miller, 2002; Sonies,
Baum, & Shawker, 1984; Shawker, Sonies, Stone, &
Baum, 1983; Stone & Shawker, 1986). The method has
proven useful even for babies (Bosma, Hepburn, Josell,
& Baker, 1990). Swallowing and speech have been studied
in different pathological populations including patients
with cerebral palsies (Casas, Kenny, & McPherson, 1994;
Casas, McPherson, & Kenny,1995; Kenny, Casas, &
McPherson, 1989; Sonies & Dalakas, 1991), strokes
(Wein, Alzen, Tolxdorff, Bockler, Klajman, & Huber,
1988a; Wein, Angerstein, & Klajman, 1993), geriatric
patients (Sonies et al., 1984), glossectomy (Schliephake,
Schmelzeisen, Schönweiler, Schneller, & Altenbernd,
1998), and malocclusions (Cheng, Peng, Chiou, & Tsai,
2002; Kikyo, Saito, & Ishikawa, 1999).
The fact that the tongue shape can be displayed in
real time on the screen makes ultrasound potentially
very attractive to speech-language pathologists as a tool
for biofeedback for oral deaf speakers and also for patients
with dysarthrias or compensatory articulation errors
associated with cleft palate. However, only a few studies
so far have used ultrasound as a tool for biofeedback in
speech therapy. Shawker and Sonies (1985) described
the use of ultrasound imaging for the speech therapy of
an individual with an articulation disorder. The authors
found that the subject was able to improve her
articulatory distortions of the /r/ sound over a course of
3 months of ultrasound biofeedback therapy. A recent
study by Bernhardt et al. (Bernhardt, Bacsfalvi, Gick,
Radanov, & Williams, this issue; Bernhardt, Gick,
Bacsfalvi, & Ashdown, 2003) compared
ultrasonographic and electropalatographic feedback in
four speakers with mild to severe hearing loss. The authors
found that all subjects were able to improve their
articulation and that both feedback methods were
equally effective.
For the research in the Voice and Resonance
Laboratory at the University of Toronto, we use a lowend General Electric Logiq Alpha 100 MP ultrasound
machine with a 6.5 MHz micro convex curved array
scanner with a 114° view (Model E72, General Electric
Medical Systems, P.O. Box 414, Milwaukee, Wisconsin
53201). The machine and the transducer are displayed in
Figure 1. During the ultrasound examination, the videooutput of the ultrasound machine is captured with a
generic digital video camera (Canon ZR 45 MC, Canon
Canada Inc., 6390 Dixie Road, Mississauga, Ontario L5T
1P7). Parallel sound recordings are made onto the same
digital videotape using an AKG C420 headset condenser
microphone (AKG Acoustics, 914 Airpark Center Drive,
Nashville, Tennessee 37217) with a Behringer Ultragain
Pro 2200 line-driver (Behringer Ltd., 18912 North Creek
Pkwy, Suite 200, Bothell, Washington 98011). After the
recording, the ultrasound films are downloaded from
the digital video camera onto a computer and saved as
a digital file. Figure 2 shows typical midsagittal tongue
contours during the sustained production of the cardinal
vowels /a/, /i/, and /u/.
Problems in 2D ultrasound imaging: Head
fixation
The ultrasound image is acquired by holding the
transducer against the neck of the research participant.
If the transducer is held manually against the neck of a
subject who sits in a standard office chair, there are a
number of moving elements that might lead to
measurement error:
· The examiner’s hand with the ultrasound transducer
may move or change the angle of the transducer;
· The subject’s mandible moves up and down which
may change the position of the transducer;
· The subject’s head and shoulders may move which
can affect the coupling or the angle of the transducer.
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
159
Applications of 2D and 3D Ultrasound Imaging
Figure 1
Figure 1. Ultrasound machine with 114° endocavity transducer and ultrasound gel. The image also shows the PC Bird
electromagnetic movement tracking system, which is used to reconstruct three-dimensional ultrasound volumes.
Figure 2
(a)
(b)
(c)
Figure 2. Midsagittal ultrasound contours of the tongue during the production of sustained vowels. The anterior tongue
is towards the right side of the images. To facilitate viewing, the tongue contours are marked with a grey line. (a)
sustained /a/; (b) sustained /i/; (c) sustained /u/.
If one wishes to generate quantitative tongue
movement data from an ultrasound film, it is important
to reduce movement of the transducer and the subject’s
head. On the other hand, the fixation mechanism should
ensure good coupling during regular mandibular
movement in speech. Different groups of researchers
have used head stabilization devices such as headrests for
the back of the head (Davidson , 2004; Stone et al., 1988),
headrests for the forehead (Peng et al., 2000), complete
head fixation (Stone & Davies, 1995), helmets with
transducer attachments (Hewlett, Vasquez, Zharkova,
& Zharkova, 2004) or position control with laser pointers
(Gick, 2002). In an alternative approach, Whalen et al.
(2004) suggested minimizing the head movement with a
headrest and tracking the residual movement with a
three-dimensional optical tracking system.
160
In our laboratory at the University of Toronto, we
developed the Comfortable Head Anchor for
Sonographic Examinations (CHASE; Carmichael, 2004)
that was roughly modelled on the device described by
Peng et al. (2000). The CHASE, which is depicted in
Figure 3, consists of a headrest for the subject’s forehead
and a transducer cradle with a suspension-spring
mechanism. Since a large number of our research
participants are head and neck cancer patients, it was
our goal for the development of the CHASE to make the
device as unintimidating as possible. For this reason, the
CHASE only anchors and stabilizes the participant’s
head while avoiding forced head fixation. Our experience
to date shows that this effectively reduces all head and
transducer movement to an acceptable minimum (less
than 1.5 mm lateral wandering after ten minutes of
speech recordings).
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Applications of 2D and 3D Ultrasound Imaging
Figure 3
Figure 3. A research participant in the CHASE head anchor.
Problems in 2D ultrasound imaging: Image
analysis
The set-up for ultrasound imaging and the data
acquisition require minimal advance preparation.
However, the data analysis can be labour-intensive. Every
second of film generates 30 separate image frames, and we
are interested in the positions of different parts of the
tongue. It is extremely time consuming to do these analyses
by hand. It is therefore desirable to automate the data
analysis as much as possible.
At first, the automatic extraction of tongue contours
from an ultrasound film may seem a trivial task because
the tongue movement can be easily visualized and appears
clearly on the screen of the ultrasound machine. However,
ultrasound is an acoustic imaging method and, as a
consequence, the image is often noisy and may contain
artefacts. On close inspection, the tongue contours that
look so clear to the experimenter’s eye are in fact blurry
patches of diffuse shades of grey. The human observer has
the advantage of Gestalt perception, which means that a
pattern of elements is perceived as a unified whole. While
the moving tongue surface is easily discernible for the
human observer, the automatic extraction of tongue
contours from an ultrasound film is a challenging problem
for computer vision programming.
Over the years, the researchers at the Vocal Tract
Visualization Laboratory at the University of Maryland
have developed a number of different successful motion
trackers (Akgul, Kambhamettu, & Stone, 1999; Unser &
Stone, 1992). The current version of the EdgeTrak software
(Li, Kambhamettu, & Stone., 2003) can be downloaded
at http://speech.umaryland.edu. Other programs are
currently being developed at Queen Margaret University
College in Edinburgh (Wrench , 2004) and at the
University of British Columbia in Vancouver (Gick &
Rahemtulla, 2004). However, even the best automatic
motion trackers will often lose a tongue contour that they
are tracking. The deformation of the tongue shape can be
very rapid. A good example for a situation that often leads
to the failure of a motion tracker is when the speaker
changes from a low to a high vowel. The tracking of
ultrasonographic tongue contours is probably more an
artificial intelligence rather than an image-processing
problem, and the current generation of motion trackers
cannot be expected to perform flawlessly. It is therefore
important that the user reviews and corrects the automatic
tracking results.
We recently completed the development of our own
software, named the Ultrasonographic Contour Analyzer
for Tongue Surfaces (Ultra-CATS; Gu, Bressmann,
Cannons, & Wong, 2004). The Ultra-CATS software can
be downloaded from www.slp.utoronto.ca/People/Labs/
TimLab/ultracats.htm. The Ultra-CATS was designed
with a focus on the semi-automated analysis of the
ultrasound data. The program also incorporates an
automatic tracking option. The main goal of the software
was to facilitate the manual frame-by-frame analysis of an
image sequence. In the semi-automated analysis, the user
traces the tongue surface with a drawing tool on each
image frame. The software then measures points on the
tongue surface with a polar grid. In our experience, the
manual tracing of a single ultrasound frame will take a
trained experimenter about 7 seconds on average. At this
pace, every ten seconds of continuous ultrasound film will
take approximately 35 minutes to analyze. An additional
feature of the Ultra-CATS software is an automatic imageprocessing algorithm. The automatic tracker can be set,
run, stopped, corrected and set back on track at any point
during the analysis. The Ultra-CATS program saves all
measurements to a text file so that they can be edited and
analyzed using a spreadsheet editor or a statistical analysis
program. An image of the program interface of the UltraCATS can be found in Figure 4.
Since we have completed the development of the
Ultra-CATS software, we have used it to analyze the
speech and swallowing of normal speakers as well as
patients with tongue cancer and lingual paralyses. Figure
5 shows a waterfall display for a water swallow of a normal
male participant. The waterfall display shows the elevation
of the back of the tongue that prevents predeglutitive
aspiration. As the oral transport phase of the swallow
begins, the back of the tongue lowers in order to allow
passage of the bolus. We can then appreciate the progressive
elevation of the tongue from the front to the back as the
bolus is cleared from the oral cavity.
Figure 6 shows a waterfall display of the phrase ‘ninetythree years old’. This segment was taken from a reading of
the second sentence of the Grandfather passage (‘Well, he
is nearly ninety-three-years old’; van Riper, 1963), spoken
by a normal female speaker. The displayed data represent
1.3 seconds of data. Note the large number of posture
adjustments that the tongue makes to produce all the
required phonemes. Also note the immediate anticipatory
elevation of the anterior dorsum of the tongue after the
final /d/ plosion, which leads into the first high front
vowel of the next sentence (‘He dresses himself in an
ancient black frock coat …’).
Figure 7 shows two waterfall displays of repeated
syllables. In Figure 7a, we see five repeated utterances of
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
161
Applications of 2D and 3D Ultrasound Imaging
Figure 4
Figure 4. Screenshot of the Ultra-CATS software
Figure 5
Figure 5. Waterfall display of a water swallow by a normal female participant. The numbers indicate different phases
of the swallow. (1) The dorsum of the tongue is elevated to prevent predeglutitive aspiration. (2) The swallow is initiated.
The tongue lowers so that the water bolus can pass into the oropharynx. (3) First the tip and then the dorsum of the
tongue elevate to clear the bolus from the oral cavity. During this part of the swallow, the hyoid bone moves forward
to open the upper oesophageal sphincter. The forward movement of the hyoid is indicated by the missing data/ zeros
which are visible in the posterior tongue during this phase of the swallow. (4) After the swallow, the tongue returns to
a neutral rest position and then lowers in preparation for the next swallow.
162
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Applications of 2D and 3D Ultrasound Imaging
Figure 6
deliver exact representations of the lingual surface in
different speech sounds, the above studies remained
descriptive and did not attempt to quantify lingual
movement ranges in the reconstructed three-dimensional
volume. However, a quantitative approach to the threedimensional deformation shape of the tongue during the
production of speech sounds would be particularly
desirable for the analysis of patients with glossectomy in
order to compare pre- and postoperative movement
ranges and to evaluate the effect of different reconstructive
techniques on the deformation and symmetry of the tongue
tissue.
Figure 7
Figure 6. Waterfall display of the midsagittal tongue
contours during the phrase ‘ninety-three years old’ from
the second sentence of the Grandfather passage (‘Well, he
is nearly ninety-three years old.’), spoken by a normal
adult female participant.
the syllable /aka/ by a normal female speaker. The lowering
of the tongue from the rest position towards the position
for the /a/ can be appreciated. The back of the tongue then
elevates to achieve velar closure for the /k/, and the
procedure is repeated over. Figure 7b demonstrates the
same repeated syllables spoken by an older female patient
with a flaccid paralysis of the tongue resulting from postPolio syndrome. It can be observed that the lingual
paralysis leads to an undifferentiated elevation of the
tongue, rather than the lowering of the tongue for /a/,
which perceptually resulted in a centralized vowel. The
patient cannot elevate the posterior tongue for /k/.
Perceptually, this articulatory undershoot results in
significantly reduced speech intelligibility.
Figure 7b
Static Three-Dimensional Ultrasound
The 2D display of tongue contours is interesting and
affords us fascinating insights into the movement of the
tongue. However, the tongue is a complex, non-rigid
three-dimensional structure (Hiiemae & Palmer, 2003;
Stone 1990). It is especially important to recover the
three-dimensional information about the shape of the
tongue in glossectomy patients because the lingual
resection and reconstruction rarely ever lead to a
symmetrical outcome. This means that a midsagittal image
will often be an incomplete representation of the surgically
altered tongue shape. So far, three-dimensional
ultrasound has mostly been used in feasibility studies in
normal speakers (Lundberg and Stone, 1999; Stone and
Lundberg, 1996; Watkin and Rubin, 1989; Wein, Klajman,
Huber, & Doring, 1988b). While it was demonstrated
that three-dimensional ultrasound has the capabilities to
Figure 7. Waterfall display of the midsagittal tongue
contour during five repetitions of /aka/. (a) Normal adult
female participant. Note the elevated tongue rest position
before the beginning of the utterance. (b) Female patient
with lingual paralysis resulting from post-Polio syndrome.
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
163
Applications of 2D and 3D Ultrasound Imaging
Figure 8
In a series of ongoing studies
at the Voice and Resonance
Laboratory at the University of
Toronto, we use our ultrasound
machine in combination with a
three-dimensional motion
sensor (PC Bird, Ascension
Technology Corporation, P.O.
Box 527, Burlington, Vermont
05402). The FreeScan V7.04
computer program (Echotech
3D Imaging Systems, 85399
Halbergmoos, Germany) is used
for the data acquisition and the
reconstruction of the threedimensional volumes. This setup allows us to make threedimensional scans of static
structures and, consequently, all
volume scans have to be obtained
Figure 8. Orthogonal planes of a three-dimensional ultrasound volume of the
from sustained speech sounds.
sustained vowel /a/, spoken by a normal male adult participant.
During the 3D data acquisition
procedure, the subject is seated
Figure 9
upright on a chair and instructed to slightly overextend
his or her head. The transducer is held in a coronal
scanning position and swept from the chin to the upper
border of the thyroid cartilage. In a typical examination,
a research participant sustains the following English
phonemes: /a/,/i/,/u/,/s/,//,//,/l/,/n/and
//. Each speech sound is repeated three times. The sound
is sustained for approximately 5 seconds while the
ultrasound scan is made. A 3D ultrasound scan of the
tongue usually takes 2-3 seconds.
Using the FreeScan software, we can then browse
through the three-dimensional ultrasound volume in any
direction. Figure 8 illustrates how multiple planes of a
three-dimensional ultrasound volume can be visualized.
The sound shown is sustained /a/ spoken by a normal male
speaker. Figure 9 shows the reconstructed threeFigure 9. Reconstructed three-dimensional volume of the
dimensional tongue volume of the same sustained /a/. In
sustained /a/ from Figure 8.
order to make measurements of the 3D tongue surface, it
is important to define an anchor point in the tongue upon
Figure 10
which all measurements can be based. We first align all 3D
scans so that the lingual septum is as exactly vertical as
possible and that muscle fibres between the chin and the
hyoid bone that are formed by the geniohyoid and the
inferior genioglossus are as exactly horizontal as possible.
We then define the ‘midpoint’ of the tongue as the halfway
point between the mandible and the hyoid on the superior
border of the geniohyoid muscle in the sagittal view, and
as the point of intersection of the lingual septum and the
geniohyoid muscle in the coronal view. Using this
midsagittal midpoint, we then identify a left and a right
parasagittal plane that is exactly parallel to the midsagittal
slice. Based on the anchor point, we then superimpose a
concentric grid with measurement lines spaced out in
11.25° intervals on the slices and measure the sagittal
Figure 10. Midsagittal plane of a sustained /a/ with overlay
tongue form in three parallel sagittal planes. Figure 10
of
a measurement grid.
demonstrates how we take measurements in the midsagittal
164
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Applications of 2D and 3D Ultrasound Imaging
plane of the same ultrasound volume of the sustained /a/
shown in Figures 8 and 9.
This procedure generates a data matrix for the tongue
surface for a speech sound. The data can be used to
reconstruct a rough visual representation of tongue surface
shapes as demonstrated in Figure 11. Figure 11a shows a
composite of tongue surface data during the production
of // for 12 normal speakers. Note the midsagittal
groove for this sound that is necessary to produce a clear
//. Figure 11b shows the tongue of a patient with a
lateral carcinoma of the tongue during the production of
the same sound before the ablative cancer surgery to the
right side of her tongue. The patient was able to produce
an adequate midsagittal groove and her // was
perceptually acceptable. Figure 11c shows the same patient
after a lateral resection of the right side of her tongue and
defect reconstruction with a radial forearm flap. The
patient was now unable to form a consistent midsagittal
groove, which led perceptually to a distortion and
lateralization of the //sound.
In two studies (Bressmann, Uy, & Irish, 2005;
Bressmann, Thind, Uy, Bollig, Gilbert, & Irish, 2005), we
used the quantitative tongue surface data that we
generated from three-dimensional tongue volumes to
extract underlying components of tongue movement by
using mathematical procedures such as principal
component analysis or multi-dimensional scaling. We
also developed a number of quantitative descriptors for
the degree of protrusion of the tongue in the oral cavity
(anteriority index), the degree of three-dimensional
midsagittal grooving along the length of the tongue
(concavity index) and the symmetry of the elevation of
left and right lateral tongue (asymmetry index). We
established orienting values for a group of normal speakers
and demonstrated the usefulness of these measures for the
analysis of a patient with glossectomy in whom we
compared pre- and postoperative movement ranges and
evaluated the effect of the defect reconstruction on the
deformation and symmetry of the tongue.
Towards Dynamic Three-Dimensional
Ultrasound
With the current generation of ultrasound machines,
we are faced with the dilemma that we can either visualize
a two-dimensional dynamic motion or a threedimensional static volume. Obviously, our ultimate goal
would be to visualize the motion of the tongue in 3D. This
would be especially important for our research on
glossectomy patients because the partial tongue resections
and reconstructions rarely ever lead to a symmetrical
outcome. In recent years, so-called 4D ultrasound
machines have become commercially available. These
machines are currently able to visualize up to 30 volumes
per second. However, while this technology is used with
great success in obstetrics and gynaecology, it is less suitable
for the imaging of the tongue in speech. The reason for this
is that the air in the oral cavity causes echo artefacts in the
ultrasound scan that obscure the true surface of the tongue
in the three-dimensional volume. Consequently, 4D scans
acquired this way would necessitate extensive postprocessing. Yang & Stone (2002) at the Vocal Tract
Figure 11
Figure 11. Surface plots for the postalveolar fricative //. (a) Composite surface plot for 12 normal speakers; (b) A
patient with a carcinoma of the right lateral tongue preoperatively; (c) The same patient after the tumour resection
and reconstruction with a radial forearm flap. Note the decreased lingual grooving and the postoperative asymmetry
of the tongue.
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
165
Applications of 2D and 3D Ultrasound Imaging
Visualization Laboratory at the University of Maryland
recently suggested an interesting new approach to the
reconstruction of three-dimensional tongue motion from
multiple two-dimensional image sequences. The
researchers recorded repeated utterances of the same
sentence in multiple parallel sagittal and coronal planes
and then reassembled the data in a reconstructed threedimensional surface using the Dynamic Programming
method. Using multiple two-dimensional scans to
reconstruct a three-dimensional moving tongue surface is
an elegant way to circumvent the current technological
limitations of the available ultrasound machines.
However, the high number of slices and repetitions
required for the method described by Yang & Stone
(2002) relies on the participation of a highly compliant
research volunteer. It is probably less practical for clinical
patient examinations and research in pathological speaker
groups.
In a recent series of experiments at the Voice and
Resonance Laboratory, we have started to acquire parallel
sagittal scans to make sparse three-dimensional surface
plots of the moving tongue. For this research, we use the
CHASE device to stabilize the head of the research
participant for repeated scans in three parallel sagittal
planes. Instead of aligning the images retrospectively with
Dynamic Programming, we pace the speech of the subject
using a digital metronome. In order to facilitate the task
of keeping a steady rhythm at 60 beats per minute, the
metronome is set to 120 beats per minute and the subject
is instructed to speak at half-tempo. So far, we have used
only repeated VCV syllables for this examination
technique. The stress is on the CV segment and coincides
with the metronome beat (i.e., /a’ta/). The examination
is repeated in three sagittal planes.
We then use a video-editing program with a parallel
oscillogram display (ScreenBlast, Sony Corp., 550
Madison Avenue, New York, New York 10022) to identify
the bursts of the metronome in the acoustic signal. The
metronome bursts are used to identify key-frames that
help us synchronize the image sequences for the same
utterance recorded in three sagittal planes. The image
sequences are analyzed using the Ultra-CATS software
and the results are plotted as surfaces. Figure 12 shows a
number of frames of the pseudo-3D surface plots of tongue
movement during the production of the syllable /aka/. In
our laboratory, this technique is still largely experimental
at this time. While we have only used it on normal speakers
to date, we are hoping to incorporate a similar procedure
into future patient examinations.
Conclusion
Ultrasound offers exciting new possibilities for
researchers and therapists in speech-language pathology.
The main advantages of ultrasound imaging are its noninvasiveness, bio-safety, cost-effectiveness and, last but
not least, the ease of the image acquisition. Ultrasound
allows us to acquire extensive amounts of speech data, and
166
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Figure 12
Applications of 2D and 3D Ultrasound Imaging
Marilia Sampaio, Parveen Thind, Catherine Uy and Willy
Wong. Funding for this research was provided by the
Canadian Institutes for Health Research (grant fund #
MOP 62960).
References
Figure 12. Posterior view of the reconstructed threedimensional surface plots of the tongue during the
utterance /a’ka/. (a) First /a/: Note the prominent
genioglossus furrow during the production of /a/; (b)
Maximal dorsal elevation for the production of /k/; (c)
Second /a/: Note the reduced genioglossus furrow
following the velar closure.
the examination sessions can be as long as necessary. We
also find that the direct visualization of the tongue shape
on the ultrasound screen is a good motivator for research
participants. Obviously, there are still a number of
methodological issues associated with the use of ultrasound
for research or clinical applications. These will have to be
addressed in further research. Nevertheless, the benefits
outweigh the disadvantages of this highly promising
imaging method.
The costs for basic ultrasound machines have dropped
significantly over the last few years, and it is likely that this
trend will continue in the future. In addition, many
manufacturers now make small portable ultrasound
machines that are operated with batteries. In the future,
many more speech-language pathologists may be able to
use ultrasound for speech and swallowing assessments as
well as a biofeedback device for therapy.
Acknowledgements
The authors gratefully acknowledge the invaluable
contributions of the following people (in alphabetical
order) to the research described here: Yarixa Barillas,
Carmen Bollig, Arpita Bose, Amanda Braude, Kevin
Cannons, Brent Carmichael, Michelle Ciccia, Heather
Flowers, Jiayun (Jenny) Gu, Gajanan (Kiran) Kulkarni,
Akgul, Y.S., Kambhamettu, C., & Stone, M. (1999). Automatic extraction and
tracking of the tongue contours. IEEE Transactions on Medical Imaging, 18, 10351045.
Bernhardt, B., Gick, B., Bacsfalvi, P., & Ashdown, J. (2003). Speech habilitation
of hard of hearing adolescents using electropalatography and ultrasound as evaluated
by trained listeners. Clinical Linguistics and Phonetics, 17, 199-216.
Bosma, J.F., Hepburn, L.G., Josell, S.D., & Baker, K. (1990). Ultrasound
demonstration of tongue motions during suckle feeding. Developmental Medicine and
Child Neurology, 32, 223-229.
Bressmann, T., Uy, C., & Irish, J.C. (2005). Analyzing normal and partial
glossectomee tongues using ultrasound. Clinical Linguistics & Phonetics, 19, 35-52.
Bressmann, T., Thind, P., Uy, C. Bollig, C., Gilbert, R.W., & Irish, J.C. (2005).
Quantitative three-dimensional ultrasound analysis of tongue protrusion, grooving
and symmetry: Data from twelve normal speakers and a partial glossectomee. Clinical
Linguistics and Phonetics, 19, 573-588.
Bressmann, T., Gu J., Cannons, K., Wong, W., Heng, C. L., & Carmichael, B. (in
preparation). Quantitative analysis of tongue motion using semi-automatic edge
detection software in B-mode ultrasound films: Design, methods and procedure
validation.
Carmichael, B. (2004). The Comfortable Head Anchor for Sonographic Examinations
(CHASE). Toronto: University of Toronto.
Casas, M. J., Kenny, D. J., & Macmillan, R. E. (2003). Buccal and lingual activity
during mastication and swallowing in typical adults. Journal of Oral Rehabilitation,
30, 9-16.
Casas, M.J., Kenny, D.J., & Mc Pherson, K.A. (1994). Swallowing/ventilation
interactions during oral swallow in normal children and children with cerebral palsy.
Dysphagia, 9, 40-46.
Casas, M.J., McPherson, K.A., & Kenny, D.J. (1995). Durational aspects of oral
swallow in neurologically normal children and children with cerebral palsy: an
ultrasound investigation. Dysphagia, 10, 155-159.
Cheng, C. F., Peng, C. L., Chiou, H. Y., & Tsai, C. Y. (2002). Dentofacial morphology
and tongue function during swallowing. American Journal of Orthodontics and
Dentofacial Orthopedics, 122, 491-499.
Chi-Fishman, G., & Sonies, B. C. (2002). Kinematic strategies for hyoid movement
in rapid sequential swallowing. Journal of Speech, Language, and Hearing Research,
45, 457-468.
Chi-Fishman, G., Stone, M., & McCall, G. N. (1998). Lingual action in normal
sequential swallowing. Journal of Speech, Language, and Hearing Research, 41, 771785.
Davidson, L. (2004 April). Assessing tongue shape similarity: comparing L2 norms
and area measures. Paper presented at the meeting of the Second Ultrasound
Roundtable, Vancouver, BC.
Gick, B. (2002). The use of ultrasound for linguistic phonetic fieldwork. Journal
of the International Phonetic Association, 32, 113-122.
Gick, B., & Rahemtulla, S. (2004 April). Recent developments in quantitative
analysis of ultrasound tongue data. Paper presented at the meeting of the Second
Ultrasound Roundtable, Vancouver, BC.
Gu, J., Bressmann, T., Cannons, K., & Wong, W. (2004). The Ultrasonographic
Contour Analyzer for Tongue Surfaces (Ultra-CATS). Toronto: University of Toronto.
Hewlett, N., Vazquez, Y., Zharkova, A., & Zharkova, N. (2004 April). Ultrasound
study of coarticulation and the “Trough Effect” in symmetrical VCV syllables: A report
of work in progress. Paper presented at the meeting of the Second Ultrasound
Roundtable, Vancouver, BC.
Hiiemae, K.M., & Palmer, J.B. (2003). Tongue movements in feeding and speech.
Critical Reviews in Oral Biology and Medicine: An Official Publication of the American
Association of Oral Biologists, 14, 413-29.
Kenny, D.J., Casas, M.J., & McPherson, K.A. (1989). Correlation of ultrasound
imaging of oral swallow with ventilatory alterations in cerebral palsied and normal
children: preliminary observations. Dysphagia, 4 , 112-117.
Kikyo, T., Saito, M., & Ishikawa, M. (1999). A study comparing ultrasound images
of tongue movements between open bite children and normal children in the early mixed
dentition period. Journal of Medical and Dental Sciences, 46, 127-137.
Li, M., Kambhamettu, C, & Stone, M. (2003 August). EdgeTrak, a program for
band edge extraction and its applications. Paper presented at the Sixth IASTED
International Conference on Computers, Graphics and Imaging, Honolulu, HI.
Lundberg, A., & Stone, M. (1999). Three-dimensional tongue surface reconstruction:
Practical considerations for ultrasound data. Journal of the Acoustical Society of
America, 106, 2858-2867.
Morrish, K.A., Stone, M., Sonies, B.C., Kurtz, D., & Shawker, T. (1984).
Characterization of tongue shape. Ultrasonic Imaging, 6, 37-47.
Munhall, K.G. (1985). An examination of intra-articulator relative timing. Journal
of the Acoustical Society of America, 78, 1548-1553.
Neuschäfer-Rube, C., Wein, B.B., Angerstein, W., Klajman, S. Jr., & Fischer-Wein,
G. (1997). Sektorbezogene Grauwertanalyse videosonographisch aufgezeichneter
Zungenbewegungen beim Schlucken. HNO, 45, 556-562.
Parush, A., & Ostry, D.J. (1993). Lower pharyngeal wall coarticulation in VCV
syllables. Journal of the Acoustical Society of America, 94, 715-722.
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
167
Applications of 2D and 3D Ultrasound Imaging
Peng, C.L., Jost-Brinkmann, P.G., Miethke, R.R.,& Lin, C.T.(2000). Ultrasonographic
measurement of tongue movement during swallowing. Journal of Ultrasound in
Medicine, 19, 15-20.
Schliephake, H., Schmelzeisen, R., Schönweiler, R., Schneller, T., & Altenbernd.
C. (1998). Speech, deglutition and life quality after intraoral tumour resection. A
prospective study. International Journal of Oral and Maxillofacial Surgery, 27, 99105.
Shawker, T.H., & Sonies, B.C. (1984). Tongue movement during speech: a realtime ultrasound evaluation. Journal of Clinical Ultrasound, 12, 125-133.
Shawker, T.H., & Sonies, B.C. (1985). Ultrasound biofeedback for speech training.
Instrumentation and preliminary results. Investigative Radiology, 20, 90-93.
Shawker, T.H., Sonies, B., Stone, M., & Baum, B.J. (1983). Real-time ultrasound
visualization of tongue movement during swallowing. Journal of Clinical Ultrasound,
11, 485-490.
Soder, N., & Miller, N. (2002). Using ultrasound to investigate intrapersonal
variability in durational aspects of tongue movement during swallowing. Dysphagia,
17, 288-297.
Sonies, B.C., Baum, B.J., & Shawker, T.H. (1984). Tongue motion in elderly adults:
initial in situ observations. Journal of Gerontology, 39, 279-283.
Sonies, B.C., & Dalakas, M.C. (1991). Dysphagia in patients with the post-polio
syndrome. New England Journal of Medicine, 324, 1162-1167.
Stone, M. (1990). A three-dimensional model of tongue movement based on
ultrasound and x-ray microbeam data. Journal of the Acoustical Society of America,
87, 2207-2217.
Stone, M., & Davis, E.P. (1995). A head and transducer support system for making
ultrasound images of tongue/jaw movement. Journal of the Acoustical Society of
America, 98, 3107-3112.
Stone, M., & Lundberg, A. (1996). Three-dimensional tongue surface shapes of
English consonants and vowels. Journal of the Acoustical Society of America, 99,
3728-3737.
Stone, M., Morrish, K., Sonies, B.C., & Shawker, T.H. (1987). Tongue curvature:
A model of shape during vowel production. Folia Phoniatrica, 39, 302-315.
Stone, M., & Shawker, T.H. (1986). An ultrasound examination of tongue
movement during swallowing. Dysphagia, 1, 78-83.
Stone, M., Shawker, T.H., Talbot, T.L., & Rich, A.H. (1988). Cross-sectional
tongue shape during the production of vowels. Journal of the Acoustical Society of
America, 83, 1586-1596.
Unser, M., & Stone, M. (1992). Automated detection of the tongue surface in
sequences of ultrasound images. Journal of the Acoustical Society of Canada, 91,
3001-3007.
Van Riper, C. (1963). Speech correction: principles and methods, 4th edition.
Englewood Cliffs, NJ: Prentice-Hall.
Watkin, K.L., & Rubin, J.M. (1989). Pseudo-three-dimensional reconstruction of
ultrasonic images of the tongue. Journal of the Acoustical Society of America, 85, 496499.
Wein, B., Angerstein, W., & Klajman, S. (1993). Suchbewegungen der Zunge bei
einer Sprechapraxie: Darstellung mittels Ultraschall und Pseudo-3D-Abbildung.
Nervenarzt, 64, 143-145.
Wein, B., Alzen, G., Tolxdorff, T., Bockler, R., Klajman, S., & Huber, W. (1988).
Computersonographische Darstellung der Zungenmotilität mittels Pseudo-3DRekonstruktion. Ultraschall in der Medizin, 9, 95-97.
Wein, B., Bockler, R., Huber, W., Klajman, S., & Willmes, K. (1990).
Computersonographische Darstellung von Zungenformen bei der Bildung der langen
Vokale des Deutschen. Ultraschall in der Medizin, 11, 100-103.
Wein, B., Klajman, S., Huber, W., & Doring, W. H. (1988). Ultraschalluntersuchung
von Koordinationsstörungen der Zungenbewegung beim Schlucken. Nervenarzt, 59,
154-158.
Whalen, D.H., Iskarous, K., Tiede, M.K., Ostry, D.J., Lehnert-LeHoullier, H.,
Vatikiotis-Bateson, E., & Hailey, D.S. (2004 April). HOCUS: the Haskins Optically
Corrected Ultrasound System. Paper presented at the meeting of the Second Ultrasound
Roundtable, Vancouver, BC.
Wrench, A. (2004 April). QMUC Matching, merging and means: Spline productivity
tools for ultrasound analysis. Paper presented at the meeting of the Second Ultrasound
Roundtable, Vancouver, BC.
Yang, C.S., & Stone, M. (2002). Dynamic programming method for temporal
registration of three-dimensional tongue surface motion from multiple utterances.
Speech Communication, 38, 199-207.
CA S L P
Ac t i o n !
PA
Have you looked at us lately?
ON-LINE JSLPA
SEARCHABLE
INDEX
Search online for past articles
that have appeared in JSLPA.
Searches may be done by
article, issue, topic, author and
key word. Search results allow
you to view the entire abstract
and give you the option to order
article reprints, back issues or
full subscriptions. This is an
excellent research tool that can
be accessed under the JSLPA
area of the Resources section of
www.caslpa.ca
Author Note
Please address correspondence to: Tim Bressmann,
Ph.D., Assistant Professor, Graduate Department of
Speech-Language Pathology, University of Toronto, 500
University Avenue, Toronto, ON M5G 1V7 Canada,
[email protected]
Access the JSLPA searchable index at:
www.caslpa.ca/english/resources/jslpa-index.asp
Received: November 15, 2004
Accepted: March 16, 2005
Another great service brought to you by CASLPA
168
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Electropalatography and Ultrasound
Exploring the Use of Electropalatography and Ultrasound
in Speech Habilitation
Explorer l’électropalatographie et l’échographie pour
l’éducation de la parole
Barbara Bernhardt, Penelope Bacsfalvi, Bryan Gick, Bosko Radanov
and Rhea Williams
Abstract
Barbara Bernhardt, Ph.D.
School of Audiology and
Speech Sciences
University of British
Columbia
Vancouver, BC Canada
Penelope Bacsfalvi, M.Sc.
School of Audiology and
Speech Sciences
University of British
Columbia
Vancouver, BC Canada
Bryan Gick, Ph.D.
Department of Linguistics
University of British
Columbia
Vancouver, BC Canada
Bosko Radanov, B.A.
School of Audiology and
Speech Sciences
University of British
Columbia
Vancouver, BC Canada
Electropalatography (EPG) and ultrasound have been recently explored as articulatory visual
feedback tools in speech habilitation at the University of British Columbia’s Interdisciplinary Speech
Research Laboratory (UBC, ISRL). Although research studies imply that such tools are effective in
speech habilitation, most studies have utilized trained listeners. To determine the impact of speech
habilitation on everyday communication, it is also important to include everyday, untrained
listeners in the treatment evaluation process. Two everyday listener studies were conducted, using
data from two of the exploratory UBC treatment studies. The listeners observed improvement posttreatment for some but not all speakers or speech targets. More research is needed to determine the
relative effectiveness of EPG and ultrasound in speech habilitation in terms of speaker variables and
treatment targets, and in comparison with each other and different treatment methods. The current
paper has two purposes: (1) to provide an overview of EPG and ultrasound in speech habilitation,
and (2) to present the two preliminary listener studies, suggesting directions for future research and
clinical application.
Abrégé:
Le laboratoire de recherche interdisciplinaire sur la parole de l’University of British Columbia
(UBC) a examiné la possibilité d’utiliser l’électropalatographie et l’échographie comme outils de
rétroaction visuelle de l’articulation. Bien que des études laissent entendre l’utilité de tels outils pour
l’éducation de la parole, la plupart sont fondées sur des auditeurs formés. Pour évaluer l’efficacité
de la réadaptation de la parole dans la communication quotidienne, il est aussi important d’inclure
des auditeurs ordinaires n’ayant pas été formés au processus d’évaluation du traitement. Le
laboratoire a mené, à UBC, deux études avec des auditeurs ordinaires à partir des données de deux
études exploratoires portant sur le traitement. Les participants ont noté des améliorations à la suite
du traitement, mais pas chez tous les orateurs ni pour toutes les cibles. Il faut poursuivre la recherche
afin de déterminer l’efficacité relative de l’électropalatographie et de l’échographie pour la réadaptation
de la parole en fonction des variables des orateurs mêmes et entre les méthodes de traitement. Le
présent article vise deux objectifs: (1) effectuer un survol de l’utilité de l’électropalatographie et de
l’échographie dans la réadaptation de la parole, et (2) présenter les résultats de deux études
préliminaires sur des auditeurs afin de proposer des orientations pour la recherche et l’application
clinique.
Key Words: electropalatography, ultrasound, everyday listener, visual feedback
Rhea Williams
Williams, M.Sc.
Speech-Language Pathologist
Barrie, ON Canada
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
169
Electropalatography and Ultrasound
I
n the past several decades, a number of small n
studies have demonstrated the utility of visual
feedback technology in speech habilitation.
Studies have included participants with a variety of
etiologies, for example, hearing impairment (e.g.
Bernhardt, Fuller, Loyst & Williams, 2000; Bridges &
Huckabee, 1970; Dagenais, 1992; Fletcher, Dagenais, &
Critz-Crosby, 1991; Volin, 1991), cleft palate (e.g.,
Gibbon, Crampin, Hardcastle, Nairn, Razzell, Harvey,
& Reynolds, 1998; Michi, Yamashita, Imai, & Yoshida,
1993), phonological impairment (e.g., Adler-Bock, 2004;
Gibbon, Hardcastle, Dent, & Nixon, 1996), or motor
speech impairment (e.g., Gibbon & Wood, 2003).
Technologies providing acoustic displays are perhaps
most common (e.g., Maki, 1983; Shuster, Ruscello, &
Toth, 1995; Volin, 1991). However, there is a growing
interest and body of research on visual articulatory
feedback in treatment, for example, electropalatography
(EPG, or palatometry), which shows dynamic tonguepalate contact patterns (e.g., Bernhardt et al., 2000;
Dagenais, 1992; Fletcher et al., 1991; Gibbon et al., 1998;
Hardcastle, Jones, Knight, Trudgeon, & Calder, 1989).
Two-dimensional ultrasound, which can display dynamic
images of tongue shapes and movements, has also become
more accessible in the past 5 years for speech habilitation
(Adler-Bock, 2004; Bernhardt, Gick, Bacsfalvi, &
Ashdown, 2003; Gick, 2002). At the University of British
Columbia, exploratory treatment studies have been
conducted using EPG and/or ultrasound. One purpose of
the current paper is to provide an overview of these
technologies in speech habilitation as a foundation for
future clinical and research applications. Although the
tools are currently university-based, there may be
potential for future clinical use. Queen Margaret
University College in Edinburgh has been situating EPG
in clinical sites throughout Britain, with links to the
university research team (Cleftnet UK: Gibbon et al.,
1998). One step in proceeding towards such a program in
Canada is dissemination of information about the
technologies and their clinical use to S-LPs and other
researchers. For clinical purposes, the question is whether
technologies are worth the investment. The research
literature suggests that visual feedback technologies aid
speech habilitation. However, most studies to date have
been conducted by trained observers. Because an ultimate
goal of speech therapy is to enhance communication in a
client’s everyday life, outcomes research also needs to
include the perspectives of everyday listeners (Frattali,
1998; Klasner & Yorkston, 2000; World Health
Organization, 2001). Two of the exploratory UBC
treatment studies had data that could be used to collect
everyday listener observations. The second purpose of the
current paper is to present those preliminary everyday
listener observations, not as measures of effectiveness of
the treatment, but as a foundation for future research and
clinical studies. By discussing the two small studies in one
paper, a broader perspective can be gained on everyday
listener research.
170
The paper begins with an introduction to EPG and
ultrasound and a brief discussion of everyday listener
research methods. The two listener studies are then
presented in turn. Background information is provided
on the treatment studies themselves within the context of
each listener study, although space precludes a detailed
discussion of them (see Bernhardt et al., 2000 and
Bernhardt et al., 2003 for more details). The treatment
studies were case studies conducted with the purpose of
learning about the use of EPG and ultrasound in speech
habilitation. Thus, they were conducted without strict
experimental single subject or group designs (e.g., the use
of control groups). The projects were developmental in
nature, and thus, S-LPs and researchers shared
information with each other throughout about
participants and procedures. The final section of the
paper discusses future research and clinical implications.
Visual Feedback Technology
Dynamic Electropalatography
Different types of dynamic EPG systems have been
available over the past three decades. Older systems such
as the now unavailable Kay Palatometer ran on DOS (Kay
Elemetrics, New Jersey), with more recent ones running
on Windows, for example, the WIN-EPG
(www.articulateinstruments.co.uk) or the Logometrix
system (www.logometrix.org). The Kay palatometer and
the WIN-EPG (2000) have been used in the UBC research
program, and thus the discussion focuses on these
instruments.
The above-mentioned systems operate in similar ways.
Speakers wear a custom-fit artificial palate (figure 1).
Figure 1
Figure 1. The WIN-EPG and Kay artificial palates: The
top of the palates represents the front part of the mouth,
and the bottom, the velar area. The WIN-EPG has 62
electrodes bunched densely at the front of the palate, but
with contacts across the palate to the velar area. The Kay
pseudopalate has 92 electrodes, bunched densely around
the edge of the palate up to the teeth, but with contacts
back to the velar area.
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Electropalatography and Ultrasound
Although highly anomalous oral structures may
preclude the wearing of artificial palates, all 22 speakers
in the UBC projects have been able to wear them. The
palate contains electrodes that are sensitive to tonguepalate contacts and send contact-induced electrical
impulses through fine bundled wires to a computer.
Tongue-palate contact patterns are displayed on-line.
These displays and the accompanying acoustic signals can
be stored for analysis, or as templates for practice. The
Kay and WIN-EPG differ in how the palates attach, the
number and type of electrode displays and the types of
analyses provided. The acrylic Kay pseudo-palate has 92
electrodes; it fits over and is held on by the upper teeth
(figure 1). The acrylic WIN-EPG has 62 electrodes; dental
wires hold the pseudo-palate onto the upper teeth. Both
types of palates allow distinction of non-low vowels and
alveolar to velar consonants from one another; the Kay
palate, in addition, provides displays of dental consonants.
The EPG systems typically provide acoustic displays (e.g.,
a waveform displaying intensity), with the WIN-EPG also
providing off-line spectrographic displays and detailed
quantitative analyses of contact patterns.
The following section describes typical maximal target
contact patterns and some aberrant patterns for English
lingual consonants and vowels, as observed by the authors
during the treatment studies (see figures 2a-2h, 5a-d and
6a-d). EPG images for the current paper are taken from
the currently used machine, WIN-EPG, which has much
easier image exporting capabilities than the previous
machines. Although exact tongue-palate contact patterns
for speech targets vary within and between speakers, the
contact patterns are similar in configuration and region.
Alveolar targets /t/ (figure 2a), /d/ and /n/ show a horseshoe
contact pattern, with the tongue touching the alveolar
ridge and the sides of the upper dental arch to the back of
the molar region. The alveolar sibilants /s/ (figure 2b) and
/z/ show a similar contact pattern, but have more contact
on the sides of the palate, creating a groove primarily in
the alveolar area. This groove varies somewhat in size and
location across speakers (McLeod, Roberts & Sita, 2003).
The contact pattern for /l/ depends on context (figure 2d).
Prevocalic /l/ generally has alveolar contact similar to
/t/,/d/ and /n/, full or near-full lateral contact on one side
of the upper dental arch, and reduced posterior lateral
contact on the other side of the dental arch. Postvocalic
“dark” /l/ tends to show posterior velar contact at the
beginning of the articulation, followed rapidly by the
contact pattern for prevocalic /l/. The palatoalveolars
// (figure 2c) and // show broad, post-alveolar contact
along the sides of the upper dental arch, and a wider
groove (i.e., less contact) than the alveolar sibilants. The
affricates /t/ and /d/ vary across speakers. Some
speakers show a stop-sibilant contact pattern in the
palatoalveolar area (Hardcastle, Gibbon & Scobbie,
1995), while others show movement backwards from the
/t/ or /d/ contact area to a post-alveolar // or //
contact area. English /r/ tends to have lateral contact
along the back molars and a wide channel with no contact
Figure 2
2a.
2b.
2c.
2d.
2e.
2f.
2g.
2h.
Figure 2a-2h. Electropalatograms for North American
English (a) /t/, (b) /s/, (c) //, (d) prevocalic /l/, (e) /k/,
(f) prevocalic /r/, (g) /i/, and (h) //. The top of the figure
represents the alveolar area, and the bottom, the velar
area. The black squares indicate tongue-palate contact.
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
171
Electropalatography and Ultrasound
in the middle of the tongue (figure 2f). The velar consonants
(figure 2e) tend to have continuous contact along the
back of the pseudo-palate region in the context of back
vowels, although for some speakers, this contact is not
visible if their pseudo-palate does not extend far enough
back. In the context of front vowels, the velars show
continuous contact from the velar through the palatal
regions. The front vowels and /j/ show bilateral contact
about halfway along the palate towards the front, and a
wide mid-channel with no contact; the back vowels and
/w/ show minimal contact in the velar region and a wide
mid-channel with no contact (figure 2g, 2h). Tense vowels
have the same type of contact pattern as their lax vowel
cognates, but are produced further forward along the
dental arch. Low vowels are generally not visible because
they have no tongue-palate contact in most speakers; for
some speakers, mid vowels have no visible contact.
During speech therapy, the target tongue-palate
contact pattern is demonstrated by a typical speaker with
an artificial palate, and the client is encouraged to emulate
that target. If the client produces an acceptable variant of
the target phone with a different tongue-palate contact
configuration from that of the model, the client’s
production becomes his or her own template.
For further information on EPG, consult http://
sls.qmuc.ac.uk/epg/epg0_big.html.
Sonosite C15/4-2 MHz MCX transducer (for treatment
only), and (b) a stationary Aloka Pro-Sound SSD-5000
with a UST-9118 endo-vaginal 180-degree convex array
transducer (see Bernhardt, Gick, Adler-Bock & Bacsfalvi,
2005).
Figure 3
3a.
3b.
3c.
Dynamic Two-Dimensional Ultrasound
The following description provides an overview of the
functioning of two-dimensional ultrasonography for
speech displays. To display speech or other lingual
movements with ultrasound, a transducer is held by the
speaker or attached to a fixed arm or stand so that it
contacts the undersurface of the speaker’s chin. The
transducer is coated with water-soluble ultrasound gel to
enhance the signal. Sound waves are transmitted by the
transducer up through the oral tissue. Echo patterns from
sound waves returning from the tongue surface are
converted to moving images that are visible on an
ultrasound screen (see figures 3, 4 and 7). (See also Stone,
2005.) There are a number of different dynamic twodimensional ultrasound machines, with large differences
in price, depending on the size and complexity of the
machine and transducer. Three different machines have
been used in the UBC treatment studies. For Study #2 in
the current paper, the Aloka SSD-900 portable ultrasound
was used, along with a 3.5MHz convex intercostal
transducer probe. The range and gain were adjusted to
give the clearest image for the tongue surface across
assessments, for example, a range of 11 centimeters, gain
of 60. The simultaneous audio signal was recorded onto
VHS tape at 30 frames per second from the ultrasound
machine (JVC Super VHS ET Professional Series, SRVS20), and also onto a digital video source using a ProSound YU34 unidirectional microphone amplified
through the built-in pre-amplifier in a Tascam cassette
deck. In subsequent studies, two other machines have
been utilized: (a) a portable Sonosite 180 Plus with a
172
3d.
3e.
3f.
3g.
Figure 3a-3g. Ultrasound images of North-American
English: (a) /k/, (b) /t/, (c) /s/ (coronal view), (d) /l/, (e)
/r/ (f), /u/, (g) //. The tongue tip is on the right in all
sagittal images. The straight line for /k/ approximates the
velar area. The straight line for /t/ approximates the
alveolar area. 3c shows a mid-line groove for /s/. Note the
complex shapes of /l/ and /r/ (3d, 3e). The /u/ has an
advanced tongue root and higher tongue body than //
(3f, 3g).
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Electropalatography and Ultrasound
Either mid-sagittal (figure 3a,3b,3d,3e,3f,3g) or
coronal-oblique (figure 3c) views of the tongue shape and
movement patterns can be displayed on a screen. The
sagittal view displays a side view of the tongue, showing
tongue height, backness, and slope; this view has been
used most in the UBC studies. The tip or root may be
obliterated in the display, due to the limited field of view
or by the jaw and hyoid shadows. However, the general
shape, position, height and slope provide useful feedback
to the speaker. Slope has been found to be especially
relevant for /l/, /r/ and the vowels in the ISRL studies. The
coronal view shows a cross-section of the tongue, and
thus, flatness, mid-line grooving or lateral elevation or
depression of the tongue. This view has been useful for
showing mid-line tongue grooving for sibilants and /r/,
plus relative vowel height. Reference points or lines or
palatal contour sketches derived from images of swallows
are sometimes added to the display. This is done either by
attaching overhead transparencies with drawings of the
palatal arch or target tongue positions or shapes to the
screen, or by using reference lines generated by the
ultrasound machine itself. For example, if a certain tongue
height is optimal for a particular target, a reference line
will be placed on the screen that the speaker is to ‘hit’ with
his or her tongue body or tip. (See the lines in figures 3a,
3b and 3e,, for example.) As with EPG treatment, a typical
speaker provides examples for the client to emulate.
Images or movies can be stored and used for future
reference.
Typical and aberrant ultrasound images are shown in
figures 3, 4 and 6 (see also Bernhardt et al., 2005), and are
described below. The tongue tip movement and height for
the velars contrast visibly with tongue body movement
and height for the alveolars, as shown in figures 3a and 3b
respectively.. For sibilants, the varying width of the groove
for the alveolar and post-alveolar sibilants is visible in the
coronal view as in figure 3c. The sagittal view shows the
relative front-back position of the apical end of the tongue
and helps distinguish alveolar from post-alveolar
fricatives. For affricates, the display shows the relative
backness of the tongue and any change in movement from
a more anterior position for the stop portion of the
affricate to a more posterior position for the fricative
portion. The English /l/ and /r/ are complex articulations
with multiple components that differ across word position
and speakers (figure 3d and 3e; also see figure 4 which
shows a sample aberrant /r/ pre-treatment and accurate
/r/ post-treatment from Adler-Bock, 2004). For both
liquids, the sagittal view typically shows a two-point
displacement of the tongue in the tip/blade and root
regions (Stone & Lundberg, 1996). For /l/, the relative
timing of the anterior and posterior constrictions can
also be seen in the sagittal view (figure 3d; Sproat &
Fujimura, 1993; Gick, 2003). The coronal view shows the
lateral dip of one or both sides of the tongue for /l/, a dip
which is usually towards the posterior portion of the
tongue body. The English /r/ can be produced with a more
retroflexed or bunched position, and is generally
articulated with three separate constrictions along the
vocal tract (Delattre & Freeman, 1968): labial, central
and pharyngeal. The shape of the tongue in the region of
each of the lingual constrictions is visible on ultrasound
(figure 3e). A posterior and relatively wide mid-line
depression is another important component of /r/ and is
visible in the coronal view of /r/. With respect to vowels,
the sagittal view provides a view of the whole tongue as it
advances and retracts, and moves through various heights.
The sagittal view thus displays the tongue root as it
advances and retracts for tense and lax vowels respectively,
and in addition, shows the higher tongue body position
for tense vowels (compare the height and backness of /u/
and // in figure 3f and 3g, and the same for /i/ and //
in figure 7). The coronal view also shows relative height
of the sides of the tongue; the sides are higher for the high
and tense vowels. Additional information on ultrasound
is available in the Volume 19, 2005 issue of Clinical
Linguistics and Phonetics and the following websites:
www.linguistics.ubc.ca/isrl/UltraSoundResearch/;
http://speech.umaryland.edu/research.html;
www.slp.utoronto.ca/People/Labs/TimLab;
www.sls.qmuc.ac.uk/RESEARCH/Ultrasound.
Figure 4
4a.
4b.
Figure 4a, 4b. Ultrasound images of a pre-treatment
vocalic substitution for /r/ (4a) followed by a posttreatment on-target /r/ production (4b: Victor, AdlerBock, 2004).
Everyday Listener Observations
An ultimate goal of speech therapy is to enhance
communication in a person’s everyday life. The
observations of everyday listeners are thus important in
the treatment evaluation process. A variety of methods
can be used to obtain such observations, from studies of
speech intelligibility or comprehensibility to more
qualitative approaches such as interviews, questionnaires
or focus groups. Speech intelligibility measures were used
in the two following studies; consequently, the following
discussion focuses on those methods.
Speech intelligibility is generally evaluated with some
kind of identification task or rating scale (Kent, Miolo &
Bloedel, 1994). For identification tasks, words may be
presented in isolation or in connected speech in a variety
of listening conditions. Listeners may be asked to
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
173
Electropalatography and Ultrasound
transcribe orthographically what they hear, or to select
responses from closed sets in computerized or noncomputerized protocols. Kent et al. (1994) suggest that
identification tasks can provide information about specific
words and phonemes, but not about the speaker’s general
conversational competence. General scalar ratings of
connected speech samples, in contrast, can provide holistic
appraisals of speech, because the listener can take nonsegmental factors into account, such as intonation, rate,
and rhythm. The general rating scales, however, give
minimal information on specific words and segments.
Rating scales may also be more subject to listener and
context biases than identification tasks (Schiavetti, 1992;
Kent et al., 1994). Some rating scales may provide more
information on specific speech targets (Black, 1999; Ertmer
& Maki, 2000). Ertmer and Maki note that progress in
treatment often has an intermediate phase, in which
“closer—though still distorted approximations” may
precede fully acceptable variants of the target (2000, p.
1514). They constructed a 3-point rating scale for
evaluating treatment of specific targets: “0” (omission or
substitution of the target), “1” (“improved but not fully
acceptable”), or “2” (“a fully acceptable” variant of the
target) (p. 1514). This method also has inherent biases.
First, the speech target is known, potentially influencing
the listener responses. Furthermore, listeners may vary in
their definition of acceptability, depending on their
background and their understanding of the task.
Nevertheless, the task does allow specific speech targets to
be rated without phonetic transcription, making it usable
by everyday listeners.
One of each type of task was selected for the current
listener studies in order to assess the utility of each for
evaluating data from treatment studies. The first study
used a single word identification task. The second study
adopted a 3-point rating scale similar to the Ertmer and
Maki (2000) scale. Based on formal (Bernhardt et al.,
2003) and informal trained listener observations, it was
predicted that everyday listeners would identify more
post-treatment words in study #1, and rate post-treatment
speech samples more highly in study #2. It was also
predicted that their responses would differ according to
the various speakers. However, it could not be predicted
to what degree pre- and post-treatment listener
observations might differ, or how listeners might react to
individual speakers or segment (phoneme) types in the
two studies. These preliminary listener responses would
serve as a foundation for future research questions and
methods. The two studies are discussed in turn below.
Study #1: Speech Habilitation Using EPG with
a Heterogeneous Group of Speakers
Treatment Study Background for Listener Study #1
The first treatment study included a
heterogeneous sample of 7 speakers in terms of etiology,
age and speech target types (see table 1). Inclusion of a
variety of speaker types in the first exploratory study
174
allowed the research team to gain preliminary insight
into the overall scope of the technology for future studies.
Three speakers had hearing impairments, 4 had cleft
palates, and 2 had motor impairments, thus reflecting the
variety of impairments reported in the EPG literature.
Four speakers were 18 or over, and three were 8-9 years of
age. Two adults had mild speech impairments
(pseudonyms Stan and Delia), one had a moderate speech
impairment (Devon), and one a severe speech impairment
(Samantha). Among the children, 2 had severe speech
impairments (Dora, Sandy), and one, a moderate speech
impairment (Dana). All speakers had received a minimum
of 3 years of prior speech therapy.
Phonological analysis (Bernhardt & Stemberger,
2000) and the palatograms served as a basis for setting
individualized targets for treatment. Speech targets
included: (1) alveolars /t/, /d/, /s/, /l/, with all speakers
having at least one alveolar target, (2) palatoalveolars
// and/or /t/ and /d/ for Stan, Delia, Dora, Devon,
(3) velars /k/ and/or /g/ for Delia, Samantha, Sandy,
Devon, (4) // for Sandy and Delia, and (5) /r/ for Dora
and Samantha. One of two S-LPs associated with the
project conducted a traditional articulation therapy
baseline (Bernthal & Bankson, 2004) over 6-8 sessions,
focusing on one or more of the speakers’ targets. The S-LPs
modeled targets for imitation, provided visual, auditory
and tactile cueing, and gave oral, written or sign language
feedback as indicated. For the children, games and books
were used to engage their interest. The children’s and
Devon’s family members attended the sessions. Narrow
phonetic transcription by the investigators at the end of
the baseline period showed no change for identified
targets.
A 20-session palatometry program was then
conducted over 14-16 weeks at the university by one of the
same two S-LPs in consultation with the first author. The
case study protocol for each speaker consisted of two 4week treatment blocks with two treatment sessions per
week, a 1-to-3-week treatment break, and a 4-session
weekly maintenance phase. More than one target was
included for each client in each block in a semi-cyclic
approach to treatment. The training time for specific
targets was adjusted to speaker needs, with targets being
revisited in the second treatment block as required. The
palatometer was considered an adjunct to the articulation
therapy program, which was conducted with the same
general therapy techniques used in the baseline.
Participants practised the given targets as isolated
segments, and then in syllables, words, sentences and
conversation, both with and without the palatometer. As
the client progressed at each level of complexity (e.g.,
segment, syllable, etc.), targets were practiced more often
without the artificial palates within sessions. From the
beginning, home practice was encouraged without the
artificial palates for those targets that showed some success
in the sessions. For example, a speaker might be asked to
practice oral movements, segments, syllables, words or
phrases, with a family member providing feedback.
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Electropalatography and Ultrasound
Table 1
Study #1 speaker characteristics and EPG treatm ent targets
Speaker's
A ge
History
Major speech patterns
pseudonym
Stan
Severity and
EPG targets
19
Cleft lip and palate
Alveolars and palatoalveolars:
mid-palatal substitutions.
Sibilants lateralized.
Mild
/t,s,st,,d/
Mild nasal emission
Delia
29
/t/,/d/,/k/,/t/ :
glottalized
Cleft palate
Sibilants ungrooved, often
pharyngealized.
Mild
/,t,d,s, z,k,/
Devon
40
Cerebral haemhorrage.
Imprecise articulation, voicing
mismatches, hypernasility,
suprasegmental aberrations
Samantha
18
Rubella: Profound hearing
loss, mild oral-motor leftsided weakness, palatal
lift.
ESL: Oral (some sign).
Many deletions, substitutions.
Nasal emission without palatal lift.
Weak articulation
Severe
/s,,k,r/
Dana
8
Cleft lip and palate.
Fistula. Malocclusion.
Pharyngoplasty, speech
bulb reduction.
Alveolars > mid-palatal or
interdentals.
/,/r/, prevocalic /l/:
velarized
Mild
/t,d,s,l/
Sandy
9
Klippel-Feil Syndrome.
Cleft lip and palate
Fistula. Malocclusion.
Mild-moderate hearing
loss (aided).
Velar stops, fricatives, affricates
> glottal stops, pharyngeal
fricatives, nasal fricatives
Alveolar stops dentalized.
Labials accurate.
Prevocalic /r/ > glide.
Severe
/, s, k, /
Dora
8
Cytomegalovirus:
Cochlear implant age
3;11.
Total communication
since preschool.
Liquids, some final consonants,
cluster elements: deleted.
/s/, /z/ : weak or deleted.
Velars > alveolars.
/i/ > [m]; /n/ > [f].
Severe
/t,d,s,,
t,,r/
Everyday Listener Study #1
Method for everyday listener study #1.
Williams (fifth author) conducted the everyday
listener study described below as part of her master’s
thesis project (Williams, 1998). She organized the stimuli,
ran the experiment and conducted initial analyses. The
first author conducted further analyses for the present
paper.
Eight male and eight female adult listeners
participated. These listeners had a mean age of 25, Grade
12 or higher, spoke English as their first language, and had
had no prior experience with disordered speech. Their
hearing was assessed as normal with a pure tone screening
Moderate
/t,d,s,,k,/
level of 20dbHL from 500-6000 Hz, speech reception
thresholds at 20dbHL or better, and speech discrimination
scores of 88% or better.
The listener task involved open-set word
identification. Word stimuli for the task came from two
audio-recordings of each speaker pre- and post-treatment,
made with a Marantz audiotape recorder and PZM 331090B microphone in the therapy room. Because there
were no observable changes in the baseline period, the
listener data included the pre- and post-treatment
assessment samples only, in order to reduce listener time
requirements. The stimuli were taken from a list of 164
single words not used in therapy that included multiple
exemplars of English consonants and clusters across word
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
175
Electropalatography and Ultrasound
Madsen TDH 39P 10W headphones in a double-walled
positions (Bernhardt, 1990). A unique list of stimuli
Industrial Acoustical Company (IAC) sound booth.
words was selected semi-randomly for each speaker. Each
During the task, listeners faced the computer screen,
speaker’s list contained the same pre- and post-treatment
which showed one large square labeled ‘NEXT’. Listeners
words, and seven to ten consonant treatment targets.
were told that there were seven blocks in each session, that
Different words were chosen for each speaker, because
all stimuli in a given set came from one speaker, and that
listeners could potentially have learned words from the
they could decide when to proceed to the next word. They
more intelligible speakers during the task, enabling them
were asked to write down the words that they heard. Each
to identify those words when spoken by less intelligible
listening session lasted 60-80 minutes, with breaks on
speakers. Ten was the maximum number of words per
request.
speaker that could be selected, in order to include all of a
speaker’s treatment targets, while avoiding repetition of
Results and discussion of listener study #1.
words across speakers.
Analyses were performed within-speaker only,
Because of the individual case study design of the
because
of the small size and heterogeneity of the sample.
treatment study, and the heterogeneity of the speaker
(See
table
2.) The prediction had been that listeners would
group, listeners were asked to make within-speaker
identify
significantly
more words and target segments in
judgments. This may have resulted in a practice effect,
the
post-treatment
word
sets across speakers. Listener
that is, some within-speaker familiarization for the
responses
varied
by
speaker.
For pre-treatment stimuli,
listeners. The practice effect was diminished by
average
words
correctly
identified
across listeners showed
randomizing stimuli within the pre- and post-treatment
a
bimodal
split.
For
speakers
with
a mild-moderate
counterbalancing speaker order across listeners, and
impairment
(Stan,
Delia,
Devon
and
Dana), listeners
presenting all pre- and post-treatment samples in the first
identified
an
average
of
6
to
7
of
10
words
per speaker. For
of two listening sessions. The latter procedure was
speaker
with
a
severe
impairment
(Samantha,
Dora and
considered potentially less biasing than having
Sandy),
listeners
identified
0
to
2
of
10
words
per
speaker.
randomized pre- and post-treatment words from the
The
average
number
of
correctly
identified
words
was
same speaker in the same small word sets, where clearer,
higher
for
all
speakers
in
the
post-treatment
samples
but
post-treatment pronunciations could potentially give
to
different
degrees.
The
non-parametric
Wilcoxon’s
the listener cues to pre-treatment words. A 2-3 day interval
Signed Ranks test was used to compare the pre-post listener
between testing sessions further
reduced potential practice
effects.
Table 2
The audiorecorded
stimuli were digitized using the
Average speaker words and treatm ent targets identified across the 16 listeners in pre-and postComputerized Speech Research treatm ent sam ples.
Environment 45 (CSRE45)
software (1995) and the Tucker Speaker
Avg. pre-tx words Avg. post-tx words
Avg. treatment
Avg. treatment target
Davis Technologies (TDT)
identified (/10)
identified
target segments
segments identified
hardware (1994) at a sampling
identified pre-tx
post-tx
rate of 20 kHz. Sound files were
6.63 (0.96)
6.94 (1.61)
9.54/10 (0.74)*
9.23/10 (O.66) a
edited and analyzed in the Stan
Ecoscon program in CSRE45.
Delia
7.44 (0.96)
7.88 (0.50)
6.25/8 (0.45)
6.0/8 (0.85)
The sound files were attenuated
or amplified during pre- Devon
6.63 (1.30)
7.44 (1.15)
6.38/8 (1.02)
6.93/8 (1.16)
processing so that pre- and post0.38 (0.50)
0.75 (0.68)
0.81/9 (0.66)
2.07/9 (1.03)**
treatment stimuli pairs could be Samantha
presented at similar levels of Dana
6.50 (1.67)
9.44 (0.51)**
8.25/10 (1.81)
9.87/10 (0.35)**
intensity (Williams, 1998).
Fourteen blocks were designed Sandy
2.44 (1.26)
3.63 (1.26)*
1.34/8 (1.13)
3.27/8 (0.96)
within Ecosgen, a stimuli
1.06 (0.57)
5.44 (1.15)**
0.63/10 (1.09)
6.07/10 (1.16)**
presentation protocol that is Dora
part of CSRE45. Seven blocks
contained the pre-treatment Note: tx = treatment. Parentheses = standard deviation. Each number represents the average number
words (one block per speaker), of words or segments identified across the 16 listeners out of the set of 8-10 potential words or
and seven the corresponding segments.
post-treatment words. Stimuli aTreatment targets varied in number from 7 to 10 per speaker within word sets.
were presented to the listeners
using Ecosgen via the TDT. *Wilcoxon's Signed Ranks: p < .05
Participants listened through
**Wilcoxon's Signed Ranks: p < .007 (from .000 to .006)
176
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Electropalatography and Ultrasound
observation sets, because of the small size and
heterogeneity of the sample. The adolescents and adults
showed a small, non-significant increase, although for
Devon, the increase approached significance (p = .053).
The children showed a significant increase: to 9.44/10 for
Dana (p = .001), 5.44/10 for Dora (p < .001), and 3.63/10
for Sandy (p = .013). Standard deviation for words
identified ranged from 0.57 to 1.67 words across listeners,
suggesting that listeners were in close agreement. In terms
of consonant treatment target identification, there was a
similar split by speaker across listeners. For pre-treatment
stimuli, the average numbers of target consonants
correctly identified across listeners were as follows: (1) for
Stan, Delia, Devon and Dana: over 75% (from 6.25/8 to
9.23/10), and (2) for Samantha, Dora and Sandy: 0.63%15% (from 0.63/10 to 1.34/8). Post treatment, more
segments were correctly identified for all speakers but
Delia, for whom there was a slight non-significant decrease
in consonant identification. The increase was significant
according to a Wilcoxon’s Signed Ranks test (p < .05) for
all but Devon, who showed a near-significant increase (p
= .06).
The small sample, case study design and preliminary
nature of the listener study preclude in-depth statistical
analyses or interpretation of the data for the various
speakers. The slight and non-significant increases in word
identification for Stan, Devon, Samantha and Delia may
have reflected a listener practice effect for those speakers,
rather than an actual improvement. Listeners did identify
significantly more target consonants in the two
adolescents’ words (Stan and Samantha), which may
however be a reflection of positive treatment effects. The
increase in number of words and segments identified for
the children suggests a change above and beyond a practice
effect, which may or may not be attributable to the
program. The differential response of the listeners to
various speakers suggests that future studies will need to
include larger numbers of speakers of different ages,
etiologies, severity and phonological profiles.
The listeners’ differential responses also show that
everyday listeners can contribute useful data about
individuals in an evaluation process. In terms of the
listening tasks, word and segment identification can give
specific information that may be useful in determining
treatment effects. The issue of practice effects is a challenge
in listening tasks. Randomization of stimuli may help
diminish such effects, but even randomized stimuli may
influence each other in small word sets, suggesting that
larger word sets are needed in future studies. The results
from this listener study suggested further exploration of
visual feedback technology was warranted.
Study 2: Speech Habilitation using EPG and
Ultrasound with Adolescents with Hearing
Impairment
An Interdisciplinary Speech Research Laboratory
(ISRL) was funded at UBC in 2001, making new equipment
available for treatment studies, including two-
dimensional ultrasound and WIN-EPG (2000). A followup exploratory project was initiated incorporating both
EPG and ultrasound, not to compare the two technologies
experimentally, but to learn more about each new
technology as a basis for future studies.
Treatment Study Background for Study #2
The first treatment study had shown a difference
between child and adult participants, with the adolescents
(Stan and Samantha) showing slightly greater treatment
effects than the other adults. In order to gain more insight
into the relevance of age, four adolescents aged 16-18 were
recruited for the second treatment study.
The students for the second study were a homogeneous
group in terms of age and etiology (hearing impairment).
The three males had aided hearing levels in the moderate
range (Palmer) or moderate-to-severe range (Purdy,
Peran). The female participant, Pamela, had a fluctuating
and progressive sensorineural hearing loss due to Large
Vestibular Aqueduct Syndrome. Her aided thresholds up
to 2000 Hz sloped downwards from normal to the mild
loss range in the better ear. In the other ear, her aided
thresholds were in the moderate to severe loss range.
Across speakers, audio-recorded baseline speech samples
showed mild to moderate suprasegmental aberrations in
terms of voice quality, intonation, nasalization, loudness
and/or pitch control. Sibilants and liquids were the least
well-established consonant categories. Vowels tended to
be centralized and/or lowered, and the tense-lax
distinction for vowels was only weakly established. Pamela
and Purdy were intelligible most of the time in
conversation; Peran and Palmer were intelligible some of
the time in conversation. The adolescents attended an
oral high school program for deaf and hard of hearing
with partial mainstreaming and speech-language support.
The adolescents had received at least 12 to 15 years of prior
speech habilitation. The second author was their current
speech-language pathologist. Prior to the visual feedback
treatment study, this S-LP conducted a 5-week traditional
treatment baseline, targeting /l/ and sibilants. Speakers
showed slight gains on consonants that they could already
pronounce on occasion pre-treatment. (See Bernhardt et
al., 2003, for more details.)
The speech habilitation study used both WIN-EPG
and ultrasound. There were 14 weekly individual
treatment sessions at the ISRL, with follow-up sessions at
the school without the use of technological feedback. All
speakers had the same treatment targets. Consonant
treatment targets included the voiceless coronal fricatives
/s/ and // and the approximants /l/ and /r/. For vowels,
the tense-lax distinction was targeted in the pairs /i/-//
and /u/-//. Additional data were collected as controls:
/k/ and alveolar stops, /t /, and all other vowels.
Treatment targets were generally counterbalanced across
equipment and order across speakers, although
approximants were addressed second for all participants.
Pamela and Palmer spent six sessions solely with EPG
(sibilants, vowels), and three sessions solely with
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
177
Electropalatography and Ultrasound
ultrasound (liquids); Peran and Purdy spent six sessions
solely with ultrasound (vowels, sibilants) and three solely
with EPG (liquids). For the final five sessions, all
participants alternated within sessions between
ultrasound and EPG. The difference in time allotment by
equipment provided an opportunity to make a
qualitative, non-experimental comparison between the
two technologies. During treatment, the first and second
authors (S-LPs) modeled targets and provided feedback,
using speech, sign and written information. Targets
changed in complexity from silent movements to isolated
segments to syllables, words and finally phrases.
Prior to the everyday listener study reported here,
trained listeners evaluated the pre- and post-treatment
data using phonetic transcription (Bernhardt et al., 2003).
The transcriptions indicated a 50% gain in consonant
target accuracy post-treatment for Pamela, Purdy and
Palmer, and a 28% gain for Peran. This compared with a
28% gain in vowel target accuracy for Purdy and Palmer,
compared with a 16.9% gain for Pamela and a -1%
regression for Peran (Bernhardt et al., 2003). Across
speakers, the trained listener suggested that the most
improved consonant was /r/, and the most improved
vowel was //, followed by /i/ and /u/. The question for
the current study was whether everyday listeners would
notice these or similar improvements.
Everyday Listener Evaluation in Study #2
Method for listener study #2.
Research assistants who had not helped with the
treatment study organized the stimuli for the everyday
listener study. Ten native English-speaking everyday
listeners between 20 and 45 participated (6 men, 4 women).
All had post-secondary education, no familiarity with
disordered speech, and normal hearing as screened at 25
dB from 500 to 4000 Hz.
Nonsense word samples from the ultrasound
recordings were selected for the everyday listener study,
because they had also been used in the trained listener
study (Bernhardt et al., 2003). The stimuli were controlled
in terms of phonetic context (/h/, // and /b/), and
thus it was assumed that the listener could focus on the test
segment. The following targets were included: vowel
treatment targets /hib, hb, hub, hb/, vowel
observation targets /heb, hb, hb/, consonant
treatment targets /s, hs, , h, l, hl, r,
hr/and consonant observation targets /t, ht/.
The audio-recorded sound files were transferred to a
Macintosh computer using Adobe Premiere 6.0 and
Macromedia SoundEdit 16. The sound files differed within
and across speakers in terms of degree of background
noise. This discrepancy was the result of different recording
levels during the probes rather than different signal-tonoise ratios, because recordings were made under constant
ambient noise conditions. Overall signal amplitude was
reduced for the louder files using SoundEdit 16, yielding
equally loud tokens across the listener sessions, and
bringing relative background noise to within 3 dB for all
tokens.
178
As was the case with the first listener study, the small
n favored within- rather than between-speaker analyses.
The stimuli were entered into PsyScope 1.2.5 (1993).
However, because PsyScope 1.2.5 (1993) could not easily
identify the source of individual tokens after
randomization, listeners were asked to make only withinspeaker judgments for the four individual speakers. Data
from a fifth speaker was used for a short training session.
The within-speaker rating procedure may have resulted
in a practice effect for the listeners. However, speaker
order was counterbalanced across listeners, and pre- and
post-treatment tokens were randomized within the same
sets. In order to ensure data recoverability from PsyScope,
word-initial consonants, word-final consonants and
vowels were also in separate sets. The five target consonants
(/s,,l,r,t/) yielded 10 consonant sets, that is, 5 sets
of word-final consonants and 5 sets of word-initial
consonants. Each consonant set contained 20 randomized
tokens, 10 pre- and 10 post-treatment. Equal numbers of
the five consonant syllables were presented in random
order. There were seven vowel targets, and thus seven
vowel sets. Each set contained 20 vowel syllable targets, 10
pre- and 10 post-treatment tokens, with all vowels
represented as indicated above. Because of the 20-token
limit per set, one vowel could only be tested twice in each
set. However, all listeners heard all seven sets from each
speaker, ensuring that listeners rated all available vowel
tokens over the seven sets.
A 3-point judgment scale was adopted for the
listener evaluation, following Ertmer and Maki (2000).
The 3-point scale allowed listeners to provide an
intermediate rating for tokens that were somewhat like
the target. Listeners attended two testing sessions of about
90 minutes each. In the first session, they practised with
the training set, and rated all test sets from two speakers.
In the second session, they rated sets from the other two
speakers. Listeners wore Koss UR-20 headphones and sat
in an IAC sound booth. Instructions were given orally
and on the computer screen. Listeners were instructed to
focus on the target (e.g. “FINAL CONSONANT” of the
word), and to ignore the rest of the word. They were told
to press “1” if the stimulus sounded “EXACTLY” like the
target, “2” if the stimulus sounded “SOMEWHAT” like
the target, and “3” if the stimulus sounded “NOT AT ALL”
like the target. They could escape from the program at any
time. For consonants, the printed consonant syllable
appeared in English orthography on the computer screen
at the same time as the sound recording was played back
(e.g., saw, hoss, shaw, etc.). The listener then registered his
or her rating on the computer keyboard (1, 2 or 3), within
5 seconds. In pilot testing, trained listeners could not
respond sufficiently quickly to the nonsense word vowel
stimuli. Thus, for the vowels, a familiar word containing
the target vowel was presented on the screen. The listener
was asked whether the vowel heard in the auditory
stimulus was the same as the one in the word on the screen:
/i/ - need, /I/ - jig, /u/ - rude, // - good, /e/ - raid, //
- bed, and // - log. No other external reference was given
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Electropalatography and Ultrasound
to the listeners, in order to ensure that they would use
their own internal reference without experimenter
biasing.
With PsyScope 1.2.5, responses occasionally did not
register because of the speed of a response. Unfortunately,
if a given stimuli set had a missing response, the program
could not identify which item was missing, and thus all
data from that set had to be eliminated. The number of
incomplete consonant sets was just slightly greater than
chance (5.8%), but 21-25% of vowel sets across speakers
had missing sets. There was no difference in terms of
missing response sets across speakers, and no individual
speaker’s vowel data had to be eliminated altogether.
There was a bimodal difference in listener responses; five
listeners had missing responses in less than three data sets,
and the other five had missing responses in five to seven
data sets. The listener groups did not differ significantly
in rating proportions (‘1,’ ‘2,’ and ‘3’ responses), however,
showing that speed of response was probably the affecting
variable. If listeners had more than one missing response
set for a given speaker and consonant or vowel category,
all data from that speaker were eliminated for that listener
and category as a cautionary measure. Otherwise,
complete data sets were pooled across listeners within
speakers, giving a final number of listener tokens as follows:
Peran: Vowels (V) — 840; Consonants (C)— 1720; Palmer:
V — 920; C — 1920; Purdy: V — 1000; C — 1890; Pamela:
V — 1040; C — 1900.
improvement. The vowel /i/ showed the most improved
listener ratings.
Space precludes a detailed comparison with the trained
listener study (Bernhardt et al., 2003) in terms of speaker
ratings. However, as predicted, there was general
congruence. Palmer was rated most severely pretreatment. Palmer, Pamela and Purdy all showed greater
gains in consonant ratings than Peran, and Purdy and
Palmer showed greater gains in vowel ratings than the
other two. The everyday listeners also agreed with the
trained listeners in rating /r/ as the most improved
consonant across speakers. Rankings for less-improved
consonants differed; the everyday listeners rated the
sibilants overall more highly than did the trained listeners,
perhaps showing a greater tolerance for dentalization of
sibilants. Trained and everyday listener evaluations
disagreed on the most improved vowel. According to the
everyday listener ratings, the most improved vowel was
/i/, whereas in the trained listener study, // was evaluated
as most improved, followed by /i/. The // vowel may be
difficult for everyday listeners to evaluate, partly because
of English orthography, where the “oo” can be /u/ or //
or because it is a lax vowel and therefore less salient. The
rating task was generally problematic for the vowels, as
attested by the number of abandoned vowel data sets.
The listeners’ differential response to the various
speakers further confirms that future research will have to
consider speaker profiles. The everyday listener rating
Results and discussion for listener
study #2.
Results were evaluated within speaker,
with Table 3 showing overall consonant
and vowel ratings. Across listeners, average
pre-treatment consonant ratings showed
fairly similar ratings in the low on-target
range for Purdy, Peran and Pamela: 1.61 to
1.76. For Palmer, the average rating was
2.16, that is, in the intermediate accuracy
range. A Wilcoxon’s Summed Ranks nonparametric test was used to test pre-post
differences, because of the small sample.
Post-treatment, by speaker, average
consonant listener ratings improved
significantly for Palmer and Pamela, to
1.52 and 1.62 respectively (p < .01). Purdy
showed a slight non-significant
improvement, and Peran a slight nonsignificant regression. The most improved
consonant ratings across speakers were for
/r/, with individual speaker variability for
the sibilants and /l/. Average ratings for
vowels across listeners pre-treatment were
in the mid on-target range for all speakers,
from 1.4 for Pamela to 1.67 for Palmer.
Post-treatment, Purdy and Palmer showed
a small but significant increase in ratings to
1.42 and 1.50 respectively (p < .01). Pamela
and Peran showed a small, non-significant
Table 3
Everyday listener ratings (1,2,3) of speakers' pre- and post-treatm ent consonants
and vowels
C or V
Average rating
Average rating
pre-Tx
post-Tx
Purdy
1.61 (0.77)
1.55 (0.72)
Peran
1.69 (0.73)
1.70 (0.75)
Pamela
1.76 (0.81)
1.62* (0.72)
Palmer
2.16 (0.81)
1.52* (0.69)
Pamela
1.40 (0.63)
1.38 (0.58)
Purdy
1.53 (0.69)
1.42* (0.62)
Peran
1.57 (0.69)
1.51 (0.68)
Palmer
1.67 (0.73)
1.50* (0.69)
C
All
1.81 (0.81)
1.59* (0.72)
V
All
1.53 (.69)
1.45* (.65)
Consonants
Vowels
Speaker
Note: Standard deviation in parentheses. "1" rating: "exactly like the target;" "2" rating:
"somewhat like the target;" "3" rating: "not at all like the target." Based on ratings of
10 everyday listeners of the consonants /r/,/l/,/s/, //,/t/, and vowels
/i/,//,/u/,//,//, //, /e/.
*p < .01 on Wilcoxon's Signed Ranks tests.
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
179
Electropalatography and Ultrasound
Figure 5
5a.
Figure 7
5b.
5c.
7a.
7b.
7c.
7d.
5d.
Figure 5a-5d. EPG pre- and post-treatment images for
Pamela’s /s/ (5a, 5b pre-post) and /r/ (5c, 5d pre-post).
Post-treatment, /s/ has a narrower groove, and /r/ more
symmetrical contact.
Figure 6
Figure 7a-d. Ultrasound images for Pamela’s /i/ (7a, 7b
pre-post treatment) and // (7c, 7d pre-post). The /i/ is
advanced and heightened in comparison with // posttreatment.
6a.
6b.
6c.
6d.
Figure 6a-d. EPG images for Peran’s vowels // (6a,6b
pre-post) and /u/ (6c, 6d pre-post). The /u/ is advanced,
and the // retracted after treatment.
180
scale did discriminate between speakers and speech targets.
Vowel ratings were problematic, however, in this study.
Furthermore, individual listener references for the “1,”
“2” and “3” ratings remain unknown. If a speaker has an
average 1.5 rating on some target, it is not clear what that
implies for the target or everyday communication. The
rating scale provided less specific observations than the
word identification task as used in the first study.
As a final note on the second study and as a further
tutorial on ultrasound and EPG, sample pre- and posttreatment images are given in figures 5-7. Figure 5 shows
a narrowing of the groove for /s/ post-treatment for
Pamela (figures 5a, 5b), and a more symmetrical contact
pattern for /r/ (figures 5c, 5d). The EPG images for Peran
show the // tongue-palate contact moving backward
(figures 6a, 6b) and the /u/ contact moving forward
(figure 6c, 6d). The ultrasound images for Pamela’s /i/
(figures 7a, 7b) and // (figures 7c, 7d) show a
differentiation post-treatment, with the /i/ showing a
higher tongue body and advanced tongue root. Overall,
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Electropalatography and Ultrasound
these images correspond generally to the listener
perspectives. For further discussion of perceptualarticulatory convergence in ultrasound studies, see AdlerBock (2004).
Conclusion
The current paper had two purposes: (1) to provide
an introduction to EPG and ultrasound as tools in speech
habilitation, and (2) to present preliminary everyday
listener observations concerning pre- and post-treatment
samples collected for two exploratory studies at UBC, as
a foundation for future research and clinical initiatives.
Comparing EPG and ultrasound, they provide
different types of dynamic information about the tongue
during speech production. EPG shows tongue-palate
contact patterns from the tongue tip to the back of the
hard palate for mid and high vowels and lingual
consonants. Ultrasound images tongue shape, location
and configuration for all vowels and lingual consonants.
In practical terms, EPG is much less expensive than
ultrasound, and WIN-EPG has built-in analysis
capabilities. However, ultrasound does not require
individualized artificial palates for each client, and strides
are being made in data quantification (see the
aforementioned websites). In terms of speech habilitation,
both tools appear to hold promise. Research is needed to
compare the relative benefits of each in speech habilitation,
and their merit in comparison with other technologies
and methods across a variety of speakers.
In working towards clinical implementation of EPG
or ultrasound, it is important to know whether the benefits
outweigh the costs. The ultimate test of any treatment
methodology is through randomized clinical trials, with
large numbers of participants, comparison and control
groups, blinding of S-LPs to conditions and data, rigorous
baseline and treatment protocols, multiple types of
evaluations and control of external variables. As a prelude
to such trials, additional small n studies may provide
further information as to the relative merit of the
technologies. The everyday listener studies reported in
the current paper raise a number of questions for future
research in speech habilitation with visual feedback
technology, particularly in terms of speaker profiles and
evaluation methods. More research is needed to determine
the relationship of speaker variables to outcomes, for
example, age, etiology or severity. In terms of evaluation
methods, a number of issues were raised by the everyday
listener studies. In the second study, everyday listeners
agreed with a previous study by trained listeners on the
most improved speakers and consonant, with near
agreement on the most improved vowel. Although this
congruence is encouraging, one aspect of the everyday
listener data suggests that it may be important to consider
the relative impact of the degree of change in future
studies. Everyday listener observations across the two
studies showed minimal pre-post differences for 7 of 11
speakers. This may be because the everyday listeners rated
six of the pre-treatment samples relatively highly, leaving
minimal room for improvement. Trained listeners might
have noted minor improvements through narrow
phonetic transcription or instrumental analyses.
However, it is not known what the impact of small
differences might be in speakers’ everyday conversations.
More research is needed to compare everyday and trained
listener observations and to determine the impact of
different types and degrees of change on conversational
intelligibility. In the studies reported here, everyday
listener measures of intelligibility were used, specifically
word identification and accuracy judgments. Word
identification may be more ecologically valid than
accuracy judgments, because conversation involves word
identification. However, accuracy judgments can
differentiate between speakers and samples. Thus, both
can contribute information to the evaluation process.
Although everyday listener observations bring the ‘real
world’ into the evaluation process, future research also
needs to bring the interaction between speaker and listener
into focus through comprehensibility studies (Visser,
2004; Yorkston et al., 1996). Qualitative studies gathering
the viewpoints of the speakers and their conversation
partners may also be illuminating, and are currently
underway in the UBC research program.
In terms of clinical application, S-LPs currently have
limited or no access to EPG or ultrasound for their clients
unless they can form partnerships with university centres
or hospitals engaged in research. Cleftnet UK has been
established and may also be a future possibility in Canada.
Meanwhile, the information about phonetics and
treatment evaluation presented in this paper may provide
S-LPs with some new ideas for daily clinical practice and
its evaluation.
Acknowledgements
Thank you to the following: the speakers and listeners;
S-LPs David Loyst and Shannon Muir, Study #1; research
assistants Dana Bryer, Dana Kanwischer, Jonathan Howell
and Marie Jette; Dr. Katherine Pichora-Fuller of the
University of Toronto and Dr. Linda Rammage of the
Vancouver Hospital Voice Clinic, committee members
for Rhea Williams’ Master’s thesis; and the reviewers of
this paper. For funding, we thank the Canadian
Foundation for Innovation, the BC Medical Services
Foundation, UBC’s Hampton and HSS funds and the
Variety Club.
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
181
Electropalatography and Ultrasound
References
Adler-Bock, M. (2004). Visual feedback from ultrasound in remediation of
persistent /r/ errors: Case studies of two adolescents. Unpublished Master’s thesis,
University of British Columbia.
Bernhardt, B., Fuller, K., Loyst, D., & Williams, R. (2000). Speech production
outcomes before and after palatometry for a child with a cochlear implant. Journal
of the Association of Rehabilitative Audiology, 23, 11-37.
Bernhardt, B., Gick, B., Adler-Bock, M., & Bacsfalvi, P. (2005). Ultrasound in
speech therapy with adolescents and adults. Clinical Linguistics and Phonetics, 19,
605-617.
Bernhardt, B., Gick, B., Bacsfalvi, P., & Ashdown, J. (2003). Speech habilitation
of hard of hearing adolescents using electropalatography and ultrasound as evaluated
by trained listeners. Clinical Linguistics and Phonetics, 17,, 199-216.
Bernthal, J., & Bankson, N. (2004). Articulation and phonological disorders, Fifth
Edition. Boston, MA: Allyn and Bacon.
Black, H. (1999). An evaluation of palatometry in the treatment of /r/ for a hard
of hearing adolescent. Master of Science graduating essay, University of British
Columbia.
Bridges, C.C. & Huckabee, R.M. (1970). A new visual speech display- its use in
speech therapy. Volta Review, 72, 112-115.
Dagenais, P. (1992). Speech training with glossometry and palatometry with
profoundly hearing impaired children. The Volta Review, 94, 261-282.
Delattre, P. C. & Freeman, D. C. (1968). A dialect study of American r’s by x-ray
motion picture. Linguistics, 44, 29-68.
Derry, K., & Bernhardt, B. (2000). Palatometry intervention in relation to body
(structure and function), activity, and participation. Poster presented at the VIIIth
meeting of the International Clinical Phonetics and Linguistics Association, Edinburgh,
Scotland.
Ertmer, D., & Maki, J. (2000). Comparison of speech training methods with deaf
adolescents: spectrographic versus noninstrumental instruction. Journal of Speech,
Language & Hearing Research, 43, 1509-1523.
Fletcher, S., Dagenais P., & Critz-Crosby, P. (1991). Teaching consonants to
profoundly hearing-impaired speakers using palatometry. Journal of Speech and
Hearing Research, 34, 929-942.
Frattali, C. (1998). Measuring outcomes in speech-language pathology. New York,
NY: Thieme.
Gibbon, F., Crampin, L., Hardcastle, W., Nairn, M., Razzell, R., Harvey, L., &
Reynolds, B. (1998). Cleftnet (Scotland): A network for the treatment of cleft palate
speech using EPG. International Journal of Language & Communication Disorders, 33,
supplement, 44-49.
Gibbon, F., Hardcastle, W., Dent, H., & Nixon, F. (1996). Types of deviant sibilant
production in a group of school-aged children, and their response to treatment using
electropalatography. In M.J. Ball & M. Duckworth (Eds.), Advances in clinical
phonetics (pp. 115-149). Amsterdam: John Benjamins.
Gibbon, F., & Wood, S. (2003). Using electropalatography (EPG) to diagnose and
treat articulation disorders associated with mild cerebral palsy: A case study. Clinical
Linguistics and Phonetics, 17, 365-374.
Gick, B. (2002). The use of ultrasound for linguistic phonetic fieldwork. Journal
of the International Phonetic Association, 32, 113-122.
Hardcastle, W., Gibbon, F., & Scobbie, J. (1995). Phonetic and phonological
aspects of English affricate production in children with speech disorders. Phonetica,
52, 242-250.
Hardcastle, W., Jones, W., Knight, C., Trudgeon, A., & Calder, G. (1989). New
developments in EPG. A state of the art report. Clinical Linguistics and Phonetics, 3,
1-38.
Kent, R., Miolo, G., & Bloedel, S. (1994).. The intelligibility of children’s speech:
A review of evaluation procedures. American Journal of Speech-Language Pathology,
3, 81-95.
Klasner, E., & Yorkston, K. (1999). Dysarthria in ALS: A method for obtaining the
everyday listener’s perception. Journal of Medical Speech-Language Pathology, 8,
261-264.
McLeod, S., Roberts, A., & Sita, J. (2003). The difference between /s/ and /z/:
More than +/- voice? Poster presented at the American Speech and Hearing
Association Convention, Chicago, IL, November, 2003.
Maki, J. (1983). Application of the speech spectrographic display in developing
articulatory skills in hearing-impaired adults. In I. Hochberg, H. Levitt, & M. Osberger
(Eds.), Speech of the hearing impaired: Research, training and personnel preparation
(pp. 297-312). Baltimore: University Park.
Michi, K. Yamashita, Y., Imai, S., & Yoshida, H. (1993). Role of visual feedback
treatment for defective /s/ sounds in patients with cleft palate. Journal of Speech and
Hearing Research, 26, 277-285.
Schiavetti, N. (1992). Scaling procedures for the measurement of speech intelligibility.
In R. Kent (Ed.), Intelligibility in speech disorders: Theory, measurement and
management (pp. 11-34). Amsterdam: John Benjamins.
Schuster, L., Ruscello, D., & Toth, A. (1995). The use of visual feedback to elicit
correct /r/. American Journal of Speech-Language Pathology, 4, 37-44.
Sproat, R., & Fujimura, O. (1993). Allophonic variation in English /l/ and its
implications for phonetic implementation. Journal of Phonetics, 21, 291-311.
Stone, M. (2005). A guide to analyzing tongue motion from ultrasound images.
Clinical Linguistics and Phonetics, 19-, 455-501.
Stone, M., & Lundberg, L. (1996). Three-dimensional tongue surface shapes of
English consonants and vowels. Journal of the Acoustical Society of America, 99(6),
1-10.
182
Volin, R. (1991).. Micro-computer based systems providing biofeedback of voice
and speech production. Topics in Language Disorders, 11, 65-79.
Yorkston, K., Strand, E., & Kennedy, M. (1996). Comprehensibility of dysarthric
speech: Implications for assessment and treatment planning. American Journal of
Speech-Language Pathology, 5, 55-66.
Williams, R. (1998). Outcomes of palatometry therapy as perceived by untrained
listeners. Unpublished Master’s thesis, University of British Columbia, 1998.
World Health Organization. (2001). ICF: International classification of functioning,
disability and health. Geneva: World Health Organization.
Author Note
Please direct enquiries to Barbara Bernhardt, School
of Audiology and Speech Sciences, University of British
Columbia, 5804 Fairview Avenue, Vancouver,BC V6T
1Z3, [email protected]
Received: November 16, 2004
Accepted: October 4, 2005
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Resource Review / Évaluation des ressources
Resource Review/
Évaluation des ressources
Clinical Evaluation of Language Fundamentals
Preschool–2nd Edition (2004)
Autho
rs:
Authors:
Publisher:
Reviewer:
Elisabeth H. Wiig, Wayne A. Secord
and Eleanor Semel
Harcourt Assessment, Inc., 19500
Bulverde Road, San Antonio, Texas
78259, USA
PsychCorp: www.PsychCorp.com
Sharon A. Bond, M.Sc., R.SLP, S-LP(C),
CCC-SLP,Speech-Language
Pathologist
Capital Health Authority
Edmonton, Alberta
The Clinical Evaluation of Language
Fundamentals Preschool-2nd Edition (CELF Preschool2) is an individually administered clinical tool that can
be used to identify and diagnose language deficits in
children who are 3-6 years of age. According to the
authors, the test was redesigned to make it easier and
quicker to administer, to provide greater diagnostic
value for children ages 3-4 years and 5-6 years, to assist
in the assessment of early classroom and literacy
fundamentals and communication in context
(pragmatics), and to add composite scores to evaluate
content (semantics) and structure (morphosyntax).
The authors describe the CELF Preschool-2 as a
series of levels. The level chosen to begin a particular
child’s assessment is dependent upon the examiner’s
clinical judgment, the child’s functional language
performance, and the referral question to be answered.
Level 1 is used to determine whether or not a language
disorder exists. It consists of the Sentence Structure,
Word Structure and Expressive Vocabulary Subtests.
Scores on these subtests are used to develop a Core
Language Score. The authors state that this score best
discriminates the performance of those children with
typically developing language and those children with
language disorders. If a child is found to have a language
disorder, further language testing will be done at Level
2. This level of testing is to provide more information
about how language modalities, language content, and
language structures are affected. At Level 2, the examiner
is able to determine patterns of performance across
index scores and to compare the child’s score patterns
with the appropriate norm-reference group. Item
analysis can be conducted on all subtests at this level.
At Level 3, the examiner evaluates a child’s early
classroom and literacy fundamentals. Phonological
awareness and pre-literacy rating are included at this
level. To determine a child’s pragmatic skills in relation
to social communication in the home, community and/
or school (Level 4), a caregiver or other person familiar
with the child completes the Descriptive Pragmatics
Profile. This profile includes verbal and nonverbal
behaviors.
The CELF Preschool-2 is individually administered.
Administration time is dependent upon the level of
assessment done. Approximately 20 minutes is required
to administer the three subtests needed to develop the
Core Language score. More time is required for levels 2
and 3. The test results are interpreted in terms of subtest
scaled scores, composite standard scores, criterion scores,
percentile ranks, and age equivalents.
The test components include the Examiner’s
Manual, Stimulus Books 1 and 2, Record Form, and
Checklists for The Pre-Literacy Rating Scale and the
Descriptive Pragmatics Profile. The Examiner’s Manual
includes guidelines for scoring and interpreting test
performance, detailed description of the test
development, normative data, and suggestions for
intervention and/or follow-up based on the test results.
Stimulus Book 1 is colorful and includes all the directions
to administer the subtests. Stimulus Book 2 contains
illustrations for the story used for Recalling Sentences in
Context. The Record Form contains the test items, space
for recording responses, item analysis, and clear
information regarding whether or not repetitions are
allowed and when the discontinue rule is to be applied.
A Behavioral Observation Checklist is also included.
The checklists can be completed by someone familiar
with the child and his background.
Norm-referenced data for the CELF Preschool2 was derived from a standardization sample of 800
children aged 3 years to 6 years in 2002. The sample
(based on the U.S Bureau of the Census, 2000) was
stratified on the basis of age, sex, race/ethnicity,
geographic region, and primary caregiver’s education
level. Reported test-retest reliability for the subtests and
composite scores by age and across all ages ranged from
a low of .78 to a high of .94. Internal consistency reliability
coefficients across all ages ranged from .79 to .97. The
test-retest reliability and internal consistency would be
considered acceptable across subtests and excellent for
the composite scores.
The authors presented evidence of validity of the
CELF Preschool-2 based on test content, the response
process, internal structure and intercorrelational studies.
The results of the intercorrelations between the subtests
and the composite scores were given. Overall, the
correlations were moderate. Further validation evidence
was reported through comparison with other tests of
language disorders in children, including the CELF
Preschool, the CELF-4, and the PLS-4. The correlations
were reported as moderate to high. The validity of the
CELF Preschool-2 was based on five different types of
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
183
Resource Review / Évaluation des ressources
research that yielded significant results. Therefore, the
validation of the test would be considered good.
In summary, the Clinical Evaluation of Language
Fundamentals Preschool-2 is a well-designed test for
identifying and assessing language disorders in children
ages 3 years to 6 years. I found that the structuring of the
test into levels allowed me to tailor the language
assessment based on the specific needs of the child I was
seeing. The directions of each subtest are not only on the
Response Form, but also in the Stimulus Book. Each
subtest is preceded by information regarding materials
needed, number of repetitions allowed, and the
discontinue rule. If the individual subtest score is to be
used in more than one index or composite score, the
indexes are listed. Acceptable alternate responses and 1point responses are included on the Response Form.
I am pleased with the modifications to many of
the stimulus plates. The pictures are colorful and are
interesting to children. During language assessments, I
often have children who like to talk about the pictures.
At that point, I take the opportunity to record what they
say for a language sample. The various children depicted
represent Caucasian, Black, Hispanic, and Asian ethnic
backgrounds. Children taking the test are able to identify
with the presented manner of dress and activities. The
story used for the Recalling Sentences in Context is a
significant improvement over the one in the previous
edition. “The Big Move” has been replaced with “No
Juice!”, a story about three children’s trip with their
mother to the grocery store. This is an event that is very
familiar to children.
In summary, the CELF Preschool-2 is a welldesigned clinical tool for identifying and diagnosing
language disorders in children ages 3 years to 6 years. The
authors have provided evidence of the instrument’s
acceptable reliability and validity. The CELF Preschool2 would be a very good addition to the assessment
protocols of those speech-language pathologists who
provide services to the preschool and kindergarten
population.
Ac t i o n
ACO
A!
nA
COA
Jetez-nous un coup d’oeil!
MOTEUR DE RECHERCHE
EN LIGNE DU ROA
Recherchez en ligne pour les
articles qui ont été publiés dans
la ROA. La recherche peut
porter sur l’article, l’enjeu, le
sujet, l’auteur ou un mot-clé. Les
résultats de la recherche vous
permettent de visionner tout
l’extrait et vous donnent le choix
de commander une réédition de
l’article, des numéros déjà parus
ou de s’abonner. C’est un
excellent outil de recherche.
Jetez-y un coup d’oeil sous ROA
à la section Ressources de
www.caslpa.ca.
Utilisez le moteur de recherche du ROA:
www.caslpa.ca/english/resources/jslpa-index.asp
Un autre service de qualité offert par l’ACOA
184
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Information for Contributors
The Journal of Speech-Language Pathology and Audiology
(JSLPA) welcomes submissions of scholarly manuscripts related
to human communication and its disorders broadly defined.
This includes submissions relating to normal and disordered
processes of speech, language, and hearing. Manuscripts that
have not been published previously are invited in English and
French. Manuscripts may be tutorial, theoretical, integrative,
practical, pedagogic, or empirical. All manuscripts will be evaluated
on the basis of the timeliness, importance, and applicability of the
submission to the interests of speech–language pathology and
audiology as professions, and to communication sciences and
disorders as a discipline. Consequently, all manuscripts are
assessed in relation to the potential impact of the work on
improving our understanding of human communication and its
disorders. All categories of manuscripts submitted will undergo
peer-review to determine the suitability of the submission for
publication in JSLPA. The Journal recently has established multiple
categories of manuscript submission that will permit the broadest
opportunity for disseminaion of information related to human
communication and its disorders. New Categories for manuscript
submission include:
Tutorials. Review articles, treatises, or position papers that
address a specific topic within either a theoretical or clinical
framework.
Articles. Traditional manuscripts addressing applied or
basic experimental research on issues related to speech, language,
and/or hearing with human participants or animals.
Clinical Reports. Reports of new clinical procedures,
protocols, or methods with specific focus on direct application
to identification, assessment and/or treatment concerns in speech,
language, and/or hearing.
Brief Reports. Similar to research notes, brief
communications concerning preliminary findings, either clinical
or experimental (applied or basic), that may lead to additional
and more comprehensive study in the future. These reports are
typically based on small “n” or pilot studies and must address
disordered participant populations.
Research Notes. Brief communications that focus on
experimental work conducted in laboratory settings. These reports
will typically address methodological concerns and/or
modifications of existing tools or instruments with either normal
or disordered populations.
Field Reports. Reports that outline the provision of services
that are conducted in unique, atypical, or nonstandard settings;
manuscripts in this category may include screening, assessment,
and/or treatment reports.
Letters to the Editor. A forum for presentation of scholarly/
clinical differences of opinion concerning work previously
published in the Journal. Letters to the Editor may influence our
thinking about design considerations, methodological confounds,
data analysis and/or data interpretation, etc. As with other
categories of submissions, this communication forum is
contingent upon peer-review. However, in contrast to other
categories of submission, rebuttal from the author(s) will be
solicited upon acceptance of a letter to the editor.
Submission of Manuscripts
Contributors should send five (5) copies of manuscripts
including all tables, figures or illustrations, and references to:
Phyllis Schneider, PhD
Editor, JSLPA
Dept. of Speech Pathology and Audiology
University of Alberta
2-70 Corbett Hall
Edmonton, AB T6G 2G4
Along with copies of the manuscript, a cover letter indicating
that the manuscript is being submitted for publication
consideration should be included. The cover letter must explicitly
state that the manuscript is original work, that has not been
published previously, and that it is not currently under review
elsewhere. Manuscripts are received and peer-reviewed contingent
upon this understanding. The author(s) must also provide
appropriate confirmation that work conducted with humans or
animals has received ethical review and approval. Failure to
provide information on ethical approval will delay the review
process. Finally, the cover letter should also indicate the category
of submission (i.e., tutorial, clinical report, etc.). If the editorial
staff determines that the manuscript should be considered
within another category, the contact author will be notified.
All submissions should conform to the publication
guidelines of the Publication Manual of the American
Psychological Association (APA), 5th Edition. Manuscripts
should be word processed, IBM format preferred. Should the
manuscript be accepted for publication, submission of a diskette
version of the submission will facilitate the publication process.
A confirmation of receipt for all manuscripts will be provided
to the contact author prior to distribution for peer-review.
JSLPA seeks to conduct the review process and respond to
authors regarding the outcome of the review within 90 days of
receipt. If a manuscript is judged as suitable for publication in
JSLPA, authors will have 30 days to make necessary revisions
prior to a secondary review.
The author is responsible for all statements made in his or
her manuscript, including changes made by the editorial and/
or production staff. Upon final acceptance of a manuscript and
immediately prior to publication, the contact author will be
permitted to review galley proofs and verify its content to the
publication office within 72 hours of receipt of galley proofs.
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
185
Organization of the Manuscript
All copies should be typed, double-spaced, with a standard
typeface (12 point, noncompressed font) on high quality 8 ½ X
11 paper. All margins should be at least one (1) inch. An original
and four (copies) of the manuscript should be submitted directly
to the Editor. Author identification for the review process is
optional; if blind-review is desired, three (3) of the copies should
be prepared accordingly (cover page and acknowledgments
blinded). Responsibility for removing all potential identifying
information rests solely with the author(s). All manuscripts
should be prepared according to APA guidelines. This manual is
available from most university bookstores or is accessible via
commercial bookstores. Generally, the following sections should
be submitted in the order specified.
Title Page: This page should include the full title of the
manuscript, the full names of the author(s) with academic
degrees, each author’s affiliation, and a complete mailing address
for the contact author. An electronic mail address also is
recommended.
Abstract
Abstract: On a separate sheet of paper, a brief yet informative
abstract that does not exceed one page is required. The abstract
should include the purpose of the work along with pertinent
information relative to the specific manuscript category for
which it was submitted.
Key Words: Following the abstract and on the same page,
the author(s) should supply a list of key words for indexing
purposes.
Tables: Each table included in the manuscript must be
typewritten and double-spaced on a separate sheet of paper.
Tables should be numbered consecutively beginning with Table
1. Each table must have a descriptive caption. Tables should serve
to expand the information provided in the text of the manuscript,
not to duplicate information.
Potential Conflicts of Interest
and Dual Commitment
As part of the submission process, the author(s) must
explicitly identify if any potential conflict of interest, or dual
commitment, exists relative to the manuscript and its author(s).
Such disclosure is requested so as to inform JSLPA that the author
or authors have the potential to benefit from publication of the
manuscript. Such benefits may be either direct or indirect and
may involve financial and/or other nonfinancial benefit(s) to the
author(s). Disclosure of potential conflicts of interest or dual
commitment may be provided to editorial consultants if it is
believed that such a conflict of interest or dual commitment may
have had the potential to influence the information provided in
the submission or compromise the design, conduct, data collection
or analysis, and/or interpretation of the data obtained and
reported in the manuscript submitted for review. If the manuscript
is accepted for publication, editorial acknowledgement of such
potential conflict of interest or dual commitment may occur
when publication occurs.
186
Illustrations: All illustrations included as part of the
manuscript will need to be included with each copy of the
manuscript. While a single copy of original artwork (black and
white photographs, x-ray films, etc.) is required, all manuscripts
must have clear copies of all illustrations for the review process.
For photographs, 5 x 7 glossy prints are preferred. High quality
laser printed materials are also acceptable. For other types of
computerized illustrations, it is recommended that JSLPA
production staff be consulted prior to preparation and submission
of the manuscript and associated figures/illustrations.
Legends for Illustrations: Legends for all figures and
illustrations should be typewritten (double-spaced) on a separate
sheet of paper with numbers corresponding to the order in which
figures/illustrations appear in the manuscript.
Page Numbering and Running Head: The text of the
manuscript should be prepared with each page numbered,
including tables, figures/illustrations, references, and if
appropriate, appendices. A short (30 characters or less) descriptive
running title should appear at the top right hand margin of each
page of the manuscript.
Acknowledgments: Acknowledgments should be
typewritten (double-spaced) on a separate sheet of paper.
Appropriate acknowledgment for any type of sponsorship,
donations, grants, technical assistance, and to professional
colleagues who contributed to the work, but are not listed as
authors, should be noted.
References: References are to be listed consecutively in
alphabetical order, then chronologically for each author. Authors
should consult the APA publication manual (4th Edition) for
methods of citing varied sources of information. Journal names
and appropriate volume number should be spelled out and
underlined. All literature, tests and assessment tools, and standards
(ANSI and ISO) must be listed in the references. All references
should be double-spaced.
Participants in Research
Humans and Animals
Each manuscript submitted to JSLPA for peer-review that
is based on work conducted with humans or animals must
acknowledge appropriate ethical approval. In instances where
humans or animals have been used for research, a statement
indicating that the research was approved by an institutional
review board or other appropriate ethical evaluation body or
agency must clearly appear along with the name and affiliation
of the research ethics and the ethical approval number. The
review process will not begin until this information is formally
provided to the Editor.
Similar to research involving human participants, JSLPA
requires that work conducted with animals state that such work
has met with ethical evaluation and approval. This includes
identification of the name and affiliation of the research ethics
evaluation body or agency and the ethical approval number. A
statement that all research animals were used and cared for in an
established and ethically approved manner is also required. The
review process will not begin until this information is formally
provided to the Editor.
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005
Renseignements à l’intention des collaborateurs
La Revue d’orthophonie et d’audiologie (ROA) est heureuse de
se voir soumettre des manuscrits de recherche portant sur la
communication humaine et sur les troubles qui s’y rapportent, dans
leur sens large. Cela comprend les manuscrits portant sur les
processus normaux et désordonnés de la parole, du langage et de
l’audition. Nous recherchons des manuscrits qui n’ont jamais été
publiés, en français ou en anglais. Les manuscrits peuvent être
tutoriels, théoriques, synthétiques, pratiques, pédagogiques ou
empiriques. Tous les manuscrits seront évalués en fonction de leur
signification, de leur opportunité et de leur applicabilité aux intérêts
de l’orthophonie et de l’audiologie comme professions, et aux
sciences et aux troubles de la communication en tant que disciplines.
Par conséquent, tous les manuscrits sont évalués en fonction de leur
incidence possible sur l’amélioration de notre compréhension de la
communication humaine et des troubles qui s’y rapportent. Peu
importe la catégorie, tous les manuscrits présentés seront soumis
à une révision par des collègues afin de déterminer s’ils peuvent être
publiés dans la ROA. La Revue a récemment établi plusieurs
catégories de manuscrits afin de permettre la meilleure diffusion
possible de l’information portant sur la communication humaine
et les troubles s’y rapportant. Les nouvelles catégories de manuscrits
comprennent :
Tutoriels : Rapports de synthèse, traités ou exposés de position
portant sur un sujet particulier dans un cadre théorique ou clinique.
Articles : Manuscrits conventionnels traitant de recherche
appliquée ou expérimentale de base sur les questions se rapportant
à la parole, au langage ou à l’audition et faisant intervenir des
participants humains ou animaux.
Comptes rendus cliniques : Comptes rendus de nouvelles
procédures ou méthodes ou de nouveaux protocoles cliniques
portant particulièrement sur une application directe par rapport aux
questions d’identification, d’évaluation et de traitement relativement
à la parole, au langage et à l’audition.
Comptes rendus sommaires : Semblables aux notes de
recherche, brèves communications portant sur des conclusions
préliminaires, soit cliniques soit expérimentales (appliquées ou
fondamentales), pouvant mener à une étude plus poussée dans
l’avenir. Ces comptes rendus se fondent typiquement sur des études
à petit « n » ou pilotes et doivent traiter de populations désordonnées.
Notes de recherche : Brèves communications traitant
spécifiquement de travaux expérimentaux menés en laboratoire.
Ces comptes rendus portent typiquement sur des questions de
méthodologie ou des modifications apportées à des outils existants
utilisés auprès de populations normales ou désordonnées.
Comptes rendus d’expérience : Comptes rendus décrivant
sommairement la prestation de services offerts en situations uniques,
atypiques ou particulières; les manuscrits de cette catégorie peuvent
comprendre des comptes rendus de dépistage, d’évaluation ou de
traitement.
Courrier des lecteurs : Forum de présentation de divergences
de vues scientifiques ou cliniques concernant des ouvrages déjà
publiés dans la Revue. Le Courrier des lecteurs peut avoir un effet sur
notre façon de penser par rapport aux facteurs de conception, aux
confusions méthodologiques, à l’analyse ou l’interprétation des
données, etc. Comme c’est le cas pour d’autres catégories de
présentation, ce forum de communication est soumis à une révision
par des collègues. Cependant, contrairement aux autres catégories,
on recherchera la réaction des auteurs sur acceptation d’une lettre.
Présentation de manuscrits
On demande aux collaborateurs de faire parvenir cinq (5)
exemplaires de leurs manuscrits, y compris tous les tableaux, figures
ou illustrations et références, à :
Phyllis Schneider, Ph.D.
Rédactrice en chef, Revue d’orthophonie et d’audiologie
Dept. of Speech Pathology and Audiology
University of Alberta
2-70 Corbett Hall
Edmonton (Alberta) T6G 2G4
On doit joindre aux exemplaires du manuscrit une lettre d’envoi
qui indiquera que le manuscrit est présenté en vue de sa publication.
La lettre d’envoi doit préciser que le manuscrit est une œuvre
originale, qu’il n’a pas déjà été publié et qu’il ne fait pas actuellement
l’objet d’un autre examen en vue d’être publié. Les manuscrits sont
reçus et examinés sur acceptation de ces conditions. L’auteur (les
auteurs) doit (doivent) aussi fournir une attestation en bonne et due
forme que toute recherche impliquant des êtres humains ou des
animaux a fait l’objet de l’agrément d’un comité de révision
déontologique. L’absence d’un tel agrément retardera le processus
de révision. Enfin, la lettre d’envoi doit également préciser la catégorie
de la présentation (i.e. tutoriel, rapport clinique, etc.). Si l’équipe
d’examen juge que le manuscrit devrait passer sous une autre
catégorie, l’auteur-contact en sera avisé.
Toutes les présentations doivent se conformer aux lignes de
conduite présentées dans le Publication Manual of the American
Psychological Association (APA), 5th Edition. Les manuscrits doivent
être dactylographiés sur traitement de texte en format IBM, de
préférence. L’envoi d’une disquette, si le manuscrit est accepté, facilite
la publication. Un accusé de réception de chaque manuscrit sera
envoyé à l’auteur-contact avant la distribution des exemplaires en
vue de la révision. La ROA cherche à effectuer cette révision et à
informer les auteurs des résultats de cette révision dans les 90 jours
de la réception. Lorsqu’on juge que le manuscrit convient à la ROA,
on donnera 30 jours aux auteurs pour effectuer les changements
nécessaires avant l’examen secondaire.
L’auteur est responsable de toutes les affirmations formulées
dans son manuscrit, y compris toutes les modifications effectuées
par les rédacteurs et réviseurs. Sur acceptation définitive du manuscrit
et immédiatement avant sa publication, on donnera l’occasion à
l’auteur-contact de revoir les épreuves et il devra signifier la vérification
du contenu dans les 72 heures suivant réception de ces épreuves.
Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005
187
Organisation du manuscrit
Tous les textes doivent être dactylographiés à double interligne,
en caractère standard (police de caractères 12 points, non comprimée)
et sur papier 8 ½” X 11" de qualité. Toutes les marges doivent être
d’au moins un (1) pouce. L’original et quatre (4) copies du manuscrit
doiventêtreprésentésdirectementaurédacteurenchef.L’identification
de l’auteur est facultative pour le processus d’examen : si l’auteur
souhaite ne pas être identifié à ce stade, il devra préparer trois (3)
copies d’un manuscrit dont la page couverture et les remerciements
seront voilés. Seuls les auteurs sont responsables de retirer toute
information identificatrice éventuelle. Tous les manuscrits doivent
être rédigés en conformité aux lignes de conduite de l’APA. Ce manuel
est disponible dans la plupart des librairies universitaires et peut être
commandé chez les libraires commerciaux. En général, les sections
qui suivent doivent être présentées dans l’ordre chronologique
précisé.
Page titre : Cette page doit contenir le titre complet du manuscrit,
les noms complets des auteurs, y compris les diplômes et affiliations,
et l’adresse complète de l’auteur-contact. Une adresse de courriel est
également recommandée.
Abrégé : Sur une page distincte, produire un abrégé bref mais
informateur ne dépassant pas une page. L’abrégé doit indiquer
l’objet du travail ainsi que toute information pertinente portant sur
la catégorie du manuscrit.
Mots clés : Immédiatement suivant l’abrégé et sur la même
page, les auteurs doivent présenter une liste de mots clés aux fins de
constitution d’un index.
Tableaux : Tous les tableaux compris dans un même manuscrit
doivent être dactylographiés à double interligne sur une page distincte.
Les tableaux doivent être numérotés consécutivement, en
commençant par le Tableau 1. Chaque tableau doit être accompagné
d’une légende et doit servir à compléter les renseignements fournis
dans le texte du manuscrit plutôt qu’à reprendre l’information
contenue dans le texte ou dans les tableaux.
Illustrations : Toutes les illustrations faisant partie du manuscrit
doiventêtreinclusesavecchaqueexemplairedumanuscrit.Quoiqu’un
Conflits d’intérêts possibles
et engagement double
Dans le processus de présentation, les auteurs doivent déclarer
clairementl’existencedetoutconflitd’intérêtspossiblesouengagement
double relativement au manuscrit et de ses auteurs. Cette déclaration
est nécessaire afin d’informer la ROA que l’auteur ou les auteurs
peuvent tirer avantage de la publication du manuscrit. Ces avantages
pour les auteurs, directs ou indirects, peuvent être de nature financière
ou non financière. La déclaration de conflit d’intérêts possibles ou
d’engagement double peut être transmise à des conseillers en matière
de publication lorsqu’on estime qu’un tel conflit d’intérêts ou
engagement double aurait pu influencer l’information fournie dans
la présentation ou compromettre la conception, la conduite, la
collecte ou l’analyse des données, ou l’interprétation des données
recueillies et présentées dans le manuscrit soumis à l’examen. Si le
manuscrit est accepté en vue de sa publication, la rédaction se réserve
le droit de reconnaître l’existence possible d’un tel conflit d’intérêts
ou engagement double.
188
seul exemplaire du matériel d’illustration original (photographies,
radiographies, etc.) soit requis, chaque manuscrit doit contenir des
copies claires de toutes les illustrations pour le processus de révision.
Dans le cas de photographies, on préfère les photos sur papier glacé
5" X 7". Les impressions au laser de haute qualité sont acceptables.
Pour les autres types d’illustrations informatisées, il est recommandé
de consulter le personnel de production de la ROA avant la préparation
et la présentation du manuscrit et des figures et illustrations s’y
rattachant.
Légendes des illustrations : Les légendes accompagnant chaque
figure et illustration doivent être dactylographiées à double interligne
sur une feuille distincte et identifiées à l’aide d’un numéro qui
correspond à la séquence de parution des figures et illustrations dans
le manuscrit.
Numérotation des pages et titre courant : Chaque page du
manuscrit doit être numérotée, y compris les tableaux, figures,
illustrations, références et, le cas échéant, les annexes. Un bref (30
caractères ou moins) titre courant descriptif doit apparaître dans la
marge supérieure droite de chaque page du manuscrit.
Remerciements : Les remerciements doivent être
dactylographiés à double interligne sur une feuille distincte. L’auteur
doit reconnaître toute forme de parrainage, don, bourse ou d’aide
technique, ainsi que tout collègue professionnel qui ont contribué à
l’ouvrage mais qui n’est pas cité à titre d’auteur.
Références : Les références sont énumérées les unes après les
autres, en ordre alphabétique, suivi de l’ordre chronologique sous
le nom de chaque auteur. Les auteurs doivent consulter le manuel
de l’APA (5e Édition) pour obtenir la façon exacte de rédiger une
citation. Les noms de revues scientifiques et autres doivent être
rédigés au long et soulignés. Tous les ouvrages, outils d’essais et
d’évaluation ainsi que les normes (ANSI et ISO) doivent figurer dans
la liste de références. Les références doivent être dactylographiées à
double interligne.
Participants à la recherche –
êtres humains et animaux
Chaque manuscrit présenté à la ROA en vue d’un examen par
des pairs et qui se fonde sur une recherche effectuée avec la
participation d’être humains ou d’animaux doit faire état d’un
agrément déontologique approprié. Dans les cas où des êtres
humains ou des animaux ont servi à des fins de recherche, on doit
joindre une attestation indiquant que la recherche a été approuvée
par un comité d’examen reconnu ou par tout autre organisme
d’évaluation déontologique, comportant le nom et l’affiliation de
l’éthique de recherche ainsi que le numéro de l’approbation. Le
processus d’examen ne sera pas amorcé avant que cette information
ne soit formellement fournie au rédacteur en chef.
Tout comme pour la recherche effectuée avec la participation
d’êtres humains, la ROA exige que toute recherche effectuée avec des
animaux soit accompagnée d’une attestation à l’effet que cette
recherche a été évaluée et approuvée par les autorités déontologiques
compétentes. Cela comporte le nom et l’affiliation de l’organisme
d’évaluation de l’éthique en recherche ainsi que le numéro de
l’approbation correspondante. On exige également une attestation
à l’effet que tous les animaux de recherche ont été utilisés et soignés
d’une manière reconnue et éthique. Le processus d’examen ne sera
pas amorcé avant que cette information ne soit formellement
fournie au rédacteur en chef.
Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005