FULL TEXT - Canadian Centre for Knowledge Mobilisation
Transcription
FULL TEXT - Canadian Centre for Knowledge Mobilisation
Volume 29, No. 4 Obtaining and Interpreting Maximum Performance Tasks from Children: A Tutorial Applications of 2D and 3D Ultrasound Imaging in Speech-Language Pathology Exploring the Use of Electropalatography and Ultrasound in Speech Habilitation Published by the Canadian Association of Speech-Language Pathologists and Audiologists Publiée par l'Association canadienne des orthophonistes et audiologistes Winter Hiver 2005 JOURNAL OF SPEECH-LANGUAGE PATHOLOGY AND AUDIOLOGY Purpose and Scope The Canadian Association of Speech-Language Pathologists and Audiologists (CASLPA) is the recognized national professional association of speech-language pathologists and audiologists in Canada. The association was founded in 1964, incorporated under federal charter in 1975 and is committed to fostering the highest quality of service to communicatively impaired individuals and members of their families. It began its periodical publications program in 1973. Indexing JSLPA is indexed by: • CINAHL - Cumulative Index to Nursing and Allied Health Literature • CSA - Cambridge Scientific Abstracts Linguistics and Language Behavior Abstracts • Elsevier Bibliographic Databases • ERIC Clearinghouse on Disabilities and Gifted Education • PsycInfo JSLPA Reviewers The purpose of the Journal of Speech-Language Pathology and Audiology (JSLPA) is to disseminate contemporary knowledge pertaining to normal human communication and related disorders of communication that influence speech, language, and hearing processes. The scope of the Journal is broadly defined so as to provide the most inclusive venue for work in human communication and its disorders. JSLPA publishes both applied and basic research, reports of clinical and laboratory inquiry, as well as educational articles related to normal and disordered speech, language, and hearing in all age groups. Classes of manuscripts suitable for publication consideration in JSLPA include tutorials, traditional research or review articles, clinical, field, and brief reports, research notes, and letters to the editor (see Information to Contributors). JSLPA seeks to publish articles that reflect the broad range of interests in speechlanguage pathology and audiology, speech sciences, hearing science, and that of related professions. The Journal also publishes book reviews, as well as independent reviews of commercially available clinical materials and resources. Subscriptions/Advertising Nonmember and institution subscriptions are available. For a subscription order form, including orders of individual issues, please contact: CASLPA, 200 Elgin Street, Suite 401, Ottawa, Ontario K2P 1L5. Tel.: (800) 259-8519, (613) 567-9968; Fax: (613) 567-2859; E-mail: [email protected] Internet: www.caslpa.ca/english/resources/ jslpasubscriptions.asp. All inquiries concerning the placement of advertisements in JSLPA should be directed to [email protected]. The contents of all material and advertisements which appear in JSLPA are not necessarily endorsed by the Canadian Association of Speech-Language Pathologists and Audiologists. Scott Adams, Joy Armson, Lisa Avery, Shari Baum, Paul Beaudin, Sandi Bojm, V.J. Boucher, Janine Boutelier, Tim Bressman, David Brown, Melanie Campbell, Marshall Chasin, Margaret Cheesman, Lynne Clarke, Pat Cleave, Martha Crago, Claire Croteau, Lynn Dempsey, Luc DeNil, Marla Jean DeSousa, Christine Dollaghan, Philip Doyle, Christopher Dromey, Wendy Duke, Ahn Duong, Andrée Durieux-Smith, Tanya L. Eadie, Jos Eggermont, Diane Frome Loeb, Jean-Pierre Gagné, Robin Gaines, Linda Garcia, Bryan Gick, Luigi Girolametto, Paul Hagler, Joseph W. Hall, III, Elizabeth Haynes, Steve Heath, Lynne Hewitt, Megan Hodge, Bill Hodgetts, Tammy Hopper , Nancy Hubbard, Marc Joanisse, Jack Jung, Benoît Jutras, Aura Kagan, Joseph Kalinowski, Michael Kiefte, Robert Kroll, Deborah Kully, Guylaine Le Dorze, Jeff Lear, Christopher Lee, Carol Leonard, Tony Leroux, Janice Light, Rosemary Lubinski, Shelia MacDonald, Ian MacKay, Heather MacLean, Ruth Martin, Virginia Martin, Rosemary Martino, Rachel Mayberry, David McFarland, Lu-Anne McFarlane, Alison McVittie, Barbara Meissner Fishbein, Kathy Meyer, Linda Miller, Linda Milosky, Jerald Moon, Taslim Moosa, Robert Mullen, Kathleen Mullin, Kevin Munhall, Chris Murphy, Candace Myers, J.B. Orange, Marc Pell, Carole Peterson, Kathy Pichora-Fuller, Dennis Phillips, Michel Picard, Karen Pollock, Moneca Price, Barbara Purves, Elizabeth Kay-Raining Bird, Jana Rieger, Danielle Ripich, Elizabeth Rochon, Nelson Roy, Christine Santilli, Susan Scollie, Barb Shadden, Rosalee Shenker, Bernadette Ska , Elizabeth Skarakis-Doyle, Jeff Small, Ravi Sockalingham, David Stapells, Catriona Steele, Andrew Stuart, Anne Sutton, Stephen Tasko, Nancy Thomas-Stonell, Sharon Trehub, Natacha Trudeau, Anne van Kleeck, Ted Venema, Joanne Volden, Susan Wagner, Danya Walker, Linda Walsh, Jian Wang, Genese Warr Leeper, Penny Webster, Richard Welland, Lynne Williams, William Yovetich, Connie Zalmanowitz, Kim Zimmerman Vol. 29, No. 4 Winter 2005 Editor Phyllis Schneider, PhD University of Alberta Managing Editor/Layout Judith Gallant Manager of Communications Angie Friend Associate Editors Marilyn Kertoy University of Western Ontario (Language, English submissions) Tim Bressmann University of Toronto (Speech, English submissions) Rachel Caissie Dalhousie University (Audiology, English submissions) Patricia Roberts, PhD University of Ottawa (Speech & Language, French submissions) Tony Leroux, PhD Université de Montréal (Audiology, French submissions) Assistant Editor Vacant (Material & Resource Reviews) Assistant Editor Vacant (Book Reviews) Cover illustration Andrew Young Review of translation Tony Leroux, PhD Université de Montréal Translation Smartcom Inc. ISSN 0848-1970 Canada Post Publications Mail # 40036109 JSLPA is published quarterly by the Canadian Association of Speech-Language Pathologists and Audiologists (CASLPA). Publications Agreement Number: # 40036109. Return undeliverable Canadian addresses to: CASLPA, 200 Elgin Street, Suite 401, Ottawa, Ontario K2P 1L5. Address changes should be sent to CASLPA by e-mail [email protected] or to the above-mentioned address. Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 141 REVUE D’ORTHOPHONIE ET D’AUDIOLOGIE Objet et Portée L’Association canadienne des orthophonistes et audiologistes (ACOA) est l’association professionnelle nationale reconnue des orthophonistes et des audiologistes du Canada. L’Association a été fondée en 1964 et incorporée en vertu de la charte fédérale en 1975. L’Association s’engage à favoriser la meilleure qualité de services aux personnes atteintes de troubles de la communication et à leurs familles. Dans ce but, l’Association entend, entre autres, contribuer au corpus de connaissances dans le domaine des communications humaines et des troubles qui s’y rapportent. L’Association a mis sur pied son programme de publications en 1973. L’objet de la Revue d’orthophonie et d’audiologie (ROA) est de diffuser des connaissances relatives à la communication humaine et aux troubles de la communication qui influencent la parole, le langage et l’audition. La portée de la Revue est plutôt générale de manière à offrir un véhicule des plus compréhensifs pour la recherche effectuée sur la communication humaine et les troubles qui s’y rapportent. La ROA publie à la fois les ouvrages de recherche appliquée et fondamentale, les comptes rendus de recherche clinique et en laboratoire, ainsi que des articles éducatifs portant sur la parole, le langage et l’audition normaux ou désordonnés pour tous les groupes d’âge. Les catégories de manuscrits susceptibles d’être publiés dans la ROA comprennent les tutoriels, les articles de recherche conventionnelle ou de synthèse, les comptes rendus cliniques, pratiques et sommaires, les notes de recherche, et les courriers des lecteurs (voir Renseignements à l’intention des collaborateurs). La ROA cherche à publier des articles qui reflètent une vaste gamme d’intérêts en orthophonie et en audiologie, en sciences de la parole, en science de l’audition et en diverses professions connexes. La Revue publie également des critiques de livres ainsi que des critiques indépendantes de matériel et de ressources cliniques offerts commercialement. Abonnements/Publicité Les membres de l’ACOA reçoivent la Revue à ce titre. Les non-membres et institutions peuvent s’abonner Les demandes d’abonnement à la ROA ou de copies individuelles doivent être envoyées à : ACOA, 200, rue Elgin, bureau 401, Ottawa (Ontario) K2P 1L5. Tél. : (800) 259-8519, (613) 567-9968; Téléc. : (613) 567-2859 Courriel : [email protected]; Internet : www.caslpa.ca/francais/resources/jslpa-asp. Toutes les demandes visant à faire paraître de la publicité dans la ROA doivent être adressées au Bureau national. Les articles, éditoriaux et publicités qui paraissent dans la ROA ne sont pas nécessairement avalisés par l’Association canadienne des orthophonistes et audiologistes. Inscription au Répertoire ROA est répertoriée dans: • CINAHL - Cumulative Index to Nursing and Allied Health Literature • CSA - Cambridge Scientific Abstracts Linguistics and Language Behavior Abstracts • Elsevier Bibliographic Databases • ERIC Clearinghouse on Disabilities and Gifted Education • PsycInfo Réviseurs de la ROA Scott Adams, Joy Armson, Lisa Avery, Shari Baum, Paul Beaudin, Sandi Bojm, V.J. Boucher, Janine Boutelier, Tim Bressman, David Brown, Melanie Campbell, Marshall Chasin, Margaret Cheesman, Lynne Clarke, Pat Cleave, Martha Crago, Claire Croteau, Lynn Dempsey, Luc DeNil, Marla Jean DeSousa, Christine Dollaghan, Philip Doyle, Christopher Dromey, Wendy Duke, Ahn Duong, Andrée Durieux-Smith, Tanya L. Eadie, Jos Eggermont, Diane Frome Loeb, Jean-Pierre Gagné, Robin Gaines, Linda Garcia, Bryan Gick, Luigi Girolametto, Paul Hagler, Joseph W. Hall, III, Elizabeth Haynes, Steve Heath, Lynne Hewitt, Megan Hodge, Bill Hodgetts, Tammy Hopper , Nancy Hubbard, Marc Joanisse, Jack Jung, Benoît Jutras, Aura Kagan, Joseph Kalinowski, Michael Kiefte, Robert Kroll, Deborah Kully, Guylaine Le Dorze, Jeff Lear, Christopher Lee, Carol Leonard, Tony Leroux, Janice Light, Rosemary Lubinski, Shelia MacDonald, Ian MacKay, Heather MacLean, Ruth Martin, Virginia Martin, Rosemary Martino, Rachel Mayberry, David McFarland, Lu-Anne McFarlane, Alison McVittie, Barbara Meissner Fishbein, Kathy Meyer, Linda Miller, Linda Milosky, Jerald Moon, Taslim Moosa, Robert Mullen, Kathleen Mullin, Kevin Munhall, Chris Murphy, Candace Myers, J.B. Orange, Marc Pell, Carole Peterson, Kathy Pichora-Fuller, Dennis Phillips, Michel Picard, Karen Pollock, Moneca Price, Barbara Purves, Elizabeth KayRaining Bird, Jana Rieger, Danielle Ripich, Elizabeth Rochon, Nelson Roy, Christine Santilli, Susan Scollie, Barb Shadden, Rosalee Shenker, Bernadette Ska , Elizabeth SkarakisDoyle, Jeff Small, Ravi Sockalingham, David Stapells, Catriona Steele, Andrew Stuart, Anne Sutton, Stephen Tasko, Nancy ThomasStonell, Sharon Trehub, Natacha Trudeau, Anne van Kleeck, Ted Venema, Joanne Volden, Susan Wagner, Danya Walker, Linda Walsh, Jian Wang, Genese Warr Leeper, Penny Webster, Richard Welland, Lynne Williams, William Yovetich, Connie Zalmanowitz, Kim Zimmerman Vol. 29, No 4 Hiver 2005 REVUE D’ORTHOPHONIE ET D’AUDIOLOGIE Rédactrice en chef Phyllis Schneider, Ph.D. University of Alberta Directrice de la rédaction / mise en page Judith Gallant Directrice des communications Angie Friend Rédacteurs en chef adjoints Marilyn Kertoy University of Western Ontario (Orthophonie, soumissions en anglais) Tim Bressmann University of Toronto (Orthophonie, soumissions en anglais) Rachel Caissie Dalhousie University (Audiologie, soumissions en anglais) Patricia Roberts, Ph.D. Université d’Ottawa (Orthophonie, soumissions en français) Tony Leroux, Ph.D. Université de Montréal (Audiologie, soumissions en français) Rédacteur adjoint Libre (Évaluation des ressources) Rédacteur adjoint Libre (Évaluation des ouvrages écrits) Révision de la traduction Tony Leroux, Ph.D Université de Montréal Illustration (couverture) Andrew Young Traduction Smartcom Inc. ISSN0848-1970 Postes Canada Envoi publications # 40036109 La ROA est publiée quatre fois l’an par l’Association canadienne des orthophonistes et audiologistes (ACOA). Numéro de publication: #40036109. Faire parvenir tous les envois avec adresses canadiennes non reçus au 200, rue Elgin, bureau 401, Ottawa (Ontario) K2P 1L5. Faire parvenir tout changement à l’ACOA au courriel [email protected] ou à l’adresse indiquée ci-dessus. 142 Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Table of Contents Table des matières Introduction Winter Issue 144 Introduction Numéro de l’Hiver 145 Article Article Obtaining and Interpreting Maximum Performance Tasks from Children: A Tutorial Susan Rvachew, Megan Hodge and Alyssa Ohberg 146 Obtenir et interpréter des durées maximales d’exécution chez des enfants: un tutoriel Susan Rvachew, Megan Hodge et Alyssa Ohberg 146 Article Article Applications of 2D and 3D Ultrasound Imaging in Speech-Language Pathology Tim Bressmann, Chiang-Le Heng and Jonathan C. Irish 158 Utilisation de l’échographie en 2D et 3D en orthophonie Tim Bressmann, Chiang-Le Heng et Jonathan C. Irish 158 Article Article Exploring the Use of Electropalatography and Ultrasound in Speech Habilitation Barbara Bernhardt, Penelope Bacsfalvi, Bryan Gick, Bosko Radanov and Rhea Williams 169 Resource Review 183 Information for Contributors 185 Explorer l’électropalatographie et l’échographie pour l’éducation de la parole Barbara Bernhardt, Penelope Bacsfalvi, Bryan Gick, Bosko Radanov et Rhea Williams 169 Évaluation des ressources 183 Renseignements à l’intention des collaborateurs 187 Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 143 Introduction Barbara Bernhardt, Ph.D. School of Audiology and Speech Sciences University of British Columbia, Vancouver, BC T echnological innovations are providing new opportunities for speech assessment and (re)habilitation. The papers in this issue present up-to-date tutorial information on technological innovations in Canada, with some preliminary findings on the use of ultrasound, electropalatography and the TOCS+™ MPT Recorder© ver. 1 (Hodge & Daniels, 2004) in speech assessment and/or (re)habilitation. Rvachew, Hodge and Ohberg provide information on the use and evaluation of ‘maximum performance tasks’ in the assessment of motor speech impairments in children. Such tasks involve prolongation of speech sounds and repetition of syllables, and have been used to assist in differential diagnosis of speech dyspraxia or dysarthria in children (e.g., Thoonen et al., 1996; Williams & Stackhouse, 2000). The authors describe how TOCS+™ MPT Recorder© ver. 1 (Hodge & Daniels, 2004) is used along with a waveform editor to facilitate reliable administration and recording of children’s responses and accurate measurement of maximum durations and repetition rates. The software eliminates significant impediments to the use of the tasks with children. The authors suggest that the application of these procedures can result in reliable and valid data from younger children and become a routine part of the speech-language assessment protocol for children with suspected or confirmed speech impairments and delays. The other two papers in the issue describe visual display technology for articulatory movements. Bressmann, Heng and Irish provide a tutorial on speech imaging of the tongue with two-dimensional and three-dimensional ultrasound. They describe a number of different applications that they envisage for ultrasound imaging in speech-language pathology and demonstrate research findings from different projects in speech and swallowing that have been undertaken in the Voice and Resonance Laboratory at the University of Toronto. For example, for patients with glossectomy, three-dimensional tongue imaging has been used to quantify shapes of the tongue during the production of speech sounds pre- and postoperatively, and to evaluate the effect of different reconstructive techniques on the deformation and symmetry of the tongue tissue. The authors make a convincing argument for the usefulness, safety and cost-effectiveness of ultrasound. At the University of British Columbia’s Interdisciplinary Speech Research Laboratory, Bernhardt and colleagues have explored both two-dimensional ultrasound and electropalatography (EPG) as articulatory feedback tools in speech habilitation. EPG and ultrasound give different types of dynamic information about the tongue during speech production. EPG shows tongue-palate contact patterns from the tongue tip to the back of the hard palate for mid and high vowels and lingual consonants. Ultrasound images tongue shape, location and configuration for all vowels and lingual consonants. In their article, Bernhardt, Bacsfalvi, Gick, Radanov & Williams discuss the relative merits of these techniques and present data from two preliminary treatment studies using one or both of the techniques. Further research will clarify the relative benefits of the two tools in speech habilitation and their merit in comparison with other methods across a variety of speakers. Currently, the technologies described are available primarily at or near universities. However, the procedures discussed by Rvachew and colleagues can be adapted for use elsewhere. Portable ultrasound machines exist, and in British Columbia, a rural consultancy project is underway by Bernhardt and colleagues to evaluate the consultative use of ultrasound in speech habilitation. A clinical program is being set up in Britain called Cleftnet, which could be a model for regions of Canada. Cleftnet will be providing electropalatographs to cleft palate clinics, and linking those clinics with a university for data analysis and treatment recommendations. A charitable organization has provided the funding for this innovative service delivery program. Such a plan is currently germinating in British Columbia. By bundling these papers together in one volume, we hope that readers will see potential for future clinical applications and will be encouraged to gain access to new technologies for speech assessment and intervention. Such technologies enhance our understanding of speech (and swallowing) and can maximize the potential for good outcomes for our clients. References Hodge, M. M. & Daniels, J. D. (2004). TOCS+™ MPT Recorder© ver.1. [Computer software]. University of Alberta, Edmonton, AB. Thoonen, G., Maassen, B., Wit, J., Gabreels, F., & Schreuder, R. (1996). The integrated use of maximum performance tasks in differential diagnostic evaluations among children with motor speech disorders. Clinical Linguistics & Phonetics, 10, 311-336. Williams, P., & Stackhouse, J. (2000). Rate, accuracy and consistency: diadochokinetic performance of young, normally developing children. Clinical Linguistics & Phonetics, 14, 267-293. 144 Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Introduction Barbara Bernhardt, Ph.D. School of Audiology and Speech Sciences University of British Columbia, Vancouver, BC L es innovations technologiques offrent de nouvelles possibilités pour l’évaluation et la réadaptation de la parole. Les articles du présent numéro donnent des renseignements à jour sur les progrès technologiques au Canada et présentent des résultats préliminaires sur l’utilisation de l’échographie, de l’électropalatographie et du logiciel TOCS+™ MPT Recorder© ver. 1 (Hodge et Daniels, 2004) pour l’évaluation et la réadaptation de la parole. Rvachew, Hodge et Ohberg abordent l’utilisation et l’évaluation de la mesure maximale d’exécution de tâches pour les troubles moteurs de la parole chez les enfants. Ces tâches demandent aux enfants de prolonger des sons de la parole et de répéter des syllabes. Elles ont été utilisées pour établir des diagnostics différentiels de la dyspraxie et de la dysarthrie chez les enfants (p. ex. : Thoonen et coll., 1996; Williams et Stackhouse, 2000). Les auteurs décrivent comment le logiciel TOCS+™ MPT Recorder© ver. 1 (Hodge et Daniels, 2004) est utilisé en concomitance avec un éditeur d’ondes sonores pour faciliter l’administration d’un test, l’enregistrement des réponses de l’enfant et la mesure exacte de la durée maximale et de la fréquence de répétition. Ce logiciel élimine les grands obstacles qui nuisent à l’utilisation de ces tâches avec des enfants. Les auteurs avancent que le recours à ces procédures peut mener à des données fiables et valides pour les jeunes enfants et pourrait même faire partie du protocole d’évaluation en orthophonie des enfants chez qui l’on soupçonne ou l’on a diagnostiqué un trouble ou un retard de la parole. Les deux autres articles de ce numéro décrivent une technologie de visualisation des mouvements articulatoires. Bressmann, Heng et Irish présentent un tutoriel sur l’imagerie des mouvements de la langue grâce à l’échographie en deux et trois dimensions. Ils décrivent un certain nombre d’utilisations différentes de cette technique en orthophonie et présentent des résultats de recherche issus de divers projets sur la parole et la déglutition menés par le Voice and Resonance Laboratory à l’University of Toronto. Par exemple, pour des patients ayant subi une glossectomie, l’imagerie en trois dimensions de la langue a servi à quantifier les formes que prend l’organe durant la production de sons avant et après l’ablation et à évaluer l’effet de différentes techniques de reconstruction sur la déformation et la symétrie du tissu de la langue. Les auteurs présentent des arguments convaincants sur l’utilité de l’échographie, sa sécurité et son caractère économique. Au Interdisciplinary Speech Research Laboratory de l’University of British Columbia, Bernhardt et ses collègues ont exploré l’échographie à deux dimensions et l’électropalatographie comme outils de rétroaction visuelle de l’articulation. Ces deux techniques procurent des types différents d’information dynamique à propos de la langue durant la parole. L’électropalatographie montre les modèles de contact entre la langue et le palais depuis le bout de la langue jusqu’au palais dur pour les voyelles mi-fermées et fermées et les consonnes linguales. L’échographie permet de prendre une image de la forme de la langue, de son emplacement et de sa configuration pour toutes les voyelles et les consonnes linguales. Dans leur article, Bernhardt, Bacsfalvi, Gick, Radanov et Williams abordent les mérites relatifs de ces techniques et présentent des données de deux études préliminaires sur le traitement à partir de l’une ou de ces deux méthodes. D’autres recherches permettront de mieux définir les avantages relatifs de ces deux outils pour la rééducation de la parole et de comparer leurs mérites par rapport à d’autres méthodes employées pour une variété de locuteurs. Actuellement, les techniques décrites sont offertes principalement en milieu universitaire ou à proximité. Cependant, il est possible d’adapter les démarches décrites par Rvachew et ses collègues pour qu’elles soient utilisées ailleurs. Il existe des appareils à échographie portables, et Bernhardt et ses collègues mènent actuellement un projet d’étude en ColombieBritannique pour évaluer les avantages de l’échographie lors d’une consultation sur la rééducation de la parole. La Grande-Bretagne met actuellement en œuvre un programme clinique baptisé Cleftnet, qui pourrait servir de modèle pour les régions du Canada. Cleftnet permettra de faire des électropalatographies pour des cliniques sur la fissure palatine et de mettre ces cliniques en rapport avec une université pour l’analyse des données et l’établissement des recommandations de traitement. Un organisme caritatif a fourni les fonds nécessaires au financement de ce programme novateur de prestation de services. Un tel plan est en cours d’élaboration en Colombie-Britannique. En réunissant tous ces articles dans un même numéro, nous espérons que les lecteurs y trouveront des applications cliniques et une motivation pour obtenir l’accès à ces nouvelles technologies d’évaluation de la parole et d’intervention. Ces techniques améliorent notre compréhension de la parole (et de la déglutition) et peuvent nous permettre d’aller chercher les meilleurs résultats pour nos clients. Références Hodge, M.M. et J.D. Daniels (2004). TOCS+™ MPT Recorder© ver.1. [logiciel informatique]. University of Alberta, Edmonton (Alb.). Thoonen, G., B. Maassen, J. Wit, F. Gabreels et R. Schreuder (1996). The integrated use of maximum performance tasks in differential diagnostic evaluations among children with motor speech disorders. Clinical Linguistics & Phonetics, 10, p. 311-336. Williams, P., et J. Stackhouse (2000). Rate, accuracy and consistency: diadochokinetic performance of young, normally developing children. Clinical Linguistics & Phonetics, 14, p. 267-293 Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 145 Maximum Performance Tasks Obtaining and Interpreting Maximum Performance Tasks from Children: A Tutorial Obtenir et interpréter des durées maximales d’exécution chez des enfants : un tutoriel Susan Rvachew, Megan Hodge and Alyssa Ohberg Abstract The diagnosis of motor speech disorders in children can be aided by the use and interpretation of measures of maximum performance tasks. These tasks include measuring how long a vowel can be sustained or how fast syllables can be repeated. This tutorial provides a rationale for including these measures in assessment protocols for children with speech sound disorders. Software developed to motivate children to cooperate with these procedures and to expedite recording of sound prolongations and syllable repetitions is described. Procedures for obtaining maximum performance measures from digital sound file recordings are illustrated followed by a discussion of how these measures may aid in clinical diagnosis. Abrégé Le diagnostique d’un trouble moteur de la parole chez un enfant peut être facilité par l’utilisation et l’interprétation de tâches de durée maximale d’exécution. Ces tâches comprennent la mesure de la durée vocalique et de la rapidité de répétition des syllabes. Le présent tutoriel explique les raisons pour inclure ces tâches dans les protocoles d’évaluation pour les enfants atteints d’un trouble de parole. Le logiciel élaboré pour motiver ces derniers à collaborer lors de ces procédures et pour accélérer l’enregistrement du prolongement sonore et des répétitions de syllabes y est décrit. Les démarches pour obtenir des durées maximales d’exécution à partir d’un fichier sonore numérique y sont illustrées et sont suivies par une discussion sur la façon dont ces mesures peuvent aider à poser un diagnostic. Ke y Words: Speech sound disorders, motor speech disorders, assessment, maximum performance tasks C Susan Rvachew Ph.D., S-LP(C) McGill University Montreal, QC Canada Megan Hodge University of Alberta Edmonton, AB Canada Alyssa Ohberg McGill University Montreal, QC Canada 146 hildren with speech sound disorders form a heterogeneous group from a number of perspectives, including underlying etiological factors, the developmental course of the disorder, and the nature of the overt speech errors that are present at a given point in time (Shriberg, 1997). Most frequently the speech sound disorder is of unknown origin and has no obvious motoric basis, a subtype that will be referred to here as developmental phonological disorder. This subtype has also been referred to as speech sound disorder of unknown origin, nonspecific speech delay, functional articulation disorder or functional phonological disorder in the literature cited in the following sections. Other children's speech sound errors can be linked to motoric factors, with or without a known primary cause. Childhood apraxia of speech (also referred to as speech dyspraxia) is identified by a number of inclusionary characteristics including difficulties with sequencing articulatory movements, phonemes, and syllables; trial and error groping behaviours; and unusual and inconsistent error patterns for both consonants and vowels. Dysarthria may also be observed in children and manifests itself as more consistent error patterns resulting from slow and imprecise movements associated with an abnormal sensorimotor profile that typically includes weakness and tone abnormalities of the affected speech muscle groups. One purpose of a speech-language assessment is to determine the extent to which motoric factors contribute to a child's difficulties with the acquisition of the sound system of the native language. Knowledge about whether or not the child's speech disorder has a motor component will help the clinician to choose the most appropriate Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Maximum Performance Tasks treatment approach. Accurate diagnosis may also have ramifications for the child's access to treatment services because both public and private funders often favour the provision of services to children with an identifiable medical impairment. Measures of maximum performance tasks (MPTs) such as how long a vowel can be sustained (maximum phonation duration; MPD) or how fast syllables can be repeated (maximum repetition rate; MRR) are wellestablished procedures used by speech-language pathologists when assessing older children and adults (Duffy, 1995; Kent, Kent, & Rosenbeck, 1987). More recently, Thoonen and colleagues (Thoonen, Maassen, Gabreels, & Schreuder, 1999; Thoonen, Maassen, Wit, Gabreels, & Schreuder, 1996) described the application of MPTs to assist clinicians in diagnosing the presence and nature of motor speech impairment in younger children (age 6 to 10 years). Published protocols for identifying and describing oral and speech praxis characteristics of children also include maximum syllable repetition rate measures as part of a battery of nonspeech and speech performance measures (e.g., Hickman, 1997). The classification system developed by Thoonen et al. is particularly appealing because it offers clinicians a systematic framework for integrating and interpreting measures from MPTs to assist in differential diagnosis of childhood speech disorders. As with all assessment procedures, the ease and reliability with which measures of MPTs can be obtained and their validity and usefulness in differential diagnosis are key determinants to being adopted in clinical practice. This tutorial provides a rationale for including these measures in assessment protocols for young children with speech sound disorders. It summarizes the tasks and classification procedure developed by Thoonen et al. and how the measures obtained are interpreted to ascertain the presence and nature of motor speech impairment. Software that expedites recording of the MPTs recommended by Thoonen et al. is described and procedures for obtaining MPT measures from digital sound file recordings are illustrated for readers who may be unfamiliar with computer-assisted measurement. Rationale Accurate identification of speech motor limitations can be difficult, especially in the case of children who do not present with an obvious primary impairment such as cerebral palsy or traumatic brain injury. Campbell (2003) reported that second opinion assessments conducted at the Children's Hospital of Pittsburgh confirmed a prior diagnosis of childhood apraxia of speech (CAS) in only 17 percent of cases, suggesting a significant over-diagnosis of CAS among children with a severe and persistent speech sound disorder. On the other hand, Gibbon (1999) has suggested that a more subtle form of motoric involvement, termed 'undifferentiated lingual gestures', is frequently under-diagnosed among children who present with errors that appear to be phonological on the basis of perceptual analyses (in particular, velar fronting and/or backing and fricative gliding and/or stopping). Certain phonetic errors such as a lateral lisp may also reflect an inability to independently control the lateral margins of the tongue. Under-identification of motor speech limitations may harm individual clients if it prevents them from accessing services to which they are entitled or receiving the most appropriate form of treatment. Over-identification also has far-reaching implications, since threats to the credibility of our profession will have a negative impact on the funding of speech therapy services. One reason for misdiagnosis may be an over-reliance on diagnostic checklists as a means of identifying motor speech disorders (Shriberg, Campbell, Karlsson, Brown, McSweeny, & Nadler , 2003). These lists have a kind of face validity because they describe the overt characteristics of the child's speech. Unfortunately they lack specificity because they fail to distinguish between fundamental characteristics of a motor speech disorder and the consequences of such a disorder. The linguistic consequences of dysarthria or dyspraxia are not clearly distinguishable from the linguistic consequences of a developmental phonological disorder. Unintelligibility and persistence of the speech problem are not specific to motor speech disorders and systematic error patterns are not specific to development phonological delay. Shriberg, Aram, and Kwiatkowski (1997) demonstrated that CAS could not be differentiated from a developmental phonological disorder on the basis of structural or phonological characteristics of the child's conversational speech (i.e., phonetic repertoire, syllable structure repertoire, percentage of consonants correct, intelligibility index, or phonological processes). Maximum Performance Tasks A more promising approach is to administer Maximum Performance Tasks (MPTs) to children. Thoonen, Maassen, Wit, Gabreels, and Schreuder (1996) explained that "although [MPTs] assess abilities that differ from normal speech production…, they provide information on motor speech abilities underlying dysarthria and [CAS] (e.g., articulatory coordination, breath control, speaking rate, speech fluency, articulatory accuracy and temporal variability)" (p. 312). These researchers demonstrated how Maximum Phonation Duration (MPD) and Maximum Repetition Rate (MRR) can be used to differentiate groups of children with spastic dysarthria, CAS, developmental phonological disorder, or normally developing speech. Their criteria for classification were derived from the responses of children aged 6 to 10 years of age, some with normally developing speech and some with clinically diagnosed dyspraxia or dysarthria. Briefly, children with dysarthria were found to produce short phonation durations and slow monosyllabic repetition rates; children with dyspraxia produced slow trisyllabic repetition rates and short fricative durations. Later, these criteria were crossvalidated with new samples of school-aged children, this Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 147 Maximum Performance Tasks time including a sample of children with a developmental phonological disorder with no motoric component. It was shown that these tasks could be used to identify dysarthria with 89% sensitivity and 100% specificity. In other words, 89% of the children with clinically diagnosed dysarthria were identified as dysarthric on the basis of their responses on the MPTs (sensitivity). Furthermore, none of the children who were not dysarthric by clinical criteria were falsely identified as dysarthric on the basis of their responses to the MPTs (specificity). Dyspraxia was identified from MPT responses with 100% sensitivity and 91% specificity. Overall, diagnostic accuracy was excellent with 95% correct classification of 41 children as presenting with normally developing speech, developmental phonological delay, childhood apraxia of speech, or dysarthria. Of particular interest was the finding that children with a developmental phonological disorder performed these tasks in a qualitatively and quantitatively different manner from children with dysarthria or dyspraxia. Children with dyspraxia were often unable to produce a correct trisyllabic sequence. Children with a developmental phonological disorder were usually able to produce the sequence accurately but only after an unusual number of unsuccessful attempts. Overall their performance on these tasks was intermediate between the control group and the dysarthric and dyspraxic groups. Kent, Kent, & Rosenbeck (1987) described some of the difficulties inherent to the clinical application and interpretation of MPTs which may explain why these techniques are not routinely applied, especially with young children. A primary issue with interpretation of MPT performance is the availability of good quality normative data. Kent et al. reviewed a number of studies that provided normative data for school-age children and young adults but noted that there was a lack of normative data for younger children and older adults. Subsequently, however, Robbins and Klee (1987) described the MRR and MPD performance of children aged 2;6 through 6;11 (with a sample of 10 children at each 6-month age interval). Williams and Stackhouse (2000) reported additional data regarding repetition performance for 3-, 4-, and 5-yearold children. Reliability of the measures obtained from the child's performance of each task presents another challenge. Stability of the results across repeated trials can be poor. Individual performance is affected by the task instructions and the motivation of the child. Kent et al. (1987) suggested that standardized instructions and procedures would help reduce variability within and across children. In this report we describe a software tool that presents a standard protocol for clinicians to follow when administering MPTs to young children and recording their productions. Experience with the software indicates that it increases children's motivation to comply with the protocol. To date all of the preschool-aged children that we have tested with this tool have provided a complete set of responses for each of the maximum performance tasks. Unstable performance levels across trials also leads to questions about the validity of these measures as 148 implemented in a clinical setting. Kent et al. (1987) reported that it can take as many as 15 trials before a stable response is achieved, particularly when attempting to obtain maximum phonation duration. However, Potter, Kent, & Lazarus (2004) reported that in their investigation of typical performance on repetition tasks, the first attempt was most frequently the fastest and most accurate. Over 90% of the children who attempted and could perform the task gave their best performance within the first three trials. This is an encouraging finding because our experience has been that it is impractical to attempt more than three trials with a young child. Although instability across repeated trials is a potential threat to the validity of MPTs, Kent et al. concluded that "nonetheless, the test may still have clinical utility as a screening procedure if it is recognized that the object is to determine if the client can reach some minimal standard" (p. 369). This is the approach taken by Thoonen et al. (1999). They established the threshold values for Maximum Phonation Duration and Maximum Repetition Rates that can be used to diagnose dyspraxia or dysarthria in children aged 6 through 10 years of age. Finally, some of the variability in results that is observed may result from the difficulty of obtaining an accurate measurement of MPD and MRR when administering the tasks 'live' with the use of a stop-watch. Kent et al. (1987) and Thoonen et al. (1996, 1999) recommended that responses be recorded and measures of the acoustic waveform be used whenever possible to obtain more precise measurements. The software described in this report makes it easy for the clinician to record the child's responses and retrieve them for measurement. Durations and repetition rates can then be accurately measured from these recordings using any available waveform editor. Procedures for measuring MPD and MRR from a waveform display are demonstrated in a later section. A Protocol for Obtaining MPTs from Children The protocol for obtaining MPTs described here was developed by Thoonen et al. (1996, 1999). This procedure involves the administration of nine tasks as follows: prolongation of [a] and [mama] to yield a maximum phonation duration (MPD), prolongation of [f], [s], and [z] to yield a maximum fricative duration (MFD), repetition of the single syllables [pa], [ta], and [ka] to yield a maximum repetition rate-monosyllabic (MRRmono), and repetition of the syllable sequence [pataka] to yield a maximum repetition rate-trisyllabic (MRRtri). Two additional outcome measures are derived from the child's performance during the trisyllabic repetitions task, specifically a score indicating whether the child achieved a correct trisyllabic sequence (Seq) and the number of attempts beyond the standard three trials required for the child to achieve a correct sequencing of [pataka] (Attempts). The instructions for administering these items and then combining results across the nine tasks to yield the six outcome measures are shown in Table 1. Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Maximum Performance Tasks Table 1 Instructions for adm inistration of the m axim um perform ance tasks, including M axim um Phonation Duration (M PD), M axim um Fricative Duration (M FD), M axim um Repetition Rate for Single Syllables (M RRm ono), and M axim um Repetition Rate for Trisyllabic Sequences (M RRtri), adapted from Thoonen et al. (1996) Task Instructions Maximum Phonation Duration (MPD) [a] 1. Produce a prolonged [a] for approximately 2 seconds on one breath in a monotonic manner with normal pitch. Ask the child to imitate your model. Repeat if necessary until the child is successful in imitating your model. 2. As above except model a prolongation of [a] for 4 to 5 seconds and then ask the child to imitate your model. 3. Ask the child to say [a] for as long as possible on one breath (with no model provided in this case). Repeat the instruction two more times, providing the child with a total of three opportunities to prolong [a] for as long as possible. [mama] Repeat steps 1, 2, and 3 above except that in this case, model a repetition of the syllables [mama...]. Again at step 3, give the child three opportunities to produce [mama...] for as long as possible on a single breath. MPD MPD is the mean of the longest prolongation of [a] and the longest prolongation of [mama...]. Maximum Fricative Duration (MFD) [f] Repeat steps 1,2, and 3 as described for MPD, in this case modelling a prolonged production of [f]. Again at step 3, give the child three opportunities to prolong [f] for as long as possible on a single breath. [s] Repeat steps 1,2, and 3 as described above, in this case modelling a prolonged production of [s]. Again at step 3, give the child three opportunities to prolong [s] for as long as possible on a single breath. [z] Repeat steps 1,2, and 3 as described above, in this case modelling a prolonged production of [z]. Again at step 3, give the child three opportunities to prolong [z] for as long as possible on a single breath. MFD MFD is the mean of the longest prolongation of [f], the longest prolongation of [s] and the longest prolongation of [z]. Maximum Repetition Rate - Monosyllabic (MRRmono) [pa] 1. Ask the child to say [pa], and then [papapa], and then [papapapapa]. 2. Model the repetition of approximately 12 [pa] syllables on a single breath at a rate of about four syllables per second and ask the child to imitate your model. 3. Ask the child to repeat step 2 but this time as fast as possible. Stop recording when the child has produced 12 or more syllables. Provide the child with two additional opportunities to maximize the repetition rate. [ta] Repeat steps 1,2, and 3 as described above, in this case modelling repetition of the syllable [ta]. Again at step 3, give the child three opportunities to produce [ta] as fast as possible on a single breath. [ka] Repeat steps 1,2, and 3 as described above, in this case modelling repetition of the syllable [ka]. Again at step 3, give the child three opportunities to produce [ka] as fast as possible on a single breath. MRRmono For each trial the repetition rate is calculated as the number of syllables produced per second. MRRmono is the mean repetition rate for the fastest repetition of [pa], the fastest repetition of [ta], and the fastest repetition of [ka]. Table 1 continued on page 150 Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 149 Maximum Performance Tasks Table 1 (continued) Instructions for adm inistration of the m axim um perform ance tasks, including M axim um Phonation Duration (M PD), M axim um Fricative Duration (M FD), M axim um Repetition Rate for Single Syllables (M RRm ono), and M axim um Repetition Rate for Trisyllabic Sequences (M RRtri), adapted from Thoonen et al. (1996) Task Instructions Maximum Repetition Rate - Trisyllabic (MRRtri) [pataka] 1. Ask the child to say [pataka] at a slow rate. Practice this syllable sequence, breaking it down into its component parts if necessary, until the child can produce a single correct sequence. 2. Produce the sequence twice [patakapataka] fluently and at a slow rate and ask the child to imitate. 3. Produce the sequence three times at a normal speaking rate and ask the child to imitate. 4. Produce the sequence four times at a rate of about four syllables per second and ask the child to imitate. 5. Model a repetition of the sequence, five times and as fast as possible. Ask the child to produce the sequence as fast as possible for as long as possible on a single breath. Give the child two additional trials to perform this task. If the child cannot produce the sequence accurately, repeat the steps and allow three additional attempts to produce a correct sequence as fast as possible and as long as possible on a single breath. MRRtri MRRtri is the number of syllables per second produced during the child's fastest attempt at repeating this sequence. The sequence must be produced correctly over 5 repetitions on a trial for it to be used to calculate the MMRtri. Sequence Score 1 if the child produces a correct repetition of the sequence. Score 0 if the child does not suceed in producing a correct sequence. Attempts This score is the number of additional attempts (beyond the first three) that are required for the child to achieve a correct repetition of the sequence. Generic free or inexpensive software programs are available to record sound and display the waveforms of the recordings (e.g., GoldWave, Goldwave, Inc., 2005; PRAAT, Boersma & Weenink, 2005). Software packages are also available that count syllable peaks and perform an automatic count (e.g., Motor Speech Profile, KayPentax). The TOCS+™ MPT Recorder© ver. 1 (Hodge & Daniels, 2004) is freeware that was developed specifically to facilitate administration and measurement of MPTs with children, following the protocol of Thoonen et al. (1996). It turns any personal computer that has an operating system with Windows 98 or later into a digital audio recorder with a sampling rate of 48 kHz and a quantization size of 16 bits. An inexpensive computer microphone is adequate for the durational measures to be obtained from the recordings of the child's responses to the MPTs. A headmounted microphone is preferable if the child will tolerate this but a table microphone is a second option. The software sets a standard recording level at start-up that can be checked and modified within the software before administering the MPT protocol. It guides the user through administration of the MPD, MFD, and MRR tasks in succession. At the beginning of each task type (MPD, MFD, MRRmono, and MRRtri) a screen with instructions similar to those summarized in Table 1 is displayed to cue the examiner so that the same instructions are given each time. This is followed by successive screens for a practice trial followed by the required number of test 150 trials for each MPT listed in Table 1. For each of these trials, a short tone and a small icon appear on the screen to signal to the child that it is time to start the task (see Figure 1). This ensures onset synchronization of the child's response and recording, reduces the likelihood of overlapping examiner and child speech, and avoids false starts and unnecessary repeat trials. Recordings of each trial are saved as a .wav file that is named by task and trial number and stored in the child's folder. Measurement of MPD and MRR Digital recordings of MPTs obtained using the TOCS+ MPT Recorder (or other software with recording capabilities) can be displayed as a waveform by a variety of software packages, such as those cited previously. In the examples that follow, Time-Frequency-Response version 2.1 (TFR; AVAAZ Innovations, Inc., 1999) was used to demonstrate the measurement of durations and repetition rates. The basic procedure is the same regardless of the specific software used to display the waveforms and measure the durations. Measurement of MPD is the most straightforward. After loading the sound file into a waveform display window, visual inspection of the waveform and the partial playback feature of the software helps to identify the waveform that represents the production of the [a]. For example, in the waveform shown in Panel A of Figure 2, the prolonged [a] is preceded by some examiner speech and the client's inhalation, and there is a second inhalation Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Maximum Performance Tasks Figure 1 Figure 1: Instruction screen with visual prompt to the child to begin the practice trial for the first maximum performance task, from the TOCS+MPT Recorder Version 1 (altered to appear in black and white) Figure 2 A. Client inhalation Client inhalation Examiner Speech Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 151 Maximum Performance Tasks Figure 2 B File length approximately 12957 ms Figure 2. Panel A shows the waveform of the recording of a prolonged ‘ah’ [a] marked by a bracket and surrounded by extraneous information in the file. Panel B shows the prolonged ‘ah’ cut from the first file as shown in A so that the extraneous information is removed. The duration of the file in milliseconds is indicated with an arrow. that follows the [a] production. Waveform editors provide a 'click and drag' function for marking off the specific waveform of interest, in this case the waveform that is marked with a bracket. In Panel B of Figure 2 the duration of the [a] is shown as being 12,956.92 ms which, when divided by 1000, yields approximately 12.96 seconds. The procedure for measuring duration of [mama], [f], [s], and [z] is the same as that shown here for [a]. Measurement of MRRmono is accomplished by loading the sound file into the wave form display window and marking off 10 consecutive repetitions of the syllable, as shown in Figure 3. As described in Table 1, all 10 syllables should be produced on a single breath. These 10 syllables should not include the first syllable after an inspiration or the last syllable before an inspiration. In Panel B of Figure 3 the selected 10 syllables are isolated from the rest of the file. The total duration of the selected portion is shown as approximately 1835 ms. When using Thoonen et al.'s protocol for interpreting the results it is necessary to calculate the number of syllables produced per second. This value is obtained by converting the time value to seconds and dividing the 10 repetitions by the total time in seconds yielding 10/1.835 = 5.45 syllables per second in this case. The procedure for determining MRRtri is the same as that for determining MRRmono except that 4 consecutive repetitions of the sequence [pataka] (i.e., 12 syllables) are marked off. The number of syllables per second is calculated as described previously for MRRmono . For the example shown in Figure 4, the total time taken to produce 4 repetitions of the sequence [pataka] was 1580 ms. This 152 results in a rate of 7.59 syllables per second (12 syllables/ 1.58 seconds). Alternative Calculation Procedures The procedures described in the previous section for measuring MRRmono and MRRtri are specific to the Thoonen et al. (1999) protocol. The way in which repetition rates are calculated and represented depends upon the norms that will be used to interpret the child's performance. Some norms for single syllable repetition rates are presented as the time taken to produce a specified number of repetitions (e.g., Fletcher, 1972). When using Fletcher's time-by-count norms the examiner simply marks the required number of repetitions and notes the time taken to produce those repetitions. Some norms for the interpretation of trisyllable repetition rates, such as those published by Robbins and Klee (1987), are based on the number of repetitions of the entire sequence, (e.g., in the example in Figure 4, four repetitions of the sequence in 1.58 seconds yields a rate of 2.53 repetitions of [pataka] per second). An important point about the measurement of MRRtri described by Thoonen et al. (1999) is that it requires a repetition of correctly articulated sequences. Some younger children may be unable to correctly articulate the [ka] phoneme in which case they might repeat [patata], a response that should not be scored using Thoonen et al.'s procedure. Williams and Stackhouse (2000) reported repetition performance for three-syllable words and nonsense words in which accuracy, rate, and consistency measures were derived independently. Therefore, their Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Maximum Performance Tasks Figure 3 A Figure 3 B File length approximately 1835 ms Figure 3. Panel A shows the client’s repetitions of the syllable ‘pa’ from the first syllable until the time when recording was stopped. The duration of the 10 repetitions that are marked by the bracket is measured by cutting these repetitions from the file as shown in Panel B. The duration of the 10 repetitions shown in the cut file is indicated with an arrow. paper provides a normative reference for the repetition rate, regardless of accuracy, for 3- to 5-year-old children. They found that even 3-year-olds produced repetition rates no slower than three syllables per second. They suggested that the ability to repeat a consistent [patata] sequence at a rate of at least three syllables per second would not be reason for concern with this age group. However, inconsistent and inaccurate repetitions of the sequence would be cause for concern. Differential Diagnosis Thoonen et al. (1999) developed a flow chart for differential diagnosis of dysarthria and dyspraxia, based on MPT data that they obtained from children aged 6 through 10 years. The application of these criteria are described here. Figure 5 illustrates the results of this interpretative process for a hypothetical 7-year-old child. The process begins with the assignment of a dysarthria score of 0, 1, or 2, where 0 indicates that the child is not Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 153 Maximum Performance Tasks Figure 4 A 1 sequence 4 sequences Figure 4 B 1580 ms to produce 4 sequences or 12 syllables Figure 4. Panel A shows the client’s repetition of the sequence ‘pataka’ from the first syllable until the time when recording was stopped. The brackets indicate the first sequence, which is excluded, and the next 4 sequences that are cut to form the display shown in Panel B. The time taken to produce 12 syllables comprising these 4 sequences is marked with an arrow. 154 Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Maximum Performance Tasks Figure 5 Figure 5. Example of calculation of Maximum Phonation Duration (MPD), Maximum Fricative Duration (MFD), Maximum Repetition Rate for monosyllables (MRRmono), Maximum Repetion Rate for trisyllabic sequences (MRRtri), Attempts, and Sequence. Interpretation of these data to yield a diagnosis is shown at the bottom of the chart. Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 155 Maximum Performance Tasks dysarthric and a 2 indicates that the child is primarily dysarthric. MRRmono is the primary diagnostic marker for dysarthria. A score of 0 is assigned if MMRmono is greater than 3.5 syllables per second. A score of 2 is assigned if the MRRmono is less than 3 syllables per second. If the child’s MRRmono is between 3 and 3.5, the MPD is examined: if the MPD is less than 7.5 seconds, a score of 2 is assigned; if the MPD is more than 7.5, a score of 1 is assigned. Next, a dyspraxia score of 0, 1, or 2 is assigned, where 0 indicates that the child is not dyspraxic and a score of 2 indicates that the child is dyspraxic. MRRtri and Attempts are the primary diagnostic markers for CAS. A score of 0 is assigned if the child produces a correct trisyllabic sequence at a rate of at least 4.4 syllables per second without requiring more than 2 additional attempts. If the child cannot produce a correct sequence or the MRRtri for a correct sequence is less than 3.4 syllables per second a score of 2 is assigned. If the MRRtri is between 3.4 and 4.4 syllables per second, a score of 1 is assigned as long as the MFD is appropriate at more than 11 seconds and the child did not require more than 2 additional attempts to achieve a correct sequence. If the MRRtri is between 3.4 and 4.4 syllables per second and more than 2 additional attempts were needed to achieve a correct sequence a score of 2 is assigned. A score of 2 is also assigned if MRRtri is between 3.4 and 4.4 syllables per second and MFD is 11 seconds or less. Note that a diagnosis of ‘primarily dysarthria’ would be concluded if the child received dysarthria and dyspraxia scores of 2. Children with dysarthria are likely to produce very slow repetition rates for both monosyllables and trisyllabic sequences. Children with CAS are likely to produce repetition rates that are slower for trisyllabic sequences than for monosyllables (Thoonen et al., 1999). The hypothetical child profiled in Figure 5 received a dysarthria score of 1 and a dyspraxia score of 2, justifying a clinical diagnosis of CAS. His MRRmono was not slow enough to justify a diagnosis of dysarthria. His MRRtri was somewhat slow at 3.45 syllables per second, and he did not achieve a correct repetition of the sequence until the sixth trial. Thus the combination of Attempts = 3 and MRRtri between 3.4 and 4.4 led to a dysarthria score of 2, resulting in a diagnosis of childhood apraxia of speech. Summary and Conclusions A number of normative data sets are available to aid in the interpretation of a child’s ability to prolong sounds and repeat syllables (e.g., Kent et al., 1987; Robbins & Klee, 1987; Thoonen et al., 1996, 1999; Williams & Stackhouse, 2000). For children with specific phonological errors such as velar fronting, the diagnostic accuracy of the procedure can be improved by considering accuracy and consistency of production of a trisyllabic sequence as described by Williams and Stackhouse (2000). Thoonen et al. have provided a framework for using MPTs to assist in differential diagnosis of speech dyspraxia or dysarthria in pediatric clients. Technological advances such as the 156 TOCS+™ MPT Recorder© ver. 1 (Hodge & Daniels, 2004) and readily available waveform editors facilitate reliable administration and recording of children’s responses and accurate measurement of maximum durations and maximum repetition rates. The publication of new normative data and the availability of audio recording and editing software have eliminated significant impediments to the use of maximum performance tasks with children. It is our hope that the application of these procedures will result in reliable and valid normative data from younger children and become a routine part of the speech-language assessment protocol for all children with suspected or confirmed speech disorders and delays. Author Note Address correspondence to Dr. Susan Rvachew, School of Communication Sciences and Disorders, McGill University, 1266 Pine Avenue West, Montréal, Québec H3G 1A8. Development of the TOCS+™ MPT Recorder© ver. 1 was supported by a grant from the Canadian Language and Literacy Research Network (www.cllrnet.ca) and uses the Universal Sound Server© software developed for the TOCS+ Project (www.Tocs.plus.ualberta.ca) at the University of Alberta by Tim Young. Readers interested in using the TOCS+™ MPT Recorder© can contact Megan Hodge ([email protected]) to obtain a copy of the software. References Avaaz Innovations, Inc. (1999). Time-Frequency-Response Version 2.1 (TFR). [Computer software]. London, Ont.: Avaaz Innovations, Inc. www.avaaz.com. Boersma, P. & Weenink, D. (2005). Praat Version 4.3.12 [Computer software] Institute of Phonetic Sciences, University of Amsterdam, www.fon.hum.uva.nl/praat/. Campbell, T. F. (2003). Childhood apraxia of speech: Clinical symptoms and speech characteristics. In L. D. Shriberg & T. F. Campbell (Eds.), 2002 Childhood Apraxia of Speech Research Symposium. Carlsbad, CA: The Hendrix Foundation. Duffy, J. (1995). Motor speech disorders: Substrates, differential diagnosis and management. St. Louis, MO: Mosby-Year Book, Inc. Fletcher, S. (1972). Time-by-count measurement of diadochokinetic syllable rate. Journal of Speech and Hearing Research, 15, 763-770. Goldwave Inc. (2005). Goldwave Version 5.10. [Comuter software]. www.goldwave.com. Gibbon, F. E. (1999). Undifferentiated lingual gestures in children with articulation/ phonological disorders. Journal of Speech, Language, and Hearing Research, 42, 382297. Hickman, L. (1997). The Apraxia Profile. Communication Skill Builders/Therapy Skill Builders, a division of The Psychological Corporation. Hodge, M. M. & Daniels, J. D. (2004). TOCS+™ MPT Recorder© ver.1. [Computer software]. University of Alberta, Edmonton, AB. KayPentax. Motor Speech Profile [Computer software]. www.kayelemetrics.com. Kent, R. D., Kent, J.F., & Rosenbeck, J.C. (1987). Maximum performance tests of speech production. Journal of Speech and Hearing Disorders, 52, 367-387. Potter, N., Kent, R. & Lazurus, J. (March, 2004). Measures of speech and manual motor performance in children. Presented at the Conference on Motor Speech, Albuquerque, NM. Robbins, J. & Klee, T. (1987). Clinical assessment of oropharyngeal motor development in young children. Journal of Speech and Hearing Disorders, 52, 271277. Shriberg, L.D. (1997). The Speech Disorders Classification System (SDCS): Extensions and lifespan reference data. Journal of Speech, Language, and Hearing research, 40, 723-740. Shriberg, L. D., Aram, D. M., & Kwiatkowski, J. (1997). Developmental apraxia of speech: II. Toward a diagnostic marker. Journal of Speech, Language, and Hearing Research, 40, 286-312. Shriberg, L. D., Campbell, T. F., Karlsson, H. B., Brown, R. L., McSweeny, J. L., & Nadler, C. J. (2003). A diagnostic marker for childhood apraxia of speech: the lexical stress ratio. Clinical Linguistics & Phonetics, 17, 549–574. Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Thoonen, G., Maassen, B., Gabreels, F., & Schreuder, R. (1999). Validity of maximum performance tasks to diagnose motor speech disorders in children. Clinical Linguistics & Phonetics, 13, 1-23. Thoonen, G., Maassen, B., Wit, J., Gabreels, F. & Schreuder, R. (1996). The integrated use of maximum performance tasks in differential diagnostic evaluations among children with motor speech disorders. Clinical Linguistics & Phonetics, 10, 311-336. Williams, P., & Stackhouse, J. (2000). Rate, accuracy and consistency: diadochokinetic performance of young, normally developing children. Clinical Linguistics & Phonetics, 14, 267-293. Received: November 29, 2004 Accepted: August 8, 2005 Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 157 Applications of 2D and 3D Ultrasound Imaging Applications of 2D and 3D Ultrasound Imaging in Speech-Language Pathology Utilisation de l’échographie en 2D et 3D en orthophonie Tim Bressmann, Chiang-Le Heng and Jonathan C. Irish Abstract Tongue motion in speech and swallowing is difficult to image because the tongue is concealed in the oral cavity. It is even more difficult to assess the extent of lingual motion quantitatively. Ultrasound imaging of the tongue in speech production and swallowing allows for a safe and non-invasive data acquisition. The paper describes the potentials and methodological problems of conducting ultrasound speech research using dynamic two-dimensional, static three-dimensional and dynamic three-dimensional ultrasound imaging. Abrégé Le mouvement de la langue pour la parole et la déglutition est difficile à mettre en image parce que cet organe est dissimulé dans la cavité buccale. Il est encore plus difficile d’évaluer l’ampleur du mouvement de la langue de manière quantitative. L’échographie de la langue lors de la production de la parole et de la déglutition constitue une méthode sécuritaire et non invasive d’obtenir des données. Le présent article décrit les possibilités et les problèmes méthodologiques pour mener des recherches sur l’utilisation de l’échographie dynamique en deux dimensions, statique en trois dimensions et dynamique en trois dimensions dans le domaine de l’orthophonie. Key Words Words: Glossectomy, cancer, tongue paralysis, speech, articulation, swallowing, tongue, 3D ultrasound, 2D ultrasound Tim Bressmann, Ph.D. Graduate Department of Speech-Language Pathology University of Toronto Toronto, ON Canada Chiang-Le Heng, B.Sc. Graduate Department of Speech-Language Pathology University of Toronto Toronto, ON Canada Jonathan C. Irish, M.D., M.Sc., F.R.C.S.C., F.A.C.S. Departments of Otolaryngology and Surgical Oncology Princess Margaret Hospital University Health Network and Toronto General Hospital, University Health Network Toronto, ON Canada 158 Introduction U ltrasound imaging of the tongue is currently gaining popularity as a research tool in speech-language pathology and speech science. Until ten years ago, ultrasound machines were out of reach for most speech researchers because they were very expensive. In recent years, advances in computer technology and increased competition between the different manufacturers of ultrasound machines have helped to bring down the costs considerably. As a consequence, more speech researchers are now able to purchase ultrasound machines for their laboratories. Also, more physicians are buying machines for their hospital or private practice. This in turn may potentially give speech-language pathologists who are affiliated with hospitals or physicians in private practice access to ultrasound machines. The main advantages of ultrasound imaging over other methods in phonetic research lie in low cost, bio-safety and ease of image acquisition. After an ultrasound machine has been purchased, associated costs to collect data are negligible. The radiation levels that are generated by a medical ultrasound machine are extremely low and do not accumulate so it is biologically safe to make extended recording sessions or examine patients repeatedly. This is an advantage over x-ray based imaging methods such as videofluoroscopy. Ultrasound imaging is non-invasive for the patient because it is not necessary to glue transducer coils to the tongue (like in electromagnetic Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Applications of 2D and 3D Ultrasound Imaging midsagittal articulography). The ultrasound data acquisition is reasonably comfortable for the participant so that clinical populations can be studied. It also becomes easier to make recordings with notoriously ‘difficult’ research populations such as children. In this paper, we will give an introduction to a number of different applications that we see for ultrasound imaging in speech-language pathology and demonstrate research findings from different projects that were undertaken in the Voice and Resonance Laboratory at the University of Toronto. Two-Dimensional Ultrasound Diagnostic ultrasound makes use of the pulse-echo principle: Any physical impulse into an environment puts objects into oscillation and results in echoes. In the ultrasound machine, a piezoelectric crystal in the ultrasound transducer generates a sound-burst. After sending the burst, the same piezoelectric crystal is then put into receiving mode and listens for the echoes. By repeating this process hundreds of times every second, a two-dimensional image can be reconstructed by a computer. Commercially available ultrasound machines deliver a video-frame rate of 30 frames per second. This corresponds to the standard NTSC format that is used in television sets and video-recorders in North America. The NTSC video-frame rate is sufficiently fast to capture even the quicker aspects of tongue movement in speech such as those involved in the production of plosives. Diagnostic ultrasound has long been used in phonetic research to examine the tongue shape in different speech sounds (Morrish, Stone, Sonies, Kurtz, & Shawker,1984; Shawker & Sonies, 1984; Stone, Morrish, Sonies, & Shawker, 1987, 1988; Wein, Bockler, Huber, Klajman, & Wilmes, 1990) as well as to assess temporal aspects of speech motor control (Munhall, 1985; Parush & Ostry, 1993). Much of the clinical application of ultrasound imaging for the study of tongue function to date has focused on the study of the oral phase of swallowing (Casas,Kenny, & Macmillan, 2003; Chi-Fishman, Stone, & McCall, 1998; Neuschäfer-Rube,Wein, Angerstein, Klajman, & Fisher-Wein, 1997; Peng, Jost-Brinkmann, Miethke, & Lin, 2000; Soder & Miller, 2002; Sonies, Baum, & Shawker, 1984; Shawker, Sonies, Stone, & Baum, 1983; Stone & Shawker, 1986). The method has proven useful even for babies (Bosma, Hepburn, Josell, & Baker, 1990). Swallowing and speech have been studied in different pathological populations including patients with cerebral palsies (Casas, Kenny, & McPherson, 1994; Casas, McPherson, & Kenny,1995; Kenny, Casas, & McPherson, 1989; Sonies & Dalakas, 1991), strokes (Wein, Alzen, Tolxdorff, Bockler, Klajman, & Huber, 1988a; Wein, Angerstein, & Klajman, 1993), geriatric patients (Sonies et al., 1984), glossectomy (Schliephake, Schmelzeisen, Schönweiler, Schneller, & Altenbernd, 1998), and malocclusions (Cheng, Peng, Chiou, & Tsai, 2002; Kikyo, Saito, & Ishikawa, 1999). The fact that the tongue shape can be displayed in real time on the screen makes ultrasound potentially very attractive to speech-language pathologists as a tool for biofeedback for oral deaf speakers and also for patients with dysarthrias or compensatory articulation errors associated with cleft palate. However, only a few studies so far have used ultrasound as a tool for biofeedback in speech therapy. Shawker and Sonies (1985) described the use of ultrasound imaging for the speech therapy of an individual with an articulation disorder. The authors found that the subject was able to improve her articulatory distortions of the /r/ sound over a course of 3 months of ultrasound biofeedback therapy. A recent study by Bernhardt et al. (Bernhardt, Bacsfalvi, Gick, Radanov, & Williams, this issue; Bernhardt, Gick, Bacsfalvi, & Ashdown, 2003) compared ultrasonographic and electropalatographic feedback in four speakers with mild to severe hearing loss. The authors found that all subjects were able to improve their articulation and that both feedback methods were equally effective. For the research in the Voice and Resonance Laboratory at the University of Toronto, we use a lowend General Electric Logiq Alpha 100 MP ultrasound machine with a 6.5 MHz micro convex curved array scanner with a 114° view (Model E72, General Electric Medical Systems, P.O. Box 414, Milwaukee, Wisconsin 53201). The machine and the transducer are displayed in Figure 1. During the ultrasound examination, the videooutput of the ultrasound machine is captured with a generic digital video camera (Canon ZR 45 MC, Canon Canada Inc., 6390 Dixie Road, Mississauga, Ontario L5T 1P7). Parallel sound recordings are made onto the same digital videotape using an AKG C420 headset condenser microphone (AKG Acoustics, 914 Airpark Center Drive, Nashville, Tennessee 37217) with a Behringer Ultragain Pro 2200 line-driver (Behringer Ltd., 18912 North Creek Pkwy, Suite 200, Bothell, Washington 98011). After the recording, the ultrasound films are downloaded from the digital video camera onto a computer and saved as a digital file. Figure 2 shows typical midsagittal tongue contours during the sustained production of the cardinal vowels /a/, /i/, and /u/. Problems in 2D ultrasound imaging: Head fixation The ultrasound image is acquired by holding the transducer against the neck of the research participant. If the transducer is held manually against the neck of a subject who sits in a standard office chair, there are a number of moving elements that might lead to measurement error: · The examiner’s hand with the ultrasound transducer may move or change the angle of the transducer; · The subject’s mandible moves up and down which may change the position of the transducer; · The subject’s head and shoulders may move which can affect the coupling or the angle of the transducer. Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 159 Applications of 2D and 3D Ultrasound Imaging Figure 1 Figure 1. Ultrasound machine with 114° endocavity transducer and ultrasound gel. The image also shows the PC Bird electromagnetic movement tracking system, which is used to reconstruct three-dimensional ultrasound volumes. Figure 2 (a) (b) (c) Figure 2. Midsagittal ultrasound contours of the tongue during the production of sustained vowels. The anterior tongue is towards the right side of the images. To facilitate viewing, the tongue contours are marked with a grey line. (a) sustained /a/; (b) sustained /i/; (c) sustained /u/. If one wishes to generate quantitative tongue movement data from an ultrasound film, it is important to reduce movement of the transducer and the subject’s head. On the other hand, the fixation mechanism should ensure good coupling during regular mandibular movement in speech. Different groups of researchers have used head stabilization devices such as headrests for the back of the head (Davidson , 2004; Stone et al., 1988), headrests for the forehead (Peng et al., 2000), complete head fixation (Stone & Davies, 1995), helmets with transducer attachments (Hewlett, Vasquez, Zharkova, & Zharkova, 2004) or position control with laser pointers (Gick, 2002). In an alternative approach, Whalen et al. (2004) suggested minimizing the head movement with a headrest and tracking the residual movement with a three-dimensional optical tracking system. 160 In our laboratory at the University of Toronto, we developed the Comfortable Head Anchor for Sonographic Examinations (CHASE; Carmichael, 2004) that was roughly modelled on the device described by Peng et al. (2000). The CHASE, which is depicted in Figure 3, consists of a headrest for the subject’s forehead and a transducer cradle with a suspension-spring mechanism. Since a large number of our research participants are head and neck cancer patients, it was our goal for the development of the CHASE to make the device as unintimidating as possible. For this reason, the CHASE only anchors and stabilizes the participant’s head while avoiding forced head fixation. Our experience to date shows that this effectively reduces all head and transducer movement to an acceptable minimum (less than 1.5 mm lateral wandering after ten minutes of speech recordings). Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Applications of 2D and 3D Ultrasound Imaging Figure 3 Figure 3. A research participant in the CHASE head anchor. Problems in 2D ultrasound imaging: Image analysis The set-up for ultrasound imaging and the data acquisition require minimal advance preparation. However, the data analysis can be labour-intensive. Every second of film generates 30 separate image frames, and we are interested in the positions of different parts of the tongue. It is extremely time consuming to do these analyses by hand. It is therefore desirable to automate the data analysis as much as possible. At first, the automatic extraction of tongue contours from an ultrasound film may seem a trivial task because the tongue movement can be easily visualized and appears clearly on the screen of the ultrasound machine. However, ultrasound is an acoustic imaging method and, as a consequence, the image is often noisy and may contain artefacts. On close inspection, the tongue contours that look so clear to the experimenter’s eye are in fact blurry patches of diffuse shades of grey. The human observer has the advantage of Gestalt perception, which means that a pattern of elements is perceived as a unified whole. While the moving tongue surface is easily discernible for the human observer, the automatic extraction of tongue contours from an ultrasound film is a challenging problem for computer vision programming. Over the years, the researchers at the Vocal Tract Visualization Laboratory at the University of Maryland have developed a number of different successful motion trackers (Akgul, Kambhamettu, & Stone, 1999; Unser & Stone, 1992). The current version of the EdgeTrak software (Li, Kambhamettu, & Stone., 2003) can be downloaded at http://speech.umaryland.edu. Other programs are currently being developed at Queen Margaret University College in Edinburgh (Wrench , 2004) and at the University of British Columbia in Vancouver (Gick & Rahemtulla, 2004). However, even the best automatic motion trackers will often lose a tongue contour that they are tracking. The deformation of the tongue shape can be very rapid. A good example for a situation that often leads to the failure of a motion tracker is when the speaker changes from a low to a high vowel. The tracking of ultrasonographic tongue contours is probably more an artificial intelligence rather than an image-processing problem, and the current generation of motion trackers cannot be expected to perform flawlessly. It is therefore important that the user reviews and corrects the automatic tracking results. We recently completed the development of our own software, named the Ultrasonographic Contour Analyzer for Tongue Surfaces (Ultra-CATS; Gu, Bressmann, Cannons, & Wong, 2004). The Ultra-CATS software can be downloaded from www.slp.utoronto.ca/People/Labs/ TimLab/ultracats.htm. The Ultra-CATS was designed with a focus on the semi-automated analysis of the ultrasound data. The program also incorporates an automatic tracking option. The main goal of the software was to facilitate the manual frame-by-frame analysis of an image sequence. In the semi-automated analysis, the user traces the tongue surface with a drawing tool on each image frame. The software then measures points on the tongue surface with a polar grid. In our experience, the manual tracing of a single ultrasound frame will take a trained experimenter about 7 seconds on average. At this pace, every ten seconds of continuous ultrasound film will take approximately 35 minutes to analyze. An additional feature of the Ultra-CATS software is an automatic imageprocessing algorithm. The automatic tracker can be set, run, stopped, corrected and set back on track at any point during the analysis. The Ultra-CATS program saves all measurements to a text file so that they can be edited and analyzed using a spreadsheet editor or a statistical analysis program. An image of the program interface of the UltraCATS can be found in Figure 4. Since we have completed the development of the Ultra-CATS software, we have used it to analyze the speech and swallowing of normal speakers as well as patients with tongue cancer and lingual paralyses. Figure 5 shows a waterfall display for a water swallow of a normal male participant. The waterfall display shows the elevation of the back of the tongue that prevents predeglutitive aspiration. As the oral transport phase of the swallow begins, the back of the tongue lowers in order to allow passage of the bolus. We can then appreciate the progressive elevation of the tongue from the front to the back as the bolus is cleared from the oral cavity. Figure 6 shows a waterfall display of the phrase ‘ninetythree years old’. This segment was taken from a reading of the second sentence of the Grandfather passage (‘Well, he is nearly ninety-three-years old’; van Riper, 1963), spoken by a normal female speaker. The displayed data represent 1.3 seconds of data. Note the large number of posture adjustments that the tongue makes to produce all the required phonemes. Also note the immediate anticipatory elevation of the anterior dorsum of the tongue after the final /d/ plosion, which leads into the first high front vowel of the next sentence (‘He dresses himself in an ancient black frock coat …’). Figure 7 shows two waterfall displays of repeated syllables. In Figure 7a, we see five repeated utterances of Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 161 Applications of 2D and 3D Ultrasound Imaging Figure 4 Figure 4. Screenshot of the Ultra-CATS software Figure 5 Figure 5. Waterfall display of a water swallow by a normal female participant. The numbers indicate different phases of the swallow. (1) The dorsum of the tongue is elevated to prevent predeglutitive aspiration. (2) The swallow is initiated. The tongue lowers so that the water bolus can pass into the oropharynx. (3) First the tip and then the dorsum of the tongue elevate to clear the bolus from the oral cavity. During this part of the swallow, the hyoid bone moves forward to open the upper oesophageal sphincter. The forward movement of the hyoid is indicated by the missing data/ zeros which are visible in the posterior tongue during this phase of the swallow. (4) After the swallow, the tongue returns to a neutral rest position and then lowers in preparation for the next swallow. 162 Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Applications of 2D and 3D Ultrasound Imaging Figure 6 deliver exact representations of the lingual surface in different speech sounds, the above studies remained descriptive and did not attempt to quantify lingual movement ranges in the reconstructed three-dimensional volume. However, a quantitative approach to the threedimensional deformation shape of the tongue during the production of speech sounds would be particularly desirable for the analysis of patients with glossectomy in order to compare pre- and postoperative movement ranges and to evaluate the effect of different reconstructive techniques on the deformation and symmetry of the tongue tissue. Figure 7 Figure 6. Waterfall display of the midsagittal tongue contours during the phrase ‘ninety-three years old’ from the second sentence of the Grandfather passage (‘Well, he is nearly ninety-three years old.’), spoken by a normal adult female participant. the syllable /aka/ by a normal female speaker. The lowering of the tongue from the rest position towards the position for the /a/ can be appreciated. The back of the tongue then elevates to achieve velar closure for the /k/, and the procedure is repeated over. Figure 7b demonstrates the same repeated syllables spoken by an older female patient with a flaccid paralysis of the tongue resulting from postPolio syndrome. It can be observed that the lingual paralysis leads to an undifferentiated elevation of the tongue, rather than the lowering of the tongue for /a/, which perceptually resulted in a centralized vowel. The patient cannot elevate the posterior tongue for /k/. Perceptually, this articulatory undershoot results in significantly reduced speech intelligibility. Figure 7b Static Three-Dimensional Ultrasound The 2D display of tongue contours is interesting and affords us fascinating insights into the movement of the tongue. However, the tongue is a complex, non-rigid three-dimensional structure (Hiiemae & Palmer, 2003; Stone 1990). It is especially important to recover the three-dimensional information about the shape of the tongue in glossectomy patients because the lingual resection and reconstruction rarely ever lead to a symmetrical outcome. This means that a midsagittal image will often be an incomplete representation of the surgically altered tongue shape. So far, three-dimensional ultrasound has mostly been used in feasibility studies in normal speakers (Lundberg and Stone, 1999; Stone and Lundberg, 1996; Watkin and Rubin, 1989; Wein, Klajman, Huber, & Doring, 1988b). While it was demonstrated that three-dimensional ultrasound has the capabilities to Figure 7. Waterfall display of the midsagittal tongue contour during five repetitions of /aka/. (a) Normal adult female participant. Note the elevated tongue rest position before the beginning of the utterance. (b) Female patient with lingual paralysis resulting from post-Polio syndrome. Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 163 Applications of 2D and 3D Ultrasound Imaging Figure 8 In a series of ongoing studies at the Voice and Resonance Laboratory at the University of Toronto, we use our ultrasound machine in combination with a three-dimensional motion sensor (PC Bird, Ascension Technology Corporation, P.O. Box 527, Burlington, Vermont 05402). The FreeScan V7.04 computer program (Echotech 3D Imaging Systems, 85399 Halbergmoos, Germany) is used for the data acquisition and the reconstruction of the threedimensional volumes. This setup allows us to make threedimensional scans of static structures and, consequently, all volume scans have to be obtained Figure 8. Orthogonal planes of a three-dimensional ultrasound volume of the from sustained speech sounds. sustained vowel /a/, spoken by a normal male adult participant. During the 3D data acquisition procedure, the subject is seated Figure 9 upright on a chair and instructed to slightly overextend his or her head. The transducer is held in a coronal scanning position and swept from the chin to the upper border of the thyroid cartilage. In a typical examination, a research participant sustains the following English phonemes: /a/,/i/,/u/,/s/,//,//,/l/,/n/and //. Each speech sound is repeated three times. The sound is sustained for approximately 5 seconds while the ultrasound scan is made. A 3D ultrasound scan of the tongue usually takes 2-3 seconds. Using the FreeScan software, we can then browse through the three-dimensional ultrasound volume in any direction. Figure 8 illustrates how multiple planes of a three-dimensional ultrasound volume can be visualized. The sound shown is sustained /a/ spoken by a normal male speaker. Figure 9 shows the reconstructed threeFigure 9. Reconstructed three-dimensional volume of the dimensional tongue volume of the same sustained /a/. In sustained /a/ from Figure 8. order to make measurements of the 3D tongue surface, it is important to define an anchor point in the tongue upon Figure 10 which all measurements can be based. We first align all 3D scans so that the lingual septum is as exactly vertical as possible and that muscle fibres between the chin and the hyoid bone that are formed by the geniohyoid and the inferior genioglossus are as exactly horizontal as possible. We then define the ‘midpoint’ of the tongue as the halfway point between the mandible and the hyoid on the superior border of the geniohyoid muscle in the sagittal view, and as the point of intersection of the lingual septum and the geniohyoid muscle in the coronal view. Using this midsagittal midpoint, we then identify a left and a right parasagittal plane that is exactly parallel to the midsagittal slice. Based on the anchor point, we then superimpose a concentric grid with measurement lines spaced out in 11.25° intervals on the slices and measure the sagittal Figure 10. Midsagittal plane of a sustained /a/ with overlay tongue form in three parallel sagittal planes. Figure 10 of a measurement grid. demonstrates how we take measurements in the midsagittal 164 Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Applications of 2D and 3D Ultrasound Imaging plane of the same ultrasound volume of the sustained /a/ shown in Figures 8 and 9. This procedure generates a data matrix for the tongue surface for a speech sound. The data can be used to reconstruct a rough visual representation of tongue surface shapes as demonstrated in Figure 11. Figure 11a shows a composite of tongue surface data during the production of // for 12 normal speakers. Note the midsagittal groove for this sound that is necessary to produce a clear //. Figure 11b shows the tongue of a patient with a lateral carcinoma of the tongue during the production of the same sound before the ablative cancer surgery to the right side of her tongue. The patient was able to produce an adequate midsagittal groove and her // was perceptually acceptable. Figure 11c shows the same patient after a lateral resection of the right side of her tongue and defect reconstruction with a radial forearm flap. The patient was now unable to form a consistent midsagittal groove, which led perceptually to a distortion and lateralization of the //sound. In two studies (Bressmann, Uy, & Irish, 2005; Bressmann, Thind, Uy, Bollig, Gilbert, & Irish, 2005), we used the quantitative tongue surface data that we generated from three-dimensional tongue volumes to extract underlying components of tongue movement by using mathematical procedures such as principal component analysis or multi-dimensional scaling. We also developed a number of quantitative descriptors for the degree of protrusion of the tongue in the oral cavity (anteriority index), the degree of three-dimensional midsagittal grooving along the length of the tongue (concavity index) and the symmetry of the elevation of left and right lateral tongue (asymmetry index). We established orienting values for a group of normal speakers and demonstrated the usefulness of these measures for the analysis of a patient with glossectomy in whom we compared pre- and postoperative movement ranges and evaluated the effect of the defect reconstruction on the deformation and symmetry of the tongue. Towards Dynamic Three-Dimensional Ultrasound With the current generation of ultrasound machines, we are faced with the dilemma that we can either visualize a two-dimensional dynamic motion or a threedimensional static volume. Obviously, our ultimate goal would be to visualize the motion of the tongue in 3D. This would be especially important for our research on glossectomy patients because the partial tongue resections and reconstructions rarely ever lead to a symmetrical outcome. In recent years, so-called 4D ultrasound machines have become commercially available. These machines are currently able to visualize up to 30 volumes per second. However, while this technology is used with great success in obstetrics and gynaecology, it is less suitable for the imaging of the tongue in speech. The reason for this is that the air in the oral cavity causes echo artefacts in the ultrasound scan that obscure the true surface of the tongue in the three-dimensional volume. Consequently, 4D scans acquired this way would necessitate extensive postprocessing. Yang & Stone (2002) at the Vocal Tract Figure 11 Figure 11. Surface plots for the postalveolar fricative //. (a) Composite surface plot for 12 normal speakers; (b) A patient with a carcinoma of the right lateral tongue preoperatively; (c) The same patient after the tumour resection and reconstruction with a radial forearm flap. Note the decreased lingual grooving and the postoperative asymmetry of the tongue. Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 165 Applications of 2D and 3D Ultrasound Imaging Visualization Laboratory at the University of Maryland recently suggested an interesting new approach to the reconstruction of three-dimensional tongue motion from multiple two-dimensional image sequences. The researchers recorded repeated utterances of the same sentence in multiple parallel sagittal and coronal planes and then reassembled the data in a reconstructed threedimensional surface using the Dynamic Programming method. Using multiple two-dimensional scans to reconstruct a three-dimensional moving tongue surface is an elegant way to circumvent the current technological limitations of the available ultrasound machines. However, the high number of slices and repetitions required for the method described by Yang & Stone (2002) relies on the participation of a highly compliant research volunteer. It is probably less practical for clinical patient examinations and research in pathological speaker groups. In a recent series of experiments at the Voice and Resonance Laboratory, we have started to acquire parallel sagittal scans to make sparse three-dimensional surface plots of the moving tongue. For this research, we use the CHASE device to stabilize the head of the research participant for repeated scans in three parallel sagittal planes. Instead of aligning the images retrospectively with Dynamic Programming, we pace the speech of the subject using a digital metronome. In order to facilitate the task of keeping a steady rhythm at 60 beats per minute, the metronome is set to 120 beats per minute and the subject is instructed to speak at half-tempo. So far, we have used only repeated VCV syllables for this examination technique. The stress is on the CV segment and coincides with the metronome beat (i.e., /a’ta/). The examination is repeated in three sagittal planes. We then use a video-editing program with a parallel oscillogram display (ScreenBlast, Sony Corp., 550 Madison Avenue, New York, New York 10022) to identify the bursts of the metronome in the acoustic signal. The metronome bursts are used to identify key-frames that help us synchronize the image sequences for the same utterance recorded in three sagittal planes. The image sequences are analyzed using the Ultra-CATS software and the results are plotted as surfaces. Figure 12 shows a number of frames of the pseudo-3D surface plots of tongue movement during the production of the syllable /aka/. In our laboratory, this technique is still largely experimental at this time. While we have only used it on normal speakers to date, we are hoping to incorporate a similar procedure into future patient examinations. Conclusion Ultrasound offers exciting new possibilities for researchers and therapists in speech-language pathology. The main advantages of ultrasound imaging are its noninvasiveness, bio-safety, cost-effectiveness and, last but not least, the ease of the image acquisition. Ultrasound allows us to acquire extensive amounts of speech data, and 166 Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Figure 12 Applications of 2D and 3D Ultrasound Imaging Marilia Sampaio, Parveen Thind, Catherine Uy and Willy Wong. Funding for this research was provided by the Canadian Institutes for Health Research (grant fund # MOP 62960). References Figure 12. Posterior view of the reconstructed threedimensional surface plots of the tongue during the utterance /a’ka/. (a) First /a/: Note the prominent genioglossus furrow during the production of /a/; (b) Maximal dorsal elevation for the production of /k/; (c) Second /a/: Note the reduced genioglossus furrow following the velar closure. the examination sessions can be as long as necessary. We also find that the direct visualization of the tongue shape on the ultrasound screen is a good motivator for research participants. Obviously, there are still a number of methodological issues associated with the use of ultrasound for research or clinical applications. These will have to be addressed in further research. Nevertheless, the benefits outweigh the disadvantages of this highly promising imaging method. The costs for basic ultrasound machines have dropped significantly over the last few years, and it is likely that this trend will continue in the future. In addition, many manufacturers now make small portable ultrasound machines that are operated with batteries. In the future, many more speech-language pathologists may be able to use ultrasound for speech and swallowing assessments as well as a biofeedback device for therapy. Acknowledgements The authors gratefully acknowledge the invaluable contributions of the following people (in alphabetical order) to the research described here: Yarixa Barillas, Carmen Bollig, Arpita Bose, Amanda Braude, Kevin Cannons, Brent Carmichael, Michelle Ciccia, Heather Flowers, Jiayun (Jenny) Gu, Gajanan (Kiran) Kulkarni, Akgul, Y.S., Kambhamettu, C., & Stone, M. (1999). Automatic extraction and tracking of the tongue contours. IEEE Transactions on Medical Imaging, 18, 10351045. Bernhardt, B., Gick, B., Bacsfalvi, P., & Ashdown, J. (2003). Speech habilitation of hard of hearing adolescents using electropalatography and ultrasound as evaluated by trained listeners. Clinical Linguistics and Phonetics, 17, 199-216. Bosma, J.F., Hepburn, L.G., Josell, S.D., & Baker, K. (1990). Ultrasound demonstration of tongue motions during suckle feeding. Developmental Medicine and Child Neurology, 32, 223-229. Bressmann, T., Uy, C., & Irish, J.C. (2005). Analyzing normal and partial glossectomee tongues using ultrasound. Clinical Linguistics & Phonetics, 19, 35-52. Bressmann, T., Thind, P., Uy, C. Bollig, C., Gilbert, R.W., & Irish, J.C. (2005). Quantitative three-dimensional ultrasound analysis of tongue protrusion, grooving and symmetry: Data from twelve normal speakers and a partial glossectomee. Clinical Linguistics and Phonetics, 19, 573-588. Bressmann, T., Gu J., Cannons, K., Wong, W., Heng, C. L., & Carmichael, B. (in preparation). Quantitative analysis of tongue motion using semi-automatic edge detection software in B-mode ultrasound films: Design, methods and procedure validation. Carmichael, B. (2004). The Comfortable Head Anchor for Sonographic Examinations (CHASE). Toronto: University of Toronto. Casas, M. J., Kenny, D. J., & Macmillan, R. E. (2003). Buccal and lingual activity during mastication and swallowing in typical adults. Journal of Oral Rehabilitation, 30, 9-16. Casas, M.J., Kenny, D.J., & Mc Pherson, K.A. (1994). Swallowing/ventilation interactions during oral swallow in normal children and children with cerebral palsy. Dysphagia, 9, 40-46. Casas, M.J., McPherson, K.A., & Kenny, D.J. (1995). Durational aspects of oral swallow in neurologically normal children and children with cerebral palsy: an ultrasound investigation. Dysphagia, 10, 155-159. Cheng, C. F., Peng, C. L., Chiou, H. Y., & Tsai, C. Y. (2002). Dentofacial morphology and tongue function during swallowing. American Journal of Orthodontics and Dentofacial Orthopedics, 122, 491-499. Chi-Fishman, G., & Sonies, B. C. (2002). Kinematic strategies for hyoid movement in rapid sequential swallowing. Journal of Speech, Language, and Hearing Research, 45, 457-468. Chi-Fishman, G., Stone, M., & McCall, G. N. (1998). Lingual action in normal sequential swallowing. Journal of Speech, Language, and Hearing Research, 41, 771785. Davidson, L. (2004 April). Assessing tongue shape similarity: comparing L2 norms and area measures. Paper presented at the meeting of the Second Ultrasound Roundtable, Vancouver, BC. Gick, B. (2002). The use of ultrasound for linguistic phonetic fieldwork. Journal of the International Phonetic Association, 32, 113-122. Gick, B., & Rahemtulla, S. (2004 April). Recent developments in quantitative analysis of ultrasound tongue data. Paper presented at the meeting of the Second Ultrasound Roundtable, Vancouver, BC. Gu, J., Bressmann, T., Cannons, K., & Wong, W. (2004). The Ultrasonographic Contour Analyzer for Tongue Surfaces (Ultra-CATS). Toronto: University of Toronto. Hewlett, N., Vazquez, Y., Zharkova, A., & Zharkova, N. (2004 April). Ultrasound study of coarticulation and the “Trough Effect” in symmetrical VCV syllables: A report of work in progress. Paper presented at the meeting of the Second Ultrasound Roundtable, Vancouver, BC. Hiiemae, K.M., & Palmer, J.B. (2003). Tongue movements in feeding and speech. Critical Reviews in Oral Biology and Medicine: An Official Publication of the American Association of Oral Biologists, 14, 413-29. Kenny, D.J., Casas, M.J., & McPherson, K.A. (1989). Correlation of ultrasound imaging of oral swallow with ventilatory alterations in cerebral palsied and normal children: preliminary observations. Dysphagia, 4 , 112-117. Kikyo, T., Saito, M., & Ishikawa, M. (1999). A study comparing ultrasound images of tongue movements between open bite children and normal children in the early mixed dentition period. Journal of Medical and Dental Sciences, 46, 127-137. Li, M., Kambhamettu, C, & Stone, M. (2003 August). EdgeTrak, a program for band edge extraction and its applications. Paper presented at the Sixth IASTED International Conference on Computers, Graphics and Imaging, Honolulu, HI. Lundberg, A., & Stone, M. (1999). Three-dimensional tongue surface reconstruction: Practical considerations for ultrasound data. Journal of the Acoustical Society of America, 106, 2858-2867. Morrish, K.A., Stone, M., Sonies, B.C., Kurtz, D., & Shawker, T. (1984). Characterization of tongue shape. Ultrasonic Imaging, 6, 37-47. Munhall, K.G. (1985). An examination of intra-articulator relative timing. Journal of the Acoustical Society of America, 78, 1548-1553. Neuschäfer-Rube, C., Wein, B.B., Angerstein, W., Klajman, S. Jr., & Fischer-Wein, G. (1997). Sektorbezogene Grauwertanalyse videosonographisch aufgezeichneter Zungenbewegungen beim Schlucken. HNO, 45, 556-562. Parush, A., & Ostry, D.J. (1993). Lower pharyngeal wall coarticulation in VCV syllables. Journal of the Acoustical Society of America, 94, 715-722. Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 167 Applications of 2D and 3D Ultrasound Imaging Peng, C.L., Jost-Brinkmann, P.G., Miethke, R.R.,& Lin, C.T.(2000). Ultrasonographic measurement of tongue movement during swallowing. Journal of Ultrasound in Medicine, 19, 15-20. Schliephake, H., Schmelzeisen, R., Schönweiler, R., Schneller, T., & Altenbernd. C. (1998). Speech, deglutition and life quality after intraoral tumour resection. A prospective study. International Journal of Oral and Maxillofacial Surgery, 27, 99105. Shawker, T.H., & Sonies, B.C. (1984). Tongue movement during speech: a realtime ultrasound evaluation. Journal of Clinical Ultrasound, 12, 125-133. Shawker, T.H., & Sonies, B.C. (1985). Ultrasound biofeedback for speech training. Instrumentation and preliminary results. Investigative Radiology, 20, 90-93. Shawker, T.H., Sonies, B., Stone, M., & Baum, B.J. (1983). Real-time ultrasound visualization of tongue movement during swallowing. Journal of Clinical Ultrasound, 11, 485-490. Soder, N., & Miller, N. (2002). Using ultrasound to investigate intrapersonal variability in durational aspects of tongue movement during swallowing. Dysphagia, 17, 288-297. Sonies, B.C., Baum, B.J., & Shawker, T.H. (1984). Tongue motion in elderly adults: initial in situ observations. Journal of Gerontology, 39, 279-283. Sonies, B.C., & Dalakas, M.C. (1991). Dysphagia in patients with the post-polio syndrome. New England Journal of Medicine, 324, 1162-1167. Stone, M. (1990). A three-dimensional model of tongue movement based on ultrasound and x-ray microbeam data. Journal of the Acoustical Society of America, 87, 2207-2217. Stone, M., & Davis, E.P. (1995). A head and transducer support system for making ultrasound images of tongue/jaw movement. Journal of the Acoustical Society of America, 98, 3107-3112. Stone, M., & Lundberg, A. (1996). Three-dimensional tongue surface shapes of English consonants and vowels. Journal of the Acoustical Society of America, 99, 3728-3737. Stone, M., Morrish, K., Sonies, B.C., & Shawker, T.H. (1987). Tongue curvature: A model of shape during vowel production. Folia Phoniatrica, 39, 302-315. Stone, M., & Shawker, T.H. (1986). An ultrasound examination of tongue movement during swallowing. Dysphagia, 1, 78-83. Stone, M., Shawker, T.H., Talbot, T.L., & Rich, A.H. (1988). Cross-sectional tongue shape during the production of vowels. Journal of the Acoustical Society of America, 83, 1586-1596. Unser, M., & Stone, M. (1992). Automated detection of the tongue surface in sequences of ultrasound images. Journal of the Acoustical Society of Canada, 91, 3001-3007. Van Riper, C. (1963). Speech correction: principles and methods, 4th edition. Englewood Cliffs, NJ: Prentice-Hall. Watkin, K.L., & Rubin, J.M. (1989). Pseudo-three-dimensional reconstruction of ultrasonic images of the tongue. Journal of the Acoustical Society of America, 85, 496499. Wein, B., Angerstein, W., & Klajman, S. (1993). Suchbewegungen der Zunge bei einer Sprechapraxie: Darstellung mittels Ultraschall und Pseudo-3D-Abbildung. Nervenarzt, 64, 143-145. Wein, B., Alzen, G., Tolxdorff, T., Bockler, R., Klajman, S., & Huber, W. (1988). Computersonographische Darstellung der Zungenmotilität mittels Pseudo-3DRekonstruktion. Ultraschall in der Medizin, 9, 95-97. Wein, B., Bockler, R., Huber, W., Klajman, S., & Willmes, K. (1990). Computersonographische Darstellung von Zungenformen bei der Bildung der langen Vokale des Deutschen. Ultraschall in der Medizin, 11, 100-103. Wein, B., Klajman, S., Huber, W., & Doring, W. H. (1988). Ultraschalluntersuchung von Koordinationsstörungen der Zungenbewegung beim Schlucken. Nervenarzt, 59, 154-158. Whalen, D.H., Iskarous, K., Tiede, M.K., Ostry, D.J., Lehnert-LeHoullier, H., Vatikiotis-Bateson, E., & Hailey, D.S. (2004 April). HOCUS: the Haskins Optically Corrected Ultrasound System. Paper presented at the meeting of the Second Ultrasound Roundtable, Vancouver, BC. Wrench, A. (2004 April). QMUC Matching, merging and means: Spline productivity tools for ultrasound analysis. Paper presented at the meeting of the Second Ultrasound Roundtable, Vancouver, BC. Yang, C.S., & Stone, M. (2002). Dynamic programming method for temporal registration of three-dimensional tongue surface motion from multiple utterances. Speech Communication, 38, 199-207. CA S L P Ac t i o n ! PA Have you looked at us lately? ON-LINE JSLPA SEARCHABLE INDEX Search online for past articles that have appeared in JSLPA. Searches may be done by article, issue, topic, author and key word. Search results allow you to view the entire abstract and give you the option to order article reprints, back issues or full subscriptions. This is an excellent research tool that can be accessed under the JSLPA area of the Resources section of www.caslpa.ca Author Note Please address correspondence to: Tim Bressmann, Ph.D., Assistant Professor, Graduate Department of Speech-Language Pathology, University of Toronto, 500 University Avenue, Toronto, ON M5G 1V7 Canada, [email protected] Access the JSLPA searchable index at: www.caslpa.ca/english/resources/jslpa-index.asp Received: November 15, 2004 Accepted: March 16, 2005 Another great service brought to you by CASLPA 168 Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Electropalatography and Ultrasound Exploring the Use of Electropalatography and Ultrasound in Speech Habilitation Explorer l’électropalatographie et l’échographie pour l’éducation de la parole Barbara Bernhardt, Penelope Bacsfalvi, Bryan Gick, Bosko Radanov and Rhea Williams Abstract Barbara Bernhardt, Ph.D. School of Audiology and Speech Sciences University of British Columbia Vancouver, BC Canada Penelope Bacsfalvi, M.Sc. School of Audiology and Speech Sciences University of British Columbia Vancouver, BC Canada Bryan Gick, Ph.D. Department of Linguistics University of British Columbia Vancouver, BC Canada Bosko Radanov, B.A. School of Audiology and Speech Sciences University of British Columbia Vancouver, BC Canada Electropalatography (EPG) and ultrasound have been recently explored as articulatory visual feedback tools in speech habilitation at the University of British Columbia’s Interdisciplinary Speech Research Laboratory (UBC, ISRL). Although research studies imply that such tools are effective in speech habilitation, most studies have utilized trained listeners. To determine the impact of speech habilitation on everyday communication, it is also important to include everyday, untrained listeners in the treatment evaluation process. Two everyday listener studies were conducted, using data from two of the exploratory UBC treatment studies. The listeners observed improvement posttreatment for some but not all speakers or speech targets. More research is needed to determine the relative effectiveness of EPG and ultrasound in speech habilitation in terms of speaker variables and treatment targets, and in comparison with each other and different treatment methods. The current paper has two purposes: (1) to provide an overview of EPG and ultrasound in speech habilitation, and (2) to present the two preliminary listener studies, suggesting directions for future research and clinical application. Abrégé: Le laboratoire de recherche interdisciplinaire sur la parole de l’University of British Columbia (UBC) a examiné la possibilité d’utiliser l’électropalatographie et l’échographie comme outils de rétroaction visuelle de l’articulation. Bien que des études laissent entendre l’utilité de tels outils pour l’éducation de la parole, la plupart sont fondées sur des auditeurs formés. Pour évaluer l’efficacité de la réadaptation de la parole dans la communication quotidienne, il est aussi important d’inclure des auditeurs ordinaires n’ayant pas été formés au processus d’évaluation du traitement. Le laboratoire a mené, à UBC, deux études avec des auditeurs ordinaires à partir des données de deux études exploratoires portant sur le traitement. Les participants ont noté des améliorations à la suite du traitement, mais pas chez tous les orateurs ni pour toutes les cibles. Il faut poursuivre la recherche afin de déterminer l’efficacité relative de l’électropalatographie et de l’échographie pour la réadaptation de la parole en fonction des variables des orateurs mêmes et entre les méthodes de traitement. Le présent article vise deux objectifs: (1) effectuer un survol de l’utilité de l’électropalatographie et de l’échographie dans la réadaptation de la parole, et (2) présenter les résultats de deux études préliminaires sur des auditeurs afin de proposer des orientations pour la recherche et l’application clinique. Key Words: electropalatography, ultrasound, everyday listener, visual feedback Rhea Williams Williams, M.Sc. Speech-Language Pathologist Barrie, ON Canada Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 169 Electropalatography and Ultrasound I n the past several decades, a number of small n studies have demonstrated the utility of visual feedback technology in speech habilitation. Studies have included participants with a variety of etiologies, for example, hearing impairment (e.g. Bernhardt, Fuller, Loyst & Williams, 2000; Bridges & Huckabee, 1970; Dagenais, 1992; Fletcher, Dagenais, & Critz-Crosby, 1991; Volin, 1991), cleft palate (e.g., Gibbon, Crampin, Hardcastle, Nairn, Razzell, Harvey, & Reynolds, 1998; Michi, Yamashita, Imai, & Yoshida, 1993), phonological impairment (e.g., Adler-Bock, 2004; Gibbon, Hardcastle, Dent, & Nixon, 1996), or motor speech impairment (e.g., Gibbon & Wood, 2003). Technologies providing acoustic displays are perhaps most common (e.g., Maki, 1983; Shuster, Ruscello, & Toth, 1995; Volin, 1991). However, there is a growing interest and body of research on visual articulatory feedback in treatment, for example, electropalatography (EPG, or palatometry), which shows dynamic tonguepalate contact patterns (e.g., Bernhardt et al., 2000; Dagenais, 1992; Fletcher et al., 1991; Gibbon et al., 1998; Hardcastle, Jones, Knight, Trudgeon, & Calder, 1989). Two-dimensional ultrasound, which can display dynamic images of tongue shapes and movements, has also become more accessible in the past 5 years for speech habilitation (Adler-Bock, 2004; Bernhardt, Gick, Bacsfalvi, & Ashdown, 2003; Gick, 2002). At the University of British Columbia, exploratory treatment studies have been conducted using EPG and/or ultrasound. One purpose of the current paper is to provide an overview of these technologies in speech habilitation as a foundation for future clinical and research applications. Although the tools are currently university-based, there may be potential for future clinical use. Queen Margaret University College in Edinburgh has been situating EPG in clinical sites throughout Britain, with links to the university research team (Cleftnet UK: Gibbon et al., 1998). One step in proceeding towards such a program in Canada is dissemination of information about the technologies and their clinical use to S-LPs and other researchers. For clinical purposes, the question is whether technologies are worth the investment. The research literature suggests that visual feedback technologies aid speech habilitation. However, most studies to date have been conducted by trained observers. Because an ultimate goal of speech therapy is to enhance communication in a client’s everyday life, outcomes research also needs to include the perspectives of everyday listeners (Frattali, 1998; Klasner & Yorkston, 2000; World Health Organization, 2001). Two of the exploratory UBC treatment studies had data that could be used to collect everyday listener observations. The second purpose of the current paper is to present those preliminary everyday listener observations, not as measures of effectiveness of the treatment, but as a foundation for future research and clinical studies. By discussing the two small studies in one paper, a broader perspective can be gained on everyday listener research. 170 The paper begins with an introduction to EPG and ultrasound and a brief discussion of everyday listener research methods. The two listener studies are then presented in turn. Background information is provided on the treatment studies themselves within the context of each listener study, although space precludes a detailed discussion of them (see Bernhardt et al., 2000 and Bernhardt et al., 2003 for more details). The treatment studies were case studies conducted with the purpose of learning about the use of EPG and ultrasound in speech habilitation. Thus, they were conducted without strict experimental single subject or group designs (e.g., the use of control groups). The projects were developmental in nature, and thus, S-LPs and researchers shared information with each other throughout about participants and procedures. The final section of the paper discusses future research and clinical implications. Visual Feedback Technology Dynamic Electropalatography Different types of dynamic EPG systems have been available over the past three decades. Older systems such as the now unavailable Kay Palatometer ran on DOS (Kay Elemetrics, New Jersey), with more recent ones running on Windows, for example, the WIN-EPG (www.articulateinstruments.co.uk) or the Logometrix system (www.logometrix.org). The Kay palatometer and the WIN-EPG (2000) have been used in the UBC research program, and thus the discussion focuses on these instruments. The above-mentioned systems operate in similar ways. Speakers wear a custom-fit artificial palate (figure 1). Figure 1 Figure 1. The WIN-EPG and Kay artificial palates: The top of the palates represents the front part of the mouth, and the bottom, the velar area. The WIN-EPG has 62 electrodes bunched densely at the front of the palate, but with contacts across the palate to the velar area. The Kay pseudopalate has 92 electrodes, bunched densely around the edge of the palate up to the teeth, but with contacts back to the velar area. Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Electropalatography and Ultrasound Although highly anomalous oral structures may preclude the wearing of artificial palates, all 22 speakers in the UBC projects have been able to wear them. The palate contains electrodes that are sensitive to tonguepalate contacts and send contact-induced electrical impulses through fine bundled wires to a computer. Tongue-palate contact patterns are displayed on-line. These displays and the accompanying acoustic signals can be stored for analysis, or as templates for practice. The Kay and WIN-EPG differ in how the palates attach, the number and type of electrode displays and the types of analyses provided. The acrylic Kay pseudo-palate has 92 electrodes; it fits over and is held on by the upper teeth (figure 1). The acrylic WIN-EPG has 62 electrodes; dental wires hold the pseudo-palate onto the upper teeth. Both types of palates allow distinction of non-low vowels and alveolar to velar consonants from one another; the Kay palate, in addition, provides displays of dental consonants. The EPG systems typically provide acoustic displays (e.g., a waveform displaying intensity), with the WIN-EPG also providing off-line spectrographic displays and detailed quantitative analyses of contact patterns. The following section describes typical maximal target contact patterns and some aberrant patterns for English lingual consonants and vowels, as observed by the authors during the treatment studies (see figures 2a-2h, 5a-d and 6a-d). EPG images for the current paper are taken from the currently used machine, WIN-EPG, which has much easier image exporting capabilities than the previous machines. Although exact tongue-palate contact patterns for speech targets vary within and between speakers, the contact patterns are similar in configuration and region. Alveolar targets /t/ (figure 2a), /d/ and /n/ show a horseshoe contact pattern, with the tongue touching the alveolar ridge and the sides of the upper dental arch to the back of the molar region. The alveolar sibilants /s/ (figure 2b) and /z/ show a similar contact pattern, but have more contact on the sides of the palate, creating a groove primarily in the alveolar area. This groove varies somewhat in size and location across speakers (McLeod, Roberts & Sita, 2003). The contact pattern for /l/ depends on context (figure 2d). Prevocalic /l/ generally has alveolar contact similar to /t/,/d/ and /n/, full or near-full lateral contact on one side of the upper dental arch, and reduced posterior lateral contact on the other side of the dental arch. Postvocalic “dark” /l/ tends to show posterior velar contact at the beginning of the articulation, followed rapidly by the contact pattern for prevocalic /l/. The palatoalveolars // (figure 2c) and // show broad, post-alveolar contact along the sides of the upper dental arch, and a wider groove (i.e., less contact) than the alveolar sibilants. The affricates /t/ and /d/ vary across speakers. Some speakers show a stop-sibilant contact pattern in the palatoalveolar area (Hardcastle, Gibbon & Scobbie, 1995), while others show movement backwards from the /t/ or /d/ contact area to a post-alveolar // or // contact area. English /r/ tends to have lateral contact along the back molars and a wide channel with no contact Figure 2 2a. 2b. 2c. 2d. 2e. 2f. 2g. 2h. Figure 2a-2h. Electropalatograms for North American English (a) /t/, (b) /s/, (c) //, (d) prevocalic /l/, (e) /k/, (f) prevocalic /r/, (g) /i/, and (h) //. The top of the figure represents the alveolar area, and the bottom, the velar area. The black squares indicate tongue-palate contact. Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 171 Electropalatography and Ultrasound in the middle of the tongue (figure 2f). The velar consonants (figure 2e) tend to have continuous contact along the back of the pseudo-palate region in the context of back vowels, although for some speakers, this contact is not visible if their pseudo-palate does not extend far enough back. In the context of front vowels, the velars show continuous contact from the velar through the palatal regions. The front vowels and /j/ show bilateral contact about halfway along the palate towards the front, and a wide mid-channel with no contact; the back vowels and /w/ show minimal contact in the velar region and a wide mid-channel with no contact (figure 2g, 2h). Tense vowels have the same type of contact pattern as their lax vowel cognates, but are produced further forward along the dental arch. Low vowels are generally not visible because they have no tongue-palate contact in most speakers; for some speakers, mid vowels have no visible contact. During speech therapy, the target tongue-palate contact pattern is demonstrated by a typical speaker with an artificial palate, and the client is encouraged to emulate that target. If the client produces an acceptable variant of the target phone with a different tongue-palate contact configuration from that of the model, the client’s production becomes his or her own template. For further information on EPG, consult http:// sls.qmuc.ac.uk/epg/epg0_big.html. Sonosite C15/4-2 MHz MCX transducer (for treatment only), and (b) a stationary Aloka Pro-Sound SSD-5000 with a UST-9118 endo-vaginal 180-degree convex array transducer (see Bernhardt, Gick, Adler-Bock & Bacsfalvi, 2005). Figure 3 3a. 3b. 3c. Dynamic Two-Dimensional Ultrasound The following description provides an overview of the functioning of two-dimensional ultrasonography for speech displays. To display speech or other lingual movements with ultrasound, a transducer is held by the speaker or attached to a fixed arm or stand so that it contacts the undersurface of the speaker’s chin. The transducer is coated with water-soluble ultrasound gel to enhance the signal. Sound waves are transmitted by the transducer up through the oral tissue. Echo patterns from sound waves returning from the tongue surface are converted to moving images that are visible on an ultrasound screen (see figures 3, 4 and 7). (See also Stone, 2005.) There are a number of different dynamic twodimensional ultrasound machines, with large differences in price, depending on the size and complexity of the machine and transducer. Three different machines have been used in the UBC treatment studies. For Study #2 in the current paper, the Aloka SSD-900 portable ultrasound was used, along with a 3.5MHz convex intercostal transducer probe. The range and gain were adjusted to give the clearest image for the tongue surface across assessments, for example, a range of 11 centimeters, gain of 60. The simultaneous audio signal was recorded onto VHS tape at 30 frames per second from the ultrasound machine (JVC Super VHS ET Professional Series, SRVS20), and also onto a digital video source using a ProSound YU34 unidirectional microphone amplified through the built-in pre-amplifier in a Tascam cassette deck. In subsequent studies, two other machines have been utilized: (a) a portable Sonosite 180 Plus with a 172 3d. 3e. 3f. 3g. Figure 3a-3g. Ultrasound images of North-American English: (a) /k/, (b) /t/, (c) /s/ (coronal view), (d) /l/, (e) /r/ (f), /u/, (g) //. The tongue tip is on the right in all sagittal images. The straight line for /k/ approximates the velar area. The straight line for /t/ approximates the alveolar area. 3c shows a mid-line groove for /s/. Note the complex shapes of /l/ and /r/ (3d, 3e). The /u/ has an advanced tongue root and higher tongue body than // (3f, 3g). Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Electropalatography and Ultrasound Either mid-sagittal (figure 3a,3b,3d,3e,3f,3g) or coronal-oblique (figure 3c) views of the tongue shape and movement patterns can be displayed on a screen. The sagittal view displays a side view of the tongue, showing tongue height, backness, and slope; this view has been used most in the UBC studies. The tip or root may be obliterated in the display, due to the limited field of view or by the jaw and hyoid shadows. However, the general shape, position, height and slope provide useful feedback to the speaker. Slope has been found to be especially relevant for /l/, /r/ and the vowels in the ISRL studies. The coronal view shows a cross-section of the tongue, and thus, flatness, mid-line grooving or lateral elevation or depression of the tongue. This view has been useful for showing mid-line tongue grooving for sibilants and /r/, plus relative vowel height. Reference points or lines or palatal contour sketches derived from images of swallows are sometimes added to the display. This is done either by attaching overhead transparencies with drawings of the palatal arch or target tongue positions or shapes to the screen, or by using reference lines generated by the ultrasound machine itself. For example, if a certain tongue height is optimal for a particular target, a reference line will be placed on the screen that the speaker is to ‘hit’ with his or her tongue body or tip. (See the lines in figures 3a, 3b and 3e,, for example.) As with EPG treatment, a typical speaker provides examples for the client to emulate. Images or movies can be stored and used for future reference. Typical and aberrant ultrasound images are shown in figures 3, 4 and 6 (see also Bernhardt et al., 2005), and are described below. The tongue tip movement and height for the velars contrast visibly with tongue body movement and height for the alveolars, as shown in figures 3a and 3b respectively.. For sibilants, the varying width of the groove for the alveolar and post-alveolar sibilants is visible in the coronal view as in figure 3c. The sagittal view shows the relative front-back position of the apical end of the tongue and helps distinguish alveolar from post-alveolar fricatives. For affricates, the display shows the relative backness of the tongue and any change in movement from a more anterior position for the stop portion of the affricate to a more posterior position for the fricative portion. The English /l/ and /r/ are complex articulations with multiple components that differ across word position and speakers (figure 3d and 3e; also see figure 4 which shows a sample aberrant /r/ pre-treatment and accurate /r/ post-treatment from Adler-Bock, 2004). For both liquids, the sagittal view typically shows a two-point displacement of the tongue in the tip/blade and root regions (Stone & Lundberg, 1996). For /l/, the relative timing of the anterior and posterior constrictions can also be seen in the sagittal view (figure 3d; Sproat & Fujimura, 1993; Gick, 2003). The coronal view shows the lateral dip of one or both sides of the tongue for /l/, a dip which is usually towards the posterior portion of the tongue body. The English /r/ can be produced with a more retroflexed or bunched position, and is generally articulated with three separate constrictions along the vocal tract (Delattre & Freeman, 1968): labial, central and pharyngeal. The shape of the tongue in the region of each of the lingual constrictions is visible on ultrasound (figure 3e). A posterior and relatively wide mid-line depression is another important component of /r/ and is visible in the coronal view of /r/. With respect to vowels, the sagittal view provides a view of the whole tongue as it advances and retracts, and moves through various heights. The sagittal view thus displays the tongue root as it advances and retracts for tense and lax vowels respectively, and in addition, shows the higher tongue body position for tense vowels (compare the height and backness of /u/ and // in figure 3f and 3g, and the same for /i/ and // in figure 7). The coronal view also shows relative height of the sides of the tongue; the sides are higher for the high and tense vowels. Additional information on ultrasound is available in the Volume 19, 2005 issue of Clinical Linguistics and Phonetics and the following websites: www.linguistics.ubc.ca/isrl/UltraSoundResearch/; http://speech.umaryland.edu/research.html; www.slp.utoronto.ca/People/Labs/TimLab; www.sls.qmuc.ac.uk/RESEARCH/Ultrasound. Figure 4 4a. 4b. Figure 4a, 4b. Ultrasound images of a pre-treatment vocalic substitution for /r/ (4a) followed by a posttreatment on-target /r/ production (4b: Victor, AdlerBock, 2004). Everyday Listener Observations An ultimate goal of speech therapy is to enhance communication in a person’s everyday life. The observations of everyday listeners are thus important in the treatment evaluation process. A variety of methods can be used to obtain such observations, from studies of speech intelligibility or comprehensibility to more qualitative approaches such as interviews, questionnaires or focus groups. Speech intelligibility measures were used in the two following studies; consequently, the following discussion focuses on those methods. Speech intelligibility is generally evaluated with some kind of identification task or rating scale (Kent, Miolo & Bloedel, 1994). For identification tasks, words may be presented in isolation or in connected speech in a variety of listening conditions. Listeners may be asked to Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 173 Electropalatography and Ultrasound transcribe orthographically what they hear, or to select responses from closed sets in computerized or noncomputerized protocols. Kent et al. (1994) suggest that identification tasks can provide information about specific words and phonemes, but not about the speaker’s general conversational competence. General scalar ratings of connected speech samples, in contrast, can provide holistic appraisals of speech, because the listener can take nonsegmental factors into account, such as intonation, rate, and rhythm. The general rating scales, however, give minimal information on specific words and segments. Rating scales may also be more subject to listener and context biases than identification tasks (Schiavetti, 1992; Kent et al., 1994). Some rating scales may provide more information on specific speech targets (Black, 1999; Ertmer & Maki, 2000). Ertmer and Maki note that progress in treatment often has an intermediate phase, in which “closer—though still distorted approximations” may precede fully acceptable variants of the target (2000, p. 1514). They constructed a 3-point rating scale for evaluating treatment of specific targets: “0” (omission or substitution of the target), “1” (“improved but not fully acceptable”), or “2” (“a fully acceptable” variant of the target) (p. 1514). This method also has inherent biases. First, the speech target is known, potentially influencing the listener responses. Furthermore, listeners may vary in their definition of acceptability, depending on their background and their understanding of the task. Nevertheless, the task does allow specific speech targets to be rated without phonetic transcription, making it usable by everyday listeners. One of each type of task was selected for the current listener studies in order to assess the utility of each for evaluating data from treatment studies. The first study used a single word identification task. The second study adopted a 3-point rating scale similar to the Ertmer and Maki (2000) scale. Based on formal (Bernhardt et al., 2003) and informal trained listener observations, it was predicted that everyday listeners would identify more post-treatment words in study #1, and rate post-treatment speech samples more highly in study #2. It was also predicted that their responses would differ according to the various speakers. However, it could not be predicted to what degree pre- and post-treatment listener observations might differ, or how listeners might react to individual speakers or segment (phoneme) types in the two studies. These preliminary listener responses would serve as a foundation for future research questions and methods. The two studies are discussed in turn below. Study #1: Speech Habilitation Using EPG with a Heterogeneous Group of Speakers Treatment Study Background for Listener Study #1 The first treatment study included a heterogeneous sample of 7 speakers in terms of etiology, age and speech target types (see table 1). Inclusion of a variety of speaker types in the first exploratory study 174 allowed the research team to gain preliminary insight into the overall scope of the technology for future studies. Three speakers had hearing impairments, 4 had cleft palates, and 2 had motor impairments, thus reflecting the variety of impairments reported in the EPG literature. Four speakers were 18 or over, and three were 8-9 years of age. Two adults had mild speech impairments (pseudonyms Stan and Delia), one had a moderate speech impairment (Devon), and one a severe speech impairment (Samantha). Among the children, 2 had severe speech impairments (Dora, Sandy), and one, a moderate speech impairment (Dana). All speakers had received a minimum of 3 years of prior speech therapy. Phonological analysis (Bernhardt & Stemberger, 2000) and the palatograms served as a basis for setting individualized targets for treatment. Speech targets included: (1) alveolars /t/, /d/, /s/, /l/, with all speakers having at least one alveolar target, (2) palatoalveolars // and/or /t/ and /d/ for Stan, Delia, Dora, Devon, (3) velars /k/ and/or /g/ for Delia, Samantha, Sandy, Devon, (4) // for Sandy and Delia, and (5) /r/ for Dora and Samantha. One of two S-LPs associated with the project conducted a traditional articulation therapy baseline (Bernthal & Bankson, 2004) over 6-8 sessions, focusing on one or more of the speakers’ targets. The S-LPs modeled targets for imitation, provided visual, auditory and tactile cueing, and gave oral, written or sign language feedback as indicated. For the children, games and books were used to engage their interest. The children’s and Devon’s family members attended the sessions. Narrow phonetic transcription by the investigators at the end of the baseline period showed no change for identified targets. A 20-session palatometry program was then conducted over 14-16 weeks at the university by one of the same two S-LPs in consultation with the first author. The case study protocol for each speaker consisted of two 4week treatment blocks with two treatment sessions per week, a 1-to-3-week treatment break, and a 4-session weekly maintenance phase. More than one target was included for each client in each block in a semi-cyclic approach to treatment. The training time for specific targets was adjusted to speaker needs, with targets being revisited in the second treatment block as required. The palatometer was considered an adjunct to the articulation therapy program, which was conducted with the same general therapy techniques used in the baseline. Participants practised the given targets as isolated segments, and then in syllables, words, sentences and conversation, both with and without the palatometer. As the client progressed at each level of complexity (e.g., segment, syllable, etc.), targets were practiced more often without the artificial palates within sessions. From the beginning, home practice was encouraged without the artificial palates for those targets that showed some success in the sessions. For example, a speaker might be asked to practice oral movements, segments, syllables, words or phrases, with a family member providing feedback. Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Electropalatography and Ultrasound Table 1 Study #1 speaker characteristics and EPG treatm ent targets Speaker's A ge History Major speech patterns pseudonym Stan Severity and EPG targets 19 Cleft lip and palate Alveolars and palatoalveolars: mid-palatal substitutions. Sibilants lateralized. Mild /t,s,st,,d/ Mild nasal emission Delia 29 /t/,/d/,/k/,/t/ : glottalized Cleft palate Sibilants ungrooved, often pharyngealized. Mild /,t,d,s, z,k,/ Devon 40 Cerebral haemhorrage. Imprecise articulation, voicing mismatches, hypernasility, suprasegmental aberrations Samantha 18 Rubella: Profound hearing loss, mild oral-motor leftsided weakness, palatal lift. ESL: Oral (some sign). Many deletions, substitutions. Nasal emission without palatal lift. Weak articulation Severe /s,,k,r/ Dana 8 Cleft lip and palate. Fistula. Malocclusion. Pharyngoplasty, speech bulb reduction. Alveolars > mid-palatal or interdentals. /,/r/, prevocalic /l/: velarized Mild /t,d,s,l/ Sandy 9 Klippel-Feil Syndrome. Cleft lip and palate Fistula. Malocclusion. Mild-moderate hearing loss (aided). Velar stops, fricatives, affricates > glottal stops, pharyngeal fricatives, nasal fricatives Alveolar stops dentalized. Labials accurate. Prevocalic /r/ > glide. Severe /, s, k, / Dora 8 Cytomegalovirus: Cochlear implant age 3;11. Total communication since preschool. Liquids, some final consonants, cluster elements: deleted. /s/, /z/ : weak or deleted. Velars > alveolars. /i/ > [m]; /n/ > [f]. Severe /t,d,s,, t,,r/ Everyday Listener Study #1 Method for everyday listener study #1. Williams (fifth author) conducted the everyday listener study described below as part of her master’s thesis project (Williams, 1998). She organized the stimuli, ran the experiment and conducted initial analyses. The first author conducted further analyses for the present paper. Eight male and eight female adult listeners participated. These listeners had a mean age of 25, Grade 12 or higher, spoke English as their first language, and had had no prior experience with disordered speech. Their hearing was assessed as normal with a pure tone screening Moderate /t,d,s,,k,/ level of 20dbHL from 500-6000 Hz, speech reception thresholds at 20dbHL or better, and speech discrimination scores of 88% or better. The listener task involved open-set word identification. Word stimuli for the task came from two audio-recordings of each speaker pre- and post-treatment, made with a Marantz audiotape recorder and PZM 331090B microphone in the therapy room. Because there were no observable changes in the baseline period, the listener data included the pre- and post-treatment assessment samples only, in order to reduce listener time requirements. The stimuli were taken from a list of 164 single words not used in therapy that included multiple exemplars of English consonants and clusters across word Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 175 Electropalatography and Ultrasound Madsen TDH 39P 10W headphones in a double-walled positions (Bernhardt, 1990). A unique list of stimuli Industrial Acoustical Company (IAC) sound booth. words was selected semi-randomly for each speaker. Each During the task, listeners faced the computer screen, speaker’s list contained the same pre- and post-treatment which showed one large square labeled ‘NEXT’. Listeners words, and seven to ten consonant treatment targets. were told that there were seven blocks in each session, that Different words were chosen for each speaker, because all stimuli in a given set came from one speaker, and that listeners could potentially have learned words from the they could decide when to proceed to the next word. They more intelligible speakers during the task, enabling them were asked to write down the words that they heard. Each to identify those words when spoken by less intelligible listening session lasted 60-80 minutes, with breaks on speakers. Ten was the maximum number of words per request. speaker that could be selected, in order to include all of a speaker’s treatment targets, while avoiding repetition of Results and discussion of listener study #1. words across speakers. Analyses were performed within-speaker only, Because of the individual case study design of the because of the small size and heterogeneity of the sample. treatment study, and the heterogeneity of the speaker (See table 2.) The prediction had been that listeners would group, listeners were asked to make within-speaker identify significantly more words and target segments in judgments. This may have resulted in a practice effect, the post-treatment word sets across speakers. Listener that is, some within-speaker familiarization for the responses varied by speaker. For pre-treatment stimuli, listeners. The practice effect was diminished by average words correctly identified across listeners showed randomizing stimuli within the pre- and post-treatment a bimodal split. For speakers with a mild-moderate counterbalancing speaker order across listeners, and impairment (Stan, Delia, Devon and Dana), listeners presenting all pre- and post-treatment samples in the first identified an average of 6 to 7 of 10 words per speaker. For of two listening sessions. The latter procedure was speaker with a severe impairment (Samantha, Dora and considered potentially less biasing than having Sandy), listeners identified 0 to 2 of 10 words per speaker. randomized pre- and post-treatment words from the The average number of correctly identified words was same speaker in the same small word sets, where clearer, higher for all speakers in the post-treatment samples but post-treatment pronunciations could potentially give to different degrees. The non-parametric Wilcoxon’s the listener cues to pre-treatment words. A 2-3 day interval Signed Ranks test was used to compare the pre-post listener between testing sessions further reduced potential practice effects. Table 2 The audiorecorded stimuli were digitized using the Average speaker words and treatm ent targets identified across the 16 listeners in pre-and postComputerized Speech Research treatm ent sam ples. Environment 45 (CSRE45) software (1995) and the Tucker Speaker Avg. pre-tx words Avg. post-tx words Avg. treatment Avg. treatment target Davis Technologies (TDT) identified (/10) identified target segments segments identified hardware (1994) at a sampling identified pre-tx post-tx rate of 20 kHz. Sound files were 6.63 (0.96) 6.94 (1.61) 9.54/10 (0.74)* 9.23/10 (O.66) a edited and analyzed in the Stan Ecoscon program in CSRE45. Delia 7.44 (0.96) 7.88 (0.50) 6.25/8 (0.45) 6.0/8 (0.85) The sound files were attenuated or amplified during pre- Devon 6.63 (1.30) 7.44 (1.15) 6.38/8 (1.02) 6.93/8 (1.16) processing so that pre- and post0.38 (0.50) 0.75 (0.68) 0.81/9 (0.66) 2.07/9 (1.03)** treatment stimuli pairs could be Samantha presented at similar levels of Dana 6.50 (1.67) 9.44 (0.51)** 8.25/10 (1.81) 9.87/10 (0.35)** intensity (Williams, 1998). Fourteen blocks were designed Sandy 2.44 (1.26) 3.63 (1.26)* 1.34/8 (1.13) 3.27/8 (0.96) within Ecosgen, a stimuli 1.06 (0.57) 5.44 (1.15)** 0.63/10 (1.09) 6.07/10 (1.16)** presentation protocol that is Dora part of CSRE45. Seven blocks contained the pre-treatment Note: tx = treatment. Parentheses = standard deviation. Each number represents the average number words (one block per speaker), of words or segments identified across the 16 listeners out of the set of 8-10 potential words or and seven the corresponding segments. post-treatment words. Stimuli aTreatment targets varied in number from 7 to 10 per speaker within word sets. were presented to the listeners using Ecosgen via the TDT. *Wilcoxon's Signed Ranks: p < .05 Participants listened through **Wilcoxon's Signed Ranks: p < .007 (from .000 to .006) 176 Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Electropalatography and Ultrasound observation sets, because of the small size and heterogeneity of the sample. The adolescents and adults showed a small, non-significant increase, although for Devon, the increase approached significance (p = .053). The children showed a significant increase: to 9.44/10 for Dana (p = .001), 5.44/10 for Dora (p < .001), and 3.63/10 for Sandy (p = .013). Standard deviation for words identified ranged from 0.57 to 1.67 words across listeners, suggesting that listeners were in close agreement. In terms of consonant treatment target identification, there was a similar split by speaker across listeners. For pre-treatment stimuli, the average numbers of target consonants correctly identified across listeners were as follows: (1) for Stan, Delia, Devon and Dana: over 75% (from 6.25/8 to 9.23/10), and (2) for Samantha, Dora and Sandy: 0.63%15% (from 0.63/10 to 1.34/8). Post treatment, more segments were correctly identified for all speakers but Delia, for whom there was a slight non-significant decrease in consonant identification. The increase was significant according to a Wilcoxon’s Signed Ranks test (p < .05) for all but Devon, who showed a near-significant increase (p = .06). The small sample, case study design and preliminary nature of the listener study preclude in-depth statistical analyses or interpretation of the data for the various speakers. The slight and non-significant increases in word identification for Stan, Devon, Samantha and Delia may have reflected a listener practice effect for those speakers, rather than an actual improvement. Listeners did identify significantly more target consonants in the two adolescents’ words (Stan and Samantha), which may however be a reflection of positive treatment effects. The increase in number of words and segments identified for the children suggests a change above and beyond a practice effect, which may or may not be attributable to the program. The differential response of the listeners to various speakers suggests that future studies will need to include larger numbers of speakers of different ages, etiologies, severity and phonological profiles. The listeners’ differential responses also show that everyday listeners can contribute useful data about individuals in an evaluation process. In terms of the listening tasks, word and segment identification can give specific information that may be useful in determining treatment effects. The issue of practice effects is a challenge in listening tasks. Randomization of stimuli may help diminish such effects, but even randomized stimuli may influence each other in small word sets, suggesting that larger word sets are needed in future studies. The results from this listener study suggested further exploration of visual feedback technology was warranted. Study 2: Speech Habilitation using EPG and Ultrasound with Adolescents with Hearing Impairment An Interdisciplinary Speech Research Laboratory (ISRL) was funded at UBC in 2001, making new equipment available for treatment studies, including two- dimensional ultrasound and WIN-EPG (2000). A followup exploratory project was initiated incorporating both EPG and ultrasound, not to compare the two technologies experimentally, but to learn more about each new technology as a basis for future studies. Treatment Study Background for Study #2 The first treatment study had shown a difference between child and adult participants, with the adolescents (Stan and Samantha) showing slightly greater treatment effects than the other adults. In order to gain more insight into the relevance of age, four adolescents aged 16-18 were recruited for the second treatment study. The students for the second study were a homogeneous group in terms of age and etiology (hearing impairment). The three males had aided hearing levels in the moderate range (Palmer) or moderate-to-severe range (Purdy, Peran). The female participant, Pamela, had a fluctuating and progressive sensorineural hearing loss due to Large Vestibular Aqueduct Syndrome. Her aided thresholds up to 2000 Hz sloped downwards from normal to the mild loss range in the better ear. In the other ear, her aided thresholds were in the moderate to severe loss range. Across speakers, audio-recorded baseline speech samples showed mild to moderate suprasegmental aberrations in terms of voice quality, intonation, nasalization, loudness and/or pitch control. Sibilants and liquids were the least well-established consonant categories. Vowels tended to be centralized and/or lowered, and the tense-lax distinction for vowels was only weakly established. Pamela and Purdy were intelligible most of the time in conversation; Peran and Palmer were intelligible some of the time in conversation. The adolescents attended an oral high school program for deaf and hard of hearing with partial mainstreaming and speech-language support. The adolescents had received at least 12 to 15 years of prior speech habilitation. The second author was their current speech-language pathologist. Prior to the visual feedback treatment study, this S-LP conducted a 5-week traditional treatment baseline, targeting /l/ and sibilants. Speakers showed slight gains on consonants that they could already pronounce on occasion pre-treatment. (See Bernhardt et al., 2003, for more details.) The speech habilitation study used both WIN-EPG and ultrasound. There were 14 weekly individual treatment sessions at the ISRL, with follow-up sessions at the school without the use of technological feedback. All speakers had the same treatment targets. Consonant treatment targets included the voiceless coronal fricatives /s/ and // and the approximants /l/ and /r/. For vowels, the tense-lax distinction was targeted in the pairs /i/-// and /u/-//. Additional data were collected as controls: /k/ and alveolar stops, /t /, and all other vowels. Treatment targets were generally counterbalanced across equipment and order across speakers, although approximants were addressed second for all participants. Pamela and Palmer spent six sessions solely with EPG (sibilants, vowels), and three sessions solely with Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 177 Electropalatography and Ultrasound ultrasound (liquids); Peran and Purdy spent six sessions solely with ultrasound (vowels, sibilants) and three solely with EPG (liquids). For the final five sessions, all participants alternated within sessions between ultrasound and EPG. The difference in time allotment by equipment provided an opportunity to make a qualitative, non-experimental comparison between the two technologies. During treatment, the first and second authors (S-LPs) modeled targets and provided feedback, using speech, sign and written information. Targets changed in complexity from silent movements to isolated segments to syllables, words and finally phrases. Prior to the everyday listener study reported here, trained listeners evaluated the pre- and post-treatment data using phonetic transcription (Bernhardt et al., 2003). The transcriptions indicated a 50% gain in consonant target accuracy post-treatment for Pamela, Purdy and Palmer, and a 28% gain for Peran. This compared with a 28% gain in vowel target accuracy for Purdy and Palmer, compared with a 16.9% gain for Pamela and a -1% regression for Peran (Bernhardt et al., 2003). Across speakers, the trained listener suggested that the most improved consonant was /r/, and the most improved vowel was //, followed by /i/ and /u/. The question for the current study was whether everyday listeners would notice these or similar improvements. Everyday Listener Evaluation in Study #2 Method for listener study #2. Research assistants who had not helped with the treatment study organized the stimuli for the everyday listener study. Ten native English-speaking everyday listeners between 20 and 45 participated (6 men, 4 women). All had post-secondary education, no familiarity with disordered speech, and normal hearing as screened at 25 dB from 500 to 4000 Hz. Nonsense word samples from the ultrasound recordings were selected for the everyday listener study, because they had also been used in the trained listener study (Bernhardt et al., 2003). The stimuli were controlled in terms of phonetic context (/h/, // and /b/), and thus it was assumed that the listener could focus on the test segment. The following targets were included: vowel treatment targets /hib, hb, hub, hb/, vowel observation targets /heb, hb, hb/, consonant treatment targets /s, hs, , h, l, hl, r, hr/and consonant observation targets /t, ht/. The audio-recorded sound files were transferred to a Macintosh computer using Adobe Premiere 6.0 and Macromedia SoundEdit 16. The sound files differed within and across speakers in terms of degree of background noise. This discrepancy was the result of different recording levels during the probes rather than different signal-tonoise ratios, because recordings were made under constant ambient noise conditions. Overall signal amplitude was reduced for the louder files using SoundEdit 16, yielding equally loud tokens across the listener sessions, and bringing relative background noise to within 3 dB for all tokens. 178 As was the case with the first listener study, the small n favored within- rather than between-speaker analyses. The stimuli were entered into PsyScope 1.2.5 (1993). However, because PsyScope 1.2.5 (1993) could not easily identify the source of individual tokens after randomization, listeners were asked to make only withinspeaker judgments for the four individual speakers. Data from a fifth speaker was used for a short training session. The within-speaker rating procedure may have resulted in a practice effect for the listeners. However, speaker order was counterbalanced across listeners, and pre- and post-treatment tokens were randomized within the same sets. In order to ensure data recoverability from PsyScope, word-initial consonants, word-final consonants and vowels were also in separate sets. The five target consonants (/s,,l,r,t/) yielded 10 consonant sets, that is, 5 sets of word-final consonants and 5 sets of word-initial consonants. Each consonant set contained 20 randomized tokens, 10 pre- and 10 post-treatment. Equal numbers of the five consonant syllables were presented in random order. There were seven vowel targets, and thus seven vowel sets. Each set contained 20 vowel syllable targets, 10 pre- and 10 post-treatment tokens, with all vowels represented as indicated above. Because of the 20-token limit per set, one vowel could only be tested twice in each set. However, all listeners heard all seven sets from each speaker, ensuring that listeners rated all available vowel tokens over the seven sets. A 3-point judgment scale was adopted for the listener evaluation, following Ertmer and Maki (2000). The 3-point scale allowed listeners to provide an intermediate rating for tokens that were somewhat like the target. Listeners attended two testing sessions of about 90 minutes each. In the first session, they practised with the training set, and rated all test sets from two speakers. In the second session, they rated sets from the other two speakers. Listeners wore Koss UR-20 headphones and sat in an IAC sound booth. Instructions were given orally and on the computer screen. Listeners were instructed to focus on the target (e.g. “FINAL CONSONANT” of the word), and to ignore the rest of the word. They were told to press “1” if the stimulus sounded “EXACTLY” like the target, “2” if the stimulus sounded “SOMEWHAT” like the target, and “3” if the stimulus sounded “NOT AT ALL” like the target. They could escape from the program at any time. For consonants, the printed consonant syllable appeared in English orthography on the computer screen at the same time as the sound recording was played back (e.g., saw, hoss, shaw, etc.). The listener then registered his or her rating on the computer keyboard (1, 2 or 3), within 5 seconds. In pilot testing, trained listeners could not respond sufficiently quickly to the nonsense word vowel stimuli. Thus, for the vowels, a familiar word containing the target vowel was presented on the screen. The listener was asked whether the vowel heard in the auditory stimulus was the same as the one in the word on the screen: /i/ - need, /I/ - jig, /u/ - rude, // - good, /e/ - raid, // - bed, and // - log. No other external reference was given Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Electropalatography and Ultrasound to the listeners, in order to ensure that they would use their own internal reference without experimenter biasing. With PsyScope 1.2.5, responses occasionally did not register because of the speed of a response. Unfortunately, if a given stimuli set had a missing response, the program could not identify which item was missing, and thus all data from that set had to be eliminated. The number of incomplete consonant sets was just slightly greater than chance (5.8%), but 21-25% of vowel sets across speakers had missing sets. There was no difference in terms of missing response sets across speakers, and no individual speaker’s vowel data had to be eliminated altogether. There was a bimodal difference in listener responses; five listeners had missing responses in less than three data sets, and the other five had missing responses in five to seven data sets. The listener groups did not differ significantly in rating proportions (‘1,’ ‘2,’ and ‘3’ responses), however, showing that speed of response was probably the affecting variable. If listeners had more than one missing response set for a given speaker and consonant or vowel category, all data from that speaker were eliminated for that listener and category as a cautionary measure. Otherwise, complete data sets were pooled across listeners within speakers, giving a final number of listener tokens as follows: Peran: Vowels (V) — 840; Consonants (C)— 1720; Palmer: V — 920; C — 1920; Purdy: V — 1000; C — 1890; Pamela: V — 1040; C — 1900. improvement. The vowel /i/ showed the most improved listener ratings. Space precludes a detailed comparison with the trained listener study (Bernhardt et al., 2003) in terms of speaker ratings. However, as predicted, there was general congruence. Palmer was rated most severely pretreatment. Palmer, Pamela and Purdy all showed greater gains in consonant ratings than Peran, and Purdy and Palmer showed greater gains in vowel ratings than the other two. The everyday listeners also agreed with the trained listeners in rating /r/ as the most improved consonant across speakers. Rankings for less-improved consonants differed; the everyday listeners rated the sibilants overall more highly than did the trained listeners, perhaps showing a greater tolerance for dentalization of sibilants. Trained and everyday listener evaluations disagreed on the most improved vowel. According to the everyday listener ratings, the most improved vowel was /i/, whereas in the trained listener study, // was evaluated as most improved, followed by /i/. The // vowel may be difficult for everyday listeners to evaluate, partly because of English orthography, where the “oo” can be /u/ or // or because it is a lax vowel and therefore less salient. The rating task was generally problematic for the vowels, as attested by the number of abandoned vowel data sets. The listeners’ differential response to the various speakers further confirms that future research will have to consider speaker profiles. The everyday listener rating Results and discussion for listener study #2. Results were evaluated within speaker, with Table 3 showing overall consonant and vowel ratings. Across listeners, average pre-treatment consonant ratings showed fairly similar ratings in the low on-target range for Purdy, Peran and Pamela: 1.61 to 1.76. For Palmer, the average rating was 2.16, that is, in the intermediate accuracy range. A Wilcoxon’s Summed Ranks nonparametric test was used to test pre-post differences, because of the small sample. Post-treatment, by speaker, average consonant listener ratings improved significantly for Palmer and Pamela, to 1.52 and 1.62 respectively (p < .01). Purdy showed a slight non-significant improvement, and Peran a slight nonsignificant regression. The most improved consonant ratings across speakers were for /r/, with individual speaker variability for the sibilants and /l/. Average ratings for vowels across listeners pre-treatment were in the mid on-target range for all speakers, from 1.4 for Pamela to 1.67 for Palmer. Post-treatment, Purdy and Palmer showed a small but significant increase in ratings to 1.42 and 1.50 respectively (p < .01). Pamela and Peran showed a small, non-significant Table 3 Everyday listener ratings (1,2,3) of speakers' pre- and post-treatm ent consonants and vowels C or V Average rating Average rating pre-Tx post-Tx Purdy 1.61 (0.77) 1.55 (0.72) Peran 1.69 (0.73) 1.70 (0.75) Pamela 1.76 (0.81) 1.62* (0.72) Palmer 2.16 (0.81) 1.52* (0.69) Pamela 1.40 (0.63) 1.38 (0.58) Purdy 1.53 (0.69) 1.42* (0.62) Peran 1.57 (0.69) 1.51 (0.68) Palmer 1.67 (0.73) 1.50* (0.69) C All 1.81 (0.81) 1.59* (0.72) V All 1.53 (.69) 1.45* (.65) Consonants Vowels Speaker Note: Standard deviation in parentheses. "1" rating: "exactly like the target;" "2" rating: "somewhat like the target;" "3" rating: "not at all like the target." Based on ratings of 10 everyday listeners of the consonants /r/,/l/,/s/, //,/t/, and vowels /i/,//,/u/,//,//, //, /e/. *p < .01 on Wilcoxon's Signed Ranks tests. Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 179 Electropalatography and Ultrasound Figure 5 5a. Figure 7 5b. 5c. 7a. 7b. 7c. 7d. 5d. Figure 5a-5d. EPG pre- and post-treatment images for Pamela’s /s/ (5a, 5b pre-post) and /r/ (5c, 5d pre-post). Post-treatment, /s/ has a narrower groove, and /r/ more symmetrical contact. Figure 6 Figure 7a-d. Ultrasound images for Pamela’s /i/ (7a, 7b pre-post treatment) and // (7c, 7d pre-post). The /i/ is advanced and heightened in comparison with // posttreatment. 6a. 6b. 6c. 6d. Figure 6a-d. EPG images for Peran’s vowels // (6a,6b pre-post) and /u/ (6c, 6d pre-post). The /u/ is advanced, and the // retracted after treatment. 180 scale did discriminate between speakers and speech targets. Vowel ratings were problematic, however, in this study. Furthermore, individual listener references for the “1,” “2” and “3” ratings remain unknown. If a speaker has an average 1.5 rating on some target, it is not clear what that implies for the target or everyday communication. The rating scale provided less specific observations than the word identification task as used in the first study. As a final note on the second study and as a further tutorial on ultrasound and EPG, sample pre- and posttreatment images are given in figures 5-7. Figure 5 shows a narrowing of the groove for /s/ post-treatment for Pamela (figures 5a, 5b), and a more symmetrical contact pattern for /r/ (figures 5c, 5d). The EPG images for Peran show the // tongue-palate contact moving backward (figures 6a, 6b) and the /u/ contact moving forward (figure 6c, 6d). The ultrasound images for Pamela’s /i/ (figures 7a, 7b) and // (figures 7c, 7d) show a differentiation post-treatment, with the /i/ showing a higher tongue body and advanced tongue root. Overall, Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Electropalatography and Ultrasound these images correspond generally to the listener perspectives. For further discussion of perceptualarticulatory convergence in ultrasound studies, see AdlerBock (2004). Conclusion The current paper had two purposes: (1) to provide an introduction to EPG and ultrasound as tools in speech habilitation, and (2) to present preliminary everyday listener observations concerning pre- and post-treatment samples collected for two exploratory studies at UBC, as a foundation for future research and clinical initiatives. Comparing EPG and ultrasound, they provide different types of dynamic information about the tongue during speech production. EPG shows tongue-palate contact patterns from the tongue tip to the back of the hard palate for mid and high vowels and lingual consonants. Ultrasound images tongue shape, location and configuration for all vowels and lingual consonants. In practical terms, EPG is much less expensive than ultrasound, and WIN-EPG has built-in analysis capabilities. However, ultrasound does not require individualized artificial palates for each client, and strides are being made in data quantification (see the aforementioned websites). In terms of speech habilitation, both tools appear to hold promise. Research is needed to compare the relative benefits of each in speech habilitation, and their merit in comparison with other technologies and methods across a variety of speakers. In working towards clinical implementation of EPG or ultrasound, it is important to know whether the benefits outweigh the costs. The ultimate test of any treatment methodology is through randomized clinical trials, with large numbers of participants, comparison and control groups, blinding of S-LPs to conditions and data, rigorous baseline and treatment protocols, multiple types of evaluations and control of external variables. As a prelude to such trials, additional small n studies may provide further information as to the relative merit of the technologies. The everyday listener studies reported in the current paper raise a number of questions for future research in speech habilitation with visual feedback technology, particularly in terms of speaker profiles and evaluation methods. More research is needed to determine the relationship of speaker variables to outcomes, for example, age, etiology or severity. In terms of evaluation methods, a number of issues were raised by the everyday listener studies. In the second study, everyday listeners agreed with a previous study by trained listeners on the most improved speakers and consonant, with near agreement on the most improved vowel. Although this congruence is encouraging, one aspect of the everyday listener data suggests that it may be important to consider the relative impact of the degree of change in future studies. Everyday listener observations across the two studies showed minimal pre-post differences for 7 of 11 speakers. This may be because the everyday listeners rated six of the pre-treatment samples relatively highly, leaving minimal room for improvement. Trained listeners might have noted minor improvements through narrow phonetic transcription or instrumental analyses. However, it is not known what the impact of small differences might be in speakers’ everyday conversations. More research is needed to compare everyday and trained listener observations and to determine the impact of different types and degrees of change on conversational intelligibility. In the studies reported here, everyday listener measures of intelligibility were used, specifically word identification and accuracy judgments. Word identification may be more ecologically valid than accuracy judgments, because conversation involves word identification. However, accuracy judgments can differentiate between speakers and samples. Thus, both can contribute information to the evaluation process. Although everyday listener observations bring the ‘real world’ into the evaluation process, future research also needs to bring the interaction between speaker and listener into focus through comprehensibility studies (Visser, 2004; Yorkston et al., 1996). Qualitative studies gathering the viewpoints of the speakers and their conversation partners may also be illuminating, and are currently underway in the UBC research program. In terms of clinical application, S-LPs currently have limited or no access to EPG or ultrasound for their clients unless they can form partnerships with university centres or hospitals engaged in research. Cleftnet UK has been established and may also be a future possibility in Canada. Meanwhile, the information about phonetics and treatment evaluation presented in this paper may provide S-LPs with some new ideas for daily clinical practice and its evaluation. Acknowledgements Thank you to the following: the speakers and listeners; S-LPs David Loyst and Shannon Muir, Study #1; research assistants Dana Bryer, Dana Kanwischer, Jonathan Howell and Marie Jette; Dr. Katherine Pichora-Fuller of the University of Toronto and Dr. Linda Rammage of the Vancouver Hospital Voice Clinic, committee members for Rhea Williams’ Master’s thesis; and the reviewers of this paper. For funding, we thank the Canadian Foundation for Innovation, the BC Medical Services Foundation, UBC’s Hampton and HSS funds and the Variety Club. Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 181 Electropalatography and Ultrasound References Adler-Bock, M. (2004). Visual feedback from ultrasound in remediation of persistent /r/ errors: Case studies of two adolescents. Unpublished Master’s thesis, University of British Columbia. Bernhardt, B., Fuller, K., Loyst, D., & Williams, R. (2000). Speech production outcomes before and after palatometry for a child with a cochlear implant. Journal of the Association of Rehabilitative Audiology, 23, 11-37. Bernhardt, B., Gick, B., Adler-Bock, M., & Bacsfalvi, P. (2005). Ultrasound in speech therapy with adolescents and adults. Clinical Linguistics and Phonetics, 19, 605-617. Bernhardt, B., Gick, B., Bacsfalvi, P., & Ashdown, J. (2003). Speech habilitation of hard of hearing adolescents using electropalatography and ultrasound as evaluated by trained listeners. Clinical Linguistics and Phonetics, 17,, 199-216. Bernthal, J., & Bankson, N. (2004). Articulation and phonological disorders, Fifth Edition. Boston, MA: Allyn and Bacon. Black, H. (1999). An evaluation of palatometry in the treatment of /r/ for a hard of hearing adolescent. Master of Science graduating essay, University of British Columbia. Bridges, C.C. & Huckabee, R.M. (1970). A new visual speech display- its use in speech therapy. Volta Review, 72, 112-115. Dagenais, P. (1992). Speech training with glossometry and palatometry with profoundly hearing impaired children. The Volta Review, 94, 261-282. Delattre, P. C. & Freeman, D. C. (1968). A dialect study of American r’s by x-ray motion picture. Linguistics, 44, 29-68. Derry, K., & Bernhardt, B. (2000). Palatometry intervention in relation to body (structure and function), activity, and participation. Poster presented at the VIIIth meeting of the International Clinical Phonetics and Linguistics Association, Edinburgh, Scotland. Ertmer, D., & Maki, J. (2000). Comparison of speech training methods with deaf adolescents: spectrographic versus noninstrumental instruction. Journal of Speech, Language & Hearing Research, 43, 1509-1523. Fletcher, S., Dagenais P., & Critz-Crosby, P. (1991). Teaching consonants to profoundly hearing-impaired speakers using palatometry. Journal of Speech and Hearing Research, 34, 929-942. Frattali, C. (1998). Measuring outcomes in speech-language pathology. New York, NY: Thieme. Gibbon, F., Crampin, L., Hardcastle, W., Nairn, M., Razzell, R., Harvey, L., & Reynolds, B. (1998). Cleftnet (Scotland): A network for the treatment of cleft palate speech using EPG. International Journal of Language & Communication Disorders, 33, supplement, 44-49. Gibbon, F., Hardcastle, W., Dent, H., & Nixon, F. (1996). Types of deviant sibilant production in a group of school-aged children, and their response to treatment using electropalatography. In M.J. Ball & M. Duckworth (Eds.), Advances in clinical phonetics (pp. 115-149). Amsterdam: John Benjamins. Gibbon, F., & Wood, S. (2003). Using electropalatography (EPG) to diagnose and treat articulation disorders associated with mild cerebral palsy: A case study. Clinical Linguistics and Phonetics, 17, 365-374. Gick, B. (2002). The use of ultrasound for linguistic phonetic fieldwork. Journal of the International Phonetic Association, 32, 113-122. Hardcastle, W., Gibbon, F., & Scobbie, J. (1995). Phonetic and phonological aspects of English affricate production in children with speech disorders. Phonetica, 52, 242-250. Hardcastle, W., Jones, W., Knight, C., Trudgeon, A., & Calder, G. (1989). New developments in EPG. A state of the art report. Clinical Linguistics and Phonetics, 3, 1-38. Kent, R., Miolo, G., & Bloedel, S. (1994).. The intelligibility of children’s speech: A review of evaluation procedures. American Journal of Speech-Language Pathology, 3, 81-95. Klasner, E., & Yorkston, K. (1999). Dysarthria in ALS: A method for obtaining the everyday listener’s perception. Journal of Medical Speech-Language Pathology, 8, 261-264. McLeod, S., Roberts, A., & Sita, J. (2003). The difference between /s/ and /z/: More than +/- voice? Poster presented at the American Speech and Hearing Association Convention, Chicago, IL, November, 2003. Maki, J. (1983). Application of the speech spectrographic display in developing articulatory skills in hearing-impaired adults. In I. Hochberg, H. Levitt, & M. Osberger (Eds.), Speech of the hearing impaired: Research, training and personnel preparation (pp. 297-312). Baltimore: University Park. Michi, K. Yamashita, Y., Imai, S., & Yoshida, H. (1993). Role of visual feedback treatment for defective /s/ sounds in patients with cleft palate. Journal of Speech and Hearing Research, 26, 277-285. Schiavetti, N. (1992). Scaling procedures for the measurement of speech intelligibility. In R. Kent (Ed.), Intelligibility in speech disorders: Theory, measurement and management (pp. 11-34). Amsterdam: John Benjamins. Schuster, L., Ruscello, D., & Toth, A. (1995). The use of visual feedback to elicit correct /r/. American Journal of Speech-Language Pathology, 4, 37-44. Sproat, R., & Fujimura, O. (1993). Allophonic variation in English /l/ and its implications for phonetic implementation. Journal of Phonetics, 21, 291-311. Stone, M. (2005). A guide to analyzing tongue motion from ultrasound images. Clinical Linguistics and Phonetics, 19-, 455-501. Stone, M., & Lundberg, L. (1996). Three-dimensional tongue surface shapes of English consonants and vowels. Journal of the Acoustical Society of America, 99(6), 1-10. 182 Volin, R. (1991).. Micro-computer based systems providing biofeedback of voice and speech production. Topics in Language Disorders, 11, 65-79. Yorkston, K., Strand, E., & Kennedy, M. (1996). Comprehensibility of dysarthric speech: Implications for assessment and treatment planning. American Journal of Speech-Language Pathology, 5, 55-66. Williams, R. (1998). Outcomes of palatometry therapy as perceived by untrained listeners. Unpublished Master’s thesis, University of British Columbia, 1998. World Health Organization. (2001). ICF: International classification of functioning, disability and health. Geneva: World Health Organization. Author Note Please direct enquiries to Barbara Bernhardt, School of Audiology and Speech Sciences, University of British Columbia, 5804 Fairview Avenue, Vancouver,BC V6T 1Z3, [email protected] Received: November 16, 2004 Accepted: October 4, 2005 Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Resource Review / Évaluation des ressources Resource Review/ Évaluation des ressources Clinical Evaluation of Language Fundamentals Preschool–2nd Edition (2004) Autho rs: Authors: Publisher: Reviewer: Elisabeth H. Wiig, Wayne A. Secord and Eleanor Semel Harcourt Assessment, Inc., 19500 Bulverde Road, San Antonio, Texas 78259, USA PsychCorp: www.PsychCorp.com Sharon A. Bond, M.Sc., R.SLP, S-LP(C), CCC-SLP,Speech-Language Pathologist Capital Health Authority Edmonton, Alberta The Clinical Evaluation of Language Fundamentals Preschool-2nd Edition (CELF Preschool2) is an individually administered clinical tool that can be used to identify and diagnose language deficits in children who are 3-6 years of age. According to the authors, the test was redesigned to make it easier and quicker to administer, to provide greater diagnostic value for children ages 3-4 years and 5-6 years, to assist in the assessment of early classroom and literacy fundamentals and communication in context (pragmatics), and to add composite scores to evaluate content (semantics) and structure (morphosyntax). The authors describe the CELF Preschool-2 as a series of levels. The level chosen to begin a particular child’s assessment is dependent upon the examiner’s clinical judgment, the child’s functional language performance, and the referral question to be answered. Level 1 is used to determine whether or not a language disorder exists. It consists of the Sentence Structure, Word Structure and Expressive Vocabulary Subtests. Scores on these subtests are used to develop a Core Language Score. The authors state that this score best discriminates the performance of those children with typically developing language and those children with language disorders. If a child is found to have a language disorder, further language testing will be done at Level 2. This level of testing is to provide more information about how language modalities, language content, and language structures are affected. At Level 2, the examiner is able to determine patterns of performance across index scores and to compare the child’s score patterns with the appropriate norm-reference group. Item analysis can be conducted on all subtests at this level. At Level 3, the examiner evaluates a child’s early classroom and literacy fundamentals. Phonological awareness and pre-literacy rating are included at this level. To determine a child’s pragmatic skills in relation to social communication in the home, community and/ or school (Level 4), a caregiver or other person familiar with the child completes the Descriptive Pragmatics Profile. This profile includes verbal and nonverbal behaviors. The CELF Preschool-2 is individually administered. Administration time is dependent upon the level of assessment done. Approximately 20 minutes is required to administer the three subtests needed to develop the Core Language score. More time is required for levels 2 and 3. The test results are interpreted in terms of subtest scaled scores, composite standard scores, criterion scores, percentile ranks, and age equivalents. The test components include the Examiner’s Manual, Stimulus Books 1 and 2, Record Form, and Checklists for The Pre-Literacy Rating Scale and the Descriptive Pragmatics Profile. The Examiner’s Manual includes guidelines for scoring and interpreting test performance, detailed description of the test development, normative data, and suggestions for intervention and/or follow-up based on the test results. Stimulus Book 1 is colorful and includes all the directions to administer the subtests. Stimulus Book 2 contains illustrations for the story used for Recalling Sentences in Context. The Record Form contains the test items, space for recording responses, item analysis, and clear information regarding whether or not repetitions are allowed and when the discontinue rule is to be applied. A Behavioral Observation Checklist is also included. The checklists can be completed by someone familiar with the child and his background. Norm-referenced data for the CELF Preschool2 was derived from a standardization sample of 800 children aged 3 years to 6 years in 2002. The sample (based on the U.S Bureau of the Census, 2000) was stratified on the basis of age, sex, race/ethnicity, geographic region, and primary caregiver’s education level. Reported test-retest reliability for the subtests and composite scores by age and across all ages ranged from a low of .78 to a high of .94. Internal consistency reliability coefficients across all ages ranged from .79 to .97. The test-retest reliability and internal consistency would be considered acceptable across subtests and excellent for the composite scores. The authors presented evidence of validity of the CELF Preschool-2 based on test content, the response process, internal structure and intercorrelational studies. The results of the intercorrelations between the subtests and the composite scores were given. Overall, the correlations were moderate. Further validation evidence was reported through comparison with other tests of language disorders in children, including the CELF Preschool, the CELF-4, and the PLS-4. The correlations were reported as moderate to high. The validity of the CELF Preschool-2 was based on five different types of Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 183 Resource Review / Évaluation des ressources research that yielded significant results. Therefore, the validation of the test would be considered good. In summary, the Clinical Evaluation of Language Fundamentals Preschool-2 is a well-designed test for identifying and assessing language disorders in children ages 3 years to 6 years. I found that the structuring of the test into levels allowed me to tailor the language assessment based on the specific needs of the child I was seeing. The directions of each subtest are not only on the Response Form, but also in the Stimulus Book. Each subtest is preceded by information regarding materials needed, number of repetitions allowed, and the discontinue rule. If the individual subtest score is to be used in more than one index or composite score, the indexes are listed. Acceptable alternate responses and 1point responses are included on the Response Form. I am pleased with the modifications to many of the stimulus plates. The pictures are colorful and are interesting to children. During language assessments, I often have children who like to talk about the pictures. At that point, I take the opportunity to record what they say for a language sample. The various children depicted represent Caucasian, Black, Hispanic, and Asian ethnic backgrounds. Children taking the test are able to identify with the presented manner of dress and activities. The story used for the Recalling Sentences in Context is a significant improvement over the one in the previous edition. “The Big Move” has been replaced with “No Juice!”, a story about three children’s trip with their mother to the grocery store. This is an event that is very familiar to children. In summary, the CELF Preschool-2 is a welldesigned clinical tool for identifying and diagnosing language disorders in children ages 3 years to 6 years. The authors have provided evidence of the instrument’s acceptable reliability and validity. The CELF Preschool2 would be a very good addition to the assessment protocols of those speech-language pathologists who provide services to the preschool and kindergarten population. Ac t i o n ACO A! nA COA Jetez-nous un coup d’oeil! MOTEUR DE RECHERCHE EN LIGNE DU ROA Recherchez en ligne pour les articles qui ont été publiés dans la ROA. La recherche peut porter sur l’article, l’enjeu, le sujet, l’auteur ou un mot-clé. Les résultats de la recherche vous permettent de visionner tout l’extrait et vous donnent le choix de commander une réédition de l’article, des numéros déjà parus ou de s’abonner. C’est un excellent outil de recherche. Jetez-y un coup d’oeil sous ROA à la section Ressources de www.caslpa.ca. Utilisez le moteur de recherche du ROA: www.caslpa.ca/english/resources/jslpa-index.asp Un autre service de qualité offert par l’ACOA 184 Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Information for Contributors The Journal of Speech-Language Pathology and Audiology (JSLPA) welcomes submissions of scholarly manuscripts related to human communication and its disorders broadly defined. This includes submissions relating to normal and disordered processes of speech, language, and hearing. Manuscripts that have not been published previously are invited in English and French. Manuscripts may be tutorial, theoretical, integrative, practical, pedagogic, or empirical. All manuscripts will be evaluated on the basis of the timeliness, importance, and applicability of the submission to the interests of speech–language pathology and audiology as professions, and to communication sciences and disorders as a discipline. Consequently, all manuscripts are assessed in relation to the potential impact of the work on improving our understanding of human communication and its disorders. All categories of manuscripts submitted will undergo peer-review to determine the suitability of the submission for publication in JSLPA. The Journal recently has established multiple categories of manuscript submission that will permit the broadest opportunity for disseminaion of information related to human communication and its disorders. New Categories for manuscript submission include: Tutorials. Review articles, treatises, or position papers that address a specific topic within either a theoretical or clinical framework. Articles. Traditional manuscripts addressing applied or basic experimental research on issues related to speech, language, and/or hearing with human participants or animals. Clinical Reports. Reports of new clinical procedures, protocols, or methods with specific focus on direct application to identification, assessment and/or treatment concerns in speech, language, and/or hearing. Brief Reports. Similar to research notes, brief communications concerning preliminary findings, either clinical or experimental (applied or basic), that may lead to additional and more comprehensive study in the future. These reports are typically based on small “n” or pilot studies and must address disordered participant populations. Research Notes. Brief communications that focus on experimental work conducted in laboratory settings. These reports will typically address methodological concerns and/or modifications of existing tools or instruments with either normal or disordered populations. Field Reports. Reports that outline the provision of services that are conducted in unique, atypical, or nonstandard settings; manuscripts in this category may include screening, assessment, and/or treatment reports. Letters to the Editor. A forum for presentation of scholarly/ clinical differences of opinion concerning work previously published in the Journal. Letters to the Editor may influence our thinking about design considerations, methodological confounds, data analysis and/or data interpretation, etc. As with other categories of submissions, this communication forum is contingent upon peer-review. However, in contrast to other categories of submission, rebuttal from the author(s) will be solicited upon acceptance of a letter to the editor. Submission of Manuscripts Contributors should send five (5) copies of manuscripts including all tables, figures or illustrations, and references to: Phyllis Schneider, PhD Editor, JSLPA Dept. of Speech Pathology and Audiology University of Alberta 2-70 Corbett Hall Edmonton, AB T6G 2G4 Along with copies of the manuscript, a cover letter indicating that the manuscript is being submitted for publication consideration should be included. The cover letter must explicitly state that the manuscript is original work, that has not been published previously, and that it is not currently under review elsewhere. Manuscripts are received and peer-reviewed contingent upon this understanding. The author(s) must also provide appropriate confirmation that work conducted with humans or animals has received ethical review and approval. Failure to provide information on ethical approval will delay the review process. Finally, the cover letter should also indicate the category of submission (i.e., tutorial, clinical report, etc.). If the editorial staff determines that the manuscript should be considered within another category, the contact author will be notified. All submissions should conform to the publication guidelines of the Publication Manual of the American Psychological Association (APA), 5th Edition. Manuscripts should be word processed, IBM format preferred. Should the manuscript be accepted for publication, submission of a diskette version of the submission will facilitate the publication process. A confirmation of receipt for all manuscripts will be provided to the contact author prior to distribution for peer-review. JSLPA seeks to conduct the review process and respond to authors regarding the outcome of the review within 90 days of receipt. If a manuscript is judged as suitable for publication in JSLPA, authors will have 30 days to make necessary revisions prior to a secondary review. The author is responsible for all statements made in his or her manuscript, including changes made by the editorial and/ or production staff. Upon final acceptance of a manuscript and immediately prior to publication, the contact author will be permitted to review galley proofs and verify its content to the publication office within 72 hours of receipt of galley proofs. Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 185 Organization of the Manuscript All copies should be typed, double-spaced, with a standard typeface (12 point, noncompressed font) on high quality 8 ½ X 11 paper. All margins should be at least one (1) inch. An original and four (copies) of the manuscript should be submitted directly to the Editor. Author identification for the review process is optional; if blind-review is desired, three (3) of the copies should be prepared accordingly (cover page and acknowledgments blinded). Responsibility for removing all potential identifying information rests solely with the author(s). All manuscripts should be prepared according to APA guidelines. This manual is available from most university bookstores or is accessible via commercial bookstores. Generally, the following sections should be submitted in the order specified. Title Page: This page should include the full title of the manuscript, the full names of the author(s) with academic degrees, each author’s affiliation, and a complete mailing address for the contact author. An electronic mail address also is recommended. Abstract Abstract: On a separate sheet of paper, a brief yet informative abstract that does not exceed one page is required. The abstract should include the purpose of the work along with pertinent information relative to the specific manuscript category for which it was submitted. Key Words: Following the abstract and on the same page, the author(s) should supply a list of key words for indexing purposes. Tables: Each table included in the manuscript must be typewritten and double-spaced on a separate sheet of paper. Tables should be numbered consecutively beginning with Table 1. Each table must have a descriptive caption. Tables should serve to expand the information provided in the text of the manuscript, not to duplicate information. Potential Conflicts of Interest and Dual Commitment As part of the submission process, the author(s) must explicitly identify if any potential conflict of interest, or dual commitment, exists relative to the manuscript and its author(s). Such disclosure is requested so as to inform JSLPA that the author or authors have the potential to benefit from publication of the manuscript. Such benefits may be either direct or indirect and may involve financial and/or other nonfinancial benefit(s) to the author(s). Disclosure of potential conflicts of interest or dual commitment may be provided to editorial consultants if it is believed that such a conflict of interest or dual commitment may have had the potential to influence the information provided in the submission or compromise the design, conduct, data collection or analysis, and/or interpretation of the data obtained and reported in the manuscript submitted for review. If the manuscript is accepted for publication, editorial acknowledgement of such potential conflict of interest or dual commitment may occur when publication occurs. 186 Illustrations: All illustrations included as part of the manuscript will need to be included with each copy of the manuscript. While a single copy of original artwork (black and white photographs, x-ray films, etc.) is required, all manuscripts must have clear copies of all illustrations for the review process. For photographs, 5 x 7 glossy prints are preferred. High quality laser printed materials are also acceptable. For other types of computerized illustrations, it is recommended that JSLPA production staff be consulted prior to preparation and submission of the manuscript and associated figures/illustrations. Legends for Illustrations: Legends for all figures and illustrations should be typewritten (double-spaced) on a separate sheet of paper with numbers corresponding to the order in which figures/illustrations appear in the manuscript. Page Numbering and Running Head: The text of the manuscript should be prepared with each page numbered, including tables, figures/illustrations, references, and if appropriate, appendices. A short (30 characters or less) descriptive running title should appear at the top right hand margin of each page of the manuscript. Acknowledgments: Acknowledgments should be typewritten (double-spaced) on a separate sheet of paper. Appropriate acknowledgment for any type of sponsorship, donations, grants, technical assistance, and to professional colleagues who contributed to the work, but are not listed as authors, should be noted. References: References are to be listed consecutively in alphabetical order, then chronologically for each author. Authors should consult the APA publication manual (4th Edition) for methods of citing varied sources of information. Journal names and appropriate volume number should be spelled out and underlined. All literature, tests and assessment tools, and standards (ANSI and ISO) must be listed in the references. All references should be double-spaced. Participants in Research Humans and Animals Each manuscript submitted to JSLPA for peer-review that is based on work conducted with humans or animals must acknowledge appropriate ethical approval. In instances where humans or animals have been used for research, a statement indicating that the research was approved by an institutional review board or other appropriate ethical evaluation body or agency must clearly appear along with the name and affiliation of the research ethics and the ethical approval number. The review process will not begin until this information is formally provided to the Editor. Similar to research involving human participants, JSLPA requires that work conducted with animals state that such work has met with ethical evaluation and approval. This includes identification of the name and affiliation of the research ethics evaluation body or agency and the ethical approval number. A statement that all research animals were used and cared for in an established and ethically approved manner is also required. The review process will not begin until this information is formally provided to the Editor. Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005 Renseignements à l’intention des collaborateurs La Revue d’orthophonie et d’audiologie (ROA) est heureuse de se voir soumettre des manuscrits de recherche portant sur la communication humaine et sur les troubles qui s’y rapportent, dans leur sens large. Cela comprend les manuscrits portant sur les processus normaux et désordonnés de la parole, du langage et de l’audition. Nous recherchons des manuscrits qui n’ont jamais été publiés, en français ou en anglais. Les manuscrits peuvent être tutoriels, théoriques, synthétiques, pratiques, pédagogiques ou empiriques. Tous les manuscrits seront évalués en fonction de leur signification, de leur opportunité et de leur applicabilité aux intérêts de l’orthophonie et de l’audiologie comme professions, et aux sciences et aux troubles de la communication en tant que disciplines. Par conséquent, tous les manuscrits sont évalués en fonction de leur incidence possible sur l’amélioration de notre compréhension de la communication humaine et des troubles qui s’y rapportent. Peu importe la catégorie, tous les manuscrits présentés seront soumis à une révision par des collègues afin de déterminer s’ils peuvent être publiés dans la ROA. La Revue a récemment établi plusieurs catégories de manuscrits afin de permettre la meilleure diffusion possible de l’information portant sur la communication humaine et les troubles s’y rapportant. Les nouvelles catégories de manuscrits comprennent : Tutoriels : Rapports de synthèse, traités ou exposés de position portant sur un sujet particulier dans un cadre théorique ou clinique. Articles : Manuscrits conventionnels traitant de recherche appliquée ou expérimentale de base sur les questions se rapportant à la parole, au langage ou à l’audition et faisant intervenir des participants humains ou animaux. Comptes rendus cliniques : Comptes rendus de nouvelles procédures ou méthodes ou de nouveaux protocoles cliniques portant particulièrement sur une application directe par rapport aux questions d’identification, d’évaluation et de traitement relativement à la parole, au langage et à l’audition. Comptes rendus sommaires : Semblables aux notes de recherche, brèves communications portant sur des conclusions préliminaires, soit cliniques soit expérimentales (appliquées ou fondamentales), pouvant mener à une étude plus poussée dans l’avenir. Ces comptes rendus se fondent typiquement sur des études à petit « n » ou pilotes et doivent traiter de populations désordonnées. Notes de recherche : Brèves communications traitant spécifiquement de travaux expérimentaux menés en laboratoire. Ces comptes rendus portent typiquement sur des questions de méthodologie ou des modifications apportées à des outils existants utilisés auprès de populations normales ou désordonnées. Comptes rendus d’expérience : Comptes rendus décrivant sommairement la prestation de services offerts en situations uniques, atypiques ou particulières; les manuscrits de cette catégorie peuvent comprendre des comptes rendus de dépistage, d’évaluation ou de traitement. Courrier des lecteurs : Forum de présentation de divergences de vues scientifiques ou cliniques concernant des ouvrages déjà publiés dans la Revue. Le Courrier des lecteurs peut avoir un effet sur notre façon de penser par rapport aux facteurs de conception, aux confusions méthodologiques, à l’analyse ou l’interprétation des données, etc. Comme c’est le cas pour d’autres catégories de présentation, ce forum de communication est soumis à une révision par des collègues. Cependant, contrairement aux autres catégories, on recherchera la réaction des auteurs sur acceptation d’une lettre. Présentation de manuscrits On demande aux collaborateurs de faire parvenir cinq (5) exemplaires de leurs manuscrits, y compris tous les tableaux, figures ou illustrations et références, à : Phyllis Schneider, Ph.D. Rédactrice en chef, Revue d’orthophonie et d’audiologie Dept. of Speech Pathology and Audiology University of Alberta 2-70 Corbett Hall Edmonton (Alberta) T6G 2G4 On doit joindre aux exemplaires du manuscrit une lettre d’envoi qui indiquera que le manuscrit est présenté en vue de sa publication. La lettre d’envoi doit préciser que le manuscrit est une œuvre originale, qu’il n’a pas déjà été publié et qu’il ne fait pas actuellement l’objet d’un autre examen en vue d’être publié. Les manuscrits sont reçus et examinés sur acceptation de ces conditions. L’auteur (les auteurs) doit (doivent) aussi fournir une attestation en bonne et due forme que toute recherche impliquant des êtres humains ou des animaux a fait l’objet de l’agrément d’un comité de révision déontologique. L’absence d’un tel agrément retardera le processus de révision. Enfin, la lettre d’envoi doit également préciser la catégorie de la présentation (i.e. tutoriel, rapport clinique, etc.). Si l’équipe d’examen juge que le manuscrit devrait passer sous une autre catégorie, l’auteur-contact en sera avisé. Toutes les présentations doivent se conformer aux lignes de conduite présentées dans le Publication Manual of the American Psychological Association (APA), 5th Edition. Les manuscrits doivent être dactylographiés sur traitement de texte en format IBM, de préférence. L’envoi d’une disquette, si le manuscrit est accepté, facilite la publication. Un accusé de réception de chaque manuscrit sera envoyé à l’auteur-contact avant la distribution des exemplaires en vue de la révision. La ROA cherche à effectuer cette révision et à informer les auteurs des résultats de cette révision dans les 90 jours de la réception. Lorsqu’on juge que le manuscrit convient à la ROA, on donnera 30 jours aux auteurs pour effectuer les changements nécessaires avant l’examen secondaire. L’auteur est responsable de toutes les affirmations formulées dans son manuscrit, y compris toutes les modifications effectuées par les rédacteurs et réviseurs. Sur acceptation définitive du manuscrit et immédiatement avant sa publication, on donnera l’occasion à l’auteur-contact de revoir les épreuves et il devra signifier la vérification du contenu dans les 72 heures suivant réception de ces épreuves. Revue d’orthophonie et d’audiologie - Vol. 29, No 4, Hiver 2005 187 Organisation du manuscrit Tous les textes doivent être dactylographiés à double interligne, en caractère standard (police de caractères 12 points, non comprimée) et sur papier 8 ½” X 11" de qualité. Toutes les marges doivent être d’au moins un (1) pouce. L’original et quatre (4) copies du manuscrit doiventêtreprésentésdirectementaurédacteurenchef.L’identification de l’auteur est facultative pour le processus d’examen : si l’auteur souhaite ne pas être identifié à ce stade, il devra préparer trois (3) copies d’un manuscrit dont la page couverture et les remerciements seront voilés. Seuls les auteurs sont responsables de retirer toute information identificatrice éventuelle. Tous les manuscrits doivent être rédigés en conformité aux lignes de conduite de l’APA. Ce manuel est disponible dans la plupart des librairies universitaires et peut être commandé chez les libraires commerciaux. En général, les sections qui suivent doivent être présentées dans l’ordre chronologique précisé. Page titre : Cette page doit contenir le titre complet du manuscrit, les noms complets des auteurs, y compris les diplômes et affiliations, et l’adresse complète de l’auteur-contact. Une adresse de courriel est également recommandée. Abrégé : Sur une page distincte, produire un abrégé bref mais informateur ne dépassant pas une page. L’abrégé doit indiquer l’objet du travail ainsi que toute information pertinente portant sur la catégorie du manuscrit. Mots clés : Immédiatement suivant l’abrégé et sur la même page, les auteurs doivent présenter une liste de mots clés aux fins de constitution d’un index. Tableaux : Tous les tableaux compris dans un même manuscrit doivent être dactylographiés à double interligne sur une page distincte. Les tableaux doivent être numérotés consécutivement, en commençant par le Tableau 1. Chaque tableau doit être accompagné d’une légende et doit servir à compléter les renseignements fournis dans le texte du manuscrit plutôt qu’à reprendre l’information contenue dans le texte ou dans les tableaux. Illustrations : Toutes les illustrations faisant partie du manuscrit doiventêtreinclusesavecchaqueexemplairedumanuscrit.Quoiqu’un Conflits d’intérêts possibles et engagement double Dans le processus de présentation, les auteurs doivent déclarer clairementl’existencedetoutconflitd’intérêtspossiblesouengagement double relativement au manuscrit et de ses auteurs. Cette déclaration est nécessaire afin d’informer la ROA que l’auteur ou les auteurs peuvent tirer avantage de la publication du manuscrit. Ces avantages pour les auteurs, directs ou indirects, peuvent être de nature financière ou non financière. La déclaration de conflit d’intérêts possibles ou d’engagement double peut être transmise à des conseillers en matière de publication lorsqu’on estime qu’un tel conflit d’intérêts ou engagement double aurait pu influencer l’information fournie dans la présentation ou compromettre la conception, la conduite, la collecte ou l’analyse des données, ou l’interprétation des données recueillies et présentées dans le manuscrit soumis à l’examen. Si le manuscrit est accepté en vue de sa publication, la rédaction se réserve le droit de reconnaître l’existence possible d’un tel conflit d’intérêts ou engagement double. 188 seul exemplaire du matériel d’illustration original (photographies, radiographies, etc.) soit requis, chaque manuscrit doit contenir des copies claires de toutes les illustrations pour le processus de révision. Dans le cas de photographies, on préfère les photos sur papier glacé 5" X 7". Les impressions au laser de haute qualité sont acceptables. Pour les autres types d’illustrations informatisées, il est recommandé de consulter le personnel de production de la ROA avant la préparation et la présentation du manuscrit et des figures et illustrations s’y rattachant. Légendes des illustrations : Les légendes accompagnant chaque figure et illustration doivent être dactylographiées à double interligne sur une feuille distincte et identifiées à l’aide d’un numéro qui correspond à la séquence de parution des figures et illustrations dans le manuscrit. Numérotation des pages et titre courant : Chaque page du manuscrit doit être numérotée, y compris les tableaux, figures, illustrations, références et, le cas échéant, les annexes. Un bref (30 caractères ou moins) titre courant descriptif doit apparaître dans la marge supérieure droite de chaque page du manuscrit. Remerciements : Les remerciements doivent être dactylographiés à double interligne sur une feuille distincte. L’auteur doit reconnaître toute forme de parrainage, don, bourse ou d’aide technique, ainsi que tout collègue professionnel qui ont contribué à l’ouvrage mais qui n’est pas cité à titre d’auteur. Références : Les références sont énumérées les unes après les autres, en ordre alphabétique, suivi de l’ordre chronologique sous le nom de chaque auteur. Les auteurs doivent consulter le manuel de l’APA (5e Édition) pour obtenir la façon exacte de rédiger une citation. Les noms de revues scientifiques et autres doivent être rédigés au long et soulignés. Tous les ouvrages, outils d’essais et d’évaluation ainsi que les normes (ANSI et ISO) doivent figurer dans la liste de références. Les références doivent être dactylographiées à double interligne. Participants à la recherche – êtres humains et animaux Chaque manuscrit présenté à la ROA en vue d’un examen par des pairs et qui se fonde sur une recherche effectuée avec la participation d’être humains ou d’animaux doit faire état d’un agrément déontologique approprié. Dans les cas où des êtres humains ou des animaux ont servi à des fins de recherche, on doit joindre une attestation indiquant que la recherche a été approuvée par un comité d’examen reconnu ou par tout autre organisme d’évaluation déontologique, comportant le nom et l’affiliation de l’éthique de recherche ainsi que le numéro de l’approbation. Le processus d’examen ne sera pas amorcé avant que cette information ne soit formellement fournie au rédacteur en chef. Tout comme pour la recherche effectuée avec la participation d’êtres humains, la ROA exige que toute recherche effectuée avec des animaux soit accompagnée d’une attestation à l’effet que cette recherche a été évaluée et approuvée par les autorités déontologiques compétentes. Cela comporte le nom et l’affiliation de l’organisme d’évaluation de l’éthique en recherche ainsi que le numéro de l’approbation correspondante. On exige également une attestation à l’effet que tous les animaux de recherche ont été utilisés et soignés d’une manière reconnue et éthique. Le processus d’examen ne sera pas amorcé avant que cette information ne soit formellement fournie au rédacteur en chef. Journal of Speech-Language Pathology and Audiology - Vol. 29, No. 4, Winter 2005