Login | Register

The Acquisition of English /I/ by Spanish Speakers via Text-to-Speech Synthesizers: A Quasi-Experimental Study


The Acquisition of English /I/ by Spanish Speakers via Text-to-Speech Synthesizers: A Quasi-Experimental Study

Soler Urzúa, Fernanda (2011) The Acquisition of English /I/ by Spanish Speakers via Text-to-Speech Synthesizers: A Quasi-Experimental Study. Masters thesis, Concordia University.

[thumbnail of Soler-Urzua_MA_F2011.pdf]
Text (application/pdf)
Soler-Urzua_MA_F2011.pdf - Accepted Version


A plausible explanation for learners having difficulties with the acquisition of L2 phonology is the idea that L2 speech is processed through the L1 and perceived in relation to it. L2 learners sometimes fail to perceive the differences between L1 and L2 segments; consequently, they are unable to acquire new sounds. In this context, the concept of perceptual salience takes on added importance because learners might be able to establish differences between L1 and L2 sounds if they are perceptually prominent in the L2 input. Some researchers suggest that multimedia environments are beneficial because the language input can be highlighted in many ways and thus render opaque forms more salient to the learner. This study investigates the extent to which pedagogical instruction using text-to-speech (TTS) technology as a means to enhance the aural input assists learners in the acquisition of the English /I/. Three groups of learners of the same L1 (Spanish) and similar English proficiency were pre-tested on their ability to perceive and produce the target vowel by means of different tasks (two for each ability). Each group was subjected to a different instructional condition: TTS-based instruction, non-TTS based instruction and regular classroom instruction. The TTS group performed tasks intended to develop their perception of the target forms via TTS; the non-TTS group performed the same tasks, but receiving input from the researcher; and the third group worked on listening comprehension tasks. It was hypothesized that the TTS group would outperform the other two groups in terms of perception and production. After completing the treatments, the three groups were tested on their productive and perceptual abilities in relation to the target sound. Two weeks later, the participants received the same tests. The results obtained showed that the TTS group significantly outperform the non-TTS group in one of the pronunciation tasks. However, their performance in the other tasks in the post-tests was not significantly different from the other groups. These results are discussed with respect to the hypotheses proposed and in relation to the relevant theory and previous studies. The limitations of the study together with suggestions for future research and its implications for ESL teaching are also addressed.

Divisions:Concordia University > Faculty of Arts and Science > Education
Item Type:Thesis (Masters)
Authors:Soler Urzúa, Fernanda
Institution:Concordia University
Degree Name:M.A.
Program:Applied Linguistics
Date:25 August 2011
Thesis Supervisor(s):Cardoso, Walcir
Keywords:TTS, text-to-speech, CALL, L2 phonology, L2 perception, L2 production, second language acquisition
ID Code:15159
Deposited On:17 Nov 2011 20:32
Last Modified:18 Jan 2018 17:35


Adank, P., Smits, R., & van Hout, R. (2004). A comparison of vowel normalization procedures for language variation research. The Journal of the Acoustical Society of America, 116(5), 3099-3107.
Audacity (Version 1.3-12-beta) [Computer software]. Boston, MA: Free Software Foundation Inc.
Avery, P., & Ehrlich, S. (1992). Teaching American English pronunciation. Oxford: Oxford University Press.
Best, C. (1993). Learning to perceive the sound pattern of English. Haskins Laboratories Status Report on Speech Research,114, 31–80.
Best, C. (1995). A direct realistic perspective on cross-language speech perception. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 167–200). Timonium, MD: York Press.
Bion, R. A. H., Escudero, P., Rauber, A. S., & Baptista, B. O. (2006). Category formation and the role of spectral quality in the perception and production of English front vowels. Proceedings of INTERSPEECH 2006, USA, 2913–2916. Retrieved from http://www.nupffale.ufsc.br/rauber/Interspeech%20article_2006.pdf
Bradlow, A. R., Pisoni, D. B., Yamada, R. A. & Tokhura, Y. (1997). Training Japanese listeners to identify English /r/ and /l/: IV. Some effects on perceptual learning on speech production. Journal of the Acoustic Society of America, 101(4), 2299-2310.
Brown, C. (1998). The role of the L1 grammar in the acquisition of segmental structure. Second Language Research, 14, 139–193.
Brown, C. (2000). The interrelation between speech perception and phonological acquisition from infant to adult. In J. Archibald (Ed.), Second language acquisition and linguistic theory (pp. 4–63). Oxford: Blackwell.
Cardoso, W. (2007). The variable development of English word-final stops by Brazilian Portuguese speakers: A stochastic optimality theoretic account. Language Variation and Change, 19(3), 219–248.
Carlisle, R. S. (1998). The acquisition of onsets in a markedness relationship: A longitudinal study. Studies in Second Language Acquisition, 20, 245–260.
Carlson, R. (1995). Models of speech synthesis. Proceedings of the National Academy of Science of the United States of America, USA, 92, 9932–9937. Retrieved from http://www.jstor.org/pss/2368593
Celce-Murcia, M., Brinton, D. M., & Goodwin, J. M. (1996). Teaching pronunciation: A reference for teachers of English to speakers of other languages. New York, NY: Cambridge University Press.
Cenoz, J., & García Lecumberri, L. (1999). The effect of training on the discrimination of English vowels. International Review of Applied Linguistics in Language Teaching, 37(4), 261–275.
Chang, R. (2011). 365 ESL short stories. Retrieved form http://www.eslfast.com/.
Chapelle, C. (1998). Multimedia CALL: Lessons to be learned from research on instructed SLA. Language Learning and Technology, 2(1), 21–39.
Chapelle, C. (2001). Innovative language learning: Achieving the vision. ReCALL, 13(1), 3–14.
Chapelle, C. (2003). English language and technology. Philadelphia, PA: John Benjamins.
Chapelle, C. (2007). Technology and second language acquisition. In N. Markee (Ed.), Annual Review of Applied Linguistics (pp. 98–114). Cambridge: Cambridge University Press.
Chapelle, C. (2009). The relationship between second language acquisition theory and computer-assisted language learning. The Modern Language Journal, 93, 741–753.
Cobb, T. (n.d.). Web Vocabprofile [Computer software]. Retrieved from http:// www.lextutor. ca/vp/eng/
Cohen, J. (1988). Statistical power analysis for the behavioural sciences. Hillsdale, NJ: Lawrence Erlbaum Associates.
Collins, L., Trofimovich, P., White, J., Cardoso, W., & Horst, M. (2009). Some input on the easy/difficult grammar question: An empirical study. The Modern Language Journal, 93, 336–353.
Czaykowska-Higgins, E., & Dobrovolsky, M. (2009). Phonology: The function and patterning of sounds. In W. O’Grady & J. Archibald (Eds.), An introduction to contemporary linguistic analysis (pp. 59-108). Toronto, ON: Pearson Education Canada.
De Lacy, P. (2006). Markedness: Reduction and preservation in phonology. Cambridge: Cambridge University Press.
Derwing, T., & Munro, M., & Wiebe, G. (1998). Evidence in favor of a broad framework for pronunciation instruction. Language Learning, 48, 393–410.
Deutmert, A. (2003). Markedness and salience in language contact and second-language acquisition: Evidence from a non-canonical contact language. Language Sciences, 25, 561–613.
Dulay, H., Burt, M., & Krashen, S. (1982). Language two. Oxford University Press: Oxford.
Dutoit, T., & Cerňak, M. (2005). TTSBOX: A matlab toolbox for teaching text-to-speech synthesis. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP ’05), USA, 5, 537–540. Retrieved from http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1416359.
Eckman, F. (1977). Markedness and the contrastive analysis hypothesis. Language Learning, 27(2), 315–330.
Eckman, F. (1991). The structural conformity hypothesis and the acquisition of consonant clusters in the interlanguage of ESL learners. Studies in Second Language Acquisition, 13, 23–41.
Eckman. F. (2004). From phonemic difference to constraint rankings. Studies in Second Language Acquisition, 26, 513–149.
Ellis, N. C. (2006). Language acquisition as rational contingency learning. Applied Linguistics, 27, 1–24.
Ellis, N. C. (2008). Usage-based and form-focused language acquisition: The associative learning of constructions, learned attention, and the limited L2 state. In P. Robinson & N. C. Ellis (Eds.), Handbook of cognitive linguistics and second language acquisition (pp. 372–405). London: Routledge.
Escudero, P. (2000). Developmental patterns in the adult L2 acquisition of new contrasts: The acoustic cue weighting in the perception of Scottish teNS/lax vowels by Spanish speakers (Master’s thesis). Retrieved from http://www.fon.hum.uva. nl/paola/.
Escudero, P. (2006). The phonological and phonetic development of new vowel contrasts in Spanish learners of English. In B. O. Baptista & M. A. Watkins (Eds.), English with a Latin beat: Studies in Portuguese/Spanish-English interphonology (pp. 41–55). Amsterdam: John Benjamins.
Field, A. (2009). Discovering statistics using SPSS. London: SAGE Publications Ltd.
Finch, D., & Ortiz, H. (1982). A course in English phonetics for Spanish speakers. Bristol: Heinemann Educational Books.
Flege, J. E. (1995). Second language speech learning: Theory, findings and problems. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 233–277). Timonium, MD: York Press.
Flege, J. E. (2002). Interactions between the native and second-language phonetic systems. In P. Burmeister, T. Piske & A. Rohde (Eds.), An integrated view of language development: papers in honor of Henning Wode (pp. 217–244). Trier: Wissenschaftlicher Verlag.
Flege, J. E. (2003). Assessing constraints on second-language segmental production and perception. In A. Meyer & N. Schiller (Eds.), Phonetics and phonology in language comprehension and production (pp. 319–355). Berlin: Mouton de Gruyter.
Flege, J. E., Bohn, O., & Jang, S. (1997). Effects of experience on non-native speakers’ production and perception of English vowels. Journal of Phonetics, 25, 437–470.
García Pérez, G. M. (2003). Training Spanish speakers in the perception and production of English vowels (Doctoral dissertation). Retrieved from Pro-Quest (Document ID 932345391).
Gatbonton, E., & Trofimovich, P. (2008). The ethnic group affiliation and L2 proficiency link: Empirical evidence. Language Awareness, 17, 229–248.
Goldschneider, J. M., & DeKeyser, R. M. (2001). Explaining the “natural order of L2 morpheme acquisition” in English: A meta-analysis of multiple determinants. Language Learning, 51(1), 1–50.
Hallé, P. A., & Best, C. (2007). Dental-to-velar perceptual assimilation: A cross-linguistic study of the perception of dental stops + /l/ clusters. Journal of the Acoustical Society of America, 121(5), 2899–2914.
Hancin-Bhatt, B. (1994). Segment transfer: A consequence of a dynamic system. Second Language Research, 10, 241–269.
Handley, Z. (2009). Is text-to-speech synthesis ready for use in computer-assisted language learning? Speech Communication, 5, 906–919.
Hincks, R. (2002). Speech synthesis for teaching lexical stress. Proceedings of Fonetik- Speech, Music and Hearing: Quarterly Progress Status Reports, 44(1), 153–156.
Hualde, J. I. (2005). The sounds of Spanish. New York: Cambridge University Press.
Jenkins, J. (2002). A sociolinguistically based, empirically researched pronunciation syllabus for English as an international language. Applied Linguistics, 23(1), 83–103.
Julie (Version M16) [Computer software]. Santa Clara, CA: Neospeech Inc.
Kate (Version M16) [Computer software]. Santa Clara, CA: Neospeech Inc.
Kiliçkaya, F. (2008). Improving pronunciation via accent reduction and text-to-speech software. In T. Koyama, J. Noguchi, Y. Yoshinari, and A. Iwasaki (Eds.), Proceedings of the WorldCALL 2008 Conference, Japan, 1, 135–137. Retrieved from http://www.j-let.org/~wcf/proceedings/proceedings.pdf
King, R.D. (1967). Functional load and sound change. Language, 43, 831–852.
Kirstein, M. (2006). Universalizing universal design: Applying text-to-speech technology to English language learners’ process writing (Doctoral dissertation). Retrieved from Pro-Quest (Document ID1188883621).
Ladefoged, P. (2005). A course in English phonetics. Boston, MA: Heinle & Heinle.
Lee, F. (1969). Reading machine: From text to speech. Institute of Electrical and Electronics Engineers - Transactions Audio and Electroacoustics, USA, 17(4), 275–282.
Lenneberg, E. H. (1967). Biological foundations of language. New York, NY: Wiley.
Leow, R. P. (1997). Attention, awareness, and foreign language behavior. Language Learning, 47(3), 467–505.
Leow, R. P. (2000). A study of the role of awareness in foreign language behavior: Aware versus unaware learners. Studies in Second Language Acquisition, 22(4), 557–584.
Leow, R. P. (2007). Input in the L2 classroom. In R. M. DeKeyser (Ed.), Practice in a second language: Perspectives from applied linguistics and cognitive psychology (pp. 21–50). New York, NY: Cambridge University Press.
Levis, J., & Cortes, V. (2008). Minimal pairs in spoken corpora: Implications for pronunciation assessment and teaching. In C. Chapelle, Y. R. Chung, & J. Xu (Eds.) Towards adaptive CALL: Natural language processing for diagnostic language assessment (pp. 197–208). Ames, IA: Iowa State University.
Lively, S. E., Pisoni, D. B., & Logan, J. S. (1992). Some effects of training Japanese listeners to identify English /r/ and /l/. In Y. Tohkura, E. Vatikiotis-Bateson, & Y. Sagisaka (Eds.), Speech perception, production, and linguistic structure (pp. 175 - 196). Tokyo: Ohmsha.
Lively, S. E., Pisoni, D. B., Yamada, R. A., Tokhura, Y. & Yamada, T. (1994). Training Japanese listeners to identify Englihs /r/ and /l/. III. Long-term memory retention of new phonetic categories. Journal of the Acoustic Society of America, 96(4), 2076-2087.
Logan, J. S., Lively, S. E. & Pisoni, D. B. (1991). Training Japanese listeners to identify English /r/ and /l/: A first report. Journal of the Acoustic Society of America, 89(2), 874-886.
McCandliss, B. D., Fiez, J. E., Protopapas, A., Conway, M., & McClelland, J. L. (2002). Success and failure in teaching the [r]-[l] contrast to Japanese adults: Test of a Hebbian model of plasticity and stabilization in spoken language perception. Cognitive, Affective, & Behavioral Neuroscience, 2(2), 89–108.
Major, R. C. (1986). The ontogeny model: Evidence from L2 acquisition of Spanish r. Language Learning, 36, 453–504.
Major, R. C. (2001). Foreign accent. Amsterdam: Benjamins.
Menke, M. (2010). Examination of the Spanish vowels produced by Spanish-English bilingual children. Southwest Journal of Applied Linguistics 28(2), 98–135.
Morrison, G. (2008). L1-Spanish speakers’ acquisition of the English /i/-/ɪ/ contrast: Duration-based perception is not the initial developmental stage. Language and Speech, 51(4), 285–315.
Morrison, G. (2009). L1-Spanish speakers’ acquisition of the English /i/-/ɪ/ contrast II: Perception of vowel inherent spectral change. Language and Speech, 52(4), 437–462.
Munro, M., & Derwing, T. (2006). The functional load principle in ESL pronunciation instruction: An exploratory study. System, 34, 520–531.
Nation, I. S. P., & Heatley, A. (1994). Range: A program for the analysis of vocabulary in texts [Computer software]. Retrieved from http://www.victoria.ac.nz/ lals/staff/paul-nation.aspx
Odisho, E. (1992). A comparative study of English and Spanish vowel systems: Theoretical and practical implications for teaching pronunciation (Report No. ED352836). Retrieved from Education Resources Information Center website http://www.eric.ed.gov/ERICDocs/data/
PASW Statistics (Version 18). Chicago, IL: SPSS Inc.
Paul (Version M16) [Computer software]. Santa Clara, CA: Neospeech Inc.
Penfield, W., & Roberts, L. (1959). Speech and brain mechanisms. Princeton, NJ: Princeton University Press.
PowerPoint (Version 2007) [Computer software]. Redmond, WA: Microsoft Corporation.
Praat (Version 5.2.04) [Computer software]. Amsterdam: Boersma & Weeninck.
Proctor, C. P., Dalton, B., & Grisham, D. L. (2007). Scaffolding English language learners and struggling readers in a universal literacy environment with embedded strategy instruction and vocabulary support. Journal of Literacy Research, 39(1), 71–93.
Rauber, A. S., Escudero, P., Bion, R. A. H., & Baptista, B. O. (2005). The interrelation between perception and production of English vowels by native speakers of Brazilian Portuguese. Proceedings of INTERSPEECH 2005, Portugal, 2913–2916. Retrieved from http://www.nupffale.ufsc.br/rauber/Interspeech%20article_ 2005.pdf
Richards, J. C., & Schmidt, R. (2002). Longman dictionary of language teaching and applied linguistics. Harlow: Pearson Education.
Roach, P. (2000). English Phonetics and Phonology, a practical course. Cambridge: Cambridge University Press.
Rosa, E., & O'Neill, M. D. (1999). Explicitness, intake, and the issue of awareness. Studies in Second Language Acquisition, 21(4), 511–556.
Saito, K., & Lyster, R. (in press). Effects of form-focused instruction and corrective feedback on L2 pronunciation development of /ɹ/ by Japanese learners of English. Language Learning.
Schmidt, R. W. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 35–78.
Schmidt, R. W. (1993). Awareness and second language acquisition. Annual Review of Applied Linguistics, 13, 206–226.
Šef, T., & Gams, M. (2003). Speaker (GOROVEC): A complete Slovenian text-to-speech system. International Journal of Speech Technology, 6, 277–287.
Sisson, C. (2007). Text-to-speech in vocabulary acquisition and student knowledge models: A classroom study using the REAP intelligent tutoring system (Report No. CMU-LTI-07-009). Retrieved from Carnegie Mellon University, Language Technologies Institute website http://www.lti.cs.cmu.edu/Research/tech-reports.html
Stockwell, R. P., & Bowen, J. D. (1965). The sounds of English and Spanish. Chicago: The University of Chicago Press.
Stratil, M., Burkhardt, D., Jarrat, P., & Yandle, J. (1987). Computer-aided language learning with speech synthesis: User reactions. Programmed learning and Educational Technology, 24(4), 309–316.
Stratil, M., Weston, G., & Burkhardt, D. (1987). Exploration of foreign language speech synthesis. Literary and Linguistic Computing, 2(2), 116–119.
Thomas, E. R., & Tyler, K. (2007). NORM: The vowel normalization and plotting suite. [Online Resource: http://ncslaap.lib.ncsu.edu/tools/norm/].
Visual Basic (Version 6.3) [Computer Software]. Redmond, WA: Microsoft Corporation.
VoiceText (Version [Computer Software]. Seoul: Voiceware Corporation Ltd.
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top