Login | Register

Assessing L2 Pronunciation with Automatic Speech Recognition Dictation Tools

Title:

Assessing L2 Pronunciation with Automatic Speech Recognition Dictation Tools

Johnson, Carol (2025) Assessing L2 Pronunciation with Automatic Speech Recognition Dictation Tools. PhD thesis, Concordia University.

[thumbnail of Johnson_PhD_F2025.pdf]
Preview
Text (application/pdf)
Johnson_PhD_F2025.pdf - Accepted Version
Available under License Spectrum Terms of Access.
1MB

Abstract

The use of automatic speech recognition (ASR) to score pronunciation placement tests offers language institutions an efficient alternative to human raters, addressing common challenges such as rater reliability and high labour costs. However, the customizable ASR systems used to create scoring models are costly, making them unaffordable for most institutions. As an alternative, this dissertation explored the feasibility of using transcripts from Google Voice Typing (GVT), a free and readily available dictation-based software, to provide automated scores for pronunciation assessments. Via two empirical studies, it addressed the following overarching research question “What are the affordances offered by dictation ASR in an L2 pronunciation assessment context?”
In the first study (Manuscript A), human-rated and GVT-rated scores of 56 pronunciation placement tests were compared, showing strong correlations. However, when the samples were divided by proficiency levels, there were weak correlations for high-proficiency users, raising concerns about the reliability of the test scores. To explain this finding, it was hypothesized that some high-proficiency test takers received low GVT scores due to problematic linguistic elements in some test items (e.g., highly infrequent words, unusual collocations). Overall, this study showed that scoring pronunciation tests with GVT is feasible, with the caveat that reliability issues need to be explored to ensure test validity and reliability.
As a follow-up, the second study (Manuscript B) explored the effect of word frequency, unusual collocations, and phonologically ambiguous items on GVT transcription accuracy, with the aim of supporting the design of valid and reliable pronunciation tests. Four highly intelligible English speakers recorded 60 sentences targeting these three features. The recordings were transcribed by GVT and scored for accuracy, while eight human raters carried out an intelligibility transcription task which was also scored for accuracy. For GVT, the results suggest that lower-frequency vocabulary and phonologically ambiguous phrases were particularly challenging, while sentences containing names, proper nouns, or unusual collocations were almost always accurately transcribed. In contrast, transcriptions produced by human raters showed great variability and tended to be less accurate than those generated by GVT. These results indicate that certain features are difficult for both human raters and GVT to transcribe, even when produced by highly intelligible speakers. This highlights the importance of careful task design to avoid features that may compromise transcription accuracy. To ensure valid, reliable scores and fair decision-making, transcription accuracy must be verified through systematic test piloting.
The findings of this dissertation emphasize that GVT can be used to develop cost-effective scoring models for pronunciation placement tests, while also highlighting the need for careful task design to avoid the use of language that may compromise transcription accuracy and unfairly penalize test takers. These results also offer practical implications for classroom-based pronunciation assessments.

Divisions:Concordia University > Faculty of Arts and Science > Education
Item Type:Thesis (PhD)
Authors:Johnson, Carol
Institution:Concordia University
Degree Name:Ph. D.
Program:Education
Date:20 August 2025
Thesis Supervisor(s):Cardoso, Walcir
Keywords:pronunciation assessment; automatic speech recognition; placement tests; Google Voice Typing; infrequent vocabulary; ambiguous speech; unusual collocations
ID Code:996022
Deposited By: CAROL JOHNSON
Deposited On:04 Nov 2025 15:52
Last Modified:04 Nov 2025 15:52

References:

Alameen, G. & Levis, J. (2019). Connected speech. In M. Reed & J. Levis (Eds.), The handbook of English pronunciation. Wiley-Blackwell.

Ashwell, T., & Elam, J. (2017). How accurately can the Google Web Speech API recognize and transcribe Japanese L2 English learners’ oral production? The JALT CALL Journal, 13(1), 59–76. http://dx.doi.org/10.29140/jaltcall.v13n1.212

Atkins, B., & Rundell, M. (2008). The Oxford guide to practical lexicography. Oxford University Press.

Bachman, L., & Palmer, A. (1996). Language testing in practice. Oxford University Press.

Bernstein, J., Cohen, M., Murveit, H., Rtischev, D., & Weintraub, M. (1990). Automatic evaluation and training in English pronunciation. Proceeding of the First International Conference on Spoken Language Processing (ICSLP 1990), 1185–1188. https://doi.org/10.21437/ICSLP.1990-313

Bernstein, J., Van Moere, A., & Cheng, J. (2010). Validating automated speaking tests. Language Testing, 27(3), 355–377. https://doi.org/10.1177%2F0265532210364404

Browne, K., & Fulcher, G. (2016). Pronunciation and intelligibility in assessing spoken fluency. In T. Isaacs, & P. Trofimovich (Eds.), Second language pronunciation assessment: Interdisciplinary perspectives (pp. 37–53). Multilingual Matters. https://doi.org/10.21832/ISAACS6848

Bybee, J. (2006). From usage to grammar: The mind’s response to repetition. Language, 82(4), 711—733. https://doi.org/10.1353/lan.2006.0186

Cámara-Arenas, E., Tejedor-García, C., Tomas-Vázquez, C., & Escudero-Mancebo, D. (2023). Automatic pronunciation assessment vs. automatic speech recognition: A study of conflicting conditions for L2-English. Language Learning and Technology, 27(1), 1–19. https://hdl.handle.net/10125/73512

Carey, M., Mannell, R., & Dunn, P. (2010). Does a rater’s familiarity with a candidate’s pronunciation affect the rating in oral proficiency interviews? Language Testing, 28(2), 201—219. http://dx.doi.org/10.1177/0265532210393704

Chen, L., Zechner, K., Yoon, S-Y., Evanini, K., Wang, X., Loukina, A., Tao, J., Davis, L., Lee, M., Mundkowsky, R., Lu, C., Leong, C., & Gyawali, B. (2018). Automated scoring of nonnative speech using the SpeechRaterSM v. 5.0 engine. ETS Research Report Series, 1, 1–31. https://doi.org/10.1002/ets2.12198

Chen., W., Inceoglu, S., & Lim, H. (2020) Using ASR to improve Taiwanese EFL learners’ pronunciation: Learning outcomes and learners’ perceptions. In O. Kang, S. Staples, K. Yaw., & K. Hirschi (Eds.), Proceedings of the 11th Pronunciation and Second Language Learning and Teaching Conference (pp. 37 – 48). Iowa State University.

Chun, C. W. (2006). An Analysis of a Language Test for Employment: The Authenticity of the PhonePass Test. Language Assessment Quarterly, 3(3), 295–306. https://doi.org/10.1207/s15434311laq0303_4

Cobb, T. (n.d.). Web Vocabprofile. Lextutor. https://www.lextutor.ca/vp/eng/

Coniam, D. (1998). The use of speech recognition software as an English language oral assessment instrument: An exploratory study. CALICO Journal, 15(4), 7–23. http://www.jstor.org/stable/24147601

Cox, T., & Davies, R. (2012). Using automatic speech recognition technology with elicited oral response testing. CALICO Journal, 29(4), 601–618. http://dx.doi.org/10.11139/cj.v29i4.54

Cucchiarini, C., Strik, H., & Boves, L. (2000). Different aspects of expert pronunciation quality ratings and their relation to scores produced by speech recognition algorithms. Speech Communication, 30(2–3), 109–119. https://doi.org/10.1016/S0167-6393 (99)00040-0

Dai, Y., & Wu, Z. (2023). Mobile-assisted pronunciation learning with feedback from peers and/or automatic speech recognition: a mixed-methods study. Computer Assisted Language Learning, 36(5–6), 861–884. https://doi.org/10.1080/09588221.2021.1952272

Davies, M. (2008). The corpus of contemporary American English (COCA). https://www.english-corpora.org/coca/.

Derwing, T., & Munro, M. (2015). Pronunciation fundamentals: Evidence-based perspectives for L2 teaching and research. John Benjamins.

Derwing, T., Munro, M., & Carbonaro, M. (2000). Does popular speech recognition software work with ESL speech? TESOL Quarterly, 34(3), 592–603. https://doi.org/10.2307/3587748

Dillon, T., & Wells, D. (2021). Student perceptions of mobile automated speech recognition for pronunciation study and testing. English Teaching, 76(4), 101–122. https://doi.org/10.15858/engtea.76.4.202112.101

Downey, R., Farhady, H., Present-Thomas, R., Suzuki, M. & Van Moere, A. (2008). Evaluation of the usefulness of the Versant for English test: A response. Language Assessment Quarterly, 5(2), 160–167. https://doi.org/10.1080/15434300801934744

Elimat, A., & AbuSeileek, A. (2014). Automatic speech recognition technology as an effective means for teaching pronunciation. The JALT CALL Journal, 10(1), 21–47.

Ellis, N., Simpson-Vlach, R., & Maynard, C. (2008). Formulaic language in native and second language speakers: Psycholinguistics, corpus linguistics, and TESOL. TESOL Quarterly, 42(3), 375—396. https://doi.org/10.1002/j.1545-7249.2008.tb00137.x

Eskenazi, M. 1999. Using a computer in foreign language pronunciation training: What advantages? CALICO Journal, 16(3), 447–69. https://doi.org/10.1558/cj.v16i3.447-469

Evanini, K., & Zechner, K. (2019). Overview of automated scoring. In K. Zechner & K. Evanini (Eds.), Automated speaking assessment: Using language technologies to score spontaneous speech (pp. 3–20). Taylor & Francis Group. https://doi.org/10.4324/9781315165103

Evers, K., & Chen, S. (2021) Effects of automatic speech recognition software on pronunciation for adults with different learning styles. Journal of Educational Computing Research, 59(4), 669–685. https://doi.org/10.1177/0735633120972011

Feng, S., Kudina, O., Halpern, B., & Scharenborg, O. (2021). Quantifying bias in automatic speech recognition. In 22nd Annual Conference of the International Speech Communication Association (INTERSPEECH 2021). International Speech Communication Association. https://doi.org/10.48550/arXiv.2103.15122

Field, J. (2005). Intelligibility and the listener: The role of lexical stress. TESOL Quarterly, 39(3), 399—423. https://doi.org/10.2307/3588487

Filippidou, F., & Moussiades, L. (2020). Α benchmarking of IBM, Google and Wit automatic speech recognition systems. Artificial Intelligence Applications and Innovations, 583, 73–82. https://doi.org/10.1007/978-3-030-49161-1_7

Garcia, C., Nickolai, D., & Jones, L. (2020). Traditional versus ASR-based pronunciation instruction: An empirical study. CALICO Journal, 37(3), 213–232. https://doi.org/10.1558/cj.40379

Gaskell, G., & Marslen-Wilson, W. (2001). Lexical ambiguity resolution and spoken word recognition: Bridging the gap. Journal of Memory and Language, 44(3), 325–349. https://doi.org/10.1006/jmla.2000.2741

González Fernández, B. & Schmitt, N. (2015). How much collocation knowledge do L2 learners have? The effects of frequency and amount of exposure. International Journal of Applied Linguistics, 166 (1), 94–126. https://doi.org/10.1075/itl.166.1.03fer

Gow, D., Jr., & Gordon, P. C. (1995). Lexical and prelexical influences on word segmentation: evidence from priming. Journal of Experimental Psychology. Human Perception and Performance, 21(2), 344–359. https://doi.org/10.1037//0096-1523.21.2.344

Graham, C., Lonsdale, D., Kennington, C., Johnson, A., & Mcghee, J. (2008). Elicited imitation as an oral proficiency measure with ASR scoring. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, & D.Tapias (Eds.), Proceedings of the Sixth International Conference on Language Resources and Evaluation (pp. 1604–1610). European Language Resources Association. http://www.lrec-conf.org/proceedings/lrec2008/pdf/409_paper.pdf

Guskaroska, A. (2020). ASR-dictation on smartphones for vowel pronunciation practice. Journal of Contemporary Philology, 3(2), 45–61. https://doi.org/10.37834/JCP2020045g

Hahn, L. (2004) Primary stress and intelligibility: Research to motivate the teaching of suprasegmentals. TESOL Quarterly, (38)2, 201–223. https://doi.org/10.2307/3588378

Hirai, A., Kovalyova, A. (2023). Using speech-to-text applications for assessing English language learners’ pronunciation: A comparison with human raters. In MdM Suárez & El-W.M. Henawy (Eds.), Optimizing online English language learning and teaching. Springer. https://doi.org/10.1007/978-3-031-27825-9_17

Inceoglu, S., Chen, W-H., & Lim, H. (2023). Assessment of L2 intelligibility: Comparing L1 listeners and automatic speech recognition. ReCALL, 35(1), 89–104. https://doi.org/10.1017/S0958344022000192

Inceoglu, S., Lim, H., & Chen, W-H. (2020). ASR for EFL pronunciation practice: Segmental development and learners’ beliefs. The Journal of Asia TEFL, 17(3), 824–840. http://dx.doi.org/10.18823/asiatefl.2020.17.3.5.824

Isaacs, T. (2018a). Fully automated speaking assessment: Changes to proficiency testing and the role of pronunciation. In O. Kang, R. Thomson, and J. Murphy (Eds.), The Routledge handbook of contemporary English pronunciation (pp. 570–584). Routledge.

Isaacs, T. (2018b). Shifting sands in second language pronunciation teaching and assessment research and practice. Language Assessment Quarterly, 15(3), 273–293. https://doi.org/10.1080/15434303.2018.1472264

Issacs, T., & Harding, L. (2017). Pronunciation assessment. Language Teaching, 50(3), 347-366. https://doi.org/10.1017/S026144417000118

John, P., Cardoso, W., & Johnson, C. (2022). Evaluating automatic speech recognition for L2 pronunciation feedback: A focus on Google Translate. In N. Zoghlami, C. Brudermann, C. Sarré, M. Grosbois, L. Bradley, & S. Thouësny (Eds.), Intelligent CALL, granular systems and learner data: Short papers from EUROCALL 2022. http://dx.doi.org/10.14705/rpnet.2022.61.1458

John, P., Johnson, C., & Cardoso, W. (2024). Assessing Google Translate ASR for feedback on L2 pronunciation errors in unpredictable sentence contexts. In CALL for all Languages - EUROCALL 2023 Short Papers. https://doi.org/10.4995/eurocall2023.2023.16987

John, P., Johnson, C., & Cardoso, W. (2025). Exploring automatic speech recognition for corrective and confirmative pronunciation feedback. Journal of Second Language Pronunciation. https://doi.org/10.1075/jslp.24035.joh

Johnson, C. & Cardoso, W. (2024). Hey Google, let’s write: Examining L2 learners’ acceptance of automatic speech recognition as a writing tool. CALICO Journal, 41(2), 122-145. https://doi.org/10.1558/cj.22431

Johnson, C., Cardoso, W., Zuercher, B., Brenner, K., & Springer, S. (2024). Assessing pronunciation using dictation tools: The use of Google Voice Typing to score a pronunciation placement test. Journal of Second Language Pronunciation, 10(1), 122–145. https://doi.org/10.1075/jslp.23033.joh

Johnson, C., Cardoso, W., Zuercher, B., Brenner, K., & Springer, Suzanne. (2022). Using Google Voice Typing to automatically assess pronunciation. In N. Zoghlami, C. Brudermann, C. Sarré, M. Grosbois, L. Bradley, & S. Thouësny (Eds.), Intelligent CALL, granular systems and learner data: Short papers from EUROCALL 2022. (pp. 203–207). Research-publishing.net. https://doi.org/10.14705/rpnet.2022.61.1459

Kang, O, & Ginther, A. (2018). Assessment in Second Language Pronunciation. Routledge.

Kennedy, S., & Trofimovich, P. (2008) Intelligibility, comprehensibility, and accentedness of L2 speech: The role of listener experience and semantic context. Modern Language Review, 64(3), 459–489. http://dx.doi.org/10.3138/cmlr.64.3.459

Khabbazbashi, N., Xu, J., & Galaczi, E. (2021). Opening the black box: Exploring Automated speaking evaluation. In B. Lanteigne, C. Coombe, & J. D. Brown (Eds.), Challenges in Language Testing Around the World (pp. 333–343). Springer. https://doi.org/10.1007/978-981-33-4232-3_25

Kim, D., Stephens, J., & Pitt, M. (2012). How does context play a part in splitting words apart? Production and perception of word boundaries in casual speech. Journal of Memory and Language, 66(4), 509–529. https://doi.org/10.1016/j.jml.2011.12.007

Lenhard, W. & Lenhard, A. (2014). Hypothesis tests for comparing correlations. Psychometrica. https://www.psychometrica.de/correlation.html

Levis, J. M. (2005). Changing contexts and shifting paradigms in pronunciation teaching. TESOL Quarterly, 39(3), 369–377. https://doi.org/10.2307/3588485

Levis, J., & Suvorov, R. (2012). Automatic speech recognition. In C. Chapelle (Ed.), The encyclopedia of applied linguistics. John Wiley & Sons.

Liakin D, Cardoso W, Liakina N. (2017). Mobilizing instruction in a second-language context: learners’ perceptions of two speech technologies. Languages, 2(3), 1–21. https://doi.org/10.3390/languages2030011

Liakin, D., Cardoso, W., & Liakina, N. (2014). Learning L2 pronunciation with a mobile speech recognizer: French/y/. CALICO Journal, 32(1), 1–25. https://doi.org/10.1558/cj.v32i1.25962

Ling, G., Mollaun, P., & Xiaoming, X. (2014). A study on the impact of fatigue on human raters when scoring speaking responses. Language Testing, 31(4), 479–499. https://doi.org/10.1177/0265532214530699

Loukina, A., Davis, L., & Xi, X. (2018). Automated assessment of pronunciation in spontaneous speech. In O. Kang & A. Ginther (Eds.), Assessment in second language pronunciation (pp. 153–171). Routledge.

Luo, D., Minematsu, N., Yamauchi, Y., & Hiroshi, K. (2009). Analysis and comparison of automatic language proficiency assessment between shadowed sentences and read sentences. In Proceedings of SLaTE. ISCA. http://www.eee.bham.ac.uk/SLaTE2009/papers/SLaTE2009-20-v2.pdf

Mattys, S., & Melhorn, J. (2007). Sentential, lexical, and acoustic effects on the perception of word boundaries. The Journal of the Acoustical Society of America, 122 (1), 554–567. https://doi.org/10.1121/1.2735105

McCrocklin, S. (2016). Pronunciation learner autonomy: The potential of automatic speech recognition. System, 57, 25–42. https://doi.org/10.1016/j.system.2015.12.013

McCrocklin, S. (2018). Learners’ feedback regarding ASR-based dictation practice for pronunciation learning. CALICO Journal, 36(2), 119–137. https://doi.org/10.1558/cj.34738

McCrocklin, S. (2019). ASR-based dictation practice for second language pronunciation improvement. Journal of Second Language Pronunciation, 5(1), 98–118. https://doi.org/10.1075/jslp.16034.mcc

McCrocklin, S., & Edalatishams, I. (2020). Revisiting popular speech recognition software for ESL speech. TESOL Quarterly, 54(4), 1086–1097. https://doi.org/10.1002/tesq.3006

McCrocklin, S., Humaidan, A., & Edalatishams, E. (2019). ASR dictation program accuracy: Have current programs improved? In J. Levis, C. Nagle, & E. Todey (Eds.), Proceedings of the 10th Pronunciation in Second language learning and Teaching Conference, (pp. 191–200). Iowa State University.

McNamara, T., & Ryan, K. (2011) Fairness versus justice in language testing: The place of English literacy in the Australian citizenship test, Language Assessment Quarterly, 8(2), 161–178. https://doi.org/10.1080/15434303.2011.565438

McNamara, T., Knoch, U., & Fan, J. (2019). Fairness, justice, and language assessment. Oxford University Press.
Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18(2), 5-11. https://doi.org/10.3102/0013189X018002005

Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18(2), 5–11. https://doi.org/10.3102/0013189X018002005

Messick, S. (1995). Standards of validity and the validity of standards in performance assessment. Educational Measurement: Issues and Practice, 14(4), 5–8. https://doi.org/10.1111/j.1745-3992.1995.tb00881.x

Moussalli, S., & Cardoso, W. (2020). Intelligent personal assistants: Can they understand and be understood by accented L2 learners? Computer Assisted Language Learning, 33(8), 865–890. https://doi.org/10.1080/09588221.2019.1595664

Mroz, A. (2018). Seeing how people hear you: French learners experiencing intelligibility through automatic speech recognition. Foreign Language Annals, 51(3), 617–637. https://doi.org/10.1111/flan.12348

Mroz, A. (2020). Aiming for advanced intelligibility and proficiency using mobile ASR. Journal of Second Language Pronunciation, 6(1), 12–38. https://doi.org/10.1075/jslp.18030.mro

Munro, M., & Derwing, T. (2020). Foreign accent, comprehensibility and intelligibility, redux. Journal of Second Language Pronunciation, 6(3), 283–309. https://doi.org/10.1075/jslp.20038.mun

Nakatani, L., & Dukes, K. (1977). Locus of segmental cues for word juncture. The Journal of the Acoustical Society of America, 62(3), 714–719. https://doi.org/10.1121/1.381583

Nation, I. (2013). Learning vocabulary in another language (2nd ed.). Cambridge University Press.
Nation, I. (2017). The BNC/COCA word lists (Version 1.0.0) [Data file]. Victoria University of Wellington. http://www.victoria.ac.nz/lals/staff/paul-nation.aspx

Nelson, C., & Cardoso, W. (2024). Evaluating the effectiveness of Microsoft Transcribe for automating the assessment of pronunciation in language proficiency tests. In Y. Choubsaz et al. (Eds.), EUROCALL 2023 Short Papers. (pp. 117–122). Editorial Universitat Politècnica de València. https://doi.org/10.4995.EuroCALL2023.2023.17007

Nelson, C., & Cardoso, W. (in press). Dictation automatic speech recognition for second language pronunciation assessment: Focus on age-related bias. Canadian Journal of Applied Linguistics.

Neri, A., Mich, O., Gerosa, M., & Giuliani, D. (2008). The effectiveness of computer-assisted pronunciation training for foreign language learning by children. Computer Assisted Language Learning, 21(5), 393–408. https://doi.org/10.1080/09588220802447651

Ngo, T., Hao-Jan Chen, H., & Kuo-Wei Lai, K. (2023). The effectiveness of automatic speech recognition in ESL/EFL pronunciation: A meta-analysis. ReCALL, 1–18. https://doi.org/10.1017/S0958344023000113

Nicklin C., Patterson A., & McLean, S. (2023). Quantifying proper nouns’ influence on L2 English learners’ reading fluency. Studies in Second Language Acquisition, 45(4), 906–929. http://doi.org/10.1017/S027226312200050X

Pourhosein Gilakjani, A., & Rahimy, R. (2020). Using computer-assisted pronunciation teaching (CAPT) in English pronunciation instruction: A study on the impact and the teacher’s role. Education and Information Technologies, 25, 1129—1159. https://doi.org/10.1007/s10639-019-10009-1

Pouw, C., de Heer Kloots, M., Alishahi, A., & Zuidema, W. (2024) Perception of phonological assimilation by neural speech recognition models. Computational Linguistics, 50(4), 1557-1585. https://doi.org/10.1162/coli_a_00526

Proverbio, A., Lilli, S., Semenza, C., & Zani, A. (2001). ERP indexes of functional differences in brain activation during proper and common names retrieval. Neuropsychologia, 39(8), 815–827. https://doi.org/10.1016/S0028-3932 (01) 00003-3

Rogerson-Revell, P. M. (2021). Computer-assisted pronunciation training (CAPT): Current issues and future directions. RELC Journal, 52(1), 189–205. https://doi.org/10.1177/0033688220977406

Saito, K., Macmillan, K., Kachlicka, M., Kunihara, T., & Minematsu, N. (2023). Automated assessment of second language comprehensibility: Review, training, validation, and generalization studies. Studies in Second Language Acquisition, 45(1), 234 - 263. https://doi.org/10.1017/S0272263122000080

Saito, K., Trofimovich, P., Isaacs, T., & Webb, S. (2016). Re-examining phonological and lexical correlates of second language comprehensibility: The role of rater experience. In T. Isaacs, & P. Trofimovich (Eds.), Second language pronunciation assessment: Interdisciplinary perspectives (pp. 141–156). Multilingual Matters. https://doi.org/10.21832/ISAACS6848

Saleh, A. & Pourhosein Gilakjani, A. (2021) Investigating the impact of computer-assisted pronunciation teaching (CAPT) on improving intermediate EFL learners’ pronunciation ability. Education and Information Technologies, 26, 489–515. https://doi.org/10.1007/s10639-020-10275-4

Shahnawazuddin, S., Adiga, N., Kumar, K., Poddar, A., & Ahmad, W. (2020). Voice conversion based data augmentation to improve children’s speech recognition in limited data scenario. In Proceedings from INTERSPEECH 2020 (pp. 4382–4386). https://doi.org/10.21437/Interspeech.2020-1112

Skehan, P. (2018). Second language task-based performance: Theory, research, assessment. Routledge. https://doi.org/10.4324/9781315629766

Strachan, L., & Trofimovich, P. (2019). Now you hear it, now you don’t: Perception of English regular past -ed in naturalistic input. The Canadian Modern Language Review, 75(1), 84–103. http://dx.doi.org/10.3138/cmlr.2017-0082

Sun, W. (2023). The impact of automatic speech recognition technology on second language pronunciation and speaking skills of EFL learners: A mixed methods investigation. Frontiers in Psychology, 14, 1–14. https://doi.org/10.3389/fpsyg.2023.1210187

Tatman, R. (2017). Gender and dialect bias in Youtube’s automatic captions. In Proceedings of the First ACL Workshop on Ethics in Natural Language Processing (pp. 53–59). Association for Computational Linguistics. http://dx.doi.org/10.18653/v1/W17-1606

Tatman, R., & Kasten, C. (2017). Effects of talker dialect, gender, and race on accuracy of Bing speech and Youtube automatic captions. In Proceedings from INTERSPEECH, 2017 (pp. 934–938). International Speech Communication Association. http://dx.doi.org/10.21437/Interspeech.2017-1746

Turner, C. (2014). Ratings Scales for Language Tests. In C. Chapelle (Ed.), The encyclopedia of applied linguistics. John Wiley & Sons.
van der Walt, C., de Wet, F., & Niesler, T. (2008). Oral proficiency assessment: The use of automatic speech recognition systems. Southern African Linguistics and Applied Language Studies, 26(1), 135–146. https://doi.org/10.2989/SALALS.2008.26.1.11.426

Van Moere, A., & Suzuki, M. (2018). Using speech processing technology in assessing pronunciation. In O. Kang & A. Ginther (Eds.), Assessment in second language pronunciation (pp. 137–152). Routledge.

Wallace, L. (2015). Reflexive photography, attitudes, behavior, and CALL: ITAs improving spoken English intelligibility. CALICO Journal, 32(3), 449–479. https://doi.org/10.1558/cj.v32i3.26384

Wallace, L. (2016). Using Google web speech as a springboard for identifying personal pronunciation problems. In J. Levis, H. Le, I. Lucic, E. Simpson, & S. Vo (Eds.). Proceedings of the 7th Pronunciation in Second Language Learning and Teaching Conference (pp. 180–186). Iowa State University. https://apling.engl.iastate.edu/wp-content/uploads/sites/221/2016/06/PSLLT-7-Proceedings.pdf

Williamson, D. M., Xi, X., & Breyer, F. J. (2012). A framework for evaluation and use of automated scoring. Educational Measurement: Issues and Practice, 31(1), 2–13. https://doi.org/10.1111/j.1745-3992.2011.00223.x

Winke, P., & Gass, S. (2013). The influence of second language experience and accent familiarity on oral proficiency rating: A qualitative investigation. TESOL Quarterly, 47(4), 762–789. https://doi.org/10.1002/tesq.73

Winke, P., Gass, S., & Myford, C. (2012). Raters’ L2 background as a potential source of bias in rating oral performance. Language Testing. 30(2) 231–252. https://doi.org/10.1177%2F0265532212456968

Yan, X., & Ginther, A. (2018). Listeners and raters: Similarities and differences in evaluation of accented speech. In O. Kang & A. Ginther (Eds.), Assessment in second language pronunciation (pp. 67–88). Routledge.

Zechner, K., Higgins, D., Xi, X., & Williamson, D. (2009). Automatic scoring of non-native spontaneous speech in tests of spoken English. Speech Communication, 51(10), 883–895. https://doi.org/10.1016/j.specom.2009.04.009
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top