Publications
Working Papers
Anna Seo Gyeong Choi, Oresis Papakyriakopoulos, Allison Koenecke, & Alessandro Fabris. (in preparation). ClinSpeech: A Holistic Benchmark for evaluating ASR Fairness in Clinical Conversations.
Anna Seo Gyeong Choi, Sunghye Cho, & Iris Nowenstein. (under review). PEEC: The Protected Entities Ethics Checklist for Collecting Speech Data from Vulnerable Clinical Populations.
Katelyn X. Mei*, Anna Seo Gyeong Choi*, Hilke Schellmann, Mona Sloane, & Allison Koenecke. (under review). Pitfalls of Auditing Practices in Automatic Speech Recognition Technologies: A Case Study of People with Aphasia. (* equal contribution)
Anna Seo Gyeong Choi*, Maria Teleki*, Miguel del Rio Fernandez, James Caverlee, & Allison Koenecke. (in submission). SpeechSpectrum: A Framework for User-Controlled Speech-to-Text Representation Along the Linguistic Fidelity Spectrum. (* equal contribution)
Peer-reviewed Articles
Eileen Pan, Anna Seo Gyeong Choi, Maartje Ter Hoeve, Skyler Seto, & Allison Koenecke. (2025). Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning Benchmarks. Proceedings of Empirical Methods in Natural Language Processing (EMNLP) Findings.
Anna Seo Gyeong Choi & Hoon Choi. (2025). Fairness of Automatic Speech Recognition: Looking Through a Philosophical Lens. Proceedings of AAAI/ACM Conference on AI, Ethics, and Society (AIES).
Resources: arxiv
- Anna Seo Gyeong Choi, Alex Richardson, Ryan Partlan, Sunny X. Tang†, & Sunghye Cho†. (2025). Comparative Evaluation of Acoustic Feature Extraction Tools for Clinical Speech Analysis. Proceedings of Interspeech. († equal contribution)
Resources: arxiv | official link
- Chanwoo Park, Anna Seo Gyeong Choi, Sunghye Cho, & Chanwoo Kim. (2025). Reasoning-Based Approach with Chain-of-Thought for Alzheimer’s Detection Using Speech and Large Language Models. Proceedings of Interspeech.
Resources: arxiv | official link
- Anna Seo Gyeong Choi*, Jonghyeon Park*, & Myungwoo Oh. (2025). Data-Driven Mispronunciation Pattern Discovery for Robust Speech Recognition. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). (* equal contribution)
Resources: arxiv | official link
- Robin Zhao*, Anna Seo Gyeong Choi*, Allison Koenecke†, & Anais Rameau†. (2024). Quantification of Automatic Speech Recognition System Performance on d/Deaf and Hard of Hearing Speech. The Laryngoscope. (*, † equal contribution)
Resources: official link
- Allison Koenecke, Anna Seo Gyeong Choi*, Katelyn X. Mei*, Hilke Schellmann†, & Mona Sloane†. (2024). Careless Whisper: Speech-to-Text Hallucination Harms. Proceedings of ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT). (*, † equal contribution)
Resources: arxiv | official link | explainer video | github
Jinseo Kim, Anna Seo Gyeong Choi, & Sunghye Cho. (2024). KoFREN: Comprehensive Korean Word Frequency Norms Derived from Large Scale Free Speech Corpora. Proceedings of Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING).
Anna Seo Gyeong Choi, Jinseo Kim, Seo-hee Kim, Minseok Baek, & Sunghye Cho. (2024). Crosslinguistic Acoustic Feature-based Dementia Classification using Advanced Learning Architectures. Proceedings of Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING) Workshop on Resources and Processing of Linguistic, Para-linguistic and Extra-linguistic Data from People with Various Forms of Cognitive/Psychiatric/Developmental Impairments (RaPID-5).
Orestis Papakyriakopoulos*, Anna Seo Gyeong Choi*, William Thong, Dora Zhao, Jerome Andrews, Rebecca Bourke, Alice Xiang†, & Allison Koenecke†. (2023). Augmented Datasheets for Speech Datasets and Ethical Decision-Making. Proceedings of ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT). (*, † equal contribution)
Resources: arxiv | official link | recorded talk | github
- Sunhee Kim, Jooyeong Lee, Seo Gyeong Choi, Seunghun Ji, Jeemin Kang, Jongin Kim, et al. (2020). Building Korean Conversational Speech Data in Emergency Medical Domain. Phonetics and Speech Sciences, 12(4).