Publication

Publications

(*,† denotes equal contribution)

Papers focusing on Fairness in Speech Technologies

Papers focusing on Acoustic Biomarker in Clinical Speech Technologies

Working Papers

Anna Seo Gyeong Choi, Maria Teleki, James Caverlee, Miguel del Rio Fernandez†, Corey Miller†, & Hoon Choi†. (in submission, FAccT) Beyond Single Ground Truth: Reference Monism as Epistemic Injustice in ASR Evaluation.
- preprint
Maria Teleki, Anna Seo Gyeong Choi, Anne Duray, Sai Tejas Janjur, Xiangjue Dong, James Caverlee, & Dilma da Silva. (in submission, FAccT). How Are U.S. Universities Responding to AI? An Audit of Governance Capacity.
- preprint
Anna Seo Gyeong Choi, Sunghye Cho, & Iris Nowenstein. (under second revision, JSLHR). PEEC: The Protected Entities Ethics Checklist for Collecting Speech Data from Vulnerable Clinical Populations.
Katelyn X. Mei*, Anna Seo Gyeong Choi*, Hilke Schellmann, Mona Sloane, & Allison Koenecke. (in submission, FAccT). Pitfalls of Auditing Practices in Automatic Speech Recognition Technologies: A Case Study of People with Aphasia.
- arxiv | github
Anna Seo Gyeong Choi*, Maria Teleki*, Miguel del Rio Fernandez, Coerey Miller†, James Caverlee†, & Allison Koenecke†. (in submission, FAccT). SpeechSpectrum: A Framework for User-Controlled Speech-to-Text Representation Along the Linguistic Fidelity Spectrum.
- preprint
Anna Seo Gyeong Choi, Oresis Papakyriakopoulos, Allison Koenecke, & Alessandro Fabris. (in preparation). ClinSpeech: A Holistic Benchmark for evaluating ASR Fairness in Clinical Conversations.

Peer-reviewed Articles

(All conference proceedings are peer-reviewed with the full articles.)

Anna Seo Gyeong Choi, Ryan Partlan, Alex Richardson, Sunghye Cho† & Sunny X. Tang†. (2026). Speech Prosody in Schizophrenia Spectrum Disorders: Perceptual Evaluation and Machine Classification. Proceedings of Speech Prosody.
- preprint
Eileen Pan, Anna Seo Gyeong Choi, Maartje Ter Hoeve, Skyler Seto, & Allison Koenecke. (2025). Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning Benchmarks. Proceedings of Empirical Methods in Natural Language Processing (EMNLP) Findings.
- arxiv | slides | poster
Anna Seo Gyeong Choi & Hoon Choi. (2025). Fairness of Automatic Speech Recognition: Looking Through a Philosophical Lens. Proceedings of AAAI/ACM Conference on AI, Ethics, and Society (AIES).
- arxiv | official link | slides
Anna Seo Gyeong Choi, Alex Richardson, Ryan Partlan, Sunny X. Tang†, & Sunghye Cho†. (2025). Comparative Evaluation of Acoustic Feature Extraction Tools for Clinical Speech Analysis. Proceedings of Interspeech.
- arxiv | official link | slides
Chanwoo Park, Anna Seo Gyeong Choi, Sunghye Cho, & Chanwoo Kim. (2025). Reasoning-Based Approach with Chain-of-Thought for Alzheimer’s Detection Using Speech and Large Language Models. Proceedings of Interspeech.
- arxiv | official link
Anna Seo Gyeong Choi*, Jonghyeon Park*, & Myungwoo Oh. (2025). Data-Driven Mispronunciation Pattern Discovery for Robust Speech Recognition. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).
- arxiv | official link
Robin Zhao*, Anna Seo Gyeong Choi*, Allison Koenecke†, & Anais Rameau†. (2024). Quantification of Automatic Speech Recognition System Performance on d/Deaf and Hard of Hearing Speech. The Laryngoscope.
- official link
Allison Koenecke, Anna Seo Gyeong Choi*, Katelyn X. Mei*, Hilke Schellmann†, & Mona Sloane†. (2024). Careless Whisper: Speech-to-Text Hallucination Harms. Proceedings of ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT).
- arxiv | official link | explainer video | github
Jinseo Kim, Anna Seo Gyeong Choi, & Sunghye Cho. (2024). KoFREN: Comprehensive Korean Word Frequency Norms Derived from Large Scale Free Speech Corpora. Proceedings of Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING).
- official link
Anna Seo Gyeong Choi, Jinseo Kim, Seo-hee Kim, Minseok Baek, & Sunghye Cho. (2024). Crosslinguistic Acoustic Feature-based Dementia Classification using Advanced Learning Architectures. Proceedings of Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING) Workshop on Resources and Processing of Linguistic, Para-linguistic and Extra-linguistic Data from People with Various Forms of Cognitive/Psychiatric/Developmental Impairments (RaPID-5).
- official_link
Orestis Papakyriakopoulos*, Anna Seo Gyeong Choi*, William Thong, Dora Zhao, Jerome Andrews, Rebecca Bourke, Alice Xiang†, & Allison Koenecke†. (2023). Augmented Datasheets for Speech Datasets and Ethical Decision-Making. Proceedings of ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT).
- arxiv | official link | recorded talk | github
Sunhee Kim, Jooyeong Lee, Seo Gyeong Choi, Seunghun Ji, Jeemin Kang, Jongin Kim, et al. (2020). Building Korean Conversational Speech Data in Emergency Medical Domain. Phonetics and Speech Sciences, 12(4).
- official link