-
摘要: 嗓音声学分析可用于检测和分析正常嗓音、艺术嗓音和病理性嗓音的声学特征,是一种客观、定量、非侵入且可重复的嗓音质量评价方法。随着现代医学、物理学、统计学和人工智能技术的发展,近年来嗓音声学分析的研究,特别是在声学参数的开发和适用性上有了新的进展。同时利用人工神经网络等辅助计算方法进行复杂的多参数分析,大大提高了嗓音声学分析的效率。本文就嗓音声学分析的方法及其最新进展做一概述。Abstract: Acoustic analysis of the voice, as an objective, quantitative, non-invasive and reproducible method for the evaluation of voice quality, can be used to detect and analyze the acoustic characteristics of normal, artistic or pathological voice. With the development of medicine, physics, statistics, and artificial intelligence technology, there are new advances in the study of voice acoustic analysis, especially in terms of acoustic parameters. In addition, artificial neural networks can be used to perform complex multi-parameter analysis, which greatly improves the efficiency of acoustic analysis. This paper provides an overview of the methods of acoustic analysis and its latest development.
-
Key words:
- voice quality /
- objective evaluation /
- acoustic analysis
-
[1] Liu B, Polce E, Jiang J. An Objective Parameter to Classify Voice Signals Based on Variation in Energy Distribution[J]. J Voice, 2019, 33(5): 591-602. doi: 10.1016/j.jvoice.2018.02.011
[2] Shao J, MacCallum JK, Zhang Y, et al. Acoustic analysis of the tremulous voice: assessing the utility of the correlation dimension and perturbation parameters[J]. J Commun Disord, 2010, 43(1): 35-44. doi: 10.1016/j.jcomdis.2009.09.001
[3] Peplinski J, Berisha V, Liss J, et al. OBJECTIVE ASSESSMENT OF VOCAL TREMOR[J]. Proc IEEE Int Conf Acoust Speech Signal Process, 2019, 2019: 6386-6390.
[4] Payten CL, Chiapello G, Weir KA, et al. Frameworks, Terminology and Definitions Used for the Classification of Voice Disorders: A Scoping Review[J]. J Voice, 2022.
[5] 韩德民, Robert T. Sataloff, 徐文. 嗓音医学[M]. 2版. 北京: 人民卫生出版社, 2017: 66-68.
[6] Suphinnapong P, Phokaewvarangkul O, Thubthong N, et al. Objective vowel sound characteristics and their relationship with motor dysfunction in Asian Parkinson's disease patients[J]. J Neurol Sci, 2021, 426: 117487. doi: 10.1016/j.jns.2021.117487
[7] Suppa A, Asci F, Saggio G, et al. Voice Analysis with Machine Learning: One Step Closer to an Objective Diagnosis of Essential Tremor[J]. Mov Disord, 2021, 36(6): 1401-1410. doi: 10.1002/mds.28508
[8] Murton O, Hillman R, Mehta D. Cepstral Peak Prominence Values for Clinical Voice Evaluation[J]. Am J Speech Lang Pathol, 2020, 29(3): 1596-1607. doi: 10.1044/2020_AJSLP-20-00001
[9] Awan SN, Awan JA. A Two-Stage Cepstral Analysis Procedure for the Classification of Rough Voices[J]. J Voice, 2020, 34(1): 9-19. doi: 10.1016/j.jvoice.2018.07.003
[10] Sampaio MC, Bohlender JE, Brockmann-Bauser M. Fundamental Frequency and Intensity Effects on Cepstral Measures in Vowels from Connected Speech of Speakers with Voice Disorders[J]. J Voice, 2021, 35(3): 422-431. doi: 10.1016/j.jvoice.2019.11.014
[11] Antonetti A E D S, Siqueira L T D, Gobbo M P D A, et al. Relationship of cepstral peak prominence-smoothed and long-term average spectrum with auditory-perceptual analysis[J]. Applied Sciences, 2020, 10(23): 8598. doi: 10.3390/app10238598
[12] Madill C, Nguyen DD, Yick-Ning Cham K, et al. The Impact of Nasalance on Cepstral Peak Prominence and Harmonics-to-Noise Ratio[J]. Laryngoscope, 2019, 129(8): E299-E304.
[13] Wei M, Du J, Wang X, et al. Voice disorders in severe obstructive sleep apnea patients and comparison of two acoustic analysis software programs: MDVP and Praat[J]. Sleep Breath, 2021, 25(1): 433-439. doi: 10.1007/s11325-020-02102-4
[14] Brockmann-Bauser M, Van Stan JH, Carvalho Sampaio M, et al. Effects of Vocal Intensity and Fundamental Frequency on Cepstral Peak Prominence in Patients with Voice Disorders and Vocally Healthy Controls[J]. J Voice, 2021, 35(3): 411-417. doi: 10.1016/j.jvoice.2019.11.015
[15] Watts CR, Awan SN, Maryn Y. A Comparison of Cepstral Peak Prominence Measures From Two Acoustic Analysis Programs[J]. J Voice, 2017, 31(3): 387. e1-387. e10. doi: 10.1016/j.jvoice.2016.09.012
[16] Ingo RT. "Workshop on acoustic voice analysis" summary statementGlossary of terms[J]. 中国眼耳鼻喉科杂志, 1996, (3): 125-128,140.
[17] Sprecher A, Olszewski A, Jiang JJ, et al. Updating signal typing in voice: addition of type 4 signals[J]. J Acoust Soc Am, 2010, 127(6): 3710-3716. doi: 10.1121/1.3397477
[18] Rusz J, Tykalova T, Ramig LO, et al. Guidelines for Speech Recording and Acoustic Analyses in Dysarthrias of Movement Disorders[J]. Mov Disord, 2021, 36(4): 803-814. doi: 10.1002/mds.28465
[19] Patel RR, Awan SN, Barkmeier-Kraemer J, et al. Recommended Protocols for Instrumental Assessment of Voice: American Speech-Language-Hearing Association Expert Panel to Develop a Protocol for Instrumental Assessment of Vocal Function[J]. Am J Speech Lang Pathol, 2018, 27(3): 887-905. doi: 10.1044/2018_AJSLP-17-0009
[20] Aghadoost S, Jalaie S, Dabirmoghaddam P, et al. Effect of Muscle Tension Dysphonia on Self-perceived Voice Handicap and Multiparametric Measurement and Their Relation in Female Teachers[J]. J Voice, 2022, 36(1): 68-75. doi: 10.1016/j.jvoice.2020.04.011
[21] Fri M, Pavlechová A. Listening evaluation and classification of female singing voice categories[J]. Logoped Phoniatr Vocol, 2020, 45(3): 97-109. doi: 10.1080/14015439.2018.1551418
[22] Seok J, Ryu YM, Jo SA, et al. Singing voice range profile: New objective evaluation methods for voice change after thyroidectomy[J]. Clin Otolaryngol, 2021, 46(2): 332-339. doi: 10.1111/coa.13673
[23] Herzel H, Berry D, Titze I, et al. Nonlinear dynamics of the voice: Signal analysis and biomechanical modeling[J]. Chaos, 1995, 5(1): 30-34. doi: 10.1063/1.166078
[24] Lin L, Calawerts W, Dodd K, et al. An Objective Parameter for Quantifying the Turbulent Noise Portion of Voice Signals[J]. J Voice, 2016, 30(6): 664-669. doi: 10.1016/j.jvoice.2015.08.017
[25] 邵骏, 吴琍雯. 嗓音的声学信号测量研究近况[J]. 中国眼耳鼻喉科杂志, 2001, 6(3): 109-111. doi: 10.3969/j.issn.1671-2420.2001.03.028
[26] Delgado-Vargas B, Acle-Cervera L, Sánz-López L, et al. Cepstral analysis in patients with a vocal fold motility impairment: advantages of the cepstrum over time-based acoustic analysis[J]. Eur Arch Otorhinolaryngol, 2021, 278(1): 173-179. doi: 10.1007/s00405-020-06291-2
[27] Campisi P, Tewfik TL, Manoukian JJ, et al. Computer-assisted voice analysis: establishing a pediatric database[J]. Arch Otolaryngol Head Neck Surg, 2002, 128(2): 156-160. doi: 10.1001/archotol.128.2.156
[28] Calawerts WM, Lin L, Sprott JC, et al. Using Rate of Divergence as an Objective Measure to Differentiate between Voice Signal Types Based on the Amount of Disorder in the Signal[J]. J Voice, 2017, 31(1): 16-23. doi: 10.1016/j.jvoice.2016.01.005
[29] Liu B Q, Polce E, Raj H, et al. Quantification of Voice Type Components Present in Human Phonation Using a Modified Diffusive Chaos Technique[J]. Ann Otol Rhinol Laryngol, 2019, 128: 921-931. doi: 10.1177/0003489419848451
[30] Liu B, Polce E, Jiang J. An Objective Parameter to Classify Voice Signals Based on Variation in Energy Distribution[J]. J Voice, 2019, 33(5): 591-602. doi: 10.1016/j.jvoice.2018.02.011
[31] 蒋家琪, 舒敏, 王闰生, 等. 嗓音功能评估概述[J]. 中国眼耳鼻喉科杂志, 2012, 12(S1): 428-432. doi: 10.14166/j.issn.1671-2420.2012.s1.010
[32] Jiang JJ, Zhang Y, MacCallum J, et al. Objective acoustic analysis of pathological voices from patients with vocal nodules and polyps[J]. Folia Phoniatr Logop, 2009, 61(6): 342-349. doi: 10.1159/000252851
[33] Maccallum JK, Cai L, Zhou L, et al. Acoustic analysis of aperiodic voice: perturbation and nonlinear dynamic properties in esophageal phonation[J]. J Voice, 2009, 23(3): 283-290. doi: 10.1016/j.jvoice.2007.10.004
[34] Jiang JJ, Zhang Y, Ford CN. Nonlinear dynamics of phonations in excised larynx experiments[J]. J Acoust Soc Am, 2003, 114(4 Pt 1): 2198-2205.
[35] Choi SH, Zhang Y, Jiang JJ, et al. Nonlinear dynamic-based analysis of severe dysphonia in patients with vocal fold scar and sulcus vocalis[J]. J Voice, 2012, 26(5): 566-576. doi: 10.1016/j.jvoice.2011.09.006
[36] Jiang JJ, Zhang Y, Stern J. Modeling of chaotic vibrations in symmetric vocal folds[J]. J Acoust Soc Am, 2001, 110(4): 2120-2128. doi: 10.1121/1.1395596
[37] Yu P, Garrel R, Nicollas R, et al. Objective voice analysis in dysphonic patients: new data including nonlinear measurements[J]. Folia Phoniatr Logop, 2007, 59(1): 20-30. doi: 10.1159/000096547
[38] MacCallum JK, Olszewski AE, Zhang Y, et al. Effects of low-pass filtering on acoustic analysis of voice[J]. J Voice, 2011, 25(1): 15-20. doi: 10.1016/j.jvoice.2009.08.004
[39] Liu B, Polce E, Sprott JC, et al. Applied Chaos Level Test for Validation of Signal Conditions Underlying Optimal Performance of Voice Classification Methods[J]. J Speech Lang Hear Res, 2018, 61(5): 1130-1139. doi: 10.1044/2018_JSLHR-S-17-0250
[40] Fraile R, Godino-llorente JI. Cepstral peak prominence: A comprehensive analysis[J]. Biomedical Signal Processing and Control, 2014, 14: 42-54. doi: 10.1016/j.bspc.2014.07.001
[41] 魏梅, 杜建群, 耿磊, 等. 基于发声与言语障碍分析参数对病理嗓音的检测[J]. 临床耳鼻咽喉头颈外科杂志, 2022, 36(7): 492-496. doi: 10.13201/j.issn.2096-7993.2022.07.002
[42] Taylor S, Dromey C, Nissen SL, et al. Age-Related Changes in Speech and Voice: Spectral and Cepstral Measures[J]. J Speech Lang Hear Res, 2020, 63(3): 647-660. doi: 10.1044/2019_JSLHR-19-00028
[43] Ferrer Riesgo CA, Nöth E. What Makes the Cepstral Peak Prominence Different to Other Acoustic Correlates of Vocal Quality?[J]. J Voice, 2020, 34(5): 806.e1-806.e6. doi: 10.1016/j.jvoice.2019.01.004
[44] Seipelt M, Möller A, Nawka T, et al. Monitoring the Outcome of Phonosurgery and Vocal Exercises with Established and New Diagnostic Tools[J]. Biomed Res Int, 2020, 2020: 4208189.
[45] Caffier PP, Möller A, Forbes E, et al. The Vocal Extent Measure: Development of a Novel Parameter in Voice Diagnostics and Initial Clinical Experience[J]. Biomed Res Int, 2018, 2018: 3836714.
[46] Barsties V Latoszek B, Kim GH, Delgado Hernández J, et al. The validity of the Acoustic Breathiness Index in the evaluation of breathy voice quality: A Meta-Analysis[J]. Clin Otolaryngol, 2021, 46(1): 31-40. doi: 10.1111/coa.13629
[47] Maryn Y, Weenink D. Objective dysphonia measures in the program Praat: smoothed cepstral peak prominence and acoustic voice quality index[J]. J Voice, 2015, 29(1): 35-43. doi: 10.1016/j.jvoice.2014.06.015
[48] Lee JM, Roy N, Peterson E, et al. Comparison of Two Multiparameter Acoustic Indices of Dysphonia Severity: The Acoustic Voice Quality Index and Cepstral Spectral Index of Dysphonia[J]. J Voice, 2018, 32(4): 515.e1-515.e13. doi: 10.1016/j.jvoice.2017.06.012
[49] Pabon P, Ternström S. Feature Maps of the Acoustic Spectrum of the Voice[J]. J Voice, 2020, 34(1): 161.e1-161.e26. doi: 10.1016/j.jvoice.2018.08.014
[50] Englert M, Latoszek B, Behlau M. Exploring The Validity of Acoustic Measurements and Other Voice Assessments[J]. J Voice, 2022.
[51] Maryn Y, De Bodt M, Barsties B, et al. The value of the acoustic voice quality index as a measure of dysphonia severity in subjects speaking different languages[J]. Eur Arch Otorhinolaryngol, 2014, 271(6): 1609-1619.
[52] Kishore Pebbili G, Shabnam S, Pushpavathi M, et al. Diagnostic Accuracy of Acoustic Voice Quality Index Version 02.03 in Discriminating across the Perceptual Degrees of Dysphonia Severity in Kannada Language[J]. J Voice, 2021, 35(1): 159.e11-159.e18. doi: 10.1016/j.jvoice.2019.07.010
[53] 李曙光, 牛燕燕, 祝小莉, 等. 嗓音质量指数在汉语普通话人群中的适用性初测[J]. 中国中西医结合耳鼻咽喉科杂志, 2021, 29(6): 450-454, 439. https://www.cnki.com.cn/Article/CJFDTOTAL-XYJH202106012.htm
[54] 牟少敏, 时爱菊. 模式识别与机器学习技术[M]. 北京: 冶金工业出版社, 2019: 17-21.
[55] Vashkevich M, Rushkevich Y. Classification of ALS patients based on acoustic analysis of sustained vowel phonations[J]. Biomedical Signal Processing and Control, 2021, 65.
[56] Mahajan P, Baths V. Acoustic and Language Based Deep Learning Approaches for Alzheimer's Dementia Detection From Spontaneous Speech[J]. Front Aging Neurosci, 2021, 13: 623607. doi: 10.3389/fnagi.2021.623607
[57] Madruga M, Campos-Rroca Y, Perez CJ. Multicondition Training for Noise-Robust Detection of Benign Vocal Fold Lesions From Recorded Speech[J]. Ieee Access, 2021, 9: 1707-1722. doi: 10.1109/ACCESS.2020.3046873
[58] Ben Aicha A. Contribution of Data Augmentation for the Prenventive Detection of Vocal Fold Precancerous Lesions[M]//RUDAS I J, JANOS C, TORO C, et al. Knowledge-Based and Intelligent Information & Engineering Systems, 2019: 212-220.
[59] Kashyap B, Pathirana PN, Horne M, et al. Quantitative Assessment of Speech in Cerebellar Ataxia Using Magnitude and Phase Based Cepstrum[J]. Ann Biomed Eng, 2020, 48(4): 1322-1336. doi: 10.1007/s10439-020-02455-7
[60] 宋琦, 李晓明. 嗓音分析与内镜技术结合人工智能在咽喉病变诊疗中的应用和发展[J]. 临床耳鼻咽喉头颈外科杂志, 2022, 36(8): 647-650. https://www.cnki.com.cn/Article/CJFDTOTAL-LCEH202208019.htm
计量
- 文章访问数: 2309
- PDF下载数: 2591
- 施引文献: 0