Abstract
PURPOSE
The purpose of this study was to compare men with women in terms of speech intelligibility, to investigate the validity of objective acoustic parameters related with speech intelligibility, and to try to set up the standard data for the future study in various field in prosthodontics.
MATERIALS AND METHODS
Twenty men and women were served as subjects in the present study. After recording of sample sounds, speech intelligibility tests by three speech pathologists and acoustic analyses were performed. Comparison of the speech intelligibility test scores and acoustic parameters such as fundamental frequency, fundamental frequency range, formant frequency, formant ranges, vowel working space area, and vowel dispersion were done between men and women. In addition, the correlations between the speech intelligibility values and acoustic
variables were analyzed.
RESULTS
Women showed significantly higher speech intelligibility scores than men and there were significant difference between men and women in most of acoustic parameters used in the present study. However, the correlations between the speech intelligibility scores and acoustic parameters were low.
CONCLUSION
Speech intelligibility test and acoustic parameters used in the present study were effective in differentiating male voice from female voice and their values might be used in the future studies related patients involved with maxillofacial prosthodontics. However, further studies are needed on the correlation between speech intelligibility tests and objective acoustic parameters.
When the patients have cancer in the maxillofacial area, resective surgery is a usual option for their treatment. The acquired palatal defects originated from the maxillectomy result in several functional difficulties. Among them, speech disturbance can be thought of one of the most devastating consequences.1 With speech problems it is not easy for maxillectomy patients to return to society.2 As speech is a learned function, it is more easily disturbed by ablative surgery or congenital malformations than respiration and deglutition which are the primary and life supporting functions.3 In the maxillectomy patient the cause of speech problem originates from the anatomical defect. The loss of palatal tissue and incorrect tongue-palatal contacts after maxillectomy lead to the distortion of the oronasal resonance equilibrium and compromised articulation of the speech. It is well known that speech function can be enhanced and speech intelligibility increases with prosthodontic treatments.4,5 Therefore the successful prosthodontic treatment can be evaluated in terms of speech function and the establishment of methods to evaluate patients'speech intelligibility and it is useful in evaluating prosthodontic rehabilitation of maxillectomy patients.
Speech intelligibility, which is the accuracy with which a spoken word or phrase is understood, is crucial in communication among individuals. For the measurement of speech intelligibility, the speech intelligibility (SI) test has been used by many researchers.6-8 This perceptual test has advantages of easiness in use and no necessity for special equipment, however, it is a subjective method which requires juries. In addition, it can be influenced by the juries'familiarity with a talker's voice.9 Therefore, a method that can objectively quantify the speech intelligibility is expected to be helpful in evaluating the improvement of patients'speech after prosthodontic treatment. To provide an acceptable level of speech quality, the understanding of acoustic characteristics of normal speaker is essential and, especially the understanding of the factors related with speech intelligibility is important.
Several factors are suggested to be related with speech intelligibility. They are gender, fundamental frequency, formant frequency, vowel working space, and vowel dispersion. It was reported that gender is a remarkable characteristic feature which can affect intelligibility.10 It was reported that male and female showed different glottal characteristics11,12 and speaker sex is the major determinants of the acoustic properties of speech within a given language.13 It is not difficult for listeners to differentiate male voices from female voices.9 In addition, on the relation of gender with speech intelligibility, it was stated that female voice showed more intelligibility.14
On the fundamental frequency, it seems not to be clear whether it affects speech intelligibility. It was reported that there was no reliable difference in mean fundamental frequency between higher and lower intelligibility talkers.14 In addition it was stated that there were no strong predictions regarding the relationship between fundamental frequency characteristics and intelligibility.15 On the other hand, in another study it was reported that a wider range in fundamental frequency was related with a higher overall intelligibility score and there was a significantly greater fundamental frequency range for the group of female talkers than for the group of male talkers.10
Formant frequencies for vowels are known to differ substantially across speakers from different sex groups.16 In addition, vowel formant frequency values have been widely used in the study of speech to assess speech intellibility.16 Although factors such as first formant (F1), second formant (F2), and F2-F1 differences have been used to characterize variance in word intelligibility,17,18 some of the authors have focused on the vowel frequency range. In the previous study it was reported that a stronger positive correlation was found between range in F2 and intelligibility than for range in F1 and intelligibility.19 However, other author reported that the area covered in F1 was a better correlate of overall intelligibility than the area covered in F2.10
It was reported that vowel working space has positive relationship with speech intelligibility.10,20-22 Vowel working space means the space enclosed by the first two formants of corner vowels.22 The Euclidian area covered by the triangle defined by the mean of each vowel category has been used to assess the relationship between vowel space and overall speech intelligibility. Although it was hypothesized that the greater the triangular area, the higher the overall intelligibility, every study didn't prove a positive correlation between triangular vowel space area and speech intelligibility scores. In one study it was stated that the points used to calculate triangular vowel space area might not be representative of the individual vowel tokens and instead of vowel working space the use of vowel space dispersion was suggested.10 It was reported that vowel space dispersion could provide an indication of the overall expansion or compactness of the set of individual vowel tokens from each talker.10
Studies related with speech function of maxillectomy or soft palate resection patients have not differentiated gender.23-25 Therefore it is needed to establish the difference based on gender with respect to acoustic characteristics of the subjects. These results will be helpful in the future studies related with the speech function of maxillectomy patient. The purpose of this study was to compare male speech with female speech in terms of speech intelligibility, to investigate the validity of objective parameters related with speech intelligibility, and to try to set up the standard data for the future study in various field in prosthodontics.
Twenty male and female were served as subjects in the present study. Subjects were workers at Samsung Medical Center and were limited to speakers of the Seoul and Kyounggi regions in Korea in order to avoid any influence related to linguistic background on the results. Among the subjects 10 were men and 10 were women. The age of men ranged from 26 to 32 years and the mean age was 28.9 years. The age of women ranged from 26 to 32 and the mean age was 28.6 years. Ages of subjects were limited to twenties and thirties to exclude subjects who could have voice changes from secondary sex characteristics or menopause. All of the subjects were judged by one prosthodontist whether they had adequate intelligibility to perform the speech recording, to possess hearing ability within normal limits, and to have normal oral structure and function. None of the subjects had a history of craniofacial anomalies or velopharyngeal impairment. Subjects who had recent changes in oral environment from disease or treatment and who had upper respiratory infection on the recording day were excluded.
The voices of the 20 subjects were recorded in a quiet separated room. The parts of the 'Sanchaek' passage were used as sample passages for speech intelligibility test. They included 26 words and 81 syllables. The three Korean vowels, /a/, /i/, and /u/ were used as the sample vowel sounds for acoustic studies. Each subject was seated and a microphone (PC150; Sennheiser electronic, Wedemark, Germany) was placed in front of the subject's mouth. The Korean passages and three sample sounds were provided to the subjects as printed forms on the paper. Before any recordings, the protocol was explained to the subjects and they were allowed to read them in advance. Five centimeter distance from the microphone to the mouth was maintained during recording. Subjects were asked to read them as fast as they could. The passages were read within 14 seconds. For acoustic analysis subjects were asked to pronounce sustained sample vowel sounds as clearly as possible for 5 seconds. The pronounced sounds were recorded using software program (Multispeech model 3700; Kaypentax, Lincoln Park, NJ, USA) directly on the personal computer. The sounds were sampled at 11,025 Hz.
The perceptual evaluations of speech intelligibility were performed by three trained speech pathologists that were unfamiliar with the subjects. They had at least over 4 years of experiences in speech pathology field. After they listened to the speech recordings using MDVP program over headphones in a quiet room, they scored subjects' recordings. They were encouraged to listen to the recordings as many times as needed. These subjective judgments were done using 10-point scale, where 1 represented the worst score, 10 represented the best score, and 6 represented a reference for acceptable speech intelligibility. Then the mean value of the ratings of the three judges was used as the speech intelligibility scores. The average speech intelligibility values of men from three listeners were compared with those of women using the independent t-test.
The acoustic analysis was performed with the same Multispeech program as used in the recordings. With the Multispeech program the formant history of the recorded signal was observed and a 0.5-second section in which the formant values were stabilized was selected. The average values of the first formant (F1) and the second formant (F2) of the selected section were obtained by tracking formant history and the F2-F1 differences were calculated. The resulting F1, F2 and F2-F1 difference values of the male subjects were compared with those of female subjects. The independent t-test with Bonferroni correction was used as a statistical method. Then the differences between the highest F1 value and the lowest F1 value in all sample sounds were calculated and it was considered as F1 range. Using same procedures F2 ranges were obtained in men and women group. The F1 and F2 ranges of men were compared with those of women using the independent t-test.
With the same recorded sounds used in formant analysis, fundamental frequency was obtained. The fundamental frequencies of /a/, /i/, and /u/ sounds were produced using Multispeech program. The average fundamental frequency from three sounds was used as mean fundamental frequencies of the subjects. Then the minimum and the maximum fundamental frequencies were extracted from the three sounds and fundamental frequency range, the differences between them were regarded as fundamental frequency ranges. The fundamental frequencies of men from /a/, /i/, and /u/ sounds were compared separately with those of women. In addition with the mean fundamental frequency values and fundamental frequency ranges the same comparisons were done. The independent t-test was used as a statistical method.
For the vowel working space area, the F1 and F2 pairs of each vowel were viewed as coordinates in the x-y plane and the Euclidian area covered by the triangle of each vowel was calculated. The areas of the triangle from male subjects and those from the female subjects were compared. In addition, vowel space dispersion was obtained. The distances from a central point to angular points of the vowel working triangle were obtained and vowel space dispersion was calculated as the mean of these distances for each subjects. The vowel working space area and the vowel space dispersion of men were compared with those of women. The independent t-test was used as a statistical method. In addition the shapes of the triangle of vowel working space for two groups were compared.
Then the correlations between the speech intelligibility values of men and women and acoustic variables were analyzed. The Spearman's correlation coefficient by rank test was used to detect correlation between the speech intelligibility and fundamental frequency, fundamental frequency range, formant frequency, formant ranges, vowel working space area, and vowel dispersion. In all analyses, P-values less than 0.05 were considered statistically significant and all data were analyzed using statistical software SPSS version 12 (SPSS, Chicago, IL, USA).
The results from the comparison of speech intelligibility test, fundamental frequencies and fundamental frequency ranges between men and women are shown in Table 1. The average speech intelligibility score of women from three listeners was significantly higher than that of men (P = .046). The average fundamental frequencies of men for /a/, /i/, and /u/ sounds were significantly lower than those of women (P = .000). In addition, the average fundamental frequency of men from three sample sounds were significantly lower than that of women (P = .000). However, in the fundamental frequency ranges there was no difference between men and women.
In the formant analysis with /a/ sound women showed significantly higher frequency values in F1, F2, and F2-F1. With /i/ sound F2 and F2-F1 frequencies of women were significantly higher than those of men. However, there was no significant difference between the mean frequency values of the two groups for F1 frequencies. With /u/ sound there were no significant differences in all of the frequency values between men and women (Table 2). The frequency ranges of female were larger than those of men in F1 range (P = .004), and in F2 range (P = .000). The mean frequency ranges and standard deviations of men and women in F1 and F2 and their comparisons between them are shown in Table 3.
For the vowel working space areas, women showed significantly larger areas than those of men (P = .000). In addition, women demonstrated significantly higher values than men in the comparisons of vowel dispersions (P = .000). Table 4 represents the results for the comparisons of the vowel working space areas and vowel dispersions. The vowel working space of men and women from mean values showed the typical vowel triangle shape. However, the locations of corner vowels in men group were more apart than those in women group. Their difference is depicted in Fig. 1.
Table 5 showed the correlation analysis between speech intelligibility scores and acoustic parameters including fundamental frequency, fundamental frequency range, formant frequency, formant range, vowel working space area, and vowel dispersion. The results demonstrated that the correlations between the speech intelligibility scores and acoustic parameters were low and all of the correlations were not significant.
The results of this study showed that women were different from men in most of the parameters related with subjects' voice. The results of speech intelligibility tests demonstrated that speech intelligibility scores of women were higher than those of men. They were in accordance with the results of a previous studies.10,14 In one of the previous studies it was reported that this intelligibility difference might be due to an increased prevalence of specific reduction phenomena for male speech relative to female speech, rather than due to the voice quality differences between males and females.14 One of the results of the present study supports this suggestion. In the comparison of vowel working spaces, the triangle from the mean frequency values of men was smaller than that of women and vowel centralization seemed to occur. Vowel centralization is known to be a typical feature of casual or reduced speech,14,26,27 and it is thought to be the cause of the difference in speech intelligibility between men and women in the present study. Even in the clear speech used in the present study, men seemed to centralize vowel sounds. As it was reported that more peripheral vowel category locations in F1 by F2 space were found for a higher-intelligibility talker relative to a lower-intelligibility talker,15 the results of the comparison of vowel working space between men and women might explain their difference in speech intelligibility tests.
In the analysis with fundamental frequencies there were distinct differences between men and women. These results were in accord with the previous study. It was reported that fundamental frequency is a characteristic feature that typically differs across male and female talkers.10 The fundamental frequency scores in the present study were in accordance with previous study.28 The results of this study might be used as a standard data for future studies. However, there was no significant difference between men and women in terms of fundamental frequency range in the present study. In the present study fundamental frequency range was obtained from the 0.5 second formant stabilized section. As it was reported that fundamental frequency range from clear speech could be different from that from conversational speech,26 the limited sample sounds can be thought of the cause of the result of the present study. Further studies related with frequency range using other speech samples such as conversational speech are thought to be needed to clarify the role of fundamental frequency range in the acoustic analysis.
In the present study for /a/ and /i/ sounds there were significant differences in the formant values between men and women. As the F1 and F2 frequencies are related principally to tongue height and advancement, and the F2-F1 difference can be interpreted as tongue advancement-retraction,13 tongue positions of men seemed to be different from those of women when they produce various sounds. Vowel formant frequency is known to be affected by a number of factors including the intrinsic size of the vocal tract, the size of the tongue, the size and configuration of the oral cavity, the size and configuration of the pharyngeal cavity, and the tongue configuration.29 These are parameters related with anatomy, and as the anatomical difference is evident between men and women, the formant differences based on gender would be expected. In the future prosthodontic study related with speech function, the separation of subjects based on gender seemed to be required.
All of the acoustic parameters used in the present study showed overall low correlation with the results of speech intelligibility tests. Although the fundamental frequency and fundamental frequency range were reported to be related with speech intelligibility,10,15 there were low correlation between them in the present study. In addition, although it was found that anatomical components related with formant frequency were different from those of women in the present study, their relations with speech intelligibility were not found. The reason can be variously inferred. The rating of subjective speech intelligibility test could be inaccurate. Although experienced speech pathologists joined this study, judging subjects' speech after listening to short passages could be subjective. In addition the speech intelligibility test seemed not to be standardized. During the course of evaluation speech intelligibility, speech pathologists showed disagreements in their scorings. Therefore the development of standardization of sample passages and evaluation methods are required. Besides the subjective speech intelligibility tests, the small sample size could contribute the low correlation. Further studies with other sample sound and larger sample size are required. In addition, the inaccuracies in the process of recording from equipment and room could be the factor for the low correlation between tests.
After ablative surgery on the maxillofacial area such as maxillectomy, the patients' speech function has to be restored to a reasonable level. To achieve acceptable recovery of speech function, appropriate evaluation methods and standard data for outputs of acoustic tests are needed. Because the voice quality and results for acoustic tests can be various, patient's original voice may be the best reference in restoring patient's speech function. However, in the situation that it is not available, the results of the present study which were in accordance with the previous studies can be used for the standard data for the evaluation of various prosthodontic procedures including maxillofacial prosthodontic treatment.
Speech intelligibility test and acoustic parameters used in the present study including fundamental frequencies, fundamental frequency ranges, formant frequencies, vowel working space area and vowel dispersion were effective in differentiating male voice from female voice. In addition, their values might be used in the future studies related patients involved with maxillofacial prosthodontics. However, further studies are needed on the correlation between speech intelligibility tests and appropriate acoustic parameter.
Figures and Tables
References
1. Plank DM, Weinberg B, Chalian VA. Evaluation of speech following prosthetic obturation of surgically acquired maxillary defects. J Prosthet Dent. 1981. 45:626–638.
2. Yoshida H, Michi K, Ohsawa T. Prosthetic treatment of speech disorders due to surgically acquired maxillary defects. J Oral Rehabil. 1990. 17:565–571.
3. Beumer J, Curtis TA, Marunick MT. Maxillofacial rehabilitation: Prosthodontic and surgical considerations. 1996. St. Louis: Elsevier;225–284.
4. Kipfmueller LJ, Lang BR. Presurgical maxillary prosthesis: an analysis of speech intelligibility. J Prosthet Dent. 1972. 28:620–625.
5. Majid AA, Weinberg B, Chalian VA. Speech intelligibility following prosthetic obturation of surgically acquired maxillary defects. J Prosthet Dent. 1974. 32:87–96.
6. Rogers CL, DeMasi TM, Krause JC. Conversational and clear speech intelligibility of /bVd/ syllables produced by native and non-native English speakers. J Acoust Soc Am. 2010. 128:410–423.
7. Mahanna GK, Beukelman DR, Marshall JA, Gaebler CA, Sullivan M. Obturator prostheses after cancer surgery: an approach to speech outcome assessment. J Prosthet Dent. 1998. 79:310–316.
8. Umino S, Masuda G, Ono S, Fujita K. Speech intelligibility following maxillectomy with and without a prosthesis: an analysis of 54 cases. J Oral Rehabil. 1998. 25:153–158.
9. Nygaard LC, Sommers MS, Posoni DB. Effects of stimulus variability on perception and representation of speaking words in memory. Percept Psychophys. 1995. 57:989–1001.
10. Bradlow AR, Torretta GM, Pisoni DB. Intelligibility of normal speech I: Global and fine-grained acoustic-phonetic talker characteristics. Speech Commun. 1997. 20:255–272.
11. Klatt D, Klatt L. Analysis, synthesis, and perception of voice quality variations among female and male talkers. J Acoust Soc Am. 1990. 87:820–857.
12. Hanson HM. Glottal characteristics of female speakers- Acoustic, physiological, and perceptual correlates. J Acoust Soc Am. 1997. 101:466–481.
13. Vorperian HK, Kent RD. Vowel acoustic space development in children: a synthesis of acoustic and anatomic data. J Speech Lang Hear Res. 2007. 50:1510–1545.
14. Byrd D. Relations of sex and dialect to reduction. Speech Commun. 1994. 15:39–54.
15. Bond ZS, Moore TJ. A note on the acoustic phonetic characteristics of inadvertently clear speech. Speech Commun. 1994. 14:325–337.
16. Peterson G, Barney H. Control methods used in a study of the vowels. J Acoust Soc Am. 1952. 24:175–184.
17. Bunton K, Weismer G. The relationship between perception and acoustics for a high-low vowel contrast produced by speakers with dysarthria. J Speech Lang Hear Res. 2001. 44:1215–1228.
18. Turner GS, Tjaden K, Weismer G. The influence of speaking rate on vowel space and speech intelligibility for individuals with amyotrophic lateral sclerosis. J Speech Hear Res. 1995. 38:1001–1013.
19. Monsen RB. Normal and reduced phonological space: the productions of English vowels by deaf adolescents. J Phon. 1976. 4:189–198.
20. de Bruijn MJ, ten Bosch L, Kuik DJ, Quené H, Langendijk JA, Leemans CR, Verdonck-de Leeuw IM. Objective acoustic-phonetic speech analysis in patients treated for oral or oropharyngeal cancer. Folia Phoniatr Logop. 2009. 61:180–187.
21. Liu HM, Tsao FM, Kuhl PK. The effect of reduced vowel working space on speech intelligibility in Mandarin-speaking young adults with cerebral palsy. J Acoust Soc Am. 2005. 117:3879–3889.
22. Weismer G, Laures JS, Jeng JY, Kent RD, Kent JF. Effect of speaking rate manipulations on acoustic and perceptual aspects of the dysarthria in amyotrophic lateral sclerosis. Folia Phoniatr Logop. 2000. 52:201–219.
23. Umino S, Masuda G, Ono S, Fujita K. Speech intelligibility following maxillectomy with and without a prosthesis: an analysis of 54 cases. J Oral Rehabil. 1998. 25:153–158.
24. Rieger JM, Wolfaardt JF, Jha N, Seikaly H. Maxillary obturators: the relationship between patient satisfaction and speech outcome. Head Neck. 2003. 25:895–903.
25. Bohle G 3rd, Rieger J, Huryn J, Verbel D, Hwang F, Zlotolow I. Efficacy of speech aid prostheses for acquired defects of the soft palate and velopharyngeal inadequacy-clinical assessments and cephalometric analysis: a Memorial Sloan-Kettering Study. Head Neck. 2005. 27:195–207.
26. Picheny MA, Durlach NI, Braida LD. Speaking clearly for the hard of hearing. II: Acoustic characteristics of clear and conversational speech. J Speech Hear Res. 1986. 29:434–446.
27. Moon SJ, Lindblom B. Interaction between duration, contact and speaking style in English stressed vowels. J Acoust Soc Am. 1994. 96:40–55.
28. Pyo HY, Sim HS, Song YK, Yoon YS, Lee EK, Lim SE, Hah HR, Choi HS. The acoustic study on the voices of Korean normal adults. Speech Sci. 2002. 9:179–192.
29. Liu H, Ng ML. Formant characteristics of vowels produced by Mandarin esophageal speakers. J Voice. 2009. 23:255–260.