Abstract
Purpose
Many studies have shown that subjects show a change of vocal fundamental frequency (F0) when phonating subjects hear their vocal pitch feedback shifted upward or downward. This study was performed to demonstrate whether vocal parameters [F0, intensity, jitter, shimmer, and noise to harmonic ratio (NHR)] in normal males respond to changes in frequency of pure tone masking.
Materials and Methods
Twenty healthy male subjects participated in this study. Subjects vocalized /a/ vowel sounds while listening to a pitch-shift pure tone through headphones (upward pitch-shift in succession: 1kHz to 2 kHz and 1 kHz to 4 kHz at 50 dB or 80 dB, respectively, downward pitch-shift in succession: 1 kHz to 250 Hz and 1 kH to 500 Hz at 50 dB or 80 dB, respectively).
Results
Vocal intensity, F0, was increased, whereas jitter was decreased as the pitch of pure tone was shifted upward. However, there was no correlation between shimmer and NHR with pitch-shift feedback for pure tones. Unlike vocal pitch-shift feedback in other studies, upward pitch-shift feedback of pure tones caused the vocal F0 and intensity to change in the same direction as pitch-shift.
Auditory feedback through acoustic auto-monitoring of vocal output plays an important role in the control of phonation. Vocal intensity tends to increase in response to masking noise, and this is known as the Lombard effect. In addition to vocal intensity, F0 is closely linked to the auditory system.1 Several experimental studies have demonstrated that subjects change their F0 with distortion of their vocal feedback.2 Tanabe et al.3 showed that another control loop related to vocal output is feedback from the laryngeal sensory receptors. This is a complex neuromuscular reflex system referred to as kinesthetic feedback.4 Some reports showed that kinesthetic receptors are important for fine control of F0 frequency.5 Although vocal intensity and F0 in response to a change of intensity in masking noise have been relatively well studied, there are no previous investigations on the relationship between vocal parameters and frequency of pure tone masking.
Some studies have showed that vocal F0 is reduced to compensate for the disparity of a perceived vocal pitch that is greater than the intended pitch,6 suggesting that certain ranges of frequency of the feedback signal can be involved in vocal control. In addition to vocal pitch-shift feedback, changes in vocal F0 have also been observed in response to non-vocal sounds such as clicks.7,8 However, vocal pitch-shift feedback differs from non-vocal pitch-shift feedback in that the latency of response to non-vocal sounds has been shown to be shorter than vocal pitch-shift response. The term "pitch-shift response" refers to this process. Therefore, pitch-shift response helps to stabilize vocal F0 around an actual or intended target F0. However, earlier studies are limited by the use of stimuli that were short and presented suddenly.
In our study, we investigated whether auditory kinesthetic feedback was sensitive to pitch-shift with pure tones while vocalizing vowel sounds as shown by vocal pitch-shift feedback. We also analyzed the change in stability of phonation (jitter, shimmer, and NHR) as well as vocal F0 elicited by perturbations in pitch of pure tones.
Twenty healthy male subjects (28 - 33 years of age; mean age, 29.7 years) participated in this study. None of the subjects had a history of neurological deficits; speech, language, auditory or voice disorders, and were not trained singers or regular smokers. Each subject passed a hearing screening test at the 15 dB sound pressure level (SPL) for 500, 1 k, 2 k, 4 k, and 8 kHz bilaterally.
Subjects were seated comfortably in a sound-treated booth. They were instructed to vocalize /a/ vowel sounds at a comfortable and steady habitual pitch while listening to pitch-shift of pure tones through headphones. All subjects were tested for 8 binaural pitch-shift masking conditions (increasing pitch-shift in succession: 1 kHz to 2 kHz and 1 kHz to 4 kHz at 50 dB or 80 dB, respectively; decreasing pitch-shift in succession: 1 kHz to 250 Hz and 1 kHz to 500 Hz at 50 dB or 80 dB, respectively). While subjects sustained the vowel sounds for 5 seconds, we changed the frequency of masking at 3 seconds. The first and last 500 ms of each vowel sound were discarded to minimize any potential initiation and termination effects. Half of all subjects received increasing pitch-shift masking after decreasing pitch-shift masking with a break of 5 minutes between conditions. The other half received increasing pitch-shift masking followed by decreasing pitch-shift masking. Vocal responses were analyzed by Kay Elemetrics CSL Model 4300B (Kay Elemetrics Corporation, Lincoln Park, NJ, USA). Vocalization was transduced with an AKG c420 microphone (AKG Acoustics Harman proGmbH, Munich, Germany). We used an AC 40 (Interacoustics, Denmark) for the delivery of pitch-shift pure tone.
The mean fundamental frequency, jitter, shimmer, and NHR in each condition are shown in Fig. 1. The mean vocal parameters for the pre- and post-shift periods were separately measured. Each vocal parameter from the pre-shift period was subtracted from that of the post-shift period; positive numbers indicate increasing changes and negative numbers decreasing changes.
For change of vocal intensity, the mean vocal intensity was significantly increased as pitch of pure tone shifted upward for each of the conditions (Fig. 1A). Sixteen subjects (1 kHz to 2 kHz and 1 kHz to 4 kHz at 50 dB), 18 (1 kHz to 2 kHz at 80 dB), and 19 (1 kHz to 4 kHz at 80 dB) out of 20 subjects had increased vocal intensity as pitch tone shifted upward. However, there were no correlation between vocal intensity and downward pitch-shift feedback. The magnitude of the response of vocal intensity was greater at the pitch feedback of 80 dB than at 50 dB, and it was greater at a pitch-shift of 1 k to 4 k than at 1 k to 2 k.
For change of fundamental frequency, the mean vocal F0 increased as pitch of pure tone shifted upward for each of the conditions (Fig. 1B). The majority of subjects increased their F0 for an upward pitch-shift feedback of pure tone. However, there were no significant differences in the change of F0 when subjects received a downward pitch-shift feedback of pure tone at 50 dB and 80 dB. The percentage of subjects that increased their F0 in response to an upward pitch-shift feedback of pure tone was low by 60% (1 k to 2 k at 50 dB), 65% (1 k to 4 k at 50 dB), 70% (1 k to 2k at 80 dB), and 80% (1 k to 4 k at 80 dB).
For change of jitter in response to a shift in masking frequency, jitter had a tendency to decrease when masking frequency shifted upward at 50 dB and 80 dB (Fig. 1C).
For change of shimmer or NHR in response to shift-masking frequency, there were no correlations between shimmer, NHR, and pitch-shift feedback (Figs. 1D and E). The degree of change in shimmer or NHR was negligible.
Vocal reaction in response to noise is a physiological reflex in preparation for the possibility of verbal communication. Pitch-shift response to stabilize vocal F0 by correcting pitch perturbations has been widely recognized.6 This voice auditory feedback response shows that vocal F0 is opposite in direction to vocal pitch-shift.9 When feedback pitch is perceived to be lower, vocal F0 is increased; conversely, when feedback pitch is perceived to be higher, vocal F0 is decreased. Speakers modulate their voices to compensate for changes in pitch of voice auditory feedback.
Most earlier studies have analyzed vocal response to vocal pitch-shift feedback within a narrow change of pitch (from 50 to 200 cents).10 However, in our study, we tested the effects of pitch-shift feedback of pure tones on vocal response in normal hearing subjects. We tested a pitch-shift of pure tones in a different range from speech frequency, which is middle- to low-tone frequency and middle- to high-tone frequency.
Vocal F0 increases when vocal pitch feedback shifts downward as mentioned above, however, the shift that increases vocal F0 in response to pitch of pure tone masking is not downward but upward. Vocal F0 shows no change when pitch of pure tone masking was shifts downward. For vocal intensity, our study showed that vocal intensity was also increased only when pitch of pure tone masking was shifted upward. Most subjects said that they felt as if their speaking was distorted while hearing the upward pitch-shift of pure tone masking. According to them, masking noise seemed to be increasing while hearing an upward pitch-shift of pure tone masking, and it was much more intensive at the 80 dB. However, subjects did not perceive similar changes in pure tone masking when hearing a downward pitch-shift of pure tones.
The pathophysiology of our results is not clear. However, some possibilities include the following. First, pitch-shift feedback for pure tones resulted in an overall following response of vocal F0 and intensity. The following response means that change of vocal F0 or intensity was in the same direction as the pitch-shift stimulus. However, it is well established that vocal pitch-shift feedback results in an opposing response of the vocal F0: that is, a change in vocal F0 is in the opposite direction of the pitch-shift stimulus.6 When vocalizing at a particular pitch, subjects compare pitch memory with auditory, proprioceptive, and kinesthetic feedback.11 For vocal pitch-shift feedback, subjects may primarily rely on pitch memory to adjust F0 output, thereby auditory feedback aligning with memory, which results in the opposing response. However, for pitch-shift feedback with pure tones, the frequency of a pure tone is a frequency different from vocal F0. When external pitch feedback is clearly different from subjects' vocal F0, it is likely that subjects tend to ignore the compensation for the shift of pure tone frequency. Subjects who follow the direction of pitch-shift stimulus may adopt an external reference to control vocal F0 and intensity.12 Because an external reference is different from internal vocal F0, the external reference may dominate the vocal control system. This mechanism may result in the following response to vocal F0 and intensity of the response to pitch-shift feedback of pure tone masking.
Second, vocal F0 is related only to upward pitch feedback, not the downward pitch feedback. An increase in pitch of pure tone masking may have a more noticeable effect than a decrease. one report on the effects of frequency shift feedback on vocal F0 showed a similar pattern; the change of vocal F0 was larger for the upward shift than the downward shift although the change of direction was opposite to the stimulus.13
Third, the increase of vocal F0 in response to upward pitch-shift feedback was related to vocal intensity. Some investigators have reported that an increase of F0 occurs concurrently with an increase of vocal intensity with altered auditory feedback.14 In an aerodynamic study, vocal intensity usually increased with subglottic air pressure, which is associated with increase of F0.15 Although it is not known whether the change of vocal F0 is secondary to vocal intensity, the present study showed that pitch feedback for pure tones could affect vocal F0 and vocal intensity.
As for the relationship between jitter and pitch-shift feedback, it is important to consider that the vocalis and cricothyroid muscles exert more balanced force as the vocal intensity increases.15 This may lead to increased stability as reflected by a decrease of the jitter level. However, shimmer and NHR showed no correlation with pitch-shift feedback. Perhaps, the change of intensity was too small to change shimmer or NHR.
The change of vocal parameters in response to pitch changes of pure tones with unperturbed vocal feedback has not previously been studied. Unlike vocal pitch-shift feedback, upward pitch-shift feedback differed from the subject's voice. The present results showed that pure tone perception made subjects change their vocal F0 and intensity toward the same direction of the pitch-shift. Therefore, the change of frequency of pure tone also affects auditory kinesthetic feedback.
References
1. Lane H, Tranel B, Sisson C. Regulation of voice communication by sensory dynamics. J Acoust Soc Am. 1970. 47:618–624.
2. Elliot L, Neimoeller A. The role of hearing in controlling voice fundamental frequency. Int Audiol. 1970. 9:47–52.
3. Tanabe M, Kitajima K, Gould WJ. Laryngeal phonatory reflex. The effect of anesthetization of the internal branch of the superior laryngeal nerve: Acoustic aspects. Ann Otol Rhinol Laryngol. 1975. 84:206–212.
4. Schultz-Coulon HJ. The neuromuscular phonatory control system and vocal function. Acta Otolaryngol. 1978. 86:142–153.
5. Sundberg J, Iwarsson J, Billström AH. Significance of mechanoreceptors in the subglottal mucosa for subglottal pressure control in singers. J Voice. 1995. 9:20–26.
6. Burnett TA, Freedland MB, Larson CR, Hain TC. Voice F0 responses to manipulations in pitch feedback. J Acoust Soc Am. 1998. 103:3153–3161.
7. Sapir S, McClean MD, Larson CR. Human laryngeal responses to auditory stimulation. J Acoust Soc Am. 1983. 73:315–321.
8. Baer T. Reflex activation of laryngeal muscles by sudden induced subglottal pressure changes. J Acoust Soc Am. 1979. 65:1271–1275.
9. Burnett TA, Senner JE, Larson CR. Voice F0 responses to pitch-shifted auditory feedback: a preliminary study. J Voice. 1997. 11:202–211.
10. Sivasankar M, Bauer JJ, Babu T, Larson CR. Voice responses to changes in pitch of voice or tone auditory feedback. J Acoust Soc Am. 2005. 117:850–857.
11. Fairbanks G. Systematic research in experimental phonetics. I. A theory of the speech mechanism as a servosystem. J Speech Hear Disord. 1954. 19:133–139.
12. Hain TC, Burnett TA, Kiran S, Larson CR, Singh S, Kenney MK. Instructing subjects to make a voluntary response reveals the presence of two components to the audio-vocal reflex. Exp Brain Res. 2000. 130:133–141.
13. ELman JL. Effects of frequency-shifted feedback on the pitch of vocal productions. J Acoust Soc Am. 1981. 70:45–50.