Silent, Non-Invasive Communications With The Profoundly Deaf


Ultrasonic hearing has been found capable of supporting frequency discrimination and speech detection in normal and older hearing-impaired and profoundly deaf human subjects. When speech signals are modulated into the ultrasonic range, listening to words results in the clear perception of the speech stimuli and not a sense of high-frequency vibration. These data indicate that ultrasonic hearing has potential as a communications channel in the rehabilitation of hearing disorders as well as in many other applications.



By M. L. Lenhart, Department of Otolaryngology and Biomedical Engineering Program, Medical College of Virginia, Virginia Commonwealth University, Richmond, VA 23298, and Hearing Innovations Inc., 2451 E. Calle Los Altos, Tucson, AZ 85718.

R. Skellett, P. Wang, A. M. Clarke, Biomedical Engineering Program, Medical College of Virginia, Virginia Commonwealth University, Richmond, VA 23298.

The upper range of human air conduction hearing is believed to be no higher than about 24,000 Hz (1); nevertheless, there have been reports of humans hearing well into the ultrasonic range but only when the ultrasonic stimuli are delivered by bone conduction (2-4). Furthermore, ultrasonic bone conduction hearing in humans has been readily demonstrated in various conditions of auditory pathology, including sensorineural hearing loss and middle ear disorders (2). Ultrasonic stimulation of the skull with frequencies up to 108 kHz induces a perception of sound within the head without any sensation of cutaneous feeling (2-4), suggesting that these high frequencies are being processed in a modality other than the vibratory-somatosensory system.

In contrast to the excellent frequency discrimination found in the midsonic auditory range, ultrasonic perception has been described as having poor frequency-resolving ability (3). This interpretation originates from reports that the pitch elicited by ultrasonic stimulation is similar to the pitch of the highest air conduction frequency detectable, that is 8 to 16 kHz (4). Because the perceived pitch of ultrasonic tones was associated with the upper limit of conventional air conduction hearing, these investigators assumed that the perceived audible sensation produced by ultrasonic stimuli consisted only of fixed pitch with monotonal quality. In addition, it was assumed that ultrasonic detection was a by-product of some form of cochlear processing that activated sensorineural elements within the classical auditory system (4) and, as such, offered little in the form of an additional communication channel.

We have tried to determine whether these early investigators might have overlooked the possibility that ultrasonic frequency discrimination exists and that ultrasonic hearing may, as a consequence, be capable of serving as a viable alternative communication channel, particularly for individuals with varying degrees of hearing loss. To explore this possibility, we obtained audiometric and ultrasonic thresholds (4,000 to 90,000 Hz) from ten normal, young adults (N1)(19 to 26 years of age) (5). Standard audiometric testing revealed that these subjects had normal hearing (see Fig. 1A), and ultrasonic testing confirmed early reports that humans could perceive ultrasonic stimuli at least as high as 90,000 Hz (see Fig. 1B). Thresholds for tonal detection in the ultrasonic range ranged from 82 to 112 dB of acceleration depending on the stimulus frequency. When these same subjects were presented with a stimulus approximately 30 dB above threshold, they reported the tone as being loud and unpleasant.

We next sought to determine the ability of humans to resolve frequency in the ultrasonic range, using a paradigm in which ten additional normal, young adult subjects (N2)(23 to 25 years of age) were required to indicate just noticeable differences (JNDs) in pitch to tonal stimuli of 6, 11, 2, 32, and 40 kHz (7). The subjects were able to discriminate changes in frequency, although there was a steady increase in the JNDs from 2 to 40 kHz (Fig. 1C). In the auditory range the JNDs ranged from 0.4 to 1.0% of the stimulus frequency, whereas in the ultrasonic range the JNDs were on the order of 10% of the stimulus frequency. These data reinforce the premise that pitch discrimination does occur in the ultrasonic range but that frequency discrimination in the ultrasonic range differs from that in the more conventional auditory range.

The confirmed presence of frequency discrimination raises the possibility that ultrasonic hearing may support the more complex temporal and spectral discrimination necessary for rudimentary speech perception. In order to determine ultrasonic speech recognition, eight additional normal, young adults (N3) (20 to 29 years of age) served as listeners when words were completely modulated into the ultrasonic range. In translating the speech signals into the ultrasonic range, we used an amplitude modulation, suppressed carrier (double side band modulation) technique with carrier frequencies of 28 and 40 kHz. The carrier frequency was suppressed such that the audio speech signal was carried on the two side bands. With modulation just below multiplication distortion, real-time spectral analysis revealed no audio frequency contamination either when the vibrator was held near the ear and was not mass loaded or when it was mass loaded by being placed on the mastoid region of the skull (8) (see also Fig. 2B). When the vibrator was placed in contact with the skull, the signal was clearly perceived as speech. The temporal quality was well preserved such that syllables could be clearly and accurately counted. In order to quantify speech recognition, the same eight subjects were given an abbreviated version of the Word Intelligibility by Picture Identification (WIPI) Test, a closed-set, picture identification task (9). Each word was presented in the ultrasonic range, and the listeners were instructed to point to the correct one of six pictures. Speech stimuli were presented in blocks of ten words using each carrier for an equal number of trials. Mean recognition scores of 83% correct (with a range of 70 to 90%) were observed for both carrier frequencies. These results were well above the chance level of 16%. One subject returned 2 months after testing and repeated her performance of 88% on another word test.

We next sought to determine if individuals with impaired high frequency hearing could detect ultrasonic speech. Ten older subjects (50 to 82 years of age) (N4) were given complete audiograms [as in (5)], including ultrasonic testing, and these threshold data were compared to those previously generated in normal, young adults. At frequencies between 1,000 and 10,000 Hz, all older subjects exhibited substantial hearing loss typical of age-related deafness (Fig. 1A). However, both subject groups displayed similar ultrasonic thresholds regardless of age or hearing loss in the conventional auditory range (Fig. 1B) These data confirm the ability of human subjects to detect sounds in the ultrasonic range despite a significant hearing loss normally associated with aging (2). Furthermore, in a subsequent experiment with five additional subjects with age-related hearing loss (55 to 75 years of age) (N5), speech stimuli were amplitude-modulated onto a 28-kHz carrier frequency and presented in a single-word, oral-response discrimination task (10). Mean accuracy was found to be 58% with a range of 45 to 70%. All of the subjects discriminated words at a level considerably better than chance. The lower scores for the older group relative to the young group were due to incorrect identification of a single consonant sound within the target word.

Encouraged by the observation that individuals with moderate sensorineural hearing loss were able to accurately hear isolated words, we next sought to evaluate ultrasonic speech perception in nine subjects with acquired sensorineural deafness having left corner audiograms with pured tone average hearing of 90+ dB [experiment performed as in (5)]. All nine subjects detected ultrasonic tones with thresholds of approximately 100 to 130 dB, slightly higher than those observed in normal, young adults (N3) (Fig. 1B). When the WIPI testing paradigm (9) was administered to two of these deaf subjects who displayed oral communicative skills, accuracy levels of 20 to 30% were observed. These scores were lower than those observed for the normal, young adults and may be due to the relative differences in amplitude by which the WIPI was presented to the two groups. In the normal, young adults, the words were presented approximately 10 to 15 dB above the respective threshold levels for this group. However, because the threshold for the deaf subjects were higher than for the normal subjects were higher than for the normal subjects and approached the power output limits of our equipment, the words were presented only slightly above the threshold levels for these subjects. Confirmation of this hypothesis must await modification to our equipment, enabling us to increase the power output.

In an effort to make the speech material more intelligible, the lower side band was suppressed by 40 dB (8) and three deaf subjects were retested with the same word paradigm. These subjects reported that the upper side band modulation sounded "better," which was reflected in scores of 40% identification for each subject. An analysis of the errors in speech recognition for the deaf subjects yielded the same proportional consonant confusion profiles as those seen for the normal subjects. Seventy percent of the errors made by the normal subjects and 72% of the errors made by the deaf subjects were similar consonant substitutions, suggesting that both groups used the same speech recognition strategies while hearing words in the ultrasonic range.

As previously observed (3,4), humans have ultrasonic hearing. The present results suggest that some neural substrate is capable of encoding speech signals when these speech signals are modulated into the ultrasonic frequencies. This is true in the normal hearing individual, the older listener with compromised auditory function, as well as in the profoundly deaf person with no substantial auditory function.

Although the specific neural substrates for this ultrasonic processing remain to be elucidated, several hypotheses can be considered. First, it is possible that ultrasonic frequencies are not transmitted by the middle ear because of a poor impedance match) but are transmitted by the bone and then are coded like high-frequency, air-conducted sounds at the base of the cochlea where the signal is then transmitted to the brain via the classical auditory pathways (2, 4, 11). Although more data must be generated to fully evaluate this hypothesis, the ability of the older hearing-impaired and the clinically deaf to detect ultrasonic sound and to perceive speech would argue against such a hypothesis.

Alternatively, it is possible that an ultrasonic receptor resides within a known structure to which we presently ascribe a different function. One possible candidate for such ultrasonic reception is the saccule, an otolithic organ that responds to acceleration and gravity and may be responsible for transduction of sound after destruction of the cochlea (12). Anatomically, it has been reported that saccular afferents in mammals innervate the classical auditory system at the level of the cochlear nucleus, the first synapse in this pathway (13). In addition, a saccular nerve branch has been reported to innervate the base (high-frequency portion) of the cochlea, and a branch of the cochlear nerve has been reported to innervate the saccule, suggesting a reciprocal interdependence between these two organs (14). The present results in hearing-impaired and profoundly deaf subjects strongly suggest that bone-conducted, ultrasonic stimulation may provide an alternative therapeutic approach for the rehabilitation of severe hearing loss.

REFERENCES AND NOTES

1. E. G. Wever and M. Lawrence, Physiological Acoustics (Princeton Univ. Press, Princeton, N.J. 1954).

2. Patients with varying degrees of inner ear and nerve deafness (sensorineural) and disorders of the middle ear have been reported to hear ultrasound: R. J. Bellucci and D. E. Schneider, Am. Otol. Rhinol. Laryngol. 71, 719 (1962); A. J. Abramovich, J. Layngol. Otol. 92, 861 (1978).

3. J. F. Corso and M. Levine, J. Acoust. Soc. Am. 35, 804 (1963); Am. J. Psychol. 78, 557 (1965).

4. Pomphrey, Nature 166, 571 (1950); B. H. Deatherage et al, J. Acoust. Soc. Am 26, 582 (1954); A. V. Haeff and C. Knox, Science 139, 590 (1963); J. F. Corso, J. Acoust. Soc. Am. 35, 1738 (1963); H. G. Dieroff and H. Ertel, Arch. Oto-Rhino-Laryngol. 209, 277 (1975).

5. We administered tonal audiograms to ten young adults (N1) aged 19 to 26 with normal hearing and no history of ear disease, using a vibratory system that results in bone conduction hearing. A Wilcoxon (Rockville, MD) F3/F9 electromagnetic piezoelectric vibrator was applied to the mastoid region of the skull. Acceleration was measured from built-in as well as attached sensors. We measured acceleration in decibels using a Quest (Okanawac, WI) vibration meter references to 10-3 m/s2. An acceleration value of 10-3 m/s2 is 80 dB below one unit of gravity (1g). W assessed signal purity using a Hewlett-Packard real-time special analyzer. See Fig. 2A for an example of spectral analysis performed at 30 kHz.

6. Subjects were administered an auditory pattern task in which they were required to detect rising or falling audio tones by means of a convential (clinical) vibrator affixed to the skull (Radio Ear model 32B). Tones were shifted from 2000 to 4000 Hz or from 4000 to 2000 Hz in 10-ms sweeps. Subjects were instructed to identify shift direction as either rising or falling. Ultrasonic auditory patterns were delivered to the F9 piezoelectric vibrator, and frequencies were shifted from either 25 to 32 kHz or from 32 to 25 kHz in 100-ms sweeps. Subjects were again instructed to identify sweep direction. Signal purity was monitored by fast Fourier transfer analysis, which revealed no energy 60 dB down from the peak frequencies. There was no energy above or below the frequency range of 25 to 32 kHz.

7. Continuous pure tones of 6, 11.2, 32, and 40 kHz were presented to the skull by the F3/F9 vibrator. Tones were monitored in intensity by means of the Quest vibration meter with octave filtering, and tonal purity was monitored with real-time fast Fourier analysis analysis. Tones were delivered at a comfortable listening level, which varied in acceleration from 70 to 100 dB depending on the frequency administered. Subjects were instructed to indicate a JND in pitch when the experimenter increased frequency rate over 5 s. A frequency counter was used to measure the frequency that elicited a JND in pitch. An average of three trials was obtained per subject.

8. Live voice speech (300 to 3000 Hz) was delivered through a microphone, amplified, and translated into the ultrasonic range. The carrier frequencies used were 28 and 40 kHz. The double side band (DSB) carrier-suppressed signal was filtered with a Kron-Hite model 3750 filter set in high-pass mode with a cutoff frequency of 20 kHz and a slope of 24 dB per octave. This filtered signal was delivered through the Wilcoxon Research model PA7C power amplifier and the model N9 matching network to the model F9 piezoelectric vibration generator on a model Z9 transducer base. The accelerometer output of the transducer was used as input to the Quest Electronics measurement instrument, and signal levels were determined. The accelerometer output was also sent to the Hewlett-Packard model 3561A dynamic signal analyzer, allowing the spectral content of the ultrasonic, bone conduction speech signal to be analyzed and monitored. A DSB signal is generated by the multiplication of two input signals, one modulating (audio) signal and one carrier (high-frequency) signal, using a balanced modulator. An upper side band signal was generated by phasing method such that both the audio and the carrier were shifted by 90 deg., fed into each of two balanced modulators, and summed [F. G. Stremler, An Introduction to Communication Systems (Addison-Wesley, Reading, MA, 1982)].

9. M. Ross and J. Lerman, Word Intelligibility by Picture Identification (Stanwix House Inc. Pittsburg, 1971). All of the test words are pictorially represented in a booklet, each page of which contains a six-picture matrix with one test word per page, repeated two or three times by the examiner. The subject must choose one of the pictures in the appropriate matrix and make a verbal response. Ten words were delivered per trial, and the number of correct responses was recorded so that the types of errors could be identified. Normal subjects were asked to perform one trial for each of the carrier frequencies.

10. Using an amplitude-modulated circuit with a 28-kHz carrier, we presented 20 open-set, phonetically balanced words to five older (55 to 75 years of age) subjects using the F9 piezoelectric vibrator. A forced choice paradigm was used such that only a totally correct word was counted and there was no partial credit for individual phoneme recognition. 11. K. Obyama et al., Hear. Res. 17, 143 (1985); K Ohyama et al.., Acta Oto-Laryngol. Suppl. 435, 73 (1987).

12. Y. Cazals, J. M. Aran, J. P. Erre, A. Guilhaume, Science 210, 83 (1980); Y. Cazals et al., Acta Oto-Laryngol. 95, 211 (1983); Y. Cazals and C. Aurousseu, in The Vestibular System Clinical Research, M. D. Graham and J. I. Kemink, Eds. (Raven, New York, 1987), P. 601; I. Hunter-Duvar et al., J. Otolaryngol. 5, 497 (1976).

13. G. A. Devetter and A. A. Perachio, Brain Behav. Evol. 34, 193 (1989). 14. M. Hardy, Anat. Rec. 59, 403 (1934).

15. Supported in part by a grant from Hearing Innovations Inc. with additional support from G. Bradley. The encouragement of and stimulating discussions with W. Regelson, A. Lippa, C. Berlin, and H. Hecker are gratefully acknowledged.