The present invention relates in general to audio adjustment and equalization, and more specifically to a method and apparatus for self-adjusting audio levels and providing equalization for communications-related devices and other audio devices.
With the widespread proliferation of portable communications devices such as, for example, cellular handsets, and portable audio players such as, for example, portable Compact Disc (CD) players or MPEG 1 (Motion Picture Experts Group) Layer 3 (MP3) player, portable computers, Portable Digital Assistants (PDAs) and the like, the quality of audio has become an important factor in maximizing the usability, fidelity, and enjoyment of such devices. For devices focusing on speech such as communication units, additional issues arise although many fidelity-related issues are common to speech and music.
Individuals listen to speech in unique ways, paying attention to certain voicing or accent characteristics and interpreting speech information accordingly. Some listeners may pay more attention to certain vowel sounds or vocalization styles and thus a particular speaking style or accent establishes an expectation of what will follow. By properly identifying attributes of particular interest to a listener, a better listening experience can be provided by emphasizing those attributes. Attaining a high level of audio quality for users of all kinds, including hearing impaired users is of increasing importance.
Problems arise in connection with controlling audio through the use of, for example, conventional volume and/or tone control, particularly for devices with a narrow audio band such as communication units. A typical communication unit such as a wireless handset for example, has a band limited frequency response between around 150 Hz to around 3600 Hz. Since listeners have different hearing capabilities, preferences, or the like, and since hearing impaired listeners may have hearing capabilities with very specific impairments at certain frequencies, conventional systems rarely provide adequate audio fidelity for users having hearing capabilities falling outside normal levels, or for users having specific impairments. Still further, standards and regulations require levels to fall within certain boundaries further increasing the challenge of providing adequate fidelity to those whose hearing capabilities are not within ranges typically considered “normal”.
Some hearing deficient or hearing impaired users are well aware of their deficiencies and either purchase expensive hearing aids specifically tailored to boost specific frequencies associated with their impairment or attempt to listen unassisted, often losing specific audio information such as high frequency components of the audio signal. Such frequency specific audio information loss can degrade the overall quality or intelligibility of the listening experience of, for example, a musical segment, or can result in the misinterpretation of certain sounds occurring regularly in speech such as consonants with high frequency components such as the sounds associated with the consonants “F”, “T”, “P”, “S”, and the like. Problems associated with, for example, a hearing impaired listener attempting to engage in unassisted listening are further exacerbated in devices with narrow audio bands as described above.
Therefore, to address the above described problems and other problems, what is needed is a method and apparatus for addressing issues associated with poor audio level and equalization control in devices such as handsets, including wireless handsets wired handsets, and the like, and other audio devices such as MP3 players, portable computers portable CD players, PDAs, and the like, particularly where control may concern hearing deficiencies, impairments, preferences and the like.
The accompanying figures, where like reference numerals refer to identical or functionally similar elements and which together with the detailed description below are incorporated in and form part of the specification, serve to further illustrate a preferred embodiment and to explain various principles and advantages in accordance with the present invention.
In overview, the present disclosure concerns communications devices or units, often referred to as communication units, such as telephone handsets, cellular telephone or two-way radio handsets, portable music players such as MP3 players, portable computers, PDAs and the like having audio capability. More particularly, various inventive concepts and principles are embodied in communication devices and other audio-capable devices, and methods therein for self adjusting audio levels and equalization. It should be noted that in addition to connoting a typical handset or audio device such as a player, the term communication device or communication unit may be used interchangeably with subscriber unit, wireless subscriber unit, wireless subscriber device or the like. Each of these terms denotes a device ordinarily associated with a user and typically a wireless mobile device that may be used with a public network, for example in accordance with a service agreement, or within a private network such as an enterprise network. Examples of such units include personal digital assistants, personal assignment pads, and personal computers equipped for wireless operation, a cellular handset or device, or equivalents thereof provided such units are arranged and constructed for operation using audio.
The instant disclosure is provided to further explain in an enabling fashion the best modes of performing one or more embodiments of the present invention. The disclosure is further offered to enhance an understanding and appreciation for the inventive principles and advantages thereof, rather than to limit in any manner the invention. The invention is defined solely by the appended claims including any amendments made during the pendency of this application and all equivalents of those claims as issued.
It is further understood that the use of relational terms such as first and second, and the like, if any, are used solely to distinguish one from another entity, item, or action without necessarily requiring or implying any actual such relationship or order between such entities, items or actions.
Much of the inventive functionality and many of the inventive principles when implemented, are best supported with or in software or integrated circuits (ICs), such as a digital signal processor and software therefore or application specific ICs. It is expected that one of ordinary skill, notwithstanding possibly significant effort and many design choices motivated by, for example, available time, current technology, and economic considerations, when guided by the concepts and principles disclosed herein will be readily capable of generating such software instructions or ICs with minimal experimentation. Therefore, in the interest of brevity and minimization of any risk of obscuring the principles and concepts according to the present invention, further discussion of such software and ICs, if any, will be limited to the essentials with respect to the principles and concepts used by the preferred embodiments.
In addition to devices of a general nature with audio capability, the communication devices of particular interest are those providing or facilitating voice/audio communications services over cellular wide area networks (WANs), such as conventional two way systems and devices, various cellular phone systems including analog and digital cellular, CDMA (code division multiple access) and variants thereof, GSM, GPRS (General Packet Radio System), 2.5 G and 3 G systems such as UMTS (Universal Mobile Telecommunication Service) systems, Internet Protocol (IP) Wireless Wide Area Networks like 802.16, 802.20 or Flarion, integrated digital enhanced networks and variants or evolutions thereof. Furthermore the wireless communication units or devices of interest can have short range wireless communications capability normally referred to as WLAN capabilities, such as IEEE 802.11, Bluetooth, or Hiper-Lan and the like preferably using CDMA, frequency hopping, OFDM or TDMA access technologies and one or more of various networking protocols, such as TCP/IP (Transmission Control Protocol/Internet Protocol), UDP/UP (Universal Datagram Protocol/Universal Protocol), IPX/SPX (Inter-Packet Exchange/Sequential Packet Exchange), Net BIOS (Network Basic Input Output System) or other protocol structures. Alternatively the wireless communication units or devices of interest may be connected to a LAN using protocols such as TCP/IP, UDP/UP, IPX/SPX, or Net BIOS via a hardwired interface such as a cable and/or a connector.
As further discussed herein below, various inventive principles and combinations thereof are advantageously employed to provide self adjustment of audio levels and equalization.
In accordance with various exemplary embodiments, audiogram, loudness, or hearing profile tests, or the like, are conducted to determine a listener's hearing characteristics, such as a hearing loss profile. Since, as described above, a typical communication unit has a band-limited frequency response typically from around 150 Hz to around 3600 Hz, and may also have a standard audio equalization configuration which can alter the basic audio frequency response of the device, it is necessary to conduct tests on individual devices to allow response factors to be addressed and overcome on a per device basis. When a listener's hearing capabilities, impairments, preferences, or the like are determined, an audio profile is generated and will be used on the communication device for conditioning audio output thereafter. It will further be appreciated that depending on the output device which can be, for example, a speaker, an audio transducer such as a piezo-electric transducer, a headset speaker, earpiece, or the like, a user may wish to alter the default equalization for the particular audio output device being used.
In order to generate an audio profile, all or significant ones of the frequencies which the communication device is capable of generating must be tested during hearing profile tests. Thus exemplary hearing profile tests use a standard tone sweep to generate a listener's audiogram as will be appreciated by those of skill in the art. Referring now to
While left ear response 101 and right ear response 102 represent normal hearing profiles, various exemplary abnormal profiles are shown in
It should be noted that the objective of self adjusting in accordance with various exemplary embodiments is to bring the hearing levels to within normal zone 110 as shown in
Although the limits of human hearing are generally considered to be from around 20 Hz to around 20 KHz, testing in accordance with various exemplary embodiments need only support the available bandwidth of the communication unit or device which as noted is around 150 Hz to around 3,600 Hz for a typical wireless communication unit. Accordingly, a Graphical User Interface (GUI) is used to present the audiogram test to, for example, a user or listener. Since loudness is a function of level and frequency, the exemplary audiogram test will need to be conducted at each frequency for various volume levels associated with the communication unit or device. While the diagrams shown in the above figures have profiles for the left and right ear, an average unimpaired listener's profile is closely matched in both ears and thus the audiogram will generate a profile based on the hearing of whatever ear is used during the listening test. An impaired user will likely conduct the test with the ear having the least degree of impairment. The listener decisions to increase or lower volume will thus be conditioned by the audio profile generated from the audiogram level tests and information associated therewith will be used to condition each frequency.
In an exemplary audiogram test procedure, for example, as illustrated in
It should be noted that test tones can be spaced on a critical band scale since loudness is based on the critical band concept of hearing. A table of critical band frequencies for the entire range of typical human hearing is given in table 1 below. One of ordinary skill in the art will appreciate that while table 1 shows all the critical band frequencies, not all the critical band frequencies will be relevant in all devices. For example, while a relatively high fidelity audio player may have a frequency response across the entire range of hearing, a typical communication unit will have a band limited frequency response.
In accordance with various exemplary embodiments, the exemplary GUI can provide a representation of each tone tested for and display it to the screen so the listener can see the results. The communication unit further can keep, for example, a history of audiograms for display and comparison. In accordance with other exemplary embodiments, the GUI can display a chart of the frequency response profile showing the hearing loss attenuation over each critical band.
As noted, the audiogram test will produce a profile containing all values and perceived attenuation levels across the frequency band for the particular communication unit. Since each individual communication unit may have its own unique frequency response profile, based for example on differences between component tolerances and the like, it is necessary to do listening tests on the actual communication unit. The listener's tonal sensitivity curve, as determined by the listening test and the resulting audiogram and profile, determines the required level scaling for the critical band tones and can be used to determine an equalization profile which compensates or otherwise restores the individual's hearing to or near to a normal hearing profile. The equalization profile can be used in all subsequent audio processing and signal output functions. The equalization profile, as will be appreciated by one of ordinary skill in the art, specifies how much attenuation or amplification is necessary to balance the loudness of frequency components across the band for the communication unit. Since tones are used, the above described audiogram or listening test provides a relatively coarse adjustment of audio levels. A finer adjustment may further be desirable for making characteristics associated with speech more discemable.
In addition, the audiogram can be used to address compression by determining “headroom” which can be defined as the degree of amplification possible for a frequency before saturation occurs. As will be understood by one of ordinary skill in the art, saturation occurs when gain levels are sufficiently high to cause an amplifier to operate outside its linear region resulting in non-linearity and clipping in the audio output signal. For a typical device, a one-to-one correspondence between the input and output levels should be present. At input levels above some level, for example 10 dB, compression effects can occur.
Voice conditions can be categorized according to parameters including formants, fricatives, and equalization. Formants are resonant sounds having distinguishing frequency components allowing, for example, vowel sounds to be distinguished from each other. Fricatives are sounds produced by air flowing through a narrow channel made by two articulating organs in close proximity such as the tip of the tongue and the upper teeth. The turbulent airflow resulting from the narrow passage produces a characteristic noise called “frication”. Equalization is the relative emphasis across the frequency spectrum. Ideally, equalization will emphasize weak frequencies and de-emphasize strong frequencies and should result in a “flat” response in the listener where all frequencies are heard with equal emphasis. Due to impairments, preferences or the like, certain frequencies will require greater emphasis to achieve the desired response in the listener.
In accordance with various exemplary embodiments, voice conditions noted above, while associated with frequencies generated in the testing described hereinabove, may produce anomalies which can affect intelligibility and which can be corrected for in an additional fine adjustment test involving, for example, speech oriented listening and intelligibility tests rather than tone-based audiogram or loudness tests. Thus, in an exemplary test for nasalty and formant sharpening, nasal sounds can be presented and a formant postfilter adjusted until, for example, improved recognition is attained in intelligibility. In an exemplary consonant and midfrequency emphasis test, certain fricitaves are presented and tested for intelligibility. The midfrequency amplitudes for example can be sweeped in level until recognition results improve. In an exemplary unvoiced speech and audio equalization test the entire frequency band is tested for loudness or level imbalances. For example, when high frequencies are overemphasized, ‘s’ sounds can be harsh and piercing. Accordingly, the audiogram or loudness tests described above may be conducted to establish acceptable volume levels and additional adjustments can be made to soften harsh consonants and fricatives, and re-shape formants.
In addition to hearing difficulties, speech disorders including nasality may affect the quality of the speech being generated and input to a communication unit; where this speech is ultimately destined for transmission to a terminating communication unit and thus listened to by others. Speech studies of people with nasal speech disorders reveal that nasality is primarily due to pronounced abnormal resonances in the nasal cavity which amplify formant energy. Other nasal related disorders include the loss of consonant articulation due to an inability to build air pressure because of air escaping through the nasal cavity. It is envisioned that principals discussed and described herein could be used to compensate for nasality during processing of the speech signal in, for example, an exemplary vocoder or the like.
As can be seen in exemplary block diagram 500 of
It should be noted that while some postfiltering may be present and associated with default profiles, postfilters are not typically accessible for reconfiguration or adjustment such as in accordance with the present invention. Thus, a more accessible environment is needed where a listener can perform specific listening tests during an exemplary test period to provide an even finer adjustment over the tone sweeping audiogram tests already described. An exemplary procedure 600 associated with performing speech quality assessment tests such as may be presented to a user by way of a GUI is shown and described in connection with
Since conventional phones or communication devices do not typically employ tests to automatically set a user profile to achieve individually tailored responses, performance, loudness, intelligibility, and acceptability tests can be conducted on an exemplary communication unit in accordance with various exemplary embodiments, using an interactive test procedure which can be presented to a user through a user interface such as a test suite presented using Graphical User Interface (GUI) 700 as shown in
In accordance with an alternative exemplary embodiment (not shown), tests may be performed, for example during manufacturing, using external equipment such as an external listening test suite to evaluate speech processed by the combined enhancement algorithms. Accordingly, a serial port interface can be used to stream speech data from and to the exemplary communication unit with an accessory cable in order to present an adequate opportunity to perform adjustments. Software to uncompress/compress and encode/decode streamed speech exists on the communication unit for performing processing. A recording test can be used to record speech utterances to be processed for listening tests. Vocoded speech data may be acquired from the communication unit and may be processed externally with, for example, loudness enhancement algorithms. The processed and vocoded speech can be uploaded back to the communication unit and can be used to conduct the listening tests as noted.
With continuing reference to GUI 700 shown in
As noted, the listening test block 720 enumerates the type of tests available. In accordance with various exemplary embodiments, three listening tests are shown in information area 721: loudness or audiogram test 730, intelligibility test 740 and acceptability test 750. The listener ideally will take all three tests although the user may return to a single test if desired. “Info” button 722 may be used to present the rules for each test, while “Arrow” button 723 can be used to select between tests, and once a test is selected, “Next” button 724 can be used to begin the selected test. It will be appreciated that the user can at any time return to listening test block 720 if the fidelity of the device or communication unit requires improvement.
In loudness or audiogram test 730, a tone at a particular level, both the tone frequency and level being displayed in information area 731, may be presented to a listener to judge for loudness. The listener can increase or decrease loudness for the tone using arrow key 733 in the “Int+” direction 734 or the “Int−” direction 735. When the loudness is determined to be sufficient for the listener “Next” button 736 can be pressed which tabulates the intensity level for the particular tone and the next tone can be tested. When the final tone is tested and logged, pressing “Next” button 736 will exit loudness or audiogram test 730.
In intelligibility test 740, the listener can be presented with 25 words in information area 741 to judge for intelligibility. The user presses “Play” button 742 and the word is played. After playing, two words can be displayed in information area 741, at which point arrow button 743 can be used to select which word was heard. It will be appreciated that the word choices are not presented in text form until the word is played since the listener must decide what word was said. The listener is further not given information as to which word was processed, and the presentation order of the words after playing is random. It should be noted that the intelligibility test 740 can add artificial noise to the word in an effort to generate possible ambiguities between for example consonants and fricatives or the like, as might exist between the words “fat” and “sat”. Results are tabulated and the user presses “Next” button 744 to proceed to the next word. When no more words are available, “Next” button 744 may be used to advance to the next test.
In acceptability test 750, the listener is asked to provide a quality rating of the speech they hear in exemplary sentences. In accordance with various exemplary embodiments, 20 sentences are provided to evaluate acceptability. A sentence is played by pressing “Play” button 752 and the user is asked to rate the quality of the speech they hear in accordance with ratings such as “excellent”, “good”, “fair”, or the like as may be presented in information area 751, by moving to the displayed rating using arrow button 753. The sentences are selected and processed with the loudness enhancement algorithm at random. The listener rates the quality as “excellent”, “good”, or “fair”. Results are tabulated and the user proceeds to the next sentence for evaluation by hitting “Next” button 754. When no more sentences are available, “Next” button 754 will advance to “Finished” block 760 at which point the test suite may be exited by pressing “End” button 761.
It will be appreciated that in accordance with various exemplary embodiments, the present invention may be implemented as an apparatus. Exemplary apparatus 800 shown in
This disclosure is intended to explain how to fashion and use various embodiments in accordance with the invention rather than to limit the true, intended, and fair scope and spirit thereof. The invention is defined solely by the appended claims, as they may be amended during the pendency of this application for patent. The foregoing description is not intended to be exhaustive or to limit the invention to the precise form disclosed. The embodiment(s) was chosen and described to provide the best illustration of the principles of the invention and its practical application, and to enable one of ordinary skill in the art to utilize the invention in various embodiments and with various modifications as are suited to the particular use contemplated. All such modifications and variations are within the scope of the invention as determined by the appended claims, as may be amended during the pendency of this application for patent, and all equivalents thereof, when interpreted in accordance with the breadth to which they are fairly, legally, and equitably entitled.