An embodiment of the invention relates to improving a user's experience of downlink audio in a communications device. Other embodiments are also described.
Two-way voice conversations (which may be not just voice only, but also voice and video) can be carried out between two users, using electronic communication devices such as telephones. These devices have evolved over the years from simple plain old telephone system (POTS) analog wire line stations to cellular network phones, smart mobile phones, voice over IP (VOIP) stations, and personal computer-based VOIP telephony applications. There is a desire to remain backwards compatible with the original, relatively small bandwidth allocated to a voice channel in a POTS network. This in part has prevented the emergence of a “high fidelity” telephone call, despite the availability of such technology.
Improving the sound quality of a telephone call is particularly desirable for mobile phones as they may be more susceptible to electromagnetic interference, due to their reliance on cellular wireless links. In addition, mobile phones are often used in noisy sound environments, such as outside in the wind or near a busy highway or a crowded people venue. Accordingly, modern communications devices such as mobile phones have one or more stages of audio signal processing that is applied to the downlink voice signal, which is received from the communications network (before the signal is audiblized to the near end user of the device through a speaker). Such processing or filtering may, for example, reduce the effect of echo and noise that might otherwise be heard by the near end user. Typically, while the near end user can adjust the volume of the speaker, there is no manual adjustment available for changing filtering in one audio frequency band relative to another, in the downlink voice path.
In accordance with the embodiments of the invention, a user of a communications device is given the ability to conveniently control the quality of the sound he is hearing during a call. In one embodiment, an acoustic transducer interface circuit (e.g., part of an audio codec integrated circuit device) of the communications device has volume settings that span a range between lowest and highest and are set by the user actuating a volume adjustment button. In addition, the device has one or more intelligibility boost settings. The intelligibility boost settings are also selected by actuating the volume adjust button of the device. In particular, once the device has been signaled into the highest volume setting in response to actuation of the button in a given direction, and the next actuation of the button during the call is also in the given direction, a downlink voice signal processor of the device responds to the next actuation by changing its audio frequency response to boost intelligibility of the far end user's speech being heard by the near end user.
The volume settings and the intelligibility boost settings may be signaled in the “host device” by the user's actuation of any one of a variety of different volume control buttons (and their associated switches or transducers). Examples include: a dedicated volume switch that is integrated in the housing of the host device; a switch that is integrated in a microphone housing of a wired headset and that is detected or read using a chipset that communicates with the host device through the microphone bias line; and a switch that is integrated in a wireless headset and that is detected or read using a short distance wireless interface chipset (e.g., a Bluetooth transceiver chipset) of the host device.
In another embodiment, the communications device has a touch sensitive screen in which a virtual volume button is displayed during the call. In particular, the virtual button may appear during speakerphone mode and not during handset mode. Once the device has been signaled into the highest volume setting, e.g. in response to actuation of the virtual button in a given direction, and the next actuation of the virtual button during the call is also in the same direction, the downlink processor responds by changing its frequency response in the audio range so as to boost intelligibility of speech that is heard from the loudspeaker of the device (in the speakerphone mode).
The above summary does not include an exhaustive list of all aspects of the present invention. It is contemplated that the invention includes all systems and methods that can be practiced from all suitable combinations of the various aspects summarized above, as well as those disclosed in the Detailed Description below and particularly pointed out in the claims filed with the application. Such combinations have particular advantages not specifically recited in the above summary.
The embodiments of the invention are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. It should be noted that references to “an” or “one” embodiment of the invention in this disclosure are not necessarily to the same embodiment, and they mean at least one.
The acoustic transducer interface circuit 114 is to feed an audio signal from, and connect with, a voice pickup device or microphone 113. For this function, the interface circuit 114, may have an analog to digital converter that converts the analog audio signal from a connected microphone 113 into digital form. Alternatively, the interface circuit 114 may simply buffer a digital audio signal from a digital, wireless or wired headset interface (e.g., as part of a Bluetooth wireless headset chipset or a microphone bias line remote sensing chipset). An uplink voice signal processor 174 is coupled between the communications network 178 and the interface circuit 114.
The speaker 111 may be a loudspeaker 214 used in speakerphone mode (see
Returning to
The call 180 may be placed or initiated through a communication network 178 to which the network interface is connected. Depending upon the particular type of remote device 182 used by the far end user 183, the communications network 178 may actually be composed of several different types of networks that cooperate with each other (e.g., via gateways, not shown) to establish and conduct the call 180. For example, the communications network 178 may include a cellular network link at the near end, followed by a back haul or PSTN segment and finally a wireless or wired local area network segment at the far end.
Once the call 180 has been established or a connection has been made with the remote device 182, processing of the users' conversation may proceed as follows. A downlink voice signal from the remote device 182 of the far end user 183 is received through network interface 176 and processed by downlink voice signal processor 172 prior to being delivered to the acoustic transducer interface circuitry 114. The downlink processor 172 may include digital audio signal processing capability in the form of hardware and/or software that applies a number of quality improvement operations to the input voice signal from the network interface 176, including, for example, echo cancellation and/or noise suppression. Similarly, and simultaneously, the uplink signal processor 174 may be applying echo cancellation and/or noise suppression to the microphone pickup signal, and then delivering the improved uplink signal to the network interface 176, which in turn transmits the signal to the communications network 178. The uplink signal eventually makes its way to the remote device 182 where it is audiblized for the end user 183.
The interface circuit 114 may have a number of volume settings 186 at which the speaker 111 is to be operated during the call. These settings span a range, between a lowest or minimum volume setting and a highest or maximum volume setting. A volume setting (or control signal) is provided to the interface circuit 114 by a decoder 186. The interface circuit 114 may include a local audio amplifier that responds to the volume setting by amplifying the audio signal received from the downlink processor 172 accordingly, before feeding the amplified audio signal to the speaker 111 over a wired connection. In another embodiment, the interface circuit 114 in effect forwards this volume setting to a remote audio amplifier, such as one that is located in a wireless headset. In that case, the audio signal is amplified by the remote audio amplifier, in accordance with the volume setting.
Still referring to
In one example, a dedicated volume switch 196 that is integrated in the housing of the host device may be detected or read by the decoder 186, through a housing-integrated switch interface circuit 195 (e.g., a simple switch biasing circuit). The switch 196 is depicted as a momentary rocker switch that is to be actuated during a call by the user's finger. In this case, the user can push or pull the switch 195 in one direction to signal the interface circuit 114 into a lower volume setting, and in an opposite direction to signal the interface circuit into a higher volume setting. Thus, each pushing or pulling of the switch 196 in a given direction will change to the next higher (or lower) volume setting. As an alternative to a rocker switch, a click wheel or rotary switch may be used for setting the volume.
There may also be a volume switch located in the microphone housing of a wired headset 194. The headset 194 may be connected to the (host) device 100 through a standard headset jack (not shown). In that case, a wired headset interface 193 of the device 100 contains part of a chipset that detects or reads the switch through the microphone bias line, and then provides this information to the decoder 186.
In yet another embodiment, a volume switch is integrated in a wireless headset 192. For that case, a wireless headset interface 191 of the (host) device 100 contains part of a short distance wireless interface chipset (e.g., a Bluetooth transceiver chipset) that detects or reads the switch through a wireless link with the host device 100. The decoder 186 could be alerted by the chipset, e.g. through an interrupt signal, in response to each switch actuation.
Once the downlink processor 172 has been signaled into the highest volume setting in response to actuation in a given direction, and next actuation is also in the same direction, the downlink processor 172 will respond to this next actuation by changing its frequency response. The frequency response, which is in the audio frequency range, is that to which the downlink voice signal is subjected before being fed to the speaker interface circuit. In one embodiment, the change increases gain over a middle frequency band, M, relative to lower and upper frequency bands, L and H, as shown. Once the decoder 186 detects the maximum volume setting, the next actuation at that point is translated into an IB setting, which is signaled to the downlink processor 172. In other words, as seen in
As depicted in
In addition to increasing the gain in the middle frequency band M, relative to the lower and upper frequency bands, the downlink signal processor 172 may be designed to further respond to an intelligibility boost setting 188, by increasing roll off in the lower frequency band, L, as depicted in the frequency response curves shown in
In many instances, the near end user 171 may decide during the call that the volume through the speaker 111 is too high and will, therefore, actuate the volume adjust button in the direction of decreasing volume. In that case, actuation of the transducer 185 in the direction of lower volume (starting from the setting maximum) will signal the downlink signal processor 172 to exit the intelligibility boost state and resume a “normal” volume (somewhere between minimum and maximum). The downlink signal processor 172 may, in that case, respond by changing its frequency response back to the balanced or flat shape 118.
When the call has ended, the downlink processor 172 may be automatically signaled to return to some normal volume setting, in preparation for the next call to be placed or received. For example, a telephony module (not shown) may be running in the device 100 and that is responsible for managing calls, including signaling the network interface 176 to place a new call or disconnect an on-going call, and receiving a signal from the network interface 176 that a new call has been received or that an ongoing call has been disconnected. This information may be signaled to the downlink processor 172 to cause it to “reset”, i.e. deactivate the intelligibility boost and instead resume some normal volume setting, at the beginning of each new call.
Turning now to
The downlink signal processor 172 has an input that is coupled to the wireless communication network 278 through the RF circuitry 108 and the antenna, and an output that is alternatively coupled to an earpiece speaker or receiver 216 in handset mode, and a loudspeaker 214 in speakerphone mode. The downlink voice signal from the downlink processor 172 is fed to the acoustic transducer interface circuit 114. The interface circuit 114 may in part be integrated with a codec, so as to perform the function of converting the digital downlink voice signal into analog form, amplify the analog signal in accordance with the volume setting signaled by the decoder 186, and route the analog signal to either the loudspeaker 214 or the earpiece 216 depending upon whether the device 200 is operating in speakerphone mode or headset mode. The order of conversion, amplification and routing by this combination may be different. Note that as an alternative, when the wireless headset 192 has been activated, the interface circuit 114 can simply buffer the downlink voice signal and route it to the wireless headset interface circuit 191, which may be a wireless digital headset interface such as one that is compliant with Bluetooth technology (see
For the uplink side, the device 202 may have a similar arrangement as device 100, shown in
As in the embodiment of
The downlink signal processor 172 in this embodiment may also respond similarly as described above in connection with
Although not explicitly described here, the process of initiating an outgoing call 180, or answering an incoming call 180, may be in accordance with a number of different possible techniques. For example, in the embodiment of
The speakerphone mode aspects described above are also applicable to the instance where the call 180 is an outgoing call that has been placed by the near end user 171, for example, following a manual number dialing process using a virtual keypad (not shown) or by automatic dialing of a stored number associated with the name of the far end user 183 which has been selected from a contacts list stored in the device 200.
As suggested above, the embodiments of the invention may be particularly desirable in a mobile communications device, such as a mobile smart phone.
The device 200 includes input-output components such as handset microphone 213 and loudspeaker 214. When the speakerphone mode is not enabled, the sound during a telephone call is emitted from earpiece or receiver 216 that is placed adjacent to the user's ear during a call in the handset mode of operation. The device 200 may also include a headset jack (not shown) and a wireless headset interface, to connect with a headset device that has a built-in microphone, allowing the user to experience the call while wearing a headset that is connected to the device 200.
Referring to
In one example, there are one or more processors 120 that run or execute various software programs or sets of instructions (e.g., applications) that are stored in memory 102, to perform the various functions described below, with the assistance of or through the peripherals. These may be referred to as modules stored in the memory 102. The memory 102 also stores an operating system 126 of the device. The operating system may be an embedded operating system such as Vx Works, OS X, or others which may also include software components and/or drivers for controlling and managing the various hardware components of the device, including memory management, power management, sensor management, and also facilitates communication between various software components or modules.
The device 200 may have wireless communications capability enabled by radio frequency (RF) circuitry 108 that receives and sends RF signals via an integrated or built-in antenna of the device 200. The RF circuitry may include RF transceivers, as well as digital signal processing circuitry that supports cellular network or wireless local area network protocol communications. The RF circuitry 108 may be used to communicate with networks such as the Internet with such protocols as the World Wide Web, for example. This may be achieved through either the cellular telephone communications network or a wireless local area network, for example. Different wireless communications standards may be implemented as part of the RF circuitry 108, including global system for mobile communications (GSM), enhanced data GSM environment (EDGE), high speed downlink packet access (HSDPA), code division multiple access (CDMA), Bluetooth, wireless fidelity (Wi-Fi), and Wi-Max.
The device 200 also includes audio circuitry 110 that provides an interface to acoustic transducers, such as the speaker 111 (e.g., a loudspeaker, an earpiece or receiver, a headset) and a microphone 113. These form the audio interface between a user of the device 200 and the various applications that may run in the device 200. The audio circuitry 110 serves to translate digital audio signals produced in the device (e.g., through operation of the processor 120 executing an audio-enabled application) into a format suitable for output to a speaker, and translates audio signals detected by the microphone 130 (e.g., when the user is speaking into the microphone) to digital signals suitable for use by the various applications running in the device.
The device 200 also has an I/O subsystem 106 that serves to communicatively couple various other peripherals in the device to the peripherals interface 118. The I/O subsystem 106 may have a display controller 156 that manages the low level processing of data that is displayed on the touch sensitive display screen 112. One or more input controllers 160 may be used to receive or send signals from and to other input control devices 116, such as physical switches or transducers (e.g., push button switches, rocker switches, etc.), dials, slider switches, joy sticks, click wheels, and so forth. In other embodiments, the input controller 160 may enable input and output to other types of devices, such as a keyboard, an infrared interface circuit, a universal serial bus, USB, port, or a pointer device such as a mouse. Physical buttons may include an up/down button for volume control of the speaker 111 and a separate, sleep or power on/off button of the device 200. In contrast to these physical peripherals, the touch sensitive screen 112 is used to implement virtual or soft buttons and one or more soft keyboards.
The touch sensitive screen 112 is part of a larger input interface and output interface between the device 200 and its user. The display controller 156 receives and/or sends electrical signals from/to the touch screen 112. The latter displays visual output to the user, for example, in the form of graphics, text, icons, video, or any combination thereof (collectively termed “graphics” or image objects). The touch screen 112 also has a touch sensitive surface, sensor, or set of sensors that accept input from the user based on haptic and/or tactile contact. These are aligned directly with the visual display, typically directly above the latter. The touch screen 112 and the display controller 156, along with any associated program modules and/or instructions in memory 102, detect contact, movement, and breaking of the contact on the touch sensitive surface. In addition, they convert the detected contact into interaction with user-interface objects (e.g., soft keys, program launch icons, and web pages) whose associated or representative image objects are being simultaneously displayed on the touch screen 112.
The touch screen 112 may include liquid crystal display technology or light emitting polymer display technology, or other suitable display technology. The touch sensing technology may be capacitive, resistive, infrared, and/or surface acoustic wave. A proximity sensor array may also be used to determine one or more points of contact with the touch screen 112. The touch screen 112 may have a resolution in excess of 100 dpi. The user may make contact with the touch screen 112 using any suitable object or appendage, such as a stylist, a finger, and so forth. In some embodiments, the user interface is designed to work primarily with finger-based contacts and gestures, which are generally less precise than stylist-based input due to the larger area of contact of a finger. The device in that case translates the rough finger-based input into a precise pointer/cursor position or command for performing the action desired by the user.
The device 200 has a power system 162 for supplying electrical power to its various components. The power system 162 may include a power management system, one or more replenishable or rechargeable power sources such as a battery or fuel cell, a replenishing system, a power or failure detection circuit, as well as other types of circuitry including power conversion and other components associated with the generation, management and distribution of electrical power in a portable device.
The device 200 may also include one or more accelerometers 168. The accelerometer 168 is communicatively coupled to the peripherals interface 118 and can be accessed by a module being executed by the processor 120. The accelerometer 168 provides information or data about the physical orientation or position of the device, as well as rotation or movement of the device about an axis. This information may be used to detect that the device is, for example, in a vertical or portrait orientation (in the event the device is rectangular shaped) or in a horizontal or landscape orientation. On that basis, a graphics module 132 and/or a text input module 134 are able to display information “right side up” on the touch screen 112, regardless of whether the device is in any portrait or landscape orientation. The processing of the accelerometer data may be performed by the operating system 126 and in particular a driver program that translates raw data from the accelerometer 168 into physical orientation information that can be used by various other modules of the device as described below.
The device 200 shown in
Turning now to the modules in more detail, a contact/motion module 130 may detect user initiated contact with the touch screen 112 (in conjunction with the display controller 156), and other touch sensitive devices e.g., a touchpad or physical quick wheel. The contact/motion module 130 has various software components for performing operations such as determining if contact with the touch screen has occurred or has been broken, and whether there is movement of the contact and tracking the movement across the touch screen. Determining movement of the point of contact may include determining speed (magnitude), velocity (magnitude and direction), and/or acceleration of the point of contact. These operations may be applied to single contacts (e.g., one finger contacts) or to multiple simultaneous contacts (e.g., multi-touch or multiple finger contacts).
The graphics module 132 has various known software components for rendering and displaying graphics on the display of the touch screen 112 including, for example, icons of user interface objects such as soft keys and a soft keyboard. The text input module 134, which may be a component of graphics module 132, provides soft keyboards for entering text in different languages. Such soft keyboards are for use by various applications e.g., a telephone module 138, a contacts module 137 (address book updating), email client module 140 (composing an email message), browsing module 147 (typing in a web site universal resource locator), and a translation module 141 (for entering words or phrases to be translated).
A GPS module 135 determines the geographic location of the device (using for example an RF-based triangulation technique), and provides this information for display or use by other applications, such as by the telephone module 138 for user in location-based dialing and applications that provide location-based services, such as a weather widget, local Yellow Page widget, or map/navigation widgets (not shown). The widget modules 149 depicted here include a calculation widget 149_1 which displays a soft keypad of a calculator and enables calculator functions, an alarm clock widget 149_2, and a dictionary widget 149_3 that is associated or tied to the particular human language set in the device 200.
The telephone module 138 is responsible for managing the placement of outbound calls and the receiving of inbound calls made over a wireless telephone network, e.g. a cellular telecommunications network. Some or all of the functions of the decoder 186 described above in connection with
A calendar module 148 displays a calendar of events and lets the user define and manage events in her electronic calendar.
A music player module 146 may manage the downloading, over the Internet or from a local desktop personal computer, of digital media files, such as music and movie files, which are then played back to the user through the audio circuitry 110 and the touch sensitive display system 112.
It should be noted that each of the above-identified modules or applications correspond to a set of instructions to be executed by a machine such as the processor 120, for performing one or more of the functions described above. These modules or instructions need not be implemented as separate programs, but rather may be combined or otherwise rearranged in various combinations. For example, the text input module 134 may be integrated with the graphics module 132. In addition, the enablement of certain functions could be distributed amongst two or more modules, and perhaps in combination with certain hardware. For example, in one embodiment, the functions of the decoder 186 (see
To conclude, various aspects of a technique for giving a user of a communications device more convenient control of sound quality have been described. As explained above, an embodiment of the invention may be a machine-readable medium having stored thereon instructions which program a processor to perform some of the operations described above. In other embodiments, some of these operations might be performed by specific hardware components that contain hardwired logic. Those operations might alternatively be performed by any combination of programmed data processing components and fixed hardware circuit components.
A machine-readable medium may include any mechanism for storing or transferring information in a form readable by a machine (e.g., a computer), such as Compact Disc Read-Only Memory (CD-ROMs), Read-Only Memory (ROMs), Random Access Memory (RAM), and Erasable Programmable Read-Only Memory (EPROM). The invention is not limited to the specific embodiments described above. For example, the device 200 depicted in