1. Field of the Invention
The present invention generally relates to wireless multimedia headsets.
2. Description of the State-of-the-Art
Wireless headsets are common devices used for hands-free operation in conjunction with cell phones and VoIP phones, as well as with portable music players such as digital MP3 players. Such headsets typically include radio technology to access a given wireless system. For example, cell phone headsets use wireless technology to communication with the cell phone handset such that the voice signals received by the handset over the cell phone system can be transferred to the headset. Similarly, wireless headsets for MP3 players use wireless technology to transfer music files from the player to the headset.
In one aspect the invention provides a High-Fidelity Multimedia Wireless Headset. In another aspect, the invention provides a wireless multimedia headset that can include multiple features such as multimedia storage with advanced search capability, a high fidelity sound system, peer-to-peer networking capability, and ultra low power such that the device is capable of operation without recharging.
In another aspect, the invention provides a multimedia headset and method for designing and operating a headset comprising: a plurality of multiple wireless interfaces; an advanced search engine with media search capability; a high fidelity sound processor; power management means for ultra low power operation; and network connectivity for peer-to-peer networking.
The present disclosure is generally directed to a wireless multimedia headset that can include multiple features and support multiple wireless systems. These features can include any combination of a multimedia storage with advanced search capability; a high fidelity sound system; peer-to-peer networking capability; and an ultra low power consumption, such that the device is capable of operation without recharging. The headset can also provide a platform for both existing and new headset applications (such as “push-to-talk” between headsets) to enable access to the device features.
The headset 100 is capable of several applications 125, in addition to power management 130 to enhance battery life. The headset 100 supports Voice over IP (VoIP) 135 directly through any of the interfaces that allow it to connect to the Internet, as well as an audio subsystem 140 that includes several functionalities such as, for example, noise cancellation 145 (and beamforming) through microphone array processing 150, in addition to voice recognition 155 and MP3 support 160. Multiple wireless systems may be integrated into the headset 100, including, but not limited to, GPS and different radio systems (AM/FM/XM) 165, various cellular phone standards (3G/2G/GSM/Edge and/or Wimax) 170, different Wifi standards (802.11a/b/g/n) 175, and 802.15 (Bluetooth, Zigbee, and/or UWB) 180. In most embodiments, an antenna, or array of antennas, having antenna algorithms 185 is used as part of the wireless system or subsystems disclosed herein.
The device also includes a SIM card 245 that, for example, identifies the user account with a network, handles authentication, provides data storage for basic user data and network information, and may also contain applications. The powers subsystems 250 include advanced power management 255 functionality to control energy use through power supplies 260. Solar cells 265 are also available to assist in sustaining the supply of power. The solar cells 265 can charge the battery 270 from ambient light as well as solar light. A battery charger 275 is included and can charge the battery, for example, through the input of a DC current 280.
The speech recognizer 320 receives information from a digital signal processor (DSP) 322, which collects, processes, compresses, transmits and displays analog and digital data for feature extraction from an acoustic signal 324. The speech recognizer 320 is designed such that the wireless interfaces 325 and/or peer-to-peer network 330 can be used to provide an additional input to the algorithm. Specifically, the algorithm will have the ability to use any of the available wireless interfaces 325 and/or peer-to-peer network 330 to connect to another device 335 such as, for example, a laptop, computer, or handset to include other capabilities including, but not limited to, expanding the vocabulary base or providing translation assistance to the engine in the speech recognizer 320.
The algorithm will take the user's speech as a query 110 or command 115 input and initiate an indexing function 340. The sources of indexed data 342 include, but are not limited to, automated metadata extraction 343 and user entered metadata 344. Automated metadata 343 includes, for example, music, video, and contact information. User entered metadata 344 includes, for example, personal photographs. For commands, the indexing function 340 will take the appropriate action 345 to satisfy the command 115. For queries, 110 the indexing function 340 will enable the search engine to locate the desired file and provide search results 350 to the user interface 355 and then take the appropriate action 345, such as dialing a number 357 or playing a desired song 360.
The algorithms for noise cancellation (and beamforming) 145 based on the microphone array input 228 speech can be designed relative to the speech recognition algorithm, such that the feature extraction of the input 228 speech is optimized. One of skill will appreciate that noise cancellation/beamforming algorithms designed independent of the speech recognition algorithms can degrade speech recognition performance by introducing undesired speech artifacts. The speech recognition will categorize recognized speech as either a query 100 (e.g. look for a particular song) or a command 115 (e.g. dial a specific number).
An additional antenna element may be placed inside the ear canal with signal processing through a distorted voice parameter extraction component 425 to invert the distortion of the ear canal transmission and enhance the voice parameters. The antenna elements 435,440,445,450 in the microphone array will have weights assigned to each antenna input. Different algorithms can be used to determine the weights, depending on the performance criteria, the number of antenna elements available and their nature, and the algorithm complexity. For example, the weights may be used to minimize ambient noise, to make the antenna array gain independent of frequency, to minimize the expected mean square distortion or error of the signal, or to steer the direction of the microphone array 227 towards the speaker as shown in
The beam forming block 444 may include analog circuits, digital circuits, a combination of analog and digital circuits, hardwired or programmable processing, or other means for processing the input signals and altering the individual microphones and the microphone array and/or the processing of the individual microphone 410,415,420 output signals to achieve the desired beam steering. The beam steering has the effect of focusing the sensitivity of the microphone array 515 as a whole toward a desired sound source, such as the human speaker 505. It may alternatively be used to steer the sensitivity away from an objectionable sound source.
Advantageously, the beam steering will be used to increase the human speaker 505 (or other sound source) signal to background noise ratio or to otherwise achieve a desired result. The output 545 of the beam forming block 444 is combined with an output 560 from a background noise cancellation block 565. The background noise canceller 412 receives a background noise input signal 570 as the output electrical signal of an ambient noise microphone 410. This ambient noise microphone 410 is primarily responsible for sensing or detecting an acoustic ambient noise signal and transducing or otherwise converting it into an electrical ambient noise signal 570 which it communicates to a background noise canceller 412. Since the microphone array 515 may advantageously be steered toward the user 505 and may advantageously include a directional characteristic such that most of the sensitivity of the microphone array 515 is in the direction of the user 505, the amount or signal strength of the steerable microphone array 515 relative to the user will be higher for the user signal and lower for the ambient noise.
The amount or signal strength of the ambient noise microphone 410 relative to the user 505 will be lower for the user signal and higher for the ambient noise because of the non-steerable and typically non-directional character of the ambient noise microphone 410. In at least one non-limiting embodiment, the use of a plurality of microphones for sensing the user's 50 or speakers sounds may provide added sensitivity over the sensitivity of a single ambient noise microphone. It should however be appreciated that multiple microphones may be used for the ambient noise sensing.
The output signal 545 from the beam forming block 444 is combined with the output signal 560 from the background noise canceller 412 to generate a signal 585 that is communicated to other processing circuitry, such as for example to the frequency domain noise enhancer in the embodiment if
The headset will have nonvolatile storage for multimedia data files, typically music files, for example through a Flash RAM. There are many methods by which the multimedia data files may be loaded into the headset memory, for example via a wireless connection to the Internet, via a cellular telephone connection, via a satellite (e.g. XM or Sirius) or AM/FM radio receiver, via a USB high-speed data port, or via a wired or wireless connection to another device (e.g. a wireless connection to a computer, music server, handset, PDA, or other wireless device). The library may be partitioned by media type, for example, there may be one partition of the memory for music, one for phone numbers, and the like.
File storage will include the capability to add “tags” to files. The tagging is done to facilitate searching based on tags that the user selects for each media type. For example, a music file might have a tag or tags such as file title, song title, artist, keywords, genre, album name, music sample or clip, and the like. The headset will contain intelligent software for searching multimedia files stored on the headset based on multiple search criteria and by the type of file of interest. Alternatively, a user can set up certain tags for all files downloaded under the given tagging criterion. The user need only enter this tag or set of tags once, and then change the tag or tags when a change is desired so that, for example, all music downloaded at a given time will have the same tag. This is particularly useful for a headset since it is very hard to do manual entry for each new file.
The search engine (SE) will implement a search algorithm consisting of a multistep process to locate a file or set of files of interest. This generalized search engine will re-use a number of similar functions for different kind of searches such as speech recognition and name recognition. The search engine (SE) interacts with the user through the user interface, which for example can be control buttons or via speech. In the case of speech commands, the headset synthesizes a speech signal to query the user, and the user's speech commands are processed by a speech recognition engine and then sent to the SE. The noise cancellation (and beamforming) 145 capabilities of the microphone array, described above, can be combined with the speech recognition engine to improve its performance.
Retuning to the step of determination (step 625) as to whether more than one file or content matches the search term(s) or other search criteria. If the determination is that only one file or content matches (no), that file or content is sent to the user (step 635). If either the step of determining if one or more files match type and search term(s) (step 620), or the step of determining if the user has requested more than 1 file are negative (step 630), then a determination is made in which the search engine queries the user to determine if the user wishes to change the search term(s) or other search criteria (step 640). If the answer is yes, then the step of the search engine scanning the library or other database, storage, or other potential file or content source (step 615) is repeated. If the determination (step 640) is no, then the search terminates (step 645). The user may of course repeat the search at any time with different search terms. It may be appreciated that this search engine logic is exemplary and non-limiting and that other search engine logic or procedures may be implemented. Furthermore, although the search may be directed to files or content such as music, it may alternatively be directed to other types of content such as audio books, pod casts, or other content.
As shown in
The headset may be designed such that a certain application or set of applications that require relatively low power can be maintained for an indefinite time period under solar power alone, for example using solar cells embedded in the device and aggressive power management will allow the device to support the given application(s) indefinitely without recharging by shutting down all nonessential functions except those associated with the specific application or applications. For example, the device may operate indefinitely without recharging in Bluetooth-only or Zigbee-only mode by shutting down all functions not associated with maintaining a low-rate wireless connection to the handset through Bluetooth or Zigbee; in voice-only mode the device may operate indefinitely without recharging by shutting down all functionality of the device not associated with making a voice call (e.g. certain memory access, audio processing, noise cancellation, and search algorithms) through one or more interfaces that support such calls (e.g., 2G, 3G, GSM, VoIP over Wifi), and the like. Exemplary strategies and processes are illustrated in the embodiment of
The headset may advantageously support simultaneous operation on the different wireless interfaces, such as for example simultaneous operation on at least two systems that may include Wifi (802.11a/b/g/n), Wimax, 3G cellular, 2G cellular, GSM-EDGE, radio (e.g. AM/FM/XM), 802.15 (Bluetooth, UWB, and Zigbee) and GPS. These systems often operate at different frequencies and may require different antenna characteristics. The simultaneous operation over different frequencies can be done, for example, by using some set of antennas for one system and using another set of antennas for another system.
There are two main components to the peer-to-peer networking protocol: neighbor discovery and routing. In neighbor discovery a handset determines which other devices it can establish a direct connection with. This may be done, for example, by setting aside a given control channel for neighbor discovery, where nodes that are already in the peer-to-peer network listen on the control channel for new nodes beginning the process of neighbor discovery. When a node first begins the process of neighbor discovery, it broadcasts a beacon identifying itself over a control channel set up for this purpose. Established nodes on the network periodically listen on the control channel for new nodes. If an established node on the network hears a broadcast beacon, it will establish a connection with the broadcasting node. The existing node will exchange information with the new node about the existing network to which it belongs, e.g. it may exchange the routing table it has for other nodes in the network with the new node. The neighboring node will also inform other nodes on the network about the existence of the new node, and that it can be reached via the neighboring node, e.g. by exchanging updated routing tables with the other nodes. At that point the new node becomes part of the network and activates the routing protocol to communicate with all nodes in the network.
The routing protocol will take advantage of link layer flexibility in establishing and utilizing single and multihop routes between nodes with the best possible end-to-end performance. The routing protocol will typically be based on least-cost end-to-end routing by assigning costs for each link used in an end-to-end route and computing the total cost based on these link costs. The cost function is designed to optimize end-to-end performance. For example, it may take into account the data rates, throughput, and/or delay associated with a given link in coming up with a cost of using that link. It may also adjust link layer parameters such as constellation size, code rate, transmit power, use of multiple antennas, etc., to reduce the cost of a link and thereby the cost of an end-to-end route.
In addition, for nodes with multiple antennas, multiple independent paths can be established between these nodes, and these independent paths can comprise separate links over which a link cost is computed. The routing protocol can also include multiple priorities associated with routing of each data packet depending on data priority, delay constraints, user priority, and the like.
The headset will also be developed as an open architecture so that third party applications can utilize the handset capabilities of high-fidelity sound, large memory, advanced searching capabilities, peer-to-peer networking, and multiple wireless connections. The architecture of the handset will enable this by providing the appropriate subsystem and software interfaces.
As shown in
The headset will be designed such that a certain application or set of applications that require relatively low power can be maintained for an indefinite time period under solar power alone, i.e. solar cells embedded in the device and aggressive power management will allow the device to support the given application(s) indefinitely without recharging by shutting down all nonessential functions except those associated with the specific application or applications. For example, the device may operate indefinitely without recharging in Bluetooth-only or Zigbee-only mode by shutting down all functions not associated with maintaining a low-rate wireless connection to the handset through Bluetooth or Zigbee; in voice-only mode the device may operate indefinitely without recharging by shutting down all functionality of the device not associated with making a voice call (e.g. certain memory access, audio processing, noise cancellation, and search algorithms) through one or more interfaces that support such calls (e.g. 2G, 3G, GSM, VoIP over Wifi), etc.
The headset supports simultaneous operation on the different wireless interfaces, i.e. simultaneous operation on at least two systems that may include Wifi (802.11a/b/g/n), Wimax, 3G cellular, 2G cellular, GSM-EDGE, radio (e.g. AM/FM/XM), 802.15 (Bluetooth, UWB, and Zigbee) and GPS. These systems often operate at different frequencies. The simultaneous operation over different frequencies can be done, for example, by using some set of antennas for one system and using another set of antennas for another system.
Another mechanism to support this simultaneous multifrequency operation is time division. In addition to simultaneous operation, the handset can support seamless handoff between two systems. For example, the handset could switch a VoIP call from a wide-area wireless network such as Wimax or 3G to a local area network such as Wifi.
There are two main components to the peer-to-peer networking protocol: neighbor discovery and routing. In neighbor discovery a handset determines which other devices it can establish a direct connection with. This may be done, for example, by setting aside a given control channel for neighbor discovery, where nodes that are already in the peer-to-peer network listen on the control channel for new nodes beginning the process of neighbor discovery. When a node first begins the process of neighbor discovery, it broadcasts a beacon identifying itself over a control channel set up for this purpose. Established nodes on the network periodically listen on the control channel for new nodes. If an established node on the network hears a broadcast beacon, it will establish a connection with the broadcasting node. The existing node will exchange information with the new node about the existing network to which it belongs, e.g. it may exchange the routing table it has for other nodes in the network with the new node. The neighboring node will also inform other nodes on the network about the existence of the new node, and that it can be reached via the neighboring node, e.g. by exchanging updated routing tables with the other nodes. At that point the new node becomes part of the network and activates the routing protocol to communicate with all nodes in the network. A flow chart describing this process is shown in
The routing protocol will take advantage of link layer flexibility in establishing and utilizing single and multihop routes between nodes with the best possible end-to-end performance. The routing protocol will typically be based on least-cost end-to-end routing by assigning costs for each link used in an end-to-end route and computing the total cost based on these link costs. The cost function is designed to optimize end-to-end performance. For example, it may take into account the data rates, throughput, and/or delay associated with a given link in coming up with a cost of using that link. It may also adjust link layer parameters such as constellation size, code rate, transmit power, use of multiple antennas, etc., to reduce the cost of a link and thereby the cost of an end-to-end route.
In addition, for nodes with multiple antennas, multiple independent paths can be established between these nodes, and these independent paths can comprise separate links over which a link cost is computed. The routing protocol can also include multiple priorities associated with routing of each data packet depending on data priority, delay constraints, user priority, etc.
The headset will also be developed as an open architecture so that third party applications can utilize the handset capabilities of high-fidelity sound, large memory, advanced searching capabilities, peer-to-peer networking, and multiple wireless connections. The architecture of the handset will enable this by providing the appropriate subsystem and software interfaces.
Number | Date | Country | |
---|---|---|---|
60741672 | Dec 2005 | US |