The present disclosure relates to point-to-point wireless communication and more specifically to a digital wireless communication protocol providing low latency to facilitate binaural applications for hearing instruments.
Hearing instruments, such as hearing aids or ear-worn speakers (e.g., ear buds), can be worn in or on each ear of a user to provide sound to a user. Additionally, the hearing instruments may include one or more microphones to receive audio signals from an environment of the user. For example, audio from the environment may be received and converted (i.e., sampled) into a first digital signal (i.e., left channel) by a left-worn hearing instrument, and audio from the environment may be received and converted into a second digital signal (i.e., right channel) by a right-worn hearing instrument. Processing to improve a user's hearing experience is possible if the left channel and the right channel can be processed together (i.e., binaural processing).
Such binaural processing requires communication, preferably wireless digital communication, between the hearing instruments. Power for such communication is limited primarily by size and weight restrictions, but also by the desire to minimize electromagnetic interference. The short distance between a user's ears enables the effective use of near field magnetic induction (NFMI) for wireless digital communication, offering efficiency but limiting the available frequency range. Existing communications protocols can introduce a substantial communications latency in such systems. Such latencies impair the performance of binaural processing techniques such as beamforming.
Accordingly, there are disclosed herein ultra-low latency communication protocol, methods, devices, and systems suitable for providing wireless digital communication of audio data. One illustrative communication method suitable for a central (aka primary, master) hearing instrument includes: transmitting a preamble packet to initiate a wireless connection; after receiving a preamble response packet, wirelessly sending a downlink stream of audio data frames; and wirelessly receiving an uplink stream of audio data frames. The audio data frames of the downlink stream and the uplink stream each consist of a message packet, a check packet, and multiple single-sample audio data packets, and these packets exclude any preambles or sync words. The audio data frame packets of the downlink stream and the uplink stream are interleaved with each other.
An illustrative communication method suitable for a peripheral (aka secondary, slave) hearing instrument includes: wirelessly receiving a preamble packet to initiate a wireless connection; responsively transmitting a preamble response packet; after transmitting the preamble response packet, receiving a downlink stream of audio data frames; and responsive to each audio data frame packet of the downlink stream, sending an audio data frame packet of an uplink stream.
An illustrative hearing instrument includes: an analog to digital converter to produce a series of local audio samples; a digital signal processor to obtain a series of output audio samples by combining the series of local audio samples with a series of received audio samples; a digital to analog converter to convert the series of output audio samples into an output audio signal; and a wireless signal transceiver to send a downlink stream of audio data frames representing the series of local audio samples and to receive an uplink stream of audio data frames representing the series of received audio samples. The audio data frames of the downlink stream and the uplink stream each consist of a message packet, a check packet, and multiple single-sample audio data packets. The audio data frame packets exclude any preambles or sync words, and the audio data frame packets of the downlink stream are interleaved with those of the uplink stream.
The foregoing methods and instruments may be implemented separately or conjointly, together with one or more of the following optional features in any suitable combination: 1. the preamble packet comprises a preamble and ends with a sync word. 2. the preamble response packet comprises a shortened preamble and the sync word. 3. the message packet and the check packet each include a single audio data sample. 4. using a shared clock source for sampling audio data for the downlink stream and for said wirelessly sending the downlink stream. 5. the transceiver is configured to initiate a wireless connection with a second hearing instrument by sending a preamble packet that includes a preamble and ends with a sync word. 6. the transceiver is configured to send the downlink stream only after receiving a preamble response packet having a shortened preamble and the sync word. 7. the wireless connection is via near field magnetic induction. 8. deriving a communication clock from the preamble packet and the downlink stream and using the communication clock for said transmitting and receiving.
The following description and accompanying drawings are provided for explanatory purposes, not to limit the disclosure. In other words, they provide the foundation for one of ordinary skill in the art to recognize and understand all modifications, equivalents, and alternatives falling within the scope of the claims.
Binaural processing can improve a user's hearing experience by processing audio received at different hearing instruments (e.g., worn at each ear of the user). For example, a user that is deaf in one ear may hear a combined left/right audio channel in the hearing ear so that sounds on the side of the user's deaf ear may be heard more easily. In another example, a user may receive audio with reduced noise, using a binaural processing technique known as beamforming. Conventional wireless digital communication protocols (e.g., Bluetooth, WiFi) introduce a communication delay (aka transport delay, latency) that can negatively affect the binaural processing. Disclosed herein are circuits and methods to reduce wireless digital communication latency to facilitate binaural processing, such as beamforming.
Beamforming may be performed on audio from two spatially separated microphones on a single hearing instrument (i.e., monaural beamforming); however, beamforming performed on audio from microphones on a left hearing instrument at a left ear of the user and a right hearing instrument at a right ear of the user (i.e., binaural beamforming) may offer some advantages. For example, the audio from the left and right hearing instruments may have an interaural delay and amplitude difference resulting from the separation between the left/right ears and a direction of the sound, which can improve a quality the beamforming.
Conventional digital wireless communication often includes propagating radio frequency (RF) signals (e.g., 2.4 gigahertz (GHz)) between a transmitter antenna and a receiver antenna. For conventional digital wireless communication, such as Bluetooth and WiFi, the RF signals are intended to propagate according to far-field transmission principles. Communicating between hearing instruments worn in opposite ears using these forms of digital wireless communication could result in (at least) poor efficiency and interference with other devices. To avoid these problems, the disclosed circuits and methods can use a near field magnetic induction (NFMI) communication technology, which is better suited for wireless communication between body worn devices, such as hearing instruments.
NFMI facilitates digital communication over a short range (e.g., <1 meter). For NFMI communication, each hearing instrument 101, 103 can include a coil 240 that is coupled to that hearing instrument's transceiver. The transmitter of a first hearing instrument 101 can be configured to excite a current in a coil 240 of the transmitter to produce a magnetic field 250 that is inductively coupled to a coil 240 of a receiver of a second hearing instrument 103. The magnetic field 250 may be modulated (e.g., frequency shift key (FSK) modulation) for communicating digital information between the hearing instruments. The coils may be substantially similar (e.g., identical) and can be arranged to optimize magnetic coupling to maximize the efficiency of NFMI. Further, magnetic field 250 can be tightly coupled, having an amplitude that drops quickly with range (i.e., near field transmission). Accordingly, NFMI communication can minimize interference with other devices. For digital communication, the magnetic field 250 may be modulated in a high-frequency (HF) band carrier (e.g., 10-14 megahertz (MHz)). Signals in the HF band may experience less distortion/absorption from the human body than conventional RF signal frequencies (e.g., 2.4 GHz).
The hearing instrument 101 includes at least one microphone 320 configured to convert received sounds 220 into an analog audio signal 321. The analog audio signal 321 is coupled to an analog-to-digital converter (A/D) 330 that is configured to periodically sample (i.e., samples spaced by a sampling period) the analog audio signal 321 at a sample rate (e.g., 24 kilohertz (kHz)) and to convert the analog audio samples into digital samples having a binary representation of their amplitude (e.g., 16 bits), and to output the digital samples in sequence as a digital data series 331. The series of digital audio samples 331 may be transmitted to a processor (e.g., a digital signal processor (DSP)) 340, which can be configured to use the series 331 as a channel for binaural processing, such as beamforming. In a possible implementation, the hearing instrument can include a delay buffer 335 that is configured to generate a series 332 of delayed digital audio samples for the processor 340. The delay buffer 335 may provide a desired delay for the chosen binaural processing application and may further be configurable to compensate for wireless communications latency of the digital audio samples from the other hearing instrument. The series of digital audio samples may also be provided to an audio encoder 344 that is configured to encode (e.g., compress) the digital data stream to reduce a number of bits communicated over the wireless communication link 230. The audio encoder 344 may be a portion of an audio codec 350 that includes (at least) an encoder 344 and a decoder 345. The audio codec 350 may be one of a plurality of possible types, including (but not limited to) an adaptive differential pulse-code modulation (ADPCM) codec. The encoder 344 may output an encoded data stream 341 for transmission to a transmitter portion of the transceiver 310 for transmission to another hearing instrument.
The transceiver 310 may further include a receiver portion that is configured to receive an encoded data stream from another hearing instrument over the wireless communication link 230 via the coil 240. The receiver portion may couple the received encoded data stream 342 to the decoder 345. The decoder 345 can be configured to decode (e.g., decompress) the received encoded data stream and output a series 346 of received digital audio samples 346 to the DSP 340.
The DSP 340 may be configured to process the local digital audio sample series 331 (aka first channel, local channel) and the received digital audio sample series 346 (aka second channel, remote channel, received channel) for a binaural application and to output a processed series of digital audio samples 351. The processed digital data series 351 can be a combination of the local channel 331 and the remote channel 346. In a possible implementation, the DSP 340 is configured (e.g., by software) to perform beamforming processing on the first channel and the second channel.
The hearing instrument 101 may further include a non-transitory computer readable information storage medium (e.g., nonvolatile memory) 360. The memory 360 may be configured to store a computer program product (i.e., software) including instructions that, when executed by a processor, configure the processor to perform operations to enable functionality of the hearing instrument. For example, the memory 360 may include stored instructions that can configure the DSP 340 to perform a beamforming process (i.e., method). Additionally, or alternatively, the memory may include stored instructions that can configure a processor (e.g., the DSP 340 or a separate microcontroller) to perform a method associated with an ultra-low latency NFMI communication protocol.
The hearing instrument 101 may further include a digital-to-analog converter (D/A) 355 that is configured to parse the digital samples of the processed digital data series 351 and to generate an analog speaker signal 324 based on the digital samples. The analog speaker signal 324 can be amplified and coupled to at least one speaker 325 of the hearing instrument 101 to produce transmitted sounds 210. The transmitted sounds 210 may provide an improved hearing experience for a user because of the binaural processing. For example, the transmitted sounds may include less noise than transmitted sounds without the processing provided by the binaural application.
As mentioned, the hearing instrument 101 shown in
To illustrate cooperation between multiple hearing instruments,
The remote channel used by the processor 340 to generate a series of output audio samples for D/A converter 355 begins as sound received by a microphone 420 in the remote hearing instrument. In accordance with a sample clock, an A/D converter 430 digitizes samples of the analog signal to produce a series of remote audio samples. An audio encoder 444 compresses the series of audio samples to reduce bandwidth requirements. An audio encoder such as, e.g., an adaptive differential pulse code modulator (ADPCM) enables a series of 24-bit audio signal samples to be well represented as a series of, e.g., 5-bit quantized errors measured relative to the output of a recursive prediction filter. Moreover, a 16-bit audio stream can be well represented as a series of 4-bit or 5-bit quantized errors depending on the sophistication of the recursive prediction filter.
Because the compression process removes most of the signal redundancy, a channel encoder 490 re-introduces a controlled amount of redundancy to enable error detection and correction. As one example, two Hamming parity polynomial bits can be added to each five bits to enable one bit error of the seven total bits to be corrected (single-error correction, or SEC) or detection of up to two bit errors in a seven-bit packet (dual error detection, or DED). Additional parity bits can be added to further increase the detectable and/or correctable number of errors in each packet, at the cost of requiring additional channel bandwidth. The channel encoder 490 may further apply a scrambling mask to the data before or after the parity bits are added. Such randomization of the data tends to improve system performance particularly when the data might otherwise exhibit a predictable pattern, e.g., a low-noise environment.
A modulator 485 maps the packet bits to channel symbols, e.g., representing each zero with a first frequency and each one with a second different frequency. Other illustrative forms of modulation include amplitude shift keying (ASK) and phase shift keying (PSK). As explained further below, the modulator 485 in the central hearing device employs a transmit clock that is based on the local A/D sample clock or derived from the same clock source as the sample clock. This shared clock source facilitates synchronization of at least the downlink stream to the audio samples. A mixer 475 multiplies the channel signal with a carrier signal from an oscillator 480 to provide a frequency upshift to the modulated signal. An amplifier 468 filters the upshifted signal and applies it as a drive signal via mode switch 466 to antenna 440. The mode switch 466 switches the antenna coupling between the transmit amplifier 468 and the receive amplifier 470.
Antenna 440 converts the drive signal to electromagnetic fields that produce a receive signal in another antenna 240. When mode switch 366 decouples the antenna 240 from transmit amplifier 368 to receive amplifier 370, the receive amplifier provides a buffered and filtered receive signal to mixer 375. Mixer 375 multiplies the receive signal with a carrier signal from oscillator 380 to frequency downshift the receive signal to baseband or near-baseband.
A demodulator 385 performs filtering, timing recovery, and data detection to convert the downshifted signal into bits. Channel decoder 390 reverses the operations of channel encoder 490 to obtain the series of compressed audio data samples, and audio decoder 345 reconstructs the received series of digital audio samples from the compressed audio data samples. Note that the channel decoder 390 operates on potentially corrupted data to detect bit errors in each packet, correcting them when possible. When an error is corrected, the decoder can optionally flag the relevant audio data sample as being corrected. When errors are detected but not correctable, the audio data sample from that packet may optionally be flagged as having an uncorrectable error. Such flags may be taken into account as part of the processing performed by the processor 340, e.g., with replacement or de-emphasis of the relevant audio data samples to prevent such errors from creating noticeable audio artifacts.
The processor 340 performs binaural processing, whether for beamforming or for providing monaural audio, by summing the left and right signals with optional scaling and/or signal delay. The processing may include directional detection of an audio source with corresponding adaptation of relative channel contributions and delays to increase or decrease sensitivity in that direction. The additional components traversed by the remote audio channel data create a communications latency that may be at least partly compensated by delay buffer 335. The software or firmware stored in memory 360 may cause the processor 340 or a separate microcontroller for transceiver 310 and codec 350 to implement a low-latency wireless streaming method having single-sample audio data packets to minimize communications latency. Alternatively, this method may be implemented using application-specific integrated circuitry.
Also shown are a central receive enable (CNTRL_RX_EN) signal and a peripheral receive enable (PERIPH_RX_EN) signal which may be used to control the mode switches 366, 466. A central transmit signal (CENTRAL_TX here, CTX in
In
Typical examples of a preamble pattern include alternating bits or alternating bit pairs, e.g., 010101 . . . or 00110011. . . . The length of the long preamble may account for the training time typically required by peripheral hearing instrument to derive a communication clock and may span more than one sampling period. The long preamble may be, e.g., 48 bits long, 64 bits long, or more. The sync word may be chosen to be a pattern not found in the preamble or in any sequence of the channel encoder outputs. Though this selection depends on the choice of channel encoder, one example is the eight-bit sequence 11011011. Alternatively, the sync word may be an extension of the preamble pattern, but with inverted bits to signal the transition between the two.
The central hearing instrument may send the preamble packet periodically until a response is detected, giving the peripheral hearing instrument multiple opportunities to detect and respond. Upon achieving accurate timing recovery and sensing the sync word, the peripheral hearing instrument sends a preamble response packet that includes a preamble (PPRE) and the sync word. This preamble may be the same the CPRE preamble, but in practice a shortened preamble may be used. The duration of the preamble response packet may preferably be less than one sampling interval, limiting the length of the short preamble to perhaps 16 bits or so. Because the peripheral hearing instrument employs a communication clock derived from the preamble packet timing, the central hearing instrument's demodulator may require only minimal time for timing recovery and packet data detection. With the exchange of sync words and frequent packet exchanges, the hearing instruments can maintain tight coupling of timing information.
Upon detecting the preamble response packet, the central hearing instrument may send a downlink stream of audio data frames each representing multiple digital audio samples. For the purposes of the following explanation and with reference to
The message word is a fixed number of bits, e.g., 16 bits or 32 bits, representing a command with any associated parameter value(s). In some contemplated implementations, the command may be a read or write of a selected control register, enabling the central hearing instrument to sense or set the contents of peripheral hearing instrument registers that control its behavior and/or the operating parameters for the communication protocol such as frame length, sample resolution, sample rate, channel encoder configuration, audio codec configuration. For the uplink stream, the message word may be acknowledgements or data in response to such commands, or in the absence of such commands may be status information. The checksum may be a cyclic redundancy check (CRC) for the preceding message word. In at least some contemplated embodiments, it has the same number of bits as the message word, but this is not a requirement.
For unidirectional communication, the peripheral hearing instrument need not send single-sample packets, but rather may send only the preamble response packet as part of startup phase 610, and as part of each uplink frame 621, 622, may send the message packet and check packet at the positions where they would be expected in the bidirectional system.
Returning to
Block 708 is repeated until a quarter of the downlink audio data frame has been sent and a quarter of the uplink audio data frame has been received. Once this point is detected in block 710, the controller causes a local digital audio sample to be encoded and appended to a check word for the previous message word to form a check packet. The transceiver sends the check packet and responsively receives an SSA packet in block 712. In block 714, the operations of block 708 are repeated until the halfway point is detected in block 716.
In block 718, the controller causes a local digital audio sample to be encoded and sent as an SSA packet. The transceiver responsively receives an uplink message packet. In block 720, the operations of block 708 are repeated until the three-quarter point is detected in block 722. In block 724, the controller causes a local digital audio sample to be encoded and sent as an SSA packet. The transceiver responsively receives an uplink check packet.
In block 726, the controller evaluates the check packet and uplink message word, possibly in combination with evaluations from previous frames. In some implementations, a single checksum failure may be taken as an indication that the connection is lost and needs to be reset. In other implementations, two or three consecutive checksum failures may be required to determine that the connection is lost and needs to be reset. If such a determination is made, the controller returns to block 702 to restart the connection. Otherwise, in block 728, the operations of block 708 are repeated until the end of frame is detected (via a packet counter) in block 730. Thereafter the controller returns to block 706.
The controller repeats listening bock 802 until the preamble packet is received, thereafter sending a preamble response packet in block 804. In block 806, the controller listens until a message packet is received, and responds by causing a local digital audio sample to be encoded as an SSA packet and sent in block 808. The audio data from each of the downlink packets is decoded to obtain the downlink audio data stream, which is forwarded to the DSP for binaural processing.
In block 810, the controller listens for SSA packets, returning to block 808 each time one is received until in block 812 the controller determines that a quarter of the downlink frame has been received. In block 814, the controller listens for a check packet and uses it in block 816 to determine whether the connection has been lost and needs to be reset. The determination may be done in a similar fashion as that of block 726 (
To provide more even timing (and better latency minimization), the central hearing instrument's controller may seek to ensure that all the downlink stream packets end at the same point in the sampling clock period. To this end, the longer packets (i.e., message packet, check packet) may be started earlier in the sampling clock period than the SSA packets. Conversely, the peripheral hearing instrument's controller may operate to start each uplink frame packet at the same point in the sampling clock period, regardless of packet type.
Though the operations of
While the foregoing discussion has focused on audio streaming in the context of hearing aids, the foregoing principles can be useful for many applications, particularly those involving audio streaming to or from smart phones or other devices benefitting from low latency wireless audio streaming. Any of the controllers described herein, or portions thereof, may be formed as a semiconductor device using one or more semiconductor dice. Though the operations shown and described in
It will be appreciated by those skilled in the art that the words during, while, and when as used herein relating to circuit operation are not exact terms that mean an action takes place instantly upon an initiating action but that there may be some small but reasonable delay(s), such as various propagation delays, between the reaction that is initiated by the initial action. Additionally, the term while means that a certain action occurs at least within some portion of a duration of the initiating action. The use of the word approximately or substantially means that a value of an element has a parameter that is expected to be close to a stated value or position. The terms first, second, third and the like in the claims or/and in the Detailed Description or the Drawings, as used in a portion of a name of an element are used for distinguishing between similar elements and not for describing a sequence, either temporally, spatially, in ranking or in any other manner. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments described herein are capable of operation in other sequences than described or illustrated herein. Inventive aspects may lie in less than all features of any one given implementation example. Furthermore, while some implementations described herein include some, but not other features included in other implementations, combinations of features of different implementations are meant to be within the scope of the invention, and form different embodiments as would be understood by those skilled in the art.
The present application claims priority to Provisional U.S. Application 63/469,295 filed 2023 Apr. 14 and titled “Ultra-low latency NFMI communication protocol” by inventors A. Heubi and I. Coenen, which is hereby incorporated herein by reference. The present application further relates to pending U.S. application Ser. No. 17/931,747 filed 2022 Sep. 13 and titled “Low-latency communication protocol for binaural applications” by inventors I. Coenen and D. Mitchler, which is hereby incorporated herein by reference. application Ser. No. 17/931,747 is a divisional of issued U.S. Pat. No. 11,503,416, filed 2021 Jan. 7 with the same title and inventors, and also hereby incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
63496295 | Apr 2023 | US |