The present disclosure also relates to an apparatus, such as an ear-worn device, that employs neural networks to process signals.
Ear-worn devices such as hearing aids may be used to help those who have trouble hearing to hear better. Typically, hearing aids amplify received sound. Some hearing aids attempt to remove environmental noise from incoming sound.
A recurrent neural network (RNN) is a type of neural network in which the result of processing at one time step may affect the processing at a subsequent time step. RNNs thus have “states” that can persist from one time step to another and represent context information derived from analysis of previous inputs. It should be appreciated that other types of neural networks may also be stateful (i.e., have and use states) and may also use the methods described herein.
An RNN may include an input layer, one or more RNN layers, and an output layer. Each RNN layer may include input nodes, output nodes, and states. In some embodiments, there may be one type of state and the states may be referred to as “hidden states.” In some embodiments, such as a long short-term memory (LSTM) type of RNN, there may be two types of states and the states may be referred to as “hidden states” and “cell states.” At each time step, the states from the previous time step may be concatenated with the inputs from the current time step.
Recurrent neural networks and other stateful neural networks may suffer from drawbacks. Over time, states in a neural network may drift and attain sets of values they never attained during training. As a result, the neural network may suffer degradation in performance over time. Such degradation in performance over the long term may be avoided by resetting certain states (i.e., one or more states) of the neural network. Doing so, however, may come with the drawback that, immediately after resetting the states, performance of the neural network may be degraded because the neural network is operating without the benefit of the context information derived from processing prior inputs. Therefore, although resetting the states may address the problem of states drifting to values they never attained during training, a different problem is created.
The inventors have discovered that to address both problems with stateful neural networks, multiple neural networks may operate in parallel on the same input, but their states may be reset at times offset from each other. In this manner, the problem of long-term degradation can be avoided for both neural networks, and at any given point in time at least one of the neural networks may have established state information aiding in calculation of its output. The output of the neural networks may be processed in combination such that the combined output from the parallel neural networks utilizes more heavily the prediction from the neural network whose states are at an optimal point of processing. The neural network system may thus reduce its reliance on a neural network whose states are at a non-optimal point. In other words, the output from one neural network may be weighted more than the output of another neural network. For example, the weight for a neural network that has been reset recently may be lower than the weight for another neural network. The parallel neural networks may have different architecture and different weights, or the same architecture and weights. Due to staggered reset times for states, the parallel neural networks may have different states even if their architectures and weights are the same.
In general, one or more states in a neural network may be reset. In some embodiments, all states in the neural network may be reset. In some embodiments, only certain types of states in a neural network may be reset. For example, in an LSTM neural network, in some embodiments only cell states but not hidden states may be reset. In some embodiments, only certain states of one or more layers of a neural network may be reset, and states from different layers or groups of layers may be reset at different times. As a specific example, all states of one layer (e.g., layer 1) may be reset at one time, then all states of a different layer (e.g., layer 2) may be reset at another time, etc. Thus, as referred to herein, resetting one or more states of a neural network may refer to resetting all the states of the neural network, resetting only certain states (e.g., states of a certain type or types) of the neural network, resetting all the states of one or more layers (e.g., one layer) of a neural network, and/or resetting certain states (e.g., states of a certain type or types) of one or more layers (e.g., one layer) of a neural network.
As referred to herein, resetting a state may refer to actively changing values in the state to 0, or actively changing values in the state to a different value other than zero. Additionally, as referred to herein, resetting a state may refer to actively changing values in the state immediately, or over a finite length of time. In the latter case, the reset may be smooth, such that the values in the state decay over time to zero or to a different value.
Conventional ear-worn devices (such as hearing aids, cochlear implants, earphones, etc.) receive an input acoustic signal, amplify the signal, and output it to the wearer. Hearing aid performance can be improved by utilizing neural networks, for example to reduce noise in audio signals. In some embodiments, parallel RNNs such as those described herein may be implemented in an ear-worn device such as a hearing aid. The inputs to the neural networks may be based on audio received by one or more microphones on the hearing aid, and the outputs of the parallel neural networks may be combined in a weighted combination to develop the output that is played back to the wearer by a receiver of the hearing aid. However, while some embodiments of the technology described herein may relate to hearing aids or other ear-worn devices such as cochlear implants and earphones, this should be understood to be non-limiting, and it should be appreciated that any device implementing stateful neural networks may utilize this technology as well.
Various aspects and embodiments of the application will be described with reference to the following figures. It should be appreciated that the figures are not necessarily drawn to scale. Items appearing in multiple figures are indicated by the same reference number in all the figures in which they appear.
The aspects and embodiments described above, as well as additional aspects and embodiments, are described further below. These aspects and/or embodiments may be used individually, all together, or in any combination of two or more, as the disclosure is not limited in this respect.
As illustrated, the neural network 106 may be configured to receive an input 102. In the context of a hearing aid, for example, the input 102 may be an audio signal. In some embodiments, the input 102 may undergo processing prior to input to the neural network system 100. For example, in the context of a hearing aid, the audio signal may undergo analog processing (e.g., preamplification and/or filtering) and digital processing (e.g., wind reduction, beamforming, anti-feedback, Fourier transformation, and calibration). The audio signal may also be broken up into snippets (e.g., as described with reference to
Generally, one or more layers of the neural network 104 may be configured to operate in parallel with one or more layers of the neural network 106. A layer of one neural network may be considered configured to operate in parallel with a layer of another neural network when each layer is configured to receive and process the same input. For example, in
It should be appreciated from the above that one full neural network (e.g., the neural network 106) may be configured to operate (i.e., configured to process the input 102) while in parallel less than a full neural network (e.g., fewer than all layers of the neural network 104, such as one layer of the neural network 104) may be configured to operate. One or more layers may be considered configured to process the input 102 if they receive the input 102 itself or a result of processing the input by another layer. In some embodiments, when two neural networks operate in parallel, one may be configured to have more layers operating than the other. An example schedule for parallel operation of layers in different neural networks is illustrated in
In some embodiments, the output 116 of the neural network system 100 may be an output that the neural network system 100 uses to reduce a noise component of the signal to obtain an enhanced output. For example, the neural network system output 116 may be a mask that, when multiplied by the input 102, leaves just the speech portion of an audio signal remaining. In some embodiments, the neural network system output 116 may be the enhanced output. For example, in the context of a hearing aid, the neural network system output 116 may be the speech portion of the input 102.
The combiner 110 may combine the outputs 112 and 114 of the neural networks 104 and 106 in any suitable manner. In some embodiments, the outputs of neural networks 104 and 106 may be combined according to a weighting scheme. For example, the combiner 110 may multiply the output 112 by a first weight, multiply the output 114 by a second weight (which may be different than the first weight), and add the two products together. Example weighting schemes are described further herein.
Each of the neural networks 104 and 106 may be a separate neural network. As described above, each of the neural networks 104 and 106 may have a different architecture with different weights. Alternatively, each of the neural networks 104 and 106 may have the same architecture with the same weights. Therefore, each of the neural networks 104 and 106 may have the same architecture with the same set of weights. The states of each of the neural networks 104 and 106 may change as the neural network runs, and may be different from one another, for example, due to staggered reset times. In some embodiments, using the same weights for each neural network may save memory (e.g., in a device such as a hearing aid implementing the neural networks).
As referred to herein, the output of a neural network should be understood to include the output of any layer or layers of the neural network. Thus, the output of a neural network, as referred to herein, may be the output of the first layer of the neural network, an intermediate layer of the neural network, the final layer of the neural network, or any combination of multiple layers. Similarly, processing an input by a neural network should be understood to include processing the input by any layer or layers of the neural network.
When implemented by neural network circuitry (e.g., the neural network circuitry 720), in some embodiments, the states of a neural network's layers that are configured to operate may be stored in memory of the neural network circuitry, while the states of layers not configured to operate may not be stored in memory. For example, while the neural network 104 is illustrated in
In the example of
The outputs of the neural networks may be weighted prior to being combined together. As described above, the output of a neural network should be understood to include the output of any layer or layers of the neural network. Thus, the output of a neural network described with reference to
In general, the weighting scheme may be a linear piecewise function, a smooth function such as a sinusoidal function, or another function to transition the weight between neural networks. Thus, the weighting scheme may be a dynamic weighting scheme such that a first weight for a first neural network output and a second weight for a second neural network output change over a period of time. Such a dynamic weighting scheme may include variable weights that correspond to different reset timings, such as a first and second variable weight that may transition from values between 0 and 1.
In some embodiments, a controller (e.g., implemented by control circuitry 726) may calculate the weights for each neural network on the fly. In such embodiments, it may be helpful to use linear functions for the weighting scheme to reduce the computational complexity for the controller. For example, the controller may only need to calculate the weight for one neural network, and then the weight for the other neural network may be obtained by subtracting the first weight from one. Nevertheless, in some embodiments, each neural network may have an independent schedule, and the controller may calculate the weights for each independently. In some embodiments, the weights may be stored in memory and the controller may need not calculate the weights on the fly.
Certain weighting schedules described herein (e.g., in
In some embodiments, the weights may be determined based on a confidence metric associated with each neural network's output, such that when the two neural networks are combined, the neural network with the higher quality/confidence is weighted higher. For example, one confidence metric may be frame-to-frame consistency, as consistency in a neural network's output may generally correlate strongly with the neural network's confidence. A neural network that is not confident may have outputs that are very different across frames.
In some embodiments, the reset might not occur at a fixed interval but rather during periods in which the neural network can reset with minimal impact. For example, in some embodiments, one of the neural networks may transition and reset during a period of silence. In some embodiments, periods of silence may be used for resetting and can allow for only one neural network to be used. In other words, even immediately after resetting, the output of the neural network may be used, as the period of silence is likely to continue after resetting, and the neural network is not likely to introduce artifacts during periods of silence. The periods of silence may consist of neither speech nor noise above a threshold, (e.g., 20 dB, 40 dB, or 60 dB).
In some embodiments, the reset may occur only when a particular condition is met. By using metrics to monitor the quality of the neural network output, the controller may determine whether a reset should occur. Once a metric (e.g., frame-to-frame consistency) crosses a certain threshold, a reset may be initiated, beginning with a second neural network being initialized. Once the second neural network completes its warmup period, then the weights may begin shifting to incorporate the second neural network output more into the combined output.
The input data may be sampled as snippets of data, and each snippet of data may be a certain amount of time long (“window”), and may begin a certain amount of time after the previous snippet began (“step”). In some embodiments, the window may be between 1 and 10 ms, such as 4 ms. In some embodiments, the step may be between 1 and 5 ms, such as 2 ms. In a non-limiting embodiment in which the window is 4 ms and the step is 2 ms, every 2 ms, the 4 most recent milliseconds of input data may be passed through each neural network. Depending on the window and step sizes, multiple snippets may sample the same input data. For example, in
As illustrated, after a time corresponding to computing latency, each neural network produces an output based on a snippet of data. The output from each neural network may be referred to as a vote. As described above, the output of a neural network should be understood to include the output of any layer or layers of the neural network. Thus, the output of a neural network as described with reference to
The outputs from each neural network may be combined, and the combined output may be used to produce the final output that is played back. For example, the combined output may include averaging the relevant votes, or using a weighting scheme. In the example of
As shown by the first weighting schedule 502, the first neural network may be weighted at 1 for a time, then transition from 1 to 0 during a transition time, and then reset at the reset time 506. After the reset time 506 of the first neural network, the first neural network may continue to be on but be in a warmup period 510. Following the warmup period for the first neural network, the weighting of the first neural network begins to transition from 0 to 1 as shown by the first weighting schedule 502. While the first neural network is in the warmup period, the second neural network is weighted at 1. The second neural network may follow an opposite weighting and reset schedule as shown by the second weight schedule 504 and the reset time 508. While
As shown by the first weighting schedule 602, the first neural network may be weighted at 1 for a time, then transition from 1 to 0 during a transition time. During an off period 612, the first neural network is off. At a reset time 606, the first neural network turns on and resets, and then continues to be on but in a warmup period 610. Following the warmup period 610 for the first neural network, the weighting of the first neural network begins to transition from 0 to 1 as shown by the first weighting schedule 602. While the first neural network is in the warmup period, the second neural network is weighted at 1. The second neural network may follow an opposite weighting and reset schedule as shown by the second weight schedule 604 and the reset time 608. While
Thus, in some embodiments, one neural network may be running alone for a period of time. For example, in
As described above, in some embodiments two full neural networks may operate in parallel, whereas in other embodiments one full neural network may operate in parallel with one layer or a subset of layers of another neural network. Assuming that two full neural networks are operating in parallel, in the example of
Let P be the power consumed by a full neural network running. The inventors have appreciated that when two such neural networks (e.g., two identical neural networks) are running 100% of the time, such as in
Assuming that the one full neural network is operating in parallel with one layer or a subset of layers of a second neural network, the power consumption may be even less. For example, if the second neural network only has at most one of its n layers running at a time, then the power consumption may only rise as high as (1+1/n)×P at a time.
Generally, it should be appreciated that a neural network such as an RNN may have multiple layers each including multiple states. For example, a network with n layers each including m states may have a total of n*m states. With two parallel neural networks, there may be a total of 2*n*m states. In some embodiments, resetting the states of the two neural networks at different times may mean that the n*m states of one neural network are all reset at the same time (“first time”), the n*m states of the other neural network are all reset at the same time (“second time”), but the first and second times are different. In some embodiments, even the states of one neural network may be reset at different times. For example, all 2*n*m states in the system of two neural networks may be reset at different times.
Further description of neural networks, training neural networks, and implementing neural networks in hearing aids may be found in U.S. Pat. No. 11,812,225, titled “Method, Apparatus and System for Neural Network Hearing Aid,” issued Nov. 7, 2023, which is incorporated by reference herein in its entirety.
The one or more microphones 714 may be configured to receive sound and convert the sound to analog electrical signals. In some embodiments, the processing circuitry 716 may include analog processing circuitry. The analog processing circuitry may be configured to perform analog processing on the audio signals received from the microphones 714. For example, the analog processing circuitry may be configured to perform one or more of analog preamplification, analog filtering, and analog-to-digital conversion. Thus, the analog processing circuitry may be configured to generate analog-processed audio signals from the audio signal received from the microphones 714. The analog-processed audio signals may include multiple individual signals, each an analog-processed version of one of the audio signals received from the microphones 714. As referred to herein, analog processing circuitry may include analog-to-digital conversion circuitry, and an analog-processed signal may be a digital signal that has been converted from analog to digital by analog-to-digital conversion circuitry.
In some embodiments, the processing circuitry 716 may include digital processing circuitry. The digital processing circuitry may be configured to perform digital processing on the analog-processed audio signals received from the analog processing circuitry. For example, the digital processing circuitry may be configured to perform one or more of wind reduction, input calibration, and anti-feedback processing. Thus, the digital processing circuitry may be configured to generate digital-processed audio signals from the analog-processed audio signals. The digital-processed audio signals may include multiple individual signals, each a digital-processed version of one of the analog-processed audio signals
In some embodiments, the processing circuitry 716 may include beamforming circuitry. The beamforming circuitry may be configured to generate one or more beamformed audio signals from two or more of the digital-processed audio signals. The bcamformed audio signals may include one or more individual signals, each a beamformed version of two or more digital-processed audio signals.
The noise reduction circuitry 718 includes the neural network circuitry 720. The neural network circuitry 720 may be configured to implement one or more neural network layers trained to perform noise reduction. In some embodiments, the neural network circuitry 720 may be configured to implement multiple neural networks (e.g., recurrent neural networks or other stateful neural networks) operating in parallel (e.g., the neural network system 100). While the neural network circuitry 720 may receive audio signals that have been processed (e.g., by the processing circuitry 716) subsequent to their reception by the one or more microphones 714, this may still be referred to herein as the neural network circuitry 720 reducing noise in audio signals received by the one or more microphones 714.
The processing circuitry 730 may be configured to perform further processing on the output of the noise reduction circuitry 718. For example, the processing circuitry 730 may include digital processing circuitry configured to perform one or more of wide-dynamic range compression and output calibration.
The receiver 722 may be configured to play back the output of the processing circuitry 730 as sound into the ear of the user. The receiver 722 may also be configured to implement digital-to-analog conversion prior to the playing back.
The communication circuitry 724 may be configured to communicate with other devices over wireless connections, such as Bluetooth, WiFi, LTE, or NFMI connections. The control circuitry 726 may be configured to control operation of the processing circuitry 716, the noise reduction circuitry 718 (including the neural network circuitry 720), the processing circuitry 730, the communication circuitry 724, and the receiver 722.
The neural network circuitry 720 may include circuitry configured to perform operations necessary for computing the output of a neural network layer. One such operation may be a matrix-vector multiplication. In some embodiments, the neural network circuitry 720 may include multiple identical tiles each including multiple multiply-and-accumulate circuits configured to perform intermediate computations of a matrix-vector multiplication in parallel and then compute results of the intermediate computations into a final result. Each tile may additionally include memory configured to store neural network weights, registers configured to store input activation elements, and routing circuitry configured to facilitate communication of status and data between tiles. Other types of circuitry configured to perform processing, such as the processing circuitry 716, other portions of the noise reduction circuitry 718, and/or the processing circuitry 730, may be implemented as digital processing circuitry. In some embodiments, such digital processing circuitry may use a SIMD (single instruction multiple data) architecture. The ear-worn device 702 may include a chip implementing certain portions of circuitry. For example, the noise reduction circuitry 718 may be implemented (in whole or in part) on a chip. Thus, the chip may include the tiles and digital processing circuitry described above. In some embodiments, for a model having up to 10 M 8-bit weights, and when operating at 100 GOPs/sec on time series data, the chip may achieve power efficiency of 4 GOPs/milliwatt, measured at 40 degrees Celsius, when the chip uses supply voltages between 0.5-1.8V, and when the chip is performing operations without idling. Further description of chips incorporating (in some embodiments, among other elements) neural network circuitry for use in ear-worn devices may be found in U.S. Pat. No. 11,886,974, entitled “Neural Network Chip for Ear-Worn Device,” issued Jan. 30, 2024, which is incorporated by reference herein in its entirety. In some embodiments, in addition to such a chip including some or all of the noise reduction circuitry, the ear-worn device 702 may include a digital signal processor configured to perform other operations, such as some or all of the processing performed by the processing circuitry 716 and/or the processing circuitry 730. Thus, in some embodiments, the processing circuitry 716 (or a portion thereof) may be implemented on a single chip (i.e., a single semiconductor die or substrate). In some embodiments, the noise reduction circuitry 718 (or a portion thereof) may be implemented on a single chip. In some embodiments, the neural network circuitry 720 (or a portion thereof) may be implemented on a single chip. In some embodiments, the processing circuitry 716 (or a portion thereof), the noise reduction circuitry 718 (or a portion thereof), and the processing circuitry 730 (or a portion thereof) may be implemented on a single chip.
The receiver wire 813 may be configured to transmit audio signals from the body 811 to the receiver 822. The receiver 822 may be configured to receive audio signals (i.e., those audio signals generated by the body 811 and transmitted by the receiver wire 813) and generate sound signals based on the audio signals. The dome 815 may be configured to fit tightly inside the wearer's ear and direct the sound signal produced by the receiver 822 into the ear canal of the wearer.
In some embodiments, the length of the body 811 may be equal to 2 cm, equal to 5 cm, or between 2 and 5 cm in length. In some embodiments, the weight of the hearing aid 802 may be less than 4.5 grams. In some embodiments, the spacing between the microphones may be equal to 5 mm, equal to 12 mm, or between 5 and 12 mm. In some embodiments, the body 811 may include a battery (not visible in
The description of
At step 902, the neural network system resets one or more states of the first neural network. Step 902 may occur at a first time. As described above, in some embodiments resetting a state may refer to actively changing values in the state to zero, or actively changing values in the state to a different value other than zero. Additionally, resetting one or more states at a first time may mean actively changing values in the state immediately, or over a finite length of time beginning at the first time. In the latter case, the reset may be smooth, such that the values in the state decay over time to zero or to a different value. In some embodiments in which the first neural network includes just hidden states, all the hidden states may be reset. In some embodiments in which the first neural network includes hidden states and cell states, all the cell states, but not the hidden states, may be reset. In some embodiments, one or more states of only certain (i.e., one or more) layers of a neural network may be reset. For example, all the states of one or more layers of a neural network may be reset, or a subset of the states (e.g., the cell states but not the hidden states in an LSTM) of a layer or a subset of layers of a neural network may be reset. As a particular example, one or more states of one layer may be reset at a time (e.g., as in
At step 904, the neural network system receives a first input signal. Step 904 may occur subsequent to resetting the one or states of the first neural network (i.e., at step 902). Step 904 may occur at a second time later than the first time. For example, the input signal may be the input 102. As described above, the input signal may be processed prior to being received by the neural network system. In
At step 906, the neural network system processes the first input signal using the first neural network to produce a first output and using the second neural network to produce a second output. For example, the first output may be the output 112 and the second output may be output 114. The outputs may be the output of any layer or layers (e.g., one layer) of the neural network. For example, the outputs may be from the first layer of the neural network, an intermediate layer of the neural network, the final layer of the neural network, or any combination of multiple layers. Processing the first input signal should be understood to include processing the input by any layer or layers (e.g., one layer) of a neural network.
As described above, in some embodiments the neural network system may run the first neural network for a warmup period (e.g., the warmup period 210 and/or 510) after the resetting, during which the first neural network is on but the weight used for its output is zero. Thus, if an input is received between step 902 and step 904 and during the warmup period of the first neural network, the neural network system may process the input with both neural networks, but only use the output from the second neural network. Equivalently, the output from the first neural network may be weighted at 0. During the warmup period, the weight for the output from the second neural network may be 1. It should be appreciated that step 906 may also apply during a warmup period for the first neural network. In such a scenario, the first weight may be 0 and/or the second weight may be 1. The warmup period may occur during the time period after t1 in which the schedule 1102 is at 0. Additionally, as described above, in some embodiments, prior to the reset time, the neural network system may be configured to turn off the first neural network (e.g., during the off period 612).
At step 908, the neural network system combines the first output and the second output using a first weight for the first output and a second weight for the second output, where the second weight is greater than the first weight. For example, the combiner 110 may combine the first and second outputs. In the example of
In embodiments in which the first output is the output from one layer of the first neural network, and the second output is the output from one layer of the second neural network, the neural network system may feed the combined first and second outputs to a subsequent layer of the first neural network.
Additionally, in some embodiments, the first output from the first neural network may include a combination of multiple outputs from the first neural network, and the second output from the second neural network may include a combination of multiple outputs from the second neural network, such as the votes described with reference to
In some embodiments, the process 900 continues to the process 1000. Thus, step 1010 of the process 1000 may occur after step 908 of the process 900. At step 1010, the neural network system receives a second input signal. Step 1010 may occur subsequent to receiving the first audio signal (i.e., at step 904), and may also occur subsequent to combining the first output and the second output (i.e., at step 908). Step 1010 may occur at a third time later than the second time. Further description of receiving input signals may be found with reference to step 904. In
At step 1012, the neural network system processes the second input signal using the first neural network to produce a third output and using the second neural network to produce a fourth output. Further description of processing input signals may be found with reference to step 906.
At step 1014, the neural network system combines the third output and the fourth output using a third weight for the third output and a fourth weight for the fourth output, where the third weight is greater than the first weight. Further description of combining outputs may be found with reference to step 908. In the example of
At step 1016, the neural network system resets one or more states of the second neural network. Step 1016 may occur subsequent to receiving the second audio signal (i.e., at step 1010). Step 1016 may occur at a fourth time later than the third time. Further description of resetting states may be found with reference to step 902. In the example of
In some embodiments, following step 1016, the processes 900 and 1000 may repeat from step 904, but with the roles of the first neural network and the second neural network reversed. Thus, the neural network system may receive a third input signal at a fifth time later than the fourth time; process the third input signal using the first neural network to produce a fifth output and using the second neural network to produce a sixth output; combine the fifth output and the sixth output using a fifth weight for the fifth output and a sixth weight for the sixth output, where the fifth weight is greater than the sixth weight; receive a fourth input signal at a sixth time later than the fifth time; process the fourth input signal using the first neural network to produce a seventh output and using the second neural network to produce an eight output; and combine the seventh output and the eighth output using a seventh weight for the seventh output and an eighth weight for the eighth output, where the eighth weight is greater than the sixth weight.
Additionally, in embodiments in which one or more states of one layer of the first neural network were reset at step 902, at a later time, one or more states of a different layer of the first neural network may be reset. That later time may be different from the fourth time to provide staggered resets of the first and second neural networks.
It should be appreciated from the above that, generally, control circuitry (e.g., the control circuitry 726) in an apparatus (e.g., an ear-worn device, such as the ear-worn device 702) may be configured to reset one or more states of a first neural network at one or more first reset times and reset one or more states of a second neural network at one or more second reset times, and the one or more second reset times may be different from the one or more first reset times. In other words, the one or more second reset times may be staggered with respect to the one or more first reset times.
It should be appreciated that while the above description may focus on ear-worn devices, such as hearing aids, the features described may be implemented in any type of apparatus using a neural network system. Thus, any apparatus may have neural network circuitry configured to implement a neural network system including at least a first neural network and a second neural network operating in parallel, and control circuitry configured to control the neural network system to receive a first input signal (which need not necessarily be an audio signal), process the first input signal using the first neural network to produce a first output and using the second neural network to produce a second output, combine the first output and the second output, reset one more states of the first neural network, and reset one or more states of the second neural network at a different time than when the one or more states of the first neural network are reset.
Example 1 is directed to an apparatus, comprising neural network circuitry configured to implement a neural network system comprising at least a first neural network and a second neural network configured to operate in parallel, and control circuitry configured to control the neural network system to: reset one or more states of the first neural network; receive a first audio signal subsequent to resetting the one or more states of the first neural network; process the first audio signal using the first neural network to produce a first output and using the second neural network to produce a second output; and combine the first output and the second output using a first weight for the first output and a second weight for the second output, wherein the second weight is greater than the first weight.
Example 2 is directed to the apparatus of example 1, wherein the control circuitry is configured to: reset the one or more states of the first neural network at one or more first reset times; and reset one or more states of the second neural network at one or more second reset times; wherein the one or more second reset times are different from the one or more first reset times.
Example 3 is directed to the apparatus of example 1, wherein the neural network is further configured to: receive a second audio signal subsequent to combining the first output and the second output; process the second audio signal using the first neural network to produce a third output and using the second neural network to produce a fourth output; combine the third output and the fourth output using a third weight for the third output and a fourth weight for the fourth output, wherein the third weight is greater than the first weight; and reset one or more states of the second neural network subsequent to receiving the second audio signal.
Example 4 is directed to the apparatus of any of examples 1-3, wherein all layers of the first neural network are configured to process the first audio signal and fewer than all layers of the second neural network are configured to process the first audio signal.
Example 5 is directed to the apparatus of example 4, wherein the neural network system is configured, when resetting the one or more states of the first neural network, to reset one or more states of a particular layer of the first neural network.
Example 6 is directed to the apparatus of example 5, wherein the neural network system is further configured to reset one or more states of a different layer of the first neural network subsequent to resetting the one or more states of the particular layer of the first neural network.
Example 7 is directed to the apparatus of example 6, wherein a time between resetting the one or more states of the particular layer of the first neural network and resetting the one or more states of the different layer of the first neural network is approximately equal to 1 second, approximately equal to 60 seconds, or between 1 second and 60 seconds.
Example 8 is directed to the apparatus of any of examples 5-7, wherein the neural network system is configured, when processing the first audio signal using the first neural network to produce the first output and using the second neural network to produce the second output, to process the first audio signal using one layer of the first neural network to produce the first output and to process the first audio signal using one layer of the second neural network to produce the second output.
Example 9 is directed to the apparatus of example 8, wherein the neural network system is further configured to feed the combined first and second outputs to a subsequent layer of the first neural network.
Example 10 is directed to the apparatus of any of examples 1-9, wherein the neural network system is configured to run the first neural network for a warmup period subsequent to resetting the one or more states of the first neural network and to weight an output of the first neural network at zero during the warmup period.
Example 11 is directed to the apparatus of any of examples 1-10, wherein the neural network system is configured to turn off the first neural network for an off period prior to resetting the one or more states of the first neural network.
Example 12 is directed to the apparatus of any of examples 1-11, wherein weights applied to outputs of the first neural network depend, at least in part, on how much time has elapsed since resetting the one or more states of the first neural network.
Example 13 is directed to the apparatus of example 12, wherein weights applied to outputs of the first neural network transition from low to high after resetting the one or more states of the first neural network, and then transition from high to low prior to a next resetting of one or more states of the first neural network.
Example 14 is directed to the apparatus of any of examples 1-13, wherein the neural network circuitry is implemented on a chip.
Example 15 is directed to the apparatus of any of examples 1-14, wherein the first output from the first neural network comprises a combination of multiple outputs from the first neural network, and the neural network system is configured to wait until the first neural network has produced the multiple outputs prior to determining the first output.
Example 16 is directed to the apparatus of any of examples 1-15, wherein the first and second weights are determined from a weighting scheme comprising a linear piecewise function or a smooth function.
Example 17 is directed to the apparatus of any of examples 1-16, wherein the first neural network and the second neural network are trained to reduce noise in audio signals.
Example 18 is directed to the apparatus of any of examples 1-17, wherein the first neural network and the second neural network comprise a same architecture and same weights.
Example 19 is directed to the apparatus of any of examples 1-18, wherein the first neural network and the second neural network comprise recurrent neural networks.
Example 20 is directed to the apparatus of any of examples 3-19, wherein the control circuitry is further configured to control the neural network system to: receive a third audio signal subsequent to resetting the one or more states of the second neural network; process the third audio signal using the first neural network to produce a fifth output and using the second neural network to produce a sixth output; combine the fifth output and the sixth output using a fifth weight for the fifth output and a sixth weight for the sixth output, wherein the fifth weight is greater than the sixth weight; receive a fourth audio signal subsequent to receiving the third audio signal; process the fourth audio signal using the first neural network to produce a seventh output and using the second neural network to produce an eight output; and combine the seventh output and the eighth output using a seventh weight for the seventh output and an eighth weight for the eighth output, wherein the eighth weight is greater than the sixth weight.
Example 21 is directed to the apparatus of any of examples 1-20, wherein the apparatus comprises an ear-worn device.
Example 22 is directed to the apparatus of any of examples 1-21, wherein the apparatus comprises a hearing aid.
Having described several embodiments of the techniques in detail, various modifications and improvements will readily occur to those skilled in the art. Such modifications and improvements are intended to be within the spirit and scope of the invention. Accordingly, the foregoing description is by way of example only, and is not intended as limiting. For example, any components described above may comprise hardware, software or a combination of hardware and software.
The indefinite articles “a” and “an,” as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to mean “at least one.”
The phrase “and/or,” as used herein in the specification and in the claims, should be understood to mean “either or both” of the elements so conjoined, i.e., elements that are conjunctively present in some cases and disjunctively present in other cases. Multiple elements listed with “and/or” should be construed in the same fashion, i.e., “one or more” of the elements so conjoined. Other elements may optionally be present other than the elements specifically identified by the “and/or” clause, whether related or unrelated to those elements specifically identified.
As used herein in the specification and in the claims, the phrase “at least one,” in reference to a list of one or more elements, should be understood to mean at least one element selected from any one or more of the elements in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements. This definition also allows that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase “at least one” refers, whether related or unrelated to those elements specifically identified.
The terms “approximately” and “about” may be used to mean within ±20% of a target value in some embodiments, within ±10% of a target value in some embodiments, within ±5% of a target value in some embodiments, and yet within ±2% of a target value in some embodiments. The terms “approximately” and “about” may include the target value.
Also, the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of “including,” “comprising,” or “having,” “containing,” “involving,” and variations thereof herein, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items.
Having described above several aspects of at least one embodiment, it is to be appreciated various alterations, modifications, and improvements will readily occur to those skilled in the art. Such alterations, modifications, and improvements are intended to be objects of this disclosure. Accordingly, the foregoing description and drawings are by way of example only.
This application is a Continuation-In-Part of U.S. Ser. No. 18/511,314, filed Nov. 16, 2023; which is a Continuation of U.S. application Ser. No. 18/239,321, filed Aug. 29, 2023; now U.S. Pat. No. 11,838,727, issued Dec. 5, 2023, all in which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | 18239321 | Aug 2023 | US |
Child | 18511314 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 18511314 | Nov 2023 | US |
Child | 18817293 | US |