The present disclosure is generally related to content-based switchable audio codecs.
Advances in technology have resulted in smaller and more powerful computing devices. For example, there currently exist a variety of portable personal computing devices, including wireless telephones such as mobile and smart phones, tablets and laptop computers that are small, lightweight, and easily carried by users. These devices can communicate voice and data packets over wireless networks. Further, many such devices incorporate additional functionality such as a digital still camera, a digital video camera, a digital recorder, and an audio file player. Also, such devices can process executable instructions, including software applications, such as a web browser application, that can be used to access the Internet. As such, these devices can include significant computing capabilities.
Many common uses of such devices revolve around media (e.g., audio, video, games, extended reality, etc.), such as uses that involved the capture, communication, and/or reproduction of media content. Taking audio data as an example, digitization, storage, and communication of audio data is challenging because high fidelity reproduction of sound is generally desirable (e.g., to improve the user experience), but increasing sound reproduction fidelity may entail the use of more bits to represent the audio content and/or increased computational complexity to process the audio data. Increased computational complexity requires more power, more processing resources, more memory, or all three. Increasing the number of bits used to represent the audio content increases bandwidth required to transmit the audio data and/or memory required to store the audio data.
Encoding schemes are often used to process audio data to reduce the number of bits needed to represent particular audio content. While many encoding techniques can retain sound reproduction fidelity while decreasing the number of bits needed to represent the audio content, such techniques introduce additional computational complexity. Thus, it is challenging to encode audio data in resource constrained use cases, such as on mobile computing devices that rely on battery power.
According to one implementation of the present disclosure, a device includes a machine-learning audio encoder and a waveform-matching audio encoder. The device includes a controller configured to cause a segment of audio data to be input to the machine-learning audio encoder, to the waveform-matching audio encoder, or to both, based on a classification associated with the segment.
According to another implementation of the present disclosure, a method includes obtaining, by one or more processors, an indication of a type of audio content associated with a segment of audio data. The method includes selectively, based on the indication, causing the segment to be sent as input to a machine-learning audio encoder, a waveform-matching audio encoder, or both.
According to another implementation of the present disclosure, a non-transitory computer-readable medium stores instructions that are executable by one or more processors to cause the one or more processors to obtain an indication of a type of audio content associated with a segment of audio data. The instructions further cause the one or more processors to selectively, based on the indication, cause the segment to be sent as input to a machine-learning audio encoder, a waveform-matching audio encoder, or both.
According to another implementation of the present disclosure, an apparatus includes means for obtaining an indication of a type of audio content associated with a segment of audio data. The apparatus includes means for selectively, based on the indication, causing the segment to be sent as input to a machine-learning audio encoder, a waveform-matching audio encoder, or both.
Other aspects, advantages, and features of the present disclosure will become apparent after review of the entire application, including the following sections: Brief Description of the Drawings, Detailed Description, and the Claims.
Machine-learning-based audio codecs can be trained to encode audio data representing particular types of audio content, such as speech, in a manner that is more efficient (in terms of compression ratio, e.g., number of bits used to represent a particular segment of audio data) than traditional codecs without loss of quality. One example of such a machine-learning-based audio codec is the Lyra codec. The Lyra codec uses a recurrent neural network to quantize audio data representing speech at a low bitrate and uses a generative neural network to decode the quantized audio data to generate output representing the speech. The Lyra codec is able to achieve a high compression ratio for audio representing speech and provides high-quality decoded speech output. However, the Lyra codec and other similar codecs trade off generality to achieve such bit efficiency and high-quality speech reproduction. For example, while the Lyra codec performs well when provided audio data representing speech, it is not able to achieve the same performance when provided audio data representing other types of audio content, such as music.
Other more generalized machine-learning-based audio codecs, such as the SoundStream codec, are able to provide high quality audio reproduction, but at the cost of decreased compression ratio as compared to the more specialized audio codecs targeting a particular audio type (e.g., Lyra, which targets speech data). Further, such generalized machine-learning-based audio codecs tend to be much larger (e.g., in terms of model parameters, and correspondingly, memory footprint) and more complex (e.g., more resource intensive to use), which makes them challenging to use in resource constrained use-cases, such as onboard mobile devices.
For audio streams that include various types of audio content (e.g., speech, noise, music, etc.) and audio streams where the audio content type is not known in advance, it is challenging to use machine-learning codecs since the quality of the generated audio output cannot be guaranteed unless one relies on less efficient, high memory footprint generalized machine-learning-based audio codecs. Thus, it is problematic to provide high-quality, low bitrate audio compression in resource constrained situations.
The above-described problems associated with providing high-quality, low-bit rate audio compression for resource constrained use cases and various types of audio data are solved using a content-based switchable coder system (also referred to herein as a “content-switchable coder system”) as described herein. The content-based switchable coder system includes a plurality of audio encoders and a controller that selectively provides segments of audio data to one or more of the audio encoders based on content (e.g., the type of audio data) represented in each segment. The plurality of audio encoders can include, for example, a machine-learning audio encoder that is particularly well-suited to encode a particular type of audio data, such as speech. In this example, the machine-learning audio encoder can provide a high compression ratio and high-quality audio reproduction for segments that include speech. The plurality of audio encoders can include at least one audio encoder that is more generalized to provide high-quality audio reproduction for many types of audio content, such as a waveform-matching audio encoder. In this context, a waveform-matching audio encoder refers to a coder that attempts to represent a segment of audio data in a manner that enables reproduction of the entire waveform of the segment (in contrast, for example, to coders that attempt to enable reproduction of only speech components of the segment).
The controller of the content-based switchable coder system can cause segments of audio data that include a target audio type (e.g. speech, wind noise, noise, music, silence, etc.) to be provided as input to a machine-learning audio encoder that is well suited to encode the target audio type and can cause other segments (e.g., segments that include non-target audio type(s)) to be provided as input to a waveform-matching audio encoder. The content-based switchable coder system can include an audio classifier that is configured to provide an indication to the controller of whether each segment of audio data includes target or non-target audio. For example, the classifier can be a machine-learning based classifier configured to generate a classification output associated with an input segment of audio data. The classification output can be binary (e.g., a first value, such as a one, to indicate that the segment includes a target audio type and a second value, such as a zero, to indicate that the segment does not include the target audio type). Alternatively, the classification output can indicate one of a plurality of classifications associated with the segment (e.g., speech, wind noise, music, silence, etc.). The classifier can use machine-learning techniques, can use procedural techniques (such as voice activity detection), or a combination thereof. To illustrate, multiple classification techniques can be used, and a voting or other selection mechanism can be used to generate an indication for the controller based on the various classification results from the multiple classification techniques.
The content-based switchable coder system thus enables high-quality, low-bit rate representation of target audio data (e.g., speech) without loss of quality for audio data segments that include non-target audio types. Further, the audio encoders used, e.g., a targeted machine-learning audio encoder and a general waveform-matching audio encoder, can have a smaller memory footprint and can be less resource intensive to use than general purpose machine-learning based audio encoders. Thus, the content-based switchable coder system is useable in resource-constrained use cases.
One problem that can arise due to switching between codecs is that such codec switching can introduce audio artifacts that reduce the overall quality of reproduced audio output. The content-based switchable coder system disclosed herein can use various switching techniques to mitigate introduction of such artifacts. For example, in some embodiments, when switching between coders, the content-based switchable coder system can provide one or more segments of audio data to both a machine-learning audio encoder and a waveform-matching audio encoder, and the output of each coder can be sent to a decoder system. In such embodiments, the decoder system can use a machine-learning audio decoder to decode the data from the machine-learning audio encoder to generate a first decoded representation of the segment and a waveform-matching decoder to decode the data from the waveform-matching audio encoder to generate a second decoded representation of the segment. The decoder system can combine portions of the first and second decoded representations to taper down from one coder while tapering up the other. To illustrate, when the switch is from the machine-learning audio encoder to the waveform-matching audio encoder, the decoder system can gradually de-emphasize (e.g., taper down) the first decoded representation and concurrently gradually emphasize (e.g., taper up) the second decoded representation to generate output audio. Combining decoded output data from two different coders in this manner blends the audio in a manner that reduces audio artifacts introduced by switching codecs.
Combining the decoded output data from two different coders as in the previous example increases the bit rate of data transmitted between the content-based switchable coder system and the decoder system since two representations of at least one audio data segment are sent to facilitate the blending. Additionally, providing a single segment to two different coders uses extra resources (e.g., processor time and power) at the content-based switchable coder system. In some embodiments, these problems are solved by providing each segment of audio data to only one of the audio encoders of the content-based switchable coder system. In such embodiments, the decoder system uses extrapolation techniques to blend adjacent segments from different coders. For example, blending techniques such as those used for frame error concealment can be used to ease the transition between codecs to reduce audio artifacts introduced by switching codecs. Such embodiments do not increase the bit rate of data transmitted between the content-based switchable coder system and the decoder system since only one representation of each audio data segment is used to perform the blending.
In some embodiments, the controller uses switching hysteresis to reduce audio artifacts introduced by switching codecs. For example, when switching from a machine-learning audio encoder to a waveform-matching audio encoder, the controller can switch without delay. In contrast, when switching from the waveform-matching audio encoder to the machine-learning audio encoder, the controller can introduce a switching delay that is based on content of the audio data segments. The waveform-matching audio encoder is generally able to encode various types of audio content without significant reduction in fidelity; however, it is often the case that a relatively short segment of non-target data can cause the machine-learning audio encoder to generate significant (e.g., audible to a user) artifacts. Additionally, switching to the machine-learning audio encoder between segments with certain sounds may cause more perceivable artifacts than switching between segments that represent other sounds or silence. To illustrate, artifacts can be introduced by switching to the machine-learning audio encoder in the middle of a vowel sound. Thus, the controller can delay switching from the waveform-matching audio encoder to the machine-learning audio encoder until the end of a vowel sound, until a period of silence, or until a low energy segment is received for coding.
In some embodiments, when switching from a first coder to a second coder, the controller populates coder state data of the second coder based on data from the first coder. For example, when switching from the machine-learning based coder to the waveform-matching audio encoder, the controller can populate excitation signal memories of the waveform-matching audio encoder based on data from the machine-learning coder, which provides a smoother pitch pulse sequence than initializing the excitation signal memories of the waveform-matching audio encoder using default data (e.g., zeros).
Particular aspects of the present disclosure are described below with reference to the drawings. In the description, common features are designated by common reference numbers. As used herein, various terminology is used for the purpose of describing particular implementations only and is not intended to be limiting of implementations. For example, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. Further, some features described herein are singular in some implementations and plural in other implementations. To illustrate,
In some drawings, multiple instances of a particular type of feature are used. Although these features are physically and/or logically distinct, the same reference number is used for each, and the different instances are distinguished by addition of a letter to the reference number. When the features as a group or a type are referred to herein e.g., when no particular one of the features is being referenced, the reference number is used without a distinguishing letter. However, when one particular feature of multiple features of the same type is referred to herein, the reference number is used with the distinguishing letter. For example, referring to
As used herein, the terms “comprise,” “comprises,” and “comprising” may be used interchangeably with “include,” “includes,” or “including.” Additionally, the term “wherein” may be used interchangeably with “where.” As used herein, “exemplary” indicates an example, an implementation, and/or an aspect, and should not be construed as limiting or as indicating a preference or a preferred implementation. As used herein, an ordinal term (e.g., “first,” “second,” “third,” etc.) used to modify an element, such as a structure, a component, an operation, etc., does not by itself indicate any priority or order of the element with respect to another element, but rather merely distinguishes the element from another element having a same name (but for use of the ordinal term). As used herein, the term “set” refers to one or more of a particular element, and the term “plurality” refers to multiple (e.g., two or more) of a particular element.
As used herein, “coupled” may include “communicatively coupled,” “electrically coupled,” or “physically coupled,” and may also (or alternatively) include any combinations thereof. Two devices (or components) may be coupled (e.g., communicatively coupled, electrically coupled, or physically coupled) directly or indirectly via one or more other devices, components, wires, buses, networks (e.g., a wired network, a wireless network, or a combination thereof), etc. Two devices (or components) that are electrically coupled may be included in the same device or in different devices and may be connected via electronics, one or more connectors, or inductive coupling, as illustrative, non-limiting examples. In some implementations, two devices (or components) that are communicatively coupled, such as in electrical communication, may send and receive signals (e.g., digital signals or analog signals) directly or indirectly, via one or more wires, buses, networks, etc. As used herein, “directly coupled” may include two devices that are coupled (e.g., communicatively coupled, electrically coupled, or physically coupled) without intervening components.
In the present disclosure, terms such as “obtaining,” “determining,” “calculating,” “estimating,” “shifting,” “adjusting,” etc. may be used to describe how one or more operations are performed. It should be noted that such terms are not to be construed as limiting and other techniques may be utilized to perform similar operations. Additionally, as referred to herein, “obtaining,” “generating,” “calculating,” “estimating,” “using,” “selecting,” “accessing,” and “determining” may be used interchangeably. For example, “obtaining,” “generating,” “calculating,” “estimating,” or “determining” a parameter (or a signal) may refer to actively generating, estimating, calculating, or determining the parameter (or the signal) or may refer to using, selecting, or accessing the parameter (or signal) that is already generated, such as by another component or device.
As used herein, the term “machine learning” should be understood to have any of its usual and customary meanings within the fields of computers science and data science, such meanings including, for example, processes or techniques by which one or more computers can learn to perform some operation or function without being explicitly programmed to do so. As a typical example, machine learning can be used to enable one or more computers to analyze data to identify patterns in data and generate a result based on the analysis. For certain types of machine learning, the results that are generated include data that indicates an underlying structure or pattern of the data itself. Such techniques, for example, include so called “clustering” techniques, which identify clusters (e.g., groupings of data elements of the data).
For certain types of machine learning, the results that are generated include a data model (also referred to as a “machine-learning model” or simply a “model”). Typically, a model is generated using a first data set to facilitate analysis of a second data set. For example, a first portion of a large body of data may be used to generate a model that can be used to analyze the remaining portion of the large body of data. As another example, a set of historical data can be used to generate a model that can be used to analyze future data.
Since a model can be used to evaluate a set of data that is distinct from the data used to generate the model, the model can be viewed as a type of software (e.g., instructions, parameters, or both) that is automatically generated by the computer(s) during the machine learning process. As such, the model can be portable (e.g., can be generated at a first computer, and subsequently moved to a second computer for further training, for use, or both). Additionally, a model can be used in combination with one or more other models to perform a desired analysis. To illustrate, first data can be provided as input to a first model to generate first model output data, which can be provided (alone, with the first data, or with other data) as input to a second model to generate second model output data indicating a result of a desired analysis. Depending on the analysis and data involved, different combinations of models may be used to generate such results. In some examples, multiple models may provide model output that is input to a single model. In some examples, a single model provides model output to multiple models as input.
Examples of machine-learning models include, without limitation, perceptrons, neural networks, support vector machines, regression models, decision trees, Bayesian models, Boltzmann machines, adaptive neuro-fuzzy inference systems, as well as combinations, ensembles and variants of these and other types of models. Variants of neural networks include, for example and without limitation, prototypical networks, autoencoders, transformers, self-attention networks, convolutional neural networks, deep neural networks, deep belief networks, etc. Variants of decision trees include, for example and without limitation, random forests, boosted decision trees, etc.
Since machine-learning models are generated by computer(s) based on input data, machine-learning models can be discussed in terms of at least two distinct time windows-a creation/training phase and a runtime phase. During the creation/training phase, a model is created, trained, adapted, validated, or otherwise configured by the computer based on the input data (which in the creation/training phase, is generally referred to as “training data”). Note that the trained model corresponds to software that has been generated and/or refined during the creation/training phase to perform particular operations, such as classification, prediction, encoding, or other data analysis or data synthesis operations. During the runtime phase (or “inference” phase), the model is used to analyze input data to generate model output. The content of the model output depends on the type of model. For example, a model can be trained to perform classification tasks or regression tasks, as non-limiting examples. In some implementations, a model may be continuously, periodically, or occasionally updated, in which case training time and runtime may be interleaved or one version of the model can be used for inference while a copy is updated, after which the updated copy may be deployed for inference.
In some implementations, a previously generated model is trained (or re-trained) using a machine-learning technique. In this context, “training” refers to adapting the model or parameters of the model to a particular data set. Unless otherwise clear from the specific context, the term “training” as used herein includes “re-training” or refining a model for a specific data set. For example, training may include so called “transfer learning.” In transfer learning a base model may be trained using a generic or typical data set, and the base model may be subsequently refined (e.g., re-trained or further trained) using a more specific data set.
A data set used during training is referred to as a “training data set” or simply “training data”. The data set may be labeled or unlabeled. “Labeled data” refers to data that has been assigned a categorical label indicating a group or category with which the data is associated, and “unlabeled data” refers to data that is not labeled. Typically, “supervised machine-learning processes” use labeled data to train a machine-learning model, and “unsupervised machine-learning processes” use unlabeled data to train a machine-learning model; however, it should be understood that a label associated with data is itself merely another data element that can be used in any appropriate machine-learning process. To illustrate, many clustering operations can operate using unlabeled data; however, such a clustering operation can use labeled data by ignoring labels assigned to data or by treating the labels the same as other data elements.
Training a model based on a training data set generally involves changing parameters of the model with a goal of causing the output of the model to have particular characteristics based on data input to the model. To distinguish from model generation operations, model training may be referred to herein as optimization or optimization training. In this context, “optimization” refers to improving a metric, and does not mean finding an ideal (e.g., global maximum or global minimum) value of the metric. Examples of optimization trainers include, without limitation, backpropagation trainers, derivative free optimizers (DFOs), and extreme learning machines (ELMs). As one example of training a model, during supervised training of a neural network, an input data sample is associated with a label. When the input data sample is provided to the model, the model generates output data, which is compared to the label associated with the input data sample to generate an error value. Parameters of the model are modified in an attempt to reduce (e.g., optimize) the error value. As another example of training a model, during unsupervised training of an autoencoder, a data sample is provided as input to the autoencoder, and the autoencoder reduces the dimensionality of the data sample (which is a lossy operation) and attempts to reconstruct the data sample as output data. In this example, the output data is compared to the input data sample to generate a reconstruction loss, and parameters of the autoencoder are modified in an attempt to reduce (e.g., optimize) the reconstruction loss.
In
The audio data 112 includes a plurality of segments 114. In
In
The waveform-matching audio encoder 146 is a procedural coder that attempts to represent a segment 114 of audio data 112 in a manner that enables reproduction of the entire waveform of the segment irrespective of the audio content represented by the waveform. For example, to limit computing resources used by the machine-learning audio encoder 144, the machine-learning audio encoder 144 is optimized (e.g., configured and trained) to encode audio data including a target type of audio content, such as speech, music, etc. at a low bit rate. In this example, the machine-learning audio encoder 144 may have difficulty encoding audio data that does not include the target type of audio content with the same degree of fidelity and at the same low bit rate. In contrast, the waveform-matching audio encoder 146 is a general-purpose coder that can encode any audio content with approximately the same degree of audio reproduction fidelity; however, to achieve this wide range of encoding, the waveform-matching audio encoder 146 has a higher bit rate than the machine-learning audio encoder 144 and may also use more computing resources to perform encoding. For example, the machine-learning audio encoder 144 is configured to encode an input segment 114 to generate an output 154 that includes a first number of bits to represent the segment 114, and the waveform-matching audio encoder 146 is configured to encode an input segment 114 to generate an output 156 that includes a second number of bits to represent the segment 114, where the first number is less than the second number.
The controller 142 is configured to cause a segment 114 of audio data 112 to be input to the machine-learning audio encoder 144, to the waveform-matching audio encoder 146, or to both, based on a classification associated with the segment 114. For example, the content-switchable coder system 140 includes an audio classifier 148 that is configured to generate an indicator 150 that indicates a classification associated with one of the segments 114, and the controller 142 selects one or more of the audio encoders 144, 146 to process the segment 114 based on the indicator 150. To illustrate, the audio classifier 148 may be configured to generate the indicator 150 to indicate whether the segment 114 represents audio content of a particular type (e.g., a target audio type of the machine-learning audio encoder 144). The target audio type can include, for example, speech, music, non-speech sounds, etc. The audio classifier 148 can include a machine-learning model (e.g., a classification model, such as a decision-tree, a neural network, a support vector machine, etc.). Alternatively, the audio classifier 148 can use a non-machine-learning technique, such as voice activity detection.
In some situations, switching between the machine-learning audio encoder 144 and the waveform-matching audio encoder 146 can introduce artifacts into the audio reproduced based on the output 154, 156 of the content-switchable coder system 140. For example, when the audio data 112 includes speech, some speech sounds may extend across more than one segment 114. In this example, switching encoders during such speech sounds can cause reproduced audio data 188 to include an artifact of the switching that reduces the intelligibility of the speech and/or leads to decreased user experience.
To limit the introduction of such artifacts, the controller 142 can optionally be configured to use one or more of various artifact mitigation techniques. One example of a technique to limit introduction of artifacts is to provide one or more segments 114 to both the machine-learning audio encoder 144 and the waveform-matching audio encoder 146. For example, in response to a determination to transition which audio encoder is provided segments 114 of the audio data 112, the controller 142 can provide at least one segment 114 of the audio data 112 to both the machine-learning audio encoder 144 and the waveform-matching audio encoder 146. In this example, the device 102, one or more remote devices 180, or both, can use blending techniques to combine a portion of the reproduced audio data 188 that is based on the output 154 of the machine-learning audio encoder 144 representing a segment 114 and a portion of the reproduced audio data 188 that is based on the output 156 of the waveform-matching audio encoder 146 representing the segment 114 to generate a blended version of the segment 114.
Another example of a technique to limit introduction of artifacts is to use encoder state data 160 of a first audio encoder (e.g., the machine-learning audio encoder 144 or the waveform-matching audio encoder 146) to initialize the other audio encoder when switching from the first audio encoder to the other audio encoder. To illustrate, when different audio encoders are selected to process two sequential segments (e.g., the segment 114A and the segment 114B) of the audio data 112, encoder state data 160 resulting from processing the segment 114A (e.g., a first of the two sequential segments) is used to process the segment 114B (e.g., a second segment of the two sequential segments). For example, in some embodiments, artifacts can be reduced by populating excitation signal memories of the waveform-matching audio encoder 146 based on information from the machine-learning audio encoder 144. Such implementations may enable the waveform-matching audio encoder 146 to start generating a smoother evolution of the pitch pulse sequence, instead of starting from zeros or some other default encoder state data 160.
Another example of a technique to limit introduction of artifacts is to delay switching between audio encoders based on audio content of the segments. To illustrate, in some embodiments, the controller 142 is configured to use a first delay when transitioning to causing segments to be input to the waveform-matching audio encoder 146 and configured to use a second delay when transitioning to causing segments to be input to the machine-learning audio encoder 144. In such embodiments, the first delay is different from the second delay. For example, the first delay may be fixed, and the second delay may be variable and selected based on content of the segments 114. To illustrate, when the segment 114A includes speech (or other target audio data) and the segment 114B (e.g., a segment immediately following the segment 114A) includes non-speech (or other non-target audio data), the controller 142 sends the segment 114A to the machine-learning audio encoder 144 and sends the segment 114B to the waveform-matching audio encoder 146 (e.g., with a delay of zero segments). In this example, a delay of zero segments is used because encoding even just a few of some types of non-speech signals using the machine-learning audio encoder 144 can cause significant audio artifacts. In contrast, when switching in the other direction (e.g., from the waveform-matching audio encoder 146 to the machine-learning audio encoder 144), a delay based on content of the audio data 112 can be used to avoid switching artifacts. To illustrate, the controller 142 can delay switching to the machine-learning audio encoder 144 until the end of a speech sound (e.g., a vowel sound) is detected or until a low energy segment 114 (e.g., a segment representing silence) is detected. The waveform-matching audio encoder 146 may be able to encode both target and non-target audio equally well, but at the cost of using more computing and communication resources. Accordingly, delaying transition from the waveform-matching audio encoder 146 to the machine-learning audio encoder 144 is less efficient than switching immediately, but can avoid introduction of audio artifacts.
Various combinations of the above-described techniques can be used together. For example, in some embodiments, the controller 142 is configured to select a single one of the audio encoders to process each respective segment 114 of the audio data. In such embodiments, switching delays, using encoder state data from one encoder to initialize another encoder, or both, can be used to limit artifacts. In other embodiments, the controller 142 is configured to, at least under some circumstances, select two or more of the audio encoders to encode a particular segment 114 of the audio data 112. In some such embodiments, the controller 142 can also use the encoder state data 160 from one encoder to initialize another encoder to further limit artifacts.
In the example illustrated in
Although the content-switchable coder system 140 of
In some implementations, the device 102 corresponds to or is included in one of various types of devices. In an illustrative example, the processor 190 is integrated in a headset device, as described further with reference to
In each of
In
As described above, in some embodiments, the machine-learning audio encoder 144 is configured to encode a segment of audio data using a first number of bits, and the waveform-matching audio encoder 146 is configured to encode a segment of audio data using a second number of bits that is greater than the first number of bits. Thus, if the encoded segment 210 and the encoded segment 212 each represent the same number of segments, the encoded segment 210 includes more bits than the encoded segment 212.
Audio output based on the decoded segments 230-236 can include artifacts due to switching codecs between some segments. For example, switching from a machine-learning audio codec to the waveform-matching audio codec between decoded segment(s) 232 and decoded segment(s) 234 can introduce artifacts in the audio output. The content-switchable coder system 140 of
As another example, when switching from the waveform-matching audio encoder 146 to the machine-learning audio encoder 144, coder state data of the machine-learning audio encoder 144 can be initialized based on information from the waveform-matching audio encoder 146, or vice versa. To illustrate, coder state data based on encoding of the encoded segment(s) 210 can be used to initialize the machine-learning audio encoder 144 when the machine-learning audio encoder 144 begins generation of the encoded segment(s) 212. Additionally, or alternatively, encoder state data based on encoding of the encoded segment(s) 212 can be used to initialize the waveform-matching audio encoder 146 when the waveform-matching audio encoder 146 begins generation of the encoded segment(s) 214.
The example illustrated in
In
In the example illustrated in
The specific values listed in the table 500 are merely illustrative. In other embodiments, different values are used to indicate which audio encoder was used to encode the segments. Further, in some embodiments, the encoder ID field can include a different number of bits.
In
In
In
The first earbud 1702 includes a first microphone 1720, such as a high signal-to-noise microphone positioned to capture the voice of a wearer of the first earbud 1702, an array of one or more other microphones configured to detect ambient sounds and spatially distributed to support beamforming, illustrated as microphones 1722A, 1722B, and 1722C, an “inner” microphone 1724 proximate to the wearer's ear canal (e.g., to assist with active noise cancelling), and a self-speech microphone 1726, such as a bone conduction microphone configured to convert sound vibrations of the wearer's ear bone or skull into an audio signal.
The second earbud 1704 can be configured in a substantially similar manner as the first earbud 1702. In some implementations, the first earbud 1702 is also configured to receive one or more audio signals generated by one or more microphones of the second earbud 1704, such as via wireless transmission between the earbuds 1702, 1704, or via wired transmission in implementations in which the earbuds 1702, 1704 are coupled via a transmission line.
In some implementations, the earbuds 1702, 1704 are configured to automatically switch between various operating modes, such as a passthrough mode in which ambient sound is played via a speaker 1730, a playback mode in which non-ambient sound (e.g., streaming audio corresponding to a phone conversation, media playback, video game, etc.) is played back through the speaker 1730, and an audio zoom mode or beamforming mode in which one or more ambient sounds are emphasized and/or other ambient sounds are suppressed for playback at the speaker 1730. In other implementations, the earbuds 1702, 1704 may support fewer modes or may support one or more other modes in place of, or in addition to, the described modes.
In an illustrative example, the earbuds 1702, 1704 can automatically transition from the playback mode to the passthrough mode in response to detecting the wearer's voice and may automatically transition back to the playback mode after the wearer has ceased speaking. In some examples, the earbuds 1702, 1704 can operate in two or more of the modes concurrently, such as by performing audio zoom on a particular ambient sound (e.g., a dog barking) and playing out the audio zoomed sound superimposed on the sound being played out while the wearer is listening to music (which can be reduced in volume while the audio zoomed sound is being played). In this example, the wearer can be alerted to the ambient sound associated with the audio event without halting playback of the music.
In
In
Referring to
In some embodiments, the method 2000 includes, at block 2002, obtaining, by one or more processors, an indication of a type of audio content associated with a segment of audio data. For example, the controller 142 of
The method 2000 also includes, at block 2004, selectively, based on the indication, causing the segment to be sent as input to a machine-learning audio encoder, a waveform-matching audio encoder, or both. For example, the controller 142 of
In some embodiments, the method 2000 includes generating a bitstream representing an output of the machine-learning audio encoder, an output of the waveform-matching audio encoder, or both. For example, the modem 170 of
In some embodiments, the method 2000 includes, when a particular audio encoder is selected to process two sequential segments of the audio data, processing a second segment of the two sequential segments using encoder state data resulting from processing a first segment of the two sequential segments, where the second segment is subsequent to the first segment. For example, when the waveform-matching audio encoder 146 of
In some embodiments, the method 2000 includes, when different audio encoders are selected to process two sequential segments of the audio data, processing a second segment of the two sequential segments using encoder state data that is independent of processing a first segment of the two sequential segments, where the second segment is subsequent to the first segment. For example, when the waveform-matching audio encoder 146 of
In some embodiments, the method 2000 includes, when different audio encoders are selected to process two sequential segments of the audio data, processing a second segment of the two sequential segments using encoder state data resulting from processing a first segment of the two sequential segments, where the second segment is subsequent to the first segment. For example, when the waveform-matching audio encoder 146 of
In some embodiments, the method 2000 includes applying a first delay when transitioning to causing segments to be input to the waveform-matching audio encoder and applying a second delay when transitioning to causing segments to be input to the machine-learning audio encoder, wherein the first delay is different from the second delay. In some such embodiments, the first delay is fixed and the method 2000 further includes determining the second delay based on content of one or more of the segments.
The method 2000 of
Referring to
In a particular implementation, the device 2100 includes a processor 2106 (e.g., a central processing unit (CPU)). The device 2100 may include one or more additional processors 2110 (e.g., one or more DSPs). In a particular aspect, the processor 190 of
In this context, the term “processor” refers to an integrated circuit consisting of logic cells, interconnects, input/output blocks, clock management components, memory, and optionally other special purpose hardware components, designed to execute instructions and perform various computational tasks. Examples of processors include, without limitation, central processing units (CPUs), digital signal processors (DSPs), neural processing units (NPU), graphics processing units (GPUs), field programmable gate arrays (FPGAs), microcontrollers, quantum processors, coprocessors, vector processors, other similar circuits, and variants and combinations thereof. In some cases, a processor can be integrated with other components, such as communication components, input/output components, etc. to form a system on a chip (SOC) device or a packaged electronic device.
Taking CPUs as a starting point, a CPU typically includes one or more processor cores, each of which includes a complex, interconnected network of transistors and other circuit components defining logic gates, memory elements, etc. A core is responsible for executing instructions to, for example, perform arithmetic and logical operations. Typically, a CPU includes an Arithmetic Logic Unit (ALU) that handles mathematical operations and a Control Unit that generates signals to coordinate the operation of other CPU components, such as to manage operations a fetch-decode-execute cycle.
CPUs and/or individual processor cores generally include local memory circuits, such as registers and cache to temporarily store data during operations. Registers include high-speed, small-sized memory units intimately connected to the logic cells of a CPU. Often registers include transistors arranged as groups of flip-flops, which are configured to store binary data. Caches include fast, on-chip memory circuits used to store frequently accessed data. Caches can be implemented, for example, using Static Random-Access Memory (SRAM) circuits.
Operations of a CPU (e.g., arithmetic operations, logic operations, and flow control operations) are directed by software and firmware. At the lowest level, the CPU includes an instruction set architecture (ISA) that specifies how individual operations are performed using hardware resources (e.g., registers, arithmetic units, etc.). Higher level software and firmware is translated into various combinations of ISA operations to cause the CPU to perform specific higher-level operations. For example, an ISA typically specifies how the hardware components of the CPU move and modify data to perform operations such as addition, multiplication, and subtraction, and high-level software is translated into sets of such operations to accomplish larger tasks, such as adding two columns in a spreadsheet. Generally, a CPU operates on various levels of software, including a kernel, an operating system, applications, and so forth, with each higher level of software generally being more abstracted from the ISA and usually more readily understandable by human users.
GPUs, NPUs, DSPs, microcontrollers, coprocessors, FPGAs, ASICS, and vector processors include components similar to those described above for CPUs. The differences among these various types of processors are generally related to the use of specialized interconnection schemes and ISAs to improve a processor's ability to perform particular types of operations. For example, the logic gates, local memory circuits, and the interconnects therebetween of a GPU are specifically designed to improve parallel processing, sharing of data between processor cores, and vector operations, and the ISA of the GPU may define operations that take advantage of these structures. As another example, ASICs are highly specialized processors that include similar circuitry arranged and interconnected for a particular task, such as encryption or signal processing. As yet another example, FPGAs are programmable devices that include an array of configurable logic blocks (e.g., interconnect sets of transistors and memory elements) that can be configured (often on the fly) to perform customizable logic functions.
The device 2100 may include a memory 2186 and a CODEC 2134. The memory 2186 may include instructions 2156, that are executable by the one or more additional processors 2110 (or the processor 2106) to implement the functionality described with reference to the content-switchable coder system 140, or both. The device 2100 may include the modem 170 coupled, via a transceiver 2150, to an antenna 2152.
The device 2100 may include a display 2128 coupled to a display controller 2126. One or more speakers 2192, the microphone(s) 2194 may be coupled to the CODEC 2134. The CODEC 2134 may include a digital-to-analog converter (DAC) 2102, an analog-to-digital converter (ADC) 2104, or both. In a particular implementation, the CODEC 2134 may receive analog signals from the microphone(s) 2194, convert the analog signals to digital signals using the analog-to-digital converter 2104, and provide the digital signals to the speech and music codec 2108. The speech and music codec 2108 may process the digital signals, and the digital signals may further be processed by the content-switchable coder system 140. In a particular implementation, the speech and music codec 2108 may provide digital signals to the CODEC 2134. The CODEC 2134 may convert the digital signals to analog signals using the digital-to-analog converter 2102 and may provide the analog signals to the speaker 2192.
In a particular implementation, the device 2100 may be included in a system-in-package or system-on-chip device 2122. In a particular implementation, the memory 2186, the processor 2106, the processors 2110, the display controller 2126, the CODEC 2134, and the modem 170 are included in the system-in-package or system-on-chip device 2122. In a particular implementation, an input device 2130 and a power supply 2144 are coupled to the system-in-package or the system-on-chip device 2122. Moreover, in a particular implementation, as illustrated in
The device 2100 may include a smart speaker, a speaker bar, a mobile communication device, a smart phone, a cellular phone, a laptop computer, a computer, a tablet, a personal digital assistant, a display device, a television, a gaming console, a music player, a radio, a digital video player, a digital video disc (DVD) player, a tuner, a camera, a navigation device, a vehicle, a headset, an augmented reality headset, a mixed reality headset, a virtual reality headset, an aerial vehicle, a home automation system, a voice-activated device, a wireless speaker and voice activated device, a portable electronic device, a car, a computing device, a communication device, an internet-of-things (IoT) device, a virtual reality (VR) device, a base station, a mobile device, or any combination thereof.
In conjunction with the described implementations, an apparatus includes means for obtaining an indication of a type of audio content associated with a segment of audio data. For example, the means for obtaining the indication of the type of audio content associated with the segment of audio data can include the system 100, the device 102, the processor(s) 190, the content-switchable coder system 140, controller 142, the audio classifier 148, the integrated circuit 802, the processor 2106, the processor(s) 2110, the system-in-package or the system-on-chip device 2122, the device 2100, other circuitry configured to obtain an indication of a type of audio content associated with a segment of audio data, or a combination thereof.
The apparatus also includes means for selectively, based on the indication, causing the segment to be sent as input to a machine-learning audio encoder, a waveform-matching audio encoder, or both. For example, the means for selectively causing the segment to be sent as input to a machine-learning audio encoder, a waveform-matching audio encoder, or both, based on the indication can include the system 100, the device 102, the processor(s) 190, the content-switchable coder system 140, controller 142, the integrated circuit 802, the processor 2106, the processor(s) 2110, the system-in-package or the system-on-chip device 2122, the device 2100, other circuitry configured to cause a segment to be selectively sent as input to a machine-learning audio encoder, a waveform-matching audio encoder, or both, based on an indication of a type of audio content associated with a segment of audio data, or a combination thereof.
In some implementations, a non-transitory computer-readable medium (e.g., a computer-readable storage device, such as the memory 2186) includes instructions (e.g., the instructions 2156) that, when executed by one or more processors (e.g., the one or more processors 2110 or the processor 2106), cause the one or more processors to obtain an indication of a type of audio content associated with a segment of audio data. The instructions also cause the one or more processors to selectively, based on the indication, cause the segment to be sent as input to a machine-learning audio encoder, a waveform-matching audio encoder, or both.
Particular aspects of the disclosure are described below in sets of interrelated Examples:
According to Example 1, a device includes a machine-learning audio encoder; a waveform-matching audio encoder; and a controller configured to cause a segment of audio data to be input to the machine-learning audio encoder, to the waveform-matching audio encoder, or to both, based on a classification associated with the segment.
Example 2 includes the device of Example 1, further comprising an audio classifier configured to generate an indicator of the classification based on whether the segment represents audio content of a particular type and configured to provide the indicator to the controller.
Example 3 includes the device of Example 1 or Example 2, further comprising a modem coupled to the machine-learning audio encoder and the waveform-matching audio encoder and configured to represent, in a bitstream, an output of the machine-learning audio encoder, an output of the waveform-matching audio encoder, or both.
Example 4 includes the device of any of Examples 1 to 3, wherein the machine-learning audio encoder is configured to encode an input segment using a first number of bits, wherein the waveform-matching audio encoder is configured to encode an input segment using a second number of bits, and wherein the first number is less than the second number.
Example 5 includes the device of any of Examples 1 to 4, wherein the controller is configured to select the machine-learning audio encoder to process a first set of segments that represent speech and to select the waveform-matching audio encoder to process a second set of segments that represent non-speech sounds.
Example 6 includes the device of any of Examples 1 to 5, wherein the controller is configured to select a single audio encoder to process each respective segment of the audio data.
Example 7 includes the device of any of Examples 1 to 5, wherein the controller is configured to, in response to a determination to transition which audio encoder is provided segments of the audio data, provide at least one segment of the audio data to both the machine-learning audio encoder and the waveform-matching audio encoder.
Example 8 includes the device of any of Examples 1 to 7 and further includes a modem coupled to the machine-learning audio encoder and the waveform-matching audio encoder and configured to represent, in a bitstream, an output of the machine-learning audio encoder, an output of the waveform-matching audio encoder, or both.
Example 9 includes the device of any of Examples 1 to 8, wherein, when a particular audio encoder is selected to process two sequential segments of the audio data, encoder state data resulting from processing a first segment of the two sequential segments is used to process a second segment of the two sequential segments, where the second segment is subsequent to the first segment.
Example 10 includes the device of any of Examples 1 to 9, wherein, when different audio encoders are selected to process two sequential segments of the audio data, default encoder state data is used to process a second segment of the two sequential segments, where the second segment is subsequent to the first segment.
Example 11 includes the device of any of Examples 1 to 9, wherein, when different audio encoders are selected to process two sequential segments of the audio data, encoder state data used by a second audio encoder to process a second segment of the two sequential segments is based on a prior state of the second audio encoder, where the second segment is subsequent to the first segment.
Example 12 includes the device of any of Examples 1 to 9, wherein, when different audio encoders are selected to process two sequential segments of the audio data, encoder state data used to process a second segment of the two sequential segments is based on processing of a first segment of the two sequential segments, where the second segment is subsequent to the first segment.
Example 13 includes the device of any of Examples 1 to 12, wherein the controller is configured to use a first delay when transitioning to causing segments to be input to the waveform-matching audio encoder and is configured to use a second delay when transitioning to causing segments to be input to the machine-learning audio encoder, wherein the first delay is different from the second delay.
Example 14 includes the device of Example 13, wherein the first delay is fixed and the second delay is variable and is selected based on content of the segments.
Example 15 includes the device of any of Examples 1 to 14, wherein the controller is integrated into one or more processors.
Example 16 includes the device of any of Examples 1 to 14, wherein the controller is integrated into processing circuitry.
Example 17 includes the device of any of Examples 1 to 16, wherein the machine-learning audio encoder, the waveform-matching audio encoder, or both, are integrated into a processor.
Example 18 includes the device of any of Examples 1 to 17, wherein the controller, the machine-learning audio encoder, and the waveform-matching audio encoder are integrated in at least one of a mobile phone, a tablet computer device, a wearable electronic device, a camera device, a virtual reality headset, a mixed reality headset, or an augmented reality headset.
Example 19 includes the device of any of Examples 1 to 17, wherein the controller, the machine-learning audio encoder, and the waveform-matching audio encoder are integrated in a vehicle.
According to Example 20, a method includes obtaining, by one or more processors, an indication of a type of audio content associated with a segment of audio data; and selectively, based on the indication, causing the segment to be sent as input to a machine-learning audio encoder, a waveform-matching audio encoder, or both.
Example 21 includes the method of Example 20, further comprising using an audio classifier to generate the indication based on whether the segment represents audio content of a particular type.
Example 22 includes the method of Example 20 or Example 21, further comprising generating a bitstream representing an output of the machine-learning audio encoder, an output of the waveform-matching audio encoder, or both.
Example 23 includes the method of any of Examples 20 to 22, wherein the machine-learning audio encoder is configured to encode an input segment using a first number of bits, wherein the waveform-matching audio encoder is configured to encode an input segment using a second number of bits, and wherein the first number is less than the second number.
Example 24 includes the method of any of Examples 20 to 23, wherein each segment of the audio data is sent as input to a single audio encoder.
Example 25 includes the method of any of Examples 20 to 23 and further includes, based on a determination to transition which audio encoder is provided segments of the audio data, providing at least one segment of the audio data to both the machine-learning audio encoder and the waveform-matching audio encoder.
Example 26 includes the method of any of Examples 20 to 25 and further includes, when a particular audio encoder is selected to process two sequential segments of the audio data, processing a second segment of the two sequential segments using encoder state data resulting from processing a first segment of the two sequential segments, where the second segment is subsequent to the first segment.
Example 27 includes the method of any of Examples 20 to 26 and further includes, when different audio encoders are selected to process two sequential segments of the audio data, processing a second segment of the two sequential segments using default encoder state data, where the second segment is subsequent to the first segment.
Example 28 includes the method any of Examples 20 to 26 and further includes, when different audio encoders are selected to process two sequential segments of the audio data, processing, by a second audio encoder, a second segment of the two sequential segments using encoder state data based on a prior state of the second audio encoder, where the second segment is subsequent to the first segment.
Example 29 includes the method of any of Examples 20 to 26 and further includes, when different audio encoders are selected to process two sequential segments of the audio data, processing a second segment of the two sequential segments using encoder state data resulting from processing a first segment of the two sequential segments, where the second segment is subsequent to the first segment.
Example 30 includes the method of any of Examples 20 to 29 and further includes applying a first delay when transitioning to causing segments to be input to the waveform-matching audio encoder and applying a second delay when transitioning to causing segments to be input to the machine-learning audio encoder, wherein the first delay is different from the second delay.
Example 31 includes the method of Example 30, wherein the first delay is fixed and further comprising determining the second delay based on content of one or more of the segments.
According to Example 32, a non-transitory computer-readable medium stores instructions that are executable by one or more processors to cause the one or more processors to obtain an indication of a type of audio content associated with a segment of audio data; and selectively, based on the indication, cause the segment to be sent as input to a machine-learning audio encoder, a waveform-matching audio encoder, or both.
Example 33 includes the non-transitory computer-readable medium of Example 32, wherein the instructions are executable to cause the one or more processors to use an audio classifier to generate the indication based on whether the segment represents audio content of a particular type.
Example 34 includes the non-transitory computer-readable medium of Example 32 or Example 33, wherein the instructions are executable to cause the one or more processors to generate a bitstream representing an output of the machine-learning audio encoder, an output of the waveform-matching audio encoder, or both.
Example 35 includes the non-transitory computer-readable medium of any of Examples 32 to 34, wherein the machine-learning audio encoder is configured to encode an input segment using a first number of bits, wherein the waveform-matching audio encoder is configured to encode an input segment using a second number of bits, and wherein the first number is less than the second number.
Example 36 includes the non-transitory computer-readable medium of any of Examples 32 to 35, wherein the instructions are executable to cause the one or more processors to send each segment of the audio data as input to a single audio encoder.
Example 37 includes the non-transitory computer-readable medium of any of Examples 32 to 35, wherein the instructions are executable to cause the one or more processors to, based on a determination to transition which audio encoder is provided segments of the audio data, provide at least one segment of the audio data to both the machine-learning audio encoder and the waveform-matching audio encoder.
Example 38 includes the non-transitory computer-readable medium of any of Examples 32 to 37, wherein the instructions are executable to cause the one or more processors to, when a particular audio encoder is selected to process two sequential segments of the audio data, process a second segment of the two sequential segments using encoder state data resulting from processing a first segment of the two sequential segments, where the second segment is subsequent to the first segment.
Example 39 includes the non-transitory computer-readable medium of any of Examples 32 to 38, wherein the instructions are executable to cause the one or more processors to, when different audio encoders are selected to process two sequential segments of the audio data, process a second segment of the two sequential segments using default encoder state data, where the second segment is subsequent to the first segment.
Example 40 includes the non-transitory computer-readable medium any of Examples 32 to 38, wherein the instructions are executable to cause the one or more processors to, when different audio encoders are selected to process two sequential segments of the audio data, process, by a second audio encoder, a second segment of the two sequential segments using encoder state data based on a prior state of the second audio encoder, where the second segment is subsequent to the first segment.
Example 41 includes the non-transitory computer-readable medium of any of Examples 32 to 38, wherein the instructions are executable to cause the one or more processors to, when different audio encoders are selected to process two sequential segments of the audio data, process a second segment of the two sequential segments using encoder state data resulting from processing a first segment of the two sequential segments, where the second segment is subsequent to the first segment.
Example 42 includes the non-transitory computer-readable medium of any of Examples 32 to 41, wherein the instructions are executable to cause the one or more processors to apply a first delay when transitioning to causing segments to be input to the waveform-matching audio encoder and apply a second delay when transitioning to causing segments to be input to the machine-learning audio encoder, wherein the first delay is different from the second delay.
Example 43 includes the non-transitory computer-readable medium of Example 42, wherein the first delay is fixed and wherein the instructions are executable to cause the one or more processors to determine the second delay based on content of one or more of the segments.
According to Example 44, an apparatus includes means for obtaining an indication of a type of audio content associated with a segment of audio data; and means for selectively, based on the indication, causing the segment to be sent as input to a machine-learning audio encoder, a waveform-matching audio encoder, or both.
Example 45 includes the apparatus of Example 44, further comprising means for using an audio classifier to generate the indication based on whether the segment represents audio content of a particular type.
Example 46 includes the apparatus of Example 44 or Example 45, further comprising means for generating a bitstream representing an output of the machine-learning audio encoder, an output of the waveform-matching audio encoder, or both.
Example 47 includes the apparatus of any of Examples 44 to 46, wherein the machine-learning audio encoder is configured to encode an input segment using a first number of bits, wherein the waveform-matching audio encoder is configured to encode an input segment using a second number of bits, and wherein the first number is less than the second number.
Example 48 includes the apparatus of any of Examples 44 to 47, wherein each segment of the audio data is sent as input to a single audio encoder.
Example 49 includes the apparatus of any of Examples 44 to 47 and further includes means for providing at least one segment of the audio data to both the machine-learning audio encoder and the waveform-matching audio encoder based on a determination to transition which audio encoder is provided segments of the audio data.
Example 50 includes the apparatus of any of Examples 44 to 49 and further includes means for processing a second segment of two sequential segments using encoder state data resulting from processing a first segment of the two sequential segments when a particular audio encoder is selected to process the two sequential segments of the audio data, where the second segment is subsequent to the first segment.
Example 51 includes the apparatus of any of Examples 44 to 50 and further includes means for processing a second segment of two sequential segments using default encoder state data when different audio encoders are selected to process the two sequential segments of the audio data, where the second segment is subsequent to the first segment.
Example 52 includes the apparatus of any of Examples 44 to 50 and further includes means for processing, by a second audio encoder, a second segment of two sequential segments using encoder state data based on a prior state of the second audio encoder, where the second segment is subsequent to the first segment.
Example 53 includes the apparatus of any of Examples 44 to 50 and further includes means for processing a second segment of two sequential segments using encoder state data resulting from processing a first segment of the two sequential segments when different audio encoders are selected to process the two sequential segments of the audio data, where the second segment is subsequent to the first segment.
Example 54 includes the apparatus of any of Examples 44 to 53 and further includes means for applying a first delay when transitioning to causing segments to be input to the waveform-matching audio encoder and applying a second delay when transitioning to causing segments to be input to the machine-learning audio encoder, wherein the first delay is different from the second delay.
Example 55 includes the apparatus of Example 54, wherein the first delay is fixed and further comprising means for determining the second delay based on content of one or more of the segments.
Those of skill would further appreciate that the various illustrative logical blocks, configurations, modules, circuits, and algorithm steps described in connection with the implementations disclosed herein may be implemented as electronic hardware, computer software executed by a processor, or combinations of both. Various illustrative components, blocks, configurations, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or processor executable instructions depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, such implementation decisions are not to be interpreted as causing a departure from the scope of the present disclosure.
The steps of a method or algorithm described in connection with the implementations disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in random access memory (RAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disk, a removable disk, a compact disc read-only memory (CD-ROM), or any other form of non-transient storage medium known in the art. An exemplary storage medium is coupled to the processor such that the processor may read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an application-specific integrated circuit (ASIC). The ASIC may reside in a computing device or a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a computing device or user terminal.
The previous description of the disclosed aspects is provided to enable a person skilled in the art to make or use the disclosed aspects. Various modifications to these aspects will be readily apparent to those skilled in the art, and the principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope possible consistent with the principles and novel features as defined by the following claims.