The present disclosure is generally related to sound event classification and more particularly to transfer learning techniques for updating sound event classification models.
Advances in technology have resulted in smaller and more powerful computing devices. For example, there currently exist a variety of portable personal computing devices, including wireless telephones such as mobile and smart phones, tablets and laptop computers that are small, lightweight, and easily carried by users. These devices can communicate voice and data packets over wireless networks. Further, many such devices incorporate additional functionality such as a digital still camera, a digital video camera, a digital recorder, and an audio file player. Also, such devices can process executable instructions, including software applications, such as a web browser application, that can be used to access the Internet. As such, these devices can include significant computing capabilities, including, for example a Sound Event Classification (SEC) system that attempts to recognize sound events (e.g., slamming doors, car horns, etc.) in an audio signal.
An SEC system is generally trained using a supervised machine learning technique to recognize a specific set of sounds that are identified in labeled training data. As a result, each SEC system tends to be domain specific (e.g., capable of classifying a predetermined set of sounds). After an SEC system is trained, it is difficult to update the SEC system to recognize new sounds that were not identified in the labeled training data. For example, an SEC system can be trained using a set of labeled audio data samples that include a selection of city noises, such as car horns, sirens, slamming doors, and engine sounds. In this example, if a need arises to also recognize a sound that was not labeled in the set of labeled audio data samples, such as a doorbell, updating the SEC system to recognize the doorbell involves completely retraining the SEC system using both labeled audio data samples for the doorbell as well as the original set of labeled audio data samples. As a result, training an SEC system to recognize a new sound requires approximately the same computing resources (e.g., processor cycles, memory, etc.) as generating a brand-new SEC system. Further, over time, as even more sounds are added to be recognized, the number of audio data samples that must be maintained and used to train the SEC system can become unwieldy.
In a particular aspect, a device includes one or more processors configured to initialize a second neural network based on a first neural network that is trained to detect a first set of sound classes. The one or more processors are also configured to link an output of the first neural network and an output of the second neural network to one or more coupling networks. The one or more processors are also configured to, after the second neural network and the one or more coupling networks are trained, determine whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and an accuracy of sound classes assigned by the first neural network.
In a particular aspect, a method includes initializing a second neural network based on a first neural network that is trained to detect a first set of sound classes and linking an output of the first neural network and an output of the second neural network to one or more coupling networks. The method further includes, after training the second neural network and the one or more coupling networks, determining whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and an accuracy of sound classes assigned by the first neural network.
In a particular aspect, a device includes means for initializing a second neural network based on a first neural network that is trained to detect a first set of sound classes and means for linking an output of the first neural network and an output of the second neural network to one or more coupling networks. The device further includes means for determining, after the second neural network and the one or more coupling networks are trained, whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and an accuracy of sound classes assigned by the first neural network.
In a particular aspect, a non-transitory computer-readable storage medium includes instructions that when executed by a processor, cause the processor to initialize a second neural network based on a first neural network that is trained to detect a first set of sound classes. The instructions further cause the processor to link an output of the first neural network and an output of the second neural network to one or more coupling networks. The instructions further cause the processor to, after training the second neural network and the one or more coupling networks, determine whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and an accuracy of sound classes assigned by the first neural network.
Other aspects, advantages, and features of the present disclosure will become apparent after review of the entire application, including the following sections: Brief Description of the Drawings, Detailed Description, and the Claims.
Sound event classification models can be trained using machine-learning techniques. For example, a neural network can be trained as a sound event classifier using backpropagation or other machine-learning training techniques. A sound event classification model trained in this manner can be small enough (in terms of storage space occupied) and simple enough (in terms of computing resources used during operation) for a portable computing device to store and use the sound event classification model. However, the training process uses significantly more processing resources than are used to perform sound event classification using the trained sound event classification model. Additionally, the training process uses a large set of labeled training data including many audio data samples for each sound class that the sound event classification model is being trained to detect. Thus, it may be prohibitive, in terms of memory utilization or other computing resources, to train a sound event classification model from scratch on a portable computing device or another resource limited computing device. As a result, a user who desires to use a sound event classification model on a portable computing device may be limited to downloading pre-trained sound event classification models onto the portable computing device from a less resource constrained computing device or a library of pre-trained sound event classification models. Thus, the user has limited customization options.
The disclosed systems and methods facilitate knowledge migration from a previously trained sound event classification model (also referred to as a “source model”) to a new sound event classification model (also referred to as a “target model”), which enables learning new sound event classes without forgetting previously learned sound event classes and without re-training from scratch. In a particular aspect, in order to migrate the previously learned knowledge from the source model to the target one, a neural adapter is employed. The source model and the target model are merged via the neural adapter to form a combined model. The neural adapter facilitates the target model to learn new sound events with minimal training data and maintaining a similar performance to the source model.
Thus, the disclosed systems and methods provide a scalable sound event detection framework. In other words, a user can add a customized sound event to an existing source model, whether the source model is part of an ensemble of binary classifiers or is a multi-class classifier. In some aspects, the disclosed systems and methods enable the target model to learn multiple new sound event classes at the same time (e.g., during a single training session).
The disclosed learning techniques may be used for continuous learning, especially in applications where there is a constraint on the memory footprint. For example, the source model may be discarded after the target model is trained, freeing up memory associated with the source model. To illustrate, when the target model is determined to be mature (e.g., in terms of classification accuracy or performance), the source model and the neural adapter can be discarded, and the target model can be used alone. In some aspects, the maturity of the target model is determined based on performance of the target model, such as performance in recognizing sound event classes that the source model was trained to recognize. For example, the target model may be considered mature when the target model is able to recognize sound event classes with at least the same accuracy as the source model. In some aspects, the target model can later be used as a source model for learning additional sound event classes.
In a particular aspect, no training of the sound event classification models is performed while the system is operating in an inference mode. Rather, during operation in the inference mode, existing knowledge, in the form of one or more previously trained sound event classification models (e.g., the source model), is used to analyze detected sounds. More than one sound event classification model can be used to analyze the sound. For example, an ensemble of sound event classification models can be used during operation in the inference mode. A particular sound event classification model can be selected from a set of available sound event classification models based on detection of a trigger condition. To illustrate, a particular sound event classification model is used, as the active sound event classification model, whenever a certain trigger (or triggers) is activated. The trigger(s) may be based on locations, sounds, camera information, other sensor data, user input, etc. For example, a particular sound event classification model may be trained to recognize sound events related to crowded areas, such as theme parks, outdoor shopping malls, public squares, etc. In this example, the particular sound event classification model may be used as the active sound event classification model when global positioning data indicates that a device capturing sound is at any of these locations. In this example, the trigger is based on the location of the device capturing sound, and the active sound event classification model is selected and loaded (e.g., in addition to or in place of a previous active sound event classification model) when the device is detected to be in the location. In a particular aspect, while operating in the inference mode, audio data samples representing sound events that are not recognized can be stored and can subsequently be used to update a sound event classification model using the disclosed learning techniques.
The disclosed systems and methods use transfer learning techniques to generate updated sound event classification models in a manner that is significantly less resource intensive than training sound event classification models from scratch. According to a particular aspect, the transfer learning techniques can be used to generate an updated sound event classification model based on a previously trained sound event classification model (also referred to herein as a “base model”). The updated sound event classification model is configured to detect more types of sound events than the base model is. For example, the base model is trained to detect any of a first set of sound events, each of which corresponds to a sound class of a first set of sound classes, and the updated sound event classification model is trained to detect any of the first set of sound events as well as any of a second set of sound events, each of which corresponds to a sound class of a second set of sound classes. Accordingly, the disclosed systems and methods reduce the computing resources (e.g., memory, processor cycles, etc.) used to generate an updated sound event classification model. As one example of a use case for the disclosed system and methods, a portable computing device can be used to generate a custom sound event detector.
According to a particular aspect, an updated sound event classification model is generated based on a previously trained sound event classification model, a subset of the training data used to train the previously trained sound event classification model, and one or more sets of training data corresponding to one or more additional sound classes that the updated sound event classification model is to be able to detect. In this aspect, the previously trained sound event classification model (e.g., a first neural network) is retained and unchanged. Additionally, a copy of the previously trained sound event classification model is generated and modified to have a new output layer. The new output layer includes an output node for each sound class that the updated sound event classification model (e.g., a second neural network) is to be able to detect. For example, if the first model is configured to detect ten distinct sound classes, then an output layer of the first model may include ten output nodes. In this example, if the updated sound event classification model is to be trained to detect twelve distinct sound classes (e.g., the ten sound classes that the first model is configured to detect plus two additional sound classes), then the output layer of the second model includes twelve output nodes.
One or more coupling networks are generated to link output of the first model and output of the second model. For example, the coupling network(s) convert an output of the first model to have a size corresponding to an output of the second model. To illustrate, in the example of the previous paragraph, the first model includes ten output nodes and generates an output having ten data elements, and the second model includes twelve output nodes and generates an output having twelve data elements. In this example, the coupling network(s) modify the output of the first model to have twelve data elements. The coupling network(s) also combine the output of the second model and the modified output of the first model to generate a sound classification output of the updated sound event classification model.
The updated sound event classification model is trained using labeled training data that includes audio data samples and labels for each sound class that the updated sound event classification model is being trained to detect or classify. However, since the first model is already trained to accurately detect the first set of sound classes, the labeled training data includes far fewer audio data samples for the first set of sound classes than were originally used to train the first model. To illustrate, the first model can be trained using hundreds or thousands of audio data samples for each sound class of the first set of sound classes. In contrast, the labeled training data used to train the updated sound event classification model can include tens or fewer of audio data samples for each sound class of the first set of sound classes. The labeled training data also includes audio data samples for each sound class of the second set of sound classes. The audio data samples for the second set of sound classes can also include tens or fewer audio data samples for each sound class of the second set of sound classes.
Backpropagation or another machine-learning technique is used to train the second model and the one or more coupling networks. During this process, the first model is unchanged, which limits or eliminates the risk that the first model will forget its prior training. For example, during its previous training, the first model was trained using a large labeled training data set to accurately detect the first set of sound classes. Retraining the first model using the relatively small labeled training data set used during retraining risks causing the accuracy of the first model to decline (sometimes referred to as “forgetting” some of its prior training). Retaining the first model unchanged while training the updated sound event detector model mitigates the risk of forgetting the first set of sound classes.
Additionally, before training, the second model is identical to the first model except for the output layer of the second model and interconnections therewith. Thus, at the starting point of the training, the second model is expected to be closer to convergence (e.g., closer to a training termination condition) than a randomly seeded model. As a result, fewer iterations should be needed to train the second model than were used to train the first model.
After the updated sound event classification model is trained, either the second model or the updated sound event classification model (including the first model, the second model, the one or more coupling networks, and links therebetween) can be used to detect sound events. For example, a model checker can select an active sound event classification model by performing one or more model checks. The model checks may include determining whether the second model exhibits significant forgetting relative to the first model. To illustrate, classification results generated by the second model can be compared to classification results generated by the first model to determine whether the second model assigns sound classes as accurate as the first model does. The model checks may also include determining whether the second model by itself (e.g., without the first model and the one or more coupling networks) generates classification results with sufficient accuracy. If the second model satisfies the model checks, the model checker designates the second model as the active sound event classifier. In this circumstance, the first model is discarded or remains unused during sound event classification. If the second model does not satisfy the model checks, the model checker designates the updated sound event classification model (including the first model, the second model, the one or more coupling networks, and links therebetween) as the active sound event classifier. In this circumstance, the first model is retained as part of the updated sound event classification model.
Thus, the model checker enables designation an active sound event classifier in a manner conserves computing resources. For example, if the second model alone is sufficiently accurate, the first model and the one or more coupling networks are discarded, which reduces an in memory footprint of the active sound event classifier. The resulting active sound classifier (e.g., the second model) is similar in memory footprint to the first model but has improved functionality relative to the first model (e.g., the second model is able to recognized sound classes that the first model cannot, and retains similar accuracy for sound classes that the first model can recognize). Relative to using the first model, the second model, and the one or more coupling networks together as the active sound event classifier, using the second model alone as the active sound event classifier uses fewer computing resources, such as less processor time, less power, and less memory. Further, even using the first model, the second model, and the one or more coupling networks together as the active sound event classifier provides users with the ability to generate customized sound event classifiers without retraining from scratch, which saves considerable computing resources, including memory to store a large library of audio data samples for each sound class, power and processing time to train a neural network to perform adequately as a sound event classifier, etc.
Particular aspects of the present disclosure are described below with reference to the drawings. In the description, common features are designated by common reference numbers. As used herein, various terminology is used for the purpose of describing particular implementations only and is not intended to be limiting of implementations. For example, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. Further, some features described herein are singular in some implementations and plural in other implementations. To illustrate,
The terms “comprise,” “comprises,” and “comprising” are used herein interchangeably with “include,” “includes,” or “including.” Additionally, the term “wherein” is used interchangeably with “where.” As used herein, “exemplary” indicates an example, an implementation, and/or an aspect, and should not be construed as limiting or as indicating a preference or a preferred implementation. As used herein, an ordinal term (e.g., “first,” “second,” “third,” etc.) used to modify an element, such as a structure, a component, an operation, etc., does not by itself indicate any priority or order of the element with respect to another element, but rather merely distinguishes the element from another element having a same name (but for use of the ordinal term). As used herein, the term “set” refers to one or more of a particular element, and the term “plurality” refers to multiple (e.g., two or more) of a particular element.
As used herein, “coupled” may include “communicatively coupled,” “electrically coupled,” or “physically coupled,” and may also (or alternatively) include any combinations thereof. Two devices (or components) may be coupled (e.g., communicatively coupled, electrically coupled, or physically coupled) directly or indirectly via one or more other devices, components, wires, buses, networks (e.g., a wired network, a wireless network, or a combination thereof), etc. Two devices (or components) that are electrically coupled may be included in the same device or in different devices and may be connected via electronics, one or more connectors, or inductive coupling, as illustrative, non-limiting examples. In some implementations, two devices (or components) that are communicatively coupled, such as in electrical communication, may send and receive electrical signals (digital signals or analog signals) directly or indirectly, such as via one or more wires, buses, networks, etc. As used herein, “directly coupled” refers to two devices that are coupled (e.g., communicatively coupled, electrically coupled, or physically coupled) without intervening components.
In the present disclosure, terms such as “determining,” “calculating,” “estimating,” “shifting,” “adjusting,” etc. may be used to describe how one or more operations are performed. It should be noted that such terms are not to be construed as limiting and other techniques may be utilized to perform similar operations. Additionally, as referred to herein, “generating,” “calculating,” “estimating,” “using,” “selecting,” “accessing,” and “determining” may be used interchangeably. For example, “generating,” “calculating,” “estimating,” or “determining” a parameter (or a signal) may refer to actively generating, estimating, calculating, or determining the parameter (or the signal) or may refer to using, selecting, or accessing the parameter (or signal) that is already generated, such as by another component or device.
In a particular implementation, the device 100 includes a processor 120 (e.g., a central processing unit (CPU)). The device 100 may include one or more additional processor(s) 132 (e.g., one or more DSPs). The processor 120, the processor(s) 132, or both, may be configured to generate sound identification data, to update the active sound event classification model 162, or both. For example, in
The active SEC model 162 is a previously trained sound event classification model. For example, before the active SEC model 162 is updated, a base model 104 is designated as the active SEC model 162. In a particular aspect, updating the active SEC model 162 includes generating and training an update model 106. As described further with reference to
After the update model 106 is trained by the model updater 110, the model checker 160 determines whether to discard the base model 104. To illustrate, the model checker 160 determines whether to discard the base model 104 based on an accuracy of sound classes assigned by the incremental model and an accuracy of sound classes assigned by the base model 104. In a particular aspect, if the model checker 160 determines that the incremental model alone is sufficiently accurate (e.g., satisfies an accuracy threshold), the incremental model is designated as the active SEC model 162 and the base model 104 is discarded. If the model checker 160 determines that the incremental model is not sufficiently accurate (e.g., fails to satisfy the accuracy threshold), the update model 106 is designated as the active SEC model 162 and the base model 104 is retained as part of the update model 106. In this context, “discarding” the base model 104 refers to deleting the base model 104 from the memory 130, reallocating a portion of the memory 130 allocated to the base model 104, marking the base model 104 for deletion, archiving the base model 104, moving the base model 104 to another memory location for inactive or unused resources, retaining the base model 104 but not using the base model 104 for sound event classification, or other similar operations.
In some implementations, another computing device, such as the remote computing device 150, trains the base model 104, and the base model 104 is stored on the device 100 as a default model, or the device 100 downloads the base model 104 from the other computing device. In some implementations, the device 100 trains the base model 104. Training the base model 104 entails use of a relatively large set of labeled training data (e.g., base training data 152 in
In
In
In the example illustrated in
In
In a particular implementation, the device 100 is included in a system-in-package or system-on-chip device 144. In a particular implementation, the memory 130, the processor 120, the processor(s) 132, the display controller 112, the CODEC 142, the modem 136, and the transceiver 134 are included in a system-in-package or system-on-chip device 144. In a particular implementation, the input device 122 and a power supply 116 are coupled to the system-on-chip device 144. Moreover, in a particular implementation, as illustrated in
The device 100 may include, correspond to, or be included within a voice activated device, an audio device, a wireless speaker and voice activated device, a portable electronic device, a car, a vehicle, a computing device, a communication device, an internet-of-things (IoT) device, a virtual reality (VR) device, an augmented reality (AR) device, a mixed reality (MR) device, a smart speaker, a mobile computing device, a mobile communication device, a smart phone, a cellular phone, a laptop computer, a computer, a tablet, a personal digital assistant, a display device, a television, a gaming console, an appliance, a music player, a radio, a digital video player, a digital video disc (DVD) player, a tuner, a camera, a navigation device, or any combination thereof. In a particular aspect, the processor 120, the processor(s) 132, or a combination thereof, are included in an integrated circuit.
During training (e.g., backpropagation training), the base topology 202 is static and the base parameters 236 are changed. In
The base topology 202 includes an input layer 204, one or more hidden layers (labeled hidden layer(s) 206 in
The hidden layer(s) 206 can have various configurations and various numbers of layers depending on the specific implementations.
As explained above, the update model 106 includes the base model 104, a modified copy of the base model 104 (e.g., the incremental model 302 of
The operations described with reference to
In the particular aspect, the operations described with reference to
To generate the update model 106, the model updater 110 copies the base model 104 and replaces the output layer 234 of the copy of the base model 104 with a different output layer (e.g., an output layer 322 in
In addition to generating the incremental model 302, the model updater 110 generates one or more coupling network(s) 314. In
The merger adapter 308 is configured to generate output data 318 by merging the third output 356 from the neural adapter 310 and the second output 354 from the incremental model 302. In
In a particular aspect, the first output 352 is generated by the output layer 234 of the base model 104 (as opposed to by a layer of the base model 104 prior to the output layer 234), and the second output 352 is generated by the output layer 322 of the incremental model 302 (as opposed to by a layer of the incremental model 302 prior to the output layer 322). Stated another way, the combining networks 314 combine classification results generated by the base model 104 and the incremental model 302 rather than combining encodings generated by layers before the output layers 234, 322. Combining the classification results facilitates concurrent training of the incremental model 302 and the combining networks 314 so that the incremental model 302 can be used as a stand-alone sound event classifier if it is sufficiently accurate.
During training, the model updater 110 provides labeled training data 304 as input 350 to the base model 104 and to the incremental model 302. The labeled training data 304 includes one or more of the audio data samples 126 (which correspond to sound classes that the base model 104 is trained to recognize) and one or more audio data samples 128 (which correspond to new sound classes that the base model 104 is not trained to recognize). In response to particular audio data samples of the labeled training data 304, the base model 104 generates the first output 352 that is provided as input to the neural adapter 310. Additionally, in response to the particular audio data samples, the incremental model 302 generates the second output 354 that is provided, along with the third output 356 of the neural adapter 310, to the merger adapter 308. The merger adapter 308 merges the second output 354 and third output 356 to generate a merged output and generates the output data 318 based on the merged output.
The output data 318, the sound event identifier 360, or both, are provided to the model updater 110 which compares the sound event identifier 360 to a label associated, in the labeled training data 304, with the particular audio data samples and calculates updated link weight values (updated link weights 362 in
After the model updater 110 completes training of the update model 106, the model checker 160 determines whether to discard the base model 104 based on an accuracy of sound classes assigned by the incremental model 302 in the second output 354 and an accuracy of sound classes assigned by the base model 104 in the first output 352. For example, the model checker 160 may compare values of one or more metric 374 (e.g., F1-scores) that are indicative of the accuracy of sound classes assigned by the incremental model 302 to audio data samples of a first set of sound classes (e.g., the audio data samples 126) as compared to the accuracy of sound classes assigned by the base model 104 to the audio data samples of the first set of sound classes. In this example, the model checker 160 determines whether to discard the base model 104 based on values of the metric(s) 374. For example, if the value of an F1-score determined for the second output 354 is great than or equal to value of an F1-score determined for the first output 352, the model checker 160 determines to discard the base model 104. In some implementation, the model checker 160 determines to discard the base model 104 if the value of the F1-score determined for the second output 354 is less than the value of an F1-score determined for the first output 352 by less than a threshold amount.
In some aspects, the model checker 160 determines values of the metric(s) 374 during training of the update model. For example, the first output 352 and the second output 354 may be provided to the model checker 160 to determine values of the metric(s) 374 while the update model 106 is undergoing training or validation by the model updater 110. In this example, after training, the model checker 160 designates the active SEC model 162. In some implementations, a value of a metric 374 indicating the accuracy of sound classes assigned by the base model 104 to the audio data samples of the first set of sound classes may be stored in memory (e.g., the memory 130 of
If the model checker 160 determines to discard the base model 104, the incremental model 302 is designated the active SEC model 162. However, if the model checker 160 determines not to discard the base model 104, the update model 106 is designated the active SEC model 162.
In
During use (e.g., in an inference mode of operation following a training mode of operation), the SEC engine 108 provides input 450 to the active SEC model 162. The input 450 includes audio data samples 406 for which sound event identification data 460 is to be generated. In a particular example, the audio data samples 406 include, correspond to, or are based on audio captured by the microphone(s) 114 of the device 100 of
In
Additionally, in response to the audio data samples 406, the incremental model 302 generates a second output that is provided to the coupling network(s) 314. As described with reference to
The coupling network(s) 314 generate the sound event identification data 460 that is based on the first output of the base model 104 and the second output of the incremental model 302. For example, the first output of the base model 104 is used to generate a third output that corresponds to the second count of classes of the second set of sound class, and the third output is merged with the second output of the incremental model 302 to form a merged output. The merged output is processed to generate the sound event identification data 460 which indicates a sound class associated with the audio data samples 406.
In
Thus, the model checker 160 facilitates use of significantly fewer computing resources when the metric(s) 374 indicate that the base model 104 can be discarded and the incremental model 302 can be used as the active SEC model 162. For example, since the update model 106 includes both the base model 104 and the incremental model 302, more memory is used to store the update model 106 than is used to store only the incremental model 302. Similarly, determining a sound event class associated with particular audio data samples 406 using the update model 106 uses more processor time than determining a sound event class associated with particular audio data samples 406 using only the incremental update model 302.
In a particular aspect, the device 100 is coupled to the screen 502 and provides an output to the screen 502 responsive to the active SEC model 162 detecting or recognizing various events (e.g., sound events) described herein. For example, the device 100 provides the sound event identification data 460 of
In a particular implementations, the sensor(s) 504 include one or more microphone(s) 114 of
The device 100 in
Thus, the techniques described with respect to
The sensor(s) 606 enable detection of audio data, which the device 100 uses to detect sound events or to update the active SEC model 162. For example, the device 100 uses the active SEC model 162 to generate the sound event identification data 460 which may be provided to the display 604 to indicate that a recognized sound event, such as a car horn, is detected in audio data samples received from the sensor(s) 606. In some implementations, the device 100 can perform an action responsive to recognizing a sound event, such as activating a camera or one of the sensor(s) 606 or providing haptic feedback to the user.
In the example illustrated in
The sensor(s) 704 enable detection of audio data, which the device 100 uses to detect sound events or to update the active SEC model 162. For example, the device 100 provides the sound event identification data 460 of
In the example illustrated in
During operation, in response to receiving a verbal command, the voice-controlled speaker system 800 can execute assistant operations. The assistant operations can include adjusting a temperature, playing music, turning on lights, etc. The sensor(s) 804 enable detection of audio data samples, which the device 100 uses to detect sound events or to generate the active SEC model 162. Additionally, the voice-controlled speaker system 800 can execute some operations based on sound events recognized by the device 100. For example, if the device 100 recognizes the sound of a door closing, the voice-controlled speaker system 800 can turn on one or more lights.
In the example illustrated in
In the example illustrated in
During operation, the mobile device 1000 may perform particular actions in response to the device 100 detecting particular sound events. For example, the actions can include sending commands to other devices, such as a thermostat, a home automation system, another mobile device, etc. The sensor(s) 1004 enable detection of audio data, which the device 100 uses to detect sound events or to generate the update model 106.
In the example illustrated in
In the example illustrated in
In the example illustrated in
In the example illustrated in
The method 1400 includes, at block 1402, initializing a second neural network based on a first neural network that is trained to detect a first set of sound classes. For example, the model updater 110 can initialize the incremental model 302 by generating a copy of the input layer 204, hidden layers 206 and base link weights 238 of the base model 104 (e.g., the first neural network) and couple the copies of the input layer 204, hidden layers 206 to a new output layer 322 to form the incremental model 302 (e.g., the second neural network).
Thus, the method 1400 facilitates use of transfer learning techniques to generate an updated sound event classification model based on a previously trained sound event classification model. The use of such transfer learning techniques reduces the computing resources (e.g., memory, processor cycles, etc.) used to train a sound event classification model from scratch.
The method 1500 includes, at block 1502, generating a copy of a sound event classification model that is trained to recognize a first set of sound classes. For example, the model updater 110 can generate a copy of the input layer 204, hidden layers 206 and base link weights 238 of the base model 104 (e.g., the first neural network).
The method 1500 includes, at block 1504, modifying the copy to have a new output layer configured to generate output corresponding to a second set of sound classes, the second set of sound classes including the first set of sound classes and one or more additional sound classes. For example, the model updater 110 can couple the copies of the input layer 204, hidden layers 206 to a new output layer 322 to form the incremental model 302 (e.g., the second neural network). In this example, the incremental model 302 is configured to generate output corresponding to a second set of sound classes (e.g., the first set of sound classes plus one or more additional sound classes).
Thus, the method 1500 facilitates use of transfer learning techniques to generate an updated sound event classification model based on a previously trained sound event classification model. The updated sound event classification model is configured to detect more types of sound events than the base model is. The use of such transfer learning techniques reduces the computing resources (e.g., memory, processor cycles, etc.) used to train a sound event classification model that detects more sound events than previously trained sound event classification models.
The method 1600 includes, at block 1602, generating a copy of a trained sound event classification model that includes an output layer including N output nodes corresponding to N sound classes that the trained sound event classification model is trained to recognize. For example, the model updater 110 can generate a copy of the input layer 204, hidden layers 206 and base link weights 238 of the base model 104 (e.g., the first neural network). In this example, the output layer 234 of the base model 104 includes N nodes, where N corresponds to the number of sound classes that the based model 104 is trained to recognize.
The method 1600 includes, at block 1604, connecting a new output layer to the copy, the new output layer including N+K output nodes corresponding to the N sound classes and K additional sound classes. For example, the model updater 110 can couple the copies of the input layer 204, hidden layers 206 to a new output layer 322 to form the incremental model 302 (e.g., the second neural network). In this example, the new output layer 322 includes K+N output nodes corresponds to the N sound classes that the base model 104 is trained to recognize and K additional sound classes.
Thus, the method 1600 facilitates use of transfer learning techniques to learn to detect new sound events based on a previously trained sound event classification model. The new sound events include a prior set of sound event classes and one or more additional sound classes. The use of such transfer learning techniques reduce the computing resources (e.g., memory, processor cycles, etc.) used to train from scratch a sound event classification model that detects more sound events than previously trained sound event classification models.
The method 1700 includes, at block 1702, linking an output of the first neural network and an output of the second neural network to one or more coupling networks. For example, the model updater 110 of
Thus, the method 1700 facilitates use of coupling networks to facilitate transfer learning to learn to detect new sound events based on a previously trained sound event classification model. The use of the coupling networks and transfer learning reduces the computing resources (e.g., memory, processor cycles, etc.) used to train from scratch a sound event classification model that detects more sound events than previously trained sound event classification models.
The method 1800 includes, at block 1802, obtaining one or more coupling networks. For example, the model updater 110 of
The method 1800 includes, at block 1804, linking an output layer of a first neural network to the one or more coupling networks. For example, the model updater 110 of FIG. 1 may link the coupling network(s) 314 to the base model 104 and the incremental model 302, as illustrated in
The method 1800 includes, at block 1806, linking an output layer of the second neural network to one or more coupling networks to generate an update model including the first neural network and the second neural network. For example, the model updater 110 of
Thus, the method 1800 facilitates use of coupling networks and transfer learning to generate a new sound event classification model based on a previously trained sound event classification model. The use of the coupling networks and transfer learning reduces the computing resources (e.g., memory, processor cycles, etc.) used to train the new sound event classification model from scratch.
The method 1900 includes, at block 1902, obtaining a neural adapter including a number of input nodes corresponding to a number of output nodes of a first neural network that is trained to recognize a first set of sound classes. For example, the model updater 110 of
The method 1900 includes, at block 1904, obtaining a merger adapter including a number of input nodes corresponding to a number of output nodes of a second neural network. For example, the model updater 110 of
The method 1900 includes, at block 1906, linking the output nodes of the first neural network to the input nodes of the neural adapter. For example, the model updater 110 of
The method 1900 includes, at block 1908, linking the output nodes of the second neural network and output nodes of the neural adapter to the input nodes of the merger adapter to generate an update network including the first neural network, the second neural network, the neural adapter, and the merger adapter. For example, the model updater 110 of
Thus, the method 1900 facilitates use of a neural adapter and a merger adapter with transfer learning to generate a new sound event classification model based on a previously trained sound event classification model. The use of the neural adapter and a merger adapter with the transfer learning reduces the computing resources (e.g., memory, processor cycles, etc.) used to train the new sound event classification model from scratch.
The method 2000 includes, at block 2002, after training of a second neural network and one or more coupling networks that are linked to a first neural network, determining whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and accuracy of sound classes assigned by the first neural network. For example, in
Thus, the method 2000 facilitates designation an active sound event classifier in a manner that conserves computing resources. For example, if the second neural network alone is sufficiently accurate, the first neural network and the one or more coupling networks are discarded, which reduces an in-memory footprint of the active sound event classifier.
The method 2100 includes, at block 2102, after training of an update model that includes a first neural network and a second neural network, determining whether the second neural network exhibits significant forgetting relative to the first neural network. For example, in
The method 2100 includes, at block 2104, discarding the first neural network based on a determination that the second neural network does not exhibit significant forgetting relative to the first neural network. The model checker 160 discards the base model 104 and the coupling networks 314 in response to determining that the one or more metrics 374 indicate that the incremental model 302 does not exhibit significant forgetting of the prior training of the base model 104.
Thus, the method 2100 facilitates conservation of computing resources when training an updated sound event classifier (e.g., the second neural network). For example, if the second neural network alone is sufficiently accurate, the first neural network and the one or more coupling networks are discarded, which reduces an in-memory footprint of the active sound event classifier.
The method 2200 includes, at block 2202, determining an accuracy metric based on classification results generated by a first model and classification results generated by a second model. For example, the model checker 160 may determine a value of an F1-score or another accuracy metric based on the accuracy of sound classes assigned by the incremental model 302 to audio data samples of a first set of sound classes as compared to the accuracy of sound classes assigned by the base model 104 to the audio data samples of the first set of sound classes.
The method 2200 includes, at block 2204, designating an active sound event classifier, where an update model including the first model and the second model is designated as the active sound event classifier responsive to the accuracy metric failing to satisfy a threshold or the second model is designated the active sound event classifier responsive to the accuracy metric satisfying the threshold. For example, if the value of an F1-score determined for the second output 354 is greater than or equal to value of an F1-score determined for the first output 352 of
Thus, the method 2200 facilitates designation of an active sound event classifier in a manner that conserves computing resources. For example, if the second neural network alone is sufficiently accurate, the first neural network and the one or more coupling networks are discarded, which reduces an in-memory footprint of the active sound event classifier.
In block 2302, the method 2300 includes initializing a second neural network based on a first neural network that is trained to detect a first set of sound classes. For example, the model updater 110 can generate a copy of the input layer 204, hidden layers 206 and base link weights 238 of the base model 104 (e.g., the first neural network) and couple the copies of the input layer 204, hidden layers 206 to a new output layer 322 to form the incremental model 302 (e.g., the second neural network). In this example, the base model 104 includes the output layer 234 that generates output corresponding to a first count of classes of a first set of sound classes, and the incremental model 302 includes the output layer 322 that generates output corresponding to a second count of classes of a second set of sound classes.
In block 2304, the method 2300 includes linking an output of the first neural network and an output of the second neural network to one or more coupling networks. For example, the model updater 110 of
In block 2306, the method 2300 includes, after the second neural network and the one or more coupling networks are trained, determining whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and an accuracy of sound classes assigned by the first neural network. For example, in
Thus, the method 2300 facilitates conservation of computing resources when training an updated sound event classifier (e.g., the second neural network). For example, if the second neural network alone is sufficiently accurate, the first neural network and the one or more coupling networks are discarded, which reduces an in memory footprint of the active sound event classifier.
In conjunction with the described implementations, an apparatus includes means for initializing a second neural network based on a first neural network that is trained to detect a first set of sound classes. For example, the means for initializing the second neural network based on the first neural network includes the remote computing device 150, the device 100, the instructions 124, the processor 120, the processor(s) 132, the model updater 110, one or more other circuits or components configured to initialize a second neural network based on a first neural network, or any combination thereof. In some aspects, the means for initializing the second neural network based on the first neural network includes means for generating copies of the input layer and the hidden layers of the first neural network and means for connecting a second output layer to the copies of the input layer and the hidden layers. For example, the means for generating copies of the input layer and the hidden layers of the first neural network and means for connecting the second output layer to the copies of the input layer and the hidden layers include the remote computing device 150, the device 100, the instructions 124, the processor 120, the processor(s) 132, the model updater 110, one or more other circuits or components configured generate copies of the input layer and the hidden layers of the first neural network and connect a second output layer to the copies of the input layer and the hidden layers, or any combination thereof.
The apparatus also includes means for linking an output of the first neural network and an output of the second neural network to one or more coupling networks. For example, the means for linking the first neural network and the second neural network to one or more coupling networks includes the remote computing device 150, the device 100, the instructions 124, the processor 120, the processor(s) 132, the model updater 110, one or more other circuits or components configured to link the first neural network and the second neural network to one or more coupling networks, or any combination thereof.
The apparatus also includes means for determining, after the second neural network and the one or more coupling networks are trained, whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and an accuracy of sound classes assigned by the first neural network. For example, the means for determining whether to discard the first neural network includes the remote computing device 150, the device 100, the instructions 124, the processor 120, the processor(s) 132, the model updater 110, the model checker 160, one or more other circuits or components configured to determine whether to discard a neural network or to designate an active SEC model, or any combination thereof.
Those of skill would further appreciate that the various illustrative logical blocks, configurations, modules, circuits, and algorithm steps described in connection with the implementations disclosed herein may be implemented as electronic hardware, computer software executed by a processor, or combinations of both. Various illustrative components, blocks, configurations, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or processor executable instructions depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, such implementation decisions are not to be interpreted as causing a departure from the scope of the present disclosure.
The steps of a method or algorithm described in connection with the implementations disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in random access memory (RAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disk, a removable disk, a compact disc read-only memory (CD-ROM), or any other form of non-transient storage medium known in the art. An exemplary storage medium is coupled to the processor such that the processor may read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an application-specific integrated circuit (ASIC). The ASIC may reside in a computing device or a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a computing device or user terminal.
Particular aspects of the disclosure are described below in a first set of interrelated clauses:
According to Clause 1, a device includes one or more processors. The one or more processors are configured to initialize a second neural network based on a first neural network that is trained to detect a first set of sound classes and to link an output of the first neural network and an output of the second neural network as input to one or more coupling networks. The one or more processors are configured to, after the second neural network and the one or more coupling networks are trained, determine whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and an accuracy of sound classes assigned by the first neural network.
Clause 2 includes the device of Clause 1 wherein the one or more processors are further configured to determine a value of a metric indicative of the accuracy of sound classes assigned by the second neural network to audio data samples of the first set of sound classes as compared to the accuracy of sound classes assigned by the first neural network to the audio data samples of the first set of sound classes, and the one or more processors are configured to determine whether to discard the first neural network further based on the value of the metric.
Clause 3 includes the device of Clause 1 or clause 2 wherein the output of the first neural network indicates a sound class assigned to particular audio data samples by the first neural network and the output of the second neural network indicates a sound class assigned to the particular audio data samples by the second neural network.
Clause 4 includes the device of any of Clauses 1 to 3 wherein the output of the first neural network includes a first count of data elements corresponding to a first count of sound classes of the first set of sound classes, the output of the second neural network includes a second count of data elements corresponding to a second count of sound classes of a second set of sound classes, and the one or more coupling networks include a neural adapter comprising one or more adapter layers configured to generate, based on the output of the first neural network, a third output having the second count of data elements.
Clause 5 includes the device of Clause 4 wherein the one or more coupling networks include a merger adapter including one or more aggregation layers configured to merge the third output from the neural adapter and the output of the second neural network and including an output layer to generate a merged output.
Clause 6 includes the device of any of Clauses 1 to 5 wherein an output layer of the first neural network includes N output nodes, and an output layer of the second neural network includes N+K output nodes, where N is an integer greater than or equal to one, and K is an integer greater than or equal to one.
Clause 7 includes the device of Clause 6 wherein the N output nodes correspond to N sound event classes that the first neural network is trained to recognize and the N+K output nodes include the N output nodes correspond to the N sound event classes and K output nodes correspond to K additional sound event classes.
Clause 8 includes the device of any of Clauses 1 to 7 wherein, prior to initializing the second neural network, the first neural network is designated as an active sound event classifier and the one or more processors are configured to designate the second neural network as the active sound event classifier based on a determination to discard the first neural network.
Clause 9 includes the device of any of Clauses 1 to 8 wherein, prior to initializing the second neural network, the first neural network is designated as an active sound event classifier and the one or more processors are configured to designate the first neural network, the second neural network, and the one or more coupling networks together as the active sound event classifier based on a determination not to discard the first neural network.
Clause 10 includes the device of any of Clauses 1 to 9 wherein the one or more processors are integrated within a mobile computing device.
Clause 11 includes the device of any of Clauses 1 to 9 wherein the one or more processors are integrated within a vehicle.
Clause 12 includes the device of any of Clauses 1 to 9 wherein the one or more processors are integrated within a wearable device.
Clause 13 includes the device of any of Clauses 1 to 9 wherein the one or more processors are integrated within an augmented reality headset, a mixed reality headset, or a virtual reality headset.
Clause 14 includes the device of any of Clauses 1 to 13 wherein the one or more processors are included in an integrated circuit.
Particular aspects of the disclosure are described below in a second set of interrelated clauses:
According to a Clause 15, a method includes initializing a second neural network based on a first neural network that is trained to detect a first set of sound classes and linking an output of the first neural network and an output of the second neural network to one or more coupling networks. The method also includes, after the second neural network and the one or more coupling networks are trained, determining whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and an accuracy of sound classes assigned by the first neural network.
Clause 16 includes the method of Clause 15 and further includes determining a value of a metric indicative of the accuracy of sound classes assigned by the second neural network to audio data samples of the first set of sound classes as compared to the accuracy of sound classes assigned by the first neural network to the audio data samples of the first set of sound classes, and wherein a determination of whether to discard the first neural network is further based on the value of the metric.
Clause 17 includes the method of Clause 15 or Clause 16 wherein the second neural network is initialized automatically based on detecting a trigger event.
Clause 18 includes the method of clause 17 wherein the trigger event is based on encountering a threshold quantity of unrecognized sound classes.
Clause 19 includes the method of clause 17 or clause 18 wherein the trigger event is specified by a user setting.
Clause 20 includes the method of any of Clauses 15 to 19 wherein the first neural network includes an input layer, hidden layers, and a first output layer, and wherein initializing the second neural network based on the first neural network includes generating copies of the input layer and the hidden layers of the first neural network and connecting a second output layer to the copies of the input layer and the hidden layers, wherein the first output layer includes a first count of output nodes corresponding to a count of sound classes of the first set of sound classes and the second output layer includes a second count of output node corresponding to a count of sound classes of the second set of sound classes.
Clause 21 includes the method of any of Clauses 15 to 20 wherein the output of the first neural network indicates a sound class assigned to particular audio data samples by the first neural network and the output of the second neural network indicates a sound class assigned to the particular audio data samples by the second neural network.
Clause 22 includes the method of Clause 21 wherein the one or more coupling networks are configured to generate merged output that indicates a sound class assigned to the particular audio data samples by the one or more coupling networks based on the output of the first neural network and the output of the second neural network.
Clause 23 includes the method of any of Clauses 15 to 22 and further includes determining a first value indicating the accuracy of sound classes assigned by the first neural network to audio data samples of the first set of sound classes and determining a second value indicating the accuracy of the sound classes assigned by the second neural network to the audio data samples of the first set of sound classes, wherein the determining whether to discard the first neural network is based on a comparison of the first value and the second value.
Clause 24 includes the method of any of Clauses 15 to 23 wherein the output of the first neural network includes a first count of data elements corresponding to a first count of sound classes of the first set of sound classes, the output of the second neural network includes a second count of data elements corresponding to a second count of sound classes of the second set of sound classes, and the one or more coupling networks include a neural adapter including one or more adapter layers configured to generate, based on the output of the first neural network, a third output having the second count of data elements.
Clause 25 includes the method of Clause 24 wherein the one or more coupling networks include a merger adapter including one or more aggregation layers configured to merge the third output from the neural adapter and the output of the second neural network and include an output layer to generate a merged output.
Clause 26 includes the method of any of Clauses 15 to 25 wherein link weights of the first neural network are not updated during the training of the second neural network and the one or more coupling networks.
Clause 27 includes the method of any of Clauses 15 to 26 wherein, prior to initializing the second neural network, the first neural network is designated as an active sound event classifier, and further including designating the second neural network as the active sound event classifier based on a determination to discard the first neural network.
Clause 28 includes the method of any of Clauses 15 to 27 wherein, prior to initializing the second neural network, the first neural network is designated as an active sound event classifier, and further including designating the first neural network, the second neural network, and the one or more coupling networks together as the active sound event classifier based on a determination not to discard the first neural network.
Particular aspects of the disclosure are described below in a third set of interrelated clauses:
According to a Clause 29, a device includes means for initializing a second neural network based on a first neural network that is trained to detect a first set of sound classes and means for linking an output of the first neural network and an output of the second neural network to one or more coupling networks. The device also includes means for determining, after the second neural network and the one or more coupling networks are trained, whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and an accuracy of sound classes assigned by the first neural network.
Clause 30 includes the device of Clause 29 and further includes means for determining a value of a metric indicative of the accuracy of sound classes assigned by the second neural network to audio data samples of the first set of sound classes as compared to the accuracy of sound classes assigned by the first neural network to the audio data samples of the first set of sound classes, and wherein the means for determining whether to discard the first neural network is configured to determine whether to discard the first neural network based on the value of the metric.
Clause 31 includes the device of Clause 29 or Clause 30 wherein the means for determining whether to discard the first neural network is configured to discard the first neural network based on determining that the second neural network does not exhibit significant forgetting relative to the first neural network.
Clause 32 includes the device of any of Clauses 29 to 31 wherein the first neural network includes an input layer, hidden layers, and a first output layer, and wherein the means for initializing the second neural network includes means for generating copies of the input layer and the hidden layers of the first neural network and means for connecting a second output layer to the copies of the input layer and the hidden layers, where the first output layer includes a first count of output nodes corresponding to a count of sound classes of the first set of sound classes and the second output layer includes a second count of output node corresponding to a count of sound classes of a second set of sound classes.
Particular aspects of the disclosure are described below in a fourth set of interrelated clauses:
According to a Clause 33, a non-transitory computer-readable storage medium includes instructions that when executed by a processor, cause the processor to initialize a second neural network based on a first neural network that is trained to detect a first set of sound classes and link an output of the first neural network and an output of the second neural network to one or more coupling networks. The instructions, when executed by the processor, also cause the processor to, after the second neural network and the one or more coupling networks are trained, determine whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and an accuracy of sound classes assigned by the first neural network.
Clause 34 includes the non-transitory computer-readable storage medium of Clause 33 and the instructions, when executed by the processor, further cause the processor to determine a value of a metric indicative of the accuracy of sound classes assigned by the second neural network to audio data samples of the first set of sound classes as compared to the accuracy of sound classes assigned by the first neural network to the audio data samples of the first set of sound classes, and wherein a determination of whether to discard the first neural network is further based on the value of the metric.
Clause 35 includes the non-transitory computer-readable storage medium of Clause 33 or 34 wherein the first neural network includes an input layer, hidden layers, and a first output layer, and wherein initializing the second neural network based on the first neural network includes generating copies of the input layer and the hidden layers of the first neural network and connecting a second output layer to the copies of the input layer and the hidden layers, wherein the first output layer includes a first count of output nodes corresponding to a count of sound classes of the first set of sound classes and the second output layer includes a second count of output node corresponding to a count of sound classes of a second set of sound classes.
Clause 36 includes the non-transitory computer-readable storage medium of any of Clauses 33 to 34 wherein the output of the first neural network indicates a sound class assigned to particular audio data samples by the first neural network and the output of the second neural network indicates a sound class assigned to the particular audio data samples by the second neural network.
Clause 37 includes the non-transitory computer-readable storage medium of Clause 36 wherein the one or more coupling networks are configured to generate merged output that indicates a sound class assigned to the particular audio data samples by the one or more coupling networks based on the output of the first neural network and the output of the second neural network.
Clause 38 includes the non-transitory computer-readable storage medium of any of Clauses 33 to 37 and the instructions, when executed by the processor, further cause the processor to determine a first value indicating the accuracy of sound classes assigned by the first neural network to audio data samples of the first set of sound classes and determine a second value indicating the accuracy of the sound classes assigned by the second neural network to the audio data samples of the first set of sound classes, wherein the determination whether to discard the first neural network is based on a comparison of the first value and the second value.
Clause 39 includes the non-transitory computer-readable storage medium of any of Clauses 33 to 38 wherein the output of the first neural network includes a first count of data elements corresponding to a first count of sound classes of the first set of sound classes, the output of the second neural network includes a second count of data elements corresponding to a second count of sound classes of the second set of sound classes, and the one or more coupling networks include a neural adapter including one or more adapter layers configured to generate, based on the output of the first neural network, a third output having the second count of data elements.
Clause 40 includes the non-transitory computer-readable storage medium of Clause 39 wherein the one or more coupling networks include a merger adapter including one or more aggregation layers configured to merge the third output from the neural adapter and the output of the second neural network and include an output layer to generate a merged output.
Clause 41 includes the non-transitory computer-readable storage medium of any of Clauses 33 to 40 wherein link weights of the first neural network are not updated during the training of the second neural network and the one or more coupling networks.
Clause 42 includes the non-transitory computer-readable storage medium of any of Clauses 33 to 41 wherein, prior to initializing the second neural network, the first neural network is designated as an active sound event classifier, and further including designating the second neural network as the active sound event classifier based on a determination to discard the first neural network.
Clause 43 includes the non-transitory computer-readable storage medium of any of Clauses 33 to 42 wherein, prior to initializing the second neural network, the first neural network is designated as an active sound event classifier, and further including designating the first neural network, the second neural network, and the one or more coupling networks together as the active sound event classifier based on a determination not to discard the first neural network.
The previous description of the disclosed aspects is provided to enable a person skilled in the art to make or use the disclosed aspects. Various modifications to these aspects will be readily apparent to those skilled in the art, and the principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope possible consistent with the principles and novel features as defined by the following claims.