The subject technology relates to systems and methods for reducing background noise and in particular, for deploying machine learning models to detect and attenuate unwanted background noises (audio artifacts) in teleconference and videoconference settings.
Passive noise control techniques such as earplugs, thick walls, and sound-absorbing ceiling tiles are well known. However, such passive solutions are undesirable for many situations in which noise cancellation or suppression is desired as they can be uncomfortable, bulky, unsightly, or ineffective. More recently, active noise cancellation (ANC) techniques have been developed whereby a speaker emits a sound wave designed to cancel out offensive noise via destructive interference.
However, legacy ANC technologies are limited in applicability. They are suitable only for small enclosed spaces, such as headphones, or for continuous or highly periodic low frequency sounds, such as machinery noise. Further, due in part to a dependency on complex signal processing algorithms, many ANC technologies are limited cancelling noise that comprises a small range of predictable frequencies (e.g., relatively steady-state and low-frequency noise).
Certain features of the subject technology are set forth in the appended claims. However, the accompanying drawings, which are included to provide further understanding, illustrate disclosed aspects and together with the description serve to explain the principles of the subject technology. In the drawings:
The detailed description set forth below is intended as a description of various configurations of the subject technology and is not intended to represent the only configurations in which the technology can be practiced. The appended drawings are incorporated herein and constitute a part of the detailed description. The detailed description includes specific details for the purpose of providing a more thorough understanding of the technology; however, it will be clear and apparent that the subject technology is not limited to the specific details set forth herein and may be practiced without these details. In some instances, structures and components are shown in block diagram form in order to avoid obscuring certain concepts.
Existing active noise cancellation techniques are ineffective in many situations where it is desirable to reduce noise, such as in public spaces, outdoor areas, or near highways or airports, etc. Passive noise reduction (e.g., noise blocking/absorption) is typically used for these situations, but passive approaches have limited bandwidth and when used incorrectly, can result in acoustically unpleasant conditions, such as an overly damped (“dead”) sounding room. Thus, passive and currently available active noise cancellation techniques are unsuitable for many situations where noise cancellation is desirable.
In some environments, it is possible to use machine learning (ML) models (classifiers) to identify and selectively eliminate undesired noises on an audio channel. However, high-accuracy ML models are computationally expensive to deploy and can therefore be difficult to implement in real-time and on light-weight computing devices used for transmitting audio communications, such as smartphones, and Internet Protocol (IP) telephony devices.
The disclosed technology addresses the forgoing limitations of ML noise filtering techniques by providing a multi-layered ML based solution for detecting and attenuating unwanted audible features (background noises or audio artifacts). Aspects of the technology address the limitations of deploying high-accuracy ML models by utilizing computationally inexpensive (light weight) preliminary classifiers to reduce the detection of false-positives. By realizing significant reductions in false-positive noise detections, additional higher-accuracy ML models can be implemented to accurately classify remaining background (sound) features.
A process of the disclosed technology can include a computer-implemented method for receiving a first set of audio segments from an audio capture device, analyzing the first set of audio segments using a first machine learning model to identify a first probability that one or more background (noise) features exist in the first set of audio segments, and if the first probability exceeds a first predetermined threshold, analyzing the first set of audio segments using a second machine learning model to determine a second probability that the one or more background features exist in the first set of audio segments. In some aspects, the process can further include steps for attenuating at least one of the one or more background features if the second probability exceeds a second predetermined threshold.
Using machine learning (ML) models it is possible to accurately detect (classify) unwanted audio artifacts on an audio channel, and to perform the signal processing necessary to attenuate the noises in a manner that is undetectable by the human ear. As such, ML models can be used to identify unwanted background noises (e.g., sirens, typing sounds, crying babies, etc.), and to remove the noises from communication channels, such as in a teleconference or videoconference settings. As discussed above, one limitation of conventional ML approaches to noise mitigation is that fast and accurate noise classification can be computationally expensive, making it difficult to deploy such technologies in-line with legacy telephony equipment.
The disclosed technology addresses the computational limitations of deploying high-accuracy ML models by using a multi-layered approach. As discussed in further detail below, sounds having a low-probability of being background noises can be quickly filtered using a light-weight preliminary (first) ML model. By reducing the set of total sound events to be processed/classified, higher-probability background events can be efficiently screened using a subsequent (second) ML model that is more accurate and robust than the first ML model. As discussed in further detail below, noise filtering using a multi-layered ML approach can be implemented based on assigned classification probabilities.
In some approaches, probabilities are assigned to audible events, e.g., designating their respective probability for constituting unwanted background noises. Audible events associated with noise-classification probabilities below a predetermined threshold can be ignored, whereas events with noise-classification probabilities above the threshold are provided to a second (more accurate) ML model for additional filtering. The second ML model, which performs a more accurate (and computationally expensive) classification, can be used to assign a second noise-classification probability to each background features. Subsequently, those features associated with noise-classification probabilities exceeding a second threshold can be selected for removal/attenuation.
Attenuation for positively identified noise events (unwanted background artifacts) can be based on associated event probabilities. For example, sound events associated with higher noise-classification probabilities can be more greatly attenuated than events associated with lower noise-classification probabilities. Additionally, as further discussed below, attenuated noises can be buffered, such that at a time when desired sounds (e.g., user speech noises) are detected, the desired sound can be inserted into the audio channel, for example, using “time squeezing” and played at a normal volume level. As such, sounds at the beginning of words or sentences are not inadvertently attenuated, thereby improving the overall intelligibility human speech.
It is understood that the described techniques can be applied to a variety of machine learning and/or classification algorithms, and that the scope of the technology is not limited to a specific machine learning implementation. By way of example, implementations of the technology can include the deployment of multi-layered ML models based on one or more classification algorithms, including but not limited to: a Multinomial Naive Bayes classifier, a Bernoulli Naive Bayes classifier, a Perceptron classifier, a Stochastic Gradient Descent (SGD) Classifier, and/or a Passive Aggressive Classifier, or the like.
In some aspects, ML models can be configured to perform various types of regression, for example, using one or more regression algorithms, including but not limited to: a Stochastic Gradient Descent Regressor, and/or a Passive Aggressive Regressor, etc. ML models can also be based on clustering algorithms (e.g., a Mini-batch K-means clustering algorithm), a recommendation algorithm (e.g., a Miniwise Hashing algorithm, or Euclidean LSH algorithm), and/or an anomaly detection algorithm, such as a Local outlier factor. Additionally, ML models can employ a dimensionality reduction approach, such as, one or more of: a Mini-batch Dictionary Learning algorithm, an Incremental Principal Component Analysis (PCA) algorithm, a Latent Dirichlet Allocation algorithm, and/or a Mini-batch K-means algorithm, etc.
As further illustrated in
Conference assistant device 132 is configured to coordinate with the other devices in conference room 130 to start and maintain a conferencing session. For example, conference assistant device 132 may interact with portable device 142 associated with one or more users to facilitate a conferencing session, either directly or via networks 110a/b.
Portable device 142 may be, for example, a smart phone, tablet, laptop, or other computing device. Portable device 142 may have an operating system and run one or more collaboration service applications that facilitate conferencing or collaboration, and interaction with conference assistant device 132. In practice, networks 110a/b can be configured to support communications between users of any of devices 1221, 1222, . . . 122n, and 142. In some approaches, acoustic attenuation system 120 is configured to identify and mitigate unwanted background noises on an audio channel provided by networks 110a/b during such communications.
In particular, acoustic attenuation system 120 can include hardware and software modules necessary to implement a multi-layered machine-learning noise mitigation process of the disclosed technology. Attenuation system 120 can be configured to intercept real-time audio segments of audio information transmitted between two or more of devices 1221, 1222, . . . 122n, and 142. The audio segments are analyzed using a preliminary (first) ML model, which assigns a first (noise-classification) probability to background features (sounds) in the audio segments. The noise-classification probabilities for each background feature correspond with a probability that the associated feature (noise) is an undesired audio artifact.
In some aspects, the first probabilities calculated for each sound event can be used to filter low-probability background features, i.e., to remove sounds that have a low probability of being background noises. To perform filtering, each sound event associated with a probability less than a predetermined threshold amount can be ignored. By way of example, sound events that fall below a 30% chance of constituting unwanted background noises can be ignored. On the other hand, sound events associated with a probability that is greater than the predetermined threshold may be provided to a secondary ML model. Further to the above example, sound events having a greater than 30% chance of constituting unwanted background noises may be provided to a second ML model.
As discussed in further detail below, the second ML model can provide higher accuracy classification as compared to the first ML model. The second ML model, therefore, can process each sound event and assign a second probability to each event, i.e., corresponding with a probability that the event constitutes an unwanted background noise. As in the above example, sound events associated with probabilities exceeding a second (predetermined) threshold) can be selected for attenuation, whereas sound events that do not exceed the second threshold can be identified as constituting normal speech and ignored.
The audio segments each represent an interval of sound data, such as 1 second audio clips. Audio segments can be sampled from audio information passing over a communications channel, for example, as between two or more of devices 1221, 1222, . . . 122n, and 142, discussed above. In some aspects, the audio segments represent samples taken at sliding time intervals, such as 1 second segments sampled every 10 ms. Audio segment lengths and sampling rates (temporal segment spacing) can vary, depending on the desired implementation.
After the real-time audio segments are generated, the segments are provided to a first ML model (204). The first ML model can be implemented using software and/or hardware modules deployed on any device coupled to, or configured to receive, audio segments from an audio channel. In some approaches, the first ML model is a relatively light-weight (computationally inexpensive) classifier configured to quickly evaluate audio features contained in the received audio segments. As discussed above, the first ML model can be configured to associate background (sound) features detected in the audio segments with probabilistic indicators that those features represent unwanted background noises. As such, the first ML model can function as a classifier-based filter.
In some aspects, the first ML model can associate each background feature in the audio segments with a probability score, such as 0.05, to indicate a 5% chance that the background feature represents an undesired background noise, or 0.80 to indicate an 80% chance that the background feature represents an undesired noise. In some aspects, probability scores may be appended to audio segment data, for example, as metadata tags. In other aspects, probability scores may be stored to an associative memory structure, such as a database or table.
Subsequently, the audio segments are evaluated to determine if the detected background features can be ignored, e.g., if they have a low probability of constituting unwanted background noises (206). Filtering of irrelevant background features is performed using a probability threshold. For example, probability scores assigned by the first ML model can be compared against a (first) predetermined threshold. Background events associated with probability scores below the (first) threshold are deemed to have a low chance of being unwanted background noises, and can be ignored (return to 202). However, background events associated with probability scores above the (first) threshold can be selected for further analysis using a second ML model (208).
The second ML model can be a classifier having greater noise detection accuracy than the first ML model. As such, the second ML model can be computationally more expensive to operate. However, the second ML model receives a smaller total number background features because a significant number are filtered by the first ML model. As a result, the second ML model can be deployed and implemented in real-time communications, without disrupting or distorting audio exchange. The second ML model can be configured to analyze received audio segments and to assign second set of probability scores to each identified background (sound) events, e.g., to quantitatively indicate a probability that the corresponding event is an unwanted background noise (208).
Probabilities assigned by the second ML model for each of the background sound features can be compared to a second threshold to determine if noise attenuation should be performed for that background feature (210). If the probability associated with the background feature is less than the second threshold, then the sound may be ignored, i.e., no action is taken (202). Alternatively, if the probability associated with the background feature is greater than the second predetermined threshold, then the background event may be reduced in volume (dB) using an attenuation module (212). In some approaches, the attenuation module can perform on-the fly signal processing for the corresponding audio segment such that there is no loss or distortion in audio quality. As such, background noises can be effectively filtered in real-time (or near real-time) such that user experience is improved by removal of extraneous background noises, but not negatively affected by audio delays.
It is understood that the thresholds (e.g., the first threshold and second threshold) may be automatically configured, or manually set, for example, by a system administrator or by default system settings. In some aspects, the first/second threshold may be tuned or adjusted based on considerations of accuracy and user experience. The amount of amplitude attenuation for a particular background sound feature can be based on the probability assigned by the second ML model. That is, for background features for which there is a high-confidence that the sound is an unwanted background noise (a high probability), attenuation can be greater than for background features for which there is a lower-confidence (lower associated probability).
As discussed in further detail with respect to
The example of
As discussed in further detail below, noise attenuation can be performed on the fly (in real-time) or performed after a pre-determined offset. Delaying attenuation of background noises can help preserve speech quality, for example, by avoiding attenuation of sounds that occur at the beginning of words or syllables. By way of example, in column 303, corresponding with time frame 6 (column 303), the probability of unwanted background noise for sounds in the corresponding time frame is 0.95 or 95%. Due to an attenuation delay between time frames 6 and 7 (column 304), an attenuation of 19 dB is applied in time frame 7, based on the noise probability calculated in time frame 6 (e.g., 95%). As such, in time frame 7, the raw volume of 70 dB is reduced to an output volume of 51 dB.
In this approach, sounds existing in frame 6 (303) and frame 7 (304) can be buffered. Thus, if sounds in frame 7 (304) are attenuated, but include sounds that should not be attenuated, such as normal speech sounds, then frame can be inserted into the audio channel with the delay (and attenuation) removed. That is, the original sounds contained in frame 7 can be preserved and provided at full volume in their proper chronological time, favoring the preservation of purposeful sound events over the attenuation of unwanted background noise.
In another example, time frame 10 (305) and time frame 11 (306) correspond with sounds that have low probabilities of constituting unwanted background noises, relative to time frame 6 and 7. For example, in time frame 10 (305), the probability of the corresponding sound constituting unwanted background noise is 0.6, or 60%. As a result, attenuation of the sound volume applied in time frame 11 (306) is only 12 dB, reducing the raw volume for that time frame from 50 dB to 38 dB at output. In some approaches, where the noise probability is exceedingly low, no attenuation is performed. For example, as illustrated in table 302, the noise probability in frame 11 (306) is very low, i.e., 0.01 or 1%—the resulting noise attenuation in subsequent frame 12 is 0 dB.
The probability of noise for each time frame in table 302 can be a numeric value assigned to an audio segment (time frame) using a ML model, such as second ML model 208, discussed above. However, it is understood that implementations of the disclosed noise reduction technology are not limited to two-layer ML architectures. For example, three or more ML models or classifiers can be implemented, without departing from the scope of the technology.
In practice, audio signals, such as audio segments received from an audio capture device, are provided by audio input module 402 to noise detector 404 and signal delay module 408. Noise detector 404 can be configured to implement a multi-layered ML model, discussed above. For example, noise detector 404 may include software, firmware, and/or hardware used to implement a process similar to process 200, discussed with respect to
Detection of unwanted background noises (e.g., the identification of background features with high noise probabilities) can be indicated to attenuation and delay controller 406 by noise detector 404. Attenuation and delay controller 406 is configured to provide control signals to delay module 408 and amplifier 410 to selectively delay and background features that are identified as high-probability background noises. For example, for a given audio segment, attenuation of an identified background noise can be performed by delay module 408 and amplifier 410, such that volume attenuation is performed gradually. By increasing attenuation over time, device 400 can help to maintain, the fidelity of natural human speech sounds, for example, by preserving the volume of all noises occurring at the beginning of a word or syllable.
Device 500 can be a distributed system that performs functions described in this disclosure can be distributed within a datacenter, multiple datacenters, a peer network, etc. Depending on the desired implementation, one or more of the described system components can represent one or more such components, each performing some or all of the functions for which the component is described. Additionally, the components can be physical or virtual devices, such as virtual machines (VMs) or networking containers.
Device 500 includes at least one processing unit (CPU or processor) 510 and connection 505 that couples various system components, including system memory 515, such as read only memory (ROM) and random access memory (RAM), to processor 510. Device 500 can include a cache of high-speed memory connected directly with, in close proximity to, or integrated as part of processor 510. Processor 510 can include any general purpose processor and a hardware service or software service, such as services 532, 534, and 536 stored in storage device 530, configured to control processor 510 as well as a special-purpose processor where software instructions are incorporated into the actual processor design. Processor 510 can be a self-contained computing system, containing multiple cores or processors, a bus, memory controller, cache, etc. A multi-core processor can be symmetric or asymmetric.
To enable user interaction, computing system 500 includes an input device 545, which can represent any number of input mechanisms, such as a microphone for speech, a touch-sensitive screen for gesture or graphical input, keyboard, mouse, motion input, speech, etc. Computing system 500 can also include output device 535, which can be one or more of a number of output mechanisms known to those of skill in the art. In some instances, multimodal systems can enable a user to provide multiple types of input/output to communicate with computing system 500.
Device 500 can include communications interface 540, which can generally govern and manage the user input and system output. Communications interface 540 can include one or more wired or wireless network interfaces, for example, that are configured to facilitate network communications between one or more computer networks and device 500. There is no restriction on operating on any particular hardware arrangement and therefore the basic features here may easily be substituted for improved hardware or firmware arrangements as they are developed. Storage device 530 can be a non-volatile memory device and can be a hard disk or other types of computer readable media which can store data that are accessible by a computer, such as magnetic cassettes, flash memory cards, solid state memory devices, digital versatile disks, cartridges, random access memories (RAMs), read only memory (ROM), and/or some combination of these devices. Storage device 530 can include software services, servers, services, etc., that when the code that defines such software is executed by the processor 510, it causes the system to perform a function. In some aspects, a hardware service that performs a particular function can include the software component stored in a computer-readable medium in connection with the necessary hardware components, such as processor 510, connection 505, output device 535, etc., to carry out the function.
For clarity of explanation, the illustrative system embodiment is presented as including individual functional blocks including functional blocks labeled as a “processor” or processor 510. The functions these blocks represent may be provided through the use of either shared or dedicated hardware, including, but not limited to, hardware capable of executing software and hardware, such as a processor 510, that is purpose-built to operate as an equivalent to software executing on a general purpose processor. For example, the functions of one or more processors may be provided by a single shared processor or multiple processors. (Use of the term “processor” should not be construed to refer exclusively to hardware capable of executing software.) Illustrative embodiments may include microprocessor and/or digital signal processor (DSP) hardware, read-only memory (ROM) 440 for storing software performing the operations discussed below, and random access memory (RAM) 450 for storing results. Very large scale integration (VLSI) hardware embodiments, as well as custom VLSI circuitry in combination with a general purpose DSP circuit, may also be provided.
The logical operations of the various embodiments are implemented as: (1) a sequence of computer implemented steps, operations, or procedures running on a programmable circuit within a general use computer, (2) a sequence of computer implemented steps, operations, or procedures running on a specific-use programmable circuit; and/or (3) interconnected machine modules or program engines within the programmable circuits. The system 400 can practice all or part of the recited methods, can be a part of the recited systems, and/or can operate according to instructions in the recited non-transitory computer-readable storage media. Such logical operations can be implemented as modules configured to control the processor 420 to perform particular functions according to the programming of the module.
It is understood that any specific order or hierarchy of steps in the processes disclosed is an illustration of exemplary approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the processes may be rearranged, or that only a portion of the illustrated steps be performed. Some of the steps may be performed simultaneously. For example, in certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.
The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects. Thus, the claims are not intended to be limited to the aspects shown herein, but are to be accorded the full scope consistent with the language claims, wherein reference to an element in the singular is not intended to mean “one and only one” unless specifically so stated, but rather “one or more.”
A phrase such as an “aspect” does not imply that such aspect is essential to the subject technology or that such aspect applies to all configurations of the subject technology. A disclosure relating to an aspect may apply to all configurations, or one or more configurations. A phrase such as an aspect may refer to one or more aspects and vice versa. A phrase such as a “configuration” does not imply that such configuration is essential to the subject technology or that such configuration applies to all configurations of the subject technology. A disclosure relating to a configuration may apply to all configurations, or one or more configurations. A phrase such as a configuration may refer to one or more configurations and vice versa.
The word “exemplary” is used herein to mean “serving as an example or illustration.” Any aspect or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects or designs.
This application is a continuation of U.S. patent application Ser. No. 16/012,565 filed on Jun. 19, 2018, the contents of which is incorporated by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
5185848 | Aritsuka et al. | Feb 1993 | A |
5970064 | Clark et al. | Oct 1999 | A |
6115393 | Engel et al. | Sep 2000 | A |
6298351 | Castelli et al. | Oct 2001 | B1 |
6597684 | Gulati et al. | Jul 2003 | B1 |
6697325 | Cain | Feb 2004 | B1 |
6721899 | Narvaez-Guarnieri et al. | Apr 2004 | B1 |
6894714 | Gutta et al. | May 2005 | B2 |
6954617 | daCosta | Oct 2005 | B2 |
7185077 | O'Toole et al. | Feb 2007 | B1 |
7453864 | Kennedy et al. | Nov 2008 | B2 |
7496650 | Previdi et al. | Feb 2009 | B1 |
7826372 | Mabe et al. | Nov 2010 | B1 |
8059557 | Sigg et al. | Nov 2011 | B1 |
8063929 | Kurtz et al. | Nov 2011 | B2 |
8154583 | Kurtz et al. | Apr 2012 | B2 |
8274893 | Bansal et al. | Sep 2012 | B2 |
8385355 | Figueira et al. | Feb 2013 | B1 |
8489765 | Vasseur et al. | Jul 2013 | B2 |
8620840 | Newnham et al. | Dec 2013 | B2 |
8630291 | Shaffer et al. | Jan 2014 | B2 |
8634314 | Banka et al. | Jan 2014 | B2 |
8638778 | Lee et al. | Jan 2014 | B2 |
8707194 | Jenkins et al. | Apr 2014 | B1 |
8767716 | Trabelsi et al. | Jul 2014 | B2 |
8774164 | Klein et al. | Jul 2014 | B2 |
8842161 | Feng et al. | Sep 2014 | B2 |
8856584 | Matsubara | Oct 2014 | B2 |
8862522 | Jaiswal et al. | Oct 2014 | B1 |
8880477 | Barker et al. | Nov 2014 | B2 |
8942085 | Pani et al. | Jan 2015 | B1 |
9324022 | Williams, Jr. et al. | Apr 2016 | B2 |
9338065 | Vasseur et al. | May 2016 | B2 |
9553799 | Tarricone et al. | Jan 2017 | B2 |
9558451 | Nilsson et al. | Jan 2017 | B2 |
9608889 | Lundin | Mar 2017 | B1 |
9614756 | Joshi | Apr 2017 | B2 |
9779755 | Kay et al. | Oct 2017 | B1 |
9886954 | Meacham | Feb 2018 | B1 |
10446170 | Chen | Oct 2019 | B1 |
20020061001 | Garcia-Luna-Aceves et al. | May 2002 | A1 |
20020101505 | Gutta et al. | Aug 2002 | A1 |
20020105904 | Hauser et al. | Aug 2002 | A1 |
20020116154 | Nowak et al. | Aug 2002 | A1 |
20020159386 | Grosdidier et al. | Oct 2002 | A1 |
20030005149 | Haas et al. | Jan 2003 | A1 |
20030061340 | Sun et al. | Mar 2003 | A1 |
20030091052 | Pate et al. | May 2003 | A1 |
20030117992 | Kim et al. | Jun 2003 | A1 |
20030133417 | Badt, Jr. | Jul 2003 | A1 |
20030225549 | Shay et al. | Dec 2003 | A1 |
20040153563 | Shay et al. | Aug 2004 | A1 |
20040218525 | Elie-Dit-Cosaque et al. | Nov 2004 | A1 |
20050111487 | Matta et al. | May 2005 | A1 |
20050114532 | Chess et al. | May 2005 | A1 |
20050143979 | Lee et al. | Jun 2005 | A1 |
20060072471 | Shiozawa | Apr 2006 | A1 |
20060083193 | Womack et al. | Apr 2006 | A1 |
20060116146 | Herrod et al. | Jun 2006 | A1 |
20060133404 | Zuniga et al. | Jun 2006 | A1 |
20070071030 | Lee | Mar 2007 | A1 |
20070083650 | Collomb et al. | Apr 2007 | A1 |
20070120966 | Murai | May 2007 | A1 |
20070149249 | Chen et al. | Jun 2007 | A1 |
20070192065 | Riggs et al. | Aug 2007 | A1 |
20080049622 | Previdi et al. | Feb 2008 | A1 |
20080089246 | Ghanwani et al. | Apr 2008 | A1 |
20080140817 | Agarwal et al. | Jun 2008 | A1 |
20080159151 | Datz et al. | Jul 2008 | A1 |
20080181259 | Andreev et al. | Jul 2008 | A1 |
20080192651 | Gibbings | Aug 2008 | A1 |
20080293353 | Mody et al. | Nov 2008 | A1 |
20090003232 | Vaswani et al. | Jan 2009 | A1 |
20090010264 | Zhang | Jan 2009 | A1 |
20090073988 | Ghodrat et al. | Mar 2009 | A1 |
20090129316 | Ramanathan et al. | May 2009 | A1 |
20090147714 | Jain et al. | Jun 2009 | A1 |
20090147737 | Tacconi et al. | Jun 2009 | A1 |
20090168653 | St. Pierre et al. | Jul 2009 | A1 |
20090271467 | Boers et al. | Oct 2009 | A1 |
20090303908 | Deb et al. | Dec 2009 | A1 |
20100046504 | Hill | Feb 2010 | A1 |
20100165863 | Nakata | Jul 2010 | A1 |
20110082596 | Meagher et al. | Apr 2011 | A1 |
20110116389 | Tao et al. | May 2011 | A1 |
20110149759 | Jollota | Jun 2011 | A1 |
20110228696 | Agarwal et al. | Sep 2011 | A1 |
20110255570 | Fujiwara | Oct 2011 | A1 |
20110267962 | J S A et al. | Nov 2011 | A1 |
20110274283 | Athanas | Nov 2011 | A1 |
20120075999 | Ko et al. | Mar 2012 | A1 |
20120163177 | Vaswani et al. | Jun 2012 | A1 |
20120213062 | Liang et al. | Aug 2012 | A1 |
20120213124 | Vasseur et al. | Aug 2012 | A1 |
20120307629 | Vasseur et al. | Dec 2012 | A1 |
20120321095 | Hetherington et al. | Dec 2012 | A1 |
20130003542 | Catovic et al. | Jan 2013 | A1 |
20130010610 | Karthikeyan et al. | Jan 2013 | A1 |
20130028073 | Tatipamula et al. | Jan 2013 | A1 |
20130070755 | Trabelsi et al. | Mar 2013 | A1 |
20130128720 | Kim et al. | May 2013 | A1 |
20130177305 | Prakash et al. | Jul 2013 | A1 |
20130250754 | Vasseur et al. | Sep 2013 | A1 |
20130275589 | Karthikeyan et al. | Oct 2013 | A1 |
20130311673 | Karthikeyan et al. | Nov 2013 | A1 |
20140049595 | Feng et al. | Feb 2014 | A1 |
20140126423 | Vasseur et al. | May 2014 | A1 |
20140133327 | Miyauchi | May 2014 | A1 |
20140204759 | Guo et al. | Jul 2014 | A1 |
20140207945 | Galloway et al. | Jul 2014 | A1 |
20140215077 | Soudan et al. | Jul 2014 | A1 |
20140219103 | Vasseur et al. | Aug 2014 | A1 |
20140293955 | Keerthi | Oct 2014 | A1 |
20150023174 | Dasgupta et al. | Jan 2015 | A1 |
20150142702 | Nilsson et al. | May 2015 | A1 |
20150324689 | Wierzynski et al. | Nov 2015 | A1 |
20150340032 | Gruenstein | Nov 2015 | A1 |
20160005422 | Zad Issa et al. | Jan 2016 | A1 |
20160105345 | Kim et al. | Apr 2016 | A1 |
20160203404 | Cherkasova et al. | Jul 2016 | A1 |
20170324990 | Eban et al. | Nov 2017 | A1 |
20170347308 | Chou et al. | Nov 2017 | A1 |
20170353361 | Chopra et al. | Dec 2017 | A1 |
20180040333 | Wung et al. | Feb 2018 | A1 |
20180293995 | Makela et al. | Oct 2018 | A1 |
20180301157 | Gunawan | Oct 2018 | A1 |
Number | Date | Country |
---|---|---|
102004671 | Mar 2013 | CN |
WO 2017106454 | Jun 2017 | WO |
Entry |
---|
International Search Report and Written Opinion from the International Searching Authority, dated Sep. 16, 2019, 14 pages, for corresponding International Patent Application No. PCT/US2019/037518. |
Akkaya, Kemal, et al., “A survey on routing protocols for wireless sensor networks” Abtract, 1 page, Ad Hoc Networks, May 2005. |
Alsheikh, Mohammad Abu, et al., “Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications,” Mar. 19, 2015, 23 pages. |
Author Unknown, “White Paper on Service Enabler Virtualization,” Draft dated Nov. 9, 2015, 26 pages, Open Mobile Alliance (OMA), San Diego, CA, USA. |
Baccour, Nouha, et al., “Radio Link Quality Estimation in Wireless Sensor Networks: A Survey,” 2011, 35 pages. |
Fan, Na, “Learning Nonlinear Distance Functions Using Neural Network for Regression with Application to Robust Human Age Estimation,” Abstract, 1 page, IEEE International Conference on Computer Vision (ICCV), Nov. 2011, Institute of Electrical and Electronics Engineers, Barcelona, Spain. |
Flushing, Eduardo Feo, et al.: “A mobility-assisted protocol for supervised learning of link quality estimates in wireless networks,” Feb. 2012, 8 pages. |
Fortunato, Santo, “Community Detection in Graphs”, arXiv:0906.0612v2 [physics.soc-ph]; Physics Reports 486, 75-174, Jan. 25, 2010, 103 pages. |
Godsill, Simon, et al., “Detection and suppression of keyboard transient noise in audio streams with auxiliary keybed microphone,” Abstract, 1 page, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr. 19-24, 2015, Brisbane, Qld, Australia (Abstract available at http://ieeexplore.ieee.org/document/7177995/, downloaded on Feb. 28, 2018. |
Hershey, Shawn, et al., “CNN Architectures for Large-Scale Audio Classification,” Jan. 10, 2017, 5 pages, Google, Inc., New York, NY, and Mountain View, CA, USA. |
Hradis, Michael, et al., “Voice activity detection from gaze in video mediated communication,” ACM, Mar. 28-30, 2012 http://medusa.fit.vutbr.cz/TA2/TA2., 4 pages. |
Hui, J., et al., “An IPv6 Routing Header for Source Routes with the Routing Protocol for Low-Power and Lossy Networks (RPL)”, Request for Comments 6554, Mar. 2012, 12 pages, Internet Engineering Task Force Trust. |
Kuklinski, Slawomir, et al., “Design Principles of Generalized Network Orchestrators,” 2016 IEEE International Conference on Communications Workshops (ICC), May 23, 2016, pp. 430-435. |
Liu, Tao, et al., “Data-driven Link Quality Prediction Using Link Features,” ACM Transactions on Sensor Networks, Feb. 2014, 35 pages. |
Mckenna, Shannon, et al., “Acoustic Event Detection Using Machine Learning: Identifying Train Events,” Sep. 2017, pp. 1-5, http://cs229.stanford.edu/proj2012/McKennaMcLaren-AcousticEventDetectionUsingMachineLearningIdentifyingTrainEvents.pdf, downloaded on Feb. 28, 2018. |
Newman, M. E. J., “Analysis of weighted networks,” Phys. Rev. E 70, 056131, Jul. 20, 2004, 9 pages, http://arxiv.org/pdf/condmat/0407503.pdf. |
Newman, W. E. J., “Modularity and Community Structure in Networks”, Proceedings of the National Academy of Sciences of the United States of America, Jun. 2006, vol. 103, No. 23, pp. 8577-8582, PNAS, Washington, DC. |
Piczak, Karol J., “Environmental Sound Classification With Convolutional Neutral Networks,” 2015 IEEE International Workshop on Machine Learning for Signal Processing, Sep. 17-20, 2015, Boston, USA. |
Raghavendra, Kulkami V., et al., “Computational Intelligence in Wireless Sensor Networks: A Survey,” Abstract, 1 page, IEEE Communications Surveys & Tutorials, May 27, 2010. |
Salamon, Justin, et al., “Deep Convolutional Neutral Networks and Data Augmentation for Environmental Sound Classification,” IEEE Signal Processing Letters, Accepted Nov. 2016, 5 pages. |
Siddiky, Feroz Ahmed, et al., “An Efficient Approach to Rotation Invariant Face Detection Using PCA, Generalized Regression Neural Network and Mahalanobis Distance by Reducing Search Space,” Abstract, 1 page, 10th International Conference on Computer and Information Technology, Dec. 2007, Dhaka, Bangladesh. |
Singh, Shio Kumar, et al., “Routing Protocols in Wireless Sensor Networks—A Survey,” International Journal of Computer Science & Engineering Survey (IJCSES) vol. 1, No. 2, Nov. 2010, pp. 63-83. |
Tang, Pengcheng, et al., “Efficient Auto-scaling Approach in the Telco Cloud using Self-learning Algorithm,” 2015 IEEE Global Communications Conference (Globecom), Dec. 6, 2015, pp. 1-6. |
Tang, Yongning, et al., “Automatic belief network modeling via policy interference for SDN fault localization,” Journal of Internet Services and Applications, Jan. 20, 2016, pp. 1-13, Biomed Central Ltd., London, UK. |
Ting, Jo-Anne, et al., “Variational Bayesian Least Squares: An Application to Brain-Machine Interface Data,” Neural Networks, vol. 21, Issue 8, Oct. 2008, pp. 1112-1131, Elsevier. |
Tsang, Yolanda, et al., “Network Radar: Tomography from Round Trip Time Measurements,” ICM'04, Oct. 25-27, 2004, Sicily, Italy. |
Vasseur, JP., et al., “Routing Metrics Used for Path Calculation in Low-Power and Lossy Networks,” Request for Comments 6551, Mar. 2012, 30 pages, Internet Engineering Task Force Trust. |
Winter, T., et al., “RPL: IPv6 Routing Protocol for Low-Power and Lossy Networks,” Request for Comments 6550, Mar. 2012, 157 pages, Internet Engineering Task Force Trust. |
Zhang, Xiaoju, et al., “Dilated convolution neutral network with LeakyReLU for environmental sound classification,” Abstract, 1 page, 2017 22nd International Conference on Digital Signal Processing (DSP), Aug. 23-25, 2017, London, U.K. |
Zinkevich, Martin, et al. “Parallelized Stochastic Gradient Descent,” 2010, 37 pages. |
Number | Date | Country | |
---|---|---|---|
20200043509 A1 | Feb 2020 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16012565 | Jun 2018 | US |
Child | 16598059 | US |