The accuracy of signal analysis systems can directly depend on the amount of information contained in the signals being analyzed. Thus signals may be transmitted to analysis components in file formats having a large file size. These large file sizes may demand high quantities of bandwidth and subject the signals to increased risk of interruption when transmitted over networks.
In accordance with an implementation of this disclosure, at least a first neural network layer of a first neural network of a first device may determine a first signal difference between a signal characteristic of a first audio signal and a signal characteristic of a second audio signal. At least a second neural network layer of the neural network may compress the first audio signal and the second audio signal into a third audio signal based on the first signal difference. The first device may provide, to a second device, the first audio signal and the third audio signal.
In accordance with an implementation of this disclosure, a non-transitory, computer-readable medium may store instructions that, when executed by a processor, cause the processor to perform operations. The operations may include determining by at least a first neural network layer of a neural network of a first device, a first signal difference between a signal characteristic of a first audio signal and a signal characteristic of a second audio signal. The operations may include compressing, by at least a second neural network layer of the neural network and based on the first signal difference, the first audio signal and the second audio signal into a third audio signal. The operations may include providing, by the first device to a second device, the first audio signal and the third audio signal.
In accordance with an implementation of this disclosure, a first device may include a processor and a non-transitory, computer-readable medium in communication with the processor and storing instructions that, when executed by the processor, cause the processor to perform operations. The operations may include determining, by at least a first neural network layer of a neural network of a first device, a first signal difference between a signal characteristic of a first audio signal and a signal characteristic of a second audio signal. The operations may include compressing, by at least a second neural network layer of the neural network and based on the first signal difference, the first audio signal and the second audio signal into a third audio signal. The operations may include providing, to a second device, the first audio signal and the third audio signal.
In accordance with an implementation of this disclosure a means may be provided for determining by at least a first neural network layer of a neural network, a first signal difference between a signal characteristic of a first audio signal and a signal characteristic of a second audio signal. The means may provide for compressing, by at least a second neural network layer of the neural network and based on the first signal difference, the first audio signal and the second audio signal into a third audio signal. The means may provide for providing, to a second device, the first audio signal and the third audio signal.
In accordance with an implementation of this disclosure a first device may generate, based on a first audio signal and a second audio signal, a third audio signal. At least a first neural network layer of a neural network of the first device may determine a first signal difference between a signal characteristic of the first audio signal and a signal characteristic of the third audio signal. At least the first neural network layer may determine a second signal difference between a signal characteristic of the second audio signal and a signal characteristic of the third audio signal. At least a second neural network layer of the neural network may compress the first audio signal and the second audio signal into a fourth audio signal based on the first signal difference and the second signal difference. The first device may provide the third audio signal and the fourth audio signal to a second device.
In accordance with an implementation of this disclosure, a non-transitory, computer-readable medium may store instructions that, when executed by a processor, cause the processor to perform operations. The operations may include generating, by a first device based on a first audio signal and a second audio signal, a third audio signal. The operations may include determining, by at least a first neural network layer of a neural network of the first device, a first signal difference between a signal characteristic of the first audio signal and a signal characteristic of the third audio signal. The operations may include determining, by at least the first neural network layer of the neural network, a second signal difference between a signal characteristic of the second audio signal and a signal characteristic of the third audio signal. The operations may include compressing, by at least a second neural network layer of the neural network based on the first signal difference and the second signal difference, the first audio signal and the second audio signal into a fourth audio signal. The operations may include providing, by the first device to a second device, the third audio signal and the fourth audio signal.
In accordance with an implementation of this disclosure, a means may be provided for generating, based on a first audio signal and a second audio signal, a third audio signal. The means may provide for determining, by at least a first neural network layer of a neural network, a first signal difference between a signal characteristic of the first audio signal and a signal characteristic of the third audio signal. The means may provide for determining, by at least the first neural network layer, a second signal difference between a signal characteristic of the second audio signal and a signal characteristic of the third audio signal. The means may provide for compressing, by at least a second neural network layer of the neural network based on the first signal difference and the second signal difference, the first audio signal and the second audio signal into a fourth audio signal. The means may provide for providing, to a second device, the third audio signal and the fourth audio signal.
In accordance with an implementation of this disclosure, a first neural network executing on one or more first computing devices may determine multiple signal differences between one or more signal characteristics of a first audio signal of a first set of audio signals and one or more signal characteristics of one or more other audio signals of the first set of audio signals. The first neural network may compress the first set of audio signals into a compressed audio signal based on the multiple signal differences. The one or more first computing devices may provide the first audio signal and the compressed audio signal to a second neural network executing on one or more second computing devices. The first neural network may receive a second set of audio signals from the second neural network. The second set of audio signals may have been decompressed by the second neural network from the first audio signal and the compressed audio signal. The one or more computing devices may compare the first set of audio signals to the second set of audio signals and train the first neural network based on the comparison.
In accordance with an implementation of this disclosure, a means may be provided for determining, by a first neural network, a set of signal differences between one or more signal characteristics of a first audio signal of a first set of audio signals and one or more signal characteristics of one or more other audio signals of the first set of audio signals. The means may provide for compressing, by the first neural network and based on the set of signal differences, the first set of audio signals into a compressed audio signal. The means may provide for providing, the first audio signal and the compressed audio signal to a second neural network executing on one or more second computing devices. The means may provide for receiving, by the first neural network from the second neural network, a second set of audio signals decompressed by the second neural network from the first audio signal and the compressed audio signal. The means may provide for comparing the first set of audio signals to the second set of audio signals and training the first neural network based on the comparison.
Features, advantages, implementations, and embodiments of the disclosure may be apparent from consideration of the following detailed description, drawings, and claims. Moreover, it is to be understood that both the foregoing summary and the following detailed description are illustrative and are intended to provide further explanation without limiting the scope of the claims.
The accompanying drawings, which are included to provide further understanding of this disclosure, are incorporated in and constitute a part of this specification. The drawings also illustrate implementations and/or embodiments of the disclosure, and together with the detailed description serve to explain the principles of implementations and/or embodiments of the disclosure. No attempt is made to show structural details in more detail than may be necessary for a fundamental understanding of the disclosure and various ways in which it may be practiced.
In an implementation of this disclosure, a sensor device may execute a neural network that compresses audio signals from multiple signal sources, such as microphones, into a lower bit rate signal for more efficient and robust network transmission. For example, a sensor device in a “smart home environment” as described below may have five different microphones and may be positioned in a room of a home. An event in the home may generate sound waves that interact with each of the microphones and cause each to generate a signal. Each of these signals may often be quite similar to the other because they each may be caused by the same event. As a result, in some instances, only relatively minor differences in signal characteristics amongst the signals may need to be encoded in order to effectively approximate the signals at a later time. One of the microphone signals may be designated as a reference signal, and the other microphone signals may be designated as secondary signals. Each secondary signal may have differences in signal characteristics as compared to the reference signal. The signal differences may be, for example, differences in phase, magnitude, gain, frequency response, or a transfer function representing the relationship between the input and output of a respective signal source. These signal differences may be caused, for example by the different positions of the microphones on the housing of the device or the different geometry of the surfaces of the room with respect to each microphone.
The sensor device may contain a computing device executing a first neutral network. The first neural network may be trained to extract the significant differences between signal characteristics of the reference signal and signal characteristics of the secondary signals. The first neural network may generate a compressed signal by combining these extracted signal differences. The compressed signal may be a lossy signal, having a lower bit rate than any of the secondary signals or the sum of the bit rates of the secondary signals from which the signal differences were extracted. The sensor device may losslessly compress the reference signal and transmit the compressed lossless reference signal along with the compressed lossy signal to a network of distributed computing devices in a cloud computing environment.
A second neural network trained to decompress the compressed lossy signal may execute on the distributed computing devices in the cloud environment. One of the computing devices in the cloud environment may decompress the compressed lossless reference signal into the original reference signal. The second neural network may process the decompressed reference signal and the compressed lossy signal into representations of the secondary signals. The original reference signal and the representations of the secondary signals may be transmitted to a third neural network executing on computing devices in the cloud environment.
The third neural network may be trained to identify speech or sounds in received audio signals. The third neural network may receive the original reference signal and representations of the secondary signals and may perform sound recognition procedures, such as automated speech recognition, to identify words or sounds of interest. Indicators of these words or sounds may be transmitted back to the sensor device and serve as a basis for further functionality. For example, recognized speech may trigger the functioning of a system in the smart home environment such as air conditioning, lighting, or audio/video systems, or recognized sounds may trigger alerts on systems such as child monitoring systems or security systems. For example, the recognition of the sound of broken glass may serve as the basis for triggering a home alarm, or the recognition of the cry of a child may serve as the basis for notifying a parent that the child needs attending.
Generally, embodiments and implementations of this disclosure may be partially or completely incorporated within a smart home environment, such as is described in later portions of this disclosure. The smart home environment may include systems such as premises management systems that may include or communicate with various intelligent, multi-sensing, network-connected devices, such as the neural network executing sensor device described above. Devices included within the smart home environment, such as any of the sensor devices and related components described below with respect to
In an embodiment of this disclosure, multiple microphones of a sensor device may detect sound waves and generate audio signals from those sound waves. For example,
When microphones 110-150 receive sound waves, transducers housed within microphones 110-150 may generate signals, such as audio signals x1-xN. Device 100 may include signal channels composed of electronic circuitry suitable to communicate signals x1-xN to a computing device. The computing device may be any of those discussed below with respect to
The non-transitory, computer readable storage medium many store instructions for executing neural network 160 that compresses signals x1-xN for further transmission. Neural network 160 may be any of various types of neural networks suitable for the purposes of this disclosure. For example, in some implementations, neural network 160 may be a deep neural network that includes multiple neural network layers. In some implementations, in addition or as alternatives to a deep neural network, the neural network may include one or more recurrent neural network layers such as long short-term memory layers, one or more convolutional neural network layers, or one or more local contrast normalization layers. Neural networks, as described herein, may also have the architecture of a convolutional, long short-term memory, fully connected deep neural network. In some instances, various types of filters such as infinite impulse response filters, linear predictive filters, Kalman filters, or the like may be implemented in addition to or as part of one or more of the neural network layers.
As shown in
In some implementations, digitized samples of audio signals received from the microphones may be convolved with finite-duration impulse response filters of prescribed lengths. Since the input features to a neural network may generally be frequency-domain based representations of the signals, modeling the finite-duration impulse response filter within the neural network may be relatively straightforward in the frequency domain. Modeling the finite-duration impulse response filter response in the frequency domain may require that the parameters corresponding to the finite-duration impulse response filter be complex numbers, however. Thus additional non-linear post-processing may occur, for example, by enhancing signals in one spectrum or suppressing signals in another spectrum. This post-processing may be applied to the signals in the frequency domain.
In an implementation of this disclosure for frequency spectrum FSn, higher layers of neural network 160 may include nodes 250, 251, 252 for layer L2; nodes 253, 254, 255 for layer L3; nodes 256, 257, 258 for layer L4; and node 259 in highest layer L5. Similarly for frequency spectrum FSn+1, higher layers of neural network 160 may include nodes 260, 261, 262 for layer L2; nodes 263, 264, 265 for layer L3; nodes 266, 267, 268 for layer L4; and node 269 in highest layer L5. Nodes may be computational elements of neural network 160. A node may be adaptively weighted, in accordance with its relationship to other nodes and include threshold values or implement other suitable functions that affect output of the node. Nodes may preform real or complex computations, such as operations involving the phase and magnitude of an input signal.
In implementations of this disclosure, layers between and/or including the highest or lowest layer of neural network 160 may be trained to extract differences in signal characteristics received via signal channels 210, 230. For example, nodes 250-258 of layers L2, L3, and L4 of frequency spectrum FSn may compare signal characteristics of signals received from signal channels 210 to a reference signal. Similarly, nodes 260-268 of layers L2, L3, and L4 of frequency spectrum FSn+1 may compare signal characteristics of signals received from signal channels 210 to a reference signal. Through neural network processing, L2, L3, and L4 may extract significant differences in the signal characteristics, such as differences in frequency, phase, magnitude, frequency response, or transfer function. For example, one or more of nodes 250-258 and 260-268 may be weighted as a result of training neural network 160 to generate a beneficial compressed signal. The nodes of trained layers L2, L3, and L4 may then exact differences in signal characteristics that are determined to positively contribute to forming the beneficial compressed signal. These signal differences may be combined in higher layers of neural network 160 to generate a beneficial compressed lossy signal.
Neural network 160 may also capture temporal relationships according to implementations of this disclosure. For example, the outputs from the surrounding past and future samples of a given frequency spectrum may be combined from various signal channels to form a convolutional neural network. For example, the temporal relationships between the frequency spectra may be captured from layers L1 to L2, as illustrated by dashed lines 270 between different time instances of the frequency spectra FSn, FSn+1.
In implementations of this disclosure, neural network 160 may pass the extracted significant differences in signal characteristics to layer L5. Layer L5 may be the highest layer of neural network 160 and may have fewer nodes than lower layers L1-L4. For example layer L5 of frequency spectrum FSn may have a single node 259, and layer L5 of frequency spectrum FSn+1 may have a single node 269. The highest layer of neural network 160 may function as a linear bottleneck layer, where signal characteristic data received from multiple lower level nodes may be compressed into a signal having a higher data compression ratio. The highest layer of a neural network may be the layer of the neural network where no other layer exists between the highest layer and the output of the neural network. The new compressed signal may be considered to be a lossy signal because it does not contain all of the data of signal channels 210. Thus, for example, data representing significant differences in signal characteristics extracted by layers L2, L3, and L4 of frequency spectrum FSn of neural network 160 may be passed via multiple nodes 256, 257, and 258 of layer L4 to the single node 259 of layer L5 and compressed into a signal having a higher data compression ratio. Similarly, data representing significant differences in signal characteristics extracted by layers L2, L3, and L4 of frequency spectrum FSn+1 may be passed via multiple nodes 266, 267, and 268 of layer L4 to the single node 269 of layer L5 and compressed into a signal having a higher data compression ratio.
In some implementations, neural network 160 may have fewer layers or alternate structures. For example,
Output 303 may represent a cell state of the neural network. The cell state may connect each memory cell of the neural network. Interactions between the rest of the neural network and the cell state may be regulated by gates such that information flow may be selectively restricted from adding to or leaving the cell state. Gates may be composed of a neural network layer, such as L3 and a pointwise multiplication operation. By only selectively allowing the cell state to change, long short-term memory neural networks may maintain long-term dependencies on certain information learned by the neural network in the past. Output 302 may represent the loop generally found in recurrent neural networks that is not selectively gated in the same way as the cell state of output 303. Further discussion and examples of long, short-term memory neural networks as well as the basis for
As shown in
In other implementations, the reference signal may be a composite of signals from signal sources 110-150. For example,
Various procedures may be executed to compress source signals prior to transmission. For example,
The first neural network layer may determine many differences between signal characteristics of the first audio signal and signal characteristics of the second audio signal. However, only some of the determined differences may be selected. For example, nodes of the first neural network layer as well as other layers may be weighted or otherwise trained such that only nodes that extract differences in signal characteristics that are above a threshold value are passed to higher layers of the neural network. As another example, only certain components of a signal difference may be valuable for signal compression. Thus, for example, nodes of the first neural network layer may be weighted such that certain valuable frequency differences are passed to higher layers or amplified, and other frequency differences are restricted or their contributing effects degraded.
At 540, at least a second neural network layer of the neural network may compress the first audio signal and the second audio signal into a third compressed audio signal based on the first signal difference. The second neural network layer may be distinct from the first neural network layer. For example, the second neural network layer may be the highest layer in the neural network, and the first neural network layer may be one of the lower neural network layers. The at least second neural network layer may compress the first audio signal and the second audio signal into a lossy compressed third signal, and a bit rate of the first signal may be greater than a bit rate of the third signal. The first computing device may then provide the third audio signal and the first audio signal to a second computing device at 550 for decompression and further processing. The second computing device may be distinct and remote from the first computing device. For example the first computing device may be within a sensor device, such as sensor device 100 and located in a home, and the second computing device may be one of multiple servers in a remote cloud computing environment.
In another example,
At 660 at least the first neural network layer may determine a second signal difference between a signal characteristic of the second audio signal and a signal characteristic of the third audio signal. At least a second neural network layer of the neural network may then compress the first audio signal and the second audio signal into a fourth audio signal based on the first signal difference and the second signal difference at 670. At 680 the first computing device may provide the third signal and the fourth signal to a second computing device for decompression and further processing.
A reference signal and compressed signal may be provided to one or more second computing devices for decompression. For example, as shown in
In an implementation of this disclosure, a computing device may execute a neural network for decompressing received compressed signals. For example,
Signals output from a decompression neural network, such as signals d1-dN, may be provided to one or more third computing devices executing a third neural network in a cloud computing environment, such as neural network 190 as shown in
In some embodiments the one or more third computing devices executing the third, sound or speech recognition neural network may be local to one or more first computing devices executing the first, compression neural network and/or one or more second computing devices executing the second, decompression neural network. In some implementations the first, second, and/or third neural networks may each be part of the same neural network. The third neural network may include any of the neural network architectures described above, including that of a convolutional, long short-term memory, fully connected deep neural network.
The efficacy of a neural networks for compressing signals, decompressing signals, and recognizing sounds or speech may depend on the method and extent of prior training of the neural networks.
The second neural network may decompress the compressed audio signal into multiple audio signals at 825, and the one or more second computing devices may provide the decompressed audio signals to a third neural network for sound and speech recognition. The one or more second computing devices may also provide the decompressed audio signals back to the first neural network for training purposes at 830. The first neural network may receive the decompressed audio signals from the second neural network.
At 835 the one or more first computing devices may compare the decompressed audio signals to the multiple original audio signals, and at 840, train the first neural network based on the comparison. For example, if a signal characteristic of the decompressed audio signals provides a high quality approximation of a corresponding signal characteristic in the original audio signals, then the weight or other training feature of a node in the first neural network that contributed to inclusion of that signal characteristic in the compressed audio signal may be increased. Similarly, if a signal characteristic of the decompressed audio signals provides a poor quality approximation of a corresponding signal characteristic in the original audio signals, then the weight or other training feature of a node in the first neural network that contributed to inclusion of that signal characteristic in the compressed signal may be decreased. In a similar manner, the one or more second computing devices may train, at 845, the second neural network based on comparison of the decompressed audio signals to the multiple original audio signals. For example, the one or more second computing devices may receive the multiple original audio signals. The one or more second computing devices may compare signal characteristics of the decompressed audio signals to signal characteristics of the multiple original audio signals and adjust the weights or other training features of the nodes of the second neural network accordingly.
The third neural network executing on one or more third computing devices may receive the decompressed signals. At 850, the third neural network may determine a category associated with one or more components, such as a frame, of one or more signals of the decompressed signals. At 855 a computing device executing the first neural network may provide an indicator of a category known to be associated with the multiple original audio signals to the third neural network. At 860 the one or more third computing devices may compare an indicator of the determined category with an indicator of the known category associated with the multiple original audio signals. At 865, the one or more third computing devices may train the third neural network based on this comparison. For example, the weight or other training features of nodes in the third neural network that provided contributions to a successful determination of the known category may be strengthened, while those that provided negative contributions may be weakened.
In some implementations the first neural network and the second neural network may be trained concurrently, while the third neural network is prevented from training, such as by not providing feedback on the success or failure of category determinations by the third neural network. In other implementations, the first, second, and third neural networks may be trained concurrently. Any of a variety of other techniques for training neural networks may be employed in procedure 800, for example supervised, unsupervised, and reinforcement training techniques may be employed.
As discussed throughout this disclosure, operations performed by one or more computing devices executing a neural network may be performed by components of the one or more computing devices other than the neural network or by the neural network executing on the one or more computing devices.
The devices, systems, and procedures set forth in this disclosure may be in communication with other devices, systems, and procedures throughout a premises. Combined these devices, systems, and procedures may make up the greater smart home environment for the premises. Further aspects of the smart home environment and related components are discussed in the following portions of this disclosure.
In general, a “sensor” or “sensor device” as disclosed herein may include multiple sensors or sub-sensors, such as a position sensor that includes both a GPS sensor as well as a wireless network sensor. This combination may provide data that can be correlated with known wireless networks to obtain location information. Multiple sensors may be arranged in a single physical housing, such as where a single device includes movement, temperature, magnetic, and/or other sensors, as well as the devices discussed in earlier portions of this disclosure. Such a housing also may be referred to as a sensor or a sensor device. For clarity, sensors are described with respect to the particular functions they perform and/or the particular physical hardware used, when such specification is necessary for understanding of the embodiments disclosed herein.
A sensor may include hardware in addition to the specific physical sensor that obtains information about the environment.
As an example of the implementation of sensors within a premises
In some configurations, two or more sensors may generate data that can be used by a processor of a system to generate a response and/or infer a state of the environment. For example, an ambient light sensor in a room may determine that the room is dark (e.g., less than 60 lux). A microphone in the room may detect a sound above a set threshold, such as 60 dB. The system processor may determine, based on the data generated by both sensors, that it should activate one or more lights in the room. In the event the processor only received data from the ambient light sensor, the system may not have any basis to alter the state of the lighting in the room. Similarly, if the processor only received data from the microphone, the system may lack sufficient data to determine whether activating the lights in the room is necessary, for example, during the day the room may already be bright or during the night the lights may already be on. As another example, two or more sensors may communicate with one another. Thus, data generated by multiple sensors simultaneously or nearly simultaneously may be used to determine a state of an environment and, based on the determined state, generate a response.
As another example, a system may employ a magnetometer affixed to a door jamb and a magnet affixed to the door. When the door is closed, the magnetometer may detect the magnetic field emanating from the magnet. If the door is opened, the increased distance may cause the magnetic field near the magnetometer to be too weak to be detected by the magnetometer. If the system is activated, it may interpret such non-detection as the door being ajar or open. In some configurations, a separate sensor or a sensor integrated into one or more of the magnetometer and/or magnet may be incorporated to provide data regarding the status of the door. For example, an accelerometer and/or a compass may be affixed to the door and indicate the status of the door and/or augment the data provided by the magnetometer.
In some configurations, an accelerometer may be employed to indicate how quickly the door is moving. For example, the door may be lightly moving due to a breeze. This may be contrasted with a rapid movement due to a person swinging the door open. The data generated by the compass, accelerometer, and/or magnetometer may be analyzed and/or provided to a central system such as a controller 1130 and/or remote system 1140 depicted in
The data collected from one or more sensors may be used to determine the physical status and/or occupancy status of a premises, for example whether one or more family members are home or away. For example, open/close sensors such as door sensors as described with respect to
Data generated by one or more sensors may indicate patterns in the behavior of one or more users and/or an environment state over time, and thus may be used to “learn” such characteristics. For example, sequences of patterns of radiation may be collected by a capture component of a device in a room of a premises and used as a basis to learn object characteristics of a user, pets, furniture, plants, and other objects in the room. These object characteristics may make up a room profile of the room and may be used to make determinations about objects detected in the room.
In another example, data generated by an ambient light sensor in a room of a house and the time of day may be stored in a local or remote storage medium with the permission of an end user. A processor in communication with the storage medium may compute a behavior based on the data generated by the light sensor. The light sensor data may indicate that the amount of light detected increases until an approximate time or time period, such as 3:30 pm, and then declines until another approximate time or time period, such as 5:30 pm, at which point there is an abrupt increase in the amount of light detected. In many cases, the amount of light detected after the second time period may be either below a dark level of light (e.g., under or equal to 60 lux) or bright (e.g., equal to or above 400 lux). In this example, the data may indicate that after 5:30 pm, an occupant is turning on/off a light as the occupant of the room in which the sensor is located enters/leaves the room. At other times, the light sensor data may indicate that no lights are turned on/off in the room. The system, therefore, may learn occupants' patterns of turning on and off lights, and may generate a response to the learned behavior. For example, at 5:30 pm, a smart home environment or other sensor network may automatically activate the lights in the room if it detects an occupant in proximity to the home. In some embodiments, such behavior patterns may be verified using other sensors. Continuing the example, user behavior regarding specific lights may be verified and/or further refined based upon states of, or data gathered by, smart switches, outlets, lamps, and the like.
Such learning behavior may be implemented in accordance with the techniques disclosed herein. For example, a smart home environment as disclosed herein may be configured to learn appropriate notices to generate or other actions to take in response to a determination that a notice should be generated, and/or appropriate recipients of a particular notice or type of notice. As a specific example, a smart home environment may determine that after a notice has been sent to a first occupant of the smart home premises indicating that a window in a room has been left open, a second occupant is always detected in the room within a threshold time period, and the window is closed shortly thereafter. After making such a determination, in future occurrences the notice may be sent to the second occupant or to both occupants for the purposes of improving the efficacy of the notice. In an embodiment, such “learned” behaviors may be reviewed, overridden, modified, or the like by a user of the system, such as via a computer-provided interface to a smart home environment as disclosed herein.
Sensors, premises management systems, mobile devise, and related components as disclosed herein may operate within a communication network, such as a conventional wireless network, and/or a sensor-specific network through which sensors may communicate with one another and/or with dedicated other devices. In some configurations one or more sensors may provide information to one or more other sensors, to a central controller, or to any other device capable of communicating on a network with the one or more sensors. A central controller may be general- or special-purpose. For example, one type of central controller is a home automation network that collects and analyzes data from one or more sensors within the home. Another example of a central controller is a special-purpose controller that is dedicated to a subset of functions, such as a security controller that collects and analyzes sensor data primarily or exclusively as it relates to various security considerations for a location. A central controller may be located locally with respect to the sensors with which it communicates and from which it obtains sensor data, such as in the case where it is positioned within a home that includes a home automation and/or sensor network. Alternatively or in addition, a central controller as disclosed herein may be remote from the sensors, such as where the central controller is implemented as a cloud-based system that communicates with multiple sensors, which may be located at multiple locations and may be local or remote with respect to one another.
The devices of the disclosed subject matter may be communicatively connected via the network 1100, which may be a mesh-type network such as Thread, which provides network architecture and/or protocols for devices to communicate with one another. Typical home networks may have a single device point of communications. Such networks may be prone to failure, such that devices of the network cannot communicate with one another when the single device point does not operate normally. The mesh-type network of Thread, which may be used in methods and systems of the disclosed subject matter may avoid communication using a single device. That is, in the mesh-type network, such as network 1100, there is no single point of communication that may fail so as to prohibit devices coupled to the network from communicating with one another.
The communication and network protocols used by the devices communicatively coupled to the network 1100 may provide secure communications, minimize the amount of power used (i.e., be power efficient), and support a wide variety of devices and/or products in a home, such as appliances, access control, climate control, energy management, lighting, safety, and security. For example, the protocols supported by the network and the devices connected thereto may have an open protocol which may carry IPv6 natively.
The Thread network, such as network 1100, may be easy to set up and secure to use. The network 1100 may use an authentication scheme, such as AES (Advanced Encryption Standard) encryption or the like, to reduce and/or minimize security holes that exist in other wireless protocols. The Thread network may be scalable to connect devices (e.g., 2, 5, 10, 20, 50, 100, 310, 200, or more devices) into a single network supporting multiple hops (e.g., so as to provide communications between devices when one or more nodes of the network is not operating normally). The network 1100, which may be a Thread network, may provide security at the network and application layers. One or more devices communicatively coupled to the network 1100 (e.g., controller 1130, remote system 1140, and the like) may store product install codes to ensure only authorized devices can join the network 1100. One or more operations and communications of network 1100 may use cryptography, such as public-key cryptography.
The devices communicatively coupled to the network 1100 of the smart home environment disclosed herein may have low power consumption and/or reduced power consumption. That is, devices efficiently communicate to with one another and operate to provide functionality to the user, where the devices may have reduced battery size and increased battery lifetimes over conventional devices. The devices may include sleep modes to increase battery life and reduce power requirements. For example, communications between devices coupled to the network 1100 may use the power-efficient IEEE 802.15.4 MAC/PHY protocol. In embodiments of the disclosed subject matter, short messaging between devices on the network 1100 may conserve bandwidth and power. The routing protocol of the network 1100 may reduce network overhead and latency. The communication interfaces of the devices coupled to the smart home environment may include wireless system-on-chips to support the low-power, secure, stable, and/or scalable communications network 1100.
The sensor network shown in
The smart home environment can control and/or be coupled to devices outside of the structure. For example, one or more of the sensors 1110 and 1120 may be located outside the structure, for example, at one or more distances from the structure (e.g., sensors 1110 and 1120 may be disposed outside the structure, at points along a land perimeter on which the structure is located, and the like. One or more of the devices in the smart home environment need not physically be within the structure. For example, the controller 1130 which may receive input from the sensors 1110 and 1120 may be located outside of the structure.
The structure of the smart home environment may include a plurality of rooms, separated at least partly from each other via walls. The walls can include interior walls or exterior walls. Each room can further include a floor and a ceiling. Devices of the smart home environment, such as the sensors 1110 and 1120, may be mounted on, integrated with and/or supported by a wall, floor, or ceiling of the structure.
The smart home environment including the sensor network shown in
For example, a smart thermostat may detect ambient climate characteristics (e.g., temperature and/or humidity) and may accordingly control an HVAC system of the structure. For example, the ambient climate characteristics may be detected by sensors 1110 and 1120 shown in
As another example, a smart hazard detector may detect the presence of a hazardous substance or a substance indicative of a hazardous substance (e.g., smoke, fire, or carbon monoxide). For example, smoke, fire, and/or carbon monoxide may be detected by sensors 1110 and 1120 shown in
As another example, a smart doorbell may control doorbell functionality, detect a person's approach to or departure from a location (e.g., an outer door to the structure), and announce a person's approach or departure from the structure via audible and/or visual message that is output by a speaker and/or a display coupled to, for example, the controller 1130.
In some embodiments, the smart home environment of the sensor network shown in
In embodiments of the disclosed subject matter, a smart home environment may include one or more intelligent, multi-sensing, network-connected entry detectors (e.g., “smart entry detectors”). Such detectors may be or include one or more of the sensors 1110 and 1120 shown in
The smart home environment of the sensor network shown in
The smart thermostats, the smart hazard detectors, the smart doorbells, the smart wall switches, the smart wall plugs, the smart entry detectors, the smart doorknobs, the keypads, and other devices of a smart home environment (e.g., as illustrated as sensors 1110 and 1120 of
A user can interact with one or more of the network-connected smart devices (e.g., via the network 1100). For example, a user can communicate with one or more of the network-connected smart devices using a computer or mobile device (e.g., a desktop computer, laptop computer, tablet, or the like) or other portable electronic device (e.g., a smartphone, a tablet, a key FOB, or the like). A webpage or application can be configured to receive communications from the user and control the one or more of the network-connected smart devices based on the communications and/or to present information about the device's operation to the user. For example, the user can view, arm or disarm the security system of the home.
One or more users can control one or more of the network-connected smart devices in the smart home environment using a network-connected computer or portable electronic device. In some examples, some or all of the users (e.g., individuals who live in the home) can register their mobile device and/or key FOBs with the smart home environment (e.g., with the controller 1130). Such registration can be made at a central server (e.g., the controller 1130 and/or the remote system 1140) to authenticate the user and/or the electronic device as being associated with the smart home environment, and to provide permission to the user to use the electronic device to control the network-connected smart devices and systems of the smart home environment. A user can use their registered electronic device to remotely control the network-connected smart devices and systems of the smart home environment, such as when the occupant is at work or on vacation. The user may also use their registered electronic device to control the network-connected smart devices when the user is located inside the smart home environment.
Alternatively, or in addition to registering electronic devices, the smart home environment may make inferences about which individuals live in the home (occupants) and are therefore users and which electronic devices are associated with those individuals. As such, the smart home environment may “learn” who is a user (e.g., an authorized user) and permit the electronic devices associated with those individuals to control the network-connected smart devices of the smart home environment (e.g., devices communicatively coupled to the network 1100) in some embodiments, including sensors used by or within the smart home environment. Various types of notices and other information may be provided to users via messages sent to one or more user electronic devices. For example, the messages can be sent via email, short message service (SMS), multimedia messaging service (MMS), unstructured supplementary service data (USSD), as well as any other type of messaging services and/or communication protocols. As previously described, such notices may be generated in response to specific determinations of the occupancy and/or physical status of a premises, or they may be sent for other reasons as disclosed herein.
A smart home environment may include communication with devices outside of the smart home environment but within a proximate geographical range of the home. For example, the smart home environment may include an outdoor lighting system (not shown) that communicates information through the communication network 1100 or directly to a central server or cloud-computing system (e.g., controller 1130 and/or remote system 1140) regarding detected movement and/or presence of people, animals, and any other objects and receives back commands for controlling the lighting accordingly.
The controller 1130 and/or remote system 1140 can control the outdoor lighting system based on information received from the other network-connected smart devices in the smart home environment. For example, in the event that any of the network-connected smart devices, such as smart wall plugs located outdoors, detect movement at nighttime, the controller 1130 and/or remote system 1140 can activate the outdoor lighting system and/or other lights in the smart home environment.
In some configurations, a remote system 1140 may aggregate data from multiple locations, such as multiple buildings, multi-resident buildings, individual residences within a neighborhood, multiple neighborhoods, and the like. In general, multiple sensor/controller systems 1150 and 1160 as shown
In situations in which the systems discussed here collect personal information about users, or may make use of personal information, the users may be provided with an opportunity to control whether programs or features collect user information (e.g., information about a user's social network, social actions or activities, profession, a user's preferences, or a user's current location), or to control whether and/or how to receive content from the content server that may be more relevant to the user. In addition, certain data may be treated in one or more ways before it is stored or used, so that personally identifiable information is removed. For example, specific information about a user's residence may be treated so that no personally identifiable information can be determined for the user, or a user's geographic location may be generalized where location information is obtained (such as to a city, ZIP code, or state level), so that a particular location of a user cannot be determined. As another example, systems disclosed herein may allow a user to restrict the information collected by those systems to applications specific to the user, such as by disabling or limiting the extent to which such information is aggregated or used in analysis with other information from other users. Thus, the user may have control over how information is collected about the user and used by a system as disclosed herein.
Embodiments of the presently disclosed subject matter may be implemented in and used with a variety of computing devices.
The bus 1210 allows data communication between the central processor 1220 and one or more memory components 1230 and 1270, which may include RAM, ROM, and other memory, as previously noted. Applications resident with the computing device 1200 are generally stored on and accessed via a non-transitory, computer-readable storage medium, such as memory 1230 or fixed storage 1270.
The fixed storage 1270 may be integral with the computing device 1200 or may be separate and accessed through other interfaces. The network interface 1290 may provide a direct connection to a remote server via a wired or wireless connection. The network interface 1290 may provide such connection using any suitable technique and protocol as will be readily understood by one of skill in the art, including digital cellular telephone, Wi-Fi, Bluetooth®, near-field, and the like. For example, the network interface 1290 may allow the device to communicate with other computers via one or more local, wide-area, or other communication networks, as described in further detail herein.
Various embodiments of the presently disclosed subject matter may include or be embodied in the form of computer-implemented processes and apparatuses for practicing those processes. Embodiments also may be embodied in the form of a computer program product having computer program code containing instructions embodied in non-transitory and/or tangible media, such as hard drives, USB (universal serial bus) drives, or any other machine readable storage medium, such that when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing embodiments of the disclosed subject matter. When implemented on a general-purpose microprocessor, the computer program code may configure the microprocessor to become a special-purpose device, such as by creation of specific logic circuits as specified by the instructions.
Embodiments may be implemented using hardware that may include a processor, such as a general purpose microprocessor and/or an Application Specific Integrated Circuit (ASIC) that embodies all or part of the techniques according to embodiments of the disclosed subject matter in hardware and/or firmware. The processor may be coupled to memory, such as RAM, ROM, flash memory, a hard disk or any other device capable of storing electronic information. The memory may store instructions adapted to be executed by the processor to perform the techniques according to embodiments of the disclosed subject matter.
The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit embodiments of the disclosed subject matter to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to explain the principles of embodiments of the disclosed subject matter and their practical applications, to thereby enable others skilled in the art to utilize those embodiments as well as various embodiments with various modifications as may be suited to the particular use contemplated.
Number | Name | Date | Kind |
---|---|---|---|
5692098 | Kurdziel | Nov 1997 | A |
5737485 | Flanagan et al. | Apr 1998 | A |
5737716 | Bergstrom | Apr 1998 | A |
5819215 | Dobson et al. | Oct 1998 | A |
8041041 | Luo et al. | Oct 2011 | B1 |
8332229 | Samsudin et al. | Dec 2012 | B2 |
8990076 | Strom | Mar 2015 | B1 |
9736611 | Hetherington et al. | Aug 2017 | B2 |
20030016835 | Elko et al. | Jan 2003 | A1 |
20030097257 | Amada et al. | May 2003 | A1 |
20040044520 | Chen et al. | Mar 2004 | A1 |
20060031066 | Hetherington et al. | Feb 2006 | A1 |
20110194704 | Hetherington et al. | Aug 2011 | A1 |
20110224991 | Fejzo et al. | Sep 2011 | A1 |
20120230497 | Dressler et al. | Sep 2012 | A1 |
20140164001 | Lang et al. | Jun 2014 | A1 |
20150095026 | Bisani et al. | Apr 2015 | A1 |
20150340032 | Gruenstein | Nov 2015 | A1 |
20160180838 | Parada San Martin et al. | Jun 2016 | A1 |
20160379638 | Basye et al. | Dec 2016 | A1 |
Entry |
---|
Morishima, Shigeo, H. Harashima, and Y. Katayama. “Speech coding based on a multi-layer neural network.” IEEE International Conference on Communications, Including Supercomm Technical Sessions. IEEE, 1990. (Year: 1990). |
Disch, Sascha, Christian Ertel, Christof Faller, Juergen Herre, Johannes Hilpert, Andreas Hoelzer, Peter Kroon, Karsten Linzmeier, and Claus Spenger. “Spatial Audio Coding: Next-generation efficient and compatible coding of multi-channel audio.” In Audio Engineering Society Convention 117. Audio Engineering Society, 2004. |
Faller, Christof, and Frank Baumgarte. “Binaural cue coding applied to stereo and multi-channel audio compression.” In Audio Engineering Society Convention 112. Audio Engineering Society, 2002. |
Olah, Chris, “Understanding LSTM Networks.” In Colah's Blog, 2015. http://colah. github.io/posts/2015-08-Understanding-LSTMs/ (last visited Jul 5, 2016). |
Sainath, Tara N., Ron J. Weiss, Andrew Senior, Kevin W. Wilson, and Oriol Vinyals. “Learning the speech front-end with raw waveform cldnns.” In Proc. Interspeech. 2015. |
Sainath, Tara N., Ron J. Weiss, Kevin W. Wilson, Arun Narayanan, and Michiel Bacchiani. “Factored spatial and spectral multichannel raw waveform CLDNNs.” In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5075-5079. IEEE, 2016. |
Number | Date | Country | |
---|---|---|---|
20180108363 A1 | Apr 2018 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15211417 | Jul 2016 | US |
Child | 15845087 | US |