The invention relates to systems and methods for fusing sensor data obtained from different types of sensors to derive spatial analytics or other information about a room and its occupancy.
Commercial and residential spaces are equipped with lighting devices to provide illumination. For energy conservation and convenience, the lighting devices communicate with motion sensors to automatically turn on and off the light source of these lighting devices. For example, when an individual walks into a room, a motion sensor detects motion and causes a signal to turn on the light source automatically. If no motion is detected for a particular duration of time, it is assumed that there are no occupants in the area and the light source is turned off.
There are techniques to estimate occupancy of large areas using several motion sensors. Error in these estimations increases as the number of motion sensors are reduced. Occupancy estimation is generally more accurate for large spaces having multiple sensors over long periods of time. The prior art does not offer solutions for estimating occupancy with greater accuracy for smaller rooms or rooms with few motion sensors.
Independently, the occupancy of a room may be determined using audio-based sensors instead of motion. For example, U.S. Patent Publication No. 2019/0214019A1 describes an array of microphones that uses sound localization to count occupants by analyzing a direction of the sound and its intensity level. U.S. Patent Publication No. US20140379305A1 describes an ultrasonic sensor that generates a signal based on a sensed ultrasonic echo of the monitored area. Specifically, ultrasonic transducer sensors generate high frequency sound waves and evaluate the echoes, which are received back by the sensors.
The present disclosure overcomes the problems in the prior art by providing more accurate techniques of estimating room occupancy. In addition, the present disclosure improves the accuracy of spatial analysis in small rooms with fewer sensors.
One aspect of the present invention is related to a system for estimating a number of occupants in a room. The system includes a motion sensor configured to generate motion samples and a microphone configured to generate audio samples. The system has a communication interface configured to communicate with a computing system such as, for example, a server. The system has at least one processor configured to detect motion events from the motion samples to determine a first estimated number of occupants in the room; analyze the audio samples to derive a second estimated number of occupants in the room; compare the first estimated number to the second estimated number; and generate an output comprising a number of occupants in the room in response to the comparison.
The system may include a lighting device such as, for example, a luminaire. According to one embodiment the lighting device includes the motion sensor and microphone. In addition, the lighting device includes the communication interface to communicate with the computing device. For example, the lighting device is a networked device that communicates with a server over the Internet or other network. The at least one processor may be installed in the lighting device and/or server to facilitate a distributed computing environment
According to another embodiment the motion sensor comprises a passive infrared sensor and wherein the motion events comprise at least a minor motion event and a major motion event.
According to another embodiment the at least one processor is configured to analyze the audio samples by converting the audio samples into a series of feature vectors corresponding to a series of sample windows, and cluster the series of feature vectors into a number of clusters, wherein the number of clusters is the second estimated number. For example, this may involve clustering the series of feature vectors into a number of clusters using the first estimated number as an initial cluster count. As another example, the feature vectors are clustered according to k-means clustering such that the first estimated number is provided as a seed value to the k-means clustering. In one aspect of the present disclosure the series of feature vectors include Mel Frequency Cepstral Coefficients.
5. According to another embodiment, the system determines whether a cluster of feature vectors in the audio samples corresponds to audio originating from a speaker device.
Further details, aspects, and embodiments of the invention will be described, by way of example only, with reference to the drawings. Elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. In the figures, elements which correspond to elements already described may have the same reference numerals. In the drawings,
The embodiments shown in the drawings and described in detail herein should be considered exemplary of the principles of the invention and not intended to limit the invention to the specific embodiments shown and described herein.
In the following, for the sake of understanding, elements of embodiments are described in operation. However, it will be apparent that the respective elements are arranged to perform the functions being described as performed by them.
Further, the invention is not limited to the embodiments, and the invention lies in each and every novel feature or combination of features described herein or recited in mutually different dependent claims.
Each lighting device 101 communicates with other system components over a network 105. The network 105 includes the Internet, intranets, extranets, wide area networks (WANs), local area networks (LANs), wired networks, wireless networks, or other suitable networks, etc., or any combination of two or more such networks.
The spatial analysis system 100 of
The computing system 106 includes a database 109. Various data is stored in the database 109 or other memory that is accessible to the computing system 106. The database 109 may represent one or more databases 109. The data stored in the database 109 includes, for example, lighting device identifiers (IDs) 112, occupancy statistics 115, motion sensor data 118, and microphone data 121, Lighting device IDs 112 includes data identifying each lighting device 101 coupled to the network 105. In this respect, the computing device 106 tracks each lighting device 101 so as to facilitate communication with each lighting device 101 on an independent and individual basis. Occupancy statistics 115 include data identifying the various rooms in which the lighting devices 101 are installed and the occupancy status in terms of the number of occupants over time. Motion sensor data 118 and microphone data 121 may include raw or processed data originating from the motion sensors 102 and microphones 103 of the various installed lighting devices. Various applications and/or other functionality may be executed in the computing system 106. This includes, for example, server application 127, which is configured to communicate with the lighting devices 101.
Next is a description of how the various components of
An estimated number of occupants is determined using the motion samples. An embodiment of this process is described in further detail with respect to
In the embodiment of
The spatial analysis system 100 includes a motion based occupancy estimator 322, which may be executed by a processor in the lighting device 101, within a server application 127, or a combination thereof. The motion based occupancy estimator 322 analyzes the processed motion data 318 to derive a motion-based estimated number of occupants. This result may be stored as occupancy statistics 115. When performing this analysis, the motion based occupancy estimator 322 associates each major motion event as an incremental change in occupancy count and confirms whether the occupancy has changed based on the number of minor or medium motion events. Thus,
The spatial analysis system 100 includes a feature generation module 415 that is executed by a processor in the lighting device 101, within a server application 127, or a combination thereof. The feature generation module converts the audio samples 412 for a given timeslot into a feature vector 418 made up of coefficients. The feature vector 418 has a finite size that is relatively smaller than the size of an audio sample 412 for a particular timeslot. For example, if the timeslot is 20-40 milliseconds and the audio sample rate is 44,100 samples per second, there may be over a thousand samples for the given timeslot, which is significantly higher than the feature vector size.
The feature generation module 415 may convert the time domain audio samples 412 into the frequency domain using a Fourier transformation. Then it may filter out all the frequencies that are do not relate to the human vocal range or otherwise selecting a predetermined set of frequencies directed to the human voice. Then, the feature generation module 415 may perform a discrete log function followed by a discrete cosine transformation to derive the values of the coefficients that make up the feature vector 418. According to an embodiment, the feature generation module 415 generates the feature vectors 418 using Mel-frequency cepstral coefficients. In another embodiment, the feature generation module 415 generates the feature vectors 418 using octave band coefficients.
A feature vector 418 is determined for each timeslot. According to an embodiment, the spatial analysis system 100 increases the size of the audio timeslot. This process is performed in response to the initial audio timeslots being significantly short. Thus, if an audio timeslot is 20 milliseconds and the desired timeslot size is 1 second, the feature vectors 418 for all the 20 millisecond timeslots occurring within a 1 second timeslot are averaged to generate an average feature vector 421 for the 1 second timeslot.
The spatial analysis system 100 includes a clustering module 428 that is executed by a processor in the lighting device 101, within a server application 127, or a combination thereof. The clustering module 428 clusters together the different feature vectors 418 or average feature vectors 421 to identify one or more clusters. The number of clusters corresponds to an estimated number of occupants 437 based on the detected audio 408. Although
According to an embodiment, the clustering module uses k-means clustering to cluster the feature vectors 421. Clustering is an iterative process which incrementally adjusts the way feature vectors 421 are clustered as new feature vectors 421 are provided to the clustering module 428.
According to an embodiment of the invention, the clustering module 428 is initialized using the estimated number of occupants derived from the motion sensor 102, an example of which, is described in
As explained above, the initial number of clusters provides a general guidance for initialization purposes, but does not necessarily force the cluster module to arrive at any particular result. This way, the audio-based occupancy estimation may be used to verify the motion-based occupancy estimation. It may also be used to detect discrepancies between the two. The occurrence of a discrepancy may indicate that one or more remote attendees are virtually present in the room but not physically present. The identification of a discrepancy may also be used to tune the parameters of the motion sensor 102, motion analyzer 315, or motion based occupancy estimator 322.
At 606, the spatial analysis system 100 receives motion samples 312. A motion sensor 102 detects motion and the motion is converted into motion samples 312. The motion sensor 102 may be installed in a lighting device 101. At 609, the spatial analysis system 100 detects one or more motion events. The spatial analysis system 100 includes a motion analyzer 315 to identify motion events expressed in the motion samples 312 for a given timeslot. The motion analyzer 315 may classify the motion events into one among a plurality of motion event classifications such as a minor motion event or a major motion event. At 612, the spatial analysis system 100 determines a first estimated number of occupants. This first estimation may be referred to as a motion-based estimation. The first estimated number of occupants may be determined by detecting changes in the number of motion events counted for consecutive timeslots. The process repeats so that the spatial analysis system 100 continuously receives motion samples in order to determine an estimated number of occupants as each timeslot advances.
The spatial analysis system 100 may also concurrently detect audio. At 615, the spatial analysis system 100 receives audio samples 412. The spatial analysis system 100 includes a microphone 103 or audio sensor to receive audio and digitize them into audio samples 412. The microphone 103 may be installed in a lighting device 101 such that it is collocated with the motion sensor 102. At 618, the spatial analysis system 100 converts the audio samples into feature vectors 421. In this respect, each timeslot or frame of audio sample series is quantified as a relatively short vector of coefficients tailored to mathematically represent the vocal nature of the audio samples. A clustering module 428 is used to cluster each feature vector to determine a cluster count.
At 621, the spatial analysis system 100 initializes the clustering module. For example, the spatial analysis system 100 obtains the first estimated number of occupants determined at 612 and configures the clustering module 428 to input the first estimated number of occupants as an initial cluster count. At 623, the spatial analysis system 100 clusters the feature vectors 421 into clusters 503. At 625, the spatial analysis system 100 determines a second estimated number of occupants. The second estimated number of occupants is the cluster count. The second estimated number of occupants may also be referred to as the audio-based estimation. The process repeats so that the spatial analysis system 100 continuously receives audio samples in order to determine an estimated number of occupants as each timeslot advances.
At 628, the spatial analysis system 100 determines whether the audio-based estimation and the motion-based estimation agree. If the two estimations agree such that they are the same, the spatial analysis system 100 generates an output at 631. The output comprises a number of occupants in the room in response to the comparison. Thus, if the motion-based estimation branch yielded a first estimated number of occupants of five, and the audio-based estimation branch also yielded a second estimated number of occupants of five, then the output includes data indicating that there are five occupants for a given timeslot. This output is stored as an occupancy statistic 115. The computing system 106 may provide this data to client devices over the network 105, using, for example, a dashboard, mobile application, or client portal. In this respect, administrators can track room occupancy rates defined by the length of the timeslot.
If the two estimations disagree, then at 634, the spatial analysis system 100 determines whether the motion-based estimation is higher than the audio-based estimation. If the motion-based estimation is higher, the spatial analysis system 100 reconfigures the motion detection process at 637. For example, the spatial analysis system 100 may adjust the parameters that define how to classify a motion event. In other words, the spatial analysis system 100 may adjust the sensitivity of the classification process to improve the accuracy of detecting motion events.
If the motion-based estimation is not higher than the audio-based estimation, then, at 640, the spatial analysis system 100 analyzes remote attendance. In this case, more unique people were speaking than expected. In a remote attendance analysis, the audio samples 412 and/or feature vectors 415 are analyzed to detect the presence of audio signatures that correspond to the output of a speaker device 205. Speaker devices 205 have unique audio characteristics such as the presence or absence of particular frequency patterns. The remote attendance analysis identifies the presence or absence of these audio characteristics within the audio samples or within a particular cluster. For example, if the motion-based estimation yielded an estimated occupancy of four attendees and the audio-based estimation process was seeded with an initial estimation of four, but ultimately detected five clusters, then a remote attendance analysis is performed. Upon performing the remote attendance analysis, one of the five audio-based clusters may have the characteristics that correspond to audio projected through a speaking device 205. Thereafter, at 631, an output is generated indicating that number of occupants and the number of remote attendees.
The flowchart described in
The lighting device 101 also includes a processor(s) 703, a memory 706, and a communication interface 708, each of which are coupled to a local interface 712 or bus. The lighting device 101 further includes a motion sensor 102, a microphone 103. The motion sensor 102 and microphone 103 may be a wired accessory, wireless accessory, or an integrated component with respect to the lighting device 101.
The communication interface 708 may include hardware, such as, for example, a network interface card, a modem, a transceiver, or radio and/or may include software such as, for example, a software module that encodes/decodes communication packets for transmission and receipt. The local interface 712 may comprise, for example, a data bus with an accompanying address/control bus or other bus structure as can be appreciated.
Stored in the memory 706 are both data and several components that are executable by the processor 703. In particular, stored in the memory 706 and executable by the processor 703 is the motion analyzer 315, motion based occupancy estimator 322, feature generation module 415, and clustering module 428. In addition, an operating system or firmware may be stored in the memory 706 and executable by the processor 703. In another embodiment, the components such as, for example, the motion analyzer 315, feature generation module 415, and clustering module 428 may be executed remotely in whole or in part by a computing system 106. Thus, the functionality provided by the lighting device may be part of a distributed architecture.
The communication interface 708 is configured to communicate with the computing system 106. The processor 703 uses the communication interface 708 to establish communication with components external to the lighting device 101. For example, the processor 703 may send instructions to the communication interface 708 to cause the transmission of a raw or processed data samples. Similarly, data received from the communication interface 708 is forwarded to the processor 703.
Stored in the memory 806 are both data and several components that are executable by the processor 803. In particular, stored in the memory 806 and executable by the processor 803 is the server application 127. The server application 127 may implement at least some of the functionality that includes the motion analyzer 315, motion based occupancy estimator 322, feature generation module 415, and clustering module 428.
Also stored in the memory 806 may be a database 109 and other data. In addition, an operating system may be stored in the memory 806 and executable by the processor 803.
The communication interface 809 is configured to communicate with a plurality of lighting devices 101 over the network 105. The processor 803 uses the communication interface 809 to establish communication with components external to the computing system 106. For example, the processor 803 may send instructions to the communication interface 809 to cause the transmission of data to and from the lighting devices 101. Data received from the communication interface 809 is forwarded to the processor 803.
With respect to both
Several software components are stored in the memory 706, 806 and are executable by the processor 703, 803. In this respect, the term “executable” means a program file that is in a form that can ultimately be run by the processor 703, 803. Examples of executable programs may be, for example, a compiled program that can be translated into machine code in a format that can be loaded into a random access portion of the memory 706, 806 and run by the processor 703, 803, source code that may be expressed in proper format such as object code that is capable of being loaded into a random access portion of the memory 706, 806 and executed by the processor 703, 803, or source code that may be interpreted by another executable program to generate instructions in a random access portion of the memory 706, 806 to be executed by the processor 703, 803, etc. An executable program may be stored in any portion or component of the memory 706, 806 including, for example, random access memory (RAM), read-only memory (ROM), hard drive, solid-state drive, USB flash drive, memory card, optical disc such as compact disc (CD) or digital versatile disc (DVD), floppy disk, magnetic tape, or other memory components.
The memory 706, 806 is defined herein as including both volatile and nonvolatile memory and data storage components. Volatile components are those that do not retain data values upon loss of power. Nonvolatile components are those that retain data upon a loss of power. Thus, the memory 706, 806 may comprise, for example, random access memory (RAM), read-only memory (ROM), hard disk drives, solid-state drives, USB flash drives, memory cards accessed via a memory card reader, floppy disks accessed via an associated floppy disk drive, optical discs accessed via an optical disc drive, magnetic tapes accessed via an appropriate tape drive, and/or other memory components, or a combination of any two or more of these memory components. In addition, the RAM may comprise, for example, static random access memory (SRAM), dynamic random access memory (DRAM), or magnetic random access memory (MRAM) and other such devices. The ROM may comprise, for example, a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or other like memory device.
Also, the processor 703, 803 may represent multiple processors 703, 803 and/or multiple processor cores and the memory 706, 806 may represent multiple memories 706, 806 that operate in parallel processing circuits, respectively. In such a case, the local interface 712, 812 may be an appropriate network that facilitates communication between any two of the multiple processors 703, 803, between any processor 703, 803 and any of the memories 706, 806, or between any two of the memories 706, 806, etc. The local interface 712, 812 may comprise additional systems designed to coordinate this communication, including, for example, performing load balancing. The processor 703, 803 may be of electrical or of some other available construction.
Although the software applications or programs as described herein may be embodied in software or code executed by general purpose hardware as discussed above, as an alternative the same may also be embodied in dedicated hardware or a combination of software/general purpose hardware and dedicated hardware. If embodied in dedicated hardware, each can be implemented as a circuit or state machine that employs any one of or a combination of a number of technologies. These technologies may include, but are not limited to, discrete logic circuits having logic gates for implementing various logic functions upon an application of one or more data signals, application specific integrated circuits (ASICs) having appropriate logic gates, field-programmable gate arrays (FPGAs), or other components, etc. Such technologies are generally well known by those skilled in the art and, consequently, are not described in detail herein.
The foregoing detailed description has set forth a few of the many forms that the invention can take. The above examples are merely illustrative of several possible embodiments of various aspects of the present invention, wherein equivalent alterations and/or modifications will occur to others skilled in the art upon reading and understanding the present invention and the annexed drawings. In particular, in regard to the various functions performed by the above described components (devices, systems, and the like), the terms (including a reference to a “means”) used to describe such components are intended to correspond, unless otherwise indicated to any component such as hardware or combinations thereof, which performs the specified function of the described component (i.e., that is functionally equivalent), even though not structurally equivalent to the disclosed structure which performs the function in the illustrated implementations of the disclosure.
Although a particular feature of the present invention may have been illustrated and/or described with respect to only one of several implementations, any such feature may be combined with one or more other features of the other implementations as may be desired and advantageous for any given or particular application. Furthermore, references to singular components or items are intended, unless otherwise specified, to encompass two or more such components or items. Also, to the extent that the terms “including”, “includes”, “having”, “has”, “with”, or variants thereof are used in the detailed description and/or in the claims, such terms are intended to be inclusive in a manner similar to the term “comprising”.
The present invention has been described with reference to the preferred embodiments. However, modifications and alterations will occur to others upon reading and understanding the preceding detailed description. It is intended that the present invention be construed as including all such modifications and alterations. It is only the claims, including all equivalents that are intended to define the scope of the present invention.
In the claims references in parentheses refer to reference signs in drawings of exemplifying embodiments or to formulas of embodiments, thus increasing the intelligibility of the claim. These references shall not be construed as limiting the claim.
Number | Date | Country | Kind |
---|---|---|---|
19202501.3 | Oct 2019 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2020/076667 | 9/24/2020 | WO |
Number | Date | Country | |
---|---|---|---|
62905794 | Sep 2019 | US |