This disclosure relates generally to automatic determination and notification of room status.
In a larger office building or complex, a facilities supervisor has the responsibility that all of the facilities in the building or complex are properly operating or configured for use. One example of interest here is the state of common areas, particularly conference rooms, after the common area has been used. Users of a conference room often fail to clean the conference room before they leave. There is writing on a whiteboard. There are papers and food containers strewn over the conference table. The wastebasket is overflowing. The chairs are in disarray. This means that at least the conference is very messy or dirty for the next users. It may also mean that confidential materials are available in the room, such as the writing on the whiteboard and papers that are still present, which is a breach of most security protocols.
While facilities cleaning staff can be assigned to perform these duties, this becomes an appreciable expense as the building or complex likely has many conference rooms and other common areas to monitor, with frequent turnover of the rooms, effectively requiring dedicated personnel.
For illustration, there are shown in the drawings certain examples described in the present disclosure. In the drawings, like numerals indicate like elements throughout.
The full scope of the inventions disclosed herein are not limited to the precise arrangements, dimensions, and instruments shown. In the drawings:
7B is a flowchart of clean room processing using neural network processing according to an example of the present disclosure.
Examples according to this description provide a facilities supervisor a notice after a conference room or other common area has been used and is in a messy or dirty condition. An image is obtained of the conference room after use and is evaluated to determine if the conference room is in a clean or neat condition. Using one of several techniques, a cleanliness or neatness score is obtained for the conference room after use. If the score indicates a neatness value above that specified by the settable level, the conference room is considered clean and ready for use. If the neatness score indicates a neatness value less than that of the settable level, the conference room is not ready for use, and a notice is provided to the facilities supervisor to allow a cleaning person to be dispatched.
The need to perform the cleanliness or neatness review is triggered several ways. A first way is by referencing scheduled meetings in a calendaring system and triggering after a scheduled meeting is completed. A second way is by monitoring the conference room for the presence of individuals having an unscheduled meeting. The individuals leaving the conference room triggers a cleanliness review. A third way is a periodic or random check, after confirming the conference room is not in use.
One technique for determining the conference room cleanliness level performs object or feature detection on both a reference image, made when the conference room is configured as desired, and the post-meeting image. The detected objects and features of the two images are compared and analyzed using a distance algorithm or a structural similarity index, with a resulting score to be used for comparison to the settable neatness score.
Another technique uses a neural network trained with both clean and dirty images. The post-meeting image is provided to the neural network and a confidence score is proved as an output; the confidence score compared to the settable neatness score.
By performing the post-meeting image analysis to determine room cleanliness or neatness, cleaning staff is dispatched only when necessary and is not required to continually monitor the many conference rooms and other common areas in the office building or complex. This saves expense by reducing the need for cleaning staff to continually check the state of the conference rooms.
In the drawings and the description of the drawings herein, certain terminology is used for convenience only and is not to be taken as limiting the examples of the present disclosure. In the drawings and the description below, like numerals indicate like elements throughout.
Computer vision is an interdisciplinary scientific field that deals with how computers can be made to gain high-level understanding from digital images or videos. Computer vision seeks to automate tasks imitative of the human visual system. Computer vision tasks include methods for acquiring, processing, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world to produce numerical or symbolic information. Computer vision is concerned with artificial systems that extract information from images. Computer vision includes algorithms which receive a video frame as input and produce data detailing the visual characteristics that a system has been trained to detect.
Traditional computer vision techniques perform feature extraction and object detection in various ways. In one example, edge detection is used to identify relevant points in an image. These relevant points can be compared to models to identify the particular object and its orientation. If this object detection is performed on a reference image, particularly if the reference image has the relevant objects marked, object detection than can be performed on a sample image. The objects found in the sample image can then be compared to the objects in the reference image. Various techniques can be used to determine how close of a match the sample image is to the reference image, many involving various distance algorithms or a structural similarity index (SSIM).
A convolutional neural network is a class of deep neural network which can be applied to analyzing visual imagery. A deep neural network is an artificial neural network with multiple layers between the input and output layers.
Artificial neural networks are computing systems inspired by the biological neural networks that constitute animal brains. Artificial neural networks exist as code being executed on one or more processors. An artificial neural network is based on a collection of connected units or nodes called artificial neurons, which mimic the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a ‘signal’ to other neurons. An artificial neuron that receives a signal then processes it and can signal neurons connected to it. The signal at a connection is a real number, and the output of each neuron is computed by some non-linear function of the sum of its inputs. The connections are called edges. Neurons and edges have weights, the value of which is adjusted as ‘learning’ proceeds or as new data is received by a state system. The weight increases or decreases the strength of the signal at a connection. Neurons may have a threshold such that a signal is sent only if the aggregate signal crosses that threshold.
Referring now to
In
In some examples, only one of the calendaring server 404 operations of
In step 712, the obtained image is processed to develop a neatness score to determine if the conference room is sufficiently clean for use. This processing is detailed below and in
The room clean processing of step 712 is performed in different manners in different examples. In one group of examples, traditional computer vision processing is performed. In another group of examples, deep learning and neural networks are used.
Referring to
When an image is obtained in step 710, that obtained image is received in step 750 for processing. The obtained image is processed in the same manner as the reference image in step 752, except that the objects are not marked. The computer vision processing develops relevant data similar to that stored for the reference image. The reference image data is retrieved in step 754. The features determined in the obtained image are compared to the features determined in the reference image in step 756, in some examples using one of various distance algorithms, such as Euclidian distance, Hamming distance, cosine distance and the like. In other examples, a structural similarity index (SSIM) is used to compare the features in the two images.
Referring to
The end result of the image comparison of step 756 is a score value, conceptually the higher the score, the higher the image similarity. This may require inversion of the distance result as a more similar image will generally have a lower distance result. Alternatively, if direct distance results are desired to be used, then a distance result less than the threshold would indicate a cleaner, neater conference room. This discussion will generally use a higher score indicating a cleaner room for ease of understanding. Inversion can be done if desired and the changes to the described threshold comparison would then also be done. The more general statement would be that a distance score indicating a cleaner conference room than specified by the threshold value would be passed and a distance score indicating a dirtier room than specified by the threshold value would result in the notice being provided to the facilities supervisor.
Referring to
Many factors determine the use of traditional computer vision methods or deep learning methods. The traditional computer vision method has the advantage that the collection of the images of the conference room in clean and messy conditions is not required, just the single clean reference image. Additionally, traditional computer vision methods generally require fewer computing resources, as neural networks are often very large and require intensive computations. However, in specific environments, the deep learning method may provide better results.
The processing unit 802 can include digital signal processors (DSPs), central processing units (CPUs), graphics processing units (GPUs), dedicated hardware elements, such as neural network accelerators and hardware videoconferencing endpoints, and the like in any desired combination.
The flash memory 804 stores modules of varying functionality in the form of software and firmware, generically programs or instructions, for controlling the videoconferencing endpoint 800. Illustrated modules include a video codec 850, camera control 852, face and body finding 853, neural network models 855, framing 854, room occupied 863, messaging 867, other video processing 856, camera location and selection 857, audio codec 858, audio processing 860, sound source localization 861, network operations 866, user interface 868 and operating system and various other modules 870. The RAM 805 is used for storing any of the modules in the flash memory 804 when the module is executing, storing video images of video streams and audio samples of audio streams and can be used for scratchpad operation of the processing unit 802. The room occupied module 863 uses the neural network models 855 and face and body finding 853 to determine if the conference room C is occupied or has been empty as discussed with
The network interface 808 enables communications between the videoconferencing endpoint 800 and other devices and can be wired, wireless or a combination. In one example, the network interface 808 is connected or coupled to the Internet 830 to communicate with remote endpoints 840 in a videoconference. In one or more examples, the I/O interface 810 provides data transmission with local devices such as a keyboard, mouse, printer, projector, display, external loudspeakers, additional cameras, and microphone pods, etc.
In one example, the imager 816 and external camera 819 and the microphone array 814 and microphones 815A and 815B capture video and audio, respectively, in the videoconference environment and produce video and audio streams or signals transmitted through the bus 817 to the processing unit 802. In at least one example of this disclosure, the processing unit 802 processes the video and audio using algorithms in the modules stored in the flash memory 804. Processed audio and video streams can be sent to and received from remote devices coupled to network interface 808 and devices coupled to general interface 810. This is just one example of the configuration of a videoconferencing endpoint 800.
A graphics acceleration module 924 is connected to the high-speed interconnect 908. A display subsystem 926 is connected to the high-speed interconnect 908 to allow operation with and connection to various video monitors. A system services block 932, which includes items such as DMA controllers, memory management units, general-purpose I/O's, mailboxes, and the like, is provided for normal SoC 900 operation. A serial connectivity module 934 is connected to the high-speed interconnect 908 and includes modules as normal in an SoC. A vehicle connectivity module 936 provides interconnects for external communication interfaces, such as PCIe block 938, USB block 940 and an Ethernet switch 942. A capture/MIPI module 944 includes a four-lane CSI-2 compliant transmit block 946 and a four-lane CSI-2 receive module and hub.
An MCU island 960 is provided as a secondary subsystem and handles operation of the integrated SoC 900 when the other components are powered down to save energy. An MCU ARM processor 962, such as one or more ARM R5F cores, operates as a master and is coupled to the high-speed interconnect 908 through an isolation interface 961. An MCU general purpose I/O (GPIO) block 964 operates as a slave. MCU RAM 966 is provided to act as local memory for the MCU ARM processor 962. A CAN bus block 968, an additional external communication interface, is connected to allow operation with a conventional CAN bus environment in a vehicle. An Ethernet MAC (media access control) block 970 is provided for further connectivity. External memory, generally non-volatile memory (NVM)such as flash memory 804, is connected to the MCU ARM processor 962 via an external memory interface 969 to store instructions loaded into the various other memories for execution by the various appropriate processors. The MCU ARM processor 962 operates as a safety processor, monitoring operations of the SoC 900 to ensure proper operation of the SoC 900.
It is understood that this is one example of an SoC provided for explanation and many other SoC examples are possible, with varying numbers of processors, DSPs, accelerators and the like.
While the above discussion has focused on conference rooms and offices, many other room types and building types can also be used. In an office environment, auditoriums, lounge spaces and other common spaces which have scheduled uses are suitable for such monitoring.
While providing a notice of a messy conference room to a facilities supervisor has been used as the example, other actions can also be automatically triggered, such as contacting additional parties involved with the conference room use, such as the meeting organizer or the organizer's assistant. The messy state can also be logged, including being logged into the records all of parties attending the meeting, so that repeat offenders can be detected and provided further instruction.
It is understood that devices other than videoconferencing endpoints can be used in a conference room. Simple IP-connected cameras can be used to provide the reference and sample or obtained images. The informal meeting processing could be performed on the facilities server, the facilities server polling the conference room cameras for images to use to detect individuals in the conference room and then proceeding as described above for the operation of the videoconferencing endpoint, providing an internal notification from the image processing process to the clean or messy processes.
By detecting the neatness of a conference room or other common area after a scheduled meeting or an informal meeting, the conference room be provided in a neat condition for the next user. The detection is performed by comparing a reference image of the properly arranged conference room with an image of the conference room obtained after the scheduled meeting or informal meeting using traditional vision processing. Alternatively, the detection is performed using a neural network trained to determine the neatness of the conference room. A neatness score is developed and compared to a selectable threshold. If the conference room is messier than the selectable threshold, a notice is provided to the facilities supervisor to arrange for cleaning and straightening up of the conference room. The use of the automated neatness determination allows many different conference rooms and other common area to be maintained without requiring additional cleaning staff.
A system of one or more computers can be configured to perform particular operations or actions by virtue of having software, firmware, hardware, or a combination of them installed on the system that in operation causes or cause the system to perform the actions. One or more computer programs can be configured to perform particular operations or actions by virtue of including instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions. One general aspect includes receiving a completion notification of completion of use of a common area of a multiplicity of common areas. The method also includes obtaining an image of the arrangement of items in the common area based on the receipt of the completion notification. The method also includes developing a neatness score of the arrangement of items in the common area as shown in obtained image. The method also includes determining if the neatness score indicates a neatness level messier than the neatness level of a selectable threshold. The method also includes providing a messy notification to a user when the neatness score indicates a neatness level messier than the neatness level of the selectable threshold. Other examples of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.
Implementations may include one or more of the following features. The completion notification is received from a calendaring server maintaining a schedule of the multiplicity of common areas. The completion notification is received from a videoconferencing endpoint located in the common area, the videoconferencing endpoint performing face or body detection to determine use of the common area. Obtaining an image of the arrangement of items in the common area is further based on periodically determining the need to evaluate the neatness of the common area. Developing a neatness score can include obtaining a reference image of the arrangement of items in the common area; and comparing the reference image and the obtained image and determining a distance metric between the reference image and the obtained image, the distance metric representing the neatness score. Developing a neatness score can include training a neural network to determine neatness level of the common area, the neural network having an output of a neatness score; and processing the obtained image by the neural network and obtaining the output neatness score. The messy notification includes the obtained image of the common area. Implementations of the described techniques may include hardware, a method or process, or computer software on a computer-accessible medium.
The various examples described are provided by way of illustration and should not be construed to limit the scope of the disclosure. Various modifications and changes can be made to the principles and examples described herein without departing from the scope of the disclosure and without departing from the claims which follow.