The present disclosure relates to the field of determining intent of a person and in particular to determining when there is intent of a person to open a door.
Locks and keys are evolving from the traditional pure mechanical locks. These days, there are wireless interfaces for electronic locks, e.g. by interacting with a portable key device. For instance, Radio Frequency Identification (RFID) has been used as the wireless interface.
When RFID is used, the user needs to present the portable key device in close proximity to a reader connected to the lock. RFID requires a relatively large antenna in the reader by the lock and uses a large amount of energy. Moreover, RFID is not an interface which can be used for remote system management of the lock; only system management using an RFID device in close proximity of the lock can be used for such tasks. Hence, to allow remote system management, e.g. configuration and monitoring, a second radio interface needs to be added.
Another solution is to use Bluetooth Low Energy (BLE) or ultra-wideband (UWB), or Ultra High Frequency (UHF). However, for these communication protocols, the range is longer, and it is difficult to determine intent for a particular door. One problem if the lock of a door unlocks whenever a valid portable key device is within range is that when a person on the inside of an electronic lock walks past the electronic lock, the electronic lock would open, and an unauthorised person on the outside could gain access to the restricted physical space.
Additionally, for unlocked automatic doors, e.g. sliding doors or swing doors with door openers, it is desired to determine intent to open the door when there is intent, but to avoid opening the door when there is no intent, e.g. to save energy for climatising indoor space.
One object is to improve determination of intent of a person to open a door.
According to a first aspect, it is provided a method for determining intent to open a door, the method being performed by an intent determiner. The method comprises: obtaining an image of a physical space near the door; determining a position and orientation of a person in the image by providing the image to an image machine learning model, wherein the image machine learning model is configured to determine position and orientation of a person in the image, wherein the image machine learning model is configured to determine a stick figure of the person based on the image; adding a data item, comprising an indicator of the position and an indicator of the orientation, to a data structure; determining based on the data structure, whether there is intent of the person to open the door; and repeating the method until an exit condition is true.
In the step of adding the data item, the data item may comprise coordinates of anatomical features represented by the stick figure.
In the step of adding the data item, the data item may comprise an orientation of at least one anatomical feature represented by the stick figure.
The adding data item may comprise adding the data item while preserving its position in a sequence in relation to any preceding data items in the data structure.
The exit condition may be true when the intent determiner determines that there is intent of the person do to open the door, in which case the method further comprises: sending a signal to proceed with a door opening process.
The exit condition may be true when the person is no longer determined to be in the image.
The determining a position and orientation may comprise determining a centre point of the person in the image, in which case the centre point is the indicator of position.
The determining a position and orientation may comprise determining a direction of the person in the image, indicating a direction of a torso of the person, in which case the direction of the person in the image is the indicator of orientation.
The adding the data item may comprise adding an indicator of time to the data structure, in association with the data item.
The determining whether there is intent may comprise providing the data structure to an intent machine learning model, in which case the intent machine learning model is configured to infer intent or lack of intent based on the input data structure.
According to a second aspect, it is provided an intent determiner for determining intent to open a door. The intent determiner comprises: a processor; and a memory storing instructions that, when executed by the processor, cause the intent determiner to: obtain an image of a physical space near the door; determine a position and orientation of a person in the image by providing the image to an image machine learning model, wherein the image machine learning model is configured to determine position and orientation of a person in the image, wherein the image machine learning model is configured to determine a stick figure of the person based on the image; add a data item, comprising an indicator of the position and an indicator of the orientation, to a data structure; determine based on the data structure, whether there is intent of the person to open the door; and repeat the instructions until an exit condition is true.
The data item may comprise coordinates of anatomical features represented by the stick figure.
The data item may comprises an orientation of at least one anatomical feature represented by the stick figure.
The instructions to add the indicator may comprise instructions that, when executed by the processor, cause the intent determiner to: send a signal to add the data item while preserving its position in a sequence in relation to any preceding indicators in the data structure.
The exit condition may be true when the intent determiner determines that there is intent of the person do to open the door, in which case the intent determiner further comprises instructions that, when executed by the processor, cause the intent determiner to: send a signal to proceed with a door opening process.
The exit condition may be true when the person is no longer determined to be in the image.
The instructions to determine a position and orientation may comprise instructions that, when executed by the processor, cause the intent determiner to determine a centre point of the person in the image, in which case the centre point is the indicator of position.
The instructions to determine a position and orientation may comprise instructions that, when executed by the processor, cause the intent determiner to determine a direction of the person in the image, indicating a direction of a torso of the person, in which case the direction of the person in the image is the indicator of orientation.
The instructions to add the data item may comprise instructions that, when executed by the processor, cause the intent determiner to add an indicator of time to the data structure, in association with the data item.
The instructions to determine whether there is intent may comprise instructions that, when executed by the processor, cause the intent determiner to provide the data structure to an intent machine learning model, in which case the intent machine learning model is configured to infer intent or lack of intent based on the input data structure.
According to a third aspect, it is provided a computer program for determining intent to open a door. The computer program comprises computer program code which, when executed on an the intent determiner, causes the intent determiner to: obtain an image of a physical space near the door; determine a position and orientation of a person in the image by providing the image to an image machine learning model, wherein the image machine learning model is configured to determine position and orientation of a person in the image, wherein the image machine learning model is configured to determine a stick figure of the person based on the image; add a data item, comprising an indicator of the position and an indicator of the orientation, to a data structure; determine based on the data structure, whether there is intent of the person to open the door; and repeat the computer program code until an exit condition is true.
According to a fourth aspect, it is provided a computer program product comprising a computer program according to the third aspect and a computer readable means comprising non-transitory memory in which the computer program is stored.
Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to “a/an/the element, apparatus, component, means, step, etc.” are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.
Aspects and embodiments are now described, by way of example, with reference to the accompanying drawings, in which:
The aspects of the present disclosure will now be described more fully hereinafter with reference to the accompanying drawings, in which certain embodiments of the invention are shown. These aspects may, however, be embodied in many different forms and should not be construed as limiting; rather, these embodiments are provided by way of example so that this disclosure will be thorough and complete, and to fully convey the scope of all aspects of invention to those skilled in the art. Like numbers refer to like elements throughout the description.
Embodiments presented herein provide an intent determiner to determine when there is intent of a person to open a door. A camera (or other imaging device) repetitively obtains images of an area near the door. For each image, the intent determiner abstracts the anatomy of a person to a stick figure, to determine a position and orientation of the person and stores this in a data structure. Intent to pass through the door is then determined based on the data structure, e.g. using a machine learning model.
When the access control by the electronic lock 12 results in granted access, the electronic lock 12 is set in an unlocked state. When the electronic lock 12 is in the unlocked state, the door 15 can be opened and when the electronic lock 12 is in a locked state, the door 15 cannot be opened. In this way, access to a closed space 16 is controlled by the electronic lock 12. It is to be noted that the electronic lock 12 can be mounted in the fixed structure 11 by the door 15 (as shown) or in the door 15 itself (not shown).
A camera 4, or other image capturing device, is provided to capture images of the area near the door 15. This allows the camera 4 to capture when a person 5 moves from a first position 2a, to a second position 2b and on to a third position 2c. The camera can be provided mounted to the physical structure 11, as shown, or in any other suitable position, such as above the door 15 or mounted to a ceiling above the space just outside the door, to capture a person near the door. The camera can be configured to capture an image based on visual light/IR light and/or depth information e.g. based on lidar, radar, stereo vision, etc. Hence, images from the camera can be 2D (two-dimensional) images or 3D (three-dimensional) images.
As described in more detail below, images from the camera 4 are analysed sequentially to thereby follow the position and orientation the person 5 over time, allowing the determination of whether the person 3 shows intent to enter through the door 5 or not.
The electronic lock 12 and/or the camera 4 optionally contains communication capabilities to connect to a server 6 for the electronics access control system 10 via a communication network 7. The communication network 7 can be a wide area network, such as the Internet, to which the portable key devices 2, 3 can connect e.g. via WiFi (e.g. any of the IEEE 802.11x standards) or a cellular network, e.g. LTE (Long Term Evolution), next generation mobile networks (fifth generation, 5G), UMTS (Universal Mobile Telecommunications System) utilising W-CDMA (Wideband Code Division Multiplex), etc.
In
In
In
In
The anatomical features that are identified can be a predefined set of joints and other anatomical features. One example of such a set of anatomical features is illustrated in
Optionally, the image ML model not only identify the anatomical features, but also an orientation of selected features. For instance, the orientation of the head 20q can be determined, e.g. by image processing of the image of the user, based on identification of face features (e.g. nose, eyes, mouth, etc.). Additionally, an orientation of the torso 21 can be determined either based on image processing of the image of the user, and/or based on the relative coordinates of the other anatomical features.
The stick figure, when analysed of the same user over time, can be used as a clear indication of movement. The abstraction of body posture and movement of body features (when analysed over several images) is not only effective in data use, but accurately depicts movement of the person. Additionally, since no images in themselves need to be saved, privacy of the person is preserved.
In an obtain image step 40, the intent determiner 1 obtains an image of a physical space near the door. The physical space being near the door is here to be construed as the physical space being a space through which a person has to pass in order to pass through the door, and that is sufficiently near to be able to be able to determine position and orientation in an image of the near space.
In a determine position and orientation step 42, the intent determiner 1 determines a position and orientation of a person in the image. Optionally, a bounding box (see
Optionally, a centre point of the person in the image is determined. The centre point is then the indicator of position.
Optionally, a direction of the person in the image is determined. The direction indicates a direction of a torso of the person. The shape of the head and/or head organs (e.g. eye(s), ear(s), mouth, nose, etc.) can also be used to find the direction of the person. The direction of the person in the image is then the indicator of orientation.
This step comprises providing the image to an image ML model, wherein the image ML model is configured to determine position and orientation of a person in the image. The image ML model is configured to determine a stick figure (see
In an add to data structure step 44, the intent determiner 1 adds a data item, comprising an indicator of the position and indicator of the orientation to a data structure. Optionally, the data item comprises coordinates and/or an orientation of (at least some) anatomical features represented by the stick
The adding the data item can comprises adding the data item while preserving its position in a sequence in relation to any preceding data item in the data structure. For instance, an indicator of time (e.g. timestamp) can be added to the data structure, in association with or part of the data item. Alternatively or additionally, a sequence number can be added in association with the data item. Alternatively or additionally, the data structure in itself maintains a sequence, such that any newly added data item preserves the order in which the data item were added. This can e.g. be achieved by the data structure being based on a linked list structure or similar. Since the position of the data item is preserved in a sequence, the data structure can be used to evaluate how the position and orientation of the person evolves over time.
When a bounding box has been determined, details defining the bounding box (e.g. suitable coordinates and/or size data) are also stored in the data structure, in association with the position and orientation of the person in the image.
Optionally, items from the data structure are removed if they are determined to be old (determined by a time stamp or by a number of more recent data items for the person in question in the data structure).
In a determine intent step 46, the intent determiner 1 determines, based on the data structure, whether there is intent of the person to open the door. This determination can be based on the sequence of data items in the data structure, wherein the sequence is ordered according to when the data items were added to the data structure. In this way, the movement over time, in position and orientation, is represented by the data structure.
In one embodiment, the determination of whether there is intent can comprise providing the data structure to an intent ML model. The intent ML model is then configured to infer intent (or absence of intent) based on the input data structure (i.e. the data structure comprising the position and orientation of the person as described above). The intent ML model can be any suitable type of machine learning model, e.g. a convolutional neural network (CNN), a hidden Markov model (HMM), LSTM (Long Short-Term Memory), machine learning transformers, etc. The use of machine learning fits well for this determination, since there are only a very limited number of possible outcomes, such as ‘person having intent’ and ‘person not having intent’. It is to be noted that there may be occasions when the image ML model is unable to determine the coordinates for all anatomical features due to inaccuracies in the image and/or partially blocking objects. In this case, such coordinates can be left blank, and the intent ML model still have many other coordinates on which to perform its inference. Alternatively, prior to providing the data structure to the intent ML model, coordinates are estimated where they are not determined by the image ML model.
The intent ML model can be trained using manually annotated training data, e.g. video data (i.e. sequences of 2D or 3D images) where an operator has manually annotated when there is intent for each person in the video data, in which case stick figures are determined corresponding to the same video data and form part of input features to the intent ML model. Alternatively or additionally, the intent ML model can be trained on actual data of when a person enters through the door or not. There may then need to be a manual pushbutton (or gesture recognition, proximity sensor, photosensor, etc.) allowing the person to manually trigger opening of the door when intent is mistakenly not determined by this method from person kinetics. Such mistakes and correction with the pushbutton make valuable training data to improve the performance of the intent determination.
In one embodiment, the determination of whether there is intent can be based on traditional statistical models such as ARIMA (Autoregressive Integrated Moving Average)
In a done step 47, the intent determiner 1 determines whether an exit condition is true. In one embodiment, the exit condition is true when (and optionally only when) the person is no longer determined to be in the image. In one embodiment, the exit condition is true when intent is determined.
If the exit condition is true, the method proceeds to an optional send intent signal step 48 or ends. If the exit condition is not true, the method returns to the obtain image step 40 to analyse a new image and add data to the data structure for a new intent evaluation as described above.
In the optional send intent signal step 48, the intent determiner 1 sends an intent signal, to proceed with a door opening process. This can cause a door opener to open the door for the person. Optionally, the intent signal is used to trigger the electronic lock to perform access control of the person, e.g. using local wireless communication and/or biometrics.
Using embodiments presented herein, a reliable determination of intent of a person to open the door is provided. It is to be noted that the method can optionally be used to simultaneously track multiple people in an image, in which case it is sufficient for one person to show intent for the method to determine intent. By using the data structure containing several instances over time of position and orientation, rather than the analysis of a single image, the effects of different lighting conditions throughout the day, shadows, etc., are mitigated or even eliminated.
The memory 64 can be any combination of random-access memory (RAM) and/or read-only memory (ROM). The memory 64 also comprises non-transitory persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid-state memory or even remotely mounted memory.
A data memory 66 is also provided for reading and/or storing data during execution of software instructions in the processor 60. The data memory 66 can be any combination of RAM and/or ROM.
The intent determiner 1 further comprises an I/O interface 62 for communicating with external and/or internal entities. Optionally, the I/O interface 62 also includes a user interface.
Other components of the intent determiner 1 are omitted in order not to obscure the concepts presented herein.
Here now follows a list of embodiments from another perspective, enumerated with roman numerals.
The aspects of the present disclosure have mainly been described above with reference to a few embodiments. However, as is readily appreciated by a person skilled in the art, other embodiments than the ones disclosed above are equally possible within the scope of the invention, as defined by the appended patent claims. Thus, while various aspects and embodiments have been disclosed herein, other aspects and embodiments will be apparent to those skilled in the art. The various aspects and embodiments disclosed herein are for purposes of illustration and are not intended to be limiting, with the true scope and spirit being indicated by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
2250364-3 | Mar 2022 | SE | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2023/057422 | 3/23/2023 | WO |