The disclosure is related to a monitor method and a monitor device thereof, and more particularly, a monitor method and a monitor device thereof where a mask is used to cover an image for detecting an object.
With the increasing demand for security surveillance, the demand for analyzing monitor images has also increased. For example, a surveillance camera operated 24 hours a day can generate videos 8760 hours a year, and there may be hundreds of surveillance cameras in a building. Hence, the human resource for monitoring and even analyzing the videos and images can be overwhelming. Presently, a lot of manpower is used to monitor the screens or images for assuring the security. However, this will lead to higher cost and human error because the attention of security practitioners is limited. In the field, an automatic solution taking reasonable resource for analyzing the images captured by cameras is in shortage.
An embodiment provides a monitor method for detecting an object including capturing an image; calculating an initial probability of the object existing in the image; applying a mask to cover a first portion of the image if the initial probability is higher than a threshold; calculating a first probability of the object existing in the image excluding the first portion; and using at least the first probability to detect the location the object in the image.
Another embodiment provides a monitor system for detecting an object, including a surveillance camera configured to capture an image; and a processor configured to apply a mask to cover a first portion of the image, calculate a first probability of the object existing in the image excluding the first portion, and use at least the first probability to detect the object in the image.
These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings.
In order to deal with the problem mentioned above, a monitor method and a monitor device can be provided according to embodiments to analyze the images of a monitored space.
The object OBJ can be a person or a predetermined object. For example, the image 100 can be an image of an office, and the purpose of security surveillance can be monitoring whether any person exists at an abnormal time (e.g., weekend or midnight) or whether any person abnormally appears in a restricted area (e.g., an area close to a vault). In order to detect the existence and/or the location of the object OBJ, the monitor method 200 can be used. The monitor method 200 can include the following steps.
Step 202: capture the image 100;
Step 204: calculate an initial probability PO of the object OBJ existing in the image 100;
Step 206: determine whether the initial probability PO is higher than a threshold; if so, enter Step 210; else enter Step 208;
Step 208: determine the object OBJ does not exist in the image 100; enter Step 202;
Step 210: determine the object OBJ exists in the image 100;
Step 215: apply a mask M to cover an ith portion Ai of the image 100;
Step 220: calculate an ith probability Pi of the object OBJ existing in the image 100 excluding the ith portion Ai and remark the ith probability Pi at the portion Ai;
Step 225: determine whether the location of the mask M is at the predetermined location; if so, enter Step 235; else enter Step 230;
Step 230: add 1 to i; enter Step 215; and
Step 235: determine the location of the object OBJ.
In Step 202 to Step 210, an initial classification can be performed to determine whether the object OBJ (e.g., a person or a specific object) exists. In Step 204 and Step 206, a machine learning model such as a convolutional neural network (CNN) model can be used, and the determination made by the model can be more and more accurate by training to adjust the weights used for calculation.
The threshold mentioned in Step 206 may be, for example, 90% or an appropriate threshold obtained by experiment and calculation such as the result of machine learning. Only if the object OBJ is determined to exist, Step 215 to Step 235 can be performed to determine the location of the object OBJ.
In Step 215 to Step 235, the mask M can be moved to cover different portions of the image 100 to obtain the probabilities corresponding to different portions, and the probabilities can be remarked at different portions of the image 110. The location of the object OBJ can be accordingly determined.
In Step 215 to Step 235, the mask M can cover an ith portion Ai of the image 100. Here, the variable i can be an integer larger than zero. The mask M can have a predetermined dimension such as m pixel(s)×n pixel(s), where m and n can be integers larger than zero. When the ith mask is applied to cover the portion Ai, there can be one of the three scenarios said below.
(Scenario-i)
If the mask M fails to cover any part of the object OBJ in the image 100, the ith probability Pi of the object OBJ existing in the image 100 excluding the ith portion Ai in Step 220 can be as high as the initial probability PO said in Step 204.
(Scenario-ii)
If the mask M covers a part of the object OBJ in the image 100, the ith probability Pi of the object OBJ existing in the image 100 excluding the ith portion Ai in Step 220 can be a value lower than the initial probability PO said in Step 204.
(Scenario-iii)
If the mask M covers all of the object OBJ in the image 100, the ith probability Pi of the object OBJ existing in the image 100 excluding the ith portion Ai in Step 220 can be a lowest value (for example, zero).
When a plurality of portions (e.g., A1, A2, A3 . . . ) of the image 100 are covered in sequence, the obtained probabilities (e.g., P1, P2, P3 . . . ) can be respectively remarked to the portions to generate the processed image 110, and the location of the object OBJ can be determined by means of the processed image 110. Below,
In
Regarding Step 225, if the location of the mask M is NOT at the predetermined location, the mask M can be moved to cover a subsequent portion of the image 100, and the location of the covered portion of the image 100 can be adjusted regularly each time. The predetermined location said in Step 225 can be a predefined final location. If the location of the mask M is at the predetermined location, the mask M may not be moved to cover a subsequent portion of the image 100.
For example, in
In
For example, if the width of the image 100 said in Step 202 includes 192 pixels, the width of the mask M is 24 pixels, and a portion (e.g. A1) and a subsequent portion (e.g. A2) being sequentially masked have an overlap width of 23 pixels, the mask M can be horizontally moved for 168 (i.e. 192-24) times from a leftmost portion (e.g. A1) to a rightmost portion (e.g. A169) of the image 100 to cover 169 portions for calculating 169 probabilities of the object OBJ existing in the upper section of image 100. In other words, the mask M can be moved by 1 pixel each time. By overlapping a portion with a subsequent portion covered by the mask M, the resolution of detecting the object OBJ can be effectively increased; however, the amount of calculation is also increased, and more computing resources are needed.
According to embodiments, the dimension of the mask M can be adjusted so that the dimension of the portion can be different from the dimension of the subsequent portion covered by the mask M. For example, as shown in
According to embodiments, the mask M can cover a first portion of the image 100, and cover a second portion of the image 100 afterward when Step 215 is performed for another time, where the first portion can be inside the second portion. In this manner, the resolution of detecting the object OBJ can hence be further improved. An example is described below.
Regarding
If it is sufficient to determine that the object OBJ is more related to the portion A4, the process can be stopped. However, if a higher resolution is pursued, the portion A4 can be partially covered to further analyze the location of the object OBJ.
The operation related to
By means of selecting at least one portion (e.g., A4) and further covering sub-portions of the selected portion to detect the location of the object OBJ, the computing resources can be saved. For example, regarding
The monitor method provided by embodiments can also reduce the use of weights of a neural network model. In
When the mask M is of a predetermined dimension corresponding to a predetermined resolution, a sub-portion may not be partially covered to analyze the location of the object OBJ. If a higher resolution is needed, as the operation shown in
The operations described in
As shown in
In summary, by means of the monitor method and the monitor system provided by embodiments, an automatic solution can be provided to detect the existence and the location of an object. The correctness can be assured, the resolution of detection can be flexibly adjusted and the computing resource can be saved. Hence, the long-standing problem in the field can be effectively dealt with.
Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
9547908 | Kim | Jan 2017 | B1 |
9864901 | Chang | Jan 2018 | B2 |
10867190 | Vajna | Dec 2020 | B1 |
10902291 | Kim | Jan 2021 | B1 |
20090175411 | Gudmundson | Jul 2009 | A1 |
20170332198 | Dannenbring | Nov 2017 | A1 |
20170358094 | Sun | Dec 2017 | A1 |
20180322371 | Dupont De Dinechin | Nov 2018 | A1 |
20190171871 | Zhang | Jun 2019 | A1 |
20190213443 | Cunningham | Jul 2019 | A1 |
20190286932 | Du | Sep 2019 | A1 |
20200193732 | Nagarathinam | Jun 2020 | A1 |
20200349711 | Duke | Nov 2020 | A1 |
20210319242 | Cholakkal | Oct 2021 | A1 |
20210319560 | Xia | Oct 2021 | A1 |
20210365732 | Lin | Nov 2021 | A1 |
20220036573 | Kim | Feb 2022 | A1 |
Number | Date | Country |
---|---|---|
201308254 | Feb 2013 | TW |
202036091 | Oct 2020 | TW |
202038188 | Oct 2020 | TW |
Entry |
---|
Wang G, Xiong Z, Liu D, Luo C. Cascade mask generation framework for fast small object detection. In2018 IEEE International Conference on Multimedia and Expo (ICME) Jul. 23, 2018 (pp. 1-6). IEEE. (Year: 2018). |
Ding H, Qiao S, Yuille A, Shen W. Deeply shape-guided cascade for instance segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021 (pp. 8278-8288). (Year: 2021). |
Zhang L, Zhang J, Lin Z, Lu H, He Y. Capsal: Leveraging captioning to boost semantics for salient object detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2019 (pp. 6024-6033). (Year: 2019). |
Li C, Yang T, Zhu S, Chen C, Guan S. Density map guided object detection in aerial images. Inproceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops 2020 (pp. 190-191). (Year: 2020). |
Han M, Xu W, Tao H, Gong Y. An algorithm for multiple object trajectory tracking. InProceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004. Jun. 27, 2004 (vol. 1, pp. I-I). IEEE. (Year: 2004). |
Ammirato, Phil, and Alexander C. Berg. “A mask-rcnn baseline for probabilistic object detection.” arXiv preprint arXiv:1908.03621 (2019). (Year: 2019). |
Hall D, Dayoub F, Skinner J, Zhang H, Miller D, Corke P. Carneiro G, Angelova A, Sunderhauf N. Probabilistic object detection: Definition and evaluation. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision 2020 (pp. 1031-1040). (Year: 2020). |
Number | Date | Country | |
---|---|---|---|
20220180112 A1 | Jun 2022 | US |