This application claims the priority benefit of Taiwan application serial no. 98139197, filed Nov. 18, 2009. The entirety of the above-mentioned patent application is hereby incorporated by reference herein and made a part of specification.
1. Field
The disclosure relates to a multi-state target tracking method.
2. Description of Related Art
In recent years, as issues of environmental safety become increasingly important, research of a video surveillance technique becomes more important. Besides a conventional video recording surveillance, demands for smart event detection and behaviour recognition are accordingly increased. To grasp occurrence of events at a first moment and immediately take corresponding measures are functions that a smart video surveillance system must have. To achieve a correct event detection and behaviour recognition, besides an accurate target segmentation is required, a stable tacking is also required, so as to completely describe an event process, record target information and analyse its behaviour.
Actually, in a low crowd density environment, as long as the target segmentation is accurate, a general tracking technique has a certain degree of accuracy, for example, a general foreground detection using a background model in cooperation with a shift amount prediction and characteristics comparison. However, in a high crowd density environment, an effect of the foreground detection is unsatisfactory, so that the prediction and capture of characteristics are difficult, and a tracking accuracy is comparatively low. Therefore, another non-background model tacking technique has to be used to solve such problem. However, since it is lack of characteristic information (such as color, length and width, area, etc.) provided by the background model, a plenty of targets is required to provide the characteristics required by the tracking. Comparatively, in case of the low crowd density environment, the tracking is not necessarily better than that with establishment of the background model. Therefore, a tracking mode switch mechanism adapted to an actual surveillance environment is required.
The disclosure is directed to a multi-state target tacking method, by which a most suitable tracking mode can be determined by analysing a crowd density and used for tracking targets.
The disclosure is directed to a multi-state target tacking system, which can continually detects a variation of a crowd density, so as to suitably switch a tracking mode for tracking targets.
The disclosure provides a multi-state target tracking method. In the method, when a video stream of a plurality of images is captured, a crowd density of the images is detected and is compared with a threshold, so as to determine a tracking mode used for detecting a plurality of targets in the images. When the detected crowd density is less than the threshold, a background model is used to track the targets in the images. When the detected crowd density is greater than or equal to the threshold, a non-background model is used to track the targets in the images.
The disclosure provides a multi-state target tracking system including an image capturing device, and a processing device. The image capturing device is used for capturing a video stream of a plurality of images. The processing device is coupled to the image capturing device, and is used for tracking a plurality of targets in the images, which includes a crowd density detecting module, a comparison module, a background tracking module and a non-background tracking module. The crowd density detecting module is used for detecting a crowd density of the images. The comparison module is used for comparing the crowd density detected by the crowd density detecting module with a threshold, so as to determine a tracking mode used for tracking the targets in the images. The background tracking module uses a background model to track the targets in the images when the comparison module determines that the crowd density is less than the threshold. The non-background tracking module uses a non-background model to track the targets in the images when the comparison module determines that the crowd density is greater than or equal to the threshold.
According to the above descriptions, in the multi-state target tracking method and system of the disclosure, by detecting the crowd density of the images in the video stream, the background model or the non-background model can be automatically selected to track the targets, and the tracking mode can be adjusted according to an actual environment variation, so as to achieve a purpose of effectively and correctly tracking the targets.
In order to make the aforementioned and other features and advantages of the disclosure comprehensible, several exemplary embodiments accompanied with figures are described in detail below.
The accompanying drawings are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention.
a) and
The disclosure provides an integral and practical multi-state target tracking mechanism, which is adapted to actually surveille an environmental crowd density. By correctly determining the crowd density, selecting a suitable tracking mode, switching the tracking mode and transmitting data during the switching, the tracking can be effectively and correctly performed in any environment.
First, the image capturing device 110 captures a video stream of a plurality of images (step S210), wherein the image capturing device 110 is a surveillance equipment such as a closed circuit television (CCTV) or an IP camera, which is used for capturing images of a specific region for surveillance. After the video stream is captured by the image capturing device 110, the video stream is transmitted to the processing device 120 through a wired or a wireless approach for post processing.
After the processing device 120 receives the video stream, the crowd density detecting module 130 detects a crowd density of the images (step S220). In detail, the crowd density detecting module 130 can use a foreground detecting unit 132 to perform a foreground detection on the images, so as to detect targets in the images. The foreground detecting unit 132, for example, uses an image processing method, such as a general background subtraction method, an edge detection method or a corner detection method, to detect variation amounts of the images at different time points, so as to recognize the targets in the images. Then, the crowd density detecting module 130 uses a crowd density calculating unit 134 to calculate a proportion of the targets in the images, so as to obtain the crowd density of the images.
Next, the processing device 120 uses the comparison module 140 to compare the crowd density detected by the crowd density detecting module 130 with a threshold, so as to determine a tracking mode used for tracking the targets in the images (step S230). The tracking mode includes a background model suitable for a pure environment, and a non-background model suitable for a complex environment.
When the comparison module 140 determines that the crowd density is less than the threshold, the background tracking module 150 uses the background model to track the targets in the images (step S240). Wherein, the background tracking module 150 calculates a shift amount of the target at tandem time points, predicts a position of the target appeared at a next time point, and performs a regional characteristic comparison on a region around the predicted position, so as to obtain moving information of the target.
In detail,
First, the shift amount calculating unit 152 calculates a shift amount of each of the targets between a current image and a previous image (step S310). Next, the position predicting unit 154 predicts a position of the target appeared in a next image according to the shift amount calculated by the shift amount calculating unit 152 (step S320). After the predicted position of the target is obtained, the characteristic comparison unit 156 performs the regional characteristic comparison on an associated region around the position of the target appeared in the current image and the next image, so as to obtain a characteristic comparison result (step S330). Finally, the information update unit 158 selects to add, inherit or delete the related information of the target according to the characteristic comparison result obtained by the characteristic comparison unit 156 (step S340).
In step S230 of
In detail,
First, the target detecting unit 162 uses a plurality of human characteristics to detect the targets having one or a plurality of the human characteristics in the images (step S410). The human characteristics refer to facial characteristics, such as eyes, nose and mouth of a human face, or body characteristics of a human body, which can be used to recognize a person in the image. Next, the motion vector calculating unit 164 calculates a motion vector of each of the targets between a current image and a previous image (step S420). The comparison unit 166 compares the motion vector calculated by the motion vector calculating unit 164 with a threshold to obtain a comparison result (step S430). Finally, the information update unit 168 selects to add, inherit or delete the related information of the target according to the comparison result obtained by the comparison unit 166 (step S440).
For example,
In summary, in the present embodiment, a most suitable tracking mode is selected according to a magnitude of the crowd density, so as to track the targets in the images. The method of the present embodiment is adapted to various environments and can provide a better tracking result. It should be noticed that in the present embodiment, using the background model or the non-background model to track the targets is performed in allusion to a whole image. However, in another embodiment, the image can be divided into a plurality of regions according to a distribution status of the targets, and a suitable tracking mode of each region can be selected to track the targets, so as to obtain a better tracking effect. An embodiment is provided below for detailed description.
First, the image capturing device 110 captures a video stream of a plurality of images (step S610), and the captured video stream is transmitted to the processing device 120 through a wired or a wireless approach.
Next, the processing device 120 uses the crowd density detecting module 130 to detect a crowd density of the images in the video stream. Wherein, the crowd density detecting module 130 also uses the foreground detecting unit 132 to perform a foreground detection on the images, so as to detect the targets in the images (step S620). However, a difference between the present embodiment and the aforementioned embodiment is that when calculating the crowd density, the crowd density calculating unit 134 respectively calculates the crowd density of a plurality of regions corresponding to a target distribution in the images, and regards a proportion of the targets in each of the regions as a crowd density of such region (step S630).
Comparatively, when the processing device 120 selects the tracking mode, the processing device 120 uses the comparison module 140 to compare the crowd density of each region with the threshold, so as to determine the tracking modes used for detecting the targets in the regions (step S640). The tracking mode includes the background model suitable for a pure environment, and the non-background model suitable for a complex environment.
When the comparison module 140 determines that the crowd density of a region is less than the threshold, the background tracking module 150 uses the background model to track the targets in such region (step S650). Wherein, the background tracking module 150 calculates a shift amount of the target in the region at tandem time points, predicts a position of the target appeared at a next time point, and performs a regional characteristic comparison to obtain the moving information of the target.
When the comparison module 140 determines that the crowd density of the region is greater than or equal to the threshold, the non-background tracking module 160 uses the non-background model to track the targets in such region (step S660). Wherein, the non-background tracking module 160 performs motion vector analysis on a plurality of characteristic points in the region, so as to compare the motion vectors to obtain the moving information of the targets in such region.
It should be noticed that after the target information of each region is obtained, a target information combination module (not shown) is further used to combine the moving information of the targets in the regions of the image that are obtained by the background tracking module 150 and the non-background tracking module 160, so as to obtain target information of the whole image (step S670).
For example,
In summary, in the multi-state target tracking system 100 of the present embodiment, the image can be divided into a plurality of region according to the distribution status of the detected targets for calculating the crowd densities and selecting the tracking modes, so as to provide an optimal tracking result.
It should be noticed that after the above multi-state target tracking method is used to obtain the target information, variation of the crowd density is continually detected, so as to suitably switch the tracking modes to achieve a better tracking effect. Another embodiment is provided below for further description.
First, the processing device 120 selects the background tracking module 150 or the non-background tracking module 160 to track the targets in the images according to a comparison result of the comparison module 140 (step S810).
While the targets are tracked, the processing device 120 continually uses the crowd density detecting module 130 to detect the crowd density of the images (step S820), and uses the comparison module 140 to compare the crowd density detected by the crowd density detecting module 130 with the threshold (step S830).
Wherein, when the comparison module 140 determines that the crowd density detected by the crowd density detecting module 130 is increased to exceed the threshold, the tracking mode of the targets is changed from the background model (used by the background tracking module 150 to perform the background tracking) to the non-background model (used by the non-background tracking module 160 to perform the non-background tracking). Similarly, when the comparison module 140 determines that the crowd density detected by the crowd density detecting module 130 is decreased to be less than the threshold, the tracking mode of the targets is changed from the non-background model (used by the non-background tracking module 160 to perform the non-background tracking) to the background model (used by the background tracking module 150 to perform the background tracking) (step S840).
It should be noticed that the approach for continually detecting the crowd density and updating the tracking mode of the present embodiment can also be applied to the second exemplary embodiment (in which the image is divided into a plurality of the regions to respectively perform the crowd density calculation, the tracking mode determination and the targets tracking), as long as the crowd density in the region is increased or decreased to cross the threshold, the tracking modes can be adaptively switched to achieve a better tracking effect.
In summary, in the multi-state target tracking method and system of the disclosure, based on a series of automatic detection and switching steps, such as the crowd density detection, switching of the tracking modes, inheriting of the tracking data, the most suitable tracking mode can be selected, and the targets can be continually and stably tracked in case of different environment.
It will be apparent to those skilled in the art that various modifications and variations can be made to the structure of the disclosure without departing from the scope or spirit of the invention. In view of the foregoing, it is intended that the disclosure cover modifications and variations of this invention provided they fall within the scope of the following claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
98139197 | Nov 2009 | TW | national |