This application claims priority of Taiwan Patent Application No. 103118537, filed on May 28, 2014, the entirety of which is incorporated by reference herein.
1. Field of the Invention
The disclosure relates generally to methods and devices for information capture, and more particularly it relates to methods and devices for transforming images into useful information.
2. Description of the Related Art
With the enhancing safety consciousness in the whole society, various imaging devices have become increasingly popular, and the quality of captured images is also getting better and better. However, this improvement in quality implies that the computing resources and the storage space required for handling and using these images are also increased rapidly. How to effectively handle and use these captured images is a problem that urgently needs to be solved.
Although current image-processing software is well-developed and able to automatically identify people and the objects in a picture, the computing resources required for processing a great quantity of image files are sometimes difficult to obtain. For example, when tracking a specific automobile by its license plates using a great number of monitoring cameras, images must be checked one by one by human operators and must take a lot of time. Therefore, we need a system that is able to effectively handle a great quantity of pictures to help us accomplish the tracking job.
For solving above problem, the invention provides an information-capture device and method for capturing the meaningful texts instead of a great number of images.
An embodiment of an information-capture device comprises a video-capture device, a pre-processing module, an image-processing module, and a text generation module. The video-capture device is configured to capture video data. The pre-processing module is configured to divide the video data into background data and foreground data. The image-processing module generates an object feature and object-motion information according to the foreground data, and generates captured-space information of the video data according to the background data. The text generation module generates event-description information according to the object feature, the object-motion information, and the captured-space information, wherein the event-description information is related to an event that occurred in the video data, and the event-description information comprises the information related to the event and is in the form of a machine-readable text file.
In an embodiment, the information-capture device further comprises a foreground image-processing module and a background image-processing module. The foreground image-processing module generates the object feature and the object-motion information according to the foreground data. The background image-processing module generates the captured-space information of the video data according to the background data.
In an embodiment, the foreground image-processing module comprises a feature-capture module, and a motion-detection module. The feature-capture module extracts the object feature according to the foreground data, and compares the object feature to a feature database to generate object information and feature information. The motion-detection module obtains moving behavior of the object according to an object movement algorithm and compares the moving behavior with a behavior database to generate behavior information, wherein the text generation module generates the event-description information according to the object information and the behavior information.
In an embodiment, the feature-capture module captures at least one critical point of the foreground data, generates a plurality of eigenvectors surrounding the center of the critical point, and generates the object information according to an object in the feature database having a minimum difference with the eigenvectors.
In an embodiment, the motion-detection module further generates a motion track according to the behavior information and the captured-space information, and the text generation module further generates the event-description information according to the motion track.
In an embodiment, the information-capture device further comprises an image-encryption module, a storage module, and a microprocessor. The image-encryption module encrypts the video to generate an encrypted image. The storage module stores the encrypted image. The microprocessor accesses the encrypted image according to the event-description information, and searches a corresponding section of the encrypted image according to the event-description information.
An embodiment of an information-capture method comprises capturing video data; dividing the video data into background data and foreground data; generating an object feature and object-motion information according to the foreground data; generating a captured-space information related to the video data according to the background data; and generating an event-description information according to the object feature, the object-motion information, and the captured-space information, wherein the event-description information is related to an occurred event of the video data, and the event-description information comprises the related information of the occurred event and is a machine-readable text file.
An embodiment of an information-capture method further comprises extracting the object feature according to the foreground data and comparing the object feature with a feature database to generate an object information; obtaining a moving behavior of the object according to an object movement algorithm and comparing the moving behavior with a behavior database to generate a behavior information; and generating the event-description information according to the object information, the feature description, and the behavior information.
In an embodiment of an information-capture method, further comprises capturing at least one critical point of the foreground data; generating a plurality of eigenvectors surrounding a center of the critical point; and generating the object information according to an object in the feature database having the minimum difference with the eigenvectors.
In an embodiment of an information-capture method, further comprises generating a motion track according to the behavior information and the captured-space information; and generating the event-description information according to the motion track.
In an embodiment of an information-capture method, further comprises encrypting the video data to generate an encrypted image; storing the encrypted image in a storage module; and accessing the encrypted image according to the event-description information and searching a corresponding section of the encrypted image according to the event-description information.
A detailed description is given in the following embodiments with reference to the accompanying drawings.
The invention can be more fully understood by reading the subsequent detailed description and examples with references made to the accompanying drawings, wherein:
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
The image-processing module 103 includes the background image-processing module 110 and the foreground image-processing module 120. The background image-processing module 110 generates the captured-space information SC of the video data SV according to the background data SS, and transmits the captured-space information SC to the text generation module 104. According to another embodiment of the invention, the captured-space information SC can be inserted by a user and stored in a storage device. The foreground image-processing module 120 generates the object feature SO and the object-motion information SM according to the foreground data SD, and transmits the object feature SO and the object-motion information SM to the text generation module 104. According to an embodiment of the invention, the text generation module 104 generates the event-description information ST related to the events that occurred in the video data SV, according to the content of the captured-space information SC, the object feature SO, and the object-motion information SM (not shown in
According to an embodiment of the invention, the pre-processing module 102 takes the responsibility of capturing the foreground data SD of the video data SV and eliminating the duplicated pictures to reduce the size of the processed pictures. Since there is usually some duplicated information in the captured video, the computing load on the following devices can be released by this motion.
According to an embodiment of the invention, the event-description information ST is a machine-readable text file, and the event-description information ST includes the information of WHO, WHAT, WHEN, WHERE, and HOW related to the events that occurred according to the video data SV. According to another embodiment of the invention, the event-description information ST includes the information of any combination of WHO, WHAT, WHEN, WHERE, and HOW related to the events occurred in the video data SV. According to an embodiment of the invention, the event-description information ST is in json format; according to another embodiment of the invention, the event-description information ST is in XML format.
As shown in
First, in Step 301, the feature-capture module 121 transforms the foreground data SD into scale-space expression. That is, the image is convolved in different scales by the Gaussian filter and then down-sampled according to the given scale. According to an embodiment of the invention, the power of the Gaussian filter and the frequency of down-sampling are usually chosen to be a power of 2. That is, in each iteration, the image will be transformed into the images with different scales by the ratio of 0.5, and the images with different scales are convolved with a power of 2, by the Gaussian filter, to generate the scale space of the foreground information.
In Step 302, in order to find the critical points of the scale space, the critical points are then taken as maxima/minima of the Difference of Gaussians (DoG) that occur at multiple scales.
In Step 303, the main purpose is to unify the directions of the eigenvalues. In order to unify the directions of the eigenvalues, the algorithm of scale-invariant feature transform makes sure that each eigenvalue maintains its value even in different directions. The equations are listed as follows:
Eq. 1 is used to calculate the gradient amplitude of the critical points, and Eq. 2 is used to calculate the gradient direction of the critical points, in which L(x,y) is the grey-scale value of the display pixel.
After unifying the directions of the critical points, Step 304 is executed to calculate the descriptors of the eigenvalues.
As shown in
In other words, the Euclidean distance is used to find the object, whose vector difference to the 128-dimension descriptor is the minimum, in the feature database 130, and the object is thus the most similar object. The feature-capture module 121 of
Regarding the found object feature SO mentioned above, for a continuously changing object feature SO, we continuously record the time of variance of each display pixel within the display block displaying the object feature SO. Then, we extract the gradient direction of the time of variance to get the movement direction of the foreground block in the picture.
Then, on the whole Motion History Image, the X-direction and the Y-direction of the gradient direction are calculated according to the recorded position and the moving time respectively (Step 701), so that the X-axis and Y-axis of the moving speed are obtained. Finally, the moving direction of the foreground data SD of the image is calculated by the trigonometric function (step 703), and the motion track is obtain by collecting a series of the motion-direction information. After that, the motion-detection module 122 records the moving direction and the motion track in the object-motion information SM, compares the motion track of the object-motion information SM with the moving behavior of the motion database 140, and the moving direction and the actual speed can be obtained with the aid of the captured-space information SC. the motion-detection module 122 records the related information, such as the moving direction and the speed, in the behavior information SIM.
The text generation module 104, according to the content of the captured-space information SC, the object feature SO, and the object-motion information SM, generates the event-description information ST related to the events occurred in the video data SV. According to an embodiment of the invention, the event-description information ST is in json format. According to another embodiment of the invention, the event-description information ST is in XML format. According to an embodiment of the invention, the motion-detection module 122 is able to detect the moving behavior of the motion database 140 defined by another user, and it is only used for illustrating the detecting method of the invention herein, but not in any way to limit the moving behavior to the movement.
Since the encrypted video data SV stored in the storage module 802 may be quite large, it needs to be searched by human operators when searching a specified section according to some event that was occurred. The searching time and the cost could be greatly reduced if we retrieve the event of the event-description information ST generated by the data-capture device 100 and then access the corresponding section according to the time marker recorded in the event-description information ST.
Back to Step S91, after capturing the video data, the method includes encrypting the video data to generate the encrypted image (Step S96); storing the encrypted image in the storage module (Step S97); accessing the encrypted image according to the event-description information generated in Step S95 and searching the corresponding section of the encrypted image according to the related information of the event-description information (Step S98).
According to an embodiment of the invention, the device and method for information-capture disclosed in the invention can be adapted to a great quantity of monitoring cameras to search a specific automobile by its license plates. The computer generates the event-description information ST according to the information-capture device 100, and finds which camera a car with the specific plate appears on in a very short period, or the computer can easily find out a car with the specific plate from which camera to another according to the event-description information ST. The handling time and cost can be greatly reduced compared to manually filtering the monitoring screen as the prior art or tracking vehicles by human resources.
According to another embodiment of the invention, the invention can be adapted to a great quantity of monitoring cameras, such as those used by the Taipei Metropolitan Rapid Transit System. As long as the administrator is aware of the population being rapidly growing, the administrator can do some proper reactions for the rapid-growth population. For example, the information-capture device 100 may generate event-description information ST having the number of people in the video data SV according to the video data SV captured by the video-capture device 101. The administrator can be aware of the change of the population immediately, according to the number of people of the event-description information ST, to make the best decision in advance.
While the invention has been described by way of example and in terms of preferred embodiment, it is to be understood that the invention is not limited thereto. Those who are skilled in this technology can still make various alterations and modifications without departing from the scope and spirit of this invention. Therefore, the scope of the present invention shall be defined and protected by the following claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
103118537 | May 2014 | TW | national |