In video analytics, it is important to distinguish video foreground imagery (e.g., moving objects of interest) from video background (e.g., static parts of the image). Various methods for distinguishing foreground from background currently exist, but they lack the ability to accurately and quickly compute the background in dynamic scenes containing textured and non-textured objects (e.g. illumination changes, camera movement, changes to the scene, etc.). This is especially important in real-time online applications, where on-demand availability of an accurate background-model is necessary. This is not provided by current solutions.
It would therefore be desirable and advantageous to have a method for real-time online extraction of a background model from a video stream. This goal is attained by embodiments of the present invention.
Embodiments of the present invention provide methods for analyzing a stream of video frames to differentiate foreground imagery from background imagery in real-time for immediate online use. In certain embodiments, the method yields results which are incrementally improved with each new frame processed.
Therefore, according to an embodiment of the present invention there is provided an apparatus for extracting a background model from a video stream of frames, the apparatus including: (a) a non-transitory data storage device, for storing data and executable program code; (b) a processor implementing: (c) a background model; (d) a block descriptor computer, for dividing a frame into a rectangular array of blocks, and for computing a block descriptor of a block; (e) a block descriptor of a current frame of the video stream; (f) a block descriptor of a previous frame of the video stream; (g) a match detector, for determining if a block descriptor of the current frame is substantially the same as a block descriptor of the previous frame; (h) a staging cache for storing block descriptors; and (i) an updater, for updating the background model according to a block descriptor in the staging cache.
In addition, according to another embodiment of the present invention, there is provided a method for using a data processor to extract a background model from a video stream of frames in a non-transitory memory, the method including: (a) obtaining, by the data processor, a frame from the video stream; (b) dividing, by the data processor, the frame into a rectangular array of blocks; (c) for each block in the rectangular array: (d) computing, by the data processor, a current block descriptor of the block; (e) if a corresponding block retrieved from the background model has a block descriptor that does not substantially match the current block descriptor, then: (f) if the corresponding block of the previous frame has a block descriptor that does substantially match the current block descriptor, and has matched the current block descriptor for at least a predetermined number of frames, then putting the current block descriptor into a staging cache in the non-transitory memory; and (g) updating, by the data processor, the background model according to a block descriptor selected from the staging cache.
Moreover, according to yet another embodiment of the present invention, there is provided a computer product including executable code instructions in a non-transitory storage device, which instructions, when executed by a data processor, cause the data processor to perform a method for extracting a background model from a video stream of frames in a non-transitory memory, the method including: (a) obtaining, by the data processor, a frame from the video stream; (b) dividing, by the data processor, the frame into a rectangular array of blocks; (c) for each block in the rectangular array: (d) computing, by the data processor, a current block descriptor of the block; (e) if a corresponding block retrieved from the background model has a block descriptor that does not substantially match the current block descriptor, then: (f) if the corresponding block of the previous frame has a block descriptor that does substantially match the current block descriptor, and has matched the current block descriptor for at least a predetermined number of frames, then putting the current block descriptor into a staging cache in the non-transitory memory; and (g) updating, by the data processor, the background model according to a block descriptor selected from the staging cache.
The subject matter disclosed may best be understood by reference to the following detailed description when read with the accompanying drawings in which:
For simplicity and clarity of illustration, elements shown in the figures are not necessarily drawn to scale, and the dimensions of some elements may be exaggerated relative to other elements. In addition, reference numerals may be repeated among the figures to indicate corresponding or analogous elements.
An embodiment of the invention provides a data processor 121 containing a non-transitory data storage unit 123 for maintaining data and executable program code for the following elements: a block descriptor computer module 141, for dividing a frame into a rectangular array of blocks and for computing a block descriptor of a block; a block descriptor 105A of block i=9, k=8 of frame 101B; a block descriptor 105B of block i=9, k=8 of frame 101C; a staging cache area 107 in data storage 123, a background model 103 representing a background image 113, and a match detector module 109, for detecting the condition where a current block descriptor (such as block descriptor 105B) represents substantially the same image block as that of the immediately-preceding block descriptor (such as block descriptor 105A). Match detector module 109 also detects the condition where the current block descriptor is not substantially the same as that of the immediately-preceding block descriptor. In various embodiments, a match between two blocks is determined according to normalized cross-correlation and the mean of absolute differences. In a related embodiment, the mean of absolute differences is compensated for the mean difference.
According to an embodiment of the present invention, the various components and modules described and illustrated herein are implemented by data processor 121 via executable code contained in non-transitory storage unit 123.
Frame 101C immediately follows frame 101B in video stream 101, and according to this embodiment, in this particular non-limiting example the blocks of frame 101C are compared with the corresponding blocks of frame 101B to detect changes. In this non-limiting example match detection module 109 compares block descriptor i=9, k=8 of frame 101C with block descriptor i=9, k=8 of frame 101B to determine if there is substantially the same image fragment in the block at i=9, k=8. In this example, the block at i=9, k=8 will not be substantially the same, because the object of interest, vehicle 131, has moved in relation to the background scene, street and building 133.
If a block descriptor in the current frame matches the corresponding block descriptor in the background model, then the block descriptor of the background model is updated directly using the block descriptor of the current frame. If a block descriptor in the current frame does not match the corresponding block descriptor of the background model but is substantially the same as the corresponding block descriptor of the previous frame, then the current block descriptor is placed in staging cache 107, which contains candidate blocks for updating background model 103. If the block descriptor in staging cache 107 matches the corresponding block descriptor of the current frame for at least a predetermined number of frames, then an updater module 143 updates background model 103 according to the block descriptor, by putting the block descriptor into the corresponding block of background model 103. In various embodiments, updater module 143 contains a best-fit block selector module 145 to select a candidate block from staging cache 107 which best fits into the local environment (the immediate neighbors of the block). In a related embodiment, the quality of the local fit is computed by best-fit block selector module 145 based on spectral response intensities and the degree of discontinuity between neighboring blocks, where the solution is found according to the known Iterated Conditional Modes (ICM) algorithm.
By analyzing the changes of the image blocks as described in detail herein below, apparatus according to these embodiments of the present invention extract background image 113 from video stream 101.
The loop continues at an end of loop 321. If there are more blocks, the loop is repeated. Otherwise, the method terminates at and end-point 323.
According to a related embodiment of the present invention, in the case of the first frame of video stream 101, there is no previous frame, and therefore each block is put into staging cache 107 by step 315, and background model 103 is initialized with the blocks of the first frame.
In addition, according to a further embodiment of the present invention, there is provided a computer product including executable code instructions in a non-transitory storage device, which instructions, when executed by a processor, cause the processor to perform a method according to an embodiment of the present invention.
This application claims the benefit of U.S. provisional Application No. 62/067,687 filed on Oct. 23, 2014 which is hereby incorporated by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
62067687 | Oct 2014 | US |