The present disclosure relates generally to a device and method for computing image features over discrete image regions, and more particularly for computing image features over regions having an arbitrary non-simply connected rectangular shape.
Integral image and integral histogram computations (sometimes referred to herein as “integral computations”) can be used to compute image statistics, such as mean, co-variance and histogram for a set of pixels in a simple rectangular image region.
An exemplary integral image computation is an aggregate function where, starting from an origin point in a set of image data and traversing the through the remaining points along a scan-line, the image values are summed so that each point has a cumulative value that represents the sum of the previously scanned adjacent points and the current point being scanned. An integral image representation can be created which is a representation of the cumulative image data for all the data points in the image.
The integral image representation T of an image I can be illustrated with reference to
The integral image representation can be used to efficiently calculate the sum of I, over rectangle D shown in
The use of image integral representations for face detection is described in P. Viola and M. J. Jones, “Robust Real-Time Face Detection,” IJCV, vol. 57, pages 137-154 (2004), the disclosure of which is incorporated herein by reference in its entirety.
An integral histogram computation can be calculated similarly, where the integral histogram is iterated at the current data point using the histograms of the previously scanned adjacent data points. At each point, the value of the bin that the point fits into is increased in the bin's range. After the integral histogram representation of the image is computed, histograms of rectangular target regions can be computed by using the integral histogram values at the corner points of the rectangular target region. The integral histogram of the target regions is calculated similarly to the image integral representation discussed above.
The use of integral histogram computations is described in F. Porikli, “Integral Histogram: a Fast Way to Extract Histograms in Cartesian Spaces,” CVPR, vol. 1, pp. 829-836 (Jun. 20-25, 2005), the disclosure of which is incorporated herein by reference in its entirety.
Using integral image computations to calculate region based features can be more efficient than using the original image itself because the overall computational cost is lowered.
The use of integral computations including image integral computations and integral histogram computations have been limited to computing the sum of the image data over simply connected rectangular regions having only four corners (referred to herein simply as “rectangular”). This limitation can prevent the use of integral computations for more complex, non-simply connected rectangular regions (referred to herein as “generalized rectangular”). This is a drawback because image data generally represents an array of pixels that includes generalized rectangular regions rather than simple rectangular regions.
It has been discovered in connection that integral computations, such as integral image and integral histogram, can be used over generalized rectangular regions in addition to rectangular regions. The use of integral computations over generalized rectangular regions can enable the fast computations of statistics (e.g. mean and co-variance) of multidimensional vector-values functions over (discrete) regions having arbitrary shape.
According to a first aspect of the present disclosure, there is provided an image statistic computation device for computing region-based image statistics from an inputted image, where the device includes a propagation device, a target region recognition device, a corner analyzing device and a computation device. The propagation device can be configured to produce a cumulative image representation by propagating an aggregate function through image data from the inputted image, where each data point in the cumulative image representation includes cumulative image information that is based on a value of previously propagated adjacent points as well as a value of the data point. The target region recognition device can be configured to identify a generalized rectangular region from the cumulative image representation. The corner analyzing device can be configured to identify and characterize each corner of identified generalized rectangular region. The computation device can calculate statistical information over the at least one generalized rectangular region based on the type of each corner and based on the cumulative image information at each corner.
In another aspect, a method for computing statistical information over a region in image data from an inputted image, the method including producing a cumulative image representation by propagating an aggregate function through the image data, identifying a generalized rectangular region in the cumulative image representation, identifying corners of the generalized rectangular region, characterizing the type each corner of the generalized rectangular region, assigning a value to each corner based on the type of each corner, and computing statistical information over the generalized rectangular region based on the value assigned to each corner of the generalized rectangular region and the cumulative image information at each corner of the generalized rectangular region.
The present disclosure also can provide a method for calculating an identifying descriptor for a person or object in an image, where the method includes calculating an appearance labeled image from image data taken from an image, creating a cumulative image representation by propagating an aggregate function through the appearance labeled image, identifying a first generalized rectangular region from the cumulative image representation, identifying and characterizing the corners of the first generalized rectangular region, and calculating a first image statistic over the first generalized rectangular region based on the cumulative image information of each corner and based on the type of each corner.
In another embodiment, an identifying descriptor can be calculated by calculating a shape labeled image from the image data, identifying a second generalized rectangular region from the shape labeled image, identifying and characterizing the corners of the second generalized rectangular region, identifying the portions of a cumulative image representation of an appearance labeled image that correspond to the corners of the second generalized rectangular region, and calculating a second image statistic over the second generalized rectangular region based on the type of each corner of the second generalized region and the first image statistic calculated at each data point of the cumulative image representation that corresponds to a corner of the second generalized rectangular region.
Exemplary embodiments are described in detail below with reference to the accompanying drawings in which:
Exemplary embodiments of the broad principles outlined herein are described with reference to the various drawings.
The pre-processing device 110 can receive an input image, such as video images or still photos. From the input image, the pre-processing device 110 can be used to process the image data by converting the data, if necessary, into a predetermined format for the propagation device 120. The pre-processing device transmits the converted image data to the propagation device 120.
The propagation device 120 receives the image data from the pre-processing device 110 and propagates an aggregated function through the image data to convert the image data processed in pre-processing device 110 into a cumulative image representation where each point in the cumulative image representation includes cumulative data that is based on a value of adjacent data points processed before it, as well as a value of the data point itself. The cumulative image representation created by the propagation device 120 is then transmitted to a target region recognition device 130.
The target region recognition device 130 receives the cumulative image representation from the propagation device 120, and identifies a generalized rectangular region within the cumulative image representation. The generalized rectangular region can be a region of interest that is selected so that image features over the region can be calculated. The generalized rectangular region identified by the target region recognition device 130 is transmitted to the corner analyzing device 140.
The corner analyzing device 140 receives the generalized rectangular region from the target region recognition device 130 and analyzes the generalized rectangular region to identify and characterize a corner point of the region as one of a predefined corner type. The corner analyzing device 140 transmits the corner type information for each corner point to the computation device 150.
The computation device 150 receives the corner information from the corner analyzing device 140 and calculates image statistics over the generalized rectangular region of the cumulative image representation by considering the value of the cumulative image representation at the corners of the generalized rectangular region and the corner type information of each corner. The computation device 150 can output the calculated image statistic. The outputted statistic can be output to a display, a memory or can be used in further processing, for example. The statistics computed over the generalized rectangular regions can be used in any application where it is desired to calculate image data over certain target regions of an image.
In an example of operation of an image statistic computation, the pre-processing device 110 can receive image data from an input image and process the image data to convert the image data into any form that is necessary for subsequent processing. For example, the cumulative image representation that is subsequently calculated in the propagation device 120 can be based on color or brightness data from the pixels, or can be based on pixel data that is converted into more complex forms. For example, the pre-processing device can be used to convert pixel data from RGB data into Log-RGB color space data, Lab color space data, HSV data or YIQ data. The pre-processing device can process the image data to extract and normalize pixels of interest, for example, those capturing people or objects of interest. The pre-processing device can be used to calculate more complex image statistics such as the histogram of the oriented gradient (HOG) values. The pre-processing device can filter the pixel data, and quantize and label the pixel data. The pre-processing device can be used to convert the image data into even more complex formats, for example, to create shape labeled images and appearance labeled images, as described in greater detail in co-pending U.S. patent application entitled “Image Processing for Person and Object Re-Identification.” The pre-processing device 110 transmits the processed image data to the propagating device 120.
The propagating device 120 can receive the image data processed in the pre-processing device 110 and further process the data to create a cumulative image representation. The propagation device 120 can create the cumulative image representation by propagating an aggregate function through the image data, starting from an origin point in the data and traversing through remaining points of the image data along a scan-line, to propagate the aggregate function over the image data. The aggregate function at each data point (typically each pixel) uses the values of the aggregated function from adjacent data points that were previously processed. The propagation device 120 can propagate the aggregate function on a pixel-by-pixel basis.
The aggregate function can include integral computations, such as integral image and integral histogram. For example, as discussed above, the integral image function can be used to propagate the sum of the image intensities or other image features throughout the image data, and the integral histogram function can be used to propagate a cumulative histogram of image features throughout the image data.
The propagation device 120 can propagate the aggregate function throughout the image data along scan-lines to produce a cumulative image representation that includes cumulative image data at each point. The image data is typically scanned starting from the top-left of the image, propagating the aggregate function from left to right, and then returning to the left side of the next row of pixels until data from each pixel is converted into cumulative image data including the values of previously processed adjacent pixels in addition to the value of the pixel being scanned. Thus, for a left-to-right propagation, the cumulative image representation at a given data point will hold the sum of all values to the left and above of the point including the value of the point itself. Cumulative image representations can be produced by propagating an aggregate function along a scan-line in any suitable sequence. For example, the principles outlined in this disclosure could readily be adapted to form a cumulative image representation that is produced by scanning up-to-down along the image data. The cumulative image representation created by the propagation device 120 can be transmitted to the target region recognition device 130.
The target region recognition device 130 can receive the cumulative image representation from the propagation device 120. The target region recognition device 130 can be configured to identify a region of interest in the cumulative image representation received from the propagation device 120. The target region recognition device can identify a generalized rectangular region. The generalized rectangular region is a non-simply connected rectangle, such that the boundaries of the generalized rectangular region are made of a collection of portions of a finite number of hyperplanes that are perpendicular to one of the axes of a reference coordinate system. The generalized rectangular region identified by the target region recognition device 130 represents a region over which image features will be computed. Referring to
The target region recognition device 130 can be configured to identify existing regions in the cumulative image representation that are defined by a common feature, for example, or can be configured to select specific regions of a predetermined shape and/or location from the cumulative image representation. The target region recognition device 130 can also be configured to identify generalized rectangular regions of any shape. For example, the target region can be a generalized rectangular region that has no holes. Similarly, the target recognition device can be configured to identify multiple generalized rectangular regions, generalized rectangular regions with multiple holes, simply-connected rectangular regions, and any combination of the foregoing. Additionally, while the generalized rectangular region illustrated in
The corner analyzing device 140 can receive the generalized rectangular region from the target region recognition device 130 and can inspect the corners of the generalized rectangular region D to evaluate the corners according to a predetermined corner characterization function. The corner characterization function can be used to assign values to each corner point on the generalized rectangular region based on the type of each corner. The corner characterization function can depend on the dimension k and the scan-line used to create the cumulative image representation.
For a planar region D, a generalized rectangular region can have the 10 different types of corners illustrated in
As can be seen in
The function αD depends on the dimension k of the polygonal region D. A function αD for non-planar dimensions can be derived based on the above-principles.
The corner analyzing device 140 can recognize whether a pixel of an image representation is a corner, and if so, can determine what type of corner it is. For example, in the planar case, the corner analyzing device 140 can determine for each pixel whether it is a corner and, if so, which of the 10 types of corners it belongs to. One embodiment illustrating the implementation of the corner analyzing device is illustrated in
A special check can be performed to verify whether pixel (x, y) is an isolated point. This can be checked by determining whether the 3×3 template is like the one illustrated in
Referring back to
As background, the aggregate function propagated in propagation device 120, such as integral image or integral histogram computations, can be generalized to any function ƒ(x), such that ƒ(x): RkRm with antiderivative F(x). For a a simple 2D case (i.e k=2) of a rectangular region D, the following equation can be written for the integral
Similar equations can be written for k>2. Additionally, if x is a uniformly distributed random variable and E[•|D] denotes the expectation where x is constrained to assume values in D, then one can write the expression of simple statistics, such as the mean of ƒ(x) over D
or the covariance of f(x) over D:
where g(x): RkRmxm is such that xƒ(x)ƒ(x)T. Similarly, higher-order moments can be written in this matter.
Expressions (3) and (4) can assume very different meanings depending on the choice of ƒ(x). For instance, for the integral image they represent mean and covariance of the pixel intensities over the region D. On the other hand for the integral histogram, equation (3) is the histogram of the pixels of region D, according to quantization q. What those expressions share is the fact that the integral operation can be substituted with the result of Equation (2).
An integral image statistic can be calculated in the computation device 150 over the generalized rectangular region by summing up the values of the product of F(x), determined in the propagation device 120, and the values based on αD, determined in the corner analyzing device 140.
An exemplary process for this calculation can be described as follows: D⊂Rk can be a generalized rectangular region where the boundary ∂D is made of a collection of portions of a finite number of hyperplanes perpendicular to one of the axes of Rk. If ∇·D indicates the set of corners of a generalized rectangular region D, then
where αD: RkZ, is a map that depends on k. For k=2, αD(x)ε{0, ±1, ±2}, according to the which of the types of corners x belongs to. Thus if D is a generalized rectangular region, the integral of ƒ(x) over D can be computed in constant time. This can be done by summing up the values of F(x), computed at the corners xε∇·D, and multiplied by αD(x), which depends on the type of corner. Therefore, for any discrete region D, statistics over region D can be computed in constant time simply by inspecting the corners to evaluate αD.
According to the present disclosure, Equation 1 defined above in connection with the integral image computation can be extended from simple rectangular regions (
The computational complexity to evaluate K features over generalized rectangular regions with Q pixels and W corners, out of an image of N×N pixels, is O(KQ) if it is computed with the original image representation, and is O(N2+KW) if it is computed with the integral image representation using Equation 5. Typically, W is much smaller than Q. When K is large, it is therefore more efficient to evaluate features using the integral image representation the simply by using the original image.
In one embodiment, the framework described above can be used to calculate region based image features to re-identify a person or object in a plurality of images, as described in greater detail in the co-pending application entitled “Image Processing for Person or Object Re-identification.” As described therein, shape labeled image and an appearance labeled image can be created from inputted image data. The shape labeled image can be created by assigning a shape label to each pixel based on an attribute of the pixel that is characteristic of the shape of the part of the person or object that is captured by the pixel. Similarly, the appearance labeled image can be created by assigning an appearance label to each pixel based on an attribute of the pixel that is characteristic of the appearance of the pixel data. The shape labeled image and appearance labeled image can be processed to compute an occurrence matrix that can be used as an identifying descriptor for a person or object in the image. Re-identification can be used in security applications, forensic applications, for identifying missing people, and for tracking people or objects in crowded environments such as mass transit and airports. The calculation of image statistics over generalized regions based on the appearance labeled image and shape labeled image can be greatly simplified by using the approaches described above. In one embodiment, calculating image statistics over generalized rectangular regions in a shape labeled image and an appearance labeled image is described in greater detail with reference to
In
If image I contains a person or object of a given class, A can be its appearance labeled image, and S (defined over Λ) can be its shape labeled image, where pixel labels are meant to identify regions of image I occupied by specific parts of the object. The descriptor Θ may be determined as follows. S: ΛS and A: ΛA are two functions defined on a discrete region Λ of dimensions M×N, and assuming values in the label sets S={s1, . . . , sn} and A={a1, . . . , am} respectively. Also, P={p1, . . . , p1} is a partition such that ∪ipi represents the plane, and pi∩pj=0, if i≠j. If pεP and x a point on the plane x, p(x) can be defined p(x)={x+y|yεp}, and h(a, p(x))=P[A(y)=a|yεp(x)] can represent the probability distribution of the labels of A over the region p(x), where P is a probability measure.
In other words, for a given A, and a randomly selected point yεp(x), the probability that the label at that point will be a is given by h(a, p(x)). For example, in
Also, if Ds={x|S(x)=s}, sεS, the occurrence function can be defined as follows. The occurrence function Θ: A×S×PR+ can be defined such that point (a, s, p) maps to
If x is a uniformly distributed random variable, E[•|D] denotes the statistical expectation where x is constrained to assume values in D. For example, the computation of the mean and covariance can describe the notation of expectation E[.|D].
Θ computed over S and A is an m×n×l matrix. Θ can be a collection of values corresponding to all the points of the region A×S×P which is referred to sometimes herein as the occurrence matrix. The occurrence matrix can be used as a unique identifying descriptor for each part or region Ds because, given S and A, for a randomly selected point xεDs, the probability distribution of the labels A over the region p(x) of A can be represented by Θ(•,s,p).
For an image having N×N image, the computation complexity to calculate Θ using conventional methods is O(N4). Such computation cost is considered impractical for real-time evaluation of image statistics. However, using the procedures outlined above in the pending application, the computation cost can be greatly reduced by identifying and characterizing the corner points for the generalized rectangular partitions p, and identifying and characterizing the corner points of the generalized rectangular shape labeled regions s. Specifically, calculation of the occurrence function Θ can be simplified according to the following equation:
and e: ANm is such that a label aiεA is mapped to ei, where ei is the unit vector with only the i-th component different then 0, and therefore, the inner integral is the integral histogram of A. Note that aεA is intended to index one of the elements of the m-dimensional vector G(•, x). ∇·DS indicates the set of corners of a generalized rectangular shape labeled region s, and ∇·p indicates the set of corners of a generalized rectangular partition from the mask.
Thus, based on Equations 7 and 8 the occurrence matrix can be calculated efficiently using Algorithm 1, below:
The computation cost of Equation 7 is O(N2+CsCp), which is generally O(N2) in practice because CsCp is of the same order of N2.
The appearance labeled image is out put in step S12, and during operation of a propagation device, a cumulative integral histogram representation of the appearance labeled image is calculated based on a single-pass inspection of the appearance labeled image in step S13. The propagation device outputs the cumulative integral histogram.
The shape labeled image is output from step S12, and during operation of a target region recognition device, the shape labeled image can be analyzed in step S14 to identify a generalized rectangular region DS that corresponds to a group of pixels having the same shape label. During operation of a corner analyzing device, the generalized rectangular region DS can be analyzed in step S15 to identify and characterize corner points of region DS. A value can be assigned to each of the corner points of DS based on the type of each corner and a predetermined corner characterization function.
In step S16, during operation of a second target region recognition device, the cumulative integral histogram is received and a mask having a plurality of partitions can be placed over a pixel in the cumulative integral histogram. The mask can be placed over a pixel 601 that corresponds to a corner point pixel of DS, as illustrated in
During operation of a computing device, the average integral histogram over each partition that is placed over the cumulative integral histogram can be calculated in step S20 based on the cumulative integral histogram at each corner point of the partition and the type of each corner. In step S22, a vector can be calculated for the pixel that the mask was centered over, where the vector represents the average integral histograms for all of the partitions in the mask.
In step S24, the process steps described in steps S16-S22 can be repeated for each pixel of the cumulative integral histogram representation that corresponds to a corner point of DS, such that the mask is superimposed over each pixel corresponding to a corner point of DS. In step S26, the average values of the vectors calculated in step S22 over region DS can be calculated based on the value of the vector and the type of corner of DS that was determined in step S15.
To calculate a shape and appearance context descriptor that can be used to identify a person or object in the image, the steps of S15-S26 can be repeated for each region in the shape labeled image to calculate the occurrence matrix Θ shown in
The process described above illustrates an exemplary embodiment of a process used to calculate region based image features for determining an identifying descriptor of a person or object in an image. The image statistic computation device and method can also be exploited in any application where it is desired to calculate region based image features. Some examples of useful applications include face detection engines for people detection and tracking, and medical imaging detection and recognition applications.
While the disclosed methods and systems have been described in conjunction with exemplary embodiments, these embodiments should be viewed as illustrative, not limiting. It should be understood that various modifications, substitutes, or the like are possible within the spirit and scope of the disclosed devices, methods and systems.
The present application claims priority to U.S. Provisional Patent Application No. 60/868,407, which was filed on Dec. 4, 2006, and U.S. Provisional Patent Application No. 60/960,545, which was filed on Oct. 3, 2007, the disclosures of which are incorporated herein by reference in their entireties. The co-pending application entitled “Image Processing for Person and Object Re-Identification” (application Ser. No. 11/987/777) that is being filed concurrently herewith, is additionally incorporated herein by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
7574037 | Hidai et al. | Aug 2009 | B2 |
20050102246 | Movellan et al. | May 2005 | A1 |
20060072811 | Porter et al. | Apr 2006 | A1 |
20060177131 | Porikli | Aug 2006 | A1 |
Number | Date | Country | |
---|---|---|---|
20080187220 A1 | Aug 2008 | US |
Number | Date | Country | |
---|---|---|---|
60868407 | Dec 2006 | US | |
60960545 | Oct 2007 | US |