The present disclosure relates to a system and method for determining a transformation matrix to transform a first image into a second image and to transform the first image into the second image.
As the need for video surveillance systems grows, the need for more automated systems is becoming more apparent. These systems are configured to detect moving objects and to analyze the behavior thereof. In order to optimize these systems, it is important for the system to be able to geospatially locate objects in relation to one another and in relation to the space being monitored by the camera.
One solution that has been proposed is to use a calibrated camera, which can provide for object detection and location. These cameras require large amounts of time to manually calibrate the camera, however. Further, the manual calibration of the camera is a very complicated process and requires the use of a physical geometric pattern, such as a checkerboard, lighting pattern, or a landmark reference. As video surveillance cameras are often placed in parking lots, large lobbies or in wide spaces, a field of view (FOV) of camera is often quite large and the calibration objects, e.g. checkerboard, are too small to calibrate the camera in such a large FOV. Thus, there is a need for video surveillance systems having cameras that are easier to calibrate and that improve object location.
This section provides background information related to the present disclosure which is not necessarily prior art.
A method for determining a transformation matrix used to transform data from a first image of a space to a second image of the space is disclosed. The method comprises receiving image data from a video camera monitoring the space, wherein the video camera generates image data of an object moving through the space and determining spatio-temporal locations of the object with respect to a field of view of the camera from the image data. The method further comprises determining observed attributes of motion of the object in relation to the field of view of the camera based on the spatio-temporal locations of the object, the observed attributes including at least one of a velocity of the object with respect to the field of view of the camera and an acceleration of the object with respect to the field of view of the camera. The method also includes determining the transformation matrix based on the observed attributes of the motion of the object.
This section provides a general summary of the disclosure, and is not a comprehensive disclosure of its full scope or all of its features.
Further areas of applicability will become apparent from the description provided herein. The description and specific examples in this summary are intended for purposes of illustration only and are not intended to limit the scope of the present disclosure.
The drawings described herein are for illustrative purposes only of selected embodiments and not all possible implementations, and are not intended to limit the scope of the present disclosure.
Corresponding reference numerals indicate corresponding parts throughout the several views of the drawings.
An automated video surveillance system is herein described. A video camera monitors a space, such as a lobby or a parking lot. The video camera produces image data corresponding to the space observed in the field of view (FOV) of the camera. The system is configured to detect an object observed moving the FOV of the camera, hereinafter referred to as a “motion object.” The image data is processed and the locations of the motion object with respect to the FOV are analyzed. Based on the locations of the motion object, observed motion data, such as velocity and acceleration of the motion object with respect to the FOV can be calculated and interpolated. It is envisioned that this is performed for a plurality of motion objects. Using the observed motion data, a transformation matrix can be determined so that an image of the space can be transformed to a second image. For example, the second image may be a birds-eye-view of the space, i.e. from a perspective above and substantially parallel to the ground of the space. Once the image is transformed into the birds-eye-view, the actual motion data of a motion object, e.g. the velocity and/or acceleration of the motion object with respect to the space, and the geospatial locations of objects in the space can be easily determined with greater precision.
The system can also be configured to be self-calibrating. For example, a computer-generated object, e.g. a 3D avatar, can be inserted into the first image and configured to “move” through the observed space. The image is then transformed. If the 3D avatar in the transformed image is approximately the same size as the 3D avatar in the second image or the observed motion in the second image corresponds to the motion of the first image, then the elements of the translation matrix are determined to be sufficient. If, however, the 3D avatar is much larger or much smaller or the motion does not correspond to the motion observed in the first image, than the elements were incorrect and should be adjusted. The transformation matrix or other parameters are adjusted and the process is repeated.
Once the transformation matrix is properly adjusted, the camera is calibrated. This allows for more effective monitoring of a space. For example, once the space is transformed, the geospatial location of objects can be estimated more accurately. Further, the actual velocity and acceleration, that is with respect to the space, can be determined.
Referring now to
The generated trajectories ultimately may be used to determine the existence of abnormal behavior. In an aspect of this disclosure, however, the trajectories are communicated to a processing module 32. The processing module 32 receives the trajectories and can be configured to generate velocity maps, acceleration maps, and/or occurrence maps corresponding to the motion objects observed in the FOV of the camera. The processing module 32 can be further configured to interpolate additional motion data so that the generated maps are based on richer data sets. The processing module 32 is further configured to determine a transformation matrix to transform an image of the space observed in the FOV into a second image, such as a bird-eye-view of the space. The transformation module 32 uses the observed motion data with respect to the camera to generate the transformation matrix. The transformation matrix can be stored with the various metadata in the mining metadata datastore 36. The mining metadata data store 36 stores various types of data including, metadata, motion data, fused data, transformation matrices, 3d objects, and other types of data used by the recording module 20.
The calibration module 34 calibrates the transformation matrix, thereby optimizing the transformation from the first image to the second image. The calibration module 34 receives the transformation matrix from the processing module 32 or from storage, e.g. the mining data datastore 36. The calibration matrix receives the first image and embeds a computer-generated object into the image. Further, the calibration module 34 can be configured to track a trajectory of the computer-generated object. The calibration module 34 then transforms the image with the embedded computer generated object. The calibration module 34 then evaluates the embedded computer generated object in the transformed space, and the trajectory thereof if the computer generated object was “moved” through the space. The calibration module 34 compares transformed computer generated object with the original computer generated object and determines if the transformation matrix accurately transforms the first image into the second image. This is achieved by comparing the objects themselves and/or the motions of the objects. If the transformation matrix does not accurately transform the image, then the values of the transformation matrix are adjusted by the calibration module 34.
It is envisioned that the surveillance module 20 and its components can be embodied as computer readable instructions embedded in a computer readable medium, such as RAM, ROM, a CD-ROM, a hard disk drive or the like. Further, the instructions are executable by a processor associated with the video surveillance system. Further, some of the components or subcomponents of the surveillance module may be embodied as special purpose hardware.
Metadata generation module 28 receives image data and generates metadata corresponding to the image data. Examples of metadata can include but are not limited to: a motion object identifier, a bounding box around the motion object, the (x,y) coordinates of a particular point on the bounding box, e.g. the top left corner or center point, the height and width of the bounding box, and a frame number or time stamp.
As can be appreciated, each time a motion event has been detected, a time stamp or frame number can be used to temporally sequence the motion object. At each event, metadata may be generated for the particular frame or timestamp. Furthermore, the metadata for all of the frames or timestamps can be formatted into an ordered tuple. For example, the following may represent a series of motion events, where the tuple of metadata corresponding to a motion object is formatted according to: <t, x, y, h, w, obj_id>:
<t1, 5, 5, 4, 2, 1>, <t2, 4, 4, 4, 2, 1>, . . . <t5, 1, 1, 4, 2, 1>
As can be seen, the motion object having an id tag of 1, whose bounding box is four units tall and two units wide, moved from point (5,5) to point (1,1) in five samples. As can be seen, a motion object is defined by a set of spatio-temporal coordinates. It is also appreciated that any means of generating metadata from image data now known or later developed may be used by metadata generation module 28 to generate metadata.
Furthermore, the FOV can have a grid overlay divided into a plurality of cells.
Additionally, in an aspect of this disclosure, the metadata generation module 28 can be configured to record the spatio-temporal locations of the motion object with respect to a plurality of grids. As will be shown below, tracking the location of the motion object with respect to a plurality of grids allows the transformation module 32 to perform more accurate interpolation of motion data.
The metadata generation module 28 can also be configured to remove outliers from the metadata. For example if a received metadata for a particular time sample is inconsistent with the remaining metadata then the metadata generation module 28 determines that the sample is an outlier and removes it from the metadata.
The metadata generation module 28 outputs the generated metadata to the metadata mining warehouse 36 and to a data mining module 30. The metadata generation module 28 also communicates the metadata to the transformation module 38, which transforms an image of the space and communicates the transformed image to a surveillance module 40.
The vector generation module 50 receives the metadata and determines the amount of vectors to be generated. For example, if two objects are moving in a single scene, then two vectors may be generated. The vector generation module 50 can have a vector buffer that stores up to a predetermined amount of trajectory vectors. Furthermore, the vector generation module 50 can allocate the appropriate amount of memory for each vector corresponding to a motion object, as the amount of entries in the vector will equal the amount of frames or time stamped frames having the motion object detected therein. In the event vector generation is performed in real time, the vector generation module 50 can allocate additional memory for the new points in the trajectory as the new metadata is received. The vector generation module 50 also inserts the position data and time data into the trajectory vector. The position data is determined from the metadata. The position data can be listed in actual (x,y) coordinates or by identifying the cell that the motion object was observed in.
The outlier detection module 66 receives the trajectory vector and reads the values of the motion object at the various time samplings. An outlier is a data sample that is inconsistent with the remainder of the data set. For example, if a motion object is detected at the top left corner of the FOV in samples t1 and t3, but is located in the bottom right corner in sample t2, then the outlier detection module 52 can determine that the time sample for time t2 is an outlier. It is envisioned that any means of detecting outliers may be implemented in outlier detection module. Further, as will be discussed below, if an outlier is detected, the position of the motion object may be interpolated based on the other data samples. It is envisioned that any means of outlier determination can be implemented by the outlier detection module 52.
The velocity calculation module 54 calculates the velocity of the motion object at the various time samples. It is appreciated that the velocity at each time section will have two components, a direction and magnitude of the velocity vector. The magnitude relates to the speed of the motion object. The magnitude of the velocity vector, or speed of the motion object, can be calculated for the trajectory at tcurr by:
Alternatively, the magnitude of the velocity vector may be represented in its individual components, that is:
It is further appreciated that if data cell representation is used, i.e. the position of motion object is defined by the data cell which it is found in, a predetermined (x,y) value that corresponds to the data cell or a cell identifier can be substituted for the actual location. Further, if multiple grids are implemented, then the positions and velocities of the motion object can be represented with respect to the multiple grids, i.e. separate representations for each grid. It is appreciated that the calculated velocity will be relative to the FOV of the camera, e.g. pixels per second. Thus, objects further away will appear slower than objects closer to the camera, despite the fact that the two objects may be traveling at the same or similar speeds. It is further envisioned that other means of calculating the relative velocity may be implemented.
The direction of the velocity vector can be represented relative to its direction in a data cell by dividing each data cell into predetermined sub cells, e.g. 8 octants.
The acceleration calculation module 56 operates in substantially the same manner as the velocity calculation module 54. Instead of the position values, the magnitude of the velocity vectors at the various time samples may be used. Thus, the acceleration may be calculated by:
Alternatively, the magnitude of the acceleration vector may be represented in its individual components, that is:
With respect to the direction, the direction of the acceleration vector may be in the same direction as the velocity vector. It is understood, however, that if the motion object is decelerating or turning, then the direction of the acceleration vector will be different than that of the velocity vector.
The data mining module 30 can be further configured to generate data cubes for each cell. A data cube is a multidimensional array where each element in the array corresponds to a different time. An entry in the data cube may comprise motion data observed in the particular cell at a corresponding time. Thus, in the data cube of a cell, the velocities and accelerations of various motion objects observed over time may recorded. Further, the data cube may contain expected attributes of motion objects, such as the size of the minimum bounding box.
Once the trajectory vector of a motion object is generated, the vector may be stored in the metadata mining warehouse 36.
The processing module 32 is configured to determine a transformation matrix to transform an image of the observed space into a second image.
A first data interpolation module 70 is configured to receive a trajectory vector from the data mining module 30 or from the mining metadata data store 36 and to interpolate data for cells having incomplete motion data associated therewith. The interpolated motion data, once determined, is included in the observed motion data for the trajectory.
A data fusion module 72 is configured to receive the observed motion data, including interpolated motion data, and to combine the motion data of a plurality of observed trajectories. The output of the data fusion module 72 may include, but is not limited to, at least one velocity map, at least one acceleration map, and at least one occurrence map, wherein the various maps are defined with respect to the grid by which the motion data is defined.
A transformation module 74 receives the fused data and determines a transformation matrix based thereon. In some embodiments the transformation module 74 relies on certain assumptions such as a constant velocity of a motion object with respect of the space to determine the transformation matrix. The transformation matrix can be used by the surveillance system to “rotate” the view of the space to a second view, e.g. a birds-eye view. The transformation module 74 may be further configured to actually transform an image of the space into a second image. While the first image is referred to as being transformed or rotated, it is appreciated that the transformation can be performed to track motion objects in the transformed space. Thus, when the motion of an object is tracked, it may be tracked in the transformed space instead of the observed space.
The first data interpolation module 70 can be configured to interpolate data for cells having incomplete data.
As can be appreciated, each motion event can correspond to a change from one frame to a second frame. Thus, when motion data is sampled, the observed trajectory is likely composed of samples taken at various points in time. Accordingly, certain cells, which the motion object passed through, may not have data associated with them because no sample was taken at the time the motion object was passing through the particular cell. For example, the data in FOV 402, includes velocity vectors in boxes (0,0), (2,2), and (3,3). To get from box (0,0) to (2,2), however, the trajectory must have passed through column 1. The first data interpolation module 70 is configured to determine which cell to interpolate data for, as well as the magnitude of the vector. It is envisioned that the interpolation performed by the first data interpolation module 70 can be performed by averaging the data from the first proceeding cell and the first following cell to determine the data for the cell having the incomplete data. In alternative embodiments, other statistical techniques such as performing a linear regression on the motion data of the trajectory can be performed to determine the data of the cell having the incomplete data.
The first data interpolation module 70 can be configured to interpolate data using one grid or multiple grids. It is envisioned that other techniques for data interpolation may be used as well
Once the first data interpolation module 70 has interpolated the data, the data fusion module 72 can fuse the data from multiple motion objects. The data fusion module 72 can retrieve the motion data from multiple trajectories from the metadata mining data store 36 or from another source, such as the first data interpolation module 70 or a memory buffer associated thereto. In some embodiments, the data fusion module 72 generates a velocity map indicating the velocities observed in each cell. Similarly, an acceleration map can be generated. Finally, an occurrence map indicating an amount of motion objects observed in a particular cell can be generated. Furthermore, the data fusion module 72 may generate velocity maps, acceleration maps, and/or occurrence maps for each grid. It is appreciated that each map can be configured as a data structure having an entry for each cell, and each entry has a list, array, or other means of indicating the motion data for each cell. For example, a velocity map for a 4×4 grid can consist of a data structure having 16 entries, each entry corresponding to a particular cell. Each entry may be comprised of a list of velocity vectors. Further, the velocity vectors may be broken down into the x and y components of the vector using simple trigonometric equations.
Further, data fusion module 74 can be further configured to calculate a dominant flow direction for each cell. For each cell, the data fusion module can examine the velocity vectors associated therewith and determine a general flow associated with the cell. This can be achieved by counting the number of velocity vectors in each direction for a particular cell. As described earlier, the directions of vectors can be approximated by dividing a cell into a set of octants, as shown previously in
Once the dominant flow direction is determined, the data fusion module 72 removes all of the vectors not in the dominant flow direction of a cell from the velocity map.
vx=vm*sin(α); (5)
vy=vm*cos(α); (6)
where vm is the magnitude of the dominant flow direction velocity vector and α is the angle of the direction vector.
Further, it is appreciated that in some embodiments, if a large number of cells are used, e.g. a 16×16 grid having 256 cells, the data fusion module 72 may merge the cells into larger cells, e.g. a 4×4 grid having 16 cells. It is appreciated that the smaller cells can be simply inserted into the larger cells and treated as a single cell within the larger cell.
The data fusion module 72 then retrieves trajectory data for a particular time period from the mining metadata data store 36, as depicted at step 1204. It is appreciated that the system can be configured to analyze trajectories only occurring during a given period of time. Thus, the data fusion module 72 may generate a plurality of velocity maps, each map corresponding to a different time period, the different time periods hereinafter referred to as “slices.” Each map can be identified by its slice, i.e. the time period corresponding to the map.
Once the trajectory data is retrieved, the data fusion module 72 can insert the velocity vectors into the cells of the velocity map, which corresponds to step 1206. Further, if the data fusion module 72 is configured to merge data cells, this may be performed at step 1206 as well. This can be done by mapping the cells used to define the trajectory data to the larger cells of the map, as shown by the example of
After the data has been inserted into cells of the velocity map, the data fusion module 72 can determine the dominant flow direction of each cell, as shown at step 1208. The data fusion module 72 will analyze each velocity vector in a cell and keep a count for each direction in the cell. The direction having the most velocity vectors corresponding thereto is determined to be the dominant flow direction of the cell.
Once the dominant flow direction is determined, the dominant flow direction velocity vector can be calculated for each cell, as shown in step 1210. As mentioned, this step can be achieved in many ways. For example, an average magnitude of the velocity vectors are directed in the dominant flow direction can be calculated. Alternatively, the median magnitude can be used, or the largest or smallest magnitude can be used as the magnitude of the dominant flow direction velocity vector. Furthermore, the dominant flow direction velocity vector may be broken down into its component vectors, such that it is represented by a vector in the x-direction and a vector in the y-direction, as depicted at step 1212. It is appreciated that the sum of the two vectors equals the dominant flow direction velocity vector, both in direction and magnitude.
The foregoing method is one example of data fusion. It is envisioned that the steps recited are not required to be performed in the given order and may be performed in other orders. Additionally, some of the steps may be performed concurrently. Furthermore, not all of the steps are required and additional steps may be performed. While the foregoing was described with respect to generating a velocity map, it is understood the method can be used to determine an acceleration map as well.
The data fusion module 72 can be further configured to generate an occurrence map. As shown at step 1208, when the directions are being counted, a separate count may be kept for the total amount of vectors observed in each cell. Thus, each cell may have a total amount of occurrences further associated therewith, which can be used as the occurrence map.
Once the data for a particular cell is merged, the data for the particular cell can be represented by the following <cn, rn, vxcn,rn, vycn,rn, sn>, where cn is the column number of the cell, rn is the row number of the cell, vxcn,rn is the x component of the dominant flow direction velocity vector of the cell, vxcn,rn is the y component of the dominant flow direction velocity vector of the cell, and sn is the slice number. As discussed above, the slice number corresponds to the time period for which the trajectory vectors were retrieved. Furthermore, additional data that may be included is the x and y components of the acceleration vector of the dominant flow direction acceleration vector and the number of occurrences in the cell. For example, the fused data for a particular cell can be further represented by <cn, rn, vxcn,rn, vycn,rn, axcn,rn, aycn,rn, on, sn>.
The data fusion module 72 can be further configured to determine four sets of coefficients for each cell, whereby each cell has four coefficients corresponding to the corners of the cell. The data fusion module 72 uses the dominant flow direction velocity vector for a cell to generate the coefficients for that particular cell.
x1=vx0,0
x2=vx1,0
X3=vx3,0
X4=vx4,0
Y1=vy0,0
Y2=vy1,0
Y3=vy2,0
Y4=vy3,0
X5=vx0,1
X6=vx1,1
X7=vx2,1
X8=vx3,1
Y5=vy0,1
Y6=vy1,1
Y7=vy2,1
Y8=vy3,1
X9=vx0,2
X10=vx1,2
X11=vx3,2
X12=vx4,2
Y9=vy0,2
Y10=vy1,2
Y11=vy2,2
Y12=vy3,2
X13=vx0,3
X14=vx1,3
X15=vx3,3
X16=vx4,3
Y10=vy0,3
Y11=vy1,3
Y12=vy2,3
Y13=vy3,3
It is appreciated that vxa,b is the absolute value of the x component of the dominant flow direction velocity vector in the ath column and the bth row and vya,b is the absolute value of the y component of the dominant flow direction velocity vector in the ath column and the bth row. It is understood that the first column is column 0 and the top row is row 0. Further it is appreciated that the foregoing is an example and the framework described can be used to determine grids of various dimension.
Once the data fusion module 72 has generated the coordinates corresponding to the dominant flow direction of each cell, the transformation module 74 will determine a transformation matrix for each cell. The transformation matrix for a cell is used to transform an image of the observed space, i.e. the image corresponding to the space observed in the FOV of the camera, to a second image, corresponding to a different perspective of the space.
The data fusion module 72 may be further configured to determine actual motion data of the motion objects. That is, from the observed motion data the data fusion module 72 can determine the actual velocity or acceleration of the object with respect to the space. Further, the data fusion module 32 may be configured to determine an angle of the camera, e.g. pan and/or tilt of the camera, based on the observed motion data and/or the actual motion data.
In the present embodiment, the transformation module 74 receives the fused data, including the dominant flow direction of the cell, and the coordinates corresponding to the dominant flow direction velocities of the cell. The transformation module 74 then calculates theoretical coordinates for the cell. The theoretical coordinates for a cell are based on an assumed velocity of an average motion object and the dominant flow direction of the cell. For example, if the camera is monitoring a sidewalk, the assumed velocity will correspond to the velocity of the average walker, e.g. 1.8 m/s. If the camera is monitoring a parking lot, the assumed velocity can correspond to an average velocity of a vehicle in a parking lot situation, e.g. 15 mph or 7.7 m/s.
It is appreciated that the average velocity can be hard coded or can be adaptively adjusted throughout the use of the surveillance system. Furthermore, object detection techniques can be implemented to ensure that the trajectories used for calibration all correspond to the same object type.
The transformation module 74 will use the dominant flow direction of the cell and the assumed velocity, va, to determine the absolute values of the x component and y component of the assumed velocity. Assuming the angle of the dominant flow direction, α, is taken with respect to the x axis, then the x and y components can be solved using the following:
vx′=va*sin(α); (7)
vy′=va*cos(α); (8)
where va is the assumed velocity and α is the angle of the dominant flow direction of the cell with respect to the x-axis (or any horizontal axis). Once the component vectors are calculated, the theoretical coordinates can be inserted into a matrix B′ such that:
Also, the calculated coordinates of the cell, i.e. the coordinates of the cell that were based upon the dominant flow direction velocity vector of the cell may be inserted into a matrix B such that:
Using the two matrices, B and B′, the transformation matrix for the cell, A can be solved for. It is appreciated that the transformation matrix A can be defined as follows:
The values of A can be solved for using the following:
where xi′ and yi′ are the coordinate values at the ith element of B′ and xi and yi are the ith element of B, and where i=[1, 2, 3, 4] such that 1 is the top left element, 2 is the top right element, 3 is the bottom right element and 4 is the bottom right element. A system of equations may be utilized to solve for the elements of the transformation matrix A.
The transformation module 74 performs the foregoing for each cell using the fused data for each particular cell to determine the transformation matrix of that cell. Thus, in the example where a 4×4 grid is used, then 16 individual transformation matrices will be generated. Further, the transformation module 74 can store the transformation matrices corresponding to each cell in the mining metadata data store 36.
In an alternative embodiment, the transformation module 74 determines a single transformation matrix to transform the entire image. In the alternative embodiment, the transformation module 74 receives the dominant flow direction velocity vectors and/or acceleration vectors, and the occurrence map.
The transformation module 74 will then determine the n cells having the greatest amount of occurrences from the occurrence map, as shown at step 1404. The occurrence map is received with the fused motion data. As will become apparent, n should be greater than or equal to 6. As will also be appreciated, the larger n is the more accurate the transformation matrix will be, but at the cost of computational resources. For each of the n cells, the transformation module 74 will retrieve the x and y component vectors of the dominant flow direction acceleration vector for the particular cell, as shown at step 1406.
Using the camera parameters and the component vectors for the n cells, the translation module 74 will define the translation equation as follows:
where λ is initially set to 1 and where X, Y, and Z are set to 0. X, Y, and Z correspond to the actual accelerations of the motion objects with respect to the space. It is assumed that when the camera is calibrated, motion objects having constant velocities can be used to calibrate the camera. Thus, the actual accelerations with respect to the space will have accelerations of 0. As can be appreciated, the observed accelerations are with respect to the FOV of the camera, and may have values other than 0. Further, where there are k samples of velocities, there will be k−1 samples of acceleration.
Using a statistical regression, the values of
can be estimated, using the acceleration component vectors of the dominant flow direction accelerations of the n cells as input. It is appreciated that a linear regression can be used as well as other statistical regression and estimation techniques, such as a least squares regression. The result of the regression is the translation matrix, which can be used to transform an image of the observed space into a second image, or can be used to transform an object observed in one space to the second space.
The transformation module 72 can be further configured to determine if the transformation matrix did not receive enough data for a particular region of the FOV. For example, if the regression performed on equation 14 is not producing converging results, the transformation module 72 determines that additional data is needed. Similarly, if the if the results from equation 14 for the different cells are inconsistent, then the transformation module 72 may determine that additional data is needed for the cells. In this instance, transformation matrix will initiate a second data interpolation module 74.
The second data interpolation module 74 receives a velocity map that produced the non-conforming transformation matrices and is configured to increase the amount of data for a cell. This is achieved by either increasing the resolution of the grids and/or by adding data from other slices. For example, referring to
While the transformation matrices in either embodiment described above are generated by making assumptions about the motion attributes of the motion objects, actual velocities and/or accelerations of motion objects can also be used to determine the transformation matrices. This data can either be determined in a training phase or may be determined by the data fusion module 72.
Once the processing module 32 has determined a transformation matrix, the transformation matrix can be calibrated by the calibration module 34. The calibration module 34 as shown in
The emulation module 152 is configured to generate a 3d object referred to as an avatar 156. The avatar 156 can be generated in advance and retrieved from a computer readable medium or may be generated in real time. The avatar 156 can have a known size and bounding box size. The avatar 156 is inserted into the image of the space at a predetermined location in the image. The image or merely the avatar 156 is converted using the transformation matrix determined by the processing module 32. According to one embodiment the transformation matrix for a particular cell is:
In these embodiments, the avatar 156 should be placed in a single cell per calibration iteration. Each pixel in the cell in which the avatar 156 is located can be translated by calculating the following:
X=(x*C00+y*C01+C02)/(x*C20+y*C21+C32)
Y=(x*C10+y*C11+C12)/(x*C20+y*C21+C32)
where x and y are the coordinates on the first image to be translated and where X and Y are the coordinates of the translated pixels. It is appreciated that this may be performed for some or all of the pixels in the cell. Also for calibration purposes, each cell should be calibrated using its corresponding transformation matrix.
In the embodiments, the transformation matrix is defined as:
In these embodiments, the transformation can be performed using the following
where x and y are the coordinates of a pixel to be transformed and X and Y are the coordinates of the transformed pixel. It is appreciated that the pixels are transformed by solving for X and Y.
Once the avatar 156 is transformed, the location of the transformed avatar 156 is communicated to the evaluation and adaptation module 154. The evaluation and adaptation module 154 receives the location of the originally placed avatar 156 with respect to the original space and the location of the transformed avatar 156 with respect to the transformed space. After transformation, the bounding box of the avatar 156 should remain the same size. Thus, the evaluation and adaptation module 154 will compare the bounding boxes of the original avatar 156 and the transformed avatar 156. If the transformed avatar 156 is smaller than the original avatar 156, then the evaluation and adaptation module 154 multiplies the transformation matrix by a scalar greater than 1. If the transformed avatar 156 is larger than the original avatar 156, than the evaluation and adaptation module 154 multiplies the transformation matrix by a scalar less than 1. If the two avatar 156 are substantially the same size, e.g. within 5% of one another, than the transformation matrix is deemed calibrated. It is appreciated that the emulation module 152 will receive the scaled transformation matrix and perform the transformation again. The emulation module 152 and the evaluation and adaptation module 154 may iteratively calibrate the matrix or matrices according to the process described above. Once the transformation matrix is calibrated it may be stored in the mined metadata data store 36 or may be communicated to the image and object transformation module 38.
Referring back to
When calibrating a camera, it may be useful to have one or more motion objects, e.g. a person or vehicle, move through the space observed in the FOV of the camera at a constant velocity. While it is not necessary, the motion data resulting from a constant velocity motion object may result in more accurate transformation matrices.
It is appreciated that by performing a transformation when monitoring a space, the observations by the surveillance module 40 will be greatly improved. For example, an actual velocity and acceleration of a motion object can be determined instead of a velocity or acceleration with respect to the field of view of the camera. Further, the geospatial locations of objects, stationary or moving, can be determined as well instead of the objects' locations with respect to the field of view of the camera.
The foregoing description of the embodiments has been provided for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention. Individual elements or features of a particular embodiment are generally not limited to that particular embodiment, but, where applicable, are interchangeable and can be used in a selected embodiment, even if not specifically shown or described. The same may also be varied in many ways. Such variations are not to be regarded as a departure from the invention, and all such modifications are intended to be included within the scope of the invention.