The present invention relates to methods and systems for automated detection and prediction of the progression of behavior and treat patterns in a real-time, multi-sensor environment.
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
The recent trend in video surveillance systems is to provide video analysis components that can detect potential threats from live streamed video surveillance data. The detection of potential threats assists a security operator, who monitors the live feed from many cameras, to detect actual threats.
Conventional surveillance systems detect potential threats based on predefined patterns. To operate, each camera requires an operator to manually configure abnormal behavior detection features. When the predetermined abnormal pattern is detected, the system generates an alarm. It often requires substantial efforts in adjusting the sensitivity of multiple detection rules defined to detect specific abnormal patterns such as speeding, against the flow, abnormal flow.
Such systems are inefficient in their operation. For example, the proper configuration of each camera is time consuming, requires professional help, and increases deployment costs. In addition, the definition and configuration of every possible abnormal behavior is not realistically possible due to the fact that there may just be too many to enumerate, to study, and to develop a satisfying solution in all possible contexts.
Accordingly, a surveillance system is provided. The surveillance system generally includes a data capture module that collects sensor data. A scoring engine module receives the sensor data and computes at least one of an abnormality score and a normalcy score based on the sensor data, at least one dynamically loaded learned data model, and a learned scoring method. A decision making module receives the at least one of the abnormality score and the normalcy score and generates an alert message based on the at least one of the abnormality score and the normalcy score and a learned decision making method to produce progressive behavior and threat detection.
Further areas of applicability will become apparent from the description provided herein. It should be understood that the description and specific examples are intended for purposes of illustration only and are not intended to limit the scope of the present disclosure.
The drawings described herein are for illustration purposes only and are not intended to limit the scope of the present teachings in any way.
The following description is merely exemplary in nature and is not intended to limit the present teachings, their application, or uses. It should be understood that throughout the drawings, corresponding reference numerals indicate like or corresponding parts and features. As used herein, the term module or sub-module can refer to: a processor (shared, dedicated, or group) and memory that executes one or more software or firmware programs, and/or other suitable components that can provide the described functionality and/or combinations thereof.
Referring now to
In various aspects of the present teachings, a single surveillance module 16 can be implemented and located remotely from each sensory device 12a-12n as shown in
Referring now to
The image capture module 22 collects the sensor data 14a-14n as image data corresponding to a scene and the video analysis module 80 processes the image data to extract object meta data 30 from the scene. The scoring engine module 24 receives the object meta data 30 and produces a measure of abnormality or normality also referred to as a score 34 based on learned models 32.
The decision making module 26 collects the scores 34 and determines an alert level for the object data 30. The decision making module 26 sends an alert message 36n that includes the alert level to external components for further processing. The decision making module 26 can exchange scores 34 and object data 30 with other decision making modules 26 of other cameras 20a, 20b to generate predictions about objects in motion. The device configuration module 28 loads and manages various models 32, scoring engine methods 52, decision making methods 50, and/or decision making parameters 51 that can be associated with the camera 20n.
The surveillance system 10 can also include an alarm handling module 38, a surveillance graphical user interface (GUI) 40, a system configuration module 42, a learning module 44, and a model builder module 46. As shown, such components can be located remotely from the cameras 20a-20n. The alarm handling module 38 re-evaluates the alert messages 36a-36n from the cameras 20a-20n and dispatches the alarm messages 18. The alarm handling module 38 interacts with the user via the surveillance GUI 40 to dispatch the alarm messages 18 and/or collect miss-classification data 48 during alarm acknowledgement operation.
The learning module 44 adapts the decision making methods 50 and parameters 51, and/or the scoring engine methods 52 for each camera 20a-20n by using the miss-classification data 48 collected from the user. As will be discussed further, the decision making methods 50 are automatically learned and optimized for each scoring method 52 to support the prediction of potential incidents, increase the detection accuracy, and reduce the number of false alarms. The decision making methods 50 fuse the scores 34 as well as previous scoring results, object history data, etc., to reach a final alert decision.
The model builder module 46 builds models 32 representing normal and/or abnormal conditions based on the collected object data 30. The system configuration module 42 manages the models 32, the decision making methods 50 and parameters 51, and the scoring engine methods 52 for the cameras 20a-20n and uploads the methods and data 32, 50, 51, 52 to the appropriate cameras 20a-20n.
Referring now to
The model initialization module 60 captures the domain knowledge from users, and provides initial configuration of system components (i.e., optimized models, optimized scoring functions, optimized decision making functions, etc.). In particular, the model initialization module 60 builds initial models 32 for each camera 20a-20n (
The model initialization module 60 builds the optimized models 32 from predefined model builder methods stored in the model methods datastore 68. In various aspects of the present teachings, the model initialization module 60 builds the optimal configuration according to a model builder method that selects particular decision making methods 50 (
In various aspects of the present teachings, the model initialization GUI 62 can provide an option to the user to insert a predefined object into the displayed scene. The model initialization module 60 then simulates the predefined object along the trajectory path for verification purposes. If the user is satisfied with the trajectory paths, the model 32 is stored in the model data datastore 70. Otherwise, the user can iteratively adjust the trajectory parameters and thus, the models 32 until the user is satisfied with the simulation.
Thereafter, the model learn module 64 can automatically adapt the models 32 for each camera 20a-20n (
As can be appreciated, various model building methods can be stored to the model methods datastore 68 to allow the model builder module 46 to build a number of models 32 for each object based on a model type. For example, the various models can include, but are not limited to, a velocity model, an acceleration model, an occurrence model, an entry/exit zones model, a directional speed profile model, and a trajectory model. These models can be built for all observed objects as well as different types of objects. As shown in
In various aspects of the present teachings, the occurrence model describes the object detection probabilities in space and time dimensions. Each element of the occurrence data cube represents the probability of detecting an object at the particular location in the scene at the particular time interval. As can be appreciated, a time plus three dimensional occurrence data cube can be obtained from multiple cameras 20a-20n (
The trajectory models can be built by using the entry and exit regions with the object meta data 30 obtained from the video analysis module 80 (
The directional models represent the motion of an object with respect to regions in a site. Specifically, each cell contains a probability of following a certain direction in the cell and a statistical representation of measurements in a spatio temporal region (cell), such as speed and acceleration. A cell can contain links to entry regions, exit regions, trajectory models, and global data cube model of site under surveillance. A cell can contain spatio temporal region specific optimized scoring engine methods as well as user specified scoring engine methods. Although the dimensions of the data cube are depicted as a uniform grid structure, it is appreciated that non-uniform intervals can be important for optimal model representation. The variable length intervals, as well as clustered/segmented non-rigid spatio temporal shape descriptors (i.e., 3D/4D shape descriptions), can be used for model reduction. Furthermore, the storage of the model 32 can utilize multi-dimensional indexing methods (such as R-tree, X-tree, SR-tree, etc.) for efficient access to cells.
As can be appreciated, the data cube structure supports predictive modeling of the statistical attributes in each cell so that the a motion trajectory of an observed object can be predicted based on the velocity and acceleration attributes stored in the data cube. For example, based on a statistical analysis of the past history of motion objects, any object detected in location (X1, Y1) may be highly likely to move to location (X2, Y2) after T seconds based on historical data. When a new object is observed in location (X1, Y1) it is likely to move to location (X2, Y2) after T seconds.
Referring now to
As discussed above, the image capture module 22 captures image data 93 from the sensor data 14. The image data 93 is passed to the video analyzer module 80 for the extraction of objects and properties of the objects. More particularly, the video analyzer module 80 can produce object data 30 in the form of an object detection vector ({right arrow over (o)}), that includes: an object identifier (a unique key value per object); a location of a center of an object in the image plane (x,y), a timestamp; a minimum bounding box (MBB) in the image plane (x,low,y,low,x,upper,y,upper): a binary mask matrix that specifies which pixels belong to a detected object; image data of the detected object; and/or some other properties of detected objects such as visual descriptors specified by an Metadata format (i.e. MPEG7 Standard and its extended form for surveillance). The object data 30 can be sent to the scoring engine (SE) modules 24 and saved into the object history datastore 82.
In various aspects of the present teachings, the video analyzer module 80 can access the models 32 of the camera models datastore 92, for example, for improving accuracy of the object tracking methods. As discussed above, the models 32 are loaded to the camera models datastore 92 of the camera 20 via the device configuration module 28. The device configuration module also instantiates the scoring engine module 24, the decision making module 26, and prepares a communication channel between modules involved in the processing of object data 30 for progressive behavior and threat detection.
The scoring engine module 24 produces one or more scores 34 for particular object traits, such as, an occurrence of the object in the scene, a velocity of the object, and an acceleration of the object. In various aspects, the scoring engine module includes a plurality of scoring engine sub-module that performs the following functionality. The scoring engine module 24 selects a particular scoring engine method 52 from the scoring methods datastore 86 based on the model type and the object trait to be scored. Various exemplary scoring engine methods 52 can be found in the attached Appendix A. The scoring engine methods 52 are loaded to the scoring methods datastore 86 via the device configuration module 28
The scores 34 of each detected object can be accumulated to obtain progress threat or alert levels at location (X0, Y0) in real time. Furthermore, using the predictive model stored in the data cube, one can calculate the score 34 of the object in advance by first predicting the motion trajectory of the object and calculate the score of the object along the trajectory. As a result, the system can predict the changing of threat levels before it happens to support preemptive alert message generation. The forward prediction can include the predicted properties of an object in the near future (such as it is location, speed, etc.) as well as the trend analysis of scoring results.
The determination of the score 34 can be based on the models 32, the object data 30, the scores history data 34, and in some cases object history data from the object history datastore 82, the some regions of interest (defined by user), and their various combinations. As can be appreciated, the score 34 can be a scalar value representing the measure of abnormality. In various other aspects of the present teachings the score 34 can include two or more scaler values. For example, the score 34 can include a measure of normalcy and/or a confidence level, and/or a measure of abnormality and/or a confidence level. The score data 34 is passed to the decision making module 26 and/or stored in the SE scores history datastore 84 with a timestamp.
The decision making module 26 then generates the alert message 36 based on a fusing of the scores 34 from the scoring engine modules 24 for a given object detection event data ({right arrow over (o)}). The decision making module can use the historical score data 34, and object data 30 during fusion. The decision making module 26 can be implemented according to various decision making methods 50 stored to the decision methods datastore 88. Such decision making methods 50 can be loaded to the camera 20 via the device configuration module 28. In various aspects of the present teachings, as shown in
Where w represents a weight for each score based on time (t) and spatial dimensions (XY). In various aspects of the present teachings, the dimensions of the data cube can vary in number for example. XYZ spatial dimensions. The weights (w) can be pre-configured or adaptively learned and loaded to the parameters datastore 90 via the device configuration module 28. In various other aspects of the present teachings, the alert message 36 is determined based on a decision tree based method as shown in
Since the decision making module 26 can be implemented according to various decision making methods 50, the decision making module is preferable defined in a declarative form by using, for example, XML based representation such as an extended form of the Predictive Model Markup Language. This enables the Learning Module 44 to improve the decision making module accuracy since the learning module 44 changes various parameters (such as weight and the decision tree as explained above) and the decision making method also.
In various aspects of the present teachings, the decision making module 26 can generate predictions that can generate early-warning alert messages for progressive behavior and threat detection. For example, the decision making module 26 can generate predications about objects in motion based on the trajectory models 32. A prediction of a future location of an object in motion enables the decision making module 26 to identify whether two objects in motion will collide. If the collision is probable, the decision making module 26 can predict where objects will collide and when objects will collide as well as generate the alert message 36 to prevent a possible accident.
As discussed above, to allow for co-operative decision making between cameras 20a-20n in the surveillance system 10, the decision making module 26 can exchange data with other decision making modules 26 such as decision make modules 26 running in other cameras 20a, 20b (
Referring now to
More particularly, the alarm handling module 38 can include a threats data datastore 98, a rule based abnormality evaluation module 94, a rules datastore 100, and a dynamic rule based alarm handling module 96. As can be appreciated, the rule based abnormality evaluation module 94 can be considered another form of a decision making module 26 (
The rules datastore 100 stores rules that are dynamically configurable and that can be used to further evaluate the detected object. Such evaluation rules, for example, can include, but are not limited to, rules identifying permissible objects even though they are identified as suspicious; rules associating higher alert levels with recognized objects; and rules recognizing an object as suspicious when the object is present in two different scenes at the same time.
The rule based abnormality evaluation module 94 associates the additional properties with the detected object based on the object data from the threats data datastore 98. The rule based abnormality evaluation module 94 then uses this additional information and the evaluation rules to re-evaluate the potential threat and the corresponding alert level. For example, the rule based abnormality evaluation module 94 can identify the object as a security guard traversing the scene during off-work hours. Based on the configurable rules and actions, the rule based abnormality evaluation module 94 can disregard the alert message 36 and prevent the alarm messages 18 from being dispatched even though a detection of a person at off-work hours is suspicious.
The dynamic rule based alarm handling module 96 dispatches an alert event 102 in the form of the alarm messages 18 and its additional data to interested modules, such as, the surveillance GUI 40 (
Referring now to
For example, the learning module 44 retrieves the decision making methods 50, the models 32, the scoring engine methods 52, and the parameters 51 from the system configuration module 42. The learning module 44 selects one or more appropriate learning methods from a learning method datastore 106. The learning methods can be associated with a particular decision making method 50. Based on the learning method, the learning module 44 re-examines the decision making method 50 and the object data 30 from a camera against the miss-classification data 48. The learning module can adjust the parameters 51 to minimize the error in the decision making operation. As can be appreciated, if more than one learning method is associated with the decision making method 50, the learning module 44 performs the above re-examination for each method 50 and uses a best result or some combination thereof to adjust the parameters 51.
Referring now to
The camera configuration module 110 associates the models 32, the scoring engine methods 52, and the decision making methods 50 and parameters 51 with each of the cameras 20a-20n (
The information upload module 112 provides the models 32, the scoring engine methods 52, and the decision making methods 50 and parameters 51 to the device configuration module 28 (
Those skilled in the art can now appreciate from the foregoing description that the broad teachings of the present disclosure can be implemented in a variety of forms. Therefore, while this disclosure has been described in connection with particular examples thereof, the true scope of the disclosure should not be so limited since other modifications will become apparent to the skilled practitioner upon a study of the drawings, specification, and the following claims.