The present disclosure relates generally to video surveillance systems. More particularly, the present disclosure relates to testing video analytics of video surveillance systems.
A number of video surveillance systems employ video cameras that are installed or otherwise arranged around a surveillance area such as a city, a portion of a city, a facility or a building. Video surveillance systems may also include mobile video cameras, such as drones carrying video cameras. Video surveillance systems may employ a variety of different detection algorithms using video analytics. What would be desirable are improved methods for testing the video analytics of video surveillance systems.
The present disclosure relates generally to video surveillance systems and more particularly to testing video analytics of video surveillance systems. An example may be found in a computer implemented method for testing one or more video analytics algorithms of a video surveillance system. The video surveillance system includes a video camera having a field of view (FOV) that encompasses part of a facility (or other area). The illustrative method includes generating simulated objects that are to be superimposed on a video stream captured by the video camera, wherein generating the simulated objects includes displaying an image captured by the video camera on a display, receiving a first user input that identifies one or more user identified regions relative to the image, tagging each of the one or more user identified regions with a corresponding region tag, receiving a second user input that is used to identify one or more simulation parameters, and identifying one or more simulation parameters based at least in part on the second user input. Based at least in part on the one or more simulation parameters and one or more of the user identified regions and corresponding region tags, the method includes determining a plurality of characteristics of the simulated objects including one or more of a quantity of the simulated objects in the FOV, a distribution of the simulated objects in the FOV, a starting point for each of the simulated objects in the FOV, and a movement of each of the simulated objects in the FOV. The method includes superimposing the simulated objects on the FOV of the video stream captured by the video camera to create an augmented video stream and processing the augmented video stream using one or more video analytics algorithms to test the effectiveness of each of the one or more video analytics algorithms.
Another example may be found in a computer implemented method for testing one or more video analytics algorithms of a video surveillance system, where the video surveillance system includes a video camera having a field of view (FOV) that encompasses part of a facility (or other region). The illustrative method includes generating simulated objects that are to be superimposed on a video stream captured by the video camera, wherein generating the simulated objects includes receiving a user input that identified a particular video analytics algorithm from a plurality of video analytics algorithm, and based at least in part on the particular video particular video analytics algorithm identified by the user input, determining a plurality of characteristics of the simulated objects including one or more of a quantity of the simulated objects in the FOV, a distribution of the simulated objects in the FOV, a starting point for each of the simulated objects in the FOV, and a movement of each of the simulated objects in the FOV. The method includes superimposing the simulated objects on the FOV of the video stream captured by the video camera to create an augmented video stream and processing the augmented video stream using the particular video analytics algorithm identified by the user input to test the effectiveness of the particular video particular video analytics algorithm.
Another example may be found in a computer implemented method for testing a video surveillance system including a video camera having a field of view (FOV) that encompasses a region of a facility (or other region). The illustrative method includes generating simulated objects that are to be superimposed on a video stream captured by the video camera, wherein generating the simulated objects includes receiving from a user a voice or text based description of one or more scenarios to be tested. Based at least in part on the voice or text based description of the one or more scenarios to be tested, the method includes determining a plurality of characteristics of the simulated objects including one or more of a quantity of the simulated objects in the FOV, a distribution of the simulated objects in the FOV, a starting point for each of the simulated objects in the FOV, and a movement of each of the simulated objects in the FOV. The method includes superimposing the simulated objects on the FOV of the video stream captured by the video camera to create an augmented video stream and processing the augmented video stream to test the one or more scenarios described in the voice or text based description.
The preceding summary is provided to facilitate an understanding of some of the innovative features unique to the present disclosure and is not intended to be a full description. A full appreciation of the disclosure can be gained by taking the entire specification, claims, figures, and abstract as a whole.
The disclosure may be more completely understood in consideration of the following description of various examples in connection with the accompanying drawings, in which:
While the disclosure is amenable to various modifications and alternative forms, specifics thereof have been shown by way of example in the drawings and will be described in detail. It should be understood, however, that the intention is not to limit the disclosure to the particular examples described. On the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the disclosure.
The following description should be read with reference to the drawings, in which like elements in different drawings are numbered in like fashion. The drawings, which are not necessarily to scale, depict examples that are not intended to limit the scope of the disclosure. Although examples are illustrated for the various elements, those skilled in the art will recognize that many of the examples provided have suitable alternatives that may be utilized.
All numbers are herein assumed to be modified by the term “about”, unless the content clearly dictates otherwise. The recitation of numerical ranges by endpoints includes all numbers subsumed within that range (e.g., 1 to 5 includes 1, 1.5, 2, 2.75, 3, 3.80, 4, and 5).
As used in this specification and the appended claims, the singular forms “a”, “an”, and “the” include the plural referents unless the content clearly dictates otherwise. As used in this specification and the appended claims, the term “or” is generally employed in its sense including “and/of” unless the content clearly dictates otherwise.
It is noted that references in the specification to “an embodiment”, “some embodiments”, “other embodiments”, etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is contemplated that the feature, structure, or characteristic may be applied to other embodiments whether or not explicitly described unless clearly stated to the contrary.
The controller 16 is operably coupled with a memory 18. The memory 18 may be used for storing recorded video streams and video clips, for example. The memory 18 may be used for storing video analytics algorithms 20. The video analytics algorithms 20 may be used by the controller 16, for example, when analyzing a video stream or a recorded video clip. The video analytics algorithms 20 may include any of a variety of different video analytics algorithms 20, such as but not limited to one or more of a crowd detection algorithm (e.g. crowd present or forming), a crowd analytics algorithm (e.g. size of crowd, rate of crowd formation or dispersion, density of crowd, movement within the crowd, movement of the crowd, etc.), a people count algorithm (e.g. number of people in a defined region of interest of the FOV), a behavior detection algorithm (e.g. running, walking, loitering, yelling, shop lifting, protesting, damaging property, tailgating through a secure entry point, wearing a mask, social distancing, etc.), an intrusion detection algorithm (e.g. entering a secure area), a perimeter protection algorithm (e.g. crossing a security perimeter) and a tailgating detection algorithm (e.g. tailgating through a secure entry point). These are just examples.
The controller 16 is operably coupled with a user interface 22. The user interface 22 includes a display 24 that may be used for displaying a video stream, for example. The user interface 22 may also include data entry capability, such as a keyboard and mouse, that a user may use when entering information into the controller 16. In some cases, the user interface 22 may include a voice and/or text receiving device for receiving a natural language input, sometimes from a remote location. In some instances, at least some of the controller 16, the memory 18 and the user interface 22 may be manifested within a computer such as a PC or laptop. In some instances, the controller 16 and/or the memory 18 may be manifested in a computer server such as a cloud-based server. The controller 16 and/or the memory 18 may represent part of a distributed computer system, with part of the distributed computer disposed locally and part of the distributed computer disposed remotely.
The Object Simulation Module 54 generates simulated objects and movement of the simulated objects. The augmented video stream, including simulated objects moving in the field of view, is visualized, as indicated at block 58. One or more detection algorithms are run on the augmented video stream, as indicated at block 60. A visual check is run to see if the superimposed simulated objects are being detected as expected for the one or more video analytics algorithms being tested, as indicated at block 62. This may include a user watching the augmented video stream via the user interface 22 of
In some cases, one or more of the simulation parameters may be scaled by a scale factor that is dependent on the location of the simulated object in the FOV of the video camera. For example, for simulated objects that are in the upper part of the FOV, indicating that these objects are further from the video camera, the simulated objects may be smaller in size (smaller number of pixels) relative to similar objects that are in the lower part of the FOV. In another example, the speed of the simulated objects (across pixels) may be smaller for objects in the upper part of the FOV relative to similar objects that are in the lower part of the FOV. Moreover, in some cases, the scale factor may be dependent upon the placement of the camera and the size of the area being monitored. For example, for a video camera placed in a large warehouse with a FOV that covers a long distance, the scale factor may be greater than a video camera that is placed in a smaller confined room. The downward facing angle of the video camera may also be taken into account when determining the scale factor. In some cases, the scale factor is based on one or more inputs from a user.
The simulated objects, and their movement, are superimposed on the individual frames of the video stream, as indicated at block 72. A GIF layer may be added to the original RGB (Red Green Blue) images, and previewed, as seen at block 74. The RGB images with the superimposed objects are temporarily saved and reloaded into the preview, as indicated at block 76.
In some instances, receiving first user input may include drawing annotations on the image while displayed on a display screen to identify the one or more user identified regions. The annotations may indicate user identified regions that include possible obstructions that may interfere with where the simulated objects are allowed to be placed or to move by an object simulator module. In some instances, generating the simulated objects include automatically tagging each of the one or more user identified regions with a corresponding region tag, as indicated at block 86. In some instances, the region tags may define a corresponding region type. Examples of regions and/or region types include but are not limited to one or more of a wall visible in the FOV, a fence visible in the FOV, an obstacle visible in the FOV, a secure area visible in the FOV, a window visible in the FOV, a door visible in the FOV, a pedestrian lane visible in the FOV, and a vehicle lane visible in the FOV.
In some instances, the user may draw annotations that indicate regions or areas of interest. As an example, a user may draw a box or other shape around a particular area of interest in the FOV, in which one or more video analytics algorithms are to be run. In some cases, each area of interest may be tagged to identify which of a plurality of video analytics algorithms are to be run in the associated area of interest. For example, a user may draw a shape that encompasses an area in front of a cash machine, and may tag the shape with a loitering video analytics algorithm that detects if/when people are loitering in the area of interest. In another example, a user may draw a shape that encompasses a region of a train station, and may tag the shape with a crowd detection video analytics algorithm that detects if/when a crowd of people are gathered in the area of interest. These are just examples. In another example, a user may draw a box or other shape around a particular area, noting that this area is not of interest, and there is no need to run any video analytics algorithms on this particular area. A bathroom or changing room may represent an area that is not of interest. This is just an example.
In some instances, generating the simulated objects include receiving a second user input that is used to identify one or more simulation parameters, as indicated at block 88. In some instances, one or more of the simulation parameters are based at least in part on the second user input, as indicated at block 90. In some instances, and as continued on
In some instances, the characteristics of the simulated objects include a quantity of the simulated objects in the FOV, as indicated at block 94. In some instances, the characteristics of the simulated objects include a distribution and/or distribution type of the simulated objects in the FOV, as indicated at block 96. In some instances, the characteristics of the simulated objects include a starting point (e.g. time and/or location) for each of the simulated objects in the FOV, as indicated at block 98. In some instances, the characteristics of the simulated objects include a movement (e.g. speed, movement type) of each of the simulated objects in the FOV, as indicated at block 100. In some instances, the characteristics of the simulated objects may be influenced by the regions of interest indicated by the user. The simulated objects may be directed into or through a region of interest, particularly for testing a particular video analytics algorithm. The simulated objects may be directed away from areas not of interest, as running video analytics algorithms on these areas may be viewed as a waste of computing resources.
In some instances, the method 78 includes superimposing the simulated objects on the FOV of the video stream captured by the video camera to create an augmented video stream, as indicated at block 102. The method 78 may include processing the augmented video stream using one or more video analytics algorithms to test the effectiveness of each of the one or more video analytics algorithms, as indicated at block 104. The one or more video analytics algorithms may include, but are not limited to, one or more of a crowd detection algorithm (e.g. crowd present or forming), a crowd analytics algorithm (e.g. size of crowd, rate of crowd formation or dispersion, density of crowd, movement within the crowd, movement of the crowd, etc.), a people count algorithm (e.g. number of people in a defined region of interest of the FOV), a behavior detection algorithm (e.g. running, walking, loitering, yelling, shop lifting, protesting, damaging property, tailgating through a secure entry point, wearing a mask, social distancing, etc.), an intrusion detection algorithm (e.g. entering a secure area), a perimeter protection algorithm (e.g. crossing a security perimeter) and a tailgating detection algorithm (e.g. tailgating through a secure entry point). These are just examples. In some instances, the method 78 may include receiving a third user input that tags at least one of the one or more user identified regions with a corresponding region tag including a region tag type, as indicated at block 106. The method 78 may include using video analytics to automatically identify and tag at least one of the one or more user identified regions with a corresponding region tag, as indicated at block 108. That is, the video analytics may be used to identify walls, doors, windows, fences, parking spaces and other regions in the FOV and automatically and tag each of the identified regions with an appropriate region tag. The method 78 may include using video analytics to automatically identify and tag at least some areas, including one or more user identified regions, as not being of interest. A potential person of interest is unlikely to materialize through a solid wall, for example.
In some instances, the one or more simulation parameters may include one or more of an object type of one or more of the simulated objects, a measure related to a number of simulated objects, a distribution of two or more of the simulated objects, a starting point for one or more of the simulated objects, a speed of movement of one or more of the simulated objects, a variation in speed of movement of one or more of the simulated objects over time, a variation in speed of movement between two or more of the simulated objects, and a type of movement of one or more of the simulated objects. These are just examples. In some instances, the distribution of two or more of the simulated objects may include one or more of a Weilbull distribution, a Gaussian distribution, a Binomial distribution, a Poisson distribution, a Uniform distribution, a Random distribution, and a Geometric distribution. In some instances, the distribution of two or more of the simulated objects may include a mixed distribution of two or more different distributions. In some instances, the type of movement may include one or more of a random directional movement, a random directional movement constrained by movement of other simulated objects and/or constrained by user identified regions of the FOV, a directional movement along a predefined corridor, a varying speed movement, and a group movement associated with two or more of the simulated objects.
The method 110 continues on
In some instances, and continuing on
In one example, a voice or text based description may be “Test for an unauthorized entry through door A of region 1”. When the description is voice based, the description may be converted to text based through a voice to text translator. The system may parse the voice or text based description and identify the phrases “door A” and “region 1”. From these phrases, the system may identify which video camera(s) of the video surveillance system 10 has a FOV that incudes door A of region 1. The system may also parse the voice or text based description and identify the phrase “unauthorized entry”. This phrase may be mapped to one of a plurality of video analytics algorithms, such as an intrusion detection video analytics algorithm (or an unauthorized entry detection video analytics algorithm). The system may then be configured to automatically identify characteristics of the simulated objects that are suitable for testing the identified video analytics algorithm, namely the intrusion detection video analytics algorithm, for an intrusion of door A of region 1. The system may then superimpose simulated objects with the identify characteristics on the FOV of the video stream captured by the identified video camera to create an augmented video stream. The system may then process the augmented video stream using the particular video analytics algorithm (e.g. the intrusion detection video analytics algorithm) to test the effectiveness of the particular video particular video analytics algorithm (e.g. the intrusion detection video analytics algorithm) to identify an intrusion of door A of region 1.
In another example, a voice or text based description may be “Do crowd detection in region 4”. The system may parse the voice or text based description and identify the phrases “region 4”. From this phrases, the system may identify which video camera(s) of the video surveillance system 10 has a FOV that incudes region 4. The system may also parse the voice or text based description and identify the phrase “crowd detection”. This phrase may be mapped to one of a plurality of video analytics algorithms, such as a crowd detection video analytics algorithm. The system may then be configured to automatically identify characteristics of the simulated objects that are suitable for testing the identified video analytics algorithm, namely the crowd detection video analytics algorithm in region 4. The system may then superimpose simulated objects with the identify characteristics on the FOV of the video stream captured by the identified video camera to create an augmented video stream. The system may then process the augmented video stream using the particular video analytics algorithm (e.g. crowd detection video analytics algorithm) to test the effectiveness of the particular video particular video analytics algorithm (e.g. crowd detection video analytics algorithm) to identify a crowd in region 4. This approach may facilitate testing the video surveillance system 10 from a remote device and/or remote location.
The mappings between the various phrases of the or text based description and the locations and/or video analytics algorithm may be stored in a mapping table. In some cases, the mappings may be learned over time by an artificial intelligence model, and once trained, the artificial intelligence model may automatically produce an appropriate test of the video surveillance system 10 based on the voice or text based description.
The blocks 190, 192 and 194 each provide information to a decision block 196, where a determination is made as to whether the object simulator is attempting to move a polygon shape (an object) into one of the regions received from the block 190 or the block 192. If yes, control passes to block 198, and the simulator does not move the object to position X1, Y1. If not, control passes to block 200, and the simulator does move the object to position X1, Y1.
In this embodiment, both blocks 190 and 192 defined regions where the simulator is not allowed to move objects. In other embodiments, one of the regions, such as the region defined by block 192, may represent a region that the simulator is encourage to move objects. For example, if attempting to test a crowd detection algorithm in region A of the FOV, the region defined by block 192 may correspond to region A and the simulator may be encouraged to even required to move one or more objects into the region defined by block 192. In this embodiment, the decision block 196 may determine whether the object simulator is attempting to move a polygon shape (an object) into the region defined by block 192. If yes, the simulator is encouraged to move the object to position X1, Y1. If not, the simulator is not encouraged but is still allowed to move the object to position X1, Y1. In this embodiment, control is not passed to blocks 198 or 200.
Having thus described several illustrative embodiments of the present disclosure, those of skill in the art will readily appreciate that yet other embodiments may be made and used within the scope of the claims hereto attached. It will be understood, however, that this disclosure is, in many respects, only illustrative. Changes may be made in details, particularly in matters of shape, size, arrangement of parts, and exclusion and order of steps, without exceeding the scope of the disclosure. The disclosure's scope is, of course, defined in the language in which the appended claims are expressed.