Hereinafter, the best mode for implementing an embodiment of the invention will be described with reference to the drawings. An embodiment described below is an example suited for a monitoring system in which an imaging device (a monitoring camera) takes video data of a shooting target and generates metadata, and the obtained metadata is analyzed to detect a moving object (an object) to output the detected result.
In addition, naturally, the numbers of the monitoring camera, the client terminal, the server and the client terminal are not restricted to this embodiment.
Here, metadata generated in the monitoring camera will be described. The term metadata is attribute information of video data taken by the imaging part of the monitoring camera. For example, the following is named.
The term object information is information that information described as binary data in metadata is extended to a data structure with meanings such as a structure.
The term metadata filter is decision conditions when alarm information is generated from object information, and the term alarm information is information that is filtered based on the object information extended from metadata. The term alarm information is in formation that is obtained by analyzing a plurality of frames of metadata to determine the velocity from the changes in the position of a moving object, by confirming whether a moving object crosses over a certain line, or by analyzing them in a composite manner.
For example, for the types of filters, there are seven types below, and a given type of filter among them may be used.
For data included in alarm information, there is the filter “Capacity” among the filters described above, for example, including “the accumulated number of objects” that is generated through the filter using the accumulating total value of the detected object, “the number of objects” that is the number of objects matched with the conditions of the filter, the number of objects that is matched with the conditions of the filter within a specific frame, and attribute information of an object matched with the conditions of the filter (the ID, X-coordinate, Y-coordinate and size of an object). As described above, alarm information includes the number (the number of people) in video and statistics thereof, which may be used as a report function.
Next, the detailed configuration of the monitoring camera 1 shown in
The imaging part 212 has a preamplifier part and an A/D (Analog/Digital) converting part, not shown, for example. The preamplifier part amplifies the electric signal level of the imaging signal Sv and removes reset noise caused by correlated double sampling, and the A/D converting part converts the imaging signal Sv from the analog signal into the digital signal. Moreover, the imaging part 212 adjusts the gain of the supplied imaging signal Sv, stabilizes the black level, and adjusts the dynamic range. The imaging signal Sv subjected to various processes is supplied to an imaging signal processing part 213.
The imaging signal processing part 213 performs various signal processes for the imaging signal Sv supplied from the imaging part 212, and generates video data Dv. For example, such processes are performed: knee correction that compresses a certain level or more of the imaging signal Sv, γ correction that corrects the level of the imaging signal Sv in accordance with a set γ curve, white clipping or black clipping that limits the signal level of the imaging signal Sv to a predetermined range, and so on. Then, video data Dv is supplied to data processing part 214.
In order to reduce the data volume in communications with the client terminal 3, for example, the data processing part 214 performs coding process for video data Dv, and generates video data Dt. Furthermore, the data processing part 214 forms the generated video data Dv into a predetermined data structure, and supplies it to the client terminal 3.
Based on a switching instruction signal CA inputted from the client terminal 3, the imaging operation switching part 22 switches the operation of the monitoring camera 1 so as to obtain the optimum imaged video. For example, the imaging operation switching part 22 switches the imaging direction of the imaging part, and in addition to this, it allows the individual parts to do such processes in which a control signal CMa is supplied to the lens part 211 to switch the zoom ratio and the iris, a control signal CMb is supplied to the imaging part 212 and the imaging signal processing part 213 to switch the frame rate of imaged video, and a control signal CMc is supplied the data processing part 214 to switch the compression rate of video data.
The metadata generating part 23 generates metadata Dm that shows information about a monitoring target. In the case in which the moving object is set to a monitoring target, the metadata generating part uses video data Dv generated in the video data generating part 21, detects the moving object, generates moving object detection information indicating whether the moving object is detected, and moving object position information that indicates the position of the detected moving object, and includes them as object information into metadata. At this time, a unique ID is assigned to each of detected objects.
In addition, information about the monitoring target is not restricted to information about the moving object, which may be information indicating the state of the area to be monitored by the monitoring camera. For example, it may be information about the temperature or intensity of the area to be monitored. Alternatively, it may be information about operations done in the area to be monitored. In the case in which the temperature is a monitoring target, the temperature measured result may be included into metadata, whereas in the case in which the intensity is a monitoring target, the metadata generating part 23 may determine the average brightness of monitor video, for example, based on video data Dv, and includes the determined result into metadata.
Furthermore, in the case in which operations done by users on an ATM (Automated Teller Machine) and a POS (Point Of Sales) terminal are monitoring targets, it is sufficient that user operations performed through an operation key and an operation panel are included into metadata.
Moreover, the metadata generating part 23 includes imaging operation QF supplied from the imaging operation switching part 22 (for example, the imaging direction or the zoom state at the time when the monitoring target is imaged, and setting information of the video data generating part) and time information into metadata, whereby the time when metadata is generated and the situations can be left as records.
Here, the configurations of video data and metadata will be described. Video data and metadata are each configured of a data main body and link information. In the case of video data, the data main body is video data that is monitor video taken by the monitoring cameras 1a and 1b. In addition, in the case of metadata, it describes attribute information that defines the description mode of information such as information indicating a monitoring target. On the other hand, the term link information is association information that indicates association between video data and metadata, and information that describes attribute information defining the description mode of the descriptions of information.
For association information, for example, a time stamp that identifies video data and sequence numbers are used. The term time stamp is information that gives a point in time of generating video data (time information), and the term sequence number is information that gives the order of generating contents data (order information). In the case in which there is a plurality of pieces of monitor video having the same time stamp, the order of generating video data having the same time stamp can be identified. Moreover, for association information, such information may be used that identifies a device to generate video data (for example, manufacturer names, product type names, production numbers and so on).
In order to describe link information about a metadata main body, the Markup Language is used that is defined by describing information exchanged on the web (WWW: World Wide Web). With the use of the Markup Language, information can be easily exchanged via the network 2. Furthermore, for the Markup Language, for example, with the use of XML (Extensible Markup Language) that is used to exchange documents and electric data, video data and metadata can be easily exchanged. In the case of using XML, for the attribute information that defines the description mode of information, for example, the XML schema is used.
Video data and metadata generated by the monitoring cameras 1a and 1b may be supplied as a single stream to the client terminal 3, or video data and metadata may be supplied asynchronously to the client terminal 3 in separate streams.
In addition, as shown in
Next, the detailed configuration of the client terminal 3 shown in
The client terminal 3 has a network connecting part 101 which transmits data with the monitoring cameras 1a and 1b, a video buffering part 102 which acquires video data from the monitoring cameras 1a and 1b, a metadata buffering part 103 which acquires metadata from the monitoring cameras 1a and 1b, a filter setting database 107 which stores filter settings in accordance with the filtering process, a metadata filtering part 106 as a filtering part which filters metadata, a vanishing point setting database 113 which stores vanishing point setting information when a location at which an object can disappear out of the monitoring target area of the monitoring camera is set as “an area having a vanishing point”, a rule switching part 108 which notifies a change of settings to the monitoring cameras 1a and 1b, a video data storage database 104 which stores video data, a metadata storage database 105 which stores metadata, a display part 111 which displays video data and metadata, a video data processing part 109 which performs processes to reproduce video data on the display part 111, a metadata processing part 110 which performs processes to reproduce metadata on the display part 111, and a reproduction synchronizing part 112 which synchronizes the reproduction of metadata with video data.
The video buffering part 102 acquires video data from the monitoring cameras 1a and 1b, and decodes coded video data. Then, the video buffering part 102 holds obtained video data in a buffer, not shown, disposed in the video buffering part 102. Furthermore, the video buffering part 102 also in turn supplies video data held in the buffer, not shown, to the display part 111 which displays images thereon. As described above, video data is held in the buffer, not shown, whereby video data can be in turn supplied to the display part 111 without relying on the reception timing of video data from the monitoring cameras 1a and 1b. Moreover, the video buffering part 102 stores the held video data in the video data storage database 104 based on a recording request signal supplied from the rule switching part 108, described later. In addition, this scheme may be performed in which coded video data is stored in the video data storage database 104, and is decoded in the video data processing part 109, described later.
The metadata buffering part 103 holds metadata acquired from the monitoring cameras 1a and 1b in the buffer, not shown, disposed in the metadata buffering part 103. Moreover, the metadata buffering part 103 in turn supplies the held metadata to the display part 111. In addition, it also supplies the metadata held in the buffer, not shown, to the metadata filtering part 106, described later. As described above, metadata is held in the buffer, not shown, whereby metadata can be in turn supplied to the display part 111 without relying on the reception timing of metadata from the monitoring cameras 1a and 1b. Moreover, metadata can be supplied to the display part 111 in synchronization with video data. Furthermore, the metadata buffering part 103 stores metadata acquired from the monitoring cameras 1a and 1b in the metadata storage database 105. Here, in storing metadata in the metadata storage database 105, time information about video data synchronized with metadata is added. With this configuration, without reading the description of metadata to determine point in time, the added time information is used to read metadata at a desired point in time out of the metadata storage database 105.
The filter setting database 107 stores filter settings in accordance with the filtering process performed by the metadata filtering part 106, described later, as well as supplies the filter settings to the metadata filtering part 106. The term filter settings is settings that indicate determination criteria such as the necessities to output alarm information and to determine whether to switch the imaging operations of the monitoring camera 1a, 1b for every information about the monitoring target included in metadata. The filter settings are used to filter metadata to show the filtered result for every information about the monitoring target. The filtered result shows the necessities to output alarm information, to switch the imaging operations of the monitoring cameras 1a and 1b, and so on.
The metadata filtering part 106 uses the filter settings stored in the filter setting database 107 to filter metadata for determining whether to generate alarms. Then, the metadata filtering part 106 filters metadata acquired from the metadata buffering part 103 or metadata supplied from the metadata storage database 105, and notifies the filtered result to the rule switching part 108.
The vanishing point setting database 113 stores vanishing point setting information in the case in which the location at which the object can disappear out of the monitoring target area of the monitoring camera such as a door is set as the area having a vanishing point. The area having a vanishing point is indicated by a polygon, for example, based on its coordinate information, and a flag is added that indicates that it is the area having a vanishing point, and is set to the vanishing point setting information. Vanishing point setting information stored in the vanishing point setting database 113 is referenced in the filtering process done by the metadata filtering part 106, and analysis is made in accordance with vanishing point setting information. The details of the process in this case will be described later.
Based on the filtered result notified from the metadata filtering part 106, the rule switching part 108 generates the switching instruction signal, and notifies changes such as the switching of the imaging direction to the monitoring cameras 1a and 1b. For example, the rule switching part outputs the instruction of switching the operations of the monitoring cameras 1a and 1b based on the filtered result obtained from the metadata filtering part 106, so as to obtain monitor video suited for monitoring. Moreover, the rule switching part 108 supplies the recording request signal to the video data storage database 104 to store the video data acquired by the video buffering part 102 in the video data storage database 104 based on the filtered result.
The video data storage database 104 stores video data acquired by the video buffering part 102. The metadata storage database 105 stores metadata acquired by the metadata buffering part 103.
The video data processing part 109 performs the process that allows the display part 111 to display video data stored in the video data storage database 104. In other words, the video data processing part 109 in turn reads video data out of the reproduction position instructed by a user, and supplies the read video data to the display part 111. In addition, the video data processing part 109 supplies the reproduction position (a reproduction point in time) of video data being reproduced to the reproduction synchronizing part 112.
The reproduction synchronizing part 112 which synchronizes the reproduction of metadata with video data supplies a synchronization control signal to the metadata processing part 110, and controls the operation of the metadata processing part 110 so that the reproduction position supplied from the video data processing part 109 is synchronized with the reproduction position of metadata stored in the metadata storage database 105 by means of the metadata processing part 110.
The metadata processing part 110 performs the process that allows the display part 111 to display metadata stored in the metadata storage database 105. In other words, the metadata processing part 110 in turn reads metadata out of the reproduction position instructed by the user, and supplies the read metadata to the display part 111. In addition, as described above, in the case in which video data and metadata are reproduced, the metadata processing part 110 controls the reproduction operation based on the synchronization control signal supplied from the reproduction synchronizing part 112, and outputs metadata synchronized with video data to the display part 111.
The display part 111 displays live (raw) video data supplied from the video buffering part 102, reproduced video data supplied from the video data processing part 109, live metadata supplied from the metadata buffering part 103, or reproduced metadata supplied from the metadata processing part 110. In addition, based on the filter settings from the metadata filtering part 106, the display part 111 uses any one of monitor video, metadata video, and filter setting video, or uses video combining them, and displays (outputs) video showing the monitoring result based on the filtered result.
Moreover, the display part 111 also functions as a graphical user Interface (GUI). The user uses an operation key, a mouse, or a remote controller, not shown, and selects a filter setting menu displayed on the display part 111 to define the filter, or to display information about the analyzed result of individual processing parts and alarm information in GUI.
As discussed above, the client terminal 3 acquires, analyzes and stores the video data 1001 and the metadata 1002 supplied from the monitoring cameras 1a and 1b. The video data 1001 and the metadata 1002 inputted to the client terminal 3 are stored in the video data storage database 104, and the metadata storage database 105. The client terminal 3 has a filter setting function in which various filter settings are made through a filter setting screen (filter setting menu) displayed on the display part 111 and setting information is stored in the filter setting database 107.
On a filter setting display screen 1003 shown in
Monitor video 1004 shows that the video data 1001 is superimposed on the filter, and they are displayed on the display part 111. The line LN is set as the passing filter. In the case in which objects passing through the filter are counted, the number of objects passing through the line LN is computed. On the screen, since objects MB1 and MB2 are detected as objects passing through the line LN, the number of objects is two.
However, because of disturbance such as noise on the system and abrupt changes in the brightness, and quick motion of an object (a moving object), it sometimes happens that the object that has been recognized as an object temporarily disappears and then again appears. In this case, in the case in which the object before disappearing and the object that again appears are recognized as different objects, the number of objects recognized as objects is twice as much as the actual number of objects. In order to prevent this event, such a setting is sometimes done that the object that has disappeared once and again appears at almost the same place is considered to be the same object.
However, for example, in the case in which that setting is applied to such a place that an object can actually disappear or appear such as an entrance, such an error can occur this time that even though there is another object, it is recognized as the same object.
In the embodiment, the place at which an object can actually disappear or again appear such as an entrance is defined as “an area having a vanishing point”. At the place which is set as the area having a vanishing point, such settings are made that an object that has once visually disappeared out of the monitoring screen and an object that has appeared at almost the same place are not considered to be the same object, whereby the number of objects obtained through the filter is approximated to the actual number of objects.
Next, the object recognition process according to the embodiment will be described with reference to a flow chart shown in
The client terminal 3 receives metadata including object information, and determines whether the area in which the object is detected is the area in which the vanishing point is set (Step S14). If the area in which the object is detected is the area which is set as the area having a vanishing point, the object is recognized as a new object (Step S16). If the area in which the object is detected is the area which is not set as the area having a vanishing point, in the case in which it is confirmed that the object is an object that has disappeared at almost the same position right before the time at which the object has been detected, the object and the detected object are recognized as the same object (Step S15).
In this case, in the case in which this setting is made that “the object that has once disappeared and then appeared at almost the same place is considered to be the same object”, it is considered that the object MB1 in
However, suppose the objects MB1 and MB3 are actually different people, the number of objects obtained through the filter is smaller than the actual number of objects. On this account, in the embodiment, in the case in which an object has once disappeared and an object is again detected at almost the same place and the place is the area having a vanishing point VP, it is considered that the object before disappearing and the object after appearing again are different objects. When the similar definitions are applied to the other places, such a problem arises that the same object is overlapped and counted. Thus, it is defined that in the areas other than the area having a vanishing point VP, it is considered that the object that has once disappeared and again appeared at almost the same place is the same object.
Again returning to
However, in the embodiment, in the case in which an object has disappeared and again appeared in the area other than the area that is set as the area having a vanishing point, it is considered that the object before disappearing and the object after appearing again are the same object. Therefore, it is considered that the object MB2 in
As described above, in the case in which the area in which an object can visually disappear and then appear at a place such as an entrance is set as the area having a vanishing point, an object once disappears and an object is again detected at almost the same place, the place is the area having a vanishing point, it is recognized that the object before disappearing are the object after appearing again different objects. Therefore, such an error is eliminated that various moving objects go in and out of an entrance and the objects are recognized as the same object, and errors between the number of actual objects (moving objects) and the number of objects obtained through the filter are made small.
Moreover, in the areas other than the area having a vanishing point, it is recognized that the object before disappearing and the object after appearing again are the same object. Therefore, the object before disappearing and the object after appearing again are recognized as the same object even in the case in which although an object does not actually disappear, it is recognized that the object has disappeared because of some factors such as disturbance. On this account, the number of objects obtained through a filter is made closer to the actual number of objects.
In addition, the embodiment described so far, the object ID is assigned on the monitoring camera side, but this task may be done on the client terminal side.
Moreover, a series of the process steps of the embodiment described above cant be executed by hardware, which may be executed by software. In the case in which a series of the process steps is executed by software, a program configuring the software is installed in a computer incorporated in a dedicated hardware, or a program configuring desired software is installed in a multi-purpose personal computer that can execute various functions by installing various programs.
Furthermore, in the embodiment described above, it is configured in which metadata outputted from the monitoring camera (the monitoring imaging device) is filtered. However, the target for the filtering process is not restricted to metadata, and the configuration can be adapted to various cases in which data in various forms is filtered. For example, this configuration may be performed in which video (the image) of video data outputted from the monitoring camera is directly analyzed in the client terminal.
It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and alterations may occur depending on design requirements and other factors insofar as they are within the scope of the appended claims or the equivalents thereof.
Number | Date | Country | Kind |
---|---|---|---|
P2006-205067 | Jul 2006 | JP | national |