The present invention relates to a person tracking device for and a person tracking program for detecting each individual person which exists in an area to be monitored to track each individual person.
Although a huge number of elevators are installed in a skyscraper, a group control operation of causing such many elevators to operate in conjunction with one another is required in order to convey passengers efficiently at the time of morning commuter rush-hour and rush-hour for lunch break, for example. In order to carry out the group control operation of causing many elevators to operate in conjunction with one another efficiently, it is necessary to measure movement histories of passengers about “on which floor how many persons got on each elevator and on which floor how many persons got off each elevator”, and to provide the movement histories for a group management system.
Conventionally, various proposals have been made as to a person tracking technology of counting of the number of passengers, and measuring each passenger's movements by using a camera.
As one of them, a person tracking device for detecting passengers in an elevator to count the number of passengers in the elevator by determining for a difference image (a background difference image) between a background image pre-stored therein and an image of the inside of the elevator captured by a camera (refer to patent reference 1) has been proposed.
However, in the case in which the elevator is greatly crowded, each passenger exists in an about-25 cm square and a situation in which passengers in the image overlap one another occurs. Therefore, the background difference image may become a silhouette of a group of people. As a result, it is very difficult to separate an image of each individual person from the background difference image, and the above-mentioned person tracking device cannot count the number of passengers in the elevator correctly.
Furthermore, as another technology, a person tracking device provided with a camera installed in an upper portion of an elevator cage, for carrying out pattern matching between a reference pattern of each person's head image pre-stored therein and an image captured by a camera to detect the head of each passenger in the elevator and count the number of passengers in the elevator case (refer to patent reference 2) has been proposed.
However, if a passenger is shaded by another passenger when the passenger is viewed from the camera, for example, when passengers are detected by using such the simple pattern matching, the number of passengers may be counted erroneously. Furthermore, in the case in which a mirror is installed in the elevator cage, a passenger in the mirror may be detected erroneously.
In addition, as another technology, a person tracking device provided with a stereoscopic camera installed in an upper portion of an elevator cage, for carrying out stereo vision of each person who is detected from an image captured by the stereoscopic camera to determine the person's three-dimensional position (refer to patent reference 3) has been proposed.
However, this person tracking device may detect a larger number of persons than the actual number of persons.
More specifically, in the case of this person tracking device, as shown in
However, it may be estimated that the person exists also at a point that the vector VA1 and a vector VB2 intersect, and, even when only two persons exist actually, it may be therefore determined erroneously that three persons exist.
In addition, as methods of detecting two or more persons by using multiple cameras, a method of using dynamic programming to determine each person's moving track on the basis of a silhouette of the person which is acquired from a background difference (refer to nonpatent reference 1) and a method of determining each person's moving track by using “Particle Filter” (refer to nonpatent reference 2) have been proposed.
The use of each of these methods makes it possible to, even when a person is shared by another person at a point of view, determine the number of persons and each person's moving track by using silhouette information and time series information at another point of view.
However, because the silhouettes of some persons always overlap one another in a crowded elevator cage or train even though each of them is shot from any point of view, these methods cannot be applied to such a situation.
Because the conventional person tracking devices are constructed as mentioned above, a problem with these conventional person tracking devices is that in a situation in which an elevator cage which is an area to be monitored is crowded greatly, passengers in the elevator cage cannot be correctly detected and each of the passengers cannot be tracked correctly.
The present invention is made in order to solve the above-mentioned problem, and it is therefore an object of the present invention to provide a person tracking device and a person tracking program which can correctly track each person who exists in an area to be monitored even when the area to be monitored is crowded greatly.
A person tracking device in accordance with the present invention includes: a plurality of shooting units installed at different positions, each for shooting an identical area to be monitored; a person position calculating unit for analyzing a plurality of video images of the area to be monitored which is shot by the plurality of shooting units to determine a position on each of the plurality of video images of each individual person existing in the area to be monitored; and a two-dimensional moving track calculating unit for calculating a two-dimensional moving track of each individual person in each of the plurality of video images by tracking the position on each of the plurality of video images which is calculated by the person position calculating unit, and a three-dimensional moving track calculating unit carries out stereo matching between two-dimensional moving tracks in the plurality of video images, which are calculated by the two-dimensional moving track calculating unit, to calculate a degree of match between the two-dimensional moving tracks, and calculates a three-dimensional moving track of each individual person from two-dimensional moving tracks each having a degree of match equal to or larger than a specific value.
Because the person tracking device in accordance with the present invention is constructed in such a way that the person tracking device includes the a person position calculating unit for analyzing a plurality of video images of the area to be monitored which is shot by the plurality of shooting units to determine the position on each of the plurality of video images of each individual person existing in the area to be monitored; and the two-dimensional moving track calculating unit for calculating a two-dimensional moving track of each individual person in each of the plurality of video images by tracking the position on each of the plurality of video images which is calculated by the person position calculating unit, and the three-dimensional moving track calculating unit carries out stereo matching between two-dimensional moving tracks in the plurality of video images, which are calculated by the two-dimensional moving track calculating unit, to calculate the degree of match between the two-dimensional moving tracks, and for calculates a three-dimensional moving track of each individual person from two-dimensional moving tracks each having a degree of match equal to or larger than the specific value, there is provided an advantage of being able to correctly track each person existing in the area to be monitored even when the area to be monitored is crowded greatly.
Hereafter, in order to explain this invention in greater detail, the preferred embodiments of the present invention will be described with reference to the accompanying drawings.
However, the type of each of the plurality of cameras 1 is not limited to a specific type. Each of the plurality of cameras 1 can be a general surveillance camera. As an alternative, each of the plurality of cameras 1 can be a visible camera, a high sensitivity camera capable of shooting up to a near infrared region, a far-infrared camera capable of shooting a heat source, or the like. As an alternative, infrared distance sensors, laser range finders or the like capable of measuring a distance can be substituted for such cameras.
A video image acquiring unit 2 is a video input interface for acquiring a video image of the inside of the elevator cage shot by each of the plurality of cameras 1, and carries out a process of outputting the video image of the inside of the elevator cage to a video analysis unit 3.
In this embodiment, it is assumed that the video image acquiring unit 2 outputs the video image of the inside of the elevator cage to the video analysis unit 3 in real time. The video image acquiring unit 2 can alternatively record the video image into a recorder, such as a hard disk prepared beforehand, and can output the video image to the video analysis unit 3 through an off-line process.
The video analysis unit 3 carries out a process of analyzing the video image the inside of the elevator cage outputted from the video image acquiring unit 2 to calculate a three-dimensional moving track of each individual person existing in the cage, and then calculating a person movement history showing the floor where each individual person has got on the elevator cage and the floor where each individual person has got off the elevator cage, and so on according to the three-dimensional moving track.
An image analysis result display unit 4 carries out a process of displaying the person movement history and so on which are calculated by the video analysis unit 3 on a display (not shown). The image analysis result display unit 4 constructs an image analysis result display unit.
A door opening and closing recognition unit 11 carries out a process of analyzing the video image of the inside of the elevator cage outputted from the video image acquiring unit 2 to specify the opening and closing times of the door of the elevator. The door opening and closing recognition unit 11 constructs a door opening and closing time specifying unit.
A floor recognition unit 12 carries out a process of analyzing the video image of the inside of the elevator cage outputted from the video image acquiring unit 2 to specify the floor where the elevator is located at each time. The floor recognition unit 12 constructs a floor specifying unit.
A person tracking unit 13 carries out a process of analyzing the video image of the inside of the elevator cage outputted from the video image acquiring unit 2 and then tracking each individual person existing in the cage to calculate a three-dimensional moving track of each individual person, and calculate a person movement history showing the floor where each individual person has got on the elevator cage and the floor where each individual person has got off the elevator cage, and so on according to the three-dimensional moving track.
In
A background difference unit 22 carries out a process of calculating a difference between the background image registered by the background image registration unit 21 and a video image of the door region shot by a camera 1.
An optical flow calculating unit 23 carries out a process of calculating a motion vector showing the direction of the door's movement from a change of the video image of the door region shot by the camera 1.
A door opening and closing time specifying unit 24 carries out a process of determining an open or closed state of the door from the difference calculated by the background difference unit 22 and the motion vector calculated by the optical flow calculating unit 23 to specify an opening or closing time of the door.
A background image updating unit 25 carries out a process of updating the background image by using a video image of the door region shot by the camera 1.
In
A template matching unit 32 carries out a process of performing template matching between the template image registered by the template image registering unit 31 and a video image of an indicator region in the elevator shot by a camera 1 to specify the floor where the elevator is located at each time, or carries out a process of analyzing control base information about the elevator to specify the floor where the elevator is located at each time.
A template image updating unit 33 carries out a process of updating the template image by using a video image of the indicator region shot by the camera 1.
In
A camera calibration unit 42 of the person position determining unit 41 carries out a process of analyzing a degree of distortion of each of video images of a calibration pattern which are shot in advance by the plurality of cameras 1 before the person tracking process is started to calculate camera parameters of the plurality of cameras 1 (parameters regarding a distortion of the lens of each camera, the focal length, optical axis and principal point of each camera).
The camera calibration unit 42 also carries out a process of determining the installed positions and installation angles of the plurality of cameras 1 with respect to a reference point in the elevator cage by using both the video images of the calibration pattern shot by the plurality of cameras 1 and the camera parameters of the plurality of cameras 1.
A video image correcting unit 43 of the person position determining unit 41 carries out a process of correcting a distortion of the video image of the elevator cage shot by each of the plurality of cameras 1 by using the camera parameters calculated by the camera calibration unit 42.
A person detecting unit 44 of the person position determining unit 41 carries out a process of detecting each individual person in each video image in which the distortion has been corrected by the video image correcting unit 43 to calculate the position on each video image of each individual person.
A two-dimensional moving track calculating unit 45 carries out a process of calculating a two-dimensional moving track of each individual person in each video image by tracking the position of each individual person on each video image calculated by the person detecting unit 44. The two-dimensional moving track calculating unit 45 constructs a two-dimensional moving track calculating unit.
A three-dimensional moving track calculating unit 46 carries out a process of performing stereo matching between each two-dimensional moving track in each video image and a two-dimensional moving track in another video image, the two-dimensional moving tracks being calculated by the two dimensional moving track calculating unit 45, to calculate the degree of match between them and then calculate a three-dimensional moving track of each individual person from the corresponding two-dimensional moving tracks each having a degree of match equal to or larger than a specified value, and also determining a person movement history showing the floor where each individual person has got on the elevator cage and the floor where each individual person has got off the elevator cage by bringing the three-dimensional moving track of each individual person into correspondence with the floors specified by the floor recognition unit 12. The three-dimensional moving track calculating unit 46 constructs a three-dimensional moving track calculating unit.
A two-dimensional moving track graph generating unit 47 of the three-dimensional moving track calculating unit 46 carries out a process of performing a dividing process and a connecting process on two-dimensional moving tracks calculated by the two-dimensional moving track calculating unit 45 to generate a two-dimensional moving track graph.
A track stereo unit 48 of the three-dimensional moving track calculating unit 46 carries out a process of searching through the two-dimensional moving track graph generated by the two-dimensional moving track graph generating unit 47 to determine a plurality of two-dimensional moving track candidates, carrying out stereo matching between each two-dimensional moving track candidate in each video image and a two-dimensional moving track candidate in another video image by taking into consideration the installed positions and installation angles of the plurality of cameras 1 with respect to the reference point in the cage which are calculated by the camera calibration unit 42 to calculate the degree of match between the candidates, and then calculating a three-dimensional moving track of each individual person from the corresponding two-dimensional moving track candidates each having a degree of match equal to or larger than a specified value.
A three-dimensional moving track graph generating unit 49 of the three-dimensional moving track calculating unit 46 carries out a process of performing a dividing processing and a connecting process on three-dimensional moving tracks calculated by the track stereo unit 48 to generate a three-dimensional moving track graph.
A track combination estimating unit 50 of the three-dimensional moving track calculating unit 46 carries out a process of searching through the three-dimensional moving track graph generated by the three-dimensional moving track graph generating unit 49 to determine a plurality of three-dimensional moving track candidates, selecting optimal three-dimensional moving tracks from among the plurality of three-dimensional moving track candidates to estimate the number of persons existing in the cage, and also calculating a person movement history showing the floor where each individual person has got on the elevator cage and the floor where each individual person has got off the elevator cage by bringing the optimal three-dimensional moving track of each individual person into correspondence with the floors specified by the floor recognition unit 12.
In
A time series information display unit 52 carries out a process of performing graphical representation of person movement histories calculated by the three-dimensional moving track calculating unit 46 of the person tracking unit 13 in time series.
A summary display unit 53 carries out a process of calculating statistics on the person movement histories calculated by the three-dimensional moving track calculating unit 46 to display the statistic results of the person movement histories.
An operation related information display unit 54 carries out a process of displaying information about the operation of the elevator with reference to the person movement histories calculated by the three-dimensional moving track calculating unit 46.
A sorted data display unit 55 carries out a process of sorting and displaying the person movement histories calculated by the three-dimensional moving track calculating unit 46.
In
Next, the operation of the person tracking device will be explained.
First, an outline of the operation of the person tracking device of
When the plurality of cameras 1 start capturing video images of the inside of the elevator cage, the video image acquiring unit 2 acquires the video images of the inside of the elevator cage from the plurality of cameras 1 and outputs each of the video images to the video analysis unit 3 (step ST1).
When receiving each of the video images captured by the plurality of cameras 1 from the video image acquiring unit 2, the door opening and closing recognition unit 11 of the video analysis unit 3 analyzes each of the video images to specify the opening and closing times of the door of the elevator (step ST2).
More specifically, the door opening and closing recognition unit 11 analyzes each of the video images to specify the time when the door of the elevator is open and the time when the door is closed.
When receiving the video images captured by the plurality of cameras 1 from the video image acquiring unit 2, the floor recognition unit 12 of the video analysis unit 3 analyzes each of the video images to specify the floor where the elevator is located (i.e., the stopping floor of the elevator) at each time (step ST3).
When receiving the video images captured by the plurality of cameras 1 from the video image acquiring unit 2, the person tracking unit 13 of the video analysis unit 3 analyzes each of the video images to detect each individual person existing in the cage.
The person tracking unit 13 then refers to the result of the detection of each individual person and the opening and closing times of the door specified by the door opening and closing recognition unit 11 and tracks each individual person existing in the cage to calculate a three-dimensional moving track of each individual person.
The person tracking unit 13 also calculates a person movement history showing the floor where each individual person has got on the elevator and the floor where each individual person has got off the elevator by bringing the three-dimensional moving track of each individual person into correspondence with the floors specified by the floor recognition unit 12 (step ST4).
The image analysis result display unit 4 displays the person movement history on the display after the video analysis unit 3 calculates the person movement history and so on (step ST5).
Next, the process carried out by the video analysis unit 3 in the person tracking device of
First, the door opening and closing recognition unit 11 selects a door region in which the door is shot from one of the video images of the elevator cage shot by the plurality of cameras 1 (step ST11).
In the example of
The background image registration unit 21 of the door opening and closing recognition unit 11 acquires an image of the door region in the elevator in a state where the door is closed (e.g., a video image captured by one camera 1 when the door is closed: refer to FIG. 8(B)), and registers the image as a background image (step ST12).
After the background image registration unit 21 registers the background image, the background difference unit 22 of the door opening and closing recognition unit 11 receives the video image captured by the camera 1 which varies from moment to moment from the video image acquiring unit 2 and calculates the difference between the video image of the door region in the video image captured by the camera 1 and the above-mentioned background image in such a way as shown in
When calculating the difference between the video image of the door region and the background image, and determining that the difference is large (e.g., when the difference is larger than a predetermined threshold and the video image of the door region greatly differs from the background image), the background difference unit 22 sets a flag Fb for door opening and closing determination to “1” because there is a high possibility that the door is open.
In contrast, when determining that the difference is small (e.g., when the difference is smaller than the predetermined threshold and the video image of the door region hardly differs from the background image), the background difference unit 22 sets the flag Fb for door opening and closing determination to “0” because there is a high possibility that the door is closed.
The optical flow calculating unit 23 of the door opening and closing recognition unit 11 receives the video image captured by the camera 1 which varies from moment to moment from the video image acquiring unit 2, and calculates a motion vector showing the direction of movement of the door from a change of the video image (two continuous image frames) of the door region in the video image captured by the camera 1 (step ST14).
For example, in a case in which the door of the elevator is a central one, as shown in
In contrast, when the direction of movement of the door shown by the motion vector is an inward one, the optical flow calculating unit 23 sets the flag Fo for door opening and closing determination to “0” because there is a high possibility that the door is closing.
Because the motion vector does not show any direction of movement of the door when the door of the elevator is not moving (when a state in which the door is open or closed is maintained), the optical flow calculating unit sets the flag Fo for door opening and closing determination to “2”.
After the background difference unit 22 sets the flag Fb for door opening and closing determination and the optical flow calculating unit 23 sets the flag Fo for door opening and closing determination, the door opening and closing time specifying unit 24 of the door opening and closing recognition unit 11 determines the open or closed state of the door with reference to those flags Fb and Fo to specify the opening and closing times of the door (step ST15).
More specifically, the door opening and closing time specifying unit 24 determines that the door is closed during a time period during which both the flag Fb and the flag Fo are “0” and during a time period during which the flag Fb is “0” and the flag Fo is “2”, and also determines that the door is open during a time period during which at least one of the flag Fb and the flag Fo is “1”.
In addition, the door opening and closing time specifying unit 24 sets the door index di of each time period during which the door is closed to “0”, as shown in
The background image updating unit 25 of the door opening and closing recognition unit 11 receives the video image of the camera 1 which varies from moment to moment from the video image acquiring unit 2, and updates the background image registered into the background image registration unit 21 (i.e., the background image which the background difference unit 22 uses at the next time) by using the video image of the door region in the video image captured by the camera 1 (step ST16).
As a result, even when a video image of a region in the vicinity of the door varies due to an illumination change, for example, the person tracking device can carry out the background difference process adaptively according to the change.
First, the floor recognition unit 12 selects an indicator region in which the indicator showing the floor where the elevator is located is shot from one of the video images of the inside of the elevator cage shot by the plurality of cameras 1 (step ST21).
In an example of
The template image registering unit 31 of the floor recognition unit 12 registers an image of each of the numbers showing the corresponding floor in the selected indicator region as a template image (step ST22).
For example, in a case in which the elevator moves from the first floor to the ninth floor, the template image registering unit successively registers number images (“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, and “9”) of the numbers respectively showing the floors as template images, as shown in
After the template image registering unit 31 registers the template images, the template matching unit 32 of the floor recognition unit 12 receives the video image captured by the camera 1 which varies from moment to moment from the video image acquiring unit 2, and carries out template matching between the video image of the indicator region in the video image captured by the camera 1 and the above-mentioned template images to specify the floor where the elevator is located at each time (step ST23).
Because an existing normalized cross correlation method or the like can be used as a method of carrying out the template matching, a detailed explanation of this method will be omitted hereafter.
The template image updating unit 33 of the floor recognition unit 12 receives the video image captured by the camera 1 which varies from moment to moment from the video image acquiring unit 2, and uses a video image of the indicator region in the video image captured by the camera 1 to update the template images registered in the template image registering unit 31 (i.e., the template images which the template matching unit 32 uses at the next time) (step ST24).
As a result, even when a video image of a region in the vicinity of the indicator varies due to an illumination change, for example, the person tracking device can carry out the template matching process adaptively according to the change.
First, each of the cameras 1 shoots the calibration pattern before the camera calibration unit 42 of the person tracking unit 13 determines the camera parameters of each of the cameras 1 (step ST31).
The video image acquiring unit 2 acquires the video image of the calibration pattern captured by each of the cameras 1, and outputs the video image of the calibration pattern to the camera calibration unit 42.
As the calibration pattern used in this embodiment, a black and white checkered flag pattern having a known size (refer to
The calibration pattern is shot by the plurality of camera 1 at about 1 to 20 different positions and at about 1 to 20 different angles.
When receiving the video image of the calibration pattern captured by each of the cameras 1 from the video image acquiring unit 2, the camera calibration unit 42 analyzes the degree of distortion of the video image of the calibration pattern to determine the camera parameters of each of the cameras 1 (e.g., the parameters regarding a distortion of the lens of each camera, the focal length, optical axis and principal point of each camera) (step ST32).
Because the method of determining the camera parameters is a well-known technology, a detailed explanation of the method will be omitted hereafter.
Next, when the camera calibration unit 42 determines the installed positions and installation angles of the plurality of cameras 1, the plurality of cameras 1 shoot the identical calibration pattern having a known size simultaneously after the plurality of cameras 1 are installed in an upper portion in the elevator cage (step ST33).
For example, as shown in
At that time, the position and angle of the calibration pattern laid out on the floor of the cage with respect to a reference point in the cage (e.g., the entrance of the cage) are measured as an offset, and the inside dimension of the cage is also measured.
In the example of
As an alternative, as shown in
When receiving the video images of the calibration pattern captured by the plurality of cameras 1 from the video image acquiring unit 2, the camera calibration unit 42 calculates the installed positions and installation angles of the plurality of cameras 1 with respect to the reference point in the elevator cage by using both the video images of the calibration pattern and the camera parameters of the plurality of cameras 1 (step ST34).
More specifically, when a black and white checkered flag pattern is used as the calibration pattern, for example, the camera calibration unit 42 calculates the relative positions and relative angles of the plurality of cameras 1 with respect to the checker pattern shot by the plurality of cameras 1.
By then adding the offset of the checkered pattern which is measured beforehand (the position and angle of the checkered pattern with respect to the entrance of the cage which is the reference point in the cage) to the relative position and relative angle of each of the plurality of cameras 1, the camera calibration unit calculates the installed positions and installation angles of the plurality of cameras 1 with respect to the reference point in the cage.
In contrast, when the four corners of the floor of the cage and three corners of the ceiling are used as the calibration pattern, as shown in
In this case, it is possible to automatically determine the installed position and installation angle of each camera 1 by simply installing the camera 1 in the cage.
When the person tracking unit 13 carries out a detecting process of detecting a person, an analysis process of analyzing a moving track, or the like, the plurality of cameras 1 repeatedly shoot an area in the elevator cage which is actually operating.
The video image acquiring unit 2 acquires the plurality of video images of the inside of the elevator cage shot by the plurality of cameras 1 from moment to moment (step ST41).
Every time when acquiring the plurality of video images captured by the plurality of cameras 1 from the video image acquiring unit 2, the video image correcting unit 43 of the person tracking unit 13 corrects a distortion in each of the plurality of video images by using the camera parameters calculated by the camera calibration unit 42 to generate a normalized image which is a distortion-free video image (step ST42).
Because the method of correcting a distortion in each of the plurality of video images is a well-known technology, a detailed explanation of the method will be omitted hereafter.
After the video image correcting unit 43 generates the normalized images from the video images captured by the plurality of cameras 1, the person detecting unit 44 of the person tracking unit 13 detects, as a person, appearance features of each human body which exists in each normalized image to calculate the position (image coordinates) of the person on each normalized image and also calculate the person's degree of certainty (step ST43).
The person detecting unit 44 then performs a camera perspective filter on the person's image coordinates to delete the person detection result if the person detection result has an improper size.
For example, when the person detecting unit 44 detects the head (one appearance feature) of each human body, the image coordinates of the person show the coordinates of the center of a rectangle surrounding a region including the head.
Furthermore, the degree of certainty is an index showing how much similarity there is between the corresponding object detected by the person detecting unit 44 and a human being (a human head). The higher degree of certainty the object has, the higher probability that the object is a human being while the lower degree of certainty the object has, the lower probability that the object is a human being.
Hereafter, the process of detecting a person which is carried out by the person detecting unit 44 will be explained concretely.
In the case of
In this case, as the detecting method of detecting a head, a face detection method disclosed by the following reference 1 can be used.
More specifically, Haar-basis-like patterns which are called “Rectangle Features” are selected by using Adaboost and many weak classifiers are acquired, so that the sum of the outputs of these weak classifiers and a proper threshold can be used as the degree of certainty.
Furthermore, a road sign detecting method disclosed by the following reference 2 can be applied as the detecting method of detecting ahead so that the image coordinates and the degree of certainty of each detected head can be calculated.
In the case of
As shown in
First, the person detecting unit 44 determines a direction vector V passing through both the point A on a video image captured by a camera 1 and the center of the camera 1.
The person detecting unit 44 then sets up a maximum height (e.g., 200 cm), a minimum height (e.g., 100 cm), and a typical head size (e.g., 30 cm) of persons which can be assumed to get on the elevator.
Next, the person detecting unit 44 projects the head of a person having the maximum height onto the camera 1, and defines the size of a rectangle on the image surrounding the projected head as the maximum detection rectangular head size at the point A.
Similarly, the person detecting unit 44 projects the head of a person having the minimum height onto the camera 1, and defines the size of a rectangle on the image surrounding the projected head as the minimum detection rectangular head size at the point A.
After defining both the maximum detection rectangular head size at the point A and the minimum detection rectangular head size at the point A, the person detecting unit 44 compares each person's detection result at the point A with the maximum detection rectangular head size and the minimum detection rectangular head size. When each person's detection result at the point A is larger than the maximum rectangular head size or is smaller than the minimum rectangular head size, the person detecting unit 44 determines the detection result as an erroneous detection and deletes this detection result.
Every time when the person detecting unit 44 calculates the image coordinates of each individual person by detecting each individual person from each normalized image (image frames) which is generated from moment to moment by the video image correcting unit 43, the two-dimensional moving track calculating unit 45 determines a sequence of points each shown by the image coordinates to calculate a two-dimensional moving track of each individual person which is moving along the sequence of points (step ST44).
Hereafter, the process of determining a two-dimensional moving track which is carried out by the two-dimensional moving track calculating unit 45 will be explained concretely.
First, the two-dimensional moving track calculating unit 45 acquires the person detection results (the image coordinates of persons) in the image frame at a time t which are determined by the person detecting unit 44, and assigns a counter to each of the person detection results (step ST51).
For example, as shown in
In this case, the two-dimensional moving track calculating unit assigns a counter to each of the person detection results, and initializes the value of the counter to “0” when starting tracking each person.
Next, the two-dimensional moving track calculating unit 45 uses each person detection result in the image frame at the time t as a template image to search for the image coordinates of the corresponding person in the image frame at the next time t+1 shown in
In this case, as a method of searching for the image coordinates of the person, a normalized cross correlation method which is a known technology, or the like can be used, for example.
In the case, the two-dimensional moving track calculating unit uses an image of a person region at the time t as a template image to determine the image coordinates of a rectangular region having the highest correlation value at the time (t+1) with those at the time t by using the normalized cross correlation method, and output the image coordinates.
As another method of searching for the image coordinates of the person, a correlation coefficient of a feature described in above-mentioned reference 2 can be used, for example.
In this case, a correlation coefficient of a feature in each of a plurality of subregions included in each person region at the time t is calculated, and a vector having the correlation coefficients as its components is defined as a template vector of the corresponding person. Then, a region whose distance to the template vector is minimized at the next time (t+1) is searched for, and the image coordinates of the region are outputted as the search result about the person.
In addition, as another method of searching for the image coordinates of the person, a method using a distributed covariance matrix of a feature described in the following reference 3 can be used. By using this method, person tracking can be carries out to determine the person's image coordinates from moment to moment.
Next, the two-dimensional moving track calculating unit 45 acquires the person detection results (each person's image coordinates) in the image frame at the time t+1 which are calculated by the person detecting unit 44 (step ST53).
For example, the two-dimensional moving track calculating unit acquires the person detection results as shown in
Next, the two-dimensional moving track calculating unit 45 updates each person's information which the person tracking device is tracking by using both the person image coordinates calculated in step ST52 and the person image coordinates acquired in step ST53 (step ST54).
For example, as shown in
In contrast, when the person detecting unit has failed in person detection of the person B at the time (t+1), as shown in
Thus, when a detection result exists around the search result, the two-dimensional moving track calculating unit 45 increments the value of the counter by one, whereas when no detection result exists around the search result, the two-dimensional moving track calculating unit decrements the value of the counter by one.
As a result, the value of the counter becomes large as the number of times that the person is detected increases, while the value of the counter becomes small as the number of times that the person is detected decreases.
Furthermore, the two-dimensional moving track calculating unit 45 can accumulate the degree of certainty of each person detection in step ST54.
For example, when a detection result exists around the search result, the two-dimensional moving track calculating unit 45 accumulates the degree of certainty of the corresponding person detection result, whereas when no detection result exists around the search result, the two-dimensional moving track calculating unit 45 does not accumulate the degree of certainty of the corresponding person detection result. As a result, the larger number of times that the person is detected, the higher degree of accumulated certainty the corresponding two-dimensional moving track has.
The two-dimensional moving track calculating unit 45 then determines whether or not to end the tracking process (step ST55).
As a criterion by which to determine whether or not to end the tracking process, the value of the counter described in step ST54 can be used.
For example, when the value of the counter determined in step ST54 is lower than a fixed threshold, the two-dimensional moving track calculating unit determines that the object is not a person and then ends the tracking.
As an alternative, by carrying out a process of comparing the degree of accumulated certainty described in step ST54 with a predetermined threshold, as a criterion by which to determine whether or not to end the tracking process, the two-dimensional moving track calculating unit can determine whether or not to end the tracking process.
For example, when the degree of accumulated certainty is lower than the predetermined threshold, the two-dimensional moving track calculating unit determines that the object is not a person and then ends the tracking.
Thus, by thus determining whether or not to end the tracking process, the person tracking device can prevent itself from erroneous tracking anything which is not a human being.
By repeatedly performing the image template matching process in steps ST52 to ST55 on frame images from which persons who have entered the elevator from moment to moment are detected, the two-dimensional moving track calculating unit 45 can express each of the persons as a sequence of image coordinates of each person moving, i.e., as a sequence of points. The two-dimensional moving track calculating unit calculates this sequence of points as a two-dimensional moving track of each person moving.
In this case, when the tracking of a person is ended on the way due to shading or the like, the person tracking device can simply restart tracking the person after the shading or the like is removed.
In this Embodiment 1, the two-dimensional moving track calculating unit 45 tracks each person's image coordinates calculated by the person detecting unit 44 in the forward direction of time (the direction from the present to the future), as mentioned above. The two-dimensional moving track calculating unit 45 can further track each person's image coordinates in the backward direction of time (the direction from the present to the past), and can calculate two-dimensional moving tracks of each person along the backward direction of time and along the forward direction of time.
By thus tracking each person's image coordinates in the backward direction of time and in the forward direction of time, the person tracking device can calculate each person's two-dimensional moving track while reducing the risk of missing each person's two-dimensional moving track as much as possible. For example, even when failing in the tracking of a person in the forward direction of time, the person tracking device can eliminate the risk of missing the person's two-dimensional moving track as long as it succeeds in tracking the person in the backward direction of time.
After the two-dimensional moving track calculating unit 45 calculates the two-dimensional moving tracks of each individual person, the two-dimensional moving track graph generating unit 47 performs a dividing process and a connecting process on the two-dimensional moving tracks of each individual person to generate a two-dimensional moving track graph (step ST45 of
More specifically, the two-dimensional moving track graph generating unit 47 searches through the set of two-dimensional moving tracks of each individual person calculated by the two-dimensional moving track calculating unit 45 for two-dimensional moving tracks close to one another with respect to space or time, and then performs processes, such as division and connection, on them to generate a two-dimensional moving track graph having the two-dimensional moving tracks as vertices of the graph, and having connected two-dimensional moving tracks as directed sides of the graph.
Hereafter, the process carried out by the two-dimensional moving track graph generating unit 47 will be explained concretely.
First, an example of two-dimensional moving tracks close to one another with respect to space, which are processed by the two-dimensional moving track graph generating unit 47, will be mentioned.
For example, as shown in
In the example of
Furthermore, because the shortest distance d between the end point T1E of the two-dimensional moving track T1 and the two-dimensional moving track T3 falls within the fixed distance, it can be said that the two-dimensional moving track T3 exists close to the end point T1E of the two-dimensional moving track T1 with respect to space.
In contrast, because a two-dimensional moving track T4 has a start point which is distant from the end point T1E of the two-dimensional moving track T1, it can be said that the two-dimensional moving track T4 does not exist close to the two-dimensional moving track T1 with respect to space.
Next, an example of two-dimensional moving tracks close to one another with respect to time, which are processed by the two-dimensional moving track graph generating unit 47, will be mentioned.
For example, assuming that a two-dimensional moving track T1 shown in
In contrast with this, when the length of the time interval |t3−t2| exceeds the constant value, it is defined that the two-dimensional moving track T2 does not exist close to the two-dimensional moving track T1 with respect to time.
Although the examples of the two-dimensional moving track close to the end point T1E of the two-dimensional moving track T1 with respect to space and with respect to time are described above, two-dimensional moving tracks close to the start point of a two-dimensional moving track with respect to space and with respect to time can be defined similarly.
Next, the track dividing process and the track connecting process carried out by the two-dimensional moving track graph generating unit 47 will be explained.
When another two-dimensional moving track A exists close to the start point S of a two-dimensional moving track calculated by the two-dimensional moving track calculating unit 45 with respect to time and with respect to space, the two-dimensional moving track graph generating unit 47 divides the other two-dimensional moving track A into two portions at a point near the start point S.
For example, when two-dimensional moving tracks {T1, T2, T4, T6, T7} are calculated by the two-dimensional moving track calculating unit 45, as shown in
Therefore, the two-dimensional moving track graph generating unit 47 divides the two-dimensional moving track T2 into two portions at a point near the start point of the two-dimensional moving track T1 to generate a two-dimensional moving track T2 and a two-dimensional moving track T3 newly and acquires a set of two-dimensional moving tracks {T1, T2, T4, T6, T7, T3} as shown in
Furthermore, when another two-dimensional moving track A exists close to the end point S of a two-dimensional moving track calculated by the two-dimensional moving track calculating unit 45 with respect to time and space, the two-dimensional moving track graph generating unit 47 divides the other two-dimensional moving track A into two portions at a point near the end point S.
In the example of
Therefore, the two-dimensional moving track graph generating unit 47 divides the two-dimensional moving track T4 into two portions at a point near the end point of the two-dimensional moving track T1 to generate a two-dimensional moving track T4 and a two-dimensional moving track T5 newly and acquire a set of two-dimensional moving tracks {T1, T2, T4, T6, T7, T3, T5} as shown in
When the start point of another two-dimensional moving track B exists close to the endpoint of a two-dimensional moving track A with respect to space and with respect to time in the set of two-dimensional moving tracks acquired through the track dividing process, the two-dimensional moving track graph generating unit 47 connects the two two-dimensional moving tracks A and B to each other.
More specifically, the two-dimensional moving track graph generating unit 47 acquires a two-dimensional moving track graph by defining each two-dimensional moving track as a vertex of a graph, and also defining each pair of two-dimensional moving tracks connected to each other as a directed side of the graph.
In the example of
In this case, the two-dimensional moving track graph generating unit 47 generates a two-dimensional moving track graph having information about the two-dimensional moving tracks T1 to T7 as the vertices of the graph, and information about directed sides which are pairs of two-dimensional moving tracks: (T1, T5), (T2, T1), (T2, T3), (T3, T4), (T3, T6), (T4, T5), and (T6, T7).
Furthermore, the two-dimensional moving track graph generating unit 47 can not only connect two-dimensional moving tracks in the forward direction of time (in the direction toward the future), but also generate a graph in the backward direction of time (in the direction toward the past). In this case, the two-dimensional moving track graph generating unit can connect two-dimensional moving tracks to each other along a direction from the end point of each two-dimensional moving track toward the start point of another two-dimensional moving track.
In the example of
While tracking a person, when another person wearing a dress of the same color as the person's dress exists in a video image, or when another person overlaps the person in a video image and therefore shades the person, the person's two-dimensional moving track may branch off into two parts or may be discrete with respect to time. Therefore, as shown in
Therefore, the two-dimensional moving track graph generating unit 47 can hold information about a plurality of moving paths for such a person by generating a two-dimensional moving track graph.
After the two-dimensional moving track graph generating unit 47 generates the two-dimensional moving track graph, the track stereo unit 48 determines a plurality of two-dimensional moving track candidates by searching through the two-dimensional moving track graph, carries out stereo matching between each two-dimensional moving track candidate in each video image and a two-dimensional moving track in any other video image by taking into consideration the installed positions and installation angles of the plurality of cameras 1 with respect to the reference point in the cage calculated by the camera calibration unit 42 to calculate the degree of match between the two-dimensional moving track candidates, and calculates three-dimensional moving tracks of each individual person from the two-dimensional moving track candidates each having a degree of match equal to or larger than a specified value (step ST46 of
Hereafter, the process carried out by the track stereo unit 48 will be explained concretely.
First, a method of searching through a two-dimensional moving track graph to list two-dimensional moving track candidates will be described.
Hereafter, it is assumed that, as shown in
At this time, the track stereo unit 48 searches through the two-dimensional moving track graph G to list all connected two-dimensional moving track candidates.
In the example of
First, the track stereo unit 48 acquires one two-dimensional moving track corresponding to each of camera images captured by the plurality of cameras 1 (step ST61), and calculates a time interval during which each two-dimensional moving track overlaps another two-dimensional moving track (step ST62).
Hereafter, the process of calculating the time interval during which each two-dimensional moving track overlaps another two-dimensional moving track will be explained concretely.
Hereafter, it is assumed that, as shown in
Furthermore, β1 shows a two-dimensional moving track of the person A in the video image captured by the camera 1β, and β2 shows a two-dimensional moving track of the person B in the video image captured by the camera 1β.
For example, when, in step ST61, acquiring the two-dimensional moving track α1 and the two-dimensional moving track β1 which are shown in
where Xa1(t) and Xb1(t) are the person's A two-dimensional image coordinates at the time t. The two-dimensional moving track α1 shows that its image coordinates are recorded during the time period from the time T1 to the time T2, and the two-dimensional moving track β1 shows that its image coordinates are recorded during the time period from the time T3 to the time T4.
In this case, because a time interval during which the two-dimensional moving track α1 and the two-dimensional moving track β1 overlap each other extends from the time T3 to the time T2, the track stereo unit 48 calculates this time interval.
After calculating the time interval during which each two-dimensional moving track overlaps another two-dimensional moving track, the track stereo unit 48 carries out stereo matching between the corresponding sequences of points which form the two-dimensional moving tracks at each time within the overlapping time interval by using the installed position and installation angle of each of the cameras 1 which is calculated by the camera calibration unit 42 to calculate the distance between the sequences of points (step ST63).
Hereafter, the process of carrying out stereo matching between the sequences of points will be explained concretely.
As shown in
Furthermore, the track stereo unit 48 calculates the distance d(t) between the straight line Va1(t) and the straight line Vb1(t) at the same time when calculating a point of intersection of the straight line Va1(t) and the straight line Vb1(t) as a three-dimensional position Z(t) of the person.
For example, from {Xa1(t)}t=T1, . . . , T2 and {Xb1(t)}t=T3, . . . , T4, the track stereo unit acquires a set {Z(t), d(t)}t=T3, . . . , T2 of pairs of the three-dimensional position vector Z(t) and the distances d(t) between the straight lines during the overlapping time interval t=T3, . . . , T2.
As an alternative, the distance d(t) between the two straight lines and the point of intersection Z(t) can be calculated by using an “optimum correction” method disclosed by the following reference 4.
Next, the track stereo unit 48 calculates the degree of match between the two-dimensional moving tracks by using the distance between the sequences of points which the track stereo unit has acquired by carrying out stereo matching between the corresponding sequences of points (step ST64).
When the overlapping time interval has a length of “0”, the track stereo unit determines the degree of match as “0”. In this embodiment, for example, the track stereo unit calculates, as the degree of match, the number of times that the straight lines intersect during the overlapping time interval.
More specifically, in the example of
In this embodiment, the example in which the track stereo unit calculates, as the degree of match, the number of times that the straight lines intersect during the overlapping time interval is shown. However, this embodiment is not limited to this example. For example, the track stereo unit can calculate, as the degree of match, a proportion of the overlapping time interval during which the two straight lines intersect.
More specifically, in the example of
As an alternative, the track stereo unit can calculate, as the degree of match, the average of the distance between the two straight lines during the overlapping time period.
More specifically, in the example of
As an alternative, the track stereo unit can calculate, as the degree of match, the sum total of values of the distance of the two straight lines during the overlapping time interval.
More specifically, in the example of
In addition, the track stereo unit can calculate the degree of match by combining some of the above-mentioned calculating methods.
Hereafter, advantages provided by carrying out the stereo matching between two-dimensional moving tracks will be described.
For example, because the two-dimensional moving track α2 and the two-dimensional moving track β2 shown in
In contrast, because the two-dimensional moving track α1 and the two-dimensional moving track β2 belong to the different persons A and B, respectively, the stereo matching between the two-dimensional moving track α1 and the two-dimensional moving track β2 which is carried out by the track stereo unit may show that the straight lines intersect at a time by accident. However, the straight lines do not intersect almost all the time, and the average of the reciprocal of the distance d(t) has a small value. Therefore, the degree of match between the two-dimensional moving track α1 and the two-dimensional moving track β2 has a low value.
Conventionally, because the stereo matching is performed on person detection results at a moment to estimate each person's three-dimensional position, as shown in
In contrast, the person tracking device in accordance with this Embodiment 1 can cancel the ambiguity of the stereo vision and becomes possible to determine each person's three-dimensional moving track correctly by carrying out the stereo matching between two-dimensional moving tracks of each person throughout a fixed time interval.
After calculating the degree of match between the two-dimensional moving track of each person in each video image and the two-dimensional moving track of a person in any other video image in the above-mentioned way, the track stereo unit 48 compares the degree of match with a predetermined threshold (step ST65).
When the degree of match between the two-dimensional moving track of each person in each video image and the two-dimensional moving track of a person in another video image exceeds the threshold, the track stereo unit 48 carries out a process of calculating a three-dimensional moving track during a time interval during which the two-dimensional moving track of each person in each video image and the two-dimensional moving track of a person in another video image overlap each other from these two-dimensional moving tracks (although the three-dimensional positions where a portion of the two-dimensional moving track of each person in each video image and a portion of the two-dimensional moving track of a person in another video image overlap each other during the time interval can be estimated by carrying out normal stereo matching, a detailed explanation of this normal stereo matching will be omitted because this is a known technique), and performing filtering on the three-dimensional moving track to remove an erroneously-estimated three-dimensional moving track (step ST66).
More specifically, because when the person detecting unit 44 carries out an erroneous detection of a person, the track stereo unit 48 may calculate the person's three-dimensional moving track erroneously because of the erroneous detection, the track stereo unit 48 determines the three-dimensional moving track as what is not a person's essential track to cancel this three-dimensional moving track when the person's three-dimensional position Z(t) does not satisfy any criteria (a) to (c) shown below.
Criterion (a): The person's height is higher than a fixed length (e.g., 50 cm).
Criterion (b): The person exists in a specific area (e.g., the inside of the elevator cage).
Criterion (c): The person's three-dimensional movement history is smooth.
According to the criterion (a), a three-dimensional moving track at an extremely low position is determined as one which is erroneously detected and is therefore canceled.
Furthermore, according to the criterion (b), for example, a three-dimensional moving track of a person image in a mirror installed in the cage is determined as one which is not a person's track and is therefore canceled.
Furthermore, according to the criterion (c), for example, an unnatural three-dimensional moving track which varies rapidly both vertically and horizontally is determined as one which is not a person's track and is therefore canceled.
Next, the track stereo unit 48 calculates the three-dimensional positions of the sequence of points which form portions of the two-dimensional moving tracks which do not overlap each other with respect to time by using the three-dimensional positions where a portion of the two-dimensional moving track of each person in each video image and a portion of the two-dimensional moving track of a person in another video image overlap each other during the time interval to estimate three-dimensional moving tracks of each individual person (step ST67).
In the case of
No three-dimensional moving track of a person can be calculated during a time interval during which any two two-dimensional moving tracks of the person do not overlap each other by using the normal stereo matching method. In this case, in accordance with this embodiment, the average of each person's height during a time interval which two two-dimensional moving tracks of each person overlap each other is calculated and each person's three-dimensional moving track during a time interval during which any two two-dimensional moving tracks of each person do not overlap each other is estimated by using the average of the height.
In the example of
Next, the track stereo unit determines the point at each time t whose height from the floor is equal to aveH from among the points on the straight line Va1(t) passing through both the center of the camera 1α, and the image coordinates Xa1(t), and then estimates this point as the three-dimensional position Z(t) of the person. Similarly, the track stereo unit estimates the person's three-dimensional position Z(t) from the image coordinates Xb1(t) at each time t.
As a result, the track stereo unit can acquire a three-dimensional moving track {Z(t)}t=T1, . . . , T4 throughout all the time period from the time T1 to the time T4 during which the two-dimensional moving track α1 and the two-dimensional moving track β1 are recorded.
As a result, even when the person is not shot during a certain time period by one of the cameras for the reason that the person is shaded by someone else, or the like, the track stereo unit 48 can calculate the person's three-dimensional moving track as long as the person's two-dimensional moving track is calculated by using a video image captured by another camera, and the two-dimensional moving track overlaps another two-dimensional moving track before and after the person is shaded by someone else.
After the calculation of the degree of match between the two-dimensional moving tracks of each of all the pairs is completed, the person tracking device ends the process by the track stereo unit 48 and then makes a transition to the process by the three-dimensional moving track calculating unit 49 (step ST68).
After the track stereo unit 48 calculates three-dimensional moving tracks of each individual person, the three-dimensional moving track graph generating unit 49 performs a dividing process and a connecting process on the three-dimensional moving tracks of each individual person to generate a three-dimensional moving track graph (step ST47).
More specifically, the three-dimensional moving track graph generating unit 49 searches through the set of three-dimensional moving tracks of each individual person calculated by the track stereo unit 48 for three-dimensional moving tracks close to one another with respect to space or time, and then performs processes such as division and connection, on them to generate a three-dimensional moving track graph having the three-dimensional moving tracks as vertices of the graph, and having connected three-dimensional moving tracks as directed sides of the graph.
Hereafter, the process carried out by the three-dimensional moving track graph generating unit 49 will be explained concretely.
First, an example of three-dimensional moving tracks close to one another with respect to space, which are processed by the three-dimensional moving track graph generating unit 49, will be mentioned.
For example, as shown in
In the example of
Furthermore, because the shortest distance d between the end point L1E of the three-dimensional moving track L1 and the three-dimensional moving track L3 falls within the fixed distance, it can be said that the three-dimensional moving track L3 exists close to the end point L1E of the three-dimensional moving track L1 with respect to space.
In contrast, because a three-dimensional moving track L4 has a start point which is distant from the end point L1E of the three-dimensional moving track L1, it can be said that the three-dimensional moving track L4 does not exist close to the three-dimensional moving track L1 with respect to space.
Next, an example of three-dimensional moving tracks close to one another with respect to time, which are processed by the three-dimensional moving track graph generating unit 49, will be mentioned.
For example, assuming that a three-dimensional moving track L1 shown in
In contrast with this, when the length of the time interval |t3−t2| exceeds the constant value, it is defined that the three-dimensional moving track L2 does not exist close to the three-dimensional moving track L1 with respect to time.
Although the examples of the three-dimensional moving track close to the end point L1E of the three-dimensional moving track L1 with respect to space and with respect to time are described above, three-dimensional moving tracks close to the start point of a three-dimensional moving track with respect to space and with respect to time can be defined similarly.
Next, the track dividing process and the track connecting process carried out by the three-dimensional moving track graph generating unit 49 will be explained.
When another three-dimensional moving track A exists close to the start point S of a three-dimensional moving track calculated by the three-dimensional moving track calculating unit 48 with respect to time and with respect to space, the three-dimensional moving track graph generating unit 49 divides the three-dimensional moving track A into two portions at a point near the start point S.
In the case of
Therefore, the three-dimensional moving track graph generating unit 49 divides the three-dimensional moving track L3 into two portions at a point near the start point of the three-dimensional moving track L2 to generate a three-dimensional moving track L3 and a three-dimensional moving track L5 newly and acquire a set of three-dimensional moving tracks as shown in
Furthermore, when another three-dimensional moving track A exists close to the end point Sofa three-dimensional moving track calculated by the track stereo unit 48 with respect to time and with respect to space, the three-dimensional moving track graph generating unit 49 divides the other three-dimensional moving track A into two portions at a point near the end point S.
In the example of
Therefore, the three-dimensional moving track graph generating unit 49 divides the three-dimensional moving track L4 into two portions at a point near the end point of the three-dimensional moving track L5 to generate a three-dimensional moving track L4 and a three-dimensional moving track L6 newly and acquire a set of three-dimensional moving tracks L1 to L6 as shown in
When the start point of another three-dimensional moving track B exists close to the end point of a three-dimensional moving track A with respect to space and with respect to time in the set of three-dimensional moving tracks acquired through the track dividing process, the three-dimensional moving track graph generating unit 49 connects the two three-dimensional moving tracks A and B to each other.
More specifically, the three-dimensional moving track graph generating unit 49 acquires a three-dimensional moving track graph by defining each three-dimensional moving track as a vertex of a graph, and also defining each pair of three-dimensional moving tracks connected to each other as a directed side of the graph.
In the example of
In many cases, the three-dimensional moving tracks of each individual person calculated by the track stereo unit 48 are comprised of a set of plural three-dimensional moving track fragments which are discrete with respect to space or time due to a failure to track each individual person's head in a two-dimensional image, or the like.
To solve this problem, the three-dimensional moving track graph generating unit 49 performs the dividing processing and the connecting process on these three-dimensional moving tracks to determine a three-dimensional moving track graph, so that the person tracking device can hold information about a plurality of moving paths of each person.
After the three-dimensional moving track graph generating unit 49 generates the three-dimensional moving track graph, the track combination estimating unit 50 searches through the three-dimensional moving track graph to calculate three-dimensional moving track candidates of each individual person from an entrance to the cage to an exit from the cage, and estimates a combination of optimal three-dimensional moving tracks from the three-dimensional moving track candidates to calculate an optimal three-dimensional moving track of each individual person, and the number of persons existing in the cage at each time (step ST48).
Hereafter, the process carried out by the track combination estimating unit 50 will be explained concretely.
First, the track combination estimating unit 50 sets up an entrance and exit area for persons at a location in the area to be monitored (step ST71).
The entrance and exit area is used as the object of a criterion by which to judge whether each person has entered or exited the elevator. In the example of
When the moving track of the head of a person has started from the entrance and exit area which is set up in the vicinity of the entrance of the elevator, for example, it can be determined that the person has got on the elevator on the corresponding floor. In contrast, when the moving track of a person has been ended in the entrance and exit area, it can be determined that the person has got off the elevator on the corresponding floor.
Next, the track combination estimating unit 50 searches through the three-dimensional moving track graph generated by the three-dimensional moving track graph generating unit 49, and calculates candidates for a three-dimensional moving track of each individual person (i.e., a three-dimensional moving track from an entrance to the area to be monitored to an exit from the area) which satisfy the following entrance criteria and exit criteria within a time period determined for the analytical object (step ST72).
(1) Entrance criterion:
The three-dimensional moving track is extending from the door toward the inside of the elevator.
(2) Entrance criterion:
The position of the start point of the three-dimensional moving track is in the entrance and exit area.
(3) Entrance criterion:
The door index di at the start time of the three-dimensional moving track set up by the door opening and closing recognition unit 11 is not “0”.
(1) Exit criterion:
The three-dimensional moving track is extending from the inside of the elevator toward the door.
(2) Exit criterion:
The position of the end point of the three-dimensional moving track is in the entrance and exit area.
(3) Exit criteria:
The door index di at the end time of the three-dimensional moving track set up by the door opening and closing recognition unit 11 is not “0”, and the door index di differs from that at the time of entrance.
In the example of
It is assumed that the three-dimensional moving track graph G is comprised of three-dimensional moving tracks L1 to L6, and the three-dimensional moving track graph G has the following information.
Furthermore, it is assumed that the door indexes di of the three-dimensional moving tracks L1, L2, L3, L4, L5, and L6 are 1, 2, 2, 4, 3, and 3, respectively. However, it is further assumed that the three-dimensional moving track L3 is determined erroneously due to a failure to track the individual person's head or shading by another person.
Therefore, two three-dimensional moving tracks (the three-dimensional moving tracks L2 and L3) are connected to the three-dimensional moving track L1, and therefore ambiguity occurs in the person's moving trucking.
In the example of
In this case, the track combination estimating unit 50 searches through the three-dimensional moving track graph G by, for example, starting from the three-dimensional moving track L1, and then tracing the three-dimensional moving tracks in order of L1→L2→L6 to acquire a candidate {L1, L2, L6} for the three-dimensional moving track from an entrance to the area to be monitored to an exit from the area.
Similarly, the track combination estimating unit 50 searches through the three-dimensional moving track graph G to acquire candidates, as shown below, for the three-dimensional moving track from an entrance to the area to be monitored to an exit from the area.
Track candidate A={L1, L2, L6}
Track candidate B={L4, L5}
Track candidate C={L1, L3, L5}
Next, by defining a cost function which takes into consideration a positional relationship among persons, the number of persons, the accuracy of stereo vision, etc., and selectively determining a combination of three-dimensional moving tracks which maximizes the cost function from among the candidates for the three-dimensional moving track from an entrance to the area to be monitored to an exit from the area, the track combination estimating unit 50 determines a correct three-dimensional moving track of each person and the correct number of persons (step ST73).
For example, the cost function reflects requirements: “any two three-dimensional moving tracks do not overlap each other” and “as many three-dimensional moving tracks as possible are estimated”, and can be defined as follows.
Cost=“the number of three-dimensional moving tracks”−“the number of times that three-dimensional moving tracks overlap each other”
where the number of three-dimensional moving tracks means the number of persons in the area to be monitored.
When calculating the above-mentioned cost in the example of
Similarly, because the track candidate B={L4, L5} and the track candidate C={L1, L3, L5} overlap each other in a portion of L5, “the number of times that three-dimensional moving tracks overlap each other” is calculated to be “1”.
As a result, the cost of each of combinations of one or more track candidates is calculated as follows.
Therefore, the combination of the track candidates A and B is the one which maximizes the cost function, and it is therefore determined that the combination of the track candidates A and B is an optimal combination of three-dimensional moving tracks.
Because the combination of the track candidates A and B is an optimal combination of three-dimensional moving tracks, it is also estimated simultaneously that the number of persons in the area to be monitored is two.
After determining the optimal combination of the three-dimensional moving tracks of persons, each of which starts from the entrance and exit area in the area to be monitored, and ends in the entrance and exit area, the track combination estimating unit 50 brings each of the three-dimensional moving tracks into correspondence with floors specified by the floor recognition unit 12 (stopping floor information showing stopping floors of the elevator), and calculates a person movement history showing the floor where each individual person has got on the elevator and the floor where each individual person has got off the elevator (a movement history of each individual person showing “how many persons have got on the elevator on which floor and how many persons have got off the elevator on which floor”) (step ST74).
In this embodiment, although the example in which the track combination estimating unit brings each of the three-dimensional moving tracks into correspondence with the floor information specified by the floor recognition unit 12 is shown, the track combination estimating unit can alternatively acquire stopping floor information from control equipment for controlling the elevator, and can bring each of the three-dimensional moving tracks into correspondence with the stopping floor information independently.
As mentioned above, by defining a cost function in consideration of a positional relationship among persons, the number of persons, the accuracy of stereo vision, etc., and then determining a combination of three-dimensional moving tracks which maximizes the cost function, the track combination estimating unit 50 can determine each person's three-dimensional moving track and the number of persons in the area to be monitored even when the result of tracking of a person head has an error due to shading by something else.
However, when a large number of persons have got on and got off the elevator and the three-dimensional moving track graph has a complicated structure, the number of candidates for the three-dimensional moving track of each person is very large and hence the number of combinations of candidates becomes very large, and the track combination estimating unit may be unable to carry out the process within a realistic time period.
In such a case, the track combination estimating unit 50 can define a likelihood function which takes into consideration a positional relationship among persons, the number of persons, and the accuracy of stereo vision, and use a probabilistic optimization technique, such as MCMC (Markov Chain Monte Carlo: Markov chain Monte Carlo) or GA (Genetic Algorithm: genetic algorithm), to determine an optimal combination of three-dimensional moving tracks.
Hereafter, a process of determining an optimal combination of three-dimensional moving tracks of persons which maximizes the cost function by using MCMC, which is carried out by the track combination estimating unit 50, will be explained concretely.
First, symbols are defined as follows.
yi(t): the three-dimensional position at a time t of a three-dimensional moving track yi. yi(t)εR3
yi: the three-dimensional moving track of the i-th person from an entrance to the area to be monitored to an exit from the area
y
i
={y
i(t)}
|yi|: the record time of the three-dimensional moving track yi
N: the number of three-dimensional moving tracks each extending from an entrance to the area to be monitored to an exit from the area (the number of persons)
Y={yi}i=1, . . . ,N: a set of three-dimensional moving tracks
S(yi): the stereo cost of the three-dimensional moving track yi
O(yi, yj): the cost of an overlap between the three-dimensional moving track yi and the three-dimensional moving track yj
w+: a set of three-dimensional moving tracks yi which are selected as correct three-dimensional moving tracks
w−: a set of three-dimensional moving tracks yi which are not selected w−=w−w+
w: w={w+,w−}
wopt: w which maximizes the likelihood function
|w+|: the original number of three-dimensional moving tracks in w+ (the number of tracks which are selected as correct three-dimensional moving tracks)
Ω: a set of w(s) w□εΩ (a set of divisions of the set Y of three-dimensional moving tracks)
L(w|Y): the likelihood function
Lnum(w|Y): the likelihood function of the number of selected tracks
Lstr(w|Y): the likelihood function regarding the stereo vision of the selected tracks
Lovr(w|Y): the likelihood function regarding an overlap between the selected tracks
q(w′|w): a proposed distribution
A(w′|w): an acceptance probability
After the three-dimensional moving track graph generating unit 49 generates the three-dimensional moving track graph, the track combination estimating unit 50 searches through the three-dimensional moving track graph to determine the set Y={yi}i=1, . . . ,N of candidates for the three-dimensional moving track of each individual person which meet the above-mentioned entrance criteria and exit criteria.
Furthermore, after defining w+ as the set of three-dimensional moving track candidates which are selected as correct three-dimensional moving tracks, the track combination estimating unit defines both w−=w−w+ and w={w+,w−}. The track combination estimating unit 50 is aimed at selecting correct three-dimensional moving tracks from the set Y of three-dimensional moving track candidates, and this aim can be formulized into the problem of defining the likelihood function L(w/Y) as a cost function, and maximizing this cost function.
More specifically, when an optimal track selection is assumed to be wopt, wopt is given by the following equation.
w
opt=argmax L(w|Y)
For example, the likelihood function L(w|Y) can be defined as follows.
L(w|Y)=Lovr(w|Y)Lnum(w|Y)Lstr(w|Y)
where Lovr is the likelihood function in which “any two three-dimensional moving tracks do not overlap each other in the three-dimensional space” is formulized, Lnum is the likelihood function in which “as many three-dimensional moving tracks as possible exist” is formulized, and Lstr is the likelihood function in which “the accuracy of stereo vision of a three-dimensional moving track is high” is formulized.
Hereafter, the details of each of the likelihood functions will be mentioned.
The criterion: “any two three-dimensional moving tracks do not overlap each other in the three-dimensional space” is formulized as follows.
L
ovr(w|Y)∝exp(−c1Σi,jεw+O(yi,yj))
where O(yi,yj) is the cost of an overlap between the three-dimensional moving track yi and the three-dimensional moving track yj.
When the three-dimensional moving track yi and the three-dimensional moving track yj perfectly overlap each other, O(yi,yj) has a value of “1”, whereas when the three-dimensional moving track yi and the three-dimensional moving track yj do not overlap each other at all, O(yi,yj) has a value of “0”. Furthermore, c1 is a positive constant.
O(yi,yj) is determined as follows.
yi and yj are expressed as yi={yi(t)}t=t1, . . . ,t2 and yj={yj(t)}t=t3, . . . , t4, respectively, and it is assumed that the three-dimensional moving track yi and the three-dimensional moving track yj exist simultaneously during a time period F=[t3 t2].
Furthermore, a function g is defined as follows.
g(yi(t),yi(t))=1(if ∥yi(t)−yi(t)∥<Th1),=0 (otherwise)
where Th1 is a proper distance threshold, and is set to 25 cm, for example.
That is, the function g is a function for providing a penalty when the three-dimensional moving tracks are close to each other within a distance less than the threshold Th.
At this time, the overlap cost O(yi, yj) is calculated as follows.
O(yi,yj)=ΣtεFg(yi(t),yj(t))/|F|
O(yi,y1)=0
The criterion: “as many three-dimensional moving tracks as possible exist.” is formulized as follows.
L
num(w|Y)∝exp(c2|w+|)
where |w+| is the original number of three-dimensional moving tracks in w+. Furthermore, c2 is a positive constant.
The criterion: “the accuracy of stereo vision of a three-dimensional moving track is high.” is formulized as follows.
L
str(w|Y)∝exp(−c3Σiεw+S(yi))
where S(yi) is a stereo cost, and when a three-dimensional moving track is estimated by using the stereo vision, S(yi) of the three-dimensional moving track has a small value, whereas when a three-dimensional moving track is estimated by using monocular vision or when a three-dimensional moving track has a time period during which it is not observed by any camera 1, S(yi) of the three-dimensional moving track has a large value. Furthermore, c3 is a positive constant.
Hereafter, a method of calculating the stereo cost S(yi) will be described.
In this case, when yi is expressed as yi={yi(t)}t=t1, . . . ,t2, the three following time periods F1i, F2i, and F3i exist mixedly within the time period Fi=[t1 t2] of the three-dimensional moving track yi.
□·F1i: the time period during which the three-dimensional moving track is estimated by using the stereo vision
□·F2i: the time period during which the three-dimensional moving track is estimated by using the monocular vision
□·F3i: the time period during which no three-dimensional moving track is observed by any camera 1
In this case, the stereo cost S(yi) is provided as follows.
S(yi)=(c8×|F1i|+c9×|F2i|+c10×|F3i|)/|Fi|
where c8, c9 and c10 are positive constants.
Next, a method of maximizing the likelihood function L(w|Y) by using MCMC which the track combination estimating unit 50 uses will be described.
First, an outline of the algorithm is described as follows.
Input: Y, winit, Nmc Output: wopt
(1) Initialization w=winit, wopt=winit
(2) Main routine
for n=1 to Nmc
end
The input to the algorithm is the set Y of three-dimensional moving tracks, an initial division winit, and a sampling frequency Nmc, and the optimal division wopt is acquired as the output of the algorithm.
In the initialization, the initial division winit is given by winit={w+=□,w−=Y}.
In the main routine, in step1, m is sampled according to a probability distribution ζ(m). For example, the probability distribution ζ(m) can be set to be a uniform distribution.
Next, in step2, the candidate w′ is sampled according to the proposed distribution q(w′|w) corresponding to the index m.
As the proposed distribution of a proposal algorithm, three types including “generation”, “disappearance” and “swap” are defined.
The index m=1 corresponds to “generation”, the index m=2 corresponds to “disappearance”, and the index m=3 corresponds to “swap”.
Next, in step3, u is sampled from the uniform distribution Unif[0 1].
In next step4, the candidate w′ is accepted or rejected on the basis of u and the acceptance probability A(w,w′).
The acceptance probability A(w,w′) is given by the following equation.
A(w,w′)=min(1,q(w|w′)L(w′|Y)/q(w′|w)L(w|Y))
Finally, in step5, the optimal wopt that maximizes the likelihood function is stored.
Hereafter, the details of the proposed distribution q(w′|w) will be mentioned.
One three-dimensional moving track y is selected from the set w−, and is added to w+.
At this time, a three-dimensional moving track which does not overlap the tracks in w+ with respect to space is selected as y on a priority basis.
More specifically, when yεw−, w={w+,w−}, and w′={{w++y}, {w−−y}}, the proposed distribution is given by the following equation.
q(w′|w)∝ζ(1)exp(−c4Σjεw+O(y,yj))
where O(y,yj) is the above-mentioned overlap cost, and has a value of “1” when the tracks y and yj overlap each other perfectly, whereas O(y,yj) has a value of “0” when the tracks y and yj do not overlap each other at all, and c4 is a positive constant.
One three-dimensional moving track y is selected from the set w+, and is added to w−.
At this time, a three-dimensional moving track which overlaps another track in w+ with respect to space is selected as y on a priority basis.
More specifically, when yεw+, w={w+,w−}, and w′={{w+−y}, {w−+y}}, the proposed distribution is given by the following equation.
q(w′|w)∝ζ(2)exp(c5Σjεw+O(y,yj))
When w+ is an empty set, the proposed distribution is shown by the following equation.
q(w′|w)=1 (if w′=w),q(w′|w)=0 (otherwise)
where c5 is a positive constant.
A three-dimensional moving track having a high stereo cost is interchanged with a three-dimensional moving track having a low stereo cost.
More specifically, one three-dimensional moving track y is selected from the set w+ and one three-dimensional moving track z is selected from the set w−, and the three-dimensional moving track y is interchanged with the three-dimensional moving track z.
Concretely, one three-dimensional moving track having a high stereo cost is selected first as the three-dimensional moving track y on a priority basis.
Next, one three-dimensional moving track which overlaps the three-dimensional moving track y and which has a low stereo cost is selected as the three-dimensional moving track z on a priority basis.
More specifically, assuming yεw+, zεw−, and w′={{w+−y+z}, {w−+y−z}}, p(y|w)□∝exp(c6 S(y)), and p(z|w,y)∝exp(−c6 S(z) exp(c7 O(z,y))), the proposed distribution is given by the following equation.
q(w′|w)∝ζ(3)×p(z|w,y)p(y|w)
where c6 and c7 are positive constants.
After determining the movement history of each individual person in the above-mentioned way, the video analysis unit 3 provides the movement history to a group management system (not shown) which manages the operations of two or more elevators.
As a result, the group management system becomes possible to carry out optimal group control of the elevators at all times according to the movement history acquired from each elevator.
Furthermore, the video analysis unit 3 outputs the movement history of each individual person, etc. to the image analysis result display unit 4 as needed.
When receiving the movement history of each individual person, etc. from the video analysis unit 3, the image analysis result display unit 4 displays the movement history of each individual person, etc. on a display (not shown).
Hereafter, the process carried out by the image analysis result display unit 4 will be explained concretely.
As shown in
The video display unit 51 of the image analysis result display unit 4 synchronously displays the video images of the inside of the elevator cage captured by the plurality of cameras 1 (the video image captured by the camera (1), the video image captured by the camera (2), the video image of the indicator for floor recognition), and the analysis results acquired by the video analysis unit 3, and displays the head detection results, the two-dimensional moving tracks, etc. which are the analysis results acquired by the video analysis unit 3 while superimposing them onto each of the video images.
Because the video display unit 51 thus displays the plurality of video images synchronously, a user, such as a building maintenance worker, can know the states of the plurality of elevators simultaneously, and can also grasp the image analysis results including the head detection results and the two-dimensional moving tracks visually.
The time series information display unit 52 of the image analysis result display unit 4 forms the person movement history and cage movement histories which are determined by the three-dimensional moving track calculating unit 46 of the person tracking unit 13 into a time-series graph, and displays this time-series graph in synchronization the video images.
In
In the screen example of
Furthermore, the time series information display unit displays a bar showing time synchronization with the video image being displayed on the graph, and expresses each time period during which an elevator's door is open with a thick line.
Furthermore, in the graph, a text “F15-D10-J0-K3” showing the floor on which the corresponding elevator is located, the door opening time of the elevator, the number of persons who have got on the elevator, and the number of persons who have got off the elevator is displayed in the vicinity of each thick line showing the corresponding door opening time.
This text “F15-D10-J0-K3” is a short summary showing that the floor where the elevator cage is located is the 15th floor, the door opening time is 10 seconds, the number of persons who have got on the elevator is zero, and the number of persons who have got off the elevator is three.
Because the time series information display unit 52 thus displays the image analysis results in time series, the user, such as a building maintenance worker, can know visually a temporal change of information including the number of persons who have got on each of a plurality of elevators, the number of persons who have got off each of the plurality of elevators, the door opening and closing times of each of the plurality of elevators, etc.
The summary display unit 53 of the image analysis result display unit 4 acquires statistics on the person movement histories calculated by the three-dimensional moving track calculating unit 46, and lists, as statistic results of the person movement histories, the number of persons who have got on each of the plurality of cages on each floor in a certain time zone and the number of persons who have got off each of the plurality of cages on each floor in the certain time zone.
Because the summary display unit 53 thus lists the number of persons who have got on each of the plurality of cages on each floor in a certain time zone and the number of persons who have got off each of the plurality of cages on each floor in the certain time zone, the user can grasp the operation states of all the elevators of a building at a glance.
In the screen example of
The operation related information display unit 54 of the image analysis result display unit 4 displays detailed information about the person movement histories with reference to the person movement histories calculated by the three-dimensional moving track calculating unit 46. More specifically, for a specified time zone, a specified floor, and a specified elevator cage number, the operation related information display unit displays detailed information about the elevator operation including the number of persons who have moved from the specified floor to other floors, the number of persons who have moved to the specified floor from the other floors, the passenger waiting time, etc.
In regions (A) to (F) of the screen of
(A): Display the specified time zone, the specified cage number, and the specified floor.
(B): Display the specified time zone, the specified cage number, and the specified floor.
(C): Display that the number of persons getting on cage #1 on 2F and moving upward during AM7:00 to AM10:00 is ten
(D): Display that number of persons getting on cage #1 on 3F and getting off cage #1 on 2F during AM7:00 to AM10:00 is one and average riding time is 30 seconds
(E): Display that number of persons getting on cage #1 from 3F and moving downward during AM7:00 to AM10:00 is three
(F): Display that number of persons getting on cage #1 on B1F and getting off cage #1 on 2F during AM7:00 to AM10:00 is two and average riding time is 10 seconds
By thus displaying the detailed information about the analyzed person movement histories, the operation related information display unit 54 enables the user to browse individual information about each floor and individual information about each cage, and analyze the details of a cause, such as a malfunction of the operation of an elevator.
The sorted data display unit 55 sorts and displays the person movement histories calculated by the three-dimensional moving track calculating unit 46. More specifically, the sorted data display unit sorts the data about the door opening times, the number of persons who have got on each elevator and the number of persons who have got off each elevator (the number of persons getting on or off), the waiting times, or the like by using the analysis results acquired by the video analysis unit 3, and displays the data in descending or ascending order of their ranks.
In the example of
Furthermore, in the example of
In the example of
Furthermore, in the example of
In the example of
Furthermore, in the example of
Because the sorted data display unit 55 thus displays the sorted data, the person tracking device enables the user to, for example, find out a time zone in which an elevator's door is open unusually and then refer to a video image and analysis results which were acquired in the same time zone to track the malfunction to its source.
As can be seen from the above description, the person tracking device in accordance with this Embodiment 1 is constructed in such a way that the person tracking device includes the person position calculating unit 44 for analyzing video images of an area to be monitored which are shot by the plurality of cameras 1 to determine a position on each of the video images of each individual person existing in the area to be monitored, and the two-dimensional moving track calculating unit 45 for calculating a two-dimensional moving track of each individual person in each of the video images by tracking the position on each of the video images calculated by the person position calculating unit 44, and the three-dimensional moving track calculating unit 46 carries out stereo matching among the two-dimensional moving tracks in the video images calculated by the two-dimensional moving track calculating unit 45 to calculate the degree of match between a two-dimensional moving track in each of the video images and a two-dimensional moving track in another one of the video images, and then calculates a three-dimensional moving track of each individual person from two-dimensional moving tracks each having a degree of match equal to or larger than a specific value. Therefore, the present embodiment offers an advantage of being able to track correctly each person existing in the area to be monitored even in a situation in which the area to be monitored is crowded greatly.
More specifically, while in a narrow crowded area, such an elevator cage, it is difficult for a conventional person tracking device to carry out detection and tracking of each person because a person may be shaded by another person, the person tracking device in accordance with this Embodiment 1 can determine a correct three-dimensional moving track of each individual person and can estimate the number of persons in the area to be monitored by listing a plurality of three-dimensional moving track candidates and determining a combination of three-dimensional moving track candidates which maximizes the cost function which takes into consideration a positional relationship among persons, the number of persons, the accuracy of the stereoscopic vision, etc. even when there exists a three-dimensional moving track which is determined erroneously because of shading of a person by something else.
Furthermore, even when a three-dimensional moving track graph has a very complicated structure and there is a huge number of combinations of three-dimensional moving track candidates each extending from an entrance to the cage to an exit from the cage, the track combination estimating unit 50 determines an optimal combination of three-dimensional moving tracks by using a probabilistic optimization technique such as MCMC or GA. Therefore, the person tracking device in accordance with this embodiment can determine the combination of three-dimensional moving tracks within a realistic processing time period. As a result, even in a situation in which the area to be monitored is crowded greatly, the person tracking device can detect each individual person in the area to be monitored correctly and also can track each individual person correctly.
Furthermore, because the image analysis result display unit 4 shows the video images captured by the plurality of cameras 1 and the image analysis results acquired by the video analysis unit 3 in such a way that the video images and the image analysis results are visible to the user, the user, such as a building maintenance worker or a building owner, can grasp the operation state and malfunctioned parts of each elevator easily, and can bring efficiency to the operation of each elevator and perform maintenance work of each elevator smoothly.
In this Embodiment 1, the example in which the image analysis result display unit 4 displays the video images captured by the plurality of cameras 1 and the image analysis results acquired by the video analysis unit 3 on the display (not shown) is shown. As an alternative, the image analysis result display unit 4 can display the video images captured by the plurality of cameras 1 and the image analysis results acquired by the video analysis unit 3 on a display panel installed in each floor outside each elevator cage and a display panel disposed in each elevator cage to provide information about the degree of crowdedness of each elevator cage for passengers.
Accordingly, each passenger can grasp when he or she should gen on which elevator cage from the degree of crowdedness of each elevator cage.
Furthermore, in this Embodiment 1, although the case in which the area to be monitored is the inside of each elevator cage is explained, this case is only an example. For example, this embodiment can be applied to a case in which the inside of a train is defined as the area to be monitored and the degree of crowdedness or the like of the train is measured.
This embodiment can be also applied to a case in which an area with a high need for security is defined as the area to be monitored and each person's movement history is determined to monitor a doubtful person's action.
Furthermore, this embodiment can be applied to a case in which a station, a store, or the like is defined as the area to be monitored and each person's moving track is analyzed to be used for marketing or the like.
In addition, this embodiment can be applied to a case in which each landing of an escalator is defined as the area to be monitored and the number of persons existing in each landing is counted, and, when one landing of the escalator is crowded, the person tracking device carries out appropriate control, such as a control operation of slowing down or stopping the escalator, for example, to prevent an accident, such as an accident where people fall over like dominoes on the escalator, from occurring.
The person tracking device in accordance with above-mentioned Embodiment 1 searches through a plurality of three-dimensional moving track graphs to calculate three-dimensional moving track candidates which satisfy the entrance and exit criteria, lists three-dimensional moving track candidates each extending from an entrance to the elevator cage to an exit from the cage, and determines an optimal combination of three-dimensional moving track candidates by maximizing the cost function in a probabilistic manner by using a probabilistic optimization technique such as MCMC. However, when each three-dimensional moving track graph has a complicated structure, the number of three-dimensional moving track candidates which satisfy the entrance and exit criteria becomes large astronomically, and the person tracking device in accordance with above-mentioned Embodiment 1 may be unable to carry out the processing within a realistic time period.
To solve this problem, a person tracking device in accordance with this Embodiment 2 labels the vertices of each three-dimensional moving track graph (i.e., the three-dimensional each moving tracks which construct each graph) to estimate an optimal combination of three-dimensional moving tracks within a realistic time period by maximizing a cost function which takes entrance and exit criteria into consideration in a probabilistic manner.
A track combination estimating unit 61 carries out a process of determining a plurality of candidates for labeling by labeling the vertices of each three-dimensional moving track graph generated by a three-dimensional moving track graph generating unit 49, and selecting an optimal candidate for labeling from among the plurality of candidates for labeling to estimate the number of persons existing in the area to be monitored.
Next, the operation of the person tracking device will be explained.
Because the person tracking device in accordance with this embodiment has the same structure as that in accordance with above-mentioned Embodiment 1, with the exception that the track combination estimating unit 50 is replaced by the track combination estimating unit 61, only the operation of the track combination estimating unit 61 will be explained.
First, the track combination estimating unit 61 sets up an entrance and exit area for persons at a location in the area to be monitored (step ST81), like the track combination estimating unit 50 of
In the example of
Next, the track combination estimating unit 61 labels the vertices of each three-dimensional moving track graph generated by the three-dimensional moving track graph generating unit 49 (i.e., the three-dimensional moving tracks which construct each graph) to calculate a plurality of candidates for labeling (step ST82).
In this case, the track combination estimating unit 61 can search through the three-dimensional moving track graph thoroughly to list all possible candidates for labeling. The track combination estimating unit 61 can alternatively select only a predetermined number of candidates for labeling at random when there are many candidates for labeling.
Concretely, the track combination estimating unit determines a plurality of candidates for labeling as follows.
As shown in
In this case, the track combination estimating unit 61 calculates candidates A and B for labeling as shown in
For example, labels having label numbers from 0 to 2 are assigned to three-dimensional moving track fragments in the candidate A for labeling, respectively, as shown below.
In this case, it is defined that label 0 shows a set of three-dimensional moving tracks which does not belong any person (erroneous three-dimensional moving tracks), and label 1 or greater shows a set of three-dimensional moving tracks which belongs to an individual person.
In this case, the candidate A for labeling shows that two persons (label 1 and label 2) are existing in the area to be monitored, and a person (1)'s three-dimensional moving track is comprised of the three-dimensional moving tracks L4 and L5 to which label 1 is added and a person (2)'s three-dimensional moving track is comprised of the three-dimensional moving tracks L1, L2 and L6 to which label 2 is added.
Furthermore, labels having label numbers from 0 to 2 are added to three-dimensional moving track fragments in the candidate B for labeling, respectively, as shown below.
In this case, the candidate B for labeling shows that two persons (label 1 and label 2) are existing in the area to be monitored, and the person (1)'s three-dimensional moving track is comprised of the three-dimensional moving tracks L1, L3 and L5 to which label 1 is added and the person (2)'s three-dimensional moving track is comprised of the three-dimensional moving track L4 to which label 2 is added.
Next, the track combination estimating unit 61 calculates a cost function which takes into consideration the number of persons, a positional relationship among the persons, the accuracy of stereoscopic vision, entrance and exit criteria for the area to be monitored, etc. for each of the plurality of candidates for labeling to determine a candidate for labeling which maximizes the cost function and calculate an optimal three-dimensional moving track of each individual person and the number of persons (step ST83).
As the cost function, such a cost as shown below is defined:
Cost=“the number of three-dimensional moving tracks which satisfy the entrance and exit criteria”
In this case, the entrance criteria and the exit criteria which are described in above-mentioned Embodiment 1 are used as the entrance and exit criteria, for example.
In the case of
In the candidate B for labeling, only the three-dimensional moving tracks with label 1 satisfy the entrance and exit criteria.
Therefore, because the candidates A and B for labeling have costs as shown below, the candidate A for labeling is the one whose cost function is a maximum and the candidate A for labeling is determined as labeling of an optimal three-dimensional moving track graph.
Therefore, it is also estimated simultaneously that two persons have been moving in the elevator cage.
After selecting a candidate for labeling whose cost function is a maximum and then calculating an optimal three-dimensional moving track of each individual person, the track combination estimating unit 61 then brings the optimal three-dimensional moving track of each individual person into correspondence with floors specified by a floor recognition unit 12 (stopping floor information showing stopping floors of the elevator), and calculates a person movement history showing the floor where each individual person has got on the elevator and the floor where each individual person has got off the elevator (a movement history of each individual person showing “how many persons have got on the elevator on which floor and how many persons have got off the elevator on which floor”) (step ST84).
In this embodiment, although the example in which the track combination estimating unit brings each of the three-dimensional moving tracks into correspondence with the floor information specified by the floor recognition unit 12 is shown, the track combination estimating unit can alternatively acquire stopping floor information from control equipment for controlling the elevator, and can bring each of the three-dimensional moving tracks into correspondence with the stopping floor information independently.
However, when there are many persons getting on and off the elevator cage on each floor and each three-dimensional moving track graph has a complicated structure, the labeling of each three-dimensional moving track graph produces many possible sets of labels, and the track combination estimating unit may become impossible to actually calculate the cost function for each of all the sets of labels.
In such a case, the track combination estimating unit 61 can carry out the labeling process of labeling each three-dimensional moving track graph by using a probabilistic optimization technique, such as MCMC or GA.
Hereafter, the labeling process of labeling each three-dimensional moving track graph will be explained concretely.
After the three-dimensional moving track graph generating unit 49 generates a three-dimensional moving track graph, the track combination estimating unit 61 defines the set of vertices of the three-dimensional moving track graph, i.e., a set of each person's three-dimensional moving tracks as
Y={y
i}i=1, . . . ,N.
where N is the number of three-dimensional moving tracks. The track combination estimating unit also defines a state space w as follows.
w={τ
O,τ1,τ2, . . . ,τK}
where τ0 is a set of three-dimensional moving tracks yi not belonging to any person, τi the set of three-dimensional moving tracks yi belonging to the i-th person's three-dimensional moving tracks, and K is the number of three-dimensional moving tracks (i.e., the number of persons).
τi is comprised of a plurality of connected three-dimensional moving tracks, and can be assumed to be one three-dimensional moving track.
Furthermore, the following equations are satisfied.
□·Uk=0, . . . ,Kτk=Y
□·τi∩□τj=□ (for all i≠j)
□·|τk|>1 (for all k)
At this time, the track combination estimating unit 61 is aimed at determining which set of three-dimensional moving tracks from τ0 to τK the set Y of three-dimensional moving tracks belongs to. More specifically, this aim is equivalent to the problem of assigning labels from 0 to K to the elements of the set Y.
This aim can be formulized into the problem of defining a likelihood function L(w/Y) as a cost function, and maximizing this cost function.
More specifically, when an optimal track labeling is assumed to be wopt, wopt is given by the following equation.
w
opt=argmax L(w|Y)
In this case, the likelihood function L(w|Y) is defined as follows.
L(w|Y)=Lovr(w|Y)Lnum(w|Y)Lstr(w|Y)
where Lovr is a likelihood function in which “any two three-dimensional moving tracks do not overlap each other in the three-dimensional space” is formulized, Lnum is a likelihood function in which “as many three-dimensional moving tracks satisfying the entrance and exit criteria as possible exist” is formulized, and Lstr is a likelihood function in which “the accuracy of stereo vision of a three-dimensional moving track is high” is formulized.
Hereafter, the details of each of the likelihood functions will be mentioned.
The criterion: “any two three-dimensional moving tracks do not overlap each other in the three-dimensional space” is formulized as follows.
L
ovr(w|Y)∝exp(−c1Στiεw−τ0Στjεw−τ0O(τi,τj))
where O(τi,τj) is the cost of an overlap between the three-dimensional moving track τi and the three-dimensional moving track τi. When the three-dimensional moving track τi and the three-dimensional moving track τj perfectly overlap each other, O(τi,τj) has a value of “1”, whereas when the three-dimensional moving track τi and the three-dimensional moving track τj do not overlap each other at all, O(τi,τj) has a value of “0”.
As O(τi,τj), O(yi,yj) which is explained in above-mentioned Embodiment 1 is used, for example. c1 is a positive constant.
The criterion: “as many three-dimensional moving tracks satisfying the entrance and exit criteria as possible exist.” is formulized as follows.
L
num(w|Y)∝exp(c2×K+c3×J)
where K is the number of three-dimensional moving tracks, and is given by K=|w−τ0|.
Furthermore, J shows the number of three-dimensional moving tracks which satisfy the entrance and exit criteria and which are included in the K three-dimensional moving tracks τ1 to τK.
As the entrance and exit criteria, the ones which are explained in above-mentioned Embodiment 1 are used, for example.
The likelihood function Lnum(w|Y) works in such a way that as many three-dimensional moving tracks as possible are selected from the set Y, and the selected three-dimensional moving tracks include as many three-dimensional moving tracks satisfying the entrance and exit criteria as possible. c2 and c3 are positive constants.
The criterion: “the accuracy of stereo vision of a three-dimensional moving track is high.” is formulized as follows.
L
str(w|Y)∝exp(−c4×Στiεw−τ0S(τi))
where S(τi) is a stereo cost, and when a three-dimensional moving track is estimated by using the stereo vision, S(τi) of the three-dimensional moving track has a small value, whereas when a three-dimensional moving track is estimated by using monocular vision or when a three-dimensional moving track has a time period during which it is not observed by any camera, S(τi) of the three-dimensional moving track has a large value.
For example, as a method of calculating the stereo cost S(τi), the one which is explained in above-mentioned Embodiment 1 is used. c4 is a positive constant.
Each of the likelihood functions which are defined as mentioned above can be optimized by using a probabilistic optimization technique, such as MCMC or GA.
As can be seen from the above description, because the person tracking device in accordance with this Embodiment 2 is constructed in such a way that the track combination estimating unit 61 calculates a plurality of candidates for labeling by labeling the directed sides of each three-dimensional moving track graph generated by the three-dimensional moving track graph generating unit 49, selects an optimal candidate for labeling from among the plurality of candidates for labeling, and estimates the number of persons existing in the area to be monitored, this Embodiment 2 provides an advantage of being able to estimate each person's optimal (or semi-optimal) three-dimensional moving track and the number of persons within a realistic time period even when there are an astronomical number of three-dimensional moving track candidates which satisfy the entrance and exit criteria.
The person tracking device in accordance with above-mentioned Embodiment 2 labels the vertices of each three-dimensional moving track graph (the three-dimensional each moving tracks which construct each graph) and maximizes a cost function which takes into consideration the entrance and exit criteria in a probabilistic manner to estimate an optimal combination of three-dimensional moving tracks within a realistic time period. However, when the number of persons in each video image increases, and each two-dimensional moving track graph has a complicated structure, there is a case in which the number of candidates for three-dimensional moving track fragments which are acquired as results of the stereoscopic vision increases astronomically, and the person tracking device cannot complete the processing within a realistic time period even when using the method in accordance with Embodiment 2.
To solve this problem, a person tracking device in accordance with this Embodiment 3 labels the vertices of each two-dimensional moving track graph (the two-dimensional moving tracks which construct each graph) in a probabilistic manner, performs stereoscopic vision on three-dimensional moving tracks according to the labels respectively assigned to the two-dimensional moving tracks and evaluates a cost function which takes into consideration the entrance and exit criteria for each of the three-dimensional moving tracks to estimate an optimal three-dimensional moving track within a realistic time period.
The two-dimensional moving track labeling unit 71 carries out a process of determining a plurality of candidates for labeling by labeling the directed sides of each two-dimensional moving track graph generated by a two-dimensional moving track graph generating unit 47. The three-dimensional moving track cost calculating unit 72 carries out a process of calculating a cost function regarding a combination of three-dimensional moving tracks, and selecting an optimal candidate for labeling from among the plurality of candidates for labeling to estimate the number of persons existing in an area to be monitored.
Next, the operation of the person tracking device will be explained.
The two-dimensional moving track labeling unit 71 and the three-dimensional moving track cost calculating unit 72, instead of the three-dimensional moving track graph generating unit 49 and the track combination estimating unit 50, are added to the components of the person tracking device in accordance with above-mentioned Embodiment 1. Because the other structural components of the person tracking device are the same as those of the person tracking device in accordance with above-mentioned Embodiment 1, the operation of the person tracking device will be explained hereafter, focusing on the operation of the two-dimensional moving track labeling unit 71 and that of the three-dimensional moving track cost calculating unit 72.
First, the two-dimensional moving track labeling unit 71 calculates a plurality of candidates for labeling for each two-dimensional moving track graph generated by the two-dimensional moving track graph generating unit 47 by labeling the vertices of each two-dimensional moving track graph (the two-dimensional moving tracks which construct each graph) (step ST91). In this case, the two-dimensional moving track labeling unit 71 can search through each two-dimensional moving track graph thoroughly to list all possible candidates for labeling. The two-dimensional moving track labeling unit 71 can alternatively select only a predetermined number of candidates for labeling at random when there are many candidates for labeling.
Concretely, the two-dimensional moving track labeling unit determines a plurality of candidates for labeling as follows.
As shown in
A video image captured by a camera 1
In this case, the two-dimensional moving track labeling unit 71 performs labeling on each two-dimensional moving track graph shown in
In this case, the candidate 1 for labeling is interpreted as follows. The candidate 1 for labeling shows that two person persons (corresponding to the labels A and B) exist in the area to be monitored, and the person Y's two-dimensional moving track is comprised of the two-dimensional moving tracks T1, T3, P1, and P2 to which the label A is assigned. The candidate 1 for labeling also shows that the person X's two-dimensional moving track is comprised of the two-dimensional moving tracks T4, T6, P4, and P5 to which the label B is assigned. In this case, the label Z is defined as a special label, and shows that T2, T5, P3, and P6 to which the label Z is assigned are an erroneously-determined set of two-dimensional moving tracks which belong to something which is not a human being.
In this case, although only the three labels A, B, and Z are used, the number of labels used is not limited to three and can be increased arbitrarily as needed.
After the two-dimensional moving track labeling unit 71 generates a plurality of candidates for labeling for each two-dimensional track graph, the track stereo unit 48 carries out stereo matching between a two-dimensional moving track candidate labeled with a number in each video image and a two-dimensional moving track labeled with the same number in any other video image by taking into consideration the installed positions and installation angles of the plurality of cameras 1 with respect to a reference point in the cage calculated by a camera calibration unit 42 to calculate the degree of match between the two-dimensional moving track candidates, and then calculates a three-dimensional moving track of each individual person (step ST92).
In the example of
Furthermore, because T2, T5, P3 and P6 to which the label Z is assigned are interpreted as tracks of something which is not a human being, the track stereo unit does not perform stereo matching on the tracks.
Because the other operation regarding the stereoscopic vision of two-dimensional moving tracks by the track stereo unit 48 is the same as that shown in Embodiment 1, the explanation of the other operation will be omitted hereafter.
Next, the three-dimensional moving track cost calculating unit 72 calculates a cost function which takes into consideration the number of persons, a positional relationship among the persons, the degree of stereo matching between the two-dimensional moving tracks, the accuracy of stereoscopic vision, the entrance and exit criteria for the area to be monitored, etc. for the sets of three-dimensional moving tracks in each of the plurality of candidates for labeling which are determined by the above-mentioned track stereo unit 48 to determine a candidate for labeling which maximizes the cost function and calculate an optimal three-dimensional moving track of each individual person and the number of persons (step ST93).
For example, as the simplest cost function, such a cost as shown below is defined.
Cost=“the number of three-dimensional moving tracks which satisfy the entrance and exit criteria”
In this case, the entrance criteria and the exit criteria which are described in above-mentioned Embodiment 1 are used as the entrance and exit criteria, for example. For example, in the case of
As an alternative, as the cost function, such a cost defined as below can be used.
Cost=“the number of three-dimensional moving tracks which satisfy the entrance and exit criteria”−aדthe sum total of overlap costs each between three-dimensional moving tracks”+bדthe sum total of the degrees of match each between two-dimensional moving tracks”
where a and b are positive constants for establishing a balance among evaluated values. Furthermore, as the degree of match between two-dimensional moving tracks and the overlap cost between three-dimensional moving tracks, the ones which are explained in Embodiment 1 are used, for example.
Furthermore, when there are a large number of persons getting on and off and each two-dimensional moving track graph has a complicated structure, there is a case in which the two-dimensional moving track labeling unit 71 determines a large number of possible candidates for labeling for each two-dimensional moving track graph, and the three-dimensional moving track cost calculating unit therefore becomes impossible to actually calculate the cost function for all the labelings.
In such a case, the two-dimensional moving track labeling unit 71 generates candidates for labeling in a probabilistic manner by using a probabilistic optimization technique, such as MCMC or GA, and then determines an optimal or semi-optimal three-dimensional moving track so as to complete the processing within a realistic time period.
Finally, after selecting a candidate for labeling whose cost function is a maximum and then calculating an optimal three-dimensional moving track of each individual person, the three-dimensional moving track cost calculating unit 72 brings the optimal three-dimensional moving track of each individual person into correspondence with floors specified by a floor recognition unit 12 (stopping floor information showing stopping floors of the elevator), and calculates a person movement history showing the floor where each individual person has got on the elevator and the floor where each individual person has got off the elevator (a movement history of each individual person showing “how many persons have got on the elevator on which floor and how many persons have got off the elevator on which floor”) (step ST94).
In this embodiment, although the example in which the three-dimensional moving track cost calculating unit brings each of the three-dimensional moving tracks into correspondence with the floor information specified by the floor recognition unit 12 is shown, the three-dimensional moving track cost calculating unit can alternatively acquire stopping floor information from control equipment for controlling the elevator, and can bring each of the three-dimensional moving tracks into correspondence with the stopping floor information independently.
As can be seen from the above description, because the person tracking device in accordance with this Embodiment 3 is constructed in such away that the two-dimensional moving track labeling unit 71 determines a plurality of candidates for labeling by labeling each two-dimensional moving track graph generated by the two-dimensional moving track graph generating unit 47, selects an optimal candidate for labeling from among the plurality of candidates for labeling, and estimates the number of persons existing in the area to be monitored, this Embodiment 3 provides an advantage of being able to estimate each person's optimal (or semi-optimal) three-dimensional moving track and the number of persons within a realistic time period even when each two-dimensional moving track graph has a complicated structure and there are an astronomical number of candidates for labeling.
In above-mentioned Embodiments 1 to 3, the method of measuring the person movement history of each person getting on and off an elevator is described. In contrast, in this Embodiment 4, a method of using the person movement history will be described.
A sensor 81 is installed outside an elevator which is an area to be monitored, and consists of a visible camera, an infrared camera, or a laser range finder, for example.
A floor person detecting unit 82 carries out a process of measuring a movement history of each person existing outside the elevator by using information acquired by the sensor 81. A cage call measuring unit 83 carries out a process of measuring an elevator call history.
A group control optimizing unit 84 carries out an optimization process for allocating a plurality of elevator groups efficiently in such a way that elevator waiting times are minimized, and further simulates a traffic flow at the time of carrying out optimal group elevator control.
A traffic flow visualization unit 85 carries out a process of comparing a traffic flow which the video analysis unit 3, the floor person detecting unit 82, and the cage call measuring unit 83 have measured actually with the simulated traffic flow which the group control optimizing unit 84 has generated, and displaying results of the comparison with animation or a graph.
First, the plurality of cameras 1, the video image acquiring unit 2, and the video analysis unit 3 calculate person movement histories of persons existing in the elevator (steps ST1 to ST4).
The floor person detecting unit 82 measures movement histories of persons existing outside the elevator by using the sensor 81 installed outside the elevator (step ST101).
For example, the person tracking device detects and tracks each person's head from a video image by using a visible camera as the sensor 81, like that in accordance with Embodiment 1, and the floor person detecting unit 82 carries out a process of measuring persons who are waiting for arrival of the elevator, three-dimensional moving tracks of persons who are getting on the elevator from now on, the number of the persons waiting, and the number of the persons getting on.
The sensor 81 is not limited to a visible camera, and can be an infrared camera for detecting heat, a laser range finder, or a pressure-sensitive sensor covered on the floor as long as the sensor can measure each person's movement information.
The cage call measuring unit 83 measures elevator cage call histories (step ST102). For example, the cage call measuring unit 83 carries out a process of measuring a history of pushdown of an elevator call button arranged on each floor.
The group control optimizing unit 84 unifies the person movement histories of persons existing in the elevator which are determined by the video analysis unit 3, the person movement histories of persons existing outside the elevator which are measured by the floor person detecting unit 82, and the elevator call histories which are measured by the cage call measuring unit 83, and carries out an optimization process for allocating the plurality of elevator groups efficiently in such away that average or maximum elevator waiting times are minimized. The group control optimizing unit further simulates person movement histories at the time of carrying out optimal group elevator control by using a computer to calculate the results of the person movement histories (step ST103).
In this embodiment, the elevator waiting time of a person is the time which elapses after the person reaches a floor until a desired elevator arrives at the floor.
As an algorithm for optimizing group control, an algorithm disclosed by the following reference 5 can be used, for example.
Because conventional person tracking devices do not have any means for correctly measuring person movement histories for elevators, according to a conventional algorithm for optimizing group control, a process of optimizing the group elevator control is carried out by assuming a proper probability distribution of person movement histories inside and outside each elevator. In contrast, the person tracking device in accordance with this Embodiment 4 can implement further optimal group control by inputting the measured person movement histories to the conventional algorithm.
The traffic flow visualization unit 85 finally carries out a process of comparing the person movement histories which the video analysis unit 3, the floor person detecting unit 82, and the cage call measuring unit 83 have measured actually with the simulated person movement histories which the group control optimizing unit 84 has generated, and displaying results of the comparison with animation or a graph (step ST104).
For example, on a two-dimensional cross-sectional view of the building showing the elevators and tenants, the traffic flow visualization unit 85 displays the elevator waiting times, the sum total of persons' amounts of travel, or the probability of each person's travel per unit time with animation, or a diagram of elevator cage travels with a graph. The traffic flow visualization unit 85 can perform a simulation using a computer to increase or decrease the number of elevators installed in the building, or virtually calculate the movement history of a person at the time of introducing a new elevator model into the building, and display simultaneously the results of this simulation and the person movement histories which the video analysis unit 3, the floor person detecting unit 82, and the cage call measuring unit 83 have measured actually. Therefore, the present embodiment offers an advantage of making it possible to compare the simulation results with the actually-measured person movement histories to verify a change from the current traffic flow in the building to the expected traffic flow resulting from the reconstruction.
As can be seen from the above description, because the person tracking device in accordance with this Embodiment 4 is constructed in such a way that the sensor 81 is installed in an area outside the elevators, such as an elevator hall, and measures person movement histories, the present embodiment offers an advantage of being able to determine person travels associated with the elevators completely. This embodiment offers another advantage of implementing optimal group elevator control on the basis of the measured person movement histories. Furthermore, the person tracking device in accordance with this embodiment becomes possible to verify a change of the traffic flow resulting from reconstruction of the building correctly by comparing the actually-measured person movement histories with the results of a simulation of the reconstruction which are acquired by a computer.
Conventionally, when a wheelchair accessible button of an elevator is pushed down on a floor, the elevator is allocated to the floor on a priority basis. However, because the elevator is allocated to the floor on a priority basis even when a healthy person accidentally pushes down the wheelchair accessible button without intending to do so, such allocation becomes a cause of lowering the operational efficiency of the elevator group.
To solve this problem, in this Embodiment 5, a structure of, only when recognizing a wheelchair by carrying out image processing and further recognizing that a person in the wheelchair exists on a floor and then in an elevator cage, operating the cage on a priority basis to operate the elevator group efficiently is shown.
A wheelchair detecting unit 91 carries out a process of specifying a wheelchair and a person sitting on the wheelchair from among persons which are determined by the video analysis unit 3 and the floor person detecting unit 82.
First, the plurality of cameras 1, the video image acquiring unit 2, and the video analysis unit 3 calculate person movement histories of persons existing in the elevator (steps ST1 to ST4). The floor person detecting unit 82 measures movement histories of persons existing outside the elevator by using the sensor 81 installed outside the elevator (step ST101). The cage call measuring unit 83 measures elevator cage call histories (step ST102).
The wheelchair detecting unit 91 carries out the process of specifying a wheelchair and a person sitting on the wheelchair from among persons which are determined by the video analysis unit 3 and the floor person detecting unit 82. (step ST201). For example, by carrying out machine learning of patterns of wheelchair images through image processing by using an Adaboost algorithm, a support vector machine, or the like, the wheelchair detecting unit specifies a wheelchair existing in the cage or on a floor from a camera image on the basis of the learned patterns. Furthermore, an electronic tag, such as an RFID (Radio Frequency IDentification), can be added to each wheelchair beforehand, and the person tacking device can detect that a wheelchair to which an electronic tag is added is approaching an elevator hall.
When a wheelchair is detected by the wheelchair detecting unit 91, a group control optimizing unit 84 allocates an elevator to the person in the wheelchair on a priority basis (step ST202). For example, when a person sitting on a wheelchair pushes an elevator call button, the group control optimizing unit 84 allocates an elevator to the floor on a priority basis, and carries out a preferential-treatment elevator operation of not stopping on any floor other than the destination floor. Furthermore, when a person in a wheelchair is going to enter an elevator cage, the group control optimizing unit can lengthen the time interval during which the door of the elevator is open, and the time required to close the door.
Conventionally, because even when a healthy person accidentally pushes down a wheelchair accessible button without intending to do so, an elevator is allocated to the corresponding floor on a priority basis, such allocation lowers the operational efficiency of the elevator group. In contrast, the person tracking device in accordance with this Embodiment 5 is constructed in such a way that the wheelchair detecting unit 91 detects a wheelchair, and dynamically carries out group elevator control according to the detecting state of the wheelchair, such as allocation of an elevator cage to the corresponding floor on a priority basis. Therefore, the person tracking device in accordance with this Embodiment 5 can carry out elevator operations more efficiently than conventional person tracking devices do. Furthermore, this embodiment offers an advantage of eliminating wheelchair accessible buttons for elevators.
In addition, in this Embodiment 5, although only the detection of a wheelchair is explained, the person tracking device can be constructed in such a way as to detect not only wheelchairs but also important persons, old persons, children, etc. automatically, and adaptively control the allocation of elevator cages, the door opening and closing times, etc.
Because the person tracking device in accordance with the present invention can surely specify persons existing in an area to be monitored, the person tracking device in accordance with the present invention can be applied to the control of allocation of elevator cages of an elevator group, etc.
Number | Date | Country | Kind |
---|---|---|---|
2009-040742 | Feb 2009 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2010/000777 | 2/9/2010 | WO | 00 | 8/3/2011 |