The present invention relates to a person detection apparatus and a person detection method that detect a person from an image.
In recent years, research on detecting a person, using an image captured by a camera has been conducted. In particular, because the outline of the head and shoulders of a person represents a characteristic shape of a person, attention has been focused on detecting a person, using an edge-based feature value of this outline. The outline of the head and shoulders of a person resembles the omega (Ω) symbol of the Greek alphabet, so that the outline is called an “omega figure” or an “omega shape,” for example.
A technology described in Non-Patent Literature (hereinafter, abbreviated as “NPL”) 1, (the technology is referred to as “related art,” hereinafter) focuses on the omega shape in detecting a person from an image. The related art uses a Boosting method such as Real AdaBoost to learn tendencies of feature values of images including the omega shape from several thousands to several tens of thousands of sample images. Based on this learning, the related art generates a classifier that classifies images between images including the omega shape and images not including the omega shape. Examples of feature values include HoG (histogram of gradient) feature values, Sparse feature values and Haar feature values. Examples of learning methods include a Boosting method, an SVM (support vector machine) and a neural network.
According to the related art as described above, it is possible to determine at which positions the head and shoulders of a person are located in an image based on low-dimensional information with a small processing load.
In an attempt to detect a suspicious person from an image captured by a surveillance camera or to analyze the actions of a worker from an image captured in a factory, it is desirable that the posture or actions or the like of the person can be detected automatically from the image. In order to perform automatic detection of this kind, it is necessary to estimate, from the image, the states of parts of the body of the person, such as the position and orientation of the shoulders or the inclination of the head with respect to the shoulders.
However, according to the related art, although it is possible to estimate the position of the head and shoulders of a person, the states of the aforementioned parts cannot be estimated.
An object of the present invention is to provide a person detection apparatus and a person detection method that can estimate a state of a part of a person from an image.
A person detection apparatus according to an aspect of the present invention includes: an evaluation section that acquires a predetermined outline of a person from an evaluation image; and an estimation section that estimates, on a basis of an estimation model that shows a relationship between the predetermined outline of a person and a state of a predetermined part of the person, a state of the predetermined part of a person included in the evaluation image based on the predetermined outline acquired from the evaluation image.
A person detection method according an aspect of the present invention includes: acquiring a predetermined outline of a person from an evaluation image; and estimating, on a basis of an estimation model that shows a relationship between the predetermined outline of a person and a state of a predetermined part of the person, a state of the predetermined part of a person included in the evaluation image based on the predetermined outline acquired from the evaluation image.
According to the present invention, a state of a part of a person can be estimated from an image.
An embodiment of the present invention will be described in detail hereunder with reference to the drawings. The present embodiment is an example in which the present invention is applied to an apparatus that estimates the position and orientation of the shoulders of a person using an omega shape from an image.
In
Operation control section 200 provides an operation screen that allows a user to operate person detection apparatus 100. Operation control section 200 notifies each section of the contents of operation performed by the user and also displays an evaluation result image received as input from orientation estimation section 500 and presents an estimation result image to be described hereinafter, to the user.
Shoulder-position estimation model generation section 300 acquires an omega shape (predetermined outline of a person) and a position of shoulders (state of a predetermined part of a person) in a sample image, and generates a shoulder position estimation model that shows the relationship between the omega shape and the position of the shoulders. Shoulder-position estimation model generation section 300 includes sample image storage section 310, omega feature generation section 320, omega feature storage section 330 and shoulder-position estimation model storage section 340.
Sample image storage section 310 stores sample images (N images) including an omega shape for generating a shoulder position estimation model.
Omega feature generation section 320 extracts overall omega feature values from sample images stored in sample image storage section 310, and outputs the extracted overall omega feature values to shoulder-position estimation model storage section 340. The term “overall omega feature value” refers to information in which an omega feature value and a shoulder position feature value are associated with each other for each sample image. The term “omega feature value” refers to information that represents a feature of an omega shape. The term “shoulder position feature value” refers to information that represents a position of a shoulder relative to a position of an omega shape (hereunder, referred to as “shoulder position”).
Omega feature storage section 330 performs a principal component analysis with respect to the received overall omega feature values, and generates and stores a principal component list and a principal component score list. The term “principal component list” refers to information that describes principal components of the overall omega feature values. The term “principal component score list” refers to information in which a score of each principal component is described for each sample image.
Omega feature storage section 330 also generates and stores a principal component score designated range list. The term “principal component score designated range list” refers to information that describes a designated range of scores for each principal component.
Omega feature storage section 330 outputs the received overall omega feature values to shoulder-position estimation model storage section 340. The principal component list, the principal component score list, and the principal component score designated list are collectively referred to hereunder as “omega generation information.”
Shoulder-position estimation model storage section 340 generates and stores a shoulder position estimation model that shows a relationship between an omega shape and positions of shoulders based on the received overall omega feature values. More specifically, shoulder-position estimation model storage section 340 stores, as a shoulder position estimation model, a regression coefficient matrix obtained by performing a multiple regression analysis using shoulder position feature values (N rows) as objective variables and omega feature values (N rows) as explanatory variables.
Shoulder-position estimation section 400 acquires an omega shape (outline of a person) from an evaluation image, and estimates the positions of the shoulders (state of a part of a person) of a person included in the evaluation image from the acquired omega shape on the basis of the shoulder position estimation model. Shoulder-position estimation section 400 includes evaluation image storage section 410, omega generation section 420, evaluation section 430 and shoulder-position calculating section 440.
Evaluation image storage section 410 stores an evaluation image that serves as a target for detection of a person. In the present embodiment, it is assumed that an evaluation image is an edge image obtained by extracting an edge portion of an image captured by a camera or the like.
Omega generation section 420 generates an omega image based on omega generation information stored by omega feature storage section 330, and outputs the generated omega image to evaluation section 430. The term “omega image” refers to an image in which an omega shape has been reconstructed as an image.
In the present embodiment, it is assumed that omega generation section 420 reconstructs omega shapes of different types by changing a combination of values that are adopted from among the omega generation information. Hereunder, an omega shape that has been reconstructed based on omega generation information is referred to as a “sample omega shape,” an image including a sample omega shape is referred to as a “sample omega image,” and an omega feature value that shows a feature of a sample omega shape is referred to as a “sample omega feature value.”
Evaluation section 430 performs edge matching between received sample omega images and an evaluation image to evaluate the similarity between the evaluation image and each of the sample omega images. Furthermore, evaluation section 430 acquires a sample omega feature value of a sample omega image similar to the evaluation image as an omega feature value of an omega shape included in the evaluation image, and outputs the acquired omega feature value to shoulder-position calculating section 440. Hereunder, an omega shape obtained from an evaluation image is referred to as an “evaluation omega shape,” and an omega feature value obtained from an evaluation image is referred to as an “evaluation omega feature value.”
Shoulder-position calculating section 440 estimates a shoulder position feature value based on a received evaluation omega feature value, using the shoulder position estimation model stored by shoulder-position estimation model storage section 340 and outputs the estimated shoulder position feature value to orientation estimation section 500. Hereunder, a shoulder position feature value estimated based on an evaluation omega feature value is referred to as an “evaluation shoulder position feature value.” The evaluation shoulder position feature value is information that shows a shoulder position of a person included in an evaluation image.
Orientation estimation section 500 estimates the orientation of a shoulder (hereunder, referred to as “shoulder orientation”) of a person included in an evaluation image based on a received evaluation shoulder position feature value. Furthermore, orientation estimation section 500 acquires an evaluation image from evaluation image storage section 410, generates an evaluation result image in which an image showing a shoulder position and shoulder orientation (a state of a part of a person) is superimposed on the evaluation image, and outputs the evaluation result image to operation control section 200.
Note that person detection apparatus 100 is configured by, for example, a computer system that has a communication function (personal computer or workstation or the like). Although not shown in the drawings, the computer system broadly includes an input apparatus, a computer unit, an output apparatus, and a communication apparatus. The input apparatus is a keyboard or a mouse, for example. The output apparatus is, for example, a display or a printer. The communication apparatus is, for example, a communication interface that can connect to an IP network. The computer unit, for example, mainly includes a CPU (central processing unit) and a storage apparatus. The CPU has a control function and a computing function. The storage apparatus includes, for example, a ROM (read only memory), which stores programs and data, and a RAM (random access memory), which temporarily stores data. The ROM can be a flash memory whose contents can be electrically rewritten.
Person detection apparatus 100 configured in the manner described above uses the shoulder position estimation model that shows the relationship between an omega shape and a shoulder position, so that person detection apparatus 100 can estimate a shoulder position and a shoulder orientation of a person included in an evaluation image based on the evaluation image.
Next, the operation of person detection apparatus 100 will be described.
In step S1000, operation control section 200 generates and displays an operation screen on an output apparatus such as a liquid crystal display.
As shown in
Model-generation-preparation-button arrangement region 615 includes “sample image selection” button 617, “omega shape generation” button 618, “reference point setting” button 619, “right shoulder position setting” button 620, “left shoulder position setting” button 621 and “omega shape generation complete” button 622.
It is possible to press “model generation preparation” button 611, “model generation” button 612, “evaluation” button 613 and “end” button 614 at any time. However, a configuration may also be adopted in which “model generation preparation” button 611 cannot be pressed when a sample image has not been prepared. Furthermore, a configuration may be adopted in which “model generation” button 612 cannot be pressed when omega generation information has not been prepared. Furthermore, a configuration may be adopted in which “evaluation” button 613 cannot be pressed when a shoulder position estimation model and an evaluation image have not been prepared.
Buttons 617 to 622, which are arranged in model-generation-preparation-button arrangement region 615, can be pressed only during a period from when “model generation preparation” button 611 is pressed until any of buttons 612 to 614 is pressed.
“Model generation preparation” button 611 is a button for allowing a user to input an instruction to generate omega generation information. Each time “model generation preparation” button 611 is pressed, operation control section 200 sends a “model generation preparation request” to Shoulder-position estimation model generation section 300.
“Model generation” button 612 is a button for allowing a user to input an instruction to generate a shoulder position estimation model. Each time “model generation” button 612 is pressed, operation control section 200 sends a “model generation request” to Shoulder-position estimation model generation section 300.
“Evaluation” button 613 is a button for allowing a user to input an instruction to evaluate an evaluation image and output the evaluation result (shoulder position and shoulder orientation). Each time “evaluation” button 613 is pressed, operation control section 200 sends an “evaluation request” to shoulder-position estimation section 400.
“End” button 614 is a button for allowing a user to input an instruction to end a series of processing. When “end” button 614 is pressed, operation control section 200 sends an “end” notification to Shoulder-position estimation model generation section 300 and shoulder-position estimation section 400.
Processing-content display region 616 displays a character string or diagram that shows the progress of processing or a processing result, or an image received from shoulder-position estimation model generation section 300 and orientation estimation section 500 or the like. Operation control section 200 receives pointing operation performed by means of pointer operation or the like with respect to each position of processing-content display region 616.
Each time any one of buttons 617 to 622 of model-generation-preparation-button arrangement region 615 is pressed, operation control section 200 sends information showing which button was pressed to shoulder-position estimation model generation section 300 and shoulder-position estimation section 400. In this case, as necessary, operation control section 200 also sends information that indicates which position is pointed at to shoulder-position estimation model generation section 300 and shoulder-position estimation section 400.
Subsequently, in step S2000 in
If “model generation preparation” button 611 has been pressed (S2000: YES), omega feature generation section 320 proceeds to step S3000. In contrast, if “model generation preparation” button 611 has not been pressed (S2000: No), omega feature generation section 320 proceeds to step S4000.
In step S3000, person detection apparatus 100 performs overall-omega-feature-value generation processing.
First, in step S3010, omega feature generation section 320 stands by for notification that “sample image selection” button 617 is pressed from operation control section 200. Upon receiving notification indicating that “sample image selection” button 617 has been pressed (S3010: YES), omega feature generation section 320 proceeds to step S3020.
In step S3020, omega feature generation section 320 outputs data of a sample image selection screen to operation control section 200, to cause the sample image selection screen to be displayed in processing-content display region 616. The sample image selection screen is a screen that receives, from a user, a selection of a sample image to be used to generate omega generation information from among sample images stored in sample image storage section 310. The sample image selection screen, for example, displays a thumbnail or an attribute such as a file name of each sample image as choices.
Next, in step S3030, omega feature generation section 320 stands by for notification from operation control section 200 of information indicating that “omega shape generation” button 618 has been pressed and also indicating the selected sample image. When notified of information indicating that “omega shape generation” button 618 has been pressed and also indicating the selected sample image is notified (S3030: YES), omega feature generation section 320 proceeds to step S3040.
In step S3040, omega feature generation section 320 outputs data of an omega shape generation screen that includes the selected sample image to operation control section 200, to cause the omega shape generation screen to be displayed in processing-content display region 616. The omega shape generation screen is a screen for receiving setting of an omega shape with respect to a sample image from a user. In the present embodiment, it is assumed that setting of an omega shape is performed by setting positions of omega candidate points forming an omega shape.
As shown in
Omega shape 630 includes a portion denoted by reference numeral 633, which mainly represents the outline of the head (hereunder, referred to as “spherical part”), and a portion denoted by 634, which mainly represents the outline of the shoulders (hereunder, referred to as “wing part”). Neck part 635 often exists at a boundary between spherical part 633 and wing part 634. That is, omega shape 630 forms a characteristic shape.
Omega feature generation section 320 sets identification numbers in correspondence to the adjacency order to points 631 that form omega shape 630. Hereunder, a point included in omega shape 630 for which a setting with respect to a sample image is received from a user is referred to as an “omega candidate point.” That is, the term “omega candidate point” refers to a point on an edge line from the head to the shoulders of a person, which is set by a user.
As shown in
Note that omega shape 630 may be a part of the outline of the head and shoulders. For example, as shown in
The omega shape generation screen displays an omega figure having a standard shape that can be moved onto a selected sample image and changed in shape. The omega shape generation screen receives a user operation in which the user visually aligns a standard omega figure with an actual omega shape included in a sample image.
First, as shown in
According to the present embodiment, when the number of omega candidate points set on an edge line of the outline of shoulder is denoted by “a,” and the number of omega candidate points set on an edge line of the outline of head is denoted by “b,” the total number of omega candidate points denoted by “m” is represented by equation 1 below.
[1]
m=2a+b (Equation 1)
For example, a=5 and b=10 is defined as a setting rule of omega
In step S3050 in
In the present embodiment, an assumption is made that it is previously determined as a reference point setting rule that the reference point is the center of gravity of the head. Therefore, the user estimates the center of gravity of the head of a person from a sample image, and sets reference point 643 at the estimated position as shown in
In step S3060, omega feature generation section 320 generates omega candidate point information C that shows the positions of the omega candidate points set by the user, based on the positions of the set omega candidate points and reference point. Subsequently, omega feature generation section 320 generates omega feature values K that represent features of an omega shape based on the position of each omega candidate point relative to the reference point.
As shown in
As shown in
As shown in
That is, omega feature value K is 2m dimensional data K=(k1, k2, . . . km) that is a set of omega candidate points represented by two-dimensional data k=(R, θ). Note that the contents of omega feature value K are not limited to this example, and may be a set of distances in the x-axis direction and distances in the y-axis direction from a reference point, for example.
Single omega feature value K for one sample image is obtained by the processing described so far.
In step S3070 in
In step S3080, omega feature generation section 320 outputs data for a shoulder position designation screen to operation control section 200, to cause the shoulder position designation screen to be displayed on processing-content display region 616. The shoulder position designation screen is a screen that receives settings from a user for a shoulder position on a sample image.
According to the present embodiment, as a shoulder position setting rule, an assumption is made that it is previously determined that a center position of a shoulder joint is taken as a shoulder position. Accordingly, the user estimates the center position of the joint of the right shoulder from the sample image, and as shown in
Next, in step S3090 in
Next, in step S3100, omega feature generation section 320 generates shoulder candidate point information R, which shows the shoulder positions set by the user, based on the set right shoulder position and left shoulder position. Subsequently, omega feature generation section 320 generates shoulder position feature value Q that represents the feature of shoulder position based on each shoulder position relative to the reference point.
As shown in
As shown in
That is, shoulder position feature value Q is four-dimensional data Q=(q1, q2) that is a set of shoulder positions represented by two-dimensional data q=(R, θ). Note that the contents of shoulder position feature value Q are not limited to this example, and may be a set of distances in the x-axis direction and distances in the y-axis direction from a reference point, for example.
Subsequently, in step S3110 in
In step S3120, omega feature generation section 320 integrates omega feature value K and shoulder position feature value Q to generate an overall omega feature value. However, in a case where an overall omega feature value has already been generated, omega feature generation section 320 updates the overall omega feature value by adding the newly generated omega feature value K and shoulder position feature value Q.
Next, in step S3130, omega feature generation section 320 determines whether or not generation of N omega feature values K and shoulder position feature values Q has been completed and these values have been described in the overall omega feature values.
If generation of N omega feature values K and shoulder position feature values Q has not been completed (S3130: No), the processing returns to step S3010 to perform processing for an unprocessed sample image. That is, the operation starting from the time when the user presses “sample image selection” button 617 for a sample image stored in sample image storage section 310 until the user presses “omega shape generation complete” button 622 is repeated N times.
When omega feature generation section 320 has completed generation of N omega feature values K and shoulder position feature values Q (S3130: YES), omega feature generation section 320 outputs the overall omega feature values to omega feature storage section 330, and thereafter, the processing returns to the processing in
As shown in
In step S4000 in
If “model generation” button 612 has been pressed and there are overall omega feature values (S4000: YES), omega feature storage section 330 proceeds to step S5000. In contrast, if “model generation” button 612 has not been pressed or if there are no overall omega feature values (S4000: No), omega feature storage section 330 proceeds to step S6000.
In step S5000, omega feature storage section 330 generates and stores omega generation information, and also outputs a “shoulder position estimation model generation request” and overall omega feature values to shoulder-position estimation model storage section 340. Upon receipt of the request and the values, shoulder-position estimation model storage section 340 generates a shoulder position estimation model based on the received overall omega feature values, and stores the shoulder position estimation model therein.
Specifically, omega feature storage section 330 applies a principal component analysis to the overall omega feature values to calculate principal component vectors P=(p1, p2, . . . pm′) up to an m′-th principal component. Furthermore, omega feature storage section 330 calculates principal component scores S=(s1, s2, . . . sm′) up to the m′-th principal component in correspondence with the principal component vectors P. The value of m′ is set to a number such that a cumulative contribution ratio is equal to or greater than a %. The value, “a” is determined by experiments or the like taking into account the processing load and the estimation accuracy.
Omega feature storage section 330 generates a principal component list describing principal components P, and a principal component score list describing principal component scores S.
As shown in
As shown in
Omega feature storage section 330 also extracts a maximum value and a minimum value from among the principal component scores with respect to an i-th principal component (1≦i≦m′), which correspond to columns of principal component score list 720. Omega feature storage section 330 generates and stores a principal component score designated range list describing a maximum value and a minimum value of each principal component score, that is, a score range.
As shown in
Note that, a configuration may also be adopted in which omega feature storage section 330 does not create a list of the actual range from the minimum value to the maximum value as principal component score designated range list 730, but creates a list of a range of scores of a narrower range than the actual range, instead.
Shoulder-position estimation model storage section 340 performs a multiple regression analysis using shoulder position feature values Q (N rows) included in the overall omega feature values as an objective variable, and omega feature values K of N rows (N rows) included in the overall omega feature values as an explanatory variable. Shoulder-position estimation model storage section 340 calculates and stores regression coefficient matrix A that satisfies equation 2 below.
[2]
Q=KA (Equation 2)
Shoulder-position estimation model storage section 340 stores the calculated regression coefficient matrix A as a shoulder position estimation model. Note that a configuration may also be adopted in which, at this time, shoulder-position estimation model storage section 340 outputs a “model generation complete notification” to operation control section 200 to cause processing-content display region 616 to output a character string such as “shoulder position estimation model generation has been completed.”
In step S6000 in
If “evaluation” button 613 has been pressed and there is omega generation information and a shoulder position estimation model (S6000: YES), evaluation section 430 proceeds to step S7000. In contrast, if “evaluation” button 613 has not been pressed or if there is no omega generation information and no shoulder position estimation model (S6000: No), evaluation section 430 proceeds to step S9000.
In step S7000, person detection apparatus 100 performs image matching processing.
First, in step S7010, evaluation section 430 acquires an evaluation image from evaluation image storage section 410, and sends an “omega generation request” to omega generation section 420. Upon receiving the “omega generation request,” omega generation section 420 generates and stores a sample omega image based on omega generation information stored by omega feature storage section 330.
In this case, omega generation section 420 reconstructs H different types of omega shapes by changing the combination of values adopted as the principal component score of each principal component among the omega generation information. The value, “H” is determined by experiments or the like taking into account the processing load and the estimation accuracy.
Specifically, omega generation section 420 generates principal component scores Sj (j=1 to H) that are formed by values within a range described in principal component score designated range list 730 (see
[3]
Vj=
Note that when the number of principal component scores Sf of an f-th principal component is taken as Uf, a total number of sample omega shapes that can be reconstructed from omega generation information is represented by equation 4 below.
Subsequently, omega generation section 420 generates a sample omega shape indicated by sample omega feature value Vj, and generates a sample omega image that includes the sample omega shape. The size of the sample omega image is, for example, the same size as the sample image that is the base image. In this case, it is assumed that the size of the sample omega image is a width of w pixels×a height of h pixels. Furthermore, the sample omega image is an image which depicts the shape indicated by sample omega feature value Vj, in which the left upper edge of the image is taken as the origin, and a reference point is placed at the center (w/2, h/2) of the image.
When a combination of the values of principal component score Sj changes, as indicated by arrow 741 in
Furthermore, omega generation section 420 generates and stores a sample omega image list describing generated sample omega images and sample omega feature values Vj on which the sample omega images are based are described in association with each other.
As shown in
Upon completing generation of sample omega image list 750, omega generation section 420 sends an “omega generation complete” notification to evaluation section 430.
Upon receiving the “omega generation complete” notification, in step S7020 in
Next, in step S7030, evaluation section 430 scans a detection window over the evaluation image and evaluates the edge matching between a cut-out image that is, cut out by the detection window from the evaluation image at a certain position and the sample omega image.
As shown in
Next, evaluation section 430 determines whether or not, when the sample omega image that has been selected is superimposed on an evaluation omega shape included in cut-out image 762, a feature indicating an edge can be seen in pixels corresponding to the sample omega shape in cut-out image 762. That is, evaluation section 430 determines whether or not a shape that is approximate to the sample omega shape is included at the same position in cut-out image 762.
Specifically, for each pixel corresponding to the sample omega shape, for example, evaluation section 430 determines whether or not an edge exists at a corresponding portion of cut-out image 762. Evaluation section 430 calculates a value obtained by dividing the number of pixels at which it is determined that an edge exists by the total number of pixels of the sample omega shape as a matching score. Hereunder, processing that calculates this matching score is referred to as “edge matching processing.”
Note that, for example, the correlation with the positions of the shoulder joints of a person is different between an end portion of a shoulder and a top part of the head. Therefore, shoulder-position calculating section 440 may be configured so that, for each position of an omega shape that is indicated by maximum likelihood omega feature value K′, the influence of the relevant position on estimation of a shoulder position differs. In this case, shoulder-position calculating section 440 previously creates a degree of influence list describing, for example, for each point (hereunder, referred to as “omega point”) that defines the maximum likelihood omega feature value K′, a degree of influence on shoulder position estimation of the relevant point as a weight γ.
A more specific description will be provided hereinafter. For the convenience of description, let us first suppose a case in which {α1, α2, α3, α4} is stored in the first row of regression coefficient matrix A and {β1, β2, β3, β4} is stored in the second row thereof.
At this time, α1 means the degree of influence of the radius of an omega point denoted by identification number 1 on estimation of the radius of the right shoulder. Furthermore, α2 means the degree of influence of the radius of the omega point denoted by identification number 1, on estimation of the angle of the right shoulder. In addition, α3 means the degree of influence of the radius of the omega point denoted by identification number 1, on estimation of the radius of the left shoulder. Furthermore, α4 means the degree of influence of the radius of the omega point denoted by identification number 1, on estimation of the angle of the left shoulder.
Similarly, β1 means the degree of influence of an angle of the omega point denoted by identification number 1, on estimation of the radius of the right shoulder. Furthermore, β2 means the degree of influence of the angle of the omega point denoted by identification number 1, on estimation of the angle of the right shoulder. In addition, β3 means the degree of influence of the angle of the omega point denoted by identification number 1, on estimation of the radius of the left shoulder. Furthermore, β4 means the degree of influence of the angle of the omega point denoted by identification number 1, on estimation of the angle of the left shoulder.
The position of the omega point denoted by identification number 1 is expressed by a set of a radius and an angle with respect to an evaluation reference point. Accordingly, shoulder-position estimation model storage section 340 converts into numerical form a degree of influence E1→RR of the omega point denoted by identification number 1, on estimation of the radius of the right shoulder and a degree of influence E1→Rθ of the omega point denoted by identification number 1, on estimation of the angle of the right shoulder using, for example, equations 5 and 6 below. Furthermore, shoulder-position estimation model storage section 340 converts into numerical form a degree of influence E1→LR of the omega point denoted by identification number 1, on estimation of the radius of the left shoulder and a degree of influence E1→Lθ of the omega point denoted by identification number 1, on estimation of the angle of the left shoulder using, for example, equations 7 and 8 below.
[5]
E1→RR=√{square root over (α12+β12)} (Equation 5)
[6]
E1→Rθ=√{square root over (α22+β22)} (Equation 6)
[7]
E1→LR=√{square root over (α32+β32)} (Equation 7)
[8]
E1→Lθ=√{square root over (α42+β42)} (Equation 8)
Shoulder-position estimation model storage section 340 performs conversion into numerical form of the degrees of influence as described above for all omega points to generate degree of influence list 780. It is thereby possible to select a point that has a large degree of influence.
Shoulder-position estimation model storage section 340 stores the degrees of influence of each omega point in the degree of influence list. Furthermore, shoulder-position estimation model storage section 340 selects the top F omega points that have the largest values from among degrees of influence E1→RR to E20→RR on estimation of the radius of the right shoulder, and increases a weight score of those omega points by 1 in the degree of influence list. Similarly, shoulder-position estimation model storage section 340 also increases the weight score of omega points that have a large degree of influence with respect to the angle of the right shoulder, the radius of the left shoulder, and the angle of the left shoulder.
Shoulder-position estimation model storage section 340 determines a value of a final weight score of each omega point as being a degree of influence (hereunder, referred to as “weight γ”) on estimation of both shoulders. Note that, the value, “F” is determined by experiments or the like taking into account the processing load and the estimation accuracy.
As shown in
Subsequently, when evaluating edge matching, evaluation section 430 utilizes weights γ in degree of influence list 780 as a parameter for weighting calculation. More specifically, for each omega point, evaluation section 430 calculates a matching score by multiplying a value obtained by converting a result regarding the presence or absence of an edge in the vicinity of the relevant omega point into a numerical value, by weight γ of the relevant omega point.
Assignment of weights in this manner makes it possible to perform matching that particularly focuses on a portion that has a high degree of correlation to shoulder estimation, and thus to improve the estimation accuracy.
In step S7040 in
On the other hand, if edge matching processing for L cut-out images has been completed (S7040: YES), evaluation section 430 acquires maximum value MSj of the obtained L matching scores. Subsequently, evaluation section 430 generates a matching result list describing maximum value MSj of the matching scores and cutting-out position Cj of the cut-out image for which the relevant maximum value MSj is obtained, for each sample omega image. However, in this case, if a matching result list has already been generated, evaluation section 430 updates the matching result list by adding the maximum values MSj of newly obtained matching scores to the list.
As shown in
Subsequently, in step S7050 in
If processing is not yet completed for H sample omega images (S7050: No), evaluation section 430 returns to step S7020, then selects the next sample omega image and performs processing.
On the other hand, if processing has been completed for H sample omega images (S7050: YES), evaluation section 430 identifies a sample omega image for which the matching score becomes the maximum value from matching result list 770. Evaluation section 430 outputs a sample omega feature value of the sample omega image for which the matching score becomes the maximum value to shoulder-position calculating section 440, sends a “shoulder position calculating request” to shoulder-position calculating section 440, and returns to the processing in
Upon receiving the “shoulder position calculating request,” in step S8000 in
Specifically, shoulder-position calculating section 440 identifies a position of a reference point (hereunder, referred to as “evaluation reference point”) of a maximum likelihood omega feature value in the evaluation image. In this case, as described above, since the sample omega shape is depicted by taking the center of a cutting-out position as a reference point, when cutting-out position C is (Xc, Ye), the position of the evaluation reference point is (Xc+w/2, Ye+h/2).
In addition, shoulder-position calculating section 440 acquires regression coefficient matrix A that is a shoulder position estimation model from shoulder-position estimation model storage section 340. Subsequently, shoulder-position calculating section 440 calculates evaluation shoulder position feature value Q′ of the evaluation image by applying maximum likelihood omega feature value K′ as an explanatory variable to regression coefficient matrix A as in equation 9 below, for example.
[9]
Q′=K′A (Equation 9)
However, when it is assumed that “[u][v]” mean a two-dimensional array of u rows×v columns, and an omega shape is defined by m=20 points as described above, the values are, Q′: [1][4], K′: [1][40] and A: [40][4]. Furthermore, a set of a calculated evaluation shoulder position feature value Q′ and an evaluation reference point is information that shows the right shoulder position and the left shoulder position of the person included in the evaluation image.
After calculating the right shoulder position and left shoulder position in this manner, shoulder-position calculating section 440 outputs the right shoulder position (Shx1, Shy1) and the left shoulder position (Shx2, Shy2) to orientation estimation section 500 and also sends an “orientation estimation request” to orientation estimation section 500.
Upon receiving the “orientation estimation request,” orientation estimation section 500 calculates, as shoulder orientation Vd of the person included in the evaluation image, a vector corresponding to a direction that is rotated 90 degrees clockwise when viewed from above from among horizontal vectors that are orthogonal to a vector from the right shoulder position to the left shoulder position. In addition, orientation estimation section 500 calculates a center point ((Shx1+Shx2)/2, (Shy1+Shy2)/2) of a line segment connecting the right shoulder position and the left shoulder position as a position at the center of the shoulders D.
Orientation estimation section 500 acquires an evaluation image from evaluation image storage section 410. Orientation estimation section 500 superimposes symbols and the like indicating the calculated right shoulder position, left shoulder position, position at the center of the shoulders, and orientation vector on the evaluation image, and generates image data of the depicted evaluation result image. Next, orientation estimation section 500 sends an “orientation estimation complete notification” to operation control section 200, and also outputs the generated image data of the evaluation result image to operation control section 200 to display the evaluation result image in processing-content display region 616.
Subsequently, in step S9000, operation control section 200 determines whether or not “end” button 614 has been pressed. If “end” button 614 has not been pressed, operation control section 200 returns to step S2000. In contrast, if “end” button 614 has been pressed, operation control section 200 clears operation screen 610, and sends an “end” notification to shoulder-position estimation model generation section 300 and shoulder-position estimation section 400. Upon receiving the “end” notification, shoulder-position estimation model generation section 300 and shoulder-position estimation section 400 stop the operation.
By performing the above-described operation, person detection apparatus 100 generates a shoulder position estimation model and omega generation information, and based on these model and information, can estimate the shoulder positions and shoulder orientation of a person included in an evaluation image from the evaluation image.
As described above, since person detection apparatus 100 according to the present embodiment uses a shoulder position estimation model that shows a relationship between an omega shape and shoulder positions, person detection apparatus 100 can estimate the shoulder positions and shoulder orientation of a person included in an evaluation image based on the evaluation image. In addition, since person detection apparatus 100 can estimate shoulder positions, it is also possible for person detection apparatus 100 to determine a posture, such as a downward sloping right shoulder or a downward sloping left shoulder, and thus person detection apparatus 100 can contribute to development of technology that estimates a postural state of a person.
That is, person detection apparatus 100 enables the extraction of an overall feature of an omega shape that cannot be extracted using co-occurrence of local features alone such as in the related art. Accordingly, person detection apparatus 100 enables practical use of correlation between an omega shape and a shoulder position that is difficult to be realized unless an overall feature is represented.
Person detection apparatus 100 configured in this manner can be applied, for example, to a system that records moving images and analyzes a direction in which the body of a person is facing based on the recorded moving images. Accordingly, person detection apparatus 100 can be applied to detection of abnormal behavior on a street, analysis of purchasing behavior in a store, or to detection of inattentive driving by a driver or the like.
Although in the above described embodiment, a shoulder position estimation model is generated at shoulder-position estimation model storage section 340 in person detection apparatus 100, a shoulder position estimation model may also be generated at omega feature storage section 330.
In addition, in person detection apparatus 100, a shoulder position estimation model may be generated by principal component analysis with respect to overall omega feature values, instead of multiple regression analysis.
Furthermore, omega feature storage section 330 need not necessarily apply a principal component analysis to omega feature values K (40 dimensions×N sample images) to generate omega generation information. That is, omega feature storage section 330 may generate omega generation information by applying a principal component analysis to data (44 dimensions×N sample images) in which omega feature values K and shoulder position feature values Q that are included in the overall omega feature values are aggregated. In this case, in addition to an omega shape, it is also possible to reconstruct both shoulder positions based on the omega generation information, and omega generation section 420 can generate a sample omega image including an omega shape and both shoulder positions.
Note that, evaluation section 430 may disregard both shoulder positions when evaluating a matching property. Furthermore, in this case, evaluation section 430 can include a function of shoulder-position calculating section 440. That is, evaluation section 430 can estimate both shoulder positions in a sample omega image for which a high level of matching has been determined by the matching evaluation, as the two shoulder positions that should be found.
In addition, person detection apparatus 100 may estimate, as the outline of a person, an outline of another body part, such as a hand or leg, instead of all or part of an omega shape. Moreover, person detection apparatus 100 may estimate, as a state of a part of a person, only one of shoulder positions and the shoulder orientation, or another state such as a position of a wrist or a direction in which a forearm is extending, instead of estimating both of the shoulder positions and the shoulder orientation. That is, the present invention can be applied to an apparatus that estimates a state of a variety of parts highly correlated with the outline of a person.
The disclosure of the specification, the drawings, and the abstract included in Japanese Patent Application No. 2010-274675, filed on Dec. 9, 2010, is incorporated herein by reference in their entirety.
A person detection apparatus and a person detection method according to the present invention are useful in that they can estimate a state of a part of a person from an image.
Number | Date | Country | Kind |
---|---|---|---|
2010-274675 | Dec 2010 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2011/005619 | 10/5/2011 | WO | 00 | 5/16/2013 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2012/077267 | 6/14/2012 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
6681032 | Bortolussi et al. | Jan 2004 | B2 |
7274800 | Nefian et al. | Sep 2007 | B2 |
7734062 | Kato et al. | Jun 2010 | B2 |
8233661 | Ikenoue | Jul 2012 | B2 |
8265791 | Song et al. | Sep 2012 | B2 |
20050084141 | Kato et al. | Apr 2005 | A1 |
20050105770 | Sumitomo et al. | May 2005 | A1 |
20110001813 | Kim et al. | Jan 2011 | A1 |
20120051594 | Kim et al. | Mar 2012 | A1 |
Number | Date | Country |
---|---|---|
2005-071344 | Mar 2005 | JP |
2005-78257 | Mar 2005 | JP |
2005-149144 | Jun 2005 | JP |
2009-288917 | Dec 2009 | JP |
2010-157075 | Jul 2010 | JP |
Entry |
---|
International Search Report for PCT/JP2011/005619 dated Nov. 15, 2011. |
Tomokazu Mitsui, Yuji Yamauchi, Hironobu Fujiyoshi, “Human Detection by Two Stages AdaBoost with Joint HOG,” 14th Image Sensing Symposium, 2008, SSII08, IN1-06. |
Keisuke Oyama, “Motion Analysis of a Driver from Video captured by Driving Recorder”, ITE Technical Report, Oct. 25, 2010, vol. 34, No. 44. |
Number | Date | Country | |
---|---|---|---|
20130251203 A1 | Sep 2013 | US |