The present invention relates to a group specification apparatus and a group specification method that are for specifying a group of persons from shot images, and further relates to a computer-readable recording medium that includes a program recorded thereon for realizing the apparatus and method.
Specifying a group (a plurality of persons who interact) from images shot at a public facility or the like and recognizing attributes of the group is useful for improving services and for marketing. Apparatuses for specifying a group based on shot images have thus been proposed heretofore (e.g., see, Patent Documents 1 and 2).
Specifically, an apparatus disclosed in Patent Document 1 specifies a group of persons, by extracting regions of persons from a shot image, and determining whether the persons whose regions were extracted belong to the same group, based on the distance between the extracted regions of the persons and the state of overlap between the regions.
Furthermore, the apparatus disclosed in Patent Document 1 is also able to track the respective regions of the persons extracted from the shot image on a frame-by-frame basis, and, if the distance between the regions of the persons continues to be close, specify the persons in those regions as the same group. In other words, the apparatus disclosed in Patent Document 1 is also able to specify a group, based on the temporal change in the distance between persons.
Also, an apparatus disclosed in Patent Document 2, first, detects persons from a shot image, tracks the detected persons on a frame-by-frame basis, and acquires position information of each person in chronological order. The apparatus disclosed in Patent Document 2 then calculates the relative distance and relative speed of the persons, based on the acquired chronological position information of the persons, and, if a state in which the calculated relative distance and relative speed are within a set range continues for greater than or equal to a given time period, determines that the persons belong to the same group.
Incidentally, there is a problem with the apparatus disclosed in Patent Document 1 described above in that it is difficult to specify a group in the case where shooting is performed in a crowded environment where persons are close together, or where shooting is performed in a state where the angle of depression of the camera is shallow (i.e., the shooting direction is close to horizontal). This is because when such shooting is performed, persons unrelated to the group are likely to appear together in front of or behind the group in the shot image, and, in determining the distance between the regions of the persons and the state of overlap between the regions, it is difficult to separate out persons who are unrelated to the group.
In response, it is considered possible to resolve the above problem with the apparatus disclosed in Patent Document 1 described above in the case where tracking processing is performed. Similarly, it is considered possible to also resolve the above problem in the case of the apparatus disclosed in Patent Document 2 described above because of tracking processing being performed.
However, there are problems with the tracking processing in that it is difficult to maintain a high tracking accuracy, and also persons who only appear briefly in shot images cannot be tracked. With the apparatus disclosed in Patent Document 1 and the apparatus disclosed in Patent Document 2 described above, groups are specified based on chronological information obtained by the tracking processing, and thus both apparatuses have difficultly specifying a group when such problems occur. It is thus sought to specify groups without relying on tracking persons.
An example object of the present invention is to provide a group specification apparatus, a group specification method, and a computer-readable recording medium that solve the aforementioned problem and specify a group without requiring person tracking processing.
In order to achieve the above-described object, a group specification apparatus for specifying a group from a shot image according to an example aspect of the invention, includes:
In addition, in order to achieve the above-described object, a group specification apparatus for specifying a group from a shot image according to an example aspect of the invention, includes:
Furthermore, in order to achieve the above-described object, a computer readable recording medium according to an example aspect of the invention is a computer readable recording medium that includes recorded thereon a program to cause a computer specify a group from a shot image,
As described above, according to the invention, it is possible to specify a group without requiring person tracking processing.
Hereinafter, a group specification apparatus, a group specification method, and a program of the example embodiment will be described using
[Apparatus Configuration]
Initially, a schematic configuration of the group specification apparatus of the example embodiment will be described using
A group specification apparatus 10 of the example embodiment shown in
The first group candidate setting unit 11 selects a person from among a plurality of persons within a first shot image, and sets a first group candidate, based on a spatial condition stipulating the position of another person and a state condition stipulating a state of the other person, with reference to the selected person.
The second group candidate setting unit 12 selects a person from among a plurality of persons within a second shot image having a different shooting time from the first shot image, using attributes of the person selected by the first group candidate setting unit 11. Also, the second group candidate setting unit 12 sets a second group candidate, based on the spatial condition and the state condition, with reference to the selected person.
The similarity calculation unit 13 compares first attribute configuration information that includes the attributes of the persons constituting the first group candidate with second attribute configuration information that includes the attributes of the persons constituting the second group candidate, and calculates the similarity between the first group candidate and the second group candidate.
The group specification unit 14 specifies the persons constituting the first group candidate as one group, if the similarity calculated by the similarity calculation unit 13 satisfies a set condition.
In this way, in the example embodiment, a group candidate is set from each of the two shot images having different shooting date-times, and, furthermore, the similarity between the two set group candidates is calculated. Further, this similarity is used to determine whether persons constituting the group candidates are one group. In other words, according to the example embodiment, a group can be specified without requiring person tracking processing.
Next, the configuration and function of the group specification apparatus of the example embodiment will be specifically described using
As shown in
Also, as shown in
In the example embodiment, the first group candidate setting unit 11, first, acquires, from the image data storage unit 16, any one of the image data as image data of the first shot image (hereinafter, referred to as “first image data”), detects a plurality of persons from the acquired first image data, and, furthermore, estimates attributes of each of the detected plurality of persons. Note that, in the case where a plurality of persons cannot be detected from the acquired first image data, the first group candidate setting unit 11 acquires different image data as the first image data, and again performs person detection and attribute estimation.
Specifically, the first group candidate setting unit 11 specifies a region in which a person is present from the first image data, using a feature amount representing a person (person's face), and extracts the specified region as a person. The first group candidate setting unit 11 is also able to detect the orientation of a person's face or the orientation of a person (orientation of upper body or lower body) in the person extraction. Next, the first group candidate setting unit 11 obtains a feature amount in the specified region, inputs the obtained feature amount to a classifier, and estimates attributes (gender, age, clothing (color, pattern), height, volume, weight, etc.) of the person in the specified region. The classifier is created in advance, by machine learning the relationship between the attributes and the feature amount.
Next, the first group candidate setting unit 11 selects, as a reference person, any one of the detected plurality of persons. The first group candidate setting unit 11 then sets a first group candidate constituted by the persons detected from the first image data, based on the spatial condition and state condition regarding another person apart from the reference person, with reference to the reference person.
Here, in the example embodiment, the spatial condition includes another person being present within a set range centered on the reference person. The state condition includes another person facing the reference person, another person facing the same direction as the reference person, the size of another person being within a set range referenced on the size of the reference person, and a combination thereof.
In the example embodiment, the second group candidate setting unit 12 acquires, from the image data storage unit 16, image data (hereinafter, referred to as “second image data”) of a plurality of second shot images having different shooting times from the first shot image. Examples of the second shot images include shot images having an earlier shooting time than the shooting time of the first shot image.
The second group candidate setting unit 12 performs, for each of the acquired plurality of second image data, detection of a plurality of persons, and, furthermore, estimation of the attributes of each of the detected plurality of persons. Specifically, the second group candidate setting unit 12, similarly to the first group candidate setting unit 11, specifies a region in which a person is present from each second image data, using a feature amount representing a person (person's face), and extracts the specified region as a person. The second group candidate setting unit 12, similarly to the first group candidate setting unit 11, is also able to detect the orientation of a person's face or the orientation of a person (orientation of upper body or lower body) in the person extraction. Next, the second group candidate setting unit 12, similarly to the first group candidate setting unit 11, obtains a feature amount in the specified region, inputs the obtained feature amount to a classifier, and estimates attributes (gender, age, clothing (color, pattern), height, volume, weight, etc.) of the person in the specified region.
Next, the second group candidate setting unit 12 performs, for each of the plurality of second image data, selection of a person (hereinafter, “corresponding person”) corresponding to the reference person selected by the first group candidate setting unit 11, from among the extracted persons, based on the attributes of the reference person. Note that criteria for judging whether a person is the corresponding person include the number of matching attributes being greater than or equal to a predetermined number and a number of specific attributes being matched.
The second group candidate setting unit 12 then sets a second group candidate constituted by the persons detected from the second image data, based on the spatial condition and state condition regarding another person apart from the corresponding person, with reference to the corresponding person. Note that examples of spatial condition and state condition referred to herein include those illustrated in the description of the first group candidate setting unit 11.
Also, in the example embodiment, the second group candidate setting unit 12 is able to acquire a plurality of second shot images, that is, second image data. In this case, the second group candidate setting unit 12 performs, for each second image data, detection of persons, estimation of attributes, selection of a corresponding person, and setting of a second group candidate.
Furthermore, in the example embodiment, the second group candidate setting unit 12 is also able to set a partial region of the second shot images as a search range, based on the shooting time of the first shot image, the shooting times of the second shot images, and the position of the reference person selected by the first group candidate setting unit 11. In this case, the second group candidate setting unit 12 is able to select a corresponding person from the set search range.
Also, in the example of
In the example embodiment, the similarity calculation unit 13, first, creates first attribute configuration information that includes the attributes of the persons constituting the first group candidate, using the attributes of the persons that were estimated by the first group candidate setting unit 11. Also, the similarity calculation unit 13 creates second attribute configuration information that includes the attributes of the persons constituting the second group candidate, using the attributes of the persons that were estimated by the second group candidate setting unit 12.
Next, the similarity calculation unit 13 performs, for each of the plurality of second shot images (second image data), comparison of the attributes of the persons that are included in the first attribute configuration information with the attributes of the persons that are included in the second attribute configuration information, and calculation of a similarity, based on the comparison result. Similarity calculation processing will now be described in detail, using
In the example shown in
In the example shown in
In the example embodiment, the group specification unit 14 determines whether the set condition is satisfied, using the similarity calculated for each of the plurality of second shot images (second image data), and, if the set condition is satisfied, specifies the persons constituting the first group candidate as one group. An example of the set condition is the number of second image data having a similarity greater than or equal to a threshold value being greater than or equal to a set number.
Also, the group specification unit 14, in the case of specifying the persons constituting the first group candidate as one group, outputs information relating to the first group candidate specified as one group to the management apparatus 30. The group specification unit 14 is, for example, able to output, as information relating to the first group candidate, at least one of the position, size, orientation, attributes and first attribute configuration information of each person constituting the first group candidate.
Also, a database (hereinafter, referred to as “sample database”) in which a plurality of groups serving as samples (hereinafter, referred to as “sample groups”) and attribute configuration information thereof is registered in advance is assumed to have been prepared. Examples of sample groups include couple, family, travel group, company colleagues, and student group.
In such a mode, the group specification unit 14, in the case of specifying the persons constituting the first group candidate as one group, also specifies a sample group that conforms with the first group candidate by checking the first attribute configuration information of the first group candidate against the database. The group specification unit 14 is able to also output information on the sample group to the management apparatus 30.
Furthermore, the group specification unit 14, in the case where there are a plurality of first group candidates respectively specified as one group, determines whether there are persons common between the plurality of first group candidates respectively specified as one group. The group specification unit 14 is then able to integrate the first group candidates determined to have common persons into one group. According to this mode, more accurate specification of groups is achieved.
[Apparatus Operations]
Next, operations of the group specification apparatus of the example embodiment will be described using
Also, first, it is assumed that the image data acquisition unit 15 has acquired image data for a certain period that has been output from the image capturing apparatus (camera) 20, and has stored the acquired image data in the image data storage unit 16 in chronological order.
As shown in
Next, the first group candidate setting unit 11 detects a plurality of persons from the first image data acquired in step A1, and, furthermore, estimates attributes of each of the detected plurality of persons (step A2). Note that, in step A2, in the case where a plurality of persons cannot be detected from the first image data, the first group candidate setting unit 11 acquires different image data as the first image data, and again performs person detection and attribute estimation.
Next, the first group candidate setting unit 11 selects any one of the plurality of persons detected in step A2 as a reference person (step A3). The criterion for selecting the reference person is not particularly limited, and a mode where a person is randomly selected or a mode where a person in a predetermined position within the image is selected may be adopted.
Next, the first group candidate setting unit 11 sets a first group candidate constituted by the persons detected from the first image data, based on the spatial condition and state condition regarding another person apart from the reference person, with reference to the reference person selected in step A3 (step A4).
Next, the second group candidate setting unit 12 acquires, from the image data storage unit 16, image data (second image data) of a shot image that was shot earlier than the shooting time of the shot image of the first image data (step A5).
Next, the second group candidate setting unit 12 detects a plurality of persons from the second image data acquired in step A5, and, furthermore, estimates attributes of each of the detected plurality of persons (step A6). Also, in step A6, similarly to step A2, if a plurality of persons cannot be detected from the second image data, the second group candidate setting unit 12 acquires different image data as the second image data and again performs person detection and attribute estimation.
Next, the second group candidate setting unit 12 selects a corresponding person who corresponds to the reference person, from among the persons extracted in step A6, based on the attributes of the reference person selected by the first group candidate setting unit 11 in step A3 (step A7).
Next, the second group candidate setting unit 12 sets a second group candidate constituted by the persons detected from the second image data, based on the spatial condition and state condition regarding another person apart from the corresponding person, with reference to the corresponding person (step A8). Note that the spatial condition and state condition referred to here are the same as the spatial condition and state condition that are used in step A4.
Next, the similarity calculation unit 13 creates first attribute configuration information that includes the attributes of the persons constituting the first group candidate, using the attributes of the persons that were estimated in step A2, and creates second attribute configuration information that includes the attributes of the persons constituting the second group candidate, using the attributes of the persons that were estimated in step A6 (step A9).
Next, the similarity calculation unit 13 compares the attributes of the persons that are included in the first attribute configuration information created in step A9 with the attributes of the persons that are included in the second attribute configuration information likewise created in step A9, and calculates the similarity, based on a result of the comparison (step A10). The similarity calculation is as shown in
Next, the similarity calculation unit 13 determines whether there is image data that has not yet been processed as second image data in the image data storage unit 16 (step A11). If the result of the determination in step A11 indicates that there is image data that has not yet been processed in the image data storage unit 16 (step A11: Yes), step A5 is executed again. On the other hand, if the result of the determination in step A11 indicates that there is no image data that has not yet been processed in the image data storage unit 16 (step A11: No), step A12 is executed.
In step A12, the group specification unit 14 determines whether the set condition is satisfied using the similarity calculated for each second image data, and, if the set condition is satisfied, specifies the persons constituting the first group candidate as one group (step A12).
In step A12, the group specification unit 14, furthermore, outputs information (position, size, orientation, attributes, and first attribute configuration information of each person) relating to the first group candidate specified as one group to the management apparatus 30.
Furthermore, in step A12, the group specification unit 14 is able to specify a sample group that conforms with the first group candidate, by checking the first attribute configuration information of the first group candidate against the database, and to also output information on the sample group to the management apparatus 30.
Also, the group specification unit 14, in the case where there are a plurality of first group candidates respectively specified as one group in step A12 executed previously, determines whether there are persons common between the plurality of first group candidates respectively specified as one group. The group specification unit 14 is then able to integrate the first group candidates determined to have common persons into one group.
Also, in step A12, if the group specification unit 14 does not specify the persons constituting the first group candidate as one group, the first group candidate setting unit 11 selects, as the reference person, a person who has not yet been selected, from among the plurality of persons within the first shot image. The first group candidate setting unit 11 then executes step A4 again and newly sets the first group candidate.
When the first group candidate is newly set, the second group candidate setting unit 12 executes steps A7 and A8 again to newly set the second group candidate. Also, when the first group candidate and the second group candidate are newly set, the similarity calculation unit 13 executes steps A9 and A10 again to newly calculate the similarity. Thereafter, the group specification unit 14 executes step A12 again and specifies a group, using the newly calculated similarity.
As described above, in the example embodiment, a group candidate is set for each image, using a first shot image and second shot image having different shooting times, and the similarity between the set group candidates is obtained. Also, the similarity is obtained between one first shot image and a plurality of second shot images, and a final group is specified from a number of the obtained similarities. Thus, in the example embodiment, even in a crowded environment where persons are close together, or when the angle of depression of the camera is shallow (i.e., the shooting direction is close to the horizontal), a group can be accurately specified, without requiring person tracking processing.
[Program]
It suffices for a program in the example embodiment to be a program that causes a computer to carry out steps A1 to A12 shown in
In the example embodiment, the image data storage unit 16 may be realized by storing data files consisting it in a storage device such as a hard disk provided in the computer, or may be realized by storing data files consisting it in a storage device of another computer.
The computer includes general-purpose PC, smartphone and tablet-type terminal device. Furthermore, the computer may be a computer that constitutes the management apparatus 30. In this case, the group specification apparatus 10 according to the example embodiment is constructed on the operating system of the management apparatus 30.
Furthermore, the program according to the example embodiment may be executed by a computer system constructed with a plurality of computers. In this case, for example, each computer may function as one of the first group candidate setting unit 11, the second group candidate setting unit 12, the similarity calculation unit 13, the group specification unit 14, and the image data acquisition unit 15.
[Physical Configuration]
Using
As shown in
The computer 110 may include a GPU (Graphics Processing Unit) or an FPGA (Field-Programmable Gate Array) in addition to the CPU 111, or in place of the CPU 111. In this case, the GPU or the FPGA can execute the programs according to the example embodiment.
The CPU 111 deploys the program according to the example embodiment, which is composed of a code group stored in the storage device 113 to the main memory 112, and carries out various types of calculation by executing the codes in a predetermined order. The main memory 112 is typically a volatile storage device, such as a DRAM (dynamic random-access memory).
Also, the program according to the example embodiment is provided in a state where it is stored in a computer-readable recording medium 120. Note that the program according to the present example embodiment may be distributed over the Internet connected via the communication interface 117.
Also, specific examples of the storage device 113 include a hard disk drive and a semiconductor storage device, such as a flash memory. The input interface 114 mediates data transmission between the CPU 111 and an input device 118, such as a keyboard and a mouse. The display controller 115 is connected to a display device 119, and controls display on the display device 119.
The data reader/writer 116 mediates data transmission between the CPU 111 and the recording medium 120, reads out the program from the recording medium 120, and writes the result of processing in the computer 110 to the recording medium 120. The communication interface 117 mediates data transmission between the CPU 111 and another computer.
Specific examples of the recording medium 120 include: a general-purpose semiconductor storage device, such as CF (CompactFlash®) and SD (Secure Digital); a magnetic recording medium, such as a flexible disk; and an optical recording medium, such as a CD-ROM (Compact Disk Read Only Memory).
Note that the group specification apparatus 10 according to the example embodiment can also be realized by using items of hardware that respectively correspond to the components, such as a circuit, rather than the computer in which the program is installed. Furthermore, a part of the group specification apparatus 10 according to the example embodiment may be realized by the program, and the remaining part of the group specification apparatus 10 may be realized by hardware.
A part or an entirety of the above-described example embodiment can be represented by (Supplementary Note 1) to (Supplementary Note 39) described below but is not limited to the description below.
(Supplementary Note 1)
A group specification apparatus for specifying a group from a shot image, comprising:
(Supplementary Note 2)
The group specification apparatus according to Supplementary Note 1,
(Supplementary Note 3)
The group specification apparatus according to Supplementary Note 1 or 2,
(Supplementary Note 4)
The group specification apparatus according to Supplementary Note 3,
(Supplementary Note 5)
The group specification apparatus according to any one of Supplementary Notes 1 to 4,
(Supplementary Note 6)
The group specification apparatus according to any one of Supplementary Notes 1 to 4,
(Supplementary Note 7)
The group specification apparatus according to any one of Supplementary Notes 1 to 6,
(Supplementary Note 8)
The group specification apparatus according to any one of Supplementary Notes 1 to 7,
(Supplementary Note 9)
The group specification apparatus according to any one of Supplementary Notes 1 to 7,
(Supplementary Note 10)
The group specification apparatus according to any one of Supplementary Notes 1 to 9,
(Supplementary Note 11)
The group specification apparatus according to any one of Supplementary Notes 1 to 10,
(Supplementary Note 12)
The group specification apparatus according to any one of Supplementary Notes 1 to 11,
(Supplementary Note 13)
The group specification apparatus according to any one of Supplementary Notes 1 to 12,
(Supplementary Note 14)
A group specification method for specifying a group from a shot image, comprising:
(Supplementary Note 15)
The group specification method according to Supplementary Note 14,
(Supplementary Note 16)
The group specification method according to Supplementary Note 14 or 15,
(Supplementary Note 17)
The group specification method according to Supplementary Note 16,
(Supplementary Note 18)
The group specification method according to any one of Supplementary Notes 14 to 17,
(Supplementary Note 19)
The group specification method according to any one of Supplementary Notes 14 to 17,
(Supplementary Note 20)
The group specification method according to any one of Supplementary Notes 14 to 19,
(Supplementary Note 21)
The group specification method according to any one of Supplementary Notes 14 to 20,
(Supplementary Note 22)
The group specification method according to any one of Supplementary Notes 14 to 20,
(Supplementary Note 23)
The group specification method according to any one of Supplementary Notes 14 to 22,
(Supplementary Note 24)
The group specification method according to any one of Supplementary Notes 14 to 23,
(Supplementary Note 25)
The group specification method according to any one of Supplementary Notes 14 to 24,
(Supplementary Note 26)
The group specification method according to any one of Supplementary Notes 14 to 25,
(Supplementary Note 27)
A computer-readable recording medium that includes a program recorded thereon for specifying a group from a shot image by a computer, the program including instructions that cause the computer to carry out:
(Supplementary Note 28)
The computer-readable recording medium according to Supplementary Note 27,
(Supplementary Note 29)
The computer-readable recording medium according to Supplementary Note 27 or 28,
(Supplementary Note 30)
The computer-readable recording medium according to Supplementary Note 29,
(Supplementary Note 31)
The computer-readable recording medium according to any one of Supplementary Notes 27 to 30,
(Supplementary Note 32)
The computer-readable recording medium according to any one of Supplementary Notes 27 to 30,
(Supplementary Note 33)
The computer-readable recording medium according to any one of Supplementary Notes 27 to 32,
(Supplementary Note 34)
The computer-readable recording medium according to any one of Supplementary Notes 27 to 33,
(Supplementary Note 35)
The computer-readable recording medium according to any one of Supplementary Notes 27 to 33,
(Supplementary Note 36)
The computer-readable recording medium according to any one of Supplementary Notes 27 to 35,
(Supplementary Note 37)
The computer-readable recording medium according to any one of Supplementary Notes 27 to 36,
(Supplementary Note 38)
The computer-readable recording medium according to any one of Supplementary Notes 27 to 37,
(Supplementary Note 39)
The computer-readable recording medium according to any one of Supplementary Notes 27 to 38,
Although the invention of the present application has been described above with reference to the example embodiment, the invention of the present application is not limited to the above-described example embodiment. Various changes that can be understood by a person skilled in the art within the scope of the invention of the present application can be made to the configuration and the details of the invention of the present application.
As described above, according to the invention, it is possible to specify a group without requiring person tracking processing. The invention is useful in various fields which it is required to identify groups from images.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/038817 | 10/14/2020 | WO |