The present invention relates to a technique of recognizing an object in a video and superimposing related information with respect to the recognized object.
Conventionally, there is a technique of recognizing an object in a video and superimposing related information with respect to the recognized object. By superimposing and displaying information related to a specific object shown in a video, the viewer can obtain the information without actively searching for it.
The process of recognizing a specific object in an input video and superimposing and displaying its related information in the video can be broadly divided into two processes: the process of recognizing the specific object (object recognition process); and the process of using the result of the recognition process as input and superimposing information (information superimposition process).
[Patent Literature 1] Unexamined Japanese Patent Application Publication No. 2009-251774
In relationship to the information superimposition process described above, there is a conventional technique of displaying related information at a position in contact with the region of an object detected from a video. However, with this conventional technique, the related information often hides the object itself or nearby objects, which then damages the quality of the viewing experience. That is, the problem with this conventional information superimposition process is that related information cannot be displayed such that the viewer can easily understand the content of the related information.
The present invention has been made in view of the above, and aims to provide a technique, whereby related information that is associated with an object can be superimposed over a video such that the viewer can easily understand the content of the related information.
According to the technique of the present disclosure, an information superimposition device for superimposing, in a video, superimposition information that is associated with an object in the video, includes:
According to the disclosed technique, a technique whereby related information that is associated with an object in a video can be superimposed over the video such that the viewer can easily understand the content of the related information is provided.
Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. The embodiments described below are simply examples, and embodiments to which the present invention is applicable are by no means limited to the following embodiments.
The herein-contained embodiments relate to a technique for recognizing a specific object shown in an input video, and superimposing and displaying its related information in the video.
To illustrate a specific example of this technique,
In this way, if it is possible to superimpose and display information related to a specific object (for example, a player) in a video, the viewer can obtain the information without having to actively search for it. In particular, if the viewer is not knowledgeable about the target video, even if the viewer is interested in an object shown in the video, there are few means to search for the details of the object, and therefore displaying information in a superimposing fashion is expected to improve the viewer’s understanding of the video’s content significantly. That is, the technique according to the present embodiment leads to an improved viewing experience.
To recognize a specific object in an input video and superimpose its related information in the video, roughly two processes are needed: namely, the process of recognizing a specific object (object recognition process); and the process of superimposing information by using the result of recognition as input.
With the herein-contained embodiments, an example related to the object recognition process will be described as an embodiment 1, and an example related to the information superimposition process will be described as an embodiment 2. Note that, although embodiments will be described below in which the object recognition process and the information superimposition process are combined, the object recognition process and the information superimposition process may be carried out independently.
Before describing the device structure and operation according to each embodiment, first, the details of the problem will be described. Note that the details of the cited references that will be touched upon in the following description are listed at the end of this specification.
One of the simplest ways to implement the object recognition process is to detect objects of interest from each image frame in a video by using the object detector disclosed in cited reference 1, for example. In this case, it is necessary to prepare training data for training the object detector, for each target object. Collecting training data like this generally entails a non-negligible cost. In particular, when different target objects look alike, such as, for example, when a number of players wear the same uniform as in the example shown in
In another method, it is possible to detect candidate objects, and then recognize a specific object by detecting a predetermined class or attribute from each candidate object. To be more specific, with the example of
However, this method has two major problems. The first problem is that, depending on the positional relationship between the object and the camera, the image frame may not contain enough visible information to recognize/determine its class or attribute, and the recognition often fails. Examples are shown in
Also, in the example of
The second problem is that recognizing and detecting a class and an attribute for all the detection results entails a high cost of calculation. This problem becomes more pronounced in cases in which a large number of target objects are captured, or in cases in which real-time processing is required.
As described above, when the technique of detecting a class and an attribute of candidate objects and identifying a specific object is simply employed, the accuracy of recognition of the class and attribute to serve as indications that identify the specific object is low, and there is also a problem that the processing speed is slow.
Next, regarding the information superimposition process, cited reference 3 discloses a method of outputting and displaying a label at a position in contact with a detected object’s field. When the method of cited reference 3 is used to display superimposition information of a size equal to or larger than a target object, like the panels shown in the example of
In order to solve the above problem, that is, in order not to hide the target object, a method of placing superimposition information at a position that is close to but does not overlap the target object, and that is determined in each image frame, may be employed. By means of this method, superimposed information can be displayed such that the viewer can easily understand the content of the superimposed information.
However, since this method does not take into account the consistency of the position of superimposed information over time, the position of superimposed information may vary significantly in each image frame, and the viewer may not be able to understand the content of the information that is displayed.
The herein-contained embodiments are therefore configured such that the conditions: (i) superimposed information does not occlude the target object; (ii) proximity to the target object is maintained; and (iii) the consistency of the position of superimposed information is maintained over time, are all satisfied at the same time. As a result of this, superimposed information can be displayed such that the position of the superimposed information does not vary significantly in each image frame, and the viewer can easily understand the content of the superimposed information.
In the following description, examples of recognizing a player in the rugby video shown in
The information indicating device 300 may be configured by one computer, or may be configured by connecting a plurality of computers via a network. Also, the object recognition part 100 and the information superimposition part 200 may be referred to as “an object recognition device 100” and “an information superimposition device 200,” respectively. In embodiments 1 and 2 described below, these will be referred to as an “object recognition device 100” and an “information superimposition device 200,” respectively. Also, the information indicating device 300 may be referred to as an “object recognition device” or an “information superimposition device.”
The video data storage part 110 stores chronological image frames, and the object recognition part 100 and the information superimposition part 200 process each image frame as read from the video data storage part 110.
The object recognition part 100 receives as input the image frame pertaining to each time constituting the video data and stored in the video data storage part 110 and the object recognition result at the immediately preceding time, and outputs the object recognition result at the present time. Note that the “present time” is the time of the latest image that is subject to the object recognition or information superimposition process.
The object superimposition information storage part 210 stores the information to be superimposed for each target specific object. Examples of information to be superimposed according to the embodiments are shown in
Note that, although the “class” and “attribute” are used with the present embodiment, both are examples of attributes. Also, the “label” is also an example of an attribute. For example, the team name may be referred as an “attribute 1,” and the uniform number may be referred to as an “attribute 2.” Also, when the class is an example of an attribute, the number of attributes is not limited to two, and may be one or three or more.
Where object superimposition information is stored in the object superimposition information storage part 210, the information superimposition part 200 determines the superimposition position for superimposition information for an object captured in the image frame at the present time, based on the superimposition position in the image frame immediately before the present time, and superimposes the information over the image frame at the present time and outputs the result. The image frame of each time, over which the superimposition information is superimposed, is transmitted to, for example, a user terminal, and displayed as a video in which the superimposition information is superimposed, on the user terminal.
Hereinafter, a detailed example of the object recognition device 100 corresponding to the object recognition part 100 will be described as an embodiment 1, and a detailed example of the information superimposition device 200 corresponding to the information superimposition part 200 will be described as an embodiment 2.
The video data storage part 110 stores chronological image frames. The detection part 120 receives the image frame of each time constituting the video data stored in the video data storage part 110, and detects the objects captured in it.
The tracking part 130 receives as input the detection result output by the detection part 120 and a past tracking result, and outputs the tracking result at the present time. The label identifying part 140 receives as input the tracking result output from the tracking part 130 and the image frame at the present time, and determines a specific object label with respect to each tracking object.
Here, the tracking result output by the tracking part 130 consists of a set of positions of objects captured in the image frame at the present time, and a set of IDs (tracking ID set), each shared by the same individual throughout the video.
In the label identifying part 140, the label identification process is performed only for those tracking IDs that are included in the tracking result of the image frame of the present time and that have not been assigned specific object labels in the past. By this means, the number of times to identify labels can be reduced compared to the case in which label identification is performed for all the objects detected in image frames, and, as a result of this, the overall throughput of the process can be improved.
The class visibility determining part 141 receives the object position set and the tracking ID set as input, and determines, for each object with a tracking ID which is captured in the image frame at the present time, and to which no specific object label is assigned yet, whether or not visible information about the class is captured.
For each object with a tracking ID, with respect to which the class visibility determining part 141 determines that visible information about the class is captured, the class inferring part 142 infers the class based on the visible information.
For a given object, the class visibility determining part 141 determines whether or not visible information related to the class is captured by evaluating the object’s overlap with other objects in space in the same image frame. By inferring the class of an object for which it is determined that visible information about the class is captured in the image frame, it is possible to prevent inferring the wrong class.
The attribute visibility determining part 143 receives the object position set and the tracking ID set as input, and determines, for each object with a tracking ID which is captured in the image frame at the present time, and to which no specific object label is assigned yet, whether or not visible information about the attribute is captured.
For each object with a tracking ID, with respect to which the attribute visibility determining part 143 determines that visible information about the attribute is captured, the attribute inferring part 144 infers the attribute based on the visible information.
For a given object, the attribute visibility determining part 143 determines whether or not visible information related to the attribute is captured by evaluating the object’s overlap with other objects in space in the same image frame. By inferring the attribute of an object for which it is determined that visible information about the attribute is captured in the image frame, it is possible to prevent inferring the wrong attribute.
Note that the label identifying part 140, “the class visibility determining part 141 + the class inferring part 142,” and “the attribute visibility determining part 143 + the attribute inferring part 144” are all examples of the attribute determining part.
As described above, the video data storage part 110 of the object recognition device 100 stores chronological image frames, and the detection part 120 (and the tracking part 130 and the label identifying part 140) processes each image frame read from the video data storage part 110.
The detection part 120 receives an image frame corresponding to a given time in the video as input, detects the position of an object in it, and infers its posture. Any method can be used to determine the position of the object. For example, the position of the object can be determined by using a rectangle that encloses the object like the one defined by the black frame in
Also, the method of determining the posture of the object may be any suitable method as desired. For example, as shown in
As in this embodiment 1, when the object to be detected is a person, any method can be used to detect the person and infer his/her posture. For example, the technique disclosed in cited reference 1 can be used. At this time, it is also possible to prepare a mask that defines the target region in the image, and determine whether or not the detected person is included in the target region, and output the filtering result.
With this embodiment 1, a mask that defines a region in the rugby court in the input image may be used, so that it is possible to exclude detection results pertaining to people such as the audience and support staff. Also, the object’s posture may be inferred after the image data is internally resized to a predetermined size.
The tracking part 130 receives as input the object detection result at the present time output from the detection part 120 and the past tracking result, and outputs the tracking result at the present time. Here, the tracking result is composed of a set of tracking IDs, assigned to tracking-target individuals on a respective basis, and a set of the positions (including postures) of the individuals of respective tracking IDs at the present time. The tracking part 130 may perform the above-described tracking, for example, by using the techniques disclosed in cited reference 4.
In the tracking result of the present time output from the tracking part 130, the label identifying part 140 assigns a label to an individual with an ID that has not been assigned a label. As mentioned above, the label in this embodiment 1 is defined by a combination of a class and an attribute.
As shown in
The class visibility determining part 141 receives the set of object positions at the present time as input, determines, for each object, whether or not the object is visible enough to recognize its class, and outputs the result.
To determine whether an object is visible enough to recognize its class, the class visibility determining part 141 according to this embodiment 1 calculates to what extent the object is not hidden by objects located in front of the object, and compares this value with a predetermined threshold.
The method of extracting objects located in front of an object of interest is not limited to a specific method, and any method can be used. An example method of extracting objects located in front of an object of interest will be described with reference to
Also, the calculation of the extent to which an object of interest is not hidden is not limited to a specific method, and any method can be used. For example, the intersection-over-union (IoU) may be calculated between the object of interest and each object that is located in front of the object of interest, and the maximum value may be subtracted from 1 to determine the extent to which the object of interest is not hidden (that is, how visible it is). This indicator indicates visibility.
For example, in the example of
For example, if V2 is greater than the threshold with respect to the person in the back, the class visibility determining part 141 determines that the person in the back is visible enough to recognize the person’s class.
In the tracking result at the present time, if there is an object that is not assigned a class and that is determined by the class visibility determining part 141 to be visible enough to recognize its class, the class inferring part 142 infers and outputs the class of the object. The method of inferring the class is not limited to a specific method, and any method can be used.
For example, the technique disclosed in cited reference 5 may be used to extract a feature from a partial region in an image frame corresponding to an object’s position, input the feature into an identifier such as a support vector machine (SVM), and classify the object in the partial region into a predetermined class. Alternatively, for example, typical features may be defined in advance for each class, and then a feature extracted from a partial region may be compared with these typical features, and the class corresponding to the most similar value may be assigned. Any method may be used to calculate the typical features, and, for example, features extracted from objects of each class may be averaged.
The attribute visibility determining part 143 receives the set of object positions at the present time as input, determines, for each object, whether or not the object is visible enough to recognize its attribute, and outputs the result. With this embodiment 1, posture information of each object is used to determine whether or not each object is visible enough to recognize its attribute.
In this embodiment 1, a uniform number is printed on the back of a player, that is, the target object. Given this condition, an example method of determining whether or not an object is visible enough to recognize its attribute will be described with reference to
In the example of
The attribute visibility determining part 143 determines whether the following equation is satisfied.
In the above formula, the bar above plsprs indicates the length between pls and prs. Also, σaspect is a parameter. Note that 1>σaspect>0. When the attribute visibility determining part 143 detects that the above formula is satisfied, the attribute visibility determining part 143 determines “True” (that is, the region where the attribute is included is visible) with respect to the person. If the attribute visibility determining part 143 detects that the above formula is not satisfied, the attribute visibility determining part 143 determines “False” (that is, the region where the attribute is included is not visible).
In addition to the method of using the posture of the object or instead of the method of using the posture of the object, the attribute visibility determining part 143 may determine whether or not the target object is visible enough to recognize its attribute based on the overlap between objects, in the same way as the class visibility determining part 141 does.
Note that, in addition to the method of using the overlap between objects or instead of the method of using the overlap between objects, the class visibility determining part 141 may use a method of using the posture of the object to determine whether or not its class is recognizable, in the same way as the attribute visibility determining part 143 does.
In the tracking result at the present time, if there is an object that is not assigned an attribute and that is determined by the attribute visibility determining part 143 to be visible enough to recognize its attribute, the attribute determining part 144 infers and outputs the attribute of the object. The method of inferring the attribute is not limited to a specific method, and any method can be used.
According to this embodiment 1, it is possible to recognize a specific object at high speed, with high accuracy.
Next, embodiment 2 will be explained. With embodiment 2, an information superimposition device 200, which corresponds to the information superimposition part 200 in the information indicating device 300 of
However, this is only an example, and the information superimposition device 200 may operate by using an object recognition result obtained by any method as input, without presuming the use of the object recognition device 100 of embodiment 1. The operation of each part of the information superimposition device 200 will follow.
The object superimposition information storage part 210 stores superimposition information as shown in
Using the object recognition result, the candidate superimposition positions, and the object/superimposition position association result in the previous image frame as input, the associating part 230 associates the objects in the image frame at the present time with superimposition positions. Based on the result of associating objects with superimposition positions by the associating part 230, the superimposition part 240 superimposes the object superimposition information over the image frame at the present time, and outputs the resulting image frame. Image frames in which object superimposition information is superimposed are thus output sequentially, so that, for example, a video in which information related to objects is superimposed is displayed on a user terminal.
Here, the candidate superimposition position selection part 220 outputs candidate superimposition positions that do not overlap the object positions recognized in the image frame of the present time. This makes it possible to satisfy the condition (i) “superimposed information does not occlude the target object.” Also, through optimization of an objective function that satisfies the condition that superimposed information is displayed near each object recognized in the image frame at the present time, and the condition that each superimposed information displayed in the previous image frame does not change its position significantly in the present frame, the associating part 230 determines the superimposition information display position for each object, from the candidate superimposition positions. As a result of this, the above-mentioned conditions (ii) “proximity to the target object is maintained” and (iii) “the consistency of the position of superimposed information is maintained over time” can be satisfied.
As described above, for each image frame processed by the object recognition device 100, the information superimposition device 200 receives the processing result, namely the object recognition result, as input.
Using the object recognition result at each time as an input, the candidate superimposition position selection part 220 selects and outputs candidate object superimposition positions, which serve as candidate positions that do not overlap the recognized objects, and in which object superimposition information can be superimposed.
As for the method of outputting candidate object superimposition positions, for example, as shown in
Also, as for the method of calculating the overlaps in the above process, for example, intersection-over-union (IoU) may be used. That is, by using IoU, for example, regions corresponding to the superimposition positions where IoU=0 (the dotted-line frames in the right part of
Note that, although the above example (the example shown in the right part of
The associating part 230 associates between the candidate superimposition positions output by the candidate superimposition position selection part 220 and the objects recognized at the present time, and determines the information superimposition position for each object.
To be more specific, the associating part 230 determines these associations such that the condition that superimposition information is displayed near each object recognized in the image frame at the present time, and the condition that each superimposition information displayed in the previous image frame does not change its position significantly in the image frame at the present time, are both satisfied at the same time. An example of how to perform the above associations will be described below.
{ (l1, b1), ..., (li, bi), ..., (lNt, bNt) } is a set of specific objects detected from a frame It of a time t by the object recognition device 100. li∈Lt is the label of a specific object, and bi is the detection result. bi is a vector defined by, for example, information of the four corners of a rectangle. Also, {c1, ..., cj, ..., cM)} is a set of candidate superimposition positions at present time t. For example, when the superimposition information is an image, cj is information (vector) of the four corners of a rectangle. Furthermore, the positions where each object’s label information li∈Lt-1 is superimposed at time t-1, which is immediately before present time t, is {p1, ..., pi, ... }.
Letting {aij}∈RN×M be a value indicating the appropriateness of associating an object i with a candidate superimposition position j, the value may be defined as in following equation 1, and the associating part 230 may calculate each aij.
dist (m, n) in above equation 1 is a function that outputs the distance between positions m and n, and, for example, may be defined as a function for calculating the L2 norm of the central coordinates of m and n. Equation 1 means that, when the information of label li of a specific object is superimposed at time t-1, the distance between that position pt-1i and candidate superimposition position cj at time t becomes aij, and that, when the information of label li of the specific object is not superimposed at time t-1, the distance between the specific object’s position bi and candidate superimposition position cj becomes aij.
When the label li of a specific object is superimposed at time t-1, making distance aij between that position pt-1i and candidate superimposition position cj small means that each superimposed information displayed in the previous image frame does not change its position significantly in the present frame. Also, making distance aij between position bi of a specific object and candidate superimposition position cj small means displaying superimposition information near each object recognized in the image frame at the present time.
Note that, with this embodiment, when the information of label li of a specific object is superimposed at time t-1, distance aij between its position pt-1i and candidate superimposition position cj is made small (referred to as “A”), and, when the information of label li of the specific object is not superimposed at time t-1, distance aij between position bi of the specific object and candidate superimposition position cj is made small (referred to as “B”). That is, although, with this embodiment, an objective function is defined and the optimization problem of following equation 2 is solved by using both of above methods A and B, it is equally possible to solve the optimization problem of following equation 2 by using only one of above methods A and B.
Defining {xij}∈RN×M as a binary matrix that takes the value of 1 when object i is associated with candidate superimposition position j and takes the value of 0 otherwise, the associating part 230 may determine a {xij} that satisfies following equation 2, and thereby the associating part 230 can obtain an association {xij}* that satisfies both conditions that superimposed information is displayed near each object recognized in the image frame at the present time, and that each superimposed information displayed in the previous image frame does not change its position significantly in the present frame, at the same time.
Above equation 2 means finding {xij} that minimizes the total sum of aij×ij under restrictions that one object be associated with one candidate superimposition position, and that one candidate superimposition position be associated with maximum one object. Equation 2 can be solved by using any algorithm, and, for example, the Hungarian algorithm may be used.
Note that, although the above example is configured to select an association that satisfies both conditions that superimposition information be displayed near each object recognized in the image frame at the present time, and that each superimposed information displayed in the previous image frame does not change its position significantly in the present frame, at the same time, this is just an example. It is equally possible to select an association that satisfies only the condition that superimposed information be displayed near each object recognized in the image frame at the present time, or select an association that satisfies only the condition that each superimposed information displayed in the previous image frame not change its position significantly in the present frame.
The superimposition part 240 superimposes the object superimposition information over the image frame at the present time, based on the results of associating between objects and superimposition positions, obtained by the associating part 230.
As explained above, according to this embodiment 2, superimposed information can be displayed such that the viewer can easily understand the content of the superimposed information. To be more specific, for example, it is possible to superimpose information over a video such that the conditions: (i) the superimposed information does not occlude the target object; (ii) proximity to the target object is maintained; and (iii) the consistency of the position of the superimposed information is maintained over time, are all satisfied at the same time. Note that it is not essential to satisfy all these three conditions at the same time. If at least one of these conditions is satisfied, superimposed information can be displayed such that the viewer can easily understand the content of the superimposed information. Nevertheless, by satisfying the above three conditions at the same time, the effect of displaying superimposed information in such a way that the content of the superimposed information can be easily understood may be maximized.
All of the object recognition device 100, the information superimposition device 200, and the information indicating device 300 can be implemented, for example, by causing a computer to execute programs. This computer may be a physical computer or a virtual machine on a cloud. Note that, hereinafter, the object recognition device 100, the information superimposition device 200, and the information indicating device 300 will be collectively referred to as “devices.”
That is, the devices can be realized by executing programs corresponding to the processes performed by the devices by using hardware resources such as a central processing unit (CPU) and a memory built in a computer. The above programs can be recorded in a computer-readable recording medium (portable memory, etc.) and saved or distributed. Also, it is possible to provide the above programs through a network such as the Internet or by using e-mail.
The programs for realizing the processes in the computer are provided by means of a recording medium 1001 such as a compact disc read-only memory (CD-ROM) or a memory card, for example. When the recording medium 1001 storing the programs is placed in the drive device 1000, the programs are installed from the recording medium 1001 to the auxiliary memory device 1002 via the drive device 1000. However, the programs do not necessarily have to be installed from the recording medium 1001, and may be downloaded from another computer via a network. The auxiliary memory device 1002 stores the installed programs, and also stores the necessary files and data.
When there is an instruction to start a program, the memory device 1003 reads out the program from the auxiliary memory device 1002 and stores it. The CPU 1004 implements the functions related to the devices according to the program stored in the memory device 1003. The interface device 1005 is used as an interface for connecting with a network, and functions as a transmitting part and a receiving part. The display device 1006 displays a GUI (Graphical User Interface) or the like by the program. The input device 1007 is composed of a keyboard, a mouse, buttons, a touch panel, or the like, and is used to input various operational instructions. The output device 1008 outputs the calculation result.
This specification discloses at least the following object recognition device, object recognition method, and program.
An object recognition device having:
The object recognition device according to number 1, in which the processor is further configured to determine whether or not the attribute of the undetermined object can be determined by calculating an index value that indicates an extent to which the undetermined object is not hidden by other objects, and by comparing the index value with a threshold.
The object recognition device according to number 1, in which the processor is further configured to determine whether or not the attribute of the undetermined object can be determined by determining whether or not a predetermined region of the undetermined object is visible, based on information about a posture of the undetermined object.
An object recognition method, including:
A program for causing a computer to function as the object recognition device according to number 1.
A non-transitory recording medium storing a program for causing a computer to:
This specification discloses at least the following information superimposition device, information superimposition method, and program.
An information superimposition device for superimposing, in a video, superimposition information that is associated with an object in the video, the information superimposition device including:
An information superimposition device for superimposing, in a video, superimposition information that is associated with an object in the video, the information superimposition device including:
The information superimposition device according to number 1, in which the processor is further configured to determine the position of the superimposition information, based on the set of the candidate superimposition positions and the positions of the one or more objects recognized in the video, such that the distance between the object and the superimposition information that is associated with the object is made small, and such that the position of the superimposition information changes little between image frames.
The information superimposition device according to number 3, in which the processor is configured to determine the position of the superimposition information for each object by solving an optimization problem in which an objective function is designed such that:
An information superimposition method to be executed by an information superimposition device for superimposing, in a video, superimposition information that is associated with an object in the video, the information superimposition method including:
A program for causing a computer to function as the information superimposition device according to number 1.
A non-transitory recording medium storing a program for causing a computer to:
A non-transitory recording medium storing a program for causing a computer to:
A non-transitory recording medium storing a program for causing a computer to:
Although embodiments of the present invention have been described above, the present invention is by no means limited to these specific embodiments, and various modifications and changes may be made within the scope of the present invention recited in the accompanying claims.
[1] X. Zhou, D. Wang, and P. Krähenbühl. Objects as points. In arXiv preprint arXiv: 1904.07850, 2019.
G. Li, S. Xu, X. Liu, L. Li, and C. Wang. Jersey number recognition with semi-supervised spatial transformer network. In CVPR Workshop, 2018.
Y. Wu, A. Kirillov, F. Massa, W.-Y. Lo, and R. Girshick. Detectron2. https://github.com/facebookresearch/detectron2, 2019.
A. Bewley, Z. Ge, L. Ott, F. Ramos, and B. Upcroft. Simple online and realtime tracking. In ICIP, 2016.
K. Zhou, Y. Yang, A. Cavallaro, and T. Xiang. Omni-scale feature learning for person re-identification. In ICCV, 2019.
Number | Date | Country | Kind |
---|---|---|---|
2020-206298 | Dec 2020 | JP | national |
The present application is a continuation filed under 35 U.S.C. 111(a) claiming the benefit under 35 U.S.C. 120 and 365(c) of PCT International Application No. PCT/JP2021/045401, filed on Dec. 9, 2021, and designating the U.S., which is based on and claims priority to Japanese Patent Application No. 2020-206298, filed on Dec. 11, 2020. The entire contents of PCT International Application No. PCT/JP2021/045401 and Japanese Patent Application No. 2020-206298 are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2021/045401 | Dec 2021 | WO |
Child | 18325349 | US |