The present disclosure relates to an information processing device, an information processing method, and an information processing program.
In recent years, a technology of estimating the three-dimensional position of a target is developed. For example, NPL 1 discloses a technology of estimating the three-dimensional position of a target by using DNN (Deep Neural Network) on the basis of a depth image.
[NPL 1]
Jonathan Tompson, et. al, “Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks,” ACM Transactions on Graphics, [Online], [Retrieved on Oct. 6, 2020], <http://yann.lecun.com/exdb/publis/pdf/tompson-siggraph-14.pdf>
In a case where a depth image is used for DNN learning as is the technology disclosed in NPL 1, it is important to increase the quality of the depth image in order to increase the accuracy of estimating a three-dimensional position.
For example, RAW images taken at different time points are integrated to generate a depth image for use to estimate a three-dimensional position. However, due to displacement of the position of a target in the RAH images, it is difficult to calculate the distance between a photographing position and the target with high accuracy.
To this end, the present disclosure proposes a new and improved information processing device capable of calculating the distance between a photographing position and a target with higher accuracy.
The present disclosure provides an information processing device including an acquisition section that acquires a signal value of a corresponding pixel where the same target is located in each of multiple frames which are obtained when a subject is photographed over multiple time sections, and a distance calculation section that calculates a distance between a photographing position and the target on the basis of the signal values acquired by the acquisition section.
Further, the present disclosure provides an information processing method that is performed by a computer. The method includes acquiring a signal value of a corresponding pixel where the same target is located in each of multiple frames which are obtained when a subject is photographed over multiple time sections, and calculating a distance between a photographing position and the target on the basis of the acquired signal values.
Moreover, the present disclosure provides an information processing program for causing a computer to function as an acquisition section that acquires a signal value or a corresponding pixel where the same target is located in each of multiple frames which are obtained when a subject is photographed over multiple time sections, and a distance calculation section that calculates a distance between a photographing position and the target on the basis of the signal values acquired by the acquisition section.
Hereinafter, a preferable embodiment of the present disclosure will be explained in detail with reference to the drawings. It is to be noted that components having substantially the same functional structure are denoted by the same reference sign throughout the present description and the drawings, and a redundant explanation thereof will be omitted.
It is to be noted that the explanation will be given in accordance with the following order.
One embodiment of the present disclosure relates to an information processing system capable of calculating the distance between a photographing position and a target with higher accuracy. The general outline of the information processing system will be explained below with reference to
(ToF camera 10)
The ToF camera 10 emits an emitted wave w1 to a target o1, and receives a reflected wave w2 reflected from the target. Specifically, a functional configuration of the ToF camera 10 will be explained with reference to
The modulation signal generation section 101 generates a modulation signal having a sine wave shape, for example. The modulation signal generation section 101 outputs the generated modulation signal to the light emission section 105 and the light reception section 109.
The light emission section 105 emits, to the target o1, the emitted wave w1 generated on the basis of the modulation signal inputted from the modulation signal generation section 101, for example.
The light reception section 109 has a function of receiving the reflected wave w2 which results from the emitted wave w1 emitted from the light emission section 105 and reflected by the target o1, for example.
In addition, the light reception section 109 has a shutter for controlling exposure and multiple pixels arranged in a lattice shape. The light emission section 109 controls an open/close pattern of the shutter on the basis of the modulation signal inputted from the modulation signal generation section 101. Exposure is performed in accordance with the open/close pattern in each of multiple time sections so that each of the pixels in the light reception section 109 acquires a signal value from the reflected wave w2.
A set of the signal values acquired, by the pixels, from the reflected wave w2 received in one time section, forms one microframe. The ToF camera 10 outputs the microframes to the information processing device 20. In the present description, a series of processes from emission of the emitted wave w1 to acquisition of the microframes is referred to as photographing, in some cases.
The functional configuration of the ToF camera 10 has been explained above. Next, the explanation of the information processing system is resumed with reference to
The information processing device 20 has a function of acquiring the signal value of a corresponding pixel where the same target of is located in each of multiple microframes obtained by photographing the target of with the ToF camera 10 over multiple time sections, and of calculating the distance between the photographing position and the target on the basis of the signal value of the corresponding pixel.
It is to be noted that the ToF camera 10 may be integrated with the information processing device 20, or may be formed separately from the information processing device 20.
The ToF camera 10 can be utilized in a variety of cases. Hereinafter, some examples of a conceivable case of the ToF camera 10 will be explained with reference to
Some examples of the conceivable utilization of the ToF camera 10 have been explained above. Next, a method of calculating the three-dimensional position of a target, on the basis of multiple signal values acquired by photographing a subject with the ToF camera 10, will be explained with reference to
A period of time from emission of the emitted wave w1 from the light emission section 105 to reception of the reflected wave w2 resulting from the emitted wave w1 at the light reception section 109, or the light reciprocation period of time is calculated from the phase difference D between the emitted wave w1 and the reflected wave w2. On the basis of the light reciprocation period of time calculated from the phase difference D between the emitted wave w1 and the reflected wave w2, the distance between the ToF camera 10 and the target o1 can be calculated.
In other words, when the phase difference D between the emitted wave w1 and the reflected wave w2 is obtained, the distance between the ToF camera 10 and the target o1 can be calculated. Here, one example of a method of calculating the phase difference D between the emitted wave w1 and the reflected wave w2 will be explained.
First, the light reception section 109 acquires signal values containing different phase components from each of the reflected waves w2 having arrived in multiple time sections. For example, the light reception section 109 acquires a signal value containing, as one example of first component, an I component (0°-phase, 180° -phase) which is in phase with the emitted wave w1, or a signal value containing, as one example of a second component, a Q component (90° -phase, 270° -phase) which is a quadrature component to the emitted wave w1, in accordance with a time of starting opening/closing the shutter. Hereinafter, one example of a method of acquiring signal values containing different phase components will be explained with reference to
By opening/closing the shutter in accordance with the abovementioned opening/closing pattern P1 in a certain time section, the light reception section 109 can acquire, from the reflected wave w2, a signal value containing the I component with respect to the emitted wave w1. It is to be noted that, by opening/closing the shutter in accordance with an opening/closing pattern having a phase shifted by 180° from the phase of the abovementioned opening/closing pattern P1 (i.e., an opening/closing pattern of a phase shifted by 180°from the phase of the emitted wave w1), the light reception section 109 can also acquire, from the reflected wave w2, a signal value containing the 1 component with respect to the emitted wave w1.
Similarly, by opening/closing the shutter in accordance with the abovementioned opening/closing pattern P2 in another time section, the light reception section 109 can acquire, from the reflected wave w2, a signal value containing the Q component with respect to the emitted wave w1. It is to be noted that, by opening/closing the shutter in accordance with an opening/closing pattern of a phase shifted by 180° from the phase of the abovementioned opening/closing pattern P2 (i.e., an opening/closing pattern of a phase shifted by 270° from the phase of the emitted wave w1), the light reception section 109 can also acquire, from the reflected wave w2, a signal value containing the Q component with respect to the emitted wave w1.
It is to be noted that, in the following explanation, a signal value that contains the I component with respect to the emitted wave w1 and is acquired on the basis of the opening/closing pattern in-phase (0°) with the emitted wave w1 is denoted by I0, while a signal value that contains the I component with respect to the emitted wave w1 and is acquired on the basis of the opening/closing pattern of a phase shifted by 180° from the phase of the emitted wave wl is denoted by I180.
Similarly, a signal value that contains the Q component with respect to the emitted wave wl and is acquired on the basis of the opening/closing pattern of a phase shifted by 90° from the phase of the emitted wave w1 is denoted by Q90, while a signal value that contains the Q component with respect to the emitted wave w1 and is acquired on the basis of the opening/closing pattern of a phase shifted by 270° from the phase of the emitted wave w1 is denoted by Q270.
The phase difference D between the emitted wave w1 and the reflected wave w2 is calculated on the basis of the Q90, and Q270 acquired from the reflected waves w2 having arrived in multiple time sections. First, difference between the signal values I0 and I180 each containing the I component and difference Q between the signal values Q90 and Q270 each containing the Q component are calculated.
I=I
0
−I
180 (Expression 1)
Q=Q
90
−Q
270 (Expression 2)
Then, on the basis of I and Q calculated in accordance with Expression (1) and Expression (2), the phase difference D is calculated in accordance with. Expression (3).
D =arctan(Q/I) (Expression 3)
It is to be noted that, although the signal value of any one of I0, Q90, I108 and Q270 can be acquired from the reflected wave w2 in one time section, two signal values containing the same phase components (I0 and I180, or Q90 and Q270) can also be acquired from the reflected wave w2 in one time section with use of the light reception section 109 that is a two-tap sensor type, for example.
Here, one example of a signal value that is acquired, from the reflected wave w2, by a light reception section R1 that is a two-tap sensor type, will be explained with reference to
Also, in a case where the ToF camera 10 photographs the subject in time sections t=2 to 4, the two-tap sensor type light reception section. R1 respectively acquires signal values the phases of which are shifted by 180° from each other. It is assumed that a set of signal values that are acquired by the A-tap pixel E1 or the B-tap pixel E2 in each time section is regarded as one microframe. For example, a frame indicating a depth image is calculated from a total of eight microframes. It is to be noted that, in each microframe, the density degree of the subject depends on its phase. In addition, in order to clarify the boundary between the background and the subject, the background is indicated in “white.” More accurately, however, the background is indicated in “black.” The following explanation is based on the assumption that the light reception section 109 in the present disclosure is a two-tap sensor type. However, the light reception section 109 does not need to be a two-tap sensor type.
In a case where a depth image is calculated from microframes acquired by photographing a subject over multiple time sections, however, the positions of the target in the respective microframes may change. In such a case, due to the positional displacement of the target, it has been difficult to calculate a depth image with high accuracy.
To this end, the information processing device 20 according to one embodiment of the present disclosure is achieved by originality and creativity in order to reduce the effect of positional displacement of a target. Hereinafter, the details of the configuration and operation of the information processing device 20 according to the present disclosure will be explained in order. It is to be noted that, in the following explanation, the emitted wave w1 and the reflected wave w2 are simply abbreviated as an emitted wave and a reflected wave, respectively.
The target detection section 201 is an example of the detection section, and has a function of detecting, as a corresponding pixel, a pixel where the same target is located in each of microframes acquired when the ToF camera 10 photographs a subject over multiple time sections.
For example, in a case where the tip of the thumb is determined as a target and the position of the tip of the thumb moves over the multiple time sections, a pixel where the target is located is changed from a target position (x1, y1) at t=1 to a target position (x4, y4 at t=4. That is, the target position (x1, y1) indicates a pixel where the target is not located (e.g. a space where the subject is not located) at t=4. Accordingly, positional displacement of the target position can be generated among microframes acquired in multiple time sections.
In order to reduce the effect of such positional displacement of the target, the target detection section 201 previously detects, as a corresponding pixel, a pixel where the tip of the thumb is located in each of the microframes, as depicted in
In addition, the signal value of each of pixels constituting each of microframes, which is indicated by the density degrees in the respective microframe in
For example, the ToF camera 10 photographs a subject in a time section t1 and opens/closes the shutter in accordance with an opening/closing pattern that is in-phase (0°) with an emitted wave so that a microframe It1A0 is acquired, as depicted in
Further, the ToF camera 10 photographs a subject in a time section t2 and opens/closes the shutter in accordance with an opening/closing pattern of a phase that is shifted by 270° from the phase of the emitted wave so that a microframe Qt2A270 is acquired. In the acquired microframe Qt2A270 the target detection section 201 detects the position (x, y) of the corresponding pixel by using a CNN.
In addition, by using a two-tap sensor type, the target detection section 201 may calculate an average microframe by averaging two microframes that are acquired in the same time section and that each contain an I component or a Q component. In the calculated average microframe, the target detection section 201 may detect the position of the corresponding pixel by using a CNN, with such an average microframe, the effect which can vary according to the phase can be reduced.
In addition, by using a two-tap sensor type, the target detection section 201 may calculate a differential microframe indicating the difference between two microframes that are acquired in the same time section and that each contain an I component or a Q component. In the calculated differential microframe, the target detection section 201 may detect the position of the corresponding pixel by using a CNN.
For example, the ToF camera 10 photographs a subject in the time section t1 and opens/closes the shutter in accordance with the opening/closing pattern that is in-phase (0°) with an emitted wave so that the A-tap pixel acquires the microframe It1A0 while the B-tap pixel acquires the microframe It1B180. The target detection section 201 calculates an average microframe It1 of the acquired microframes It1A0 and It1B180, and detects the position (x, y) of the corresponding pixel in the average microframe It1 by using a CNN obtained by learning a feature amount in a target position in the average microframe.
For example, the target detection section 201 determines, as a reference microframe, a microframe acquired when a subject is photographed in a certain time section, and calculates a feature amount in each of pixels constituting the reference microframe. Further, for each of the pixels constituting the reference microframe, the target detection section 201 may execute a process of detecting, in each of the microframes acquired when photographing is performed in any other time sections, a pixel having a feature amount equal to or close to the feature amount in the pixel in the reference microframe.
The ToF camera 10 photographs a subject over time sections t=1 to 4, for example, so that microframes of each of the time sections are acquired, as depicted in
Then, the target detection section 201 detects, as a corresponding pixel, each of the pixels detected to have the equal or close feature amount.
It is to be noted that the reference microframe and the other microframes may be included in the same frame, or may be included in different frames.
The signal value acquisition section 205 is an example of the acquisition section and has a function of acquiring a signal value of a corresponding pixel where the same target detected by the target detection section 201 is located in each of multiple microframes acquired when the ToF camera 10 photographs a subject.
For example, the signal value acquisition section 205 acquires the signal value It1A0 (x1, y1) of the pixel (x1, y1) which is the corresponding pixel in the microframe It1A0 in
In addition, the signal value acquisition section 205 may be a sensor section that converts a reflected wave received by the light reception section 109 of the ToF camera 10, to an electric signal value. A photographing position in this case indicates the sensor section.
The differential signal value calculation section 209 is an example of the difference calculation section and has a function of calculating a differential signal value that indicates the difference between the signal values in a corresponding pixel in two microframes acquired when the ToF camera 10 photographs a subject in a certain time section.
For example, the differential signal value calculation section 209 calculates a differential signal value It1 (x1, y1) that indicates the difference between the signal value It1A0 (x1, y1) of the pixel (x1, y1) which is the corresponding pixel in the microframe It1A0 in
The signal value estimation section 213 is one example of the estimation section and has a function of, on the basis of I-component containing signal values acquired in respective two or more time sections, estimating a signal value containing the I component with respect to an emitted wave, which could be obtained from a reflected wave having arrived in another time section.
In addition, the signal value estimation section 213 is one example of the estimation section and has a function of, on the basis of Q-component containing signal values acquired in respective two or more time sections, estimating a signal value containing the Q component with respect to an emitted wave, which could be obtained from a reflected wave having arrived in another time section. Hereinafter, one example of a method of estimating a signal value will be explained with reference to
Here, the distance between the photographing position and the target in the time section t2 can be calculated, for example, on the basis of the I-component containing differential signal value It1 obtained from the reflected wave having arrived is the time section t1 and the Q-component containing differential signal value Qt2 obtained from the reflected wave having arrived in the time section t2.
Alternatively, the signal value estimation section 213 estimates an I-component containing differential signal value I′t2, which could be obtained from the reflected wave having arrived is the time section t2, on the basis of I-component containing differential signal values It1 and It3 obtained from the reflected waves having arrived in the time sections t1 and t3, respectively, for example. Accordingly, the position calculation section 217, which will be described later, can calculate the distance between the photographing position and the target with higher accuracy.
Further, the signal value estimation section 213 may estimate a Q-component containing differential signal value Q′t2, which is obtained from the reflected wave having arrived in the time section t2 on the basis of a Q-component containing differential signal value Qt4 obtained from the reflected wave having arrived in the time section t4 and a Q-component containing differential signal value Qx obtained in another frame.
Here, one example of a method of estimating an I-component containing differential signal value or Q-component containing differential signal value which could be obtained from the reflected wave having arrived in the time section t2 will be explained with reference to
Further, the microframes acquired in the time sections t1.1 to t1.4 are combined to form a frame F1. The microframes acquired in the time sections t2.1 to t2.4 are combined to form a frame F2. Moreover, an I-component containing differential signal value in a microframe acquired in the time section is referred to as a differential signal value It1.1. A Q-component containing differential signal value in a microframe acquired in the time section t1.2 is referred to as a differential signal value Q1.2.
It is to be noted that the time section t2 in
In the estimation example E1, the signal value estimation section 213 estimates the differential signal value which could be acquired from a reflected wave having arrived in the time section t2.2 and contains the I component with respect to the emitted wave by, for example, interpolation, on the basis of an I-component containing differential signal value is a acquired from the reflected wave having arrived in the time section t2.1 in the frame F2 and an I-component containing differential signal value It2.3 acquired from the reflected wave having arrived in the time section t2.3 in the frame F2.
In the estimation example E2, the signal value estimation section 213 estimates a differential signal value Q′t2.2 which could be acquired from the reflected wave having arrived in the time section t2.2 and contains a Q component with respect to the emitted wave by, for example, interpolation, on the basis of a Q-component containing differential signal value Qt1.4 acquired from the reflected wave having arrived in the time section t1.4 in the frame F1 and a Q-component containing differential signal value Qt2.4 acquired from the reflected wave having arrived in the time section t2.4 in the frame F2.
It is to be noted that, in the estimation example E2, a differential signal value contacting two Q components which are the differential signal value Qt2.2 calculated by the differential signal value calculation section 209 and the differential signal value Q′t2.2 estimated by the signal value estimation section 213 is obtained. In such a way, a differential signal value containing multiple I components or Q components acquired in a certain time section may be integrated by, for example, weighted-averaging. As a result, the effect of noise generated in the differential signal value calculated by the differential signal value calculation section 209 can be reduced.
In each of the abovementioned estimation examples E1 and E2, a method of estimating a signal value by interpolation has been explained. However, for example, extrapolation may be used to estimate a signal value. The estimation example E3 which is one example of a method of estimating a differential signal value by extrapolation will be explained.
In the estimation example E3, the signal value estimation section 213 estimates an I-component containing differential signal value It2.2, which could be acquired from the reflected wave having arrived in the time section t2.2 by extrapolation, on the basis of an I-component containing differential signal value It1.3 acquired from the reflected wave having arrived is the time section t1.3 in the frame F1 and an I-component containing differential signal value It2.1 acquired from the reflected wave having arrived in the time section t2.1 in the frame F2.
Alternatively, the signal value estimation section 213 may receive an input of an I-component containing differential signal value or a Q-component containing differential signal value of a corresponding pixel acquired in a given time section, and may estimate an I-component containing differential signal value or a Q-component containing differential signal value of the corresponding pixel in a certain time section by using a DNN (Deep Neural Network) or an RNN (Recurrent Neural Networks), for example.
It is to be noted that the examples in which differential signal values are inputted and outputted have been explained above, but signal values may be inputted and outputted. Specifically, the signal value estimation section 213 may receive an input of an I-component containing signal value or a Q-component containing signal value of a corresponding pixel acquired in a given time section, and may estimate an I-component containing signal value or a Q-component containing signal value of the corresponding pixel in a certain time section by using a DNN or an RNN.
The position calculation section 217 is one example of the distance calculation section, and has a function of calculating the distance between a photographing position and a target on the basis of a signal value of a corresponding pixel containing an I component with respect to an emitted wave and a signal value of the corresponding pixel containing a Q component with respect to the emitted wave. For example, the position calculation section 217 calculates the distance between a photographing position and a target on the basis of an I-component containing differential signal value of a corresponding pixel, which could be acquired from a reflected wave having arrived in a certain time section estimated by the signal value estimation section 213 and a Q-component containing differential signal value of the corresponding pixel acquired from a reflected wave having arrived in the same time section as the certain time section.
For example, the position calculation section 217 calculates the distance between a photographing position and a target on the basis of an I-component containing differential signal value I′t2 of a corresponding pixel, which could be acquired from the reflected wave having arrived in the time section t2 estimated by the signal value estimation section 213 and a Q-component containing differential signal value Qt2 of the corresponding pixel acquired from the reflected wave having arrived in the time section t2, as depicted in
Further, the position calculation section 217 may calculate the three-dimensional position of the target on the basis of the calculated distance between the photographing position and the target and the positions of the corresponding pixel in the microframes.
The functional configuration of the information. processing device 20 according to the present disclosure has been explained so far. Next, operation of an information processing system according to the present disclosure will be explained with reference to
Then, the target detection section 201 detects, as a corresponding pixel, a pixel where a target is located in each of the acquired microframes (S105).
Then, the signal value acquisition section 205 acquires an I-component containing signal value or a Q-component containing signal value of each of the corresponding pixels detected in S105 (S109).
Next, the differential signal value calculation section 209 calculates, as a differential signal value, the difference between signal values of the corresponding pixels which contain the same phase component acquired by photographing in the same time section (S113).
Then, on the basis of the I-component containing differential signal values acquired in each of two or more time sections, the signal value estimation section 213 estimates a differential signal value containing an I component with respect to an emitted wave which could be acquired from a reflected wave having arrived in another time section (S117).
Next, on the basis of the I-component containing differential signal value of the other time section estimated in S117 and the Q-component containing differential signal value of the other time section, the position calculation section 217 calculates the distance between the photographing position and the target (S121).
On the basis of the distance between the photographing position and the target calculated in S121, the position calculation section 217 calculates the three-dimensional position of the target, and the information processing device 20 ends the three-dimensional position calculation process (S125).
The operation of the information processing system according to the present disclosure has bees explained so far. Next, effects which are provided by the present disclosure will be explained.
According to the present disclosure having been explained so far, a variety of effects can be obtained. For example, according to the present disclosure, the signal value acquisition section 205 acquires signal values of corresponding pixels where the same target is located, and the effect of displacement of the two-dimensional position of the target, which is generated when a subject is photographed over multiple time sections, can be reduced. Accordingly, the position calculation section 217 can calculate the distance between the photographing position and the target with higher accuracy.
In addition, the signal value estimation section 213 estimates a signal value containing a component in-phase with the phase component of an emitted wave, which could be acquired from a reflected wave having arrived in a certain time section, and the effect of displacement of the two-dimensional position of the target, which is generated when a subject is photographed over multiple time sections, can be reduced. Accordingly, the position calculation section 217 can calculate the distance between the photographing position and the target with higher accuracy.
In addition, the differential signal value calculation section 209 calculates a differential signal value indicating the difference between the signal values of a corresponding pixel in two microframes acquired in the same time section when a subject is photographed, so that fixed pattern noise which is included in the signal values can be reduced.
The camera 251 is formed as one example of the ToP camera 10 according to the present disclosure. The camera 251 acquires a microframe by emitting a wave to a target and receiving a reflected wave resulting from reflection on the target.
The communication section 255 transmits data held in the ToF camera 10 or the information processing device 20, for example, to an external device.
The CPU 259 functions as a computation processor and a controller, and controls general operation in the information processing device 20 in accordance with various programs. Further, the CPU 259 collaborates with software and the main memory 271 and the flash memory 275, which will be explained later, and, for example, the functions of the target detection section 201, the signal value estimation section 213, and the position calculation section 217, etc. are implemented.
The display 263 is a display device such as a CRT (Cathode Ray Tube) display device, a liquid crystal display (LCD) device, or an OLED (Organic Light Emitting Diode) device. The display 263 converts video data to a video and outputs the video. The display 263 may display a subject video which indicates the three-dimensional position of a target calculated by the position calculation section 217, for example.
The GPS module 267 measures the latitude, longitude, or altitude of the information processing device 20 by using a GPS signal received from a GPS satellite. The position calculation section 217 can calculate the three-dimensional position of the target including information regarding the latitude, longitude, or altitude, by using information obtained by measurement using a GPS signal, for example.
The main memory 271 temporarily stores a program that is used for execution of the CPU 259, and a parameter which varies, if needed, during the execution. The flash memory 275 stores a program, a computation parameter, etc. that are used by the CPU 259.
The CPU 259, the main memory 271, and the flash memory 275 are mutually connected through an internal bus, and are connected to the communication section 255, the display 263, the GPS module 267, the audio interface 279, and the battery interface 283, via an input/output interface.
The audio interface 279 is for connection to another device such as a loudspeaker or an earphone, which generates sounds. The battery interface 283 is for connection to a battery or a battery-loaded device.
The preferable embodiment of the present technology have been explained in detail with reference to the drawings. However, the technical scope of the present disclosure is not limited to the embodiment. It is clear that a person who has an ordinary skill in the art can conceive of various modifications and revisions within the scope of the technical concept set forth in the claims. These modifications and revisions are also considered to be obviously within the technical scope of the present disclosure.
For example, the information processing device 20 does not need to include the target detection section 201. In this case, the position calculation section 217 may calculate the distance between a photographing position and a target acquired by a certain pixel, on the basis of an I-component containing differential signal value calculated for the pixel by the differential signal value calculation section 209 and a Q-component containing differential signal value estimated for the pixel by the signal value estimation section 213. Accordingly, in a situation where displacement of the position of a target can be generated only in the depth direction, the position calculation section 217 can simplify the calculation process while maintaining the accuracy of calculating the distance between the photographing position and the target.
In addition, to detect each of corresponding pixels where multiple targets are located, the target detection section 201 may estimate a signal value of each of the corresponding pixels by using CNN. For example, when noise or occlusion is generated in a signal value, the signal value acquisition section 205 cannot accurately acquire the signal value or a corresponding pixel. Therefore, the target detection section 201 estimates a signal value of a corresponding pixel upon detection of the corresponding pixel so that a signal value in which the effect of occlusion etc. has been reduced can be acquired.
In addition, the information processing device 20 may further include a learning section that leans CNN by using microframes and target positions in the microframes. In this case, the information processing device 20 may estimate the distance between a photographing position and a target by using the CNN learned by the learning section.
In addition, the abovementioned information processing method can be performed by cloud computing. Specifically, a server having the functions of the target detection section 201, the signal value acquisition section 205, the differential signal value calculation section 209, the signal value estimation section 213, and the position calculation section 217 may be provided on a network. In this case, the information processing device 20 transmits microframes to the server, and the server calculates the distance between a photographing position and a target by using the microframes received from the information processing device 20, and transmits a result of the calculation to the information processing device 20.
In addition, it is not necessary to perform the steps of the operation of the information processing system according to the present disclosure in accordance with the time-series order depicted in the drawing. For example, the steps of the operation of the information processing system may be performed in accordance with an order different from that depicted in the drawing.
In addition, a computer program for exerting a function equivalent to that of each of the abovementioned sections of the information processing device 20 can be created in hardware such as the CPU 259, the main memory 271, or the flash memory 275 included in the information processing device 20.
The effects described in the present description are illustrative or exemplary ones, and thus, are not limited. That is, the technology according to the present disclosure can provide any other effect that is obvious to a person skilled in the art from the present description, in addition to or in place of the abovementioned effects.
It is to be noted that the present disclosure includes the following configurations.
(1)
An information processing device including:
The information processing device according to (1), in which
The information processing device according to (2), in which,
The information processing device according to (3), in which,
The information processing device according to (4), further including:
The information processing device according to (4), further including:
a detection section that, for each of pixels constituting one frame, executes a process of calculating a feature amount of each of the pixels constituting the one frame and detecting, in another frame, a pixel having a feature amount equal to or close to the calculated feature amount of the pixel, and
The information processing device according to any one of (4) to (6), in which,
The information processing device according to (7), further including:
The information processing device according to (8), in which,
The information processing device according to any one of (3) to (9), in which
The information processing device according to any one of (1) to (10), in which
An information processing method that is performed by a computer, the method including:
An information processing program for causing a computer to function as:
Number | Date | Country | Kind |
---|---|---|---|
2020-185115 | Nov 2020 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2021/033842 | 9/15/2021 | WO |