Motion vector generation apparatus, projection image generation apparatus, motion vector generation method, and program

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a U.S. 371 Application of International Patent Application No. PCT/JP2019/044619, filed on 14 Nov. 2019, which application claims priority to and the benefit of JP Application No. 2018-221942, filed on 28 Nov. 2018, the disclosures of which are hereby incorporated herein by reference in their entireties.

TECHNICAL FIELD

The present invention relates to a technique for making a target that is not actually moving feel as if it is moving.

BACKGROUND ART

Projection mapping has begin to be widely used as a technique for changing the appearance of a target which is a real object. In projection mapping, the appearance of an object (a projection target) is manipulated by projecting an image (or picture) on the surface of the object using a projector. Patent Literature 1 proposes a method of giving an impression of motion to a stationary projection target by applying this technique. In Patent Literature 1, a picture is generated by adding a motion to a grayscale image of a projection target on a computer, and a picture corresponding to the difference between each frame of the generated picture and the original grayscale image is obtained as a projection image. By setting the projection image in grayscale, it is possible to selectively stimulate a notion information detection mechanism of the human visual system because the human visual system perceives motion information mainly based on luminance information. On the other hand, it is possible to give only an impression of motion to the projection target while maintaining the natural appearance of the projection target because it maintains the shape, texture, and color information of the original appearance. Thus, it is possible to make the viewer feel as if the projection target that is not actually moving is moving.

However, there is actually some discrepancy between the projection image containing motion information and the original shape, texture, and color information (the projection target that is not actually moving). If the discrepancy is not so large, it is acceptable to the human visual system and causes no problem in appearance. However, if the discrepancy is large, the projection image does not look fit for the projection target that is not actually moving, giving an unnatural impression. In general, it is known that the degree of discrepancy between the projection image and the projection target tends to increase as the magnitude of the motion given increases. However, it is difficult to predict how large a magnitude of the motion will give an unnatural impression because this depends on conditions such as the pattern of the projection target, the dynamic range of the projector, the resolution of the projector, the intensity of ambient light, and the sensitivity of the human visual system.

CITATION LIST
Patent Literature

Patent Literature 1: WO 2015/163317

Non Patent Literature

Non Patent Literature 1: Taiki Fukiage, Takahiro Kawabe, Shin'ya Nishida, “A model of V1 metamer can explain perceived deformation of a static object induced by light projection”, Vision Sciences Society, Florida, U.S.A., May 2016

SUMMARY OF THE INVENTION
Technical Problem

In Patent Literature 1 regarding the projection mapping technique that gives an impression of motion to areal object, the magnitude of motion is manually adjusted to eliminate the sense of discrepancy (the unnaturalness of the projection result) between the projection image and the projection target. However, it takes time to manually adjust the magnitude of motion. Further, because the magnitudes of motion optimal for regions and frames of given nodal information are generally different, it is a very difficult task to manually optimize all of them.

On the other hand, Non Patent Literature 1 proposes a perceptual model that estimates the unnaturalness of a projection result of a projection target when three elements, motion information given to the projection target, an image of the projection target before projection, and an image obtained by photographing the projection result, are given. However, how to optimize the motion information based on such results has not been proposed so far.

It is an object of the present invention to provide a technique for automatically adjusting a motion given to a projection target using a perceptual model.

Means for Solving the Problem

To solve the above problems, a motion vector generation apparatus according to an aspect of the present invention includes a first parameter generation unit configured to generate a first parameter that is a parameter for scaling a motion vector based on a perceptual difference between a projection result reproduction image which is an image that is obtained when a projection target onto which a projection image obtained based on the motion vector has been projected is photographed and a warped image which is an image generated by distorting an image obtained when the projection target is photographed by a perceptual amount of motion perceived when the projection result reproduction image is viewed, and a motion vector reduction unit configured to scale the motion vector using the first parameter.

Effects of the Invention

The present invention has an advantage of being able to automatically adjust a motion given to a projection target using a perceptual model.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a functional block diagram of a projection image generation apparatus according to a first embodiment.

FIG. 2 is a diagram illustrating an example of a processing flow of the projection image generation apparatus according to the first embodiment.

FIG. 3 is a functional block diagram of a first parameter generation unit according to the first embodiment.

FIG. 4 is a diagram showing an example of a processing flow of the first parameter generation unit according to the first embodiment.

FIG. 5 is a functional block diagram of an unnaturalness estimation unit according to the first embodiment.

FIG. 6 is a diagram illustrating an example of a processing flow of the unnaturalness estimation unit according to the first embodiment.

FIG. 7 is a diagram showing an example of an algorithm for three-dimensionally smoothing parameters.

FIG. 8 is a functional block diagram of a second parameter generation unit according to the first embodiment.

FIG. 9 is a diagram illustrating an example of a processing flow of the second parameter generation unit according to the first embodiment.

FIG. 10 is a diagram for explaining a projection method of a projector.

FIG. 11 is a diagram shoring an example of an algorithm for two dimensionally smoothing parameters.

FIG. 12 is a functional block diagram of a projection image generation apparatus according to a third embodiment.

FIG. 13 is a diagram illustrating an example of a processing flow of the projection image generation apparatus according to the third embodiment.

FIG. 14 is a functional block diagram of a projection image generation apparatus according to a fourth embodiment.

FIG. 15 is a diagram illustrating an example of a processing flow of the projection image generation apparatus according to the fourth embodiment.

FIG. 16 is a functional block diagram of an unnaturalness estimation unit according to a fifth embodiment.

FIG. 17 is a diagram illustrating an example of a processing flow of the unnaturalness estimation unit according to the fifth embodiment.

DESCRIPTION OF EMBODIMENTS

Hereinafter, embodiments of the present invention will be described. In the drawings used in the following description, the same reference signs are given to components having the same function or the steps of performing the same processing and duplicate description is omitted. In the following description, a symbol “{circumflex over ( )}” or the like used in the text should originally be written direct above the character immediately before it, but is written immediately after the character due to a limitation of text notation. In Equations, such symbols are written in their original positions. It is assumed that processing performed for each element of a vector or a matrix is applied to all elements of the vector or the matrix unless otherwise specified.

First Embodiment

FIG. 1 is a functional block diagram of a projection image generation apparatus according to a first embodiment and FIG. 2 illustrates a processing flow thereof.

The projection image generation apparatus includes a projection target photographing unit 110, a camera-projector pixel correspondence acquisition unit 120, an addition unit 125, a first parameter generation unit 130, a motion vector reduction unit 140, a non-rigid vector extraction unit 150, a second parameter generation unit 160, a motion vector combining unit 170, a projection image generation omit 180, and a projection unit 190.

An overview of the projection image generation apparatus will be described below. The projection image generation apparatus acquires an input image via a camera and tried in the projection target photographing unit 110. Apart from this, the projection image generation apparatus takes a motion vector v(x, y, t) given to the projection target as an input. However, if a projection image is generated using the input motion vector as it is, the projection result may have an appearance aberration (unnaturalness) because the magnitude of the vector is too large. In order to prevent this, the first parameter generation unit 130 generates a parameter (hereinafter also referred to as a first parameter) λ(x, y, t) for ring the motion vector v(x, y, t) such that unnaturalness does not occur. However, simply ring the motion vector v(x, y, t) titan makes an impression of motion given by the projection result very weak. Therefore, the non-rigid vector extraction unit 150 extracts a non-rigid motion vector component Δv_h(x, y, t) included in the motion vector v(x, y, t) and adds the extracted component to the motion vector to increase the magnitude of the motion vector. Here, to prevent the projection result from becoming unnatural again due to addition of the non-rigid motion vector component Δv_h(x, y, t) to the reduced motion vector, the second parameter generation unit 160 generates a coefficient (hereinafter also referred to as a second parameter) λ₂(x, y, t) for scaling the non-rigid motion vector component Δv_h. The motion vector combining unit 170 calculates λ(x, y, t)v(x, y, t)+λ₂(x, y, t)Δv_h(x, y, t) as an optimal motion vector (hereinafter also referred to as a combined vector). The projection image generation tint 180 generates a projection image (a projection pattern) using the optimal motion vector. The projection unit 190 projects the generated projection image onto the projection target.

In the present embodiment, the projection target photographing unit 110 of the projection image generation apparatus includes a photographing device such as a camera and is configured to acquire an input innate captured by the photographing device. However, the projection tart photographing unit 110 may not include a photographing device and may be configured to receive an image captured by a photographing device which is a separate device as an input. Further, the projection unit 190 of the projection image generation apparatus includes a projection device such as a projector and is configured to project a generated projection image onto the projection target. However, the projection unit 190 may be configured to output the projection image to a projection device which is a separate device and this projection device may be configured to project the projection image onto the projection target. The present embodiment will be described assuring that the photographing device is a camera and the projection device is a projector.

The projection image generation appoints is, for example, a special apparatus formed by loading a special program into a known or dedicated computer having a central processing unit (CPU), a main storage device (a random access memory (RAM)), and the like. The projection image generation apparatus executes, for example, each proms under the control of the CPU. Data input to the projection image generation apparatus and data obtained through each process are stored, for example in the main storage device, and the data stored in the main storage device is read out to the central processing unit as needed and used for other processing. Each processing unit of the projection image generation apparatus many be at least partially configured by hardware such as a combined circuit. Each storage unit included in the projection image generation apparatus can be configured for example, by a main storage device such as a random access memory (RAM) or by middleware such as a relational database or a key-value store. However, each storage unit does not necessarily have to be provided inside the projection image generation apparatus and may be configured by a hand disk, an optical disc, or an auxiliary storage device formed of a semiconductor memory device such as a flash memory and may be provided outside the projection image generation apparatus.

Each unit will be described below.

Projection Target Photographing Unit 110

The projection target photographing unit 110 takes images captured by a camera included in the projection target photographing unit 110 as inputs and uses the input images to acquire and output a minimum luminance image I_Min(x, y) and a maximum luminance image I_Max(x, y) which are used as inputs to the first parameter generation unit 130 and the projection image generation unit 180. Here (x, y) represents the coordinates of each pixel.

The minimum luminance image I_Min(x, y) can be acquired from an image that the camera has obtained by photographing the projection target when the projector projects minimum luminance toward the projection target.

The maximum luminance image I_Max(x, y) can be acquired from an image that the camera has obtained by photographing the projection target when the projector projects maximum luminance toward the projection target.

The projection target photographing unit 110 stores the minimum and maximum luminance images I_Min(x, y) and I_Max(x, y) in a storage unit (not illustrated). The images are acquired in grayscale or are acquired in color and converted to grayscale and used in grayscale.

The luminance of an location in a region photographed by the camera is measured using a luminance meter or the like. A ratio ρ obtained by dividing a luminance value at this location by a corresponding pixel value of the camera is stored in the storage unit. Unnaturalness estimation units 134 and 165 in the first and second parameter generation units 130 and 160 use the ratio ρ when converting a pixel value of an image captured by the camera into a luminance value. Thus, it is desirable that the camera be corrected suds that the physical brightness (luminance) of the photographing target and the pixel value of the cognized image have a linear relationship.

Camera-Projector Pixel Correspondence Acquisition Unit 120

The camera-projector pixel correspondence acquisition unit 120 acquires and outputs the correspondence between a camera coordinate system and a projector coordinate system. For example, the camera-projector pixel correspondence acquisition unit 120 acquires and outputs mapping to the projector coordinates (p_x, p_y) when viewed from the camera coordinates (c_x, c_y) (a C2P map) and mapping to the camera coordinates (c_x, c_y) when viewed from the projector coordinates (p_x, p_y) (a P2C map). Map acquisition methods induce, for example, a method according to Reference 1 in which, while a projector projects a sequence of Gray code patterns, images that a camera has obtained by photographing the projection results are taken as inputs to decode the Gray code, thereby obtaining a C2P map.

(Reference 1) S. Inokuchi, K. Sato, and F. Matsuda, “Range-imaging for 3-D object recognition”, in Proceedings of International Conference on Pattern Recognition, 1984, pp. 806-808.

The P2C map is obtained by referring back to coordinates (c_x, c_y) in the C2P map to which the coordinates (p_x, p_y) of the projector coordinate system are mapped. A defect in the P2C reap that occurs wines corresponding coordinates (p_x, p_y) do not exist in the C2P map can be interpolated using a median value of the values of a range of surrounding 5 pixels×5 pixels or the like. The range of pixels used for interpolation is not limited to this and it is desirable that the range be adjusted according to the size of the defect. The P2C map is used in the first parameter generation unit 130, the second parameter generation unit 160, and the projection image generation unit 180. The C2P is used in the first and second parameter generation units 130 and 160.

Addition Unit 125

The addition unit 125 takes the minimum and maximum luminance images I_Min(x, y) and I_Max(x, y) as inputs and obtains and outputs an intermediate luminance image I₀(x, y).

The addition unit 125 calculates a linear weighted-sum of the minimum and maximum luminance images I_Min(x, y) and I_Max(x, y) based on the following equation to obtain the intermediate luminance image I₀(x, y).

[Math. 1]
I₀(x,y)=gI_Max(x,y)+(1−g)I_Min(x,y) (1)

Here, g has a value in a range of [0, 1]. A final projection image is generated to give an impression of notion while preserving the appearance in color and shape of this intermediate luminance image I₀(x, y). When g is 0, the final projection image gives an impression of motion while maintaining the appearance under ambient light excluding light kindle projector. However, in this case, the contrast polarity of the pattern of the projection target can only shift in the direction of bright→dark. Similarly, when g is 1, the contrast polarity of the patter of the projection target can only shift in the direction of dark→bright. In order for the contrast polarity to shift in both directions of bright→dark and dark→bright, g needs to be greater than 0 and less than 1. If the projected light is too strong relative to ambient light, the nasal appearance of the projection target may be impaired. This, in many cases, a value of g of about 0.1 to 0.3 can be said to be appropriate. However, it may be better to set g lager than this if the ambient light is very bright.

The intermediate luminance image I₀(x, y) is output to the first parameter generation unit 130 and the projection image generation unit 180.

The above processes of the projection target photographing unit 110, the camera-projector pixel correspondence acquisition unit 120, and the addition unit 125 are performed before the notion vector v(x, y, t) is input to obtain the minimum luminance image I_Min(x, y), the maximum luminance image I_Max(x, y), the intermediate luminance image I₀(x, y), the P2C map, the C2P map, and the ratio ρ.

First Parameter Generation Unit 130

The first parameter generation unit 130 takes the minimum luminance image I_Min(x, y), the maximum luminance image I_Max(x, y), the intermediate luminance image I₀(x, y), and the notion vector v(x, y, t) as inputs, obtains a first parameter λ(x, y, t) using these inputs (S130), and outputs the first parameter λ(x, y, t). The first parameter is a parameter for soling the magnitude of the motion vector v(x, y, t). Here, t represents the flame number. The motion vector is also ailed a distortion map. Here, it is assumed that the ratio ρ, the P2C map, and the C2P map are input to and set in the first parameter generation unit 130 in advance before the mourn vector v(x, y, t) is input.

For example, the first parameter generation unit 130 geneses the first parameter λ(x, y, t) based on a perceptual difference dⁱ(t) between a projection result reproduction image I_Pⁱ(x, y, t) which will be described later and an ideal distorted image without unnaturalness I_W(α)ⁱ(x, y, t) which will be described later.

FIG. 3 is a functional block diagram of the first parameter generation unit 130 and FIG. 4 illustrates an example of a processing flow thereof. The first parameter generation unit 130 includes a region division unit 131, a projection result generation unit 132, a multiplication unit 133, an unnaturalness estimation unit 134, a first parameter update unit 135, and a first parameter smoothing unit 136.

Processing is performed in the following order. First processing is executed by the region division unit 131. That processing of a loop starting from the first parameter update unit 135 is performed in the order of the first parameter update unit 135→the multiplication unit 133→the projection result generation unit 132→the unnaturalness estimation unit 134→the first parameter update writ 135. When a certain condition is satisfied, the loop ads and the process staffs from the first parameter update unit 135 to the first parameter smoothing unit 136. The control of the loop is included in the processing of the first parameter update unit 135. Details will be described later.

Region Division Unit 131

The region division unit 131 takes the minimum luminance image I_Min(x, y), the maximum luminance image I_Max(x, y), the intermediate luminance image I₀(x, y), and the motion vector v(x, y, t) as inputs and divides each into a predetermined number of divisions or into small regions having a predetermined size (for example, 64 pixels×64 pixels) (S131). The sine of each small region is not limited to this, but needs to be large enough that a Laplacian pyramid which will be described later is generated within one region.

A region-divided minimum luminance image I_Minⁱ(x, y) and a region-divided minimum luminance image I_Maxⁱ(x, y) are output to the projection result generation unit 132, a region-divided intermediate luminance image I₀ⁱ(x, y) is output to the projection result generation unit 132 and the unnaturalness estimation unit 134, and a region-divided motion vector vⁱ(x, y, t) is output to the multiplication unit 133.

A set of the region-divided minimum luminance image I_Minⁱ(x, y), the region-divided maximum luminance image I_Maxⁱ(x, y), and the region-divided intermediate luminance image I₀ⁱ(x, y) is stored in a storage unit (not illustrated). The region-divided minimum luminance image I_Minⁱ(x, y), the region-divided maximum luminance image I_Maxⁱ(x, y), and the region-divided intermediate luminance image I₀ⁱ(x, y) stored in the storage unit are read and used by the projection result generation unit 162 and the unnaturalness estimation unit 165 of the second parameter generation unit 160.

The subsequent processing of the first parameter generation unit 130, except for that of the first parameter smoothing unit 136, is performed independently for each frame t of each region i. One first parameter λⁱ(t) is output for each frame t of each region i, and when first parameters λⁱ(t) are obtained for all regions/frames, they are collectively input to the first parameter smoothing unit 136.

Multiplication Unit 133

The multiplication unit 133 takes the region-divided motion vector vⁱ(x, y, t) and a currant first parameter λⁱ(t) of the region i as inputs. A value output firm the first parameter update unit 135 is used as the current first parameter λⁱ(t).

The multiplication unit 133 multiplies the region-divided motion vector vⁱ(x, y, t) by the anent first parameter λⁱ(t) of the region i (S133) and outputs the product (vector λⁱ(t)vⁱ(x, y, t)) to the projection result generation unit 132 and the unnaturalness estimation unit 134.

Projection Result Generation Unit 132

The projection result generation unit 132 takes the region-divided minimum luminance image I_Minⁱ(x, y), the region-divided maximum luminance image I_Maxⁱ(x, y), the region-divided intermediate luminance image I₀ⁱ(x, y), the motion vector λⁱ(t)vⁱ(x, y, t) scaled by the arrant first parameter, the P2C map, and the C2P map as inputs and outputs a projection result reproduction image I_Pⁱ(x, y, t) of the region i to which the current first parameter has been applied.

The projection result generation unit 132 generates the projection result reproduction image I_Pⁱ(x, y, t) to which the current first parameter λ_i(t) has been applied as follows (S132). The projection result reproduction image is an image that is assumed to be obtained when the camera photographs the projection target onto which a projection image obtained based on the motion vector λⁱ(t)vⁱ(x, y, t) has been projected. The projection result generation unit 132 obtains the projection result reproduction image through simulation on a computer.

The projection result generation unit 132 distorts the intermediate luminance image I₀ⁱ(x, y) based on the motion vector λⁱ(t)vⁱ(x, y, t) scaled by the current first parameter λⁱ(t) to obtain a distorted image I_Wⁱ(x, y, t). Any distortion method is applied. For example, the image is divided into grid cells having a size of 4 pixels×4 pixels, vertices are moved by motion vectors λⁱ(t)vⁱ(x, y, t) corresponding to the coordinates of the vertices, and regions surrounded by the vertices are filled with the original images of squares while the original images of squares are stretched (or shrunk) ruing a bilinear interpolation method or the like. The cell size of the grid is not limited to 4 pixels×4 pixels and it is desirable that the image be divided at a resolution with a cell size which is smaller than the region size in image division of the region division unit 131 and is sufficient to express the characteristics of the motion vector vⁱ(x, y, t).

Next, the projection result generation unit 132 obtains an ideal projection image I_Mⁱ(x, y, t) (a projection image without consideration of the physical restrictions of the projector used) for reproducing the distorted image I_Wⁱ(x, y, t) using the following equation.

$[Math . 2]$

$\begin{matrix} I_{M}^{i} (x, y, t) = \frac{I_{W}^{i} (x, y, t) - I_{Min}^{i} (x, y)}{I_{Max}^{i} (x, y) - I_{Min}^{i} (x, y)} & (2) \end{matrix}$

The value of I_Mⁱ(x, y, t) obtained using Equation (2) is limited to a physically projectable range [0, 1] of the projector.

In order to reproduce the resolution of the projector, the projection result generation unit 132 maps the image obtained in the previous step to the projector coordinate system based on the P2C map and then maps it to the camera coordinate system again based on the C2P map. This makes the projection image coarse in the camera coordinate system according to the resolution of the projector. For accurate reproduction, the resolution of the camera needs to be sufficiently higher than the resolution of the projector. The image obtained here is I{circumflex over ( )}_Mⁱ(x, y, t).

Finally, the projection result generation unit 132 obtains the projection result reproduction image Iⁱ_P(x, y, t) based on the following equation and outputs it to the unnaturalness estimation unit 134.

I_Pⁱ(x,y,t)=Î_Mⁱ(x,y,t)I_Maxⁱ(x,y)+(1−Î_Mⁱ(x,y,t))I_Minⁱ(x,y) [Math. 3]

The projection result reproduction image I_Pⁱ(x, y, t) represents the value of light emitted from the projector and can be obtained by linearly interpolating a pixel value of the region-divided minimum luminance image I_Minⁱ(x, y) and a pixel value of the region-divided maximum luminance image I_Maxⁱ(x, y) using a pixel value of the image I{circumflex over ( )}_Mⁱ(x, y, t) as a weight.

Unnaturalness Estimation Unit 134

The unnaturalness estimation unit 134 takes the ratio ρ, the intermediate luminance image I₀ⁱ(x, y), the projection result reproduction image I_Pⁱ(x, y, t), and the motion vector λⁱ(t)vⁱ(x, y, t) multiplied by the first parameter λⁱ(t) as inputs, obtains an unnaturalness estimate dⁱ_Min(t) of the projection result using these inputs (S134), and outputs the unnaturalness estimate dⁱ_Min(t). The processing is performed independently for each region i and each frame t.

First Example of Unnaturalness Estimation

For example, the unnaturalness estimation unit 134 estimates the unnaturalness of the projection based on the method proposed in Non Patent Literature 1. An overview of the process will be briefly described below.

The unnaturalness estimation unit 134 outputs a minimum value dⁱ_Min(t) of the perceptual difference dⁱ(t) between the projection result reproduction image I_Pⁱ(x, y, t) and the ideal distorted image without naturalness also refaced to as a warped image) I_W(α)ⁱ(x, y, t) as an “unnaturalness of the projection result”. Obtaining the minima value of the perceptual difference dⁱ(t) corresponds to obtaining a smallest value of the distance (a smallest distance) between a feature vector representing the perceptual representation of the warped image I_W(α)ⁱ(x, y, t) and a failure vector representing the perceptual representation of the projection result reproduction image I_Pⁱ(x, y, t) which are obtained by applying a perceptual model that will be described later. This “ideal distorted image without unnaturalness I_W(α)ⁱ(x, y, t)” is generated by distorting the original intermediate luminance image I₀ⁱ(x, y) by the “perceptual amours of motion αⁱλⁱ(t)vⁱ(x, y, t) perceived when the projection result reproduction image I_Pⁱ(x, y, t) is viewed” on the computer. Here αⁱis a coefficient (hereinafter referred to as a third parameter) for sailing the input motion vector to make it correspond to the perceptual amount of motion. The third parameter αⁱis estimated as a value which minimizes the perceptual difference d_i(t) between the projection result reproduction image I_Pⁱ(x, y, t) and the warped image I_W(α)ⁱ(x, y, t). That is, the unnaturalness estimation unit 134 simultaneously estimates the third parameter αⁱthat determines the “perceptual amount of motion perceived when the projection result reproduction image I_Pⁱ(x, y, t) is viewed” and the unnaturalness estimate dⁱ_Min(t).

FIG. 5 is a functional block diagram of the unnaturalness estimation unit 134 and FIG. 6 illustrates an example of a processing flow thereof. As illustrated in FIG. 5, the unnaturalness estimation unit 134 includes a third parameter multiplication unit 134A, a warped image generation unit 134B, a third parameter update unit 134C, a perceptual model application unit 134D, and a perceptual difference calculation unit 134E. Processing is performed in the following order. Processing of a loop starting from the third parameter update unit 134C is performed in the order of the third parameter update unit 1340→the third parameter multiplication unit 134A→the warped image generation unit 134B→the perceptual model application unit 134D→the perceptual difference calculation unit 134E→the third parameter update unit 134C. When a certain condition is satisfied, the loop ends and the third parameter update unit 134C outputs the unnaturalness estimate dⁱ_Min(t) to the first parameter update unit 135. The control of the loop is included in the processing of the third parameter update unit 1340. Hereinafter, the process will be described in order.

Third Parameter Multiplication Unit 134A

The third parameter multiplication unit 134A takes the motion vector λⁱ(t)vⁱ(x, y, t) multiplied by the first parameter λⁱ(t) and the current third parameter αⁱas inputs. A value output from the third parameter update unit 1340 is used as the current third parameter αⁱ.

The third parameter multiplication unit 134A multiplies the motion vector λⁱ(t)vⁱ(x, y, t) multiplied by the first parameter λⁱ(t) by the current third parameter αⁱ(S134A) and outputs the product (vector αⁱλⁱ(t)vⁱ(x, y, t)) to the warped image generation unit 134B.

Warped Image Generation Unit 134B

The warped image generation unit 134B takes the intermediate lumina ice image I_Oⁱ(x, y) and the motion vector αⁱλⁱ(t)vⁱ(x, y, t) scaled by the first and third parameters as inputs, distorts the intermediate luminance image I_Oⁱ(x, y) based on the motion vector α_iλⁱ(t)vⁱ(x, y, t) to obtain a warped image I_W(α)ⁱ(x, y, t), and outputs the warped image I_W(α)ⁱ(x, y, t) (S134B). Any distortion method is applied. For example, the image is divided into grid cells having a size of 4 pixels×4 pixels, vertices are moved by vectors αⁱλⁱ(t)vⁱ(x, y, t) corresponding to the coordinates of the vertices, and regions surrounded by the vertices are filled with the original images of squares while the original images of squares are stretched (or shrank) using a bilinear interpolation method or the like. The cell sire of the grid is not limited to 4 pixels×4 pixels and it is desirable that the image be divided at a resolution with a cell size which is smaller than the region size in image division of the region division unit 131 and is sufficient to express the characteristics of the motion vector vⁱ(x, y, t).

Perceptual Model Application Unit 134D

The perceptual model application unit 134D takes the warped image I_W(α)ⁱ(x, y, t), the projection result reproduction image I_Pⁱ(x, y, t), and the ratio ρ as inputs and obtains and outputs a perceptual response r′(x, y, t) to the warped image I_W(α)ⁱ(x, y, t) and a perceptual response r(x, y, t) to the projection result reproduction image I_Pⁱ(x, y, t).

Because the perceptual model application unit 134D independently performs the same processing an the warped image I_W(α)ⁱ(x, y, t) and the projection result reproduction image I_Pⁱ(x, y, t), each of the input images (the warped image I_W(α)ⁱ(x, y, t) and the projection result reproduction image I_Pⁱ(x, y, t)) will be hereinafter referred to as I(x, y) (where the indices i and t indicating the region and the frame are omitted for the sake of simplicity). The perceptual model application unit 134D applies the perceptual model to the input image to obtain the perceptual response (S134D). In the present embodiment, a model that models up to the primary visual cortex corresponding to an initial stage of the human visual system is adopted as a perceptual model. This model that models up to the primary visual cortex takes an image as an input and outputs a response to the input image at spatial frequency components and orientation components of each pixel (region) of the input image (a result of simulating the response of nerve cells). This model can also be said to be a model for obtaining a feature vector representing the perceptual representation of the warped image I_W(α)ⁱ(x, y, t) and a feature vector representing the perceptual representation of the projection result reproduction image I_Pⁱ(x, y, t). First, this model uses a linear filter to decompose the input image into a plurality of spatial frequency bands and orientations. Next, the model non-linearly corrects (controls the gains of) values, corresponding to each pixel, of the components obtained through decomposition and outputs the corrected values as the response described above. However, the present embodiment, for example, omits the process of analyzing the orientation components of the image in consideration of calculation speed. The model of the perceptual response is not limited to the implementation described here, and a model including the analysis of orientation components or a model that reproduces a response of the higher-order visual cortex may be used.

(Processing 1) First, the pixel value of the input image I(x, y) is multiplied by the ratio ρ acquired by the projection target photographing unit 110 to convert the pixel value into a luminance unit. Here, the input image converted into the luminance unit is converted into a just noticeable difference (JND) scale image L(x, y) using a method described in Reference 2.

(Reference 2) R. Mantiuk, S. J. Daly, K. Myszkowski, and H.-P. Seidel, “Predicting visible differences in high dynamic range images model and its calibration”. In Proceedings of SPIE, vol. 5666, pp. 204-214, 2005.

In the JND scale, the luminance is mapped such that a aluminance change corresponding to a threshold above which thaws are perceivable is defined as 1. That is, when Ψ(L) is defined as a function that converts the JND scale value L into luminance, the following equation is obtained.

$[Math . 4]$

$\begin{matrix} \frac{d ψ (L)}{dl} = tvi (ψ (L)) & (3) \end{matrix}$

Here, tvi is a function that gives a threshold of the luminance change for adaptive luminance. The present embodiment uses the following equation for tvi as leaned from Reference 2.

$[Math . 5]$

$tvi (Y) = \frac{Y}{{π_{1} ({(\frac{π_{2}}{Y})}^{π_{3}} + 1)}^{- π_{4}}}$

Here, (π₁, π₂, π₃, π₄)=(30.162, 4.0627, 1.66596, 0.2712) and Y is the adaptive luminance. In practice, it is necessary to obtain and use the inverse function of Ψ, that is, the function Ψ⁻¹that converts luminance into a JND scale value. Because Ψ is a monotonically increasing function, Ψ⁻¹can be uniquely obtained. In the present embodiment, Ψ is obtained as a numerical solution of Equation (3) and stored in a lockup table, and a JND scale value is obtained from luminance by referring to the lookup table. The lookup table stores values that are discrete to some extent in order to save storage space, and when intermediate values between them are obtained, sufficient results can be obtained using linear interpolation.

(Processing 2) Next, a Laplacian pyramid is generated from the JND scale image L(x, y) and a plurality of bandpass images b₀(x, y), b₁(x, y), b₂(x, y), . . . , and b_N-1(x, y) are obtained. In the present embodiment, the number of bandpass images N=5. However, the value of N is not limited to this and it is considered better to increase N as the projection target is photographed at a higher resolution. Normally, when a Laplacian pyramid is generated, the resolution decreases toward a bandpass image in a lower spatial frequency band due to downsampling. However, in the present embodiment, downsampling is not performed in order to improve the accuracy.

(Processing 3) Next, in order to reproduce the sensitivity of the visual system to each spatial frequency band, the bandpass images b_j(x, y) (j=0, 1, 2, . . . , N−1) of the Laplacian pyramid are weighted to obtain weighted bandpass images as follows.

[Math. 6]
c_j(x,y)=w_jb_j(x,y),j=0,1,2, . . . N−1 (4)

Here, the weight w_jis represented by the following function.

$[Math . 7]$

$\begin{matrix} w_{j} = \exp {- {(\frac{N - 1 - j}{s})}^{θ}} & (5) \end{matrix}$

Here, s and θ are constants that determine the shape of the weighting function. In the present embodiment, the constants were determined such that (s, θ)=(0.75, 3.0) through fining to experimental data. However, the weight function is not limited to this and the parameters may be reset according to observation conditions or the like.

(Processing 4) Finally, in order to reproduce contrast gain adjustment of the visual system, the weighted bandpass image c_j(x, y) is converted into a perceptual response r_j(x, y) using the following equation.

$[Math . 8]$

$\begin{matrix} r_{j} (x, y) = sign (c_{j} (x, y)) \frac{{\langle c_{j} (x, y) \rangle}^{p}}{{c_{j} (x, y)}^{2} + σ^{2}} & (6) \end{matrix}$

Here, p and σ are constants that determine the shape of the contrast gain adjustment function. In the present embodiment, the constants were determined such that (p, σ)=(2.38, 0.156) through fitting to experimental data sign (z) is a function representing the sign of z, which is −1 if z<0 and +1 if z>0. The contest gain adjustment friction is not limited to this, and any function may be used as long as it can approximate the response of the visual system.

The above processing is performed for each of the warped image I_W(α)ⁱ(x, y, t) and the projection result reproduction image I_Pⁱ(x, y, t) to obtain a perceptual response r′_jⁱ(x, y, t) to the warped image I_W(α)ⁱ(x, y, t) and a perceptual response r_jⁱ(x, y, t) to the projection result reproduction image I_Pⁱ(x, y, t) and the obtained perceptual responses are output to the perceptual difference catenation unit 134E. A vector having the perceptual response r′_jⁱ(x, y, t) as elements is the feature vector representing the perceptual representation of the warped image I_W(α)ⁱ(x, y, t) described above and a vector having the perceptual response r_jⁱ(x, y, t) as elements is the feature vector representing the perceptual representation of the projection result reproduction image I_Pⁱ(x, y, t) described above.

Perceptual Difference Calculation Unit 134E

The perceptual difference calculation unit 134E takes the perceptual response r′_jⁱ(x, y, t) to the warped image and the perceptual response r_jⁱ(x, y, t) to the projection result reproduction image as inputs and obtains and outputs a distance dⁱ(t) between the input perceptual responses.

The perceptual difference calculation unit 134E calculates the distance dⁱ(t) between the perceptual responses using the following equation (S134E).

$[Math . 9]$

$\begin{matrix} d^{i} (t) = \ln [\sqrt{\frac{Σ_{x, y} Σ_{j} {r_{j}^{l} (x, y, t) - r_{j}^{' i} (x, y, t)}^{2}}{N_{x} N_{y}}}] & (7) \end{matrix}$

Here N_xand N_yrepresent the horizontal and vertical sines dune perceptual response r_jⁱ(x, y, t) or r′_jⁱ(x, y, t), respectively. The perceptual responses r_jⁱ(x, y, t) and r′_jⁱ(x, y, t) have the same size. In Equation (7), In is a function that calculates the natural logarithm. The distance calculation method is not limited to this, and for example, a normal Euclidean distance or a Manhattan distance may be used. In order to tolerate some errors in the estimation of perceptual motion, the perceptual responses r_jⁱ(x, y, t) and r′_jⁱ(x, y, t) may be spatially pooled into local regions of p_xpixels×p_ypixels such that their size is reduced to 1/p_xand 1/p_yin the horizontal and vertical directions and then may be substituted into Equation (7). In the present embodiment, p_x=p_y=2.

Third Parameter Update Unit 134C

The third parameter update unit 1340 controls a process of searching for the third parameter. For example, the third parameter update unit 134C searches for a third parameter which minimizes the perceptual difference dⁱ(t) obtained by the perceptual difference calculation unit 134E. In other words, the third parameter update unit 1340 estimates the third parameter as a value (a coefficient for scaling the motion vector) which minimizes the distance between a feature vector representing the perceptual representation of the warped image I_W(α)ⁱ(x, y, t) (a vector having the perceptual response r′_jⁱ(x, y, t) as dements) and a feature vector representing the perceptual representation of the projection result reproduction image I_Pⁱ(x, y, t) (a vector having the perceptual response r_jⁱ(x, y, t) as elements). Here, an example in which a golden section search method is used to search for the third parameter will be described, although another search algorithm, for example, a ternary search method, may be used.

The third parameter update unit 134C takes a perceptual difference dⁱ(t) obtained with a third parameter of the previous cycle as an input and outputs the third parameter αⁱof the next cycle. However, in the first cycle the third parameter update unit 134C performs only the output because there is no input. In the final cycle, the third parameter update unit 134C outputs the minimum perceptual difference dⁱ(t) as an unnaturalness estimate dⁱ_Min(t).

The third parameter update unit 134C updates the third parameter such that the perceptual difference dⁱ(t) becomes smaller (S134C).

The thin parameter update unit 134C uses, for example, the golden section search method. First, the third parameter update unit 134C defines L(k) and H(k) as lower and upper limits of a search section in a kth cycle. In the golden section search method, the third parameter update unit 134C chides the search section at two points into three sections and compares outputs (the perceptual differences dⁱ(t) in this example) of the function wren values of the division points (values of the third parameter in this example) are taken as inputs and shortens the search section. Then, the third parameter update unit 134C defines the smaller of the two division points in the kth cycle as A(k), the larger as B(k), the perceptual difference of A(k) as d_A(k), and the perceptual difference of B(k) as d_B(k). Also, ϕ is defined such that ϕ=(1+√5)/2.

(When k=0)

The third parameter update unit 134C sets (L(0), H(0)) and (A(0), B(0)) such that (L(0), H(0))=(0, 1) and (A(0), B(0))=(1/(1+ϕ), ϕ/(1+ϕ) and outputs the third parameter of the first cycle αⁱ(0)=A(0) to the third parameter multiplication unit 134A The values of L(0), H(0), A(0), and B(0) are stored in the storage unit.

(When k=1)

The third parameter update unit 134C sets (L(1), H(1), A(1), B(1)) such that (L(1), H(1), A(1), B(1))=(L(0), H(0), A(0), B(0)) and outputs the third parameter of the next cycle αⁱ(1)=B(1) to the third parameter multiplication unit 134A. Also, d_A(1)=dⁱ(t) is stored in the storage unit.

(When k=2)

The input perceptual difference is stored in the storage unit as d_B(1)=dⁱ(t).

(2-i) When d_A(1)<d_B(1)

The third parameter update unit 134C sets a new search section as (L(2), H(2))=(L(1), B(1)) and sets new division points as (A(2), B(2))=((ϕL(2)+H(2))/(1+ϕ), A(1)). Also, d_B(2)=d_A(1) is stored in the storage unit. The third parameter of the next cycle is set as αⁱ(2)=A(2), stored in the storage unit, and output to the third parameter multiplication unit 134A.

(2-ii) When d_A(1)>d_B(1)

The third parameter update unit 134C sets a new search section as (L(2), H(2))=(A(1), H(1)) and sets new division points as (A(2), B(2))=(B(1), (L(2)+ϕH(2))/(1+ϕ)). Also, d_A(2)=d_B(1) is stored in the storage unit. The third parameter of the next cycle is stored in the storage unit as αⁱ(2)=B(2) and output to the third parameter multiplication unit 134A.

(When k≥3)

When αⁱ(k−1)=A(k−1), the third parameter update unit 134C stores the input perceptual difference in the storage unit as d_A(k−1)=dⁱ(t). When αⁱ(k−1)=B(k−1), the third parameter update unit 134C stores the input perceptual difference in the storage unit as d_B(k−1)=dⁱ(t). Similar to when k=2, the subsequent pressing is as follows.

(3-i) When d_A(1)<d_B(1)

The third parameter update unit 134C sets a new search section as (L(k), H(k))=(L(k−1), B(k−1)) and sets new division points as (A(k), B(k))=((ϕL(k)+H(k))/(1+ϕ), A(k−1)). Also, d_B(k)=d_A(k−1) is stored in the storage unit. The third parameter of the neat cycle is stored in the storage unit as αⁱ(k)=A(k) and output to the third parameter multiplication unit 134A.

(3-ii) When d_A(1)>d_B(1)

The third parameter update unit 134C sets a new search section as (L(k), H(k))=(A(k−1), H(k−1)) and sets new division points as (A(k). B(k))=(B(k−1), (L(k)+ϕH(k))/(1+ϕ)). Also, d_A(k)=d_B(k−1) is stored in the storage unit. The third parameter of the next cycle is stored in the storage unit as αⁱ(k)=B(k) and output to the third parameter multiplication unit 134A.

In any of the above (3-i) and (3-ii), the search ends when the width H(k)−L(k) of the search section becomes less than a constant value τ_α(S134C-2), and dⁱ_Min(t) is set as dⁱ_Min(t)=d_A(k−1) if d_A(k−1)<d_B(k−1) and is set as dⁱ_Min(t)=d_B(k−1) if d_A(k−1)>d_B(k−1) and output from the unnaturalness estimation unit 134. This is output to the first parameter update unit 135 when the unnaturalness estimation unit is used in the first parameter generation unit 130 (as the unnaturalness estimation unit 134) and output to the second parameter update unit 166 when the unnaturalness estimation unit is used in the second parameter generation unit 160 (as the unnaturalness estimation unit 165). As τ_αdecreases, the estimation accuracy increases, but the calculation cost also increases. In the present embodiment, τ_α=0.05.

Second Example of Unnaturalness Estimation

A model (which is also called a perceptual model in the second example of unnaturalness estimation) that takes the warped image I_W(α)ⁱ(x, y, t) and the projection result reproduction image I_Pⁱ(x, y, t) as inputs and directly outputs the perceptual difference may also be used to obtain the perceptual difference dⁱ(t). That is, the perceptual difference dⁱ(t) is obtained directly from the warped image I_W(α)ⁱ(x, y, t) and the projection result reproduction image I_Pⁱ(x, y, t), rather than obtaining a perceptual response r′_jⁱ(x, y, t) to the warped image and a perceptual response r_jⁱ(x, y, t) to the projection result reproduction image to obtain the distance dⁱ(t) between them as in the first example of unnaturalness estimation.

In this example, the unnaturalness estimation unit 134 does not include the perceptual difference calculation unit 134E, and the perceptual model application unit 134D takes the warped image I_W(α)ⁱ(x, y, t) and the projection result reproduction image I_Pⁱ(x, y, t) as inputs, applies values of these images to the perceptual model to obtain the perceptual difference dⁱ(t) (S134D, S134E), and outputs the obtained perceptual difference dⁱ(t). The processing of the other parts of the unnaturalness estimation unit 134 is similar to that of the first example of unnaturalness estimation.

In the processing of the third parameter update unit 1340 in the estimation of this example, a feature vector representing the perceptual representation of the warped image I_W(α)ⁱ(x, y, t) and a feature vector representing the perceptual representation of the projection result reproduction image I_Pⁱ(t y, t) or the distance between them is not obtained. Instead, the third parameter update unit 134C estimates, as a result of its processing, the third parameter as a value (a coefficient for scaling the motion vector) which minimizes the distance between a feature vector representing the perceptual representation of the warped image I_W(α)ⁱ(x, y, t) and a feature vector representing the perceptual representation of the projection result reproduction image I_Pⁱ(x, y, t).

Similarly, in the estimation of this example, as a result of the processing of the unnaturalness estimation unit 134, a smallest value (a smallest distance) of the distance between a feature vector wing the perceptual representation of the warped image I_W(α)ⁱ(x, y, t) and a feature vector representing the perceptual representation of the projection result reproduction image I_Pⁱ(x, y, t) is obtained as a minimum value of the perceptual difference dⁱ(t).

First Parameter Update Unit 135

The first parameter update unit 135 controls a process of searching for the first parameter. For example, the first parameter update unit 135 searches for a first parameter at which the unnaturalness estimate dⁱ_Min(t) obtained by the unnaturalness estimation unit 134 is closest to a predetermined threshold τ. The value of τ may be set to a fixed threshold in advance or a user-adjustable interface may be provided. In the present embodiment, it is empirically determined that τ=−2.28. For example, a binary search method is used to search for the first parameter.

The first parameter update unit 135 takes an unnaturalness estimate dⁱ_Min(t) obtained with a first parameter of the previous cycle as an input and outputs the first parameter λⁱ(t) of the next cycle. However, in the first cycle, the first parameter update unit 135 performs only the output because there is no input.

The first parameter update unit 135 updates the first parameter λⁱ(t) such that the unnaturalness estimate dⁱ_Min(t) is closest to the predetermined threshold τ (S135).

First, in the first cycle, the first parameter update unit 135 stores λⁱ(t)=0.5 and a step size of stp=0.25 in the storage unit and outputs λⁱ(t) to the multiplication unit 133.

In the subsequent cycles, the first parameter update unit 135 updates λⁱ(t) as follows based on a result of comparison between the input unnaturalness estimate dⁱ_Min(t) and the threshold τ.

When dⁱ_Min(t)<τ, the first parameter update unit 135 updates the first parameter such that λⁱ=λⁱ(t)+stp and stores the updated first parameter in the storage unit.

When dⁱ_Min(t)>τ, the first parameter update unit 135 updates the first parameter such that λⁱ(t)=λⁱ(t)−stp and stores the updated first parameter in the storage unit.

When a predetermined condition is satisfied (yes of S135A), for example, when dⁱ_Min(t)≤τ or the number of cedes is N_S, the first parameter update unit 135 ends the search and outputs λⁱ(t) to the first parameter smoothing unit 136. In other cases (no in S135A), the first parameter update unit 135 updates the step side such that stp=stp/2 and stores it in the storage unit and outputs λⁱ(t) to the multiplication unit 133.

First Parameter Smoothing Unit 136

The first parameter smoothing unit 136 takes the first parameter λⁱ(t) obtained from each region/frame as an input, smooths the input first parameter λⁱ(t), and outputs the smoothed first parameter λ(x, y, t) of each pixel (S136). For example, the first parameter smoothing unit 136 spatially and temporally smooths the first parameter λⁱ(t) obtained from each region/frame using the following:

(i) First parameters obtained from regions spatially adjacent to the region i and the frame t.

(ii) First parameters obtained from the region i and frames temporally adjacent to the frame t.

(iii) First parameters obtained from regions spatially adjacent to the region i and frames temporally adjacent to the frame t.

The first parameter of each region/frame will be referred to as λ(m, n, t) for the sake of explanation. Here, m represents the horizontal position of the region, n represents the vertical position of the region, and t represents the time flame to which the region belongs. First, smoothing is performed such that extreme value changes do not occur between adjacent first parameters λ(m, n, t). At this time, smoothing is performed by replacing λ(m, n, t) with λ′(m, n, t) such that the following two constraints are satisfied.

Constraint 1: λ′(m, n, t)≤λ(m, n, t) must be satisfied for all m, n, and t. This restricts the unnaturalness from exceeding an unnaturalness threshold due to the smoothing process.

Constraint 2: The following mist be satisfied for all m, n, and t.

|λ′(m,n,t)−λ′(m′,n′,t′)|≤√{square root over ((|m−m′|²+|n−n′|²)s_s²+|t−t′|²s_t²)} [Math. 10]

Here, (m′, n′, t′) represents a set of regions around (m, n, t), where m′∈{m−1, m, m+1}, n′∈{n−1, n, n+1}, and t′∈{t−1, t, t+1}. In addition, s_sand s_tare permissible values for the magnitude of the gradient between adjacent regions. These values need to be set sufficiently small because it is required that the first parameter not qualitatively change the input original motion vector (such that a rigid motion remains rigid). In the present embodiment, (s_s, s_t)=(0.06, 0.03). It is desirable that these values be adjusted according to the region size and the frame rate for projection. In other words, s_smay increase as the region size increases and st may increase as the frame rate decreases. In the present embodiment, it is assumed that the region size is 641 pixels×64 pixels and the franc rate is 60 FPS.

The preset embodiment uses the method described in Reference 3 as an algorithm for updating λ(m, n, t) such that the above constraints are satisfied.

(Reference 3) A Majumder and R. Stevens, “Perceptual photometric seamlessness in projection-based tiled displays”, ACM Transactions on Graphics, 24(1): 118-139, 2005.

However, the present embodiment extends the algorithm to perform three-dimensional smoothing, while Reference 3 only performs two-dimensional smoothing of parameters. FIG. 7 shows an example of a specific processing algorithm A basic processing flow involves scanning the values of λ(m, n, t) of regions in order and updating the values of λ such that the above constraints 1 and 2 are satisfied. The update method follows the following procedure.

1. Differences between the current region and 11 regions in the scanning and opposite directions among 26 regions spatiotemporally adjacent to the currant region (value of current region−values of adjacent regions) are calculated.

2. If the difference calculated in the above step 1 is larger than the restricted value on the right side of the constraint 2, the value of the current region is reduced until the difference becomes equal to the value on the right side.

The procedure of steps 1 and 2 above is performed for a set of all possible scanning directions. Specifically, when a set of scanning directions on the horizontal axis, the vertical axis, and the time axis is expressed by (d_m, d_n, d_t) and the two directions on each axis are expressed by {−1, 1}, a set of eight scanning directions (d_m, d_n, d_t)=[(−1,−1,−1),(1,−1,−1),(−1,1,−1),(1,1,−1),(−1,−1,1),(1,−1,1),(−1,1,1),(1,1,1)] is scarred.

After smoothing is completed for each region (m, n, t), a process of spreading the value over pixels (x, y, t) is performed. In the present embodiment, a process of expanding the first parameter λ′(m, n, t) of each region through bilinear interpolation is performed for each frame t to obtain the first parameters λ(x, y, t) of pixels. The interpolation method used for expansion is not limited to this, and for example, bicubic interpolation or the like may be used. The obtained λ(x, y, t) is output to the differential motion vector calculation unit, the second parameter generation unit 160, and the motion vector combining unit 170.

Motion Vector Reduction Unit 140

The motion vector reduction unit 140 takes the first parameter λ(x, y, t) and the motion vector v(x, y, t) as inputs, multiplies the motion vector v(x, y, t) by the first parameter λ(x, y, t) to obtain a reduced motion vector v_s(x, y, t)=λ(x, y, t)v(x, y, t) (S140), and outputs the reduced motion vector v_s(x, y, t) to the non-rigid vector extraction unit 150, the second parameter generation unit 160, and the motion vector combining unit 170.

Non-Rigid Vector Extraction Unit 150

The non-rigid vector extraction unit 150 takes the motion vector v(x, y, t) and the reduced motion vector v_s(x, y, t) as inputs, extracts a non-rigid motion vector component Δv_h(x, y, t) included in the difference between the motion vector v(x, y, t) and the reduced motion vector v_s(x, y, t) (S150), and outputs the extracted non-rigid motion vector component Δv_h(x, y, t) to the second parameter generation unit 160 and the motion vector combining unit 170. For example, the non-rigid vector extraction unit 150 includes a differential motion vector calculation unit and a filtering unit (not illustrated). The non-rigid motion vector component Δv_h(x, y, t) corresponds to a high-pass component (a high spatial frequency component) of the motion vector v(x, y, t) and the filtering unit functions as a high-pass filter.

Differential Motion Vector Calculation Unit

The differential motion vector calculation unit takes the motion vector v(x, y, t) and the reduced motion vector v_s(x, y, t) as inputs, calculates a motion vector difference Δv(x, y, t)=v(x, y, t)−v_s(x, y, t), and outputs it to the filtering unit.

Filtering Unit

The filtering unit takes the motion vector difference Δv(x, y, t) as an input and obtains and outputs a non-rigid motion vector component Δv_h(x, y, t) of the motion vector difference.

The filtering unit convolves a Gaussian filter with the difference Δv(x, y, t) to obtain a low spatial frequency component Δv_l(x, y, t) of the difference Δv(x, y, t). For example, the standard deviation of the Gaussian filter kernel is 8 pixels. The standard deviation is not limited to this and any value can be set. However, if the standard deviation is too small, almost no non-rigid components remain to be extracted in the neat step, and if it is too large, non-rigid components are likely to include a large amount of rigid motion components.

The filtering unit subtracts the low spatial frequency component Δv_l(x, y, t) from the original difference Δv(x, y, t) to obtain the non-rigid motion vector component Δv_h(x, y, t) which is a high spatial frequency component. That is, Δv_h(x, y, t)=Δv(x, y, t)−Δv_l(x, y, t).

Second Parameter Generation Unit 160

The second parameter generation unit 160 takes the reduced motion vector v_s(x, y, t), the non-rigid motion vector component Δv_h(x, y, t), the region-divided minimum luminance image I_Minⁱ(x, y), the region-divided maximum luminance image I_Maxⁱ(x, y), the region-divided intermediate luminance image I₀ⁱ(x, y), the ratio ρ, the P2C map, and the C2P map as inputs. The second parameter generation unit 160 uses the reduced motion vector v_s(x, y, t) scaled by the first parameter output from the motion vector reduction unit 140 and the non-rigid motion vector component Δv_h(x, y, t) output from the non-rigid vector extraction unit 150 to generate a second parameter λ₂(S160) and outputs the generated second parameter λ₂. The second parameter λ₂(x, y, t) is a parameter for scaling the non-rigid motion vector component Δv_h(x, y, t) as in “v_s(x, y, t)+λ₂(x, y, t)Δv_h(x, y, t)” when a motion lost due to reduction with the first parameter is compensated for with the non-rigid motion vector component.

FIG. 8 is a functional block diagram of the second parameter generation unit 160 and FIG. 9 illustrates an example of a processing flow thereof.

As illustrated in FIG. 8, the second parameter generation unit 160 includes a second region division unit 161, a projection result generation unit 162, a second multiplication unit 163, a motion vector addition unit 164, an unnaturalness estimation unit 165, a second parameter update unit 166, and a second parameter smoothing unit 167. Details of the processing of each part will be described below.

Second Region Division Unit 161

The second region division unit 161 takes the reduced motion vector v_s(x, y, t) scaled by the first parameter and the non-rigid motion vector component Δv_h(x, y, t) output from the non-rigid vector extraction unit 150 as inputs and obtains and outputs a region-divided reduced motion vector v_sⁱ(x, y, t) and a region-divided non-rigid motion vector component Δv_hⁱ(x, y, t). Here, i represents the region number.

Similar to the region division unit 131 of the first parameter generation unit 130, the second region division unit 161 divides the input vectors (the reduced motion vector v_s(x, y, t) and the non-rigid motion vector component Δv_h(x, y, t)) into regions (S161). A region-divided reduced motion vector v_sⁱ(x, y, t) is output to the motion vector addition unit 164 and a region-divided non-rigid motion vector component Δv_hⁱ(x, y, t) is output to the second multiplication unit 163.

The subsequent processing of the second parameter generation unit 160, except for the second parameter smoothing unit 167, is performed independently for each frame t of each region i. One second parameter λ₂ⁱ(t) is output for each flame t of each region i, and when second parameters λ₂ⁱ(t) are obtained for all regions/frames, they are collectively input to the second parameter smoothing unit 167.

Second Multiplication Unit 163

The second multiplication unit 163 takes the region-divided non-rigid motion vector component Δv_hⁱ(x, y, t) and the current second parameter λ₂ⁱ(t) of the region i as inputs, multiplies the region-divided non-rigid motion vector component Δv_hⁱ(x, y, t) by the current second parameter λ₂ⁱ(t) of the region i (S163), and outputs the product (λ₂ⁱ(t)Δv_hⁱ(x, y, t)) to the motion vector addition unit 164. A value output from the second parameter update unit 166 is used as the current nt second parameter λ₂ⁱ(t).

Motion Vector Addition Unit 164

The motion vector addition unit 164 takes the region-divided reduced motion vector v_sⁱ(x, y, t) and the non-rigid motion vector component λ₂ⁱ(t)Δv_hⁱ(x, y, t) multiplied by the current second parameter λ₂ⁱ(t) as inputs and obtains and outputs a vector v{circumflex over ( )}ⁱ(x, y, t) that combines the reduced motion vector and the non-rigid motion vector component.

The motion vector addition unit 164 combines the reduced motion vector v_sⁱ(x, y, t) and the non-rigid motion vector component λ₂ⁱ(t)Δv_hⁱ(x, y, t) such that v{circumflex over ( )}ⁱ(x, y, t)=v_sⁱ(x, y, t)+λ₂ⁱ(t)Δv_hⁱ(x, y, t) (S164) and outputs the combined vector v{circumflex over ( )}ⁱ(x, y, t) to the projection result generation unit 162 and the unnaturalness estimation unit 165.

Projection Result Generation Unit 162 and Unnaturalness Estimation Unit 165

The projection result generation unit 162 and the unnaturalness estimation unit 165 of the second parameter generation unit 160 perform the same processing S162 and S165 as that of the projection result generation unit 132 and the unnaturalness estimation unit 134 of the first parameter generation unit 130, respectively, except that the “motion vector λⁱ(t)vⁱ(x, y, t) scaled by the current first parameter” taken as an input motion vector is replaced with the “vector v{circumflex over ( )}ⁱ(x, y, t) that combines the reduced motion vector and the non-rigid motion vector component”.

Second Parameter Update Unit 166

The second parameter update unit 166 takes an unnaturalness estimate dⁱ_Min(t) obtained with a previous second parameter as an input and obtains and outputs a second parameter λ₂ⁱ(t) of the next cycle. However, in the first cycle, the second parameter update unit 166 performs only the output because there is no input.

The second parameter update unit 166 controls a process of searching for the second parameter. For example, the second primmer update unit 166 searches for a second parameter at which the unnaturalness estimate dⁱ_Min(t) obtained by the unnaturalness estimation unit 165 is closest to app threshold τ. The value of τ is the same as that used in the first parameter update unit 135. A binary search method is used for the search, similar to the first parameter update unit 135.

The second parameter update unit 166 performs the same processing S166 and S166A as the first parameter update unit 135, except that the first parameter is replaced with the second parameter.

Second Parameter Smoothing Unit 167

The second parameter smoothing unit 167 performs the same processing S167 as the first parameter smoothing unit 136. The second parameter smoothing unit 167 takes the second parameter λ₂ⁱ(t) obtained from each region/frame as an input, smooths the input second parameter λ₂ⁱ(t) (S167), and outputs the smoothed second parameter λ₂(x, y, t) of each pixel. However, the parameters (s_s, s_t) that determine permissible levels for the magnitude of the gradient between adjacent regions are set greater than those of the first parameter smoothing unit 136 because non-rigid motion vector components do not significantly charge their qualitative impression of motion even if the magnitude of motion changes locally. In the present embodiment, (s_s, s_t)=(0.3, 0.06). However, these parameters send limited to the values defined here and any value may be set as Icing as the spatial and temporal discontinuities of the magnitude of motion are not a concern.

The generated second parameter λ₂(x, y, t) is output to the motion vector combining unit 170.

Motion Vector Combining Unit 170

The motion vector combining unit 170 takes the second parameter λ₂(x, y, t), the non-rigid motion vector component Δv_h(x, y, t), and the reduced motion vector v_s(x, y, t) as inputs and obtains and outputs a combined motion vector v{circumflex over ( )}(x, y, t).

The motion vector combining unit 170 scales the non-rigid motion vector component Δv_h(x, y, t) with the second parameter λ₂(x, y, t) generated by the second parameter generation unit 160 and adds the scaled non-rigid motion vector component and the reduced motion vector v_s(x, y, t) scaled by the first parameter to finally obtain a motion vector (a combined motion vector v{circumflex over ( )}(x, y, t)) to be used for projection image generation (S170). That is, the motion vector combining unit 170 combines the motion vectors using the following equation.

[Math. 11]
{circumflex over (v)}(x,y,t)=v_s(x,y,t)+λ₂(x,y,t)Δv_h(x,y,t) (8)

The motion vector combining unit 170 outputs the combined motion vector v{circumflex over ( )}(x, y, t) to the projection image generation unit 180.

Projection Image Generation Unit 180

The projection image generation unit 180 takes the minimum luminance image I_Min(x, y), the maximum luminance image I_Max(x, y), the intermediate luminance image I₀(x, y), the combined motion vector v{circumflex over ( )}(x, y, t), and the P2C map as inputs and obtains and outputs a projection image I_P(x, y, t).

The projection image generation unit 180 distorts the intermediate luminance image I₀(x, y) based on the combined motion vector v{circumflex over ( )}(x, y, t) to obtain a distorted image I_W(x, y, t) (S180). The distortion method is similar to that of the projection result generation unit 132 in the first parameter generation unit 130.

The projection image generation unit 180 obtains an ideal projection image I_M(x, y, t) for reproducing a distorted image using Equation (2), similar to the projection result generation unit 132 in the first parameter generation unit 130.

Further, the projection image generation unit 180 units the value of I_M(x, y, t) to the physically projectable rangy [0, 1] of the projector.

The projection image generation unit 180 maps the image thus obtained to the projector coordinate system based on the P2C map, sets the resulting image as I_P(x, y, t), and outputs it to the projection unit 190.

Projection Unit 190

The projection unit 190 takes the projection image I_P(x, y, t) as an input and projects the input projection image from the projector toward the projection target (S190).

The projection image I_P(x, y, t) is projected such that edges included in the projection image I_P(x, y, t) overlap the contour of the projection target or edges included in the projection target. Here, alignment of the projection image I_P(x, y, t) is unnecessary because the projection image I_P(x, y, t) is generated based on the P2C map obtained through camera calibration. A commercially available projector may be used, but it is necessary to use a projector with high luminance when used in a bright room.

The projection unit 190 projects the projection image I_P(x, y, t) onto the projection target M_staticusing a known optical production technique (see for example, Reference 4) to display a moving image M2.

M₂=M_static◯I_P(x,y,t) [Math. 12]

(Reference 4) Takahiro Kawabe, Masataka Sawayama, Kazushi Maruya, and Shinya Nishida, (2014). “A light projection method to perceptually deform two-dimensional static objects by motion information”, Annual conference of the Institute of Image Information and Television Engineers 2014, 5-3.

Here, ◯ represents a state in which the projection image I_P(x, y, t) is added to/multiplied by (applied to) the luminance component of the projection target M_staticin a combined manner. In other words, ◯ represents a state in which an operation including at least one of addition and multiplication is performed on the luminance component of the projection target M_staticand the projection image I_P(x, y, t). That is, when light is projected alto a printed matter, it is assumed that the reflection pattern differs depending on the characteristics of paper or ink and the luminance changes multiplicatively in some parts while changing additively in other parts. Thus, c indicates a calculation that makes the luminance change in those two ways.

Effects

With the above configuration, motion information to be projected can be automatically adjusted and optimized for each region and each frame according to the projection target and the projection environment. Further, fine adjustments that are difficult to perform manually can be performed in a short time.

Modifications

In the present embodiment, the projection target photographing unit 110, the camera-projector pixel correspondence acquisition unit 120, and the addition unit 125 may be provided as separate devices and a projection image generation apparatus including the remaining components may take their output values (I_Max, I_Min, I₀, ρ, the P2C map, and the C2P map) as inputs. Further, the projection unit 190 may be provided as a separate device and the projection image generation apparatus may be configured to output the projection image I_P(x, y, t) to the projection unit 190 which is a separate device.

Furthermore, the first der generation unit 130, the motion vector reduction unit 140, the non-rigid vector extraction unit 150, the second parameter generation unit 160, and the motion vector combining unit 170 may be extracted from the projection image generation apparatus of the present embodiment and implemented to function as a motion vector generation apparatus. In this case the motion vector generation apparatus takes, I_Max, I_Min, I₀, ρ, the P2C map, the C2P map, and v(x, y, t) as inputs and outputs a combined motion vector v{circumflex over ( )}(x, y, t).

The same modifications can be made in the following embodiments.

Second Embodiment

When the magnitude of motion is manually adjusted as in Patent Literature 1, it is not possible to realize an application that interactively gives motions to a target (for example, an application that gives motions based on changes in the facial expression of a person to a photograph or painting through projection mapping while capturing the facial expression of the person in real time with a camera).

Processing of the first embodiment is performed such that the first parameter generation unit 130 and the second parameter generation unit 160 obtain first parameters λⁱ(t) (or second parameters λ₂ⁱ(t)) of regions of each flame over all regions of all frames and then the first parameter smoothing unit 136 (or the second parameter smoothing unit 167) collectively smooths them at once to obtain first parameters λ(x, y, t) (or second parameters λ₂(x, y, t)). Thus, similar to Patent Literature 1, the method of the first embodiment cannot be used in cases where it is required that input motion vectors v(x, y, t) be optimized sequentially (in real time) (for example, in applications that require interactivity).

A second embodiment will be described with regard to a method of performing processing for optimizing input motion vectors v(x, y, t) sequentially frame by flame. Hereinafter, changes from the first embodiment will be mainly described.

It is assumed that the input motion vector is a motion vector v(x, y, t₀) at the current frame t=t₀rather than v(x, y, t) for every frame. According to this, it is also assumed that the motion vector reduction unit 140, the non-rigid vector extraction unit 150, the neon vector combining unit 170, and the projection image generation unit 180 perform only processing relating to the current frame.

First Parameter Generation Unit 130

In the first parameter generation unit 130, the region division unit 131 performs region division of the motion vector v(x, y, t₀) of the current flame in the same manner as in the first embodiment. The processing performed for each region (the processing of the multiplication unit 133, the projection result generation unit 132, the unnaturalness estimation unit 134, and the first parameter update unit 135) is performed in the same manner as in the first embodiment.

The processing of the first parameter smoothing unit 136 is replaced with the following processing.

First Parameter Smoothing Unit 136

The first parameter smoothing unit 136 takes the first pander λⁱ(t) obtained from each region/frame as an input and obtains and outputs a smoothed first parameter λ(x, y, t₀) of each pixel.

The first parameter smoothing unit 136 in the second embodiment separately performs smoothing in the spatial direction and smoothing in the temporal direction. The smoothing in the spatial direction is performed through the same procedure as in the first embodiment as follows.

The first parameter of each region will be referred to as λ(m, n) for the sake of explanation. Here, m represents the horizontal position of the region and n represents the vertical position of the region. First, smoothing is performed such that extreme value changes do not occur between adjacent first parameters λ(m, n). At this time, smoothing is performed by replacing λ(m, n) with λ′(m, n) such that the following two constraints are satisfied.

Constraint 1: λ′(m, n)≤λ(m, n) nest be satisfied for all m and n. This can restrict the unnaturalness from exceeding an unnaturalness threshold due to the smoothing process.

Constraint 2: The following must be satisfied for all m and n.

|λ′(m,n)−λ′(m′,n′)|≤√{square root over ((|m−m′|²+|n−n′|²)s_s)} [Math. 13]

Here, (m′, n′) represents a set of regions around (m, n), where m′∈{m−1, m, m+1}, n′∈{n−1, n, n+1}. In addition, s_sis a permissible value for the magnitude of the gradient between adjacent regions. As in the first embodiment, s_s=0.06. The method described in Reference 3 can be used as an algorithm for updating λ(m, n, t), similar to the first embodiment. The specific processing is as illustrated in FIG. 11.

Smoothing is performed in the temporal direction after smoothing in the spatial direction. For this purpose a first parameter λ″(m, n, t₀−1) of an immediately previous frame that has been smoothed in the spatial and temporal directions thereinafter referred to as λ″(t₀−1) for the sake of simplicity) is read from the storage unit, and a first parameter λ′(m, n, t₀) of the current frame that has been smoothed in the spatial direction (hereinafter referred to as λ′(t₀) for the sake of simplicity) is smoothed in the following manner to obtain a first parameter λ″(m, n, t₀) that has been smoothed in the temporal direction referred to as λ″(t₀) for the sake of simplicity).

$[Math . 14]$

$\begin{matrix} λ^{″} (t_{0}) = {\begin{matrix} λ^{″} (t_{0} - 1) + s_{t}^{'} / F if λ^{'} (t_{0}) - λ^{″} (t_{0} - 1) > s_{t}^{'} / F \\ λ^{″} (t_{0} - 1) - s_{t}^{'} / F if λ^{'} (t_{0}) - λ^{″} (t_{0} - 1) < - s_{t}^{'} / F \\ λ^{'} (t_{0}) otherwise \end{matrix} & (9) \end{matrix}$

Here, F represents the overall frame rate of the system and s′_tis a parameter that determines the permissible value (maximum value) of the magnitude of the gradient from the previous frame. In the present embodiment, s′_t=2, assuring a frame rate of F=60. In this case, the permissible magnitude of the gradient in the temporal direction of the first parameter is 0.033. The permissible magnitude of the gradient does not necessarily have to be this value, but the discontinuity of the magnitude of motion may be noticeable if it is too large, while the number of frames in which the unnaturalness of the projection result becomes greater than the threshold t increases if it is too small. In consideration of these factors, the user may be allowed to select an optimum parameter. If there is no previous frame, the subsequent processing is performed with λ″(t₀)=λ′(t₀). The obtained first parameter λ″(t₀) that has been smoothed is stored in the storage omit and used for the smoothing process of the next frame.

That is, based on the magnitude relationship between the predetermined value (s′_t/F or −s′_t/F) and the difference between the “first parameter λ″(t₀−1) of the immediately previous frame that has been smoothed in the spatial and temporal directions” and the “first parameter λ′(t₀) of the current flame that has been smoothed in the spatial direction”, the first parameter smoothing unit 136 smooths the first parameter λ′(t₀) in the temporal direction using the first parameter λ″(t₀−1) and the predetermined value (s′_t/F or −s′_t/F).

Finally, λ″(t₀) is expanded through the bilinear interpolation method or the like as in the first embodiment to obtain the first parameter λ(x, y, t₀) of each pixel.

Second Parameter Generation Unit 160

In the second parameter generation unit 160, the second region division unit 161 performs region division of the reduced motion vector v_s(x, y, t₀) of the current frame and the non-rigid motion vector component Δv_h(, y, t₀) of the cu ent frame in the same manner as in the first embodiment. The processing performed for each region (the processing of the second multiplication unit 163, the motion vector addition unit 164, the projection result generation unit 162, the unnaturalness estimation unit 165, and the second parameter update unit 166) is performed in the same manner as in the first embodiment.

Second Parameter Smoothing Unit 167

The processing of the second parameter smoothing unit 167 is replaced with the same processing as that of the first parameter smoothing unit 136 in the second embodiment.

That is, the second parameter smoothing unit 167 first performs smoothing in the spatial direction using the method described in Reference 3 and then performs smoothing in the temporal direction. The parameter that determines the permissible level of the magnitude of the gradient in the spatial direction when smoothing is performed in the spatial direction is set such that s_s=0.3 as with the second parameter smoothing unit 167 in the first embodiment. Based on the magnitude relationship between the predetermined value (s′t/F or −s′t/F) and the difference between the “second parameter λ″₂(t₀−1) of the immediately precious frame that has been smoothed in the spatial and temporal directions” and the “second parameter λ′₂(t₀) of the current frame that has been smoothed in the spatial direction”, the second parameter λ′₂(t₀) is smoothed in the temporal direction using the second parameter λ″₂(t₀−1) and the predetermined value (s′_t/F or −s′_t/F), similar to the first parameter smoothing unit 136 in the second embodiment. However, the parameter s′_tthat determines the permissible level of the magnitude of the gradient is set greater than that of the first parameter smoothing unit. In the pent embodiment, s′_t=4. However, the value of s′_tis not limited to the value defined here and any value may be set as long as the temporal discontinuity of the magnitude of motion is not a concern.

Effects

With the above configuration, the same advantageous effects as those of the first embodiment can be achieved. Further, the motion vector v(x, y, t) can be optimized sequentially (in real time). The present invention can be applied to an application that interactively gives motions to a target.

Third Embodiment

Parts different from the first and second embodiments will be mainly described.

Here, a plurality of bandpass components may be extracted using a plurality of bandpass filters, whereas in the first and second embodiments, the filtering unit of the non-rigid vector extraction unit 150 extracts a high-frequency component of the motion vector as a non-rigid motion vector component Δv_h(x, y, t).

For example, a non-rigid vector extraction unit 150 may be configured to decompose a motion vector into a plurality of (N_P) bandpass components Δv_{b_1}, Δv_{b_2}, . . . , Δv_{b_N_P}(where N_Pis an integer of 2 or more) using a Laplacian pyramid or the like and to obtain nth parameters of different spatial frequency components (n∈2, . . . , N_P+1).

FIG. 12 is a functional block diagram of a projection image generation apparatus according to the third embodiment and FIG. 13 illustrates an example of a processing flow thereof. FIG. 12 omits illustration of a projection target photographing unit 110, an addition unit 125, a camera-projector pixel correspondence acquisition unit 120, and a projection unit 190.

The projection image generation apparatus according to the third embodiment includes N_Ppieces of nth parameter generation units 160-n and N_Ppieces of nth motion vector combining units 170-n (n∈2, . . . , N_P+1) instead of the second parameter generation unit 160 and the motion vector combining unit 170 of the projection image generation apparatus of the first embodiment or the second embodiment.

nth Parameter Generation Unit 160-n

Each nth parameter generation unit 160-n perform the same processing as that of the second parameter generation unit 160 of the first embodiment (or the second embodiment) except for points described below.

The nth parameter generation unit 160-n takes a combined motion vector v_n-1(x, y, t) output from an (n−1)th motion vector combining unit 170-(n−1) (a reduced motion vector v_s(x, y, t) if n=2), an (n−1)th bandpass component Δv_{b_n-1}(x, y, t) of the motion vector, a region-divided minimum luminance image I_Minⁱ(x, y), a region-divided maximum luminance image I_Maxⁱ(x, y), a region-divided intermediate luminance image I₀ⁱ(x, y), a ratio ρ, a P2C map, and a C2P man as inputs, obtains an nth parameter λ_nusing these inputs (S160-n), and outputs the obtained nth parameter λ_n. The nth parameter λ_n(x, y, t) is a parameter for scaling the (n−1)th bandpass component Δv_{b_n-1}(x, y, t) as in “v_s(x, y, t)+λ₂(x, y, t)Δv₁(x, y, t)+ . . . +λ_n(x, y, t)Δv_{b_n-1}(x, y, t)+ . . . +λ_{N_p+1}(x, y, t)Δv_{b_N_p}(x, y, t)” when a motion lost due to reduction with the first parameter is compensated for with the (n−1)th bandpass component Δv_{b_n-1}(x, y, t).

That is, the nth parameter generation unit 160-n replaces the non rigid motion vector component Δv_h(x, y, t) with the (n−1)th bandpass component Δv_{b_n-1}(x, y, t) of the motion vector.

Only when n>2, the reduced motion vector v_s(x, y, t) is replaced with the combined motion vector v_n-1(x, y, t) output from the (n−1)th motion vector combining unit 170-(n−1) and the second parameter λ₂is replaced with the nth parameter λ_n.

The constraints in the magnitude of the gradient s_sand s_t(s′_twhen real-time processing is performed as in the second embodiment) used in a second parameter smoothing unit 167 in the nth parameter generation unit 160-n gradually increases with n (for example, increases by 2 times each time n increases by 1).

The obtained nth parameter λ_n(x, y, t) is output to the nth motion vector combining unit 170-n.

nth Motion Vector Combining Unit 170

The nth motion vector combining unit 170 takes the nth parameter λ_n(x, y, t), the (n−1)th bandpass component Δv_{b_n-1}(x, y, t) of the motion vector, and the combined motion vector v_n-1(x, y, t) output from the (n−1)th motion vector combining unit 170-(n−1) as inputs and obtains and outputs a combined motion vector v_n(x, y, t).

The nth motion vector combining unit 170 adds the (n−1)th bandpass component λ_n(x, y, t)Δv_{b_n-1}(x, y, t) scaled using the nth parameter and the (n−1) combined vector v_n-1(x, y, t) according to the following equation to calculate the combined motion vector v_n(x, y, t) (S170-n).

v_n(x,y,t)=v_n-1(x,y,t)+λ_n(x,y,t)Δv_b_n-1(x,y,t) [Math. 15]

When n<N_P+1, the combined motion vector v_n(x, y, t) is output to the (n+1)th parameter generation unit 160-(n+1) and the (n+1)th motion vector combining unit 170-(n+1).

When n=N_P+1, the combined motion vector v_{N_P+1}(y, t) is output to the projection image generation unit 180 as v{circumflex over ( )}(x, y, t).

The above processes S160-n and S170-n are repeated from n=2 to n=N_P+1(S1, S2, S3).

Effects

With the above configuration, the same advantageous effects as those of the first or second embodiment can be achieved. Further, finer adjustment can be performed by compensating for a motion lost due to reduction with the first parameter for each bandpass component.

Fourth Embodiment

Parts different from the first and second embodiments will be mainly described.

If it is known in advance that a motion vector v(x, y, t) to be input does not include many rigid motions, the non-rigid vector extraction unit 150, the second parameter generation unit 160, and the motion vector combining unit 170 may be omitted and a motion vector obtained by the motion vector reduction unit 140 may be used as a final motion vector in the projection image generation unit 180. In this case, the parameters used in the first parameter smoothing unit (s_sand s_tin the first embodiment and s_sand s′_tin the second embodiment) are replaced with those used in the second parameter smoothing unit 167.

FIG. 14 is a functional block diagram of the projection image generation apparatus according to the fourth embodiment and FIG. 15 illustrates a processing flow thereof.

Fifth Embodiment

Parts different from the first embodiment will be mainly described.

In the unnaturalness estimation unit 134 described in the first embodiment, it is necessary to run a loop to simultaneously obtain the third parameter αⁱthat determines a perceptual magnitude of motion with respect to a projection result and the unnaturalness estimate dⁱ_Min(t) and thus the processing takes time. The present embodiment will be described with regard to a method in which a third parameter αⁱis first analytically obtained and an unnaturalness estimate dⁱ_Min(t) is calculated using the obtained third parameter αⁱ, thereby allowing dⁱ_Min(t) to be output without ruining the loop. In the present embodiment, only the unnaturalness estimation unit 134 is replaced with an unnaturalness estimation unit 534 of FIG. 16, while any types can be used for other processes and components.

FIG. 16 is a functional block diagram of the unnaturalness estimation unit 534 according to the fifth embodiment and FIG. 17 illustrates an example of a processing flow thereof.

Compared with the unnaturalness estimation unit 134 of the first embodiment (see FIGS. 5 and 6), the third parameter update unit 134C is removed, and instead, a third parameter estimation unit 534C is newly added. The other can on processing units (a third parameter multiplication unit 134A, a warped image generation unit 134B, a perceptual model application writ 134D, and a perceptual difference calcination twit 134E) perform the sane processing as those of the unnaturalness estimation writ 134 of the first embodiment, except for the following two points.

(1) A third parameter αⁱ, which is input to the third parameter multiplication unit 134A, is provided by the third parameter estimation alit 534C.

(2) A perceptual difference dⁱ(t) obtained by the perceptual difference calculation unit 134E is directly output from the unnaturalness estimation unit 534C as an unnaturalness estimate d_Minⁱ(t).

Hereinafter, the processing of the third parameter estimation unit 534C will be specifically described.

Third Parameter Estimation Unit 534C

The third parameter estimation unit 534C takes an intermediate luminance image I₀ⁱ(x, y), a motion vector λⁱ(t)vⁱ(x, y, t) scaled by the first parameter, and a projection result reproduction image I_Pⁱ(x, y, t) as inputs, obtains a third parameter αⁱ(S534C), and outputs the third parameter αⁱ.

By expressing the process of distorting the intermediate luminance image I₀ⁱ(x, y) based on the motion vector λⁱ(t)vⁱ(x, y, t) scaled by the first parameter λⁱ(t) as a linear equation by a first-order approximation of Taylor expansion, the third parameter estimation unit 534C uniquely obtains the third parameter αⁱwithout repeatedly obtaining the perceptual difference dⁱ(t).

The third parameter αⁱis a parameter that determines the “perceptual amount of motion αⁱλⁱ(t)vⁱ(x, y, t)” perceived when the projection result reproduction image I_Pⁱ(x, y, t) is Viewed αⁱwhich minimizes the perceptual difference dⁱ(t) between the projection result reproduction image I_Pⁱ(x, y, t) and the image I_W(α)ⁱ(x, y, t) generated by distorting the original intermediate luminance image I₀ⁱ(x, y) by αⁱλⁱ(t)vⁱ(x, y, t) on the computer is obtained as “αⁱthat determines the perceptual amount of motion”. At this time, the first embodiment converts the projection result reproduction image I_Pⁱ(x, y, t) and the image I_W(α)ⁱ(x, y, t) into perceptual responses r(x, y, t) and r′(x, y, t), respectively, and then explicitly calculates the distance dⁱ(t) between the perceptual responses r(x, y, t) and r′(x, y, t) as a perceptual difference and obtains αⁱthat minimizes dⁱ(t) through a search including iterative processing. Here, a method of directly estimating αⁱwithout calculating dⁱ(t) will be described. Hereinafter, the superscript i (which indicates belonging to the region i) and the time frame t will be omitted to simplify the description. (Processing is performed independently for each region i and each frame t)

First, the case of obtaining α that minimizes a physical difference (an average squared error between images) rather than the perceptual difference will be considered to simplify the problem. This can be described as a problem of calculating α that minimizes the following error function.

[Math. 16]
e=Σ_x,y(I_P(x,y)−I_W(α)(x,y))² (10)

Here I_W(α)(x, y) is expressed as follows.

[Math. 17]
I_W(α)(x,y)=I_O(x−αv_x(x,y),y−αv_y(x,y) (11)

Here, v_x(x, y) and v_y(x, y) represent x- and y-axis elements of the motion vector λ_v(x, y), respectively. To simplify the description, pixel movement will be described as inverse warping (a mode in which the original image is referred to by the image after movement). However, in the present embodiment, the approximation described below also holds for forward warping (a mode in which the image after movement is referred to by the origins image) because it is assumed that α is spatially smooth.

Equation (11) can be expressed as follows by a first-order approximation of Taylor expansion.

$[Math . 18]$

$\begin{matrix} I_{W (α)} (x, y) \approx I_{O} (x, y) - α (\frac{\partial I_{O}}{\partial x} v_{x} (x, y) + \frac{\partial I_{O}}{\partial y} v_{y} (x, y)) & (12) \end{matrix}$

α=1 is substituted into Equation (12) to obtain the following equation

$[Math . 19]$

$\begin{matrix} I_{W (1)} (x, y) \approx I_{O} (x, y) - (\frac{\partial I_{O}}{\partial x} v_{x} (x, y) + \frac{\partial I_{O}}{\partial y} v_{y} (x, y)) & (13) \end{matrix}$

The following equation is obtained from Equations (12) and (13).

[Math. 20]
I_W(α)(x,y)≈I_O(x,y)+α(I_W(1)(x,y)−I_O(x,y)) (14)

Here, by setting D_P=I_P−I₀and D_W=I_W(1)−I₀and substituting Equation (14) into Equation (10), the following equation is obtained.

[Math. 21]
e=Σ_x,y(D_P(x,y)−αD_W(x,y))² (15)

The solution of this minimization problem of e can be uniquely obtained using the following equation.

$[Math . 22]$

$\begin{matrix} α = \frac{Σ_{x, y} D_{P} (x, y) D_{W} (x, y)}{Σ_{x, y} {D_{W} (x, y)}^{2}} & (16) \end{matrix}$

In the present embodiment, the first-order approximation of Taylor expansion is performed. However, this is an example and another approximation may be performed as long as it is a linear approximation using gradient information of an image.

Next, let us return to the problem of obtaining a which minimizes the perceptual difference rather than the physical difference. At this time, a method of solving Equation (16) by replacing I_P, I_W(1), and I₀with responses of the perceptual model which are conversion results through the same processing as that of the perceptual model application unit 134D can be considered first. However, instead of applying all the processing of the perceptual model application unit 134D to convert the image, conversion may be made into up to weighted bandpass images represented by Equation (4) and these may be substituted into Equation (16) to obtain α. This may be adopted because it is possible to obtain sufficient accuracy to estimate the perceptual amount of motion without reproducing the contrast gain adjustment process represented by Equation (6). However, the conversion of Equation (6) is very important for the unnaturalness estimation. A specific procedure for obtaining the third parameter αⁱis as follows.

The third parameter estimation unit 534C distorts the intermediate luminance image I₀ⁱ(x, y) based on the motion vector λⁱ(t)vⁱ(x, y, t) scaled by the first parameter λⁱ(t) to obtain Iⁱ_W(1)(x, y, t). The distortion method is similar to that of the projection result generation unit 132 in the first parameter generation unit 130.

Next, the third parameter estimation unit 534C converts each of Iⁱ_W(1)(x, y, t), Iⁱ_P(x, y, t), and I₀(x, y) into weighted bandpass images c_j(x, y) according to processing 1 to 3 of the perceptual model application unit 134D.

Further, the third parameter estimation unit 534C sums weighted bandpass images obtained from each of Iⁱ_W(1)(x, y, t), Iⁱ_P(x, y, t), and I₀(x, y) over j such that I_c(x, y)=Σ_jc_j(x, y) and combines the stuns into three respective bandpass images W_C(x, y), P_C(x, y), and O_C(x, y).

The thin parameter estimation unit 534C substitutes D_P=P_C(x, y)−O_C(x, y) and D_W=W_C(x, y)−O_C(x, y) into Equation (16) to obtain an estimate of the third parameter αⁱ(S534C). The estimate αⁱof the third parameter is output to the third parameter multiplication unit 134A.

The other processes are the same as those of the unnaturalness estimation unit 134 of the first embodiment as described above.

Effects

With the above configuration, the same advantageous effects as those of the first embodiment can be achieved. Further, processing of the unnaturalness estimation unit can be speeded up. The present embodiment may be combined with the second to fourth embodiments.

Sixth Embodiment

In the first to fifth embodiments, the first parameter is kept lowered until the number of update cycles reaches N_s, unless the unnaturalness estimate dⁱ_Min(t) is equal to or less than the threshold τ. Thus, depending on the conditions, the first parameter becomes very small and the magnitude of motion may be reduced more than expected. In order to eliminate such a possibility and secure a minimum necessary impression of motion in the projection result after optimization, the first parameter may be constrained such that the first parameter does not fall below a certain lower limit. In a sixth embodiment which is an example of a method of constraining the first parameter, the unnaturalness estimation unit 134 also outputs a third parameter αⁱ(representing how much the perceptual magnitude of motion is compared with the physical magnitude of the vector) and the first parameter is constrained such that the first parameter multiplied by the thin parameter (=the “perceptual magnitude of motion obtained by the reduced motion vector” relative to the “magnitude of the original motion vector”) does not fall below a predetermined threshold. This can be realized, for example, by replacing processing of the first parameter update unit 135 with the following processing

First Parameter Update Unit 135

The first parameter update unit 135 takes an unnaturalness estimate dⁱ_Min(t) obtained with a first parameter of the previous cycle and a third parameter αⁱ(which is indicated by (αⁱ) in FIG. 3) as inputs, obtains a first parameter λⁱ(t) of the next cycle (S135), and outputs the obtained first parameter λⁱ(t). However, in the first cycle, the first parameter update unit 135 performs only the output because there is no input.

In the first cycle, the first parameter update unit 135 stores λⁱ(t)=0.5 and a step size of stp=0.25 in the storage unit and outputs λⁱ(t) to the multiplication unit 133.

When dⁱ_Min(t)<τ, the first parameter update unit 135 updates the first parameter such that λⁱ(t)=λⁱ(t)+stp (overwrites the first parameter such that λⁱ(t)=1 if λⁱ(t) exceeds 1 in this process) and stores the updated first parameter in the storage unit.

When dⁱ_Min(t)>τ, the first parameter update unit 135 updates the first parameter such that λⁱ(t)=λⁱ(t)−stp (overwrites the first parameter such that λⁱ(t)=τ₂/αⁱif αⁱλⁱ(t)<τ₂in this process) and stores the updated first parameter in the storage unit.

When dⁱ_Min(t)=τ or the number of cycles is N_S, the first parameter update unit 135 ends the search and outputs λⁱ(t) to the first parameter smoothing unit 136. In other cases, the first parameter update unit 135 updates the step size such that stp=stp/2 and stores it in the storage unit and outputs λⁱ(t) to the multiplication unit 133.

Effects

With the above configuration, the same advantageous effects as those of the first embodiment can be achieved. Further, a minimum necessary impression of motion can be secured. The present embodiment may be combined with the second to fifth embodiments.

Seventh Embodiment

The projection image generation method may be performed based on another method. For example, a method of JJP 2018-50216 A can be used.

In this case, the projection unit 190 projects uniform light of luminance B₁and B₂(B₁<B₂) onto the projection target and the projection target photographing unit 110 obtains images I_B1and I_B2by photographing the projection target under the respective conditions.

These images are treated as a minimum luminance image I_Min=I_B1and a maximum luminance image I_Max=I_B2. An intermediate luminance image I₀is also treated such that I₀=I_B1and the process of obtaining I_Oin the addition unit is omitted.

The projection result generation unit 132 and the projection image generation unit 180 generate I_Musing the following equation.

$[Math . 23]$

$I_{M} (x, y, t) = w \frac{I_{W} (x, y, t) - I_{B 1} (x, y, t)}{K (x, y)} + B_{1}$

Here, K is a value that reflects the albedo (reflectance) of each pixel of the projection target and is calculated as follows.

K(x,y)=I_B2(x,y)−I_B1(x,y)/B₂−B₁ [Math. 24]

Although it is basically optimal to set w to 1 (w=1), the user may be allowed to change it such that the contrast of the projection image can be manipulated. If the albedo estimation contains a large amount of error, K may be fixed to 1 (K=1) for all pixels.

To obtain the projection result reproduction image I_P, the projection result generation unit 132 obtains I{circumflex over ( )}_M(x, y, t) through the same procedure as in the first embodiment and calculates I_Pusing the following equation.

$[Math . 25]$

$I_{P} (x, y, t) = \frac{(I_{B 2} (x, y) - I_{B 1} (x, y)) ({\hat{I}}_{M} (x, y, t) - B_{1})}{B_{2} - B_{1}} + I_{B 1} (x, y)$

Effects

With the above configuration, the same advantageous effects as those of the first embodiment can be achieved. The present embodiment may be combined with the second to sixth embodiments.

Other Modifications

The present invention is not limited to the above embodiments and modifications. For example, the various processes described above may be executed not only in chronological order as described but also in parallel or individually as necessary or depending on the processing capabilities of the apparatuses that execute the processing. In addition, appropriate changes can be made without departing from the spirit of the present invention.

Program and Recording Medium

The various processing furriers of each device (or apparatus) described in the above embodiments and modifications may be realized by a computer. In this case, the processing details of the functions that each device may have are described in a program. When the program is executed by a computer, the various processing functions of the device are implemented on the computer.

The program in which the processing details are described can be recorded on a computer-readable recording medium. The computer-readable recording medium can be any type of medium such as a magnetic recording device, an optical disc, a magneto-optical recording medium, or a semiconductor memory.

The program is distributed for example, by selling giving, or lending a portable recording medium such as a DVD or a CD-ROM with the program recorded on it. The program may also be distributed by storing the program in a storage device of a server computer and transmitting the program from the server computer to another computer through a network.

For example, a computer configured to execute such a program first stores, in its storage unit, the program recorded on the portable recording medium or the program transmitted from the server computer. That the computer reads the program stored in its storage unit and executes processing in accordance with the read program. In a different embodiment of the program, the computer may read the program directly from the portable recording medium and execute processing in accordance with the read program. The computer may also sequentially execute processing in accordance with the program transmitted from the server computer each time the program is received from the server computer. In another configuration, the processing may be executed through a so-called application service provider (ASP) service in which functions of the processing are implemented just by issuing an won to execute the program and obtaining results without transmission of the program from the server computer to the computer. The program includes information that is provided for use in processing by a computer and is equivalent to the program (such as data having properties defining the processing executed by the computer rather than direct commands to the computer).

In this mode, the device is described as being configured by executing the predetermined program on the computer, but at least apart of the processing may be realized by hardware.

Number	Name	Date	Kind
10571794	Kawabe et al.	Feb 2020	B2
20040252230	Winder	Dec 2004	A1
20060257048	Lin	Nov 2006	A1
20140218569	Tsubaki	Aug 2014	A1
20140292817	Iversen	Oct 2014	A1
20170006284	Gokhale	Jan 2017	A1
20190124332	Lim	Apr 2019	A1
20200150521	Kawabe et al.	May 2020	A1

Number	Date	Country
2557466	Nov 1996	JP
2015163317	Oct 2015	NO
WO-2020077198	Apr 2020	WO

Motion vector generation apparatus, projection image generation apparatus, motion vector generation method, and program

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

CPC

Field of Search

CPC

International Classifications

Term Extension

Abstract

Description

Claims

Priority Claims (1)

PCT Information

US Referenced Citations (8)

Foreign Referenced Citations (3)

Non-Patent Literature Citations (2)

Related Publications (1)

Entry
Search machine translation: Low-frequency Replacement Circuit For MUSE Decoder of JP 2557466 B2 to Ryuichi, retrieved May 10, 2023, 7 pages. (Year: 2023).
Taiki Fukiage et al., “A model of V1 metamer can explain perceived deformation of a static object induced by light projection”, Vision Sciences Society, Florida, U. S. A., May 2016.