This nonprovisional application claims priority under 35 U.S.C. §119(a) on Patent Application No. 2010-243196 filed in Japan on Oct. 29, 2010, the entire contents of which are hereby incorporated by reference.
1. Field of the Invention
The present invention relates to an image processing apparatus, an image processing method, and a program, which perform image processing. In addition, the present invention relates to an image pickup apparatus such as a digital camera.
2 Description of Related Art
There is proposed a method of generating an image as illustrated in
In addition, there is also proposed a method in which the display screen is divided into a plurality of display regions, and a plurality of frames constituting the moving image are displayed as multi-display using the plurality of divided display regions as illustrated in
If the position of the target object is scarcely changed on the moving image in such a case where the target object is a person who swings a golf club, images of the target object at different time points are overlapped on the stroboscopic image as illustrated in
According to the multi-display method illustrated in
An image processing apparatus according to the present invention includes a region setting portion that sets a clipping region as an image region in each input image based on image data of an input image sequence consisting of a plurality of input images, a clipping process portion that extracts an image within the clipping region as a clipped image from each of a plurality of target input images included in the plurality of input images, and an image combining portion that arranges and combines a plurality of clipped images that are extracted.
An image pickup apparatus according to the present invention, which obtains an input image sequence consisting of a plurality of input images from a result of sequential photographing using an image sensor, includes a vibration correcting portion that reduces vibration of a subject among the input images due to movement of the image pickup apparatus based on a detection result of the movement, a region setting portion that sets a clipping region as an image region on each input image based on the detection result of movement, a clipping process portion that extracts an image within the clipping region as a clipped image from each of a plurality of target input images included in the plurality of input images, and an image combining portion that arranges and combines a plurality of clipped images that are extracted.
An image processing method according to the present invention includes a region setting step that sets a clipping region as an image region on each input image based on image data of an input image sequence consisting of a plurality of input images, a clipping process step that extracts an image within the clipping region as a clipped image from each of a plurality of target input images included in the plurality of input images, and an image combining step that arranges and combines a plurality of clipped images that are extracted.
Further, it is preferred to form a program for a computer to perform the above-mentioned region setting step, the clipping process step, and the image combining step.
Hereinafter, examples of embodiments of the present invention are described specifically with reference to the attached drawings. In the drawings to be referred to, the same part is denoted by the same numeral or symbol, and overlapping description of the same part is omitted as a rule.
A first embodiment of the present invention is described below.
An imaging portion 11 includes, in addition to an image sensor 33, an optical system, an aperture stop, and a driver (not shown). The image sensor 33 is constituted of a plurality of light receiving pixels arranged in horizontal and vertical directions. The image sensor 33 is a solid-state image sensor constituted of a charge coupled device (CCD), a complementary metal oxide semiconductor (CMOS) image sensor, or the like. Each light receiving pixel of the image sensor 33 performs photoelectric conversion of an optical image of a subject entering through the optical system and the aperture stop, and outputs an electric signal obtained by the photoelectric conversion to an analog front end (AFE) 12. Lenses of the optical system form the optical image of the subject on the image sensor 33.
The AFE 12 amplifies an analog signal output from the image sensor 33 (each light receiving pixel) and converts the amplified analog signal into a digital signal, which is output to an image signal processing portion 13. An amplification degree of the signal amplification in the AFE 12 is controlled by a central processing unit (CPU) 23. The image signal processing portion 13 performs necessary image processing on the image expressed by the output signal of the AFE 12 and generates an image signal of the image after the image processing. A microphone 14 converts ambient sounds of the image pickup apparatus 1 into an analog sound signal, and a sound signal processing portion 15 converts the analog sound signal into a digital sound signal.
A compression processing portion 16 compresses the image signal from the image signal processing portion 13 and the sound signal from the sound signal processing portion 15 using a predetermined compression method. An internal memory 17 is constituted of dynamic random access memory (DRAM) and temporarily stores various data. An external memory 18 as a recording medium is a nonvolatile memory such as a semiconductor memory or a magnetic disk, which records the image signal and the sound signal after compression by the compression processing portion 16 in association with each other.
An expansion processing portion 19 expands the compressed image signal and sound signal read out from the external memory 18. The image signal after expansion by the expansion processing portion 19 or the image signal from the image signal processing portion 13 is sent to the display portion 27 constituted of a liquid crystal display or the like via a display processing portion 20 and are displayed as images. In addition, the sound signal after expansion by the expansion processing portion 19 is sent to the speaker 28 via a sound output circuit 21, and is output as sounds.
A timing generator (TG) 22 generates timing control signals for timing control of individual actions in the entire image pickup apparatus 1 and supplies the generated timing control signals to the individual portions of the image pickup apparatus 1. The timing control signals include a vertical synchronizing signal Vsync and a horizontal synchronizing signal Hsync. The CPU 23 integrally controls actions of the individual portions of the image pickup apparatus 1. An operating portion 26 includes a record button 26a for instructing start and finish of photographing and recording a moving image, a shutter button 26b for instructing to photograph and record a still image, an operation key 26c, and the like, so as to accept various operations by a user. The operation with the operating portion 26 is sent to the CPU 23.
Action modes of the image pickup apparatus 1 include a photography mode in which an image (still image or moving image) can be taken and recorded, and a reproduction mode in which the image recorded in the external memory 18 (still image or moving image) is reproduced and displayed on the display portion 27. Transition between the modes is performed in accordance with the operation with the operation key 26c.
In the photography mode, subjects are photographed sequentially, and photographed images of the subjects are obtained sequentially. A digital image signal expressing the image is referred to also as image data.
Note that because compression and expansion of the image data is not essentially related to the present invention, compression and expansion of the image data is neglected in the following description (in other words, for example, to record compressed image data is simply expressed as to record image data). In addition, in this specification, image data of an image may be simply referred to as an image. In addition, in this specification, when simply referred to as a display or a display screen, it means a display or a display screen of the display portion 27.
The image pickup apparatus 1 has an image combining function of combining a plurality of input images arranged in time sequence.
The image processing portion 50 is supplied with image data of an input image sequence. The image sequence such as the input image sequence means a series of a plurality of images arranged in time sequence. Therefore, the input image sequence is constituted of a plurality of input images arranged in time sequence. The image sequence can be read as a moving image. For instance, the input image sequence is a moving image including a plurality of input images arranged in time sequence as a plurality of frames. The input image is, for example, a photographed image expressed by the output signal itself of the AFE 12, or an image obtained by performing a predetermined image processing (such as a demosaicing process or a noise reduction process) on a photographed image expressed by the output signal itself of the AFE 12. An arbitrary image sequence recorded in the external memory 18 can be read out from the external memory 18 and supplied to the image processing portion 50 as the input image sequence. For instance, a manner in which a subject swings a golf club or a baseball bat is photographed as a moving image by the image pickup apparatus 1 and is recorded in the external memory 18. Then, the recorded moving image can be supplied as the input image sequence to the image processing portion 50. Note that the input image sequence can be supplied from an external arbitrary portion other than the external memory 18. For instance, the input image sequence may be supplied to the image processing portion 50 via communication from external equipment (not shown) of the image pickup apparatus 1.
A region setting portion 51 sets a clipping region as an image region in the input image based on image data of the input image sequence so as to generate and output clipping region information indicating a position and a size of the clipping region. A position of the clipping region indicated by the clipping region information is, for example, a center position or a barycenter position of the clipping region. A size of the clipping region indicated by the clipping region information is, for example, a size of the clipping region in the horizontal and the vertical directions. If the clipping region has a shape except a rectangle, the clipping region information contains information for specifying a shape of the clipping region.
A clipping process portion 52 extracts an image within the clipping region as the clipped image from the input image based on the clipping region information (in other words, an image within the clipping region is clipped as the clipped image from the input image). The clipped image is a part of the input image. Hereinafter, a process of generating the clipped image from the input image based on the clipping region information is referred to as a clipping process. The clipping process is performed on a plurality of input images, and hence a plurality of clipped images are obtained. Similarly to the plurality of input images, the plurality of clipped images are also arranged in time sequence. Therefore, the plurality of clipped images can be called a clipped image sequence.
An image combining portion 53 combines a plurality of clipped images and outputs the image obtained by combining as an output combined image. The output combined image can be displayed on the display screen of the display portion 27, and image data of the output combined image can be recorded in the external memory 18.
The image combining function can be realized in the reproduction mode. The reproduction mode for realizing the image combining function is split into a plurality of combining modes. When the user issues an instruction to select one of the plurality of combining modes to the image pickup apparatus 1, an action in the selected combining mode is performed. The user can issue an arbitrary instruction with the operating portion 26 to the image pickup apparatus 1. A so-called touch panel may be included in the operating portion 26. The plurality of combining modes may include a first combining mode that can also be called a multi-window combining mode. Hereinafter, the first embodiment describes an action of the image pickup apparatus 1 in the first combining mode.
It is supposed that the input image sequence supplied to the image processing portion 50 is an input image sequence 320 illustrated in
In Step S12, the user selects a combination start frame using the operating portion 26. When the combination start frame is selected, as illustrated in
In the next Step S13, the user selects a combination end frame using the operating portion 26. When the combination end frame is selected, as illustrated in
The combination start frame and the combination end frame are input images of the input image sequence 320, and the input image as the combination end frame is an input image photographed after the combination start frame. Here, as illustrated in
After selecting the combination start frame and the combination end frame, the user can designate a combining condition using the operating portion 26 in Step S14. For instance, the number of images to be combined for obtaining the output combined image (hereinafter, referred to as combining number CNUM) or the like can be designated. The combining condition may be set in advance, and in this case, the designation in Step S14 may be omitted. Meaning of the combining condition will be apparent from later description. The process in Step S14 may be performed before Steps S12 and S13.
All the input images F[n] to F[n+m] belonging to the combination target period do not always contribute to formation of the output combined image. An input image that contributes to formation of the output combined image among the input images F[n] to F[n+m] is referred to particularly as a target input image. There are a plurality of target input images, and a first target input image is an input image F[n]. The user can designate a sampling interval that is one type of the combining condition in Step S14. However, the sampling interval may be set in advance. The sampling interval is a photographing time point interval between two target input images that are temporally neighboring to each other. For instance, if the sampling interval is Δt×i (see
After the value of m and the combining number CNUM are determined, the sampling interval and the target input images may be set based on the determined value of m and combining number CNUM. For instance, if m=8 and CNum=5 are determined, the sampling interval is set to Δt×(m/(CNum−1)), namely (Δt×2). As a result, the input images F[n], F[n+2], F[n+4], F[n+6] and F[n+8] are extracted as the target input images.
After the processes of Steps S12 to S14, the processes Steps S15 to S17 are performed sequentially. Specifically, the region setting portion 51 performs the clipping region setting process in Step S15, the clipping process portion 52 performs the clipping process in Step S16, and the image combining portion 53 performs the combining process in Step S17, so as to generate the output combined image (see
[S15: Clipping Region Setting]
The clipping region setting process in Step S15 is described below.
First in Step S21, the region setting portion 51 extracts or generates a background image. Input images that do not belong to the combination target period among input images of the input image sequence 320 are regarded as candidate background images, and one of the plurality of candidate background images can be extracted as the background image. The plurality of candidate background images include input images F[1] to F[n−1], and may further include input image F[n+m+1], F[n+m+2], and so on. The region setting portion 51 can select the background image from the plurality of candidate background images based on image data of the input image sequence 320. It is possible to adopt a structure in which the user manually selects the background image from the plurality of candidate background images.
It is preferred to select an input image having no moving object region as the background image. On a moving image consisting of a plurality of input images, an object that is moving is referred to as a moving object, and an image region in which image data of the moving object exists is referred to as a moving object region.
For instance, the region setting portion 51 is formed so that the region setting portion 51 can perform a movement detection processing. In the movement detection processing, based on image data of two input images that are temporally neighboring to each other, an optical flow between the two input images is derived. As known well, the optical flow between the two input images is a bundle of motion vectors of objects between the two input images. The motion vector of an object between two input images indicates a direction and a size of a motion of the object between the two input images.
A size of the motion vector corresponding to the moving object region is larger than that of a region other than the moving object region. Therefore, it is possible to estimate whether or not a moving object exists in the plurality of input images from the optical flow of the plurality of input images. Therefore, for example, the movement detection processing is performed on the input images F[1] to F[n−1], so as to derive the optical flow between input images F[1] and F[2], the optical flow between input images F[2] and F[3], . . . , and the optical flow between the input images F[n−2] and F[n−1]. Then, based on the derived optical flows, the input image that is estimated to have no moving object can be extracted from the input images F[1] to F[n−1]. The extracted input image (the input image that is estimated to have no moving object) can be selected as the background image.
In addition, for example, it is possible to generate the background image by the background image generating process using the plurality of input images. A method of the background image generating process is described below with reference to
In the background image generating process, a background pixel extraction process is performed for each pixel position. The background pixel extraction process performed on a pixel position (x, y) is described below. In the background pixel extraction process, the region setting portion 51 first sets the input image G[1] to a reference image and sets each of the input images G[2] to G[5] to a non-reference image. Then, the region setting portion 51 performs a differential calculation for each non-reference image. Here, the differential calculation means a calculation of determining an absolute value of a difference between a pixel signal of the reference image at the pixel position (x, y) and a pixel signal of the non-reference image at the pixel position (x, y) as a difference factor value. The pixel signal means a signal of a pixel, and a value of the pixel signal is also referred to as a pixel value. As the pixel signal in the differential calculation, a luminance signal can be used, for example.
If the input image G[1] is the reference image, the differential calculation for each non-reference image derives,
a difference factor value VAL[1, 2] based on the pixel signal of the input image G[1] at the pixel position (x, y) and the pixel signal of the input image G[2] at the pixel position (x, y),
a difference factor value VAL[1, 3] based on the pixel signal of the input image G[1] at the pixel position (x, y) and the pixel signal of the input image G[3] at the pixel position (x, y),
a difference factor value VAL[1, 4] based on the pixel signal of the input image G[1] at the pixel position (x, y) and the pixel signal of the input image G[4] at the pixel position (x, y), and
a difference factor value VAL[1, 5] based on the pixel signal of the input image G[1] at the pixel position (x, y) and the pixel signal of the input image G[5] at the pixel position (x, y).
The region setting portion 51 performs the differential calculation for each non-reference image while switching the input image to be set to the reference image from the input image G[1] to the input images G[2], G[3], G[4], and G[5] sequentially (the input images other than the reference image are set to the non-reference images). Thus, a difference factor value VAL[i, j] based on the pixel signal of the input image G[i] at the pixel position (x, y) and the pixel signal of the input image G[j] at the pixel position (x, y) is determined for every combination of variables i and j satisfying 1≦i≦5 and 1≦j≦5 (here, i and j are different integers).
The region setting portion 51 determines a sum of four difference factor values VAL[i, j] determined in a state where the input image G[i] is set to the reference image, as a difference integrated value SUM[i]. The difference integrated value SUM[i] is derived for each of the input images G[1] to G[5]. Therefore, five difference integrated values SUM[1] to SUM[5] are determined for the pixel position (x, y). The region setting portion 51 specifies a minimum value within the difference integrated values SUM[1] to SUM[5], and sets a pixel and a pixel signal of the input image at the pixel position (x, y) corresponding to the minimum value as a pixel and a pixel signal of the background image 330 at the pixel position (x, y). In other words, for example, if the difference integrated value SUM[4] is minimum among the difference integrated values SUM[1] to SUM[5], a pixel and a pixel signal of the input image G[4] at the pixel position (x, y) corresponding to the difference integrated value SUM[4] are set to a pixel and a pixel signal of the background image 330 at the pixel position (x, y).
The moving object region of the example illustrated in
As described above, in the background image generating process, the background pixel extraction process is performed for each pixel position. Therefore, the same processes as described above are performed sequentially for pixel positions other than the pixel position (x, y), and finally the pixel signal is determined at every pixel position of the background image 330 (namely, generation of the background image 330 is completed). Note that according to the action described above, the difference factor value VAL[i, j] and the difference factor value VAL[j, i] are calculated individually, but the values thereof are the same. Therefore, it is actually sufficient if one of them is calculated. In addition, the background image is generated from five input images in the example illustrated in
In Step S22 (see
The region setting portion 51 generates a difference image between the background image and the target input image for each target input image, and performs thresholding (i.e., binarization) on the generated difference image so as to generate a binary difference image. In
The moving object regions 361 to 363 are illustrated on the binary difference image 351 to 353 in
After that, in Step S23 (see
As illustrated in
As illustrated in
As illustrated in
As illustrated in
A region (white region) 441 illustrated in
In Step S23 (see
The background image is used for setting the clipping region in the method according to Steps S21 to S23 of
Alternatively, for example, it is possible to set the clipping region by performing the processes of Steps S31 and S32 of
In Step S31, the region setting portion 51 detects an image region where an object of specific type exits from the target input image as a specific object region (or a specific subject region) based on the image data of the target input image. The detection of the specific object region can be performed for each target input image. The object of specific type means an object of a type that is registered in advance, which is an arbitrary person or a registered person, for example. If the object of specific type is a registered person, it is possible to detect the specific object region by a face recognition process based on the image data of the target input image. In the face recognition process, if there is a person's face in the target input image, it is possible to distinguish whether or not the face is a registered person's face. As the detection method of the specific object region, an arbitrary detection method including a known detection method can be used. For instance, the specific object region can be detected by using a face detection process for detecting a person's face from the target input image, and a region splitting process for distinguishing the image region where image data of the whole body of the person exist from other image region while utilizing a result of the face detection process.
In Step S32, the region setting portion 51 sets the clipping region based on the specific object region detected in Step S31. The setting method of the clipping region based on the specific object region is the same as the setting method of the clipping region based on the moving object region described above. In other words, for example, if the plurality of target input images extracted from the input images F[n] to F[n+m] are the images 341 to 343 illustrated in
Note that the object of specific type that is noted when the combining mode is used is usually a moving object, and therefore, the specific object region can be regarded as the moving object region. Hereinafter, for convenience sake of description, the specific object region is also regarded as one type of the moving object region, and it is supposed that the specific object regions detected from the target input images 341 to 343 are agreed with the moving object regions 361 to 363, respectively. In addition, in the following description, it is supposed that the clipping region is a rectangular region unless otherwise noted.
[S16: Clipping Process]
The clipping process in Step S16 of
A position, a size and a shape of the clipping region in the target input image are the same among all the target input images. However, a position of the clipping region in the target input image may be different among different target input images. The position of the clipping region in the target input image means a center position or a barycenter position of the clipping region in the target input image. The size of the clipping region means a size of the clipping region in the horizontal and vertical directions.
It is supposed that the plurality of target input images extracted from the input images F[n] to F[n+m] includes the images 341 to 343 illustrated in
In
However, the positions 472C and 473C illustrated in
[S17: Combining Process]
The combining process in Step S17 of
An image 500 illustrated in
The arrangement of the clipped images on the output combined image as illustrated in
A method of determining the arrangement of the clipped images in accordance with an aspect ratio of the output combined image (namely a method of determining the arrangement of the clipped images in the state where the aspect ratio of the output combined image is fixed) is described below. The aspect ratio of the output combined image means a ratio between the number of pixels in the horizontal direction of the output combined image and the number of pixels in the vertical direction of the output combined image. Here, it is supposed that the aspect ratio of the output combined image is 4:3. In other words, it is supposed that the number of pixels in the horizontal direction of the output combined image is 4/3 times the number of pixels in the vertical direction of the output combined image. In addition, the numbers of pixels in the horizontal and the vertical directions of the clipping region set in Step S15 of
(HNUM×HCUTSIZE):(VNUM×VCUTSIZE)=4:3 (1)
For instance, if (HCUTSIZE:VCUTSIZE)=(128:240), HNUM:VNUM=5:2 holds in accordance with the expression (1). In this case, if CNUM=HNUM×VNUM=10 holds, HNUM=5 and VNUM=2 hold, and hence the output combined image 500 illustrated in
A method of determining the arrangement of the clipped images in accordance with an image size of the output combined image (namely, a method of determining the arrangement of the clipped images in a state where the image size of the output combined image is fixed) is described below. The image size of the output combined image is expressed by the number of pixels HOSIZE in the horizontal direction of the output combined image and the number of pixels VOSIZE in the vertical direction of the output combined image. The image combining portion 53 can determine the numbers HNUM and VNUM in accordance with the following expressions (2) and (3).
H
NUM
=H
OSIZE
/H
CUTSIZE (2)
V
NUM
=V
OSIZE
/V
CUTSIZE (3)
For instance, if CNUM=HNUM×VNUM=10, (HOSIZE, VOSIZE)=(640, 480), and (HCUTSIZE, VCUTSIZE)=(128, 240) hold, HNUM=5 and VNUM=2 are satisfied, because HOSIZE/HCUTSIZE=640/128=5, and VOSIZE/VCUTSIZE=480/240=2 hold. Therefore, the output combined image 500 illustrated in
If the right sides of the expressions (2) and (3) become real numbers except integers, an integer value HINT obtained by rounding off the right side of the expression (2) and an integer value VINT obtained by rounding off the right side of the expression (3) are substituted into HNUM and VNUM, respectively. Then, the clipping region may be set again so that “HINT=HOSIZE/HCUTSIZE” and “VINT=VOSIZE/VCUTSIZE” are satisfied (namely, the clipping region that is once set is enlarged or reduced). For instance, if CNUM=HNUM×VNUM=10, and (HOSIZE, VOSIZE)=(640, 480) are satisfied, and if (HCUTSIZE, VCUTSIZE)=(130, 235) is satisfied for the clipping region that is once set, the right sides of the expressions (2) and (3) are approximately 4.92 and approximately 2.04, respectively. In this case, HINT=5 is substituted into HNUM, and VINT=2 is substituted into VNUM. Then, the clipping region is set again so that “HINT=HOSIZE/HCUTSIZE” and “VINT=VOSIZE/VCUTSIZE” are satisfied. As a result, the number of pixels of the clipping region set again in the horizontal and the vertical directions are 128 and 240, respectively. If the clipping region is set again, the clipped image is generated by using the clipping region set again so that the output combined image is generated.
Note that in the flowchart illustrated in
[Increasing and Decreasing of Combining Number]
A user can instruct to change the combining number CNUM set once by the user or the combining number CNUM set automatically by the image pickup apparatus 1. The user can issue an instruction to change the combining number CNUM at an arbitrary timing. For instance, after the output combined image in a state of CNUM=10 is generated and displayed, if the user wants to generate and display the output combined image in a state of CNUM=20, the user can instruct to increase the combining number CNUM from 10 to 20 by a predetermined operation with the operating portion 26. On the contrary, the user can also instruct to decrease the combining number CNUM.
A first increasing or decreasing method of the combining number CNUM is described below.
A second increasing or decreasing method of the combining number CNUM is described below.
The sampling interval is not changed in the second increasing or decreasing method. However, it is possible to combine the first and the second increasing or decreasing methods. In other words, for example, when the user instructs to increase the combining number CNUM, the decreasing of the sampling interval according to the first increasing or decreasing method and the increasing of the combination target period according to the second increasing or decreasing method may be performed simultaneously. Alternatively, when the user instructs to decrease the combining number CNUM, it is possible to perform simultaneously the increasing of the sampling interval according to the first increasing or decreasing method and the decreasing of the combination target period according to the second increasing or decreasing method.
As described above, in this embodiment, the clipped images of the moving object are arranged in the horizontal or the vertical direction and are combined so that the output combined image is generated. Therefore, even if the position of the moving object is scarcely changed on the moving image in such a case where the moving object is a person who swings a golf club, moving objects at different time points are not overlapped with each other on the output combined image. As a result, a movement of the moving object can be checked more easily than the stroboscopic image as illustrated in
A second embodiment of the present invention is described below. The second embodiment and a third embodiment described later are embodiments on the basis of the first embodiment. The description of the first embodiment is applied also to the second and the third embodiments unless otherwise noted in the second and the third embodiments, as long as no contradiction arises. The plurality of combining modes described above in the first embodiment can include a second combining mode that is also referred to as a synchronized combining mode. Hereinafter, an action of the image pickup apparatus 1 in the second combining mode is described in the second embodiment.
In the second combining mode, a plurality of input image sequences are used for generating output combined image. Here, for specific description, a method of using two input image sequences is described below.
In the image processing portion 50, processes of Steps S12 to S17 illustrated in
In each of the intermediate combined images 561 and 562, HNUM (the number of clipped images arranged in the horizontal direction) is two or larger, and VNUM (the number of clipped images arranged in the vertical direction) is one. In other words, the intermediate combined image 561 is generated by arranging the plurality of clipped images based on the input image sequence 551 in the horizontal direction and combining them. The intermediate combined image 562 is generated by arranging the plurality of clipped images based on the input image sequence 552 in the horizontal direction and combining them. Basically, the sampling interval and the combining number CNUM are the same between the input image sequences 551 and 552, but they may be different between the input image sequences 551 and 552. In the example illustrated in
The image combining portion 53 illustrated in
The output combined image 570 can be displayed on the display screen of the display portion 27, and thus a viewer of the display screen can easily compare a movement of the moving object on the input image sequence 551 with a movement of the moving object on the input image sequence 552. For instance, it is possible to compare a golf swing form in detail between the moving objects on the former and the latter.
When the output combined image 570 is displayed, it is possible to use the resolution conversion as necessary so as to display the entire output combined image 570 at one time. It is also possible to perform the following scroll display. For instance, the display processing portion 20 illustrated in
A state where the left end of the extraction frame 580 is agreed with the left end of the output combined image 570 is set as a start point. Then, a position of the extraction frame 580 is moved sequentially at a constant interval until the right end of the extraction frame 580 is agreed with the right end of the output combined image 570, and the scroll image is extracted every time of the movement. In the scroll display, a plurality of scroll images obtained by this are arranged in time sequence order and displayed as a moving image 585 on the display portion 27 (see
Note that in the above-mentioned example, the intermediate combined image based on the input image sequence 551 and the intermediate combined image based on the input image sequence 552 are arranged in the vertical direction and are combined. However, the intermediate combined image based on the input image sequence 551 and the intermediate combined image based on the input image sequence 552 may be arranged in the horizontal direction and combined. In this case, it is preferred to arrange in the horizontal direction the intermediate combined image obtained by arranging and combining the plurality of clipped images based on the input image sequence 551 in the vertical direction and the intermediate combined image obtained by arranging and combining the plurality of clipped images based on the input image sequence 552 in the vertical direction, and to combine them so that a final output combined image is obtained by the combining.
In addition, it is possible to use three or more input image sequences to obtain the output combined image. In other words, the three or more input image sequences may be supplied to the image processing portion 50, and the intermediate combined images respectively obtained from the input image sequences may be arranged in the horizontal or the vertical direction and combined so that the final output combined image is obtained.
A third embodiment of the present invention is described below. When the input image in the above-mentioned first or second embodiment is obtained by photographing, so-called optical vibration correction or electronic vibration correction may be performed in the image pickup apparatus 1. In the third embodiment, when the input image is obtained by photographing, it is supposed that the electronic vibration correction is performed in the image pickup apparatus 1. Then, a method of setting the clipping region in conjunction with the electronic vibration correction is described below.
First, with reference to
A rectangular extraction frame 601 that is smaller than the effective pixel region 600 is set in the effective pixel region 600, and the pixel signals within the extraction frame 601 are read out so that the input image is generated. In other words, the image within the extraction frame 601 is the input image. In the following description, a position and movement of the extraction frame 601 mean a center position and movement of the extraction frame 601 in the effective pixel region 600.
If the image pickup apparatus 1 moves in the period between the time points tn and tn+1, the noted subject moves on the image sensor 33 and in the effective pixel region 600 even if the noted subject is still in real space. In other words, a position of the noted subject on the image sensor 33 and in the effective pixel region 600 moves in the period between the time points tn and tn+1. In this case, if a position of the extraction frame 601 is fixed, a position of the noted subject in the input image F[n+1] moves from the position of the noted subject in the input image F[n]. As a result, it looks as if the noted subject has moved in the input image sequence constituted of the input images F[n] and F[n+1]. A position change of the noted subject between the input images caused by such a movement, namely the movement of the image pickup apparatus 1 is referred to as an interframe vibration.
A detection result of movement of the image pickup apparatus 1 detected by the apparatus movement detecting portion 61 is referred to also as an apparatus movement detection result. The vibration correcting portion 62 illustrated in
The region within the extraction frame 601 can be referred to also as a vibration correction region. An image within the extraction frame 601 out of the entire image (the entire optical image) formed in the effective pixel region 600 of the image sensor 33 (namely, the image within the vibration correction region) corresponds to the input image. The vibration correcting portion 62 sets, based on the apparatus motion vector, a position of the extraction frame 601 when the input images F[n] and F[n+1] are obtained so as to reduce the interframe vibration of the input images F[n] and F[n+1]. The same is true for the interframe vibration between other input images.
It is supposed that the above-mentioned reduction of the interframe vibration is performed and that the input image sequence 320 (see
The region setting portion 51 illustrated in
The region setting portion 51 sets the clipping regions within the input images 621, 622 and 623 at positions of the overlapping region 640 within the rectangular regions 631, 632 and 633, respectively. In other words, as illustrated in
The photographer performs the adjustment of the photographing direction or the like while paying attention to the noted moving object to be included within the clipped image. Therefore, even if the photographing range is changed due to a shake of a hand, at least the noted moving object is usually included in the photographing range. As a result, there is high probability that the overlapping region 640 of each target input image includes image data of the noted moving object. Therefore, in the third embodiment, the overlapping region 640 is set to the clipping region, the clipped images obtained from the clipping regions of the target input images are arranged in the horizontal or the vertical direction and are combined so that the output combined image is generated. Therefore, the same effect as in the first embodiment can be obtained. In other words, because the moving objects at different time points are not overlapped in the output combined image, movement of the moving object can be checked more easily than the stroboscopic image illustrated in
The embodiments of the present invention can be modified variously as necessary within the range of the technical concept described in the claims. The embodiments described above are merely examples of the embodiments of the present invention, and meanings of the present invention and terms of elements thereof are not limited to those described in the embodiments. The specific numeric values in the above description are merely examples, and they can be changed to various numeric values as a matter of course. As annotations that can be applied to the embodiments described above, Notes 1 to 3 are described below. Descriptions of the individual Notes can be combined arbitrarily as long as no contradiction arises.
[Note 1]
It is possible to form the image processing portion 50 so as to be able to realize a combining mode other than the first and second combining modes according to the first and second embodiments.
[Note 2]
The image processing portion 50 illustrated in
[Note 3]
The image pickup apparatus 1 illustrated in
Number | Date | Country | Kind |
---|---|---|---|
2010-243196 | Oct 2010 | JP | national |