The present disclosure relates generally to the field of image processing. More particularly, the present disclosure relates to methods and systems useful in the domain of panoramic image processing of images acquired from multiple viewpoints located along a linear path.
Panoramic photography may be defined generally as a photographic technique for capturing images with elongated fields of view. In recent years, static viewpoint panoramic photography, obtained by pivoting a camera around a single viewpoint, has become increasingly popular due to the development of accessible electronic handheld device applications. Unlike a local panorama at a static viewpoint, a multiple viewpoint panorama is constructed from partial views at consecutive viewpoints along a path. There are many challenges associated with taking high quality multiple viewpoint panoramic images. Particularly, these challenges include parallax problems i.e. problems caused by apparent displacement or difference in the apparent position of an object in the panoramic scene in consecutive captured images. Also, these challenges include post processing problems because assembling the images may result in computationally intensive activity. Furthermore, these problems are heightened in a retail store environment, at least because the depth of field is short in the aisle of a store, and because of the high resolution required for further exploitation of the panoramic image through object recognition techniques.
In the present application, the following terms and their derivatives may be understood in light of the below explanations:
Imaging Unit
An imaging unit may be an apparatus capable of acquiring pictures of a scene. In the following it is also generally referred to as a camera and it should be understood that the term camera encompasses different types of imaging units such as standard digital cameras, electronic handheld devices including imaging sensors, etc. Advantageously, a camera may be provided with means configured to estimate a rotational change of the camera. Said means may include a gyroscope, an accelerometer and/or an image processing module capable of determining a rotational change (an orientation variation) from image to image and/or with respect to a reference orientation. In the description, the camera pinhole model may be used as a support for illustration. The intrinsic parameters of the camera may be predetermined and the camera may be calibrated.
Furthermore, in the following, it is understood that the images processed may preferably be overlapping images (at least a part of one of the images is found in the other image) and acquired from multiple viewpoints located along a linear path.
Orientation
The term orientation may herein refer to a positional attitude of a camera acquiring an image with respect to a referential frame. With reference to
Scanning
In some embodiments of the present disclosure, panoramic image processing may be used for building a multiple viewpoint panorama. For example, a set of images may be acquired by displacing the camera along an axis (scanning direction) in front of a scene. Further, the scene imaged may advantageously be such that the scene geometry lies along a dominant plane (for example an aisle of a grocery store). The terms “scanning” or “sweeping” may refer to translating an imaging unit along a scanning direction while acquiring images with the imaging unit. It is noted that advanced scanning may comprise several stages with different scanning directions. For example, a scanning may contain one or more horizontal and/or vertical stages so as to capture a whole shelving unit.
Fronto-Parallel Strip
As already mentioned in the present disclosure, a set (stream) of images processed may result from a scanning of the camera along an axis i.e. a translation of the camera while theoretically maintaining the orientation of the camera in a reference orientation. A first image of the stream of images may define the reference orientation of the camera i.e. a rotational change (Euler angle) of the following images of the stream may refer to orientation of the first image. However, practically, during scanning, orientation of the camera may be unwittingly modified by a user performing such scanning. The present disclosure proposes to recognize a fronto-parallel strip of a corrected image, based on the rotational change of said image with respect to the reference orientation, and to perform registration and/or stitching based on the recognized fronto-parallel strip. In the present disclosure, the term perpendicular strip (or band) may be understood as a slice of an image in a vertical direction (along the y axis) or in a horizontal direction (along the x axis).
The fronto-parallel strip selection may include the following steps: extracting the rotational change based on positional sensor measurements, calculating a fronto-parallel warped image by applying the correction transform on the input image, marking, in the warped image a region of the input image (marked with broken lines on
The fronto-parallel strip 13 may generally reflect the portion of an image which would have appeared in the central perpendicular strip of the image if the camera was held according to the reference orientation i.e. with a rotational change equal to zero. More particularly, the perpendicular strip is a vertical strip when the image results from a horizontal scanning along the X axis or a horizontal strip when the image results from a vertical scanning along the Y axis. A width of the fronto-parallel strip may be defined by a width parameter which may be in the range of 1-5% or 5-10% of the field of view (FOV) along the scanning direction of the FOV, preferably 3%, 5% or 7%. In other words, the fronto-parallel strip may be understood as a portion of an image, imaging objects which are positioned in a region of the scene which can be defined from the frame referential (X, Y, Z) centered at the position of the camera acquiring the image by:
ω=[−α*ωmax/2;α*ωmax/2], and
θ=[θmax/2;θmax/2],
wherein α is the width parameter, ωmax is the width of the field of view and θmax is the height of the field of view.
As explained, the fronto-parallel strip may be determined by correcting an acquired image based on the rotational change of said image with respect to the reference orientation and by selecting a central strip of the resulting corrected image.
As illustrated on
The Applicant has found that, particularly in configurations of short depth of field such as in panoramic imaging of an aisle of a grocery store, performing image registration—and particularly transformation calculation/motion parameters for compensating translation and scale—between successive images based on fronto-parallel portions of the images, improves the quality of the panorama and lowers the computational requirements. Further, the Applicant has found that performing the stitching, by appending the fronto-parallel portions of successive corrected images one to another, further improves the quality of the panorama. Thus, the Applicant proposes a method of image processing for registering images which implements its finding and notably includes, in a first step the correction of a rotational change between two images and thereafter estimates the translation and scale deformation based on keypoints found in the fronto-parallel strip.
Therefore, the present disclosure provides, in a first aspect, a computer implemented method of image processing comprising, upon receiving of first and second images from an imaging unit, the first and second images being respectively associated with first and second rotational changes between a reference orientation and the orientations of the first and second images: processing (by the computer) data representative of the first image and of the second image to compensate the first and second rotational changes between the reference orientation and the respective orientations of the first and second images, thereby obtaining first and second corrected images; processing (by the computer) the first corrected image to detect distinctive keypoints within a fronto-parallel strip of the first corrected image; searching (by the computer) keypoints in the second corrected image corresponding to the detected keypoints, and estimating (by the computer) a geometric transformation between the first and second images based on matching the keypoints in the first and the second corrected images. For example, the imaging unit may be provided with a positional sensor which enables determining the first and second rotational changes.
In some embodiments, searching keypoints corresponding to the detected keypoints comprises, for each detected keypoint: defining a search area in the second corrected image based on a keypoint position in the first corrected image and on a rotational change between the first and second corrected images; and searching only in the defined search area.
In some embodiments, the rotational change between the first and second corrected images is derived from the rotational changes of the first and second images with respect to the reference orientation.
In some embodiments, defining the search area comprises estimating and correcting a translation of the imaging unit between a first acquisition position of the first image and a second acquisition position of the second image.
In some embodiments, detecting distinctive keypoints is performed using the Shi-Tomasi technique.
In some embodiments, keypoints located out of the fronto-parallel strip are discarded from further processing.
In some embodiments, a width of the fronto-parallel strip is variable and is set so as to include a sufficient amount of keypoints for enabling estimating the geometric transformation.
In some embodiments, estimating the geometric transformation is performed using a transformation model involving, exclusively, translation and scale. In fact, according to the proposed method, a rotational change is preliminarily corrected by the correction step, therefore, such a simple transformation model including translation and scale only is efficient to complete the calculation of the registration parameters.
In some embodiments, estimating a geometric transformation is performed using a random sample consensus (RANSAC) algorithm.
In some embodiments, the data representatives of the first image and of the second image are downsampled versions of the first and second images. This enables to perform the above described processing on lighter images, for example grey scale and medium resolution versions of the first and second images.
In a further aspect, the present disclosure relates to a method of panoramic image (also referred to as stitched image) creation comprising, upon receiving a sequence of images from an imaging unit, wherein each image of the sequence of images is associated with a rotational change between said image and the reference orientation: estimating geometric transformations between a sequence of successive pairs of (received) images according to the method of any of the preceding claims; computing a sequence of cumulative transformations, each cumulative transformation being associated with an (received) image of the sequence of successive pairs, by combining, for each (received) image of the sequence of successive pairs after the initial image, the geometric transformations estimated for the one or more (received) images preceding said (received) image; obtaining a sequence of corrected images corresponding to the (received) images of the successive pairs by processing data representative of at least part of said (received) images to compensate the rotational changes between the reference orientation and the respective orientations of said (received) images; obtaining a sequence of transformed images by applying each computed cumulative transformation to at least part of the corrected image corresponding to the (received) image associated with said cumulative transformation; and stitching the sequence of transformed images. The cumulative transformations may link a (received) image of the sequence of successive pairs to the initial image of the sequence of successive pairs.
In some embodiments, the data representative of at least part of said images comprise high resolution versions of at least a part of said images. This enables to obtain a high resolution stitched image allowing for further image recognition techniques.
In some embodiments, the at least part of the corrected image is the fronto-parallel strip of said corrected image. This notably enables to reduce computational requirements.
In some embodiments, the stitching includes using a seam algorithm.
In some embodiments, the (received) images result from scanning an aisle of a grocery store at multiple viewpoints located along a linear path.
In some embodiments, the reference orientation is an orientation of the initial image.
In some embodiments, the method further comprises monitoring an aperture level of a stitched image and modifying the reference orientation in order to maintain the aperture level in a predetermined range of apertures.
In some embodiments, stitching the sequence of transformed images is performed iteratively by computing, for each transformed image, an associated floating stitched image using said transformed image and a floating stitched image associated with a previous transformed image in the sequence of transformed images.
In some embodiments, the computing comprises appending an inner slice of the transformed image at an edge of a floating stitched image associated with the prior transformed image.
In some embodiments, the computing comprises superimposing an outer slice of the transformed image at an inner stitching portion of the floating stitched image associated with the prior transformed image.
In some embodiments, the data representative of at least part of said images comprise a low resolution version of at least a part of said images. This provides for a lower resolution stitched image which can further be displayed on a display window of a display screen of a system or handheld electronic device according to the present disclosure.
In a further aspect, the present disclosure provides a computer program product implemented on a non-transitory computer usable medium having computer readable program code embodied therein to cause the computer to perform the image processing method and/or a panoramic image creation method as previously described.
In a further aspect, the present disclosure provides for a system comprising: memory; an imaging unit; and a processing unit communicatively coupled to the memory and imaging unit, wherein the memory includes instructions for causing the processing unit to perform an image processing method and/or a panoramic image creation method as previously described.
In some embodiments, the memory, the imaging unit and the processing unit are part of a handheld electronic device.
In a further aspect, the present disclosure provides a method of panoramic imaging of a retail unit comprising: moving an imaging unit along a predetermined direction while acquiring a sequence of images of the retail unit; retrieving positional information of the imaging unit for each image and associating each image with a rotational change between said image and the first image of the sequence of images; creating a panoramic image according to the method previously described.
The Applicant has found that the above described technique of panoramic image creation which notably divides the tasks of apprehending an orientation variation and a translation and scale variation between successive images, enables to significantly improve post-processing computation and enhances the quality of the resulting panoramic image.
In order to better understand the subject matter that is disclosed herein and to exemplify how it may be carried out in practice, embodiments will now be described, by way of non-limiting example only, with reference to the accompanying drawings, in which:
In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the subject matter. However, it will be understood by those skilled in the art that some examples of the subject matter may be practiced without these specific details. In other instances, well-known methods, procedures and components have not been described in detail so as not to obscure the description.
As used herein, the phrase “for example,” “such as”, “for instance” and variants thereof describe non-limiting examples of the subject matter.
Reference in the specification to “one example”, “some examples”, “another example”, “other examples, “one instance”, “some instances”, “another instance”, “other instances”, “one case”, “some cases”, “another case”, “other cases” or variants thereof means that a particular described feature, structure or characteristic is included in at least one example of the subject matter, but the appearance of the same term does not necessarily refer to the same example.
It should be appreciated that certain features, structures and/or characteristics disclosed herein, which are, for clarity, described in the context of separate examples, may also be provided in combination in a single example. Conversely, various features, structures and/or characteristics disclosed herein, which are, for brevity, described in the context of a single example, may also be provided separately or in any suitable sub-combination.
Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions utilizing terms such as “generating”, “determining”, “providing”, “receiving”, “using”, “computing”, “transmitting”, “performing”, or the like, may refer to the action(s) and/or process(es) of any combination of software, hardware and/or firmware. For example, these terms may refer in some cases to the action(s) and/or process(es) of a programmable machine, that manipulates and/or transforms data represented as physical, such as electronic quantities, within the programmable machine's registers and/or memories into other data similarly represented as physical quantities within the programmable machine's memories, registers and/or other such information storage, transmission and/or display element(s).
The term “inner slice” may be used herein to refer to a slice of an image taken within (inside) the image i.e. an inner portion/cut of an image along a thickness of the image. The term “outer slice” (or “peripheral slice”) may be used, in contrast, to refer to a slice of an image along the thickness of the image which extends until an end of the image i.e. the outer slice reach three edges of the image.
In a step S110, the first and second images may be downsampled to ease further processing. The downsampled versions may be of medium resolution (for example with a downsampling factor of 0.5) and/or grayscale versions. As explained below, this step may also be performed after step S120.
In a step S120, data representative of the first image and data representative of the second image (for example the downsampled versions of the first and second images) may be processed to obtain a first corrected image and a second corrected image. It is noted that in some embodiments, the orientation correction may be performed on the received images (or on high resolution images derived from the received images) and the downsampling step S110 may be performed subsequently to the orientation correction, thereby also leading to downsampled images with corrected orientation with respect to the reference orientation.
It is noted that a general camera matrix can be represented by:
P=K[R/T]
wherein P is the camera matrix, K is an intrinsic camera calibration matrix, R is a camera rotation matrix with respect to a world reference frame, and T is a camera translation vector with respect to the world reference frame.
Using these notations, when correcting pure rotation as assumed in step S120, there is projective homography (also referred to as warping) between the image and the corrected image which can be represented by:
H=(KR2)(R1−1K−1)
wherein:
R1 is the rotation matrix of the (first or second) received image and R2 is the rotation matrix of the (first or second) corrected image oriented according to the reference orientation and can be determined using the rotational changes provided by the positional attitude sensor of the system, and
K can be determined by calibration of the imaging unit.
Wherein:
fc is a focal of the camera along the column axis;
fr is a focal of the camera along the row axis;
s is a skewness of the camera;
c0 is a column coordinate of the focal center in the image reference frame;
r0 is row coordinate of the focal center in the image reference frame.
In step S130, distinctive keypoints within a fronto-parallel strip may be detected. It is noted that keypoints located out of the fronto-parallel strip may be discarded from further processing. Keypoints detection may be performed globally on the first corrected image and selection of the keypoints located within the fronto-parallel strip may be then performed. Keypoint detection may be performed using the Shi-Tomasi technique or the like. As explained above, the fronto-parallel strip may be a centro-perpendicular band of the corrected image or a strip including information in closest proximity thereto. The fronto-parallel strip may reflect the portion of the first image which would have appeared in the central perpendicular strip of the first image if the camera was held according to the reference orientation. A direction of the fronto-parallel strip in the corrected image (horizontal or vertical) may depend on a scanning direction. It is noted that the scanning direction may be preliminarily provided to the system, for example by user input, or may alternatively be detected by image processing. Further, a width of the fronto-parallel strip is variable and is set so as to include a sufficient amount of keypoints for enabling estimating the geometric transformation. In step S140, keypoints corresponding to the detected keypoints may be searched in the second corrected image. After detecting the features (keypoints) in step S130, the detected keypoints may be matched in the second corrected image by determining which keypoints are derived from corresponding locations in the first and second images. In some embodiments, searching keypoints corresponding to the detected keypoints may comprise, for each detected keypoint, defining a search area in the second corrected image based on a keypoint position in the first corrected image and on a rotational change between the first and second corrected images and searching only in the defined search area. The rotational change between the first and second corrected images may be derived from the rotational changes of the first and second images with respect to the reference orientation. In some embodiments, the search area may be searched with an incremental registration algorithm. In some embodiments, defining the search area may comprise estimating and correcting a translation of the imaging unit between a first acquisition position of the first image and a second acquisition position of the second image. In a step S150, a geometric transformation may be estimated between the first and second images based on matching of the keypoints in the first and the second corrected images. The estimation of the geometric transformation may be performed using a transformation model involving, exclusively, translation and scale. Step S150 may be referred to as motion parameters estimation or image registration estimation. This model assumption may enable avoidance of a cumulative effect that would deform the further panoramic image. Further, the estimation of the geometric transformation may be performed using a random sample consensus (RANSAC) algorithm. This may enable reduction of parallax issues since RANSAC chooses the most populated point clusters and the most populated point clusters may be correlated to products in the foreground.
In step S210, geometric transformations may be estimated between a sequence of successive pairs of received images according to the method previously described with reference to
In step S220, a sequence of cumulative transformations linking each image of the sequence of successive pairs to the initial image may be computed. As illustrated in
In a step S230, a sequence of (orientation) corrected images corresponding to the received images of the successive pairs may be obtained. The corrected images may be obtained by processing data representative of at least part of said received images. In some embodiments, the processing may be performed on high resolution and/or color versions of at least part of the received images. This may enable obtaining a stitched image of high quality for output to further image recognition processing. In some other embodiments, the processing may be performed on low resolution versions of at least part of the received images. A downsampling factor of such versions may be superior to 0.5. This may enable computing a real time preview of the stitched image.
In a further step S240, a sequence of transformed images may be obtained by applying each computed cumulative transformation to at least part of the corrected image corresponding to the received image associated with said cumulative transformation. In some embodiments, the cumulative transformations may be applied to the whole corrected images. In some embodiments, the cumulative transformations may be applied only to the fronto parallel strips of the corrected images until the penultimate corrected image. The cumulative transformation associated to the ultimate image of the sequence may be applied to the fronto-parallel portion and to an additional portion of the ultimate image. The latter alternative enables to improve calculation time.
In a further step S250, the sequence of transformed images may be stitched, thereby leading to a stitched image. The stitching may include using a seam algorithm, in particular when the stitched image is obtained from high resolution versions of the received images (for output purposes). The stitching may also include simple blending, in particular when the stitched image is obtained from low resolution versions of the received images (for preview purposes). The stitching of the sequence of transformed images may be performed iteratively by computing, for each transformed image, an associated floating stitched image using said transformed image and a floating stitched image associated with a previous transformed image in the sequence of transformed images. Further, the computing may comprise appending an inner slice of the transformed image at an edge of the floating stitched image associated with the prior (directly) transformed image in the sequence of transformed images. Alternatively, the computing may comprise superimposing an outer slice of the transformed image at an inner stitching portion of the floating stitched image associated with the prior transformed image in the sequence of transformed images.
Furthermore, in some embodiments, the method may also comprise a step of displaying in real time a panoramic image preview on the display unit of the system while scanning the scene. The panoramic image preview may be computed upon receiving the sequence of images. The sequence of cumulative transformation may be computed progressively and may be applied to downsampled versions of the corrected images to obtain the panoramic image preview.
While certain features of the invention have been illustrated and described herein, many modifications, substitutions, changes, and equivalents will now occur to those of ordinary skill in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the true spirit of the invention.
It will be appreciated that the embodiments described above are cited by way of example, and various features thereof and combinations of these features can be varied and modified.
While various embodiments have been shown and described, it will be understood that there is no intent to limit the invention by such disclosure, but rather, it is intended to cover all modifications and alternate constructions falling within the scope of the invention, as defined in the appended claims.
It will also be understood that the system according to the presently disclosed subject matter can be implemented, at least partly, as a suitably programmed computer. Likewise, the presently disclosed subject matter contemplates a computer program being readable by a computer for executing the disclosed method. The presently disclosed subject matter further contemplates a machine-readable memory tangibly embodying a program of instructions executable by the machine for executing the disclosed method.
Number | Date | Country | Kind |
---|---|---|---|
230773 | Feb 2014 | IL | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IL2015/050070 | 1/21/2015 | WO | 00 |