The invention relates to the general field of image analysis. More particularly, the invention relates to a device for helping the capture of images and an image capture device comprising the help device.
Currently, when a cameraman films a scene, besides the direct observation of the scene via the viewfinder of the camera, the only means that he has to ensure that the scene that he is filming is correctly framed is either by using a return channel, or by using oculometric tests.
The direct observation of the scene via a viewfinder does not always enable the cameraman to frame it correctly particularly in the case of rapid movement (e.g. sport scenes). It can also be difficult for him to determine how to frame a scene in the case where this scene comprises many regions of interest (e.g. in a panoramic view).
The use of a return channel enables for example the director to inform the cameraman that the image is poorly framed. Such a solution is however not satisfactory to the extent that it is not instantaneous.
However, the oculometric tests are difficult and take a long time to set up. Indeed, they need a representative panel of observers to be arranged. Furthermore, the results of these tests are not immediate and require a long phase of analysis.
The purpose of the invention is to compensate for at least one disadvantage of the prior art.
The invention relates to a device for helping the capture of images comprising:
The device for helping the capture of images according to the invention simplifies the shot by supplying the cameraman with more information on the scene that he is filming.
According to a particular characteristic of the invention, the analysis means are suitable to calculate an item of perceptual interest data for each pixel of the image.
According to a particular aspect of the invention, the graphic indicator is overlaid on the image in such a manner that it is centred on the pixel of the image for which the perceptual interest data is the highest.
According to a particular characteristic of the invention, the image being divided into pixel blocks, the analysis means are suitable to calculate an item of perceptual interest data for each block of the image.
According to another particular aspect of the invention, the graphic indicator is an arrow pointing to at least one block whose perceptual interest data is greater than a predefined threshold.
Advantageously, the display means are further suitable to modify at least one parameter of a graphic indicator according to a rate of perceptual interest associated with the region of the image covered by the graphic indicator.
According to an embodiment, the rate of perceptual interest equals the ratio between the sum of the perceptual interest data associated with the pixels of the image covered by the graphic indicator and the sum of the perceptual interest data associated with all the pixels of the image.
According to an embodiment, the graphic indicator is a circle whose thickness is proportional to the rate of perceptual interest.
The graphic indicator belongs to the group comprising:
The invention also relates to an image capture device comprising:
The image capture device according to the invention helps the cameraman to correctly frame the scene that he is filming by informing him by means of the graphic indicators how to position the camera so that the image filmed is centred on one of the regions of interest of the scene.
According to a particular embodiment, the image capture device is suitable to capture the images of a first predefined format and the graphic indicator is a frame defining a second predefined format different from the first format.
According to an embodiment example, the first format and the second format belong to the group comprising:
The invention will be better understood and illustrated by means of embodiments and implementations, by no means limiting, with reference to the annexed figures, wherein:
The device for helping the capture of images comprises an analysis module 20 suitable to analyse an image having to be captured. More precisely, the module 20 analyses the visual content of the image to calculate perceptual interest data. An item of perceptual interest data can be calculated for each pixel of the image or for groups of pixels of the image, for example a pixel block. The perceptual interest data is advantageously used to determine the regions of interest in the image, i.e. zones attracting the attention of an observer.
For this purpose, the method described in the European Patent EP 04804828.4 published on 30 Jun. 2005 under the number 1695288 can be used to calculate for each pixel of the image an item of perceptual interest data also known as saliency value. This method illustrated by
The spatial modelling step is composed of 3 steps E201, E202 and E203. During the first step E201, the incident image data (e.g. RGB components) are filtered to make them coherent with what our visual system would perceive while looking at the image. Indeed, the step E201 implements tools that model the human visual system. These tools take into account the fact the human visual system does not appreciate the different visual components of our environment in the same way. This sensitivity is simulated by the use of Contrast Sensitivity Functions (CSF) and by the use of intra and inter component visual masking. More precisely, during the step E201, a hierarchic decomposition into perceptual channels, marked DCP in
During the second step E202, the subbands from the step E201 are convoluted with a close operator of a difference of Gaussians (DoG). The purpose of the E202 step is to simulate the visual perception mechanism. This mechanism enables the visual characteristics containing important information to be extracted (particularly local singularities that contrast with their environment) leading to the creation of an economic representation of our environment. The organisation of the reception fields of the visual cells whether they are retinal or cortical fully meets this requirement. These cells are circular and are constituted by a centre and an edge having antagonistic responses. The cortical cells also have the particularity of having a preferred direction. This organisation endows them with the property of responding strongly on contrasts and of not responding on uniform zones. The modelling of this type of cell is carried out via differences of Gaussians (DoG) whether oriented or not. The perception also consists in emphasising some characteristics essential to interpreting the information. According to the principles of the Gestaltist school, a butterfly filter is applied after the DoG to strengthen the collinear, aligned and small curvature contours. The third step E203 consists in constructing the spatial saliency map. For this purpose, a fusion of the different components is carried out by grouping or by linking elements, a priori independent, to form an image understandable by the brain. The fusion is based on an intra component and on inter components competition enabling the complementarity and redundancy of the information carried by different visual dimensions to be used (achromatic or chromatic).
The temporal modelling step, itself divided into 3 steps E204, E205 and E206, is based on the following observation: in an animated context, the contrasts of movement are the most significant visual attractors. Hence, an object moving on a fixed background, or vice versa a fixed object on a moving background, attracts one's visual attention. To determine these contrasts, the recognition of tracking eye movements is vital. These eye movements enable the movement of an object to be compensated for naturally. The velocity of the movement considered expressed in the retinal frame is therefore almost null. To determine the most relevant movement contrasts, it is consequently necessary to compensate for the inherent motion of the camera, assumed to be dominant. For this purpose, a field of vectors is estimated at the step E204 by means of a motion estimator working on the hierarchic decomposition into perceptual channels. From this field of vectors, a complete refined parametric model that represents the dominant movement (for example translational movement) is estimated at the step E205 by means of a robust estimation technique based on M-estimators. The retinal movement is therefore calculated in step E206. It is equal to the difference between the local movement and the dominant movement. The stronger the retinal movement (by accounting nevertheless for the maximum theoretical velocity of the tracking eye movement), the more the zone in question attracts the eyes. The temporal saliency that is proportional to the retinal movement or to the contrast of movement is then deduced from this retinal movement. Given that it is easier to detect a moving object among fixed disturbing elements (or distracters) than the contrary, the retinal movement is modulated by the overall quantity of movement of the scene.
The spatial and temporal saliency maps are merged in the step E207. The fusion step E207 implements a map intra and inter competition mechanism. Such a map can be presented in the form of a heat map indicating the zones having a high perceptual interest.
However, the invention is not limited to the method described in the European patent EP 04804828.4, which is only an embodiment. Any method enabling the perceptual interest data to be calculated (e.g. saliency maps) in an image is suitable. For example, the method described in the document by Itti et al entitled “A model of saliency-based visual attention for rapid scene analysis” and published in 1998 in IEEE trans. on PAMI can be used by the analysis module 20 to analyse the image.
The device for helping the capture of images 1 further comprises a display module 30 suitable to overlay on the image analysed by the analysis module 20 at least one graphic indicator of at least one region of interest in the image, i.e. a region having an item of high perceptual interest data. The position of this graphic indicator on the image and possibly its geometric characteristics depends on perceptual interest data calculated by the analysis module 20. This graphic indicator is positioned in such a manner that it indicates the position of at least one region of the image for which the perceptual interest is high. According to a variant, a plurality of graphic indicators is overlaid on the image, each of them indicating the position of a region of the image for which the perceptual interest is high.
According to a first embodiment, the graphic indicator is an arrow. To position the arrow in the image, said image is divided into N blocks of pixels not overlapping. Assuming that N=16, as illustrated in
However, if almost the entire image has a high perceptual interest, it is advantageous to indicate to the cameraman that he must perform a zoom out operation to restore the region high perceptual interest in its context. For this purpose, 4 arrows pointing away from the image are overlaid on the image.
According to another embodiment, the graphic indicator is a disk of variable size shown transparently on the image as shown on
According to another variant, the graphic indicator is a square of predefined size. For example, the most salient n pixels, i.e. having an item of data of high potential interest, are identified. The barycentre of these n pixels is calculated, the pixels being weighted by their respective perceptual interest data. A square is then positioned on the displayed image (light square positioned on the stomach of the golfer on
With reference to
The display of such graphic indicators on the viewfinder 2 enables the cameraman who films the scene to move his camera so as to centre in the image displayed on the viewfinder 2 the visually important regions of the filmed scene. In
The graphic indicators advantageously enable the cameraman to ensure that the high perpetual regions of interest in a scene will be present in the images captured. They also enable the cameraman to ensure that these regions are centred in the captured images. Moreover, by modulating certain parameters of the graphic indicators, they enable the cameraman to give a hierarchy to the high perpetual regions of interest according to their respective rates of saliency.
According to a particular embodiment, the graphic indicator is a frame of predefined size. According to the invention the viewfinder 2 is overlaid on the image such that it is centred on a region of the image having a high perpetual interest. This graphic indicator is advantageously used to represent on a captured image in the 16/9 format, a frame in the 4/3 format as illustrated by
Of course, the invention is not limited to the embodiment examples mentioned above. In particular, the person skilled in the art may apply any variant to the stated embodiments and combine them to benefit from their various advantages. Notably, any other graphic indicator than the aforementioned indicators can be used, as for example an ellipse, a parallelogram, a cross, etc.
Furthermore, the graphic indicators can be displayed in superimpression on the control screen external to the image capture device instead of being displayed on the viewfinder of an image capture device.
Number | Date | Country | Kind |
---|---|---|---|
0760170 | Dec 2007 | FR | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2008/067685 | 12/17/2008 | WO | 00 | 6/14/2010 |