The invention relates to methods and devices configured to process a stream of image data.
Image signal processing is a widely used technique for enhancing and improving the quality of images captured by various devices such as cameras and scanners. Image signal processing can include techniques such as sharpening, noise reduction, colour correction, contrast enhancement, and colour space processing. The technique has many applications from improving the aesthetic appeal of captured images to improving the definition of edges and contrast, which may be important if the image is to be subject to further automated processes (such as image recognition) or is to be inspected by a human (such as in review of medical images).
Some image capture technologies involve selective processing of digital images in which particular regions of interest are identified in the images and the regions of interest are processed in a different manner from other portions of the image. An example of such a technology is foveated rendering in which portions of an image that correspond to a user's attention are rendered at a higher resolution than other portions of an image that the eye will be able to resolve less clearly.
Digital image signal processing can be memory intensive and processor intensive. Increased power consumption associated with processing high-resolution images may be undesirable in, for example, power-limited devices such as battery-operated devices. Accordingly, there is a desire for efficient methods of performing image signal processing in various devices.
According to a first aspect to the present invention there is provided a method of processing streamed image data comprising: obtaining a stream of image data; obtaining information identifying a location of one or more region of interest in the image data; performing a first spatial processing on the stream of image data so as to change a spatial resolution of at least a portion of the streamed image data in dependence upon the location of the one or more region of interest to generate a first processed stream; performing image signal processing on the first processed stream to generate a stream of processed image data; performing a second spatial processing on the stream of processed image data to generate a second processed stream of image data.
According to a second aspect of the invention there is provided a device configured to process streamed image data comprising: one or more hardware units configured to: obtain a stream of image data; obtain information identifying one or more region of interest in the image data; perform a first spatial processing on the stream of image data so as to change a spatial resolution of at least a portion of the streamed image data in dependence upon the location of the one or more region of interest to generate a first processed stream; perform image signal processing on the first processed stream to generate a stream of processed image data; perform a second spatial processing on the stream of processed image data to generate a second processed stream of image data.
According to a third aspect there is provided a non-transitory computer-readable storage medium storing instruction that, when executed by a processor of a device, cause the device to: obtain a stream of image data; obtain information identifying one or more region of interest in the image data; perform a first spatial processing on the stream of image data so as to change a spatial resolution of at least a portion of the streamed image data in dependence upon the location of the one or more region of interest to generate a first processed stream; perform image signal processing on the first processed stream to generate a stream of processed image data; perform a second spatial processing on the stream of processed image data to generate a second processed stream of image data.
Embodiments will now be described, by way of example only, with reference to the accompanying drawings in which:
Embodiments described herein include methods and devices for performing digital image signal processing on one or more images or video data. Embodiments described below will refer to digital image data. The techniques described are equally applicable to still image data and to video image data.
An embodiment may provide a method of processing streamed image data. The method may comprise obtaining a stream of image data and obtaining information identifying a location of one or more region of interest in the image data. The method performs a step of performing a first spatial processing on the stream of image data so as to change a spatial resolution of at least a portion of the streamed image data in dependence upon the location of the region of interest to generate a first processed stream. The method further performs image signal processing on the first processed stream to generate a stream of processed image data. The method performs a second spatial processing on the stream of processed image data to generate a second processed stream of image data.
In some implementations, performing second spatial processing generates a stream of image data corresponding to the one or more region of interest and a stream of data corresponding to an overall image. The streams relating to the one or more region of interest and the overall image may have different spatial resolutions.
In other implementations, the second spatial image processing processes the region of interest and areas outside of the region of interest to generate a single stream of processed image data having a single resolution.
The method may obtain a plurality of regions of interest. In such cases, the method may perform the first spatial processing to change a spatial resolution of at least a portion of the streamed image data in dependence upon the positions of the region of interests to generate a first processed stream. Further, the method may perform the second spatial processing on the stream of processed image data to generate at least one stream associated with the plurality of regions of interest.
The second spatial processing may generate a separate stream associated with each region of interest. In such cases at least two of the generated streams may have different spatial resolutions.
In some embodiments, the first spatial processing may comprise: determining for each pixel of the stream of image data whether the pixel is horizontally aligned with the one or more regions of interest and: in a case that the pixel is not horizontally aligned with the one or more regions of interest performing a vertical scaling in relation to the pixel; and in a case that the pixel is horizontally aligned with the one or more regions of interest, not performing a vertical scaling in relation to the pixel.
In such embodiments, the first spatial processing may comprise: determining for each pixel of the stream of image data whether the pixel is vertically aligned with the one or more regions of interest and: in a case that the pixel is not vertically aligned with the one or more regions of interest performing a horizontal scaling in relation to the pixel; and in a case that the pixel is vertically aligned with the one or more regions of interest not performing a horizontal scaling in relation to the pixel. In such embodiments, the first processed stream may represent image data that is formed of a rectangular grid of pixel values.
In other such embodiments, the first spatial processing may comprise: determining for each pixel of the stream of image data whether the pixel is located within the one or more regions of interest and: in a case that the pixel is not located within the one or more regions of interest performing a horizontal scaling in relation to the pixel; and in a case that the pixel is located within the one or more regions of interest not performing a horizontal scaling in relation to the pixel. Such embodiments may result in image data after the first spatial processing with lines of image data that have varying length.
In implementations of the embodiment, the first processed stream may be stored in a set of delay lines prior to performing image signal processing.
In some embodiments, in a case that consecutive horizontal lines change length and/or image data alignment due to a change in horizontal scaling following a change in alignment with the one or more regions of interest, the first spatial processing may control the first processed stream to contain two different versions of the same line or to contain different versions of a portion of the same line in order to provide image data for the image signal processing. Two different versions of the same line or portion of the same line may be provided for lines immediately preceding and/or immediately succeeding the lines where the change in length and/or image data alignment occurs. In further embodiments, the image signal processor may generate the two different version of the same line or portion of the same line.
The image signal processing may comprise one or more of: sharpening, noise reduction, contrast adjustment, colour correction, edge detection, focus detection, and colour space processing.
In some embodiments according to the first aspect, the first spatial processing may comprise pixel binning performed by an image sensor on a portion of the stream of image data not included in the one or more regions of interest. In such embodiments the second spatial processing may comprise upscaling the portion of the of the stream of processed image data that was subjected to pixel binning.
A further embodiment provides a device configured to process streamed image data. The device comprises one or more hardware units configured to: obtain a stream of image data and obtain information identifying one or more region of interest in the image data. The device performs a first spatial processing on the stream of image data so as to change a spatial resolution of at least a portion of the streamed image data in dependence upon the location of the region of interest to generate a first processed stream. The devices perform image signal processing on the first processed stream to generate a stream of processed image data and performs a second spatial processing on the stream of processed image data to generate a second processed stream of image data.
The one or more hardware units may include circuitry such as one or more application specific integrated circuits, one or more processor, one or more field programmable gate array or the like. The circuitry may include a storage unit and/or input/output interfaces for communicating with external devices.
A further embodiment may provide a non-transitory computer-readable storage medium storing instructions that, when executed by a processor of a device, cause the device to obtain a stream of image data and obtain information identifying one or more region of interest in the image data. The instructions cause the device to perform a first spatial processing on the stream of image data so as to change a spatial resolution of at least a portion of the streamed image data in dependence upon the location of the one or more region of interest to generate a first processed stream. The instructions further cause the device to perform image signal processing on the first processed stream to generate a stream of processed image data and perform a second spatial processing on the stream of processed image data to generate a second processed stream of image data.
The device for processing streamed digital image data may in some examples be a mobile device such as a mobile phone or PDA. In other examples the device may be a computer such as a laptop or desktop PC. In other examples the device may be a server or cloud service. In yet further examples the device maybe a smart device such as a smart doorbell or any other IoT device. These examples are not exhaustive, and the device make take other forms not mentioned.
Any method described below may be implemented as a computer program. The computer programme may be stored on a computer readable storage medium and read by one or more information processing apparatus for the purposes of performing such a method.
In further embodiments the methods below may be implemented in hardware and performed using fixed function circuitry. Fixed function circuitry may comprise dedicated hardware circuitry that is configured specifically to perform a fixed function, and that is not reconfigurable to perform a different function. In this way, the fixed function circuitry can be considered distinct from a programmable circuit that is configured to receive and decode instructions defined, for example, in a software program. For example, the fixed function circuitry may not be reconfigurable to perform another function.
Fixed-function circuitry may comprise at least one electronic circuit for performing an operation. Any fixed function circuitry described herein may comprise application-specific integrated circuitry. The application-specific integrated circuitry may comprise one or more integrated circuits and may be designed using a hardware description language such as Verilog and implemented as part of the fabrication of an integrated circuit. The application-specific integrated circuitry may comprise a gate array or a full custom design.
In the example below the spatial processing components 20 and 23 and image signal processing component 22 are implemented in software run on the hardware described in
The components shown in
Streamed image data, such as still or video data, is received at the first spatial processing component 20. The image data is selectively subjected to spatial processing such as downscaling to reduce the spatial resolution or upscaling to increase the spatial resolution depending on whether or not a pixel of the image data being processed is in a region of interest.
The selectively spatially processed image data is then passed to delay lines 21 for subsequent processing by an image signal processor 22. The image signal processor is configured to perform digital image signal processing on the received stream of image data. The type of digital image signal processing performed by the digital image signal processor 22 does not matter for the purposes of the described embodiments. For example, the image signal processor 22 may perform, without restriction, one or more of sharpening, noise reduction, colour correction, contrast enhancement, edge detection, focus detection and/or colour space processing. Other image signal processing operations could be performed.
Following processing by the image signal processor 22 the streamed data is passed to the second spatial processing component 23. The second spatial processing component 23 performs a further spatial processing to complete the selective processing of the digital image data. For example, if a reduced resolution thumbnail, and high-resolution crop of a region of interest are required, the second spatial processing component 23 will perform processing to generate streams for each of the reduced resolution thumbnail and high-resolution crop.
The pipeline described above in connection with
To the left of the central region of interest, horizontal downscaling has been performed. Accordingly, in the top left and bottom left corners of the image shown in
There is also a small gap in the horizontal direction between the central region of interest and the right-side region of interest. In this gap, horizontal downscaling has also been performed.
Logic for performing the first spatial image processing is shown in
In a second line of logic, it is determined at step 61 whether or not the pixel is horizontally aligned with a region of interest. If the pixel is not horizontally aligned with a region of interest, then in step 61a the pixel is subject to vertical scaling. If the pixel is aligned with a region of interest, then in step 61b it is not subject to vertical scaling.
Following processing by the first spatial processing component 20, the stream of image data is passed to delay lines 21. The image data is subject to image signal processing by image signal processor 22 which reads the streamed image data from the delay lines 21. As noted above the image signal processor 22 may perform one or more different digital image processes on the stream of image data read from the delay lines 21. Some digital image processes use a kernel that processes using neighbouring horizontal and vertical pixels from the image data. The delay lines 21 store portions of the stream of image data to allow the processing by the image signal processor 22 to be performed.
After processing by the image signal processor 22, the stream of processed image data is passed the second spatial processing component 23. The second spatial processing component 23 receives the stream of processed image data and performs second spatial processing to generate streams of image data corresponding to the desired low-resolution thumbnail and two regions of interest in the original resolution.
For each pixel received by the second spatial processing unit 23, it may be determined whether the pixel has been subject to horizontal and/or vertical scaling by the first spatial processing component 20 based on its position within the image. If a pixel is in a region of interest, the pixel is added to a stream corresponding to the crop of the region of interest. The same pixel is further subject to horizontal and vertical scaling for addition to a stream corresponding to the low-resolution thumbnail. For pixels outside of the region of interest, if the pixel has already been subject to both horizontal and vertical scaling by the first spatial processing component 20, the pixel is already ready for inclusion in the low-resolution thumbnail image. The pixel is added to the stream corresponding to the low-resolution thumbnail image. If a pixel has been subject to one of first vertical and horizontal scaling, the pixel is subjected to scaling in the other direction and is added to the stream for the low-resolution thumbnail image.
The high-resolution regions of interest that are formed by pixels added to corresponding streams are shown in
The streams of image data are then sent for further processing. The further processing may include storing each stream of image data on a storage device or transmitting the streams of image data over a network.
It should be noted that the above method is performed on a stream of image data. In other words, the image data does not need to be buffered to store a large portion of the image or stored completely in order to complete the processing. In a naive method for generating a low-resolution thumbnail and high-resolution crops from image data, it may be necessary to process the high-resolution image through the image signal processor first, store the entire processed image, crop the relevant regions of interest from the image, and then to subject the entire image to processing to generate the low-resolution thumbnail. Alternatively, crops may be taken from an original image and the original image could be downscaled to a low-resolution image. The high-resolution crops and low-resolution image could then be processed through the image signal processor. The method described above in connection with
The particular example described above involves two regions of interest, which are kept at their original resolution and generation of a reduced resolution thumbnail. The pipeline of components shown in
In further examples, the regions of interest may not be kept at the original resolution and may be subject to upscaling or downscaling depending upon the application. Each region of interest does not need to be subject to the same spatial processing (upscaling, no change, or downscaling). In implementations in which upscaling or downscaling is to be applied to one or more regions of interest, this may be performed by the second spatial processing component. The reason for this is to provide convenient logic for maintaining the rectangular format of the image to be input to the image signal processor 22. A second embodiment is described below in which the horizontal dimensions of the image input to the image signal processor 22 may be varied. An advantage of the first embodiment is that the image input to the image signal processor 22 is in a rectangular format and so a conventional image signal processor 22 may be used.
A second embodiment has the same device hardware as shown and described in connection with
For each pixel of the streamed image data received by the first spatial processing component 20 a determination is made in a step 80 as to whether the pixel is located within a region of interest. If the pixel is not located within a region of interest the pixel is subject to horizontal scaling in step 80a in the horizontal direction to reduce image resolution. If the pixel is located within a region of interest, the pixel is not subject to horizontal scaling in step 80b.
In a second line of logic, which is the same as for the first embodiment, it is determined at step 81 whether or not the pixel is horizontally aligned with a region of interest. If the pixel is not horizontally aligned with a region of interest, then in step 81a the pixel is subject to vertical scaling. If the pixel is aligned with a region of interest, then in step 81b it is not subject to vertical scaling.
An effect of the logic shown in
As noted above, some digital image processing that may be performed by an image signal processor 22 may use a kernel that requires neighbouring horizontal and vertical pixels around a pixel being considered. Accordingly, in some embodiments additional rows of pixels could be sent to the delay lines 21 by the first spatial processing component 20 in order to provide additional image data for the image signal processor to complete its processing. In other words, when a row length changes due to the end or start of a region of interest, the next few rows may be sent to the image signal processor 22 at the previous row length (i.e., subject to scaling as previously required). Next, a few rows preceding the row at which the length changes may be sent having the new row length (i.e., subject to the scaling as required for the subsequent rows) followed by the rows at the new row length. In this way, the image signal processor has the data of neighbouring pixels required for the image signal processing. In practice, the change in row length may be due to change in the resolution of only part of the lines. In such examples, the entire line does not need to be sent to the delay lines. The portions of the line that have changed resolution may be duplicated at the different resolution and sent to the delay line and an index may be provided to the image signal processor that indicates the portion of the line being sent.
The additional lines or portions of lines may be sent to the delay lines 21 by the first spatial processing component 20. In other implementations, the image signal processor 22 may write the additional lines or portion of lines of image data back to the delay lines with differing resolution before reading them back for subsequent processing.
In further examples, the image signal processor may implement an infinite impulse response (IIR) filter. Such filters feedback the filter output as an input to the IIR filter. IIR filters have applications in noise reduction, edge detection, and autofocus amongst other applications. In a case that the spatial resolution is varied when using such a filter, the buffer used to feedback the output to the input of the IIR filter may be spatially rescaled in order to maintain consistent operation of the IIR filter.
In the second embodiment, in examples where the regions of interest are controlled to be subject to upscaling or downscaling, pixels in the regions of interest may be subject to the required horizontal scaling by the first spatial processing component 20. The vertical scaling of pixels horizontally aligned with the regions of interest is unchanged by the first spatial processing component 20 and is performed by the second spatial processing component 23 after processing by the image signal processor 22. The vertical scaling remains unchanged because the image signal processor 22 is controlled to process the stream of image data from the first spatial processing line-by-line and so embodiments maintain line-by-line alignment. For pixels in lines in which there is no region of interest, the vertical scaling will be constant. Accordingly, for these regions, the vertical scaling may be performed by the first spatial processing before the processing by the image signal processing component 22.
The pipeline shown in
The logic for when to apply the different levels of scaling may be selected according to the application. For example, in a case of automated number plate recognition, object recognition may provide a location of the number plate region within an image as a region of interest to be kept at the original highest resolution. Other areas of the image may be subject to substantial downscaling in order to reduce processing time by the image signal processor and reduce the size of the image for further processing.
Further embodiments may be implemented on a camera sensor such as in a dedicated mirrorless camera or a smartphone camera. Such cameras may use pixel binning. Pixel binning is a technique that combines data from several adjacent pixels on a camera sensor to produce a single pixel with improved image quality. The use of pixel binning can reduce noise, increase brightness, and enhance details in low-light situations. However, pixel-binning also reduces the resolution of the image. In some implementations of pixel binning, the number of pixels is reduced by a factor of four due to use of a quad Bayer array.
The techniques described above may be applied to such a sensor arrangement. For example, the camera may detect which parts of the image are in focus and which parts of the image are out of focus. The out of focus portions of the image may be processed at a lower resolution i.e., after pixel binning, whereas all available pixel values may be kept for regions of the image that are in focus. Such a technique is similar to that described in connection with the second embodiment. In particular, after the image data has been selectively subjected to pixel binning depending upon a determination by the camera, such as for example whether portions of the image are in focus, the image may be subject to processing by an image signal processor. After processing by the image signal processor, the resulting image will have an uneven resolution due to the selective pixel binning. To generate a final image, areas that were subjected to pixel binning may be digitally upscaled back to the original resolution by the second spatial processing component 23.
The pixel binning will alter the sampling points because the arrangement of colour filters in a quad Bayer array or the like vary depending upon whether pixel binning is performed. Accordingly, the upscaling performed by the second spatial processing component 23 may include additional processing in order to regularise the pixel data in the final image.
The above-described processing may be advantageous in that it saves power on processing images captured by reducing the amount of processing performed on less important areas of the image. In the case of the areas that are not in focus, the loss in quality due to the pixel binning and subsequent upscaling is likely not to be noticeable. It is noted that selectively pixel binning for areas that are not in focus is only an example. In other implementations, a person or animal may be recognized in an image prior to capture and the pixels that are not part of the person or animal (i.e., not the main subject of interest) may be subject to pixel binning and then upscaling after processing by the image signal processor.
The techniques above may be applied to different types of image signal processing. Some types of image signal processing, such as radial shading, mesh-based shading, chromatic aberration correction, zone-based auto exposure, zone-based auto exposure, and zone-based auto white balance require use of spatial position in order to apply the signal processing. The processing when applying these image signal processing techniques may be updated to correct pixel positions based on the selective compression being applied. Due to the selective compression, the mapping between the processing geometry and the original frame geometry is non-linear. The geometry is, however, piece-wise linear given the region-of-interest based spatial processing described above. Accordingly, updating the geometry is a straightforward scaling matter.
The above-described techniques selectively process a stream of image data such that the resulting stream includes one or more regions of interest that are processed differently from areas outside of the one or more regions of interest. These techniques may be used in combination with other techniques such as masking and/or cropping of image data. Where a mask or crop is to be applied to the stream of image data, the mask and/or crop may be applied during the first spatial processing in order to reduce the amount of processing performed by the image signal processor.