This disclosure relates to a camera monitoring system (CMS) for a vehicle, and specifically to a method for processing digital images to reduce computational load in the CMS.
Camera monitoring systems (CMS), such as mirror replacement systems, and camera systems for supplementing mirror views, are utilized in vehicles to enhance the ability of a vehicle operator to see a surrounding environment. CMS utilize one or more cameras to provide an enhanced field of view to a vehicle operator. In some examples, the camera systems cover a larger field of view than a conventional mirror, or include views that are not fully obtainable via a conventional mirror. In other examples, the CMS provides image analysis that can be used for driver assistance systems and for automated or semi-automated vehicle operations.
The images provided via the cameras in the CMS can be utilized by the processors in the CMS to detect aspects of the environment and aspects of the vehicle in an image-processing-based detection process. The image-processing-based perception is computationally intensive, and time consuming.
An exemplary method for processing a digital image includes receiving an image from a camera, inverting color data of the image, bit shifting each pixel in the image such that a total bits per color (BPC) of each pixel is reduced while maintaining data elements of the image, and providing the shifted image to a computer vision system.
In another example of the above described method for processing a digital image bit shifting each pixel comprises reducing the total BPC of each pixel by a predefined number of pixels.
In another example of any of the above described methods for processing a digital image bit shifting each pixel comprises reducing the total BPC of each pixel from 8 BPC to 4 BPC.
Another example of any of the above described methods for processing a digital image further includes iterating the method of claim 1 for a set of sequential images in a video feed.
In another example of any of the above described methods for processing a digital image bit shifting each pixel in the image such that the total BPC of each pixel is reduced comprises adjusting a number of BPC by which each pixel is reduced based on a feedback loop for each sequential image in the set of sequential images.
In another example of any of the above described methods for processing a digital image inverting the color data comprises performing a non-linear logarithmic transformation on pixel bit depth data using a log n (x) formula where N is a log base value in the range of 2-10.
In another example of any of the above described methods for processing a digital image bit shifting each pixel in the image comprises shifting image detail bits from a most significant image bit to a least significant image bit.
In another example of any of the above described methods for processing a digital image the shifted image includes sufficient data elements to perform at least one operation of the computer vision system.
In another example of any of the above described methods for processing a digital image the data elements include contrast and edge lines.
In another example of any of the above described methods for processing a digital image the shifted image includes reduced color and/or pattern data.
In one exemplary embodiment a Camera monitoring system (CMS) for a vehicle includes at least one camera providing a video feed to a CMS controller, the CMS controller including a memory storing instructions for causing the CMS controller to perform a video feed pre-processing method configured bit shift each pixel in the video feed such that a total bits per color (BPC) of each pixel is reduced while maintaining data elements of the image, and provide the reduced BPC video feed to at least one computer vision system.
In another example of the above described CMS for a vehicle the CMS controller is further configured invert color data of the video feed prior to performing the bit shifting.
In another example of any of the above described CMSs for a vehicle inverting the color data comprises performing a non-linear logarithmic transformation on pixel bit depth data.
In another example of any of the above described CMSs for a vehicle the video feed is an RGB 888 video feed.
In another example of any of the above described CMSs for a vehicle bit shifting each pixel in the video feed comprises shifting the video feed from the RGB 888 video feed to an RGB 444 video feed.
In another example of any of the above described CMSs for a vehicle bit shifting each pixel in the video feed comprises reducing the video feed from the RGB 888 to an RGB feed have pixel size less than 8 bits, and wherein the resultant pixel size is variable.
In another example of any of the above described CMSs for a vehicle the resultant pixel size is controlled via a feedback loop including the bit shifting operation and a feedback output from the computer vision system.
These and other features of the present invention can be best understood from the following specification and drawings, the following of which is a brief description.
The disclosure can be further understood by reference to the following detailed description when considered in connection with the accompanying drawings wherein:
The embodiments, examples and alternatives of the preceding paragraphs, the claims, or the following description and drawings, including any of their various aspects or respective individual features, may be taken independently or in any combination. Features described in connection with one embodiment are applicable to all embodiments, unless such features are incompatible.
A schematic view of a commercial vehicle 10 is illustrated in
Each of the camera arms 16a, 16b includes a base that is secured to, for example, the cab 12. A pivoting arm is supported by the base and may articulate relative thereto. At least one rearward facing camera 20a, 20b is arranged respectively within camera arms. The exterior cameras 20a, 20b respectively provide an exterior field of view FOVEX1, FOVEX2 that each include at least one of the Class II and Class IV views (
In one example, in addition to the cameras 20a, 20b in the camera arms 16a, 16b, the CMS includes at least a rear facing camera 60, and an internal trailer camera 62. The rear facing camera 60 captures a view of the exterior of the trailer 14, while the interior camera 62 captures a view of the interior of the trailer 14, including objects that are loaded into the trailer 14. In alternate examples, either camera 60, 62 can be a camera incorporated in a secondary system connected to the CMS 15, and configured to provide video feed to the controller 15, and the following description can function similarly.
First and second video displays 18a, 18b are arranged on each of the driver and passenger sides within the vehicle cab 12 on or near the A-pillars 19a, 19b to display Class II and Class IV views on its respective side of the vehicle 10, which provide rear facing side views along the vehicle 10 that are captured by the exterior cameras 20a, 20b.
If video of Class V and/or Class VI views are also desired, a camera housing 16c and camera 20c may be arranged at or near the front of the vehicle 10 to provide those views (
If video of Class VIII views is desired, camera housings can be disposed at the sides and rear of the vehicle 10 to provide fields of view including some or all of the class VIII zones of the vehicle 10. In such examples, the third display 18c can include one or more frames displaying the class VIII views. Alternatively, additional displays can be added near the first, second and third displays 18a, 18b, 18c and provide a display dedicated to providing a class VIII view.
The CMS 15 uses the images generated by the cameras 20a, 20b, 20c for both mirror replacement/supplement views and for object detection and driver assistance features. While the mirror replacement/supplement views require color and pattern data for the vehicle operator, it is appreciated that for edge detection, automated driver assistance systems, object detection, as well as other digital image analysis, certain image aspects and characteristics that humans rely on for differentiating objects (e.g., color variations and patterns) do not benefit for the computer analysis. This is true because machines such as computer processors do not “see” images the way humans see the images. Rather, machines see contrast(s) between objects and features, and utilize those contrasts to analyze the image.
In existing systems, image enhancement and modification is typically configured with human vision in mind. As a result, the enhancement techniques are designed to maintain and enhance colors and patterns that are helpful in human analysis. These colors and patterns usually provide little, if any, benefit to computer analysis and techniques that maintain or enhance them are wasted. In another phrasing of this concept, the enhancement techniques used by existing systems are aesthetic and detail focused while computer analysis systems are data focused.
In contrast to the existing enhancement techniques, the CMS 15 described herein uses an image processing method that reduces a bit size of each pixel in the image by reducing pixel quantization levels. Reducing the bit size of each pixel enables quicker processing of video frames, at a lower computational load, using edge contrast transfer functions and similar computer analysis. The reduced pixel quantization is achieved by shifting detail bits from a higher end of an image spectrum (a most significant bit) to a lower end of the image spectrum (a least significant bit) by compressing color data. This shift reduces the amount of data contained within each pixel without removing or altering the image details, such as contrast, required from the CMS based digital analysis.
With continued reference to
Initially the RGB 888 image 210 is processed using an image enhancement portion 220, that tone maps the image. Tone mapping is a technique used in image processing and computer graphics to map one color, or set of colors, to another. Typically tone mapping is used in image for human vision to approximate the appearance of high-dynamic color range images within a medium that has a more limited dynamic range. The tone mapping algorithm compresses a top band of data and a bottom band of data for each pixel into the middle band, while still maintaining distinct separations between objects. This compression effectively condenses color within the image without substantially altering contrast data. The tone mapping results in less color variation and can result in loss of pattern detail in the image.
The tone mapping algorithm initially performs a non-linear logrithimic transformation on the pixel bit depth data. In the tone mapping algorithm, the Image signals are compressed using a log n (x) formula. N is the log base values usually 2-10. Each step increases the compression. X is the image signal values of each pixel. The non-linear logarithmic transformation preserves the lower bit values and compresses the high bit depth values (e.g. the top and bottom bands) while keeping the middle spectrum of bits spread according to a linear model.
After applying the tone mapping, the image is normalized using a normalization function. The normalization includes converting the log converted image values back to full bit image values.
The output of the normalization function 230 is provided to an inverter 240. The inverter 240 inverts the color values of the image by inverting the data according to M=(N−255); N>=0 where M is the inverted output and N is the logarithmic input (i.e., the normalized value of the output of the enhancement process 220.) The inversion function highlights the edge information within the image and clips out non-essential color information from the image, as the non-essential information typically occupies top order bits in typical image processing formats.
After inverting the image data, high frequency data is removed from the image using a bit shift process 250. The bit shift process 250 reduces the number of quantization levels in each pixel for the RGB image from 8 bpc to as few bits as possible while still retaining the data elements (E.G., contrast and edge information) of the image.
Each bit shift 352, 354, 356, 358, reduces the image luminosity level while preserving all the required information data from an image that can be used to detect objects and people in the environment. With three channel inputs (RGB) enough meaningful redundant data is retained to fully operate all computer based vision processing systems. Human centric information, such as color gradient and pattern is reduced or eliminated, however this reduction doesn't impact the ability of computer based vision systems to function.
In some examples, such as the bit shift operation 450 illustrated in
In addition to the conventional output 462, the machine vision system 460 provides a feedback output 464 to the variable bit shift operation 450. In one example, the feedback output 464 can be one of three states: high, low, or good. A high state indicates that the resolution provided to the machine vision operation 460 is higher than necessary and additional bit shifting can occur in each sequential operation of the bit shifting process 460. A low state indicates that the resolution provided is low enough that data elements have been lost due to too much bit shifting and the variable bit shift operation should perform less bit shifting. A good state indicates that the amount of bit shifting should not be increased or decreased. The amount of the shift performed by the bit shift operation 450 is then adjusted according to the state of the feedback signal 464.
In yet further variations on the feedback loop 464 illustrated in
Although an example embodiment has been disclosed, a worker of ordinary skill in this art would recognize that certain modifications would come within the scope of the claims. For that reason, the following claims should be studied to determine their true scope and content.