Images for perception modules of autonomous vehicles

Description

TECHNICAL FIELD

This document relates to reducing the data complexity for analysis and bandwidth required of autonomous vehicle images.

BACKGROUND

Autonomous vehicle navigation is a technology for sensing the position and movement of a vehicle, and based on the sensing, autonomously controlling the vehicle to navigate towards a destination. Autonomous vehicle navigation can have important applications in transportation of people, goods and services. One of the components of autonomous driving, which ensures the safety of the vehicle and its passengers, as well as people and property in the vicinity of the vehicle, is analysis of images taken from vehicle cameras. The images may be used to determine fixed or moving obstacles in the path of autonomous vehicle.

SUMMARY

Disclosed are devices, systems and methods for processing images of an area surrounding a vehicle. In some embodiments, light detection and ranging (LiDAR) sensors may be used to acquire the images based on reflections captured from the surrounding area. In one aspect, a method for processing an image taken from an autonomous vehicle is disclosed. The method includes receiving a raw image from a camera, the image including three values for each of three primary colors, and/or selecting one of the three values for each pixel in the image and discarding the other two values, wherein the selecting is performed in a pattern. The method may further include performing preprocessing on the reduced image, and/or performing perception on the preprocessed image to determine one or more outlines of physical objects in a vicinity of the autonomous vehicle.

The method may further include the following features in any combination. The selecting may reduce a data size of the raw image by a factor of ⅔. The pattern may be a Bayer pattern. The Bayer pattern may be a red-green-green-blue pattern assigned to a 2-pixel by 2-pixel array repeated across the raw image. The pattern may include a greater number of green pixel values than both red and blue. The pattern may be selected such that a value of every other pixel along a row in the reduced image corresponds to green value of the raw image. The pattern may be selected such that a value of every other pixel along a column in the reduced image corresponds to green value of the raw image. The generating the reduced image may be performed using one or more color-selective filters. Each x-y value in the pattern may be from one of three possible values. The preprocessing may be performed on the image from the sensor array without human perception image enhancement. The human perception image enhancement may include one or more of de-mosaicing, white balancing, and noise reduction. The preprocessing may not include scaling one or more pixels' R, G, or B value for white balancing. The preprocessing may not include reconstruction a full color image from incomplete color samples output from the sensor array overlaid with a color filter array for de-mosaicing. The preprocessing may not include noise reduction, wherein noise reduction includes reduction of salt and pepper noise, wherein a noisy pixel bears little relation to the color of surrounding pixels, or reduction of Gaussian noise. The preprocessing may include image cropping. The preprocessing may include image resizing. The preprocessing may include image compression. The sensor array may be a camera.

In another aspect, the above-described method is embodied in the form of executable code stored in a computer-readable program medium.

In yet another aspect, a device that is configured or operable to perform the above-described method is disclosed. The device may include a processor that is programmed to implement this method.

The above and other aspects and features of the disclosed technology are described in greater detail in the drawings, the description and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts a flowchart for processing an image for use by an autonomous vehicle, in accordance with some example embodiments;

FIG. 2 depicts an example of reducing the data size a color image for use by an autonomous vehicle, in accordance with some example embodiments;

FIG. 3A depicts examples of Bayer images at different zooms to show the patterns;

FIG. 3B depicts an example of a Bayer pattern image in greyscale;

FIG. 4 depicts an example of a contrast image and a perception result, in accordance with some example embodiments; and

FIG. 5 depicts an example of an apparatus, in accordance with some example embodiments.

DETAILED DESCRIPTION

Pictures taken by still or video cameras are typically intended for human viewing. The pictures are in full color with high resolution in three primary color such as red (R), green (G) and blue (B). Image processing including vision tasks such as object detection, semantic segmentation, and others typically use processed, well rendered images that are intended for human eyes. However, images for use by machines do not need to have the same characteristics as images intended for human viewing. For example, de-mosaicing, white balancing, color reduction, and other image processing may not be necessary for machine vision used for autonomous vehicles. Since the foregoing image processing tasks do not add additional information, they may not be needed or be useful for machine images. Not performing some processing tasks such as white balancing, which can cause over exposure, also improves the images. Moreover, the amount of data required to represent an image for machine use is less than the full color RGB representation of an image.

In some example embodiments, the RGB information for each pixel in an image may be reduced so that instead of each pixel having intensity values for each of R, G, and B, each pixel has only one intensity corresponding to R, G or B. A predetermined pattern of R, G, and B may be assigned to the array of pixels in an image. By reducing the number of intensity values per pixel from three to one, the data required to represent the image is reduced to ⅓ of the data needed for a full color RGB image. In this way, the amount of data needed to represent an image is reduced to ⅓ while maintaining color sensitivity needed for colored objects in the image. Reducing the amount of data needed to represent the image reduces the bandwidth needed to transfer the image in a fixed amount of time or allows the image to be transferred or processed in less time. Both of these improve machine vision performance and responsiveness. As an illustrative example, a camera image with a size of 200 pixels by 300 pixels has 60,000 pixels. If each pixel is represented by an 8-bit red (R) value, an 8-bit green (G) value, and an 8-bit blue (B) value, the total number of bits required to represent the color image is 300 (pixels)×200 (pixels)×8 (bits)×3 (colors)=1,440,000 bits. By applying the pattern to select one of R, G, or B, for each pixel in a pattern, the number of bits needed to represent the image is reduced to 480,000. The foregoing is an example for illustrative purposes. Other numbers of pixels per image or bits of resolution per color can also be used.

FIG. 1 depicts a flowchart 100 for processing an image for use by an autonomous vehicle, in accordance with some example embodiments. At 110, an image is received. At 120, the data required to represent the raw image is reduced. At 130, the reduced image is preprocessed. At 140, perception is performed on the preprocessed image. At 150, the perception result is provided as an output.

At 110, an image is received form a camera or LiDAR or other image generating device. For example, an image from a solid-state camera such as a charge coupled device (CCD) camera is received. The camera may have separate outputs for R, G, and B or may have a composite output. As an example, R, G, and B may each be represented by 8-bit luminance values or may be represented by analog voltages. In another example, the image may be from a multi-spectral Light Detection and ranging (LiDAR) sensor. Each “color” may be represented by an 8-bit or another bit resolution value.

At 120, the data required to represent the raw image is reduced. For example, the RGB information for each pixel in an image may be reduced so that instead of each pixel having intensity values for each of R, G, and B, each pixel has only one intensity corresponding to R, G or B. By reducing the number of intensity values from three to one, the data required to represent the image is reduced to ⅓ of the data needed for a full color RGB image. In this way, the amount of data needed to represent an image is reduced to ⅓ while maintaining color sensitivity needed for colored objects in the image. Pixels may be selected to be R, G, or B based on a pattern such as a Bayer pattern which is further detailed with respect to FIG. 3.

At 130, the reduced image is preprocessed. The preprocessing is minimized to increase processing speed and reduce computational complexity. For example, demosaicing and white balancing may be eliminated and basic pre-processing such as image cropping and resizing may be maintained.

At 140, perception is performed on the preprocessed image. Perception results include object bounding boxes. See, for example, FIG. 4 at 420. The performed perception is modified since the input images are not three channels for RGB, but are instead reduced to one channel of R, G, or B for each pixel. The perception is computerized and may occur at real time speeds in a moving vehicle without human assistance or feedback.

At 150, the perception result is provided as an output. The output may be used by further image processing tasks related to identifying objects and controlling the vehicle to avoid the objects.

Advantages of the disclosed techniques include the generation of a one-channel image compared to the three-channels (RGB) for usual images. This reduces the space required for storage by ⅔, and reduces the data rate or time for transmission, or a combination of data rate and time for transmission. The second advantage is reduced computational requirements because the image has less data to process and many preprocessing steps are eliminated. Another advantage is that the reduced image causes performance improvement because the raw data even though reduced has more information due to the reduced preprocessing (e.g., removed white balancing which may cause over-exposure).

FIG. 2 at 210 depicts an example of a pattern of colors assigned to pixels. Full color images contain three channels—red (R), green (G), and blue (B). FIG. 2 includes labels, “R” for red, “B” for blue, and “G” for green to clearly indicate the colors of the pixels and the colors of light indicated at 220. For every pixel in a typical image, there are three corresponding values. Together, they mix into a real ‘color point’ we see in the digital image. As described above, instead of three values per pixel, the number of values may be reduced to one by choosing R, G, or B for each pixel. The choice of which pixels are assigned to each may be selected in a pattern. For example, a Bayer pattern image has only one channel but encodes a subset of the information of the three RGB channels in the one channel. For example, a repeating pattern of 2*2 pixel grids may be selected. In some example embodiments, a ‘RGGB’ Bayer pattern may be used. RGGB refers to red for pixel 211, green for pixel 212, green for pixel 213, and blue for pixel 214 in a repeating 4-pixel square pattern as shown at 210. Different patterns may be selected depending on the hardware. In this way, the three values for pixel 211 corresponding to an R value, a G value, and a B value, the R value is selected and the G and B values are discarded. This same elimination process is applied to each pixel. The pattern may include a greater number of green pixel values than both red and blue. The pattern may be selected such that a value of every other pixel along a row in the reduced image corresponds to a green value of the raw image, or such that a value of every other pixel along a column in the reduced image corresponds to a green value of the raw image. The pattern may include a greater number of green pixel values than both red and blue. In a similar way as described above with respect to more frequent green pixels, red or blue pixels may be more frequent rather than green.

FIG. 2 at 220 shows the selection process from three values to one for an example pixel and the corresponding mosaic of each color selection on an image. The generating the reduced image may be performed using multiple color-selective filters. This may be referred to as a color filter array (CFA).

A Bayer pattern is an example of a color filter array (CFA). In some example embodiments, a color filter array different from a Bayer pattern can be used. Generally, the red, green, and blue colors used in a Bayer pattern can be transformed into another group of three colors where different combinations of the other group of three colors can be combined to cause the appearance of all other visible colors just as red, green, and blue can. Furthermore, a color filter array in some example embodiments may include four instead of three basic colors. For example, a patterned CYGM filter (cyan, yellow, green, magenta) can be used, or a patterned RGBE filter (red, green, blue, emerald) can be used as a CFA. Moreover, in some example embodiments, a CFA may add pixels that are not color filtered including a CMYW (cyan, magenta, yellow, and white) CFA.

FIG. 3 at 305 and 310 depict example images with an RGGB Bayer pattern applied to an original full color image. The pattern in the image at 310 is zoomed in to better show the pattern. FIG. 3 at 310 depicts a different image from the image at 305, 320 or in FIG. 4. The image at 305 with RGGB Bayer pattern is the same image as FIG. 3A with the R, G, or B value maintained as one color per pixel. FIG. 3A at 320 depicts a Bayer pattern image with the value for each R, G, and B in the pattern mapped to a black and white gray scale.

FIG. 4 at 410 depicts a rendering result after image processing is performed on the image shown at 320. FIG. 4 at 420 depicts a perception result showing object detection on the Bayer pattern image 305.

Some example implementations may be described as following examples.

1. A method for processing an image, comprising: receiving an image including an x-y array of pixels from a sensor array, each pixel in the x-y array of pixels having a value selected from one of three primary colors, based on a corresponding x-y value in a mask pattern; generating a preprocessed image by performing preprocessing on the image; and performing computerized perception on the preprocessed image to determine one or more outlines of physical objects.

2. The method of example 1, wherein the mask pattern is a Bayer pattern.

3. The method of example 2, wherein the Bayer pattern is a red-green-green-blue pattern assigned to a 2-pixel by 2-pixel array repeated across the raw image.

4. The method of example 1, wherein the pattern includes a greater number of green pixel values than both red and blue.

5. The method of example 1, wherein the pattern is selected such that a value of every other pixel along a row in the reduced image corresponds to green value of the raw image.

6. The method of example 1, wherein the pattern is selected such that a value of every other pixel along a column in the reduced image corresponds to green value of the raw image.

6. The method of example 1, wherein, each x-y value in the pattern is from one of three possible values.

7. The method of example 1, wherein the image is generated using one or more color-selective filters.

8. The method of example 1, wherein the preprocessing is performed on the image from the sensor array without human perception image enhancement.

9. The method of example 8, wherein the human perception image enhancement includes one or more of de-mosaicing, white balancing, and noise reduction.

10. The method of example 1, wherein the preprocessing does not include scaling one or more pixels' R, G, or B value for white balancing.

11. The method of example 1, wherein the preprocessing does not include reconstruction a full color image from incomplete color samples output from the sensor array overlaid with a color filter array for de-mosaicing.

12. The method of example 1, wherein the preprocessing does not include noise reduction, wherein noise reduction includes reduction of salt and pepper noise, wherein a noisy pixel bears little relation to the color of surrounding pixels, or reduction of Gaussian noise.

13. The method of example 1, wherein the preprocessing includes image cropping.

14. The method of example 1, wherein the preprocessing includes image resizing.

15. The method of example 1, wherein the preprocessing includes image compression.

16. The method of example 1, wherein the sensor array is a camera.

17. A computer apparatus comprising a processor, a memory and a communication interface, wherein the processor is programmed to implement a method recited in one or more of examples 1 to 16.

18. A computer readable program medium having code stored thereon, the code, when executed by a processor, causing the processor to implement a method recited in one or more of examples 1 to 16.

FIG. 5 depicts an example of an apparatus 500 that can be used to implement some of the techniques described in the present document. For example, the hardware platform 500 may implement the process 100 or may implement the various modules described herein. The hardware platform 500 may include a processor 502 that can execute code to implement a method. The hardware platform 500 may include a memory 504 that may be used to store processor-executable code and/or store data. The hardware platform 500 may further include a communication interface 506. For example, the communication interface 506 may implement one or more of the communication protocols (LTE, Wi-Fi, and so on) described herein.

Implementations of the subject matter and the functional operations described in this patent document can be implemented in various systems, digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Implementations of the subject matter described in this specification can be implemented as one or more computer program products, e.g., one or more modules of computer program instructions encoded on a tangible and non-transitory computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more of them. The term “data processing unit” or “data processing apparatus” encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.

A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.

The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).

Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random-access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Computer readable media suitable for storing computer program instructions and data include all forms of nonvolatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

While this patent document contains many specifics, these should not be construed as limitations on the scope of any invention or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular inventions. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.

Only a few implementations and examples are described and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.

Claims

1. A method for processing an image, comprising: receiving an image including an array of pixels from a sensor, each pixel in the array of pixels having a value selected from one of three primary colors based on a corresponding position value in a mask pattern;generating a preprocessed image by performing preprocessing on the image, wherein the preprocessing excludes a reduction of salt and pepper noise, or another reduction of Gaussian noise; andperforming perception on the preprocessed image.
2. The method of claim 1, wherein the mask pattern is a Bayer pattern.
3. The method of claim 2, wherein the Bayer pattern is a red-green-green-blue pattern assigned to a 2-pixel by 2-pixel array repeated across a raw image.
4. The method of claim 1, wherein the mask pattern includes a greater number of green pixel values than both red and blue.
5. The method of claim 1, wherein the mask pattern is selected such that a value of every other pixel along a row in a reduced image corresponds to green value of a raw image.
6. The method of claim 1, wherein the mask pattern is selected such that a value of every other pixel along a column in a reduced image corresponds to a green value of a raw image.
7. The method of claim 1, wherein, each position value in the mask pattern is from one of three possible values.
8. The method of claim 1, wherein the image is generated using one or more color-selective filters.
9. The method of claim 1, wherein the preprocessing is performed on the image from the sensor without human perception image enhancement.
10. The method of claim 9, wherein the human perception image enhancement includes one or more of de-mosaicing, white balancing, and noise reduction.
11. The method of claim 1, wherein the preprocessing excludes scaling for white balancing.
12. The method of claim 1, wherein the preprocessing excludes reconstruction of a full color image from incomplete color samples output from the sensor overlaid with a color filter for de-mosaicing.
13. An apparatus comprising: at least one processor and memory including executable instructions that when executed perform operations comprising:receiving an image including an array of pixels from a sensor, each pixel in the array of pixels having a value selected from one of three primary colors based on a corresponding position value in a mask pattern;generating a preprocessed image by performing preprocessing on the image, wherein the preprocessing excludes a reduction of salt and pepper noise, or another reduction of Gaussian noise; andperforming perception of the preprocessed image.
14. The apparatus of claim 13, wherein the mask pattern is a Bayer pattern.
15. The apparatus of claim 14, wherein the Bayer pattern is a red-green-green-blue pattern assigned to a 2-pixel by 2-pixel array repeated across a raw image.
16. The apparatus of claim 13, wherein the mask pattern includes a greater number of green pixel values than both red and blue.
17. The apparatus of claim 13, wherein the at least one processor selects the mask pattern such that a value of every other pixel along a row in a reduced image corresponds to a green value of a raw image.
18. A non-transitory computer readable medium having code stored thereon, the code, when executed by a processor, causing the processor to implement a method comprising: receiving an image including an array of pixels from a sensor, each pixel in the array of pixels having a value selected from one of three primary colors based on a corresponding position value in a mask pattern; andperforming perception on a preprocessed image that excludes a reduction of salt and pepper noise, or another reduction of Gaussian noise.
19. The non-transitory computer readable medium of claim 18, wherein preprocessing is performed on the image from the sensor without human perception image enhancement.
20. The method of claim 1, wherein the performing the perception on the preprocessed image determines one or more outlines of objects.

CROSS-REFERENCE TO RELATED APPLICATIONS

This patent document is a continuation of U.S. application Ser. No. 16/381,707, entitled, “IMAGES FOR PERCEPTION MODULES OF AUTONOMOUS VEHICLES,” filed on Apr. 11, 2019, which claims priority to, and the benefit of U.S. Provisional Patent Application No. 62/656,924, entitled “IMAGES FOR PERCEPTION MODULES OF AUTONOMOUS VEHICLES,” filed on Apr. 12, 2018. The entire contents of the above patent applications are incorporated by reference as part of the disclosure of this patent document.

US Referenced Citations (235)

Number	Name	Date	Kind
6084870	Wooten et al.	Jul 2000	A
6263088	Crabtree	Jul 2001	B1
6594821	Banning et al.	Jul 2003	B1
6777904	Degner	Aug 2004	B1
6975923	Spriggs	Dec 2005	B2
7103460	Breed	Sep 2006	B1
7689559	Canright	Mar 2010	B2
7742841	Sakai et al.	Jun 2010	B2
7783403	Breed	Aug 2010	B2
7844595	Canright	Nov 2010	B2
8041111	Wilensky	Oct 2011	B1
8064643	Stein	Nov 2011	B2
8082101	Stein	Dec 2011	B2
8164628	Stein	Apr 2012	B2
8175376	Marchesotti	May 2012	B2
8271871	Marchesotti	Sep 2012	B2
8346480	Trepagnier et al.	Jan 2013	B2
8378851	Stein	Feb 2013	B2
8392117	Dolgov	Mar 2013	B2
8401292	Park	Mar 2013	B2
8412449	Trepagnier	Apr 2013	B2
8478072	Aisaka	Jul 2013	B2
8553088	Stein	Oct 2013	B2
8706394	Trepagnier et al.	Apr 2014	B2
8718861	Montemerlo et al.	May 2014	B1
8788134	Litkouhi	Jul 2014	B1
8908041	Stein	Dec 2014	B2
8917169	Schofield	Dec 2014	B2
8963913	Baek	Feb 2015	B2
8965621	Urmson	Feb 2015	B1
8981966	Stein	Mar 2015	B2
8983708	Choe et al.	Mar 2015	B2
8993951	Schofield	Mar 2015	B2
9002632	Emigh	Apr 2015	B1
9008369	Schofield	Apr 2015	B2
9025880	Perazzi	May 2015	B2
9042648	Wang	May 2015	B2
9081385	Ferguson et al.	Jul 2015	B1
9088744	Grauer et al.	Jul 2015	B2
9111444	Kaganovich	Aug 2015	B2
9117133	Barnes	Aug 2015	B2
9118816	Stein	Aug 2015	B2
9120485	Dolgov	Sep 2015	B1
9122954	Srebnik	Sep 2015	B2
9134402	Sebastian	Sep 2015	B2
9145116	Clarke	Sep 2015	B2
9147255	Zhang	Sep 2015	B1
9156473	Clarke	Oct 2015	B2
9176006	Stein	Nov 2015	B2
9179072	Stein	Nov 2015	B2
9183447	Gdalyahu	Nov 2015	B1
9185360	Stein	Nov 2015	B2
9191634	Schofield	Nov 2015	B2
9214084	Grauer et al.	Dec 2015	B2
9219873	Grauer et al.	Dec 2015	B2
9233659	Rosenbaum	Jan 2016	B2
9233688	Clarke	Jan 2016	B2
9248832	Huberman	Feb 2016	B2
9248835	Tanzmeister	Feb 2016	B2
9251708	Rosenbaum	Feb 2016	B2
9277132	Berberian	Mar 2016	B2
9280711	Stein	Mar 2016	B2
9282144	Tebay et al.	Mar 2016	B2
9286522	Stein	Mar 2016	B2
9297641	Stein	Mar 2016	B2
9299004	Lin	Mar 2016	B2
9315192	Zhu	Apr 2016	B1
9317033	Ibanez-guzman et al.	Apr 2016	B2
9317776	Honda	Apr 2016	B1
9330334	Lin	May 2016	B2
9342074	Dolgov	May 2016	B2
9347779	Lynch	May 2016	B1
9355635	Gao	May 2016	B2
9365214	Ben Shalom	Jun 2016	B2
9399397	Mizutani	Jul 2016	B2
9418549	Kang et al.	Aug 2016	B2
9428192	Schofield	Aug 2016	B2
9436880	Bos	Sep 2016	B2
9438878	Niebla	Sep 2016	B2
9443163	Springer	Sep 2016	B2
9446765	Ben Shalom	Sep 2016	B2
9459515	Stein	Oct 2016	B2
9466006	Duan	Oct 2016	B2
9476970	Fairfield	Oct 2016	B1
9483839	Kwon	Nov 2016	B1
9490064	Hirosawa	Nov 2016	B2
9494935	Okumura et al.	Nov 2016	B2
9507346	Levinson et al.	Nov 2016	B1
9513634	Pack et al.	Dec 2016	B2
9531966	Stein	Dec 2016	B2
9535423	Debreczeni	Jan 2017	B1
9538113	Grauer et al.	Jan 2017	B2
9547985	Tuukkanen	Jan 2017	B2
9549158	Grauer et al.	Jan 2017	B2
9555803	Pawlicki	Jan 2017	B2
9568915	Berntorp	Feb 2017	B1
9587952	Slusar	Mar 2017	B1
9599712	Van Der Tempel et al.	Mar 2017	B2
9600889	Boisson et al.	Mar 2017	B2
9602807	Crane et al.	Mar 2017	B2
9612123	Levinson et al.	Apr 2017	B1
9620010	Grauer et al.	Apr 2017	B2
9625569	Lange	Apr 2017	B2
9628565	Stenneth et al.	Apr 2017	B2
9649999	Amireddy et al.	May 2017	B1
9652860	Maali	May 2017	B1
9669827	Ferguson et al.	Jun 2017	B1
9672446	Vallesi-Gonzalez	Jun 2017	B1
9679206	Ferguson	Jun 2017	B1
9690290	Prokhorov	Jun 2017	B2
9701023	Zhang et al.	Jul 2017	B2
9712754	Grauer et al.	Jul 2017	B2
9720418	Stenneth	Aug 2017	B2
9723097	Harris	Aug 2017	B2
9723099	Chen	Aug 2017	B2
9723233	Grauer et al.	Aug 2017	B2
9726754	Massanell et al.	Aug 2017	B2
9729860	Cohen et al.	Aug 2017	B2
9738280	Rayes	Aug 2017	B2
9739609	Lewis	Aug 2017	B1
9746550	Nath	Aug 2017	B2
9753128	Schweizer et al.	Sep 2017	B2
9753141	Grauer et al.	Sep 2017	B2
9754490	Kentley et al.	Sep 2017	B2
9760837	Nowozin et al.	Sep 2017	B1
9766625	Boroditsky et al.	Sep 2017	B2
9769456	You et al.	Sep 2017	B2
9773155	Shotton et al.	Sep 2017	B2
9779276	Todeschini et al.	Oct 2017	B2
9785149	Wang et al.	Oct 2017	B2
9805294	Liu et al.	Oct 2017	B2
9810785	Grauer et al.	Nov 2017	B2
9823339	Cohen	Nov 2017	B2
9911030	Zhu	Mar 2018	B1
9953236	Huang	Apr 2018	B1
10147193	Huang	Dec 2018	B2
10223806	Yi et al.	Mar 2019	B1
10223807	Yi et al.	Mar 2019	B1
10410055	Wang et al.	Sep 2019	B2
20030114980	Klausner et al.	Jun 2003	A1
20030174773	Comaniciu	Sep 2003	A1
20040264763	Mas et al.	Dec 2004	A1
20070183661	El-Maleh	Aug 2007	A1
20070183662	Wang	Aug 2007	A1
20070230792	Shashua	Oct 2007	A1
20070286526	Abousleman	Dec 2007	A1
20080249667	Horvitz	Oct 2008	A1
20090040054	Wang	Feb 2009	A1
20090087029	Coleman	Apr 2009	A1
20100049397	Lin	Feb 2010	A1
20100111417	Ward	May 2010	A1
20100195908	Bechtel	Aug 2010	A1
20100226564	Marchesotti	Sep 2010	A1
20100281361	Marchesotti	Nov 2010	A1
20100303349	Bechtel	Dec 2010	A1
20110142283	Huang	Jun 2011	A1
20110206282	Aisaka	Aug 2011	A1
20110247031	Jacoby	Oct 2011	A1
20120041636	Johnson et al.	Feb 2012	A1
20120105639	Stein	May 2012	A1
20120140076	Rosenbaum	Jun 2012	A1
20120274629	Baek	Nov 2012	A1
20120314070	Zhang et al.	Dec 2012	A1
20130051613	Bobbitt et al.	Feb 2013	A1
20130083959	Owechko	Apr 2013	A1
20130182134	Grundmann et al.	Jul 2013	A1
20130204465	Phillips et al.	Aug 2013	A1
20130266187	Bulan	Oct 2013	A1
20130329052	Chew	Dec 2013	A1
20130342674	Dixon	Dec 2013	A1
20140072170	Zhang	Mar 2014	A1
20140104051	Breed	Apr 2014	A1
20140142799	Ferguson et al.	May 2014	A1
20140143839	Ricci	May 2014	A1
20140145516	Hirosawa	May 2014	A1
20140168421	Xu	Jun 2014	A1
20140198184	Stein	Jul 2014	A1
20140321704	Partis	Oct 2014	A1
20140334668	Saund	Nov 2014	A1
20140340570	Meyers	Nov 2014	A1
20150062304	Stein	Mar 2015	A1
20150269438	Samarsekera et al.	Sep 2015	A1
20150294160	Takahashi	Oct 2015	A1
20150310370	Burry	Oct 2015	A1
20150312537	Solhusvik	Oct 2015	A1
20150353082	Lee et al.	Dec 2015	A1
20160008988	Kennedy	Jan 2016	A1
20160026787	Nairn et al.	Jan 2016	A1
20160037064	Stein	Feb 2016	A1
20160094774	Li	Mar 2016	A1
20160118080	Chen	Apr 2016	A1
20160129907	Kim	May 2016	A1
20160165157	Stein	Jun 2016	A1
20160210528	Duan	Jul 2016	A1
20160224850	Silver	Aug 2016	A1
20160275766	Venetianer et al.	Sep 2016	A1
20160321381	English	Nov 2016	A1
20160334230	Ross et al.	Nov 2016	A1
20160342837	Hong et al.	Nov 2016	A1
20160347322	Clarke et al.	Dec 2016	A1
20160375907	Erban	Dec 2016	A1
20170048500	Shi	Feb 2017	A1
20170053169	Cuban et al.	Feb 2017	A1
20170061632	Linder et al.	Mar 2017	A1
20170098132	Yokota	Apr 2017	A1
20170113613	Van Dan Elzen	Apr 2017	A1
20170124476	Levinson et al.	May 2017	A1
20170134631	Zhao et al.	May 2017	A1
20170160197	Ozcan	Jun 2017	A1
20170177951	Yang et al.	Jun 2017	A1
20170301104	Qian	Oct 2017	A1
20170305423	Green	Oct 2017	A1
20170318407	Meister	Nov 2017	A1
20180151063	Pun	May 2018	A1
20180158197	Dasgupta	Jun 2018	A1
20180210465	Qu	Jul 2018	A1
20180260956	Huang	Sep 2018	A1
20180283892	Behrendt	Oct 2018	A1
20180373980	Huval	Dec 2018	A1
20190025853	Julian	Jan 2019	A1
20190065863	Luo et al.	Feb 2019	A1
20190066329	Yi et al.	Feb 2019	A1
20190066330	Yi et al.	Feb 2019	A1
20190066344	Yi et al.	Feb 2019	A1
20190108384	Wang et al.	Apr 2019	A1
20190120955	Zhong	Apr 2019	A1
20190132391	Thomas	May 2019	A1
20190132392	Liu	May 2019	A1
20190132572	Shen	May 2019	A1
20190210564	Han	Jul 2019	A1
20190210613	Sun	Jul 2019	A1
20190236950	Li	Aug 2019	A1
20190257987	Saari	Aug 2019	A1
20190266420	Ge	Aug 2019	A1
20190318456	Chen	Oct 2019	A1

Foreign Referenced Citations (43)

Number	Date	Country
101068311	Nov 2007	CN
101288296	Oct 2008	CN
101610419	Dec 2009	CN
102835115	Dec 2012	CN
105698812	Jun 2016	CN
106340197	Jan 2017	CN
106488091	Mar 2017	CN
106781591	May 2017	CN
108010360	May 2018	CN
2608513	Sep 1977	DE
890470	Jan 1999	EP
1754179	Feb 2007	EP
2448251	May 2012	EP
2463843	Jun 2012	EP
2761249	Aug 2014	EP
2946336	Nov 2015	EP
2993654	Mar 2016	EP
3081419	Oct 2016	EP
2018129753	Aug 2018	JP
100802511	Feb 2008	KR
1991009375	Jun 1991	WO
2005098739	Oct 2005	WO
2005098751	Oct 2005	WO
2005098782	Oct 2005	WO
2010109419	Sep 2010	WO
2013045612	Apr 2013	WO
2014111814	Jul 2014	WO
2014166245	Oct 2014	WO
2014201324	Dec 2014	WO
2015083009	Jun 2015	WO
2015103159	Jul 2015	WO
2015125022	Aug 2015	WO
2015186002	Dec 2015	WO
2016090282	Jun 2016	WO
2016135736	Sep 2016	WO
2017079349	May 2017	WO
2017079460	May 2017	WO
2017013875	May 2018	WO
2019040800	Feb 2019	WO
2019084491	May 2019	WO
2019084494	May 2019	WO
2019140277	Jul 2019	WO
2019168986	Sep 2019	WO

Non-Patent Literature Citations (60)

Entry
Albert S. Huang,“Finding multiple lanes in urban road networks with vision and lidar,” Mar. 24, 2009,Autonomous Robots vol. 26,pp. 104-112.
Alberto Faro,“Evaluation of the Traffic Parameters in a Metropolitan Area by Fusing Visual Perceptions and CNN Processing of Webcam Images,” May 20, 2008,IEEE Transactions on Neural Networks, vol. 19, No. 6, Jun. 2008,pp. 1108-1122.
Narayan Bhosale,“Analysis of Effect of Gaussian, Salt and Pepper Noise Removal from Noisy Remote Sensing Images,” Proceedings of the Second International Conference on “Emerging Research in Computing , Information, Coomuncation and Applications” ERCICA2014, pp. 384-389.
Carle, Patrick J.F. et al. “Global Rover Localization by Matching Lidar and Orbital 3D Maps.” IEEE, Anchorage Convention District, pp. 1-6, May 3-8, 2010. (Anchorage Alaska, US).
Caselitz, T. et al., “Monocular camera localization in 3D LiDAR maps,” European Conference on Computer Vision (2014) Computer Vision—ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol. 8690. Springer, Cham.
Mur-Artal, R. et al., “ORB-SLAM: A Versatile and Accurate Monocular SLAM System,” IEEE Transaction on Robotics, Oct. 2015, pp. 1147-1163, vol. 31, No. 5, Spain.
Sattler, T. et al., “Are Large-Scale 3D Models Really Necessary for Accurate Visual Localization?” CVPR, IEEE, 2017, pp. 1-10.
Engel, J. et la. “LSD-SLAM: Large Scare Direct Monocular SLAM,” pp. 1-16, Munich.
Levinson, Jesse et al., Experimental Robotics, Unsupervised Calibration for Multi-Beam Lasers, pp. 179-194, 12th Ed., Oussama Khatib, Vijay Kumar, Gaurav Sukhatme (Eds.) Springer-Verlag Berlin Heidelberg 2014.
Geiger, Andreas et al., “Automatic Camera and Range Sensor Calibration using a single Shot”, Robotics and Automation (ICRA), pp. 1-8, 2012 IEEE International Conference.
Zhang, Z. et al. A Flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence (vol. 22, Issue: 11, Nov. 2000).
Bar-Hillel, Aharon et al. “Recent progress in road and lane detection: a survey.” Machine Vision and Applications 25 (2011): 727-745.
Schindler, Andreas et al. “Generation of high precision digital maps using circular arc splines,” 2012 IEEE Intelligent Vehicles Symposium, Alcala de Henares, 2012, pp. 246-251. doi: 10.1109/IVS.2012.6232124.
Hou, Xiaodi and Zhang, Liqing, “Saliency Detection: A Spectral Residual Approach”, Computer Vision and Pattern Recognition, CVPR'07—IEEE Conference, pp. 1-8, 2007.
Hou, Xiaodi and Harel, Jonathan and Koch, Christof, “Image Signature: Highlighting Sparse Salient Regions”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, No. 1, pp. 194-201, 2012.
Hou, Xiaodi and Zhang, Liqing, “Dynamic Visual Attention: Searching For Coding Length Increments”, Advances in Neural Information Processing Systems, vol. 21, pp. 681-688, 2008.
Li, Yin and Hou, Xiaodi and Koch, Christof and Rehg, James M. and Yuille, Alan L., “The Secrets of Salient Object Segmentation”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 280-287, 2014.
Zhou, Bolei and Hou, Xiaodi and Zhang, Liqing, “A Phase Discrepancy Analysis of Object Motion”, Asian Conference on Computer Vision, pp. 225-238, Springer Berlin Heidelberg, 2010.
Hou, Xiaodi and Yuille, Alan and Koch, Christof, “Boundary Detection Benchmarking Beyond F-Measures”, Computer Vision and Pattern Recognition, CVPR'13, vol. 2013, pp. 1-8, IEEE, 2013.
Hou, Xiaodi and Zhang, Liqing, “Color Conceptualization”, Proceedings of the 15th ACM International Conference on Multimedia, pp. 265-268, ACM, 2007.
Hou, Xiaodi and Zhang, Liqing, “Thumbnail Generation Based on Global Saliency”, Advances in Cognitive Neurodynamics, ICCN 2007, pp. 999-1003, Springer Netherlands, 2008.
Hou, Xiaodi and Yuille, Alan and Koch, Christof, “A Meta-Theory of Boundary Detection Benchmarks”, arXiv preprint arXiv:1302.5985, 2013.
Li, Yanghao and Wang, Naiyan and Shi, Jianping and Liu, Laying and Hou, Xiaodi, “Revisiting Batch Normalization for Practical Domain Adaptation”, arXiv preprint arXiv:1603.04779, 2016.
Li, Yanghao and Wang, Naiyan and Liu, Laying and Hou, Xiaodi, “Demystifying Neural Style Transfer”, arXiv preprint arXiv:1701.01036, 2017.
Hou, Xiaodi and Zhang, Liqing, “A Time-Dependent Model of Information Capacity of Visual Attention”, International Conference on Neural Information Processing, pp. 127-136, Springer Berlin Heidelberg, 2006.
Wang, Panqu and Chen, Pengfei and Yuan, Ye and Liu, Ding and Huang, Zehua and Hou, Xiaodi and Cottrell, Garrison, “Understanding Convolution for Semantic Segmentation”, arXiv preprint arXiv:1702.08502, 2017.
Li, Yanghao and Wang, Naiyan and Liu, Laying and Hou, Xiaodi, “Factorized Bilinear Models for Image Recognition”, arXiv preprint arXiv:1611.05709, 2016.
Hou, Xiaodi, “Computational Modeling and Psychophysics in Low and Mid-Level Vision”, California Institute of Technology, 2014.
Spinello, Luciano, Triebel, Rudolph, Siegwart, Roland, “Multiclass Multimodal Detection and Tracking in Urban Environments”, Sage Journals, vol. 29 Issue 12, pp. 1498-1515 Article first published online: Oct. 7, 2010; Issue published: Oct. 1, 2010.
Matthew Barth, Carrie Malcolm, Theodore Younglove, and Nicole Hill, “Recent Validation Efforts for a Comprehensive Modal Emissions Model”, Transportation Research Record 1750, Paper No. 01-0326, College of Engineering, Center for Environmental Research and Technology, University of California, Riverside, CA 92521, date unknown.
Kyoungho Ahn, Hesham Rakha, “The Effects of Route Choice Decisions on Vehicle Energy Consumption and Emissions”, Virginia Tech Transportation Institute, Blacksburg, VA 24061, date unknown.
Ramos, Sebastian, Gehrig, Stefan, Pinggera, Peter, Franke, Uwe, Rother, Carsten, “Detecting Unexpected Obstacles for Self-Driving Cars: Fusing Deep Learning and Geometric Modeling”, arXiv:1612.06573v1 [cs.CV] Dec. 20, 2016.
Schroff, Florian, Dmitry Kalenichenko, James Philbin, (Google), “FaceNet: A Unified Embedding for Face Recognition and Clustering”, CVPR 2015.
Dai, Jifeng, Kaiming He, Jian Sun, (Microsoft Research), “Instance-aware Semantic Segmentation via Multi-task Network Cascades”, CVPR 2016.
Huval, Brody, Tao Wang, Sameep Tandon, Jeff Kiske, Will Song, Joel Pazhayampallil, Mykhaylo Andriluka, Pranav Rajpurkar, Toki Migimatsu, Royce Cheng-Yue, Fernando Mujica, Adam Coates, Andrew Y. Ng, “An Empirical Evaluation of Deep Learning on Highway Driving”, arXiv:1504.01716v3 [cs.RO] Apr. 17, 2015.
Tian Li, “Proposal Free Instance Segmentation Based on Instance-aware Metric”, Department of Computer Science, Cranberry-Lemon University, Pittsburgh, PA., date unknown.
Mohammad Norouzi, David J. Fleet, Ruslan Salakhutdinov, “Hamming Distance Metric Learning”, Departments of Computer Science and Statistics, University of Toronto, date unknown.
Jain, Suyong Dutt, Grauman, Kristen, “Active Image Segmentation Propagation”, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, Jun. 2016.
MacAodha, Oisin, Campbell, Neill D.F., Kautz, Jan, Brostow, Gabriel J., “Hierarchical Subquery Evaluation for Active Learning on a Graph”, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
Kendall, Alex, Gal, Yarin, “What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision”, arXiv:1703.04977v1 [cs.CV] Mar. 15, 2017.
Wei, Junqing, John M. Dolan, Bakhtiar Litkhouhi, “A Prediction- and Cost Function-Based Algorithm for Robust Autonomous Freeway Driving”, 2010 IEEE Intelligent Vehicles Symposium, University of California, San Diego, CA, USA, Jun. 21-24, 2010.
Peter Welinder, Steve Branson, Serge Belongie, Pietro Perona, “The Multidimensional Wisdom of Crowds”; http://www.vision.caltech.edu/visipedia/papers/WelinderEtaINIPS10.pdf, 2010.
Kai Yu, Yang Zhou, Da Li, Zhang Zhang, Kaiqi Huang, “Large-scale Distributed Video Parsing and Evaluation Platform”, Center for Research on Intelligent Perception and Computing, Institute of Automation, Chinese Academy of Sciences, China, arXiv:1611.09580v1 [cs.CV] Nov. 29, 2016.
P. Guarneri, G. Rocca and M. Gobbi, “A Neural-Network-Based Model for the Dynamic Simulation of the Tire/Suspension System While Traversing Road Irregularities,” in IEEE Transactions on Neural Networks, vol. 19, No. 9, pp. 1549-1563, Sep. 2008.
C. Yang, Z. Li, R. Cui and B. Xu, “Neural Network-Based Motion Control of an Underactuated Wheeled Inverted Pendulum Model,” in IEEE Transactions on Neural Networks and Learning Systems, vol. 25, No. 11, pp. 2004-2016, Nov. 2014.
Stephan R. Richter, Vibhav Vineet, Stefan Roth, Vladlen Koltun, “Playing for Data: Ground Truth from Computer Games”, Intel Labs, European Conference on Computer Vision (ECCV), Amsterdam, the Netherlands, 2016.
Thanos Athanasiadis, Phivos Mylonas, Yannis Avrithis, and Stefanos Kollias, “Semantic Image Segmentation and Object Labeling”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, No. 3, Mar. 2007.
Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele, “The Cityscapes Dataset for Semantic Urban Scene Understanding”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, Nevada, 2016.
Adhiraj Somani, Nan Ye, David Hsu, and Wee Sun Lee, “DESPOT: Online POMDP Planning with Regularization”, Department of Computer Science, National University of Singapore, date unknown.
Adam Paszke, Abhishek Chaurasia, Sangpil Kim, and Eugenio Culurciello. Enet: A deep neural network architecture for real-time semantic segmentation. CoRR, abs/1606.02147, 2016.
Szeliski, Richard, “Computer Vision: Algorithms and Applications” http://szeliski.org/Book/, 2010.
Chinese Application No. 201910192585.4, First Office Action dated Feb. 3, 2021.
Queau, Yvain et al., Microgeometry capture and RGB albedo estimation by photometric stereo without demosaicing. Open Archive Toulouse Archive Ouverte, May 14, 2017, pp. 1-7.
Losson, Olivier & Porebski, A. & Vandenbroucke, Nicolas & Macaire, Ludovic. (2013). Color texture analysis using CFA chromatic co-occurrence matrices. Computer Vision and Image Understanding. 117. 747-763. 10.1016/j.cviu.2013.03.001.
Chinese Application No. 201910192584.X, First Office Action dated Feb. 3, 2021.
Diamond, Steven et al. Dirty pixels: Optimizing Image Classification architectures for raw sensor data. Arxiv.org/abs/1701.06487 (Jan. 23, 2017). pp. 1-11.
Li, Zhenghao. PIMR: Parallel and Integrated Matching for Raw Data. Sensors (Basel). Jan. 2, 2016;16(1):54.
Chinese Patent Office, Second Office Action for CN 201910192584.X, dated Jan. 24, 2022, 9 pages with English translation.
Chinese Patent Office, Decision of Rejection for CN 201910192585.4, dated Mar. 3, 2022, 17 pages with English translation.
Chinese Patent Office, Third Office Action for CN 201910192584.X, dated Dec. 8, 2022, 4 pages without English translation.

Related Publications (1)

	Number	Date	Country
	20210256664 A1	Aug 2021	US

Provisional Applications (1)

	Number	Date	Country
	62656924	Apr 2018	US

Continuations (1)

	Number	Date	Country
Parent	16381707	Apr 2019	US
Child	17308911		US

Images for perception modules of autonomous vehicles

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

CPC

International Classifications

Term Extension

Abstract