The present disclosure relates generally to detecting objects of interest. More particularly, the present disclosure relates to detecting and classifying objects that are proximate to an autonomous vehicle in part by using overlapping camera fields of view.
An autonomous vehicle is a vehicle that is capable of sensing its environment and navigating with little to no human input. In particular, an autonomous vehicle can observe its surrounding environment using a variety of sensors and can attempt to comprehend the environment by performing various processing techniques on data collected by the sensors. Given knowledge of its surrounding environment, the autonomous vehicle can identify an appropriate motion path through such surrounding environment.
Thus, a key objective associated with an autonomous vehicle is the ability to perceive objects (e.g., vehicles, pedestrians, cyclists) that are proximate to the autonomous vehicle and, further, to determine classifications of such objects as well as their locations. The ability to accurately and precisely detect and characterize objects of interest is fundamental to enabling the autonomous vehicle to generate an appropriate motion plan through its surrounding environment.
Aspects and advantages of embodiments of the present disclosure will be set forth in part in the following description, or can be learned from the description, or can be learned through practice of the embodiments.
One example aspect of the present disclosure is directed to a sensor system. The sensor system includes one or more ranging systems and a plurality of cameras. The plurality of cameras are positioned such that a field of view for each camera of the plurality of cameras overlaps a field of view of at least one adjacent camera. The plurality of cameras are further positioned about at least one of the one or more ranging systems such that a combined field of view of the plurality of cameras comprises an approximately 360 degree field of view. The one or more ranging systems are configured to transmit ranging data to a perception system for detecting objects of interest and the plurality of cameras are configured to transmit image data to the perception system for classifying the objects of interest.
Another example aspect of the present disclosure is directed to an autonomous vehicle. The autonomous vehicle includes a vehicle computing system and a sensor system. The vehicle computing system includes one or more processors and one or more memories including instructions that, when executed by the one or more processors, cause the one or more processors to perform operations. The operations include detecting objects of interest and classifying the detected objects of interest. The sensor system includes a plurality of cameras, the plurality of cameras are positioned such that a field of view for each camera of the plurality of cameras overlaps a field of view of at least one adjacent camera and the plurality of cameras are configured to transmit image data to the vehicle computing system for classifying objects of interest. In some embodiments, the autonomous vehicle may further include one or more ranging systems. The one or more ranging systems may be configured to transmit ranging data to the vehicle computing system for detecting where the objects of interest are located proximate to the autonomous vehicle.
Another example aspect of the present disclosure is directed to a computer-implemented method of detecting objects of interest. The method includes receiving, by one or more computing devices, ranging data from one or more ranging systems, the ranging systems being configured to transmit ranging signals relative to an autonomous vehicle. The method includes receiving, by the one or more computing devices, image data from a plurality of cameras configured to capture images relative to the autonomous vehicle. The plurality of cameras being positioned such that a field of view for each camera of the plurality of cameras overlaps a field of view of at least one adjacent camera. The method includes detecting, by the one or more computing devices, an object of interest proximate to the autonomous vehicle within the ranging data. The method includes determining, by the one or more computing devices, a first image area within image data captured by a first camera within the plurality of cameras containing the object of interest. The method includes determining, by the one or more computing devices, a second image area within image data captured by a second camera within the plurality of cameras containing the object of interest, the second image area overlapping the first image area and providing a greater view of the object than provided in the first image area. The method includes classifying, by the one or more computing devices, the object of interest based at least in part on the second image area.
Other aspects of the present disclosure are directed to various systems, apparatuses, non-transitory computer-readable media, user interfaces, and electronic devices.
These and other features, aspects, and advantages of various embodiments of the present disclosure will become better understood with reference to the following description and appended claims. The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate example embodiments of the present disclosure and, together with the description, serve to explain the related principles.
Detailed discussion of embodiments directed to one of ordinary skill in the art is set forth in the specification, which makes reference to the appended figures, in which:
Generally, the present disclosure is directed to systems and methods for detecting and classifying objects, such as pedestrians, cyclists, other vehicles (whether stationary or moving), and the like, during the operation of an autonomous vehicle. In particular, in some embodiments of the present disclosure, when deploying a plurality of cameras as part of a vehicle sensor system, the positions and orientations of the cameras can be configured such that the field of view of each camera is overlapped by the field of view of at least one adjacent camera by a determined amount. Such camera field of view overlaps allow for an object of interest that may be captured in image data on a boundary or edge of one camera's field of view to be more fully captured within the field of view of an adjacent camera and thereby provide for improved detection and classification of the object of interest. For example, without such camera field of view overlaps, an object, such as a pedestrian, that may be on a boundary of a first camera's field of view may be only partially captured (e.g., “split”) in that camera's image data increasing the difficulty in detecting and classifying the object as a pedestrian. However, by configuring the cameras with field of view overlaps, such a “split” object can be more fully captured in the image data of an adjacent camera and thereby allow for properly identifying and classifying the object, for example, as a pedestrian (e.g., capturing at least a sufficient portion of the object of interest by the adjacent camera to allow for classification).
For example, in some embodiments, one or more camera field of view overlaps can be configured such that the field of view overlap is large enough in certain locations for a largest relevant classifiable object within a given category of objects to be fully captured by one camera. For example, given a particular category of classifiable objects (e.g., pedestrians), a field of view overlap can be configured to be large enough within a certain range of the autonomous vehicle such that a largest relevant pedestrian (e.g., an adult male as compared with other types of pedestrians such as adult females and children) near the autonomous vehicle may be fully viewed in at least one camera's field of view (e.g., a pedestrian on a boundary of one camera's field of view can be fully captured in an adjacent camera's field of view due to the field of view overlaps). It should be appreciated that this example configuration for a field of view overlap is also designed with specific consideration to the different categories of classifiable objects (e.g., pedestrians, bicycles, vehicles) such that the category of classifiable objects having a typically smallest dimension (e.g., pedestrians as opposed to bicycles or vehicles) can be fully viewed in at least one camera's field of view
In some embodiments, a field of view overlap may be configured based on a minimum amount of view of an object that is needed to determine an object classification. For example, if, in a particular embodiment, a classification can generally be determined from a camera's image data that contains at least 20% of a view of an object, such as a bicycle for example, the field of view overlap may be configured such that an overlap at least as large as 20% of the size of such object is provided.
In some embodiments, a field of view overlap may be configured based on a minimum or average dimension of an object type that is generally difficult to classify when captured on a camera's field of view boundary. For example, considering a pedestrian category of detectable objects, such objects may have different average sizes depending on whether a pedestrian is a male, a female, a child, etc. Since male pedestrians can be considered the largest relevant classifiable object within the pedestrian category, the field of view overlap can be designed to be large enough that the typical adult male pedestrian would be fully captured by one camera. More particularly, since a larger pedestrian (e.g., a male pedestrian), generally has a width dimension of at least twenty inches, then a field of view overlap may be configured so that the overlap is at least twenty inches wide at a small distance from the vehicle so that such an object (e.g., male pedestrian) could be fully captured by at least one camera.
In some embodiments, the positions and orientations of one or more of the cameras can also be configured to provide a full line horizontal field of view adjacent to an autonomous vehicle so that, for example, objects proximate to the vehicle, or farther back to the rear of the vehicle, in an adjacent lane can be more easily detected. An autonomous vehicle sensor system including one or more ranging systems and a plurality of cameras configured to provide field of view overlaps among the cameras can provide ranging data and image data (or combined “sensor data”) that allow for improved detection of objects of interest around the periphery of the autonomous vehicle and improved localization and classification of the objects of interest. The data regarding the localization and classification of the objects of interest can be further analyzed in autonomous vehicle applications, such as those involving perception, prediction, motion planning, and vehicle control.
More particularly, an autonomous vehicle sensor system can be mounted on the roof of an autonomous vehicle and can include one or more ranging systems, for example a Light Detection and Ranging (LIDAR) system and/or a Radio Detection and Ranging (RADAR) system. The one or more ranging systems can capture a variety of ranging data and provide it to a vehicle computing system, for example, for the detection and localization of objects of interest during the operation of the autonomous vehicle. The one or more ranging systems may include a single centrally mounted LIDAR system in some examples. In some examples, the centrally mounted LIDAR system may be tilted forward to provide the desired coverage pattern.
As one example, for a LIDAR system, the ranging data from the one or more ranging systems can include the location (e.g., in three-dimensional space relative to the LIDAR system) of a number of points that correspond to objects that have reflected a ranging laser. For example, a LIDAR system can measure distances by measuring the Time of Flight (TOF) that it takes a short laser pulse to travel from the sensor to an object and back, calculating the distance from the known speed of light.
As another example, for a RADAR system, the ranging data from the one or more ranging systems can include the location (e.g., in three-dimensional space relative to the RADAR system) of a number of points that correspond to objects that have reflected a ranging radio wave. For example, radio waves (pulsed or continuous) transmitted by the RADAR system can reflect off an object and return to a receiver of the RADAR system, giving information about the object's location and speed.
The autonomous vehicle sensor system can also include a plurality of cameras oriented and positioned relative to the one or more ranging systems, such as a centrally mounted LIDAR system. The plurality of cameras can capture image data corresponding to objects detected by the one or more ranging systems and provide the image data to a vehicle computing system, for example, for identification and classification of objects of interest during the operation of the autonomous vehicle. The positions and orientations for the plurality of cameras can be determined and configured such that a field of view for each camera of the plurality of cameras overlaps a field of view of at least one adjacent camera, for example, by a determined amount. These camera field of view overlaps may provide improvements in the detection, localization, and classification of objects of interest. For example, the camera field of view overlaps may provide that at least one camera of the plurality of cameras will have a more full view (e.g., not a split or left/right view) of an object, such as a pedestrian or cyclist, that is sensed with a LIDAR device. In some examples, the ranging system and plurality of cameras with field of view overlaps can provide for improved detection of smaller and/or fast moving objects, such as a pedestrian or cyclist, such as by providing a horizontal field of view adjacent to an autonomous vehicle that provides a full view along a line (e.g., providing a horizontal field of view of the lane adjacent and to the rear of the autonomous vehicle).
In some embodiments, the position and orientation of some of the cameras may be configured to provide a horizontal field of view tangent to a side of an autonomous vehicle so that objects farther back in an adjacent lane can be detected, such as for use in analyzing a lane change or merging operation of the vehicle, for example. Configuring the position and orientation of one or more cameras to provide such a horizontal field of view tangent to a side of an autonomous vehicle could provide a field of view similar to that of a side view mirror as used by a vehicle driver, and provide for viewing objects adjacent to the autonomous vehicle in a similar fashion. A field of view tangent to a side of an autonomous vehicle can be provided by positioning one or more rear side or rear facing cameras proximate to a roof edge of the autonomous vehicle. For example, in some embodiments, a left rear side camera can be mounted near the left edge of the roof of the autonomous vehicle and a right rear side camera can be mounted near the right edge of the roof of the autonomous vehicle.
In some embodiments, one or more rear facing cameras (e.g., left rear side camera, right rear side camera) positioned near the roof edge of the autonomous vehicle can provide an improved view as compared to a rear camera positioned on or near the centerline of an autonomous vehicle. For example, having a rear facing camera (e.g., left rear side camera, right rear side camera) positioned near a roof edge of the autonomous vehicle can provide an improved rear facing view of the adjacent lane when a large vehicle (e.g., bus, truck, etc.) is positioned immediately behind the autonomous vehicle, whereas the field of view for a centerline-placed rear camera could be greatly obscured by such a large vehicle. For example, one or more rear side cameras positioned nearer to a roof edge of the autonomous vehicle could provide a view similar to that of a side view mirror as used by a vehicle driver as opposed to the field of view of a centerline-positioned camera which could be comparable to a vehicle driver's view in a rear-view mirror where large following vehicles can obscure the view. In some examples, the plurality of cameras can also be positioned around and relative to the one or more ranging systems, such as a central LIDAR system, for example, such that the combined field of view of the plurality of cameras provides an approximately 360 degree horizontal field of view around the LIDAR system or the periphery of the autonomous vehicle.
In some embodiments, the plurality of cameras in the sensor system can include at least five cameras having a wide field of view to provide the adequate fields of view surrounding an autonomous vehicle. For example, the plurality of cameras may include a forward-facing camera, two forward side cameras, and two rear side cameras. In some embodiments, the plurality of cameras in the sensor system may include six cameras having a wide field of view to provide the adequate field of view surrounding an autonomous vehicle. For example, the plurality of cameras may include a forward-facing camera, two forward side cameras, two rear side cameras, and a rear-facing camera. In some implementations, more or less cameras can be utilized.
In some embodiments, the position and orientation of the plurality of cameras may be configured to provide some front bias in field of view overlap, for example, due to a higher likelihood of objects necessitating detection and classification approaching from the front and/or front sides of an autonomous vehicle while in operation. In such examples, the cameras may be configured to provide less overlap in a rear-facing direction as an autonomous vehicle is less likely to move in reverse at a high rate of speed. For example, front bias may be provided by configuring two or more forward facing cameras with larger forward field of view overlaps and two or more rearward facing cameras with smaller rear field of view overlaps.
In some embodiments, components of the sensor system, such as the ranging system and some of the plurality of cameras, may be configured in positions more forward on the roof of the autonomous vehicle, for example, to more closely align with a driver's head position and provide improved perception of oncoming terrain and objects. For example, forward facing and forward side cameras, and possibly the LIDAR system, may be mounted on the roof of the autonomous vehicle such that they are not positioned behind a driver seat position in the autonomous vehicle. In some embodiments, a forward-facing camera of the sensor system can also be positioned and oriented to be able to see a traffic control signal while the autonomous vehicle is stationary at an intersection.
In some embodiments, some or all of the cameras of the plurality of cameras may have a horizontal field of view of less than about 90 degrees, and in some examples, the camera horizontal field of views may be tighter (e.g., less than 83 degrees). In some embodiments, the plurality of cameras may be configured such that the cameras do not pitch down more than a certain range (e.g., approximately 10 degrees). In some embodiments, the ranging system and camera components of the sensor system can be configured such that they would not overhang a roof edge of the autonomous vehicle. For example, such placement can provide the advantage of reducing the possibility of a user contacting the sensor components, such as when entering or exiting the vehicle.
In some embodiments, a roof-mounted sensor system may provide a ground intercept within a defined range of the vehicle, for example, providing a ground intercept within a certain distance (e.g., four meters) of the vehicle relative to the front and sides of the vehicle. In some embodiments, the sensor system LIDAR may provide a ground intercept within a certain distance (e.g., five meters) of the vehicle and the sensor system cameras may provide a ground intercept within a certain distance (e.g., four meters) of the vehicle.
In some embodiments, the placement and orientation of one or more of the cameras relative to the LIDAR system may be configured to provide improvements in parallax effects relative to the objects detected within the ranging data from the LIDAR system and within the image data from the plurality of cameras. The placement and orientation of the LIDAR system and cameras may be configured, in particular, to minimize camera-LIDAR parallax effects, both horizontal and vertical, for a forward 180-degree view.
In some embodiments, the sensor system may also include one or more near-range sensor systems, for example, RADAR, ultrasonic sensors, and the like. Such near-range sensor systems may provide additional sensor data in regard to objects located in one or more close-in blind spots in LIDAR and/or camera coverage around an autonomous vehicle while the vehicle is either stationary or moving.
An autonomous vehicle can include a sensor system as described above as well as a vehicle computing system. The vehicle computing system can include one or more computing devices and one or more vehicle controls. The one or more computing devices can include a perception system, a prediction system, and a motion planning system that cooperate to perceive the surrounding environment of the autonomous vehicle and determine a motion plan for controlling the motion of the autonomous vehicle accordingly. The vehicle computing system can receive sensor data from the sensor system as described above and utilize such sensor data in the ultimate motion planning of the autonomous vehicle.
In particular, in some implementations, the perception system can receive sensor data from one or more sensors (e.g., one or more ranging systems and/or the plurality of cameras) that are coupled to or otherwise included within the sensor system of the autonomous vehicle. The sensor data can include information that describes the location (e.g., in three-dimensional space relative to the autonomous vehicle) of points that correspond to objects within the surrounding environment of the autonomous vehicle (e.g., at one or more times).
As yet another example, for one or more cameras, various processing techniques (e.g., range imaging techniques such as, for example, structure from motion, structured light, stereo triangulation, and/or other techniques) can be performed to identify the location (e.g., in three-dimensional space relative to the one or more cameras) of a number of points that correspond to objects that are depicted in imagery captured by the one or more cameras. Other sensor systems can identify the location of points that correspond to objects as well.
The perception system can identify one or more objects that are proximate to the autonomous vehicle based on sensor data received from the one or more sensors. In particular, in some implementations, the perception system can determine, for each object, state data that describes a current state of such object. As examples, the state data for each object can describe an estimate of the object's: current location (also referred to as position); current speed; current heading (current speed and heading also together referred to as velocity); current acceleration; current orientation; size/footprint (e.g., as represented by a bounding shape such as a bounding polygon or polyhedron); class of characterization (e.g., vehicle versus pedestrian versus bicycle versus other); yaw rate; and/or other state information. In some implementations, the perception system can determine state data for each object over a number of iterations. In particular, the perception system can update the state data for each object at each iteration. Thus, the perception system can detect and track objects (e.g., vehicles, bicycles, pedestrians, etc.) that are proximate to the autonomous vehicle over time.
The prediction system can receive the state data from the perception system and predict one or more future locations for each object based on such state data. For example, the prediction system can predict where each object will be located within the next 5 seconds, 10 seconds, 20 seconds, etc. As one example, an object can be predicted to adhere to its current trajectory according to its current speed. As another example, other, more sophisticated prediction techniques or modeling can be used.
The motion planning system can determine a motion plan for the autonomous vehicle based at least in part on predicted one or more future locations for the object and/or the state data for the object provided by the perception system. Stated differently, given information about the current locations of proximate objects and/or predicted future locations of proximate objects, the motion planning system can determine a motion plan for the autonomous vehicle that best navigates the autonomous vehicle along the determined travel route relative to the objects at such locations.
As one example, in some implementations, the motion planning system can determine a cost function for each of one or more candidate motion plans for the autonomous vehicle based at least in part on the current locations and/or predicted future locations of the objects. For example, the cost function can describe a cost (e.g., over time) of adhering to a particular candidate motion plan. For example, the cost described by a cost function can increase when the autonomous vehicle approaches possible impact with another object and/or deviates from a preferred pathway (e.g., a predetermined travel route).
Thus, given information about the current locations and/or predicted future locations of objects, the motion planning system can determine a cost of adhering to a particular candidate pathway. The motion planning system can select or determine a motion plan for the autonomous vehicle based at least in part on the cost function(s). For example, the motion plan that minimizes the cost function can be selected or otherwise determined. The motion planning system then can provide the selected motion plan to a vehicle controller that controls one or more vehicle controls (e.g., actuators or other devices that control gas flow, steering, braking, etc.) to execute the selected motion plan.
The systems and methods described herein may provide a number of technical effects and benefits. For instance, sensor systems employing a plurality of cameras with strategic field of view overlaps as described herein provide for enhanced field of view for use in object detection and classification. Such enhanced field of view can be particularly advantageous for use in conjunction with vehicle computing systems for autonomous vehicles. Because vehicle computing systems for autonomous vehicles are tasked with repeatedly detecting and analyzing objects in sensor data for localization and classification of objects of interest including other vehicles, cyclists, pedestrians, traffic changes, traffic control signals, and the like, and then determining necessary responses to such objects of interest, enhanced field of view can lead to faster and more accurate object detection and classification. Improved object detection and classification can have a direct effect on the provision of safer and smoother automated control of vehicle systems and improved overall performance of autonomous vehicles.
The systems and methods described herein may also provide a technical effect and benefit of providing for improved placement of cameras as part of an autonomous vehicle sensor system. The analysis of appropriate fields of view and field of view overlaps for a sensor system may provide for improving the placement and orientation of cameras within the sensor system to provide more robust sensor data leading to improvements in object perception by vehicle computing systems. The improved placement of cameras in an autonomous vehicle sensor system may also provide a technical effect and benefit of reducing parallax effects relative to the ranging data provided by a LIDAR system and image data provided by the plurality of cameras, thereby improving the localization of detected objects of interest, as well as improving the prediction and motion planning relative to the objects of interest, by vehicle computing systems.
The systems and methods described herein may also provide a technical effect and benefit of providing improvements in object detection relative to alternative solutions for combining image data from multiple cameras. For example, performing image stitching of images from multiple cameras can introduce stitching artifacts jeopardizing the integrity of image data along image boundaries that are stitched together. Also, cameras that are designed to obtain images that will be stitched together can sometimes be subject to design limitations such as the size and placement of the cameras, for example, in a ring that is very close together.
The systems and methods described herein may also provide resulting improvements to computing technology tasked with object detection and classification. Providing improvements in fields of view and improvements in sensor data may provide improvements in the speed and accuracy of object detection and classification, resulting in improved operational speed and reduced processing requirements for vehicle computing systems, and ultimately more efficient vehicle control.
With reference to the figures, example embodiments of the present disclosure will be discussed in further detail.
The autonomous vehicle 102 can include one or more sensors 104, a vehicle computing system 106, and one or more vehicle controls 108. The vehicle computing system 106 can assist in controlling the autonomous vehicle 102. In particular, the vehicle computing system 106 can receive sensor data from the one or more sensors 104, attempt to comprehend the surrounding environment by performing various processing techniques on data collected by the sensors 104, and generate an appropriate motion path through such surrounding environment. The vehicle computing system 106 can control the one or more vehicle controls 108 to operate the autonomous vehicle 102 according to the motion path.
The vehicle computing system 106 can include one or more processors 130 and at least one memory 132. The one or more processors 130 can be any suitable processing device (e.g., a processor core, a microprocessor, an ASIC, a FPGA, a controller, a microcontroller, etc.) and can be one processor or a plurality of processors that are operatively connected. The memory 132 can include one or more non-transitory computer-readable storage mediums, such as RAM, ROM, EEPROM, EPROM, flash memory devices, magnetic disks, etc., and combinations thereof. The memory 132 can store data 134 and instructions 136 which are executed by the processor 130 to cause vehicle computing system 106 to perform operations.
In some implementations, vehicle computing system 106 can further be connected to, or include, a positioning system 120. Positioning system 120 can determine a current geographic location of the autonomous vehicle 102. The positioning system 120 can be any device or circuitry for analyzing the position of the autonomous vehicle 102. For example, the positioning system 120 can determine actual or relative position by using a satellite navigation positioning system (e.g. a GPS system, a Galileo positioning system, the GLObal NAvigation Satellite System (GLONASS), the BeiDou Satellite Navigation and Positioning system), an inertial navigation system, a dead reckoning system, based on IP address, by using triangulation and/or proximity to cellular towers or WiFi hotspots, and/or other suitable techniques for determining position. The position of the autonomous vehicle 102 can be used by various systems of the vehicle computing system 106.
As illustrated in
In particular, in some implementations, the perception system 110 can receive sensor data from the one or more sensors 104 that are coupled to or otherwise included within the autonomous vehicle 102. As examples, the one or more sensors 104 can include a LIght Detection And Ranging (LIDAR) system 122, a RAdio Detection And Ranging (RADAR) system 124, one or more cameras 126 (e.g., visible spectrum cameras, infrared cameras, etc.), and/or other sensors 128. The sensor data can include information that describes the location of objects within the surrounding environment of the autonomous vehicle 102.
As one example, for LIDAR system 122, the sensor data can include the location (e.g., in three-dimensional space relative to the LIDAR system 122) of a number of points that correspond to objects that have reflected a ranging laser. For example, LIDAR system 122 can measure distances by measuring the Time of Flight (TOF) that it takes a short laser pulse to travel from the sensor to an object and back, calculating the distance from the known speed of light.
As another example, for RADAR system 124, the sensor data can include the location (e.g., in three-dimensional space relative to RADAR system 124) of a number of points that correspond to objects that have reflected a ranging radio wave. For example, radio waves (pulsed or continuous) transmitted by the RADAR system 124 can reflect off an object and return to a receiver of the RADAR system 124, giving information about the object's location and speed. Thus, RADAR system 124 can provide useful information about the current speed of an object.
As yet another example, for one or more cameras 126, various processing techniques (e.g., range imaging techniques such as, for example, structure from motion, structured light, stereo triangulation, and/or other techniques) can be performed to identify the location (e.g., in three-dimensional space relative to the one or more cameras 126) of a number of points that correspond to objects that are depicted in imagery captured by the one or more cameras 126. Other sensor systems 128 can identify the location of points that correspond to objects as well.
Thus, the one or more sensors 104 can be used to collect sensor data that includes information that describes the location (e.g., in three-dimensional space relative to the autonomous vehicle 102) of points that correspond to objects within the surrounding environment of the autonomous vehicle 102.
In addition to the sensor data, the perception system 110 can retrieve or otherwise obtain map data 118 that provides detailed information about the surrounding environment of the autonomous vehicle 102. The map data 118 can provide information regarding: the identity and location of different travelways (e.g., roadways), road segments, buildings, or other items or objects (e.g., lampposts, crosswalks, curbing, etc.); the location and directions of traffic lanes (e.g., the location and direction of a parking lane, a turning lane, a bicycle lane, or other lanes within a particular roadway or other travelway); traffic control data (e.g., the location and instructions of signage, traffic lights, or other traffic control devices); and/or any other map data that provides information that assists the vehicle computing system 106 in comprehending and perceiving its surrounding environment and its relationship thereto.
The perception system 110 can identify one or more objects that are proximate to the autonomous vehicle 102 based on sensor data received from the one or more sensors 104 and/or the map data 118. In particular, in some implementations, the perception system 110 can determine, for each object, state data that describes a current state of such object. As examples, the state data for each object can describe an estimate of the object's: current location (also referred to as position); current speed; current heading (current speed and heading also together referred to as velocity); current acceleration; current orientation; size/footprint (e.g., as represented by a bounding shape such as a bounding polygon or polyhedron); class (e.g., vehicle versus pedestrian versus bicycle versus other); yaw rate; and/or other state information.
In some implementations, the perception system 110 can determine state data for each object over a number of iterations. In particular, the perception system 110 can update the state data for each object at each iteration. Thus, the perception system 110 can detect and track objects (e.g., vehicles, pedestrians, bicycles, and the like) that are proximate to the autonomous vehicle 102 over time.
The prediction system 112 can receive the state data from the perception system 110 and predict one or more future locations for each object based on such state data. For example, the prediction system 112 can predict where each object will be located within the next 5 seconds, 10 seconds, 20 seconds, etc. As one example, an object can be predicted to adhere to its current trajectory according to its current speed. As another example, other, more sophisticated prediction techniques or modeling can be used.
The motion planning system 114 can determine a motion plan for the autonomous vehicle 102 based at least in part on the predicted one or more future locations for the object provided by the prediction system 112 and/or the state data for the object provided by the perception system 110. Stated differently, given information about the current locations of objects and/or predicted future locations of proximate objects, the motion planning system 114 can determine a motion plan for the autonomous vehicle 102 that best navigates the autonomous vehicle 102 relative to the objects at such locations.
As one example, in some implementations, the motion planning system 114 can determine a cost function for each of one or more candidate motion plans for the autonomous vehicle 102 based at least in part on the current locations and/or predicted future locations of the objects. For example, the cost function can describe a cost (e.g., over time) of adhering to a particular candidate motion plan. For example, the cost described by a cost function can increase when the autonomous vehicle 102 approaches a possible impact with another object and/or deviates from a preferred pathway (e.g., a preapproved pathway).
Thus, given information about the current locations and/or predicted future locations of objects, the motion planning system 114 can determine a cost of adhering to a particular candidate pathway. The motion planning system 114 can select or determine a motion plan for the autonomous vehicle 102 based at least in part on the cost function(s). For example, the candidate motion plan that minimizes the cost function can be selected or otherwise determined. The motion planning system 114 can provide the selected motion plan to a vehicle controller 116 that controls one or more vehicle controls 108 (e.g., actuators or other devices that control gas flow, acceleration, steering, braking, etc.) to execute the selected motion plan.
Each of the perception system 110, the prediction system 112, the motion planning system 114, and the vehicle controller 116 can include computer logic utilized to provide desired functionality. In some implementations, each of the perception system 110, the prediction system 112, the motion planning system 114, and the vehicle controller 116 can be implemented in hardware, firmware, and/or software controlling a general purpose processor. For example, in some implementations, each of the perception system 110, the prediction system 112, the motion planning system 114, and the vehicle controller 116 includes program files stored on a storage device, loaded into a memory, and executed by one or more processors. In other implementations, each of the perception system 110, the prediction system 112, the motion planning system 114, and the vehicle controller 116 includes one or more sets of computer-executable instructions that are stored in a tangible computer-readable storage medium such as RAM hard disk or optical or magnetic media.
Further, as illustrated in
As illustrated in
One or more parameters may be used in determining how a plurality of cameras should be configured to provide camera field of view overlaps, such as field of view overlaps 222, 224, 226, 228, 230, and 232 illustrated in
As illustrated in
In some implementations, as illustrated in
Referring still to
The autonomous vehicle 302 may use a combination of LIDAR data generated based on the LIDAR sweeps 304 and image data generated by the cameras (within the fields of view 306-314) for the detection and classification of objects in the surrounding environment of autonomous vehicle 302, such as by a vehicle computing system 106 as discussed in regard to
As illustrated in
As further illustrated in
Specific parameters characterizing a field of view overlap, for example field of view overlap 428 between the camera fields of view 404 and 406 or any other field of view overlaps illustrated or described herein, can be defined in one or more manners. For example, field of view overlap 428 can be characterized by an angle 430 of the field of view overlap 428 formed between adjacent fields of view 404 and 406. In another example, field of view overlap can be characterized as a width dimension 432 measured between adjacent field of view boundaries (e.g., boundary 434 of field of view 404 and boundary 436 of field of view 406) at a predetermined distance from autonomous vehicle 402 or from one or more components of the sensor system mounted on autonomous vehicle 402. Other parameters characterizing a field of view overlap between adjacent cameras can be based on a distance and/or angular orientation between adjacent cameras as they are mounted within a sensor system relative to autonomous vehicle 402.
In some embodiments, one or more camera field of view overlaps (e.g., field of view overlap 428) can be configured such that the field of view overlap 428 is large enough in certain locations for a largest relevant classifiable object to be fully captured by one camera (e.g., camera 404 or 406). For example, having a pedestrian category for object classification field of view overlap 428 can be configured to be large enough within a certain range of the autonomous vehicle so that a larger pedestrian (e.g., male pedestrian 414, with an average male pedestrian generally being larger than average female or child pedestrians) near the autonomous vehicle may be fully viewed in at least one camera's field of view 404, 406, 408, 410, 412 when pedestrian 414 is proximate to autonomous vehicle 402. As such, when pedestrian 414 is located on a boundary 434 of camera field of view 406, pedestrian 414 can be fully captured in the adjacent camera field of view 404 due to field of view overlap 428. As such, it may be desirable that field of view overlap 428 is characterized by a minimum or average dimension of an object class, such as pedestrian 414. For example field of view overlap 428 may be characterized by a width dimension 432 measured relatively close to autonomous vehicle 402 of between about 20-24 inches (e.g., based on a reference dimension of 20 inches for the width of a male pedestrian). When width dimension 432 is measured farther from autonomous vehicle 402 between adjacent field of view boundaries (e.g., boundary 434 of field of view 404 and boundary 436 of field of view 406), the field of view overlap 428 is wider and more likely to fully encompass an object such as pedestrian 414.
Inset window 510 illustrates an example horizontal adjacent view captured by rear facing camera 504, including objects such as bicycle 506 and motorcycle 508 which are positioned behind and to the right of autonomous vehicle 502. As illustrated in
Camera 602 can include one or more lenses 604, an image sensor 606, and one or more image processors 608. Camera 602 can also have additional conventional camera components not illustrated in
In some examples, the image sensor 606 can be a charge-coupled device (CCD) sensor or a complementary metal-oxide-semiconductor (CMOS) sensor, although other image sensors can also be employed. Image sensor 606 can include an array of image sensor elements corresponding to unique image pixels that are configured to detect incoming light provided incident to a surface of image sensor 606. Each image sensor element within image sensor 606 can detect incoming light by detecting the amount of light that falls thereon and converting the received amount of light into a corresponding electric signal. The more light detected at each pixel, the stronger the electric signal generated by the sensor element corresponding to that pixel. In some examples, each image sensor element within image sensor 606 can include a photodiode and an amplifier along with additional integrated circuit components configured to generate the electric signal representative of an amount of captured light at each image sensor element. The electric signals detected at image sensor 606 provide raw image capture data at a plurality of pixels, each pixel corresponding to a corresponding image sensor element within image sensor 606. Image sensor 606 can be configured to capture successive full image frames of raw image capture data in successive increments of time.
As illustrated in
The one or more image processors 608 can include one or more processor(s) 614 along with one or more memory device(s) 616 that can collectively function as respective computing devices. The one or more processor(s) 614 can be any suitable processing device such as a microprocessor, microcontroller, integrated circuit, an application specific integrated circuit (ASIC), a digital signal processor (DSP), a field-programmable gate array (FPGA), logic device, one or more central processing units (CPUs), processing units performing other specialized calculations, etc. The one or more processor(s) 614 can be a single processor or a plurality of processors that are operatively and/or selectively connected.
The one or more memory device(s) 616 can include one or more non-transitory computer-readable storage media, such as RAM, ROM, EEPROM, EPROM, flash memory devices, magnetic disks, etc., and/or combinations thereof. The one or more memory device(s) 616 can store information that can be accessed by the one or more processor(s) 614. For instance, the one or more memory device(s) 616 can include computer-readable instructions 618 that can be executed by the one or more processor(s) 614. The instructions 618 can be software written in any suitable programming language, firmware implemented with various controllable logic devices, and/or can be implemented in hardware. Additionally, and/or alternatively, the instructions 620 can be executed in logically and/or virtually separate threads on processor(s) 614. The instructions 618 can be any set of instructions that when executed by the one or more processor(s) 614 cause the one or more processor(s) 614 to perform operations.
The one or more memory device(s) 616 can store data 620 that can be retrieved, manipulated, created, and/or stored by the one or more processor(s) 614. The data 620 can include, for instance, raw image capture data, digital image outputs, or other image-related data or parameters. The data 620 can be stored in one or more database(s). The one or more database(s) can be split up so that they can be provided in multiple locations.
Camera 602 can include a communication interface 624 used to communicate with one or more other component(s) of a sensor system or other systems of an autonomous vehicle, for example, a vehicle computing system such as vehicle computing system 106 of
Camera 602 also can include one or more input devices 620 and/or one or more output devices 622. An input device 620 can include, for example, devices for receiving information from a user, such as a touch screen, touch pad, mouse, data entry keys, speakers, a microphone suitable for voice recognition, etc. An input device 620 can be used, for example, by a user to select controllable inputs for operation of the camera 602 (e.g., shutter, ISO, white balance, focus, exposure, etc.) and or control of one or more parameters. An output device 622 can be used, for example, to provide digital image outputs to a vehicle operator. For example, an output device 622 can include a display device (e.g., display screen, CRT, LCD), which can include hardware for displaying an image or other communication to a user. Additionally, and/or alternatively, output device(s) can include an audio output device (e.g., speaker) and/or device for providing haptic feedback (e.g., vibration).
At 704, one or more computing devices in a computing system, such as vehicle computing system 106 of
At 706, the one or more computing devices within a computing system can detect a potential object of interest within the received image data from the plurality of cameras. At 708, the computing system can determine a first image area in a first camera's image data that contains, at least partially, the potential object of interest. At 710, the computing system can determine a second image area in a second camera's image data that contains, at least partially, the potential object of interest. In some examples, the second image area in the second camera image data may overlap the first image area in the first camera image data by a defined amount (e.g., based on the configuration of the plurality of cameras). The first image area in the first camera image data may contain only a partial view of the object of interest because, for example, the potential object of interest may fall at or near a boundary edge of the first camera's field of view. The second image area in the second camera image data may contain a more complete view of the object of interest due to the overlap with the first image area in the first camera image data.
At 712, the one or more computing devices in a computing system may classify the object of interest based in part on the second camera image data and provide the object classification for use in further operations, such as tracking and prediction. For example, the partial view of the object of interest contained in the first image area in the first camera image data determined at 708 may not provide enough data for accurate localization and classification of the object of interest. However, the more full view of the object of interest in the second image area in the second camera image data determined at 708 due to the view overlap may provide sufficient data for accurate localization and classification of the object of interest.
Although
The computing device(s) 129 of the vehicle computing system 106 can include processor(s) 902 and a memory 904. The one or more processors 902 can be any suitable processing device (e.g., a processor core, a microprocessor, an ASIC, a FPGA, a controller, a microcontroller, etc.) and can be one processor or a plurality of processors that are operatively connected. The memory 904 can include one or more non-transitory computer-readable storage media, such as RAM, ROM, EEPROM, EPROM, one or more memory devices, flash memory devices, etc., and combinations thereof.
The memory 904 can store information that can be accessed by the one or more processors 902. For instance, the memory 904 (e.g., one or more non-transitory computer-readable storage mediums, memory devices) on-board the vehicle 102 can include computer-readable instructions 906 that can be executed by the one or more processors 902. The instructions 906 can be software written in any suitable programming language or can be implemented in hardware. Additionally, or alternatively, the instructions 906 can be executed in logically and/or virtually separate threads on processor(s) 902.
For example, the memory 904 on-board the vehicle 102 can store instructions 906 that when executed by the one or more processors 902 on-board the vehicle 102 cause the one or more processors 902 (the computing system 106) to perform operations such as any of the operations and functions of the computing device(s) 129 or for which the computing device(s) 129 are configured, as described herein and including, for example, steps 702-712 of method 700 in
The memory 904 can store data 908 that can be obtained, received, accessed, written, manipulated, created, and/or stored. The data 908 can include, for instance, ranging data obtained by LIDAR system 122 and/or RADAR system 124, image data obtained by camera(s) 126, data identifying detected and/or classified objects including current object states and predicted object locations and/or trajectories, motion plans, etc. as described herein. In some implementations, the computing device(s) 129 can obtain data from one or more memory device(s) that are remote from the vehicle 102.
The computing device(s) 129 can also include a communication interface 909 used to communicate with one or more other system(s) on-board the vehicle 102 and/or a remote computing device that is remote from the vehicle 102 (e.g., of remote computing system 910). The communication interface 909 can include any circuits, components, software, etc. for communicating with one or more networks (e.g., 920). In some implementations, the communication interface 909 can include for example, one or more of a communications controller, receiver, transceiver, transmitter, port, conductors, software and/or hardware for communicating data.
In some implementations, the vehicle computing system 106 can further include a positioning system 912. The positioning system 912 can determine a current position of the vehicle 102. The positioning system 912 can be any device or circuitry for analyzing the position of the vehicle 902. For example, the positioning system 912 can determine position by using one or more of inertial sensors, a satellite positioning system, based on IP address, by using triangulation and/or proximity to network access points or other network components (e.g., cellular towers, WiFi access points, etc.) and/or other suitable techniques. The position of the vehicle 102 can be used by various systems of the vehicle computing system 106.
The network(s) 920 can be any type of network or combination of networks that allows for communication between devices. In some embodiments, the network(s) can include one or more of a local area network, wide area network, the Internet, secure network, cellular network, mesh network, peer-to-peer communication link and/or some combination thereof and can include any number of wired or wireless links. Communication over the network(s) 920 can be accomplished, for instance, via a communication interface using any type of protocol, protection scheme, encoding, format, packaging, etc.
The remote computing system 910 can include one or more remote computing devices that are remote from the vehicle computing system 106. The remote computing devices can include components (e.g., processor(s), memory, instructions, data) similar to that described herein for the computing device(s) 129.
Computing tasks discussed herein as being performed at computing device(s) remote from the vehicle can instead be performed at the vehicle (e.g., via the vehicle computing system), or vice versa. Such configurations can be implemented without deviating from the scope of the present disclosure. The use of computer-based systems allows for a great variety of possible configurations, combinations, and divisions of tasks and functionality between and among components. Computer-implemented operations can be performed on a single component or across multiple components. Computer-implements tasks and/or operations can be performed sequentially or in parallel. Data and instructions can be stored in a single memory device or across multiple memory devices.
While the present subject matter has been described in detail with respect to various specific example embodiments thereof, each example is provided by way of explanation, not limitation of the disclosure. Those skilled in the art, upon attaining an understanding of the foregoing, can readily produce alterations to, variations of, and equivalents to such embodiments. Accordingly, the subject disclosure does not preclude inclusion of such modifications, variations and/or additions to the present subject matter as would be readily apparent to one of ordinary skill in the art. For instance, features illustrated or described as part of one embodiment can be used with another embodiment to yield a still further embodiment. Thus, it is intended that the present disclosure cover such alterations, variations, and equivalents.