The present invention relates to a technical field of a computer vision apparatus for detecting and tracking objects within a surrounding environment.
Many computer vision apparatuses use lidars and/or camera sensors with computer vision algorithms to detect and track objects within a surrounding environment. One of the main advantages of a lidar is that the light source is integrated therein. The lidar uses an eye-safe laser to emit laser pulses which light up a desired area. Unlike cameras, the lidar functions independently of the ambient lighting by illuminating laser rays emitted from the lidar itself. However, the lidar gives a much higher surface density compared to images captured by the camera sensor in general. However, lidar cannot identify the color differences. On the other hand, camera sensors can identify the color differences and recognize the characters such as numbers in the license plate on the vehicle as images, for example. To utilize those sensors, the advantages of lidar and camera sensors should be effectively combined.
In an embodiment of this invention, a computer vision apparatus for detecting and tracking an object within a surrounding environment, the computer vision apparatus comprises an optical camera sensor for obtaining image data of the object, a lidar sensor for obtaining point cloud data of the object, a first buffer memory for storing the image data from the optical camera sensor, a second buffer memory for storing the point cloud data from the lidar sensor, a CPU (Central Processing Unit) on which computer programs run thereon, the computer programs being arranged to control the optical camera sensor and the lidar sensor; and a memory for storing the computer programs, wherein the computer programs comprise the steps of capturing image data of the object via the optical camera sensor, storing the captured image data into the first buffer memory, scanning the object using the lidar sensor, storing scanned point cloud data via the lidar sensor into the second buffer memory, identifying a moving object in the first buffer memory or a moving blob in the second buffer memory, assigning an identification code (ID) to the moving object in the first buffer memory, obtaining information including the ID, position, speed, and time stamp when the moving object passes a certain point in a camera view of the camera sensor; and matching an incoming moving blob corresponding to the moving object by using the information received by the camera sensor when the incoming moving blob passes a certain point in a lidar view corresponding to the certain point in the camera view.
According to an embodiment of this invention described above, it becomes possible to effectively combine the benefits of using both lidar and optical camera sensor as a shared input to the computer vision apparatus. For example, uncertain recognition elements due to camera sensor performance limitation, such as car classification recognition at night, can be compensated for by lidar's performance, and the content recognized by the camera sensors can be updated.
In another embodiment of this invention, a computer vision apparatus for detecting and tracking object within a surrounding environment, the computer vision apparatus comprises an optical camera sensor for obtaining image data of the object, a plurality of lidar sensors for obtaining point cloud data, the plurality of lidar sensors including a first lidar sensor and a second lidar sensor, a first buffer memory for storing the image data from the optical camera sensor, a second buffer memory for storing the point cloud data from the plurality of lidar sensors, a CPU (Central Processing Unit) on which computer programs run thereon, the computer programs being arranged to control the optical camera sensor and the plurality of lidar sensors, and a memory for storing data associated with the computer programs, wherein the computer programs comprise the steps of, capturing the image data of the objects via the optical camera sensor, storing the captured image data into the first buffer memory, scanning the object via the first lidar sensor, scanning the object via the second lidar sensor, wherein the scanning via the second lidar sensor is phase-shifted from the scanning via the first lidar, merging cloud data from the first lidar sensor and cloud data from the second lidar sensor, storing the merged cloud data into the second buffer memory, synchronizing position of objects identified by the optical camera sensor and the plurality of lidar sensors, performing parallel pre-process data in the first buffer memory and the second buffer memory independently using independent blob detection algorithms, estimating sizes and shapes of the objects stored in the first buffer memory and the second buffer memory when the computer programs identify a moving blob or a moving object in the first buffer memory and the second buffer memory; and overlaying a position of the image data onto a position of the moving blob so that a remaining view area can be matched to a remaining view area can be matched to view area of the other sensors.
According to an embodiment of this invention described above, it becomes possible to increase effective lidar scanning rate by increasing phase-shifting lidar scan rate of the lidar sensors instead of raising scanning frequency which has a certain limitation.
The camera sensors 110, 112 and 114 capture an image of an object 130. The lidar sensors 120, 122 and 124 are arranged to irradiate laser rays onto the object 130 and receive reflected laser rays coming back from the object 130 to the lidar sensors 120, 122 and 124. Lidar sensors 120, 122 and 124 use an eye-safe laser to emit laser ray pulses which light up the desired area in this embodiment. A lidar sensor has the capability for calculating distances by measuring the time for a signal to return using appropriate sensors and data acquisition electronics. On the other hand, the camera sensors 110, 112 and 114 can recognize the color and character numbers of the license plate on vehicle by applying computer programs running on the CPU 100, for example. In this embodiment 1, video cameras are used as camera sensors 110, 112 and 114. However, the computer vision 10 can be configured by a single camera sensor and/or a single lidar sensor.
The camera sensors 110, 112 and 114 capture image data of an object 130 and store the image data to the first buffer memory 116. The lidar sensors 120, 122 and 124 obtain point cloud data including the shape and a size information of object 130 and store the point cloud data into the second buffer memory 126 under the computer programs running on the CPU 100. In this embodiment, plural camera sensors and a plurality of Lidar sensors are used. However, a single camera sensor and a single lidar can be used in this embodiment. In this embodiment, a rotation mechanism is included in the lidar sensor. However, a non-rotational usually referred to solid-state lidar sensors can also be used.
All these functions described above, and functions described below for controlling camera sensors 110, 112 and 114 and lidar sensors 120, 122 and 124 are performed under the control of the computer programs running on memory 102 together with CPU 100. The computer programs can synchronize capturing flame data on the first buffer 116 and data scans on the second buffer 126 by using a global time source such as PTP (Precision Time Protocol) or any global timestamp. [0010]
The computer programs perform parallel pre-process of the data from camera sensors 110112 and 114 and lidar sensors 120, 122 and 124 by utilizing an independent blob detection algorithm. When the computer programs recognize an identified blob or a moving object in both buffer memories 116 and 126, then the computer programs estimate the blob size and shape independently on each sensor data.
Then, the computer programs compare the detected blob shape and size from each sensor buffer 116 and 126 to match the location of camera pixels with existing blob and the point cloud data of the existing blob. The computer programs are arranged to overlay the position of the blob detected by each sensor as illustrated in
Instead of the lidar utilizing laser ray, a lidar sensor using ambient light (optical) sensor can be used in this embodiment. In this case, elements of scene can be identified by using lidar optical sensor. The elements can be matched to pixels on separate camera sensors. In this case, three-dimensional points of cloud data from the lidar sensor are matched to two-dimensional pixel map from the camera sensor. In this case, PTP (Precision Time) protocol or any global timestamp can be used to align time of pixel frame rate with lidar scan rate.
The confidence of objects detected by lidar point cloud data can be increased by comparing the blob or object as detected using ambient light sensor data from camera or lidar ambient light sensor. In the case where the reflectivity of an object is low, such as black color vehicles, it may be possible to observe the ambient light from the same object since the light source may be reflected from a different angle, color, or intensity compared to the source position of the laser from a lidar sensor.
Adjustable Start Scan Timing of Prural Lidar Sensors
Next, an embodiment including plural lidar sensors having capability of adjustable start scan timing will be described. A plurality of lidar sensors can be phased locked by internal configuration of each lidar sensor that enables an adjustable start scan timing relative to another lidar sensor with respect to the external clock source such as PTP grandmaster clock. By offsetting the start of each scan at different times of the grandmaster clock, the total scan rates can be increased by a factor of the number of sensors synchronized by an external clock. This enables faster moving objects to be detected and tracked by a multiple lidar scanning system with phase offsets. Typically, a scanning lidar has limited scan rates of 20 Hz or less. By adding a second lidar with 0 deg azimuth starting scans phase shifted by 180 deg from the first lidar 0 deg azimuth starting scans, for the case where each lidar is scanning at 20 Hz, the total scan rate of an overlapping scene or objects in a shared view can be detected and tracked with a 40 Hz effective scanning rate.
According to this example even though each lidar sensor 20 Hz lidar scan rates, the effective scan rate can be arranged to be 40 Hz by phase shifting of the Lidar sensors as described above. In this example, the lidar including a rotation mechanism is used. However, lidars can also have non-rotational configurations usually referred to as “solid-state” lidars as described above. In this example, two lidar sensors are used. However, the number of lidar sensors used in the computer vision can be more than two to increase the resolution necessary for the scanning area to be observed.
Sometimes a lidar sensor can have more resolution than a camera sensor depending on view distance and camera pixel amounts relative to the lidar laser beam concentration/angles. The same can be true for camera resolution. In these cases, camera sensor or lidar can aid in pre-tracking an object further away that is not detectable by other sensor with lower resolution.
In the case where a certain area requires high accuracy tracking by lidar but cannot capture entire scene due to view limitation, a camera sensor mounted from same point but looking at a different view (possibly no view overlap) can be used to organize and pre-track objects that are predicted to move into the view of the high resolution lidar. Sort of a short distance re-ID, where data is passed between camera sensor to lidar to allow lidar to re-ID the object and continue tracking it and gain more information because lidar's resolution is higher than that of camera sensor in general.
As described above, the vehicle 704 is recognized by the camera sensor. Then, the computer vision apparatus including the camera sensor assigns an ID (Identification Code or Number) to vehicle 704. Further, the computer vision apparatus can calculate speed, direction and estimate distances to the vehicle using camera's setup calibration. As a result, the computer vision apparatus can estimate the position of the tracked object and share distance coordinate information to confirm the position of a detected blob when the vehicle 704 passes a specific location or point, like a virtual line such as a dotted line illustrated in
On laser Rx optical receiver on the lidar sensor used in
In other words, assuming the initially tracked object detected by a camera sensor can remain in view of the camera sensor while the lidar sensor can begin to track the same object, lidar sensor uses information of the object from camera and confirms/corrects the estimated distance values that were estimated by camera sensor. The measured distance values can be more accurately tracked when lidar sensor shares data about the object (including color information known from first camera) and shared.
Embodiment 2 can be performed by using multiple camera sensors and lidar sensors. It becomes possible to define an ID data structure that allows camera sensor and lidar sensors to share information about a tracked object, instead of a numerical ID only.
In embodiments 1 and 2, an electro-mechanical including a rotation mechanism or a solid state lidar sensor or mixed thereof can be used.
This non-provisional application claims priority from U.S. Provisional Patent Application Ser. No. 63/330,609 filed, 04/13/2022, the contents of which are incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
63330609 | Apr 2022 | US |