Ranging sensors such as light detection and ranging (LiDAR) sensors, laser scanners, ultrasonic sensors, or radars have been widely used in a variety of applications including robot/unmanned aerial vehicles (UAVs) navigation, autonomous driving, environment monitoring, traffic monitoring, surveillance, and three-dimensional (3D) reconstruction. For these applications, dynamic event detection which refers to instantaneously distinguishing measured points of moving objects from measured points of static objects is a fundamental requirement for an agent such as a robot/UAV, a self-driving car, or for an alarming system to detect the moving objects in a scene, predict future states of the moving objects, plan the trajectory of the agent to move accordingly or to avoid the moving objects, or to build consistent 3D maps that exclude the moving objects.
US2020111358A1 discloses a system for detecting dynamic points. For sensors, this reference relies on the input of extra stationary LiDAR, where both vehicle-mounted LiDAR and the extra stationary LiDAR must scan the environments in columns. US2018004227A1 discloses the use of detected moving objects for further processing (e.g., behavior predictions or vehicle motion planning), but does not disclose how the dynamic objects are detected.
There continues to be a need in the art for improved designs and techniques for systems and methods for detecting moving objects and making timely decisions.
Embodiments of the subject invention pertain to a moving object detection system. The system comprises an input module capturing a point cloud comprising measurements of distances to points on one or more objects; and a detection module receiving the point cloud captured by the input module and configured to determine whether the objects are moving objects, by determining whether currently measured points occlude any previously measured points, and/or whether the currently measured points recursively occlude any previously measured points, and/or whether the currently measured points are recursively occluded by any previously measured points. Whether the objects are moving objects is determined either sequentially or simultaneously. The previously measured points of the moving objects are partially or completely excluded in the determination of occlusion for currently measured points. The determination of occlusion is performed based on depth images by comparing the depth of the currently measured points with previously measured ones that are projected to the same or adjacent pixels of the depth image in order to determine the occlusion. Moreover, the points are projected to the depth image by a spherical projection or a perspective projection or a projection that projects points lying on neighboring lines of sight to neighboring pixels. In a moving platform such as a vehicle, a UAV, or any other movable object that carries the sensor and moves in a space, a depth image is attached with a pose read from an external motion sensing module, indicating under which pose the depth image is constructed, and points are configured to be transformed to this pose before projection to the depth image. Each pixel of a depth image is configured to save all or a selected number of points projected therein, and/or all or a select number of the depths of points projected therein, and/or the statistical information comprising a minimum value, a maximum value, or a variance of depths of all or a selected number of points projected therein, and/or other information of the occluded points attached to points projected therein. Furthermore, multiple depth images can be constructed at multiple prior poses, and each is constructed from points starting from the respective pose and accumulated for a certain period of time. Each point in a pixel is configured to save the points in previous depth images that occlude the point or are occluded by the point. The occlusion of current points is determined against all or a selected number of depth images previously constructed. A current point is determined to occlude previous points if its depth is smaller than all or any points contained in adjacent pixels of any depth images it projects to. In addition, a current point can be determined to be occluded by previous points if its depth is larger than all or any points contained in adjacent pixels of any depth images it projects to. A current point can be determined to recursively occlude previous points if it occludes any point in any previous depth image and further occludes any point in any more previous depth image that is occluded by the previous one, for a certain number of times. A current point can be determined to be recursively occluded by previous points if it is occluded by any point in any previous depth image and is further occluded by any point in any more previous depth image that occludes the previous one, for a certain number of times.
According to an embodiment of the subject invention, a method for detecting one or more moving objects is provided. The method comprises capturing, by an input module, a point cloud comprising measurements of distances to points on one or more objects; providing the point cloud captured by the input module to a detection module; and configuring the detection module to determine whether the objects are moving objects, by determining whether currently measured points occlude any previously measured points, and/or whether the currently measured points recursively occlude any previously measured points, and/or whether the currently measured points are recursively occluded by any previously measured points. Moreover, whether the points of current point cloud are of the one or more moving objects is determined either sequentially or simultaneously. The previously measured points of the moving objects are partially or completely excluded in the determination of occlusion for currently measured points. The determination of occlusion is performed based on depth images by comparing depth of the currently measured points with previously measured ones that are projected to same or adjacent pixels of the depth image in order to determine the occlusion. The points are projected to the depth image by a spherical projection or a perspective projection or a projection that projects points lying on neighboring lines of sight to neighboring pixels.
In certain embodiment of the subject invention, a computer-readable storage medium is provided, having stored therein program instructions that, when executed by a processor of a computing system, cause the processor to execute a method for detecting one or more moving objects. The method comprises capturing, by an input module, point cloud comprising measurements of distances to points on one or more objects; providing the point cloud captured by the input module to a detection module; and configuring the detection module to determine whether the objects are moving objects, by determining whether currently measured points occlude any previously measured points, and/or whether the currently measured points recursively occlude any previously measured points, and/or whether the currently measured points are recursively occluded by any previously measured points.
This patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
The foregoing and other objects and advantages of the present invention will become more apparent when considered in connection with the following detailed description and appended drawings in which like designations denote like elements in the various views, and wherein:
The embodiments of subject invention show a method and systems for detecting dynamic events from a sequence of point scans measured by ranging detecting devices such as ranging sensors.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well as the singular forms, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification (except for the claims), specify the presence of stated features, steps, operations, elements, and/or components, but do not prelude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one having ordinary skill in the art to which this invention pertains. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
When the term “about” is used herein, in conjunction with a numerical value, it is understood that the value can be in a range of 90% of the value to 110% of the value, i.e. the value can be +/−10% of the stated value. For example, “about 1 kg” means from 0.90 kg to 1.1 kg.
In describing the invention, it will be understood that a number of techniques and steps are disclosed. Each of these has individual benefits and each can also be used in conjunction with one or more, or in some cases all, of the other disclosed techniques. Accordingly, for the sake of clarity, this description will refrain from repeating every possible combination of the individual steps in an unnecessary fashion. Nevertheless, the specification and claims should be read with the understanding that such combinations are entirely within the scope of the invention and the claims.
The term “ranging direction” used herein refers to a direction along which a ranging sensor measures a distance to a moving object or a stationary object.
Referring to
A detection module 220 in processor 218 reads the point cloud data from memory 218 or directly from range sensor 115 and determines whether the points on the objects in the field of view are moving objects by checking whether the data points currently measured occlude any points previously measured, and/or whether the points currently measured recursively occlude any data points previously measured, and/or whether the data points currently measured are recursively occluded by any data points previously measured. The detection module 220 can be programmed to make the determinations based on the points currently measured or previously measured either sequentially or simultaneously. For each current point of the cloud data being processed, the corresponding sensor pose at the measured time, including a translational component and a rotational component, is read from an odometry system or odometer 222 in
Referring to
The one or more objects may move perpendicularly to the ranging directions, or move in parallel with the ranging directions, or move in a direction that can be broken into two directions including a first direction perpendicular to the ranging direction and a second direction parallel to the ranging direction.
In one embodiment, the ranging sensor measures the distances to an object in a field of view (FoV) in one ranging direction or multiple ranging directions.
In one embodiment, the ranging sensor can be one of a light detection and ranging (LiDAR) sensor, a laser scanner, an ultrasonic sensor, a radar, or any suitable sensor that captures the three-dimensional (3-D) structure of a moving object or a stationary object from the viewpoint of the sensor. The ranging sensor can be used in a variety of applications, such as robot/unmanned aerial vehicles (UAVs) navigation, autonomous driving, environment monitoring, traffic monitoring, surveillance, and 3D reconstruction.
The moving object detection system and method of the subject invention may instantaneously detect data points of the moving objects, referred to as dynamic event points, by determining the occlusion between current position of the dynamic event points and all or a selected number of previous positions of the dynamic event points based on two fundamental principles of physics.
The first principle is that an object, when moving perpendicular to the ranging direction, partially or wholly occludes the background objects that have been previously detected by the moving object detection system and method.
The second principle is that an object, when moving parallel to the ranging direction, occludes or can be occluded by itself repeatedly. This second occlusion principle is necessary to form a complete moving objects detection method, since the first occlusion principle can only detect objects that are moving crossing the laser ranging direction, but not along the ranging direction
On the other hand, when the object is moving towards the sensor in parallel with the ranging direction, the phenomenon is identical except that the corresponding point occludes other point(s) instead of being occluded by other point(s). That being said, a point on the mentioned moving object would occlude previous points that further occlude themselves recursively. It should be noted that applying this phenomenon to moving object detection from LiDAR point clouds is not disclosed by any prior art.
It is also noted that for both the first and the second principles, a point can be determined that, whether it recursively occludes the previous points or is recursively occluded by the previous points, once it is generated it enables instantaneous detection of the moving objects at the point measuring rates.
The main challenge of the present invention resides in the large number of previous points, which are hard to be processed in real-time. To address such challenges, the invention has several novel features. First, depth images are used to store the previous points to ensure efficient occlusion checking (which is the foundation of the occlusion principles). Currently measured points are projected into depth images to perform occlusion checks with points in their neighboring pixels. This makes the occlusion check more efficient. Second, only a certain number of depth images containing previous points are saved for occlusion checking. This limits the number of depth images for occlusion checking and hence limits the processing time.
In one embodiment, determination of the occlusion between current time points and the previous time points can be implemented by depth images. In particular, the determination of the occlusions is performed based on depth images by comparing the depth of the current points and previous ones that are projected to the same or adjacent pixels of the depth image to determine their occlusions.
As shown in
A depth image can be attached with a pose (referred to as the depth image pose) with respect to a reference frame (referred to as the reference frame x′,-y′,-z′), indicating where the depth image is constructed.
When (R, t) is defined as the depth image pose and p is defined as the coordinates of a point (either the current point or any previous points) in the same reference frame, a projection of the point to the depth image can be achieved by following steps.
First, the point is transformed into depth image pose by Equation (1)
Then, the transformed point is projected to the depth image as shown by Equations (2) and (3)
where a projection such as perspective projection can be used.
Finally, the pixel location of the projected point location is determined by Equations (4) and (5)
where d is the pixel size which is the resolution of the depth image.
In one embodiment, the points are projected to the depth image by a spherical projection, a perspective projection, or any other suitable project that projects points lying on neighboring lines of sight to neighboring pixels.
In one embodiment, in a moving platform, a depth image is attached with a pose read from an external motion sensing device such as an odometry module, indicating under which pose the depth image is constructed and points are configured to be transformed to this pose before the projection to the depth image.
In one embodiment, each pixel of a depth image saves all or a selected number of points projected therein, and/or all or a select number of the depths of points projected therein, and/or the statistical information, for example, the minimum value, the maximum value, or the variance, of depths of all or a selected number of points projected therein, and/or the occluded points' other information attached to points projected therein.
In one embodiment, depth images are constructed at multiple prior poses and each depth image is constructed from points starting from the respective pose and accumulating for a certain period. Moreover, each point in a pixel saves the points in previous depth images that occlude the point or are occluded by the point.
In one embodiment, the occlusion of current points is determined against all or a selected number of depth images previously constructed.
In one embodiment, a current point is considered as occluding previous points if its depth is smaller than all or any points contained in adjacent pixels of any depth images it projects to.
In one embodiment, a current point is considered to be occluded by the previous points if its depth is larger than all or any points contained in adjacent pixels of any depth images it projects to.
In one embodiment, the occlusion of the current point and points in a depth image could be rejected or corrected by additional tests, for example, depending on if the current point is too close to points in the depth image.
In one embodiment, a current point is considered to recursively occlude previous points if it occludes a set of points in previous depth images, and in the set, points in later depth images are occluded by points in earlier depth images.
In one embodiment, a current point is considered to be recursively occluded by the previous points if it is occluded by a set of points in previous depth images, and in the set, points in later depth images are occluded by points in earlier depth images.
The depth image can be implemented with a fixed resolution as shown above or with multiple resolutions, the depth image can be implemented as a two-dimensional array or other types of data structure such that the pixel locations of previous points can be organized more efficiently.
Referring to
In the first step, a current point can be individually processed immediately after it is received or a group of points can form a batch by being accumulated over a certain period of time, for example, 50 Hz per frame. For each current point being processed, the corresponding sensor pose at the measured time, including a translational component and a rotational component, is read from an external odometry system or odometer 222 in
Referring to Test 1 of
Referring to Test 2 of
In particular, if the current point is occluded by any points in a selected set of previous depth image (e.g., denoted by pI
Referring to Test 3 of
In particular, if the current point occludes any points in a selected set of previous depth image (e.g., denoted by pI
Referring back to
Referring to
The moving object detection method and system of the subject invention can instantaneously distinguish points of the moving objects from the points of stationary objects measured by the ranging devices. Based on this point-level detection, moving objects in a detected scene can be robustly and accurately recognized and tracked, which is essential for an agent such as a robot/UAV, a self-driving car, or an alarming system to react or respond to the moving objects.
In particular,
The computational efficiencies of the various experiments performed by the moving object detection system and method are shown in Table 1 below. A detection latency smaller than 0.5 us can be achieved.
Due to the limited sensor resolution, the occlusion principles may lead to many false positive points (i.e., points not on moving objects are detected as on moving objects). To address this problem, a false positive rejection (FPR) step, following the occlusion check step, can be applied to each dynamic event points. This step aims to reject incorrect event points according to the principle that a point on the moving object should be distant from stationary map points. Since one depth image contains only a part of the map points, a map consistency check is performed on multiple recent depth images. Specifically, after the occlusion check, all non-event points are retrieved in the pixel where the current point is projected and in neighboring pixels. For any retrieved point, if there exists a point with a similar depth to that of the current point, the current point is considered as a stationary point and the dynamic event point should be revised as a non-event one.
The embodiments of the moving object detection system and method of the subject invention provide many advantages.
First, the embodiments are robust for detecting dynamic events of moving objects of different types, shapes, sizes, and speeds, such as moving vehicles, pedestrians, and cyclists in the application of autonomous driving and traffic monitoring, or any intruders in the application of security surveillance, or general objects such as human or animal on the ground, birds in air, and other man-made or natural objects in the application of UAV navigations.
Second, the embodiments are adaptable for working with different types of ranging sensors including, but not limited to, conventional multi-line spinning LiDARs, emerging solid-state or hybrid LiDARs, 3D laser scanners, radars, or other suitable ranging sensors, even when the ranging sensor itself is moving.
Third, the embodiments are highly efficient and can run at high point measuring rates, for example a few tens of thousands of Hertz when running on embedded low-power computers.
Fourth, the embodiments can achieve a low latency for determining whether a point is a dynamic event immediately after the measurement of the point is conducted. For example, the latency between the measurement of a point on any moving object and the determination can be less than one microsecond.
All patents, patent applications, provisional applications, and publications referred to or cited herein are incorporated by reference in their entirety, including all figures and tables, to the extent they are not inconsistent with the explicit teachings of this specification.
It should be understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and the scope of the appended claims. In addition, any elements or limitations of any invention or embodiment thereof disclosed herein can be combined with any and/or all other elements or limitations (individually or in any combination) or any other invention or embodiment thereof disclosed herein, and all such combinations are contemplated with the scope of the invention without limitation thereto.
Embodiment 1. A moving object detection system, comprising:
Embodiment 2. The moving object detection system of embodiment 1, wherein whether the objects are moving objects is determined either sequentially or simultaneously, with the system being used configured with other processing steps and modules for performance enhancements.
Embodiment 3. The moving object detection system of embodiment 1, wherein the previously measured points of the moving objects are partially, or completely excluded in the determination of occlusion for currently measured points.
Embodiment 4. The moving object detection system of embodiment 1, wherein the determination of occlusion is performed based on a depth image by comparing the depth of the currently measured points with previously measured ones that are projected to same or adjacent pixels of the depth image to determine the occlusion, with the occlusion results being corrected by additional tests for performance enhancements.
Embodiment 5. The moving object detection system of embodiment 4, wherein the points are projected to the depth image by a spherical projection, a perspective projection, or a projection that projects points lying on neighboring lines of sight to neighboring pixels.
Embodiment 6. The moving object detection system of embodiment 5, wherein in a moving platform, the depth image is attached with a pose read from an external motion sensing module, indicating under which pose the depth image is constructed and points are configured to be transformed to this pose before projection to the depth image.
Embodiment 7. The moving object detection system of embodiment 5, wherein for each pixel of the depth image, the detection module is configured to save all or a selected number of points projected therein, and/or all or a select number of the depths of points projected therein, and/or the statistical information comprising a minimum value, a maximum value, or a variance of depths of all or a selected number of points projected therein, and/or other information of the occluded points attached to points projected therein.
Embodiment 8. The moving object detection system of embodiment 5, wherein multiple depth images are constructed at multiple prior poses, and each is constructed from points starting from the respective pose and accumulating for a certain period of time.
Embodiment 9. The moving object detection system of embodiment 8, wherein for each point of a pixel, the detection module is configured to save the points in a previous depth image that occlude the point or are occluded by the point.
Embodiment 10. The moving object detection system of embodiment 8, wherein the occlusion of current points is determined against all or a selected number of depth images previously constructed.
Embodiment 11. The moving object detection system of embodiment 10, wherein a current point is determined to occlude previous points if its depth is smaller than all or any points contained in adjacent pixels of any depth image to which it projects.
Embodiment 12. The moving object detection system of embodiment 10, wherein a current point is determined to be occluded by previous points if its depth is greater than all or any points contained in adjacent pixels of any depth image to which it projects.
Embodiment 13. The moving object detection system of embodiment 11, wherein a current point is determined to recursively occlude previous points if it occludes a set of points in previous depth images and in the set, points in later depth images occlude points in earlier depth images.
Embodiment 14. The moving object detection system of embodiment 12, wherein a current point is determined to be recursively occluded by previous points if it is occluded by a set of points in previous depth image and in the set, points in later depth images are occluded by points in earlier depth images.
Embodiment 15. A method for detecting one or more moving objects, the method comprising:
Embodiment 16. The method of embodiment 15, wherein whether the objects are moving objects is determined either sequentially or simultaneously, with the system method being used configured with other processing steps for performance enhancements.
Embodiment 17. The method of embodiment 15, wherein the previously measured points of the moving objects are partially or completely excluded in the determination of occlusion for currently measured points.
Embodiment 18. The method of embodiment 15, wherein the determination of occlusion is performed based on a depth image by comparing the depth of the currently measured points with previously measured ones that project to same or adjacent pixels of the depth image to determine the occlusion, with the occlusion results being corrected by additional tests for performance enhancements.
Embodiment 19. The method of embodiment 18, wherein the points are projected to the depth image by a spherical projection, a perspective projection, or a projection that projects points lying on neighboring lines of sight to neighboring pixels.
Embodiment 20. The method of embodiment 19, wherein in a moving platform, the depth image is attached with a pose read from an external motion sensing module, indicating under which pose the depth image is constructed and points are configured to be transformed to this pose before projection to the depth image.
Embodiment 21. The method of embodiment 19, wherein for each pixel of the depth image, the detection module is configured to save all or a selected number of points projected therein, and/or all or a select number of the depths of points projected therein, and/or the statistical information comprising a minimum value, a maximum value, or a variance of depths of all or a selected number of points projected therein, and/or other information of the occluded points attached to points projected therein.
Embodiment 22. The method of embodiment 19, wherein multiple depth images are constructed at multiple prior poses, and each is constructed from points starting from the respective pose and accumulating for a certain period of time.
Embodiment 23. The method of embodiment 22, wherein for each point of a pixel, the detection module is configured to save the points in a previous depth image that occludes the point or are occluded by the point.
Embodiment 24. The method of embodiment 22, wherein the occlusion of current points is determined against all or a selected number of depth images previously constructed.
Embodiment 25. The method of embodiment 24, wherein a current point is determined to occlude previous points if its depth is smaller than all or any points contained in adjacent pixels of any depth image to which it projects.
Embodiment 26. The method of embodiment 24, wherein a current point is determined to be occluded by previous points if its depth is greater than all or any points contained in adjacent pixels of any depth image to which it projects.
Embodiment 27. The method of embodiment 25, wherein a current point is determined to recursively occlude previous points if it occludes a set of points in previous depth images and in the set, points in later depth images occlude points in earlier depth images.
Embodiment 28. The method of embodiment 26, wherein a current point is determined to be recursively occluded by previous points if it is occluded by a set of points in previous depth image and in the set, points in later depth images are occluded by points in earlier depth images.
Embodiment 29. A computer-readable storage medium having stored therein program instructions that, when executed by a processor of a computing system, cause the processor to execute a method for detecting one or more moving objects, the method comprising: capturing, by an input module, a point cloud comprising measurements of distances to points on one or more objects;
This application is a U.S. National Stage Application under 35 U.S.C. § 371 of International Patent Application No. PCT/CN2023/085922, filed Apr. 3, 2023, and claims the benefit of priority under 35 U.S.C. Section 119 (c) of U.S. Provisional Application No. 63/362,445, filed Apr. 4, 2022, which is hereby incorporated by reference in its entirety including any tables, figures, or drawings. The International Application was published on Oct. 12, 2023, as International Publication No. WO 2023/193681 A1.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2023/085922 | 4/3/2023 | WO |
Number | Date | Country | |
---|---|---|---|
63362445 | Apr 2022 | US |