The use of depth information and real-time image information is essential in robotic navigation. In legged robots, these sensors are required to create a representation of the terrain around the robot that is accurate and dense enough to search for footholds. Additionally, the representation needs to be updated without delay as the robot rapidly moves through this environment, since there is usually a short window of time in which a safe foothold is determined.
One method of creating this representation that has been commonly used in the past is accumulating measurements over time to create a “map” of the terrain around the robot. This method relies on state estimation or odometry to maintain an understanding of the robot's relative motion to the terrain and using that knowledge to create a combined terrain map. These odometry methods may use a combination of visual, inertial, and encoder measurements. The benefit of this common method is that only a few depth sensors may be required, since the data is assumed to be accumulated over time. However, the frequent footfalls in legged robots present as noise in inertial data, and toe slip can introduce large errors in the incorporation of encoder, inertial, and visual data. If the state estimation result is sufficiently inaccurate due to the aforementioned reasons, there is no possibility of recovery since the depth sensors may not have immediate visibility of the terrain near the robot feet. Further, the approach of combining multiple measurements over time often assumes the environments remain relatively unchanged and static, which does not always hold true.
To avoid this estimation error and ultimately achieve desirable and accurate results when operating vision-enabled legged locomotion on staircases, the composite field of view stretching from just in front of the robot to just behind it may be persistent and updated at a rate conducive to legged locomotion. Adding additional “downward facing” depth and visual sensors, create a comprehensive and composite field of view which covers the region in which a legged robot may tread.
The present invention positions a plurality of depth cameras in various localities on a legged robot, in particular, at the front, back and beneath the center of the robot's chassis. By positioning depth cameras at specific angles, more reliable results regarding vision-enabled legged locomotion are generated by providing a composite field of view that stretches along to the front, center and back of a legged robot using depth cameras, and depth and visual sensors. This approach employs a method that creates more accurate and safer tread for a legged robot on a staircase.
The present invention utilizes a plurality of depth cameras positioned in the base of a robot's chassis, as well as the front and back of the legged robot. The cameras provide an all-encompassing view of the terrain surrounding the robot and beneath the robot. Depth information is obtained by way of these cameras in the form of a pointcloud, and pointcloud data is used to aid in the robot's stair climbing. This pointcloud data is processed by eliminating occlusions from parts of the robot's body and used for the creation of a heightmap. Each element within the heightmap holds terrain height information, and a stair model fitting is performed to execute a stair's height and run dimensions. This model fills the missing regions of the heightmaps and allows the legged robot to move through a staircase, or elevated terrain.
Later, a gradient map is calculated on the heightmap, which is essential in the foothold selection process. The combination of these techniques helps legged robots climb stairs while they utilize the depth information from the plurality of cameras affixed to the robot's body and enhance its perception and decision-making during the navigation process.
The present invention's depth camera positions provide a comprehensive field of view. Legged robots that only have front and back cameras do not properly observe the terrain beneath them, and an estimation of the height of the terrain. Moreover, the estimation is difficult because of the need for accurate foot placement despite the presence of measurement noise, and impossible to re-initialize in the event of accumulated inaccuracy in the estimate. In an effort to mitigate this estimation problem, the present invention is disclosing a system design outfitted with a plurality of cameras that cover a full field view of the front, back, and beneath its feet.
The present invention employs a plurality of depth cameras to capture visual data, however, any assortment of cameras that accurately capture depth images with a wide field of view and generates depth data at a sufficiently high rate. The images acquired are, in turn, converted to pointcloud information regarding the height of the surrounding environment, including that of which is underneath the robot. The camera is strategically positioned on the robot at the front, tilted downward at an angle of, by way of example and not limitation, 25 degrees. Another camera is positioned on the back, also facing downward but at a slight angle of 15 degrees. There are also cameras located in the robot's belly, facing directly downward with an inclination of 10 degrees relative to the horizontal line. When the robot's height exceeds 330 cm, these cameras effectively cover the entire field of view beneath the robot. This configuration ensures comprehensive visual coverage and facilitates robust data collection for the robot's navigation and perception tasks.
Next, the present invention generates a heightmap from the pointcloud. The plurality of cameras offer a wide range of field of views, and thus provide depth information about the areas beneath, in front and behind the robot. However, during climbing maneuvers, the robot legs may enter the field of view of the cameras, potentially causing confusion in the depth information of the environment. To address this issue, a slicing strategy is implemented to mitigate the impact of the legs on the depth pointcloud. This heightmap processing is typically relayed by way of a computing box stationed inside of the legged robot that features a microprocessor and inertial memory unit.
The kinematics of the robot's legs are utilized to determine the width of the point cloud slice. By using the toe positions as determined by the kinematics, the range in the y direction of the point cloud slice is established to form the heightmap. Specifically, the minimum y position of the left toes and the maximum y position of the right toes are employed to define this range.
Upon obtaining the pointcloud slice without toe occlusion, heightmap information is generated. The heightmap consists of several elements storing the height of the terrain. The heightmap is a accumulates spatially consistent point cloud data into a more concise and spatially ordered structure, facilitating operations like gradients, and reducing computation time for dependent algorithms.
The present invention also orchestrates stair model fitting. When a legged robot is traversing stairs, the distance between the robot itself and the stairs may be less than 330 cm, leading to incomplete views captured by the cameras. To address this issue, a stair fitting algorithm is employed. A stair fitting algorithm is executed by way of a processor that inhabits the computing box affixed to a legged robot's chassis and operates over wireless network.
The algorithm begins by assuming that the stair steps are uniform and models the staircases using two parameters, height and run. It proceeds by calculating the fitting error for each combination of height and run and incrementally changing these parameters with a step of 1 cm. This process persists until all the fitting errors have been computed for all possible height and run combinations. Subsequently, the algorithm selects the best result with the smallest fitting errors.
The process of foothold selection utilizes a multi-objective optimization search (equation 1). The first two terms are: the cost of deviating from nominal foothold location (Jnom) and the gradient at the current location (Jgrad). Jnom is proportional to the distance between the current location and the nominal foothold location. The nominal foothold is the toe location based on the robot's dynamics. Jgrad is calculated based on the gradient map. This combination dually ensures a consideration of both proximity to the desired foothold position and the terrain's slope.
To enhance stability and prevent excessive movement of the foothold location in the presence of a noisy heightmap, a damping term, Jdamp, is introduced to the present invention. The damping term penalizes discrepancies between the current foothold location and its previous position. As a result, the foothold selection process is more robust, providing a smoother and more controlled foothold placement even in challenging, uncertain and or unstructured terrain conditions. The objective function is equation 1.
J=w
nom
J
nom
+w
grad
J
grad
+w
damp
J
damp (1)
In the present invention's stair model fitting method, the robot, in theory, is allowed to step at any location. However, with the gradient map, the robot should prefer to step in more flat areas than uneven areas. The stair model yields the optimal height and run, and the missing regions in the heightmap are filled by way of the algorithm employed. This enhances the perception of terrain during stair traversal, thus enabling the robot to make more strategic, and informed decisions when navigating stairs.
The gradient map calculation discloses how suitable the location in the map is for the legged robot to place its feet. This method does not employ a 3D signed distance field calculated from a terrain map, but rather, a convolution operation. This is a feature designed to aid in sensory data processing and anomaly detection. The outcome, with all features considered, results in an advanced method to use depth sensors positioned on a legged robot for efficient and accurate stair climbing operations.
Other features and aspects of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, which illustrate, by way of example, the features in accordance with embodiments of the invention. The summary is not intended to limit the scope of the invention, which is defined solely by the claims attached hereto.
The various embodiments are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings. Having thus described the invention in general terms, reference will now be made to the accompanying drawings, which are not necessarily drawn to scale, and wherein:
While various embodiments of the disclosed technology have been described above, it should be understood that they have been presented by way of example only, and not of limitation. Likewise, the various diagrams may depict an example architectural or other configuration for the disclosed technology, which is done to aid in understanding the features and functionality that may be included in the disclosed technology. The disclosed technology is not restricted to the illustrated example architectures or configurations, but the desired features may be implemented using a variety of alternative architectures and configurations. Indeed, it will be apparent to one of skill in the art how alternative functional, logical or physical partitioning and configurations may be implemented to implement the desired features of the technology disclosed herein. Also, a multitude of different constituent module names other than those depicted herein may be applied to the various partitions. Additionally, with regard to flow diagrams, operational descriptions and method claims, the order in which the steps are presented herein shall not mandate that various embodiments be implemented to perform the recited functionality in the same order unless the context dictates otherwise.
Although the disclosed technology is described above in terms of various exemplary embodiments and implementations, it should be understood that the various features, aspects and functionality described in one or more of the individual embodiments are not limited in their applicability to the particular embodiment with which they are described, but instead may be applied, alone or in various combinations, to one or more of the other embodiments of the disclosed technology, whether or not such embodiments are described and whether or not such features are presented as being a part of a described embodiment. Thus, the breadth and scope of the technology disclosed herein should not be limited by any of the above-described exemplary embodiments.
Terms and phrases used in this document, and variations thereof, unless otherwise expressly stated, should be construed as open ended as opposed to limiting. As examples of the foregoing: the term “including” should be read as meaning “including, without limitation” or the like; the term “example” is used to provide exemplary instances of the item in discussion, not an exhaustive or limiting list thereof; the terms “a” or “an” should be read as meaning “at least one,” “one or more” or the like; and adjectives such as “conventional,” “traditional,” “normal,” “standard,” “known” and terms of similar meaning should not be construed as limiting the item described to a given time period or to an item available as of a given time, but instead should be read to encompass conventional, traditional, normal, or standard technologies that may be available or known now or at any time in the future. Likewise, where this document refers to technologies that would be apparent or known to one of ordinary skill in the art, such technologies encompass those apparent or known to the skilled artisan now or at any time in the future.
This application claims the benefit of U.S. Provisional Patent Application Ser. No. 63/396,319 filed on Aug. 9, 2022, the contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
63396319 | Aug 2022 | US |