This disclosure is generally directed to hierarchical fuzzy controller with reduced rule-base, and is particularly directed to (1) an Adaptive Neuro-Fuzzy Inference System (ANFIS) and particularly directed to a fuzzy logic controller with Hierarchical Rule-Base Reduction (HRBR) and symmetric trapezoidal membership functions, and implemented as neural network trained via reinforcement learning using an ANFIS actor, (2) a multi-output fuzzy control method and system with trapezoid membership functions, symmetric rule-base, and with hierarchically reduced rule-base that prioritizes minimizing a waypoint distance error, and (3) a combination thereof.
Fuzzy logic techniques may be employed in various control functions in on-the-road or off-the-road autonomous vehicles/machines. Fuzzy logic decision in these controllers may depend on the types and functions of the vehicles and may vary from vehicle to vehicle. At the core of a fuzzy logic controller lies the design of inputs and their linguistic variables, the member functions thereof, control outputs and their linguistic variables, the member functions thereof, and the fuzzy rule-base between the input linguistic variable and the output linguistic variables. As the numbers of input and output variables of a fuzzy logic controller increase, the rule-base grows exponentially. It is thus helpful to reduce the rule-base to the most essential and effective components even in the presence of a large number of input and output variables. A fuzzy logic thus may include a rule-base and various control parameters. All possible rules within the rule-base may be prioritized to obtain a reduced ruleset for a more efficient Fuzzy logic. A fuzzy logic may be embodied as a neural network, referred to as a Neuro-Fussy Inference System (ANFIS), with the fuzzy logic parameters being determined via training of the ANFIS.
This disclosure is generally directed to hierarchical fuzzy controller with reduced rule-base, and is particularly directed to (1) an Adaptive Neuro-Fuzzy Inference System (ANFIS) and particularly directed to a fuzzy logic controller with Hierarchical Rule-Base Reduction (HRBR) and symmetric trapezoidal membership functions, and implemented as neural network trained via reinforcement learning using an ANFIS actor, (2) a multi-output fuzzy control method and system with trapezoid membership functions, symmetric rule-base, and with hierarchically reduced rule-base that prioritizes minimizing a waypoint distance error, and (3) a combination thereof.
For example, in one aspect, this disclosure particular describes a multi-output fuzzy control method and system with trapezoid membership functions and hierarchically reduced symmetric rule-base.
For example, a waypoint navigation controller and corresponding controlling methods are described, where the controller functions as a multiple input-multiple output, e.g., nonlinear angular velocity and linear speed controller for a land vessel such as a skid-steer vehicle. The controller and the controlling methods may be based on a fuzzy logic controller (alternatively referred to as “fuzzy controller”). The membership functions of the fuzzy controller may employ a trapezoidal structure with a symmetric rule-base. In addition, a Hierarchical Rule-Base Reduction (HRBR) is incorporated into the controller so as to select only the rules most influential on state errors by selecting inputs/outputs, determining the most globally influential inputs, and generating a hierarchy relating inputs via a Fuzzy Relations Control Strategy (FRCS). The resulting fuzzy controller covers an entire operating environment of the vehicle, but a rule for every possible combination of variables, states, and outputs is no longer necessary. As a result, the described fuzzy controller can increase both the number of inputs and their associated fidelity without its rule-base dramatically increasing.
In an example implementation, a fuzzy controlling method for automatically controlling a vehicle is disclosed. The method may include determining current states of a plurality of control metrics relative to a planned path based on measurements from sensors installed on the vehicle; generating at least two control commands using a fuzzy logic controller with a hierarchically reduced rule-base; and converting the at least two control commands into one or more control signals for actuating one or more path-control actuators of the vehicle.
In the example method above, each of the plurality of control metrics is associated with a plurality of input member linguistic variables relating to the corresponding control metrics by input fuzzy membership functions.
In any one of the example methods above, generating the at least two control commands using the fuzzy logic controller may include automatically converting the current state of each of the plurality of control metrics into input linguistic values of the plurality of input member linguistic variables based on the input fuzzy membership functions; automatically mapping the input member linguistic variables to linguistic control variables associated with at least two path-control actions of the vehicle based on the hierarchically reduced rule-base in the fuzzy logic controller; generating output linguistic values of the linguistic control variables for each of the at least two path-control actions based on the mapping and output fuzzy membership functions associated with the linguistic control variables; and defuzzificating the output linguistic values corresponding to the at least two path-control actions to generate the at least two control commands.
In any one of the example methods above, each of the input fuzzy membership functions specifies a trapezoidal relationship between a corresponding input member linguistic variable and corresponding control metrics.
In any one of the example methods above, the hierarchically reduced rule-base of the fuzzy logic controller is left-right symmetric.
In any one of the example methods above, the hierarchically reduced rule-base comprises a set of if-then rules linking the plurality of input, member linguistic variables to the linguistic control variables covering fewer than all possible combinations of the input member linguistic variables and the linguistic control variables.
In any one of the example methods above, the planned path comprises at least a current path segment and a next path segment joint by a target point; and the plurality of control metrics comprise a waypoint line distance from the vehicle to a waypoint between the target point and a projection point of the vehicle on the current path segment.
In any one of the example methods above, the plurality of control metrics may further include a target distance from the vehicle to the target point; a waypoint heading angle between a current heading direction of the vehicle relative to a line from the vehicle to the waypoint; a current path-alignment angle between the current heading direction of the vehicle and the current path segment; and a lookahead path-alignment angle between the current heading direction of the vehicle and the next path segment.
In any one of the example methods above, the hierarchically reduced rule-base may include s rule branches and sub-branches based on hierarchically prioritizing within the control metrics according to the input member linguistic variables.
In any one of the example methods above, the control metrics of the target distance may include a first input linguistic variable representing whether the vehicle is near the target point and a second input linguistic variable representing whether the vehicle is far from the target point; and top branches of the hierarchically reduced rule-base may include a first sub-rule-set and a second sub-rule-set corresponding to the first and second input linguistic variables of the target distance, respectively.
In any one of the example methods above, the first sub-rule-set is reduced from addressing all possible combinations of the input member linguistic variables of the waypoint line distance, the waypoint heading angle, the current path-alignment angle, and the lookahead path-alignment angle by ignoring at least one of the waypoint line distance, the waypoint heading angle, and the current path-alignment angle.
In any one of the example methods above, the second sub-rule-set is configured to ignore at least the lookahead path-alignment angle.
In any one of the example methods above, at least one of sub-branches of the second sub-rule-set further ignores the waypoint heading angle.
In any one of the example methods above, at least one other of the subbranches of the second sub-rule-set further ignores the current path-alignment angle.
In any one of the example methods above, the waypoint is determined by achieving a quickest approach to the planned path assuming a constant speed.
In any one of the example methods above, the at least two path-control actions may include an angular steering control and a linear speed control, and wherein the linguistic control variables corresponding to the angular steering control represent a plurality of angular steering levels and the linguistic control variables corresponding to the linear speed control represent a plurality of linear speed levels.
In any one of the example methods above, defuzzificating the output linguistic values is based on a center-of-mass methodology.
In any one of the example methods above, each of the output fuzzy membership functions associated with the linguistic control variables is a triangular function.
In any one of the example methods above, the vehicle comprises a skid-steer vehicle.
In another aspect, this disclosure further describes an Adaptive Neuro-Fuzzy Inference System (ANFIS) and is particularly directed to a fuzzy logic controller with Hierarchical Rule-Base Reduction (HRBR) and symmetric trapezoidal membership functions, and implemented as neural network trained via reinforcement learning using an ANFIS actor.
In particular, example approaches for designing and optimizing an ANFIS for symmetric linguistic values are disclosed. The ANFIS may correspond to a fussy logic with HRBR. Linguistic joint membership functions that underlie the fussy logic of the ANFIS are defined. Symmetrical properties with respect to inputs/outputs of the ANFIS are utilized in joint optimization of the member functions to reduce a number of training parameters. Further optimizations for the ANFIS are derived based on other design considerations, including but not limited to training the membership functions on closed or single-sided domains. The optimal output membership weights based on mean square error optimization may also be symbolically obtained. An example online training of the input/output membership functions of the ANFIS is performed using reinforcement training algorithms. Such reinforcement training may utilize an ANFIS actor.
In one example implementation, a method for generating a neuro-fuzzy logic controller is disclosed. The neuro-fuzzy logic controller is configured to generate at least one control output signal from a set of input signals, The method may include determining one or more input linguistic values and one or more output linguistic values for a fuzzy logic underlying the neuro-fuzzy logic controller; determining a rule-base linking the one or more input linguistic values and the one or more output linguistic values; performing a hierarchical rule-base reduction (HRBR) procedure to generate a modified fuzzy logic with a reduced rule-base; initializing the neuro-fuzzy logic controller to embed the modified fuzzy logic including initial input membership functions associated with the one or more input linguistic values and the set of input signals, and initial output membership functions associated with the one or more output linguistic values and the at least one control output signal; tuning the membership functions via reinforcement training of the neuro-fuzzy logic controller to generate a trained neuro-fuzzy logic controller; and controlling an actuator based on the at least one control output signal generated by the trained neuro-fuzzy logic controller front the set of input signals.
In the example implementation above, the input membership functions comprise trapezoid relations between numerical values of the one or more input linguistic values and the set of input signals.
In any one of the example implementations above, the output membership functions comprise triangular relations between numerical values of the one or more output linguistic values and the at least one control output signal.
In any one of the example implementations above, the input membership functions comprise a combination of double sided and single sided trapezoids.
In any one of the example implementations above, wherein the input membership functions are symmetric with respect to an input domain associated with each of the set of input signals.
In any one of the example implementations above, a shape of each double sided trapezoid of the input membership functions and the output membership functions is represented by four parameters in a corresponding domain.
In any one of the example implementations above, a shape of each single sided trapezoid of the input membership functions and the output membership functions is represented by two parameters in the corresponding domain.
In any one of the example implementations above, neighboring trapezoids of the input membership functions of a domain are constrained to have two joint parameters representing a slope region of the domain for the neighboring trapezoids.
In any one of the example implementations above, the hierarchically reduced rule-base comprises a set of if-then rules linking the one or more input linguistic values to the one or more output linguistic values that cover fewer than all possible if-then linking combinations of the one or more input linguistic values and the one or more output linguistic values.
In any one of the example implementations above, the hierarchically reduced rule-base comprises rule branches and sub-branches based on hierarchically prioritizing within a set of control metrics according to the one or more input linguistic values.
In any one of the example implementations above, the neuro-fuzzy logic controller comprises five neurological layers.
In any one of the example implementations above, the five neurological layers comprises a premise layer, a weighting layer, a normalization layer, a consequence layer, and an output layer.
In any one of the example implementations above, parameters of the input membership functions are tuned in the premise layer.
In any one of the example implementations above, the reinforcement training of the neuro-fuzzy logic controller is based on using the neural-fuzzy logic controller as an actor.
In any one of the example implementations above, the reinforcement training of the neuro-fuzzy logic controller is based on back-propagation of errors representing expected senor signal and actual sensor signal as a result of the actuator being actuated by the neuro-fuzzy logic controller.
In any one of the example implementations above, the reinforcement training is based on a Deep Deterministic Policy Gradient (DDPG) model.
In any one of the example implementations above, wherein the euro-fuzzy logic controller is installed in a skid-steer vehicle for navigational control of the skid-steer vehicle.
In some other examples, a control circuitry comprising the neuro-fuzzy logic controller of any one of example implementations above and configured to perform a method of any one of the example implementations above.
This patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee. For a more complete understanding of the invention, reference is made to the following description and accompanying drawings, in which:
Path tracking is a critical component for autonomous vehicles and machines operating in off-road environments. When designing path tracking controllers for such autonomous vehicles, considerations for position and velocity are paramount. In particular, for vehicles used for farming, work sites, and the like, the various offroad environments pose great challenges in the design of path tracking controllers.
In addition, controller design may vary greatly in control logic for vehicles of different, steering principles. For example, for skid-steer vehicles with wheels that are fixed in orientations relative to the body of the vehicle and that are steered by controlling skids, some traditional approaches to path tracking controllers may be problematic.
For example, a skid-steer vehicle, when represented as a unicycle model, has no A matrix while the B matrix directly maps the vehicle dynamics to the control action. Such a structure may be directly at odds with model-based approaches, such as LQR (Linear Quadratic Regulator), MPC (Model Predictive Control), and H-infinity. These model-based controllers use the vehicle's A matrix and control error signal to develop an optimal and/or robust target trajectory and a corresponding control action, which are not effective for a skid-steer vehicle.
For another example, traditional frequency domain control may not be effective for skid-steer vehicles. For frequency domain control on an ideal flat road, the stability and performance criteria can be guaranteed even though better performance may require intense system identification and tuning. In contrast, for off-road settings, the dynamics and thus stability of these controllers vary greatly with velocity and terrain for a given set of gains.
For another example, sliding mode controllers operate in a binary fashion by prescribing either a maximal or minimal control effort to drive the desired error state to a sliding manifold with a zero-error state. Sliding mode controllers have been demonstrated to work to an extent on skid steer autonomous land vehicles, but the steady-state oscillations have been problematic.
For yet another example, learning-based controllers that create a multivariable, nonlinear, sensor input-control output mapping have been considered for autonomous vehicle control applications. However, vehicles can become unstable when presented with disturbances outside of their training space, such as unexpected changes to the vehicle dynamics, ground contact physics, or unforeseen sensor measurements. As such, such controllers may be unstable for off-road environments.
In the various embodiments of the disclosure below for path tracking, a waypoint navigation controller and corresponding controlling methods are described, where the controller functions as a multiple input-multiple output, e.g., nonlinear angular velocity and linear speed controller for a land vessel such as a skid-steer vehicle. The controller and the controlling methods may be based on a fuzzy logic controller (alternatively referred to as “fuzzy controller”). The membership functions of the fuzzy controller may employ a trapezoidal structure with a symmetric rule-base. In addition, a Hierarchical Rule-Base Reduction (HRBR) may be incorporated into the controller so as to select only the rules most influential on state errors by selecting inputs/outputs, determining the most globally influential inputs, and generating a hierarchy relating inputs via a Fuzzy Relations Control Strategy (FRCS). The resulting fuzzy controller covers an entire operating environment of the vehicle, but a rule for every possible combination of variables, states, and outputs is no longer necessary. As a result, the described fuzzy controller can increase both the number of inputs and their associated fidelity without its rule-base being dramatically increased.
Comparison of performance is made between the disclosed fuzzy logic controller and geometric controllers that (let ermine optimal control action based on manifolds defined by the geometric based constraints on the vehicle and its error state. Such controllers are chosen as a baseline controller because, for differential and skid-steer vehicles, the most common form of geometric control is pure pursuit. As a baseline for comparison, these geometric controllers are robust, so long as the target point is far enough away from the vehicle to account for the maximum system time delays and any discrepancies between the real-world vehicle dynamics and model dynamics.
An example vehicle that is generally operated under control via fuzzy logic is shown in
The additional disclosure below describes example implementations of the fuzzy controller 110 given a plurality of inputs for generating a plurality of outputs with hierarchically reduced symmetric rule-base, with trapezoid membership functions for input linguistic variables, and with a set of input control metrics including a waypoint distance.
In some example implementations of fuzzy logic control, crisp inputs, zϵn may be feed into the input linguistic variables In, which are then categorized into in input linguistic values An,m in a process referred as fuzzification. L inguistic values describe their associated variable's performance with linguistic (and thus human-understandable/explainable) descriptors like fast and slow, near and far, and the like. Membership functions, μZ
After fuzzification, output value membership is determined using IF-THEN rules. An example rule structure is presented below in Equation (1) with Ok being the output linguistic variables and Bn,m being the output linguistic values. How the AND, OR, and IF-THEN operations interact with the membership functions for the values in the antecedents and consequents varies with implementation.
IF I1 is A1,2 AND/OR I2 is A2,5 THEN O1 is B1,1 (1)
In some example implementations, a Mamdani type implementation using a product AND (t-norm) may be used for a fuzzy controller. Such a controller may have a plurality of control outputs. For example, for controlling a vehicle, the control output of the fuzzy controller may include two variables for controlling the vehicle's linear speed and angular velocity. To calculate these crisp outputs, Center of Mass (CoM) defuzzification may be used to generate quantified control signals as shown below in Equation (2). In Equation (2), n represents the number of membership functions, xi represents the amount of control output for membership function i, and μc(xi) represents the degree of membership in membership function i.
In some example implementations, as illustrated in further detail below, the controller's rule-base may be designed to be symmetric with respect to certain output variables. For example, the controller may be assumed to act with the same magnitude but opposite gains while making right or left turns.
In addition, in some example implementations, the fuzzy controller may use trapezoidal input membership functions, as opposed to the traditional triangular or Gaussian membership functions, as illustrated in further detail below. Such a trapezoidal membership function generates control signals that resemble those exerted by a human operator. Furthermore, the use of trapezoidal membership functions reduces bang-bang and improves overall system stability as the flat regions of the trapezoids provide a margin of acceptable error in the input, especially around the zero error region, without producing unnecessary oscillations in the output activation signal. Moreover, using a trapezoid allows for some of the more desirable traits of a Gaussian function to be captured, thereby providing a Gaussian approximation with trapezoidal parameterization that dramnatically improves comnputationally efficiency.
The example fuzzy controller disclosed herein may incorporate a relatively large number of input error functions (5 inputs, for example) while still keeping the rule-base fairly small (40 rules, for example) as compared to the potential hundreds of rules that would result from a standard fuzzy controller with the same linguistic variables and values. As described in further detail below, this level of fidelity is achieved through use of a Fuzzy Relations Control Strategy (FRCS).
For example, relevant controller linguistic variables, Fuzzy Relations Control Variables (FRCVs), and outputs may first be established. Then, the FRCS determines the most globally influential FRCVs and by placing the FRCVs in a hierarchy of influence. This hierarchy may be used to divide the operating environment of the vehicle into distinct, regions or spaces (or branches) of operation. Following, the relations in the hierarchy/regions may be used to inform a selection of the rules most influential on state errors. This entire top-down process resulted in an HRBR.
As such, the HRBR represents a generic strategy for reducing the size of a fuzzy logic rule-base. The reduction of the rule-base follows directly from model and FRCV generation, with example steps illustrate din the data and logic flow 200 of
In some example implementations for generating tiers of control objectives and error functions in Step 202 of
In some example implementations, the hierarchy for the rules may follow from the conditions in which each control objective and thus associated error is important. For the entire duration of the control effort, completing each of the path segments may be considered unconditionally important. Particularly, when a current segment is still far from being completed. When the vehicle is close to completing the current path segment, then aligning the relative error to the next path segment may become increasingly more important. As such, fuzzy values for distances close to the next path segment and far from the next path segment may be created. These conditions and associated fuzzy values may be used to determine the branches of the hierarchy through all control actions.
Example segmentation for the remainder of the controller are shown in Table 1. Fuzzy values for each of the FRCVs may be chosen heuristically given the symmetry constraint (see above), zero constraint (from the flat portion of the output trapezoid membership functions), and number of segmentations necessary for higher priority FRCVs. In the example of Table 1, the FRCVs or control metrics are: the distance from the vehicle to the target point (distErrTarget), the minimum distance from the vehicle to the current path segment (distErr Line), the angle between the vehicle's heading and the current path segment (θNear), the angle between the vehicle's heading and the next path segment (ALookahead), and the angle between the vehicle's heading and a waypoint F on the current path segment between the vehicle's projection onto the current segment and the next waypoint (θFar). Error signal functions, represented by the FRCVs, are minimized when the vehicle is in a specific state with the span of all error states corresponding to all possible vehicle positions and orientations.
These FRCVs are illustrated or can be identified in
An example hierarchical FRCS developed over these FRCVs is illustrated generally in Table 1 and specifically in Table 2. Table 1 shows a general design where the FRCVs are hierarchically considered in building the rule set. Specifically,
the FRCV of DisErrTarget (distance to the target point of the current segment) is evaluated at a first hierarchy level as being either near or f ar. If it is near, then the next level decision for controlling the vehicle (e.g., the speed adjustment, and turning) would depend on how the vehicle is aligned with the target point of the next segment (θLookahead) regardless of DisErr Line, θNear, and θfar. If DisErrTarget is far, then the rule set may further depends on how far the vehicle is off the path of the current segment in distance (DisErr Line). If DisErr Line is not very off the current path (zero, close or near) with the distance to the target point of the current segment being still far, then the rule set may mainly target brining the vehicle towards the target point in the current segment (θNear. Otherwise, if DisErr Line is very off the current path (far) and still far from the current target point (DisErr Target being far), then the goal may be to first bring the vehicle more aggressively towards the way point in the current path before the current target point (θFar. As such, Table 1 are populated with priority consideration level (1 being the highest priority) of the FRCVs under the hierarchical branches. The FRCV combination showing as blank cells would not show up in the rule set, thereby achieving a reduction of the number of rules.
Once a hierarchy is designed and completed, as shown in Table 1, the rule reduction may be performed by only including rules relevant to each state of a branch. An example is illustrated in Table 2 with hierarchical branches for the various example linguistic variables above. As a result of left-right symmetry, Table 2 only shows half of the rules and their hierarchy.
As indicated in Table 1 and an example hierarchy of Table 2, the distErr T arget metrics, for example, may be associated with two linguistic vari-ables, far and near, which partitions the space in a first level into being near the target point or far away from it. Furthermore, the error (or metric) distErr Line partitions a subspace representing far from the target space in a second level into several sub-sub-spaces according to the minimum distance from the vehicle to the current target path using various example linguistic variables of distErr Line. For example, distErr Line error may be associated with seven linguistic variables to cat-egorize how far away the vehicle is from the current target trajectory: far left, near left, close left, zero, close right, near right, and far right (Table 2 only shows the left half and zero sub-subspaces). The remaining example FRCVs (θNear), (θFar), and (θLookahead) may, for example, all use similar five linguistic variables to qualify the orientation of the vehicle far left, close/near left, zero, close/near right, and far right.
As such, for the example hierarchical rule reduction of Table 2, the dis-tance to the target point (or next segment) may be atop the hierarchy with two branches, a “near” branch and a “far” branch. For the “near” branch in which the vehicle is close to the target point or the next path segment, the primary goal may be align the vehicle's heading with the next path segment. As such, the rules for this branch may be selected as depending only on θLookahead with other variables being ignored. Because θLookahead may be associated with, for example, five fuzzy values, only five example rules linking these five θLookahead values with the two output variables (steering action {dot over (θ)}R and speed control) may be necessary (Table 2 shows three rules for θLookahead values being “far left”, “close left”, and “zero”, but the full rule set in this branch would be five once the symmetric half of “far right”, “close right” for θLookahead is included).
In the example of Table 2, and in the sub-sub-space of the “far left” lin-guistic variable (and “far right” linguistic variable, not shown) for the distErr Line error, it may be determined that because the vehicle is very far away from the current target path segment, the primary goal may be to get back on track, and thus, the controlling of the steering and the speed of the vehicle only need to depend on lin-guistic variables associated with θFar, and the heading angles of the vehicle relative to the current segment (θNear) and next segment line (θLookahead) are not important for consideration. The rule sets in this sub-sub-space may be reduced to only linking the five example linguistic variables of θFar metrics to the steering and speed control output (Table 2 only shows left half and the zero variables).
Thus, by using the hierarchical space division as illustrated in Table 2, while every fuzzy value should be used, only a heuristically selected subset of all possible combination of all values in a branch are used in the rule set, thereby achieving a reduction of the rules for the fuzzy logic.
In the example implementation above, the pursuit back to the target path is achieved by targeting a fixed distance in front of the vehicle's projection onto the trajectory. The primary downside of this approach occurs when the projection onto the path is very close to the target, but the vehicle has drifted from the path. This scenario leads to the vehicle meeting its completion criteria when it is far away from
the target. If this occurs, the vehicle state would then be projected onto the next segment. This could result in an extremely large jump in the projected distance and potentially the skipping of a future waypoint altogether. In obstacle-riddled environments with a small number of navigable paths, obstacle avoidance protocols can lead to such an error cascade.
The example fuzzy controller above uses (θFar) when the positional error state is far away from the target trajectory and target point (when distErr Target is large). As shown above in relation to Table 2, the controller may be designed to orient the vehicle so as to minimize (θFar) and head towards the far point from trajectory target point (F in
In some example implementations, in order to have the controller approach the trajectory quickly, the value of k may be chosen to be close to 1.
This method of using constant-speed approaching to select F provided a harmonious solution that solved several issues. Using such a method, the vehicle may be controlled to aggressively approached the waypoint when segment completion is imminent, while approaching from a more casual angle when the end of the segment is further away. This casual angle decreases the overall (or speed up) completion time for the path, and in cases where the waypoints are far enough apart for the approach angle to be substantially shallow, it would typically be much less important than the path being tightly followed.
As described above for the implementation of Table 2, when distErrTarget is small and in the range of its “near” linguistic value, minimizing (θLookahead) may be used as the sole control objective. Similarly, minimizing (θFar) may be the sole control objective when distErr Target is in its “far” range and distErr line is in its “far left” range or “far right” range. As a result, both of the angle metrics have the same linguistic variables. In the example implementation of Table 2, they are also both mapped to or linked to very similar steering control output variables such that: far left mapped to right 4, near left mapped to right 2 or 1 respectively, zero mapped to zero, near right mapped to left 2 or 1 respectively, and far right mapped to left 4. The control variable for steering include a change of steering direction and change amount. For example, right 4 indicate changing steering to the right with an aggressive amount of 4. “Zero” represent no change (keep the current steering).
As described above for the implementation of Table 2, when the vehicle is close to the path but away from the target point, the control objective may be multifaceted. It may prioritize minimizing θNear) while also driving and then maintaining distErr Line to/at zero. This task incorporates five l inguistic variables f rom the distErr Line membership functions (near left, close left, zero, close right, and near right), and all five linguistic variables from the θNear) membership functions (far left, near left, zero, near right, and far right). These are combined to make 25 example rules that stabilized the vehicle about its equilibrium point. When the distErr line is zero, the steering control output may minimize (θNear) as follows: far left mapped to right 3, near left mapped to right 1, zero mapped to zero, near right mapped to left 1, and far right mapped to left 3.
In some example fuzzy system, as described above, the output angular velocity setpoint that ranges from −ωmax to ωmax may be designed be associated with, for example, nine potential linguistic variables. Those linguistic variables are left 4, left 3, left 2, left 1, zero, right 1, right 2, right 3, and right 4. The output linear speed setpoint may be configured to range from 0 m/s to 2 m/s. Linear speed may be associated with the linguistic values slow, medium, and fast. In the rule-base, the slow output may be assigned to rules where angular velocity is right/left 4, medium may be assigned to right/left 2 and 3, and fast may be assigned to right/left 1 and zero.
Example membership functions for the various input linguistic variables above and the output variables are described in further detail below. In some example
implementations, the membership functions of the various input linguistic variables for the fuzzy logic may take the form of a trapezoid, whereas the membership functions of the output variables may take the form of a triangle.
Example membership functions for the input linguistic variables (close and far) for distErr Target are illustrated in Table 3 and in
In some example implementations, unlike the membership functions associated with the input linguistic variables, the membership functions of linguistic variables for the steering control output and speed control output may be constructed as triangular. In some example implementations, the triangles may have a same area among the linguistic variables within each of the output. Example membership functions
for linguistic variables (left 4, left 3, left 2, left 1, zero, right 1, right 2, right 3, and right 4) of the steering control output are illustrated in Table 8 and
The defuzzification process, which combined these control signals, may use
the CoM approach discussed previously. As each of the output membership functions are triangles with the same area, defuzzification involved taking a weighted average of the peak values of the triangles, with the weights being the percentage that the associated rules are active. The normalized input membership functions as shown above the examples of
The generation of actuation signal for achieving the various steering control and speed control as generated by the fuzzy logic controller may be based on a physical model of the vehicle and its interaction with the ground. The disclosure below provides an example physical model of a vehicle. While the model is provided in the context of a skid steer type of vehicle, the under line methodology and general principle apply to other type of vehicles with modification and/or adaptation.
In some example implementations, five interconnected bodies are used to model or represent a vehicle (a skid steer vehicle, in particular). Each of these bodies may be associated with a reference frame. For example, the five bodies may include a main body and four wheels. The main body referenced frame F1, and each of the wheels referenced frames F2-F5, are illustrated in
The spatial velocity vector for the main body F1 may be given by Equation (7) below. This vector may include an angular velocity and a translational velocity represented by w and v respectively.
Equation (8) represents the velocity of each of the wheels. The motion transformation from F1 to Fi is given by iX1, the subspace matrix of each wheel is denoted by Si, and the angular velocity is denoted by {dot over (q)}i.
The inertia matrix of body i at the body's center of mass (CoM) are defined by Ii. The inertia matrix for the main body may be given by (9) and for the wheels by Equation (10), where a, b, and c represent the dimensions of the main body in x, y, and z, respectively. Similarly, in Equation (10), where 2r, w, 2r represent the dimensions of the wheels in x, y, and z (r represents the radius of the wheels, and w represents the wheels' width).
For bodies n=1, 2, . . . , 5, mn represents the mass of the corresponding body. Likewise, the COM location for each body, expressed in body coordinates is given by cn. Combining the above, the generalized version of the parallel axis theorem for spatial inertia is shown in Equation (11) below.
Relative to the main body, Equation (12) below represents an apparent inertia of any wheel to the main body:
The total spatial inertia of the main body may then be derived as in Equation (13) below.
In addition, Equation (14) below may be used to determine a force due to the velocity-product.
The external spatial force on the main body can be evaluated using Equation (15). The symbol {circumflex over (k)}1 represents the unit vector parallel to the z1-axis. The force due to gravity on the main body is represented by f1grav. The symbol fiw1 represents the reaction force on each wheel.
The inertial acceleration of the main body may be given by Equation (16), where 1Xi* represents the force transformation from Fi to F1.
The angular acceleration of each wheel, {umlaut over (q)}i, may be determined using Equation (17). Conversely, the applied torque may be given by τi and di=iX1a1+vi×Si{dot over (q)}i.
In the example model described above, the vehicle is treated as a rigid body that interacts with a compliant ground. This compliant ground can be modeled as a uniform distribution of an infinite number of non-linear spring-damper pairs. Further, these rigid body-ground interactions may be represented as a set of discrete contact points, each of which caused the ground to deflect spherically.
The relative modulus of elasticity between the wheel(s) and the ground, denoted by E*, may be computed using Equation (18). In Equation (18), the wheel(s) and ground may be associated with moduli of elasticity of Ew and Eg, respectively, and with Poisson ratios given by vw and vg, respectively.
A stiffness and damping coefficients may be defined as shown in Equation (19) below, where r represents the radius of the sphere and α represents a constant.
The normal force from the ground Nk may be calculated using Equation (20) at some point k. In Equation (20), δk represents a penetration distance, {dot over (δ)}k represents a penetration velocity, K represents a surface stiffness coefficient, and D represents the surface damping coefficient. It is assumed that that δk<0, i.e. penetration is considered as being into the ground.
Correspondingly, the slipping force may be given by (21), where μ represents the coefficient of friction.
The stick component of friction between the wheel(s) and ground may be further evaluated using Equation (22). The symbol u may be considered as representing the tangential deformation of the ground at the contact point,” and Vsph may be considered as representing “the tangential velocity of the bottom point of the sphere.”
Given the example representations above, the friction force may be formulated using (23).
The example modeling above provides a connection between a desired translational speed/acceleration of the main body, rotational speed/angular acceleration of the wheels in slipping or sticking frictional interaction with the ground, and the drive torques. As such, the desired control output linguistic variables from the fuzzy controller above may then be mapped to actuation signal that can generated torques and other controlling actuation to generate desired steering angles and speeds. Such mapping may be implemented in the actuation signal generator 111 of
In order to show the stability of the fuzzy controller or controlling methods described above, an equivalent controller as shown in
In
In
The e(t) and ė(t) terms can be transformed into:
where e1 is the representative singular input error generated by the input error element composing e(t) with little concern as the range of resulting values or how those values would be calculated. A similar simplification may be provided for e2 and u1 with n1 being the output trajectory of the vehicle. This output trajectory may be derived by combining the rule-base's two outputs of angular velocity and linear speed. Consequently, each antecedent and consequent combination of linguistic values for each of the rule described above became their own linguistic value. For example, if distErr Target is far, distErr Line is far left, and θfar is close left, Angular Velocity would be set to right 1 and Linear Speed would be set to fast. Here e1 would have the linguistic value TF, LFL, θfCL (Target Far, Line Far Left, θfar Close Left) whose level of membership would be determined by taking the t-norm of the level of membership in distErr Target in Far, distErr Line in Far Left, and θfar in Close Left.
The Theorem 1, Definitions 1-3 below for proving the stability are provided with the following:
A sufficient condition for asymptotic stability of the fuzzy control closed-loop is the input-output passivity of the system itself.
For any fuzzy controller with two inputs and one output, if the input-output non-linear mapping can be described by a continuous bounded Lipschitz function Φ(·,·) with the following properties, with the fuzzy controller referred to as an SFC:
Since every input that e2(k) could take has membership 1 in Dummy, Φ(e1,e2)=Φ(e1,0) and Φ(0,e2)=0. As a result, the property 5 above simplifies to:
Taking e2=0 in Equation (28) further results in
which is true for ∀λ′≥um/|e1|, ∀γ′>0, and for every (e1, e2).
For example, θnear has membership functions μn(xn), θfar has membership functions μf(xf), θlookahead has membership functions μl(xl), distError Line has membership functions μL(xL), and distError Target has membership functions μT(xT) such that zn, xf, xlϵ{−π, π} and xT, xLϵ. Since the controller possesses the Mandani structure, the Aggregation operation on the output membership functions for each of the rules is a maximum and the t-norm between input membership functions for a given rule is a minimum.
For rules i=1-10, θfar, distError Line, distError Target determine the membership in the output as follows:
with the output membership for this set of rules determined by
For rules i=11-35, θnear, distError Line, distError Target determined the membership in the output as follows:
with the output membership for this set of rules determined by
For rules i=36-40, θlookahead, distError Target determined the membership in the output as follows:
with the output membership for this set of rules determined by
considering a continuous system in the state-space form as
where f(·, ·) and h(·, ·) are smooth functions.
System Equation (32) with a properly chosen output. Equation (33) is referred to as being passive with respect to the supply rate s(u,y)=uTyϵ, if there exists a positive definite function V with V(0)=0, regarded as the storage function, such that the following inequality is satisfied for all x(t0):
Then, the fuzzy controller can be considered as a single-input single-output (SISO) non-linear system with internal dynamics, where e2 is the input, u1 is the output and ei is the state variable.
Applying [the above] definition of passivity results in:
As such, as the sufficient condition for asymptotic stability of the continuous fuzzy control closed-loop of Theorem 1 is met.
Including delays the state space form becomes:
where Tn is some delay s.t. 0≤Tn<c for cϵ+. The supply rate from Definition 2 remains s(u,y)=uTy. However, u and y change in accordance with Equations (34) and (35) respectively. As a result,
For example, there may then be 4 possibilities at any time t. Rather min(A1, A2, A3, A4)=A1. Then,
Consider a discrete system in the state-space form as
where f(·, ·) and h(·, ·) are smooth functions.
System Equation (36) with a properly chosen output in Equation (37) is said to be passive with respect to the supply rate s(u(k), y(k))=u(k)Ty(k)ϵR (k≥0), if there exists a positive definite function V with V(0)=0, regarded as the storage function, such that the following inequality is satisfied for all x(0), and ∀kϵ+=0, 1, 2, . . . .
Applying the definition of passivity in the discrete case we get:
As such, as the sufficient condition for asymptotic stability of the discrete fuzzy control closed-loop of Theorem 1 is met.
Including delays in the discrete system, the state space form becomes
where Tm represents some delay s.t. 0≤Tm<d for dϵ+. The supply rate from Definition 3 remains s(u, y)=u(k)Ty(k). However, u and y change in accordance with (38) and (39) respectively. As a result:
An example methodology for validating controller performance is used in the disclosure below for both the fuzzy and pure pursuit controllers. The approach validated controller performance by testing controllers under a set of path conditions that emphasized the efforts of disturbance rejection, phase lag, overshoot, and the like.
Example test courses are used. There are no figures in this section associated with blank versions of each of these test courses. However, each course is shown in in later sections with plots of the vehicle performance overlaid.
Test Course 1 includes a single left turn with the waypoints laid out in an L shape. The purpose of having such a sharp turn is to examine a simple yet common layout related steering disturbance a controller might experience when moving around a building or other human structures.
The next path, Test Course 2, incorporates above minimum radius turns that are still relatively sharp. These above minimum radius turns allow for a more accurate assessment of RMSE as a vehicle that could not turn in place could still have zero error. Accordingly, straightaways are paired with these above minimum radius turns to evaluate overshoot. This distinction is more important than it would initially seem as squaring the error term amplifies the errors associated with overshooting. Test Courses 2 also has both right-handed and left-handed turns, thus ensuring that the vehicle operated identically in both directions.
Test Course 3 is a figure-eight like path. The at or below minimal turn radii of the circles are used to evaluate the ability of a controller to accommodate the associated steering disturbances as present in the Maximum Error (ME). Meanwhile, the curvature of this design is useful in evaluating path phase lag about the curves which could lead to distance error and thus higher Root Mean Squared Error (RMSE).
Test course 4 incorporates both oscillatory turning that invoked phase lag similar to Test Course 3 and the above minimum radius turns are paired with straightaways of Test. Course 2. Thus, the course allowed for a more holistic examination of the controllers given the factors associated with RMSE and ME discussed above.
On each course, the fuzzy controller's performance is compared to that of an example classical pure pursuit algorithm. This choice is made as pure pursuit is one of the most commonly used waypoint navigation control algorithms and thus can be use-d as an ideal baseline controller. For that reason, the controller used is the default MATLAB pure pursuit control block.
The geometric structure of the controller is presented in
The lookahead distance is chosen to be 0.5 meters by sweeping through potential lookahead values while converging to a straight line with an initial offset. The results of this experiment are run in simulation and can be seen below in
In test cases like the one presented in
An other reason for using slightly smaller lookahead values may be because they have slightly smaller √{square root over (mse)} values; with 0.40 m having had a value of 0.3126 m and 0.45 m having had a value of 0.3113 m. However, on the actual test courses, these controllers perform worse than the controller with a lookahead distance of 0.5 m. This may be because the settling distance and overshoot become much more important metrics than rise distance when the vehicle is initialized on the path. If the vehicle is expected to have to regularly converge to paths from far away the benefits from these controllers may outweigh the costs, but this is an edge case in most day-to-day operations.
The metrics used to compare the controllers are the square root of the mean squared distance error with respect to the target trajectory, the maximum distance error with respect to the target trajectory, and the time required to complete the courses. The controller tuning is done by hand and with the aid of automation scripts.
The experimental results are acquired by running a Clearpath Jackal on a lightly worn concrete parking lot. An instance of the Robot Operating System (ROS) runs on the Clearpath Jackal. Using ROS allows for both sensor data to be sent to and commands to be received from an external laptop. To enable such communication/control, the laptop runs MATLAB, the MATLAB ROS Toolbox, the MATLAB Fuzzy Logic Toolbox, and Simulink. In Simulink, a subscriber block subscribes to the position and orientation inputs from the ROS topic ‘/odomnetry/filtered.’ Next, these inputs are converted into vehicle states and a target trajectory. Those are then fed into the fuzzy controller. The controller proceeds to determine the angular velocity setpoint. Both the predefined linear and controlled angular velocity setpoints are then published to the ROS topic ‘/cmd_vel’ using a publish block. At the same time, the x position, y position, angular velocity, and distError Line are saved to a
matrix in MATLAB.
The same data are saved in the simulation where results are acquired using the skid-steer vehicle dynamic model presented above. Accordingly, an example Spatial_v2 toolbox allows for implementation of the dynamic model. Table 10 below shows example vehicle model parameters for the Clearpath Jackal used in the experiments above.
Further, the Simulink portion of the model in the simulation is divided into four major components: the forward dynamics solver, the ground contact model, an external vehicle controller, and an internal vehicle controller.
The forward dynamics solver is used to apply forces/torques to update the vehicle's position and velocity. The ground contact model is used to determine how the ground applied forces and torques back to the vehicle.
The external vehicle controller functions much the same as in the experimental equivalent. It receives the vehicle position, orientation, and target path and output the target linear and angular velocities in order to minimize error with respect to the target trajectory. The internal vehicle controller accepts these velocity targets as inputs and translates them into wheel torques.
It is thus assumed that the control processes consisted of a high-level and low-level controller. The high-level controller would receive the linear speed and angular velocity control setpoints and transformed them into lower-level actuator setpoints. The low-level controller would instruct the actuators to hit those set points. This low-level control is chosen to be a PID controller with an integral windup saturation limit. Related parameters are defined as: Kp=4, Ki=500, Kd=−0.002, Wheel Velocity Saturation=2.2 m/s, Torque Saturation=7 Nm, and Realistic Linear Velocity Factor=0.9225. Here, the “Realistic Linear Velocity Factor” is used to match the linear velocity the vehicle would actually achieve, given the speed it is commanded to achieve. Furthermore, for ease of implementation, each wheel is modeled as having an associated motor despite the actual vehicle having only one motor for the left two wheels and one motor for the right two wheels.
Additionally, an initialization file is used to create an environment for the vehicle to interact with, which consisted of the ground geometry and ground contact coefficients K=1000000, D=1000, and μ=0.85. Next, it creates a 6 DOF floating base parent link with the physical characteristics of the base of the vehicle. Then, it creates wheel models with the physical characteristics of the wheels and links them to the base with a 1 degree of freedom rotation link. After that, the file defines the contact point locations of the base, which are the corners, and the wheels which are 32 points evenly spaced about the circumference of the wheels. Lastly, the vehicle and wheel initial positions, orientations, and velocities are defined.
To verify the proposed dynamic model and tune any inaccurate parameters a high fidelity motion capture system with 7 motion capture cameras located in the University of Illinois Intelligent. Robotics Laboratory are used. These cameras are designed to track the wavelength of light reflected by the silver balls attached to the vehicle. This is done by performing a least-squares regression/triangulation of the position of each of the individual balls, which allows it to calculate the position of the balls with 1 mm accuracy. The proposed ball configuration is thus deemed sufficient to accurately measure the position/orientation data through differentiation and to develop a polynomial fit to map the wheel velocity setpoint allocation.
The communication delay of the Jackal is also incorporated. To measure the communication time delay, a discontinuity is generate between the vehicle's zero angular velocity and a commanded nonzero angular velocity. The time between the Jackal's localization package recognizing the command and the angular velocity of the wheels changing is then measured. Across several trials, this value averages out to 0.068 seconds. However, since the simulation is not real-time, the delay is increased to 0.075 seconds to reduce time discrepancies between it and the experiment. Due to the uncertainty of the Course Completion Time (CCT) of the simulation, experimental and simulation CCTs ware calculated by dividing the total distance traveled by the overall Cartesian velocity.
For both the simulation and the experiment, the Clearpath Jackal runs at an angular velocity set-point ranging between −4 rad/s and 4 rad/s across all three courses. However, the constant linear speed is 2 m/s for the first to fourth courses. This gives the vehicle a theoretical minimum turn radius of 0.5 m.
The control efforts, where differences are most visible, for both the simulation and experiment are similar. As such, the experimental linear and angular control effort plots are presented for each of the test courses. For a similar reason,
only the experimental path plots are presented below. Additionally, a tabulated set of results for all test courses can be found in Table 11 and Table 12. For brevity, exact RMSEs, MEs, and CCTs are not shown, as the Percent Change (PC) is most relevant when comparing controllers. The unit for RMSE and Max Error are meters. The unit for Time is seconds.
Test course 1, as shown in
The related control efforts can be seen in
In terms of metrics, the VLSF controller handily outperforms the pure pursuit. In simulation, percent changes are −67.9446% for RMSE −59.8822% for ME, and 1.40064% for CCT. The experimental values are all larger with a RMSE PC at −77.2189%, ME PC at −83.1898%, and CCT PC at −2.21155%.
A similar increase from simulation to experiment i also observed against the constant linear speed fuzzy controller. In simulation, percent changes are −60.5682% for RMSE, −53.3494% for ME, and 3.87941% for CCT. While, the experimental values are a RMSE PC at −70.8885%, ME PC at −79.8319%, and CCT PC at −0.3791%.
Test Course 2, as shown in
The associated control efforts are illustrated in
Numerically, the controller yields simulated results of a RMSE PC of −20.107% and ME PC of 0% with a marginal increase in CCT at 2.1470%. While experimentally the RMSE PC is larger at −77.6865%. Likewise, the ME PC is −74.2644%, and the CCT PC yields a larger increase than other courses to 7.59544%.
Similar trends are seen against the CLSF. In simulation, percent changes are −10.3766% for RMSE, 0% for ME, and 3.4735% for CCT. Experimentally, values of RMSE PC are at −70.8799%, ME PC at −68.6296%, and CCT PC at 11.242%.
Test Course 3, as shown in
The associated control efforts can be seen in
In terms of metrics, the VLSF fuzzy controller again has a strong showing in simulation against the pure pursuit with the RMSE PC of −67.5047% and ME PC of −48.7310% with the CCT PC of 4.02423%. The experimental results are quite a bit larger RMSE PC at −91.6509% and ME PC at −90.41% than the simulation result. The CCT PC is NV/A as the experimental pure pursuit is stopped as it appeared to be in a loop and thus would be unable to compete the course in a timely manner.
In simulation against the CLSF controller, percent changes are −57.8502% for RMSE −44.9476% for ME, and 5.97066% for CCT. The experimental error are smaller with a RMSE PC of −40.2796%, ME PC of −15.4985%, and CCT PC of 9.30635%.
Test Course 4, as shown in
The associated control efforts can be seen in
In terms of metrics, the constant linear speed fuzzy controller again has a strong showing in simulation with the RMSE PC of −77.6632% and ME PC of −58.4265% while the CCT PC is relatively small at 1.24983%. The experimental results yields a smaller RMSE PC at −18.313% than the simulation result. The ME PC is still favorable yet a bit smaller than the simulation at −48.5353% and the CCT PC of 4.70973% remains at a similar small positive value.
The variable linear speed fuzzy controller similarly outperforms the constant linear speed fuzzy controller. In simulation, percent changes are −60.7646% for RMSE −20.0344% for ME, and 3.41368% for CCT. The experimental values are still commendable with a RMSE PC at −38.9212%, ME PC at −36.3311%, and CCT PC at 5.60678%.
For the pure pursuit controller, the phase lag appears to compound linearly on top of the overshoot as compared to the fuzzy controllers. There is a similar degree of overshoot observed between the pure pursuit and CLSF. Although, as seen in
A general navigational controller may be configured to generate control signals to a plurality of actuators based on environmental input from a plurality of sensors. Such navigational controllers may be used in applications including but not limited to indoor/outdoor robotics, on-road autonomous vehicles, as well as off-road or worksite autonomous vehicles/machines. For these applications, timely response to environmental variables such as positions, speed, road/site conditions and the like is critical.
Controller design may vary greatly in control logic for vehicles/machines operated/steered under different principles with different response timescales and in different, environment. For example, controller design for skid-steer vehicles with wheels that are fixed in orientations relative to the body of the vehicle and that are steered by controlling skids, as described above, may need to be drastically different from controllers for traditional autonomous directional-steer vehicles.
A navigational controller may be generally viewed as including circuitry that contains hardware, software, firmware, and the like, and the combination thereof for processing a set of dynamic input signals to generate a set of control signals that are used to drive the various actuators associated with, e.g., steering columns, accelerators, brakes, and the like. A navigational controller may be designed as an Machine Learning (ML) controller system embodied as, for example, one or more neural networks (NNs). Each of such NNs may include several layers of neurons having specific connectivity with specific weights, bias, and other parameters. Such NNs are trained to determined a set of NN parameters using iterative parameter adjustment via error calculations and back-propagation based on training datasets and deep learning techniques. However, these types of MIL systems have a critical disadvantage: they are non-explainable. Specifically, the various neural network layers and parameters correspond to features that are extracted through deep learning and represent patterns that are hidden and are generally not human-interpretable. The decision of the neural network in response to a set of input is thus generally non-human-explainable. These types of controllers thus represent black boxes that connect a set of input and a plurality of control outputs. Because the parameters in such systems are non-explainable, their adjustment and improvements must go through a cumbersome and time-consuming training/retraining process.
In some controller applications, it may be desirable to avoid the black box approach to controller design. In other words, it may be desirable that the controller include components/parameters that are explainable and human-interpretable such that they can be easily modified and improved upon. Fuzzy logic controllers (alternatively referred to as fuzzy controllers) are one example type of controllers that are explainable. A fuzzy controller may be embedded with explainable and interpretable linguistic values, parameters, and rule set. It can be explained why and how a fuzzy logic controller makes its decisions. Human can easily understand the system and its decision and thus can modify the various parameters for improvement of the controller.
In some example implementations, a fuzzy logic controller may be implemented as a neural network with well-defined layers such that some of the parameters (such as membership functions of the fuzzy logic) can be optimized by training and yet interpretable and explainable by human. Such combination of Fuzzy logic approach and neural network approach to the controller design, referred to as an Adaptive Neuro-Fuzzy Inference System (ANFIS) are both trainable and explainable.
In the example ANFIS systems further disclosed below, a fuzzy control logic is first generated with reduced rule-base by hierarchically retaining critical rules and removing unimportant rules. The fuzzy logic with reduced rules are then embodied in a neural network. The membership functions of the fuzzy logic in each input domain are modeled as parameterized trapezoidal functions. The membership functions in each input domain are modeled jointly, thereby further reducing the number of membership function parameters. The membership parameters, as part of the model parameters in the neural network is further reduced by taking advantage of the a symmetry property of the membership functions of the fuzzy logic. The resulting ANFIS is trained via reinforcement training using the ANFIS as an actor, thus providing a training process that is adapted to a particular type of vehicle/machine and in a particular operating environment. The training process is significantly simplified and streamlined as a result of the hierarchical rule-base reduction and additional parameter reduction based on membership function symmetry.
Fuzzy Logic Controllers with Hierarchical Rule-Base Reduction
A fuzzy logic replicates the human decision-making methodology and it deals with uncertainty and vagueness of the given information. Using fuzzy logic, a certain system would be able to make a decision based on degrees of output. For example, a computer cannot express how delicious a food is when the information is given using a numerical value. If the information, however, is provided to the fuzzy system, it is able to determine the degree of taste of the food based on the information with a numerical value between 0 and 1. This fuzzy logic reduces uncertainty for a computer when making a decision so that it can make a choice for a given situation/fact just like a human operator. A fussy logic system may be alternatively referred to a fuzzy interference system.
Such fuzzy logic may be applied in a navigational controller. Given the circumstances, for example, a fuzzy controller can determine how much it should make a turn to control a maneuver of the vehicle so that the fuzzy controller decreases the uncertainty of the decision for steering of a vehicle. This shows that one of the positive characteristics of a fuzzy system: applicability to nonlinear systems with uncertainty models. Another characteristic of the fuzzy system is it uses a linguistic common sense rule base, which make it human-interpretable and explainable.
An example fuzzy logic may include five functional parts: rule base, database, fuzzification, decision-making, and defuzzification, as shown in
As described above, the rule-set or rule-base of a fuzzy logic system or fuzzy inference systems may be established involving linguistic variables and their linguistic values. The rule set or rule base may include a plurality of if-then rules. Fuzzy if-then rules may be expressed in the form of:
The notations A and B are labels inside fuzzy sets attributed to a logic decisions of specific membership functions. For example:
In this example, the entities referred to as rain, speedlimit, and velocity are linguistic variables or descriptors, whereas heavy, high, and low are linguistic values of the linguistic variables defined through corresponding membership functions. In other words, numerical description of a linguistic value depends on the corresponding domain or variable and such dependency defines the linguistic value's membership functions. For example, the numerical description of the linguistic values' “high” or “low” for the linguistic variable “speed limit” may be a function of a quantified variable speed limit in, for example, number of miles per hour and may be normalized between numerical description of 0 and 1.
An example fuzzy inference system often performs the following four steps.
In a fuzzy system, the inputs are fuzzified by the membership functions of the fuzzy system which generates defuzzified output. One of the problems with a fuzzy system is that the number of rules is exponentially increased depending on the number of inputs and membership functions. If there are x input and y membership functions for each input, a full combination of the rules will be xy. For example, if there are 5 inputs and 5 membership functions, the number of the rules will be 55=3125. The more inputs and membership functions in the fuzzy system, the more rules rule base would contain. This makes the fuzzy system computationally expensive and difficult to understand for a human operator. However, the fuzzy system for an navigational system such as an autonomous vehicle/machine does not need all of the rules to control the vehicle. A fussy relations control strategy (FRCS) may be used for a hierarchical rule-based reduction for the fuzzy vehicle controller to exponentially reduce the number of the rules. For example, once the hierarchical rule-base reduction is applied to the fuzzy system with 3 inputs and 5 membership functions, the number of the rules can be decreased, for example, from 35=243 to 25. This technique makes the behavior of the controller as simple as possible and reduces the computation time. This is important because it decreases many parameters in the fuzzy system, so the computational time during training will be dramatically decreased. The rules are common sense and essentially fixed which reduces the size of the resulting ANFIS networks, and reduces the computational effort and the number of training epochs.
For example, relevant controller linguistic values, Fuzzy Relations Control Variables (FRCVs), and outputs may first be established. Then, the FRCS determines the most globally influential FRCVs and by placing the FRCVs in a hierarchy of influence. This hierarchy may be used to divide the operating environment of the vehicle into distinct regions or spaces (or branches) of operation. The relations in the hierarchy/regions may then be used to inform a selection of the rules most influential on state errors. This entire top-down process resulted in a Hierarchical Rule-Base Reduction (HRBR).
As such, the HRBR represents a generic strategy for reducing the size of a fuzzy logic rule-base. The reduction of the rule-base follows directly from model and FRCV generation, with example steps illustrated in the data and logic flow 3200 of
In some example implementations for generating tiers of control objectives and error functions in Step 3202 of
Further example for HRBR can be find in U.S. Provisional Patent Application No. 63/529,967 entitled “HIERARCHICAL FUZZY CONTROLLER WITH MULTIPLE CONTROL OUTPUT, filed by the same Applicant on Jul. 31, 2023, which is herein incorporated by reference in its entirety.
Various parameters involved in a fuzzy logic controller with HRBR, including but not limited to the parameters for defining the various membership functions and the weights among the various rules in the rule-base may be determined in various manners. But once these parameters are determined for a particular application (e.g., a particular type of vehicle/machines operating in a particular environment), they are not easily transferable to other applications even if the same membership function scheme can be used. As such, a new set of parameters for the membership functions may need to be determined. In other words, a fuzzy logic controller, by itself, is not very adaptable from application to application.
In some example implementations, a fuzzy logic may be embodied as an adaptive neural network to provide the adaptability that is missing in a traditional fuzzy logic controller. Such a system may be referred to as an Adaptive Neuro-Fizzy Inference System (ANFIS). An ANFIS embeds a fuzzy logic controller in a neural network by combining the power of trainability and adaptability of neural networks in model parameter optimization (e.g., parameters associated with membership functions) and the explainability of linguistics in fuzzy logic, thereby rendering a neural network system that is less of a black box and is yet adaptable between applications/environments.
Some example implementations of neural network, adaptability may be provided based on new information by training the various parameters using the errors from supervised or unsupervised learning. For example, if someone wants to train the neural network to control the amount of water in a water tank and have it depend on various situations, a large number of input and output data covering the various situations are required to train the neural network under supervised learning. The training would involve the neural networks taking the collected inputs and predict the outputs, then calculate the errors between the predicted outputs and the desired outputs corresponding to the inputs. The errors are then back-propagated through the neural networks to update the weight in the various neural network layers. As such, when a neural network starts learning, it takes an initial guess with random outputs; and errors will be calculated between random outputs calculated by ANN and desired outputs. The error will then be back-propagated through the neural network to adjust the parameters. This process will gradually converge to an adapted set of parameters to decrease the errors.
An ANFIS, for example, may include a model structured similarly to a fully connected neural network to emulate a fuzzy logic controller. While structured to perform the functions of the fuzzy logic controller, thus preserving the interpretability and explainability advantages, the combined model may be designed to enable a tuning of its parameters through dynamic back-propagation. Specifically, a fuzzy controller developed as a deliverable can be translated into an ANFIS system and converted back to a linguistically based fuzzy system.
In one example conversion of a fuzzy logic system into an ANFIS, the fuzzy logic may be represented in the neural network as a number of (e.g., five) sequential neurological layers, each representing a step in the fuzzy logic controller (or fuzzy logic inference system) described above, except that. Layers 2 and 3 of the neural network jointly represent step 2 of the fuzzy logic controller above. Such an example ANFIS is shown in
Layer 1. In Layer 1, as shown in (e.g., velocity measurement, temperature measurement) may be performed through the use of, for example, one or more generally trapezoidal membership functions, μk,p, with each membership function being 0 for all input values not included in its trapezoidal section. For the membership functions, k specifies the associated linguistic descriptors (e.g., “speed”) and p the given linguistic value (e.g., a “fast” value, a “medium” value, a “slow” value, etc.). The value of the function represents numerical value as a function of the domain
wherein the trapezoid function merely represents one example of the shape of this example membership function. Layer 1 may be alternatively referred to as premise layer.
Layer 2. Layer 2 represents the weights of the outputs of the fuzzy logic rules. For a given rule, i, a combination of the chosen input membership functions, μk
Layer 3. Layer 3 represents a normalization layer, where the output membership function firing rates are normalized (where r represents a number of member functions).
Layer 4. Layer 4 represented the consequent portion of the network. Using an example triangle output membership functions and the Center-of-Maximum defuzzification implementation, the output for each rule may be simply a scalar product of the peak of its associated output membership function and normalized firing rate.
Layer 5. Layer 5 is where these weighted rule outputs may be combined into a crisp output.
Layer 5 may be alternately referred to as consequential layer.
The above layers of the ANFIS thus form an neural network embedding the fuzzy logic and may be trained (with respect to input and output membership functions, for example) and used to process one or more inputs to generate one or more control outputs.
For example, in premise layer, each premise parameter can be written as:
where p=parameter to be tined; α=learning rate; and E=error to be back-propagated.
Then, the partial derivatives are:
where u=numerical description of the linguistic value.
The disclosure below provide example implementations of membership functions with characteristics that allows for unique and efficient training of the ANFIS.
Classical Trapezoid.
The parameters a, b, c, d may be assumed to follow the following constraints in .
Following the function definition, the partial derivatives of the system concerning a, b, c, d may be defined:
Trapezoid membership functions may be used to mitigate “bang-bang” problem. Specifically, in control systems, “bang-bang” refers to rapidly switching between two extreme values in response to noisy or fluctuating input signals. This behavior can cause high-frequency outputs that can harm the system, leading to malfunctions due to oscillations and generating less smooth results. Trapezoidal membership functions are sometimes used instead of other membership functions in fuzzy logic systems to mitigate this issue. As shown in
Joint Membership Functions-Constraints. In some implementations, joint membership functions of various linguistic values over a domain may be used. An example constraint may be imposed on the system such that the sum of all the numerical descriptions of the membership functions is equal to one (or any other predefined normalized value) at every point in each linguistic domain point, as shown by the three-linguistic-value joint trapezoid functions in
Through such a constraint, the membership function parameters a, b, c, d for each trapezoid depended on some other trapezoids. The boundary trapezoids are defined as single-sided functions as shown in
The trapezoid function parameters may thus be defined relative to each other i=1, 2, . . . , pk ∀k in a joint manner:
The variable pk may be defined as the number of membership functions or linguistic values with respect to an input domain k. This constraint 53 helped to reduce the parameter set size from 4(pk−1) to 2(pk−1) for the input domain k. The parameter set size may be further reduced by assuming that the system is symmetric along input domain axis c and then only defining the membership functions on one side of the axis, as shown in
Thus, the reduced membership function parameters an, bn, and c are defined as coordinates on the domain x-axis. However, to further introduce the constraint c≤a1≤b1≤ . . . ≤an≤bn into the system, a redefinition of the trapezoid function parameters may be implemented. For example, the parameters may instead be redefined as widths between key points, as shown
As such, through the redefinition of the parameter set, the c≤a1<b1< . . . <an≤bn constraint becomes:
The constraint thus follows a non-negative restriction and a cumulative sum relationship. As such, constraining the trapezoid shape may become computationally efficient.
Trapezoid Computation The trapezoid computation below, the parameter set may be defined as {w, c}:
With the size of dim(w)=p−1. To enforce non-negativity, the system may be subjected to an element-wise squaring of w, as shown in Equation (60), by using a Hadamard power:
The cumulative sum of the elements in w, may be used to compute the new set of non-negative key points representing trapezoid functions.
The cumulative sum operation may be represented as a lower triangular matrix C such that Ci,j=1 if j≤i, Cϵp-1>p-1.
To define the symmetrical output, the reflection operator matrix R may be calculated as Rϵ2(p-1)×(p-1), i, jϵ{1, 2, . . . , p−1}:
The vector r may be reduced to r=RCws=Mws, where Mϵ2(p-1)×p-1. Once the reflected vector is calculated, the vector may be shifted to be symmetric around c.
Following the definition of the vector rc, the vector may be divided into two sub-vectors. These sub-vectors may represent coordinates associated with the sloped regions of the trapezoids, with x0 and x1 being associated with coordinates for the left and right, sides of the sloped regions, respectively, iϵ{1, . . . , p−1}, X0, X1ϵ(p-1)×2(p-1).
where X0
The Hadamard division, Ø, (element-wise vector division) may be annotated below as:
From the vector subsets x0 and x1, sup may define the sloped regions, such that the scalar x is broadcasted to become a vector x∈p-1×1.
If rc is zero, a division by zero error may arise, so an ϵ may be added to the rc component, rc→rc+ϵ. Following the creation of the linear region, clamping may be used to define the regions of zero slope. As a result, scup(x) may represent the increasing sides of the trapezoids and scdown(x) may represent the decreasing sides of the trapezoids. These sides may be bounded such that, scup(x), scdown(x)ϵ[0,1] with dimensions p-1×1 and:
Finally, the vectors, scpup(x), scpdown(x)ϵp×1, may be padded and combined using Hadamard product. (C=A∘B→Ci,j=Ai,jBi,j) to produce the final trapezoid outputs, as illustrated in
Single Sided Constraint. The single sided membership function assumed that the domain of the linguistic value is restricted to a single direction i.e. [c, ∞). To satisfy this constraint, the function may be restricted from using the reflection operator in Equation (64).
Closed Domain Constraint In some previous implementations of the linguistic value's joint membership functions assumed that the domain for the variable is (−∞, ∞) in . These linguistic values may represent, for example, velocity or signed perpendicular distance to a line. However, some problems are bounded to [a, b]. These linguistic values may represent headings of a robot, which are bounded to [−π, π]. In the example implementations below, the linguistic value's membership functions are constrained within the symmetrical domain of [a, b]=[c−L, c+L]. The primary constraint on the system may involve restricting the cumulative sum of the widths not to exceed distance L. A normalization step may be performed on the squared width vector ws, iϵ{1, . . . , pk} from Equation (60).
This constraint introduces an additional width variable wp
It may also be trained when computed in the gradients.
As described above, an example ANFIS implementation may be defined as comprising five different layers defined in Equations (40), (41), (42), (43), and (44). The example implementations below reduce the ANFIS system of equations into a more compact model. The compact model may follow a similar structure to the Takagi-Sugeno fuzzy inference system architecture and the like. The implementations below of the ANFIS model exemplarily follow a Mamdani rule set approach to the rule-base formulation.
As described above, triangleO
Following those definitions, Equation (44) may be redefined as a linear operator to improve computational efficiency. The matrix P may be a set of trainable weights representing the triangular output membership functions through a Center-of-Mass representation, such that Pϵ1×n, Qϵ
n×r, Nϵ
r×1.
where ei is a unit vector representing which output, On, each rule, r, may be associated with, such that eiϵ{0,1} and eiTei=1.
The vector W may be used represent the output of each rule given as the output of the fuzzification operation, e.g., the product T-norm.
Given that wiϵ[0,1], thus
may be rewritten as
The final step may involve the redefinition of the antecedent layer.
The product T-Norm (x*y) may be used to compute the antecedents. Vector T contained the firing rates for each of the input membership functions μk,p over all the k linguistic values and their pk number of associated numerical descriptions. In total,
The function A(·): f×1→
r×1 may map the f input membership functions to the r rule outputs. As such, ai=Ta* . . . *Tq, may represent the specific rule: IF Ta AND Tb AND . . . AND Tq. In turn, the ANFIS may be represented as:
The previous notation for the reduced ANFIS system represented single input, single output notation. However, the system may account for batched inputs. b may be the batch index b×k, T(
f×1,
(
r×b,
(
1×b.
Hence, the single input ANFIS system in 88 may be transformed for the batch inputs case:
The various implementations below derive the offline mean square error optimization of the ANFIS. The loss function may be defined as:
The normalization factor may be redefined as a broadcasted normalization vector to make the function derivative easier. The broadcast may be column-wise to ensure there are n outputs. This broadcast allows the function model to be rearranged as follows:
Together, the gradient of the output gain P and the other variables constant may revolve to:
Consequently, P may be found to be:
and may be represented in a more standard notation of y=Ax:
In an example robotic application, the ANFIS system may further employ a parameter reduction for the triangle output weights, namely a symmetric constraint. Most robotic systems assume that the system's dynamics are symmetric, so clockwise and counter-clockwise motion pose the same constraints. As a result, the output triangle membership function weights would be symmetric. In turn, P's gains may also be symmetrical.
The example below in
As such, the output gain matrix P may be reduced using symmetrical constraints to the parameter set Pr and the symmetry constraint operator M:
The full system with the combination of
may be:
Following this interpretation, the optimal Pr using the Least Squares Regression solution may be defined as follows:
In some example implementations, reinforcement learning may be used to optimize the parameter set (the positions of the flat and slope regions of the membership functions, as well as the weights) of the ANFIS in a dynamic environment. The main reinforcement learning algorithm used to train the ANFIS actor-network may be based on a Deep Deterministic Policy Gradient (DDPC) model.
The example reinforcement learning setup for discrete action spaces involves an agent acting in discrete time. At each time-step t, the actor receives an observation xt of its current state in the environment; the actor then decides what action atϵN to perform in the environment. The consequences are then defined as a scalar reward rt.
The actor's behavior is defined by the policy π attributed to it. Given that most environments are stochastic, the policy, π:→
(
), may be modeled as a Markov state model with the state space
and the action space
ϵ
N.
Given the current state, st, and action, at, a reward is attributed to the transition, r(st,at). In addition, the process transition probabilities, may be defined by the probability of reaching a specific state p(st+1|st,at).
Example reinforcement learning algorithms may use a recursive approach to compute a Q-value, which approximates the expected value for an action in a specific state. This approach may be executed using the Bellman equation. For example,
The optimal Q-value may further be simplified using the optimal action a*(s), which can be found as:
When formulated in a greedy optimization manner, the Bellman equation becomes:
However, given that Q is often impossible to solve, Q(s,a) may be estimated using function approximators, e.g., a neural network, parameterized by θQ which may be optimized using a mean-squared Bellman error (MSBE) function. The set of transitions, , may be defined by (st,at,rt,st+1,d), where d indicates whether the state is an endpoint in the system, such that dϵ{0,1}. The loss function may thus be defined as:
where
Q-learning may be described on discrete action spaces. This is possible because the a* optimization can be used over a finite action space. However, arg max's equivalent greedy policy in the continuous aϵn would require an optimization process at each step. Optimizing at each time step may become prohibitively slow for real-time systems. As such, DDPG may make use of an actor-critic structure. For example, DDPG may use a parameterized actor function μ(s|θμ), which represents the system's policy through the deterministic mapping of the states to their equivalent action. As in Q-learning, the critic, which approximates Q(s, a), may be optimized through the Bellman equation. As such, the actor may be updated using the expected return of its parameters:
Since the timesteps in the environment are sequentially processed and the samples D are independently and identically distributed (IID), replay buffers may be used to solve the IID problem. A replay buffer may include a finite cache ⊂
. While training the actor and critic, mini-batches comprising the replay buffer may be sampled uniformly from
, and when the replay buffer is full, the old values may be discarded.
In addition, DDPG may provide a solution against divergent Q(s,a|θQ) given that the updated network also calculates the target value yt, causing the Q update to be prone to divergence. As such, target networks may be introduced to mitigate this problem. The actor and critic target copy networks are annotated as μ′(s|θμ′) and Q′(s,a|θQ′), respectively. The target networks then get updated in a time-weighted average fashion to “soft” update the network parameter sets as θt+1′←τθt+(1−τ)θt′, where τϵ[0, 1].
DDPG Workflow with ANFIS
The DDPG may be used as the parameter optimizer, where the Q-value approximator may be defined as a deep neural network. However, the parameterized actor function μ(s|θμ) may be defined as the ANFIS system. As such, the parameter set may be defined as θμ={P,c,w}. The use of the ANFIS may be comparable to traditional neural network parameterized actors in its universal function approximation ability. However, through the ANFIS, te a drastic parameter reduction and control over the system's characteristics may be achieved.
The control over the system's characteristics stems from the ability to modify the membership functions. As such, ANFIS is advantageous for understanding the input-output relationship compared to large neural networks whose weights can be difficult to effectually alter without prior knowledge of the consequences. In other words, traditional large neural networks appear as black boxes whereas an ANFIS is explainable and modifiable. In addition, the rule set of the fussy logic in the ANFIS is defined in an intuitive and constrained manner, which provides an easier understanding of each rule set's impact and its contributions to the outputs. This makes it easier to manipulate and troubleshoot the system by using, for example, a masking operation to understand the output of each rule set. The ANFIS using Fuzzy Logic, as a neural network, can be used as a universal approximation. Thus, a minimum ANFIS definition can be derived such that one can approximate the state-to-action policy space.
An example ANFIS is evaluated in simulation as a motion controller to an example skid-steer motion model Unmanned Ground Vehicle (UGV). Table 13 defines the system's inputs. The linear velocity is set to a constant value of
An example Clearpath Jackal represents a ground vehicle meant to follow a set of waypoints linearly. The example ANFIS is configured to act based on the current state of the robot along its path. The state of the robot may be defined through five error functions.
The waypoints are defined as specific world physical locations. The path p represented the sequential set of waypoints. To follow the robot's progress, particular waypoints may include: xp representing the waypoint the robot recently passed, xc represents the waypoint the robot is heading towards, and xf represents the next/future waypoint that the robot would reach. Finally, x describes the robot's world location.
The five example error states of the robot may be represented as:
The reward for the system may be defined as:
The rule-base of the underlying fuzzy logic may be defined in the mapping function A with the rule-to-output relationship defined in matrix Q. To save space, “left” and “right” are reduced to “l” and “r,” respectively. The ordered input membership values may be represented in vector T:
The output matrix weight P may be represented by:
with the additional output symmetry matrix M being:
The input rule base mapping function A(·) may be:
The output query matrix Q may be:
The ANFIS system may then be trained using DDPG with an Adam optimizer for the critic and actor parameters with learning rates of 0.001 and 0.0001, respectively. The critic is represented as a deep neural network and the actor is the ANFIS. The symmetry axis c may be uniformly set to 0 for all joint membership functions. Each linguistic value's respective parameters and membership type may be initially assigned to values displayed in Table 14. Three membership types may be utilized: Unrestricted (U), Single-Sided (S), and Bounded (B), each represent-ing membership functions with domains (−∞, ∞), [c, ∞), and [a, b], respectively, as described above.
The performance of the example system is evaluated by analyzing the training process results. The cumulative reward function of the system reaches an equilibrium point within a few episodes, typically between 6 to 10 episodes, as depicted in
Error (MAE) and Root Mean Squared Error (RMSE) throughout the episodes. The MSE and the RMSE are calculated relative to the Distance Line dl input parameter, which indicated the robot's proximity to the trajectory. The results obtained from the experiments are presented in
The accuracy of the example ANFIS model after tuning may be assessed by examining how well it followed the trajectory. As shown in
The various example ANFIS implementations/models above may be used as a controller in applications with symmetrically constrained problems. These example ANFIS model required fewer parameters and less computation than typical neural network approaches via HRBR and symmetry considerations, leading to faster parameter
convergence and more stable system characteristics. These ANFIS models includes human-interpretable and malleable parameters that are trained and at the same time directly human-modifiable, a desirable feature as, for example, motion controllers. The example ANFIS's rule-set and non-linear membership functions correspond to various neural network layers and thus effectively mitigate the black box approach of traditional neural network-based controllers through the explainability of its membership functions and rule set embedded in the network layers.
Finally,
In the disclosure above, a method for generating a neuro-fuzzy logic controller is disclosed, the neuro-fuzzy logic controller being configured to generate at least one control output signal from a set of input signals, the method comprising: determining one or more input linguistic values and one or more output linguistic values for a fuzzy logic underlying the neuro-fuzzy logic controller; determining a rule-base linking the one or more input linguistic values and the one or more output linguistic values; performing a hierarchical rule-base reduction (HRBR) procedure to generate a modified fuzzy logic with a reduced rule-base; initializing the neuro-fuzzy logic controller to embed the modified fuzzy logic including initial input membership functions associated with the one or more input linguistic values and the set of input signals, and initial output membership functions associated with the one or more output linguistic values and the at least one control output signal; tuning the membership functions via reinforcement training of the neuro-fuzzy logic controller to generate a trained neuro-fuzzy logic controller; and controlling an actuator based on the at least one control output signal generated by the trained neuro-fuzzy logic controller front the set of input signals.
In any one of the method above, the input membership functions may comprise trapezoid relations between numerical values of the one or more input linguistic values and the set of input signals.
In any one of the method above, the output membership functions comprise triangular relations between numerical values of the one or more output linguistic values and the at least one control output signal.
In any one of the method above, the input membership functions comprise a combination of double sided and single sided trapezoids.
In any one of the method above, the input membership functions are symmetric with respect to an input domain associated with each of the set of input signals.
In any one of the method above, a shape of each double sided trapezoid of the input membership functions and the output membership functions is represented by four parameters in a corresponding domain.
In any one of the method above, a shape of each single sided trapezoid of the input membership functions and the output membership functions is represented by two parameters in the corresponding domain.
In any one of the method above, neighboring trapezoids of the input membership functions of a domain are constrained to have two joint parameters representing a slope region of the domain for the neighboring trapezoids.
In any one of the method above, the hierarchically reduced rule-base comprises a set of if-then rules linking the one or more input linguistic values to the one or more output linguistic values that cover fewer than all possible if-then linking combinations of the one or more input linguistic values and the one or more output linguistic values.
In any one of the method above, the hierarchically reduced rule-base comprises rule branches and sub-branches based on hierarchically prioritizing within a set of control metrics according to the one or more input linguistic values.
In any one of the method above, the neuro-fuzzy logic controller comprises five neurological layers.
In any one of the method above, the five neurological layers comprises a premise layer, a weighting layer, a normalization layer, a consequence layer, and an output layer.
In any one of the method above, parameters of the input membership functions are tuned in the premise layer.
In any one of the method above, the reinforcement training of the neuro-fuzzy logic controller is based on using the neural-fuzzy logic controller as an actor.
In any one of the method above, the reinforcement training of the neuro-fuzzy logic controller is based on back-propagation of errors representing expected senor signal and actual sensor signal as a result of the actuator being actuated by the neuro-fuzzy logic controller.
In any one of the method above, the reinforcement training is based on a Deep Deterministic Policy Gradient (DDPG) model.
In any one of the method above, the euro-fuzzy logic controller is installed in a skid-steer vehicle for navigational control of the skid-steer vehicle.
In the disclosure above, a control circuitry is disclosed. The control circuitry comprises the neuro-fuzzy logic controller of any one of the methods above and configured to perform any one of the methods above.
It is to be understood that the various implementations above are not limited in its application to the details of construction and the arrangement of components set forth above and in the accompanying drawings. The disclosure is intended to cover other embodiments that may be practiced or carried out in various ways following the underlying principles disclosed herein.
It should also be noted that a plurality of hardware and software based devices, as well as a plurality of different structural components may be used to implement the various embodiments of the disclosure. In addition, it should be understood that embodiments of this disclosure may include hardware, software, and electronic components or modules that, for purposes of discussion ay be illustrated and described as if the majority of the components are implemented solely in hardware. However, one of ordinary skill in the art, and based on a reading of this disclosure, would recognize that, in at least one embodiment, the electronic based aspects of the invention may be implemented in software (e.g., stored on non-transitory computer-readable medium) executable by one or more processors. As such, it should be noted that a plurality of hardware and software based devices, as well as a plurality of different structural components may be utilized to implement the invention. Furthermore, and as described in subsequent paragraphs, the specific mechanical configurations illustrated in the drawings are intended to exemplify embodiments of the invention and that other alternative mechanical configurations are possible. For example, “controllers” described in the specification can include standard processing components, such as one or more processors, one or more computer-readable medium modules, one or more input/output interfaces, and various connections (e.g., a system bus) connecting the components. These controllers may be implemented as dedicated processing circuitry or in general-purpose processors, in combination of various software and/or firmware, and in combination of other wired or wireless communication interfaces.
In general, terminology may be understood at least in part from usage in its context. For example, terms, such as “and”, “or”, or “and/or,” as used herein may include a variety of meanings that may depend at least in part upon the context in which such terms are used. Typically, the term “or”, if used to associate a list, such as A, B or C, is intended to mean A, B, and C, here used in the inclusive sense, as well as A, B or C, here used in the exclusive sense. In addition, the term “one or more” or “at least one” as used herein, depending at least in part upon context, may be used to describe any feature, structure, or characteristic in a singular sense or may be used to describe combinations of features, structures or characteristics in a plural sense. Similarly, terms, such as “a”, “an”, or “the”, again, may be understood to convey a singular usage or to convey a plural usage, depending at least in part upon context. In addition, the term “based on” or “determined by” may be understood as not necessarily intended to convey an exclusive set of factors and may, instead allow for the existence of additional factors not necessarily expressly described, again, depending at least in part on context.
While this disclosure has described several exemplary embodiments, there are alterations, permutations, and various substitute equivalents, which fall within the scope of the disclosure. It will thus be appreciated that those skilled in the art will be able to devise numerous systems and methods which, although not explicitly shown or described herein, embody the principles of the disclosure and are thus within the spirit and scope thereof.
This application is based on and claims the benefit of priority to U.S. Provisional Patent Application Nos. 63/529,994, entitled “Training of Adaptive Neuro-Fuzzy Inference Systems with Hierarchical Rule-Base Reduction by Reinforcement Learning”, and 63/529,967, entitled “Hierarchical Fuzzy Controller with Multiple Control Output”, both filed on Jul. 31, 2023, which are incorporated herein by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
63529994 | Jul 2023 | US | |
63529967 | Jul 2023 | US |