This disclosure is related to communications in controller area networks.
The statements in this section merely provide background information related to the present disclosure. Accordingly, such statements are not intended to constitute an admission of prior art.
Vehicle systems include a plurality of subsystems, including by way of example, engine, transmission, ride/handling, braking, HVAC, and occupant protection. Multiple controllers may be employed to monitor and control operation of the subsystems. The controllers can be configured to communicate via a controller area network (CAN) to coordinate operation of the vehicle in response to operator commands, vehicle operating states, and external conditions. A fault can occur in one of the controllers that affects communications via a CAN bus.
Known CAN systems employ a bus topology for the communication connection among all the controllers that can include a linear topology, a star topology, or a combination of star and linear topologies. Known high-speed CAN systems employ linear topology, whereas known low-speed CAN systems employ a combination of the star and linear topologies. Known CAN systems employ separate power and ground topologies for the power and ground lines to all the controllers. Known controllers communicate with each other through messages that are sent at different periods on the CAN bus. Topology of a network such as a CAN network refers to an arrangement of elements. A physical topology describes arrangement or layout of physical elements including links and nodes. A logical topology describes flow of data messages or power within a network between nodes employing links.
Known systems detect faults at a message-receiving controller, with fault detection accomplished for the message using signal supervision and signal time-out monitoring at an interaction layer of the controller. Faults can be reported as a loss of communications. Such detection systems generally are unable to identify a root cause of a fault, and are unable to distinguish transient and intermittent faults. One known system requires separate monitoring hardware and dimensional details of physical topology of a network to effectively monitor and detect communications faults in the network.
A controller area network (CAN) has a plurality of CAN elements including a communication bus and controllers. A method for monitoring the controller area network CAN includes identifying active and inactive controllers based upon signal communications on the communication bus and identifying a candidate fault associated with one of the CAN elements based upon the identified inactive controllers.
One or more embodiments will now be described, by way of example, with reference to the accompanying drawings, in which:
Referring now to the drawings, wherein the showings are for the purpose of illustrating certain exemplary embodiments only and not for the purpose of limiting the same,
The CAN bus 15 includes a plurality of communications links, including a first communications link 51 between controllers 10 and 20, a second link communications 53 between controllers 20 and 30, and a third communications link 55 between controllers 30 and 40. The power grid 60 includes a power supply 62, e.g., a battery that electrically connects to a first power bus 64 and a second power bus 66 to provide electric power to the controllers 10, 20, 30 and 40 via power links. As shown, the power supply 62 connects to the first power bus 64 and the second power bus 66 via power links that are arranged in a series configuration, with power link 69 connecting the first and second power buses 64 and 66. The first power bus 64 connects to the controllers 10 and 20 via power links that are arranged in a star configuration, with power link 61 connecting the first power bus 64 and the controller 10 and power link 63 connecting the first power bus 64 to the controller 20. The second power bus 66 connects to the controllers 30 and 40 via power links that are arranged in a star configuration, with power link 65 connecting the second power bus 66 and the controller 30 and power link 67 connecting the second power bus 66 to the controller 40. The ground grid 70 includes a vehicle ground 72 that connects to a first ground bus 74 and a second ground bus 76 to provide electric ground to the controllers 10, 20, 30 and 40 via ground links. As shown, the vehicle ground 72 connects to the first ground bus 74 and the second ground bus 76 via ground links that are arranged in a series configuration, with ground link 79 connecting the first and second ground buses 74 and 76. The first ground bus 74 connects to the controllers 10 and 20 via ground links that are arranged in a star configuration, with ground link 71 connecting the first ground bus 74 and the controller 10 and ground link 73 connecting the first ground bus 74 to the controller 20. The second ground bus 76 connects to the controllers 30 and 40 via ground links that are arranged in a star configuration, with ground link 75 connecting the second ground bus 76 and the controller 30 and ground link 77 connecting the second ground bus 76 to the controller 40. Other topologies for distribution of communications, power, and ground for the controllers 10, 20, 30 and 40 and the CAN bus 15 can be employed with similar effect.
Control module, module, control, controller, control unit, processor and similar terms mean any one or various combinations of one or more of Application Specific Integrated Circuit(s) (ASIC), electronic circuit(s), central processing unit(s) (preferably microprocessor(s)) and associated memory and storage (read only, programmable read only, random access, hard drive, etc.) executing one or more software or firmware programs or routines, combinational logic circuit(s), input/output circuit(s) and devices, appropriate signal conditioning and buffer circuitry, and other components to provide the described functionality. Software, firmware, programs, instructions, routines, code, algorithms and similar terms mean any controller executable instruction sets including calibrations and look-up tables. The control module has a set of control routines executed to provide the desired functions. Routines are executed, such as by a central processing unit, and are operable to monitor inputs from sensing devices and other networked control modules, and execute control and diagnostic routines to control operation of actuators. Routines may be executed at regular intervals, for example each 3.125, 6.25, 12.5, 25 and 100 milliseconds during ongoing engine and vehicle operation. Alternatively, routines may be executed in response to occurrence of an event.
Each of the controllers 10, 20, 30 and 40 transmits and receives messages across the CAN 50 via the CAN bus 15, with message transmission rates occurring at different periods for different ones of the controllers. A CAN message has a known, predetermined format that includes, in one embodiment, a start of frame (SOF), an identifier (11-bit identifier), a single remote transmission request (RTR), a dominant single identifier extension (IDE), a reserve bit (r0), a 4-bit data length code (DLC), up to 64 bits of data (DATA), a 16-bit cyclic redundancy check (CDC), 2-bit acknowledgement (ACK), a 7-bit end-of-frame (EOF) and a 3-bit interframe space (IFS). A CAN message can be corrupted, with known errors including stuff errors, form errors, ACK errors, bit 1 errors, bit 0 errors, and CRC errors. The errors are used to generate an error warning status including one of an error-active status, an error-passive status, and a bus-off error status. The error-active status, error-passive status, and bus-off error status are assigned based upon increasing quantity of detected bus error frames, i.e., an increasing bus error count. Known CAN bus protocols include providing network-wide data consistency, which can lead to globalization of local errors. This permits a faulty, non-silent controller to corrupt a message on the CAN bus 15 that originated at another of the controllers. A faulty, non-silent controller is referred to herein as a fault-active controller.
A communications fault leading to a corrupted message on the CAN bus 15 can be the result of a fault in one of the controllers 10, 20, 30 and 40, a fault in one of the communications links of the CAN bus 15 and/or a fault in one of the power links of the power grid 60 and/or a fault in one of the ground links of the ground grid 70.
The CAN system model is generated (402). The CAN system model includes the set of controllers associated with the CAN, a communication bus topology for communication connections among all the controllers, and power and ground topologies for the power and ground lines to all the controllers.
A fault set (F) is identified that includes a comprehensive listing of individual faults (f) of the CAN associated with node-silent faults for the set of controllers, communication link faults, power link open faults, ground link open faults, and other noted faults (404). Sets of inactive and active controllers for each of the individual faults (f) are identified (406). This includes, for each fault (f) in the fault set (F), identifying a fault-specific inactive vector Vfinactive that includes those controllers that are considered inactive, i.e., communications silent, when the fault (f) is present. A second, fault-specific active vector Vfactive is identified, and includes those controllers that are considered active, i.e., communications active, when the fault (f) is present. The combination of the fault-specific inactive vector Vfinactive and the fault-specific active vector Vfactive is equal to the set of controllers Vcontroller. A plurality of fault-specific inactive vectors Vfinactive containing inactive controller(s) associated with different link-open faults can be derived using a reachability analysis of the bus topology and the power and ground topologies for the specific CAN when specific link-open faults (f) are present.
By observing each message on the CAN bus and employing time-out values, an inactive controller can be detected. Based upon a set of inactive controllers, the communication fault can be isolated since different faults, e.g., bus wire faults at different locations, faults at different controller nodes, and power and ground line faults at different locations, will affect different sets of inactive controllers. Known faults associated with the CAN include faults associated with one of the controllers including faults that corrupt transmitted messages and silent faults, open faults in communications. Thus, the bus topology and the power and ground topologies can be used in combination with the detection of inactive controllers to isolate the different faults.
Each of the controllers is designated Ci, with i indicating a specific one of the controllers from 1 through j. Each controller Ci transmits a CAN message and the period of the CAN message mi from controller Ci may differ from the CAN message period of other controllers. Each of the controllers Ci has an inactive flag (Inactivei) indicating the controller is inactive, and an active flag (Activei) indicating the controller is active. Initially, the inactive flag (Inactivei) is set to 0 and the active flag (Activei) is also set to 0. Thus, the active/inactive status of each of the controllers Ci is indeterminate. A timer Ti is employed for the active supervision of each of the controllers Ci. The time-out value for the supervision timer is Thi, which is calibratable. In one embodiment, the time-out value for the supervision timer is Thi is set to 2.5 times a message period (or repetition rate) for the timer Ti of controller Ci.
The inactive controller detection process 200 monitors CAN messages on the CAN bus (202) to determine whether a CAN message has been received from any of the controllers Ci (204). When a CAN message has not been received from any of the controllers Ci (204)(0), the operation proceeds directly to block 208. When a CAN message has been received from any of the controllers Ci (204)(1), the inactive flag for the controller Ci is set to 0 (Inactivei=0), the active flag for the controller Ci is set to 1 (Activei=1), and the timer Ti is reset to the time-out value Thi for the supervision timer for the controller Ci that has sent CAN messages (206). The logic associated with this action is that only active controllers send CAN messages.
When no message has been received from one of the controllers Ci (204)(0), it is determined whether the timer Ti has reached zero for the respective controller Ci (208). If the timer Ti has reached zero for the respective controller Ci (208)(1), the inactive flag is set to 1 (Inactivei=1) and the active flag is set to 0 (Activei=0) for the respective controller Ci (210). If the timer Ti has not reached zero for the respective controller Ci (208)(0), this iteration of the inactive controller detection process 200 ends (216). When messages have been received from all the controllers Ci within the respective time-out values Thi for all the supervision timers, inactive controller detection process 200 indicates that all the controllers Ci are presently active. When the supervision timer expires, the inactive controller detection process 200 identifies as inactive those controllers Ci wherein the inactive flag is set to 1 (Inactivei=1) and the active flag is set to 0 (Activei=0). It is then determined whether the fault isolation routine has triggered (212). If the fault isolation routine has triggered (212)(1), this iteration of the inactive controller detection process 200 ends (216). If the fault isolation routine has not triggered (212)(0), the active flag is set to 0 (Activei=0) for all the controllers Ci, i=1, . . . n, the fault count is set (Fault_Num=1) and the fault isolation routine is triggered (214). This iteration of the inactive controller detection process 200 ends (216).
The fault isolation process 300 includes an active vector Vactive and an inactive vector Vinactive for capturing and storing the identified active and inactive controllers, respectively. The vectors Vactive and Vinactive are initially empty. The Fault_Num term is a counter term that indicates the quantity of multiple faults; initially it is set to zero.
In the case of multiple faults, the candidate(s) of a previously identified candidate fault set are placed in the final candidate fault set. The vector Ft is used to store the previously identified candidate fault set and it is empty initially.
The fault isolation process 300 is triggered by occurrence and detection of a communications fault, i.e., one of the faults (f) of the fault set (F). A single fault is a candidate only if its set of inactive controllers includes all the nodes observed as inactive and does not include any controller observed as active. If no single fault candidate exists, it indicates that multiple faults may have occurred in one cycle. Multiple faults are indicated if one of the controllers is initially reported as active and subsequently reported as inactive.
In the case of multiple faults, a candidate fault set (Fc) contains multiple single-fault candidates. The condition for a multi-fault candidate fault set includes that its set of inactive nodes (union of the sets of inactive nodes of all the single-fault candidates in the multi-fault candidate fault set) includes all the nodes observed as inactive and does not include any node observed as active, and at least one candidate from the previous fault is still included in the multi-fault candidate fault set. Once the status of all nodes are certain (either active or inactive) or there is only one candidate, the candidate fault set (Fc) is reported out. The candidate fault set can be employed to identify and isolate a single fault and multiple faults, including intermittent faults.
Upon detecting a system or communications fault in the CAN system (302), the system queries whether an active flag has been set to 1 (Activei=1) for any of the controllers Ci, i=1, . . . n, indicating that the identified controllers are active and thus functioning (304). If the identified controllers are not active and functioning (304)(0), operation skips block 306 and proceeds directly to block 308. If the identified controllers are active and functioning (304)(1), any identified active controller(s) is added to the active vector Vactive and removed from the inactive vector Vinactive (306).
The system then queries whether an inactive flag has been set to 1 (Inactivei=1) for any of the controllers Ci, i=1, . . . n, indicating that the identified controllers are inactive (308). If the identified controllers are not inactive (308)(0), the operation skips block 310 and proceeds directly to block 312. If the identified controllers are inactive (308)(1), those controllers identified as inactive are added to the inactive vector Vinactive and removed from the active vector Vactive (310).
The system determines whether there have been multiple faults by querying whether any of the controllers have been removed from the active vector Vactive and moved to the inactive vector Vinactive (312). If there have not been multiple faults (312)(0), the operation skips block 314 and proceeds directly to block 316. If there have been multiple faults (312)(1), a fault counter is incremented (Fault_Num=Fault_Num+1) (314), the set Ft used to store the candidates of the previous fault is incorporated into the candidate fault set Fc (Ft=Fc), the active vector Vactive is emptied, and the active flags are reset for all the controllers (Activei=0) (314).
The system determines where a recovery has occurred, thus indicating an intermittent fault by querying whether any of the controllers have been removed from the inactive vector Vinactive and moved to the active vector Vactive (316). If an intermittent fault is indicated (316)(1), the operation proceeds directly to block 330 wherein the active vector Vactive is emptied, the inactive vector Vinactive is emptied, the fault counter Fault_Num is set to 0, and the controller is commanded to stop triggering execution of the fault isolation process 300 (330), and this iteration of the fault isolation process 300 ends (332). If an intermittent fault is not indicated (316)(0), the operation queries whether all the controllers are active (318). If all the controllers are active (318)(1), this iteration of the fault isolation process 300 ends (332). If all the controllers are not active (318)(0), then operation proceeds to block 320.
Block 320 operates to identify the candidate fault set Fc, by comparing the inactive vector Vinactive with the fault-specific inactive vector Vfinactive, and identifying the candidate faults based thereon.
V
inactive
⊂∪fεS(Vfinactive) [1]
and
V
active∩(∪fεS(Vfinactive))=empty [2]
Furthermore, if the previous candidate fault set Ft is not empty, then there exists a term R that is an element of the previous fault set Ft, such that R is a subset of set S (320).
The operation queries whether the candidate fault set Fc is empty, and whether the fault counter Fault_Num is less than the quantity of all possible faults |F| (322). If so (322)(1), the fault counter Fault_Num is incremented (324), and block 320 is re-executed. If not (322)(0), the operation queries whether the candidate fault set Fc includes only a single fault |Fc|=1 or whether the combination of the active vector Vactive and the inactive vector Vinactive includes all the controllers (VactiveΩVinactive=Vcontroller) (326). If not (326)(0), this iteration of the fault isolation process 300 ends (332). If so (326)(1), the candidate fault set Fc is output as the set of fault candidates (328), and this iteration of the fault isolation process 300 ends (332).
When controller 610 is identified as inactive after a single execution of the fault isolation process 300, it indicates that link 651 is open between controllers 610 and 620, or that link 669 is open between controller 610 and power distribution network 666, or that link 677 is open between controller 610 and ground distribution network 674, or that the controller 610 has an internal silent fault.
When controller 620 is identified as inactive after a single execution of the fault isolation process 300, it indicates that link 665 is open between controller 620 and power distribution network 664, or that link 679 is open between controller 620 and ground distribution network 674, or that controller 620 has an internal silent fault.
When controller 630 is identified as inactive after a single execution of the fault isolation process 300, it indicates that link 668 is open between controller 630 and power distribution network 666, or that link 673 is open between controller 630 and ground distribution network 678, or that the controller 630 has an internal silent fault.
When the set of inactive controllers includes controllers 610 and 620, which are identified as inactive after multiple executions of the fault isolation process 300, it indicates that link 653 is open between controller 620 and controller 630, or that link 675 is open between ground distribution network 674 and ground distribution network 678.
When the set of inactive controllers includes controllers 610, 620, and 630, which are identified as inactive after multiple executions of the fault isolation process 300, it indicates that link 655 is open between controller 640 and controller 630, or that there is a wire short in the CAN bus 615.
When the set of inactive controllers includes controllers 610 and 630, which are identified as inactive after multiple executions of the fault isolation process 300, it indicates that link 663 is open between power distribution network 666 and power distribution network 664.
This isolation of faults in the CAN is illustrative. In this manner, the fault isolation process 300 can be employed to isolate a fault to a single location or a limited quantity of locations in the CAN 650.
Identifying the candidate fault set Fc includes generating an off-line model of the CAN. The off-line model identifies all the functional nodes including software and hardware components that are involved in a travel path to transmit a message. Thus, message M1 originates from software 712 in controller 710 and includes controller 710, link 715, bus links 762, 763, 764, and 765, and link 755, and reaches controller 750. Message M2 originates from software 722 in controller 720 and includes controller 720, link 725, bus links 763, 764, and 765, and link 755, and reaches controller 750. Message M3 which originates from software 732 in controller 730 includes nodes including controller 730, link 735, bus links 764 and 765, and link 755, and reaches controller 750. Message M4 originates from software 742 in controller 740 and includes controller 740, link 745, bus link 765 and link 755, and reaches controller 750. The terms S1, S2, S3, and S4 can be employed to represent the sets of nodes including software components, controllers, and communication links involved in the travel paths of transmitting M1, M2, M3, and M4, respectively. That is, S1={712, 710, 715, 762, 763, 764, 765, 755, 750}; S2={722, 720, 725, 763, 764, 765, 755, 750}; S2={732, 730, 735, 764, 765, 755, 750}; S2={742, 740, 745, 764, 765, 755, 750}. The on-line diagnostic monitors the occurrence of each of the messages Mj (j=1, . . . n) within a moving window of period PA, which is based upon a minimum transmission rate for the different controllers. Counting number Nj is associated with each of the messages Mj. When Nj is greater than 1, message Mj is identified as received, or otherwise identified as being lost, and identified as lost message Mk. For each lost message Mk, the candidate fault set FNSk can be identified as those nodes associated with the lost message Mk, which is represented by Sk, less the nodes associated with all received message(s) Mi during the time period in question, which are represented by Si. This can be expressed as follows.
FNS
k
=S
k
−S
k∩(∪iεReedSi) [3]
Thus the candidate fault set FNS is the union of the candidate fault sets associated with each of the lost messages and this can be expressed as follows.
FNS=Ω
kεLost
FNS
k [4]
CAN systems are employed to effect signal communications between controllers in a system, e.g., a vehicle. The fault isolation process described herein permits location and isolation of a single fault, multiple faults, and intermittent faults in the CAN systems, including faults in a communications bus, a power supply and a ground network.
The disclosure has described certain preferred embodiments and modifications thereto. Further modifications and alterations may occur to others upon reading and understanding the specification. Therefore, it is intended that the disclosure not be limited to the particular embodiment(s) disclosed as the best mode contemplated for carrying out this disclosure, but that the disclosure will include all embodiments falling within the scope of the appended claims.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US12/53725 | 9/5/2012 | WO | 00 | 7/13/2015 |