The following description relates to communication systems in general and to distributed, fault-tolerant communication systems in particular.
Distributed, fault-tolerant communication systems are used, for example, in applications where a failure could possibly result in injury or death to one or more persons. Such applications are referred to here as “safety-critical applications.” One example of a safety-critical application is in a system that is used to monitor and manage sensors and actuators included in an airplane or other aerospace vehicle.
One architecture that is commonly considered for use in such safety-critical applications is the Time-Triggered Architecture (TTA). In a TTA system, multiple nodes communicate with one another over two replicated high-speed communication channels using, for example, the Time Triggered Protocol/C (TTP/C) or the FLEXRAY protocol. In some embodiments, at least one of the nodes in such a TTA system is coupled to one or more sensors and/or actuators over two replicated, low-speed serial communication channels using, for example, the Time Triggered Protocol/A (TTP/A).
In one configuration of such a TTA system, various nodes communicate with one another over two, replicated communication channels, each of which is implemented using a star topology. In such a configuration, each channel includes an independent, centralized bus guardian. Each such centralized bus guardian represents a single point of failure for the respective channel. Another configuration of a TTA system is implemented using a linear bus topology in which various nodes communicate with one another over two, replicated communication channels and where each node includes a separate, independent bus guardian for each communication channel to which that node is coupled. In other words, where two communication channels are used, each node includes two independent bus guardians. Providing multiple independent bus guardians within each node, however, may not be suitable for some applications (for example, due to the increased cost associated with providing multiple bus guardians within each node).
In one embodiment, a network comprises a plurality of nodes that are communicatively coupled to one another over first and second channels that form first and second rings, respectively. The network further comprises at least one self checking pair comprising at least two of the plurality of nodes. Each node is communicatively coupled via the first channel to a first neighbor node in a first direction and to a second neighbor node in a second direction. Each node is communicatively coupled via the second channel to the first neighbor node in the first direction and to the second neighbor node in the second direction. The two nodes of the self checking pair are neighbor nodes of one another. When each node relays a first relayed unit of data along the first channel in the first direction, that node relays information indicative of the integrity of the first relayed unit of data along with the first relayed unit of data. When each node relays a second relayed unit of data along the second channel in the second direction, that node relays information indicative of the integrity of the second relayed unit of data along with the second relayed unit of data. Each of the two nodes of the self checking pair, for a particular unit of data communicated on the first channel in the first direction and on the second channel in the second direction: sends, to the other of the two nodes included in the self checking pair, information about first and second instances of the particular unit of data received by that node from the first and second channels, respectively; receives, from the other of the two nodes included in the self checking pair, information about first and second instances of the particular unit of data received by that other node from the first and second channels, respectively; and selects, for use in processing performed by that node for the self checking pair, at least one of the first and second instances of the particular unit of data received by that node based on at least one of: information about the first and second instances received by that node from the first and second channels, respectively, and information about the first and second instances received by the other of the two nodes of the self checking pair from the first and second channels, respectively.
In another embodiment, a network comprises a plurality of nodes that are communicatively coupled to one another over first and second channels and at least one self checking pair comprising at least two of the plurality of nodes. Each node is communicatively coupled via the first channel to at least one first transmit-to node to which that node transmits data on the first channel and at least one first receive-from node from which that node receives data from the first channel. Each node is communicatively coupled via the second channel to at least one second transmit-to node to which that node transmits data on the second channel and at least one second receive-from node from which that node receives data from the second channel. A first of the two nodes of the self checking pair comprises the respective first receive-from node and the respective second transmit-to node for a second of the two nodes of the self checking pair. The second of the two nodes of the self checking pair comprises the respective second receive-from node and the respective first transmit-to node for the first of the two nodes of the self checking pair. When each node relays a first relayed unit of data along the first channel, that node relays information indicative of the integrity of the first relayed unit of data along with the first relayed unit of data. When each node relays a second relayed unit of data along the second channel, that node relays information indicative of the integrity of the second relayed unit of data along with the second relayed unit of data. Each of the two nodes of the self checking pair, for a particular unit of data communicated on the first channel and on the second channel sends, to the other of the two nodes included in the self checking pair, information about first and second instances of the particular unit of data received by that node from the first and second channels, respectively; receives, from the other of the two nodes included in the self checking pair, information about first and second instances of the particular unit of data received by that other node from the first and second channels, respectively; and selects, for use in processing performed by that node for the self checking pair, at least one of the first and second instances of the particular unit of data received by that node based on at least one of: information about the first and second instances received by that node from the first and second channels, respectively; and information about the first and second instances received by the other of the two nodes of the self checking pair from the first and second channels, respectively.
Another embodiment comprises a method for use in a network comprising a plurality of nodes that are communicatively coupled to one another over first and second channels that form first and second rings, respectively. Each node is communicatively coupled via the first channel to a first neighbor node in a first direction and to a second neighbor node in a second direction. Each node is communicatively coupled via the second channel to the first neighbor node in the first direction and to the second neighbor node in the second direction. The network comprises at least one self-checking pair that includes two nodes that are neighbor nodes of one another. The method comprises relaying, by each of the plurality of nodes, along the first channel, a first unit of data received by the respective node on the first channel along with information indicative of the integrity of the first relayed unit of data. The method further comprises relaying, by each of the plurality of nodes, along the second channel, a second unit of data received by the respective node on the second channel along with information indicative of the integrity of the second relayed unit of data. The method further comprises, for a particular unit of data communicated on the first and second channels, at each of the two nodes of the self checking pair: sending, to the other of the two nodes included in the self checking pair, information about first and second instances of the particular unit of data received by that node from the first and second channels, respectively; receiving, from the other of the two nodes included in the self checking pair, information about first and second instances of the particular unit of data received by that other node from the first and second channels, respectively; and selecting, for use in processing performed by that node for the self checking pair, at least one of the first and second instances of the particular unit of data received by that node based on at least one of: information about the first and second instances received by that node from the first and second channels, respectively; and information about the first and second instances received by the other of the two nodes of the self checking pair from the first and second channels, respectively.
In another embodiment, a self checking pair comprises first and second nodes. Each of the first and second nodes comprises an interface to communicatively couple the respective node to at least first and second channels. The first and the second channels comprise first and second rings respectively. The first and second nodes are neighbor nodes of one another. For each unit of data relayed on the first and second channels, information indicative of the integrity of the relayed unit of data is relayed along with the relayed unit of data. For a particular unit of data communicated on the network: each of the first and second nodes exchange information about a first instance of the particular unit of data received from the first channel and about a second instance of the particular unit of data received from the second channel; and each of the first and second nodes of the self checking pair selects, for use in processing performed by the respective node, at least one of the first and second instances of the particular unit of data received by the respective node based on at least one of: information about the first and second instances received by that node from the first and second channels, respectively; and information about the first and second instances received by the other of the two nodes of the self checking pair from the first and second channels, respectively.
The details of one or more embodiments of the claimed invention are set forth in the accompanying drawings and the description below. Other features and advantages will become apparent from the description, the drawings, and the claims.
Like reference numbers and designations in the various drawings indicate like elements.
In the particular embodiment shown in
Embodiments of network 100 are implemented using various media access schemes. For example, the embodiment shown in
The eight nodes 102 shown in
In addition, as used herein, a “neighbor's neighbor node” (or just “neighbor's neighbor”) for a given node 102 is the neighbor node 102 of the neighbor node 102 of the given node 102. Each node 102 has two neighbor's neighbor nodes 102, one in the clockwise direction (also referred to here as the “clockwise neighbor's neighbor node” or “clockwise neighbor's neighbor”) and one in the counter-clockwise direction (also referred to here as the “counter-clockwise neighbor's neighbor node” or “counter-clockwise neighbor's neighbor”). For example, the two neighbor's neighbor nodes for node A are node G in the clockwise direction and node C in the counter-clockwise direction.
The two communication channels 106 are individually labeled in
As used here, when a link 108 is described as being connected “from” a first node 102 “to” a second node 102, the link 108 provides a communication path for the first node 102 to send data to the second node 102 over the link 108. That is, the direction of that unidirectional link 108 is from the first node 102 to the second node 102.
A link 108 is connected from each node 102 to that node's clockwise neighbor node 102. A link 108 is also connected from each node 102 to that node's clockwise neighbor's neighbor node 102. For example, a link 108 is connected from node A to node H and a link 108 is connected from node A to node G. These clockwise links 108 make up channel 0 and are shown in
A link 108 is connected from each node 102 to that node's counter-clockwise neighbor node 102. A link 108 is also connected from each node 102 to that node's counter-clockwise neighbor's neighbor node 102. For example, a link 108 is connected from node A to node B and a link 108 is connected from node A to node C. These counter-clockwise links 108 make up channel 1 and are shown in
The links 108 that connect a given node 102 to that node's respective clockwise and counter-clockwise neighbor nodes are also referred to here as “direct” links 108. The links 108 that connect a given node 102 to that node's respective clockwise and counter-clockwise neighbor's neighbors are referred to here as “skip” links 108.
In the particular embodiment shown in
In the particular embodiment shown in
In the embodiments described here, the transport-layer processing comprises two at least two modes—an unsynchronized mode and a synchronized mode. When operating in a synchronized mode, the nodes 102 of network 100 are synchronized to a global time base and transmit in accordance with a TDMA media access scheme. With such a TDMA media access scheme, a schedule is used to determine when the nodes 102 in the network 100 transmit during a given schedule period or round. During a given schedule period, various nodes 102 in the network 100 are assigned a respective time slot in which to transmit. In other words, for any given time slot, the node 102 assigned to that time slot is allowed to transmit during that time slot (also referred to here as the “scheduled node” 102). In this embodiment, the scheduled node performs the processing described below in connection with
When the nodes 102 are operating in an unsynchronized mode, the nodes 102 have not yet synchronized to a global time base and are not yet transmitting in accordance with a TDMA schedule. At least a portion of the processing performed, in one embodiment, by the nodes 102 of the network 100 while operating in an unsynchronized mode is described, for example, in the 10/993,931 Application.
The current node 102 performs the processing of method 200 when the current node 102 determines that the current node 102, in accordance with the TDMA schedule, is allowed to transmit on the network 100 (block 202). Each node 102 in the network 100 maintains information necessary to make such a determination. In the embodiment of
When the current node 102 is allowed to transmit and the node 102 has data to transmit (checked in block 204), the current node 102 transmits a frame of data, along channel 0, to the current node's clockwise neighbor and clockwise neighbor's neighbor (block 206) and, along channel 1, to the current node's counter-clockwise neighbor and counter-clockwise neighbor's neighbor (block 208). The current node 102 transmits the frame to the current node's clockwise and counter-clockwise neighbors using the respective direct links 108. The current node 102 transmits the frame to the current node's clockwise and counter-clockwise neighbor's neighbors using the respective skip links 108. In one implementation of such an embodiment, the current node 102 includes a first transceiver that transmits the frame on channel 0 to the current node's clockwise neighbor and clockwise neighbor's neighbor and a second transceiver that transmits the frame on channel 1 to the current node's counter-clockwise neighbor and counter-clockwise neighbor's neighbor.
Method 300 is performed by a node 102 that is operating in a synchronized mode in accordance with a TDMA schedule. Each node 102, in such an embodiment, performs the processing of method 300 when one of that node's neighbors is the scheduled node 102 for the current time slot. In the context of
The current node 102 performs the processing of method 300 when the current node 102 determines that one of the neighbors of the current node 102 is scheduled to transmit during the current time slot (checked in block 302). Such a neighbor is also referred to here in the context of
When the current node 102 determines that one of its neighbors is the scheduled node for the current time slot, the current node 102 only relays frames sourced from the scheduled neighbor that are received by the current node 102 from the scheduled neighbor via the direct link 108 that couples the scheduled neighbor to the current node 102. That is, if the current node 102 receives a frame that is sourced from a node 102 other than the scheduled neighbor, the current node 102 does not relay that frame.
When the current node 102 start receiving a frame from the scheduled neighbor (checked in block 304), the current node 102 checks if the transmission complies with one or more policies that are implemented in the network 100. In the particular embodiment shown in
If the transmission complies with the temporal policy, the current node 102 checks if the transmission complies with one or more other policies (block 310). For example, in one embodiment, the current node 102 checks if the transmission complies with one or more semantic policies (for example, policies implementing semantic protocol state filtering). In another embodiment, where each frame includes a cyclic redundancy check (CRC) field that is calculated based on the contents of the frame, the current node 102 checks the CRC field to determine if any errors have been introduced into the frame in the course of transmitting the frame from the scheduled node to the current node 102. Another example of such a policy is an encoding layer enforcement policy. In another example, a frame-length policy is used and the current node 102 checks the length of the current frame (in such an example, failures to comply with the frame-length policy would, for example, be processed as described in connection with block 309 of
If the transmission fails to comply with one or more of the other policies, the current node 102 does not relay the transmission (block 308). In an alternative embodiment (shown in
Otherwise, if the transmission complies with all the policies, the current node 102 relays the current frame to the current node's next neighbor and next neighbor's neighbor along the channel from which the current frame is being received (block 312). For example, where the scheduled node is node A and the current node is node B, the current node relays the current frame to node C (node B's next neighbor along channel 1) and to node D (node B's next neighbor's neighbor along channel 1).
In other embodiments, the current node 102 checks if the transmission complies with other policies instead of or in addition to the ones described above. For example, in one such other embodiment, the current node 102 checks the directional integrity of the transmission by the scheduled node (for example, in the manner described below in connection with
Method 350 is performed by a node 102 that is operating in a synchronized mode in accordance with a TDMA schedule. Each node 102, in such an embodiment, performs the processing of method 350 when one of that node's neighbors is the scheduled node 102 for the current time slot. In the context of
The current node 102 performs the processing of method 350 when the current node 102 determines that one of the neighbors of the current node 102 is scheduled to transmit during the current time slot (checked in block 352). Such a neighbor is also referred to here in the context of
In method 350 (as in method 300 of
When the current node 102 start receiving a frame from the scheduled neighbor (checked in block 354), the current node 102 relays the received frame to the current node's next neighbor and next neighbor's neighbor along the channel from which that frame is being received (block 356). For example, where the scheduled node is node A and the current node is node B, the current node relays the frame received from node A to node C (node B's next neighbor along channel 1) and to node D (node B's next neighbor's neighbor along channel 1).
The current node 102 performs the processing of method 400 when the current node 102 is not scheduled to transmit during the current time slot and neither of the current node's neighbors are scheduled to transmit during the current time slot (checked in block 402 of
When the current node 102 determines that the current node 102 is not scheduled to transmit during the current time slot and neither of the current node's neighbors are scheduled to transmit during the current time slot and the current node 102 starts to receive a frame from the current node's counter-clockwise neighbor on channel 0 (checked in block 404), the current node 102 compares the frame being received from the current node's counter-clockwise neighbor on channel 0 to any frame that is being received from the current node's counter-clockwise neighbor's neighbor on channel 0 (block 406). In the embodiment shown in
If the current node 102 does not receive a frame from the current node's counter-clockwise neighbor on channel 0 (for example, after a predetermined time-out period has elapsed) but starts to receive a frame from the current node's counter-clockwise neighbor's neighbor on channel 0 (checked in block 412), the current node 102 relays the frame that is being received from the current node's counter-clockwise neighbor's neighbor on to the current node's clockwise neighbor and clockwise neighbor's neighbor along the channel 0 (block 414). After that frame has been relayed, the current node 102 relays in or after that frame information indicating that there was a “mismatch” at the current node 102 for channel 0 (block 416). The current node 102 relays this information to the current node's clockwise neighbor and clockwise neighbor's neighbor along the channel 0. Because no frame was received from the counter-clockwise neighbor of the current node 102, it is not the case that a frame received from the counter-clockwise neighbor is identical to the frame received from the counter-clockwise neighbor's neighbor of the current node 102.
If the current node 102 does not receive a frame from the current node's counter-clockwise neighbor on channel 0 or from the current node's counter-clockwise neighbor's neighbor on channel 0, the current node 102 does not relay any data along channel 0 for the current time slot (block 418).
The current node 102 performs the same processing for frames received from channel 1. When the current node 102 determines that the current node 102 is not scheduled to transmit during the current time slot and neither of the current node's neighbors are scheduled to transmit during the current time slot and the current node 102 starts to receive a frame from the current node's clockwise neighbor on channel 1 (checked in block 420 of
If the current node 102 does not receive a frame from the current node's clockwise neighbor on channel 1 (for example, after a predetermined time-out period has elapsed) but starts to receive a frame from the current node's clockwise neighbor's neighbor on channel 1 (checked in block 428), the current node 102 relays the frame that is being received from the current node's clockwise neighbor's neighbor on to the current node's counter-clockwise neighbor and counter-clockwise neighbor's neighbor along the channel 1 (block 430). After that frame has been relayed, the current node 102 relays in or after that frame information indicating that there was a “mismatch” at the current node 102 for channel 1 (block 432). The current node 102 relays this information to the current node's counter-clockwise neighbor and counter-clockwise neighbor's neighbor along the channel 1. Because no frame was received from the clockwise neighbor of the current node 102, it is not the case that a frame received from the clockwise neighbor is identical to the frame received from the clockwise neighbor's neighbor of the current node 102.
If the current node 102 does not receive a frame from the current node's clockwise neighbor on channel 1 or from the current node's clockwise neighbor's neighbor on channel 1, the current node 102 does not relay any data along channel 1 for the current time slot (block 434).
In one example, the current node 102 is node A and node E is the node that is scheduled to transmit during the current time slot. In such an example, node A receives a frame from node B (node A's counter-clockwise neighbor) via the respective direct link 108 of channel 0 and compares this frame to any frame node A receives from node C (node A's counter-clockwise neighbor's neighbor) via the respective skip link 108 of channel 0. Node A relays the frame that is being received from node B and the information indicative of the results of the comparison to node H (node A's next neighbor along channel 0) and to node G (node A's next neighbor's neighbor along channel 0). In such an example, node A also receives a frame from node H (node A's clockwise neighbor) via the respective direct link 108 of channel 1 and compares this frame to any frame node A receives from node G (node A's clockwise neighbor's neighbor) via the respective skip link 108 of channel 1. Node A relays the frame received from node H and the information indicative of the results of the comparison to node B (node A's next neighbor along channel 1) and to node C (node A's next neighbor's neighbor along channel 1).
In the embodiments illustrated in
In the example shown in
The current node 102 includes a second direct link interface 510 that communicatively couples the current node 102 to the clockwise direct link 108 of channel 0, which is connected to the current node's clockwise neighbor. The current node 102 also includes a second skip link interface 512 that communicatively couples the current node 102 to the clockwise skip link 108 of channel 0, which is connected to the current node's clockwise neighbor's neighbor.
In the example shown in
For given transmission during a given time slot, the current node 102 will typically receive start receiving respective frames on the first direct link interface 502 and the first skip link interface 504 at different times. For example, where the comparison and relaying processing is performed in connection with blocks 406-410 and 414-418 of
In the example shown in
In the particular example shown in
As data is received at the first direct link interface 502, the received data is written into the input end of the direct link FIFO buffer 506. Also, as data is received at the skip link interface 504, the received data is written into the input end of the skip link FIFO buffer 508. The determination as to whether a frame is being received on the first direct link interface 502 is made by detecting a start-of-frame delimiter in the data received from that interface 502. Likewise, the determination as to whether a frame is being received on the first skip link interface 504 is made by detecting a start-of-frame delimiter in the data received from that interface 504.
If a frame is being received on both the first direct link interface 502 and the first skip link interface 504, when both FIFO buffers 506 and 508 are half full, the de-skew and compare module 514 starts receiving bits from the respective outputs ends of the first and second FIFO buffers 506 and 508 and the transmitter 516 start receiving bits from the output end of the FIFO buffer 506. The de-skew and compare module 514, as it receives bits from the first and second FIFO buffers 506 and 508, performs the bit-by-bit comparison of the two received frames. The transmitter 516, as it receives bits from the first FIFO buffer 506, relays the received bits along channel 0 to the counter-clockwise neighbor and counter-clockwise neighbor's neighbor. When the de-skew and compare module 514 has compared the end of both frames, the de-skew and compare module 514 outputs, to the transmitter 516, a bit that indicates whether the two frames were or were not identical. The transmitter 516 receives the bit output by the de-skew and compare module 514 and “appends” the bit to the end of the relayed frame by transmitting the bit after the relayed frame.
If a frame is being received on the first direct link interface 502 but not on the first skip link interface 504, when the first FIFO buffer 506 is half full, the de-skew and compare module 514 and the transmitter 516 start receiving bits from the output end of the first FIFO buffer 506. The de-skew and compare module 514 outputs, to the transmitter 516, a bit that indicates that a mismatch has occurred for channel 0 at the current node 102. The transmitter 516, as it receives bits from the first FIFO buffer 506, relays the received bits along channel 0 to the counter-clockwise neighbor and counter-clockwise neighbor's neighbor. The transmitter 516 receives the bit output by the de-skew and compare module 514 and “appends” the bit to the end of the relayed frame by transmitting the bit after the relayed frame.
In the case of processing performed for method 400 of
Embodiments of network 100 provide improved fault tolerance while the nodes 102 of the network 100 are operating in a synchronous mode. For example, embodiments of network 100 provide improved transport availability and improved transport integrity. Improved transport availability is provided by, for example, the use of the two, independent opposing communication channels 0 and 1. Data that is transmitted by a node 102 in the network 100 travels to each of the other nodes 102 in the network 100 via two independent communication paths. For example, data transmitted by node A of the network 100 travels to node E via a first path traveling counter-clockwise on channel 0 from node A to nodes B, C, D, and E and via a second path traveling clockwise on channel 1 from node A to nodes H, G, F, and E. As a result, despite any single point of failure on one of these paths, there will be another path by which data can successfully travel to node E.
In the embodiment shown in
The links 108 of channel 0 and channel 1 that are affected by node A's transmission are shown in
Data transmitted by node E along channel 0 is received and relayed by nodes D, C, and B because the links 108 in this part of channel 0 are not affected by node A's transmissions. Likewise, data transmitted by node E along channel 1 is received and relayed by nodes F, G, and H because the links 108 in this part of channel 1 are not affected by node A's transmissions. The links 108 of channel 0 and channel 1 that are not affected by node A's transmissions and over which node E is able to transmit successfully are shown in
In another example, a slightly-off-specification (SOS) failure or fault occurs in the communication network 100 of
In this example, a SOS failure occurs in node A. In such a failure, during the time slot assigned to node A for node A to transmit, faulty node A transmits at a point in time that would (if node A's transmissions were relayed fully around the ring 104) result in nodes B, C, H and G receiving the transmission as correct and nodes D, E, and F receiving the transmission as incorrect.
Nodes B and H, as neighbors of node A, will check if the transmission by node A complies with the temporal policy implemented in the network 100. In such an example, node B will determine that the frame received from node A on channel 1 does not comply with the temporal policy and, therefore, will not relay the frame any further along channel 1. Likewise, node H will determine that the frame received from node A on channel 0 does not comply with the temporal policy and, therefore, will not relay the frame any further along channel 0. In this way, the impact of such SOS failures is reduced.
When a given node 102 (referred to here in the context of
When the transmitting node 102 transmits, both neighbors of the transmitting node exchange the respective frames they receive from the transmitting node over the skip links 108 that communicatively couple the neighbors to one another. As shown in
The other neighbor of the transmitting node forwards the frame it receives from the transmitting node to the current node 102 over the other skip link 108 that communicatively couples the other neighbor to the current node 102 in the current channel. In the context of
The current node 102 relays the frame it is receiving from the transmitting node 102 along the current channel (block 1210). For example, when the transmitting node 102 is the clockwise neighbor of the current node 102, the current node 102 receives the frame from the transmitting node 102 via channel 1 and relays the received frame along channel 1 to the counter-clockwise neighbor and neighbor's neighbor of the current node 102. When the transmitting node 102 is the counter-clockwise neighbor of the current node 102, the current node 102 receives the frame from the transmitting node 102 via channel 0 and relays the received frame along channel 0 to the clockwise neighbor and neighbor's neighbor of the current node 102.
After the entire frame transmitted by the transmitting node has been relayed by the current node 102 and the comparison between that frame and the other frame forwarded to the current node 102 by the other neighbor is complete, the current node 102 relays information indicative of the results of that comparison in or after the frame received from the transmitting neighbor along the current channel (block 1212). In one embodiment, the information indicative of the results of the comparison comprises a one-bit, appended integrity field that the current node 102 appends to the frame received from the transmitting node in the manner described above in connection with
For example, where the transmitting node is node A of
In one embodiment, higher-layer functionality implemented on top of the transport-layer functionality described above in connection with
In each self-checking pair 700, the two nodes 102 of each pair are required to act, at the application layer, in a replica-deterministic fashion such that the output of each node 102 is bit-for-bit identical. This enables straightforward bit-for-bit voting. In one embodiment, where pure computation-based replication is implemented, replica-determinism requires that the nodes 102 in the pair have both an identical internal state vector (that is, identical history state) and have agreed upon an input-data vector that is used for the next frame of computation. Typically, nodes of a self checking pair perform one or more comparison operations (in an operation commonly referred to as a “voting” or “selection” operation) in order to determine which of multiple instances of received data should be used in the processing performed by the self checking pair. In the embodiment shown in
Method 800 is performed by the current node 102 for each transmitted frame that is causal to a replica-determinate computation performed by the current pair 700. When a frame is transmitted in network 100, transport-layer functionality implemented at the current node 102 supplies to the application-layer functionality implemented on the current node 102 up to two instances of the transmitted frame—one instance “received” from channel 0 (block 802) and one instance “received” from channel 1 (block 804). In the particular embodiment shown in
For each instance of a transmitted frame received with integrity from a channel, the current node 102 assumes that all previous nodes 102 along that channel have received the same data for the transmitted frame from that channel. For example, when node A receives an instance of a transmitted frame with integrity from channel 0, node A assumes that node B (a previous node along channel 0) has received from channel 0 the same data for the transmitted frame. Likewise, when node B receives an instance of the transmitted frame with integrity from channel 1, node B assumes that node A (a previous node along channel 1) has received the same data for the transmitted frame from channel 1.
In the embodiment shown in
In the embodiment shown in
In one embodiment, the syndrome exchange occurs during a predetermined time slot. One example of a TDMA schedule 900 for the network 100 of
In one embodiment, for each frame that is transmitted on the network 100 that is causal to a replica-determinate computation performed by the current pair 700, the generation and exchange of syndromes by the members of the current pair 700 occurs during the particular time slot in which such frame is transmitted. In another embodiment, the members of a current pair 700 receive all frames for a given schedule round then generate and exchange a single composite syndrome for all the received frames that are causal to a replica-determinate computation performed by the current pair 700.
In the embodiment shown in
Method 1000 is performed by the current node 102 for each transmitted frame that is causal to a replica-determinate computation performed by the current pair 700 in order to determine which instance of such a transmitted frame received by each member node 102 of the pair 700 should be used in the computation. Method 1000, in the embodiment shown in
In the embodiment shown in
In one implementation, the nature of the source of a transmitted frame is identified by the current node 102 using the TDMA schedule. For example, as shown in
In the embodiment of method 1000 shown in
If neither instance of the transmitted frame was received with integrity by the current node 102 and the instance of the transmitted frame received from channel 0 is identical to the instance of the transmitted frame received from channel 1 (checked in block 1008), the current node 102 selects either instance for use in performing the replica-deterministic computation (block 1010). When the source of the transmitted frame is a self checking pair 700 and the instance received by the current node 102 from channel 0 matches the instance received by the current node 102 from channel 1, the current node 102 assumes that both instances received by the other member node 102 of the current pair 700 are identical to each other and are identical to the instances received by the current node 102. Accordingly, in such a situation, each of the member nodes 102 of the current pair 700 can chose the instance received from either channel and the member nodes 102 of the current pair 700 need not necessarily select the instance received from the same channel. In one implementation, there is a bias in favor of one channel (though such a bias in favor of one channel is not required in such a case).
If both instances of the transmitted frame were received by the current node 102 without integrity and the two instances are not identical, the current node 102 takes some default action (block 1012). The particular default action taken by the current node 102 is typically application dependent. For example, in one implementation of such an embodiment, the current node 102 does not perform the replica-deterministic computation. In another implementation, the current node 102 selects some known-good data in place of the transmitted frame for use in performing the replica-deterministic computation (for example, the last known good frame from the same source). In other implementations, other default actions are taken. In such a situation, the current node 102 assumes that the other member node 102 of the current pair 700 has not received an instance of the transmitted frame with integrity from either of the channels and that the two instances received by the other member node 102 are not identical, and, therefore, the other member node 102 will take the same default action as the current node 102 for that transmitted frame.
In the embodiment of method 1000 shown in
In an alternative embodiment, the directional integrity functionality described above in connection with
In the embodiment shown in
If the current node 102 and the other member node 102 did not both receive an instance of the transmitted frame from the same channel with integrity and the current node 102 and the other member node 102 did not both receive identical instances on channels 0 and 1, the current node 102 takes some application-dependent default action (block 1012). In the embodiment shown in
In the embodiment shown in
Such history-state CRC functionality, in one embodiment, is used to ensure that the two member nodes 102 of a pair 700 rendezvous successfully on power up. For example, in one implementation, each member node 102 waits until the history state is agreed upon (checked via exchanged CRC values) before commencing replica-deterministic computation (or other processing). In other embodiments, the syndrome includes other information. For example, in one such other embodiment, the syndromes include software and version identifiers. These identifiers are used, for example, in power on processing to ensure that the member nodes 102 of a pair 700 have the same software executing thereon. Once the member nodes 102 of a pair 700 verify that they have the same history state (and/or software), the functionality described above in connection with
When a self checking pair 700 of
Where the precision of a globally agreed fault-tolerant time base is a large number of bit cells (on the order of 1 microsecond to 5 microseconds), a local rendezvous between the two members of the pair is used to achieve the required level of synchronization. In one implementation, a “halt-release” protocol using the direct links 108 is used. In such an implementation, when a self checking pair 700 transmits, the “faster” member node 102 of the pair 700 initially transmits an IDLE preamble at the beginning of the assigned time slot. The faster node 102 continues to send such an IDLE preamble until the faster node 102 detects that the “slower” member node 102 has started its transmission. The slower node, detecting the presence of the faster node 102, sends a minimal preamble, which is only long enough for the faster node to detect and align the start of the faster node's real transmission. The precise time, in one implementation, is configured a priori using a suitable parameter in the global TDMA schedule table. Using this approach (or similar approaches), the time difference between the two member nodes 102 of a transmitting self checking pair 700 can be closely aligned to within the expected nominal propagation delay of the network 100 (for example, around one to three bit cells). By performing such a rendezvous function to closely align the transmissions of the member nodes 102 of a self-checking pair 700, the de-skew and comparison functionality implemented at the other nodes 102 of the network 100 can be utilized without requiring an increase in the FIFO buffer sizes.
In other embodiments, in addition to or instead of a pure computation self checking pair configuration, the direct links 108 between two neighbor nodes 102 are used as “private” channels between those two neighbor nodes 102 in order to exchange and/or agree on other types of data such as local sensor data. In such an embodiment, the entire raw data is exchanged between the two neighbor nodes 102. As with the exchange of the syndromes in time slot 902 of
Also, in other embodiments, other hybrid self-checking pair schemes are implemented in which only a subset of the nodes 102, tasks and/or transmissions operate in a replica-determinate fashion. For example, as noted above in connection with
The systems, devices, methods, and techniques described here can be implemented nodes that implements various types of protocols (for example, time triggered protocols such as TTP/C or FLEXRAY)
The TTP/C controller 1112 includes a communication network interface (CNI) 1116 that serves as an interface between the host 1110 and the other components of the TTP/C controller 1112. In the embodiment shown in
The TTP/C controller unit 1120 provides functionality necessary to implement the TTP/C protocol. In one implementation of such an embodiment, the TTP/C controller unit 1120 is implemented using a programmable processor (for example, a microprocessor) that is programmed with instructions to carry out the functionality performed by the TTP/C controller unit 1120. In such an embodiment, instruction memory 1126 is coupled to the TTP/C controller unit 1120. Program instructions that are executed by the TTP/C controller unit 1120 are stored in the program instruction memory 1126. In one implementation, the program memory 1126 is implemented using a read only memory device or a non-volatile memory device such as a flash memory device.
The TTP/C controller 1112 also includes message descriptor list (MEDL) memory 1128 in which configuration information for a time-division multiple access (TDMA) schedule, operating modes, and clock synchronization parameters are stored. The MEDL memory 1128 is typically implemented using, for example, a flash memory device and/or static random access memory (SRAM) device. Both the size of the CNI memory 1118, the program memory 1126, and the MEDL memory 1128 are selected based on the specific needs of the application software 1114 executing on the host 1110, the program instructions executing on the TTP/controller unit 1120, and/or a bus guardian 1132 (described below). Moreover, although the CNI memory 1118, the program memory 1126, and the MEDL memory 1128 are shown in
A single bus guardian 1132 servers as an interface between the TTP/C controller 1112 and the links 1108. In one implementation of the embodiment shown in
Data received by the bus guardian 1132 from the links 108 is passed to the TTP/C controller 1112 for processing thereby in accordance with the TTP/C protocol. Data that is to be transmitted by the TTP/C controller 1112 is passed by the TTP/C controller unit 1120 to the bus guardian 1132. The bus guardian 1132 determines when the TTP/C controller 1112 is allowed to transmit on the links 108 and when to relay data received from the links 108. In one implementation, the bus guardian 1112 implements at least a portion of the functionality described above in connection with
Although the TTP/C controller 1112 and the bus guardian 1132 are shown as separate components in
The systems, devices, methods, and techniques described here may be implemented in networks having network topologies other than the particular braided-ring topology illustrated in
Moreover, at least some of the systems, devices, methods, and techniques described here may be implemented in networks in which fewer inter-node connections are provided between the various nodes of the network. One example of such a network is a network that comprises two “simplex” ring channels. One such embodiment is implemented in a manner to that shown in
Furthermore, it is to be understood that the various systems, devices, methods, and techniques described here need not all be implemented together in a single network and that various combinations of such systems, devices, methods, and techniques can be implemented.
The methods and techniques described here may be implemented in digital electronic circuitry, or with a programmable processor (for example, a special-purpose processor or a general-purpose processor such as a computer) firmware, software, or in combinations of them. Apparatus embodying these techniques may include appropriate input and output devices, a programmable processor, and a storage medium tangibly embodying program instructions for execution by the programmable processor. A process embodying these techniques may be performed by a programmable processor executing a program of instructions to perform desired functions by operating on input data and generating appropriate output. The techniques may advantageously be implemented in one or more programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. Generally, a processor will receive instructions and data from a read-only memory and/or a random access memory. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and DVD disks. Any of the foregoing may be supplemented by, or incorporated in, specially-designed application-specific integrated circuits (ASICs).
A number of embodiments of the invention defined by the following claims have been described. Nevertheless, it will be understood that various modifications to the described embodiments may be made without departing from the spirit and scope of the claimed invention. Accordingly, other embodiments are within the scope of the following claims.
This application is a continuation-in-part of the following (all of which are hereby incorporated herein by reference): U.S. patent application Ser. No. 10/993,931, filed Nov. 19, 2004, entitled “UNSYNCHRONOUS MODE BROTHER'S KEEPER BUS GUARDIAN FOR A RING NETWORKS” (which is also referred to here as the “10/993,931 Application”), which claims the benefit of U.S. Provisional Application No. 60/523,892, filed on Nov. 19, 2003, and U.S. Provisional Application No. 60/523,865, filed on Nov. 19, 2003 (both of which are incorporated herein by reference); U.S. patent application Ser. No. 10/994,209, filed Nov. 19, 2004, entitled “CLIQUE AGGREGATION IN TDMA NETWORKS,” which claims the benefit of U.S. Provisional Application No. 60/523,892, filed on Nov. 19, 2003, and U.S. Provisional Application No. 60/523,865, filed on Nov. 19, 2003; U.S. patent application Ser. No. 10/993,936, filed Nov. 19, 2004, entitled “SYNCHRONOUS MODE BROTHER'S KEEPER BUS GUARDIAN FOR A TDMA BASED NETWORK,” which claims the benefit of U.S. Provisional Application No. 60/523,892, filed on Nov. 19, 2003, and U.S. Provisional Application No. 60/523,865, filed on Nov. 19, 2003; U.S. patent application Ser. No. 10/993,933, filed Nov. 19, 2004, entitled “HIGH INTEGRITY DATA PROPAGATION IN A BRAIDED RING,” which claims the benefit of U.S. Provisional Application No. 60/523,892, filed on Nov. 19, 2003, and U.S. Provisional Application No. 60/523,865, filed on Nov. 19, 2003; and U.S. patent application Ser. No. 10/993,932, filed Nov. 19, 2004, entitled “DIRECTIONAL INTEGRITY ENFORCEMENT IN A BIDIRECTIONAL BRAIDED RING NETWORK,” which claims the benefit of U.S. Provisional Application No. 60/523,892, filed on Nov. 19, 2003, and U.S. Provisional Application No. 60/523,865, filed on Nov. 19, 2003. This application is related to U.S. patent application Ser. No. 10/993,162 , filed Nov. 19, 2004, entitled “MESSAGE ERROR VERIFICATION USING CHECKING WITH HIDDEN DATA,” which is also referred to here as the “10/993,162 Application” and is incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
4417334 | Gunderson et al. | Nov 1983 | A |
4428046 | Chari et al. | Jan 1984 | A |
4630254 | Tseng | Dec 1986 | A |
4631718 | Miyao | Dec 1986 | A |
4740958 | Duxbury et al. | Apr 1988 | A |
4856023 | Singh | Aug 1989 | A |
4866606 | Kopetz | Sep 1989 | A |
5161153 | Westmore | Nov 1992 | A |
5257266 | Maki | Oct 1993 | A |
5307409 | Driscoll | Apr 1994 | A |
5341232 | Popp | Aug 1994 | A |
5386424 | Driscoll et al. | Jan 1995 | A |
5557778 | Vaillancourt | Sep 1996 | A |
5896508 | Lee | Apr 1999 | A |
5903565 | Neuhaus et al. | May 1999 | A |
6052753 | Doerenberg et al. | Apr 2000 | A |
6226676 | Crump et al. | May 2001 | B1 |
6374078 | Williams et al. | Apr 2002 | B1 |
6513092 | Gorshe | Jan 2003 | B1 |
6594802 | Ricchetti et al. | Jul 2003 | B1 |
6618359 | Chen et al. | Sep 2003 | B1 |
6707913 | Harrison et al. | Mar 2004 | B1 |
6760768 | Holden et al. | Jul 2004 | B2 |
6842617 | Williams et al. | Jan 2005 | B2 |
6925497 | Vetrivelkumaran et al. | Aug 2005 | B1 |
6956461 | Yoon et al. | Oct 2005 | B2 |
7050395 | Chow et al. | May 2006 | B1 |
7085560 | Petermann | Aug 2006 | B2 |
7088921 | Wood | Aug 2006 | B1 |
20020027877 | Son et al. | Mar 2002 | A1 |
20020087763 | Wendorff | Jul 2002 | A1 |
20050132105 | Hall et al. | Jun 2005 | A1 |
Number | Date | Country |
---|---|---|
407582 | Apr 2001 | AT |
3238692 | Apr 1984 | DE |
19633744 | Feb 1998 | DE |
20220280 | Nov 2003 | DE |
0405706 | Feb 1990 | EP |
1280024 | Jan 2003 | EP |
1280312 | Jan 2003 | EP |
1365543 | Nov 2003 | EP |
1398710 | Mar 2004 | EP |
1469627 | Oct 2004 | EP |
2028062 | Feb 1980 | GB |
1581803 | Dec 1980 | GB |
2175775 | Dec 1986 | GB |
0064122 | Oct 2000 | WO |
Number | Date | Country | |
---|---|---|---|
20050152379 A1 | Jul 2005 | US |
Number | Date | Country | |
---|---|---|---|
60523892 | Nov 2003 | US | |
60523865 | Nov 2003 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10993931 | Nov 2004 | US |
Child | 11010249 | US | |
Parent | 10994209 | Nov 2004 | US |
Child | 10993931 | US | |
Parent | 10993936 | Nov 2004 | US |
Child | 10994209 | US | |
Parent | 10993933 | Nov 2004 | US |
Child | 10993936 | US | |
Parent | 10993932 | Nov 2004 | US |
Child | 10993933 | US | |
Parent | 10993162 | Nov 2004 | US |
Child | 10993932 | US |