The presently disclosed subject matter relates to a wireless mesh network technology, and in particular to such networks including firmware image update functionality for network nodes.
Wireless mesh networks have been employed to good advantage in various environments and often are used to collect and report usage (consumption) and/or generation information for various utilities including, for example, water, gas, and electricity usage. Such mesh networks may incorporate a number of devices (nodes) that operate based on firmware stored within the device (node). In some instances it is desirable from time to time to update such stored firmware for varying purposes including such as adding additional functionality for the node, correcting (patching) errors in existing firmware, or for other reasons.
Generally, firmware (software) updates are accomplished by broadcasting firmware packets to the nodes. Broadcast packets are not acknowledged and often the fact that such broadcasts are conducted in a noisy RF environment means that all packets will not be received by all nodes. Furthermore the broadcast efficiency depends on the environment and is not easy to predict. In addition the ever changing nature of interference levels and network traffic load will make the broadcast efficiency vary with time.
In previously known arrangements, following the broadcast of a number of packets corresponding to, for example, a new or replacement firmware image, the network head (management system) will interrogate the various network nodes to find out which packets are missing in each node. The missing packets are then retransmitted either through a broadcast mechanism or through individual point-to-point communications. As different nodes are usually missing different packets, this process may entail the retransmission of a very large number of packets, slowing down the whole process.
In some systems, redundant broadcasts including systematic repetition of each packet is employed to attempt to mitigate problems associated with broadcast update techniques. Such redundancy increases the probability of reception of the packets but slows down the whole process.
In view of these identified issues, it would be advantageous to provide a firmware (software) updating (downloading) arrangement that provides improved packet reception reliability and recovery.
While various implementations of firmware updating systems have been developed, and while various combinations of error correcting techniques have been developed, no design has emerged that generally encompasses all of the desired characteristics as hereafter presented in accordance with the subject technology.
A full and enabling disclosure of the presently disclosed subject matter, including the best mode thereof, directed to one of ordinary skill in the art, is set forth in the specification, which makes reference to the appended figures, in which:
Repeat use of reference characters throughout the present specification and appended drawings is intended to represent same or analogous features, elements, or steps.
Mesh networks and their associated protocol architecture may be described as based on a tree with four kinds of elements, spread on cells, generally represented by present
In such tree hierarchy, just below the Collection Engine stand routers of the cells, referred to as the Cell Relays. There is one Cell Relay for each cell and it is the gateway between individual meters in the cell and the Collection Engine. The Cell Relay contains a routing table of all the meters in its cell. It can also forward data in the two directions, that is, between the Collection Engine and the endpoints. It also assumes the role of synchronizing the cell.
At the bottom of such tree are located so-called Endpoints (EPs). They can transmit and receive metering information. In addition, each one of them can act as a relay for distant endpoints with no additional hardware.
The last indicated module of such four kinds of elements is the Walk-By unit, a Zigbee (or other communications technology) handheld that can communicate with orphan endpoints or configure protocol parameters for endpoints. Therefore, an exemplary network preferably uses three media, which are an RF link, a Zigbee RF Link, and a TCP/IP link, all as represented per present
Major components of AMS 100 include exemplary respective meters (endpoints) 142, 144, 146, 148, 152, 154, 156, and 158; one or more respective radio-based networks including RF neighborhood area network (RF NAN) 162 and its accompanying Radio Relay 172, and power line communications neighborhood area network (PLC NAN) 164 and its accompanying PLC Relay 174; an IP (internet protocol) based Public Backhaul 180; and a Collection Engine 190. Other components within exemplary AMS 100 may include a utility LAN (local area network) 192 and firewall 194 through which communications signals to and from Collection Engine 190 may be transported from and to respective exemplary meters 142, 144, 146, 148, 152, 154, 156, and 158 or other devices including, but not limited to, Radio Relay 172 and PLC Relay 174.
AMS 100 is configured to be transparent in a transportation context, such that exemplary respective meters 142, 144, 146, 148, 152, 154, 156, and 158 may be interrogated using Collection Engine 190 regardless of what network infrastructure exists in-between or among such components. Moreover, due to such transparency, the meters may also respond to Collection Engine 190 in the same manner.
As represented by the illustration in
Generally when new or upgraded firmware is to be installed within a system 400, an image 410 of the firmware to be downloaded will be provided to an Advanced Metering System (AMS) Collection Engine 412 as a binary image file. Further discussion of Collection Engine 412 is included herewith but for the present it is noted that Collection Engine 412 is responsible for breaking up the single binary image into a series 420 of discrete blocks 422 that can be distributed across a communications arrangement such as an RF LAN, or other media. In an exemplary embodiment, an ANSI C12.22 compliant media may be used. Such blocks 422 have previously contained information to verify the block's integrity, as well as a block identifier, which is represented in
In general, when transferring blocks, each broken down, discrete block 422 is in its entirety preferably written into a record in a manufacturer's table for firmware images. End devices 440 are configured to evaluate such blocks 422 to determine their discrete integrity by using their block level hashes. The end devices can also validate that such blocks 422 are assembled (that is, reassembled) into the correct order. Finally, each end device is able to evaluate the integrity of the overall image by evaluating the CRC (Cyclic Redundancy Check) or hash for the entire image.
Prior basic process for transferring firmware image blocks 422 involved in part functionality that is similar to that used for reading data from meters. A broadcast containing the image blocks 422 was sent to meters 440. Meters 440 indicated whether they had successfully received the image blocks 422. Meters that don't respond are retried in a recovery process to make up for any failures. Because of the critical nature of firmware images, and the large number of blocks involved, some additional control and feedback mechanisms may be desired in some instances, to logistically handle the volume of traffic.
Managing the transport of firmware blocks 422 in an environment which encounters or involves unreliable media becomes critical when transferring firmware images. When transferring blocks across an RF LAN, for example, it is relatively likely that at least one node within a given cell will fail to successfully receive a block. Such circumstances have been addressed in two manners. First, it was important that blocks be able to be transmitted and received in any order. Second, depending on the practical reliability of the underlying network, it may have been practiced to broadcast a given block several times before resorting to point-to-point transfers of image blocks.
With further reference to
When in such firmware download mode, the device will report the number of blocks it has successfully received as part of any daily read requests. Additionally, being placed in firmware download mode resets to zero a block counter of such device. Moreover, the command includes instructions to the end devices indicating that no direct acknowledgements on the part of the meters should be made. Rather, devices acknowledge such command by reporting their success count as part of the next interrogation cycle.
Collection Engine 412 is responsible for evaluating, based on the presence of the firmware block success count, whether all of the targeted nodes have successfully entered firmware download mode. Nodes that have not switched to firmware download mode eventually are then individually contacted by the Collection Engine 412.
Once the target nodes are in firmware download mode, Collection Engine 412 will begin broadcasting firmware blocks 422 to the target nodes 440. As an alternative to transmission of the firmware blocks 422 exclusively by Collection Engine 412, it may be desirable to transfer the firmware image 410 to the cell relays 430 and then send a command to instruct them to broadcast the firmware image 410 within their respective cell. Such alternative method would be one approach to reducing public carrier back-haul costs and to allowing cell relays to better manage bandwidth within their cells.
Completion of the broadcast transfers is a process that may take several days, or even weeks, depending on whether it is being done in conjunction with other operations. In any event, after such completion, Collection Engine 412 begins evaluating the block success count of each of the target nodes. When a node has a complete set of blocks, it will record a special event in the meter history log indicating such successful completion. Most nodes should have a complete set of blocks once the broadcast transfers are complete. Nodes that are still missing blocks will need to have them transferred point-to-point. Nodes that have excessive missing blocks after the broadcast process is complete may be flagged for possible maintenance or replacement as being potentially defective.
To facilitate point-to-point transfers, Collection Engine 412 will call a second stored procedure in the device. Such second procedure, a manufacturer's stored procedure, will provide a list of missing blocks, by block number. In an exemplary embodiment, the block list will include a predetermined maximum number of blocks, and a status byte indicating whether there is more than the predetermined number of blocks missing. For example, the predetermined maximum number of blocks may be set to twenty blocks. In using such exemplary method, most meters will receive all blocks and will not need to report on individual blocks; however, those meters that are missing blocks can be interrogated for a manifest of what they still require.
Collection Engine 412 will use such missing block data provided by the respective meter 440 to perform point-to-point block transfers. Meter nodes that cannot be contacted will be reported to the system operator. Once the point-to-point retries have been completed, the devices can be instructed to enable the new firmware. The command to activate the firmware may correspond to a C12.22 manufacturer's stored procedure. If a date and time is specified, the device will activate the firmware at the specified date and time. If no date and time is provided, the device normally will be set to activate the firmware download on an immediate basis.
Successful firmware activation can involve two additional aspects. First, selected metrology devices, i.e., meters, may employ not just one, but a plurality of images related to different aspects of the device's operation. In an exemplary configuration, at least three separate firmware images may be employed: one for the meter register board, another for a neighborhood local area network (LAN) microprocessor, and a third for a home area network (HAN) processor. In a more specific exemplary configuration, the neighborhood local area network microprocessor may correspond to an RF LAN microprocessor while the home area network processor may correspond to a Zigbee processor. Each of such components will have its own firmware image that may need to be updated. Additionally, over the course of time, new metrology device versions which require different firmware may be incorporated into existing systems. In such case, a given cell may have a mixture of devices with different firmware needs. For example, the Zigbee protocol may be used for communicating with gas meters, in-home displays, load-control relays, and home thermostats.
With reference presently to
As illustrated in
In order to handle such exemplary circumstances as represented in
With reference again to both
In such foregoing viral peer-to-peer model, a firmware image may be delivered to exemplary cell relay 430. From there, Collection Engine 412 preferably may send a stored procedure command to cell relay 430, indicating that it should distribute such firmware image to the RF LAN. Collection Engine 412 also sends a command to the meter nodes within the cell using a broadcast or multicast message, instructing them that a new firmware image is available. Once such command is received, cell relay 430 makes the firmware available to its local RF LAN processor. Per the presently disclosed subject matter, meter nodes 440 within such cell instruct their RF LAN processors to begin looking for blocks. At such point, the RF LAN processors take over the block transfer process.
Such disclosed viral-type distribution mechanism may be very powerful and very efficient in that it may be able to make better use of the available physical bandwidth. Under such viral peer-to-peer arrangement, individual meter nodes 440 can grab firmware images or portions of firmware images, from their immediate neighbors or parents, rather than needing to get the data directly from cell relay 430 or Collection Engine 412. As a result, one portion of the cell could be exchanging firmware blocks while another portion of the cell could be passing various messages between meter nodes 140 and cell relay 130, all without impacting each other.
As previously employed, the Forward Error Correction (FEC) used to recover missing symbols was a Reed-Solomon code RS(n,k) with symbols in Galois Field GF(256). This means that the symbols of the code are bytes and that for each block of k bytes, there are appended r=n−k additional redundancy bytes. Such a code has a Hamming distance of r+1 and allows the computation of r missing symbols per block, provided the positions of the missing symbols are known (this is called an erasure in coding theory). The code parameters are adapted to the network performance but as a first approximation, a Reed-Solomon RS(255,200) can be suggested.
The steps for encoding include: (1) If the message length is shorter than k bytes, zero-pad the message to make it k-byte long. The zeros are on the most significant symbol side of the message. (2) Compute the r redundancy bytes with the RS(n,k) code.
The steps for decoding are: (1) If no packets are missing, no decoding has to be done because the purpose of this decoding is only to recover missing packets. It plays no role for error detection. The decoding program needs to know how many packets are missing and where they are located in the block. (2) If the message is shorter than k bytes (should be known from the length field of the header), use zero-padding to make it k-byte long as for coding process. (3) If some packets are missing, replace the missing symbols by zeros. (4) Compute the missing packets with the RS(n,k) code.
Each byte of the message is considered as an element of Galois Field GF(256). These elements are called symbols. All the operations made with these symbols (addition, subtraction, multiplication and division) should be made according to the additive and multiplicative laws of the Galois field GF(256), constructed with the primitive polynomial p(x)=x8 x7+x+x+x2+x+1. A symbol of GF(256) has several useful representations: a binary representation {b7b6b5b4b3b2b1b0}, a polynomial representation b7a7+b6a6+b5a5+b4a4+b3a3+b2a2+b1a1+b0 and an exponential representation αm.
In the last two representations, α is a primitive element such that p(α)=0. The binary or polynomial representation is useful for addition and the exponential representation is useful for multiplication. All GF(256) elements, except zero, have an exponential representation, the complete field can be written as the set GF(256)={0, 1, α, α2, α3, . . . , α253, α254}. The conversion between the two representations is done with a look-up table. For the implementation of the encoder/decoder, it is necessary to add and multiply the symbols in GF(256). Addition is easily done with the binary or polynomial representation of the symbols. The operation is equivalent to a modulo 2 addition of each bit, for instance: 0010 1100+1000 1111=1010 0011
Multiplication is more difficult because a conversion to the exponential form of the symbol is necessary. A look-up table is used for this purpose. As an example, one may multiply the two symbols of the previous example: 0010 1100×1000 1111
The first symbol (44) has the exponential representation α190, the second symbol (143) has the exponential representation α90. The product is α190α90=α280=α255+25=α25 because α255=1. One can use the table in the other way to convert the result in binary form and obtain 44*143=226.
Conforming to established convention in coding theory for a Reed-Solomon encoder, the polynomial representation for the messages is used. The k-symbol data block, {uk-1, uk-2, . . . u1, u0}, can be written u(x)+uk-1xk-1+uk-2xk-2+ . . . +u3x3+u2x2+u1x1+u0. The symbol uk-1 is the first symbol sent. The n-symbol code word (message+redundancy symbols) can be written c(x)=ck-1xk-1+ck-2xk-2+ . . . +c3x3+c2x2+c1x1+c0. The Reed-Solomon encoding process is equivalent to a division of the message by a generator polynomial G(x). Such can be implemented with a linear feedback shift register as shown in
In the shift register implementation shown in
As with conventional CRC computing, the multiplicative factors in the shift register are given by the coefficients of a polynomial: G(x)=g0+g1x+g2X2+g3x3+ . . . +xr
For a Reed-Solomon code this polynomial is defined by its roots in the following way: G(x)=(x−α)(x−α2)(x−α3)(x−α4) . . . (x−αr)
This polynomial needs to be developed to find the gi coefficients. As an example for a RS(255,200) code the result is: G(x)=α10+α123x+α142x2+α189x3+α172x4+α17x5+α120x6+α40x7+α185x8+α235x9+α90x10+α162x11+α76x12+α232x13+α236x14+α83x15+α175x16+α171x17+α84x18+α230x19+α167x20+α34x21+α2x22+α77x23+α43x24+α100x25+α201x26+α118x27+α90x28+α117x29+α215x30+α102x31+α80x32+α204x33+α180x34+α2x35+α9x36+α62x37+α93x38+α41x39+α148x40+α245x41+α185x42+α228x43+α3x44+α130x45+α219x46+α113x47+α167x48+α191x49+α32x50+α131x51+α92x52+α244x53+α169x54+x55
In accordance with the presently disclosed subject matter, further improvements have been made to permit significant improvement in the lost packet recovery aspects of broadcast firmware updates. Thus, in accordance with the presently disclosed subject matter as generally illustrated in present
To make the broadcast very reliable, the packets are relayed through the network with a sufficient number of repetitions. Randomization windows for transmission are also chosen to be large enough to avoid the loss of too many packets through internal collisions. The pace of successive broadcasts is chosen to be slow enough to avoid interference between successive packets.
After the completion of the broadcast, the applicative layer checks the integrity of the download in each endpoint. Missing packets are then retransmitted, either individually to the endpoints which have reported the packet to be missing, or with another broadcast if the packet is missing in too many endpoints.
Such kind of firmware download gives good results but is not the fastest way to make a firmware download. Reasons for these shortcomings include: sending a very large number of packets to a very large number of endpoints with a high success rate requires a very reliable broadcast. The broadcast needs to be slowed down to achieve this reliability. Retransmission of missing packets takes time because different packets are missing in different endpoints and a lot of packets might need to be retransmitted before the firmware is complete in every endpoint.
In accordance with the presently disclosed subject matter, to speed up the firmware download the following mechanisms have been provided. The process to provide such improved download efficiency is elegantly simple in that the process provides for appending some redundancy packets to the firmware download. Such redundant packets will allow each endpoint to compute the missing packets if not too many packets are lost.
For this purpose, and with reference to present
With reference to present
Present
If one considers all the symbols in position l of the packets of block i, there is the following sequence:
{D(i,1,l),D(i,2,l),D(i,3,l), . . . D(i,n,l)}, for l=1, 2, . . . , L and i=1, 2, . . . ,M−1
The redundancy symbols, D(i,j,l), k+1≦j≦n, are chosen in such a way that this sequence of n symbols is a word of the Reed-Solomon code C(n,k). It then becomes very easy to recompute a missing packet with the error correction capability of the Reed-Solomon code.
In this instance, the Reed-Solomon decoding involved here is an erasure procedure. Erasure decoding requires much less computation and allows the recovery of twice as many packets, compared to the standard Reed-Solomon decoding. This is made possible because in such case, it is known beforehand which packets need to be recovered. Such is a much simpler problem than finding the location of an unknown error and then correcting the error. For the last, shorter packet, a similar procedure can be applied.
{D(i,1,l),D(i,2,l),D(i,3,l), . . . D(i,n′,l)}, for l=1, 2, . . . , L and i=M
The firmware download with lost packet recovery in accordance with the presently disclosed subject matter is much faster than previous techniques because: if no lost packet recovery is possible, the broadcast has to be slowed down to make sure that every packet is received by every endpoint. Only a small redundancy overhead allows the endpoints to compute several lost packets. This allows a much faster broadcast pace because the system is now fault tolerant.
In accordance with further aspects of the presently disclosed subject matter other improvements over the just described lost packet recovery technique have also been provided. Generally it is expected that the performance of a broadcast strongly depends on the propagation conditions, endpoint density and ongoing traffic in the network. With no prior knowledge of the network and no feedback mechanism, the performance of the broadcast is unknown. One solution is to always send the broadcasts with a maximum efficiency configuration in order to minimize retransmissions. In accordance with such aspect of the presently disclosed subject matter, a feedback mechanism is provided that adapts the broadcast to the actual network.
The first step in such presently disclosed adaptive process is identical to the previously described firmware download with lost packet recovery. After the completion of a block broadcast, the endpoints try to compute missing packets within the block and report the result of their computations reported to the cell relay. If the block has been successfully received, the broadcast of the next block can begin. If some packets are missing within the block, the cell relay will proceed with the broadcast of additional redundancy packets. The benefit of this is that there is no need to retransmit the missing packets which will be different for each endpoint. The additional redundancy packets bring extra information that can be used by any endpoint, whatever the missing packets are.
Such presently disclosed process can be extended to have several steps as illustrated, for example, in the three-step adaptive packet recovery process shown in
For proper operation, such process uses the same Reed-Solomon code for all the steps. In such way, the redundancy packets bring incremental information to the endpoints. At each step, the endpoint can use the previously transmitted redundancy packets as well as the newly received redundancy packets. Therefore, per presently disclosed subject matter, one starts with the maximum number of redundancy packets and choses the code to be C(n,k) with n−k=r1+r2+r3 being the maximum FEC length as illustrated in
As illustrated in
Those of ordinary skill in the art should appreciate that multi-step process 1400 may be extended to include more than the three basic steps illustrated in
As an aid to implementing the presently disclosed subject matter,
Simulation results for the presently disclosed subject matter have shown that with a 10% redundancy, the total firmware download time can be expected to be reduced by a factor between two and four using a single-step process. By implementation of the presently disclosed subject matter, lost packet recovery makes the firmware download much faster as the small redundancy overhead allows a fast broadcast pace with the computation of lost packets by the endpoint itself.
Similarly, multi-step adaptive lost packet recovery provides further improvement since additional redundancy is used only in difficult environments and the normal firmware download is not slowed down if the propagation conditions are normal. Further retransmission of lost packets is replaced by transmission of additional redundancy packets. As those are the same for all endpoints, the total time dedicated to retransmission is dramatically reduced; accordingly, there is no need to address packets to individual endpoints. As a consequence, the multi-step procedure in accordance with the presently disclosed subject matter can cope with a very wide range of environments, up to the most difficult ones, thereby significantly increasing overall reliability.
This application is a continuation-in-part of pending U.S. patent application Ser. No. 13/780,001 filed Feb. 28, 2013, which is a continuation of prior pending U.S. patent application Ser. No. 12/902,853 filed Oct. 12, 2010, issued Mar. 5, 2013 as U.S. Pat. No. 8,391,177 entitled “USE OF MINIMAL PROPAGATION DELAY PATH TO OPTIMIZE A MESH NETWORK”, which is a continuation of U.S. patent application Ser. No. 11/900,202 filed Sep. 10, 2007, issued Nov. 30, 2010 as U.S. Pat. No. 7,843,834 and bearing the same title, which claims the benefit of two previously filed U.S. Provisional patent applications entitled “METERING RF LAN PROTOCOL AND CELL/NODE UTILIZATION AND MANAGEMENT,” respectively assigned U.S. Ser. No. 60/845,056, as filed Sep. 15, 2006, and assigned U.S. Ser. No. 60/845,994, as filed Sep. 20, 2006, all of which are hereby incorporated herein by reference in their entireties for all purposes. Any disclaimer that may have occurred during prosecution of the above-referenced applications is hereby expressly rescinded.
Number | Date | Country | |
---|---|---|---|
60845056 | Sep 2006 | US | |
60845994 | Sep 2006 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 12902853 | Oct 2010 | US |
Child | 13780001 | US | |
Parent | 11900202 | Sep 2007 | US |
Child | 12902853 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13780001 | Feb 2013 | US |
Child | 13833252 | US |