The invention relates generally to circuits and methods of operating circuits, and more specifically to circuitry suitable for aligning non-aligned bits of a data packet for memory storage.
In general, networks and computers operate in different manners. Networks operate by transferring data in streams and/or packets. Streams may be bit-sized, byte-sized, or otherwise broken down. Packets may be of relatively large size, such as 64, 512, or more bytes each. Computers operate by processing data, typically in well-defined small sizes, such as bytes (8 bits), words (16 bits), double words (32 bits) and so on. At the interface between a computer and a network, a translation or reorganization of data may be necessary. This may include reorganizing data from a series of packets into a format useful to a processor. In particular, this may include taking data bits of a series of bytes and reorganizing them into a form including only data bits. A similar problem may occur at a byte-level, wherein some bytes of a group of bytes are data bytes, and other bytes are effectively control bytes which need to be parsed out of data.
This approach suffers from requirements of increasing logic for increasing bus widths. Whereas a 4 bit barrel shifter may require n gates, an 8 bit barrel shifter may require 4n gates, and a 16 bit barrel shifter may require 16n gates for implementation. Thus, as bus widths grow, this approach requires exponential growth in logic.
A method and apparatus for assembling non-aligned packet fragments over multiple cycles is described. In one embodiment, the invention is a method. The method includes rotating a non-aligned data fragment within a rotate register based on a tail pointer of a prior data fragment to form a rotated data fragment. The method also includes outputting the rotated data fragment to a double width bus as a double width image of the rotated data fragment. The method further includes selectively copying the double width image of the rotated data fragment from the bus to a location logically following the prior data fragment in a destination register.
In an alternate embodiment, the invention is an apparatus. The apparatus includes a rotate register to receive a data fragment and logically rotate bits of the data fragment. The apparatus also includes a double width bus coupled to the rotate register to receive a double width image of the contents of the rotate register. The apparatus further includes a destination register coupled to the double width bus to receive data of the double width image from the double width bus.
In another alternate embodiment, the invention is also an apparatus. The apparatus includes a first means for storing a data fragment and rotating bits of the data fragment. The apparatus also includes a means for transferring a double width image of data stored in the first means for storing. The apparatus further includes a second means for storing data from the means for transferring. The second means is for receiving data selectively from the means for transferring.
The present invention is illustrated by way of example and not limitation in the accompanying figures.
A method and apparatus for assembling non-aligned packet fragments over multiple cycles is described. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the invention. It will be apparent, however, to one skilled in the art that the invention can be practiced without these specific details. In other instances, structures and devices are shown in block diagram form in order to avoid obscuring the invention.
Reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
Various descriptions of this document relate to devices or components being coupled together. Coupling typically denotes a relationship allowing for communication or connection between a first and second object. The first object may be directly connected to the second object. Alternatively, the first object may be directly connected to a third object which may also be directly connected to the second object, thereby achieving a coupling between the first object and the second object. As will be appreciated, the number of intermediate objects between two objects which are coupled together does not determine whether the objects are coupled, the presence of a link between the two objects indicates that the two objects are coupled together.
As bandwidth expands, the size of datapaths tends to expand, sometimes in quantum leaps. For example, datapaths expand typically from 8 to 16 bits, without stopping at intermediary points such as 10 or 12 bits. When logic surrounding datapaths expands, it is preferable that the logic expands slowly, such as linearly or logarithmically with the expansion of the datapaths, and not exponentially. Otherwise, the logic controlling the datapaths may effectively prohibit expansion of the datapaths.
As illustrated in the following figures, various embodiments of this invention provide for a scalable circuit useful for aligning data from a packet in memory, without requiring exponential scaling of the control circuitry. Generally, the data is received, barrel shifted within a register of the same size as the datapath of the supplying network, and then duplicated for purposes of supplying data to a wider bandwidth memory system. The duplicated data may be written with a mask into the desired area of memory. Thus, as the bandwidth of the memory system increases, additional duplication may be used to properly position incoming data, rather than requiring increasingly more complex barrel shifters.
A method and apparatus for assembling non-aligned packet fragments over multiple cycles is described. In one embodiment, the invention is a method. The method includes rotating a non-aligned data fragment within a rotate register based on a tail pointer of a prior data fragment to form a rotated data fragment. The method also includes outputting the rotated data fragment to a double width bus as a double width image of the rotated data fragment. The method further includes selectively copying the double width image of the rotated data fragment from the bus to a location logically following the prior data fragment in a destination register.
In an alternate embodiment, the invention is an apparatus. The apparatus includes a rotate register to receive a data fragment and logically rotate bits of the data fragment. The apparatus also includes a double width bus coupled to the rotate register to receive a double width image of the contents of the rotate register. The apparatus further includes a destination register coupled to the double width bus to receive data of the double width image from the double width bus.
In another alternate embodiment, the invention is also an apparatus. The apparatus includes a first means for storing a data fragment and rotating bits of the data fragment. The apparatus also includes a means for trarisferring a double width image of data stored in the first means for storing. The apparatus further includes a second means for storing data from the means for transferring. The second means is for receiving data selectively from the means for transferring.
Input register 210 receives data in unpacked form. In one embodiment, the data is a maximum of p bits, although in alternate embodiments, the data may be supplied in bytes rather than bits. The data is transferred to barrel shifter 220, which shifts the data to an appropriate position based on a tail pointer for memory 250 of memory block 240. In particular, tail pointer 270 points to the first bit in memory storage location 255 which follows the data previously stored in memory storage location 255. Barrel shifter 220 thus shifts the data to a point corresponding to the tail pointer location. If the tail pointer location is greater than the pth bit, a modulo p operation is performed to shift the data to a location corresponding to the tail pointer in the barrel shifter 220. The modulo p operation will be further described later in this document. Note that the barrel shifter 220 used in this context is restricted to p bits in width, regardless of the size of the downstream memory (memory block 240), thus standardizing and reducing the complexity of the barrel shifter 220 relative to the barrel shifter 130 of
Data from the barrel shifter 220 is then provided to byte select logic 230, with the data provided to the first p bits of byte select logic 230, and provided as an image to the next p bits of byte select logic 230, effectively creating a wrapped-around image of the data from barrel shifter 220. The data from byte select logic 230 is copied to memory storage location 255 using a write enable mask. This write enable mask allows for writing of the bits occupied by the data of byte select logic 230 without overwriting the old data already present in memory storage location 255. The extra illustration of memory storage location 255 at the bottom of
At block 275, the data in unpacked form is received, such as in an input register. At block 278, a calculation is performed to determine how much data has been received, where previous data has been stored in memory (finding the tail pointer), and accordingly where data should be shifted. At block 280, the data is aligned and rotated or shifted. At block 283, the shifted data is transferred to a double width bus with images of the data provided side by side. At block 285, a write mask is calculated based on the size of the data and tail pointer calculated at block 278. At block 290, the data of the double width bus is selectively transferred to a receiving memory storage location using the write enable mask of block 278 to avoid overwriting previously stored data. Alternatively, an image register may be used to temporarily store the double width image of the data.
Input register 210 receives data in unpacked form. The data is transferred to barrel shifter 220, which shifts the data to an appropriate position based on a tail pointer for memory storage location 255 of memory block 240. In particular, tail pointer 270 points to the first bit in memory storage location 255 which follows the data previously stored in memory storage location 255. Barrel shifter 220 thus shifts the data to a point corresponding to the tail pointer location. If the tail pointer location is greater than the pth bit, a modulo p operation is performed to shift the data to a location corresponding to the tail pointer in the barrel shifter 220.
Data from the barrel shifter is then provided to byte select logic 230 through a double width bus, with the data provided to the first p bits of byte select logic 230, and provided as an image to the next p bits of byte select logic 230, effectively creating a wrapped-around image of the data from barrel shifter 220. The data from byte select logic 230 is copied to concatenator 360, along with previously stored data from memory storage location 255 using a write enable mask. This write enable mask allows for writing of the bits occupied by the data of byte select logic 230 in combination with the old data already present in memory storage location 255 to form an image of what will ultimately be written to memory storage location 255.
At block 375, the data in unpacked form is received, such as in an input register. At block 378, a calculation is performed to determine how much data has been received, where previous data has been stored in memory (finding the tail pointer), and accordingly where data should be shifted. At block 380, the data is aligned and rotated or shifted. At block 383, the shifted data is transferred to a double width bus, with images of the data provided side by side. At block 385, a write mask is calculated based on the size of the data and tail pointer calculated at block 378. At block 388, a determination is made as to whether data in consecutive cycles is directed to the same channel. If the data is directed to the same channel in both consecutive cycles, at block 393, the data of the two consecutive cycles is concatenated at block 393. If the data is not directed to the same channel in two consecutive cycles, at block 390, the data is concatenated with read-back data from the memory storage location which is to receive the data. In either situation, the process proceeds to block 395 and the concatenated data is written to memory.
Data from the barrel shifter is then provided to byte select logic 230, with the data provided to the first p bits of byte select logic 230, and provided as an image to the next p bits of byte select logic 230, effectively creating a wrapped-around image of the data from barrel shifter 220. The data from byte select logic 230 is copied to concatenator 360, along with previously stored data from memory storage location 255 using a write enable mask. This write enable mask allows for writing of the bits occupied by the data of byte select logic 230 in combination with the old data already present in memory storage location 255 to form an image of what will ultimately be written to memory storage location 255.
As the data of byte select logic 230 overflows the memory storage location 255, only the bits between the tail pointer location and the end of byte select logic 230 is copied into the concatenator for copying into memory storage location 255. The remaining bits, occupying locations in byte select logic 230 starting with the 0th bit are copied into the concatenator after storage of data to memory storage location 255. These remaining data bits may either be stored to memory storage location 556 (the memory storage location logically following memory storage location 255) or may be retained in the concatenator until the next piece of data for the channel in question is received.
In the foregoing detailed description, the method and apparatus of the present invention has been described with reference to specific exemplary embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the present invention. In particular, the separate blocks of the various block diagrams represent functional blocks of methods or apparatuses and are not necessarily indicative of physical or logical separations or of an order of operation inherent in the spirit and scope of the present invention. For example, the various blocks of
Number | Name | Date | Kind |
---|---|---|---|
4920483 | Pogue et al. | Apr 1990 | A |
5291586 | Jen et al. | Mar 1994 | A |
5471628 | Phillips et al. | Nov 1995 | A |
5964835 | Fowler et al. | Oct 1999 | A |
6640297 | Banning et al. | Oct 2003 | B1 |
Number | Date | Country | |
---|---|---|---|
20040117584 A1 | Jun 2004 | US |