This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2013-070691, filed on Mar. 28, 2013, the entire contents of which are incorporated herein by reference.
The present invention relates to a memory control device and a memory control method.
In recent years, in the field of equipment for transmitting packets and the like by using the IP (Internet Protocol), along with the broadening of the bandwidth of the network, the transmission bandwidth which the FPGA (Field Programmable Gate Array) could handle has been broaden. For example, to allow transmission equipment to exhibit high transmission performance of about 80 Gbps on Ethernet™, the memory has to have a wider bus width in accordance with the increase in the transmission bandwidth of the FPGA. However, in the transmission equipment, since the received data are divided based on the bus width of the memory, a large amount of invalid regions are created in the memory along with the increase in bus width due to broadening of the bandwidth. Concretely speaking, in order for the transmission equipment to transmit data of 65 bytes (520 bits), when the bus has a width of 256 bits, only the region of 248 (=256−8) bits is invalid. However, when the bus has a width of 512 bits, the invalid region of as many as 504 (=512−8) bits are created.
The creation of such an invalid region causes decrease in an effective transmission bandwidth of the memory and can be eventually a cause of frame loss because of short of memory capacity. The frame loss can be a big issue for the transmission equipment to perform high speed and high quality transmission. In other words, when the bus width of the memory is increased in the transmission equipment, the maximum transmission performance is improved; however, since the invalid region is likely to increase, it is difficult for the transmission equipment to exhibit high transmission performance with respect to all the data lengths (for example, 64 to 16 Kbytes). In view of the above problems, there is a method in which when the transmission equipment writes data in the memory, a plurality of data are written in the memory without a gap so as not to create the invalid region.
However, there can be a problem that processes become complex due to the fact that a head position of the data changes by a bit or that a control for preventing other stored data from being erased is performed when writing data. In addition, the data are not always read in the order of being written; therefore, when the transmission equipment reads some data, the transmission equipment sometimes reads other data stored before and after the data; as a result, the transmission bandwidth can decrease at the time of reading. As a measure to improve an effective transmission bandwidth of the memory, other than the above-described measure in which the bus width is broadened, there is a measure in which an operating frequency of the memory itself is increased; however, the FPGA has a limit in frequency which the FPGA handles, and it is difficult for a large improvement in the transmission bandwidth to be expected with this measure.
According to an aspect of the embodiments, a memory control device includes a distribution unit and a plurality of controllers. The distribution unit divides data to be written in a memory and outputs a plurality of divided data blocks obtained by the division while distributing the divided data blocks to a plurality of buses. The plurality of controllers write the plurality of divided data blocks output by the distribution unit in the memory through the plurality of buses, the divided data blocks being in contact with each other in each of the buses.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
Preferred embodiments will be explained with reference to accompanying drawings. Note that the memory control device or the memory control method disclosed in the present application is not limited by the following embodiment.
First, with reference to
As described above, the memory control device divides each data D1 to D4 and then distributes the divided data blocks to the divided blocks B1 to B4 to write; thus, by performing the reading process on the same block to which the data were distributed at the time of writing, the data can be read even in the case that the data are stuffed. In addition, the memory control device performs a read control individually on each of the divided blocks B1 to B4; thus, even when different data (for example, the data D1 and the data D2) are stored in each of the divided blocks B1 to B4, a concurrent-like reading process can be performed. As a result, the invalid region is decreased.
The detail will be described later, but in the present embodiment, each written data D1 to D4 are managed with information (a write pointer) indicating a data ID, the storage location of the data, a data length, a block ID as a head of the data, and a data part (type) indicated by the write pointer. In the writing process, the memory control device stores the above pieces of information in a memory different from a memory 20, which is an object to be controlled; and in the reading process, the memory control device obtains the above pieces of information corresponding to the data to be read, and thereby calculating an address to be used for the reading process.
The above pieces of information will be described in detail. The data ID is information (for example, a number) for identifying the data. The write pointer indicates a pointer in which the data were written. The pointer is a memory space having a predetermined capacity (for example, 512 bytes) different from the bus width (64 bytes). The memory control device controls the data by the unit of the pointer, and one pointer corresponds to a plurality of memory addresses. The memory control device may assign a requested number to other pointers when the data more than the capacity (for example, 512 bytes) of one pointer are written in the memory.
The data length is the number of bits of the data (write data) to be written. A number of times of read by the controllers is calculated by dividing the data length by 128 bits (16 bytes), which is the width of the divided bus. The ID of the block corresponding to the head of the data (hereinafter, written as “head block ID”) indicates which block of the four divided blocks B1 to B4 the head part of the data is located in. The memory control device identifies the locations of the read data, considering the block indicated by the head block ID as the head (starting position of the data). The type is the information for indicating which position the data part indicated by the corresponding write pointer corresponds to, the head, the middle, or the tail in the whole data. The value of the type is used for an internal process including calculation of the read address and restoration of the data.
In the following description, in order to clearly distinguish the data to be divided from the data having obtained by dividing, the data to be divided are simply written as “the data”, and the data having been divided are written as “the divided data blocks.” For example, the data D1 to be divided are divided into four divided data blocks D11, D12, D13, and D14 prior to the writing process, and then the divided data blocks are combined again in the reading process, whereby the data D1 are eventually restored.
Next, since the reading process is performed on the divided blocks B1 to B4 in the order of becoming readable, the data D1 to D4 are not necessarily read out in order from the head part. In other words, the order of read-out may be reversed between the divided blocks B1 to B4.
In the present embodiment, it is assumed that a memory control device 10 is applied to an IP transmission device which has QoS (Quality of Service) functions for 80 Gbps Ethernet, but the application is not limited thereto and can be other devices. The memory control device 10 performs, in data processing between a queue management unit and an output interface in the IP transmission device, read/write control individually on each of the memory buses divided into four parts.
A configuration of the memory control device 10 according to the embodiment disclosed in the present application will be described.
Upon the input of the write request, the write-request distribution unit 11 divides the write data and distributes the divided data blocks to the controllers 15a to 15d. The write-request distribution unit 11 has a write request buffer 111, and the write request buffer 111 stores therein the write data and the requested write pointers (a plurality of memory address). For example, the write pointer #10 corresponds to memory addresses #100, #101, #102, . . . .
Then, with reference to
In the example illustrated in
As described above, since the write-request distribution unit 11 writes the write data having been divided in the different regions (divided blocks B1 to B4), the write data which were conventionally processed at different timings can be parallelly processed at the same timing. In addition, regarding the memory control device 10, the maximum invalid region in each block is reduced by dividing the bus width B, whereby the parallel loss can be reduced. Further, by preventing creation of the divided block in which noting is written, the memory control device 10 can minimize the gap between the writing processes when the writing processes successively occurs. Thus, the percentage of the invalid regions in the signal being actually read is reduced.
As a result, reduction in the transmission bandwidth is avoided.
Upon input of the read request, the read-request distribution unit 12 reads out the data D1 to D3 from each of the divided blocks B1 to B4. In particular, the read-request distribution unit 12 identifies the data to be read by referring to the above pieces of information (the data ID, the write pointer, the data length, the head block ID, and the type) based on the pointer having been requested to be read. For example, the read-request distribution unit 12 calculates the head address of the read data from the pointer having been requested to be read, and then calculates the address to which the reading process is to be performed, from the head block ID and the data length. These calculation processes are performed individually for each of the divided blocks B1 to B4. Thus, the head and the tail of the divided data blocks to be read, of the divided data blocks D11 to D34, are identified in the unit of the divided block.
Next, with reference to
When the above-mentioned read request is input into the read-request distribution unit 12 under the above conditions, the size of one pointer is 512 bytes and the bus width B of the memory is 64 bytes, whereby the number of memory address for one pointer is 8 (=512÷64). This value is unique to the circuit of the memory 20.
First, as for the pointer #10, since the head address is calculated by the formula: pointer×(the number of memory addresses for one pointer), the head address is calculated to be 80 by the formula 10×8. Further, since the type of the pointer #10 of a read R1 does not contain “tail” (see
Similarly, as for the pointer #11, since the head address is calculated by the formula: pointer×(the number of memory addresses for one pointer), the head address is calculated to be 88 by the formula 11×8. Further, since the type of the pointer #11 of a read R2 includes “tail” (see
First, the length of the data (the length of the remaining data of the data D2) indicated by the pointer #11 is calculated by the formula: (the total data length) mod (the size of one pointer). In the present embodiment, since it is assumed that the data length is 584 bytes (see
Based on the number of read, the read addresses of each of the divided blocks B1 to B4 are obtained.
Based on the above calculation results, the read-request distribution unit 12 stores the calculated read addresses #80 to #89 in the regions, in a read request buffer 121, corresponding to the divided blocks B1 to B4.
The read-data distribution unit 13 rearranges the read data being input from the read-request distribution unit 12. In particular, the writing process starts with the head divided data blocks of each of the data D1 to D3, but in the reading process, the divided data block which first becomes readable is not always the head divided data blocks of each of the data D1 to D3 but depends on the immediately preceding reading process. In the present embodiment, in order to reduce the waiting time, the read-request distribution unit 12 immediately starts the reading process on the divided block even if the divided block does not store the head divided data block therein. Thus, if needed, the read-data distribution unit 13 rearranges the divided data blocks in order to correctly restore the divided data blocks having been read in the original normal state prior to division.
With reference to
The read-data distribution unit 13 reads out, when distributing the read request, the divided data blocks from the read request buffer 121 into a read data buffer 131 in order from the head divided data blocks by using the above pieces of information (the data ID, the write pointer, the data length, the head block ID, and the type) stored in the data management unit 14. Thus, the read data (divided data blocks) are accumulated in the read data buffer 131 for each of the divided blocks B1 to B4. The read-data distribution unit 13 rearranges the accumulated read data in order from the head of a FIFO (First In First Out).
The rearrangement process of the read data (divided data blocks) will be described in more detail below.
Similar processes are performed on the data D2 and D3. In particular, when having finished reading from the all divided blocks B1 to B4 in which the data D1 are stored, the read-data distribution unit 13 starts to read out the following next data D2, and at the same time switches a rearrangement buffer 132 from the rearrangement buffer 132a to the rearrangement buffer 132b. With this arrangement, the different buffers are used for each of the data D1, D2, and D3 in the rearrangement process. Thus, the read-data distribution unit 13 can easily take out the same data (for example, the data D2) from the data D1, D2, and D3 stored in the read data buffer 131 in an intricate way as illustrated in
As described above, the read-data distribution unit 13 transfers the data D1, D2, and D3 from the read data buffer 131 to the rearrangement buffer 132 in order of the ID. If the read data buffer 131 is empty when transferring, the read-data distribution unit 13 waits until new data are stored in the read data buffer 131; and when new data are stored, the read-data distribution unit 13 starts to transfer the data.
Next, taking the data D2 as an example, the rearrangement process will be described.
The data management unit 14 stores, when performing the writing process, the above pieces of information so that the divided data blocks D11 to D34 distributed by the write-request distribution unit 11 can be correctly read by the memory control device 10. The above pieces of information are, for example, the data IDs, the write pointers, the data lengths, the head block IDs, and the types of the data D1 to D3. These pieces of information are obtained from the data management unit 14 when performing the reading process, and used, for example, to identify memory addresses to be read or to identify the divided blocks to which the memory addresses are assigned.
The controllers 15a to 15d are memory interfaces for controlling the memory 20. The controllers 15a to 15d are each connected to each of four pairs of DDR (Double Data Rate) memories 20a to 20d constituting the memory 20, through four buses B11 to B14. In the present embodiment, the bus width is assumed to be 512 bits in total; thus, each of the controllers 15a to 15d individually controls each of the corresponding DDR memories 20a to 20d through the four 128-bit buses B11 to B14.
Next, with respect to the operation of the memory control device 10, the writing process and the reading process will be separately described. In the description of the operation, similarly to the above description of the configuration, a case is exemplified in which the bus width B of 64 bytes is divided into quarters; however, the number of division is not limited to four.
First, the writing process will be described.
At T11 of
At T13, the write-request distribution unit 11 transfers the write request input to the controllers 15a to 15d, considering as the head the divided block, of the empty divided blocks, having the shortest waiting time in the controllers 15a to 15d. In this process, the write-request distribution unit 11 transfers the write requests of the divided data blocks to the controller, in order from the controller having the divided block having the shortest waiting time in the controllers 15a to 15d. At T14, the controllers 15a to 15d writes the divided data blocks in the corresponding DDR memories 20a to 20d in the memory 20, according to the write request from the write-request distribution unit 11.
In
408 bytes÷(16 bytes×27)=0.944 (1)
The above Equation (1) represents that a write bandwidth is improved; and in the memory control device 10, by further increasing the number of division to reduce the width of the divided block, the write bandwidth can be broader. Therefore, the number of division is not limited to four, and it is desirable that the number of division is as large as possible unless the circuit scale becomes too large.
At T15, the write-request distribution unit 11 previously stores the above pieces of information (for example, the write pointers, the data lengths, the head block IDs, and the types) as the information for restoring the divided data blocks in the data management unit 14 so that the divided data blocks can be correctly composed at the time of reading.
Next, the reading process will be described.
Upon receiving the input of a read pointer as the read request from the queue management unit (T21), the read-request distribution unit 12 obtains the above pieces of information (for example, the head block IDs, the data lengths, and the types) previously stored at T15 from the data management unit 14, based on the read pointer (T22).
At T23, the read-request distribution unit 12 distributes the read requests input at T21 to the controllers 15a to 15d, based on the head block IDs and the data lengths obtained at T22. Next, the read-request distribution unit 12 calculates the read addresses for each of the controllers 15a to 15d, based on the head block IDs, the data lengths, and the type obtained at T22 (T24). At T25, the read-request distribution unit 12 refers to the read addresses calculated at T24 and reads the divided data blocks corresponding to the read addresses from the memory 20 through the controllers 15a to 15d.
Here, taking as an example the case that the read requests are input in the order of the data D3, D1, D4, and D2, the reading process will be described in more detail.
Thus, also in
408 bytes÷(16 bytes×27)=0.944 (2)
The above Equation (2) represents that a read bandwidth is improved; and in the memory control device 10, by further increasing the number of division to reduce the width of the divided block, the read bandwidth can be broader. Therefore, the number of division is not limited to four, and it is desirable that the number of division is as large as possible unless the circuit scale becomes too large.
At T25, it is preferable that at the time of reading, the read-request distribution unit 12 assigns a frame number to each of the data D1 to D4 when obtaining the pieces of information corresponding to the data ID. The frame numbers are the data IDs which are reassigned in the order in which the data are actually read out. The read-request distribution unit 12 takes out the data D1 to D4 from the memory 20 according to the frame numbers in ascending order. With this arrangement, the memory control device 10 can output the data D1 to D4 to an external device in the correct order based on the input read request.
At T26, the read-data distribution unit 13 stores the read data read at T25 together with the head block ID in the rearrangement buffer 132, by the unit of the divided data block. When storing, the read-data distribution unit 13 makes a control to shift the rearrangement buffer 132 to the rearrangement buffer 132 for the next data as illustrated in
At T28, the read-data distribution unit 13 serially reads out the read data D3, D1, D4, and D2 from the rearrangement buffers 132c, 132a, 132d, and 132b. The read-out is independently done by the unit of the divided data block. At T29, the read-data distribution unit 13 refers to the head block ID stored in the data management unit 14 at T15, and then rearranges the divided data blocks such that the head divided data blocks of each of the read data D1 to D4 is located at the divided block B1 at the head of the bus, and the read-data distribution unit 13 then output the rearranged data. In this way, the divided data blocks are connected, and the original data D1 to D4 are restored.
Further, as illustrated in
Note that when transferring the data D1 to D4 between the buffers, the read-data distribution unit 13 arranges the data D1 to D4 in ascending order of the frame numbers according to the frame number assigned to each of the data D1 to D4, and which operation avoid the inconvenience that the order of outputting the data is different from the order of the read requests.
The read-data distribution unit 13 reads the divided data blocks from the rearrangement buffer 132 to output them, and the number of read for one frame is calculated by rounding up the valued obtained by dividing the data length by the bus width B. Thus, for example, in the case of the data D3 the number of read is calculated to be two (twice) according to the equation: (88 bytes)/(64 bytes)=1.375. Similarly, the number of read is calculated depending on each of the read data, for example, in the case of the data D2 the number of read is three times according to the equation: (168 bytes)/(64 bytes)=2.625.
The read-data distribution unit 13 outputs the data D3, D1, D4, and D2 in this order, incrementing the frame number; and before outputting the data, the read-data distribution unit 13 rearrange the divided data blocks based on the head block IDs such that the head divided blocks are located at the top. For example, in the case that the output data is the data D3, the read-data distribution unit 13 outputs the data D3 as they are without rearranging the divided data blocks (see
Taking the data D4 as an example, a description will be made in more detail below.
As described above, the memory control device 10 has the write-request distribution unit 11 and the controllers 15a to 15d. The write-request distribution unit 11 divides the data D1 to be written in the memory 20, in the direction of the bus width B and then outputs the data, distributing the divided data blocks D11 to D16 obtained by the dividing to the buses B11 to B14. The controllers 15a to 15d writes the divided data blocks D11 to D16 output by the write-request distribution unit 11 in the memory 20 through the buses B11 to B14, without leaving any space the size of each of the buses between the data in each of the buses (see
In the memory control device 10, it is preferable that the data management unit 14 obtains, when outputting the divided data blocks D11 to D16, the positions (for example, the head block IDs, the data lengths, the write pointers) at which the divided data blocks D11 to D16 are written in the memory 20. In addition, the read-request distribution unit 12 serially reads out the divided data blocks D11 to D16 written in a memory 29 by the controllers 15a to 15d, referring to the above positions obtained by the data management unit 14.
In the memory control device 10, it is preferable that the read-data distribution unit 13 outputs the divided data blocks D11 to D16 after rearranging the divided data blocks D11 to D16 such that the divided data blocks D11 to D16 read out by the read-request distribution unit 12 are in the correct order before divided by the write-request distribution unit 11.
In other words, the memory control device 10 individually writes and reads the data obtained by division, thereby minimizing invalid regions (parallel loss) as small as possible and improving the efficiency of arrangement of the data in the bus. With this arrangement, the transmission bandwidth is prevented from being reduced without complicating the data management method in the memory control device 10. As a result, the broadband transmission of data can be realized with the IP transmission device without speeding-up the circuit by increasing the operation frequency of the memory itself. Consequently, an efficient memory control is realized.
With reference to
First, since the memory control device of the related art processes one block of data by the unit of 64 bytes, the transfer efficiency is calculated to be (65 bytes)/(64 bytes×2)=0.51. Then, the transmission bandwidth is calculated to be 52.2 Gbps by Equation (3) below.
512 bits×200 MHz×0.51=52.2 Gbps (3)
In contrast, since the memory control device 10 according to the present embodiment divides one block of data in quarters to process them by the unit of 16 bytes, the transfer efficiency of the data is calculated to be (65 bytes)/(16 bytes×5)=0.81. Consequently, the transmission bandwidth is calculated to be 82.9 Gbps by Equation (4) below.
512 bits×200 MHz×0.81=82.9 Gbps Equation (4)
As illustrated above, in the present embodiment, the transmission bandwidth is improved by approximately 30 Gbps under the above assumptions, compared to the related art. A similar effect can be achieved with data lengths other than 65 bytes.
The memory control device 10 can be modified as described below.
In other words, the memory control device 10 has therein the memory groups 21 and 22. When the write-in request and the read-out request to the divided data blocks D11 to D16 are concurrently issued, the write-request distribution unit 11 of the memory control device 10 instructs the controllers 15a to 15b to perform writing in response to such write-in request to a memory region different from the memory region from which the divided data blocks (for example, the divided data blocks D11 to D13) are read out in response to such read-out request.
In more particular, when the read request and the write request are not concurrently issued, the read-request distribution unit 12 of the memory control device 10 reads out the read data requested to be read, from the memory group in which such data are stored. In addition, the write-request distribution unit 11 of the memory control device 10 writes the write data requested to be written, in the memory group selected by a selector for writing 133a when the write request is input. In contrast, think of the case that the read request and the write request are concurrently issued. As for the read data, similarly to the case that the read request and the write request are not concurrently issued, the read-request distribution unit 12 reads out, through a selector for reading 133b, the read data from the memory group in which the data to be read are stored. On the other hand, as for the write data, the write-request distribution unit 11 selects the memory group opposite to the memory group of the memory groups 21 and 22 which is selected for reading, and writes in such memory group the data to be written. By this operation, the writing process is performed on the memory group on which no reading process is performed. As a result, the concurrent occurrence (conflict) of the reading process and the writing process within one memory group is prevented before it happens.
As a cause of the reduction in the transmission bandwidth, occurrence of the above described waiting can be cited in addition to the above described creation of the invalid region between the data; however, with the memory control device 10 according to Modified Example 1, both causes are resolved, whereby the transmission bandwidth can be further improved.
Note that although the bus width B is divided into quarters in the embodiment, the number of division is not limited thereto and can be any plural number. However, if the number of division is too large, a problem arises that the memory control device has to have increased number of controllers and the circuit scale gets large or more lines are requested for address control. Therefore, the number of division is preferably in the range (for example, division into about four to eight parts) which does not cause these problems.
Further, in the embodiment, since the bus width of total 512 bits is evenly divided into quarters in the memory control device 10, the lengths of the divided data blocks are made to be constant and identical to each divided bus width. However, in the case that each of the divided buses has different length, the lengths of the divided data blocks may also be different in conformity with the bus widths (for example, 128 bits and 256 bits).
Further, in the embodiment, the physical multiple buses (for example, four buses) communicate between the memory control device 10 and the memory 20; however, in the memory control device 10, one physical bus may be divided into plurals to make a plurality of logical buses.
The elements of the memory control device 10 are not necessarily physically structured as illustrated in the drawings. That is to say, specific aspects of disintegration and integration of the devices are not limited to one illustrated in the drawings, and all or part may be structured by functionally or physically disintegrating or integrating by an arbitrary unit depending on various loads or status of use. For example, either the read-request distribution unit 12 and the read-data distribution unit 13 or the write-request distribution unit 11 and the read-request distribution unit 12, in the memory control device 10, may be integrated as one element. Further, the memory for data management and the memory for request buffer may be commonalized. Alternatively, the data management unit 14 may be disintegrated into two parts, one for managing the write data and the other for managing the read data. Further, the memory 20 may be connected to the memory control device 10 through networks or cables. Similarly, the data management unit 14 may be assembled as an external device of the memory control device 10.
According to an aspect of a memory control device, a transmission bandwidth can be improved.
All examples and conditional language recited herein are intended for pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventors to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2013-070691 | Mar 2013 | JP | national |