The disclosure relates to the field of lookup tables for data encoding and decoding.
Lookup tables are common in hardware architecture of encoders and decoders to determine output data for downstream logic based on input addresses corresponding to values of a lookup table. The bit width of an address determines the number of elements in a lookup table. Typically, there are 2̂(bit width of address) elements in a lookup table. With the increasing complication of encoder/decoder design, larger lookup tables are required. Accordingly, hardware area and timing pressures are relevant in the current state of the art.
Various embodiments of the disclosure include a system for accessing one or more values of a lookup table. The system includes one or more read only memory devices storing a first plurality of values of the lookup table and one or more combinational logic circuits for accessing a second plurality of values of the lookup table.
In an embodiment, the lookup table includes a plurality of sub-lookup tables. A first plurality of sub-lookup tables are stored by the one or more read only memory devices and a second plurality of sub-lookup tables are stored by the one or more combinational logic circuits.
In another embodiment, a primary value of each row of a plurality of rows of the lookup table and a plurality of delta values for each row of the plurality of rows of the lookup table are stored by the one or more read only memory devices. A plurality of secondary values of each row are accessed via the one or more combinational logic circuits, wherein each one of the plurality of secondary values is determined utilizing the primary value of each row and at least one delta value of the plurality of delta values for each row.
In yet another embodiment, a primary row of the lookup table is stored by the one or more read only memory devices. A plurality of secondary rows of the lookup table are accessed via the one or more combinational logic circuits, wherein each value of each row of the plurality of secondary rows is determined utilizing a primary value of the primary row and an eigenvector value.
It is to be understood that both the foregoing general description and the following detailed description are not necessarily restrictive of the disclosure. The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the disclosure.
The embodiments of the disclosure may be better understood by those skilled in the art by reference to the accompanying figures in which:
Reference will now be made in detail to the embodiments disclosed, which are illustrated in the accompanying drawings.
In some embodiments, the MTR coding method is based on state splitting. In addition to the traditional hard MTR constraint, a soft MTR constraint is also applied to the encoded data. State-splitting MTR algorithms are known to the art. In some embodiments, the algorithm requires initial construction of a basic MTR matrix which satisfies hard MTR rules. Then the basic MTR matrix is extended to a larger transition matrix allowing the coding performance to approach Shannon Capacity. During the encoding, the input data is mapped to a unique path on the transition matrix, which is called edge and has a label on it, where the label is the encoded data. According to the algorithm, for the MTR encoder and decoder, a relatively large number of lookup tables (LUTs) are necessary for determining the relationship between edge and input data.
In some embodiments, the LUTs occupy a large area of hardware and/or attribute to system timing pressures, often depending upon the hardware selection and architecture. Read only memory (ROM) devices typically occupy less area than combinational logic circuits (e.g. logic gates, multiplexer arrays). However, access to LUT values through ROM is also typically slower than combinational logic. Several of the following embodiments are directed to LUT hardware architecture designed to reduce at least one of area and/or timing pressures, among other improvements.
According to the embodiments described herein, various steps or functions are executed by hardware, software, firmware, or any combination of the foregoing. In some embodiments, at least one processor is configured to execute one or more steps or functions according to program instructions stored on at least one carrier medium. In some embodiments, one or more electronic circuits are configured for performing selected steps or functions.
Table 204 illustrates the DNA2 LUT split into a selected number (e.g. 16) of sub-LUTs. In an embodiment, the system 200 includes one or more ROM devices 206 configured to store a first set of the sub-LUTs and one or more combinational logic circuits configured to store a second set of sub-LUTs. In some embodiments, the first set of sub-LUTs stored by the ROM 206 include a greater number of non-zero elements than the second set of sub-LUTs stored by the combinational logic 208. Accordingly, the LUT is broken down into a plurality of sub-LUTs stored by a mixture of cascaded ROM devices 206 and combinational logic circuits 208 to decrease hardware area of the LUT.
In some embodiments, illustrated by
In an embodiment, illustrated in
In another embodiment, illustrated in
As previously discussed, conversion ROM to combinational logic advantageously balances area and timing for some LUTs, such as the LUT 500 illustrated in
In some embodiments, simple combinational logic is configured to store the LUT 500. In some embodiments, the LUT 500 includes a larger matrix (e.g. 300×300 transition matrix) extended from a smaller matrix (e.g. basic 5×5 matrix). A level and offset is stored by the combinational logic to merge zero elements, such that 5 states constitute a level and the offset defines the order of a state in a level. Accordingly, all non-zero elements (i.e. states) are indicated by two variables: level (e.g. range from 1-60) and offset (e.g. range from 1-5). For example, state 6 is level 2 and offset 1, state 7 is level 2 and offset 2, state 12 is level 3 and offset 2, and so on. In some embodiments, the following combinational logic is used:
In some embodiments, the delta value between predetermined portion and target endpoint is required for subsequent operations. In some embodiments, the possible endpoint values are approximately in the range of 0 to 17179869183. In some embodiments the endpoint values are distributed to 16 possible groups, each group including 8 possible sub-statuses for a total of 128 possible endpoint values. The predetermined portion always falls into a zone of two monotone non-decreasing endpoint values. In some embodiments, during the MTR LUT generation algorithm, the target endpoint values in the 16 possible groups are calculated in advance and stored in the LUT for the real-time encoder/decoder. The target portion in the 8 possible sub-statuses of a group is determined utilizing approximate eigenvector values from an eigenvector matrix.
Table 604 illustrates 128 possible endpoint values for an individual address (one combined value of status and sub-status). In some embodiments, only a selected number of rows such as one primary row (e.g. last (8th) row) of the LUT element 604 is stored, and other (secondary) rows are calculated by combinational logic utilizing delta values (i.e. eigenvector values). For example, to determine value 37 of the LUT element 604, a stored value (e.g. 32) of the primary row is combined with delta values from the eigenvector matrix by combinational logic, as illustrated below:
Delta1=33−32; Delta2=34−33; Delta3=35−34; Delta4=36−35; Delta5=37−36; Value 37=value 32+delta1+delta2+delta3+delta4+delta5
In some embodiments, portion splitting logic is used to determine the delta values. In some embodiments, the 8 portion groups for sub-statuses of a determined end status are accessed through combinational logic utilizing delta values while the status rows (i.e. primary rows) are stored. In some embodiments, the portion splitting logic allows for significant reduction of LUT area (e.g. DNA5 LUT is area reduced up to ⅛ original size). In some embodiments, the logic is defined by the following MATLAB LUT generation code:
As illustrated in
In some embodiments, the path of a current end state address is defined by comparison between two LUTs (e.g. DNA4−DNA3). For timing consideration, the comparison values are directly stored as a new LUT 612 (e.g. DNA6=DNA4−DNA3). In some embodiments, the new LUT 612 replaces combinational logic (DNA4−DNA3) to relax timing pressure. In some embodiments, however, the new LUT 612 adds area pressure because DNA3 and DNA4 are required for other portions of the MTR coding algorithm.
According to the non-overlapped character, the same or functionally similar LUT is enabled by system 702. The LUT is split into 12 sub-LUTs, each having a single output associated with one of the 12 elements. Some LUT outputs are invalid for a specified address, resulting in an address range from 1 to 155. In some embodiments, the system 702 accordingly has area=243×8 and critical timing path=mux 155×8 to 1. The key aspect is that different addresses do not require the same element from the original LUT, so the multiple-output LUT is split into a plurality of single-output LUTs to reduce area.
HDDs with higher density are increasingly desired in the art. Accordingly, encoding methods, such as MTR coding, are sometimes used to achieve better SNR performance. Many LUTs are often required by modern encoding methods, and they are an obstacle for area and timing closure in hardware implementations. The foregoing systems and techniques are directed to reducing area and/or timing pressures for various encoding circuits and any other circuits relying on LUTs.
It should be recognized that in some embodiments the various functions or steps described throughout the present disclosure may be carried out by any combination of hardware, software, or firmware. In some embodiments, various steps or functions are carried out by one or more of the following: electronic circuits, logic gates, field programmable gate arrays, multiplexers, or computing systems. A computing system may include, but is not limited to, a personal computing system, mainframe computing system, workstation, image computer, parallel processor, or any other device known in the art. In general, the term “computing system” is broadly defined to encompass any device having one or more processors, which execute instructions from a memory medium.
Program instructions implementing methods, such as those manifested by embodiments described herein, may be transmitted over or stored on carrier medium. The carrier medium may be a transmission medium, such as, but not limited to, a wire, cable, or wireless transmission link. The carrier medium may also include a storage medium such as, but not limited to, a read-only memory, a random access memory, a magnetic or optical disk, or a magnetic tape.
It is further contemplated that any embodiment of the disclosure manifested above as a system or method may include at least a portion of any other embodiment described herein. Those having skill in the art will appreciate that there are various embodiments by which systems and methods described herein can be effected, and that the implementation will vary with the context in which an embodiment of the disclosure deployed.
Furthermore, it is to be understood that the invention is defined by the appended claims. Although embodiments of this invention have been illustrated, it is apparent that various modifications may be made by those skilled in the art without departing from the scope and spirit of the disclosure.