The present application is based on, and claims priority from, Korean Patent Application Number 10-2023-0067411, filed May 25, 2023, the disclosure of which is incorporated by reference herein in its entirety.
The present disclosure relates to a method and apparatus for constructing a multi-stage hash table. More specifically, the present disclosure relates to a method and apparatus for resolving hash collisions using a multi-stage hash table and a multi-stage management table without storing large key values in entries of the hash table.
The contents described below merely provide background information related to the present disclosure and do not constitute prior art.
Hash tables are common technology used to store and manage a large number of (key, value) pairs in a memory-efficient manner. In hash tables, hash collisions can occur when the hash values of keys are the same or entry storage slots are the same. To resolve the hash collision, each key needs to be stored in hash table entries. However, if the key size is very large or have a variable length, a lot of memories is required to store the keys in entries. Accordingly, there are limitations that make it difficult to construct a hash table with hardware having a limited memory. A representative example of such limitations is a case in which a forwarding engine of a named data network (NDN) is implemented using a hash table.
NDN can be said to be a representative technology of an information centric network (ICN) for realizing future Internet architectures. NDN performs forwarding using a data name itself rather than a host IP address containing specific data. A data name has a variable length and has no length limit. A hash table can be used to speed up forwarding using such a variable-length name. A hash table creates a fixed-length hash value by hashing a variable-length key. An entry is stored in a k-th slot, which is the remainder of dividing the hash value by the number of slots in the hash table. In a hash table, the hash values of different entries may be the same, and multiple entries may be stored in the same slot even if the hash values are different. To resolve such a hash collision, a key may be included in entry information to check whether the entry has the same key.
In a case in which a forwarding information base (FIB) is implemented with hash table in NDN, a variable-length prefix serves as a key, and thus the prefix is stored in entries. When a forwarding engine is implemented in software, memories of different sizes may be allocated for respective entries. However, when a forwarding engine is implemented in hardware for high-speed forwarding, it may be difficult to implement entries while reducing memory waste. To solve this, the maximum size prefix may be limited and all entries may be set to maximum size entries. Alternatively, memory efficiency can be improved by using one memory block for small-length prefixes and multiple memory blocks for long prefixes. However, in this case, when designing the pipeline in hardware, the lookup time per entry must be allocated as much as the lookup time of the maximum size prefix. Accordingly, the lookup time increases and performance deteriorates.
The present disclosure can resolve hash collisions by storing only fixed-length hash values in entries of a hash table without storing large key values.
In addition, according to one embodiment, a multi-stage hash table with different hash functions can be configured to resolve hash collisions.
Further, according to one embodiment, a hash table manager can determine hash collisions by storing key values in a management table.
Furthermore, according to one embodiment, memory efficiency can be improved and forwarding performance can be enhanced by using a multi-stage hash table.
The objects to be achieved by the present disclosure are not limited to the objects mentioned above, and other objects not mentioned will be clearly understood by those skilled in the art from the description below.
According to the present disclosure, a method for configuring a multi-stage hash table includes attempting to add management table information to a first management table and attempting to add hash table information to a first hash table. The method also includes determining whether second management table information having the same hash value as a hash value included in first management table information is present in a k-th management table. The first management table information is information to be added to the k-th management table. The method also includes attempting to add table information to the k-th management table and a k-th hash table or to a (k+1)-th management table and a (k+1)-th hash table based on the determination result. table information added to the k-th hash table and the (k+1)-th hash table does not include a key.
According to the present disclosure, an apparatus for configuring a multi-stage hash table includes a memory and a plurality of processors. At least one of the plurality of processors is configured to attempt to add management table information to a first management table and attempt to add hash table information to a first hash table. The at least one of the plurality of processors is also configured to determine whether second management table information having the same hash value as a hash value included in first management table information is present in a k-th management table. The first management table information is information to be added to the k-th management table. The at least one of the plurality of processors is also configured to attempt to add table information to the k-th management table and a k-th hash table or to a (k+1)-th management table and a (k+1)-th hash table based on the determination result. table information added to the k-th hash table and the (k+1)-th hash table does not include a key.
According to the present disclosure, a computer-readable recording medium is a computer-readable recording medium storing instructions, the instructions, when executed by the computer, may cause the computer to perform attempting to add management table information to a first management table and attempting to add hash table information to a first hash table. The instructions, when executed by the computer, may also cause the computer to perform determining whether second management table information having the same hash value as a hash value included in first management table information is present in a k-th management table. The first management table information is information to be added to the k-th management table. The instructions, when executed by the computer, may also cause the computer to perform attempting to add table information to the k-th management table and a k-th hash table or to a (k+1)-th management table and a (k+1)-th hash table based on the determination result. table information added to the k-th hash table and the (k+1)-th hash table does not include a key.
According to the present disclosure, it is possible to resolve hash collisions by storing only fixed-length hash values in entries of a hash table without storing key values.
In addition, according to one embodiment, it is possible to construct a multi-stage hash table with different hash functions to resolve hash collisions.
Further, according to one embodiment, a hash table manager can determine hash collisions by storing key values in a management table.
Furthermore, according to one embodiment, it is possible to improve memory efficiency and forwarding performance by using a multi-stage hash table.
The effects of the present disclosure are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by one of ordinary skill in the art from the following description.
Hereinafter, some exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. In the following description, like reference numerals preferably designate like elements, although the elements are shown in different drawings. Further, in the following description of some embodiments, a detailed description of known functions and configurations incorporated therein will be omitted for the purpose of clarity and for brevity.
Additionally, various terms such as first, second, A, B, (a), (b), etc., are used solely to differentiate one component from the other but not to imply or suggest the substances, order, or sequence of the components. Throughout this specification, when a part ‘includes’ or ‘comprises’ a component, the part is meant to further include other components, not to exclude thereof unless specifically stated to the contrary. The terms such as ‘unit’, ‘module’, and the like refer to one or more units for processing at least one function or operation, which may be implemented by hardware, software, or a combination thereof.
The following detailed description, together with the accompanying drawings, is intended to describe exemplary embodiments of the present disclosure, and is not intended to represent the only embodiments in which the present disclosure may be practiced.
Referring to
In the case of looking up a hash table, it is necessary to compare keys in order to check whether a matching entry is the entry for a desired key. This is because two or more keys in a hash table can have the same hash value. Accordingly, it is necessary to store keys in a hash table. However, as the key size increases, the memory capacity for storing keys increases, and when a hash table is implemented using hardware such as an ASIC or an FPGA, it may be difficult to implement a hash table due to limitations on a memory size per entry and overall table memory. In particular, in the case of variable-length keys, a key memory of an entry is allocated to a maximum allowable key size, which also causes inefficient use of the memory.
Referring to
When a hash table is implemented using hardware, memory allocation may be designed for each entry. In this case, a memory having the maximum size is allocated to the prefix and a comparison logic is designed to suit the maximum size, which may result in memory waste. In the present disclosure, when the FIB is implemented as a hash table, a variable-length prefix, which is a key, may not be stored in the hash table, but a fixed-length hash value may be stored. A fixed-length hash value may be used as an entry delimiter. Accordingly, the entry size is reduced and the lookup time becomes constant, and thus the forwarding speed can increase. A multi-level hash table may be used to resolve hash collisions.
A hash table manager may construct a hash table group suitable for the system structure. As an example, a hash table manager may correspond to a routing information base (RIB) manager that manages routing entries using a routing protocol. Here, the RIB manager may be implemented as software in association with the routing protocol. As an example, FIB may be implemented using a hash table group. The hash table manager may include management tables corresponding to hash tables in a hash table group. A hash value and a prefix may be stored together in the management table. The hash table manager may use such a management table to manage hash collisions at the time of adding or deleting entries. The hash table manager may access the management table only when changing entries. Storing a variable-length prefix that is a key value in the management table does not affect forwarding performance. This is because only hash tables are used in packet forwarding.
Referring to
Referring to
Referring to
Referring to
Since the number of entries in the management table H_2 is the number of entries with hash collisions in the management table H_1, the management table size rapidly decreases as the number of stages increases. For example, if the number of output bits of the hash function of the management table H_1 is 128 bits and one million entries are added to the management table H_1, the probability of a hash collision is 10-27. Accordingly, hash collisions can be avoided by substantially using only one management table. A small management table H_2 may be added to avoid hash collisions that may occur.
Since H_1(k2) and H_1(k3) are the same in the management table H_1, a hash collision between (k2,v2) and (k3,v3) may occur. Since H_1(k5) and H_1(k7) are the same in the management table H_1, a hash collision between (k5,v5) and (k7,v7) may occur. The number of a key indicates the order in which the key has been added to a management table. Since entries containing only hash values are stored in hash tables, in the event of a hash collision, only the collision bit of a previously stored entry can be set to 1. Here, a new entry may not be added to the management tables and the hash tables. Since an entry containing H_1(k2) and an entry containing H_1(k5) have been stored first without a hash collision, only the entry containing H_1(k2) and the entry containing H_1(k5) can be present in the management table H_1. In the management table H_1, only an entry containing k2 and H_1 (k2) and an entry containing k5 and H_1(k5) can be present.
The hash table manager may ascertain that there is a hash collision between (k2,v2) and (k3,v3) in the management table H_1 and add an entry containing H_1(k2) and an entry containing H_1(k3) to the management table H_2. The hash table manager may ascertain that there are a hash collision between (k5,v5) and (k7,v7) in the management table H_1 and a hash collision between (k3,v3) and (k5,v5) in the management table H_2, and may not add an entry containing H_2 (k5) to the management table H_2. The hash table manager may add an entry containing H_3 (k3) and an entry containing H_3 (k5) to the management table H_3.
Referring to
If a hash collision between an existing entry and a new entry occurs in the (k+1)-th hash table and the (k+1)-th management table, the existing entry and the new entry may be stored in the (k+2)-th management table and the (k+2)-th hash table in the same manner. This method may be repeated up to n times, which is the number of hash table stages. The number of hash table stages may be fixed at the time of system initialization. If dynamic addition of a hash table is possible, such as in a software implementation, the number of hash table stages may be increased until no hash collisions occur. If a dynamic hash table cannot be added, adding entries may fail.
Referring to
Referring to
The step of adding table information may include a step of adding third management table information and fourth management table information to the (k+1)-management table and adding first hash table information and second hash table information to the (k+1)-th hash table if second management table information having the same hash value as a hash value included in the first management table information is present in the k-th management table. A key included in the first management table information may be the same as a key included in the third management table information, a key included in the second management table information may be the same as a key included in the fourth management table information, and a hash value included in the third management table information and a hash value included in the fourth management table information may correspond to values calculated using the hash function of the (k+1)-th management table. The first hash table information and the second hash table information may include a hash value obtained by hashing a key with a hash function, a value, and a collision bit. The first hash table information and the second hash table information may not include a key.
k may correspond to an integer in a range equal to or greater than 2 and less than n.
The first management table information may include a key, a hash value obtained by hashing the key with a hash function, a value, and a collision bit. The hash function of the k-th management table may be the same as the hash function of the k-th hash table. The step of adding table information may include a step of adding the first management table information to the k-th management table and adding third hash table information to the k-th hash table if the second management table information having the same hash value as the hash value included in the first management table information is not present in the k-th management table. The collision bit included in the first management table information and the collision bit included in the third hash table information may be set to 0. The hash function of the k-th hash table and the hash function of the (k+1)-th hash table may be independent from each other.
Each element of the apparatus or method in accordance with the present invention may be implemented in hardware or software, or a combination of hardware and software. The functions of the respective elements may be implemented in software, and a microprocessor may be implemented to execute the software functions corresponding to the respective elements.
Various embodiments of systems and techniques described herein can be realized with digital electronic circuits, integrated circuits, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), computer hardware, firmware, software, and/or combinations thereof. The various embodiments can include implementation with one or more computer programs that are executable on a programmable system. The programmable system includes at least one programmable processor, which may be a special purpose processor or a general purpose processor, coupled to receive and transmit data and instructions from and to a storage system, at least one input device, and at least one output device. Computer programs (also known as programs, software, software applications, or code) include instructions for a programmable processor and are stored in a “computer-readable recording medium.”
The computer-readable recording medium may include all types of storage devices on which computer-readable data can be stored. The computer-readable recording medium may be a non-volatile or non-transitory medium such as a read-only memory (ROM), a random access memory (RAM), a compact disc ROM (CD-ROM), magnetic tape, a floppy disk, or an optical data storage device. In addition, the computer-readable recording medium may further include a transitory medium such as a data transmission medium. Furthermore, the computer-readable recording medium may be distributed over computer systems connected through a network, and computer-readable program code can be stored and executed in a distributive manner.
Although operations are illustrated in the flowcharts/timing charts in this specification as being sequentially performed, this is merely an exemplary description of the technical idea of one embodiment of the present disclosure. In other words, those skilled in the art to which one embodiment of the present disclosure belongs may appreciate that various modifications and changes can be made without departing from essential features of an embodiment of the present disclosure, that is, the sequence illustrated in the flowcharts/timing charts can be changed and one or more operations of the operations can be performed in parallel. Thus, flowcharts/timing charts are not limited to the temporal order.
Although exemplary embodiments of the present disclosure have been described for illustrative purposes, those skilled in the art will appreciate that various modifications, additions, and substitutions are possible, without departing from the idea and scope of the claimed invention. Therefore, exemplary embodiments of the present disclosure have been described for the sake of brevity and clarity. The scope of the technical idea of the present embodiments is not limited by the illustrations. Accordingly, one of ordinary skill would understand that the scope of the claimed invention is not to be limited by the above explicitly described embodiments but by the claims and equivalents thereof.
Number | Date | Country | Kind |
---|---|---|---|
10-2023-0067411 | May 2023 | KR | national |