The present application claims priority under 35 U.S.C. § 119(a) to Korean Patent Application No. 10-2023-0139537, filed on Oct. 18, 2023, which is incorporated herein by reference in its entirety.
Various embodiments of the present disclosure relate to a computational storage device and a data processing system including the computational storage device.
A storage device may be configured to store data that are provided by an external host device in response to a write request from the host device. Furthermore, the storage device may be configured to provide the host device with the stored data in response to a read request from the host device. The host device is an electronic device capable of processing data, and may include a computer, a digital camera, or a mobile phone. The storage device may operate by being embedded in the host device or may be fabricated in a separable form and operate by being electrically coupled to the host device. The storage device may include a memory device for storing data.
A computational storage device may be a storage device capable of performing data processing and computation operations under the control of a host device. The computational storage device may decrease a burden of data processing and the delay time of data processing of the host device. Furthermore, if the computational storage device processes data having a large size, the computational storage device can effectively reduce a burden of a bandwidth by minimizing a data movement to the host device.
In an embodiment of the present disclosure, a computational storage device may include a memory device configured to store therein a data unit group including a plurality of data units; and a controller configured to: calculate, based on a computational operation request, one or more logical addresses, to which a target data unit that is included in the data unit group has been allocated and is a target of a computational operation to be performed in response to the computational operation request, calculate a start logical address and a start offset of the target data unit based on information, the start logical address being included in the one or more logical addresses, read, from the memory device, data corresponding to the one or more logical addresses, and identify the target data unit in the read data. The computational operation request may include the information of a start logical address of the data unit group and a size of each of the plurality of data units.
In an embodiment of the present disclosure, a computational storage device may include a memory device configured to store therein a data unit group including a plurality of data units; and a controller configured to: calculate a first logical address, to which a selected part of a target data unit included in the data unit group have been allocated, and an unaligned flag for the target data unit, calculate, when the target data unit is determined to be unaligned data according to the unaligned flag, a second logical address, to which a remaining part of the target data unit has been allocated, read, from the memory device, first read data corresponding to the first logical address and second read data corresponding to the second logical address, and obtain the target data unit from the first read data and the second read data.
In an embodiment of the present disclosure, a data processing system may include a computational storage device configured to allocate consecutive logical addresses to a data unit group when storing therein the data unit group; and a host device configured to transmit, to the computational storage device, a computational operation request for the data unit group. The computational storage device may perform a computational operation on the data unit group in response to the computational operation request.
Other features, aspects, and advantages of the present invention will become apparent from the following detailed description, the drawings, and the claims.
Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.
The data processing system 10 may include a host device 100 and the computational storage device 200.
The host device 100 may control the computational storage device 200 by transmitting an operation request RQ to the computational storage device 200. The operation request RQ may include a write operation request, a read operation request, and a computational operation request. For example, the host device 100 may store data in the computational storage device 200 through the write operation request. Furthermore, the host device 100 may read data that have been stored in the computational storage device 200 through the read operation request.
Furthermore, the host device 100 may control, through the computational operation request, the computational storage device 200 to perform a computational operation for data that have been stored in the computational storage device 200. For example, the computational operation request may be a request for a computational operation for one or more data units of a data unit group 221 that has been stored in the computational storage device 200. The data unit may be a unit in which a computational operation is performed on the data unit group 221 depending on a computational operation type. As will be described later, the data unit may be a unit in which a computational function accelerator 212 generates an index. The data unit group 221 may consist of a plurality of data units.
For example, when the data unit group 221 is a database table, the data unit may be each row or each column that is included in the database table. When the data unit group 221 is an embedding table, the data unit may be each vector data unit that is included in the embedding table. Hereinafter, a data unit on which a computational operation is to be performed by a computational operation request, among the data units of the data unit group 221, may be referred to as a target data unit.
The operation request RQ may include operation information IF, that is, information that is necessary for the computational storage device 200 to perform an operation requested by the operation request RQ. The operation information IF of the computational operation request may include data unit group information and computational operation information.
The data unit group information may selectively include information with regard to the start logical address of the data unit group 221, a number of data units included in the data unit group 221, a number of elements that are included in each data unit, a size of the data unit, and a size of the element. The start logical address of the data unit group 221 may be the foremost logical address among logical addresses, to which the data unit group 221 has been allocated.
According to an embodiment, if the computational storage device 200 is previously aware of the start logical address of the data unit group 221, the start logical address of the data unit group 221 might not be transmitted as the operation information IF.
The computational operation information may selectively include information with regard to a computational operation type, respective sequence numbers of one or more target data units, a number of the target data units, one or more target offsets, a number of the target offsets, one or more condition operations, a number of the condition operations, one or more constants, and a number of the constants.
Specifically, the computational operation type may be the type of computational operation that has been requested by a computational operation request. The computational operation type of the database table may include a SELECT operation, a PROJECT operation, a REMOVE operation, an ADD operation, or a MODIFY operation. The computational operation type of the embedding table may include an average aggregation operation, a weighted average aggregation operation, a maximum value aggregation operation, or a sum aggregation operation.
When each target data unit consists of a plurality of elements, a target offset may indicate an element on which a computational operation is to be performed in a corresponding target data unit. The condition operation may be an operation that is included in condition (e.g., a WHERE syntax in a computational operation request which has been written in a structured query language (SQL) or an IF sentence in a computational operation request which has been written in various programming languages) for a computational operation request. The constant may be a constant that is included in a condition for a computational operation request.
The computational storage device 200 may receive a computational operation request from the host device 100 and perform a computational operation requested by the computational operation request. For example, the computational storage device 200 may perform a write operation on the memory device 220 in response to a write operation request from the host device 100. Furthermore, the computational storage device 200 may perform a read operation on the memory device 220 in response to a read operation request from the host device 100.
Furthermore, the computational storage device 200 may perform a computational operation on one or more target data units that are included in the data unit group 221, in response to a computational operation request for the data unit group 221, and may transmit the operation result RS to the host device 100 as the results of the execution of the computational operation.
The computational storage device 200 may include a controller 210 and a memory device 220.
The controller 210 may control the memory device 220 in response to the operation request RQ. The controller 210 may calculate one or more logical addresses to which a target data unit included in the data unit group 221 has been allocated, in response to a computational operation request for the data unit group 221, may read, from the memory device 220, read data corresponding to the one or more logical addresses, may identify the target data unit in the read data, and may perform a computational operation on the target data unit.
The controller 210 can effectively calculate one or more logical addresses to which a target data unit has been allocated, regardless of whether the target data unit has a size greater than or less than a maximum size allocable to each logical address (i.e., a data size that the host device 100 can allocate for one logical address, for example, 512 B). Although a target data unit is unaligned data that is allocated to two or more logical addresses, the controller 210 can effectively calculate one or more logical addresses to which the target data unit has been allocated.
The controller 210 may include a main controller 211 and the computational function accelerator 212.
The main controller 211 may receive the operation request RQ from the host device 100 and may perform an operation requested by the operation request RQ. The main controller 211 may perform a write operation and a read operation on the memory device 220 in response to the operation request RQ.
When receiving a computational operation request from the host device 100, the main controller 211 may transmit the computational operation request or the operation information IF to the computational function accelerator 212. Furthermore, the main controller 211 may receive a logical address to which a target data unit has been allocated from the computational function accelerator 212 and may access the memory device 220 based on the logical address. Specifically, the main controller 211 may convert the logical address received from the computational function accelerator 212 into a physical address and may access a memory region corresponding to the physical address in the memory device 220. The main controller 211 may manage mapping information between the logical address and the physical address.
Although not illustrated, the main controller 211 may selectively include a host interface that is electrically coupled to the host device 100, and a memory controller, memory, and a processor that are electrically coupled to the memory device 220.
The computational function accelerator 212 may determine a target data unit that has been included in the data unit group 221 based on the operation information IF of the computational operation request and may determine one or more logical addresses to which the target data unit has been allocated.
Furthermore, the computational function accelerator 212 may determine a start offset and last offset of a target data unit based on the operation information IF of the computational operation request. The start offset of the target data unit may correspond to the number of bytes from the foremost location of read data corresponding to the start logical address of the target data unit to a location at which the target data unit is started. The last offset of the target data unit may correspond to the number of bytes from the foremost location of read data corresponding to the last logical address of the target data unit to a location at which the target data unit is ended. In the present disclosure, a foremost location of a data unit may refer to a location where the data unit starts within read data corresponding to a start logical address of the data unit.
Furthermore, when read data corresponding to one or more logical addresses to which a target data unit has been allocated are output by the memory device 220 under the control of the main controller 211, the computational function accelerator 212 may obtain the target data unit from the read data based on a start offset and last offset of the target data unit. The computational function accelerator 212 may perform a computational operation on the target data unit.
The computational function accelerator 212 may generate the index of a target data unit in response to the operation request RQ and may store the index of the target data unit in a separate location (e.g., an index cache 303 in
Furthermore, when the data unit group 221 is changed (e.g., when another data unit is added to the data unit group 221 or when a specific data unit is removed from the data unit group 221) in response to the operation request RQ of the host device 100, the computational function accelerator 212 can efficiently manage the data unit group 221 by modifying indices that have been stored in the index cache 303.
According to an embodiment, when storing the data unit group 221 in the memory device 220 under the control of the host device 100, the main controller 211 may allocate consecutive logical addresses to the data unit group 221. The main controller 211 may map the logical addresses to which the data unit group 221 has been allocated to physical addresses of memory regions in which the data unit group 221 has been stored, respectively, in the memory device 220. The main controller 211 may transmit the foremost logical address, among the logical addresses to which the data unit group 221 has been allocated, that is, the start logical address of the data unit group 221, to the host device 100. The host device 100 may previously designate the range of logical addresses to which the data unit group 221 may be allocated for the main controller 211. The main controller 211 may store the data unit group 221 in the memory device 220 and may then calculate one or more logical addresses that have been allocated to the target data unit in response to a computational operation request for the data unit group 221.
According to an embodiment, when storing the data unit group 221 in the computational storage device 200, the host device 100 may allocate the data unit group 221 to consecutive logical addresses. The host device 100 may transmit a write operation request for the data unit group 221 to the computational storage device 200 along with the logical addresses to which the data unit group 221 has been allocated.
The memory device 220 may operate under the control of the controller 210. The memory device 220 may perform a read operation, a write operation (namely, a program operation), and an erase operation under the control of the controller 210. The memory device 220 may receive a physical address from the controller 210 and may access a memory region corresponding to the physical address.
According to an embodiment, the host device 100 may be an electronic device capable of processing data, and may include a computer, a digital camera, a mobile phone, a drone, a server, and a transport system.
According to an embodiment, the computational storage device 200 may be a storage device including the computational function accelerator 212. The storage device may include a personal computer memory card international association (PCMCIA) card, a smart media card, a memory stick, various multimedia cards (e.g., an MMC, an eMMC, an RS-MMC, and MMC-micro), secure digital (SD) cards (e.g., SD, mini-SD, and micro-SD), universal flash storage (UFS), or a solid state drive (SSD).
According to an embodiment, the memory device 220 may include various types of memory, such as NAND flash memory, NOR flash memory, resistive random access memory (RRAM), phase change random access memory (PRAM), magnetoresistive random access memory (MRAM), ferroelectric random access memory (FRAM), or spin transfer torque random access memory (STTRAM).
Referring to
The table DT may be stored in the memory device 220 in row major ordering. When storing the table DT in the memory device 220 under the control of the host device 100, the main controller 211 may allocate the row data units R0 to R6 that have been sequentially listed to consecutive logical addresses LBA0 to LBA7 in a 512 B unit. In this case, each of the row data units R0 to R6 may be allocated to two logical addresses. The foremost logical address, among the logical addresses to which each row data unit has been allocated, may be referred to as the start logical address of the row data unit. A last logical address, among the logical addresses to which each row data unit has been allocated, may be referred to as the last logical address of the row data unit. When storing data allocated to each logical address in a memory region of the memory device 220, the main controller 211 may map each logical address to the physical address of the memory region.
For example, the row data unit No. 0 (R0) may be allocated to logical addresses LBA0 and LBA1. The logical address No. 0 (LBA0) may be the start logical address of the row data unit No. 0 (R0). The logical address No. 1 (LBA1) may be the last logical address of the row data unit No. 0 (R0). Anterior 512 B of the row data unit No. 0 (R0), that is, sixty-four elements from the column No. 0 to the column No. 63, may be allocated to the logical address No. 0 (LBA0). Furthermore, posterior 48 B of the row data unit No. 0 (R0), that is, six elements from the column No. 64 to the column No. 69, may be allocated to the logical address No. 1 (LBA1). In this case, after posterior 48 B of the row data unit No. 0 (R0), anterior 464 B of the row data unit No. 1 (R1), that is, fifty-eight elements from the column No. 0 to the column No. 57, may be allocated to the logical address No. 1 (LBA1).
In this way, the row data units R0 to R6 may be sequentially allocated from the logical address No. 0 (LBA0) to the logical address No. 7 (LBA7). The start logical address of the table DT may be the foremost logical address No. 0 (LBA0), among the logical addresses LBA0 to LBA7 allocated to the table DT. The last logical address of the table DT may be the last logical address No. 7 (LBA7), among the logical addresses LBA0 to LBA7 allocated to the table DT. It may be an example that the start logical address of the table DT is the logical address No. 0 (LBA0). In the logical address space of the data processing system 10, the number of bytes from the start location of the logical address No. 0 (LBA0) to a location at which each logical address is started may be represented as each logical address as follows. The logical addresses LBA0 to LBA7 may be represented as 0x0, 0x200, 0x400, 0x600, 0x800, 0xA00, 0xC00, and 0xE00, respectively.
According to an embodiment, when the size of each row data unit of table is greater than 1024 B, each row data unit may be allocated to three or more logical addresses. Each row data unit that is allocated to a plurality of logical addresses may be unaligned data.
When the table DT is stored in a storage device not having a computational function, the host device 100 may directly allocate each row data unit of the table DT to the logical addresses LBA0 to LBA7 and may manage allocation information. Furthermore, if a computational operation for the row data unit No. 0 (R0) of the table DT has to be modified, the host device 100 may transmit a read operation request for the logical addresses LBA0 and LBA1 to the storage device. Furthermore, the host device 100 may obtain all read data corresponding to the logical address No. 0 (LBA0), among read data that are output by the storage device, and data of anterior 48 B, among read data corresponding to the logical address No. 1 (LBA1), as the row data unit No. 0 (R0). However, in accordance with an embodiment, the computational storage device 200 instead of the host device 100 can more efficiently perform a computational operation on the row data unit No. 0 (R0) by directly calculating the logical addresses LBA0 and LBA1 to which the row data unit No. 0 (R0) has been allocated.
Referring to
The operation information buffer 301 may store the operation information IF that is transmitted by the main controller 211. The operation information buffer 301 may transmit the operation information IF to a corresponding component under the control of another component (e.g., the index controller 302, the operator 306, or the merge controller 308).
The index controller 302 may calculate one or more logical addresses to which a target data unit of the data unit group 221 has been allocated based on the operation information IF and may output each of the one or more logical addresses to the main controller 211 as an output logical address OLBA. The one or more logical addresses of a target data unit may include start and last logical addresses of the target data unit.
Specifically, the index controller 302 may generate an index IDX of a target data unit based on the operation information IF and store the index IDX in the index cache 303. The index controller 302 may generate the index IDX in the unit of the target data unit. Accordingly, if the table DT has been stored in the memory device 220 in row major ordering, the index IDX may include information regarding a target row data unit. The index controller 302 may generate, based on the operation information IF, a plurality of indices respectively corresponding to a plurality of target row data units. The index cache 303 may store the indices of the plurality of row data units corresponding to the respective target row data units.
Specifically, the index IDX of each row data unit may include information with regard to a corresponding row data unit, for example, validity information, a sequence number of the row data unit, the start logical address of the row data unit, the start offset of the row data unit, the last logical address of the row data unit, and the last offset of the row data unit. The validity information may indicate whether a corresponding index IDX is valid.
According to an embodiment, the index controller 302 may calculate information to be included in the index IDX of each row data unit based on Equations F1 to F9. Equations F1 to F9 may be applicable when the data unit group or the table DT is allocated to the consecutive logical addresses and each size of the row data units is greater than a data size allocable to each logical address.
In Equation 1 (F1), the size (row_size) of the row data unit may be the size of one row data unit. The parameter “col_total” may be the number of columns that are included in the table DT, and the parameter “val_size” may be the size of one element.
In Equation 2 (F2), the start address (row_start_addr) of the row data unit may be the number of bytes from the foremost location of the table DT, that is, the first element to a location at which the row data unit is started. The start_addr may be the start logical address of the table DT. The parameter “row_num” may be the sequence number of the row data unit.
In Equation 3 (F3), the last address (row_last_addr) of the row data unit may be the number of bytes from the first element of the table DT to a location before the last 8 B (i.e., the last element) of the row data unit.
In Equation 4 (F4), the start logical address (start_Iba) of the row data unit may be the foremost logical address, to which the row data unit is allocated among logical addresses.
In Equation 5 (F5), the last logical address (last_Iba) of the row data unit may be the last logical address, to which the row data unit is allocated among logical addresses.
In Equation 6 (F6), the number (Iba_total) of logical addresses may be the number of logical addresses, to which the row data unit is allocated.
In Equation 7 (F7), the size difference (size_diff) may be a difference between the size (row_size) of the row data unit and a size (Iba_size) corresponding to one logical address.
In Equation 8 (F8), the start offset (start_offset) of the row data unit may be the number of bytes from the foremost location of data corresponding to the start logical address (start_Iba) of the row data unit to a location at which the row data unit is started.
In Equation 9 (F9), the last offset (last_offset) of the row data unit may be the number of bytes from the foremost location of data corresponding to the last logical address (last_Iba) of the row data unit to a location at which the row data unit is ended.
If the index IDX of a target data unit has already been stored in the index cache 303, the index controller 302 may skip a process of generating the index IDX. Furthermore, the index controller 302 may refer to the index IDX that has already been stored in the index cache 303 in order to obtain a target data unit. According to an embodiment, indices that have been stored in the index cache 303 may be backed up in the memory device 220.
The index controller 302 may sequentially output each of one or more logical addresses to which a target data unit has been allocated as the output logical address OLBA based on the index IDX. The main controller 211 may access the memory device 220 based on the output logical address OLBA that is output by the index controller 302. Specifically, the main controller 211 may convert the output logical address OLBA into a physical address of the memory device 220, and may control a read operation or write operation of the memory device 220 on a memory region corresponding to the physical address.
For example, for a computational operation that requires a read operation of the memory device 220, the index controller 302 may sequentially output each of one or more logical addresses to which a target row data unit has been allocated as the output logical address OLBA. The main controller 211 may control the memory device 220 to output read data RD including a target row data unit, based on the output logical address OLBA. The read data RD that are output by the memory device 220 may be stored in the read data buffer 305.
Furthermore, the index controller 302 may determine whether the read data RD including a target data unit has already been stored in the read data buffer 305. For example, the index controller 302 may be aware of whether the read data RD has already been stored in the read data buffer 305 by checking the read data buffer 305 through a buffer controller (not illustrated). If the read data RD has already been stored in the read data buffer 305, the index controller 302 might not output a logical address corresponding to the read data RD as an output address. Accordingly, the memory device 220 might not perform an unnecessary read operation. The index controller 302 may notify other component units of the computational function accelerator 212 that a computational operation can be performed by using the read data RD that have already been stored in the read data buffer 305.
Furthermore, the index controller 302 may generate metadata MT of a target data unit based on the index IDX of the target data unit and may store the metadata MT in the metadata buffer 304. For example, the metadata MT of a target row data unit may include information on which the target row data unit can be identified in the read data RD that are output by the memory device 220.
Specifically, the metadata MT of a target row data unit may include the sequence number of the target row data unit, the number of logical addresses to which the target row data unit has been allocated, the start offset of the target row data unit, and the last offset of the target row data unit. Accordingly, as will be described later, the operator 306 may identify, as a target row data unit, data from the start offset to the last offset in the read data RD corresponding to the number of logical addresses, based on the metadata MT of the target row data unit.
The index cache 303 may store and maintain the index IDX under the control of the index controller 302. The index IDX that has been stored in the index cache 303 may be reused in a computational operation according to another computational operation request.
The metadata buffer 304 may store the metadata MT under the control of the index controller 302. According to an embodiment, the metadata MT that have been stored in the metadata buffer 304 may be removed whenever a computational operation for a target data unit is completed.
The read data buffer 305 may store the read data RD that are output by the memory device 220.
The operator 306 may read selected data SD from the read data buffer 305, based on the operation information IF received from the operation information buffer 301 and the metadata MT received from the metadata buffer 304. The selected data SD may be a part of or a whole of the target data unit depending on a computational operation type. Furthermore, the operator 306 may perform a computational operation on the selected data SD and output an operation result RS.
For example, the operator 306 may identify a target row data unit in the read data RD that have been stored in the read data buffer 305, based on the metadata MT of the target row data unit. Furthermore, the operator 306 may perform a computational operation on the target row data unit based on the operation information IF.
According to an embodiment, the computational function accelerator 212 may further include an output data buffer (not illustrated) for storing the operation result RS. The operator 306 may perform a computational operation and then control data, which need to be output as the operation result RS, to be transmitted from the read data buffer 305 to the output data buffer. The computational function accelerator 212 may output the operation result RS from the output data buffer.
The write data buffer 307 may receive and store another data from the host device 100. The write data buffer 307 may output the another data as write data WT without any change or may output data that have been modified under the control of the merge controller 308 as the write data WD. The write data WT may be stored in the memory device 220 by the main controller 211.
The merge controller 308 may operate in a computational operation including an operation of modifying the read data RD and storing the modified data in the memory device 220 as the write data WD. Specifically, the merge controller 308 may read the read data RD that have been stored in the read data buffer 305, based on the operation information IF received from the operation information buffer 301 and the metadata MT received from the metadata buffer 304. Furthermore, the merge controller 308 may modify the read data RD based on another data ID that have been stored in the write data buffer 307, and may control the write data buffer 307 to output the modified data as the write data WD. In this case, the index controller 302 may output, as the output logical address OLBA, a logical address to which the write data has been allocated based on the operation information IF. The main controller 211 may control the memory device 220 to store the write data in another memory region, and may map the output logical address OLBA to a physical address of the another memory region.
According to an embodiment, at least some of various memory components that are included in the computational function accelerator 212 may be included in one memory device. According to an embodiment, at least some of various components not the memory components included in the computational function accelerator 212 may operate by one processor.
Referring to
The first computational operation request RQ1 may include the operation information IF_RQ1. The operation information IF_RQ1 may include the start logical address (start_addr) of the table DT, the number (row_total) of row data units R0 to R6 that are included in the table DT, the number (col_total) of columns that are included in the table DT, the size (val_size) of each element of the table DT, a computational operation type (type), one or more target offsets (tar_offset), the number (tar_offset_total) of target offsets (tar_offset), one or more condition operations (op), the number (op_total) of condition operations (op), one or more constants (const), and the number (const_total) of constants (const). The target offsets (tar_offset) may indicate elements on which a computational operation is to be performed in each target row data unit. The condition operations (op) may be operations that are included in the WHERE syntax of the first computational operation request RQ1. The constants may be constants that are included in the condition for the first computational operation request RQ1.
First,
Referring to
Specifically, the index controller 302 may generate the index IDX_R0 of the row data unit No. 0 (R0) based on Equation 1 (F1) to Equation 9 (F9). The index IDX_R0 of the row data unit No. 0 (R0) may include validity information (valid), the sequence number (row_num) of the row data unit No. 0 (R0), the start logical address (start_Iba) of the row data unit No. 0 (R0), the start offset (start_offset) of the row data unit No. 0 (R0), the last logical address (last_Iba) of the row data unit No. 0 (R0), and the last offset (last_offset) of the row data unit No. 0 (R0).
As described above, if the index IDX_R0 of the row data unit No. 0 (R0) has already been stored in the index cache 303 before the first computational operation request RQ1 is received, the index controller 302 might not newly generate the index IDX_R0 of the row data unit No. 0 (R0).
Furthermore, the index controller 302 may generate the metadata MT_R0 of the row data unit No. 0 (R0) and store the metadata MT_R0 in the metadata buffer 304.
The index controller 302 may determine that the start logical address (start_Iba) of the row data unit No. 0 (R0) is the logical address No. 0 (LBA0) and the last logical address (last_Iba) of the row data unit No. 0 (R0) is the logical address No. 1 (LBA1). The index controller 302 may output the logical address No. 0 (LBA0) as the output logical address OLBA so that a read operation is performed on the logical address No. 0 (LBA0). Furthermore, the index controller 302 may output the logical address No. 1 (LBA1) as the output logical address OLBA so that a read operation is performed on the logical address No. 1 (LBA1). The main controller 211 may control the memory device 220 to output the read data RD0 corresponding to the logical address No. 0 (LBA0) and the read data RD1 corresponding to the logical address No. 1 (LBA1).
The read data buffer 305 may store the read data RD0 corresponding to the logical address No. 0 (LBA0) and the read data RD1 corresponding to the logical address No. 1 (LBA1).
The operator 306 may identify the row data unit No. 0 (R0), among the read data RD0 and RD1, based on the metadata MT_R0 of the row data unit No. 0 (R0). Specifically, the operator 306 may determine that the row data unit No. 0 (R0) has been allocated to two logical addresses because the number (Iba_total) of logical addresses is 2. Furthermore, the operator 306 may determine that the entire read data RD0 corresponding to the logical address No. 0 (LBA0) is a front part of the row data unit No. 0 (R0) based on the start offset (start_offset). Furthermore, the operator 306 may identify that anterior 48 B, among the read data RD1 corresponding to the logical address No. 1 (LBA1), is the remaining part of the row data unit No. 0 (R0) based on the last offset (last_offset).
Furthermore, the operator 306 may read the element of the column No. 4 (C4) and element of the column No. 6 (C6) of the row data unit No. 0 (R0) from the read data buffer 305 based on the target offsets (tar_offset) of the operation information IF_RQ1. Furthermore, the operator 306 may perform a condition operation for the first computational operation request RQ1 on the read elements based on the condition operations (op) and constants (const) of the operation information IF_RQ1. Specifically, the operator 306 may determine that the element of the column No. 4 (C4) of the row data unit No. 0 (R0) does not satisfy (i.e., a fail) the condition for the first computational operation request RQ1. Furthermore, the operator 306 may determine that the element of the column No. 6 (C6) of the row data unit No. 0 (R0) satisfies (i.e., a pass) the condition for the first computational operation request RQ1. As a result, the operator 306 may determine that the row data unit No. 0 (R0) does not satisfy (i.e., a fail) the condition for the first computational operation request RQ1. Accordingly, the row data unit No. 0 (R0) might not be output to the host device 100.
Next,
Referring to
The index controller 302 may determine that the start logical address (start_Iba) of the row data unit No. 1 (R1) is the logical address No. 1 (LBA1) and the last logical address (last_Iba) of the row data unit No. 1 (R1) is the logical address No. 2 (LBA2). In this case, for example, if the memory device 220 performs a read operation on the logical address No. 1 (LBA1) for a computational operation for the row data unit No. 0 (R0) or the read data RD1 corresponding to the logical address No. 1 (LBA1) is already present in the read data buffer 305, the index controller 302 may determine that another read operation for the logical address No. 1 (LBA1) is unnecessary. Accordingly, the index controller 302 may output only the logical address No. 2 (LBA2) as the output logical address OLBA so that the read operation is performed on the logical address No. 2 (LBA2). The main controller 211 may control the memory device 220 to output the read data RD2 corresponding to the logical address No. 2 (LBA2). The read data buffer 305 may store the read data RD1 corresponding to the logical address No. 1 (LBA1) and the read data RD2 corresponding to the logical address No. 2 (LBA2). The read data RD1 corresponding to the logical address No. 1 (LBA1) may be used in a computational operation for the row data unit No. 0 (R0), and may be reused in a computational operation for the row data unit No. 1 (R1).
The operator 306 may identify the row data unit No. 1 (R1), among the read data RD1 and RD2, based on the metadata MT_R1 of the row data unit No. 1 (R1). Specifically, the operator 306 may determine that the row data unit No. 1 (R1) has been allocated to two logical addresses because the number (Iba_total) of logical addresses is 2. Furthermore, the operator 306 may determine that the remaining part of the read data RD1 corresponding to the logical address No. 1 (LBA1), except the first 48 B of the read data RD1, is the front part of the row data unit No. 1 (R1) based on the start offset (start_offset). Furthermore, the operator 306 may identify that anterior 96 B, among the read data RD2 corresponding to the logical address No. 2 (LBA2), is the remaining part of the row data unit No. 1 (R1) based on the last offset (last_offset).
Furthermore, the operator 306 may read the element of the column No. 4 (C4) and element of the column No. 6 (C6) of the row data unit No. 1 (R1) from the read data buffer 305 based on the target offsets (tar_offset) of the operation information IF_RQ1. Furthermore, the operator 306 may perform the condition operation for the first computational operation request RQ1 on the read elements based on the condition operations (op) and constants (const) of the operation information IF_RQ1. Specifically, the operator 306 may determine that the element of the column No. 4 (C4) of the row data unit No. 1 (R1) satisfies (i.e., a pass) the condition for the first computational operation request RQ1. Furthermore, the operator 306 may determine that the element of the column No. 6 (C6) of the row data unit No. 1 (R1) satisfies (i.e., a pass) the condition for the first computational operation request RQ1. As a result, the operator 306 may determine that the row data unit No. 1 (R1) satisfies (i.e., a pass) the condition for the first computational operation request RQ1. Accordingly, the operator 306 may control the row data unit No. 1 (R1) to be output from the read data buffer 305 to the host device 100.
Similar to the aforementioned method, the computational storage device 200 may perform the SELECT operation requested by the first computational operation request RQ1 on the row data unit No. 2 (R2) to row data unit No. 6 (R6) of the table DT.
The computational storage device 200 may effectively perform a computational operation requested by a computational operation request by identifying each row data unit and each column of the table DT according to the aforementioned method, with respect to another computational operation type not a SELECT operation.
The table DT has been stored in the memory device 220 in row major ordering. According to an embodiment, a table may be stored in the memory device 220 in column major ordering. The computational function accelerator 212 may operate similar to the aforementioned method with respect to a table that has been stored in column major ordering. That is, the index controller 302 may calculate one or more logical addresses to which a target column data unit has been allocated and may generate the index IDX and the metadata MT in a column unit. Accordingly, the computational function accelerator 212 may effectively obtain a target column data unit from a table that has been stored in column major ordering in response to a computational operation request and may perform a computational operation on the target column data unit.
When receiving a computational operation request for a predetermined computational operation type (e.g., a MODIFY operation), the computational function accelerator 212 may output, as the output logical address OLBA, only a logical address selected among one or more logical addresses to which a target data unit has been allocated. Accordingly, the computational storage device 200 can more efficiently process a computational operation request because an unnecessary read operation is omitted.
Referring to
Furthermore, the index controller 302 may determine that the column No. 3 (C3) of the row data unit No. 2 (R2) is included in the read data RD2 corresponding to the logical address No. 2 (LBA2) based on the start offset (start_offset) of the row data unit No. 2 (R2). Specifically, the index controller 302 may determine that next 8 B anterior 120B is the column No. 3 (C3) of the row data unit No. 2 (R2) in the read data RD2 in which the row data unit No. 2 (R2) corresponds to the logical address No. 2 (LBA2). Accordingly, although the row data unit No. 2 (R2) has been allocated to the logical address No. 2 (LBA2) and the logical address No. 3 (LBA3), the index controller 302 may output only the logical address No. 2 (LBA2) as the output logical address OLBA.
The merge controller 308 may modify the element of the column No. 3 (C3) of the row data unit No. 2 (R2) in the read data RD2 corresponding to the logical address No. 2 (LBA2) based on the operation information IF and the metadata MT_R2 of the row data unit No. 2 (R2). As in the operating method described with respect to the operator 306, the merge controller 308 may identify the element of the column No. 3 (C3) of the row data unit No. 2 (R2) in the read data RD2 based on the metadata MT_R2 of the row data unit No. 2 (R2). The modified data may be output as the write data WD. Thereafter, the main controller 211 may store the write data WT corresponding to the logical address No. 2 (LBA2) in another memory region of the memory device 220, and may map the logical address No. 2 (LBA2) to a physical address of the another memory region.
If the read data RD2 corresponding to the logical address No. 2 (LBA2) has already been stored in the read data buffer 305, a read operation might not also need to be performed on the logical address No. 2 (LBA2), and a MODIFY operation is to be performed by using the read data RD2 that have already been stored in the read data buffer 305.
Referring to
Furthermore, the index controller 302 may allocate the another row data unit No. 1 to the logical addresses LBA7 and LBA8, and may generate the index IDX_NR1 of the another row data unit No. 1. The index IDX_NR1 of the another row data unit No. 1 may include 1 as the sequence number (row_num) of the row data unit. Furthermore, the index IDX_NR1 of the another row data unit No. 1 may include the logical address No. 7 (LBA7) as the start logical address (start_Iba) of the another row data unit No. 1 and include the logical address No. 8 (LBA8) as the last logical address (last_Iba) of the another row data unit No. 1. That is, a front part of the another row data unit No. 1 may be allocated to the logical address No. 7 (LBA7) after a rear part of the row data unit No. 7 (i.e., a previous row data unit No. 6). A rear part of the another row data unit No. 1 may be allocated to the logical address No. 8 (LBA8).
The front part of the another row data unit No. 1 may be merged with the rear part of the row data unit No. 7 (i.e., the previous row data unit No. 6) by the merge controller 308. Specifically, the index controller 302 may operate so that the read data RD corresponding to the logical address No. 7 (LBA7) are output by the memory device 220. The merge controller 308 may generate the write data WT by merging the rear part of the row data unit No. 7 (i.e., the previous row data unit No. 6) and the front part of the another row data unit No. 1 that are included in the read data RD corresponding to the logical address No. 7 (LBA7). The main controller 211 may store the write data WT corresponding to the logical address No. 7 (LBA7) in another memory region of the memory device 220, and may map the logical address No. 7 (LBA7) to a physical address of the another memory region.
According to an embodiment, the host device 100 can previously designate the range of logical addresses at which the table DT can be allocated to the computational storage device 200. The computational function accelerator 212 may allocate the another data ID that are added to the table DT to the logical addresses of the range designated by the host device. Accordingly, overhead of the host device 100 can be further reduced because the host device 100 does not need to manage the logical addresses to which the table DT is allocated.
Referring to
The method of processing the table DT, which has been described with reference to
Referring to
Referring to FIG. 9B, when storing the table ET in the memory device 220 under the control of the host device 100, the main controller 211 may allocate the vector data units V0 to Vm that have been sequentially listed in consecutive logical addresses LBA0 to LBAi in units of 512 Bs. It may be an example that a start logical address of the table ET is the logical address No. 0 (LBA0) and 0×0. Each vector data unit may be allocated to one or two logical addresses. A vector data unit that is allocated to only one logical address might not be unaligned data, and a vector data unit that is allocated to two logical addresses may be unaligned data.
For example, all elements from the vector data unit No. 0 (V0) to the vector data unit No. 3 (V3) may be allocated to the logical address No. 0 (LBA0). Accordingly, the vector data unit No. 0 (V0) to the vector data unit No. 3 (V3) might not be unaligned data.
For example, anterior 32 B of the vector data unit No. 4 (V4) may be allocated to the logical address No. 0 (LBA0), and posterior 88 B of the vector data unit No. 4 (V4) may be allocated to the logical address No. 1 (LBA1). Accordingly, the vector data unit No. 4 (V4) may be unaligned data. The logical address No. 0 (LBA0) may be the start logical address (or a first logical address) of the vector data unit No. 4 (V4). The logical address No. 1 (LBA1) may be the last logical address (or a second logical address) of the vector data unit No. 4 (V4).
The computational storage device 200 may receive a computational operation request for the table ET from the host device 100. The operation information IF that is included in the computational operation request may selectively include information with regard to the start logical address of the table ET, a size of an element, a vector dimension, a computational operation type, and respective sequence numbers of one or more target data units (i.e., target vector data units).
The computational function accelerator 212 may process the table ET similar to the method of processing the table DT illustrated in
Specifically, the index controller 302 may generate the index IDX of a target vector data unit based on the operation information IF. The index controller 302 may generate a plurality of indices corresponding to a plurality of target vector data units, respectively, based on the operation information IF.
More specifically, the index IDX of each vector data unit may include information with regard to the vector data unit, for example, validity information, the sequence number of the vector data unit, the start logical address of the vector data unit, the start offset of the vector data unit, the last logical address of the vector data unit, and the last offset of the vector data unit.
According to an embodiment, the index controller 302 may calculate information that is to be included in the index IDX of each vector data unit based on Equations F10 to F18. Equations F10 to F18 may be applicable when the data unit group or the table ET is allocated to the consecutive logical addresses and each size of the vector data units is less than a data size allocable to each logical address.
In Equation 10 (F10), the size (vt_size) of the vector data unit may be the size of one vector data unit. The parameter “vt_dim” may be the vector dimension. The parameter “val_size” may be the size of one element.
In Equation 11 (F11), the start address (vt_start_addr) of the vector data unit may be the number of bytes from the first element of the table ET to a location at which the vector data unit is started. The parameter “start_addr” may be the start logical address of the table ET.
The parameter “vt_num” may be the sequence number of the vector data unit.
In Equation 12 (F12), the start logical address (start_Iba) of the vector data unit may be the foremost logical address, to which the vector data unit is allocated among one or more logical addresses.
In Equation 13 (F13), the start offset (start_offset) of the vector data unit may be the number of bytes from the foremost location of data corresponding to the start logical address (start_Iba) of the vector data unit to a location at which the vector data unit is started.
In Equation 14 (F14), the unaligned flag (unaligned_flag) may indicate whether the vector data unit is the unaligned data or not. For example, the unaligned flag (unaligned_flag) of the vector data unit as the non-unaligned data may be calculated as 0. The unaligned flag (unaligned_flag) of unaligned data may be calculated as 1.
In Equation 15 (F15) and Equation 17 (F17), the last logical address (last_Iba) of the vector data unit may be the last logical address, to which the vector data unit is allocated among one or more logical addresses. As illustrated in Equation 15 (F15), the last logical address (last_Iba) of the vector data unit as the non-unaligned data may be the same as the start logical address (start_Iba). As illustrated in Equation 17 (F17), the last logical address (last_Iba) of unaligned data may be a logical address subsequent to the start logical address (start_Iba). The Iba_size may be a size corresponding to one logical address.
In Equation 16 (F16) and Equation 18 (F18), the last offset (last_offset) of the vector data unit may be the number of bytes from the foremost location of data corresponding to the last logical address (last_Iba) of the vector data unit to a location at which the vector data unit is ended.
According to an embodiment, when a vector data unit is not unaligned data, the calculation of the last logical address (last_Iba) according to Equation 15 (F15) and the calculation of the last offset (last_offset) according to Equation 16 (F16) may be omitted. Furthermore, the operator 306 may simply obtain data corresponding to the size (vt_size) of a vector data unit as the vector data unit from the start offset (start_offset) of the vector data unit from the read data RD corresponding to the start logical address (start_Iba) of the vector data unit.
The index controller 302 may output, as the output logical address OLBA, one or two logical addresses to which a target vector data unit has been allocated based on the index IDX of the target vector data unit. If required read data RD have already been stored in the read data buffer 305, the index controller 302 might not output a logical address corresponding to the required read data RD as the output logical address OLBA.
Furthermore, the index controller 302 may store the metadata MT of a target vector data unit in the metadata buffer 304 based on the index IDX of the target vector data unit. The metadata MT of the target vector data unit may include information on which the target vector data unit can be identified in the read data RD that are output by the memory device 220. Specifically, the metadata MT of the target vector data unit may include the sequence number of the target vector data unit, the number of logical addresses to which the target vector data unit has been allocated, the start offset of the target vector data unit, and the last offset of the target vector data unit. Accordingly, the computational function accelerator 212 can effectively perform a computational operation requested by a computational operation request by identifying a target vector data unit of the table ET based on the metadata MT of the target vector data unit.
According to an embodiment, the metadata MT of the target vector data unit may include the unaligned flag instead of the number of logical addresses to which the target vector data unit has been allocated.
Referring to
The index controller 302 may determine that the start logical address (start_Iba) of the vector data unit No. 3 (V3) is the logical address No. 0 (LBA0). The index controller 302 may determine that the vector data unit No. 3 (V3) is not unaligned data because the unaligned flag (unaligned_flag) of the vector data unit No. 3 (V3) is 0. Accordingly, the index controller 302 may determine that the last logical address (last_Iba) of the vector data unit No. 3 (V3) is the logical address No. 0 (LBA0). The index controller 302 may output the logical address No. 0 (LBA0) as the output logical address OLBA so that a read operation is performed on the logical address No. 0 (LBA0). The main controller 211 may control the memory device 220 so that read data RD0 corresponding to the logical address No. 0 (LBA0) are output.
The operator 306 may identify the vector data unit No. 3 (V3), among the read data RD, based on the metadata (MT_V3) of the vector data unit No. 3 (V3). Specifically, the operator 306 may determine that the vector data unit No. 3 (V3) has been allocated to one logical address because the number (Iba_total) of logical addresses is 1. Furthermore, the operator 306 may identify the vector data unit No. 3 (V3), among the read data RD0 corresponding to the logical address No. 0 (LBA0), based on the start offset (start_offset) and last offset (last_offset) of the vector data unit No. 3 (V3). The operator 306 may perform a computational operation on the identified vector data unit No. 3 (V3).
Referring to FIG. 10B, the index controller 302 may generate the index (IDX_V4) of the vector data unit No. 4 (V4) based on Equation (F10) to Equation 18 (F18). As described above, if the index (IDX_V4) of the vector data unit No. 4 (V4) has already been stored in the index cache 303, the index controller 302 might not newly generate the index (IDX_V4) of the vector data unit No. 4 (V4). Furthermore, the index controller 302 may generate the metadata (MT_V4) of the vector data unit No. 4 (V4) and store the metadata (MT_V4) in the metadata buffer 304.
The index controller 302 may determine that the start logical address (start_Iba) of the vector data unit No. 4 (V4) is the logical address No. 0 (LBA0). The index controller 302 may determine that the vector data unit No. 4 (V4) is unaligned data because the unaligned flag (unaligned_flag) of the vector data unit No. 4 (V4) is 1. Accordingly, the index controller 302 may determine that the last logical address (last_Iba) of the vector data unit No. 4 (V4) is the logical address No. 1 (LBA1). In this case, for example, if the memory device 220 performs a read operation on the logical address No. 0 (LBA0) for a computational operation for the vector data unit No. 3 (V3) or the read data RD0 corresponding to the logical address No. 0 (LBA0) are already present in the read data buffer 305, the index controller 302 may determine that another read operation for the logical address No. 0 (LBA0) is not necessary. Accordingly, the index controller 302 may output only the logical address No. 1 (LBA1) as the output logical address OLBA so that a read operation is performed on the logical address No. 1 (LBA1). The main controller 211 may control the memory device 220 to output read data RD1 corresponding to the logical address No. 1 (LBA1).
The operator 306 may identify the vector data unit No. 4 (V4), among the read data RD0 and RD1, based on the metadata (MT_V4) of the vector data unit No. 4 (V4). Specifically, the operator 306 may determine that the vector data unit No. 4 (V4) has been allocated to two logical addresses because the number (Iba_total) of logical addresses is 2. Furthermore, the operator 306 may determine that the remaining part of the read data RD0 corresponding to the logical address No. 0 (LBA0) except the first 480 B is a front part of the vector data unit No. 4 (V4) based on the start offset (start_offset). Furthermore, the operator 306 may identify that anterior 88B of the read data RD1 corresponding to the logical address No. 1 (LBA1) is the remaining part of the vector data unit No. 4 (V4) based on the last offset (last_offset). The operator 306 may perform a computational operation on the identified vector data unit No. 4 (V4).
The method of processing the embedding table ET, which has been described with reference to
According to an embodiment, the computational storage device 200 can autonomously determine one or more logical addresses to which a target data unit has been allocated although the one or more logical addresses to which the target data unit for a computational operation has been allocated are not received from the host device 100. Furthermore, the computational storage device 200 can identify a target data unit in read data corresponding to logical addresses although the size of the target data unit is not identical with a size corresponding to the logical addresses. As a result, the computational storage device 200 can effectively perform a computational operation requested by the host device 100, and performance of the data processing system 10 can be improved.
The above description is merely a description of the technical spirit of the present technology, and those skilled in the art may change and modify the present technology in various ways without departing from the essential characteristic of the present technology. Accordingly, the disclosed embodiments should not be construed as limiting the technical spirit of the present technology but should be construed as describing the technical spirit of the present technology. The technical spirit of the present technology is not restricted by the embodiments. The range of protection of the present technology should be construed based on the following claims, and all technical spirits within an equivalent range of the present technology should be construed as being included in the scope of rights of the present technology. Therefore, the scope of the present disclosure should not be limited to the above-described embodiments but should include the equivalents thereof.
In the above-described embodiments, all operations may be selectively performed, or part of the operations may be omitted. In each embodiment, the operations are not necessarily performed in accordance with the described order and may be rearranged. The embodiments disclosed in this specification and drawings are only examples to facilitate an understanding of the present disclosure, and the present disclosure is not limited thereto. That is, it should be apparent to those skilled in the art that various modifications can be made on the basis of the technological scope of the present disclosure.
The embodiments of the present disclosure have been described in the drawings and specification. Although specific terminologies are used here, those are only to describe the embodiments of the present disclosure. Therefore, the present disclosure is not restricted to the above-described embodiments and many variations are possible within the scope of the present disclosure. It should be apparent to those skilled in the art that various modifications can be made on the basis of the technological scope of the present disclosure in addition to the embodiments disclosed herein. Furthermore, the embodiments may be combined to form additional embodiments.
Number | Date | Country | Kind |
---|---|---|---|
10-2023-0139537 | Oct 2023 | KR | national |