This application claims priority from Korean Patent Application No. 10-2022-0006723 filed on Jan. 17, 2022 in the Korean Intellectual Property Office, and all the benefits accruing therefrom under 35 U.S.C. 119, the contents of which in their entirety are herein incorporated by reference.
The present disclosure relates to an electronic system of a real-time operating system, an operating method thereof, and an operating method for a memory device.
Solid state drives (SSDs) are gradually merging from the personal user market, such as laptop computers and smartphones, into other markets, such as large-scale data centers and high-performance enterprises. At the same time, their functional requirements thereof as applied to individual products, are becoming more diverse and complex. Real-time operating system (RTOS) may be applied to storage devices in an effort to perform product development and maintenance faster and more smoothly while satisfying varied and complex requirements. The storage device may reduce the burden on a developer by separating complex requirements into individual threads and delegating scheduling overhead to an RTOS kernel.
However, in order to more efficiently utilize the RTOS, a separate stack must be allocated to each thread. Each time a thread executes code, a stack frame is stacked on an individual function of a stack allocated to the thread, and thus the performance of the thread is directly affected depending on where the stack allocated to the thread is located in a memory layer of a system.
System developers may place a stack in a high-performance memory to more efficiently process threads, however in embedded systems in which SSDs belong, high-performance memory is so limited that stack overflow is likely to occur.
Aspects of the present disclosure provide an electronic system for efficient thread management in a real-time operating system (RTOS) under resource constraints, and an operating method thereof.
Aspects of the present disclosure also provide an electronic system in which efficiency of resource utilization is improved, performance is improved, and stack overflow does not occur in an RTOS, and an operating method thereof.
However, aspects of the present disclosure are not restricted to those set forth herein. The above and other aspects of the present disclosure will become more apparent to one of ordinary skill in the art to which the present disclosure pertains by referencing the detailed description of the present disclosure as given below.
According to an aspect of the present disclosure, there is provided an electronic system in a real-time operating system, the operating method comprises getting a call graph by performing static code analysis on at least one thread that corresponds to a task, getting a stack usage of the thread and a call probability for each node by performing runtime profiling of the call graph, allocating a threshold value of a stack size for a first memory area by taking into account the call graph, the call probability for each node, and the stack usage, expanding and storing a stack from the first memory area to a second memory area according to a comparison result between the threshold value and a stack usage of the first memory area and returning the stack to the first memory when execution is completed in the second memory area, wherein the electronic system comprises a memory device configured to include the first memory area and the second memory area.
According to another aspect of the present disclosure, there is provided an electronic system comprising a real-time operating system (RTOS) module configured to process a command from a host by dividing the command into at least one thread, a call graph module configured to get a call graph by performing static code analysis on the thread, a runtime profiler configured to get a stack usage of each node and a call probability for each node by performing runtime profiling of the call graph and a memory device configured to include a first memory area and a second memory area, wherein the RTOS module is configured to allocate a stack space that corresponds to the thread to the first memory area on the basis of a threshold value based on the stack usage of each node and the call probability for each node, store a stack corresponding to the thread in the allocated stack space in the first memory area, and connect and store a subsequent stack to the second memory area, in response to an overflow alarm based on the call probability for each node and the stack usage of each node being generated.
According to other aspect of the present disclosure, there is provided an operating method of a memory device in a real-time operating system, the memory device comprising a first memory area and a second memory area, the operating method comprises compiling a task into multiple threads and exploring a call graph by static code analysis on each thread, getting a stack usage of each node and a call probability for each node by performing dynamic code analysis on the explored call graph, setting a stack size of the first memory area based on the dynamic code analysis result, storing stack frames corresponding to the thread in sequence in the first memory area and storing a subsequent stack frame in the second memory area in response to an overflow alarm for the first memory area being generated.
It should be noted that the effects of the present disclosure are not limited to those described above, and other effects of the present disclosure will be apparent from the following description.
The above and other aspects and features of the present disclosure will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings, in which:
Hereinafter, an electronic system according to some embodiments of the present disclosure will be described with reference to
An electronic system 1 may include a host 10 and a storage system 2. The storage system may include a storage controller 20-1 and at least one storage device 201.
In some embodiments, the host 10 may be a computing device such as a personal computer, a server, a notebook computer, a tablet computer, a smartphone, or a cellular phone, but the present disclosure is not limited thereto.
The host 10 may communicate with a memory system by using an interface protocol, such as peripheral component interconnect-express (PCI-E), advanced technology attachment (ATA), serial ATA (SATA), parallel ATA (PATA), serial attached SCSI (SAS), or compute eXpress link (CXL). The interface protocols between the host 10 and the storage system 2 are not limited to the examples listed above, and the interface protocol used may be one of other interface protocols, such as universal serial bus (USB), multi-media card (MMC), enhanced small disk interface (ESDI), or integrated drive electronics (IDE).
According to some embodiments, each of the at least one storage device 201 may be a non-volatile memory (NVM) device, such as a flash memory, a magnetoresistive random access memory (MRAM), a phase-charge random access memory (PRAM), a ferroelectric random access memory (FeRAM), or the like.
Referring to
The processor 11 may include one or more central processing unit (CPU) cores, and the memory controller 20-2 may be connected to at least one external memory device 202 to control the external memory device 202. According to some embodiments, the processor 11 may be an accelerator block, such as a dedicated circuit configured to perform high-speed data computation, such as artificial intelligence (AI) data computation. For example, an accelerator block may be implemented as a graphics processing unit (GPU), a neural processing unit (NPU) and/or a data processing unit (DPU), and may be implemented as a separate chip that is physically independent of other components in the processor.
The external memory device 202 may be one or more dynamic random access memories (DRAMs), such as double data rate synchronous dynamic random access memory (DDR SDRAM), low power double data rate (LPDDR) SDRAM, graphics double data rate (GDDR) SDRAM, and Rambus dynamic random access memory (PDRAM).
The host 10 may communicate with the external memory device 202 based on one of interface standards, such as DDR, LPDDR, GDDR, wide I/O, high bandwidth memory (HBM), hybrid memory code (HMC), or CXL.
Referring to
According to some embodiments, the storage controller 20 may include a host interface 21, an RTOS module 22, a memory 100, a runtime profiler 23, a call graph module 24, a machine learning module 25, and a memory interface 26, and each of the components 21 to 27 and 100 may be connected to a system bus.
The storage controller 20 may communicate with a host 10 via the host interface 21. The host interface 21 may communicate based on the interface standards described with reference to
The processor 23 controls the overall operation of the storage controller 20. For example, upon receiving a command from the host 10 or the host processor 11, at least one thread for executing the command, for example, a multi-thread is generated.
The RTOS module 22 may be executed by the processor 23. The RTOS module 22 may perform scheduling in order to process various threads within a predetermined time. By the RTOS module 22, threads in states such as ready, running, and blocked are scheduled and pushed to an internal memory 100. According to the number of threads defined by a designer of the electronic system, the RTOS module 22 allocates a stack used by each thread to the internal memory 100.
The internal memory 100 is divided into at least one memory layer according to some embodiments. For example, the internal memory 100 may include a first memory area M1 and a second memory area M2 as shown in
The internal memory 100 may be used as a buffer memory, i.e., an operating memory of the storage controller 20. According to some embodiments, the internal memory 100 may be a memory of one type, or memories of several types. For example, the internal memory 100 may be an SRAM, a DRAM, or a combination of an SRAM and a DRAM.
According to some embodiments, the RTOS module 22 allocates a stack size for each thread to the memory 100, and pushes a stack frame to the memory when processing a thread. For example, when the electronic system starts to operate, the RTOS module 22 first pushes/pops stack frames that correspond to a thread to the first memory area M1. However, since the size of the first memory area M1 is limited, the RTOS module 22 may set the stack size of the first memory area according to a threshold value so as to prevent stack overflow from occurring in the first memory area M1, and may generate an overflow alarm according to predetermined conditions. When an overflow alarm occurs while a thread is pushed to the first memory area M1, the thread is then connected to and pushed to the second memory area M2.
According to some embodiments, the current stack usage is compared with a threshold value, and when the current stack usage exceeds the threshold value, the overflow alarm may be generated. In one example, when a stack is used within a red zone in a range preset based on a threshold value, an overflow alarm may be generated. For example, when a stack signature is detected in the first memory area, an overflow alarm may be generated. For example, an overflow alarm may be generated by using call chain history information in which whether an overflow occurs is recorded by comparing a call graph, a call chain for each node, and a call probability for each node, which are obtained by the call graph module and the runtime profiler, with the threshold value. That is, the overflow alarm may be generated by comparing the current usage and the call chain history information.
The call graph module 25 performs static code analysis on threads to be processed by the RTOS module 22. The static code analysis analyzes a call graph, for example, the code constituting the thread, to identify a call chain from a root function of the thread. That is, the call graph module 25 may create a call graph including all call chains reachable from a root function of a thread, and extract the longest call chain, i.e., the maximum call chain depth.
The runtime profiler 24 performs dynamic code analysis on the call graph. For example, the runtime profiler 24 performs runtime profiling of the call graph to get a stack usage to be allocated to each thread in real time. Accordingly, the RTOS module 22 may get the stack usage based on the maximum call chain depth obtained by the call graph module, i.e., obtain the maximum stack usage.
Also, the runtime profiler 24 may obtain the call probabilities for each node of the call graph while getting the stack usage of the thread-specific call function. That is, the runtime profiler 24 gets the probability that the call function is used for each call chain that starts from the root function and is connected to several nodes, and generates an augmented call graph. The augmented call graph will be described below with reference to
The machine learning module 26 may learn in advance a threshold value that prevents an overflow from occurring in a memory by taking into account the call graph generated by the call graph module 25 and the stack usage for each node and the maximum stack usage which are generated by the runtime profiler 24. According to some embodiments, the machine learning module 26 may store a threshold value that is learned in advance and obtained by taking into account various thread-specific call functions, call graphs, stack usage, stack size of the memory, and the like in a map table. Alternatively, according to some embodiments, the machine learning module 26 may be in the form of a computation module which can get an appropriate threshold value from the correlations of various thread-specific call functions, call graphs, stack usage, and the stack size of the memory.
The machine learning module 26 may store information on a pre-learned overflow alarm, for example, overflow prediction information, according to some embodiments. For example, the machine learning module 26 may store the overflow occurrence probability with respect to the thread, the call graph, the stack usage of each node, and the call probabilities for each node in a map table.
The memory interface 27 may be an interface that is connected to the storage device 201 of
Referring to
According to some embodiments, the first memory area M1 and the second memory area M2 may be used by dividing one memory into two areas. According to some embodiments, a plurality of memories may be respectively classified into the first memory area M1 and the second memory area M2. For example, the first memory area M1 may be a SRAM, and the second memory area M2 may be typical operating memory. In another example, the first memory area M1 may be tightly coupled memory for the RTOS module 22, and the second memory area M2 may be typical operating memory.
The first memory area M1 may access data at a faster speed than that of the second memory area M2. The RTOS module 22 may first push a thread to the first memory area M1, and then push a thread to the second memory area when a stack overflow alarm is generated in the first memory area M1.
Referring to
According to some embodiments, in the first iteration of thread processing, the stack usage of threads are within the stack size (2 KB, 2 KB, 2 KB, 2 KB) in all allocated areas, and thus the iteration may be performed without any problem.
According to some embodiments, in the second iteration of thread processing, when the stack usage of a call function exceeds the allocated stack size 1 KB, as shown in the second section A-②, stack overflow may occur.
According to some embodiments, in the third iteration, a stack is used within a larger allocated stack size (A-③), and thus stack overflow does not occur.
In order to prevent overflow by coping with various situations in which overflow occurs as in the examples described above, a stack size needs to be conservatively determined based on the maximum stack usage with the maximum call chain on the basis of the call graph for each thread. However, since the resource of the first memory area M1 is limited and a stack is allocated for each thread, the memory may be inefficiently used because the resource must be allocated even when the allocated area is not actually used frequently.
Thus, the RTOS module 22 should allocate a stack size for each thread to the first memory area by taking into account the appropriate stack usage rather than the maximum stack usage. Hereinafter, allocation of an appropriate stack usage according to
Referring to
For example, it is assumed that the thread includes a call chain divided from a root function Root() into three nodes. It is assumed that the thread is divided from a root function Root() into call function A(), call function B(), and call function C() and each call function constitutes a call chain including at least one independent call function. According to various embodiments, the call functions may intersect, link, or depend on one another, but for convenience of description, it is assumed that each call chain is independent of one another. In the illustrated example, call function A() has a call chain connected to dependent call function Foo() and dependent call function Goo(), call function B() does not have a separate dependent call function, and call function C() has a call chain connected to dependent call function Hoo().
The runtime profiler 24 gets the call probabilities for each node while getting the stack usage, i.e., memory usage, for each call function by performing runtime profiling of the call graph. In the illustrated example, the runtime profiler 24 may get the stack usage of a first node as [A, Foo, Goo]=[200 Kilobytes, 400 Kilobytes, 600 Kilobytes], the stack usage of a second node as [B]=[100 Kilobytes], and the stack usage of a third node as [C, Hoo]=[300 Kilobytes, 400 Kilobytes].
The runtime profiler 24 may extract a call chain with the maximum call chain depth on the basis of each call chain length of the call graph and the stack usage of each call function. In the illustrated example, the total stack usage when the first node based on call function A(1) is used is 200+400+600=1200 Kilobytes. The total stack usage of the second node based on call function B() is 100, and the total stack usage of the third node based on call function C() is 300+400=700 Kilobytes. The runtime profiler 24 may set the first node having the maximum value of 1200 Kilobytes to be the maximum call chain based on the total stack usage, and determine that the maximum stack size is 1200 Kilobytes.
The runtime profiler 24 may get the call probabilities for each node by predicting the frequency at which each call function is called, while profiling the call graph. Given that the total probability is 1, it is predicted that, in the illustrated example, the first node based on call function A() is called with a probability of 0.01, the second node based on call function B() is called with a probability of 0.9, and the third node based on call function C() is called with a probability of 0.7.
Referring to
For example, it is assumed that the total stack usage of the first node A is set to be a threshold value. Based on the call function for each node, the second node B or the third node C is more frequently called than the first node A. In this case, among the stack areas of the first memory area allocated to threads, a stack area corresponding to the stack usage difference (1200-100=1100) between the first node A and the second node B, i.e., a stack area of 1100 Kilobytes, is allocated as space that is unused until the first node is called, and remains empty. Alternatively, a stack area corresponding to the stack usage difference (1200-700=500 Kilobytes) between the first node A and the third node C, i.e., a stack area of 500 Kilobytes, is allocated as space that is unused until the first node is called, and remains empty. Thus, the larger the empty space area is, the more difficult it is to allocate the stack size for other threads.
Meanwhile, in the case where the total stack usage of the second node is set to be a threshold value of the stack size by taking into account only the call probabilities for each node, when the call functions of the third node and the first node are executed, it may be more frequent that the call functions are executed by expanding from the first memory area to the second memory area by the difference between the threshold value and the stack usage. Since the second memory area may have a larger memory capacity than the first memory area, but has low processing performance, expansion to the second memory area is frequent and the overall performance of the electronic system 1 may be lowered.
Therefore, the RTOS module 22 may set the threshold value by taking into account the call probabilities for each node and the stack usage of each node, according to some embodiments, in order to provide an appropriate empty space that is allocated as a space for each thread for processing and to prevent stack overflow. For example, the stack usage of a call chain having a high call probability for each node and a high stack usage of each node may be set as a threshold value.
In the case where the total stack usage of the third node C is set as a threshold value (Th=500 Kilobytes), when the second node or the third node is called, the thread may be processed without stack overflow. However, when the first node A is called, the stack size exceeding the threshold value may be expanded to the second memory area and processed. That is, when the first node A is called, the first memory area M1 is preferentially used, but the stack area exceeding the threshold value when call function Foo() of the first node is processed may be connected to the second memory area M2 to process and push the remaining stack area of call function Foo() and call function Goo().
Referring to
It is assumed that data is stored in the first memory area M1 by executing call function A of the first node and then call function Foo() is executed. In order to connect from the first memory area to the second memory area, a predetermined address of the second memory area is required, and the two separate areas may be connected by notifying the specified address.
The first memory area M1 in the illustrated example may store data from bottom to top, i.e., in a direction in which an address increases (low address → high address). That is, the first memory area M1 may pop the data. However, according to various embodiments, the first memory area M1 may store data in a direction in which an address decreases (high address→ low address), that is, the first memory area M1 may push the data. In the following description, a direction in which a stack grows, i.e., an address increases (pop) is described, but the present disclosure is not limited thereto.
In a context switching situation at a specific point in time, when call function Foo() is executed, a plurality of stack frames are accumulated in the first memory area M1. At this time, the threshold value, i.e., the stack size, is 500 Kilobytes, so the remaining area after the stack usage (200 Kilobytes) already used for call function A() is subtracted from the stack size is 300 Kilobytes. When only this remaining area is used, overflow may occur since the stack usage of call function Foo() exceeds the stack size (i.e., the threshold value) allocated to the first memory area (300 Kilobytes of remaining area size < stack usage of 400 Kilobytes of call function Foo()). In order to prevent overflow, the RTOS module 22 may expand from the first memory area M1 to the second memory area M2 when context switching takes place between threads. The RTOS module 22 may generate an overflow alarm when overflow is probable to occur.
According to some embodiments, when the overflow alarm is generated, the RTOS module 22 stores a callee’s stack pointer (SP) Add_xl used in the first memory area M1 and a link register value LR(original), and copies the stack frame X most recently stored in the first memory area M1 to the second memory area M2. In this case, it is assumed that the stack frame X most recently stored in the first memory area M1 starts from address Add_xl and has addresses up to address Add_x2. The stack frame X may include, for example, local variables, stacks R4, R5, and R6, and the link register value LR(original), and has a stack size smaller than 400 Kilobytes since the stack frame X is one of a plurality of stacks of call function Foo().
That is, the RTOS module 22 copies the callee’s SP of the first memory area M1 to address Add_y1 of the second memory area M2 (callee’s SP(new), and copies the stack frame X to a stack frame Y For example, the callee’s SP of the first memory area M1 is copied to address Add_y1 of the second memory area M2 using expandTo function, and the local variables, stacks R4, R5, and R6, and link register value LR(original) included in the stack frame X are stored in the second memory area M2.
Since the copied stack frame Y is stored in the second memory area M2, the stacks sored in the first memory area M1 and the stacks stored in the second memory area M2 are discontinuous. In the illustrated example, the stacks at the addresses Add_xl to Add_x2 and the stacks after the address Add_y1 are not continuous.
The second memory area M2 copies the stack frame, and changes the link register value LR(original) to a return location Add_x2 which is a separate epilogue function address (LR(backTo)) in order to later return to the first memory area when thread execution is completed. Thereafter, the RTOS module 22 executes call function Foo() and call function Goo() up to address Add_y3, and connect an execution flow to the call point for a caller function by executing the epilogue function. That is, the original link register value LR(original) is restored by calling a backTo function while returning to a pre-stored stack pointer SP(caller) (i.e., Add_x2) of the first memory area M1. The RTOS module 22 returns the executed stack to the first memory area M1 on the basis of the restored stack pointer and link register value. That is, the RTOS module 22 performs the execution flow smoothly.
For convenience of description, the epilogue function, the backTo function, and the like are described. However, these are merely examples, and it will be apparent that these functions may be referred to differently according to various embodiments.
Referring to
According to some embodiments, the electronic system may use a frame pointer to indicate a stack usage of a call function included in a thread. In execution of a specific call function included in a thread, the RTOS module 22 may build a stack from stack pointer point Add_x1, and, when the frame point indicating the stack usage of the specific call function enters into the red zone, the RTOS module 22 may generate an overflow alarm and expand to the second memory area in accordance with
Referring to
According to some embodiments, the stack signature may be a preset code value 0xdeadbeef as shown in
The RTOS module 22 may insert the stack signature to a preset position based on a set threshold value. In one example, the stack signature may be inserted into the red zone described with reference to
When the RTOS module 22 detects the stack signature in the first memory area M1 while building a stack, the RTOS module 22 connects and expands an address from the first memory area M1 to the second memory area M2. As for the connection and expansion of an address, the address may be connected to the second memory area as described with reference to
Referring to
According to some embodiments, a designer or user of the electronic system may arbitrarily insert code that calls the expand function into a stack frame which corresponds to the thread. For example, the RTOS module 22 connects and expands an address from the first memory area M1 to the second memory area M2 by calling the preset expand function after a frame size based on a stack pointer indicating the start of the stack frame. As for the connection and expansion of an address, the address may be connected to the second memory area as described with reference to
Referring to
The electronic system sets a threshold value that determines a stack size of a thread on the basis of the call graph, the stack usage of each node, and the call probability for each node (S40). Call functions associated with the threads are preferentially executed in a first memory area M1 (S50) until an overflow alarm is generated (S60, N). When the overflow alarm is generated (S60), the stack of the thread being executed in the first memory area M1 is connected to a second memory area M2 (S70) and subsequent stacks of the call functions are executed in the second memory area (S80).
When execution of the call function of a corresponding thread is completed in the second memory area M2, setting of an address connected for expansion returns back to the first memory area M1 (S90) and thread processing according to a call chain of the corresponding node is completed (S100).
Referring to
According to some embodiments, the main processor 1100 and the memories 1200a and 1200b may be the host 10 and the memory device 202 of
The main processor 1100 may control an overall operation of the system 1000, e.g., control operations of the other components included in the system 1000. The main processor 1100 may be implemented as a general-purpose processor, a dedicated purpose processor, an AP, or the like.
The main processor 1100 may include one or more CPU cores 1110 and may include a controller 1120 configured to control the memories 1200a and 1200b and/or the storage devices 1300a and 1300b. According to some embodiments, the main processor 1100 may include an accelerator block 1130 that is a dedicated circuit configured to perform high speed data calculation, such as AI data calculation. The accelerator block 1130 may include a GPU, an NPU, a DPU, and/or the like, and may be implemented as a separate chip physically independent of the other components in the main processor 1100.
The memories 1200a and 1200b may be used as a main memory device of the system 1000 and may include volatile memories, such as SRAM and/or DRAM, or include NVM memories, such as flash memory, PRAM, and/or RRAM. The memories 1200a and 1200b may be implemented in the same package as the main processor 1100.
The storage devices 1300a and 1300b may function as an NVM storage device configured to store data regardless of whether power is supplied thereto, and may have a larger storage capacity than the memories 1200a and 1200b. The storage devices 1300a and 1300b may include storage controllers 1310a and 1310b and NVM storages 1320a and 1320b configured to store data under the control of the storage controllers 1310a and 1310b, respectively. The NVM storage 1320a and 1320b may include V-NAND flash memory of a two-dimensional (2D) or three-dimensional (3D) structure, or another type of NVM, such as PRAM and/or RRAM.
The storage devices 1300a and 1300b may be included in the system 1000 physically separated from the main processor 1100, or may be implemented in the same package as the main processor 1100. Also, the storage devices 1300a and 1300b may have a form such as an SSD or a memory card form to be detachably coupled to the other components in the system 1000 through an interface such as the connecting interface 1480 described below. The storage devices 1300a and 1300b may be devices to which a standard protocol, such as a UFS protocol, is applied, but are not limited thereto.
The image capturing device 1410 may capture a still image or a moving picture, and may include a camera, a camcorder, a webcam, and/or the like.
The user input device 1420 may receive various types of data from a user of the system 1000, and may include a touch pad, a keypad, a keyboard, a mouse, a microphone, and/or the like.
The sensor 1430 may sense various types of physical quantities, which may be obtained from the environment, and convert the sensed physical quantity into an electrical signal. The sensor 1430 may include a temperature sensor, a pressure sensor, an illuminance sensor, a position sensor, an acceleration sensor, a biosensor, a gyroscope, and/or the like.
The communication device 1440 may perform signal transmission and reception between the system 1000 and other devices outside the system 1000 according to various communication protocols. The communication device 1440 may be implemented using an antenna, a transceiver, a modem, and/or the like.
The display 1450 and the speaker 1460 may function as output devices configured to output visual information and auditory information to the user of the system 1000, respectively.
The power supplying device 1470 may convert power supplied from a battery (not shown) in the system 1000 and/or an external power source, and supply the converted power to each component in the system 1000.
The connecting interface 1480 may provide a connection between the system 1000 and an external device connected to the system 1000 to transmit and receive data to and from the system 1000. The connecting interface 1480 may be implemented by various interface schemes, such as an ATA interface, a SATA interface, an e-SATA interface, a small computer small interface (SCSI), SAS, a PCI interface, a PCIe interface, an NVM express (NVMe) interface, an Institute of Electrical and Electronics Engineers (IEEE) 1394 interface, a USB interface, a secure digital (SD) card interface, a multi-media card (MMC) interface, an eMMC interface, a UFS interface, an embedded UFS (eUFS) interface, and a compact flash (CF) card interface.
Referring to
The application server 2100 or the storage server 2200 may include at least one of the processors 2110 and 2210 and at least one of the memories 2120 and 2220. According to some embodiments, the processors 2110 and 2210 and the memories 2120 and 2220 may be the host 10 and the memory device 202 of
Taking the storage server 220 as an example, the processor 2210 may control the overall operation of the storage server 2200, and access the memory 2220 to execute commands and/or data loaded into the memory 2220. The memory 2220 may be a double data rate synchronous DRAM (DDR SDRAM), an HBM, a HMC, a dual in-line memory module (DIMM), an Optane DIMM or an NVMDIMM. According to an embodiment, the number of processors 2210 and the number of memories 2220 included in the storage server 2200 may be variously selected. In an embodiment, the processor 2210 and the memory 2220 may provide a processor-memory pair. In an embodiment, the number of processors 2210 and the number of memories 2220 may be different from each other. The processor 2210 may include a single-core processor or a multicore processor. The aforementioned description of the storage server 2200 may also be similarly applied to the application server 2100. According to an embodiment, the application server 2100 need not include the storage device 2150. The storage server 2200 may include at least one or more storage devices 2250. The number of storage devices 2250 included in the storage server 2200 may be variously selected according to an embodiment. According to some embodiments, the storage device 2250 may be the storage system 2 of
The application servers 2100 to 2100n and the storage servers 2200 to 2200m may communicate with each other through a network 2300. The network 2300 may be implemented using a fiber channel (FC), an Ethernet, or the like. The FC is a medium used for relatively high-speed data transmissions, and may use an optical switch which provides high performance and/or high availability. The storage servers 2200 to 2200m may be provided as a file storage, a block storage or an object storage, depending on an access type of the network 2300.
In an embodiment, the network 2300 may be a storage-dedicated network, such as a storage area network (SAN). For example, the SAN may be an FC-SAN using an FC network and implemented according to an FC Protocol (FCP). In another example, the SAN may be an IP-SAN using a TCP/IP network and implemented according to an iSCSI (SCSI over TCP/IP or Internet SCSI) protocol. In another embodiment, the network 1300 may be a general network such as the TCP/IP network. For example, the network 1300 may be implemented according to protocols, such as FC over Ethernet (FCoE), network-attached storage (NAS), NVMe over Fabrics (NVMe-oF)
Hereinafter, the application server 2100 and the storage server 2200 will be mainly described. The description of the application server 2100 may be applied to another application server 2100n, and the description of the storage server 2200 may be applied to another storage server 2200m.
The application server 2100 may store data requested by the user or a client to be stored in one of the storage servers 2200 to 2200m through the network 2300. In addition, the application server 2100 may acquire data requested by the user or the client to be read from one of the storage servers 2200 to 2200m through the network 2300. For example, the application server 2100 may be implemented as a web server or a database management system (DBMS).
The application server 2100 may access a memory 2120n or a storage device 2150n included in another application server 2100n through the network 2300, or may access memories 2220 to 2220m or storage devices 2250 to 2250m included in the storage servers 2200 to 2200m through the network 2300. Accordingly, the application server 2100 may perform various operations on data stored in the application servers 2100 to 2100n and/or the storage servers 2200 to 2200m. For example, the application server 2100 may execute commands for moving or copying data between the application servers 2100 to 2100n and/or the storage servers 2200 to 2200m. At this time, the data may be moved to the memories 2120 to 2120n of the application servers 2100 to 2100n through the memories 2220 to 2220m of the storage servers 2200 to 2200m or directly from the storage devices 2250 to 2250m of the storage servers 2200 to 2200m. The data moving through the network 2300 may be encrypted data for security or privacy.
Taking the storage server 2200 as an example, an interface 2254 may provide a physical connection between the processor 2210 and a controller 2251 and a physical connection between an NIC 2240 and the controller 2251. For example, the interface 2254 may be implemented by a direct attached storage (DAS) method that directly connects the storage device 2250 with a dedicated cable. In addition, for example, the interface 2254 may be implemented by various interface schemes, such as an ATA interface, a SATA interface, an e-SATA interface, a SCSI, SAS, a PCI interface, a PCIe interface, an NVM express (NVMe) interface, an IEEE 1394 interface, a USB interface, an SD card interface, an MMC interface, an eMMC interface, a UFS interface, an eUFS interface, and a CF card interface.
The storage server 2200 may further include a switch 2230 and an NIC 2240. The switch 2230 may selectively connect the processor 2210 and the storage device 2250 or selectively connect the NIC 2240 and the storage device 2250 under the control of the processor 2210.
In an embodiment, the NIC 2240 may include a network interface card, a network adapter, and the like. The NIC 2240 may be connected to the network 2300 by a wired interface, a wireless interface, a Bluetooth interface, an optical interface, or the like. The NIC 2240 may include an internal memory, a DSP, a host bus interface, and the like, and may be connected to the processor 2210 and/or the switch 2230 through a host bus interface. The host bus interface may be implemented as one of the examples of the interface 2254 described above. In an embodiment, the NIC 2240 may be integrated with at least one of the processor 2210, the switch 2230, and the storage device 2250.
In the storage servers 2200 to 2200m or the application servers 2100 to 2100n, the processor may program or read data by transmitting a command to the storage devices 2130 to 2130n and 2250 to 2250m, or the memories 2120 to 2120n and 2220 to 2220m. In this case, the data may be error-corrected data through an error correction code (ECC) engine. The data is data bus inversion (DBI) or data masking (DM) processed data, and may include cyclic redundancy code (CRC) information. The data may be encrypted data for security or privacy.
The storage devices 2150 to 2150m and 2250 to 2250m may transmit a control signal and a command/address signal to the NAND flash memory devices 2252 to 2252m in response to a read command received from the processor. Accordingly, when data is read from the NAND flash memory devices 2252 to 2252m, a read enable (RE) signal is input as a data output control signal, and may serve to output the data to a DQ bus. A data strobe (DQS) may be generated using the RE signal. The command and the address signal may be latched in a page buffer according to a rising edge or a falling edge of a write enable (WE) signal.
The controller 2251 may control the overall operation of the storage device 2250. In an embodiment, the controller 2251 may include an SRAM. The controller 2251 may write the data to the NAND flash 2252 in response to a write command, or may read the data from the NAND flash 2252 in response to a read command. For example, the write command and/or the read command may be provided from the processor 2210 in the storage server 2200, the processor 2210m in another storage server 2200m, or the processors 2110 and 2110n in the application servers 2100 and 2100n. A DRAM 2253 may temporarily store (buffer) the data to be written to the NAND flash 2252 or the data read from the NAND flash 2252. In addition, the DRAM 2253 may store metadata. The metadata may be user data or data generated by the controller 2251 to manage the NAND flash 2252. The storage device 2250 may include a secure element (SE) for security or privacy.
While exemplary embodiments of the inventive concept have been shown and described above, it will be apparent to those skilled in the art that modifications and variations in these embodiments can be made without departing from the spirit and scope of the present inventive concept as defined in the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2022-0006723 | Jan 2022 | KR | national |