The disclosure relates generally to data storage, and more particularly, to a dynamic memory expansion capable device for a memory pool as well as a memory plus storage pool.
A double data rate (DDR)-based memory expansion in a server depends on a system power cycle and does not enable dynamic memory expansion. In other words, the system power is turned off, memory is added, and then the system is turned back on in order for memory to be expanded.
In particular, high-performance computing (HPC) and futuristic data center (DC) environments are highly likely to obtain such capability to leverage big data workloads, such as artificial intelligence (AI) or database (DB) big data analytics.
A new technology known as compute express link (CXL) has been introduced, and a large-scale memory/storage pool as a far memory/storage pool may be created, excluding a DDR basis.
HPC and futuristic DC workloads for AI/big data analytics from fifth generation (5G)/Internet of things (IoT)/self-driving cars/edge computing require large memory resource pools that have the capability for dynamic memory expansion in order to accelerate processing large data sets (e.g., in-memory driven processing for AI/DB workloads). However, the current DDR interface-based memory expansion is deficient in dynamic memory expansion capability at least because it requires a power cycle.
In addition, the CXL interface has been introduced as a method of memory expansion, but an effective memory pooling solution has not been achieved to this point.
In particular, it may be difficult to provide dynamic capacity expansion for DDR technology. In addition, peripheral component interconnect express (PCIe) may be used to achieve a successful storage pooling solution.
Futuristic DC workloads may require a large memory/storage data cache for acceleration (e.g., AI/big data analytics) while traditional methods of memory expansion were based on a DDR interface. DDR interface-based memory expansion has faced a breaking point, and as an alternative solution, CXL has been introduced with serial interface-based memory expansion.
HPC and futuristic DC industries have been limited in terms of memory resource expansion in modular, flexible and composable manners. The industry has been providing solutions for dynamic storage expansion in data storage areas, such as non-volatile memory express (NVMe) and solid state drive (SSD). Memory usage, unlike storage, faces a limitation in terms of dynamic expansion memory capacity. The CXL serial interface technology provides a possible solution for overcoming the limitation but has been limited in doing so to this point.
Therefore, there is a need in the art for a CXL interface-based memory expansion solution which increases memory capacity in a device or server without the need for power cycling the device or server.
The present disclosure has been made to address at least the above-mentioned problems and/or disadvantages and to provide at least the advantages described below.
Accordingly, an aspect of the present disclosure is to provide dynamic capacity expansion capability for memory only and memory plus storage pooling.
Another aspect of the disclosure is to provide a new mechanism for dynamic memory expansion beyond a system boot up with arbitrary system memory size.
Another aspect of the disclosure is to provide a method and apparatus that address memory usage limitations and re-design memory usage models for HPC and futuristic DC business, such as core/edge DC based on large scale memory/memory plus storage resource pools.
Another aspect of the disclosure is to provide, through the utilization of recent technology such as CXL, both memory and storage with a profound dynamic capacity expansion capability without power cycling, i.e., while maintaining an active power state.
Another aspect of the disclosure is to provide effective manners to scale up/out memory/storage pools and offer new features and DC business opportunities for disaggregated memory/storage pooling based on modular, flexible, and composable methods.
In accordance with an aspect of the disclosure, a device includes a host, a processor, a memory pooling device electrically connected to the processor, and a compute express link (CXL) dynamic memory capacity expansion device (DMCED), wherein the CXL DMCED is directly electrically connected to the memory pooling device.
In accordance with an aspect of the disclosure, a communication method between a host agent and a target agent includes initiating, by the host agent, an add capacity function, updating, by the target agent, a configuration between the host agent and the target agent, sending, by the target agent, the updated configuration to the host agent, performing a dynamic memory capacity expansion process between the host agent and the target agent, and performing a confirmation process between the host agent and the target agent, the confirmation process including an add_complete and/or a release_complete communication.
In accordance with an aspect of the disclosure, a communication method between a host agent and a target agent connected by a switching fabric (SF) disposed between the host agent and the target agent, includes initiating, by the host agent, an add capacity function, transmitting, by a host-managed device memory (HDM) decoder disposed in the SF, an updated configuration to the target agent, sending, by the target agent, the updated configuration to the host agent, performing a dynamic memory capacity expansion process between the host agent and the target agent, and performing a confirmation process between the host agent and the target agent, the confirmation process including an add_complete and/or a release_complete communication.
The above and other aspects, features, and advantages of certain embodiment of the present disclosure will be more apparent from the following detailed description taken in conjunction with the accompanying drawings, in which:
Embodiments of the present disclosure will be described herein below with reference to the accompanying drawings. However, the embodiments of the disclosure are not limited to the specific embodiments and should be construed as including all modifications, changes, equivalent devices and methods, and/or alternative embodiments of the present disclosure. Descriptions of well-known functions and/or configurations will be omitted for the sake of clarity and conciseness.
The expressions “have,” “may have,” “include,” and “may include” as used herein indicate the presence of corresponding features, such as numerical values, functions, operations, or parts, and do not preclude the presence of additional features. The expressions “A or B,” “at least one of A or/and B,” or “one or more of A or/and B” as used herein include all possible combinations of items enumerated with them. For example, “A or B,” “at least one of A and B,” or “at least one of A or B” indicate (1) including at least one A, (2) including at least one B, or (3) including both at least one A and at least one B.
Terms such as “first” and “second” as used herein may modify various elements regardless of an order and/or importance of the corresponding elements, and do not limit the corresponding elements. These terms may be used for the purpose of distinguishing one element from another element. For example, a first user device and a second user device may indicate different user devices regardless of the order or importance. A first element may be referred to as a second element without departing from the scope the disclosure, and similarly, a second element may be referred to as a first element.
When a first element is “operatively or communicatively coupled with/to” or “connected to” another element, such as a second element, the first element may be directly coupled with/to the second element, and there may be an intervening element, such as a third element, between the first and second elements. To the contrary, when the first element is “directly coupled with/to” or “directly connected to” the second element, there is no intervening third element between the first and second elements.
All of the terms used herein including technical or scientific terms have the same meanings as those generally understood by an ordinary skilled person in the related art unless they are defined otherwise. The terms defined in a generally used dictionary should be interpreted as having the same or similar meanings as the contextual meanings of the relevant technology and should not be interpreted as having ideal or exaggerated meanings unless they are clearly defined herein. According to circumstances, even the terms defined in this disclosure should not be interpreted as excluding the embodiments of the disclosure.
Memory expansion may be performed based on a DDR interface, which is a parallel interface, and may have dynamic capacity expansion limitations.
Some memory device solutions may be non-dynamic capacity expansion driven.
With CXL capability, memory expansion in dynamic and cache coherency features could be embedded on memory/storage pools, in nature. Accordingly, utilizing CXL presents new business models for HPC and futuristic DC areas.
Some the solutions disclosed herein could evolve with computational storage (CS) capability in the future for memory and memory plus storage pooling.
The disclosure provides a memory device which can dynamically expand capacity without a power cycle, and provides a memory plus storage (i.e., MEMOSTORAGE) device which has the same interface as the memory device and can dynamically expand capacity without a power cycle. Additionally, a new business model may be created, particularly for HPC and futuristic DC business where large scale virtual machines (VMs) handle AI/DB big data analytics based on in-memory (IM) centric processing and optimized new features and DC business opportunities for memory resource leveraging with a modular, flexible, and composable manner.
Thus, the disclosure advantageously reduces the physical size, energy cost of datacenters and/or memory, and improves large memory or storage resource pooling requirements for HPC and futuristic DC business.
Memory/(memory plus storage) pooling solutions could be utilized in module, sled, pod, or rack forms. Based on at least some of the solutions provided herein, memory expansion may be executed in an on-the-fly manner (e.g., dynamically, while maintaining an active power state).
This disclosure provides dynamic capacity expansion capability through memory or memory plus storage (i.e., MEMOSTORAGE) devices. Memory/memory plus storage (i.e., MEMOSTORAGE) solutions could be a fundamental technology to address large memory and memory plus storage resource pooling requirements for HPC and futuristic DC business. Also, memory/memory plus storage (i.e., MEMOSTORAGE) resource pooling in HPC and futuristic DC environments may be modular, flexible, and composable for resource management for large workload acceleration.
To enable dynamic memory expansion, the new feature, “over-commitment physical memory space,” is created on the host (non-VM or non-Container level) when the system boots up. Eventually, this new feature, “over-commitment physical memory space” will impact VM or Container technologies and will likely be added in the basic input output system (BIOS) and/or operating system (OS) kernel mechanism on the host.
As shown in host 110, a processor/accelerator (hereinafter, processor) 101 is connected to the memory pooling device 103 and the CXL DMCED 105 by CXL 107. As shown in host 120, the processor 101 is included in the host 120 but the memory pooling device 103 and CXL DMCED 105 are external to the host 120 and may be connected to the host 120 by a CXL 107. For example, the CXL 107 may be a cable or a CXL extension. As shown in host 130, when there is a large-scale memory pull, a switching fabric (SF) 109 is used to connect the host 130 and the memory pooling device 103 and CXL DMCED 105. Specifically, the host 130 is connected to the SF 109 by a CXL 107 on a first end of the SF 109, and the memory pooling device 103 and CXL DMCED 105 are connected to the SF 109 by another CXL 107 on a second end of the SF 109, opposite to the first end. The SF 109 is decoded by a host-managed device memory (HDM) decoder 111.
Each architecture in
As shown in
The memory manager 212 differs from a conventional memory controller in that the memory manager 212 supports dynamic capacity change through the host CXL agent 213a interaction. The host CXL agent 213a mailbox registers management component transport protocol (MCTP)-based component command interfaces (CCIs) and includes a set/get feature in which an alert/notification may be received whenever a target device capacity changes. Such a change may be when a hot swap is performed by a hot plug. The memory manager 212 functions as a fabric manager (FM) in the instance of the host 220 since there is no SF connection to the host 220.
The target CXL agent 213b may be software that detects memory pool 203 capacity changes or hardware that detects such memory pool 203 capacity changes based on a hardware signal.
The dynamic capacity control agent 215 can be a hardware or software device that detects dynamic changes of a memory pool 203 capacity. For example, the dynamic capacity control agent 215 operates as an intermediary between the target CXL agent 213b and the memory pool 203. Alternatively, the dynamic capacity control agent 215 may be omitted when the target CXL agent 213b can automatically detect memory pool 203 capacity changes, such as through a hardware signal, based on the construction of the target CXL agent 213b.
The HDM decoder 212 is used to determine device physical address (DPA) and host physical address (HPA) mappings, i.e., in order for the host 220 to access the memory area in the target 205. There may be n number of HDMs, such as HDM 01, HDM 02, . . . HDM n. The memory pool 203 includes a dynamic capacity having a memory region N 217 for storing memory blocks (e.g., 256 megabyte (MB)) and a region N+1 219 for storing memory blocks combined with storage.
The dynamic capacity control agent 215 includes dynamic capacity expansion list having an HDM TAG with block information related to the memory blocks in regions N and N+1 (217, 219).
The communication between the host CXL agent 213a and target CXL agent 213b is as follows. In step 242, the host 220 initiates an “add capacity” function by an orchestrator/FM configuration. In step 244, the target CXL agent 213b updates the configuration and sends the updated configuration to the host CXL agent 213a. In step 246, a dynamic memory capacity expansion process, such as a memory add or release process, is performed between the host CXL agent 213a and the target CXL agent 213b. In step 248, a confirmation process is performed between the host CXL agent 213a and the target CXL agent 213b, including an add_complete and/or a release_complete communication.
The fabric manager 314 includes an SF 309 which is connected to the host CXL agent 313a by a CXL on a first end, and to the target CXL agent 313b by a CXL 307 on a second end opposite to the first end. The SF 309 is decoded by an HDM decoder 311, as explained in
In
In
In
Based on the communication over each of the FM 514 and the SF 509, a set/get feature 553 and an add_capacity/release_capacity feature 554 are performed between the host CXL agent 513a and target CXL agent 513b. In the dynamic memory expansion device including add/release capacity 555, a set/get feature 553 and an add_capacity/release_capacity feature 554 are performed over the FM 514 and SF 509, by which memory capacity may be selectively added or released. In the dynamic memory expansion device including device add/detach 556, a device_attach and/or device_detach communication is performed between the host 520 connected to the target DMCED 505 over the FM 514 and SF 509.
In
Furthermore, the memory device is a CXL device in which a type of solution could be a dual in-line memory module (DIMM), shared memory, storage, power or discrete network within the rack/point of delivery (SLED) or point of delivery (PoD) level. It is noted that the PoD level may be a module of a network, computing, storage, and application components that work together to deliver networking services. The PoD is a repeatable design pattern, and its components maximize the modularity, scalability, and manageability of data centers.
In
In the dynamic memory plus storage expansion device 600 including device attach/detach 658, a device_attach and/or device_detach communication 652 is performed between the host 620 connected to the target DMCED 605. Based on these communications, a device may be added to or detached from the device 658.
In
In the memory plus storage devices in
The disclosure provides the following memory or memory plus storage (i.e., MEMOSTORAGE) solutions.
Based on the devices as described above, a memory oriented distributed computing may be based on HPC and may provide modular, flexible and composable solutions so that a server farm in an HPC environment can create memory pooling in effective manner.
In addition, large memory pooling in HPC environments may be achieved with dynamic expansion capability.
Moreover, futuristic DC seeking RSD oriented DC environments may be created where computing, storage and network resources are disaggregated. Memory resource could be part of the disaggregated architecture for rack scale design (RSD) based DC environments.
As to memory plus storage, modular, flexible and composable solutions may be provided so that the server farm in the HPC environment can create MEMOSTORAGE pooling in an effective manner.
Large MEMOSTORAGE pooling in HPC environments may be provided with dynamic expansion capability. Moreover, futuristic DC seeking RSD oriented DC environments may be provided where computing, storage and network resources are disaggregated. A MEMOSTORAGE resource could be part of the disaggregated architecture for RSD based DC environments.
The processor 820 may execute, for example, software (e.g., a program 840) to control at least one other component (e.g., a hardware or a software component) of the electronic device 801 coupled with the processor 820 and may perform various data processing or computations. As at least part of the data processing or computations, the processor 820 may load a command or data received from another component (e.g., the sensor module 846 or the communication module 890) in volatile memory 832, process the command or the data stored in the volatile memory 832, and store resulting data in non-volatile memory 834. The processor 820 may include a main processor 821 (e.g., a central processing unit (CPU) or an application processor (AP)), and an auxiliary processor 823 (e.g., a graphics processing unit (GPU), an image signal processor (ISP), a sensor hub processor, or a communication processor (CP)) that is operable independently from, or in conjunction with, the main processor 821. Additionally or alternatively, the auxiliary processor 823 may be adapted to consume less power than the main processor 821, or execute a particular function. The auxiliary processor 823 may be implemented as being separate from, or a part of, the main processor 821.
The auxiliary processor 823 may control at least some of the functions or states related to at least one component (e.g., the display device 860, the sensor module 876, or the communication module 890) among the components of the electronic device 801, instead of the main processor 821 while the main processor 821 is in an inactive (e.g., sleep) state, or together with the main processor 821 while the main processor 821 is in an active state (e.g., executing an application). The auxiliary processor 823 (e.g., an image signal processor or a communication processor) may be implemented as part of another component (e.g., the camera module 880 or the communication module 890) functionally related to the auxiliary processor 823.
The memory 830 may store various data used by at least one component (e.g., the processor 820 or the sensor module 876) of the electronic device 801. The various data may include, for example, software (e.g., the program 840) and input data or output data for a command related thereto. The memory 830 may include the volatile memory 832 or the non-volatile memory 834.
The program 840 may be stored in the memory 830 as software, and may include, for example, an operating system (OS) 842, middleware 844, or an application 846.
The input device 850 may receive a command or data to be used by another component (e.g., the processor 820) of the electronic device 801, from the outside (e.g., a user) of the electronic device 801. The input device 850 may include, for example, a microphone, a mouse, or a keyboard.
The sound output device 855 may output sound signals to the outside of the electronic device 801. The sound output device 855 may include, for example, a speaker or a receiver. The speaker may be used for general purposes, such as playing multimedia or recording, and the receiver may be used for receiving an incoming call. The receiver may be implemented as being separate from, or a part of, the speaker.
The display device 860 may visually provide information to the outside (e.g., a user) of the electronic device 801. The display device 860 may include, for example, a display, a hologram device, or a projector and control circuitry to control a corresponding one of the display, hologram device, and projector. The display device 860 may include touch circuitry adapted to detect a touch, or sensor circuitry (e.g., a pressure sensor) adapted to measure the intensity of force incurred by the touch.
The audio module 870 may convert a sound into an electrical signal and vice versa. The audio module 870 may obtain the sound via the input device 850 or output the sound via the sound output device 855 or a headphone of an external electronic device 802 directly (e.g., wired) or wirelessly coupled with the electronic device 801.
The sensor module 876 may detect an operational state (e.g., power or temperature) of the electronic device 801 or an environmental state (e.g., a state of a user) external to the electronic device 801, and then generate an electrical signal or data value corresponding to the detected state. The sensor module 876 may include, for example, a gesture sensor, a gyro sensor, an atmospheric pressure sensor, a magnetic sensor, an acceleration sensor, a grip sensor, a proximity sensor, a color sensor, an infrared (IR) sensor, a biometric sensor, a temperature sensor, a humidity sensor, or an illuminance sensor.
The interface 877 may support one or more specified protocols to be used for the electronic device 801 to be coupled with the external electronic device 802 directly (e.g., wired) or wirelessly. The interface 877 may include, for example, a high-definition multimedia interface (HDMI), a universal serial bus (USB) interface, a secure digital (SD) card interface, or an audio interface.
A connecting terminal 878 may include a connector via which the electronic device 801 may be physically connected with the external electronic device 802. The connecting terminal 878 may include, for example, an HDMI connector, a USB connector, an SD card connector, or an audio connector (e.g., a headphone connector).
The haptic module 879 may convert an electrical signal into a mechanical stimulus (e.g., a vibration or a movement) or an electrical stimulus which may be recognized by a user via tactile sensation or kinesthetic sensation. The haptic module 879 may include, for example, a motor, a piezoelectric element, or an electrical stimulator.
The camera module 880 may capture a still image or moving images. The camera module 880 may include one or more lenses, image sensors, image signal processors, or flashes.
The power management module 888 may manage power supplied to the electronic device 801. The power management module 888 may be implemented as at least part of, for example, a power management integrated circuit (PMIC).
The battery 889 may supply power to at least one component of the electronic device 801. The battery 889 may include, for example, a primary cell which is not rechargeable, a secondary cell which is rechargeable, or a fuel cell.
The communication module 890 may support establishing a direct (e.g., wired) communication channel or a wireless communication channel between the electronic device 801 and the external electronic device (e.g., the electronic device 802, the electronic device 804, or the server 808) and performing communication via the established communication channel. The communication module 890 may include one or more communication processors that are operable independently from the processor 820 (e.g., the AP) and supports a direct (e.g., wired) communication or a wireless communication. The communication module 890 may include a wireless communication module 892 (e.g., a cellular communication module, a short-range wireless communication module, or a global navigation satellite system (GNSS) communication module) or a wired communication module 894 (e.g., a local area network (LAN) communication module or a power line communication (PLC) module). A corresponding one of these communication modules may communicate with the external electronic device via the first network 898 (e.g., a short-range communication network, such as Bluetooth™, wireless-fidelity (Wi-Fi) direct, or a standard of the Infrared Data Association (IrDA)) or the second network 899 (e.g., a long-range communication network, such as a cellular network, the Internet, or a computer network (e.g., LAN or wide area network (WAN)). These various types of communication modules may be implemented as a single component (e.g., a single IC), or may be implemented as multiple components (e.g., multiple ICs) that are separate from each other. The wireless communication module 892 may identify and authenticate the electronic device 801 in a communication network, such as the first network 898 or the second network 899, using subscriber information (e.g., international mobile subscriber identity (IMSI)) stored in the subscriber identification module 896.
The antenna module 897 may transmit or receive a signal or power to or from the outside (e.g., the external electronic device) of the electronic device 801. The antenna module 897 may include one or more antennas, and, therefrom, at least one antenna appropriate for a communication scheme used in the communication network, such as the first network 898 or the second network 899, may be selected, for example, by the communication module 890 (e.g., the wireless communication module 892). The signal or the power may then be transmitted or received between the communication module 890 and the external electronic device via the selected at least one antenna.
Commands or data may be transmitted or received between the electronic device 801 and the external electronic device 804 via the server 808 coupled with the second network 899. Each of the electronic devices 802 and 804 may be a device of a same type as, or a different type, from the electronic device 801. All or some of operations to be executed at the electronic device 801 may be executed at one or more of the external electronic devices 802, 804, or 808. For example, if the electronic device 801 should perform a function or a service automatically, or in response to a request from a user or another device, the electronic device 801, instead of, or in addition to, executing the function or the service, may request the one or more external electronic devices to perform at least part of the function or the service. The one or more external electronic devices receiving the request may perform the at least part of the function or the service requested, or an additional function or an additional service related to the request and transfer an outcome of the performing to the electronic device 801. The electronic device 801 may provide the outcome, with or without further processing of the outcome, as at least part of a reply to the request. To that end, a cloud computing, distributed computing, or client-server computing technology may be used, for example.
While the present disclosure has been described with reference to certain embodiments, various changes may be made without departing from the spirit and the scope of the disclosure, which is defined, not by the detailed description and embodiments, but by the appended claims and their equivalents.
This application is based on and claims priority under 35 U.S.C. § 119(e) to U.S. Provisional Application Ser. No. 63/318,532, which was filed in the U.S. Patent and Trademark Office on Mar. 10, 2022, the contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
63318532 | Mar 2022 | US |