Applications generate and/or manipulate large amounts of data. Thus, the performance of these applications is typically impacted by the manner in which the applications may read and/or write data.
Specific embodiments will now be described with reference to the accompanying figures. In the following description, numerous details are set forth as examples of the invention. One of ordinary skill in the art, having the benefit of this detailed description, would appreciate that one or more embodiments of the present invention may be practiced without these specific details and that numerous variations or modifications may be possible without departing from the scope of the invention. Certain details known to those of ordinary skill in the art may be omitted to avoid obscuring the description.
In the following description of the figures, any component described with regard to a figure, in various embodiments of the invention, may be equivalent to one or more like-named components shown and/or described with regard to any other figure. For brevity, descriptions of these components may not be repeated with regard to each figure. Thus, each and every embodiment of the components of each figure is incorporated by reference and assumed to be optionally present within every other figure having one or more like-named components. Additionally, in accordance with various embodiments of the invention, any description of any component of a figure is to be interpreted as an optional embodiment, which may be implemented in addition to, in conjunction with, or in place of the embodiments described with regard to a corresponding like-named component in any other figure.
Throughout the application, ordinal numbers (e.g., first, second, third, etc.) may be used as an adjective for an element (i.e., any noun in the application). The use of ordinal numbers is not to imply or create any particular ordering of the elements nor to limit any element to being only a single element unless expressly disclosed, such as by the use of the terms “before”, “after”, “single”, and other such terminology. Rather, the use of ordinal numbers is to distinguish between the elements. By way of an example, a first element is distinct from a second element, and the first element may encompass more than one element and succeed (or precede) the second element in an ordering of elements.
As used herein, the term ‘operatively connected’, or ‘operative connection’, means that there exists between elements/components/devices a direct or indirect connection that allows the elements to interact with one another in some way (e.g., via the exchange of information). For example, the phrase ‘operatively connected’ may refer to any direct (e.g., wired or wireless connection directly between two devices) or indirect (e.g., wired and/or wireless connections between any number of devices connecting the operatively connected devices) connection.
In general, embodiments of the invention relate to systems, devices, and methods for implementing and leveraging memory devices (e.g., persistent memory (defined below) and NVMe devices (defined below) to improve performance of data requests (e.g., read and write requests). More specifically, various embodiments of the invention embodiments of the invention enable applications (e.g., applications in the application container in
Using the aforementioned architecture, embodiments of the invention enable applications to interact with the memory devices at scale in a manner that is transparent to the applications. Said another way, the OS may continue to interact with the client FS container using POSIX and the client FS container, in turn, will provide a transparent mechanism to translate the requests received via POSIX into I/O requests that may be directly serviced by the storage pool.
In one embodiment of the invention, the one or more clients (100) are configured to issue requests to the node(s) in the CSI (104) (or to a specific node of the node(s)), to receive responses, and to generally interact with the various components of the nodes (described below).
In one or more embodiments of the invention, one or more clients (100) are implemented as computing devices. Each computing device may include one or more processors, memory (e.g., random access memory), and persistent storage (e.g., disk drives, solid state drives, etc.). The persistent storage may store computer instructions, (e.g., computer code), that when executed by the processor(s) of the computing device cause the computing device to issue one or more requests and to receive one or more responses. Examples of a computing device include a mobile phone, tablet computer, laptop computer, desktop computer, server, distributed computing system, or cloud resource.
In one or more embodiments of the invention, the one or more clients (100) are implemented as a logical device. The logical device may utilize the computing resources of any number of computing devices and thereby provide the functionality of the one or more clients (100) described throughout this application.
In one or more embodiments of the invention, the one or more clients (100) may request data and/or send data to the node(s) in the CSI (104). Further, in one or more embodiments, the one or more clients (100) may initiate an application to execute on one or more client application nodes in the CSI (104) such that the application may, itself, gather, transmit, and/or otherwise manipulate data on the client application nodes, remote to the client(s). In one or more embodiments, one or more clients (100) may share access to the same one or more client application nodes in the CSI (104) and may similarly share any data located on those client application nodes in the CSI (104).
In one or more embodiments of the invention, network (102) of the system is a collection of connected network devices that allow for the communication of data from one network device to other network devices, or the sharing of resources among network devices. Examples of a network (e.g., network (102)) include, but are not limited to, a local area network (LAN), a wide area network (WAN) (e.g., the Internet), a mobile network, or any other type of network that allows for the communication of data and sharing of resources among network devices and/or devices (e.g., clients (100), node(s) in the CSI (104)) operatively connected to the network (102). In one embodiment of the invention, the one or more clients (100) are operatively connected to the node(s) (104) via a network (e.g., network (102)).
The CSI (104) includes one or more client application nodes, one or more metadata nodes, and zero, one or more storage nodes. Additional detail about the architecture of the CSI is provided below in
While
In one or more embodiments of the invention, an application container (202) is software executing on the client application node. The application container (202) may be an independent software instance that executes within a larger container management software instance (not shown) (e.g., Docker®, Kubernetes®). In embodiments in which the application container (202) is executing as an isolated software instance, the application container (202) may establish a semi-isolated virtual environment, inside the container, in which to execute one or more applications (e.g., application (212).
In one embodiment of the invention, an application container (202) may be executing in “user space” (e.g., a layer of the software that utilizes low-level system components for the execution of applications) of the OS (208) of the client application node (200).
In one or more embodiments of the invention, an application container (202) includes one or more applications (e.g., application (212)). An application (212) is software executing within the application container (e.g., 202), that may include instructions which, when executed by a processor(s) (not shown) (in the hardware layer (210)), initiate the performance of one or more operations of components of the hardware layer (210). Although applications (212) are shown executing within application containers (202) of
In one or more embodiments of the invention, each application (212) includes a virtual address space (e.g., virtual address space (220)). In one embodiment of the invention, a virtual address space (220) is a simulated range of addresses (e.g., identifiable locations) that mimics the physical locations of one or more components of the hardware layer (210). In one embodiment, an application (212) is not configured to identify the physical addresses of one or more components of the hardware layer (210); rather, the application (212) relies on other components of the client application node (200) to translate one or more virtual addresses of the virtual address space (e.g., 220) to one or more physical addresses of one or more components of the hardware layer (210). Accordingly, in one or more embodiments of the invention, an application may utilize a virtual address space (220) to read, write, and/or otherwise manipulate data, without being configured to directly identify the physical address of that data within the components of the hardware layer (210).
Additionally, in one or more embodiments of the invention, an application may coordinate with other components of the client application node (200) to establish a mapping, see e.g.,
In one or more embodiments of the invention, a client FS container (206) is software executing on the client application node (200). A client FS container (206) may be an independent software instance that executes within a larger container management software instance (not shown) (e.g., Docker®, Kubernetes®, etc.). In embodiments in where the client FS container (206) is executing as an isolated software instance, the client FS container (206) may establish a semi-isolated virtual environment, inside the container, in which to execute an application (e.g., FS client (240) and memory hypervisor module (242), described below). In one embodiment of the invention, a client FS container (206) may be executing in “user space” (e.g., a layer of the software that utilizes low-level system components for the execution of applications) of the OS (208).
Referring to
In one or more embodiments of the invention, FS client (240) may include functionality to generate one or more virtual-to-physical address mappings by translating a virtual address of a virtual address space (220) to a physical address of a component in the hardware layer (210). Further, in one embodiment of the invention, the FS client (240) may further be configured to communicate one or more virtual-to-physical address mappings to one or more components of the hardware layer (210) (e.g., memory management unit (not shown)). In one embodiments of the invention, the FS client (240) tracks and maintains various mappings as described below in
In one embodiment of the invention, the memory hypervisor module (242) is software executing within the client FS container (206) that includes functionality to generate and issue I/O requests over fabric directly to storage media in the storage pool. Additional detail about the operation of the memory hypervisor module is described below in
Returning to
In one embodiment of the invention, the GPU module (246) is software executing in the OS (208) that manages the mappings between the virtual address space and physical addresses in the GPU memory (not shown) that the GPU(s) (244) is using. Said another way, the application executing in the application container may be GPU-aware and, as such, store data directly within the GPU memory. The application may interact with the GPU memory using virtual addresses. The GPU module (246) maintains a mapping between the virtual addresses used by the application and the corresponding physical address of the data located in the GPU memory. In one embodiment of the invention, prior to the application using the GPU memory, the GPU module may register all or a portion of the GPU memory with the RDMA engine (which implements RDMA) within the external communication interface(s) (232). This registration allows the data stored within the registered portion of the GPU memory to be directly accessed by the RDMA engine and transferred to the storage nodes (see e.g.,
In one or more embodiments of the invention, the hardware layer (210) is a collection of physical components configured to perform the operations of the client application node (200) and/or otherwise execute the software of the client application node (200) (e.g., those of the containers (202, 206), applications (e.g., 212)).
In one embodiment of the invention, the hardware layer (210) includes one or more communication interface(s) (232). In one embodiment of the invention, a communication interface (232) is a hardware component that provides capabilities to interface the client application node (200) with one or more devices (e.g., a client, another node in the CSI (104), etc.) and allow for the transmission and receipt of data (including metadata) with those device(s). A communication interface (232) may communicate via any suitable form of wired interface (e.g., Ethernet, fiber optic, serial communication etc.) and/or wireless interface and utilize one or more protocols for the transmission and receipt of data (e.g., Transmission Control Protocol (TCP)/Internet Protocol (IP), Remote Direct Memory Access, IEEE 801.11, etc.).
In one embodiment of the invention, the communication interface (232) may implement and/or support one or more protocols to enable the communication between the client application nodes and external entities (e.g., other nodes in the CSI, one or more clients, etc.). For example, the communication interface (232) may enable the client application node to be operatively connected, via Ethernet, using a TCP/IP protocol to form a “network fabric” and enable the communication of data between the client application node and other external entities. In one or more embodiments of the invention, each node within the CSI may be given a unique identifier (e.g., an IP address) to be used when utilizing one or more protocols.
Further, in one embodiment of the invention, the communication interface (232), when using certain a protocol or variant thereof, supports streamlined access to storage media of other nodes in the CSI. For example, when utilizing remote direct memory access (RDMA) to access data on another node in the CSI, it may not be necessary to interact with the software (or storage stack) of that other node in the CSI. Rather, when using RDMA (via an RDMA engine (not shown) in the communications interface(s) (232)), it may be possible for the client application node to interact only with the hardware elements of the other node to retrieve and/or transmit data, thereby avoiding any higher-level processing by the software executing on that other node. In other embodiments of the invention, the communicate interface enables direct communication with the storage media of other nodes using Non-Volatile Memory Express (NVMe) over Fabric (NVMe-oF) and/or persistent memory over Fabric (PMEMoF) (both of which may (or may not) utilize all or a portion of the functionality provided by RDMA).
In one embodiment of the invention, the hardware layer (210) includes one or more processor(s) (not shown). In one embodiment of the invention, a processor may be an integrated circuit(s) for processing instructions (e.g., those of the containers (202, 206), applications (e.g., 212) and/or those received via a communication interface (232)). In one embodiment of the invention, processor(s) may be one or more processor cores or processor micro-cores. Further, in one or more embodiments of the invention, one or more processor(s) may include a cache (not shown) (as described).
In one or more embodiments of the invention, the hardware layer (210) includes persistent storage (236). In one embodiment of the invention, persistent storage (236) may be one or more hardware devices capable of storing digital information (e.g., data) in a non-transitory medium. Further, in one embodiment of the invention, when accessing persistent storage (236), other components of client application node (200) are capable of only reading and writing data in fixed-length data segments (e.g., “blocks”) that are larger than the smallest units of data normally accessible (e.g., “bytes”).
Specifically, in one or more embodiments of the invention, when data is read from persistent storage (236), all blocks that include the requested bytes of data (some of which may include other, non-requested bytes of data) must be copied to other byte-accessible storage (e.g., memory). Then, only after the data is located in the other medium, may the requested data be manipulated at “byte-level” before being recompiled into blocks and copied back to the persistent storage (236).
Accordingly, as used herein, “persistent storage”, “persistent storage device”, “block storage”, “block device”, and “block storage device” refer to hardware storage devices that are capable of being accessed only at a “block-level” regardless of whether that device is volatile, non-volatile, persistent, non-persistent, sequential access, random access, solid-state, or disk based. Further, as used herein, the term “block semantics” refers to the methods and commands software employs to access persistent storage (236).
Examples of “persistent storage” (236) include, but are not limited to, certain integrated circuit storage devices (e.g., solid-state drive (SSD), magnetic storage (e.g., hard disk drive (HDD), floppy disk, tape, diskette, etc.), or optical media (e.g., compact disc (CD), digital versatile disc (DVD), NVMe devices, computational storage, etc.). In one embodiment of the invention, NVMe device is a persistent storage that includes SSD that is accessed using the NVMe® specification (which defines how applications communicate with SSD via a peripheral component interconnect express) bus. In one embodiment of the invention, computational storage is persistent storage that includes persistent storage media and microprocessors with domain-specific functionality to efficiently perform specific tasks on the data being stored in the storage device such as encryption and compression.
In one or more embodiments of the invention, the hardware layer (210) includes memory (238). In one embodiment of the invention, memory (238), similar to persistent storage (236), may be one or more hardware devices capable of storing digital information (e.g., data) in a non-transitory medium. However, unlike persistent storage (236), in one or more embodiments of the invention, when accessing memory (238), other components of client application node (200) are capable of reading and writing data at the smallest units of data normally accessible (e.g., “bytes”).
Specifically, in one or more embodiments of the invention, memory (238) may include a unique physical address for each byte stored thereon, thereby enabling software (e.g., applications (212), containers (202, 206)) to access and manipulate data stored in memory (238) by directing commands to a physical address of memory (238) that is associated with a byte of data (e.g., via a virtual-to-physical address mapping). Accordingly, in one or more embodiments of the invention, software is able to perform direct, “byte-level” manipulation of data stored in memory (unlike persistent storage data, which must first copy “blocks” of data to another, intermediary storage mediums prior to reading and/or manipulating data located thereon).
Accordingly, as used herein, “memory”, “memory device”, “memory storage, “memory storage device”, and “byte storage device” refer to hardware storage devices that are capable of being accessed and/or manipulated at a “byte-level” regardless of whether that device is volatile, non-volatile, persistent, non-persistent, sequential access, random access, solid-state, or disk based. As used herein, the terms “byte semantics” and “memory semantics” refer to the methods and commands software employs to access memory (238).
Examples of memory (238) include, but are not limited to, certain integrated circuit storage (e.g., flash memory, random access memory (RAM), dynamic RAM (DRAM), resistive RAM (ReRAM), etc.) and Persistent Memory (PMEM). PMEM is a solid-state high-performance byte-addressable memory device that resides on the memory bus, where the location of the PMEM on the memory bus allows PMEM to have DRAM-like access to data, which means that it has nearly the same speed and latency of DRAM and the non-volatility of NAND flash.
In one embodiment of the invention, the hardware layer (210) includes a memory management unit (MMU) (not shown). In one or more embodiments of the invention, an MMU is hardware configured to translate virtual addresses (e.g., those of a virtual address space (220)) to physical addresses (e.g., those of memory (238)). In one embodiment of the invention, an MMU is operatively connected to memory (238) and is the sole path to access any memory device (e.g., memory (238)) as all commands and data destined for memory (238) must first traverse the MMU prior to accessing memory (238). In one or more embodiments of the invention, an MMU may be configured to handle memory protection (allowing only certain applications to access memory) and provide cache control and bus arbitration. Further, in one or more embodiments of the invention, an MMU may include a translation lookaside buffer (TLB) (as described below).
In one embodiment of the invention, the hardware layer (210) includes one or more graphics processing units (GPUs) (244). In one embodiment of the invention, the GPUs (244) are a type of processors that includes a significantly larger number of cores than the processors discussed above. The GPUs (244) may utilize the cores to perform a large number of processes in parallel. The processes performed by the GPUs may include basic arithmetic operations. The GPUs may perform additional types of processes without departing from the invention.
In one or more embodiments of the invention, the GPUs include computing resources that allow the GPUs to perform the functions described throughout this application. The computing resources may include cache, GPU memory (e.g., dynamic random access memory (DRAM)), and the cores discussed above. The cores may be capable of processing one or more threads at a time and temporarily storing data in the cache and/or local memory during the processing. A thread is a process performed on data by a core of the GPUs.
While
In one embodiment of the invention, the metadata server (302) includes functionality to manage all or a portion of the metadata associated with the CSI The metadata server (302) also includes functionality to service requests for data layouts that it receives from the various client application nodes. Said another way, each metadata node may support multiple client application nodes. As part of this support, the client application nodes may send data layout requests to the metadata node (300). Metadata node (300), in conjunction with the file system (304), generates and/or obtains the requested data layouts and provides the data layouts to the appropriate client application nodes. The data layouts provide a mapping between file offsets and [SOV, offset]s (see e.g.,
In one embodiment of the invention, the file system (304) includes functionality to manage a sparse virtual space (see e.g.,
In one embodiment of the invention, the memory hypervisor module (306) is substantially the same as the memory hypervisor module described in
In one embodiment of the invention, the metadata node (300) includes one or more communication interfaces (308). The communication interfaces are substantially the same as the communication interfaces described in
In one embodiment of the invention, metadata node (300) includes one or more processor(s) (not shown). In one embodiment of the invention, a processor may be an integrated circuit(s) for processing instructions (e.g., those of the metadata server (302), file system (304) and/or those received via a communication interface(s) (308)). In one embodiment of the invention, processor(s) may be one or more processor cores or processor micro-cores. Further, in one or more embodiments of the invention, one or more processor(s) may include a cache (not shown) (as described).
In one or more embodiments of the invention, the metadata node includes persistent storage (310), which is substantially the same as the persistent storage described in
In one or more embodiments of the invention, the metadata node includes memory (312), which is substantially similar to memory described in
In one embodiment of the invention, the storage server (402) includes functionality to manage the memory (408) and persistent storage (406) within the storage node.
In one embodiment of the invention, the server node includes communication interface(s) (404), which is substantially the same as the memory communication interface(s) described in
In one embodiment of the invention, server node (400) includes one or more processor(s) (not shown). In one embodiment of the invention, a processor may be an integrated circuit(s) for processing instructions (e.g., those of the storage server (402), and/or those received via a communication interface (404)). In one embodiment of the invention, processor(s) may be one or more processor cores or processor micro-cores. Further, in one or more embodiments of the invention, one or more processor(s) may include a cache (not shown) (as described).
In one or more embodiments of the invention, the server node includes persistent storage (406)), which is substantially the same as the persistent storage described in
In one or more embodiments of the invention, the server node includes memory (408), which is substantially similar to memory described in
Referring to
When the OS (e.g., 208) interacts with the FS client (e.g., 240), it uses the file name (or file identifier) and offset to refer to a specific location from which the application (e.g., 212) is attempting to read or write. The FS client (e.g., 240) maps the logical blocks (e.g., logical block A, logical block B, logical block C) (which are specified using [file name, offset]) to corresponding file system blocks (FSBs) (e.g., FSB1, FSB2, FSB3). The FSBs that correspond to a given file layout (502) may be referred to as file system layout (504). In one embodiment of the invention, the file layout (502) typically includes a contiguous set of logical blocks, while the file system layout (504) typically includes a set of FSBs, which may or may not be contiguous FSBs. The mapping between the file layout (502) and the file system layout (504) is generated by the metadata server (see e.g.,
Referring to
In one embodiment of the invention, the sparse virtual space (510) may be allocated with several petabytes of sparse space, with the intention being that the aggregate space of the storage media in the storage pool (532) will not exceed several petabytes of physical storage space. Said another way, the sparse virtual space (510) is sized to support an arbitrary number of virtual address spaces and an arbitrary amount of storage media such that the size of the sparse virtual space (510) remains constant after it has been initialized.
The sparse virtual space (510) may be logically divided into a metadata portion (512) and a data portion (514). The metadata portion (512) is allocated for the storage of file system metadata and FS client metadata. The file system metadata and the FS client metadata may correspond to any metadata (examples of which are provided below with respect to
In one or more embodiments of the invention, each FSB may be uniformly sized throughout the sparse virtual space (510). In one or more embodiments of the invention, each FSB may be equal to the largest unit of storage in storage media in the storage pool. Alternatively, in one or more embodiments of the invention, each FSB may be allocated to be sufficiently larger than any current and future unit of storage in storage media in the storage pool.
In one or more embodiments of the invention, one or more SOVs (e.g., 520) are mapped to FSBs in the sparse virtual space (510) to ultimately link the FSBs to storage media. More specifically, each SOV is a virtual data space that is mapped to corresponding physical regions of a portion of, one, or several storage devices, which may include one or more memory devices and one or more persistent storage devices. The SOV(s) (e.g., 520) may identify physical regions of the aforementioned devices by maintaining a virtual mapping to the physical addresses of data that comprise those memory devices (e.g., 238, 312, 408) or persistent storage devices (e.g., 236, 310, 406).
In one or more embodiments of the invention, several SOVs may concurrently exist (see e.g.,
In one embodiment of the invention, a SOV may be uniquely associated with a single storage device (e.g., a memory device or a persistent storage device). Accordingly, a single SOV may provide a one-to-one virtual emulation of a single storage device of the hardware layer. Alternatively, in one or more embodiments of the invention, a single SOV may be associated with multiple storage devices (e.g., a memory device or a persistent storage device), each sharing some characteristic. For example, there may be a single SOV for two or more DRAM devices and a second memory pool for two or more PMEM devices. One of ordinary skill in the art, having the benefit of this detailed description, would appreciate that SOV(s) (e.g., 520) may be organized by any suitable characteristic of the underlying memory (e.g., based on individual size, collective size, type, speed, etc.).
In one embodiment of the invention, storage pool (532) includes one or more storage devices (e.g., memory devices and/or persistent storage devices). The storage devices (or portions thereof) may be mapped into the SOV in “slice” units (or “slices”). For example, each slice (e.g., 522, 524, 526, 528, 530) may have a size of 256 MB (the invention is not limited to this example). When mapped into the SOV, each slice may include a contiguous set of FSBs that have an aggregate size equal to the size of the slice. Accordingly, each of the aforementioned FSBs (e.g., 516, 518) is logically associated with a slice (e.g., 522, 524, 526, 528, 530) in the SOV. The portion of the slice that is mapped to a given FSB may be specified using by an offset within a SOV (or by an offset within a slice within the SOV). Each portion of the slice within a SOV is mapped to one or more physical locations in the storage pool. In one non-limiting example, the portion of client C (256) may be 4K in size and may be stored in the storage pool (532) as a 6K stripe with four 1K data chunks (e.g., chunk w (534), chunk x (536), chunky (538), chunk z (540)) and two 1K parity chunks (e.g., chunk P (542), chunk Q (544)). In one embodiment of the invention, slices that only include FSBs from the metadata portion are referred to as metadata slices and slices that only include FSBs from the data portion are referred to as data slices.
Using the relationships shown in
Using the aforementioned architecture, the available storage media in the storage pool may increase or decrease in size (as needed) without impacting how the application (e.g., 212) is interacting with the sparse virtual space (510). More specifically, by creating a layer of abstraction between the sparse virtual space (510) and the storage pool (532) using the SOV (520), the sparse virtual space (510) continues to provide FSBs to the applications provided that these FSBs are mapped to a SOV without having to manage the mappings to the underlying storage pool. Further, by utilizing the SOV (520), changes made to the storage pool including how data is protected in the storage pool are performed in a manner that is transparent to the sparse virtual space (510). This enables the size of the storage pool to scale to an arbitrary size (up to the size limit of the sparse virtual space) without modifying the operation of the sparse virtual space (510).
The method shown in
A page fault typically specifies the virtual address (i.e., an address in virtual address space (e.g. 220)). The page fault may specify other information depending on whether the page fault was triggered by a read, write, or mapping request.
In one or more embodiments of the invention, as described in
In one or more embodiments of the invention, the OS will, initially, be configured to forward the page fault to the application from which the request originated. However, in one embodiment of the invention, the kernel module detects that the OS received a page fault, and instead forwards the page fault to a different location (i.e., the client FS container) instead of the default recipient (i.e., the application container and/or application). In one embodiment of the invention, the kernel module specifically monitors for and detects exception handling processes that specify an application's inability to access the physical location of data.
Turning to
In step 602, the FS container sends a request to a metadata node to obtain a data layout associated with the [file, offset]. The request for the data layout may also specify that the request is for read only access or for read write access. In one embodiment of the invention, read only access indicates that the application only wants to read data from a physical location associated with the virtual address while read write access indicates that the application wants to read data from and/or write data to a physical location associated with the virtual address. From the perspective of the application, the physical location is a local physical location (i.e., a physical location in the memory or the persistent storage) on the client application node; however, as shown in
In one embodiment of the invention, each FS client (e.g., 240) is associated with a single file system (e.g., 304) (however, each file system may be associated with multiple FS clients). The request in step 602 is sent to the metadata node that hosts the file system that is associated with the FS client on the client application node (i.e., the client application node on which the page fault was generated).
In step 604, the metadata node receives the request from the FS client container.
In step 606, in response to the request, the metadata server (on the metadata node) identifies one or more FSBs in the sparse virtual space. The identified FSBs correspond to FSB that are allocatable. An FSB is deemed allocatable if: (i) the FSB is mapped to the SOV and (ii) the FSB has not already been allocated. Condition (i) is required because while the sparse virtual space includes a large collection of FSBs, by design, at any given time not all of these FSBs are necessarily associated with any SOV(s). Accordingly, only FSBs that are associated with a SOV at the time step 606 is perform may be allocated. Condition (ii) is required as the sparse virtual space is designed to support applications distributed across multiple clients and, as such, one or more FSBs that are available for allocation may have been previously allocated by another application. The FSBs identified in step 606 may be denoted a pre-allocated FSBs in the event that no application has not written any data to these FSBs.
In one embodiment of the invention, the FSBs identified in step 606 may not be sequential (or contiguous) FSBs in the sparse virtual space. In one or more embodiments of the invention, more than one FSB may be allocated (or pre-allocated) for each logical block. For example, consider a scenario in which each logical block is 8K and each FSB is 4K. In this scenario, two FSBs are allocated (or pre-allocated) for each logical block. The FSBs that are associated with the same logical block may be sequential (or contiguous) FSBs within the sparse virtual space.
In step 608, after the FSB(s) has been allocated (or pre-allocated as the case may be), the metadata server generates a data layout. The data layout provides a mapping between the [file, file offset] (which was included in the request received in step 600) and a [SOV, offset]. The data layout may include one or more of the aforementioned mappings between [file, file offset] and [SOV, offset]. Further, the data layout may also specify the one or more FSBs associated with the data layout.
In one embodiment of the invention, if the request in step 602 specifies read only access, then the data layout will include [file, file offset] to [SOV, offset] mappings for the FSBs that include the data that the application (in the client application node) is attempting to read. In one embodiment of the invention, if the request in step 602 specifies read write access, then then the data layout may include one set of [file, file offset] to [SOV, offset] mappings for the FSBs that include the data that the application (in the client application node) is attempting to read and a second set of [file, file offset] to [SOV, offset] mappings for the FSBs to which the application may write data. The dual set of mappings provided in the aforementioned data layout may be used to support redirected writes, i.e., the application does not overwrite data; rather, all new writes are directed to new FSBs.
Continuing with the discussion of
In step 612, the client application node receives and caches the data layout from the metadata node. The FS client may also create an association between the logical blocks in the file layout (e.g., 502) and the corresponding FSBs in the file system layout (e.g., 504) based on the data layout.
In one embodiment of the invention, the FS client allocates an appropriate amount of local memory (e.g., local DRAM, local PMEM), which is/will be used to temporarily store data prior to it being committed to (i.e., stored in) the storage pool using the received data layout. Further, if the request that triggered the page fault (see step 600) was a read request, then the FS client may further initiate the reading of the requested data from the appropriate location(s) in the storage pool (e.g., via the memory hypervisor module) and store the obtained data in the aforementioned local memory.
In step 614, the client FS container informs the OS (or kernel module in the OS) of the virtual-to-physical address mapping. The virtual-to-physical address mapping is a mapping of a location in the virtual address space and a physical address in the local memory (as allocated in step 612). Once the aforementioned mapping is provided, the application and/or OS may directly manipulate the local memory of the client application node (i.e., without processing from the client FS container).
While
The method shown in
If the application has initiated the storage of the data using a msync or fflush command, then steps 700-712 are performed, resulting the data being persisted. In this scenario, the data is written to storage as a first part of processing the msync or fflush command, and then the metadata (including the data layout) is stored on the metadata server as the second part of processing the msync or fflush command.
However, if the OS or client FS container initiates the storage of the data, then the corresponding metadata may or may not be committed (i.e., steps 710 and 712 may not be performed). In certain scenarios, steps 710-712 may be initiated by the OS or the client FS container and performed by the client FS container as part of the OS or client FS container managing the local resources (e.g., portions of the cache used to store the data layouts needs to be freed to store other data layouts).
In step 700, a request to write data (i.e., write data to the storage pool; however, the metadata may or may not be committed, see e.g., Step 710) is received by the client FS container from the OS. The request may specify a virtual address corresponding to the location of the data in GPU memory and a [file, offset]. As discussed above the writing of data may also be initiated by the OS and/or the client FS container without departing from the invention. In such embodiments, the request is initiated by the OS and/or another process in the client FS container and the process that initiated the request provides the [file, offset] to the FS client.
In step 702, the FS client obtains the data layout required to service the request. The data layout may be obtained using the [file, offset] in the request received from the OS. The data layout may be obtained from a cache on the client application node. However, if the data layout is not present on the client application node, e.g., because it was invalidated and, thus, removed from the client application node, then the data layout is obtained from the metadata node in accordance with
In step 704, the FS client, using the data layout, obtains the SOV offset. As discussed above, the data layout provides a mapping between file offsets (e.g., offsets within a file layout (e.g., 502)) and the [SOV, offset] s in a SOV (e.g., 520). Accordingly, the FS client translates the [file, offset] into [SOV, offset].
In step 706, the memory hypervisor module issues a translation request to the GPU module, where the translation request specifies the virtual address (i.e., the virtual address specified in the write request in step 700).
In step 708, the memory hypervisor module receives a translation response from the GPU module that includes a physical address in the GPU memory, which corresponds to the virtual address. The GPU module is configured to receive the translation request, perform a look-up in the virtual-to-physical address mapping, and provides the resulting physical address in the translation response.
In step 710, the [SOV, offset] is then provided to the memory hypervisor module to process. More specifically, the memory hypervisor module includes the information necessary to generate and issue one or more I/O requests that result in the data being written directly from the GPU memory on the client application node (e.g., via a communication interface(s)) to an appropriate location in storage pool. For example, if the application is attempting to write data associated with logical block A (e.g., [File A, offset 0], then the memory hypervisor module is provided with [SOV, offset 18] (which is determined using the obtained data layout). The memory hypervisor module includes the necessary information to enable it to generate, in this example, one or more I/O requests to specific locations in the storage pool. Said another way, the memory hypervisor module includes functionality to: (i) determine how many I/O requests to generate to store the data associated with [SOV, offset 18]; (ii) divide the data into an appropriate number of chunks (i.e., one chunk per I/O request); (iii) determine the target of each I/O request (the physical location in the storage pool at which the chunk will be stored); and (iv) issue the I/O requests directly to the nodes on which the aforementioned physical locations exist. The issuance of the I/O requests includes initiating the transfer of data from the appropriate location in the GPU memory to the target location specified in the I/O request.
The communication interface(s) in the client application node facilitates the direct transfer of the data from the client application node to the appropriate location in the storage pool. As discussed above, the storage pool may include storage media located in storage devices (e.g., memory devices or persistent storage devices) that may be on client application nodes, metadata nodes, and/or storages. Accordingly, for any given I/O request, the communication interface(s) on the client application node on which the data resides transmits the data directly to communication interface(s) of the target node (i.e., the node that includes the storage media on which the data is to be written).
In step 712, the client application node awaits for confirmation from the target node(s) that the I/O request(s) generated and issued in step 710 has been successfully stored on the target node(s). At the end of step 712, the data has been written to the storage pool; however, the corresponding metadata is not persisted at this point; as such, the data is not deemed to be persisted. Specifically, if the application does not subsequently issue an msync command (e.g., when the application is using memory semantics) or an fflush command (e.g., when the application is using file semantics) the data will be stored in the storage pool but the metadata server will not be aware that such data has been stored. In order to persist the data, steps 714 and 716 are performed. If steps 700-710 were initiated by the OS or the client FS container, then the process may end at step 712 as the data was only written to the storage pool to free local resources (e.g., memory) on the client application node and there is no need at this time to persist the data (i.e., perform steps 714-716). Further, in scenarios in which the OS initiated the writing of the data, then step 712 also includes the client FS container notifying the OS that that the data has been written to the storage pool. However, as discussed below, there may be scenarios in which the data needs to be persisted at this time and, as such, steps 714-716 are performed.
Specifically, the data (and associated metadata) may be persisted as a result of: (i) the application issuing an msync command (e.g., when the application is using memory semantics) or an fflush command (e.g., when the application is using file semantics, (ii) the client FS container initiating (transparently to the application) steps 714 and 716, or (iii) the OS initiating (transparently to the application) steps 714 and 716.
If the application issues a request to commit data (e.g., issues an msync command or an fflush command), then in step 714, the client application node (in response to the confirmation in step 712) sends a request to commit the data layout to the metadata node. The commit request includes the mapping between the file layout and the file system layout (see e.g.,
In scenarios in which the OS or client FS container has previously committed the data layout to the metadata node, then when the client FS container receives a request to persist the data from the application, the client FS container confirms that it has previously committed the corresponding data layout (and other related metadata) (without issuing any request to the metadata nodes). After making this determination locally, the client FS container then proceeds to step 716.
Finally, in scenarios in which the OS or the client FS container needs to commit the corresponding metadata to the metadata server (e.g., portions of the cache used to store the data layouts needs to be freed to store other data layouts), then steps 714 and 716 may be initiated by the OS or the client FS container and performed by the client FS container.
In step 716, the client FS container then notifies the OS that the data has been persisted. The OS may then send the appropriate confirmation and/notification to the application that initiated the request to persist the data. The OS does not notify the application when
While one or more embodiments have been described herein with respect to a limited number of embodiments and examples, those skilled in the art, having benefit of this disclosure, would appreciate that other embodiments can be devised which do not depart from the scope of the embodiments disclosed herein. Accordingly, the scope should be limited only by the attached claims.
Number | Name | Date | Kind |
---|---|---|---|
1651470 | Sadtler | Dec 1927 | A |
5394537 | Courts et al. | Feb 1995 | A |
5946686 | Schmuck et al. | Aug 1999 | A |
6038570 | Hitz et al. | Mar 2000 | A |
6067541 | Raju et al. | May 2000 | A |
6119208 | White et al. | Sep 2000 | A |
6138126 | Hitz et al. | Oct 2000 | A |
6412017 | Straube et al. | Jun 2002 | B1 |
6681303 | Watanabe et al. | Jan 2004 | B1 |
6725392 | Frey et al. | Apr 2004 | B1 |
6751702 | Hsieh et al. | Jun 2004 | B1 |
6985995 | Holland et al. | Jan 2006 | B2 |
7191198 | Asano et al. | Mar 2007 | B2 |
7516285 | Haynes et al. | Apr 2009 | B1 |
7653682 | Erasani et al. | Jan 2010 | B2 |
7685126 | Patel et al. | Mar 2010 | B2 |
8112395 | Patel et al. | Feb 2012 | B2 |
8117388 | Jernigan, IV | Feb 2012 | B2 |
8195760 | Lacapra et al. | Jun 2012 | B2 |
8312242 | Casper et al. | Nov 2012 | B2 |
8364999 | Adessa | Jan 2013 | B1 |
8370910 | Kamei et al. | Feb 2013 | B2 |
8407265 | Scheer et al. | Mar 2013 | B1 |
8429360 | Iyer et al. | Apr 2013 | B1 |
8510265 | Boone et al. | Aug 2013 | B1 |
8566673 | Kidney et al. | Oct 2013 | B2 |
8818951 | Muntz et al. | Aug 2014 | B1 |
8924684 | Vincent | Dec 2014 | B1 |
9069553 | Zaarur et al. | Jun 2015 | B2 |
9104321 | Cudak et al. | Aug 2015 | B2 |
9172640 | Vincent et al. | Oct 2015 | B2 |
9250953 | Kipp | Feb 2016 | B2 |
9300578 | Chudgar et al. | Mar 2016 | B2 |
9330103 | Bono et al. | May 2016 | B1 |
9443095 | Lähteenmäki | Sep 2016 | B2 |
9483369 | Sporel | Nov 2016 | B2 |
9485310 | Bono et al. | Nov 2016 | B1 |
9760393 | Hiltgen et al. | Sep 2017 | B2 |
9779015 | Oikarinen et al. | Oct 2017 | B1 |
9886735 | Soum | Feb 2018 | B2 |
9990253 | Rajimwale et al. | Jun 2018 | B1 |
10031693 | Bansode et al. | Jul 2018 | B1 |
10156993 | Armangau et al. | Dec 2018 | B1 |
10209899 | Oshins et al. | Feb 2019 | B2 |
10248610 | Menachem | Apr 2019 | B2 |
10346297 | Wallace | Jul 2019 | B1 |
10348813 | Abali et al. | Jul 2019 | B2 |
10649867 | Roberts et al. | May 2020 | B2 |
10693962 | Neumann | Jun 2020 | B1 |
10740005 | Ives et al. | Aug 2020 | B1 |
11397545 | Hamid | Jul 2022 | B1 |
11438231 | Gardner et al. | Sep 2022 | B2 |
11481261 | Frandzel | Oct 2022 | B1 |
11570243 | Camargos | Jan 2023 | B2 |
11574381 | Long | Feb 2023 | B2 |
11604610 | Bono et al. | Mar 2023 | B2 |
11604706 | Nara | Mar 2023 | B2 |
11651470 | Zad Tootaghaj | May 2023 | B2 |
11677633 | Bono et al. | Jun 2023 | B2 |
11693572 | Naik | Jul 2023 | B2 |
11714568 | Kilaru | Aug 2023 | B2 |
11748143 | Kumar | Sep 2023 | B2 |
11789830 | Jain | Oct 2023 | B2 |
11829256 | Bansod | Nov 2023 | B2 |
11836047 | Madan | Dec 2023 | B2 |
20030074486 | Anastasiadis et al. | Apr 2003 | A1 |
20040062245 | Sharp et al. | Apr 2004 | A1 |
20040172073 | Busch et al. | Sep 2004 | A1 |
20040210761 | Eldar et al. | Oct 2004 | A1 |
20050004925 | Stahl et al. | Jan 2005 | A1 |
20050114557 | Arai et al. | May 2005 | A1 |
20050172097 | Voigt et al. | Aug 2005 | A1 |
20060020745 | Conley et al. | Jan 2006 | A1 |
20060101081 | Lin et al. | May 2006 | A1 |
20060117135 | Thind et al. | Jun 2006 | A1 |
20060200858 | Zimmer et al. | Sep 2006 | A1 |
20060265605 | Ramezani | Nov 2006 | A1 |
20070011137 | Kodama | Jan 2007 | A1 |
20070022138 | Erasani et al. | Jan 2007 | A1 |
20070106861 | Miyazaki et al. | May 2007 | A1 |
20070136391 | Anzai et al. | Jun 2007 | A1 |
20070143542 | Watanabe et al. | Jun 2007 | A1 |
20070245006 | Lehikoinen et al. | Oct 2007 | A1 |
20080154985 | Childs et al. | Jun 2008 | A1 |
20080184000 | Kawaguchi | Jul 2008 | A1 |
20080270461 | Gordon et al. | Oct 2008 | A1 |
20090077097 | Lacapra et al. | Mar 2009 | A1 |
20090144416 | Chatley et al. | Jun 2009 | A1 |
20090150639 | Ohata | Jun 2009 | A1 |
20090248957 | Tzeng | Oct 2009 | A1 |
20090300302 | Vaghani | Dec 2009 | A1 |
20090307538 | Hernandez et al. | Dec 2009 | A1 |
20090313415 | Sabaa et al. | Dec 2009 | A1 |
20100049754 | Takaoka et al. | Feb 2010 | A1 |
20100076933 | Hamilton et al. | Mar 2010 | A1 |
20100100664 | Shimozono | Apr 2010 | A1 |
20100115009 | Callahan et al. | May 2010 | A1 |
20100274772 | Samuels | Oct 2010 | A1 |
20100306500 | Mimatsu | Dec 2010 | A1 |
20110161281 | Sayyaparaju et al. | Jun 2011 | A1 |
20110218966 | Barnes et al. | Sep 2011 | A1 |
20110289519 | Frost | Nov 2011 | A1 |
20110314246 | Miller et al. | Dec 2011 | A1 |
20120096059 | Shimizu et al. | Apr 2012 | A1 |
20120158882 | Oehme | Jun 2012 | A1 |
20120250682 | Vincent et al. | Oct 2012 | A1 |
20120250686 | Vincent et al. | Oct 2012 | A1 |
20130139000 | Nakamura et al. | May 2013 | A1 |
20130179481 | Halevy | Jul 2013 | A1 |
20130227236 | Flynn et al. | Aug 2013 | A1 |
20130346444 | Makkar et al. | Dec 2013 | A1 |
20140089619 | Khanna et al. | Mar 2014 | A1 |
20140171190 | Diard | Jun 2014 | A1 |
20140188953 | Lin et al. | Jul 2014 | A1 |
20140195564 | Talagala et al. | Jul 2014 | A1 |
20140237184 | Kazar et al. | Aug 2014 | A1 |
20140279859 | Benjamin-deckert et al. | Sep 2014 | A1 |
20150088882 | Hartman et al. | Mar 2015 | A1 |
20150212909 | Sporel | Jul 2015 | A1 |
20150356078 | Kishimoto et al. | Dec 2015 | A1 |
20160080492 | Cheung | Mar 2016 | A1 |
20160117254 | Susarla et al. | Apr 2016 | A1 |
20160188628 | Hartman et al. | Jun 2016 | A1 |
20160259687 | Yoshihara et al. | Sep 2016 | A1 |
20160275098 | Joseph | Sep 2016 | A1 |
20160292179 | Von Muhlen et al. | Oct 2016 | A1 |
20160342588 | Judd | Nov 2016 | A1 |
20170131920 | Oshins | May 2017 | A1 |
20170132163 | Aslot et al. | May 2017 | A1 |
20170169233 | Hsu et al. | Jun 2017 | A1 |
20170249215 | Gandhi | Aug 2017 | A1 |
20170286153 | Bak et al. | Oct 2017 | A1 |
20180032249 | Makhervaks et al. | Feb 2018 | A1 |
20180095915 | Prabhakar et al. | Apr 2018 | A1 |
20180109471 | Chang et al. | Apr 2018 | A1 |
20180212825 | Umbehocker et al. | Jul 2018 | A1 |
20180307472 | Paul et al. | Oct 2018 | A1 |
20190044946 | Hwang et al. | Feb 2019 | A1 |
20190238590 | Talukdar et al. | Aug 2019 | A1 |
20190339896 | Mccloskey et al. | Nov 2019 | A1 |
20190347204 | Du et al. | Nov 2019 | A1 |
20190370042 | Gupta et al. | Dec 2019 | A1 |
20190377892 | Ben Dayan et al. | Dec 2019 | A1 |
20200004452 | Kobayashi et al. | Jan 2020 | A1 |
20200110554 | Yang | Apr 2020 | A1 |
20200241805 | Armangau et al. | Jul 2020 | A1 |
20210026774 | Lim | Jan 2021 | A1 |
20210042141 | De Marco | Feb 2021 | A1 |
20210117246 | Lal | Apr 2021 | A1 |
20210132870 | Bono | May 2021 | A1 |
20210133109 | Bono | May 2021 | A1 |
20210160318 | Sajeepa | May 2021 | A1 |
20210173588 | Kannan | Jun 2021 | A1 |
20210173744 | Agrawal | Jun 2021 | A1 |
20210182190 | Gao | Jun 2021 | A1 |
20210191638 | Miladinovic | Jun 2021 | A1 |
20210232331 | Kannan | Jul 2021 | A1 |
20210240611 | Tumanova | Aug 2021 | A1 |
20210243255 | Perneti | Aug 2021 | A1 |
20210286517 | Karr | Sep 2021 | A1 |
20210286546 | Hodgson | Sep 2021 | A1 |
20210303164 | Grunwald | Sep 2021 | A1 |
20210303519 | Periyagaram | Sep 2021 | A1 |
20210303522 | Periyagaram | Sep 2021 | A1 |
20210303523 | Periyagaram | Sep 2021 | A1 |
20210311641 | Prakashaiah | Oct 2021 | A1 |
20210314404 | Glek | Oct 2021 | A1 |
20210318827 | Bernat | Oct 2021 | A1 |
20210326048 | Karr | Oct 2021 | A1 |
20210326223 | Grunwald | Oct 2021 | A1 |
20210334206 | Colgrove | Oct 2021 | A1 |
20210349636 | Gold | Nov 2021 | A1 |
20210349649 | Lee | Nov 2021 | A1 |
20210349653 | DeWitt | Nov 2021 | A1 |
20210373973 | Ekins | Dec 2021 | A1 |
20210382800 | Lee | Dec 2021 | A1 |
20220011945 | Coleman | Jan 2022 | A1 |
20220011955 | Juch | Jan 2022 | A1 |
20220019350 | Karr | Jan 2022 | A1 |
20220019366 | Freilich | Jan 2022 | A1 |
20220019367 | Freilich | Jan 2022 | A1 |
20220019505 | Lee | Jan 2022 | A1 |
20220027051 | Kant | Jan 2022 | A1 |
20220027064 | Botes | Jan 2022 | A1 |
20220027472 | Golden | Jan 2022 | A1 |
20220035714 | Schultz | Feb 2022 | A1 |
20220050858 | Karr | Feb 2022 | A1 |
20220075546 | Potyraj | Mar 2022 | A1 |
20220075760 | Wu | Mar 2022 | A1 |
20220137855 | Irwin | May 2022 | A1 |
20220138223 | Sonner | May 2022 | A1 |
20220147253 | Sajeepa | May 2022 | A1 |
20220147365 | Bernat | May 2022 | A1 |
20220156152 | Gao | May 2022 | A1 |
20220164120 | Kannan | May 2022 | A1 |
20220171648 | Rodriguez | Jun 2022 | A1 |
20220180950 | Kannan | Jun 2022 | A1 |
20220197505 | Kannan | Jun 2022 | A1 |
20220197689 | Hotinger | Jun 2022 | A1 |
20220206691 | Lee | Jun 2022 | A1 |
20220206696 | Gao | Jun 2022 | A1 |
20220206702 | Gao | Jun 2022 | A1 |
20220206910 | Vaideeswaran | Jun 2022 | A1 |
20220215111 | Ekins | Jul 2022 | A1 |
20220229851 | Danilov et al. | Jul 2022 | A1 |
20220232075 | Emerson | Jul 2022 | A1 |
20220236904 | Miller | Jul 2022 | A1 |
20220253216 | Grunwald | Aug 2022 | A1 |
20220253389 | Fay | Aug 2022 | A1 |
20220261164 | Zhuravlev | Aug 2022 | A1 |
20220261170 | Vohra | Aug 2022 | A1 |
20220261178 | He | Aug 2022 | A1 |
20220261286 | Wang | Aug 2022 | A1 |
20220263897 | Karr | Aug 2022 | A1 |
20220269418 | Black | Aug 2022 | A1 |
20220291837 | Shao | Sep 2022 | A1 |
20220291858 | DeWitt | Sep 2022 | A1 |
20220291986 | Klein | Sep 2022 | A1 |
20220300193 | Gao | Sep 2022 | A1 |
20220300198 | Gao | Sep 2022 | A1 |
20220300413 | Kannan | Sep 2022 | A1 |
20220318264 | Jain | Oct 2022 | A1 |
20220334725 | Mertes | Oct 2022 | A1 |
20220334929 | Potyraj | Oct 2022 | A1 |
20220334990 | Karr | Oct 2022 | A1 |
20220335005 | Fernandez | Oct 2022 | A1 |
20220335009 | Paul | Oct 2022 | A1 |
20220350495 | Lee | Nov 2022 | A1 |
20220350515 | Bono et al. | Nov 2022 | A1 |
20220350543 | Bono | Nov 2022 | A1 |
20220350544 | Bono | Nov 2022 | A1 |
20220350545 | Bono | Nov 2022 | A1 |
20220350702 | Bono | Nov 2022 | A1 |
20220350778 | Bono | Nov 2022 | A1 |
20220414817 | Zad Tootaghaj | Dec 2022 | A1 |
20230126664 | Bono | Apr 2023 | A1 |
20230127387 | Bono | Apr 2023 | A1 |
20230130893 | Bono | Apr 2023 | A1 |
20230131787 | Bono | Apr 2023 | A1 |
Number | Date | Country |
---|---|---|
2016196766 | Dec 2016 | WO |
2017079247 | May 2017 | WO |
Entry |
---|
Adam Thompson et al., GPUDirect Storage: A Direct Path Between Storage and GPU Memory, Technical Blog, Aug. 6, 2019, 8 pages, NVIDIA Corporation, https://developer.nvidia.com/blog/gpudirect-storage/, accessed on May 20, 2022. |
Metz Joachim, “Hierarchical File System (HFS)” Nov. 4, 2020, Retrieved from the Internet on Dec. 16, 2021 https://github.com/libyal/libfshfs/blob/c52bf4a36bca067510d0672ccce6d449a5a85744/documentation/Hierarchical%20System%20 (HFS).asciidoc (93 pages). |
International Search Report and Written Opinion issued in corresponding Application No. PCT/US2021/030138, dated Jan. 21, 2022 (16 pages). |
International Search Report and Written Opinion issued in corresponding Application No. PCT/US2021/030141 dated Jan. 4, 2022 (11 pages). |
Y. Yamato, Proposal of Automatic GPU Offloading Method from Various Language Applications Proposal of Automatic GPU Offloading Method from Various Language Applications, 2021, pp. 400-404, 10.1109/ICIET51873.2021.9419618 (5 pages). |
Number | Date | Country | |
---|---|---|---|
20230126511 A1 | Apr 2023 | US |