This application is based upon and claims the benefit of priority of the prior Japanese Patent application No. 2019-210125, filed on Nov. 21, 2019, the entire contents of which are incorporated herein by reference.
The embodiment discussed herein is related to an information processing apparatus and a non-transitory computer-readable recording medium having stored therein an information processing program.
In an information processing apparatus such as a server or a Personal Computer (PC), an access to a main storage device exemplified by a memory, such as a Dynamic Random Access Memory (DRAM), is made by a processor (processing unit) such as a Central Processing Unit (CPU).
A processor includes one or more CPU cores (sometimes simply referred to as “cores”) and a memory controller. The core accesses data stored in the memory through execution of a process (may also be referred to as “program”), and the memory controller controls an access to the memory serving as an access target by the core.
In recent years, memories adopting the next generation memory technique have appeared. As such a memory, a memory adopting, for example, Intel Optane DC Persistent Memory (hereinafter, sometimes referred to as “PM”) (registered trademark) employing 3D XPoint (registered trademark) technique is known.
Compared with the DRAM, the PM has a lower process performance (particularly, a writing performance) (about one-tenth as an example), but are more inexpensive and larger in capacity (about ten-fold as an example).
Like the DRAM, the PM can be mounted on a memory slot, such as a Dual Inline Memory Module (DIMM) slot, and a memory controller controls accesses both to the DRAM and the PM. In other words, the DRAM, which is an example of a first memory, and the PM, which is an example of a second memory being different in process performance (process speed) from the DRAM coexist in the same storage (memory) layer.
In environment where the DRAM and the PM coexist in the same storage layer, an operation mode is prepared in which a program such as an application is arranged at least in a storing region (program region) of the DRAM and data is arranged at least in a part (storage region) of a storing region of the PM. In this operation mode, at least a part of the storing region of the PM can be used as storage.
[Patent Document 1] International Publication Pamphlet No. WO 2017/098591
[Patent Document 2] Japanese Laid-Open Patent Publication No. 2012-243117
[Patent Document 3] Japanese Laid-Open Patent Publication No. 2011-071764
The development of the PM assumes a usage that uses the PM included in an information processing apparatus (first information processing apparatus) as a shared storage region and causes another information processing apparatus (second information processing apparatus) to make a remote access to the shared storage region.
However, remote access to the storage region does not premise a case where the DRAM and the PM coexist in the same storage layer and the memory controller controls both the DRAM and the PM.
For example, it is assumed that an access to a program region (memory region) or a shared storage region made by an application executed by a processor (core) of an information processing apparatus and a remote access to the shared storage region are executed in parallel. In this case, there is a possibility that the processing time (processing delay) in a memory controller increases due to congestion caused from an access process by an application and a remote access processes, and memory accessing performance of the processor decreases.
According to an aspect of the embodiments, an information processing apparatus including: a memory region; a communication interface that is connected to an access apparatus different from the information processing apparatus; a storage region that the communication interface accesses in response to an access request from the access apparatus; and a processor coupled to the memory region and the storage region, and configured to access the memory region and the storage region, wherein the processor including a memory controller configured to control an access to the memory region and an access to the storage region, and the processor is configured to control, based on a state of one or more first accesses to the memory region and the storage region via the memory controller, a timing of executing a second access to the storage region that the communication interface makes via the memory controller.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
Hereinafter, an embodiment of the present invention will now be described with reference to the accompanying drawings. However, one embodiment described below is merely illustrative and there is no intention to exclude the application of various modifications and techniques not explicitly described below. For example, the present embodiment can be variously modified and implemented without departing from the scope thereof. In the drawings to be used in the following description, the same reference numbers denote the same or similar parts, unless otherwise specified.
As exemplarily illustrated in
This means that although being lower in process performance (particularly, writing performance) and lower in writing tolerance than the DRAM 120, the PM 130 is less expensive and larger in volume than DRAM 120. Similar to the DRAM 120, the PM 130 can be accessed in a unit of a byte and can be mounted on a memory slot such as a DIMM slot. Furthermore, since the PM 130 is non-volatile unlike the DRAM 120, the data in the PM 130 does not vanish when the power supply is cut off.
For these reasons, it is expected that an information processing apparatus mounting thereon both the DRAM 120 and the PM 130 as memory (main storage device) will become popular.
As illustrated in
As illustrated in
The server 100 may also include a memory extended region 121 formed by extending storing regions of the multiple DRAMs 120. The memory extended region 121 is a region formed by extending the storing regions of the multiple DRAMs 120 using at least some of storing regions 130a among the multiple PMs 130, and may be mainly used for storing a program such as an application.
The server 100 may further include a storage region 131. The storage region 131 is a region using at least some of storing regions 130b of the multiple PMs 130, and may be mainly used for storing data (e.g., user data). The storage region 131 is allowed to be remotely accessed by other information processing apparatus different from the server 100 and may be referred to as a “shared” storage region 131 shared with the other information processing apparatus.
Here, the storing regions 130a and 130b may have an exclusive relationship with each other, for example, the sum of the size of the storing region 130a and the size of the storing region 130b may be equal to the total sum of the sizes of the storing regions of the PMs 130. Also, the size of the storing region 130a may be zero. That is, the size of the memory extended region 121 may be equal to the total sum of the sizes of the multiple DRAMs 120, and the size of the storage region 131 may be equal to the sum of the sizes of the multiple PMs 130.
The CPU 110 includes a core (no: illustrated) and a memory controller 112, and an application 111 (denoted as “APP A” in
With this configuration, the APP A can access each of the DRAMs 120 and the PMs 130 without delay in the server 100.
Hereafter, an example of a remote access is assumed to be a Remote Direct Memory Access (RDMA). The term DMA is a method of transferring data directly between memories (or between a memory and an Input/Output (I/O) device). The RDMA is a scheme of DMA-transferring data over a network from a memory of a first computer to a memory of a second computer.
The PC 200 is an access PC that accesses the storage region 131 of the PM 130. The PC 200 may be, for example, a controller (e.g., a controller module (CM)) of a storage apparatus. As illustrated in
The NIC 230 sends, to the server 100, a request including a pointer specifying the starting position of a transfer target region 220a in the DRAM 220 and a transfer sizes of the transfer target region 220a (referred to as “ptr” and “size” in
The server 100 includes a chip set 170 and an NIC 180 in addition to the elements illustrated in
This cooperation of the NICs 230 and 180 achieves RDMA transfer of data from the DRAM 220 of the PC 200 to the shared storage region 131 of the server 100.
The example of
Furthermore, in the example of
As illustrated in
As a result, a delay on the PIO side increases as writing into the PC 130 by the RDMA increases. This is because the DMA controller 181 of the HCA 180 starts the RDMA regardless of a state of accesses in the DRAMs 120 and/or the PMs 130. Furthermore, as described above, since the PM 130 has lower processing performance, particularly lower writing performance (e.g., about one-tenth) than the DRAM 120, the delay on the PIO side remarkably increases.
Thus, in cases where an access to the memory extended region 121 or the storage region 131 by the APP A and an access to the storage region 131 by the RDMA are executed in parallel, congestion of the processes may occur in the MC 112. In this case, there is a possibility that the processing time (processing delay) in the MC 112 increases and the memory accessing performance of the CPU 110 decreases. This may accompany an increase of the response delay to the PC 200.
Accordingly, for the DRAM 120 and the PM 130 whose accesses are controlled by the same MC 112, a demand arises for a method of controlling an access from the application 111 and a remote access so as not to generate congestion is desired.
Therefore, in one embodiment, description will now be made in relation to a method of accessing the storage region 131 of the PM 130 without impairing the memory accessing performance from the application.
The processor 1a is an example of an arithmetic processing apparatus that performs various controls and arithmetic operations. The processor 1a may be communicably connected to the blocks in the server 1 to each other via a bus 1i. In one embodiment, the processor 1a may be a multiprocessor including multiple processors (e.g., multiple CPUs). Also, each of the multiple processors may be a multi-core processor having multiple processor cores.
The MC 2b is connected to one or more (three in the example of
For example, the MC 2b may associate different address ranges one with each the DRAM 3 and the PM 4 of each memory channel 5. The MC 2b may alternatively access one of the DRAM 3 or the PM 4 via the memory channel 5 shared by the DRAM 3 and the PM 4 with reference to a memory address specified from the core 2a. In other words, the MC 2b may control accesses to the DRAM 3 and the PM 4.
As the processor 1a, the CPU may be replaced with an Integrated Circuit (IC) such as a Micro Processing Unit (MPU), a Graphics Processing Unit (GPU), an Accelerated Processing Unit (APU), a Digital Signal Processor (DSP), an Application Specific IC (ASIC), and a Field-Programmable Gate Array (FPGA).
Referring back to the description of
Note that the DRAM 3 is an example of the first memory, and the PM 4 is an example of the second memory that differs (e.g., is slow) in process speed from the first memory and that shares at least a part of a storing region with the first memory.
The storing device 1c is an example of a HW device that stores information such as various data and programs. Examples of the storing device 1c include, for example, various storage device such as a semiconductor drive device such as a Solid State Drive (SSD), a magnetic disk device such as a Hard Disk Drive (HDD), a non-volatile memory. Examples of the non-volatile memory include, for example, a flash memory, a Storage Class Memory (SCM), and a Read Only Memory (ROM).
The storing device 1c may also store a program 1g that implements all or some of the various functions of the server 1. For example, the processor 1a of the server 1 can achieve the function as a processing unit 10 to be described below and illustrated in
The IF device 1d is an example of a communication IF that controls the connection to and communication with a non-illustrated network. For example, the IF device 1d may include an adapter conforming to a Local Area Network (LAN) such as InfiniBand (registered trademark) and Ethernet (registered trademark), optical communication (e.g., Fibre Channel (FC)), or the like. For example, the program 1g may be downloaded from a network to the server 1 via the communication IF and stored into storing device 1c.
The I/O device 1e may include one or both of an input device, such as a mouse, a keyboard, or an operating button, and an output device, such as a touch panel display, a monitor, such as a Liquid Crystal Display (LCD), a projector, or a printer.
The reader 1f is an example of a reader that reads information of data and programs recorded on a recording medium 1h. The reader 1f may include a connecting terminal or device to which the recording medium 1h can be connected or inserted. Examples of the reader 1f include an adapter conforming to, for example, a Universal Serial Bus (USB), a drive apparatus that accesses a recording disk, and a card reader that accesses a flash memory such as an SD card. The program 1g may be stored in the recording medium 1h. The reader 1f may read the program 1g from recording medium 1h and store the read program 1g into the storing device 1c.
The recording medium 1h is example of a non-transitory recording medium such as a magnetic/optical disk, and a flash memory. Examples of the magnetic/optical disk include a flexible disk, a Compact Disc (CD), a Digital Versatile Disc (DVD), a Blu-ray disk, and a Holographic Versatile Disc (HVD). Examples of the flash memory include a semiconductor memory such as a USB memory and an SD card.
The HW configuration of the server 1 described above is merely illustrative. Accordingly, the server 1 may appropriately undergo increase or decrease of HW (e.g., addition or deletion of arbitrary blocks), division, integration in an arbitrary combination, and addition or deletion of the bus.
Further, a PC 6 serving as an exemplary information processing apparatus, which will be described below with reference to
The processing unit 10 executes various processes in the server 1, and is an example of the function achieved by the processor 2 illustrated in
The DRAM 3 and the PM 4 constitute a memory extended region (program region) 31, which is an example of a memory region accessed by the processing unit 10, by the storing region of the DRAM 3 and at least a part of the storing region of the PM 4.
The PM 4 also constitutes a storage region 41 that the communicator 15 accesses in response to RDMA transfer requests from the processing unit 10 and the PC 6, by at least a part of the storing region of the PM 4. The storage region 41 may be accessibly shared from information processing apparatuses, such as the PC 6, and may be referred to as a shared storage region 41.
The communication controller 14 controls communication between the components in the server 1 and is an example of the chip set 170 illustrated in
The communicator 15 communicates with the PC 6 and is an example of the communication interface connected to the PC 6. The communicator 15, in one embodiment, may include an IF conforming to the standard of Infiniband (registered trademark). The communicator 15 is an example of the function achieved by the IF device 1d being illustrated in
The communicator 15 may include an RDMA executor 15a that performs an RDMA, as illustrated in
The RDMA executor 15a executes an RDMA transfer to the storage region 41 specified in an instruction from an RDMA controller 13c of a manager 13 to be described below in response to the instruction.
The PC 6 is an access PC that accesses the storage region 41 in the server 1, and is an example of an access device being different from the server 1. As illustrated in
The processing unit 7 executes various processes in the PC 6 and is an example of the function achieved by the processor 2 illustrated in
The communicator 9 is an example of the function achieved by the IF device 1d, e.g., the HCA, illustrated in
Before sending an RDMA transfer request, the allocation requester 91 transmits an acquisition request of a region 41a in the storage region 41, serving as the target of the RDMA transfer, to the server 1. The acquisition request may include information about the size (denoted as “size”) of data to be transferred from the DRAM 8 to the storage region 41 through the RDMA transfer.
When receiving an acquisition response responsive to the acquisition request from the server 1, the allocation requester 91 outputs the information of the region 41a in the acquired storage region 41 included in the acquisition response to the transfer requester 92. The information of the region 41a may include pointers and sizes (referred to as “dptr” and “dsize”, respectively) of the acquired region 41a. The pointer (dptr) may be a leading memory pointer indicating a physical storage position of the region 41a.
The transfer requester 92 transmits an RDMA transfer request, including a pointer and a size (“ptr” and “size”) of a region 8a, and the pointer and the size (“dptr” and “dsize”) of the region 41a obtained by the allocation requester 91 to the server 1. The pointer (ptr) of the region 8a is an example of information indicating a physical address (or logical address) of the DRAM 8 of the transfer source (access source) of the RDMA transfer request. The pointer (dptr) of the region 41a is an example of information that specifies the position of the acquired region 41a in the storage region 41 of the transfer destination (access destination) of the RDMA transfer request.
When receiving a completion response (e.g., ACK) of the RDMA transfer from the server 1, the transfer requester 92 may transmit a completion response to the request source of the RDMA.
As illustrated in
The controller 11 controls access from the application 12 to the memory extended region 31 or to the storage region 41, and also controls access from the PC 6 to the storage region 41 via the manager 13.
Similar to the application 111 illustrated in
The manager 13 is a storage manager that manages the storage region 41 and controls a timing of executing an RDMA from the PC 6 to the server 1 so as to abate congestion of access processes by the controller 11.
As illustrated in
The empty region manager 13a manages the empty storing region in the storage region 41. For example, when receiving an acquisition request transmitted from the allocation requester 91 of the PC 6 before the RDMA transfer, the empty region manager 13a reserves the region 41a in the storage region 41 serving as the target of the RDMA transfer on the basis of the acquisition request. An example of the region 41a may be an empty region. The empty region may be, for example, an unused (unallocated) region that has not been allocated with a logical address.
By way of example, upon receiving an acquisition request from the allocation requester 91 via the communicator 15 and the communication controller 14 as indicated by the arrow A in
For example, the empty region manager 13a registers the size of the region 8a included in the acquisition request and the pointer and the size (denoted as “dptr” and “dsize”, respectively) of a physical address of the acquired region 41a into the management information 13b.
As illustrated in
In cases where information (for example, an ID (Identifier) included in a command) that can specify the command of the RDMA transfer and that is transmitted from the PC 6 after acquisition request or the like is present, the information may be registered in management information 13b in place of “size”. In addition, in cases of “size”=“dsize”, the item of “dsize” may be omitted.
Also, information of “dptr” and “dsize” is transmitted to the allocation requester 91 after acquisition by the empty region manager 13a, and then a transfer request including the information of “dptr” and “dsize” is transmitted from the transfer requester 92 to the manager 13. Therefore, the manager 13 does not have to retain the information of “dptr” and “dsize” after the acquisition. In other words, the configuration of the management information 13b may be omitted from the server 1.
The empty region manager 13a, as indicated by the arrow B in
As described above, the empty region manager 13a is an example of an acquisition unit that, upon receipt of the acquisition request for the storage region 41 from the PC 6, reserves a region 41a having a size assigned by the acquisition request in the storage region 41, and transmits the acquisition response including information that specifies a position of the reserved region 41a to the PC 6.
The RDMA controller 13c controls, based on the access state of a memory access by the controller 11, the timing of executing the RDMA transfer based on the RDMA transfer request (transfer request) received from the PC 6.
For example, when receiving the RDMA transfer request (transfer request) from the transfer requester 92 of the PC 6 via the communicator 15 and the communication controller 14, the RDMA controller 13c inserts (stores) the received transfer request into the queue 13d as indicated by the arrow C in
In addition, the RDMA controller 13c obtains an access count of memory accesses made by the controller 11 during each regular time interval T by referring to the statistical information monitor 13e, and waits for the timing when accesses to the DRAM 3 and the PM 4 are few based on the access counts. The access counts may be, for example, the number of access requests processed by the controller 11, and is an example of the statistical information to be used to analyze the access state of memory accesses.
The statistical information monitor 13e monitors the number (access count) of memory accesses (see the dashed arrow in
The access counts serving as a result of monitoring may be reset (initialized) at regular time intervals T. Incidentally, T may be a time interval of about 1 millisecond, for example.
As illustrated by the arrow D in
For example, when the timing of executing comes, the RDMA controller 13c may read the transfer request stored at the leading position in the queue 13d and notify the RDMA executor 15a in the communicator 15 of the contents of the transfer request via the communication controller 14. For example, the RDMA controller 13c may start an RDMA transfer process that the RDMA executor 15a performs by writing the contents of the transfer request in a non-illustrated memory in the RDMA executor 15a.
The transfer request stored at =he leading position in the queue 13d is the transfer request stored the most previously into the queue 13d, and is the transfer request of “request_1” in
As indicated by the arrow E in
For example, in cases where the request is an RDMA write request, the RDMA executor 15a reads data as much as the “size” from the region 8a in the DRAM 8 specified by the “ptr” via the communicator 9 of the PC 6 and the MC 2b. Then, the RDMA executor 15a writes the read data of “dsize” into the region 41a in storage region 41 specified by the “dptr” via the communication controller 14 and the controller 11 (e.g., the MC 2b). Incidentally, such an RDMA transfer process by the RDMA executor 15a may be performed in a scheme defined in a standard, such as Infiniband, that the RDMA is compliant.
Further, as indicated by the arrow F in
Further, the RDMA controller 13c may remove the transfer request having been the target of the executing from the queue 13d after notifying the transfer request to the RDMA executor 15a or transmitting the completion response to the PC 6.
As described above, the RDMA controller 13c is an example of an access controller that controls, based on a state of one or more first accesses to the memory extended region 31 and the storage region 41 through the controller 11 made by the processing unit 10, a timing of executing a second access. The second access is an access to the storage region 41 through the controller 11 made by the communicator 15.
As described above, the server 1 of one embodiment, based on the state of the first accesses, controls the execution of the RDMA transfer at a timing when, for example, the access count is detected to be small. In other words, the RDMA controller 13c suspends an RDMA transfer request from the access PC 6 while the access count to the DRAM 3 and the PM 4 is large.
Thereby, the server 1 can achieve control such that the access from the APP A to the DRAM 3 and the PM 4 belonging to the same storage layer and the RDMA transfer from the PC 6 are avoided from congestion. Accordingly, the server 1 can execute the RDMA transfer to the storage region 41 without impairing the memory accessing performance from the APP A by controlling the timing of executing the RDMA transfer on the basis of the access state to the memory.
Further, the PC 6 can recognize (detect) the completion of the RDMA transfer by notification from the RDMA controller 13c. Accordingly, the PC 6 can eliminate the requirement to execute processes of issuing an inquiry as to whether the RDMA transfer is completed to the server 1 and providing an ample time to the standby time from the transfer request to the completion, so that consumption of communication and process resources of the PC 6 can be reduced.
Hereinafter, description will now be made in relation to an exemplary method of detecting that the access count is small through monitoring the statistical information monitor 13e by the RDMA controller 13c, in other words, determining the timing of executing the RDMA transfer.
For example, when the state of the one or more first accesses made by the processing unit 10 satisfies a condition for suppressing access congestion in the controller 11, the access congestion being caused by execution of the one or more first accesses and the second access, the RDMA controller 13c may cause the communicator 15 to execute the second access. The conditions for suppressing access congestion may be ones below.
When the transfer request is stored in the queue 13d, the RDMA controller 13c reads the access counts of the statistical information monitor 13e from the statistical information monitor 13e at regular time intervals T and resets the access counts of the statistical information monitor 13e after the reading.
The RDMA controller 13c stores M access counts in a memory, e.g., in the memory extended region 31, with numbering the access count collected immediately before to CNT[0] and the access count collected the one previous time to CNT[−1], . . . , CNT[−(M−1)] at regular time intervals T. Incidentally, M is, for example, an integer of about 2 to several tens, and is 10 as an example.
For example, the RDMA controller 13c may add the latest access count CNT[0] the most recently acquired to the memory extended region 31 by updating the access counts stored in the memory extended region 31 in the following order (i) to (iii) at regular time intervals T.
(i) delete the oldest CNT[−(M−1)].
(ii) update CNT[0] to CNT[−(M−2)] to CNT[−1] to CNT[−(M−1)], respectively.
(iii) store the acquired access count as CNT[0].
Then, the RDMA controller 13c, when the access counts are below a predetermined threshold E, detects that the access count is small, in other words, determines that the timing of executing the RDMA transfer comes. The threshold E is a criterion access count with which an access count can be determined to be small, and may be set on the basis of conditions such as the type of application 12, the tendency of access, the processing performance of the MC 2b, and the like.
The determination of whether the access counts are below the threshold E may be made, for example, in accordance with one or both of the following logics (a) and (b).
For example, the RDMA controller 13c extracts the immediately preceding M access counts from the memory extended region 31 and calculates the number K of access counts below the threshold E (for convenience, denoted as “threshold E1”) among the M access counts.
The RDMA controller 13c may store the result of determining whether or not the access counts are below the threshold E1 in association with the CNT[0] to CNT[−(M−1)] stored in the memory extended region 31. In this case, the RDMA controller 13c may determine, in the determination for each regular time interval T, whether or not the latest CNT[0] is below the threshold E1, and may read the result of determining whether or not CNT[−1] to CNT[−(M−1)] are below the threshold E1.
When the calculated number K is a threshold L of the number or more, the RDMA controller 13c determines that the access count is small. The threshold L of the number, for example, may be determined according to the value of M, and may be set to a value of M or less, as an example.
For example, at the time tx, the RDMA controller 13c determines that the number K of access counts below the threshold E1 among the latest M access counts in the T×M period (ty to tx) is equal to or greater than the threshold L. On the basis of this determination, the RDMA controller 13c reads the transfer request from the queue 13d and notifies the RDMA executor 15a of the contents of the transfer request.
Thus, the RDMA controller 13c may detect that, in cases where determining the number K of access counts below the threshold E1 to be the threshold L or more, the access count is small, in other words, may determine that the timing of executing the RDMA transfer comes.
That is, the condition for suppressing the access congestion is satisfied when, among M access counts obtained by the RDMA controller 13c at regular time intervals T, L or more access counts are below the threshold E1.
Incidentally, the threshold L of the number may be the same value as M. In this case, the condition for determining that the access count is small is that the access count is below the threshold E (threshold E1) and M (=L) consecutive times, in other words, that the access counts of the latest M (=L) times are all below the threshold E (threshold E1).
In this case, the condition for suppressing the access congestion is satisfied when M access counts obtained by the RDMA controller 13c at regular time intervals T are all below the threshold.
For example, the RDMA controller 13c extracts the latest M access counts from the memory extended region 31 and calculates the average value A of the M access counts.
Then, the RDMA controller 13c determines that the access count is small in cases where the calculated average value A is below the threshold E (for convenience, referred to as the “threshold E2”). The threshold E (threshold E1) used in the first logic and the threshold E (threshold E2) used in the second logic may be the same value, or may be different values.
For example, at the time tx, the RDMA controller 13c determines that the average value A of the latest M access counts in the T×M period (ty to tx) is below the threshold E2. On the basis of this determination, the RDMA controller 13c reads the transfer request from the queue 13d and notifies the RDMA executor 15a of the content of the transfer request.
Thus, the RDMA controller 13c may determine that, through detecting that the average value A is below the threshold E2, the access count is small, in other words, may determine that the timing of executing the RDMA transfer comes.
That is, the condition for suppressing the access congestion is satisfied when the average access count A of M access counts obtained by the RDMA controller 13c at regular time intervals T is below the threshold.
As described above, the RDMA controller 13c may detect that the access count is small when the above condition (a) or (b) is satisfied, or, may detect that the access count is small when the above conditions (a) and (b) are both satisfied.
Next, with reference to
First, referring to
As illustrated in
In the server 1, the manager 13 receives the acquisition request through the communicator 15 and the communication controller 14. The empty region manager 13a of the manager 13 acquires the leading memory pointer (dptr) in the empty region 41a having a size (dsize) from the storage region 41 by using, for example, the function of the OS (Process P2).
The empty region manager 13a transmits an acquisition response including at least the acquired leading memory pointer to the communicator 9 of the PC 6 via the communication controller 14 and the communicator 15 (Process P3). The acquisition response may include the size (dsize) of the acquired region 41a.
The allocation requester 91 of the communicator 9 receives the acquisition response, acquires the leading memory pointer (dptr) and the size (dsize) (Process P4), and then the process ends. Incidentally, in cases where the size (dsize) is not included in the acquisition response, the allocation requester 91 may regard the size (size) transmitted in the form of being included in the acquisition request as the dsize.
The allocation requester 91 may notify the transfer requester 92 of the acquired dptr and dsize.
Next, description will now be made in relation to an example of an operation of the control related to the RDMA transfer by the server 1 and the PC 6 with reference to
As illustrated in
In the server 1, the manager 13 receives the RDMA transfer request through the communicator 15 and the communication controller 14. The RDMA controller 13c of the manager 13 inserts the received RDMA transfer request into the queue 13d (Process P12).
The RDMA controller 13c reads access counts of the statistical information monitor 13e at regular time intervals T, and stores M access counts into the memory extended region 31. Then, the RDMA controller 13c suspends the RDMA transfer request stored in the queue 13d until one or both of the first logic and the second logic described above are satisfied on the basis of the M access counts (Process P13).
In the example of
When the access process in Processes P22 and P24 or the like is completed and one or both of the first logic and the second logic are satisfied, in other words, when the memory access count of the memory access made by the controller 11 is small is detected, the process proceeds to P14.
In Process P14, the RDMA controller 13c starts an RDMA transfer process by reading the leading RDMA transfer request from the queue 13d and notifying the communicator 15 of the contents of the read RDMA transfer request via the communication controller 14.
When being notified of the contents of the RDMA transfer request, the RDMA executor 15a of the communicator 15 executes the RDMA transfer process of data having a size (dsize) between the ptr of the PM 4 and the dptr of the server 1 on the basis of the contents of the request (Process P15).
In the RDMA transfer process, for example, data transfer (Process P16) from (or to) the region 8a in the DRAM 8 of the PC 6 may be performed. The operation subject of the RDMA transfer process may be the RDMA executor 15a of the communicator 15, but may alternatively be the communicator 9 of the PC 6 (e.g., RDMA controller) or the both communicators 15 and 9.
The controller 11 performs the access process to the region 41a in the storage region 41 on the basis of the memory access caused in the RDMA transfer process (Process P17).
Upon detecting the completion of the RDMA transfer process, the RDMA controller 13c transmits a completion response to the transfer requester 92 in response to the RDMA transfer request (Process P18).
The transfer requester 92 receives the completion response from the RDMA controller 13c (Process P19), and the process ends.
In cases where multiple RDMA transfer requests are stored in the queue 13d, the RDMA controller 13c may repeat Process P13 to Process P18 after Process P18 until no RDMA transfer request is left in the queue 13d.
The technique according to one embodiment described above can be changed or modified as follows.
For example, in the server 1 illustrated in
In one aspect, it is possible to suppress lowering of the processing performance of an information processing apparatus including a processor having a memory region and a memory controller that controls an access to the storage region.
All examples and conditional language recited herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present inventions have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2019-210125 | Nov 2019 | JP | national |