A memory system is a system comprising a plurality of memory modules. Each memory module comprises a non-volatile memory unit and a controller connected to at least one port. The controller may access information stored in the non-volatile memory unit in response to a request from a processor, to deliver accessed information to the processor via the at least one port. Moreover, the controller communicates with controllers of other memory modules via the at least one port. The controllers may be connected so as to form a peer-to-peer network.
Some computing systems use memory systems referred to as “memory fabrics” that treat memory as if it were a routable resource (treating memory addresses somewhat in the way that IP networks handle IP addresses). Such memory fabrics may comprise many different memory elements that are logically interconnected and that may provide non-volatile storage of data. The interconnected memory elements may be logically “stacked” and exposed (e.g. to processors/computing modules) as if they were a single memory resource with a uniform memory space through an entry point acting similarly to a gateway in computer networks.
When a processor requests (reading or writing) access to a memory address in a memory fabric the memory addressing request may be directed via various modules of the memory fabric before arriving at the appropriate destination address.
Certain memory fabrics may implement protocols that are based on the assumption that there is only a single path to transmit a request from one particular point in the fabric to another.
A challenge arises in memory fabrics that use protocols which assume the existence of only a single path for transmission of a request from one point to another in a memory fabric. This may be a viable assumption in the case of a small, static and/or carefully designed fabric. However, there are circumstances in which there is no one exclusive path, for example when the size of the considered memory fabric scales up.
The situation may arise in a memory fabric, especially one involving large numbers of components, that there are plural paths for transmission of a request from one point to another in a memory fabric. This situation may lead to the occurrence of loops, where a memory addressing request is forwarded (directly or indirectly) from a first point to a second point in the memory fabric, based on a first one of multiple possible paths to a destination memory address, and then is returned from the second location to the first location based on a second one of the possible paths to the destination memory address.
A request caught in such a loop keeps on being redirected. Thus, if a request to access a memory address is caught in a loop, this address will not be reached. A loop causes unpredictable behavior of the memory system and delays in addressing and fetching memory data.
Moreover, another challenge arises in memory fabrics that use protocols which assume the existence of only a single path for transmission for a request from one point to another in a memory fabric because, in the event that a memory module or memory component is removed or there is a failure or performance problem within the fabric, it may be impossible (or undesirable) to direct requests according to the single designated path.
The memory system 100 according to the present example may be a memory fabric. The memory system 100 comprises a plurality of memory modules 110, each memory module 110 comprising a non-volatile memory unit and a controller. Each controller is to access information stored in the non-volatile memory unit in response to a request from a processor to deliver accessed information to the processor, and to communicate with controllers of other memory modules. The controllers are connected so as to form a peer-to-peer network.
In the memory system 100, memory modules 110 are distributed between memory components 120.
Modules 110 in a memory component 120 may have the role of “router”, “switch” or “plain memory module” and thus may comprise a router controller, a switch controller, or a plain memory controller, respectively.
More precisely, in this example, each memory component 120 of the memory system 100 comprises at least one memory module having a router controller (and which, thus, may be called a “router” 130 or a “router memory module” 130). A router controller is a controller which can route a message or a request to another router controller of the memory system, or to other(s) controller(s) in the same memory component as the router controller. A router controller is also responsible for all the routing-related tasks.
Each memory component 120 may also comprise a memory module having a switch controller (and which, thus, may be called a “switch” 140 or a “switch memory module” 140). A switch controller is a controller connected to the router controller and, if there is one, at least one plain memory controller in the same memory component. When a switch controller receives a message or a request from the router controller of the same memory component, it forwards the message or request to the plain memory controller to which the message or request is addressed.
Each memory component 120 may also comprise a memory module having a plain memory controller (and which, thus, may be called a “plain memory” 150 or a “plain memory module” 150). A plain memory controller is a controller which can forward a message or request received to another plain memory controller of the same memory component 120, which is its successor.
Management utilization and access characteristics of the memory system 100 may change dynamically, so that the memory system 100 may automatically react if a memory component 120 (or a router memory module 130) is added to or removed from the memory system 100.
As illustrated in
It is thus possible to prevent loops in the memory system 100, as this method enables finding out how the router memory modules 130 are organized in the memory system 100, and creating a logical tree, which is then used to route messages and requests.
The memory system 200 comprises four router memory modules 211, 212, 213, 214. In a variant, the memory system 200 comprises a different number of router memory modules. Each router memory module 211-214 comprises at least one port 220-227 to connect the router memory module 211-214 to at least another router memory module 211-214. Moreover, each router memory module 211-214 comprises a controller 231-234. Each router 211-214 further comprises a non-volatile memory unit 241-244.
In an example, the memory system 200 can comprise other memory modules, having controllers being switch controllers or plain memory controllers (not shown in the figures). Thus, each router memory module 211-214 can be located in a memory component, which can comprise other memory modules.
The connection 251-254 between two ports 220-227 of two routers 211-214 can be a wired connection (for example an optical connection), or a wireless connection.
Examples of loops 261, 262, 263 between ports 220-227 are shown in
The loop 262 is between the port 221 of the router 211, the posts 222 and 223 of the router 212, the ports 226 and 225 of the router 213 and the port 220 of the router 211. The loop 263 is between the port 227 of the router 214, the ports 224 and 223 of the router 212, and the port 226 of the router 213.
Memory systems according to the present disclosure perform creation of logical trees of the memory system. In the memory systems 100, 200 according to the example of
The above-mentioned hardware, firmware or software may include one or both of a processor and a machine-readable storage medium storing programming instructions or code executable by the processor to perform operations of the method to create a logical tree. The above-mentioned hardware, firmware or software may include an application specific integrated circuit (ASIC) constructed to perform operations of the method to create a logical tree.
A method to create a logical tree of a memory system according to an example of the present disclosure will now be described with reference to
The memory system corresponds for instance to the one represented on
A root router memory module of the memory system 100, 200 is determined (S300).
Then, for each router memory module 110, 211-214 apart from the root router memory module, a loop-free path from the router memory module 110, 211-214 to the root router memory module is identified (S310).
A logical tree using the loop-free paths determined is then created (S320).
It is thus possible to prevent loops in the memory system 100, 200, as this method enables finding out how the router memory modules 110, 211-214 are organized in the memory system 100, 200, and creating a logical tree, which is then used to route messages and requests.
The memory system corresponds for instance to the one represented on
The root router memory module RMM of the memory system is determined (S400). Different approaches may be used to decide which router memory module should be the RMM.
In an example, each router memory module sends (S500) a message to every other router memory module of the memory system, the message comprising an identification number and a priority value of the router memory module (see
The identification number uniquely identities this router memory module in the memory system. Furthermore, the priority value may be a configurable value.
If at least one router memory module has its priority value adjusted, the root router memory module RMM may be determined using the priority values of all the router memory modules (S510). For example, the router memory module having the lowest priority value may be determined, by all router memory modules, as the root router memory module. In a variant, the router memory module having the highest priority value may be determined, by all router memory modules, as the root router memory module.
If none of the router memory modules has its priority value adjusted, the root router memory module RMM may be determined using the identification numbers of all the router memory modules (S520). For example, the router memory module having the lowest identification number may be determined, by all router memory modules, as the root router memory module. In a variant, the router memory module having the highest identification number may be determined, by all router memory modules, as the root router memory module.
The root router memory module RMM is the router memory module which acts as the manager module for the whole logical tree of the memory system.
The messages sent may be Discovery Broadcast Messages (“DBM”), which are also sent out to determine if a connection between two memory components is still active or if a memory component is still in tie memory system. These Discovery Broadcast Messages may be sent periodically (for instance every two seconds). Each Discovery Broadcast Message may comprise information about the memory module that sent this Each Discovery Broadcast Message, and a Component Identifier (“CID”) that uniquely identities this memory module in the memory system. The Component identifier may comprise a configurable priority value and an identification number.
In a variant, each router memory module sends a message to an out band management subnet. The out band management subnet can be a designated memory module (e.g. with a default address or the first one that joins the memory system), or an external centralised server. Then, the out band management subnet can determine the root router memory module, from priority values or identification numbers, and inform all the router memory modules. The out band management subnet can also acts simply as a proxy to forward the information to the router memory modules.
In the example of
Then, each router memory module apart from the root router memory module identifies a loop-free path from the router memory module to the root router memory module (S410). Different approaches may be used for determining the loop-free paths.
In an example (see
Thus, if the speed of transmission is the same for each connection 251-254 between two ports, the chosen loop-free path will be the path with the fewest number of connections 251-254.
For example, in the memory system 200 of
The messages sent may be Unicast Memory Addressing Messages.
In an example (see
The rated speed of transmission depends of the technology of the connection (optical connection, wireless connection).
In an example, to identify a loop-free path, each router memory module calculates the shortest path.
A logical tree using the loop-free paths determined is created (S420). The logical tree is a map showing, for each router memory module, the chosen loop-free path from the router memory module to the root router memory module. The logical tree can be a branching structure being the sum of all the loop-free paths from each router to the root router. This map is used by the memory system to send messages or requests in normal case, namely when no router memory module is added or removed, and when no router memory module has a failure.
The logical tree is stored in at least one non-volatile location in the memory system. This non-volatile location may be considered to be a reference location and is readily accessible by all the router memory modules. The location of the reference location(s) within the memory fabric may be predetermined and, thus, the various modules in the memory fabric know where to find the stored loop-free configuration information For example, the non-volatile location is a pre-defined memory module address such as the first address within the range of addresses assigned to the memory system. In a variant, the non-volatile location is the first address of the memory module having the lowest CID in the initial memory system.
It is thus possible to prevent loops in the memory system, as this method enables finding out how the router memory modules are organized in the memory system, and creating a logical tree, which is then used to route messages and requests. A chosen loop-free path between a router memory module and the root router memory module will be used to send messages or requests between this router memory module and the root router memory module. This path being loop-free, the chance for a message or a request to be caught in a loop is thus reduced.
Then, each router memory module determines the port of the router memory module on which the loop-free path arrives as the root port RP of the router memory module (S430). In the example of
Furthermore, for each chosen loop-free path of a router memory module, if the path is going through at least one port of another router memory module, this port is determined as a designated port DP (S440). Thus, each port being on a loop-free path and being not a root port or a port of the root router memory module is determined as being a designated port DP.
In the example of
Then, if a port is not on at least one chosen loop-free path, this port BP is put into a blocking state (S450). A port that is not a port of the root router memory module, a root port or a designated port is thus put into the blocking state. Such a port BP is thus never used to transmit or forward a message or a request (although requests may be received via this port). The blocking of ports which are not on the chosen loop-free paths allows avoidance of a case where unintended loops are created by a bug in the addressing of a message or a request.
In the example of
The logical tree can be completed by adding the root ports RP, the designated ports DP and the blocked ports BP.
Steps S400 to S450 can then be re-run, based on various failure scenarios (e.g. considering that a port of a router memory module or a router memory module of the memory system is inactive or down, i.e. not properly operational (S460). In an example, these steps are re-run for all the possible failure scenarios (combinations of malfunctioning router memory module or malfunctioning port). When considering a failure scenario in which a router memory module different from the root router memory module is inactive or down, step S400 does not need to be re-run.
The number of times the process is re-run depends on the nature of the topology within the memory system and possible loop-free alternative paths. In an example, these steps are re-run for router memory modules which are more likely to fail, and combinations of these router memory modules.
A plurality of logical trees are thus stored, including a logical tree showing the chosen loop-free paths in normal cases, and other logical trees showing the chosen loop-free paths in cases wherein one or a plurality of router memory modules/ports is inactive or down.
Thus, if a router memory module/port of the memory system is inactive or down, messages and requests can be immediately redirected according to an alternative loop-free path, because a logical tree made considering that this router memory module/port is inactive or down is stored in the memory system.
Accordingly, the method can be performed in an initialisation phase of the memory system, and does not need to be performed again during the use of the memory system. Furthermore, as the logical trees are stored in a non-volatile location, these logical trees are not lost if the memory system is turned off or goes down. Moreover, due to the massive capacity of the non-volatile location, as many logical trees as possible can be stored.
In the example of
In an example, steps S400-S460 are performed only if there is no existing logical tree of the memory system, in the non-volatile location. If there is such a logical tree, the memory module use this logical tree and no other logical trees are created.
When a new router memory module is added to the memory system, the method to create logical tree is performed again. To prevent a complete recalculation, an incremental approach can be used, assuming fie root is still the same root. In many cases, the entire logical trees need not be recomputed because most of the trees remain unchanged. Only the affected part of the tree is thus recomputed. Recomputing only a portion of the trees rather than the entire trees results in faster convergence and saves resources. Steps S410, S430, S440, and S450 are thus performed again for the router memory modules involved by this adding, and the existing trees are modifying using the results of these steps.
More precisely, when a new router memory module is added (S900, see
These steps can then be re-run, based on various failure scenarios (e.g. considering that a port of a router memory module or a router memory module of the memory system is down, i.e. not property operational).
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2015/067683 | 7/31/2015 | WO | 00 |