This invention pertains to storage, and more particularly to storage coupled to processors.
For a long time, a major problem with computer architecture has been getting data to and from the central processing unit (CPU) to be operated on. Whether the data comes from a storage point (e.g., a hard drive or memory) or is input to the computer from some source, the task of getting the data to the CPU is slow: often, the reason a CPU is idle is because it is waiting for data. This problem is commonly known as the “Von Neumann bottleneck”.
Recognizing the problem of the Von Neumann bottleneck, computer manufacturers have attempted to design new architectures that avoid the problems of data flow. But the solutions that have been attempted are complex, expensive, and require special machine designs. A need remains for a way to address these and other problems associated with the prior art.
Embodiments of the present invention are illustrated by way of example, and not by way of limitation, in the drawings and in which like reference numerals refer to similar elements.
Computer 110 may include conventional internal components, such as central processing unit (CPU) 130, memory, etc. Computer 110 may include storage units 135 and 140, each of which is coupled to an associated processor 145 and 150 and storage controller 155 and 140. In this manner, each storage unit has its own storage controller 155 and 160 and processor 145 and 150: associated processors 145 and 150 may execute operations on the data stored in associated storage unit 135 and 140. Storage units 135 and 140 may be units of memory, hard disk drives, or any other desired form of storage. Further, different storage units may be of different forms: there is no requirement that all storage units be of the same form.
While
As mentioned above, a “storage unit” is intended to refer to any desired portion of storage. For example, if a storage unit takes the form of a memory module, the memory module may be a Single In-Line Memory Module (SIMM), Dual In-Line Memory Module (DIMM), or any other desired form of memory module. In addition, a “storage unit” may be a portion of a storage module: that is, a single storage module may include multiple storage units. (As each storage unit may have its own associated storage controller and processor, this means that a single storage module may have multiple storage controllers and processors.) If a storage unit includes memory, a storage unit may be made from any desired type of memory, such as Phase Change Memory (PCM), flash memory, Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), or any other desired type of memory. A person of ordinary skill in the art will recognize that a “storage unit” may consist of volatile memory, non-volatile memory, or a combination thereof. A person of ordinary skill in the art will similarly recognize how other forms of storage (e.g., hard disk drives or solid state drives, among other possibilities) may consist of one or more storage units.
Although not shown in
With the addition of processors 145 and 150 coupled to storage units 135 and 140, respectively, the role of CPU 130 changes. Instead of being responsible for executing operations (e.g., for the operating system, file system, and/or user applications, among other possibilities), CPU 130 becomes more managerial. Specifically, CPU 130 becomes responsible for determining which processor 145 and 150 are to execute specific operations (based, in turn, on what data is affected by those operations). Of course, CPU 130 may continue to execute operations itself as well, whether or applied to particular data objects.
In embodiments of the invention, storage controllers 155 and 160 are responsible for controlling the data in storage units 135 and 140. Storage controllers 155 and 160 may store information about objects stored in storage units 135 and 140. This information, called a mapping, identifies particular objects and the blocks that store those objects.
Processors 145 and 150 may be any variety of processor. Processors 145 and 150 may be single-core or multi-core processors. Processors 145 and 150 may be processors with complex operation sets, or they may be capable of little more than performing read, write, and basic arithmetic operations (for example, an arithmetic logic unit (ALU)). Processors 145 and 150 may also be capable of executing a Java® execution environment, among other possibilities, in which case processors 145 and 150 may be provided software implemented using the Java programming language (Oracle and Java are registered trademarks of Oracle and/or its affiliates.) Finally, different processors coupled to different storage units may have different capabilities, as desired.
An ideal ratio of storage units 135 and 140 (and associated processors 145 and 150) to the overall amount of storage would have each of processors 145 and 150, along with CPU 130, operating without any individual processor or storage unit becoming a bottleneck. But since the burdens that may be placed on processors 145 and 150 and CPU 130 depend on the design and usage of the machine, there is no one “ideal” ratio: different designs and usage models will generally have different “optimal” solutions.
CPU 130 also communicates with DRAM 225 and PCM 230. As discussed above, DRAM 225 and PCM 230 may be memory with coupled processors.
While
CPU 130 may instruct the various logics 235-260 to write data received from CPU 130, read data into CPU 130, or execute operations. To that end, CPU 130 may provide program operations to logics 235-260 to be executed.
More generally, any storage (volatile or non-volatile) may include logic to support performing operations on objects in the storage. In
Note that until storage controller 155 receives mapping 405, storage controller 155 does not know what blocks make up specific objects in the storage, or how to interpret those blocks. Mapping 405 provides this information. Mapping 405 may include object identifier 425, which identifies the object in question. Object identifier 425 may also include additional metadata about the object, which may include, among other data, how the file is to be interpreted (e.g., that the file a video clip, a text document, a static image, an audio clip, etc.). Mapping 405 may also specify blocks 430, 435, 440 may make up the object. (The number of blocks that make up the object is not limited to three: any number of blocks may be used.) As it is possible for an object to be fragmented across storage, blocks 430, 435, 440 might not form a continuous portion of storage: that is, blocks 430, 435, 440 might be scattered across the storage, and in no particular order.
A difference between embodiments of the invention as compared with other solutions to the problem of the Von Neumann bottleneck is that storage controller 155 uses information about the data stored in the storage (i.e., mapping 405), but does not control that information. In the prior art, storage controller 155 either did not have access to mapping 405 (in which case operations were performed with less than complete information) or owned mapping 405. But having storage controller 155 own mapping 405—that is, having storage controller 155 be responsible for allocating blocks of storage and knowing what data is stored in what blocks—(a model known as object-based storage model) has drawbacks. Higher level software may make use of mapping 405 as well, and often may map data in a manner that is more useful to the higher level software.
But by having storage controller 155 receive mapping 405 from other sources, embodiments of the invention get the benefit of both alternatives. The storage units continue to use a block-based storage model, so storage manufacturers may use whatever scheme they desire to manage storage, but storage controller 155 may use the information about how the data is organized in storage to best execute operations on that data.
As discussed above, these additional operations may be any desired operations that may be applied to a data object. Processor 145 might only have basic arithmetic and logic operations in addition to read and write operations. Or processor 145 might be capable of executing software written using the Java programming language. In short, processor 145 may be any processor, whether manufactured now, in the past, or in the future. As such, the scope of the additional operations in processor 145 is essentially unlimited.
As discussed above with reference to
A MISD architecture may be achieved using embodiments of the invention, where the same data is copied into the memories associated with different associated processors. Each associated processor may then be given one of operations 610, 615, and 620 to execute on the copy of data stored in the associated storage, thereby producing a single result of the combination of that operation and that data.
In contrast, in
A SIMD architecture may be achieved using embodiments of the invention, where the same operation is sent to the processors associated with the different data. Each of the associated processors may then produce a single result of the combination of that data and that operation.
At block 710, this mapping is stored in the storage controller associated with the storage unit. At block 715, an operation to be executed on this object is received. At block 720, the processor associated with the storage unit executes the operation. As discussed above with reference to
The following discussion is intended to provide a brief, general description of a suitable machine in which certain aspects of the invention may be implemented. Typically, the machine includes a system bus to which is attached processors, memory, e.g., random access memory (RAM), read-only memory (ROM), or other state preserving medium, storage devices, a video interface, and input/output interface ports. The machine may be controlled, at least in part, by input from conventional input devices, such as keyboards, mice, etc., as well as by directives received from another machine, interaction with a virtual reality (VR) environment, biometric feedback, or other input signal. As used herein, the term “machine” is intended to broadly encompass a single machine, or a system of communicatively coupled machines or devices operating together. Exemplary machines include computing devices such as personal computers, workstations, servers, portable computers, handheld devices, telephones, tablets, etc., as well as transportation devices, such as private or public transportation, e.g., automobiles, trains, cabs, etc.
The machine may include embedded controllers, such as programmable or non-programmable logic devices or arrays, Application Specific Integrated Circuits, embedded computers, smart cards, and the like. The machine may utilize one or more connections to one or more remote machines, such as through a network interface, modem, or other communicative coupling. Machines may be interconnected by way of a physical and/or logical network, such as an intranet, the Internet, local area networks, wide area networks, etc. One skilled in the art will appreciated that network communication may utilize various wired and/or wireless short range or long range carriers and protocols, including radio frequency (RF), satellite, microwave, any of the Institute of Electrical and Electronics Engineers (IEEE) 810.11 standards, Bluetooth, optical, infrared, cable, laser, etc.
The invention may be described by reference to or in conjunction with associated data including functions, procedures, data structures, application programs, etc. which when accessed by a machine results in the machine performing tasks or defining abstract data types or low-level hardware contexts. Associated data may be stored in, for example, the volatile and/or non-volatile memory, e.g., RAM, ROM, etc., or in other storage devices and their associated storage media, including hard-drives, floppy-disks, optical storage, tapes, flash memory, memory sticks, digital video disks, biological storage, etc.: such associated data, by virtue of being stored on a storage medium, does not include propagated signals. Associated data may be delivered over transmission environments, including the physical and/or logical network, in the form of packets, serial data, parallel data, propagated signals, etc., and may be used in a compressed or encrypted format. Associated data may be used in a distributed environment, and stored locally and/or remotely for machine access.
Having described and illustrated the principles of the invention with reference to illustrated embodiments, it will be recognized that the illustrated embodiments may be modified in arrangement and detail without departing from such principles. And, though the foregoing discussion has focused on particular embodiments, other configurations are contemplated. In particular, even though expressions such as “in one embodiment” or the like are used herein, these phrases are meant to generally reference embodiment possibilities, and are not intended to limit the invention to particular embodiment configurations. As used herein, these terms may reference the same or different embodiments that are combinable into other embodiments.
Consequently, in view of the wide variety of permutations to the embodiments described herein, this detailed description and accompanying material is intended to be illustrative only, and should not be taken as limiting the scope of the invention. What is claimed as the invention, therefore, is all such modifications as may come within the scope and spirit of the following claims and equivalents thereto.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2011/066893 | 12/22/2011 | WO | 00 | 6/27/2013 |