Embodiments of the invention relate generally to storage architectures in a computer system. More particularly, embodiments of the invention relate to a method and apparatus for improving computer system performance by isolating system and user data for storage in separate memories.
As computer processing increases, the performance of typical computer storage architectures is degraded. The term “performance” herein refers to the attributes of a computer system including power consumption of a computer storage architecture, boot-up and wake-up times of the computer storage architecture, user data bandwidth on the bus coupling the secondary memory to the host processor, cost of the computer storage architecture, etc. Degraded or lower performance of typical computer storage architecture is caused, among several reasons, by randomly bundled system and user data which are stored in large secondary memories (e.g., hard drives). Such bundled system and user data means that the microprocessor of the computer storage architecture spends significant processing time to locate and separate system data from the randomly bundled system and user data, further increasing power consumption of the typical computer storage architecture.
The typical computer system 100 also consists of a chipset processor 108, coupled to the microprocessor 107, to communicate with an external memory 102 via a Serial Advance Technology Attachment (SATA) input-output (I/O) Interface or the Serially Attached Small System Computer (SAS) input-output (I/O) Interface bus interface 103. The SATA or SAS I/O bus interface 103 interfaces the host processor 101 via a host bus adaptor 109. The typical computer system 100 further consists of an external memory 102 (external to the host processor 101) which is also known as the secondary memory. The external memory 102 is generally an electro-mechanical hard-drive. The memory regions in the secondary memory 102 that are randomly spread in the secondary memory 102 memory contain system data 104. The remainder space of the secondary memory 102 is the memory region for containing user data 105.
In a typical computer system, most data traffic between the host processor 101 and the secondary memory 102 are read and write operations of file system content, which are part of system data 104. These frequent accesses (read and write operations) of the file system content include page file swapping that results in power consumption by the secondary memory 102. The frequent accesses of the file system content also results in excessive data traffic on the SATA or SAS I/O bus interface 103 and the host bus adaptor 109. The excessive data traffic is caused by the chipset processor 108 to identify the randomly placed system data in the secondary memory 102 from the user data 105, for processing of the system data 104 (e.g., page file data) by the chipset processor 108. Such frequent access of the secondary memory 102 lowers the performance of the computer system storage architecture. If the secondary memory 102 is a solid state drive (SSD), such frequent access of the secondary memory 102 reduces the performance of the computer system storage architecture, increases the power consumption of the computer system storage architecture, and reduces the life span of the SSD.
Embodiments of the invention will be understood more fully from the detailed description given below and from the accompanying drawings of various embodiments of the invention, which, however, should not be taken to limit the invention to the specific embodiments, but are for explanation and understanding only.
Embodiments of the invention relate to a method and apparatus for improving computer system performance by isolating system and user data from host data for storage in separate memories.
In one embodiment, the system data (e.g., page files, page tables, etc.) are stored in a first memory which is physically closer to a microprocessor and/or a chipset processor of a host processor. In one embodiment, the host processor comprises a first memory. In one embodiment, the system data which is typically smaller than the user data is separated from the host data and stored in the first memory while the user data is stored in a second memory e.g., a hard drive. By separating the system data and the user data from the host data and placing the system data closer to the processors that process the system data on a frequent basis, overall performance of the computer storage architecture is improved. In one embodiment, the performance of the computer storage system described herein improves by 16% compared to the typical computer storage system 100 of
In the following description, numerous details are discussed to provide a more thorough explanation of embodiments of the present invention. It will be apparent, however, to one skilled in the art, that embodiments of the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring embodiments of the present invention.
Note that in the corresponding drawings of the embodiments signals are represented with lines. Some lines may be thicker, to indicate more constituent signal paths, and/or have arrows at one or more ends, to indicate primary information flow direction. Such indications are not intended to be limiting. Rather, the lines are used in connection with one or more exemplary embodiments to facilitate easier understanding of a circuit or a logical unit. Any represented signal, as dictated by design needs or preferences, may actually comprise one or more signals that may travel in either direction and may be implemented with any suitable type of signal scheme, e.g., Peripheral Component Interconnect (PCI) express, non-volatile memory host computer interface (NVMHCI) differential ended scheme, single-ended scheme, etc.
In the embodiments discussed herein, the system data is stored in a first memory 213. In one embodiment, the first memory 213 is integrated on the host processor 201. In one embodiment, the first memory 213 is a non-volatile memory. In one embodiment, the first memory 213 comprises a NAND flash memory, a NOR flash memory, a Phase Change Memory (PCM), or any other non-volatile memory. In one embodiment, the physical size (form structure) and the memory size of the first memory 213 is smaller than the physical size (form structure) and the memory size of the second memory 202.
In one embodiment, the first memory 213 is coupled to a chipset processor 208 via a non-volatile memory host computer interface (NVMHCI). In other embodiments, other types of interfaces may be used to communicate with the first memory 213. While the embodiment of the host processor 201 illustrates a chipset processor 208 separate from the microprocessor 207, the components of the host processor 201 may be implemented on a single semiconductor die without changing the essence of the embodiments of the invention.
In one embodiment, the host processor 201 communicates with the second memory 202 via a host bus adaptor 209. In the embodiments discussed herein, the first memory 213 is physically closer to the processors 207 and 208 compared to the proximity of the second memory 202 from the processors 207 and 208. In one embodiment, the second memory 202 is an electro-mechanical hard-drive. In other embodiments, the second memory 202 is a solid state drive (SSD). In one embodiment, the second memory 202 is operable to store user data 205 only while the first memory 213 is operable to store system data only.
By separating the user data and the system data from the host data for closer accessibility of the system data to the processors 207 and 208, overall performance of the computer system 200 is improved. One reason for such improvement in the performance of the computer system 200 is that the system data which is frequently accessed by the processors 207 and 208, compared to the frequency of access of the user data by the processors 207 and 208, is now physically closer to the processors 207 and 208—the first memory 213 for storing the system data is integrated within the host processor 201.
In the embodiment of
Furthermore, additional power savings are realized by the processors 207 and 208 of the host processor 201 because the processors 207 and 208 do not send and receive system data on the buses 203, 210, and the adaptor 209. By having no system data traffic on the buses 203, 210, and the adaptor 209, input-output (I/O) transceivers associated with these buses consume less power compared with a computer system having I/O transceivers operating to send and receive system and user data on the buses 210 and 203. The additional power savings contribute to the overall performance enhancement of the computer system 200.
In one embodiment, the host processor 201 improves the bandwidth of the bus 203 for the user data because the host processor 201 is operable to send and receive the user data 205 only on the bus 203 i.e., the bus 203 no longer has to send and receive the system data from the second memory 202. Such bandwidth improvement of user data on the buses improves access time of the user data 205 for the host processor 201 and thus contributes to the overall performance enhancement of the computer system 200. Users of the computer system 200 can access the user data 205 faster than accessing the user data 105 of
In one embodiment, the host processor 201 improves reliability of the computer system 200 because the probability of loss of system data caused by losses of data in the buses 203 and 210 is eliminated—system data is no longer transmitted/received on the buses 203 and 210. An improvement in reliability of the computer system 200 also contributes to the improvement in the performance of the computer system 200.
In one embodiment, the host processor 201 is operable to boot-up from reset or wake-up from a sleep state faster than the host processor 101 of
In one embodiment, the logic unit 211 of the microprocessor 207 is operable to monitor host data, which is processed by the microprocessor 207, to separate the system data and the user data from the monitored host data. In one embodiment, the logic unit 211 is operable to monitor the characteristics of the host data. Characteristics of the host data include size of the host data and frequency of request or use of the host data. The characteristics of the host data also include the logical block address (LBA) of the host data to be sent to the second memory 202 where the LBA represents a minimum granularity of data size that can transfer between the host processor 201 and the second memory 202. The above three characteristics are not limiting number of characteristics. In other embodiments, the logic unit 211 is operable to monitor other (non-listed) characteristics of the host data to separate the user data and system data from the host data. The size (in bytes) of the system data is generally smaller than the size of the user data. In one embodiment, the logic unit 211 is operable to monitor one or more than one of the characteristics of the host data to determine whether the host data is user data or system data.
In one embodiment, the logic unit 211 is operable to compare the size of the host data with a predetermined threshold size and to generate a comparison result. In one embodiment, if the comparison result indicates that the monitored host data is smaller than the predetermined threshold size then the monitored data is system data. In such an embodiment, the logic unit 211 informs the microprocessor 207 and/or the operating system 214 to store the host data as system data in the first memory 213. In one embodiment, if the comparison result indicates that the monitored host data is larger than the predetermined threshold size then the logic unit 211 informs the microprocessor 207 and/or the operating system 214 to store the monitored host data in the second memory 202 because the monitored host data is determined to be user data.
In one embodiment, the logic unit 211 is operable to instruct a software driver (not shown) to monitor the size of the host data and to compare it with a predetermined threshold size to determine whether the monitored host data is system data or user data. In one embodiment, the software driver is stored as machine-readable instructions 405 in a machine-readable storage medium 404 of
In one embodiment, the logic unit 211 is operable to monitor the LBA of the host data to be sent to the second memory 202. In such an embodiment, the logic unit 211 is operable to compare the LBA with a predetermined threshold that can be determined from a File System format information. In one embodiment, the File System is a File Allocation Table 32 (FAT32) file system. In such an embodiment, the predetermined threshold is determined by calculating the first LBA after a File System Table within the FAT32 file system. While the above embodiment is explained using FAT32 file system, other file systems can use similar approaches to determine user data locations for computing the predetermined threshold.
In one embodiment, if the comparison result indicates that the LBA is less than the predetermined threshold then the host data is system data. In such an embodiment, the logic unit 211 informs the microprocessor 207 and/or the operating system 214, based on the comparison result, to store the host data as system data in the first memory 213. In one embodiment, if the comparison result indicates that the LBA is greater than the predetermined threshold then the host data is user data. In such an embodiment, the logic unit 211 informs the microprocessor 207 and/or the operating system 214 to store the host data in the second memory 202.
In one embodiment, the logic unit 211 is operable to instruct a software driver (not shown) to monitor the LBA of the host data to be sent to the second memory 202 and to compare the LBA with the predetermined threshold to determine whether the host data is system data or user data. In one embodiment, the software driver is stored as machine-readable instructions 405 in a machine-readable storage medium 404 of
Referring back to
At block 301, the logic unit 211 of the microprocessor 207 monitors the host data and identifies the characteristics of the host data. In this embodiment, the characteristic of the host data is the size of the host data. In other embodiments, other characteristics, as discussed above with reference to
At block 302, based on the identified characteristics of the host data, the logic unit 211 separates system data and user data from the host data. As discussed above, if the comparison result from the logic unit 211 indicates that the monitored host data is smaller than a predetermined threshold size then the monitored data is system data. Similarly, if the comparison result from the logic unit 211 indicates that the monitored host data is larger than the predetermined threshold size then the monitored data is user data. In one embodiment, the logic unit 211 is operable to inform the operating system 214 and/or the software driver (not shown) about the nature of the host data i.e., whether the host data is system data or user data.
At blocks 303 and 304, the operating system 214 and/or the software driver (not shown) monitors the comparison result generated by the logic unit 211 and stores the monitored data to either the first memory 213 or the second memory 202 based on the comparison result. At block 303, the operating system 214 instructs the microprocessor 207 and/or the chipset processor 208 to store the system data to the first memory 213. At block 304, the operating system 214 instructs the microprocessor 207 and/or the chipset 208 to store the user data to the second memory 202. In one embodiment, all memory accesses to the system data are channeled by the operating system 214 to the first memory 213 while all memory accesses to the user data are channeled by the operating system 214 to the second memory 202. In other embodiments, the software driver (not shown) or any other logic unit may channel memory accesses to the corresponding first and second memories based on the nature of the memory access i.e., whether the memory accesses is for system data or user data.
At block 311, the operating system 214 identifies the presence of the first memory 213 in the host processor 201. The operating system 214 being aware of the first memory 213 monitors the nature/characteristics (one or a combination of characteristics) of the host data and determines whether the host data is system data or user data. As discussed above, the operating system 214 may perform the comparison function (discussed with reference to the logic unit 211 in
Based on the comparison result, the operating system 214 follows the logical paths 315 and/or 316 shown as dotted lines. At block 313, the operating system follows the logical path 315 (dotted line) and stores (or instructs the microprocessor 207 to store) the system data in the first memory 213 based on the comparison result. Similarly, at block 314, the operating system follows the logical path 316 (dotted line) and stores (or instructs the microprocessor 207 to store) the user data in the second memory 202 based on the comparison result.
In an alternative embodiment, the operating system 214 follows the logical path shown by solid lines. At block 312, the operating system 214 instructs a software driver to separate the system data and user data from the host data. In such an embodiment, the operating system 214 delegates its authority of separating the system and user data from host data to the software driver. In one embodiment, the software driver implements the comparison function (discussed with reference to the logic unit 211) and determines whether the host data is system data or user data. In another embodiment, the software driver monitors the comparison result generated by the logic unit 211 and channels all memory access to system data to the first memory 213 and channels all memory access to user data to the second memory 202. At block 313 and 314, the software driver stores user data to the first memory 213 and system data to the second memory 202, respectively.
In one embodiment, the computer system 400 comprises a host processor 201 communicatively coupled to the second memory 202 via a network bus 402. As discussed in
In one embodiment, the machine-readable instructions 405 are stored in a machine-readable storage medium 404 which is communicatively coupled to a host processor 201 via the network bus 402. The term “machine-readable instruction” is interchangeably referred to as “computer-executable instruction.” The computer-executable instructions include instructions to perform the methods described herein including methods of
In one embodiment, the computer executable instructions include the instructions of the software driver discussed with reference to
Reference in the specification to “an embodiment,” “one embodiment,” “some embodiments,” or “other embodiments” means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least some embodiments, but not necessarily in all embodiments. The various appearances of “an embodiment,” “one embodiment,” or “some embodiments” are not necessarily all referring to the same embodiments. If the specification states a component, feature, structure, or characteristic “may,” “might,” or “could” be included, that particular component, feature, structure, or characteristic is not required to be included. If the specification or claim refers to “a” or “an” element, that does not mean there is only one of the elements. If the specification or claims refer to “an additional” element, that does not preclude there being more than one of the additional element.
While the invention has been described in conjunction with specific embodiments thereof, many alternatives, modifications and variations of such embodiments will be apparent to those of ordinary skill in the art in light of the foregoing description.
For example, the logic unit 211 or its functionality may be implemented on a memory controller (not shown) that controls all access to the first and the second memories 213 and 202, respectively. In one embodiment, the operating system 214 includes logic to identify a characteristic of the host data and determines whether the host data is system data or user data based on the characteristics of the host data. The operating system 214 may then instruct the memory controller to channel all memory access of system data to the first memory 213 and to channel all memory access of user data to the second memory 202.
In one embodiment, the logic unit 211 is operable to monitor the frequency of request or use of the host data by the second memory 202. In such an embodiment, the logic unit 211 is operable to compare the frequency of request/use of the host data with a predetermined threshold frequency and to generate a comparison result. In one embodiment, if the comparison result indicates that the frequency of request/use of the host data is less than the predetermined threshold then the host data is system data. In such an embodiment, the logic unit 211 informs the microprocessor 207 and/or the operating system 214, based on the comparison result, to store the host data as system data in the first memory 213. In one embodiment, if the comparison result indicates that the frequency of request/use of the host data is greater than the predetermined threshold then the host data is user data. In such an embodiment, the logic unit 211 informs the microprocessor 207 and/or the operating system 214 to store the host data in the second memory 202. The implementation of the comparison function in the embodiments of the invention can be reversed without changing the essence of the embodiments of the invention e.g., if the frequency of request/use of the host data is less than the predetermined threshold then the host data is user data.
The embodiments of the invention are intended to embrace all such alternatives, modifications, and variations as to fall within the broad scope of the appended claims.