The accompanying drawings illustrate a number of exemplary embodiments and are a part of the specification. Together with the following description, these drawings demonstrate and explain various principles of the present disclosure.
Throughout the drawings, identical reference characters and descriptions indicate similar, but not necessarily identical, elements. While the exemplary embodiments described herein are susceptible to various modifications and alternative forms, specific embodiments have been shown by way of example in the drawings and will be described in detail herein. However, the exemplary embodiments described herein are not intended to be limited to the particular forms disclosed. Rather, the present disclosure covers all modifications, equivalents, and alternatives falling within the scope of the appended claims.
The demand for handling complex computational and memory intensive workloads (such as those involved in Artificial Intelligence (AI), Machine Learning (ML), analytics, and video transcoding) is expanding at an ever-increasing rate. Computational and memory intensive workloads are increasingly performed by heterogeneous processing and memory systems that include general-purpose host processors, task-specific accelerators, and memory expanders. For many computational and memory intensive workloads, it may be advantageous for these devices to coherently share and/or cache memory resources. Unfortunately, conventional systems with coherent memory spaces may place extra computational demands on the general-purpose host processors that manage the coherent memory spaces and/or may have larger attack surfaces as a result of many, possibly incongruous, devices sharing access to the same memory resources. Accordingly, the instant disclosure identifies and addresses a need for additional and improved systems and methods for efficiently and securely managing shared coherent memory spaces.
The present disclosure is generally directed to storage devices that transform data in-line with reads and writes to coherent host-managed device memory. As will be explained in greater detail below, embodiments of the present disclosure may perform various in-line encryption/decryption and/or compression/decompression operations when reading and/or writing data to shared device-attached memory resources. In some embodiments, the disclosed devices may perform these in-line transformations in a way that is transparent to external host processors and/or accelerators. In some embodiments, the disclose devices may enable a coherent memory space to be partitioned into multiple regions, each region being associated with one or more in-line transformations, such that external host processors and/or accelerators are able to choose an appropriate in-line transformation by writing data to an associated region of memory. For example, a coherent memory space may include one or more encrypted sections, one or more unencrypted sections, one or more compressed sections, and/or one or more uncompressed sections.
When performing encryption, the disclosed systems and methods may manage cryptographic keys at a processor, core, or thread level such that one processor, core, or thread cannot access the encrypted data of another processor, core, or thread. By performing encryption in this way, the disclosed systems may increase the attack surface of shared system memory and/or prevent data stored to shared system memory from being access by unauthorized entities or malicious intruders. When performing compression, the disclosed systems may use multiple compression algorithms, each being associated with one or more memory regions and/or types of stored data.
Features from any of the embodiments described herein may be used in combination with one another in accordance with the general principles described herein. These and other embodiments, features, and advantages will be more fully understood upon reading the following detailed description in conjunction with the accompanying drawings and claims.
The following will provide, with reference to
As shown in
Host-connected memory 104 and/or device-connected memory 110 may represent any type of form of memory capable of storing cacheable data. Examples of host-connected memory 104 and/or device-connected memory 110 include, without limitation, dynamic randomly addressable memory (DRAM), static randomly addressable memory (SRAM), High Bandwidth Memory (HBM), cache memory, volatile memory, non-volatile memory (e.g., Flash memory), or any other suitable form of computer memory. Memory bus 106 and memory bus 112 may represent any internal memory bus suitable for interfacing with host-connected memory 104 and/or device-connected memory 110. Examples of memory bus 106 and memory bus 112 include, without limitation, Double Data Rate (DDR) buses, Serial ATA (SATA) buses, Serial Attached SCSI (SAS) buses, High Bandwidth Memory (HBM) buses, Peripheral Component Interconnect Express (PCIe) buses, and the like.
Cache-coherent bus 116 may represent any high-bandwidth and/or low-latency chip-to-chip interconnect, external bus, or expansion bus capable of hosting a providing connectivity (e.g., I/O, coherence, and/or memory semantics) between host processor(s) 102 and external devices or packages such as caching devices, workload accelerators (e.g., Graphics Processing Unit (GPU) devices, Field-Programmable Gate Array (FPGA) devices, Application-Specific Integrated Circuit (ASIC) devices, machine learning accelerators, tensor and vector processor units, etc.), memory expanders, and memory buffers. In some embodiments cache-coherent bus 116 may include a standardized interconnect (e.g., a Peripheral Component Interconnect Express (PCIe) bus), a proprietary interconnect, or some combination thereof. In at least one embodiment, cache-coherent bus 116 may include a compute express link (CXL) interconnect such as those illustrated in
Example system 100 in
As shown in
As illustrated in
As shown in
As shown in
As shown in
Returning to
When receiving a request to write data to a particular host address, the systems described herein may determine what, if any, in-line transformations should be performed on the received data by determining if the host address falls within a range of addresses designated for an in-line transformation. If the host address falls within a range of host addresses designated for one or more in-line transformations, the systems described herein may perform the one or more in-line transformations on the received data. Additionally or alternatively, if the host address falls within more than one range of host addresses, each being separately designated for an in-line transformation, the systems described herein may perform each in-line transformation on the received data. However, if the host address does not fall within a range of host addresses designated for an in-line transformation, the systems described herein may refrain from performing any in-line transformations on the received data.
At step 640, one or more of the systems described herein may write the transformed data to the physical address of the device-attached physical memory mapped to the host address received at step 610. For example, in-line transformation engine 114 may, as part of storage device 108, write data to memory location 722(1) in response to receiving a request to write the data to host address 712(M) of shared coherent memory space 710. Exemplary method 600 in
If the request received at step 610 was a request to read data, flow of method 600 may continue from step 620 to step 650. At step 650, one or more of the systems described herein may read previously transformed data from the physical address of the device-attached physical memory mapped to the host address received at step 610. For example, in-line transformation engine 114 may, as part of storage device 108, read data from memory location 722(1) in response to receiving a request to access host address 712(M) of shared coherent memory space 710.
At step 660, one or more of the systems described herein may perform a reversing in-line transformation on previously transformed data to reproduce original data. Before responding to a request to read data from a particular host address, the systems described herein may determine what, if any, reversing in-line transformations need to be performed on the data by determining if the host address falls within a range of addresses designated for an in-line transformation. If the host address falls within a range of host addresses designated for one or more in-line transformations, the systems described herein may perform one or more corresponding reversing in-line transformations on the data to restore the data to its original form. Additionally or alternatively, if the host address falls within more than one range of host addresses, each being separately designated for an in-line transformation, the systems described herein may perform the corresponding reversing in-line transformations on the data. However, if the host address does not fall within a range of host addresses designated for an in-line transformation, the systems described herein may refrain from performing any reversing in-line transformations on the data.
At step 670, one or more of the systems described herein may return the original data to the external host processor via the cache-coherent interconnect. For example, in-line transformation engine 114 may, as part of storage device 108, return original data to host processor 102 via cache-coherent interconnect 116. Exemplary method 600 in
As illustrated in
At step 1120, one or more of the systems described herein may determine whether a host address received in a request to write data falls within a range designated as encrypted memory. If a host address does fall within a range designated as encrypted memory, flow of method 1100 may continue to step 1130. For example, in-line encryption/decryption engine 200 may proceed to step 1130 after determining that host addresses 712(M) and 712(M+N) contained in requests 1312 and 1332 are mapped in coherent memory space 710 to encrypted memory range 715.
At step 1130, one or more of the systems described herein may encrypt the data received at step 1110. For example, as shown in
In some embodiments, the disclosed systems and methods may manage cryptographic keys at a processor, core, or thread level such that one processor, core, or thread cannot access the encrypted data of another processor, core, or thread. By performing encryption in this way, the disclosed systems may increase the attack surface of shared system memory and/or prevent data stored to shared system memory from being access by unauthorized processors, cores, or threads or malicious intruders that have gained access to a processor, core, or thread with the ability to access the shared system memory.
At step 1220, one or more of the systems described herein may identify a cryptographic key by querying a key store for a cryptographic key associated with an extracted requester identifier. For example, as shown in
Returning to
If the host address received at step 1110 did not fall within a range designated as encrypted memory, flow of method 1100 may continue from step 1120 to step 1150. For example, in-line encryption/decryption engine 200 may proceed to step 1150 after determining that host address 712(X) contained in request 1612 has been mapped in coherent memory space 710 to unencrypted memory range 717. At step 1150, one or more of the systems described herein may write unencrypted data to a physical address of the device-attached physical memory mapped to the host address referenced in the request received at step 1110. For example, as shown in
As illustrated in
At step 1420, one or more of the systems described herein may read previously stored data from the physical address of the device-attached physical memory that is mapped to the host address received at step 1410. For example, as shown in
At step 1430, one or more of the systems described herein may determine whether a host address received in a request to read data falls within a range designated as encrypted memory. If a host address does fall within a range designated as encrypted memory, flow of method 1400 may continue to step 1440. For example, in-line encryption/decryption engine 200 may proceed to step 1440 after determining that host addresses 712(M) and 712(M+N) contained in requests 1512 and 1532 are mapped in coherent memory space 710 to encrypted memory range 715.
At step 1440, one or more of the systems described herein may decrypt the encrypted data read from device memory at step 1430. For example, as shown in
If the host address received at step 1410 did not fall within a range designated as encrypted memory, flow of method 1400 may continue from step 1430 to step 1460. For example, in-line encryption/decryption engine 200 may proceed to step 1460 after determining that host address 712(X) contained in request 1612 has been mapped in coherent memory space 710 to unencrypted memory range 717. At step 1460, one or more of the systems described herein may return data read from device memory to the external host processor via the cache-coherent interconnect without decrypting the data. For example, as shown in
As illustrated in
At step 1720, one or more of the systems described herein may determine whether a host address received in a request to write data falls within a range designated as compressed memory. If a host address does fall within a range designated as compressed memory, flow of method 1700 may continue to step 1730. For example, in-line compression/decompression engine 300 may proceed to step 1730 after determining that host address 712(M) contained in request 1812 is mapped in coherent memory space 710 to compressed memory range 715.
At step 1730, one or more of the systems described herein may compress the data received at step 1710. For example, as shown in
At step 1740, one or more of the systems described herein may write the data compressed at step 1730 to the physical address of the device-attached physical memory mapped to the host address received at step 1710. For example, in-line compression/decompression engine 300 may write compressed data 1816 to memory location 722(1) as shown in
If the host address received at step 1710 did not fall within a range designated as compressed memory, flow of method 1700 may continue from step 1720 to step 1750. For example, in-line compression/decompression engine 300 may proceed to step 1750 after determining that host address 712(X) contained in request 2012 has been mapped in coherent memory space 710 to uncompressed memory range 717. At step 1750, one or more of the systems described herein may write uncompressed data to a physical address of the device-attached physical memory mapped to the host address referenced in the request received at step 1710. For example, as shown in
As illustrated in
At step 1920, one or more of the systems described herein may read previously stored data from the physical address of the device-attached physical memory that is mapped to the host address received at step 1910. For example, as shown in
At step 1930, one or more of the systems described herein may determine whether a host address received in a request to read data falls within a range designated as compressed memory. If a host address does fall within a range designated as compressed memory, flow of method 1900 may continue to step 1940. For example, in-line compression/decompression engine 300 may proceed to step 1940 after determining that host address 712 (M) contained in requests 1822 is mapped in coherent memory space 710 to compressed memory range 715.
At step 1940, one or more of the systems described herein may decompress the compressed data read from device memory at step 1930. For example, as shown in
If the host address received at step 1910 did not fall within a range designated as compressed memory, flow of method 1900 may continue from step 1930 to step 1960. For example, in-line compression/decompression engine 300 may proceed to step 1960 after determining that host address 712(X) contained in request 2022 has been mapped in coherent memory space 710 to uncompressed memory range 717. At step 1960, one or more of the systems described herein may return data read from device memory to the external host processor via the cache-coherent interconnect without decompressing the data. For example, as shown in
As mentioned above, embodiments of the present disclosure may perform various in-line encryption/decryption and/or compression/decompression operations when reading and/or writing data to shared device-attached memory resources. In some embodiments, the disclosed devices may perform these in-line transformations in a way that is transparent to external host processors and/or accelerators. In some embodiments, the disclose devices may enable a coherent memory space to be partitioned into multiple regions, each region being associated with one or more in-line transformations, such that external host processors and/or accelerators are able to choose an appropriate in-line transformation by writing data to an associated region of memory. For example, a coherent memory space may include one or more encrypted sections, one or more unencrypted sections, one or more compressed sections, and/or one or more uncompressed sections. When performing encryption, the disclosed systems and methods may manage cryptographic keys at a processor, core, or thread level such that one processor, core, or thread cannot access the encrypted data of another processor, core, or thread. By performing encryption in this way, the disclosed systems may increase the attack surface of shared system memory and/or prevent data stored to shared system memory from being access by unauthorized entities or malicious intruders. When performing compression, the disclosed systems may use multiple compression algorithms, each being associated with one or more memory regions and/or types of stored data.
Example 1: A storage device may include (1) a device-attached physical memory accessible to an external host processor via a cache-coherent interconnect, addresses of the device-attached physical memory being mapped to a coherent memory space of the external host processor, and (2) one or more internal physical processors adapted to (a) receive, from the external host processor via the cache-coherent interconnect, a request to write first data to a host address of the coherent memory space of the external host processor, (b) perform an in-line transformation on the first data to generate second data, and (c) write the second data to a physical address of the device-attached physical memory corresponding to the host address.
Example 2: The storage device of Example 1, wherein the in-line transformation may include an encryption operation, and the one or more internal physical processors may be adapted to (1) perform the in-line transformation by performing the encryption operation on the first data and (2) write the second data by writing the encrypted first data to the physical address of the device-attached physical memory.
Example 3: The storage device of any of Examples 1-2, further including a cryptographic-key store containing multiple cryptographic keys, each of the cryptographic keys being mapped to one or more requester identifiers. In this example, the request may include a requester identifier previously mapped to a cryptographic key in the cryptographic-key store, and the one or more internal physical processors may be adapted to (1) use the requester identifier to locate the cryptographic key and (2) use the cryptographic key to perform the encryption operation on the first data.
Example 4: The storage device of any of Examples 1-3, wherein the requester identifier may include an identifier of a thread executing on the external host processor, the thread having generated the request.
Example 5: The storage device of any of Examples 1-4, wherein the one or more internal physical processors may be further adapted to receive, from the external host processor, a second request to write third data to a second host address of the coherent memory space of the external host processor. In this example, the second request may include a second requester identifier previously mapped to a second cryptographic key in the cryptographic-key store, the second requester identifier may include a second identifier of a second thread executing on the external host processor, and the second thread may have generated the second request. The one or more internal physical processors may also be further adapted to (1) translate the second host address into a second device address of the device-attached physical memory, (2) use the second requester identifier to locate the second cryptographic key, (3) use the second cryptographic key to perform the encryption operation on the third data, and (4) write the encrypted third data to the second physical address of the device-attached physical memory.
Example 6: The storage device of any of Examples 1-5, wherein (1) a first range of addresses of the coherent memory space of the external host processor may be designated as encrypted memory, the host address falling within the first range of addresses, (2) a second range of addresses of the coherent memory space of the external host processor may be designated as unencrypted memory, and (3) the one or more internal physical processors may be adapted to perform the encryption operation on the first data in response to determining that the host address falls within the first range of addresses.
Example 7: The storage device of any of Examples 1-6, wherein the one or more internal physical processors may be further adapted to (1) receive, from the external host processor, a second request to write third data to a second host address of the coherent memory space of the external host processor, the second host address falling within the second range of addresses, (2) translate the second host address into a second device address of the device-attached physical memory, (3) refrain from encrypting the second data in response to determining that the second host address falls within the second range of addresses, and (4) write the unencrypted third data to the second physical address of the device-attached physical memory.
Example 8: The storage device of any of Examples 1-7, wherein the in-line transformation may include a compression operation, and the one or more internal physical processors may be adapted to (1) perform the in-line transformation by performing the compression operation on the first data and (2) write the second data by writing the compressed first data to the physical address of the device-attached physical memory.
Example 9: The storage device of any of Examples 1-8, wherein (1) a first range of addresses of the coherent memory space of the external host processor may be designated for storing a first type of data associated with the compression operation, the host address falling within the first range of addresses, (2) a second range of addresses of the coherent memory space of the external host processor may be designated for storing a second type of data associated with a second compression operation, and (3) the one or more internal physical processors may be adapted to perform the compression operation on the first data in response to determining that the host address falls within the first range of addresses.
Example 10: The storage device of any of Examples 1-9, wherein the one or more internal physical processors may be further adapted to (1) receive, from the external host processor, a second request to write third data to a second host address of the coherent memory space of the external host processor, the second host address falling within the second range of addresses, (2) translate the second host address into a second device address of the device-attached physical memory, (3) perform the second compression operation, instead of the compression operation, on the third data in response to determining that the second host address falls within the second range of addresses, and (4) write the compressed third data to the second physical address of the device-attached physical memory.
Example 11: A storage device including (1) a device-attached physical memory managed by and accessible to an external host processor via a cache-coherent interconnect, wherein addresses of the device-attached physical memory may be mapped to a coherent memory space of the external host processor, and (2) one or more internal physical processors adapted to (a) receive, from the external host processor via the cache-coherent interconnect, a request to read from a host address of the coherent memory space of the external host processor, (b) translate the host address into a device address of the device-attached physical memory, (c) read first data from the physical address of the device-attached physical memory, (d) perform an in-line transformation on the first data to generate second data, and (e) return the second data to the external host processor via the cache-coherent interconnect.
Example 12: The storage device of Example 11, wherein the in-line transformation may include a decryption operation, and the one or more internal physical processors may be adapted to (1) perform the in-line transformation by performing the decryption operation on the first data and (2) return the second data by returning the decrypted first data to the external host processor via the cache-coherent interconnect.
Example 13: The storage device of any of Examples 1-12, further including a cryptographic-key store containing multiple cryptographic keys, each of the cryptographic keys being mapped to one or more requester identifiers. In this example, the request may include a requester identifier previously mapped to a cryptographic key in the cryptographic-key store, and the one or more internal physical processors may be adapted to use the requester identifier to locate the cryptographic key and use the cryptographic key to perform the decryption operation on the first data.
Example 14: The storage device of any of Examples 1-13, wherein the requester identifier may include an identifier of a thread executing on the external host processor, the thread having generated the request.
Example 15: The storage device of any of Examples 1-14, wherein the one or more internal physical processors may be further adapted to receive, from the external host processor, a second request to read from a second host address of the coherent memory space of the external host processor. In this example, the second request may include a second requester identifier previously mapped to a second cryptographic key in the cryptographic-key store, the second requester identifier may include a second identifier of a second thread executing on the external host processor, and the second thread may have generated the second request. The one or more internal physical processors may be further adapted to (1) translate the second host address into a second device address of the device-attached physical memory, (2) use the second requester identifier to locate the second cryptographic key, (3) use the second cryptographic key to perform the decryption operation on the third data, and (4) return the decrypted third data to the external host processor via the cache-coherent interconnect.
Example 16: The storage device of any of Examples 1-15, wherein a first range of addresses of the coherent memory space of the external host processor may be designated as encrypted memory, the host address falling within the first range of addresses, a second range of addresses of the coherent memory space of the external host processor may be designated as unencrypted memory, and the one or more internal physical processors may be adapted to perform the decryption operation on the first data in response to determining that the host address falls within the first range of addresses.
Example 17: The storage device of any of Examples 1-16, wherein the one or more internal physical processors may be further adapted to (1) receive, from the external host processor, a request to read from a second host address of the coherent memory space of the external host processor, the second host address falling within the second range of addresses, (2) translate the second host address into a second device address of the device-attached physical memory, (3) refrain from decrypting the second data in response to determining that the second host address falls within the second range of addresses, and (4) return the third data to the external host processor via the cache-coherent interconnect.
Example 18: The storage device of any of Examples 1-17, wherein the in-line transformation may include a decompression operation, the one or more internal physical processors may be adapted to (1) perform the in-line transformation by performing the decompression operation on the first data and (2) return the second data by returning the decompressed first data to the external host processor via the cache-coherent interconnect.
Example 19: The storage device of any of Examples 1-18, wherein a first range of addresses of the coherent memory space of the external host processor may be designated for storing a first type of data associated with the decompression operation, the host address falling within the first range of addresses, a second range of addresses of the coherent memory space of the external host processor may be designated for storing a second type of data associated with a second decompression operation, and the one or more internal physical processors may be adapted to perform the decompression operation on the first data in response to determining that the host address falls within the first range of addresses.
Example 20: A computer-implemented method may include (1) receiving, from an external host processor via a cache-coherent interconnect, a request to access a host address of a coherent memory space of the external host processor, wherein physical addresses of a device-attached physical memory may be mapped to the coherent memory space of the external host processor, (2) when the request is to write data to the host address, (a) performing an in-line transformation on the data to generate second data and (b) writing the second data to the physical address of the device-attached physical memory mapped to the host address, and (3) when the request is to read data from the host address, (a) reading the data from the physical address of the device-attached physical memory mapped to the host address, (b) performing a reversing in-line transformation on the data to generate second data, and (c) returning the second data to the external host processor via the cache-coherent interconnect.
As detailed above, the computing devices and systems described and/or illustrated herein broadly represent any type or form of computing device or system capable of executing computer-readable instructions, such as those contained within the modules described herein. In their most basic configuration, these computing device(s) may each include at least one memory device and at least one physical processor.
In some examples, the term “memory device” generally refers to any type or form of volatile or non-volatile storage device or medium capable of storing data and/or computer-readable instructions. In one example, a memory device may store, load, and/or maintain one or more of the modules described herein. Examples of memory devices include, without limitation, Random Access Memory (RAM), Read Only Memory (ROM), flash memory, Hard Disk Drives (HDDs), Solid-State Drives (SSDs), optical disk drives, caches, variations or combinations of one or more of the same, or any other suitable storage memory.
In some examples, the term “physical processor” generally refers to any type or form of hardware-implemented processing unit capable of interpreting and/or executing computer-readable instructions. In one example, a physical processor may access and/or modify one or more modules stored in the above-described memory device. Examples of physical processors include, without limitation, microprocessors, microcontrollers, Central Processing Units (CPUs), Field-Programmable Gate Arrays (FPGAs) that implement softcore processors, Application-Specific Integrated Circuits (ASICs), portions of one or more of the same, variations or combinations of one or more of the same, or any other suitable physical processor.
Although illustrated as separate elements, the modules described and/or illustrated herein may represent portions of a single module or application. In addition, in certain embodiments one or more of these modules may represent one or more software applications or programs that, when executed by a computing device, may cause the computing device to perform one or more tasks. For example, one or more of the modules described and/or illustrated herein may represent modules stored and configured to run on one or more of the computing devices or systems described and/or illustrated herein. One or more of these modules may also represent all or portions of one or more special-purpose computers configured to perform one or more tasks.
In addition, one or more of the modules described herein may transform data, physical devices, and/or representations of physical devices from one form to another. For example, one or more of the modules recited herein may receive data to be transformed over a cache-coherent interconnect, transform the data (e.g., by encryption or compression), output a result of the transformation to device-connected memory, and use the result of the transformation to respond to future read requests for the data after reversing any transformations previously made. Additionally or alternatively, one or more of the modules recited herein may transform a processor, volatile memory, non-volatile memory, and/or any other portion of a physical computing device from one form to another by executing on the computing device, storing data on the computing device, and/or otherwise interacting with the computing device.
In some embodiments, the term “computer-readable medium” generally refers to any form of a device, carrier, or medium capable of storing or carrying computer-readable instructions. Examples of computer-readable media include, without limitation, transmission-type media, such as carrier waves, and non-transitory-type media, such as magnetic-storage media (e.g., hard disk drives, tape drives, and floppy disks), optical-storage media (e.g., Compact Disks (CDs), Digital Video Disks (DVDs), and BLU-RAY disks), electronic-storage media (e.g., solid-state drives and flash media), and other distribution systems.
The process parameters and sequence of the steps described and/or illustrated herein are given by way of example only and can be varied as desired. For example, while the steps illustrated and/or described herein may be shown or discussed in a particular order, these steps do not necessarily need to be performed in the order illustrated or discussed. The various exemplary methods described and/or illustrated herein may also omit one or more of the steps described or illustrated herein or include additional steps in addition to those disclosed.
The preceding description has been provided to enable others skilled in the art to best utilize various aspects of the exemplary embodiments disclosed herein. This exemplary description is not intended to be exhaustive or to be limited to any precise form disclosed. Many modifications and variations are possible without departing from the spirit and scope of the present disclosure. The embodiments disclosed herein should be considered in all respects illustrative and not restrictive. Reference should be made to the appended claims and their equivalents in determining the scope of the present disclosure.
Unless otherwise noted, the terms “connected to” and “coupled to” (and their derivatives), as used in the specification and claims, are to be construed as permitting both direct and indirect (i.e., via other elements or components) connection. In addition, the terms “a” or “an,” as used in the specification and claims, are to be construed as meaning “at least one of.” Finally, for ease of use, the terms “including” and “having” (and their derivatives), as used in the specification and claims, are interchangeable with and have the same meaning as the word “comprising.”