Mechanism to Boot Multiple Hosts from a Shared PCIe Device

Information

  • Patent Application
  • 20150169331
  • Publication Number
    20150169331
  • Date Filed
    December 16, 2013
    11 years ago
  • Date Published
    June 18, 2015
    9 years ago
Abstract
A PCIe switch including a memory and a processor. The processor is operable to receive a transaction from an information handling system to an endpoint device, determine that the transaction is a request to receive firmware code from the endpoint device, block the transaction from being issued to the endpoint device, and provide the firmware code to the information handling system from the memory.
Description
FIELD OF THE DISCLOSURE

This disclosure relates generally to information handling systems, and more particularly relates to a mechanism to boot multiple hosts from a shared PCIe device.


BACKGROUND

As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option is an information handling system. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes. Because technology and information handling needs and requirements may vary between different applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software resources that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.





BRIEF DESCRIPTION OF THE DRAWINGS

It will be appreciated that for simplicity and clarity of illustration, elements illustrated in the Figures have not necessarily been drawn to scale. For example, the dimensions of some of the elements are exaggerated relative to other elements.


Embodiments incorporating teachings of the present disclosure are shown and described with respect to the drawings presented herein, in which:



FIG. 1 is a block diagram illustrating a virtualized environment according to an embodiment of the present disclosure;



FIGS. 2 and 3 illustrate methods for loading firmware code in the virtualized environment of FIG. 1;



FIG. 4 is a block diagram illustrating a virtualized environment according to another embodiment of the present disclosure; and



FIG. 5 is a block diagram illustrating a generalized information handling system according to an embodiment of the present disclosure.





The use of the same reference symbols in different drawings indicates similar or identical items.


DETAILED DESCRIPTION OF DRAWINGS

The following description in combination with the Figures is provided to assist in understanding the teachings disclosed herein. The following discussion will focus on specific implementations and embodiments of the teachings. This focus is provided to assist in describing the teachings, and should not be interpreted as a limitation on the scope or applicability of the teachings. However, other teachings can certainly be used in this application. The teachings can also be used in other applications, and with several different types of architectures, such as distributed computing architectures, client/server architectures, or middleware server architectures and associated resources.


For purpose of this disclosure an information handling system can be implemented on one or more information handling system. An information handling system can include any instrumentality or aggregate of instrumentalities operable to compute, classify, process, transmit, receive, retrieve, originate, switch, store, display, manifest, detect, record, reproduce, handle, or utilize any form of information, intelligence, or data for business, scientific, control, entertainment, or other purposes. For example, an information handling system can be a personal computer, a laptop computer, a smart phone, a tablet device or other consumer electronic device, a network server, a network storage device, a switch, a router, or another network communication device, or any other suitable device and may vary in size, shape, performance, functionality, and price. Further, an information handling system can include processing resources for executing machine-executable code, such as a central processing unit (CPU), a programmable logic array (PLA), an embedded device such as a System-on-a-Chip (SoC), or other control logic hardware. An information handling system can also include one or more computer-readable medium for storing machine-executable code, such as software or data. Additional components of an information handling system can include one or more storage devices that can store machine-executable code, one or more communications ports for communicating with external devices, and various input and output (I/O) devices, such as a keyboard, a mouse, and a video display. An information handling system can also include one or more buses operable to transmit information between the various hardware components.



FIG. 1 illustrates an embodiment of a virtualized environment 100 including a multi-root (MR) Peripheral Connect Interface manager (PCIM) 110, one or more host system 120, an MR input/output virtualization (IOV) switch 130, and a single-root (SR) IOV endpoint 140. MR-PCIM 110 includes a PCI root complex 115 that is operable to connect a processor complex of MR-PCIM 110 to a PCI Express (PCIe) switch fabric such as MR-IOV switch 140, and thereby to gain access to the functionality of SR-IOV endpoint 150. Host system 120 includes a PCI root complex 125 similar to PCI root complex 115. MR-IOV switch 130 includes a non-volatile random access memory (NVRAM), such as a flash device or other re-writeable memory device, for storing expansion ROM firmware from SR-IOV endpoint 140, as described further, below. In a particular embodiment, MR-IOV switch 130 operates in compliance with the Multi-Root I/O Virtualization and Sharing 1.0 specification and SR-IOV endpoint 140 operates in compliance with the Single-Root I/O Virtualization and Sharing 1.1 specification.


SR-IOV endpoint 140 includes a physical function 150, one or more virtual function 160, and an expansion ROM 170. Physical function 150 operates to provide an interface through which PCI root complex 115 accesses the functions of SR-IOV endpoint 140. In particular, physical function 150 provides PCI root complex 115 with access to PCI configuration space 152, to base address registers (BAR) 154, and to descriptor fields 156. PCI configuration space 152 provides memory-mapped locations through which MR PCIM 110 can configure SR-IOV endpoint 140. BAR 154 and descriptors 156 operate to permit data to flow between PCI root complex 115 and SR-IOV endpoint 140. Physical function 150 is associated with Function 0 of SR-IOV endpoint 140. The skilled artisan will understand that expansion ROM 170 represents one or more of an option ROM, such as may be included on an add-in device, a Unified Extensible Firmware Interface (UEFI) driver, or another type of firmware function.


Virtual function 160 operates to provide an interface through which PCI root complex 125 accesses the functions of SR-IOV endpoint 140. In particular, physical function 150 provides PCI root complex 125 with access to BAR 164, and to descriptor fields 166. BAR 164 and descriptors 166 operate to permit data to flow between PCI root complex 125 and SR-IOV endpoint 140. Thus host system 120 is distinguished from MR PCIM 110 in that the host system does not have access to physical function 150, and thus cannot access expansion ROM 170. Virtual function 160 is associated with functions other than Function 0 of SR-IOV endpoint 140. The skilled artisan will understand that the above description is simplified, that a virtual function may include some subset of the functions available in a physical function's configuration space, but that the subset of functions will not include an association with an expansion ROM.


Expansion ROM 170 is a non-volatile memory for storing firmware code for SR-IOV endpoint 140. The firmware code operates to receive low-level commands, such as Bios calls, and to translate the commands into actions performed by SR-IOV endpoint 140. Thus MR-PCIM 110 or host system 120 can execute the firmware code from expansion ROM 170 to access the functions of SR-IOV endpoint 140. However, the base address for expansion ROM 170 is included in PCI configuration space 152. Therefore only MR PCIM 110 can directly obtain the firmware code directly from expansion ROM 170. Note that the Single-Root I/O Virtualization and Sharing 1.1 specification defines that expansion ROM images are applicable to a particular PCIe peripheral device in its entirety and are thereby associated with the physical function. As such, the expansion ROM images are not typically applicable to or associated with virtual functions.



FIG. 2 illustrates virtualized environment 100 and a method whereby MR-PCIM 110 reads the firmware code from expansion ROM 170. Here, PCI root complex 115 issues a transaction 202 to physical function 150 to determine the base address 204 for expansion ROM 170. When MR-IOV switch 130 detects transaction 202 to physical function 150, the MR-IOV switch stores 206 the base address for expansion ROM 170 to NVRAM 135. PCI root complex 115 then reads 206 the firmware code from expansion ROM 170, in order to use the firmware code in future transactions with SR-IOV endpoint 140. When MR-IOV switch 130 detects read 206 of the firmware code from expansion ROM 170, the MR-IOV switch stores 210 the firmware code to NVRAM 135.



FIG. 3 illustrates virtualized environment 100 and a method whereby host system 120 obtains the firmware code from MR-IOV switch 130. Here, PCI root complex 125 issues a transaction 302 to virtual function 160 to determine the base address for expansion ROM 170. However, since virtual function 160 lacks access to PCI configuration space 152, SR-IOV endpoint 140 is unable to respond to transaction 302. Thus, when MR-IOV switch 130 detects transaction 302 to virtual function 160, blocks transaction 302 from being issued 304 to virtual function 160, traps transaction 302 and provides the trapped transaction 306 to provide a base address of the firmware code stored in NVRAM 135 to PCI root complex 125. PCI root complex 125 then reads 308 the firmware code from NVRAM 135, in order to use the firmware code in future transactions with SR-IOV endpoint 140. The skilled artisan will recognize that there are different ways in which MR-IOV switch 130 traps transactions and provides the base address for the firmware code to host system 120. For example, MR-IOV switch 130 can provide the base address to the firmware code in expansion ROM 170 to PCI root complex 125, and then can trap read transactions to that base address and instead point to the copy of the firmware code in NVRAM 135. In this way, all accesses to the firmware code subsequent to the initial read that is stored in NVRAM 135 are subsequently handled by MR-IOV switch 130, including reads from MR PCIM 110. In another example, MR-IOV switch 130 can provide the base address to the firmware code in NVRAM 135 to PCI root complex 125, and then can trap read transactions from host system 120 to that base address and then the read accesses from the host system will be directed to the NVRAM without further transaction trapping. However, reads to expansion ROM 170 that are issued by MR-PCIM 110 would still be handled by the expansion ROM. After the firmware code is provided to host system 120, the host system can execute the firmware code to boot the host system and enable use of SR-IOV endpoint 140. As such, host system 120, which otherwise would be unable to access SR-IOV endpoint 140, can utilize the SR-IOV endpoint by virtue of the copy of the firmware code copied on the host system. The skilled artisan will understand that, although virtual function 160 lacks access to expansion ROM 170, the virtual function can include a limited configuration space for controlling the virtual function.



FIG. 4 illustrates an embodiment of a virtualized environment 400 including an MR-PCIM 410, one or more host system 420, an MR-IOV switch 430, and an SR-IOV endpoint 440. MR-PCIM 410 includes a PCI root complex 415 that is operable to connect a processor complex of MR-PCIM 410 to a PCI Express (PCIe) switch fabric such as MR-IOV switch 430, and thereby to gain access to the functionality of SR-IOV endpoint 440. MR-PCIM 410 also includes a storage device 417 for storing expansion ROM firmware from SR-IOV endpoint 440, as described further, below. SR-IOV endpoint 440 is similar to SR-IOV endpoint 140, and includes similar elements, including a physical function 450, a virtual function 460, and an expansion ROM 470.


Virtualized environment 400 operates similarly to virtualized environment 100. In particular, when PCI root complex 415 accesses the firmware code stored in expansion ROM 470, MR-PCIM 410 stores the firmware code in storage device 417, and MR-IOV switch 430 stores the base address for the firmware code. Subsequently, when host system 420 attempts to read the firmware code, MR-IOV switch 430 traps the transaction and issues the transaction to be serviced by MR-PCIM 410. In this way, host system 420 obtains access to the firmware code stored in expansion ROM 470, in spite of the fact that virtual function 460 does not provide a native ability to access the firmware code.



FIG. 5 illustrates a generalized embodiment of information handling system 500. For purpose of this disclosure information handling system 500 can include any instrumentality or aggregate of instrumentalities operable to compute, classify, process, transmit, receive, retrieve, originate, switch, store, display, manifest, detect, record, reproduce, handle, or utilize any form of information, intelligence, or data for business, scientific, control, entertainment, or other purposes. For example, information handling system 100 can be a personal computer, a laptop computer, a smart phone, a tablet device or other consumer electronic device, a network server, a network storage device, a switch router or other network communication device, or any other suitable device and may vary in size, shape, performance, functionality, and price. Further, information handling system 100 can include processing resources for executing machine-executable code, such as a central processing unit (CPU), a programmable logic array (PLA), an embedded device such as a System-on-a-Chip (SoC), or other control logic hardware. Information handling system 500 can also include one or more computer-readable medium for storing machine-executable code, such as software or data. Additional components of information handling system 500 can include one or more storage devices that can store machine-executable code, one or more communications ports for communicating with external devices, and various input and output (I/O) devices, such as a keyboard, a mouse, and a video display. Information handling system 500 can also include one or more buses operable to transmit information between the various hardware components.


Information handling system 500 can include devices or modules that embody one or more of the devices or modules described above, and operates to perform one or more of the methods described above. Information handling system 500 includes a processors 502 and 504, a chipset 510, a memory 520, a graphics interface 530, include a basic input and output system/extensible firmware interface (BIOS/EFI) module 540, a disk controller 550, a disk emulator 560, an input/output (I/O) interface 570, and a network interface 580. Processor 502 is connected to chipset 510 via processor interface 506, and processor 504 is connected to the chipset via processor interface 508. Memory 520 is connected to chipset 510 via a memory bus 522. Graphics interface 530 is connected to chipset 510 via a graphics interface 532, and provides a video display output 536 to a video display 534. In a particular embodiment, information handling system 500 includes separate memories that are dedicated to each of processors 502 and 504 via separate memory interfaces. An example of memory 520 includes random access memory (RAM) such as static RAM (SRAM), dynamic RAM (DRAM), non-volatile RAM (NV-RAM), or the like, read only memory (ROM), another type of memory, or a combination thereof.


BIOS/EFI module 540, disk controller 550, and I/O interface 570 are connected to chipset 510 via an I/O channel 512. An example of I/O channel 512 includes a Peripheral Component Interconnect (PCI) interface, a PCI-Extended (PCI-X) interface, a high-speed PCI-Express (PCIe) interface, another industry standard or proprietary communication interface, or a combination thereof. Chipset 510 can also include one or more other I/O interfaces, including an Industry Standard Architecture (ISA) interface, a Small Computer Serial Interface (SCSI) interface, an Inter-Integrated Circuit (I2C) interface, a System Packet Interface (SPI), a Universal Serial Bus (USB), another interface, or a combination thereof. BIOS/EFI module 540 includes BIOS/EFI code operable to detect resources within information handling system 500, to provide drivers for the resources, initialize the resources, and access the resources. BIOS/EFI module 540 includes code that operates to detect resources within information handling system 500, to provide drivers for the resources, to initialize the resources, and to access the resources.


Disk controller 550 includes a disk interface 552 that connects the disc controller to a hard disk drive (HDD) 554, to an optical disk drive (ODD) 556, and to disk emulator 560. An example of disk interface 552 includes an Integrated Drive Electronics (IDE) interface, an Advanced Technology Attachment (ATA) such as a parallel ATA (PATA) interface or a serial ATA (SATA) interface, a SCSI interface, a USB interface, a proprietary interface, or a combination thereof. Disk emulator 560 permits a solid-state drive 564 to be connected to information handling system 500 via an external interface 562. An example of external interface 562 includes a USB interface, an IEEE 1394 (Firewire) interface, a proprietary interface, or a combination thereof. Alternatively, solid-state drive 564 can be disposed within information handling system 500.


I/O interface 570 includes a peripheral interface 572 that connects the I/O interface to an add-on resource 574 and to network interface 580. Peripheral interface 572 can be the same type of interface as I/O channel 512, or can be a different type of interface. As such, I/O interface 570 extends the capacity of I/O channel 512 when peripheral interface 572 and the I/O channel are of the same type, and the I/O interface translates information from a format suitable to the I/O channel to a format suitable to the peripheral channel 572 when they are of a different type. Add-on resource 574 can include a data storage system, an additional graphics interface, a network interface card (NIC), a sound/video processing card, another add-on resource, or a combination thereof. Add-on resource 574 can be on a main circuit board, on separate circuit board or add-in card disposed within information handling system 500, a device that is external to the information handling system, or a combination thereof.


Network interface 580 represents a NIC disposed within information handling system 500, on a main circuit board of the information handling system, integrated onto another component such as chipset 510, in another suitable location, or a combination thereof. Network interface device 580 includes network channels 582 and 584 that provide interfaces to devices that are external to information handling system 500. In a particular embodiment, network channels 582 and 584 are of a different type than peripheral channel 572 and network interface 580 translates information from a format suitable to the peripheral channel to a format suitable to external devices. An example of network channels 582 and 584 includes InfiniBand channels, Fibre Channel channels, Gigabit Ethernet channels, proprietary channel architectures, or a combination thereof. Network channels 582 and 584 can be connected to external network resources (not illustrated). The network resource can include another information handling system, a data storage system, another network, a grid management system, another suitable resource, or a combination thereof.


Although only a few exemplary embodiments have been described in detail herein, those skilled in the art will readily appreciate that many modifications are possible in the exemplary embodiments without materially departing from the novel teachings and advantages of the embodiments of the present disclosure. Accordingly, all such modifications are intended to be included within the scope of the embodiments of the present disclosure as defined in the following claims. In the claims, means-plus-function clauses are intended to cover the structures described herein as performing the recited function and not only structural equivalents, but also equivalent structures.


The above-disclosed subject matter is to be considered illustrative, and not restrictive, and the appended claims are intended to cover any and all such modifications, enhancements, and other embodiments that fall within the scope of the present invention. Thus, to the maximum extent allowed by law, the scope of the present invention is to be determined by the broadest permissible interpretation of the following claims and their equivalents, and shall not be restricted or limited by the foregoing detailed description.

Claims
  • 1. A Peripheral Component Interconnect-Express (PCIe) switch comprising: a memory; anda processor operable to: receive a first transaction from a first information handling system to an endpoint device;determine that the first transaction is a first request to receive firmware code from the endpoint device;block the first transaction from being issued to the endpoint device; andprovide the firmware code to the first information handling system from the memory.
  • 2. The PCIe switch of claim 1, wherein: the memory further stores a first base address in the endpoint device, the first base address being to a first location of the firmware code in the endpoint device; andin determining that the first transaction is a request to receive the firmware code from the endpoint device, the processor is further operable to determine that the first transaction is issued to the first base address.
  • 3. The PCIe switch of claim 2, wherein: in providing the firmware code to the first information handling system, the processor is further operable to: provide a second base address to the first information handling system, the second base address being to a second location of the firmware code in the memory; andreceive a second transaction from the first information handling system to the second base address; andproviding the firmware code to the first information handling system is in response to the second transaction.
  • 4. The PCIe switch of claim 1, wherein the first transaction is to a virtual function of the endpoint device.
  • 5. The PCIe switch of claim 1, wherein the processor is further operable to: prior to receiving the first transaction, receive a second transaction from a second information handling system to the endpoint device;determine that the second transaction is a second request to receive the firmware code from the endpoint device;receive a third transaction from the endpoint device, the third transaction including the firmware code; andstore the firmware code in the memory.
  • 6. The PCIe switch of claim 5, wherein the processor is further operable to: prior to receiving the third transaction, receive a fourth transaction from the endpoint device, the fourth transaction including a base address in the endpoint device, the base address being to a location of the firmware code in the endpoint device; andstore the base address in the memory.
  • 7. The PCIe switch of claim 5, wherein the second transaction is to a physical function of the endpoint device.
  • 8. The PCIe switch of claim 1, wherein: the PCIe switch is a multi-root I/O virtualization switch; andthe endpoint device is a single-root I/O virtualization endpoint device.
  • 9. A Peripheral Component Interconnect-Express (PCIe) switch comprising: a memory; anda processor operable to: receive a first transaction from a first information handling system to an endpoint device;determine that the first transaction is a first request to receive firmware code from the endpoint device;receive a third transaction from the endpoint device, the third transaction including the firmware code; andstore the firmware code in the memory.
  • 10. The PCIe switch of claim 9, wherein the processor is further operable to: receive a second transaction from the endpoint device, the second transaction including a base address in the endpoint device, the base address being to a location of the firmware code in the endpoint device; andstore the base address in the memory.
  • 11. The PCIe switch of claim 10, wherein the processor is further operable to: receive a third transaction from a second information handling system to an endpoint device;determine that the third transaction is a second request to receive firmware code from the endpoint device;block the third transaction from being issued to the endpoint device; andprovide the firmware code to the second information handling system from the memory.
  • 12. The PCIe switch of claim 11, wherein the third transaction is to a virtual function of the endpoint device.
  • 13. The PCIe switch of claim 11, wherein in determining that the third transaction is a request to receive the firmware code from the endpoint device, the processor is further operable to determine that the third transaction is issued to the first base address.
  • 14. The PCIe switch of claim 13, wherein: in providing the firmware code to the second information handling system, the processor is further operable to: provide a second base address to the first information handling system, the second base address being to a second location of the firmware code in the memory; andreceive a fourth transaction from the first information handling system to the second base address; andproviding the firmware code to the second information handling system is in response to the fourth transaction.
  • 15. The PCIe switch of claim 9, wherein the first transaction is to a physical function of the endpoint device.
  • 16. The PCIe switch of claim 9, wherein: the PCIe switch is a multi-root I/O virtualization switch; andthe endpoint device is a single-root I/O virtualization endpoint device.
  • 17. A method comprising: receiving, at a Peripheral Component Interconnect-Express (PCIe) switch, a first transaction from a first information handling system to an endpoint device;determining that the first transaction is a first request to receive firmware code from the endpoint device;blocking the first transaction from being issued to the endpoint device; andproviding the firmware code to the first information handling system from a memory of the PCIe switch.
  • 18. The method of claim 17, further comprising: in determining that the first transaction is a request to receive the firmware code from the endpoint device, determining that the first transaction is issued to a first base address in the endpoint device, the first base address being to a first location of the firmware code in the endpoint device.
  • 19. The method of claim 17, further comprising: in providing the firmware code to the first information handling system: providing a second base address to the first information handling system, the second base address being to a second location of the firmware code in the memory; andreceiving a second transaction from the first information handling system to the second base address;wherein providing the firmware code to the first information handling system is in response to the second transaction.
  • 20. The method of claim 19, further comprising: prior to receiving the first transaction and the second transaction, receiving a third transaction from a second information handling system to the endpoint device;determining that the third transaction is a second request to receive the firmware code from the endpoint device;receiving a fourth transaction from the endpoint device, the fourth transaction including the firmware code; andstoring the firmware code in the memory.