1. Field of the Invention
The present invention relates generally to switches and electronic communication. More specifically, the present invention relates to switching between multiple hosts and multiple synthetic or logical devices in an intelligent PCIe switch.
2. Description of the Related Art
Computer architectures have advanced greatly over the years. Lately, it is becoming more and more commonplace for chip designers to include external data interfaces, such as Universal Serial Bus (USB) interface controllers into their motherboards. These interfaces are known as host controllers. The processor is typically then connected to the other components of the computer system via an input/output (I/O) interconnect system.
There are many different computer I/O interconnect standards available. One of the most popular over the years has been the peripheral component interconnect (PCI) standard. PCI allows the bus to act like a bridge, which isolates a local processor bus from the peripherals, allowing a Central Processing Unit (CPU) of the computer to connect to a host of IO devices through this interconnect.
Recently, a successor to PCI has been popularized, termed PCI Express (or, simply, PCIe). PCIe provides higher performance, increased flexibility and scalability for next-generation systems, while maintaining software compatibility with existing PCI applications. Compared to legacy PCI, the PCI Express protocol is considerably more complex, with three layers—the transaction, data link and physical layers.
In a PCI Express system, a root complex device connects the processor and memory subsystem to the PCI Express switch fabric comprised of one or more switch devices (embodiments are also possible without switches, however). In PCI Express, a point-to-point architecture is used. Similar to a host bridge in a PCI system, the root complex generates transaction requests on behalf of the processor, which is interconnected through a local I/O interconnect. Root complex functionality may be implemented as a discrete device, or may be integrated with the processor. A root complex may contain more than one PCI Express port and multiple switch devices can be connected to ports on the root complex or cascaded.
One problem in the prior art is that there are signification limitations on the ability to perform a remote booting operation. Conventionally, an Expansion ROM (also known as an Option ROM) is required to perform a boot operation. The Expansion Rom typically consists of firmware and may, for example, reside on a physical card. The Expansion Rom is loaded very early in a boot process. Typically a particular host architecture requires its own Expansion Rom. For example, a host system using an Intel-based chip architecture requires a different Expansion ROM than a host having non-Intel based chip architecture
The conventional approach for remote booting generally pertains to having a device that is capable of remote booting (a network device typically) and having the expansion ROM boot proxy code physically present in the device that is connected physically/directly connected to that host. In particular, these devices are not shared among multiple host servers. In a normal network booting process, when the processor is powered-on the pre OS/environment such as BIOS, UEFI or OpenBoot starts executing. This environment enumerates all the devices on the system and chooses a device to boot the operating system. This choice can be made by the user or the system can go through a list of devices in sequence to boot the system.
Consider the case when the booting occurs over a network device. When a network device is chosen, the expansion ROM on the device is located and executed. The expansion ROM uses the DHCP protocol to get, amongst other things, IP address, and boot image and boot server. It uses the TFTP protocol to download the boot image from the boot server and boot the operating system.
Generally speaking, in the prior art, a system can boot using a remote boot only under the following two conditions:
With PCI express based sharing of IO devices among multiple, connected hosts, these above conditions restrict the remote booting facility/capability of a connected host.
The inventors of the present application have recognized that a PCIe switch may implement a logical device/virtual device functionality to present a synthetic device to a connected host. As an example,
In one aspect of the invention, a method provided for remote booting using a synthetic device capability is disclosed. In one embodiment a synthetic expansion ROM capability is provide for remote booting, thus eliminating the requirement of a physical expansion ROM. Additionally, booting of different host architectures may be supported. An exemplary system includes a management CPU and associated memory. The host management software presents a synthetic device to a remote host, where the synthetic device includes at least one extension to support remote booting.
In one embodiment a method of remote booting over PCI Express using a synthetic remote book capability is disclosed. A management host software system intercepts probe requests from a host and provides information required for a remote boot. The management host software system may include expansion ROM information to support different host architectures. A synthetic device booting capability may be shown to a host, including the expansion ROM information. Additional support for DHCP and TFTP protocols may be provided.
Reference will now be made in detail to specific embodiments of the invention, including the best modes contemplated by the inventors for carrying out the invention. Examples of these specific embodiments are illustrated in the accompanying drawings. While the invention is described in conjunction with these specific embodiments, it will be understood that it is not intended to limit the invention to the described embodiments. On the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims. In the following description, specific details are set forth in order to provide a thorough understanding of the present invention. The present invention may be practiced without some or all of these specific details. In addition, well known features may not have been described in detail to avoid unnecessarily obscuring the invention.
In accordance with the present invention, the components, process steps, and/or data structures may be implemented using various types of operating systems, programming languages, computing platforms, computer programs, and/or general purpose machines. In addition, those of ordinary skill in the art will recognize that devices of a less general purpose nature, such as hardwired devices, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), or the like, may also be used without departing from the scope and spirit of the inventive concepts disclosed herein. The present invention may also be tangibly embodied as a set of computer instructions stored on a computer readable medium, such as a memory device.
The present application is generally related to using a synthetic device capability in a PCIe switch to support remote booting. General background on providing a synthetic device capability for other purposes is described in U.S. Pub. No. 2013/00024595, “PCI Express Switch with Logical Device Capability,” which is hereby incorporated by reference for all purposes. U.S. Publ. No. 2013/0024595 describes a capability for enabling operation of a synthetic device in a PCIe switch, including presenting the synthetic device a local host component connected to the switch. Additionally, commonly owned U.S. patent application Ser. No. 14/106,579, “Switch with Synthetic Device Capability,” is also hereby incorporated by reference for background information on providing a synthetic device capability in a PCI Express Switch fabric environment as well as U.S. patent application Ser. No. 13/624,871, “PCIe Express Switch With Logical Device Capability.” The present invention adds new features and capabilities to remote booting on a shared IO PCI express fabric.
The switch 312 thus allows creating and presenting synthetic devices and device extensions that include expansion ROM capabilities. In an embodiment of the present invention, the switch supports a multi-architecture expansion ROM capability in which the expansion ROM information required to boot an arbitrary number of different host architectures (e.g. Host Architectures 1, 2 . . . N) is supported. In embodiments of the present invention the expansion ROM functions may be implemented by the management host 330 as DMA functions (in software) with a Proxy PXE server as part of the management software of ExpressFabric, that connects either to a real PXE server in the network that the ExpressFabric (management software) can connect to or use the local storage to provide boot images.
Additionally, in one embodiment an exemplary set of components further includes:
Note that it is also possible to store pre-made boot images for various hosts in the management software (as a database stored in a flash/storage medium connected to the management agent of the PCI express Fabric), and serve that boot image, instead of going to an outside PXE server. Both the above methods are examples/embodiments for this current invention.
In one embodiment, remote booting is made fully configurable and managed by the management host software (and not by firmware or hardware) from stored memory images. This permits a connected host, such as host 320, 324, or 328, to overcome the limitations of conventional PCI and boot over the network. Some aspects of embodiments of the present invention include:
A physical device need not exist. The Express fabric can provide a synthetic device that is provided by the management software. Some aspects and benefits may include:
One aspect of the remote booting approach is that the synthetic device presentation capabilities of the PCI Express switch also allows showing a synthetic capability of network booting to a connected host. This capability can be shown to exist on either a real or a synthetic device. The connected host (e.g., host 320, 324, or 328) discovers this and uses this capability to boot using this capability. In the prior art, remote booting is typically enabled through the expansion ROM capability of PCI configuration space of devices that support remote booting. System boot code (BIOS), while scanning the devices present at the time of booting, looks at this expansion ROM capability and if configured, executes this memory image. In contrast, in the present invention the expansion ROM capability is provided via a synthetic device capability. The synthetic device capability allows the switch to add the functionality of the expansion ROM to the device even if the real device does not have an expansion ROM. It is also possible to present a device with these capabilities even when the device does not exist.
An example of a method of implementing remote booting over a real device with a synthetic capability or a synthetic device over a network is now described. This method is architecture agnostic and can be used for booting processors of different kinds over the same device.
An example management software configuration will now be described. In this example, the management software acts as a dhcp and TFTP relay i.e. the management software will forward, with appropriate transformations, packets to outside servers to handle the boot request. These packets will include dhcp requests and TFTP requests.
In one embodiment the management software is configured with three things for booting with synthetic network devices:
In one embodiment, when the host boots up, it starts at the PCIe root complex and probes for all devices. These probes are intercepted by the management software that provides the host with a device hierarchy. The hierarchy may be a real hierarchy consisting of real devices or a synthetic hierarchy consisting of a synthetic hierarchy or a combination of real and synthetic devices. Once the host finds a PCIe device, it probes the capabilities of the device. This probe is intercepted by the management software, which returns the capabilities. Once again the management software can return real capabilities (if the device is real), synthetic capabilities (for real or synthetic capabilities) or a combination of real and synthetic capabilities. If a host is configured with a synthetic device for remote booting, a synthetic device is presented in the device hierarchy of the host. When the host probes the capabilities of this synthetic device the management software will return capabilities that indicate the presence of the expansion ROM. When the host tries to read the expansion ROM, it is intercepted by the management software, that returns the correct bit stream depending on the architecture of the host.
Once the host reads the expansion ROM for network boot, it will follow the standard protocols of DHCP and TFTP to start the PXE booting. The management software intercepts the writes to the synthetic device and forwards the protocol data over a connected Ethernet network.
Additional configuration and execution details are now described for an embodiment of the present invention. In one embodiment the management software is configured with the details of which hosts need a synthetic network booting device, host architecture and the expansion ROM needed for that host that is architecture specific. The management software is also configured to know how to handle the expansion ROM requests. For example if the host sends a DHCP request, the management software can be configured to forward it or reply to it. It can also reply or relay other requests such as TFTP requests.
When the host boots, it will probe for PCIe devices. This probe is intercepted by the management software and depending on the configuration it is presented with a synthetic device with expansion ROM capabilities. The Host will read from the capabilities register about expansion ROM capability. This read is again intercepted by the management CPU and depending on the configuration the device can be shown to have expansion ROM capability. The Host will read the expansion ROM. This read is trapped once again by the Management software, and depending on the host architecture, it will return the appropriate expansion ROM.
Network booting of a host involves a PxE boot that starts with a dhcp. The management software can trap the dhcp request and either forward the packet to a existing DHCP server or return a canned DHCP reply to the host. Once the host request receives the DHCP reply that consists of the host's IP address and the TFTP server's name and other system parameters, it will connect to the TFTP server to read the operating system bit stream. The Management software can, once again, forward the connections to an existing TFTP server or return the canned reply from the local resources.
Additional alternate embodiments are contemplated. As previously discussed, the present application is generally related to using a synthetic device capability in a PCIe switch to support remote booting. The potential for the PCIe switch to include shared JO devices and the use of DMA functions was previously discussed. Additional device capabilities were discussed in detail in the patents and patent applications of the assignee that are incorporated by reference. Consequently, it will thus be understood that the synthetic expansion ROM capability may be included in implementations that include a shared JO device or built in DMA functions of ExpressFabric, instead of creating a whole new synthetic device for this purpose.
While this invention has been described in terms of several preferred embodiments, there are alterations, permutations, modifications, and various substitute equivalents, which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and apparatuses of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations, and various substitute equivalents as fall within the true spirit and scope of the present invention.