[Not Applicable]
[Not Applicable]
Certain embodiments of the invention relate to media access control. More specifically, certain embodiments of the invention relate to a method and system for fast Ethernet controller operation using a virtual CPU.
High-speed digital communication networks over copper and optical fiber are used in many network communication and digital storage applications. Ethernet and Fiber Channel are two widely used communication protocols, which continue to evolve in response to increasing demands for higher bandwidth in digital communication systems.
The Ethernet protocol may provide collision detection and carrier sensing in the physical layer. The physical layer, layer 1, is responsible for handling all electrical, optical, opto-electrical and mechanical requirements for interfacing to the communication media. Notably, the physical layer may facilitate the transfer of electrical signals representing an information bitstream. The physical layer may also provide services such as, encoding, decoding, synchronization, clock data recovery, and transmission and reception of bit streams.
As the demand for higher data rates and bandwidth continues to increase, equipment vendors are continuously being forced to employ new design techniques for manufacturing network layer 1 equipment capable of handling these increased data rates. Chip real estate and printed circuit board (PCB) real estate is generally extremely expensive. Accordingly, the use of available chip and PCB real estate is therefore a critical fabrication consideration when designing chips and/or circuit boards. Particularly in high speed applications operating at high frequencies, a high device count and pin count may result in designs that are susceptible to interference. Notably, high device and pin counts may significantly increase chip real estate and accordingly, significantly increase implementation cost.
The embedded processors in network communication chips are a substantial part of the overall chip cost and may utilize large areas of the die. Embedded processors such as MIPS or ARM processors may add substantial cost to the chip-making process through licensing fees, and may have unnecessary capabilities when utilized in specific applications.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of such systems with the present invention as set forth in the remainder of the present application with reference to the drawings.
A system and/or method for fast Ethernet controller operation using a virtual CPU, substantially as shown in and/or described in connection with at least one of the figures, as set forth more completely in the claims.
Various advantages, aspects and novel features of the present invention, as well as details of an illustrated embodiment thereof, will be more fully understood from the following description and drawings.
Certain aspects of the invention may be found in a method and system for fast Ethernet controller operation using a virtual CPU. Exemplary aspects of the invention may include controlling an on-chip Ethernet controller utilizing a virtual CPU comprising a microcode engine that loads a single instruction and executes this single instruction prior to loading or executing a subsequent instruction. The instructions may be fetched by the virtual CPU from an external non-volatile memory or on-chip ROM. The virtual CPU may initialize the Ethernet controller and provide patches for supporting hardware workarounds, wake on LAN service, and vital production data such as serial number, product name, manufacturer and related manufacturing data. The virtual CPU may power down the Ethernet controller and may be halted via a particular command and procedure.
The host 101 may comprise suitable circuitry, logic, and/or code that may be enabled to communicate with a LAN utilizing the Ethernet controller 100. The host 101 may comprise a computer, for example, and may communicate with the Ethernet controller 100 via the PCI Express interface 103.
The PCI Express interface 103 may comprise suitable circuitry, logic and/or code that may be enabled to provide a serial interconnect between the host 101 and the memory controller 105. Communication via the PCI Express interface 103 may comply with the PCI-SIG standard.
The memory controller 105 may comprise suitable circuitry, logic and/or code that may be enabled to control access of the VCPU 111 and/or the host 101 to the EEPROM 115 and the MAC and PHY 109 and 113, respectively. The memory controller 105 may comprise a general register controller (GRC) and may allow transmission and reception of single characters, configuration and status readback. The memory controller 105 may respond to write and read requests received from the host 101, the VCPU 111 or other peripheral devices not shown.
The EEPROM interface 107 may comprise suitable circuitry, logic and/or code that may be enabled to send and receive signals from the EEPROM 115 and the memory controller 105. The EEPROM interface 107 may communicate read, write, or EEPROM reset data, for example, to the EEPROM 115, and may receive data to be written to the EEPROM 115 from the memory controller 105.
The EEPROM 115 may comprise suitable circuitry, logic and/or code that may be enabled to store data, such as device configuration information, that may require storage in a non-volatile memory. The device configuration information may include but is not limited to, MAC address, device ID, vendor ID, subsystem vendor ID, subsystem device ID, vital product data (VPD), boot code data and power up boot code. The storage of this data on an external EEPROM may significantly reduce memory requirements of the VCPU 111. In addition, patch code may be stored in the EEPROM 115 to provide a workaround for hardware errors.
The VCPU 111 may comprise suitable circuitry, logic, and/or code that may be adapted to perform the functions of a CPU, but without the dedicated hardware necessary for a CPU core. The VCPU 111 may be utilized to replace embedded CPUs such as MIPS or ARM. The VCPU 111 may comprise a microcode engine with a series of instruction executions for functions such as device initialization, VPD, wake on LAN (WOL) and alert standard format (ASF) services. By storing data on an external non-volatile memory, such as the EEPROM 115, the die size requirements for the VCPU 111 may be significantly less than an embedded processor, such as MIPS or ARM. The VCPU 111 may not require on-chip cache or program/data memory, and may fetch and execute instructions one-by-one, thus generating an application specific processor without extra functionality not needed for a particular application. Use of the VCPU 111 may not only result in reduced die size requirements, but may also result in significant cost reduction. In the present invention, the VCPU 111 may be used in an Ethernet controller 100, but the application of the VCPU 111 is not limited in this regard.
The MAC 109 may comprise suitable circuitry, logic, and/or code that may be enabled to provide addressing and channel access control mechanisms, and serve as an interface between the memory controller 105 and the PHY 113. The MAC 109 may generate and parse physical frames to be transmitted to a network by the PHY 113. The PHY 113 may comprise suitable circuitry, logic, and/or code that may be enabled to send and receive signals to/from a LAN or other network. The PHY 113 may control a manner in which data may be transmitted and/or received over the physical connection, such as a twisted pair cable. The MAC 109 and the PHY 113 may communicate using a media independent interface (MII) protocol. The PHY 113 may control transmission criteria such as line speed, duplex mode, modulation scheme, etc.
In operation, the Ethernet controller may be utilized to communicate over a LAN or other network. The VCPU 111 may perform control functions within the Ethernet controller 100 that may include, but is not limited to hardware initialization, runtime workaround support, WOL support, boot agent setup, etc. Runtime workaround support may be utilized in instances where there are hardware errors or requirement changes after the chip has been fabricated, thus eliminating multiple hardware changes. WOL support may be utilized to wake a sleeping system by sending a wake signal over the network. This may allow for remote maintenance or upgrading of multiple computers on a network, eliminating the need to be physically present at the location of every system. Boot agent support may allow a networked system to boot using a program code supplied by a remote server. The EEPROM 115 may store non-changing data that may be utilized by the VCPU 111 to control the Ethernet controller 100. The host 101 may communicate with the Ethernet controller 100 via the PCI Express interface 103, and the Ethernet controller 100 may communicate with a LAN or other network via the PHY 113.
The other functions block 205 may comprise interfaces to other functional blocks such as a host CPU, an Ethernet controller, or other circuitry requiring an embedded processor. The VCPU 203 may access the other functions block 205 and the external EEPROM/Flash 209 via the VCPU register interface 215.
The VCPU 203 may comprise a series of instruction executions. In an exemplary embodiment, each instruction may comprise parity bits, op-code, register address and write data/read data operand. Table 1 below illustrates the structure of the 64-bit instruction utilized by the VCPU 203.
In instances where the op-code is for the register operation, the 17-bit address field may be the register byte address. In instances where the op-code may be for the jump operation, the address field may be the ROM/EEPROM/ Flash byte address. Bit 24 may be used to indicate whether to utilize the on-chip ROM 201 address or the external EEPROM/Flash 209 address. If bit 24 is set, it may indicate the on-chip ROM address. Otherwise, it may indicate the external EEPROM/Flash address. The total address mapping range may be from 0 to 128 KB. Exemplary definitions for the op-code are shown below in Table 2.
1The holding register may be a 32-bit internal register that temporarily stores the value read by the load instruction, which may later be modified or used as write data by the store instruction.
An exemplary definition of r1 and the register address field is shown below in Table 3.
The VCPU 203 may access the other function blocks 205 and external EEPROM/Flash 209 via the VCPU register interface 215. Other CPUs may access the VCPU 203 via a CPU register interface 213. The dedicated on-chip ROM 201 may be utilized for the VCPU 203 microcode, and may be connected through a ROM interface 211. Therefore, this microcode structure may be exploited to achieve greater flexibility and reduced cost, since developers may customize the VCPU 203 microcode by partitioning the microcode in the on-chip ROM 201 and the external EEPROM/Flash 209 to reach cost and scalability goals.
The VCPU 203 may be different from common embedded processors/microcontrollers which load the whole executable program code or microcode to the on-chip program memory (e.g., SRAM), and then executes it. Instead, the VCPU 203 may fetch one microcode instruction, execute it instantly, and then proceed to fetch the next instruction and then execute it. Hence an image loader as well as program memory are not needed. Instead, a hardware finite-state-machine (FSM) may act as an image loader to fetch the microcode instructions from the external EEPROM/Flash 209 or on-chip ROM 201 in terms of the context instructions.
The VCPU 203 may offer a level of debug capability. The program counter which traces the location of the executed instruction may be read. Other CPUs or a debug agent may halt the VCPU 203 at any point. When a halt control register is set, the VCPU 203 may complete the current instruction fetch and execution, and then stop in the location of the present instruction. If the halt control register is cleared, the VCPU 203 may resume from the next instruction. In addition, the microcode may be protected by the parity bits. Any code corruption may be detected by the parity bits in the instruction. Any abnormal behavior may be alarmed and sent to the status register for debugging purposes.
In operation, the VCPU Ethernet controller 300 operates in various manners distinctly different from a standard embedded CPU Ethernet controller. For example, in accordance with one embodiment of the invention, instead of copying the boot code from an external EEPROM to a CPU code memory (on-chip RAM) and executing after the download, the VCPU 309 may perform an instruction-by-instruction load and execution. Whenever a 64-bit instruction is loaded from the external EEPROM 313 or an on-chip VCPU ROM, it may be executed instantly. The next instruction may then be fetched and executed. This process may continue until the code ends or an IDLE instruction is fetched. Therefore, no data and/or code memory is needed for VCPU implementation. In addition, there is no need for a larger VCPU ROM for EEPROM image load function, since it is loaded and executed one line at a time. A hardware finite-state-machine (FSM) may act as an image loader.
In the VCPU Ethernet Controller 300, the VCPU 309 may access the other function blocks (e.g. 10/100 Mbps Ethernet PHY and MAC) and the external EEPROM 313 via the GRC 305 register master interface. The host CPU 301 and the serial interface 307 may access the VCPU 309 registers via the GRC register slave interface. There may be a dedicated on-chip ROM for the VCPU 309 microcode, which may be connected with the VCPU 309 through a ROM interface, which is included in the VCPU 309. The hardware self-boot code, which may comprise microcode instructions, as described with respect to
The EEPROM image may have two exemplary formats as shown in
The definitions for each of the sections of
Exemplary definitions for the configuration bits, row 0x04 [31:0] are illustrated below in Table 5.
Exemplary definitions for the parity bits, row 0x08 [31:24] of the exemplary format 325 are shown below in Table 6.
The Misc_configuration row of the exemplary format 325 may be reserved for future expansion. Mis_configuration may be copied to a VCPU configuration shadow2 register during the device boot. The default value may equal 0.
The MBA_configuration bits of row 0x0c [31:0] may be copied to a VCPU configuration register during the device boot. The default value may equal 0. Exemplary definitions for the MBA configuration bits are shown below in Table 7.
Exemplary definitions of the parity bits of row 0x10 [29:16] of the exemplary format 325 are illustrated in Table 8.
The instructions used to program MAC address register (0x410-0x417), Vendor ID register (0x0), Device ID register (e.g. 0x2), Subsystem Vendor ID register (e.g. 0x2C), Subsystem Device ID register (e.g. 0x2E) may reside in a fixed location on the external EEPROM 313. The VCPU 309 or the host CPU 301 may read/write the device parameters directly from/to the external EEPROM 313
The microcode instructions for device initialization, service and patch, shown in row 0x20-0×FF of the exemplary format 325 may comprise device initialization, service (i.e. WOL), and workarounds.
The VCPU 309 may be used to execute the instructions in the external EE PROM 313 or the internal VCPU ROM. Since the user data, such as the MAC address and MBA configuration data, may be simpler to program and update if stored in one location without instructions, the VCPU 309 may be customized to read the user data from the external EEPROM 313 and write the data to the associated registers. This user data may also be protected by the parity bits. In addition, by utilizing this EEPROM data structure, the code and the boot time may be shortened.
To leverage the flexible architecture of the VCPU 309, the fixed function microcode, such as device initialization code, VPD service code, WOL service code, and device power-down service code, may be placed in the internal VCPU ROM. The dynamic function microcode, such as user data, workarounds and new applications/services, may be placed in the external EEPROM 313. This may significantly reduce the external EEPROM 313 size. Moreover, the device boot time may be shortened because fewer instructions may be stored in the slower external EEPROM 313.
Following device initialization in step 407, the process may jump back to the EEPROM 313 in step 409 to determine whether there may be any workaround code for the device initialization and to begin execution. If not, the process may read the polling code to monitor service events such as a powerdown magic service event, a WOL service event and/or a VPD service event. In step 411, if a powerdown_magic code is present during the service polling in step 409, the process may proceed to step 421 where the powerdown_magic code may be cleared and an LED control register may be set to turn off the status LEDs. This process may be followed by a device power down in step 423, and a halt VCPU command in step 425, followed by stop step 427. If, however, there is no powerdown_magic code present during service polling, the process may proceed to step 413 where the absence of a Vmain signal may force the process to step 417 for a WOL service. The WOL service may be described further with respect to
If the VPD event signal is not present in step 415, the process may step back to step 411 to test for a powerdown_magic code again.
After the WOL service of step 417, if WOL is enabled, the process may proceed to step 425 where the VCPU 309 may be halted, followed by stop step 427. If, however, WOL is disabled, the process may proceed to step 423, or power down, before proceeding to halt the VCPU 309 process in step 425 followed by stop step 427.
Following each of the service steps, WOL service in step 417, Powerdown_Magic service in step 411, and/or VPD service in step 419, the VCPU 309 may jump to the EEPROM 313 to determine whether there is any workaround code for the specified service step, and if not, the process may proceed to the appropriate next step following the service step. This process may iterate until an event causes the VCPU power down or halt VCPU of steps 423 and 425.
In step 513, the device serial number override bit may be set to ‘1’, before proceeding to step 515 where the register, bits 11-12 in an exemplary embodiment, may be set in terms of the status LED configuration mode. If the LED configuration mode, register bits 11-12, for example, may be equal to ‘11’, the PHY register 1B bit 2 may be set to ‘1’. The process may then proceed to step 517 where the config_retry register bit may be set to ‘0’, followed by step 519 where init_done may be set to ‘1’, prior to the return to EEPROM code, step 521.
In step 607, the VPD flag and address register, 0x52 for example, may be read before proceeding to step 609. If the VPD flag is set to write or the address register may be greater than ‘0x7f’, then the process proceeds to step 611 where ‘0xffffffff’ may be returned to the VPD read request and the flag bit may be toggled before proceeding to end step 625, Return to EEPROM mode. If VPD write is not enabled and the address is less than 0x7f, the process may proceed to step 613, where the arbitration (ARB) register, 0x7020 for example, may be set to apply for EEPROM access. The ARB register may then be polled in step 615 until the VCPU 309 may be allowed access, and a register, 0x7024 for example, may be written to set access_en, indicating that access is allowed.
The process may then proceed to step 617 where the VPD address may be loaded to the NVRAM address 0x700c, for example, and NVRAM command register 0x7000 may be written to issue an EEPROM read. The process may then proceed to step 619 where the NVRAM command register may be polled until the “done” bit is set, at which time the process may proceed to step 621. In step 621, the NVRAM data may be loaded to VPD data, and the VCPU request and access_en may be cleared, followed by step 623 where the VPD flag bit may be toggled in the VPD flag and address register. The process may then proceed to end step 625, Return to EEPROM code.
In step 711, in instances where the WOL_en register, 0x5104 in an exemplary embodiment, is equal to ‘0’, the process proceeds to step 707 where the control register may be written to turn off status LEDs and then proceed to step 709, Return to EEPROM Code to power down device. In instances where the WOL_en register is not equal to ‘0’, the process may proceed to step 713. In step 713, the MAC mode register, for example 0x400, may be set to disable RDE, FHDE and TDE, and to enable Magic Packet detection, where RDE is the Receive Data Engine which may be used to process the received data. FHDE is the frame header descriptor engine which may be used to create a frame header for each frame (including the frame length, protocol, VLAN, CRC status etc.). TDE is the transmit data engine which may be used to process the transmitted data.
The process may then proceed to step 715 where the MDIO bitbang may be cleared, by writing to register 0x44c, for example, before proceeding to step 717. MDIO is the standard serial management interface which may be used to configure PHY registers. In one embodiment of the invention, a function may be provided to allow a host CPU to manipulate the MDIO interface through the register, which may be called MDIO bitbang. In step 717, the register 0x44c may be read until bit 29 may be cleared, at which time 0x1200 may be written to PHY register 0 to restart auto-negotiation. The process may then proceed to step 721 where register 0x44c may be read until bit 29 may be cleared, which may step the process to step 723. In step 723, Receive MAC Mode register, for example 0x468, may be set to enable promiscuous mode and RX MAC mode. Promiscuous mode may be used by Ethernet MAC 109 to pass all received good packets. In normal mode, only the received packets with the destined MAC address may be forwarded to the higher layer. The received packets with the other MAC address may be discarded. The process then steps to end step 725, Return to EEPROM Code to halt VCPU.
In an embodiment of the invention, a method, system and machine-readable code are disclosed for controlling an on-chip Ethernet controller 100 utilizing a virtual CPU 111 comprising a microcode engine that loads a single instruction and executes the instruction prior to loading or executing a subsequent instruction. The instructions may be fetched by the virtual CPU 111 from an external non-volatile memory, such as an EEPROM 115, or on-chip ROM. The virtual CPU 111 may initialize the Ethernet controller 10.0 and provide patches for supporting hardware workarounds, wake on LAN service, and vital production data such as serial number, product name, manufacturer and related manufacturing data. The virtual CPU 111 may power down the Ethernet controller and may be halted via a particular command and procedure.
Certain embodiments of the invention may comprise a machine-readable storage having stored thereon, a computer program having at least one code section for communicating information within a network, the at least one code section being executable by a machine for causing the machine to perform one or more of the steps described herein.
Accordingly, aspects of the invention may be realized in hardware, software, firmware or a combination thereof. The invention may be realized in a centralized fashion in at least one computer system or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited. A typical combination of hardware, software and firmware may be a general-purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
One embodiment of the present invention may be implemented as a board level product, as a single chip, application specific integrated circuit (ASIC), or with varying levels integrated on a single chip with other portions of the system as separate components. The degree of integration of the system will primarily be determined by speed and cost considerations. Because of the sophisticated nature of modern processors, it is possible to utilize a commercially available processor, which may be implemented external to an ASIC implementation of the present system. Alternatively, if the processor is available as an ASIC core or logic block, then the commercially available processor may be implemented as part of an ASIC device with various functions implemented as firmware.
The present invention may also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which when loaded in a computer system is able to carry out these methods. Computer program in the present context may mean, for example, any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form. However, other meanings of computer program within the understanding of those skilled in the art are also contemplated by the present invention.
While the invention has been described with reference to certain embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the scope of the present invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the present invention without departing from its scope. Therefore, it is intended that the present invention not be limited to the particular embodiments disclosed, but that the present invention will include all embodiments falling within the scope of the appended claims.
This application makes reference to: U.S. patent application Ser. No. 11/673,348 filed on Feb. 9, 2007. The above stated application is hereby incorporated herein by reference in its entirety.