Specific embodiments of the invention will now be described in detail with reference to the accompanying figures. Like elements in the various figures are denoted by like reference numerals for consistency.
In the following detailed description of embodiments of the invention, numerous specific details are set forth in order to provide a more thorough understanding of the invention. However, it will be apparent to one of ordinary skill in the art that the invention may be practiced without these specific details. In other instances, well-known features have not been described in detail to avoid unnecessarily complicating the description.
In general, embodiments of the invention provide a method and apparatus to manage hot-plug operations with a finite state machine. Specifically, embodiments of the invention provide a method and apparatus for managing sophisticated hot-plug thread operations using a state-sharing sequence control engine and testing compliance of hot-plug mechanisms using a fault injection mechanism. Unlike the traditional tree-structure used in typical hot-plug controllers, embodiments of the invention regulate hot-plug threads by setting wake and sleep periods for each thread. A thread is allowed to perform operations if upon waking, it has ownership of the current state. Once a thread has finished operations, the state is updated and the thread goes back to sleep. This prevents spaghetti-like code and desynchronization, allows process termination once a thread goes to sleep, and provides a more extensible and bug-free environment for development and debugging.
A hot-plug device (102) may consist of any type of removable hardware that is not vital to the core functionality of the operating system (108). For example, a hot-plug device (102) may be a PCI card, a hard drive, a PCMCIA card, or other types of printed circuit boards. A hot-plug event (104) refers to insertion or removal of a hot-plug device (102) and may be initiated by various means. For example, a hot-plug event may be triggered by notifying the operating system (108) that a hot-plug device (102) is to be inserted or removed, or by manually pressing a button on the hot-plug device (102) or simply inserting or removing the hot-plug device (102) itself.
Once the hot-plug event (104) is triggered, it spawns one or more hot-plug threads (106) to handle the hot-plug event (104). In one or more embodiments of the invention, a hot-plug event (104) involves more than just removal or addition of power to the hot-plug device (102). For example, if the operating system (108) is using the hot-plug device (102) and a remove operation is triggered, clean-up operations must be performed to save relevant data, prevent system errors, and properly shut down the hot-plug device (102) before it can be physically removed. Furthermore, in one or more embodiments of the invention, a hot-plug event (104) may refer to a surprise insertion or removal of a hot-plug device (102) before the system is adequately prepared. In such cases, the system should be robust enough such that the hot-plug threads (106) and the finite state machine (110) would handle the surprise hot-plug event (104) without causing a fatal error and shutting down the operating system (108).
As shown in
The fault injection mechanism (116) is used to validate the control logic of the hot-plug control system. It is used to determine the system's general state as well as to insert faults into the hot-plug control system. For example, a forced eject fault may be placed into the hot-plug control system to test its response and ensure that crashes are avoided and the fault is handled in a robust manner. After the fault is inserted, control flows of all hot-plug threads (106) can be traced to determine if the hot-plug control system is handling the fault appropriately. Because the present invention significantly reduces the complexity of producer/consumer thread interaction and the associated state space, tracing the control flows of hot-plug threads (106) for validation of real-time control logic is simplified as well.
The fault injection mechanism (116) may be implemented using a set of registers connected to a microcontroller or processor, which will allow it to assess the general state of the hot-plug control system. While some registers may be read-only, the fault injection mechanism may be introduced to the hot-plug control system using two registers. One register may be used to enable or disable fault insertion, which may help prevent inadvertent insertion of faults into the system. For example, if the enable/disable register were set to an “enable key” value, fault injection would be enabled, and if the register were set to any other value, fault injection would be disabled. The second register may contain addressing bits, which direct the fault to a specific location on the hot-plug device, and fault bits, which can be used to apply a set of defined fault types to the hot-plug device. Those skilled in the art will appreciate that the fault injection mechanism (116) may be implemented using other methods.
As shown in
In one or more embodiments of the invention, a single state variable and next state logic is used for all participating threads in the hot-plug device. In order to allow each thread to execute, state-sharing is employed. In state sharing, each thread corresponds to a set of states it has ownership over. A state that is completely owned by one thread is called a control state (208, 210, 212), and a state that is shared amongst threads is called an ownership transfer state (214). A thread is not allowed to execute if it does not have ownership of a state, and an ownership transfer state allows one thread to relinquish control of the finite state machine and another to take over the finite state machine and execute a set of control instructions. In one or more embodiments of the invention, control states (208, 210, 212) correspond to periods where a thread is accessing a critical section, or shared resource. Because no other threads can claim ownership of a control state other than the one it is assigned to, access of shared resources is restricted to the privileged thread and thus mutual exclusion of critical sections is implemented.
In one or more embodiments of the invention, hot-plug threads (privileged thread (202), thread 1 (204), thread n (206)) will sleep and wake at fixed periods. When a thread is awake and can claim ownership of the current state, it will execute a control operation related to the hot-plug event until its next scheduled sleep period. Once a control operation is completed by the thread, it will cycle the finite state machine to another state, until eventually an ownership transfer state (214) is reached. This allows another thread to take ownership of the finite state machine and cycle to new states that are owned by it.
For example, if the current state corresponded to a control state of the privileged thread (208), the privileged thread (202) could execute a set of instructions and transition to another state (220) before going to sleep. During this time, any other thread (thread 1 (204), thread n (206)) entering its wake period would check the current state and discover that it does not have ownership of that state. Those threads (thread 1 (204), thread n (206)) would then go back to sleep and wake up after the sleep period to check again. Once the privileged thread (202) wakes and completes operations, it may cycle to an ownership transfer state (218) and go to sleep. Other threads (thread 1 (204), thread n (206)) may then wake and assume control of the finite state machine. If thread 1 (204) were to wake first, it would cycle the finite state machine to a control state (210) it owned and execute its own set of instructions before transitioning to an ownership transfer state (214), alternating between sleep and wake periods as it does. If thread n (206) were to wake first, it would also begin its own sequence, cycling to a control state (212) it had ownership of, alternating periods of executing instructions and sleeping, until it transitioned to another ownership transfer state (214).
In one or more embodiments of the invention, a privileged thread (202) that cycles to an ownership transfer state (214) may employ a watchdog clock (not shown) that will allow it to regain control of the finite state machine if a new thread (thread 1 (204), thread n (206)) never responds. Furthermore, in one or more embodiments of the invention, sleep and wake periods of threads may be changed based on conditions. For example, if the privileged thread (202) is the only one left executing and all other threads (thread 1 (204), thread n (206)) have completed execution and exited the finite state machine, then the sleep period for the privileged thread (202) may be omitted so that it may complete operations more quickly.
Initially, a hot-plug event is triggered when a PCI eject button is pressed (Step 301) on a PCI card by a user. The system then waits five seconds (Step 303) to allow the user to cancel the hot-plug event. During this period, the system checks to see if the user presses the button again (Step 305). If so, the user has cancelled the eject operation (Step 307) and the system correspondingly cancels any hot-plug induced changes. Those skilled in the art will appreciate that a hot-plug event may be triggered via other means, such as being invoked within the operating system. In one or more embodiments of the invention, a hot-plug event triggered through the operating system will omit the five second retraction window and the hot-plug operation is immediate.
If the user does not press the button again within the five second window, the system acknowledges that a hot-plug event is to occur and begins a finite state machine state sequence (Step 309). At this time, hot-plug related threads are spawned and enter a finite state machine that regulates their activity. All steps taking place during the finite state machine state sequence are executed by threads that transfer control of the finite state machine between one another, as described above. Furthermore, in one or more embodiments of the invention, hot-plug threads subscribe to fixed sleep and wake periods and only execute instructions when they have ownership of the current state.
Once the finite state machine sequence is started (Step 309), the operating system is informed by one or more hot-plug threads of the hot-plug request (Step 311). In the case where the hot-plug event is invoked through the operating system, this step is omitted, since the operating system already knows of the hot-plug event. Once the operating system is informed of the hot-plug request, it will request the hot-plug device driver to complete operations and detach itself (Step 313). The operating system then checks to see if the detach has been successful (Step 315). If not, the PCIE module is then reset to its previous state (Step 317) and the user must restart the hot-plug process at a later time. For example, if the PCIE module is being used by another resource and cannot be taken offline within a certain period of time, the detach process (Step 313) may be denied and the user notified of the denial. The PCIE module is then allowed to continue and complete operations for the other resource so that at a later time, it can be taken offline.
If the driver detach is successful, the PCIE module is taken offline (Step 319) and brought to a power-down state. At this time, the finite state machine state sequence is completed (Step 321) and the user may unplug the PCI card (Step 323).
First, a thread must enter its wake mode (Step 401). Once it is awake, it checks the current state of the finite state machine (Step 403). As stated above, in one or more embodiments of the invention, a single state variable and next state logic is maintained for all participating threads for a given hot-plug device. Once the thread has identified the current state, it executes the next state code, which is the same for all threads. Different actions are taken based on whether the thread is the privileged thread or not (Step 405). If the thread is not the privileged tread, then it goes back to sleep (Step 411) and wakes up after its sleep cycle to check the state again. When the current state is a control state of a particular thread, only that thread is allowed to execute hot-plug related instructions. All other threads will wake up, check the current state, go back to sleep and repeat until the current state has been transitioned to an ownership transfer state, which will allow another thread to take control.
If the thread is the privileged thread, it may perform a control operation (Step 407) related to the hot-plug event. For example, the thread may perform clean-up operations between the hot-plug device and operating system, or it may be powering down or powering up the hot-plug device. Once the thread has performed its control operation (Step 407), it updates the state (Step 409) in the finite state machine. The state may be updated to another control state of the thread, or it may be updated to an ownership transfer state, which would allow other threads to take over the finite state machine. Once the state is updated, the thread enters a sleep cycle (Step 411) and relinquishes control of the finite state machine.
The invention may be implemented on virtually any type of computer regardless of the platform being used. For example, as shown in
Further, those skilled in the art will appreciate that one or more elements of the aforementioned computer system (500) may be located at a remote location and connected to the other elements over a network. Further, the invention may be implemented on a distributed system having a plurality of nodes, where each portion of the invention (e.g. finite state machine, hot-plug device, operating system, etc.) may be located on a different node within the distributed system. In one embodiment of the invention, the node corresponds to a computer system. Alternatively, the node may correspond to a processor with associated physical memory. The node may alternatively correspond to a processor with shared memory and/or resources. Further, software instructions to perform embodiments of the invention may be stored on a computer readable medium such as a compact disc (CD), a diskette, a tape, a file, or any other computer readable storage device.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this disclosure, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Accordingly, the scope of the invention should be limited only by the attached claims.