Claims
- 1. A data processing system comprising:
a first set of operating components, including a first processor and a first memory; a fabric that interconnects said first set of operating components and which supports non-disruptive hot-add and hot-removal of components via a hot-plug connector, said fabric including logic for re-configuring routing and operating protocols of said data processing system to accommodate dynamic changes to said data processing system caused by said hot-add and hot-removal of components; a second set of operating components, physically coupled to said first set of operating components via said hot-plug connector; means for automatically running a system check for both said first set of operating components and said second set of operating components, said system check identifying a problem component within either of said first set of operating components and said second set of operating components; and means, when said problem component is detected within said second set of operating components, for dynamically initiating a hot-removal of at least said problem component among said second set of operating components.
- 2. The data processing system of claim 1, further comprising means for generating an output indicating the hot-removal of said second set of operating components.
- 3. The data processing system of claim 2, wherein said output includes specific indication of a type of problem and of an identification of said problem component.
- 4. The data processing system of claim 1, further comprising:
logic for enabling on-the-fly expansion and reduction of said data processing system to respectively add and remove additional processing units, wherein said additional processing units are connected via said hot-plug connectors and added and removed while said first set of operating components is operating, without disrupting the current performance of said first set of operating components.
- 5. The data processing system of claim 1, wherein said first set of operating components and said second set of operating components are first processing units and said fabric is an interconnect fabric of said first processing unit.
- 6. The data processing system of claim 1, further comprising:
means for enabling on-the-fly expansion of said data processing system to include the second set of operating components by completing an electrical and logical connection of said second set of components via said hot plug connector without disrupting current operations of said first set of operating components.
- 7. The data processing system of claim 1, said means for automatically running a system check includes a service element that automatically initiates and completes a test of an operating readiness of said second set of components prior to enabling a re-configuration of routing and operating protocols of said interconnect fabric to accommodate said second set of components and responsive to a discovery of the problem component initiates the automatic removal.
- 8. The data processing system of claim 1, wherein:
said logic within said fabric includes configuration logic and detection logic, wherein said configuration logic includes a latch and multiple configuration registers selected by a value within said latch for implementing particular routing and operating protocols, wherein further a value within said latch is set by said detection logic whenever said problem component is detected being removed from said hot-plug connector.
- 9. The data processing system of claim 1, further comprising:
logic for dynamically selecting a configuration for controlling routing and communication operations of said interconnect fabric from among multiple configurations, wherein when said data processing system contains both said first set of components and an additional component connected via one of said hot plug connectors, said logic selects a second configuration, and when said additional component is identified as the problem component, said logic selects a first configuration supporting said first set of components.
- 10. The data processing system of claim 9, wherein said means for completing said removal comprises:
a service element, which triggers said logic to select said first configuration when said service element detects a pending removal of said additional component from said hot plug connector.
- 11. The data processing system of claim 1, further comprising:
an operating system (OS) that controls operations on the data processing system and allocates workload among said first set of operating components and said second set of operating components connected via a hot plug connector, based on a current configuration of said data processing system; and a service element, which, responsive to a detection of the problem component and pending removal of the second set of components, triggers the OS to re-allocate workload from said second set of components said first set of components.
- 12. The data processing system of claim 1, further comprising:
a connection backplane that provides a series of hot-plug connection ports for coupling and removing the second set of components to and from said hot plug connectors, respectively.
- 13. The data processing system of claim 1, wherein said interconnect fabric further comprises:
means for dynamically re-configuring routing and operating protocols to accommodate a removal of said additional components without causing said first set of operating components to suspend operations.
- 14. The data processing system of claim 10, wherein said means for dynamically initiating said hot-removal comprises:
a service element operational within said first processing unit, and which automatically generates a logical separation between said second processing unit and said first processing unit.
- 15. In a data processing system comprising a first set of processing components that are interconnected with a dynamically configurable interconnect fabric with hot plug connectors and comprising a second set of processing components that is connected to said first processing unit via at least one of said hot plug connectors, a system for reducing processing with faulty components, said system comprising:
means for automatically detecting a problem component among said second set of processing components; and means, when a problem component is identified, for dynamically removing at least said problem component from said data processing system, without disrupting operations of said first set of components.
- 16. The system of claim 15, further comprising:
means, when a problem component has been removed, for switching a configuration of said interconnect fabric to a configuration having routing and operational protocols to support said first set of processing components and remaining ones of said second set of processing components.
- 17. The system of claim 15, further comprising:
an operating system (OS) that controls operations on the data processing system and allocates workload among processors and other components within said data processing system based on a current configuration of said data processing system
- 18. The system of claim 17, further comprising:
means for re-allocating workload assigned to said problem component to said first set of processing components and the remaining ones of the second set of processing components.
- 19. The system of claim 15, wherein said first set of processing components is a processing unit having a first processor and a first memory and said second set of processing components includes individually connected components from among a second processor and a second memory that are connected via said hot plug connectors.
- 20. The system of claim 15, wherein both said first set of components and said second set of components are a first processing units and a second processing unit, respectively, and said second processing unit is added and removed as a complete unit, wherein when said problem component is identified the entire second processing unit is removed.
RELATED APPLICATION(S)
[0001] The present invention is related to the subject matter of the following commonly assigned, copending United States patent applications: (1) Ser. No.: ______ (Docket No. AUS920030198US1) entitled “Non-disruptive, Dynamic Hot-Plug and Hot-Remove of Server Nodes in an SMP” filed on ______; and (2) Ser. No.: ______ (Docket No. AUS920020199US1) entitled “Non-disruptive, Dynamic Hot-Add and Hot-Remove of Non-Symmetric Data Processing System Resources” filed ______. The content of the above-referenced applications is incorporated herein by reference.