Claims
- 1. A method for reinitializing firmware in the event of a fault in a storage area network comprising at least one storage controller having programmable memory and random access memory (RAM), said at least one storage controller for controlling data access between at least one host server and a storage device, comprising:
detecting a fault; suspending data access commands from said at least one host server; reinstalling firmware stored in programmable memory; reinitializing the at least one storage controller; and provisioning data access commands from said at least one host server to said at least one storage device prior to receiving a data access error message.
- 2. The method according to claim 1, wherein said detecting step comprises detecting non-masked interrupt faults.
- 3. The method according to claim 2, wherein said detecting step further comprises:
identifying said fault; and storing fault information pertaining to said identified fault.
- 4. The method according to claim 3, wherein said detecting step further comprises:
delaying said reinitialization step for a predetermined period; and extracting and archiving said stored fault information prior to said reinitialization step.
- 5. The method according to claim 1, wherein the suspending step further comprises issuing a reset command to host bus adaptors communicating with the at least one servers and the storage device.
- 6. The method according to claim 1, wherein the suspending step is initiated from firmware currently installed in said processor.
- 7. The method according to claim 1, wherein the reinstalling firmware step further comprises selecting a version of firmware from the group consisting of a same version, a previous version, and a later version of firmware installed on said at least one controller.
- 8. The method according to claim 1, wherein said reinstalling step further comprises:
erasing a current version of firmware installed in programmable memory on the controller; copying a second copy of the current version of firmware stored in a first portion of RAM to a second portion of the RAM; and overwriting the current version of firmware in the first portion RAM with new firmware;
- 9. The method according to claim 8, wherein said reinitialization step is initiated from the reinstalled version of firmware in the first portion of RAM.
- 10. The method according to claim 8, wherein said reinitialization step is an abbreviated reboot comprising:
initializing host bus adaptors (HBAs) coupled to the at least one controller; scanning physical drive information for failures; clearing said second portion RAM; and notifying a configuration manager that the firmware reinstallation is complete.
- 11. Apparatus for reinitializing firmware in the event of a fault in a storage area network comprising at least one storage controller having programmable memory and random access memory (RAM), said at least one storage controller for controlling data access between at least one host server and a storage device, comprising:
means for detecting a fault; means for suspending data access commands from said at least one host server; means for reinstalling firmware stored in programmable memory; means for reinitializing the at least one storage controller; and means for provisioning data access commands from said at least one host server to said at least one storage device prior to receiving a data access error message.
- 12. The apparatus according to claim 11, wherein said means for detecting comprises detecting non-masked interrupt faults.
- 13. The apparatus according to claim 12, wherein said means for detecting further comprises:
identifying said fault; and storing fault information pertaining to said identified fault.
- 14. The apparatus according to claim 13, wherein said means for detecting further comprises:
delaying said reinitialization step for a predetermined period; and extracting and archiving said stored fault information prior to said reinitialization step.
- 15. The apparatus according to claim 11, wherein the means for suspending further comprises issuing a reset command to host bus adaptors communicating with the at least one servers and the storage device.
- 16. The apparatus according to claim 11, wherein the means for suspending is initiated from firmware currently installed in said processor.
- 17. The apparatus according to claim 11, wherein the means for reinstalling firmware further comprises selecting a version of firmware from the group consisting of a same version, a previous version, and a later version of firmware installed on said at least one controller.
- 18. The apparatus according to claim 11, wherein said means for reinstalling further comprises:
erasing a current version of firmware installed in programmable memory on the controller; copying a second copy of the current version of firmware stored in a first portion of RAM to a second portion of the RAM; and overwriting the current version of firmware in the first portion RAM with new firmware;
- 19. The apparatus according to claim 18, wherein said means for reinitialization is initiated from the reinstalled version of firmware in the first portion of RAM.
- 20. The apparatus according to claim 18, wherein said means for reinitialization is an abbreviated reboot comprising:
initializing host bus adaptors (HBAs) coupled to the at least one controller; scanning physical drive information for failures; clearing said second portion RAM; and notifying a configuration manager that the firmware reinstallation is complete.
RELATED APPLICATIONS
[0001] This application relates to and claims priority from U.S. application Ser. No. 60/381,426, filed May 17, 2002, and entitled “CRASH AND RECOVER ON THE FLY”, the disclosure of which is hereby incorporated by reference in its entirety.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60381426 |
May 2002 |
US |