Claims
- 1. A method for processing Input/Output (I/O) requests to a storage network including at least one storage device and at least two adaptors, wherein each adaptor is capable of communicating I/O requests to at least one storage device, comprising:
detecting an error in a system including a first adaptor, wherein the first adaptor is capable of communicating on the storage network after the error is detected; determining whether the first adaptor is designated a master of the storage network after the error is detected; starting a master switch timer that is less than a system timeout period if the first adaptor is the master after detecting the error, wherein an error recovery procedure in the system including the first adaptor is initiated after the system timeout period has expired; and initiating an operation to designate another adaptor in the storage network as the master if the first adaptor is the master in response to detecting an expiration of the master switch timer.
- 2. The method of claim 1, further comprising:
sending a reset request to the first adaptor after the master switch timer expires.
- 3. The method of claim 2, wherein the reset causes a reset of the first adaptor and not other components within the system including the first adaptor.
- 4. The method of claim 2, wherein the reset causes a power cycle of the system including the first adaptor.
- 5. The method of claim 2, wherein sending the reset request further comprises:
issuing a get identifier request to obtain an identifier of the first adaptor, wherein the reset request is sent to the obtained identifier if the identifier is returned in response to the get identifier request.
- 6. The method of claim 5, further comprising:
issuing another get identifier request to the first adaptor if a previous get identifier request failed.
- 7. The method of claim 1, further comprising:
initiating a monitoring state to monitor I/O requests transmitted through a second adaptor in response to detecting the error; starting an I/O delay timer that is less than the system timeout period in response to receiving an I/O request; and sending a reset request to the first adaptor in response to detecting an expiration of one started I/O delay timer.
- 8. The method of claim 7, wherein the master switch timer is less than the I/O timer.
- 9. The method of claim 7, further comprising:
starting a monitoring timer equivalent to the system timeout period after detecting the error at the first adaptor; and terminating the monitoring state and any pending I/O delay timers after the monitoring timer expires.
- 10. The method of claim 7, further comprising:
starting a monitoring timer equivalent to the adaptor timeout period after detecting the error at the first adaptor; beginning a process to issue an additional get identifier request to the first adaptor if any previous get identifier request failed; and terminating the monitoring state, any pending I/O delay timers, and the process to issue additional get identifier requests after an expiration of the monitoring timer.
- 11. The method of claim 7, wherein the steps of initiating a monitoring state, starting the I/O delay timer and sending the reset request are performed by a device driver executing in an operating system.
- 12. The method of claim 1, wherein the detected error indicates that the first adaptor is unable to communicate to the system housing the first adaptor.
- 13. The method of claim 1, wherein I/O requests continue to be processed through the second adaptor until the reset request is sent.
- 14. The method of claim 1, wherein the system including the first adaptor is a first system, wherein the device driver and the operating system are in a second system.
- 15. The method of claim 1, wherein the second adaptor is within the system including the first adaptor, and wherein the reset causes a reset of the first adaptor.
- 16. The method of claim 1, wherein the storage network on which the adaptors and storage devices communicate comprises a loop topology.
- 17. The method of claim 16, wherein the adaptors and storage devices communicate using the Serial Storage Architecture (SSA) protocol.
- 18. The method of claim 1, wherein the detected error indicates an error within the first adaptor.
- 19. A system for processing Input/Output (I/O) requests to a storage network including at least one storage device and a system including a first adaptor capable of communicating I/O requests to at least one storage device, wherein the system including the first adaptor initiates an error recovery procedure after a system timeout period has expired, comprising:
a second adaptor capable of communicating on the storage network; means for detecting an error in the system including the first adaptor, wherein the first adaptor is capable of communicating on the storage network after the error is detected; means for determining whether the first adaptor is designated a master of the storage network after the error is detected; means for starting a master switch timer, after detecting the error, that is less than the system timeout period if the first adaptor is the master; and means for initiating an operation to designate another adaptor in the storage network as the master if the first adaptor is the master in response to detecting an expiration of the master switch timer.
- 20. The system of claim 19, further comprising:
means for sending a reset request to the first adaptor after the master switch timer expires.
- 21. The system of claim 20, wherein the reset causes a reset of the first adaptor and not other components within the system including the first adaptor.
- 22. The system of claim 20, wherein the reset causes a power cycle of the system including the first adaptor.
- 23. The system of claim 20, wherein the means for sending the reset request further performs:
issuing a get identifier request to obtain an identifier of the first adaptor, wherein the reset request is sent to the obtained identifier if the identifier is returned in response to the get identifier request.
- 24. The system of claim 24, further comprising:
means for issuing another get identifier request to the first adaptor if a previous get identifier request failed.
- 25. The system of claim 19, further comprising:
means for initiating a monitoring state to monitor I/O requests transmitted through a second adaptor in response to detecting the error; means for starting an I/O delay timer that is less than the system timeout period in response to receiving an I/O request; and means for sending a reset request to the first adaptor in response to detecting an expiration of one started I/O delay timer.
- 26. The system of claim 25, wherein the master switch timer is less than the I/O timer.
- 27. The system of claim 25, further comprising:
means for starting a monitoring timer equivalent to the system timeout period after detecting the error at the first adaptor; and means for terminating the monitoring state and any pending I/O delay timers after the monitoring timer expires.
- 28. The system of claim 25, further comprising:
means for starting a monitoring timer equivalent to the adaptor timeout period after detecting the error at the first adaptor; means for beginning a process to issue an additional get identifier request to the first adaptor if any previous get identifier request failed; and means for terminating the monitoring state, any pending I/O delay timers, and the process to issue additional get identifier requests after an expiration of the monitoring timer.
- 29. The system of claim 25, further including:
an operating system; and a device driver executing in the operating system, wherein the means for initiating a monitoring state, starting the I/O delay timer and sending the reset request are performed by the device driver.
- 30. The system of claim 19, wherein the detected error indicates that the first adaptor is unable to communicate to the system housing the first adaptor.
- 31. The system of claim 19, wherein I/O requests continue to be processed through the second adaptor until the reset request is sent.
- 32. The system of claim 19, wherein the system including the first adaptor is a separate system accessible over the storage network.
- 33. The system of claim 19, wherein the first adaptor is within the system including the second adaptor, and wherein the reset causes a reset of the first adaptor.
- 34. The system of claim 19, wherein the storage network on which the adaptors and storage devices communicate comprises a loop topology.
- 35. The system of claim 34, wherein the adaptors and storage devices communicate using the Serial Storage Architecture (SSA) protocol.
- 36. The system of claim 19, wherein the detected error indicates an error within the first adaptor.
- 37. An article of manufacture including code for processing Input/Output (I/O) requests to a storage network including at least one storage device and at least two adaptors, wherein each adaptor is capable of communicating I/O requests to at least one storage device, wherein the code is capable of causing operations comprising:
detecting an error in a system including a first adaptor, wherein the first adaptor is capable of communicating on the storage network after the error is detected; determining whether the first adaptor is designated a master of the storage network after the error is detected; starting a master switch timer that is less than a system timeout period if the first adaptor is the master after detecting the error, wherein an error recovery procedure in the system including the first adaptor is initiated after the system timeout period has expired; and initiating an operation to designate another adaptor in the storage network as the master if the first adaptor is the master in response to detecting an expiration of the master switch timer.
- 38. The article of manufacture of claim 37, further comprising:
sending a reset request to the first adaptor after the master switch timer expires.
- 39. The article of manufacture of claim 38, wherein the reset causes a reset of the first adaptor and not other components within the system including the first adaptor.
- 40. The article of manufacture of claim 38, wherein the reset causes a power cycle of the system including the first adaptor.
- 41. The article of manufacture of claim 38, wherein sending the reset request further comprises:
issuing a get identifier request to obtain an identifier of the first adaptor, wherein the reset request is sent to the obtained identifier if the identifier is returned in response to the get identifier request.
- 42. The article of manufacture of claim 41, further comprising:
issuing another get identifier request to the first adaptor if a previous get identifier request failed.
- 43. The article of manufacture of claim 37, further comprising:
initiating a monitoring state to monitor I/O requests transmitted through a second adaptor in response to detecting the error; starting an I/O delay timer that is less than the system timeout period in response to receiving an I/O request; and sending a reset request to the first adaptor in response to detecting an expiration of one started I/O delay timer.
- 44. The article of manufacture of claim 43, wherein the master switch timer is less than the I/O timer.
- 45. The article of manufacture of claim 43, further comprising:
starting a monitoring timer equivalent to the system timeout period after detecting the error at the first adaptor; and terminating the monitoring state and any pending I/O delay timers after the monitoring timer expires.
- 46. The article of manufacture of claim 43, further comprising:
starting a monitoring timer equivalent to the adaptor timeout period after detecting the error at the first adaptor; beginning a process to issue an additional get identifier request to the first adaptor if any previous get identifier request failed; and terminating the monitoring state, any pending I/O delay timers, and the process to issue additional get identifier requests after an expiration of the monitoring timer.
- 47. The article of manufacture of claim 43, wherein the steps of initiating a monitoring state, starting the I/O delay timer and sending the reset request are performed by a device driver executing in an operating system.
- 48. The article of manufacture of claim 37, wherein the detected error indicates that the first adaptor is unable to communicate to the system housing the first adaptor.
- 49. The article of manufacture of claim 37, wherein I/O requests continue to be processed through the second adaptor until the reset request is sent.
- 50. The article of manufacture of claim 37, wherein the system including the first adaptor is a first system, wherein the device driver and the operating system are in a second system.
- 51. The article of manufacture of claim 37, wherein the second adaptor is within the system including the first adaptor, and wherein the reset causes a reset of the first adaptor.
- 52. The article of manufacture of claim 37, wherein the storage network on which the adaptors and storage devices communicate comprises a loop topology.
- 53. The article of manufacture of claim 52, wherein the adaptors and storage devices communicate using the Serial Storage Architecture (SSA) protocol.
- 54. The article of manufacture of claim 37, wherein the detected error indicates an error within the first adaptor.
RELATED APPLICATIONS
[0001] This application is related to the copending and commonly assigned U.S. patent application Ser. No. ______ entitled “Method, System, and Program for Error Handling in a Dual Adaptor System” having attorney docket no. TUC920010103US1, which patent application was filed on the same date herewith and is incorporated herein by reference in its entirety.