In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be obvious, however, to one skilled in the art, that the present invention may be practiced without some or all of these specific details. In other instances, well known process steps have not been described in detail in order not to unnecessarily obscure the present invention.
The disclosed embodiments support the management of locks that are requested and acquired in a system implementing virtualization of storage. More particularly, the embodiments described herein may be implemented in a system implementing network-based virtualization. In a system implementing network-based virtualization, virtualization may be implemented across multiple ports and/or network devices such as switches or routers. As a result, commands such as read or write commands addressed to a volume may be intercepted by different network devices (e.g., switches, routers, etc.) and/or ports. The disclosed embodiments alleviate the locking problem that results in such a system.
A reserve request is typically sent by a host to reserve a volume or portion thereof in order to perform a read or write operation. Such a reserve request typically indicates the type of reservation being requested.
In accordance with one embodiment, there are four types of reservations that may be performed in order to reserve an entire volume or portion (e.g., extent) thereof. The four types of reservations include: read exclusive, write exclusive, exclusive access, and read shared. When a read exclusive reservation is obtained, no other initiator is permitted to perform read operations on the indicated extent(s) (or volume). However, a read exclusive reservation does not prevent write operations from being performed by another initiator. Similarly, when a write exclusive reservation is obtained, no other initiator is permitted to perform write operations on the indicated extent(s) (or volume). However, a write exclusive reservation does not prevent read operations from being performed by another initiator. An exclusive access reservation prevents all other initiators from accessing the indicated extent(s) (or volume). All reservation types that overlap these extent(s) conflict with this reservation. A read shared reservation prevents write operations from being performed by any initiator on the indicated extent(s) (or volume). This reservation does not prevent read operations from being performed by any other initiator. Although these types of reservations are supported in a system implementing the SCSI protocol, these examples are merely illustrative, and therefore the disclosed embodiments may be implemented in systems supporting different protocols, as well as different types of reservations.
When a host wishes to access a volume or portion thereof in order to read and/or write to the volume or portion thereof, the host will typically send a reserve request in order to “lock” the corresponding storage locations. In order to obtain a lock of the volume or portion thereof, one or more network devices and/or ports are notified of the lock. These notifications may serve a variety of purposes. For instance, such notifications enable the network devices and/or ports to update their information so as to prevent any subsequent reservation conflicts from occurring. As another example, in the event that the network devices and/or ports receiving a notification are aware of a conflict, they may respond to prevent a new lock from being obtained. Accordingly, through communication between or among the network devices and/or ports, the locking problem and data corruption that can occur in such a system may be eliminated.
In accordance with one embodiment, a volume is exported by one or more ports. The ports that export a particular volume may be implemented in one or more network devices within the network. In accordance with one embodiment, the ports may be intelligent ports (i.e., I-ports) implemented in a manner such as that disclosed in patent application Ser. No. 10/056,238, Attorney Docket No. ANDIP003, entitled “Methods and Apparatus for Implementing Virtualization of Storage in a Storage Area Network,” by Edsall et al, filed on Jan. 23, 2002. An I-port may be implemented as a master port, which may send commands or information to other I-ports. In contrast, an I-port that is not a master port may contact the master port for a variety of purposes, but cannot contact the other I-ports. In a Fibre Channel network, the master I-port for a particular volume may maintain the identity of the other I-ports that also export the volume in the form of a World Wide Name (WWN) and/or Fibre Channel Identifier (FCID). Similarly, the other I-ports that export the volume may maintain the identity of the master I-port in the form of a WWN and/or FCID. In other embodiments, it is contemplated that the system does not include a master I- port, and therefore the I-ports maintain the identity of the other I-ports that export the volume to which they send notifications.
In accordance with some embodiments of the invention, a master port functions as an arbitrator for the purpose of sending notifications to other ports exporting the volume, determining whether a reservation conflict exists based upon local or global reservation information, and/or updating reservation information accordingly. In addition, the master port may also function as a master port for purposes of implementing virtualization functionality. More particularly, a master port may be implemented in a manner such as that disclosed in patent application Ser. No. 10/056,238, Attorney Docket No. ANDIP003, entitled “Methods and Apparatus for Implementing Virtualization of Storage in a Storage Area Network,” by Edsall et al, filed on Jan. 23, 2002.
In accordance with one embodiment, a storage area network may be implemented with virtualization switches adapted for implementing virtualization functionality, as well as with standard switches.
In order to support the virtual-physical mapping and accessibility of memory by multiple applications and/or hosts, it is desirable to coordinate memory accesses between the virtualization switches 102 and 104. Communication between the switches 102 and 104 may be accomplished by an inter-switch link 126 between two switches. As shown, the inter-switch link 126 may be between two standard ports. In other words, synchronization of memory accesses by two switches merely requires communication between the switches. This communication may be performed via intelligent virtualization ports, but need not be performed via a virtualization port or between two virtualization ports.
Virtualization of storage is performed for a variety of reasons, such as mirroring. For example, consider four physical Logical Units (LUNs) LUNs, PLUN1128, PLUN2130, PLUN3132, and PLUN4134. It is often desirable to group two physical LUNs for the purpose of redundancy. Thus, as shown, two physical LUNs, PLUN1128 and PLUN2130 are represented by a single virtual LUN, VLUN1136. When data is mirrored, the data is mirrored (e.g., stored) in multiple physical LUNs to enable the data to be retrieved upon failure of one of the physical LUNs.
Various problems may occur when data is written to or read from one of a set of “mirrors.” For instance, multiple applications running on the same or different hosts, may simultaneously access the same data or memory location (e.g., disk location or disk block), shown as links 138, 140. Similarly, commands such as read or write commands sent from two different hosts, shown at 138, 140 and 142, 143 may be sent in the same time frame. Each host may have corresponding Host Bus Adapters (HBA) as shown. Ideally, the data that is accessed or stored by the applications or hosts should leave the mirrors intact. More particularly, even after a write operation to one of the mirrors, the data stored in all of the mirrors should remain consistent. In other words, the mirrors should continue to serve as redundant physical LUNs for the other mirrors in the event that one of the mirrors should fail.
In conventional systems in which mirroring is enabled, a relatively simultaneous access by two different sources often results in an inherent race condition. For instance, consider the situation when two different clients send a write command to the same virtual LUN. As shown, application 1144 running on Host 1124 sends a write command with the data “A,” while application 2146 running on Host 2126 sends a write command with the data “B.” If the first application 144 sends data “A” to VLUN1136 first, the data “A” may be written, for example, to PLUN1128. However, before it can be mirrored to PLUN2130, the second application 146 may send data “B.” Data “B” may be written to PLUN2130 prior to being mirrored to PLUN1128. Data “A” is then mirrored to PLUN2130. Similarly, data “B” is mirrored to PLUN1128. Thus, as shown, the last write operation controls the data to be stored in a particular physical LUN. In this example, upon completion of both mirror operations, PLUN1128 stores data “B” while PLUN2130 stores data “A.” Thus, the two physical LUNs no longer mirror one another, resulting in ambiguous data.
In order to solve the inherent race condition present in conventional systems, the virtualization ports communicate with one another, as described above, via an inter-switch link such as 126. In other words, the ports synchronize their access of virtual LUNs with one another. This is accomplished, in one embodiment, through the establishment of a single master virtualization port that is known to the other virtualization ports as the master port. The identity of the master port may be established through a variety of mechanisms. As one example, the master port may send out a multicast message to the other virtualization ports indicating that it is the master virtualization port. As another example, the virtualization ports may be initialized with the identity of the master port. In addition, in the event of failure of the master virtualization port, it may be desirable to enable one of the slave virtualization ports to substitute as a master port.
The master virtualization port may solve the problem caused due to the inherent race condition in a variety of ways. One solution is a lock mechanism. An alternative approach is to redirect the SCSI command to the master virtualization port, which will be in charge of performing the virtual to physical mapping as well as the appropriate interlocking. The slave port may then learn the mapping from the master port as well as handle the data.
Prior to accessing a virtual LUN, a slave virtualization port initiates a conversation with the master virtualization port to request permission to access the virtual LUN. This is accomplished through a locking mechanism that locks access to the virtual LUN until the lock is released. For instance, the slave virtualization port (e.g., port 106) may request the grant of a lock from the master virtualization port (e.g., port 108). The master virtualization port then informs the slave virtualization port when the lock is granted. When the lock is granted, access to the corresponding physical storage locations is “locked” until the lock is released. In other words, the holder of the lock has exclusive read and/or write access to the data stored in those physical locations. In this example, data “A” is then stored in both physical LUN1128 and physical LUN2130. When the slave virtualization port 106 receives a STATUS OK message indicating that the write operation to the virtual LUN was successful, the lock may be released. The master virtualization port 108 may then obtain a lock to access of the virtual LUN until data “B” is stored in both mirrors of the VLUN1136. In this manner, virtualization ports synchronize access to virtual LUNs to ensure integrity of the data stored in the underlying physical storage mediums.
In accordance with one embodiment, slave and master virtualization ports may be configured or adapted for performing SCSI reserve operations such as those described herein. More particularly, select ports may access reserve information indicating the portion(s) of a volume being reserved (and possibly the port requesting the reservation) and/or communicate with one another regarding SCSI reserve processes, as will be described in further detail below.
In accordance with one embodiment, the disclosed methods may be implemented by one or more ports. For instance, each port implementing one or more of the disclosed methods may be an intelligent port such as that disclosed in patent application Ser. No. 10/056,238, Attorney Docket No. ANDIP003, entitled “Methods and Apparatus for Implementing Virtualization of Storage in a Storage Area Network,” by Edsall et al, filed on Jan. 23, 2002, which is incorporated herein by reference for all purposes. Alternatively, the disclosed methods may be implemented by one or more network devices.
In order to reserve a volume or portion thereof, a host may transmit a reserve request. Similarly, in order to release the reservation of the volume or portion thereof, the host may send a release request. Various methods of processing reserve requests and corresponding release requests will be described in further detail below with reference to
Each port receiving a reserve intention notification may check whether a reservation conflict exists (e.g., by accessing local or global reserve information) and, upon determining that no reservation conflict exists, may store reserve information indicating that a lock of the portion(s) of the volume has been obtained as shown at 216 and 218, as appropriate. This information may also identify the port from which the notification has been sent (i.e., port that has received the reserve request). The ports receiving these notifications may also send an acknowledgement as shown at 220 and 222 to acknowledge the receipt of the notifications and/or indicate that no reservation conflict exists. The port that has received the reserve request may then obtain a lock of the portion of the volume being reserved at 224. More particularly, the port may wait until it receives an acknowledgement from each of the ports to which a reserve intention notification was sent prior to obtaining the lock. The port may then send a reserve response to the host at 226 indicating whether the portion of the volume has been reserved as requested.
When a reserve request is received by one of the ports that exports the volume such as iPort2 at 228, it determines whether a conflict exists between the reserve request and other reserve requests that have previously been received at 230. In other words, the port checks the reserve information it has stored for other reservations performed by other ports. iPort2 may then send a reserve response to the host at 232. This reserve response may indicate whether a conflict exists. In this example, iPort2 determines that a conflict exists and notifies the host of the conflict. As a result, iPort2 does not send reserve intention notifications to the other ports exporting the volume.
In accordance with one embodiment, the reservation information for a set of ports that exports a volume is stored at a single location or network device that is external to each of the set of ports. This location or network device may be referred to as a “shared disk” on which each port (e.g. iPort) has a segment for each region of the volume. Each segment may be used to store reservation status information for the corresponding region of the volume. The reservation information may be organized according to volume region and/or port.
If the reserve information is stored at a central location such as a shared disk, the storing and accessing of the reserve information need not be performed locally.
When a reservation conflict does not exist, the port that has received the reserve request sends a reserve intention notification to each of the other ports that exports the volume at 418 and 420. Each of the ports receiving the reserve intention notification may update its own local reserve information at 422 and 424, respectively, to indicate that iPort1 has reserved the requested region(s) of the volume as identified in the reserve notification. The ports may also acknowledge the receipt of the notification message by sending an acknowledgement at 426 and 428, respectively. The port, iPort1, may then obtain a lock of the reserved region(s) at 430. For instance, the port may obtain a lock after an acknowledgement is received from each of the set of ports to which a reserve notification was sent. The port, iPort1, may then update the centrally located reserve information at 432 indicating that it has reserved the portion of the volume. The port may also update locally maintained reserve information, as well. In addition, iPort1 may send a reserve response to the host at 434 indicating whether the reservation was successful. In this manner, iPort1 may update the reserve information, whether the reserve information is stored locally and/or at a shared disk.
When other ports that export the volume send a reserve request as shown at 436, they perform a similar process to check whether reservations by other iPorts present a reservation conflict at 438. More particularly, this reserve information may be obtained from the shared disk. In this example, upon determining whether a reservation conflict exists, iPort2 sends a reserve response to the host at 440.
While it is possible to send messages such as notifications directly to other ports, it may be desirable to send messages via an arbitrator. The arbitrator may be implemented, for example, on a separate network device.
The arbitrator is adapted for “managing” reserve requests received by a plurality of ports. More particularly, upon receiving the reserve intention request from iPort1, the arbitrator may determine whether a reservation conflict exists at 516. In other words, the arbitrator determines whether another iPort has performed a conflicting reservation. The reserve information may be stored locally by the arbitrator and/or on a separate network device. Assuming that no conflict exists, the arbitrator transmits a reserve intention notification to a set of one or more ports exporting the volume as shown at 518 and 520. Upon receiving the reserve intention notification, each port may update local reserve information as shown at 522 and 524 to indicate that a particular region or regions of a volume have been reserved. This local information may also indicate the port that has reserved these region(s). Each of these ports may also send an acknowledgement, as shown at 526 and 528, respectively.
The arbitrator records the reservation of the port, Iport1, of the specified region(s) at 530. The reserve information may identify the port that has reserved the specified region(s). In this example, the arbitrator waits to update the reserve information until it receives the acknowledgements from each of the ports to which it has sent a reserve intention notification. In addition, after it has received the acknowledgements, it sends an acknowledgement to the port that has received the reserve request at 532, Iport1.
Upon receipt of the acknowledgement from the arbitrator, the port, Iport1, obtains a lock of the requested portion(s) of the volume at 534 and may update its local reserve information. The port may then send a reserve response at 536 indicating whether the reservation was successful at 536
When another port that exports the volume such as Iport2 receives a reserve request at 538, it checks the reserve information to determine whether a reservation conflict exists at 540. In this example, it ascertains whether the reservation by Iport1 conflicts whether the currently requested reservation. The port, Iport2, may then send a reserve response at 542 indicating whether the reservation was successful. In this example, the port, Iport2, notifies the host that a conflict exists and the port does not continue to notify the arbitrator of the requested reservation.
When the host sends a release request to Iport1 at 544 indicating that it intends to release the lock it previously obtained of the portion(s) of the volume, Iport1 sends a release notification to the arbitrator at 546 indicating a release of the lock of the portion(s) of the volume. In addition, the port, Iport1, releases the lock at 547 and may update its local reserve information. The arbitrator also updates the reserve information to indicate that the lock has been released at 548. The port, Iport1, may also send an acknowledgement of the release of the lock to the host at 550.
Upon receiving the release notification, the arbitrator sends a release notification to a set of one or more ports exporting the volume as shown at 552 and 554. Each of these ports may each update their locally maintained reserve information at 556 and 558, respectively. In this manner, each of the ports may have access to reserve information enabling them to handle subsequent reserve requests appropriately.
An arbitrator may be implemented in a variety of manners. For instance, an arbitrator may be associated with a volume or set of volumes, and therefore support those ports that export the corresponding volume(s). Moreover, an arbitrator may be implemented via a network device or port of a network device.
Assuming that the arbitrator has not identified a reservation conflict, the arbitrator sends a reserve intention notification to a set of ports exporting the volume at 618 and 620. In this example, the arbitrator sends a reserve intention notification identifying the reserved portion(s) of the volume to iPort3 and iPortn. The arbitrator need not notify the requesting port, iPort2. The ports receiving the reserve intention notifications may record the reservation of the portion(s) of the volume in their reserve information at 622 and 624, and may also send an acknowledgement at 626 and 628, respectively. Thereafter, the ports, iPort3 and iPortn may process reserve requests appropriately by accessing the reserve information that has been recorded.
If acknowledgements are transmitted, the arbitrator waits for acknowledgements from the ports to which reserve intention notifications have been sent before recording the reservation of the portion(s) of the volume at 630 and sending an acknowledgement to the requesting port, iPort2, at 632. Of course, where acknowledgements are not supported, the arbitrator may assume that the ports have received and processed the notifications.
Upon receiving the acknowledgement from the arbitrator, iPort2 obtains a lock of the requested portion(s) of the volume at 634. The port, iPort2, may also send a reserve response at 636 to the host indicating whether the reservation was successful.
When another port that exports the volume, iPort3, receives a reserve request at 638, it may check its reserve information to determine whether a reservation conflict exists at 640. The port receiving the second reserve request, iPort3, may then send a reserve response indicating whether the reservation was successful to the host at 642. In this example, since a conflict exists, iPort3 does not continue to send a reserve intention notification to the arbitrator, iPort1.
When the host sends a release request to iPort2 at 644, iPort2 sends a release notification indicating that a lock of the previously requested portion(s) of the volume is being released to the arbitrator, iPort1, at 646. The port, iPort2, releases the lock at 648 and may also update its reserve information (e.g., local reserve information) accordingly. In addition, the arbitrator may also update its local reserve information at 650. Upon release of the lock, iPort2 may send a release acknowledgement to the host at 652 indicating whether the release of the lock has been performed.
Once the release notification has been received by the arbitrator, iPort1, the arbitrator sends a release notification to the ports to which it previously sent reserve intention notifications. More particularly, these ports may include the ports that export the volume, but need not include the port that initiated the reserve request, iPort2. Thus, the arbitrator sends a release notification to ports iPort3 and iPortn at 654 and 656, respectively. The ports, iPort3 and iPortn, may also update their local reserve information accordingly at 658 and 660, respectively.
In accordance with one embodiment, an arbitrator is associated with a volume or set of volumes. In this manner, a different arbitrator may be associated with a different volume or set of volumes. Moreover, each arbitrator may be implemented at a different port. For instance, each arbitrator port may be implemented by a master port as set forth in patent application Ser. No. 10/056,238, Attorney Docket No. ANDIP003, entitled “Methods and Apparatus for Implementing Virtualization of Storage in a Storage Area Network,” by Edsall et al, filed on Jan. 23, 2002, which is incorporated herein by reference for all purposes. In other words, a master port may be associated with a volume or set of volumes.
In this example, the host sends a reserve request identifying one or more portion(s) of Volume 1 to iPort2 at 712. Although iPort2 is an arbitrator for Volume 2, it is not an arbitrator for Volume 1. As a result, iPort2 sends a reserve intention notification identifying the portion(s) of Volume 1 to the arbitrator for Volume 1, iPort1, as shown at 714. The arbitrator for Volume 1, iPort1, determines whether a reservation conflict exists at 716 by determining whether another port has performed a conflicting reserve. Assuming that no conflict exists, iPort1 may record the reservation of the portion(s) of Volume 1 at 718. In addition, the arbitrator for Volume 1, iPort1, may send an acknowledgement to iPort2 indicating that no reservation conflict exists as shown at 720.
Upon receipt of the acknowledgement from the arbitrator, the port responsible for reserving the requested portion(s) of Volume 1, iPort2, sends a reserve intention notification to a set of ports that export the volume, Volume 1. As shown, a reserve intention notification is sent to the remaining ports, iPort3 and iPortn, that export Volume 1 and are not yet aware of the reservation as shown at 722 and 724, respectively. The ports iPort3 and iPortn may then record the reservation of the portion(s) of Volume 1 as shown at 726 and 728, respectively. For instance, local reserve information may identify the portion(s) of Volume 1 and/or the port requesting the reservation, iPort2. Upon receiving the reserve intention notification and/or recording the reservation, the ports, iPort3 and iPortn, may send an acknowledgement to iPort 2, as shown at 730 and 732, respectively. The requesting port, iPort2, obtains a lock of the requested portion(s) of Volume 1 at 734 and may also record the reservation in its local reserve information. In accordance with one embodiment, the requesting port waits until it has received acknowledgements from each of the ports that have received reserve intention notifications to obtain the lock. The requesting port, iPort2, may also send a reserve response to the host at 736 indicating whether the reservation was successful.
The host may also wish to reserve portion(s) of Volume 2. In this example, the host sends a reserve request at 738 identifying portion(s) of Volume 2 to iPort3. Although iPort3 exports Volume 2, it is not the arbitrator for Volume 2. As a result, iPort 3 sends a reserve intention notification identifying the requested portion(s) of Volume 2 to the arbitrator for Volume 2, iPort2, as shown at 740. The arbitrator for Volume 2, iPort2, determines whether a reservation conflict exists at 742 by determining whether another port has performed a conflicting reserve. Assuming that no conflict exists, iPort2 may record the reservation of the portion(s) of Volume 2 at 744. In addition, the arbitrator for Volume 2, iPort2, may send an acknowledgement to iPort3 indicating that no reservation conflict exists as shown at 746.
Upon receipt of the acknowledgement from the arbitrator for Volume 2, the port responsible for reserving the requested portion(s) of Volume 2, iPort3, sends a reserve intention notification to a set of ports that export the volume, Volume 2. As shown, a reserve intention notification is sent to the remaining ports, iPort1 and iPortn, that export Volume 2 and are not yet aware of the reservation as shown at 748 and 750, respectively. The ports iPort1 and iPortn may then record the reservation of the portion(s) of Volume 2 as shown at 752 and 754, respectively. For instance, local reserve information may identify the portion(s) of Volume 2, as well as the port requesting the reservation, iPort3. Upon receiving the reserve intention notification and/or recording the reservation, the ports, iPort1 and iPortn, may send an acknowledgement to iPort 3, as shown at 756 and 758, respectively. The requesting port, iPort3, obtains a lock of the requested portion(s) of Volume 2 at 760 and may also record the reservation in its local reserve information. In accordance with one embodiment, the requesting port waits until it has received acknowledgements from each of the ports that have received reserve intention notifications to obtain the lock. The requesting port, iPort3, may also send a reserve response to the host at 762 indicating whether the reservation was successful.
When the host wishes to release the lock of the portion(s) of Volume 1, it sends a release request. In this example, the release request is sent to iPort2 at 764. Even though iPort2 exports Volume 1, it is not the arbitrator for Volume 1. As a result, iPort2 sends a release notification to the arbitrator for Volume 1, iPort1, as shown at 766. The arbitrator for Volume 1, iPort1, updates its reserve information (e.g., maintained locally and/or at a separate location) to indicate that the lock has been released at 768. In addition, the remaining ports, iPort3 and iPortn, are sent a release notification at 770 and 772, respectively. The release notifications may be sent by the arbitrator or by iPort2 (as shown in this example). The ports iPort3 and iPortn may update their reserve information accordingly at 774 and 776, respectively. The port that received the release request, iPort2, may send a release acknowledgement to the host at 778 confirming that the lock has been released.
In the above-described embodiments, various operations relating to acquiring and releasing locks are described. In addition, operations relating to accessing and modifying reserve information, as well as sending and receiving corresponding reserve and release notification messages are set forth. However, it is important to note that these examples are merely illustrative, and therefore other operations and corresponding notifications are contemplated. Moreover, the disclosed embodiments may be implemented using a variety of message types.
Various switches within a storage area network may be virtualization switches supporting virtualization functionality.
When the virtualization intercept switch 806 determines that the address specified in an incoming frame pertains to access of a virtual storage location rather than a physical storage location, the frame is processed by a virtualization processor 808 capable of performing a mapping function such as that described above. More particularly, the virtualization processor 808 obtains a virtual-physical mapping between the one or more physical storage locations and the virtual storage location. In this manner, the virtualization processor 808 may look up either a physical or virtual address, as appropriate. For instance, it may be necessary to perform a mapping from a physical address to a virtual address or, alternatively, from a virtual address to one or more physical addresses.
Once the virtual-physical mapping is obtained, the virtualization processor 808 may then employ the obtained mapping to either generate a new frame or modify the existing frame, thereby enabling the frame to be sent to an initiator or a target specified by the virtual-physical mapping. For instance, a frame may be replicated multiple times in the case of a mirrored write. This replication requirement may be specified by a virtual-physical mapping function. In addition, the source address and/or destination addresses are modified as appropriate. For instance, for data from the target, the virtualization processor replaces the source address, which was originally the physical LUN address with the corresponding virtual LUN and virtual address.
In the destination address, the port replaces its own address with that of the initiator. For data from the initiator, the port changes the source address from the initiator's address to the port's own address. It also changes the destination address from the virtual LUN/address to the corresponding physical LUN/address. The new or modified frame may then be provided to the virtualization intercept switch 306 to enable the frame to be sent to its intended destination.
While the virtualization processor 808 obtains and applies the virtual-physical mapping, the frame or associated data may be stored in a temporary memory location (e.g., buffer) 810. In addition, it may be necessary or desirable to store data that is being transmitted or received until it has been confirmed that the desired read or write operation has been successfully completed. As one example, it may be desirable to write a large amount of data to a virtual LUN, which must be transmitted separately in multiple frames. It may therefore be desirable to temporarily buffer the data until confirmation of receipt of the data is received. As another example, it may be desirable to read a large amount of data from a virtual LUN, which may be received separately in multiple frames. Furthermore, this data may be received in an order that is inconsistent with the order in which the data should be transmitted to the initiator of the read command. In this instance, it may be beneficial to buffer the data prior to transmitting the data to the initiator to enable the data to be re-ordered prior to transmission. Similarly, it may be desirable to buffer the data in the event that it is becomes necessary to verify the integrity of the data that has been sent to an initiator (or target).
The new or modified frame is then received by a forwarding engine 812, which obtains information from various fields of the frame, such as source address and destination address. The forwarding engine 812 then accesses a forwarding table 814 to determine whether the source address has access to the specified destination address. More specifically, the forwarding table 814 may include physical LUN addresses as well as virtual LUN addresses. The forwarding engine 812 also determines the appropriate port of the switch via which to send the frame, and generates an appropriate routing tag for the frame.
Once the frame is appropriately formatted for transmission, the frame will be received by a buffer queuing block 816 prior to transmission. Rather than transmitting frames as they are received, it may be desirable to temporarily store the frame in a buffer or queue 818. For instance, it may be desirable to temporarily store a packet based upon Quality of Service in one of a set of queues that each correspond to different priority levels. The frame is then transmitted via switch fabric 820 to the appropriate port. As shown, the outgoing port has its own MAC block 822 and bi-directional connector 824 via which the frame may be transmitted.
One or more ports of the virtualization switch (e.g., those ports that are intelligent virtualization ports) may implement the disclosed SCSI reserve functionality. For instance, the virtualization processor 808 of a port that implements virtualization functionality may also perform SCSI reserve functionality such as that disclosed herein. Of course, this example is merely illustrative. Therefore, it is important to note that a port or network device that implements SCSI reserve functionality may be separate from a port or network device that implements virtualization functionality.
As described above, all switches in a storage area network need not be virtualization switches. In other words, a switch may be a standard switch in which none of the ports implement “intelligent,” virtualization functionality.
As described above, the present invention may be implemented, at least in part, by a virtualization switch. Virtualization is preferably performed on a per-port basis rather than per switch. Thus, each virtualization switch may have one or more virtualization ports that are capable of performing virtualization functions, as well as ports that are not capable of such virtualization functions. In one embodiment, the switch is a hybrid, with a combination of line cards as described above with reference to
Although the network devices described above with reference to
Although illustrative embodiments and applications of this invention are shown and described herein, many variations and modifications are possible which remain within the concept, scope, and spirit of the invention, and these variations would become clear to those of ordinary skill in the art after perusal of this application. Moreover, the present invention would apply regardless of the context and system in which it is implemented. Thus, broadly speaking, the present invention need not be performed using the operations or data structures described above.
In addition, although an exemplary switch is described, the above-described embodiments may be implemented in a variety of network devices (e.g., servers) as well as in a variety of mediums. For instance, instructions and data for implementing the above-described invention may be stored on a disk drive, a hard drive, a floppy disk, a server computer, or a remotely networked computer. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.