Many critical UNIX systems have the highest availability requirements. These systems have the requirement of being constantly patched to the current level for operating systems to minimize the potential service outage due to a known issue. Patches are often released daily. Unfortunately, these two requirements are contradictory. Patching introduces downtime, often excessive downtime. Current patching state of the art also has weaknesses in the patch removal process, again introducing more downtime on systems that cannot tolerate it.
Eliminating downtime required for applying patches would be ideal but unobtainable due to the nature of the underlying UNIX based operating system and the fact that that many patches require downtime for a reboot. There is a need to minimize downtime to simply the boot time required for this most intrusive of operations.
The current invention addresses the needs present in the prior art.
The present invention is directed to a method and system for reducing downtime of a computer system during system maintenance. An operating environment is run on a primary boot disk while system maintenance is performed on a secondary boot disk. This system maintenance includes identifying patches to be applied to the system, queuing the patches to be applied, and applying the patches. A reboot is performed to the secondary boot disk while the primary boot disk is maintained as a back-up boot environment. Optionally, the primary boot disk may initially be mirrored to the secondary boot disk, or, the operating environment may initially be copied from the secondary boot disk to the primary boot disk.
a illustrates a common initial boot disk configuration for an embodiment of the present invention.
b illustrates a boot disk configuration for an embodiment of the present invention after mirrors have been broken.
c illustrates a boot disk configuration for an embodiment of the present invention after a Boot Environment has been created on an inactive disk.
d illustrates a boot disk configuration for an embodiment of the present invention after the system has been rebooted to a new Boot Environment.
e illustrates a boot disk configuration for an embodiment of the present invention after references to the original boot disk have been deleted.
f illustrates the recycling of a boot disk in an embodiment of the present invention.
g illustrates a boot disk configuration for an embodiment of the present invention after a boot disk has been recycled.
h illustrates a boot disk configuration for an embodiment of the present invention after a boot disk has been recycled.
i illustrates a boot disk configuration for an embodiment of the present invention after a boot disk has been freed up for use as a new Boot Environment.
j illustrates a boot disk configuration for an embodiment of the present invention where a new Boot Environment is created on an unused disk.
k illustrates a boot disk configuration for an embodiment of the present invention after a system has been rebooted to a modified Boot Environment.
l and 1m illustrate a boot disk configuration for an embodiment of the present invention after a system has been rebooted to a modified Boot Environment.
n illustrates a boot disk configuration for an embodiment of the present invention with fully mirrored, redundant boot disks.
a is a flow chart of the operations that are performed in an embodiment of the present invention.
b is a flow chart of the operations that are performed in an embodiment of the present invention.
c is a flow chart of the operations that are performed in an embodiment of the present invention.
d is a flow chart of the operations that are performed in an embodiment of the present invention.
e is a flow chart of the operations that are performed in an embodiment of the present invention.
f is a flow chart of the operations that are performed in connection with system maintenance in an embodiment of the present invention.
Reference will now be made in detail to the preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings. It is to be understood that the figures and descriptions of the present invention included herein illustrate and describe elements that are of particular relevance to the present invention, while eliminating, for purposes of clarity, other elements. Those of ordinary skill in the art will recognize that other elements may be desirable and/or required in order to implement the present invention. However, such elements are well known in the art, and because they do not facilitate a better understanding of the present invention, a discussion of such elements is not provided herein.
The invention described herein allows a user to reduce downtime due to maintenance activities. Because only one boot disk has the patch or patches initially installed, the changes can be rolled back to a secondary boot disk. The invention also allows a system to be examined, with only patches applicable to that system being queued for application. The patches are then applied to a Boot Environment (“BE”) rather than the live operating system boot disk. The invention further allows a user to create and retain as many BEs as a system has spare local disks.
The invention described herein relates to eliminating downtime required for applying patches to a UNIX system. While the embodiment described herein specifically applies to a Sun Microsystems Solaris/Veritas environment, the invention can be equally applied to all variants of UNIX systems using appropriate variations on the commands described herein. Such variations will be known to those skilled in the art.
The exemplary embodiment of the present invention described herein operates in connection with a system that satisfies the following prerequisites:
a shows a common initial boot disk configuration for an embodiment of the present invention. The primary boot disk 100 is known as rootdisk, whereas the secondary boot disk 101 is known as disk01. The OpenBoot PROM may be configured, depending on the system, with aliases to allow a user to boot to either disk 100 or 101, and to boot to the other disk in case booting to the first one fails. Many systems will have the following definitions:
For this example, it is assumed that boot disks 101 and 102 are the only two disks available that can be used with the utility. Thus, one of the disks must be freed up. In this case, disk01 will be freed up, which entails the following steps:
The procedures to accomplish these steps is as follows:
3. Each of the volumes in the rootdg disk group has a second plex, or mirror, whose subdisks are located on the disk01 disk. Each of these plexes must be removed recursively so that the disk on which their subdisks reside can be removed.
b shows the result of taking these actions. At this point, a BE can be created on the unused disk 101 (the device name for this disk is c0t1d0s2) using the be-create command. In this example, the BE will be given the name vmupgrade:
# be_create --BE=vmupgrade --device=c0t1d0
The system may be configured to detect that this is the first time the utility is being run on this host, and may create a special BE configuration record for the original boot disk 100, named “orig”. This is simply a placeholder, and no other changes are required to the original boot disk. The system may also make changes to the OBP settings to make them consistent with the original disk's new BE name. This step may also be accomplished manually.
The be_create program may produce output giving a user the status, and may log more detailed information in /var/log/BE.log. The resulting BE may also be bootable as part of the creation procedure.
The result of these actions is shown in
vx-Bename.
At this point, the new BE can be mounted (at /.lbbe.vmupgrade/) and changes can be made to it that will be seen when the system is booted to the new BE. The BE can be mounted with the following command:
# be_mount --BE=vmupgrade
At this point, maintenance can be performed on the new BE, which is safe because disk 101 is inactive, while boot disk 100 is currently active. Modifications may be handled by various utility scripts, and these scripts have their own documentation that varies from release to release. Once the desired modifications have been made, the BE must be unmounted before attempting to boot to it:
# be_umount --BE=vmupgrade
The system can now be booted to the new BE at the user's convenience:
#reboot -- vx-vmupgrade
Depending on the system, rebooting twice in quick succession may be required. This may also be documented in the modification procedure and may appear as part of the upgrade script output. This is because the installation of the new VxVM product requires the first reboot to occur with the Volume Manager disabled, because entirely new VxVM devices have to be created by the new loadable kernel modules that come with the new Volume Manager. Immediately after this boot, the Volume Manager can be re-enabled and rebooted again. When the system comes back up, the Volume Manager will be active again. This expands the “reboot” command above to be:
# reboot -- vx-vmupgrade
<Wait for System to Reboot and log in as Root>
# cd /etc
# cp vfstab.vm vfstab
# cp system.vm system
# reboot -- vx-vmupgrade
<System will Reboot with the Volume Manager Enabled>
The results of taking these actions are shown in
Now the system is running on the new BE, and the original BE is inactive. However, to maintain this, a manual change is needed to the OpenBoot PROM settings. At this point, an OBP alias exists for both disks, but only one of the disks will ever be automatically booted from the original disk unless a manual change is made to the OBP settings. In the preferred embodiment, this set up is intentional, as there is always a chance that the new BE may not work properly on any given system, and it is preferable to require explicitly booting to the new BE until it can be ensured that it boots properly. At this point, the OBP boot-device setting should look as follows:
boot-device=vx-rootdisk vx-rootmirror
Since the vx-rootmirror was destroyed earlier, it may be removed. Also, a user wanting to boot to the new BE by default will need to add its alias (vx-vmupgrade) to the front of the OBP boot-device setting with a command such the following:
# eeprom boot-device=“vx-vmupgrade vx-rootdisk”
At this point, when a standard reboot, init 6, or other system restart command is issued, the system will attempt to boot to the new BE disk; if this fails, the original boot disk will be used.
In the preferred embodiment, it is typical to run the system this way for at least a few days before recycling the original boot disk to be used as a mirror of the new BE. The reasons for doing this are as follows:
Once the new BE's stability has been verified, the original boot disk 100 may be recycled so that it can be used as a mirror for the new BE. This requires deleting the “orig” BE that was specially created around the original boot disk 100. All references to the original boot disk must be deleted from the OBP. Although it is possible to accomplish this in one command, for this example, the following steps will be used:
# be_delete --BE=orig
<Save a Copy of the Device Pointed to by the OBP Alias vx-rootdisk in /tmp/recycled-device>
# /etc/vx/bin/vxeeprom devalias vx-rootdisk>/tmp/recycled-device
#/etc/vx/bin/vxeeprom devunalias vx-rootdisk
# eeprom boot-device=“vx-vmupgrade”
The results of running these commands are shown seen in
The original boot disk 100 is now ready to for recycling. In this example, the standard Veritas Volume Manager techniques are used to carry out these actions, which are illustrated in
The final result of these commands is shown in
Subsequent BE creations on a system will now be considered. At the end of the last example, the state of the boot disk mirrors is displayed in
The procedure to accomplish these steps is as follows:
3. Each of the volumes in the rootdg disk group has a second plex, or mirror, whose subdisks are located on the disk01 disk. Each of these plexes must be removed recursively so that the disks on which the subdisks reside can be removed.
At this point, the new BE can be mounted (at /.lbbe.patch0903/) and changes can be made to it that will be seen when the system is booted to it. The BE can be mounted with the following command:
# be_mount --BE=patch0903
Modifications to this BE can be made, treating /.lbbe.patch0903/ as though it was the /directory. Once the desired modifications have been made, the BE must be unmounted before attempting to boot to it:
# be_umount --BE=patch0903
The new BE can be booted to at any time of the user's choosing:
# reboot -- vx-patch0903
The results of taking these actions are seen in
Now the system is running on the new BE, and the older BE is inactive. However, to maintain this, a manual change is needed to the OpenBoot PROM settings. At this point, an OBP alias exists for both disks, but only one of the disks will ever be automatically booted from—the original disk—unless a manual change is made to the OBP settings. This set up is intentional, as there is always a chance that the new BE may not work properly on any given system, and it is preferable to require explicitly booting to the new BE until the user is sure that it boots properly. At this point, the OBP boot-device setting should look like this: boot-device=vx-vmupgrade
If the new BE is to be booted to by default, its alias (vx-patch0903) must be added to the front of the OBP boot-device setting:
# eeprom boot-device=“vx-patch0903 vx-vmupgrade”
Now, when a standard reboot, init 6, or other system restart command is issued, the system will attempt to boot to the vx-patch0903 disk; if this fails, the vx-vmupgrade boot disk will be used.
The system may be allowed to run this way for at least a few days before recycling the original boot disk to be used as a mirror of the new BE. The reasons for doing this are as follows:
Once the new BE has been determined to be stable over time, the user may want to recycle the vmupgrade boot disk, so that it can be used as a mirror for the new BE. To do this requires that the “vmupgrade” BE be upgraded. All references to the original boot disk must also be removed from the OBP, but the name of that device should be saved in a file (/tmp/recycled-device in this example), since a new alias for it will be created shortly thereafter. Although it is possible to accomplish the procedure of this paragraph with a single command, in this example the following steps are taken:
# be_delete --BE=vmupgrade
<Save a Copy of the Device Pointed to by the OBP Alias vx-vmupgrade in /tmp/recycled-device>
# /etc/vx/bin/vxeeprom devalias vx-vm upgrade>/tmp/recycled-de vice
#/etc/vx/bin/vxeeprom devunalias vx-vmupgrade
# eeprom boot-device=“vx-patch0903”
This is illustrated in
The former vmupgrade boot disk is now ready for recycling. In this example, the standard Veritas Volume Manager techniques are used to carry out these actions:
It is advantageous that a system have a minimum of three boot disks available, to support the creation of a new BE, while maintaining a mirror of the original boot disk. The present invention will support the use of any number of BE disks, but requires that any boot disk mirror be broken before a new BE can be created. Once a BE is created, the mirror can be restored.
Creating a new BE requires:
The term locally attached disk means any type of disk that is not on the Storage Area Network. This means that any internal drive (including FC-AL drives in systems such as the Sun V880), drives in external drive bays (such as the Sun D130), or drives in external drive arrays that are directly connected with a SCSI cable can be used as BE disks. Note that many smaller systems, such as Sun Netras, have only two internal drives, and these drives are usually mirrors of one another. In these cases, the present invention can still be used, but the mirror will have to be destroyed and used as the new BE. Once the new BE is validated to work well, the original BE can be removed, and used as the new mirror. If a system has only one disk, the present invention cannot be used on it.
In another embodiment of the present invention, the disk device c0t1d0s2 (an internal 36 GB disk on this Sun E420) is to be used as a new BE. First, it is confirmed that this disk is not currently under Veritas Volume Manager control, by using the vxdisk list command, and noting that the “DISK” and “GROUP” columns have dashes (-) in them:
Next, a name for the new BE has to be chosen, in this example, patch1202.
Finally, in this embodiment of the present invention, templates are used to automatically provide a guideline for sizing the / and /local/0 filesystem volumes, as well as the swap volume. The values in the templates are based upon the size of the drive on which the new BE is being created. At this time, the /usr and /var filesystems (and any other filesystems that are located on the current boot disk only) are collapsed into the / filesystem, but /local/0 is kept separate.
Under normal circumstances, collapsing the filesystems is perfectly acceptable, as this is the recommended configuration from Sun and Veritas since Solaris 2.5 was released. As an added protection, in a preferred embodiment of the present invention, UFS logging is automatically turned on for all UFS filesystem on newly created BEs, for Solaris 7 and later. UFS logging can actually improve filesystem performance, and will prevent the need for an fsck of these filesystems if the system should crash in the future.
With this information, the command to actually create the new BE can now be issued:
# be_create --BE=patch1202 --device=c0t1d0
The first time be_create is run on a system, the current boot disk is given a default BE name of orig. However, in the described embodiment, no OpenBoot PROM alias is created for this BE.
By default, a newly created BE is bootable. There creation process also creates an OpenBoot PROM alias to help boot to the new BE. The alias created will be of the form: vx-{BEname}. Thus, if the new BE's name is patch1202, the OpenBoot PROM alias for it would be: vx-patch1202.
Creating a BE only makes it bootable and creates an alias—it does not change the default boot disk. This is done to prevent the loss of the original default boot device, and to make it easy to boot back to the original BE, if this is needed. To change the default boot device to the BE just activated, a command, such as the one following, can be run inside the Solaris OS:
# eeprom boot-device=vx-patch1202
Alternatively, the following command can be run at the OpenBoot PROM prompt:
ok setenv boot-device vx-patch1202
Once created, a BE can be mounted at a predetermined mount point, so that its contents can manually be altered. This mount point is of the form: /.lbbe.{BEname}. Thus, if the BE name is patch1202, it will be mounted under /.lbbe.patch1202. Note that all of the filesystems that are listed in /etc/vfstab on that BE disk and are physically located on the BE disk are mounted, not just the root (/) filesystem. Some reasons for mounting an inactive BE are:
A package that supports the use of the -R <alternate roots option can be installed on a mounted BE. Note that some packages do not support the use of the -R <alternate root> option. Example: Install the package LBabc on the BE named patch1202—note that this BE has to be mounted before this action can be performed:
#pkgadd -R /.lbbe.patch1202 LBabc
A package that supports the use of the -R <alternate roots option can be removed from a mounted BE. Note that some packages do not support the use of the -R <alternate roots option. Example: Remove the package LBabc from the BE named patch1202—note that this BE has to be mounted before this action can be performed:
#pkgrm -R /.lbbe.patch1202 LBabc
If a BE is mounted and administrative work is carried out on it, it must be subsequently unmounted to make it bootable again.
Example: Unmount the BE Named Patch1202
# be_umount --BE=patch1202
The BE is now unmounted and the /lbbe.patch1202 directory is removed. Note that the BE mounting process makes the BE unbootable until the BE is unmounted again.
It is sometimes desirable to obtain a list of the patches that could be applied to a BE before actually applying them.
Example: Get a Patch Report for the BE Named Patch1202.
# be_patch --BE=patch1202 --report
Note that the be_patch command automatically mounts the named BE before producing a patch report for a BE. It also automatically unmounts the BE before completing.
Certain patches are available for application to a BE via the PatchManager framework. The be-patch tool examines the BE and compares it against the latest approved list of patches for the OS version loaded on the BE and the characteristics of the BE's loaded packages. A patch list customized for this BE is constructed, and can be applied to the BE. It is also possible to obtain a patch report for the BE with the --report option.
Example: Apply the Latest Patches for the BE Named Patch1202, and Get a Patch Report as well.
# be_patch --BE=patch1202 --apply --report
Note that the be_patch command automatically mounts the named BE before applying patches to a BE. It also automatically unmounts the BE before completing.
If a BE no longer serves a purpose, or needs to be destroyed to make room for a new BE to be created, it can be destroyed.
Example: Delete the BE Named Patch1202.
# be_delete --BE=patch1202
The status of all BEs in a system can be determined by running the be_status command:
a through 2f are flow charts illustrating preferred embodiments of methods of the present invention.
Referring to
Referring to
Referring to
Referring to
Referring to
Referring to
Following is a list of commands used in the present invention:
The system must already be under Veritas Volume Manager control.
The boot disk must already be encapsulated or initialized.
Any existing mirrors of the existing boot disk must be removed.
Options:
1. Current on this BE.
2. Are totally new to this host (no version of the patch has ever been applied).
3. Updates to patches that are currently applied.
Following is a list of utilities used with the present invention:
If given, the specified BE's driver will be checked to see if it needs an upgrade. If so, one will be done, and the configuration of the original driver (usually just the WWPN target numbers) will be copied into the new driver configuration file. The new driver configuration file will also be edited to conform to the EMC SAN fabric recommendations for that driver version (as they vary from version to version). If the driver is already at the proper version, its configuration will be compared against the EMC SAN fabric recommendations for that driver version, and any necessary corrections will be made.
If no options are given, the utility will assume that it is operating on the live boot disk, and will not attempt a driver upgrade. If the latest driver version is loaded though, it will validate and correct its configuration, if needed. The utility will then verify that the latest firmware is loaded on each card, and will upgrade it if needed. This has to be done in real-time, and causes a reset of each card. The utility verifies that all LUNs seen through that card are again visible to both the HBA driver and PowerPath (if in use) before proceeding to the next card's firmware upgrade.
vx35upgrade (/usr/LBBE/bootdiskmanager/bin/vx35upgrade)
This utility is a shell script in this release, and requires a BE name on which to operate. It will not operate on a live boot disk. If necessary, it will upgrade to the latest VxVM, VxFS, VEA, and PowerPath products. If these products are not already installed, it will not install them. They must already be installed when this utility is invoked.
This application claims the benefit of U.S. Provisional Patent Application No. 60/605,577, filed Aug. 30, 2004, the entirety of which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
60605577 | Aug 2004 | US |