1. Technical Field
The present invention generally relates to the fields of risk assessment and business continuity management for an organization. More particularly, the invention relates to systems and methods for analyzing risk associated with disruptions to business continuity and, moreover, to a framework for developing, maintaining, and using response plans to a business disruption.
2. Background Information
Maintaining normal business operations is critical for an organization. Disruptions that cause interruptions to normal business operations can cause severe financial losses. The length of a business disruption often correlates directly to the degree of loss resulting from the business disruption. Accordingly, business continuity (BC) is a high priority for many organizations. Business continuity management is, in part, based on a progression of measures aimed at recovering normal business operations of an organization after a business disruption occurs and minimizing the impact a business disruption can have on the operations of an organization.
To address these issues, organizations develop business continuity plans (BCP) to manage business continuity. These plans may include one or more processes for assessing risks, identifying critical resources, and/or monitoring the status of resources using readiness indicators. BCPs may also include processes for developing adequate recovery plans, producing reports, and/or conducting business continuity tests. Such processes may be implemented in whole, or in part, using computerized or software-based systems and components.
Despite these efforts, business continuity plans and tasks related to business continuity often lack consistency and specificity. For a large enterprise comprising multiple business units, including companies, groups, departments, branches, and/or offices, a business continuity plan of one business unit may be significantly different from that of another business unit. This disparity can cause confusion and may require unnecessary education of employees. Furthermore, with such disparity among various business units in a large organization, a manager of one business unit developing a response plan may not fully account for possible dependencies on other business units within the organization. This can result in inefficient, or even inadequate, response plans.
Furthermore, conventional business continuity plans do not provide a metric or other form of indicator for uniformly measuring the criticality of each business unit in an organization. Such indicators may be critical to determine, with specificity, which business units require response plans and/or to develop response plans that stress the recovery of more critical business units over non-critical business units in an organization.
In view of the foregoing, there is a need for improved systems and methods that minimize the impact resulting from a business disruption. There is also a need for systems and methods for assessing risk associated with business disruption and, more generally, there is a need for a framework for managing business continuity. For large organizations, there is also a need for a common, global framework encompassing all aspects of business continuity related to the organization as a whole.
Consistent with embodiments of the invention, systems and methods are provided for analyzing and/or managing risk associated with disruption(s) to business continuity. Embodiments of the invention also include systems and methods for providing a framework to develop, maintain, and use response plans to a business disruption. In certain embodiments, a common framework can be implemented that encompasses all aspects of business continuity related to a large organization as a whole.
In accordance with one embodiment, a computer-implemented system for responding to a business disruption using hierarchically ordered response plans is provided. The system comprises means for storing the response plans, which include at least two of a crisis management plan, a management summary plan, a department resilience plan, and a system resilience plan. The response plans also include escalation points. Further, the system includes an interface for allowing a user to access information on the response plans and the escalation points during normal business operation.
In certain embodiments, the computer-implemented system may also include means for maintaining the response plans using a business impact analysis. The system may further comprise means for estimating a set of time values for a business unit, the set of time values indicating points of time when a business impact of the business unit will increase; means for calculating a resilience impact rating of the business unit based on the estimated time values, wherein the resilience impact rating provides a metric for quantifying a time-criticality of the business unit; means for setting an impact threshold at a specific resilience impact rating value; and means identifying the business unit as time-critical business unit if the resilience impact rating is greater than or equal to the impact threshold. Additionally, the system may include means for producing paper printouts containing the response plans.
Consistent with another embodiment of the present invention, a computer-readable medium comprising instructions for causing a processor to execute a method for responding to a business disruption using hierarchically ordered response plans is provided. The method comprises storing the response plans, which include at least two of a crisis management plan, a management summary plan, a department resilience plan, and a system resilience plan. The response plans also include escalation points. Further, the method includes allowing a user to access, through an interface, information on the response plans and the escalation points during normal business operation.
In certain embodiments, the method may also include maintaining the response plans using a business impact analysis. The method may further comprise estimating a set of time values for a business unit, the set of time values indicating points of time when a business impact of the business unit will increase; calculating a resilience impact rating of the business unit based on the estimated time values, wherein the resilience impact rating provides a metric for quantifying a time-criticality of the business unit; setting an impact threshold at a specific resilience impact rating value; and identifying the business unit as time-critical business unit if the resilience impact rating is greater than or equal to the impact threshold. Additionally, the method may include producing paper printouts containing the response plans.
Consistent with another embodiment of the present invention, a method for responding to a business disruption using hierarchically ordered response plans is provided. The method comprises storing the response plans which include at least two of a crisis management plan, a management summary plan, a department resilience plan, and a system resilience plan. The response plans also include escalation points. Further, the method comprises allowing a user to access, through an interface, information on the response plans and the escalation points during normal business operation.
In certain embodiments, the method may also comprise maintaining the response plans using a business impact analysis. The business impact analysis includes estimating a set of time values for a business unit, the set of time values indicating points of time when a business impact of the business unit will increase; calculating a resilience impact rating of the business unit based on the estimated time values, wherein the resilience impact rating provides a metric for quantifying a time-criticality of the business unit; setting an impact threshold at a specific resilience impact rating value; and identifying the business unit as time-critical business unit if the resilience impact rating is greater than or equal to the impact threshold. Additionally, the method may include producing physical printouts containing the response plans for redundancy.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only, and should not be considered restrictive of the scope of the invention, as described and claimed. Further, features and/or variations may be provided in addition to those set forth herein. For example, embodiments of the invention may be directed to various combinations and sub-combinations of the features described in the detailed description.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate various embodiments of the invention and together with the detailed description, serve to explain the principles of the invention. In the drawings:
Reference will now be made in detail to the present invention, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts.
Systems and methods are disclosed herein for providing business continuity. Embodiments of the invention may be advantageously implemented by organizations of any size and structure for performing risk assessment and/or business continuity management. Examples of such organizations include, for instance, corporations, partnerships, government agencies, etc. An organization may consistent of one or more business units. A business unit of an organization is a logical, discrete collection of personnel or staff which performs one or more functions. Examples of a business unit include, for instance, a department of an organization.
Consistent with an aspect of the present invention, a business continuity framework (BCF) may be implemented for an organization. A BCF is an enterprise-wide discipline that may be applied for the purpose of managing risks to an organization. An organization assesses risks and determines the possible impact those risks will have on the organization. Risks are anything that can cause, prolong, or hinder effectively responding to a business disruption. A business disruption, depending on its level of severity and/or duration, can force an organization out of normal business operations or activities. Therefore, it is a goal of most organizations to reduce the impact that may be caused by various risks.
In accordance with another aspect of the invention, a business impact analysis (BIA) may be performed to identify time-critical business units in an organization. In one embodiment, the BIA uses a resilience impact rating (RIR), which provides a globally-relative number or other metric indicating time-criticality of a business unit or a resource. A criticality of a business unit may indicate, for example, the level of impact a business disruption in a business unit will have on the organization. The level of impact is often dependent on the duration of the business disruption. Examples of levels of impact are: non-significant, minor, moderate, significant, and major. These levels can be standardized in the organization by setting in policy the definitions of each levels of impact.
Consistent with additional aspects of the present invention, response plans may be developed to recover an organization back to normal business operations in case of a business disruption. In one embodiment, response plans are organized into a hierarchical structure where the execution of one plan can cause the execution of another plan, usually higher in the hierarchy, through escalation. Details of escalation are defined by escalation points defined for each response plan. Further, each plan in the hierarchy may have a specific role in responding to a business disruption.
Factors that affect business continuity can frequently change. Moreover, the size and structure of an organization may change, as well as risks that can cause a business disruption. Accordingly, consistent with an aspect of the invention, the business continuity framework may be updated to account for such changes. Maintenance of the business continuity framework may comprise updating and providing new response plans to account for the latest state of an organization and factors affecting the business continuity of the organization.
Systems and methods consistent with the present invention may be implemented in whole, or in part, using computerized systems and/or software-based components. For instance, as further disclosed herein, an organization can utilize a business continuity application (BCA) to support maintenance efforts and/or other activities associated with a business continuity framework (BCF).
Consistent with an embodiment of the present invention, exemplary methods for providing business continuity will now be described with reference to
Operational risks include events or failures that render an organization unable to continue operating in certain aspects. For example, operational risks may be identified in relation to a server or other computer hardware failure, a bomb threat, or a hazardous chemical leak.
Environmental risks include, for example, a loss of power or water to a building. Environmental risk also includes concentration risk of two or more buildings sharing the same resources, such as a power or water supply, as well as accommodating high impact staff or systems such that a large scale event will have simultaneous impact on multiple buildings, i.e., affect multiple business units and/or multiple resources.
External risks include requirements from clients, regulators, third parties, board of directors, and/or audit requirements relating to business continuity. For example, external risks for an organization may be identified with respect to client expectations that may increase and result in a need to decrease the length of a business disruption. External risks may also concern a new regulatory guideline that may be released that changes the scope of certain response(s).
Response risks include risks that can prolong a business disruption by hindering an effective response to the business disruption. Examples of response risks include, for instance, communication breakdown; unavailability or inappropriate solutions; lack of plan ownership and awareness; and/or inaccurate, inconsistent, or inaccessible plans.
An organization can maintain an inventory or database of risks and identify those specific risks with business continuity implications. Such an inventory or database may be maintained in a computerized system (see, e.g.,
In one embodiment, risks, including those indicated above, may be identified based on historical records of risks and/or future assessed risks for an organization or relevant industry. Response risk may also be based on past testing, drills, or simulations run on the response plans of the organization or related organizations. As will be appreciated by those skilled in the art, the exact risks at issue will depend on a number of factors, including the nature of the organization and the business environment in which it operates.
Referring again to
Consistent with an aspect of the invention, a response plan may include an escalation point for executing another plan that is either at the same or a higher level in a response plan hierarchy (see, e.g.,
In accordance with certain embodiments of the invention, a response plan is assigned a plan owner who is a member of the organization. A plan owner is responsible for knowing how to react in case of a business disruption. The plan owner accepts the respective roles and related control objectives assigned to him. The plan owner may also be responsible for continued evaluation of the response plan for its effectiveness.
Consistent with additional embodiments of the invention, a response plan may include a recovery time objective (RTO). An RTO expresses the approximate time between the start of a business disruption to when the business impact should be contained. This target may be defined by a plan owner. The development of a response plan can be based on the target RTO. Also, a response plan may include an escalation point based on the RTO. For example, a response plan for a department may include an escalation point to execute another response plan if the department has not been restored to normal business operations within specific number of hours set by the RTO.
Consistent with an embodiment of the present invention, multiple response plans may be developed that are hierarchically structured. For example, in
The role of the crisis management plan (CMP) 201 is to provide sufficient information to enable an effective damage assessment to be conducted and suitable communication protocol(s) to facilitate efficient command and control. An organization may develop a plurality of CMPs 201. A CMP 201 can escalate to another CMP 201. A CMP 201 can also receive an escalation from one or more MSPs 202.
The role of the management summary plan (MSP) 202 is to provide sufficient information to enable an effective damage assessment to be conducted and suitable communication protocol(s) to facilitate efficient command and control. A MSP 202 can escalate to a CMP 201. A MSP 202 can also receive an escalation from one or more DRPs 203. The MSP owner is a member of the CMP 201 that the MSP 202 escalates to.
The role of the department resilience plan (DRP) 203 is to provide sufficient information to enable an effective damage assessment, appropriate communication requirements to be identified, solution options to be chosen, solution status to be reported, and/or recovery activities to be prioritized based on the time of day and day of year the business disruption occurs. A DRP 203 can escalate to a MSP 202. A DRP 203 also assumes ownership for one or more SRPs 204.
The role of the system resilience plan (SRP) 204 is to consolidate the relevant DRP 203 requirements and recover a particular system (application or infrastructure). An SRP 204 is owned by a DRP 203. Typically, for example, an IT department would own an SRP 204 related to IT systems.
After having identified the risks (step 101) and developed response plans (step 102), an organization can be prepared for an occurrence of a business disruption.
Referring to
Consistent with embodiments of the invention, response plan(s) may be maintained (step 106) to ensure the effectiveness of the response plans. Response plan maintenance is a continuous process of on-going efforts to ensure that the organization is ready to respond to business disruptions caused by the risks. Maintaining response plans may include further identifying new risks and developing new response plans or reassessing previously identified risk and updating existing response plans in order to ensure that the current BCF is consistent with the present state of the organization, as well as any factors affecting the BC of the organization. Accordingly, maintaining response plans may include steps similar to that shown in
Development and maintenance of a BCF can be costly. Therefore, in accordance with one embodiment, an organization can prioritize risks and business units according to their importance based on business impact and available resources for BC.
Consistent with an embodiment of the invention, a business impact analysis (BIA) may be performed to identify time-critical business units in an organization for purposes of developing a response plan (see step 102,
As shown in
Next, the levels of impact are defined (step 302). Each level may indicate the level of impact a business disruption of the business unit will have on the organization. An organization may choose any appropriate number and labels for the levels of impact. In an embodiment consistent with the present invention, as shown in the example of
Levels of impact may be defined based on the amount of financial loss the organization will suffer due to the business disruption. For example, a loss of $0-1 million may be considered Non-Significant; a loss of $1-10 million may be considered Minor; a loss of $10-50 million may be considered Moderate; a loss of $50-150 million may be considered Significant; and a loss of more than $150 million may be considered Major. The exact dollar amounts will depend on many variables associated with specific organizations and the amounts can be reevaluated and adjusted as those variables change over time. Alternatively, the levels of impact may be defined by specific events or descriptive damages. For example, unpleasant coverage about the organization in the press may be classified as Significant and anything causing a shutdown of an entire manufacturing plant may be classified as Major.
Often, a business disruption has higher impact on the organization as the length of the time of the business disruption increases. As such, the longer it takes to resolve a business disruption and return the business unit to normal business operations, higher levels of impact the business unit will reach. Accordingly, in the next step (step 303), time values are estimated for each business unit, wherein the time values represent the length of time after the start of a business disruption at which point the business unit will increase in business criticality from one level to the next. This task may be performed by the department owner or any appropriate member of the organization. The department owner estimating these time values may also be the DRP plan owner.
In the example of
Then, as shown in
wherein 0.1, 1, 5, and 15 are weights. The weights are provided by the border values of the defined levels of impact.
Next, a threshold value is set (step 305) in order to distinguish the time-critical departments from non-time-critical departments. The threshold is used to determine a boundary within which to focus attention in developing BCPs. Setting a threshold provides a filter to narrow the scope of BCF. The value of the threshold can depend on many factors including the organizations resources for BCF, the determined initial RIR values, the RTOs set for the business units. The threshold may be reviewed and adjusted regularly.
After the threshold value has been set, the RIR values for the individual business units are compared against the threshold to determine which business units are time-critical business units. Business units whose RIR values are higher than the threshold are considered time-critical business units.
In the example shown in
For business units whose initial RIR values are greater than or equal to the threshold value, inter-departmental dependencies are determined (step 306). Inter-department dependencies are requirements of one business unit which must be satisfied or fulfilled in order for another business unit to execute or continue executing its response plan. For example, one department can have a response plan to relocate its employees to another department to resume operations during a business disruption. This department's response plan requires that the other department's location is available and can accommodate the relocation of employees. As another example, a response plan for one department may include using the organization's emailing system. Such a response plan could not be executed if the IT department is also disrupted by the same disruption. Furthermore, RTOs for business units may be changed in view of the identified inter-departmental dependencies.
In view of the dependencies, final RIR values are calculated (step 307) based on the initial RIR values and the inter-department dependencies. In an embodiment consistent with the present invention, one way to calculate the final RIR values is to increase initial RIR value of the dependent department by the initial RIR value of the department it is dependent upon. Using the example in
Thereafter, business units whose final RIR values are greater than or equal to the threshold value are identified as time-critical business units (step 308). By comparing the final RIR values, which may be different from the initial RIR values, with the threshold value, it is possible to have a different set of time-critical departments than previously determined using the initial RIR values.
In another embodiment consistent with the present invention, a second threshold value, possibly different from the first threshold value, may be set after calculating the final RIR values for determining the final set of time-critical business units. A different threshold value may be appropriate depending on the number of departments whose final RIR values are higher than the initial threshold value and the resources available for BC.
Now, following steps 310-308, response plan(s) can be developed for the determined time-critical business units (step 309). In developing a response plan, all relevant inter-department dependencies should be considered.
Consistent with embodiments of the invention, a plan owner may be the one responsible for conducting the business impact analysis (BIA) in order to assess whether a response plan needs to be established for a business unit. If the business unit is assessed as being non-time-critical, no further action is required from the plan owner. If, however, the business unit is considered time-critical, the plan owner needs to develop a response plan, which outlines the operational targets the plan owner and other members of the organization will achieve in case of a business disruption. The key tasks to be performed by the plan owner include, for example, identification of dependencies to other business units; identification of staff and alternative personnel to execute the response plan; and identification of escalation points.
Referring to
Furthermore, consistent with an aspect of the invention, an RTO can also represent the elapsed time between the start of a business disruption to when the business impact was actually contained from past business disruptions or past simulations. From such statistical data, an impact portfolio generated after a business disruption or simulation can depict a plot of business units against their RIR and the time it took to restore each business unit to their normal business operations. A plan owner can use such an impact portfolio to analyze or reassess risks, adjust threshold values, update response plans, or reassess inter-department dependencies as part of maintaining the response plans.
Consistent with an embodiment of the present invention, a business continuity application (BCA) may be provided to support the BCF, thereby enabling the development, implementation, maintenance, management, and testing of accurate and up-to-date response plans. By way of example, the BCA application may be implemented as a client-server application operating in a networked environment, such as that depicted in
As shown in the exemplary embodiment of
Consistent with an embodiment of the present invention, BCA server 601 and PC1604 may comprise apparatus such as a computer 700, shown in the example of
The present techniques and embodiments described herein, including the exemplary systems and methods presented above, can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in any suitable combinations thereof. In addition, apparatus consistent with the present invention can be implemented in a computer program product tangibly embodied in a machine-readable storage device for execution by a programmable processor.
Method steps according to embodiments of the invention can be performed by a programmable processor executing a program of instructions to perform functions or steps of the methods by operating on the basis of input data, and by generating output data. Embodiments of the invention may also be implemented in one or several computer programs that are executable in a programmable system, which includes at least one programmable processor coupled to receive data from, and transmit data to, a storage system, at least one input device, and at least one output device, respectively. Computer programs may be implemented in a high-level or object-oriented programming language, and/or in assembly or machine code. The language or code can be a compiled or interpreted language or code. Processors may include general and special purpose microprocessors. A processor receives instructions and data from memories, in particular from read-only memories and/ or random access memories. A computer may include one or more mass storage devices for storing data; such devices may include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM disks. Any of the foregoing can be supplemented by or incorporated in ASICs (application-specific integrated circuits).
To provide for interaction with a user, aspects of the invention can be implemented on a computer system having a display device such as a monitor or LCD screen for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer system. The computer system can be programmed to provide a graphical or text user interface through which computer programs interact with users.
A computer may include a processor, memory coupled to the processor, a hard drive controller, a video controller and an input/output controller coupled to the processor by a processor bus. The hard drive controller is coupled to a hard disk drive suitable for storing executable computer programs, including programs embodying the present technique. The I/O controller is coupled by means of an I/O bus to an I/O interface. The I/O interface receives and transmits in analogue or digital form over at least one communication link. Such a communication link may be a serial link, a parallel link, local area network, or wireless link (e.g., an RF communication link). A display is coupled to an interface, which is coupled to an I/O bus. A keyboard and pointing device are also coupled to the I/O bus. Alternatively, separate buses may be used for the keyboard pointing device and I/O interface.
The foregoing description has been presented for purposes of illustration. It is not exhaustive and does not limit the invention to the precise forms or embodiments disclosed. Modifications and adaptations of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the disclosed embodiments of the invention. For example, the described implementations include software, but systems and methods consistent with the present invention may be implemented as a combination of hardware and software or in hardware alone. Examples of hardware include computing or processing systems, including personal computers, servers, laptops, mainframes, micro-processors and the like. Additionally, although aspects of the invention are described for being stored in memory, one skilled in the art will appreciate that these aspects can also be stored on other types of computer-readable media, such as secondary storage devices, for example, hard disks, floppy disks, or CD-ROM, the Internet or other propagation medium, or other forms of RAM or ROM.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
This application claims the benefit of priority from U.S. Provisional Patent Application No. 60/898,991, filed Feb. 2, 2007, entitled “Business Continuity Framework,” the disclosure of which is expressly incorporated herein by reference to its entirety.
Number | Date | Country | |
---|---|---|---|
60898991 | Feb 2007 | US |