Business Continuity Management

Post 9/11

Disaster Recovery Methodology


Author Biography

Professor Edward Moskal is a Professor in the Computer and Information Sciences Department at Saint Peter’s College located in Jersey City, New Jersey. Prior to becoming a Professor in 2001, Ed worked for 24 years at fortune 100 companies, developing and directing system implementation projects on mainframe, mid-range and client-server computing platforms. Systems included Manufacturing, Marketing, eCommerce, Customer Relationship Management, Supply Chain, Enterprise Resource Planning, Financial, Human Resource, Retail, and Shareholder. Professor Moskal earned a BS in Management and Information Systems from Saint Peter’s College, an MS in Administration from the University of Notre Dame, and an MS in Management Science from Stevens Institute of Technology.

Acknowledgements

This article was made possible through the 2005 New York University Summer Scholar-In-Residence program in which Professor Moskal was an active participant. Utilizing New York University resources (databases, journals, periodicals, text books and the internet), Professor Moskal conducted research and studied the subject of disaster recovery.


Introduction

There are many risks that threaten organizations by disrupting computer systems and business processes. These risks include traditional emergencies like fires, floods, earthquakes and hurricanes as well as risks from cyber-terrorism, cyber-crime, computer and telecommunications failures, theft, and employee sabotage. Any one of these risks can be very disruptive to a business. There are also new risks that have surfaced since 9/11. Communicable diseases have shown with the deadly SARS outbreak that they can threaten business too and terrorism is a threat worsened by a tendency of people to become complacent in the absence of immediate danger. In addition, post 9/11, there is a greater risk of a biological agent release (bioterrorism). The Gartner group indicated in a study performed in July 2004, that many enterprises have inadequate business continuity and disaster recovery programs. A severe incident, such as the September 11 disaster, would cause crippling damage to businesses, and some might never recover.

Disaster Recovery Methodologies

From an information technology and business recovery standpoint, there are a variety of good disaster recovery methodologies and products in the market place. The challenge facing organizations today, in the post 9/11 era, is to select the optimum blend of disaster recovery products and technologies. A common problem in the past has been a tendency to view the disaster recovery solution as individual product technologies and piece parts. Instead, disaster recovery solutions need to be viewed as a whole, integrated multi-product solution to deal with both the post 9/11 and the traditional business risks and emergencies.

Why Should We Care About Disaster Recovery

Recent research shows that one major barrier to disaster preparedness is lack of senior management support and funding. Senior management has a tendency to think that disaster recovery planning is a complex, time and resource consuming process with little obvious benefit. However, insurance statistics indicate that billions of dollars are lost through catastrophes and computer outages. For the 1990-99 period, the Federal Emergency Management Agency (FEMA) spent more than $25.4 billion for declared disasters and emergencies. FEMA is now part of the Department of Homeland Security. FEMA's mission is to lead America to prepare for, prevent, respond to and recover from disasters with a vision of "A Nation Prepared." At no time in its history has this vision been more important to the country than in the aftermath of September 11th. Business managers must realize that when a disaster strikes it will have an adverse impact on their businesses/customers. Following are reasons why businesses must plan for disasters:

  • Survivability of your business might depend on it: Most companies could not stay in business very long without their mission critical applications following a computer system failure. Rapid recovery after a disaster or system failure is vital to the livelihood of your business.

  • Downtime is costly: A computer system failure, no matter what its gravity, results in increased expenses, lost revenue, and lost customers. A tested business recovery plan ensures faster recovery, and consequently, less downtime.

  • Business contracts stipulate delivery deadlines: If you are a supplier, you must deliver the services or products to the other party no matter what your circumstances, or pay the penalties.

  • Law may require it: Company officers are legally liable to protect company assets, including electronic data, particularly if you are a public company, a bank, a utility, or a government agency.

    In addition to planning for disasters, plans need to be in place to handle non-catastrophic events associated with computer hardware and software malfunctions including lack of adequate security controls. Following is a list of real world events that resulted in risk to company customers and the public associated with computer hardware, software malfunctions and lack of adequate security controls:

  • Dispatch computer glitch grounds Delta flights for 2 hours in Atlanta.

  • Florida sues AT&T for billing 1 million non-customers.

  • Canada’s largest bank has processing disruption.

  • 800,000 cards overcharged at Wal-Mart stores from hardware problem.

  • Swedish insurance computers disabled by virus, affecting all 9 million Swedes.

  • AOL worker sold list of 92 million AOL customers to spammer.

  • 4.6 million DSL subscribers’ subjected to data leakage in Japan.

  • UCLA laptop stolen with 145,000 blood donors’ data.

  • Israeli police lose laptop with critical agent’s information.

  • Air Force Motorola radios jam garage-door openers in Florida.

  • Network vandal penetrates South Korean defense system.

  • LexusNexus Inc. disclosed that hackers commandeered a database and gained access to the personal files of 300,000 people.

  • Citigroup said UPS lost tapes with sensitive information from 3.9 million customer’s of CitiFinancial, which provides loans.

  • CardSystems Solution Inc. security breach could expose 40 million people to fraud.

    Although these events may not be catastrophic, they are significant and result in considerable costs, resources and computer down time.

    Business Continuity Management Methodology

    I am proposing a Business Continuity Management methodology (a new post 9/11 disaster recovery methodology) that can be used to sort, summarize, and organize company business requirements in a methodical way. Business Continuity Management is regarded as the integration of social and technical systems that together enable effective organizational protection (Source: Journal of Strategic Information Systems, 1995).

    Before I define the elements of the proposed Business Continuity Management methodology, I want to introduce you first to (1) disaster recovery statistics and facts, (2) common threats, and (3) effects of a disaster.

    Disaster Recovery Statistics and Facts

    Research and statistics clearly highlight the importance of disaster recovery and business recovery planning. The following statistics and facts illustrate this point:

  • FEMA declared 68 major disasters in 2004 (Source: FEMA, 2005).

  • In the August 2003 Eastern North America blackout, 50 million people were left without power and communications. The economic cost was between $7 and $10 Billion (Source: ICF Consulting, 2004).

  • Within minutes of the first plane crashing into the World Trade Center in New York's financial district on September 11, 2001, more than 200 organizations started declaring disasters and invoking their business continuity and disaster recovery plans (Source: Gartner, 2003).

  • Two out of five companies that experience a disaster will go out of business in five years (Source: Gartner, 2004).

  • Almost half of the companies that lose their data through disaster never re-open, and 90 percent are out of business within two years (Source: University of Texas Center for Research on Information Systems, 2004).

  • 43 percent of companies that experience data disasters never reopen, and 29 percent close within two years (Source: McGladrey and Pullen, 2004).

  • 80 percent of businesses without a well structured recovery plan are forced to shut down within 12 months of a flood or fire (Source: London Chamber of Commerce and Industry, 2003).

  • Globally, 60 percent of 850 mid-to large-sized companies experienced from 1 hour to 24 hours of unplanned down time (Source: Veritas, 2003).

    Disaster recovery and business recovery planning should be addressed as a top priority in organizations. The consequences of not addressing disaster recovery and business recovery planning are significant to the livelihood of an organization.

    Common Threats

    Since 9/11, we have lived through a number of catastrophic events (the Tsunami, Hurricane Ivan and the Eastern North America blackout). Since 1953, there have been a total of 1,572 major disaster declarations made by FEMA. This averages out to be 31 major disasters per year since 1953. Following is a list of some of the most common disaster threat (Source: Layton & Associates LLC):

  • Natural disaster (fire, water, weather)

  • Computer component failure

  • Virus or security related attack

  • Human error (maintenance, operations)

  • Sabotage (employee or external)

  • Bomb threat

  • Denial of service attack on network or systems

  • Equipment malfunction

  • Telecommunications failure

  • Terrorist act

    Lack of proper disaster recovery planning can threaten company technology infrastructures and the survival of businesses that rely upon them.

    Effects of Disaster

    The effects of a disaster have a tremendous impact on business. Following is a list of the business effects associated with disasters:

  • Loss of business/customers

  • Loss of credibility/goodwill

  • Cash flow problems

  • Degradation of service to customers

  • Inability to pay staff

  • Loss of production

  • Loss of operational data

  • Financial loss and loss of financial control

  • Loss of customer account management

    Companies can loose their market-share or entire business if disaster recovery is not properly addressed.

    Business Continuity Management – Definitions and Model

    Business continuity management should look at all critical information processing areas of a company, including but not limited to the following:

  • Local and wide area networks, and servers

  • Telecommunications and data communication links

  • PBX, telephone and voice mail systems

  • Workstations and workspaces

  • Applications, software, and data

  • Operating systems

  • Computers and printers

  • Firewalls

  • System interfaces

  • Records storage

  • Business processes

  • Staff responsibilities

    A Business Continuity Management methodology should be followed that will provide an integrated framework for companies to implement a proper level of disaster recovery protection for their organization. The Business Continuity Management methodology proposed provides this integrated framework and consists of the following elements: risk assessment, business impact analysis, disaster recovery, business recovery, business resumption, contingency planning, and crisis management. Following are the definitions:

    Risk Assessment

    Process of identifying the risks to an organization, assessing the critical functions necessary for an organization to continue business operations, defining the controls in place to reduce organization exposure and evaluating the cost for such controls. Risk analysis often involves an evaluation of the probabilities of a particular event.

    Business Impact Analysis

    The business Impact Analysis is a process designed to identify critical business functions and workflow, determine the qualitative and quantitative impacts of a disruption, and to prioritize and establish recovery time objectives.

    Disaster Recovery Plan

    The management approved document that defines the resources, actions, tasks and data required to manage the recovery effort. Usually refers to the technology recovery effort.

    Business Recovery Plan

    Process of developing advance arrangements and procedures that enable an organization to respond to an event in such a manner that critical business functions continue with planned levels of interruption or essential change.

    Business Resumption

    Planning to ensure the continued availability of essential business processes, programs and operations. Business resumption planning prepares organizations to recover from contingencies, defined as any event that may interrupt an operation or affect service or program delivery. Business resumption planning includes facility and operations management, as well as information technology systems. The resources that must be considered include information, assets, people and facilities.

    Contingency Planning

    A plan used by an organization or business unit to respond to a specific systems failure or disruption of operations. A contingency plan may use any number of resources including workaround procedures, an alternate work area, a reciprocal agreement, or replacement resources.

    Crisis Management

    The overall coordination of an organization's response to a crisis, in an effective, timely manner, with the goal of avoiding or minimizing damage to the organization's profitability, reputation, or ability to operate. Hopefully, organizations have time and resources to complete a crisis management plan before they experience a crisis. Crisis management in the face of a current, real crisis includes identifying the real nature of a current crisis, intervening to minimize damage and recovering from the crisis. Crisis management often includes strong focus on public relations to recover any damage to public image and assure stakeholders that recovery is underway. An uncontrolled disaster or a combination of mismanaged disasters could lead to a crisis. The magnitude of the crisis could be larger than a disaster in terms of loss expectancy. A crisis usually happens because of accumulated unattended/unresolved disasters/issue(s).

    Business Continuity Management implies ensuring the continuity or uninterrupted provision of computer operations, business processes and services. Business Continuity Management is an on-going process with several different but complementary components. Risk Assessment and Business Impact Analysis are the initial steps.

    Every organization, no matter how small should have a Business Continuity Management program to help ensure that they are able to recovery both information technology and business operations in a timely manner. The sponsors of the program should be company senior management and the head of information technology.

    Business Continuity Management Implementation

    Senior management awareness is the initial and a very important step in creating a successful Business Continuity Management program. To obtain necessary resources and time from required areas of the organization, senior management must understand and support the business impacts and risks.

    In today’s global business environment, having the correct computer systems, databases and information in a timely manner can be the difference between profits and losses, maintaining pace with competition, and ensuring the viability of a company. Company’s that have 24 x 7 access to information allows them to achieve their objectives (Source: IDC June 2003). In addition, vendors, suppliers, customers and employees must have access to information when they need it. Providing the necessary level of information under normal conditions as well as unpredictable disruptions or catastrophic disasters is necessary for a company to survive. It is during the unpredictable disruptions or catastrophic disasters that businesses risk losing competitive advantage by not having taken the appropriate measures to prevent loss of information availability.

    The costs associated with developing and implementing a Business Continuity Management program will be relative to the number of computer systems and business processes identified in the risk assessment and business impact analysis that require risk mitigating safeguards. In addition, costs associated with a vendor hot site, telecommunications, and resources need to be factored into the cost equation.

    According to an IDC study in June 2003, in which telephone interviews were conducted on 41 companies that integrate business continuity as part of their information technology strategy, costs by companies varied dramatically. The companies studied were in the financial services, manufacturing and healthcare industries. The finance industry spent far more proportionally than the manufacturing and healthcare industries. The significance of these differences in business continuity budgets indicates, first, that each industry places a different level of importance on the role of business continuity, and second, that the financial industry places a higher level of significance on incorporating business continuity as part of its information technology strategy. Following are the breakdown of costs by industry:

  • Financial Services $500M
  • Manufacturing $50M
  • Healthcare $30M

    The 41 companies surveyed were comprised of: 14 from financial, 15 from manufacturing, and 12 from healthcare. The average cost spent per company on business continuity, for year 2003, was $4.5 million.

    As part of the IDC study, company information was also collected on revenue lost per incident by business function. In measuring overall revenues lost per incident, the losses incurred per disaster result in an average loss of approximately $3 million per incident. The largest overall losses are incurred by back-office functions, led by finance, followed by manufacturing and human resources (Source: IDC, June 2003). Considering the study performed by Veritas in 2003, where 60 percent of 850 mid-to large-sized companies experienced from 1 hour to 24 hours of unplanned down time, the $3 million average cost per incident is significant.

    A company needs to balance the cost of unavailability with the cost of recovery. Companies need to make decisions on what systems and business processes should be part of their Business Continuity Management program and need to be recovered in the event of a system outage or disaster. Companies need to have healthy discussions on business continuity management as part of their strategic, tactical, and operational planning process and include appropriate funding in their information technology and business unit budgets.

    It is critical that companies follow a robust Business Continuity Management methodology, when performing disaster recovery planning. The plans implemented must be achievable, testable and cost-effective, in order for them to be effective.

    All seven steps should be completed to help ensure adequate disaster recovery plans are in place. All seven steps also need to be updated on an on-going basis. If they are not sustained, improved and when necessary changed, the investment in the Business Continuity Management program will be wasted.

    September 11, 2001, changed the way many people view the world. It expanded the meaning of "disaster", causing organizations to rethink their business continuity plans. The Business Continuity Management methodology outlined, provides the road map that all companies (large, medium and small) can follow to help ensure continued computer system and business operations in the event of a disaster, computer emergency, or crisis in this post 9/11 era.


    Bibliography and References

    “Business Continuity and Disaster Recovery Plans – Things Overlooked, Steven Lewis, EDPACS, Volume 33, June 2005.

    “Business Continuity Needs: Ensuring Information Availability While Ensuring ROI”, David Tapper, International Data Group (IDC), June 2003.

    “Business Continuity Planning”, Martin Nemzow, John Wiley & Sons, July 1997.

    “Disaster Recovery and Business Continuity, Step-by-Step”, Wall, Northcutt and Edmead, SANS Institute, 2004.

    “Disaster Prevention and Management”, David Alexander, Emerald Group Publishing, Volume 14, Number 2, 2005.

    “Disaster Recovery: Best Practices”, Cisco Systems, July 2003.

    “Eight Best Practices for Disaster Recovery”, Martha Heller, CIO Magazine, November 2004.

    “Incident Handling: An Orderly Response to Unexpected Events”, Rickard L. Rollason – Reese, ACM’s Special Interest Group for University and College Computing Services, August 2003.

    “Identifying Enterprise Network Vulnerabilities”, Judith M. Myerson, International Journal of Network Management, August 2002.

    “Network went wobbly hours before outage”, Thomas Fogarty, USA Today, August 18, 2003.

    “Ongoing Crisis Communications”, W. Timothy Coombs, Sage Publications, 1999.

    “Plan for the Worst”, Kahan, Accounting Technology, Spring 2005.

    “Risks to the Public in Computer and Related Systems”, Peter G. Neumann, ACM Volume 29 Number 5, September 2004.

    “The Economic Cost of the August 2003 Blackout”, Bansari Saha , ICF Consulting, August 2003.

    “The Limitations of Traditional Information Systems Planning”, Swartz, Elliot, and Herbane, Journal of Strategic Information Systems, 1995.


    World-Wide-Web Links

    http://www.caci.com

    http://www.davislogic.com

    http://www.disasterrecoverysurvival.com

    http://www.fema.gov

    http://www.forrester.com

    http://www.gartner.com

    http://www.ibm.com

    http://www.insolutionsinc.com

    http://www.laytonllc.com

    http://www.londonchamber.co.uk

    http://www.mcgladrey.com

    http://www.scdservices.com/data_backup

    http://www.strohlsystems.com

    http://www.sungard.com

    http://www.veritas.com


    Further Readings

  • Computer Emergency Response Team, http://www.cert.org

  • Computer World, http://www.computerworld.com/securitytopics/security/recovery?from=yn \

  • Department of Homeland Security, http://www.dhs.gov

  • Disaster Recovery Journal, http://www.drj.com

  • Information Week, http://www.informationweek.com

  • Information Security, http://www.infossec.com

  • National Institute of Standards and Technology, http://csrc.nist.gov

  • Network World, http://www.networkworld.com/research/disasterrecov.html