BUY THIS BOOK

Safari Books Online

What is this?

Looking to Reprint this content?


Incident Response
Incident Response

By Kenneth R. van Wyk, Richard Forno

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: What Is Incident Response?
All too often, when organizations develop information security programs, they treat security issues as a simple "check-box" on the list of required corporate functions. After giving security its due attention (often times very little), senior executives happily and honestly check the box indicating they've somehow dealt with IT security and then move on to the next issue. Many of these organizations assume that once the security program is established (the box is checked, remember), they are assured of complete security and that -- like hanging a painting on the wall -- once in place it requires little further attention. Nothing could be farther from the truth.
For one thing, there is no such thing as total security. Good security controls keep folks honest and make it so challenging for an adversary to get around such controls that they will give up and move on to an easier target. Security supports business operations and ensures uptime and efficiency of mission-critical systems needed by the business in its daily operations to generate revenue and profit. From that perspective, security is as critical to business operations as the reliability and stability of the company's networks, servers, and phone lines. But what happens when something unexpected happens, or someone manages to get around the established security controls in a manner that threatens business operations, and subsequently revenue and profitability?
For years, mature IT organizations have recognized the advantages of keeping data centers running effectively and efficiently. Yet, far too many otherwise mature IT organizations fail to adequately address incident response (or IT security in general). Some view incident planning as a self-fulfilling prophecy that ensures that something will go wrong. Those same organizations would not dare install a data center without adequate fire protection or burglar alarms, however; not because they expect a fire to occur, but because they want to be able to respond quickly to a fire to contain the damage and get back to doing business. While most companies today invest some level of effort and funding to employing security protection mechanisms in their IT infrastructures, very little attention gets paid to planning how to handle information security incidents.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Real-Life Incidents
Contrary to public perceptions, not all incidents have dramatic dollar losses or make the front page news in sensational stories of computer terrorists wreaking havoc around the cyber world. Rather, most incidents rarely get a passing glance from even the most investigative reporter, and are often rather mundane and uninteresting for people outside of the affected area or company. As this book is not sensational and does not make unrealistic claims of gloom and doom, let's look at some typical situations that incident responders deal with on an almost-daily basis.
One case occurred a few years ago at a major university's (University X) primary computer lab. Apparently out of the blue, the 25-plus Unix workstations in the lab started crashing one by one in rapid succession, until each of the monitors had a single message on its display: "Kernel panic, core dump." Fortunately, these high-end Unix systems recognized that there was a problem and began to reset and reboot themselves to correct the problem.
The issue was resolved and considered a computer "hiccup" until thirty minutes later when the exact same thing happened again. The "Kernel panic, core dump" message appeared, and the staff reset and rebooted the computers. Sensing wrongdoing, one of the more vigilant system administrators in the lab placed a network sniffer diagnostic device on the network and began to watch for suspicious activity.
Thirty minutes went by and the sequence occurred again. However, this time, the episode was captured on the sniffer in a manner similar to recording a television program on a VCR. A quick examination of the recorded events on the sniffer showed that each of the Unix computers had received an SMTP (Simple Mail Transfer Protocol) packet from another well-known university (University Y) immediately prior to each computer crashing. The system administrator then placed a phone call to the Carnegie Mellon
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What Is an Incident?
Incident response is a vital part of any successful IT program and is frequently overlooked until a major security emergency has already occurred, resulting in untold amounts of unnecessary time and money spent, not to mention the stress associated with responding to a crisis. In the most basic terms, an incident is a situation in which an entity's information is at risk, whether the situation is real or simply perceived. Common examples of incidents include the following, by no means complete list of incident types:
  • A company's web site is defaced by an intruder. The company seeks to find the perpetrator and recoup financial damages for tarnishing the company's reputation.
  • An employee at a company is believed to be selling trade secrets to a competitor.
  • A rival corporation is believed to be dialing into a company's computing systems and downloading financial performance data.
  • A computer virus is spreading among employees by way of infected Microsoft Word document files shared over email.
These situations are serious incidents that could easily result in significant impact to a company if not handled appropriately. Clearly, it is that level of impact that is most important to a business. To a typical corporation, the most severe type of incident is one that adversely affects a business process. Any company that does not understand the potential impact of an information systems security incident need only ask its senior business managers what the impact would be if their business functions were delayed, halted, or otherwise diminished. To exacerbate the situation, business managers and many senior executives are generally not technologists or experts in the underlying IT infrastructure supporting their business process.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
About the Bad Guys
Anyone can be the source of or a participant in an information security incident: disgruntled insiders, business competitors, foreign nations, nongovernment groups with a common unifying cause, cybercriminal gangs, white-collar criminals, and even the unwitting end user who clicks Reply-To-All and propagates a virus-bearing email message to everyone inside a corporation, and many others as well. Many times people are involved in an incident and don't even realize it!
Although the media, government, and Hollywood tend to characterize every incident as being caused by a "hacker," this book takes exception to that sensational generalization. Instead, this book uses terms like "adversary," "bad guys," "attackers," or "crackers" to describe those malicious folks that cause incidents to happen. Where relevant, more specific terms (e.g., "script kiddie") is used to describe a particular type of adversary. For more information on the nomenclature associated with the computer security underground, consult the New Hackers' Dictionary located at http://www.jargon.8hz.com/jargon_toc.html.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What Is Incident Response?
Incident response is the discipline of handling situations in a manner that is:
Cost effective
Incident response is by nature a support function (except for companies and organizations that perform incident response as a core business service). Thus, keeping costs to a minimum is vital to success, since incident response is not revenue-generating but revenue-preserving and thus requires some expenditure.
Business-like
In order to be accepted in the business place, incident response must function just like any other business service.
Efficient
In every respect, efficiency is important. Without efficiency, you may encounter duplication of effort, lost time spent learning the basics needed for the given situation, excessive periods of downtime, or the wrong response tools brought to the situation.
Repeatable
As a business function, two similar incidents should be handled either similarly or identically in every regard. For example, the process to handle a network penetration is identical for any organization you encounter. There may be some location- or company-specific flavors or differences, or you may need to adapt your process to fit a special situation, but the overall process should not change that much from one incident to another.
Predictable
Incident response functions must also be predictable. Surprises are bad. A business function owner/manager needs to be able to rely upon incident response and to know what services and other support functions are to be expected from the response team.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Risk Assessment and Incident Response
It is clear why a company should invest the resources to establish an incident response program: consider the results and impact on a corporation that suffers a disaster without having prepared for it! In other words, what level of risk is a company willing to accept on its information resources and businesses?
This is addressed through the concept of risk management, or when senior management conducts a cost-benefit analysis to weigh the pros and cons of implementing various security countermeasures such as an incident response program. Risk management defines levels of risk by examining the types and probabilities of threats and vulnerabilities associated with a given organization and balances those findings against the costs associated with protecting against such potential problems. These assessments help senior management decide the level of risk they and the company are willing to accept as a result of implementing (or not implementing) specific countermeasures to potential security problems. For example, not having an incident response process may mean extended periods of downtime and confusion that could affect business operations or revenue, just as not having a properly configured firewall increases the probability of a network being compromised.
While many resources provide in depth details of risk management, here are some points to ponder in assessing the levels of risk for enterprise information resources. Truthful answers to questions like these will help determine how robust an incident response capability may be required at a given company, and thus the level of resources needed to make it happen:
  • What business processes are dependent upon the proper functioning of IT systems?
  • To what level has the company entrusted its IT staff to access these critical IT systems?
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Development of Incident Response Efforts
Unlike firefighting -- which has been in existence for centuries -- computer incident response is a comparatively new area that began in 1988 with the establishment of the Carnegie Mellon University Computer Emergency Response Team Coordination Center (CERT/CC) in Pittsburgh, PA. Incidents certainly occurred and were handled before this, but it is only since 1988 that incident response has taken shape as a distinct discipline within the information security profession. Previously, an incident in a typical organization was handled by the organization's IT staff and/or its security staff in a more or less ad hoc manner. The results, as might be expected, tended to be hit or miss, and were frequently:
  • Unpredictable
  • Unfocused, with no one knowing who was in charge of the situation
  • Not tightly synchronized with senior management's wishes and priorities in mind
  • Costly, although such ad hoc incident response situations were not even sufficiently organized to provide an accurate accounting of actual costs
  • Time-consuming
The history of incident response as a discrete discipline goes back to November 1988, when a young Cornell University graduate student named Robert T. Morris wrote a program known as a worm , and subsequently unleashed it on the fledgling Internet. Due primarily to the unavailability of large portions of the Internet, the incident resulted in what seemed to be panic and pandemonium. During the incident, individual system administrators, researchers, and others involved in the incident acted independently to analyze and thwart the worm program as it traversed the Internet.
In the aftermath of the Morris Worm, a postmortem study and resulting list of recommendations prompted the Defense Advanced Research Projects Agency (DARPA) to establish and fund the first officially recognized incident response team, the Computer Emergency Response Team (CERT). The CERT, later referred to specifically as CERT Coordination Center (CERT/CC), was established at Carnegie Mellon University's Software Engineering Institute, itself a federally funded research and development center. By design, the establishment of the CERT Coordination Center spawned numerous other incident response teams (IRTs) throughout the world, each one serving its own community or constituency.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Are You Ready? Are You Willing?
One of the first unspoken lessons learned as a computer security expert and member of an incident response team is that no matter how much you prepare, and no matter how organized your schedule is, your duty pager will always go off at the most inconvenient time possible, whether that's 5:00 p.m. on Friday or 3:30 a.m. on Wednesday morning. Just like your local firefighters, a professional incident responder's job is to immediately deploy to the scene of the problem, respond to, and resolve the incident, whether it's with a keyboard and sniffer or a truck and firehose.
The second lesson is that you never get used to the first lesson. That, and Murphy's Law is with you every time the pager goes off.
If this appeals to you, continue reading and learn how to become an effective incident responder. If this kind of life doesn't appeal to you, but you work in the IT profession, continue reading anyway -- those doing incident response will need your support at some point. So why not join them and see what all the fuss is about? After all, forewarned is forearmed.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Incident Response Teams
Since the Carnegie Mellon CERT Coordination Center (CERT/CC) was established, incident response teams have sprouted in all sorts of places, ranging from government teams to commercial for-profit organizations set up similarly to the CERT/CC. In fact, there are almost as many types of teams as there are teams themselves. This is fortunate in today's digital world -- organizations that recognize the advantages of instituting a robust incident response program have a multitude of options on how it is best accomplished. From a management perspective, one of the primary considerations between the different incident response capabilities is funding: who pays for the incident response services? From an operational perspective, however, the primary considerations are responsibility and services: to whom or what does the incident response team answer, and what services does it offer?
The answers to these questions determine the team's priorities. For example, a team funded by a government agency or large community is responsible to that entire agency or community, not just one or two organizations. Thus, the services that it provides must be divided across the community it serves. Depending on the size of that community, the funding model of the team, and the core mission of the team itself, the team will be able to reasonably offer a particular set of services. The reason that the set of services is usually impacted by the size of the community is one of sheer size. That is, within a small constituent community, it is relatively easy to offer very one-on-one, personalized services, whereas teams that serve massive communities typically have to offer services that can be feasibly supplied to the masses. We go into these in more detail shortly. The important thing is to know the needs and expectations of the community that the team will serve, and then to choose the options that best meet those needs.
In its early days, CERT/CC provided telephonic and email support to sites affected by incidents. When a site experienced an incident, a representative from the site, such as an affected systems administrator, would call the always-available
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Who Should Do It?
This question is not as simple as it seems. Every organization needs to have an incident response program of some capacity in place. The question is not whether organizations should have an incident response program, but where the response team should reside, who should be in charge of it, who should pay for it, who should it answer to, and so on. For example, a multibillion dollar corporation can afford to staff an incident response team that serves only the needs of the corporation (i.e., its users and employees) and may not need to share valuable talent resources with another organization. However, in serving only one parent organization, a team cannot easily provide some of the most useful information that traditional teams such as the CERT/CC can offer, like attack trend analysis across a broad spectrum of client sites.
Furthermore, even in a huge corporation, the issues of funding and prioritization are important. Who funds the team, and what community does the team serve? If the team is funded by the IT department, does it provide expert on-site assistance to the individual business units? If so, does the IT department charge the business units for the service? More importantly, if two incidents occur simultaneously and the team is only capable of handling one at a time, who is authorized to decide which incident has priority? These questions may sound mundane or trivial, but they can have enormous impacts on organizations if they are not addressed adequately prior to an emergency.
It should be noted that when referring to an institution, "CERT" usually indicates the Carnegie Mellon University Computer Emergency Response Team (CERT/CC) in Pittsburgh, PA. However, with the rise of incident response as a required security function, the term "CERT" has become synonymous with any organization's incident response capability. Just to make it interesting, there are several terms used to describe such incident response capabilities, but they all generally mean the same thing. You may encounter Computer Security Incident Response Teams (CSIRT), Computer Incident Response Teams (CIRT), Computer Incident Advisory Capability (CIAC), or any combination of letters. They all serve the same general function of handling incidents for their respective constituencies or customers.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Public Resource Teams
The first widely recognized incident response team, the Carnegie Mellon CERT, is the most frequently cited example of a public resource team. It is funded primarily by a government organization, the Department of Defense's Advanced Research Projects Agency (DARPA), and it is a resource available to a general audience. The original 1988 charter of the CERT set up the team to be a "direct service to the Internet community." In that role, the CERT/CC provided the entire Internet with incident response assistance including vulnerability advisories and telephone- and email-based guidance to sites experiencing security crises. (It is only fair to note here that the Internet was a substantially smaller place in 1988 when the CERT was first chartered.)
In the years since the formation of the CERT/CC, dozens of other public resource incident response teams have been formed, each serving its own constituency. These IRTs provide localized assistance for their own constituencies and typically draw on funding provided by some form of fee upon the entire constituency. Each IRT service offerings vary, of course, but the core services usually include the following:
Technical and procedural guidance to sites affected by an incident
The IRTs typically provide guidance via telephone and/or email.
Vulnerability reporting, analysis, tracking, and advisory distribution
Most IRTs function as a clearinghouse through which constituents can report system vulnerabilities. The IRT, in turn, reports the vulnerability to the affected vendor and tracks the vulnerability through its resolution, at which time the IRT issues a vulnerability advisory to its constituents. The advisories provide technical feedback to the constituents with a description of each vulnerability, impact (if exploited), and information on corrective actions.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Internal Teams
The next type of incident response team is an internal team. These teams are loosely modeled on the public CERT model, but are set up and funded to serve a much more limited community, such as a single corporation. Because they are smaller than public CERTs, internal teams can provide the sort of attention necessary for an organization to respond to incidents. Unlike a public team, the internal team is fully aware of all of the policies, procedures, and sensitivities of the organization. Also, by focusing inward, an internal team can usually call upon the resources of the entire organization when responding to an incident. Specialized technical talent, for example, can be matrixed into the team on an as-required basis.
Typically, internal teams are either funded through the corporate offices of the parent organization or are available on a charge-backbasis, in which case the business unit pays for the services provided by the team's resources. Hybrid funding can also be effective. For example, the corporate offices pay the basic recurring expenses such as salaries and training, and business units only pay for the on-site services that are above and beyond the norm, such as travel and other expenses incurred by the team in responding to an incident.
The issue of where to place the incident response team internally is very important, and has led to an enormous amount of strife inside organizations attempting to create their own response teams. Some say that incident response should fall under IT as a support function, while others say that it should report to the CFO, the CIO, or directly to the CEO. Other possibilities, such as placing incident response and security functions under an internal audit group, have also been tried.
What works best? It depends on the company, but the important thing is that the incident response team should be placed where it can best support the business units without facing conflicts of interest or significant interference by the corporate environment. This is not an issue that should be taken lightly.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Commercial Teams
The most recent type of incident response team, one that is gaining popularity, is the commercial for-profit team. Commercial teams work on a contract basis and provide technical, procedural, investigative, and legal support to their clients, as required. In an age of outsourcing entire IT organizations, the model of hiring a commercial team can be particularly attractive to companies that cannot afford to staff their own response teams or that simply seek the availability of additional support in preparation for crisis events. This staffing on demand concept works particularly well when coupled with existing internal staffing.
The services that commercial teams offer vary greatly, but the core services usually include the following:
Around the clock, on-demand incident support
Typically, this consists of business-hour personnel availability with pager, cellular phone, or other off-hour connectivity to incident response staff.
On-site incident response personnel available at the client's request
Most professional incident response teams can have a team of their incident response experts on-site within a few hours, depending on travel time.
A full suite of incident support services
Services may include technical, procedural, forensics, litigation, expert testimony, and legal/policy expertise.
Expert and experienced technical personnel
Commercial teams offer personnel who are trained in incident response and constantly maintain their proficiency in the everchanging technologies involved.
Fire drills
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Vendor Teams
Many operating system vendors such as Sun Microsystems, Microsoft, and Hewlett Packard operate their own incident response teams. These vendor-based teams are special cases, insofar as they do not provide most of the services that other response teams do. Instead, they serve as the vendor's security analysts when new vulnerabilities are reported in their products. To be fair, some vendor teams also serve as internal teams for their company, but this section specifically refers to vendor vulnerability teams. When product vulnerabilities are discovered or reported to the vendor, the team typically does the following:
Documents the vulnerability
This should include recording all of the technical details of the vulnerability such as platforms affected, versions, patches/configuration issues, exploitation details, and symptoms.
Verifies the vulnerability
This usually involves replicating the environment in which the vulnerability was first reported, setting up instrumentation to closely observe the system's behavior, and attempting to exploit the vulnerability.
Determines the cause of the vulnerability
Once validated, the team has to determine the causes of the vulnerability in order to recommend an appropriate course of action. How did the problem occur? Where did it occur in the system? This is a forensic process in which a system is painstakingly analyzed, and usually requires a highly skilled and knowledgeable staff.
Recommends a course of action
The vendor management team needs to decide whether to fix the problem in the current release of the product or to implement the fix in the next release. The incident response team provides the necessary input to senior management so that the right choice can be made. Such recommendations also address backward compatibility issues with older versions of the product in question. Many horror stories have been told of organizations deploying just-released security fixes to their systems that ended up causing other problems ranging from system reliability, stability, or opening up other security holes. Vendors need to test as many configurations and versions of their fix as possible to give their customers the widest possible level of support and patch reliability.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Ad Hoc Teams
If you've gone through this list of team types and you still believe that none of them can possibly work in your organization -- perhaps because you can't get upper management buy-in -- there's still an alternative. You may want to consider setting up an ad hoc team of, for example, system administrators and related technical staff. Simply coordinating off-hour points of contact of key staff and discussing the possibility of what to do during an emergency can be better than no planning at all. That team should include the various IT support staff, but don't neglect to list some of the key senior management staff. Many times, when an incident occurs at organizations without formal security groups or contracts with commercial incident response teams, managers run into the IT department and pluck whoever is available at the time to deal with a reported incident. Such impulse actions usually hinder the effective resolution of an incident.
An ad hoc team should only be considered as a last resort. Any inkling of senior management buy-in will always be preferred, and proceeding without that buy-in could have career-limiting effects. However, in the event of a real security crisis -- where business operations may be impacted -- your contact list and planning, albeit limited, may prove invaluable in responding rapidly with some degree of success. Most managers will tell you that being effective means knowing the capabilities of your people. The same holds true for incident response planning.
Table 2-1 summarizes the various team types and their strengths and weaknesses.
Table 2-1: Types of incident response teams
Team type
Pro
Con
Public resource
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Forum of Incident Response and Security Teams (FIRST)
The Forum of Incident Response and Security Teams (FIRST) is a nonprofit professional organization made up of member incident response teams. It brings together a large number of incident response teams that span a wide spectrum of public resource, internal, vendor, and commercial teams; it is probably the only single organization dedicated solely to the advancement of incident response teams. Apart from the usual networking benefits that such an organization facilitates, FIRST holds periodic technical colloquia that focus on exchanging technical information among the member teams in sessions that are closed to nonmembers of FIRST. Additionally, FIRST holds an annual conference -- the only one of its kind -- dedicated exclusively to incident response. The conference usually includes several one-day seminars on incident response topics for start-up as well as experienced teams and their staff. As FIRST is a nonprofit organization, its funding comes primarily from the proceeds of the annual conferences, and it relies heavily on the volunteer efforts of the member teams in order to survive. It should be stressed that FIRST is not, and was never intended to be, an operational incident response team like the CERT/CC. Its sole function is to support and bring together its member teams.
Any company or organization that is setting up its own incident response program should look into FIRST, if only by attending one or more of its annual incident handling workshops to see what it's all about.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Now Who Should Do It?
The issue of who should be charged with performing an organization's incident response activities is critical. The question, to a large degree, may be answered by the available funding. Clearly, not every organization can afford to staff its own incident response team or contract with a commercial team. In such a case, incident response may be viewed as another "duty as assigned" task for the IT staff. If so, these fortunate staffers would be well-advised to get to know what resources are available to them from public teams such as the Carnegie Mellon CERT/CC, and to make the effort to keep up with changes in information protection technologies.
For a company or organization that has adequate budgetary resources available for an incident response program, however, it has been our experience that the most effective answer to the "Who should do it?" question has been a hybrid of all four of the types of teams discussed in this chapter. That is, the company should have at least one full-time employee whose principal responsibility is incident response coordination. That person should have on-demand staffing available by way of either internal staff augmentation and/or an external commercial team. Finally, the internal team, even if it is only one person, should know how to contact the vendor teams for all of the critical IT equipment it owns.
A hybrid incident response team made up of internal staff for handling day-to-day issues and the management and coordination of others, along with external teams to provide additional expertise and resources, offers the best of all worlds.
The hybrid team ideally includes all the positive aspects of each of the team constructs discussed in this chapter. It should also thoroughly understand the pitfalls and disadvantages associated with these constructs and seek to mitigate them. When set up correctly, this sort of team can rise to almost any challenge that comes up, as it is prepared for the unexpected. Preparing for the unexpected is the cornerstone to any sort of crisis response, and incident response is no exception. The team members must also understand their roles, responsibilities, and level of authority during the incident to avoid confusion and igniting corporate power plays between different managers or departments.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Planning the Incident Response Program
With so many choices, how does an organization begin to set up an incident response program? As with most journeys (or corporate actions), making the decision to support the idea is the crucial first step. It's absolutely vital to get senior executive-level support for developing your incident response program. One thing is for certain: anyone proposing this to senior management must be able to make a compelling business case. Although much of the information in this book is meant to make that process easier, we cannot possibly give readers all the material required to take their case to management. A lot of the business case comes down to a cost-benefit analysis presented so the executive can quickly see the benefit of having an incident response capability established. After all, capability (and management interest!) already exists in being able to respond to fire, theft, burglary, or medical emergencies.
This chapter covers the administrative, management, political, and operational issues of setting up an incident response program. Naturally, the specifics vary by organization, but there should also be a great deal of common ground.
Once an organization has made the decision to proceed with an incident response program, the fun really begins for those charged with leading the undertaking. Be prepared to spend time documenting technical and managerial procedures, defining staff roles and responsibilities, staffing requirements, fighting for adequate funding, identifying and purchasing tools, training your staff, and taking care of a whole slew of other administrative details specific to your situation and organization.
On the bright side, the more attention you pay to detail now, the more likely you succeed in the future. IThe more you train, test, document, validate, optimize, and document again, the smoother things are likely to run during a real incident.
The more you sweat in training, the less you bleed in combat. -- CAPT. Richard Marcinko, U.S. Navy (Retired), founder of SEAL TEAM 6.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Establishing the Incident Response Program
Once an organization has made the decision to proceed with an incident response program, the fun really begins for those charged with leading the undertaking. Be prepared to spend time documenting technical and managerial procedures, defining staff roles and responsibilities, staffing requirements, fighting for adequate funding, identifying and purchasing tools, training your staff, and taking care of a whole slew of other administrative details specific to your situation and organization.
On the bright side, the more attention you pay to detail now, the more likely you succeed in the future. IThe more you train, test, document, validate, optimize, and document again, the smoother things are likely to run during a real incident.
The more you sweat in training, the less you bleed in combat. -- CAPT. Richard Marcinko, U.S. Navy (Retired), founder of SEAL TEAM 6.
What student in America has gone through a school year without at least one fire drill? The rationale is that the students should be familiar with the fire exit procedures in a real emergency. Likewise, an incident response program is no different; you must be able to test and measure its effectiveness to help strengthen it in the long run. And, like most emergency programs and processes, failing to test it operationally is quite likely to result in pandemonium during a real incident, because the staff is far less likely to remember all of the emergency procedures that you've spent so much time developing and documenting.
A common administrative mistake is not distributing the "how-to" policies and procedures to the people who need it the most -- the people in the field who will be the actual responders to an incident. In the case of school fire drills, developing building evacuation procedures is a useless effort if no one knows what they are. It is no coincidence that easy-to-read fire exit locations and evacuation instructions are almost always found in classrooms. This is a critical lesson for the incident response coordinator.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Internal Versus External
Today, outsourcing some IT functions has become very popular. The same holds true for incident response, as there is a growing number of commercial incident response teams available to help an organization battle crises. Care should be taken, though, in deciding whether to operate the team as an internal function or an external one. Basically, an internal team is dedicated to the company, and an external team is only available on demand. It is all too easy to contract the function and assume that it is being handled when, in fact, the team isn't available quickly enough to fight day-to-day problems. The main reason in favor of contracting 100% of the incident response function is to save money; that is, only pay for the incident response resources when they are necessary. In our experience doing incident response in a commercial environment, we've never seen this approach work. In fact, we know of no reputable incident response service vendor that would sell its service to a client who wants to rely on it for 100% of incident response needs.
The most effective incident response programs we've seen are those in which the client has a robust internal incident response team, staffed with technically skilled and experienced personnel, and contracts with an external team (or teams) to provide on demand staff or expertise augmentation. This can be highly effective at supplying the necessary skill and expertise when needed, while still keeping costs down to a reasonable level.
We should mention that this model does not replace the matrix model. Recall that the primary driving force behind a matrix-driven incident response organization is to augment the full-time staff with technical experts, or simply additional people to help fight a big fire. Augmenting a team with an externally contracted team is usually done to add the incident response process or technology experts to the internal team. These two models solve two very different problems, although it is sometimes the case that companies use their external response teams in much the same way that they would a matrix-augmented team, when the matrix staffing is not available.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Types of Incidents
We've covered most of the basic administrative, management, and political issues that a team planner is likely to face in implementing an incident response team. Now, the issues are more operational in nature. That is to say, the planning should turn to address the issues that most directly impact what the team will be doing, by whom, and for whom. With that in mind, what sort of incidents should the team anticipate? Anticipating what type of incidents to expect is, in essence, answering the question of "What is the problem?"; once you understand the problem, you can develop a solution set. Let's preface an answer to that question by saying that, in the over fifteen years that we've been performing incident response operations professionally, we've never hit a steady state condition in which we have the luxury of simply coasting on past experiences without new situations cropping up. Nearly every incident we've worked on has presented a new challenge, whether technical, procedural, legal, or from a human interaction standpoint. As we've learned, Murphy loves to ride along and lend his unique brand of assistance during incidents. That having been said, what sort of incidents should you plan for?
Much of the answer to that question comes from within your own company. What sort of incidents have you seen to date, in the absence of a formal incident response program? Although it is possible the answer to that is unknown or undocumented, it is likely the question can be at least partially answered by talking with IT helpdesk staff or a similar organization. Next, take a look at what other similar companies experience. Dozens of surveys published in the security and IT trade press can give you that kind of information. Additionally, look at some of the anecdotal information that comes out of publicly funded activities such as the "CERT Summaries" freely posted periodically by the CERT/CC. These provide valuable insight into the types of actual incidents observed at other companies. These sources will supply you with a core set of incident types to plan for. It is highly likely that the list will span a broad range of incident types from small-scale PC virus incidents through massive industrial espionage incidents involving competitors robbing each other of millions of dollars in trade secrets.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Who Are the Clients?
Incidents involve both a source and destination. Sometimes they are one and the same, and sometimes both are victims. Once you have thoroughly explored the question of "What is the problem?" and developed a realistic list of likely incidents, you're ready to move on to the next step in the planning process: defining who the clients are (clients are commonly referred to as a constituency in the incident response community). You should clearly articulate who the team is expected to be serving. Depending on the nature of your team, this can be a trivial exercise or it can be not so trivial. For example, a corporate team should be able to clearly say that its client base is anyone in the company experiencing an information protection crisis. On the other hand, a public resource team might have a tougher time defining its clients. Even for a corporate team, however, the lines aren't always so clear. How about contractors' outsourced systems, application hosting providers, networked partners? Are they covered by the team on behalf of the corporation? Or, if the corporation is a large one, it is likely that there are subsidiary companies, or that the team itself resides within a subsidiary company. In these cases, are the subsidiaries covered during an incident?
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Summary
Think through the issues proposed in this chapter during the planning phase and be sure to clearly articulate them in the team's plans. Further, like so many other issues regarding incident response, it is important to be flexible. Saying, for example, that Subsidiary X is not covered by this service is all well and good until Subsidiary X experiences a crisis. It has been our experience that these administrative issues, while important to understand and plan correctly, should be flexible during times of crisis. The team has to have a customer service orientation to the entire constituency in every sense.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Mission and Capabilities
When you have a clear understanding of your clients and what their problems are, it is time to start developing some solutions. At the top of the list should be planning the team's core services. As we've previously discussed, there is a wide range of possible services a team can provide to its customers. The actual list of services will obviously vary somewhat by organization, but the list should be carefully thought through and decided on, although it is bound to be revised over time. Also understand that the list will probably be driven not only by emergency services, but by services that can be performed during times of noncrisis (however rare). These nonemergency services can be used to justify a team's existence by doing things like awareness training to raise the corporation's overall level of security.
One effective way of coming up with a reasonable list of core services is to start by defining only the emergency-driven services. These are likely to include emergency hotline support, hands-on crisis response technical services, incident investigation support, advisory distribution, and technical support for installing vendor-supplied security patches. Once you have that list and have established a good understanding of what each service entails, make a reasonable estimate of the resources required to perform these services and the percentage of staff time they are likely to take. From that, fill in the gaps by planning nonemergency services like awareness training, technical security seminars, and any other services the corporation can benefit from.
Also, before jumping in and defining the services, consider visiting some of the key business unit managers and asking them what sort of security services they perceive a need for. It might be that they don't see a need for any, or few, based on a lack of understanding of what might be available, so this is also an opportunity to educate them somewhat. At least for most corporate incident response teams, the business managers are the end customer of the service; showing that you really will provide the services they need most is certainly time well spent. This is especially true if the business units are going to provide some of the direct or indirect funding for the team's services.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Roles and Responsibilities
Right now, you're probably tempted to put the book down and scramble to assemble your elite squad of cyberparamedics. But whoa, partners! There is more than simply figuring out what your core services are! Special times (such as incidents) require special people.
Staff selection should be driven by the list of core services your team will provide. Will you support all of the computing platforms, operating systems, applications, and services in use at your organization, or only a select few? Once you identify those core services, you can address the more engaging questions about how many people you will need on the team, their required technical skills, and what experience levels you will need. Typical job positions include a team manager, a senior technical leader, assorted technical and nontechnical engineers, analysts, and assistants.
Before even thinking about how your new team will operate, you need to know the whos, whats, and hows of the team. First you will start thinking about who you want to include on your team based on qualifications and position. But that's not enough. You then have to know what job positions (role) each person will function on the team, and also how well they can function under pressure as both an individual and member of an elite technical team.
To many managers, two of the most dreaded words in Corporate America are "job description" and "performance evaluation." Still, documenting each of the team's staffing positions is a necessary process. Your job descriptions should include a list of what each position (team leader, senior technical lead, technical staff, and so forth) on your team is expected to do along with the necessary skills, experience levels, and educational backgrounds necessary for that position. The list of duties should be as detailed as possible, with the inevitable "other duties as assigned" included for good measure. The good news is that this information can be used for each employee's performance criteria, especially if the job description is sufficiently detailed.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Staffing and Training
Although the size of each team is going to vary depending on its services and its available funding, there are some staffing issues that all teams should understand. It is of paramount importance that every team member have a positive, customer service-oriented attitude. This may sound like a cliché, but it's important to the success of any service-oriented organization. When it comes to a supplier who provides emergency services, it is even more essential. Always give customers more than they expect. Great care should be taken in selecting the right staff members.
One of the reasons that the Carnegie Mellon CERT/CC organization was successful in pioneering the incident response profession is that it spent a great deal of time selecting its personnel. Here's how an actual interview was conducted in 1988 when the CERT/CC was recruiting its technical staff. The process consisted of nine back-to-back interviews with various managers and senior technical staff. Both hiring managers and interviewees found it to be a grueling day. However, we grew to appreciate the great effort placed in selecting staff members. Each of the key staff managing, supervising, or working with a prospective employee had the opportunity to interview the applicant. In some cases, multiple days of interviewing were called for. And, in each case, every interviewer's responses were collected and considered in the hiring decision; everyone had a say in the process. In the after-interview discussion by all of the people who interviewed the candidate, everyone was encouraged to speak openly and candidly about the candidate. The selection process is time-consuming, but essential. Take the time prior to hiring someone to make sure that the person is a good fit in every sense.
While staffing your team, it is important to remember that the collective members have to remain current in their skills and experiences. Time and resources must be devoted to training. This issue is covered in more detail in later chapters, but when staffing your team, remember to decide what percentage of each person's time is going to be spent in training, because that will drive the decision of how many people to put on a team. While each company's threshold is unique with regards to percentage of time and money an employee can spend in training, the argument can be made that incident response is an emergency services profession that truly requires skilled and knowledgable practitioners. Would you be comfortable with a paramedic who only knew 1980s emergency technologies and processes?
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Involving the Critical Players
When you have decided on your team's basic services, and recruited and assembled your team according to the guidelines mentioned earlier in this chapter, it is time to make sure that your concept of "team" extends beyond the individuals who are directly assigned, matrixed, or contracted to the core team. In fact, the actual incident response team should really just be the process and technical experts who lead the charge in response to a security problem. During a crisis event, the concept of a team needs to extend far beyond those boundaries. The overall team, in a macro sense, needs to involve a much larger group of the organization's employees beyond simply the technical incident response staff.
We were involved in a role playing desktop exercise with a client in which we simulated an incident with a group of the client's senior management and executive personnel. During the exercise, we stepped through a realistic hypothetical incident involving the theft of large amounts of money from the client. Although the actual incident response team was represented in the room during the incident, it very quickly became obvious to everyone in the room that it would be impossible to have the team run the show during an actual crisis. Some of the most unlikely people in the room rapidly found themselves key figures in the actual handling of the incident, even though they were not members of the incident response team and were never even told what their incident response roles and responsibilities would be. At one point during the scenario, a camera team in the lobby of the building surprised the company's CEO with a surprise "interview." Fortunately, the CEO had been fully apprised of the situation and been given, by the office of public affairs, a recommended statement to make to the media. Thus, he was able to give the reporters a statement that helped them in putting together their story, while neither making the company look foolish in the eyes of the media nor giving away any critical information about an ongoing investigation. In this example, the public affairs office played an invaluable role in the overall handling of the incident by briefing the CEO and making sure that he would be fully prepared for a possible media assault. Even though this may not seem like a normal function for an incident response team, attention to detail can make the difference between a successfully handled incident and a public embarrassment. As a result of this exercise, the company now considers their incident response team to be a dynamic entity made up of a wide range of people from throughout the company.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
List of Contacts
Creating and maintaining an up-to-date list of people, phone numbers, pager numbers, and email addresses is tedious, boring, and time consuming. And yet, not having one (or having an incomplete or incorrect one) -- is the kiss of death for a response team. As mentioned in the opening paragraphs of this book, one of the first lessons an incident response team learns is that all major incidents start on Friday afternoons. Also, major incidents almost always involve intense, stressful activities during the wee hours of the night. Not being able to get in touch with a vital contact such as a system administrator at 3:00 a.m. when the person is really needed to help out can be disastrous.
That being the case, a good contact list should be considered as the team's most valuable asset. The list should be checked frequently, updated, and readily accessible to all the team members. That isn't to say that it needs to contain every person in the company or contracted by the company to assist in an incident. In some cases, calling the 24x7 Network Operations Center requesting that So-and-So be paged is sufficient.
The contact list should contain information about the person's job function, authority, and perhaps skill sets, making it as easy as possible for the response team to find a person when needed. If the team doesn't know the person by name, then they should be able to look up the person by position, organization, job function, or skill set.
This all sounds wonderful and easy to do, but the truth is that it is anything but easy. Populating the list or database is difficult enough, but maintaining it when someone leaves the company or changes jobs can be next to impossible. So, another piece of information for each record should be the date that the information was last updated or verified. Tracking that can help flag entries that haven't been cared for in a long time -- depending on your company's definition of "long" -- and alert someone to verify it. Part of the team's "other duties as assigned" should be to keep the database up to date.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Setting Up a Hotline
An incident response team's lifeblood is its communications; clients have to be able to alert the team to an emergency any time of the day or night, and from anywhere. Just like any customer service organization, the team must always be available. One of the most common methods of doing that is to set up a hotline. This is most often a physical phone line that is used for nothing other than calls to the team, but email, cellular, and/or pager systems almost always support it. Keep in mind that if the network or email servers are down or having problems, email notifications probably won't work or be sent in a timely fashion, but you may be able to use alternate email accounts or dialup accounts if email is absolutely necessary.
The three most important communications media for most organizations are telephone, facsimile, and email. All of them should be available around the clock to customers the team supports. The bottom line is that a customer must never be in a situation in which the team is unreachable, for any reason. Of course, being available around the clock can be difficult, especially for a one- or two-person team. It might well take some innovative applications of technology and resources to accomplish that level of availability.
Depending on the size of your organization and team, the hotline you set up might be answered by dedicated hotline staff, another help desk facility, or even an automated voice mail menu system. In several cases, teams are dispatched via the company's Network Operations Center. Likewise, em