Incident ResponseBy Kenneth R. van Wyk & Richard Forno
0-596-00130-4, Order Number: 1304
234 pages, $34.95
Tools of the Trade
Anyone who reads an information security magazine or spends time on the Internet quickly realizes that there are a myriad of security tools available. These tools range from small, simple freeware security utilities to comprehensive professional security suites of tools designed to solve all your security problems. Because there are a vast number of tools, it can be difficult for the incident response professional to select the right tools for the job. But, contrary to vendor claims, no panacea or silver bullet exists that can solve all your problems. More importantly, very few, if any, security tools are designed from the ground up to be incident response tools. There are tools for encryption, authentication, intrusion detection, and so on, but there are (arguably) no tools designed specifically for incident response. Many of the available tools are useful for incident response activities, but were not designed for that purpose. Thus, it is up to the incident response practitioner to study the available tools and their applicability to incident response operations, and then to adapt to the tools they deem appropriate for the task at hand.
This chapter describes a large percentage--though certainly not all--of the currently available tools and discusses each tool's strengths and weaknesses specifically with regards to its utility and applicability as an incident response tool. The discussions about the tools are based, in almost every case, on our experiences using them in actual incidents. The operational aspects of the tool, such as the vendor's ability to support it, are stressed in the discussions here. We do not seek to endorse or condemn any tools, only to show the novice incident responder the capabilities, pros, and cons of various tools you will encounter. Where one tool comes up short, you can bet others will fill in the gaps nicely.
What's Out There?
As we've already indicated, the spectrum of available tools is broad. Even under ideal situations, not all of them are applicable to incident response operations. Although some can be considered system administration tools as well, we're going to focus on the tools that can be used as part of incident response operations. The list of tools we've assembled has been put together through actual incident response operations. As such, it has been growing, changing, shrinking, and updating for some time now.
One of the many encouraging and rewarding aspects of the incident response community (especially FIRST) has been its willingness and eagerness to share technical ideas among teams and individuals. FIRST holds periodic Technical Colloquia; the primary purpose of these TCs is to exchange technical data not related to actual sites that have been under attack. One of the more successful topics of discussion at numerous TCs has been technical presentations about practical experiences with different security tools. Most of the member teams are more than willing to candidly discuss their successes and failures in using security tools. This sort of open exchange of nonclient technical information is a tremendous benefit to FIRST and to all of the member teams that participate in the TCs.
In order for a tool to be truly useful in an incident response operation, it has to be able to support at least some of the basic operations that incident response practitioners encounter in their work. Some of the requirements that we've found over the years are as follows:
- Above all else, the tools used the most in incident response operations are the ones that give the operator the maximum amount of flexibility. Each incident has the potential to be different. You could quickly find yourself needing different configurations of the tool, saving information to different magnetic and optic media in a variety of data formats, and so forth. New, previously unforeseen situations constantly arise in incident response operations. As responders, our tools of the trade need to support them!
- In general, there's a fairly small number of actions that the tools need to support. These include data collection, 24/365 notification of critical events, data analysis and documentation, litigation support, and attack visualization. While these criteria will no doubt evolve over time, it's a fairly accurate list of the top-level functions and features needed as of this writing.
- Almost without exception, incident response operations need to be carried out discretely so that the subject or target does not know she is being observed. The processes that we use need to be undetectable--within reason, of course--to the environment they are being deployed in. Sometimes this means connecting a computer to a network in such a way that the computer cannot be electronically observed; sometimes it means analyzing a computer system's contents in a way that doesn't disturb the contents at all; but it always means doing things quietly, efficiently, and promptly.
It is often the case that a suitable commercial or freeware tool cannot be found for a particular task. All incident response teams should be capable of putting together one-off tools for such occasions. These tools are often just used once and then stored in a repository for a future operation in which parts of the tool can be "plagiarized" for some other purpose. Examples of such one-off tools are almost always written as shell scripts, Perl scripts, batch files, or some other simple (but effective) programming language that can be very quickly tossed together into a simple archive. Indeed, some of the most useful tools we've encountered in operations were Perl scripts coded at 3:00 in the morning that served their one-time purpose and have quietly sat in a repository someplace to be resurrected someday when called upon.
One final side note before we get into the details about the specific tools regards the need to support law enforcement-grade evidentiary information. The data gathered by the tools that we're going to discuss may need to be presented in a courtroom situation as evidence. The collection and storage requirements for information that may be used in court are very different from normal network and IT diagnostic information. In a courtroom, the opposing counsel is likely to impugn your information by casting doubt on its integrity and how it was handled or prepared from the time of capture to the day of trial. For example, could the information have been fabricated or tampered with during the collection, analysis, or storage phase of the operation? If the answer to that question unveils any doubt, then it is likely that the information that you collected will at best be ineffective at convincing the judge or jury; at worst it will be dismissed as evidence and thrown out of the courtroom entirely. While electronic evidence-handling is a separate book unto itself, here are a few guiding points anytime you are handling computer evidence:
- Two-person rule
- Throughout the entire information gathering process, the data collected should be handled by two people, ensuring that no single person could have tampered with the information without the knowlege of another person.
- Process documentation
- The process, tools, etc., used in the collection of the information should be documented and signed by the people doing the collection. You can never document too much!
- Tools sealed and locked away
- Any tools used for the gathering, analysis, or documentation of information--in particular if they are of the one-off script variety such as we described previously--should be copied onto magnetic or optical media and locked away in a signed and sealed container. Ideally, the container should be stored in a trusted third-party's storage facility--we've found that storing sensitive information in our clients' General Counsels' offices is effective. Any access to that locked storage facility should be documented, dated, and signed.
- Information sealed and locked away
- Likewise, any actual data collected that might be needed as evidence should be stored on magnetic or optical media, sealed, and locked away--quite possibly for a very long time. Of course, the information that you've collected will need to be analyzed. So, prior to sealing it and storing it away--until needed in a courtroom, preferably--a duplicate copy should be made. That duplicate will be used for analysis and documentation. The duplicate itself should be backed up prior to doing the analysis. So, there should be three identical copies of the information: the original (locked away), the working duplicate (on which you do your forensics), and the back up of the duplicate (locked away in case you need to make another working duplicate).
At some point in nearly every incident response operation, the decision of whether to handle information as potential evidence should be made. As you can no doubt tell from the previous description of how to handle evidence, there's a rather large administrative burden when doing so. That burden can cause undue difficulties on the team when they are trying to get a system back in business as quickly as possible. If the information is not going to be used as evidence, then it is often best to make that determination early on and get on with the operation in the absence of the evidentiary burden. If the determination is made to treat the information as evidence, be sure to record who accesses, handles, processes, and prepares it, and which methods or tools are used. The chain of custody is a crucial part of proving in a courtroom that electronic evidence has not been tampered with or inadvertently modified.
We've often found that when business operations are placed at risk from an incident, the best thing to do is make your backup images and then allow the system admins to restore their systems while you do forensic analysis and investigation someplace else. Remember, business always comes first!
With that being said, let's dive into our list of tools and their applicability to incident response operations. Many of these tools are ones that you might want to include in a fly-away kit, a set of tools and equipment that is always ready to use when responding to an incident. As you read about the following tools, you might consider which of these you would want to include in your own kit. Fly-away kits are described in more detail later in this chapter in "The Incident Kit" section.
Many general-purpose network diagnostic tools can be useful during incident response operations. These include network protocol analyzers, sniffers, and network-based Intrusion Detection Systems (IDSs). In fact, network-based tools are probably the most useful diagnosis and analysis available today to the incident response practitioner. Their applicability is enormous. The most common incident response uses of network-based tools include the following:
- Attack detection
- Although not part of incident response per se, a good Intrusion Detection System (IDS) architecture can be a superb means of electronically watching over an information infrastructure for indicators of attack. In fact, an IDS and an IRT function are really two complementary parts of a robust information security program. Further, many IDS tools have features that are highly applicable to incident response operations. Note that we are referring to Intrusion Detection here in a very broad sense, not just dedicated special-purpose IDS tools. Included in this category of Attack Detection is any network or system component that provides event logging data that detects whether an incident has taken place. This could include, for example, standard operating system event logs, router event logs, and so forth. When these logs are appropriately monitored for anomalies, they can be tremendously useful at alerting the operations staff to possible security incidents.
- Attack diagnosis
- Once an incident has been detected--correctly or incorrectly--it is usually necessary to diagnose the actual nature of the attack. For example, was an outside intruder successful at gaining access to an internal system? If so, what information did he get to? Was any of that information tampered with? What vulnerability did he exploit in order to gain the access? Network tools can be very helpful in going through the network-level information to analyze the attack and answer these types of questions.
- Attack analysis and visualization
- Once an attack has been observed, captured, and diagnosed on a network-based collection tool such as a protocol analyzer or IDS, it needs to be fully understood. Unlike the process of diagnosing the attack, this is an intense process of going through every aspect of the attack and trying to figure out what the attacker was doing. The most important aspect is trying to figure out what the attacker's intent was. One of the most useful ways of accomplishing this is by using a tool that supports attack visualization, or playback. This is a process in which the analyst can replay the attack and essentially watch what the attacker would have seen on his own screen during the attack, just like a VCR replaying whatever it has recorded on television. We should warn you that very few commercial tools do this well at all, and this particular feature is something of a holy grail.
- Event notification
- Another fairly common process requirement for the incident response team is to be notified of a particular network event, such as an intruder reentering a system or an employee suspected of wrongdoing accessing a file or area of a system that he is not authorized to. A full-featured network-based tool can support this kind of requirement by providing out-of-band notification (e.g., dial-out to a pager) so that the IRT can be alerted around the clock.
Out-of-band communications are simply alternative methods of communicating. For example, in a networked environment, out-of-band communication tools are telephone, cellphone, wireless networking (such as Ricochet), and pagers.
- Media adaptation
- Although a somewhat more mundane requirement, it is very often necessary to connect network-based tools to strange networks or devices not commonly used. These can include the kinds of very high speed media that are often found in data centers, such as FDDI or other fiber optic cabling, all the way through external Wide Area Network (WAN) connections such as T1, T3, OC3, OC12, and so on. Most out of the box network-based tools described here are going to be equipped with a standard selection of supported network media, and the IRT is likely to find the need to connect to other media from time to time. For these cases, having a diverse collection of network media adaptors, connectors, terminators, network interface cards, and related connection devices is invaluable. Nothing is worse than a fire truck showing up to a house fire and then realizing they've got the wrong type of hose to connect to the hydrant. Guess what happens to the house during the delay.
Network Monitors and Protocol Analyzers
Although primarily a mainstay as a diagnostic tool for network administrators, network monitors and protocol analyzers can be tremendously useful to the incident response team. Such tools should be standard issue for every incident response team. Their primary uses for incident response is their ability to dig deep into the network datagrams for low-level attack analysis and their ability to store network data to disk for further analysis. In short, these tools are like microscopes that reveal the inner contents of the data you're dealing with on an incident.
The one common and vital requirement for the network monitors, protocol analyzers, and the network-based intrusion detection systems discussed later is that they must perform their tasks silently and invisibly. We frequently refer to this as blackening the network monitor. A properly-configured network monitor should be completely invisible on a target network. In other words, an intruder should never be able to discover that he is being monitored. There are numerous ways of accomplishing the blackening of the tools, some mechanical and some logical, but any network monitoring device used for incident response should almost certainly be blackened, and the blackening should be thoroughly tested prior to actually using the tool. While there may be exceptions to this rule, they are few and far between.
For years, the Sniffer product line from Network General (now Network Associates) has been one of the most trusted tools in the network administrator's toolbox. Originally designed and marketed as a pure network protocol analyzer, it has extensive utility to the incident response practitioner.
The Sniffer, which runs on PC architecture systems, has a no-nonsense user interface that's intended to help the network administrator quickly diagnose potential network problems. It's fair to think of this as a "hex-format disk editor" equivalent for network traffic. You get all of the raw data, with very little interpretation. Still, we've found it to be indispensable on numerous incidents. Some of the specific things that we've found to be useful include the following:
- Event triggers
- You can configure the Sniffer, for example, to only capture data to and from a particular pair of hosts on a network. This helps reduce the overall volume of data collected, so that you can zoom in on the specific data that you need to see.
- Datagram visualization
- In looking at the captured network data, the Sniffer clearly depicts each layer of the network protocol stack in the OSI network layering model. Thus, you can see the hardware level Ethernet headers, MAC addresses, the IP headers, the TCP headers, etc., in addition to the actual data sent through the network. Even if the intruder that you are investigating is using encryption, it's still often useful to see the network data directly.
- Close examination of network header data such as TCP flag registers
- You can quickly deduce if your attacker is using a tool that customizes a forged packet. For example, if you see that the TCP flags are not set in accordance to the standard TCP protocol, you've got an attacker with a pretty advanced knowledge of networking. This is a technique that's often used to try to fool network devices like routers and firewalls and also to cause denial of service situations. There's just no substitute for seeing the raw network data (complex as it may be) in its entirety in these cases.
- Save to disk
- The Sniffer can take all or some of the data in its buffer and store it to disk. The data format used is unique to the Sniffer, but there are now several tools (in addition to the Sniffer itself) that can read that data format for other purposes, such as analyzing the captured data.
- Reload from disk
- Saving the data to disk without the ability to retrieve it is no good. Naturally, the Sniffer does a fine job at reloading data from disk. This comes in handy during an incident happening far away from the incident response team itself. If the staff at the remote facility can collect pertinent data using a Sniffer and then securely send the Sniffer data back to the incident response team, they can then see the same data that the remote site is seeing and might be able to substantively participate in the analysis and diagnosis processes.
Although all of these things are fabulously useful and essential, there are some implied requirements. For starters, having the ability to see the network data at such a low level is only useful if the analyst is thoroughly trained on the network protocols captured. We've found that it's not just useful to be textbook-smart, but that having operational network administration experience is a major bonus. There is just no substitute for experience in analyzing this type of data.
Ethereal (pronounced ethee-real) is a feature-rich protocol analyzer that runs on a variety of Unix and Unix-like operating systems including Linux, BSD, Solaris, AIX, HPUX, and Irix. It is open source software released under the GNU General Public License. Further, as of this writing, it is still officially considered beta software. See Figure 7-1 for sample Ethereal output.
Figure 7-1. Example Ethereal output
That's good news as well as bad news. First, it is freely available, so the cost of acquiring it isn't a factor for those cash-strapped teams out there. It runs on Unix and Linux, meaning it is well suited for analyzing large amounts of incident data. Also, several of the Unix (and similar) platforms supported are freely available, which further reduces the cost of an analysis system. Due at least in part to being open source, Ethereal's list of features reads like a "Wouldn't it be nice to have this?" list, since it is constantly being updated and upgraded. In fact, among the project's original stated design goals is to "create a commercial-quality analyzer for Unix and to give Ethereal features that are missing from closed-source sniffers." The fact that it is still officially beta software, however, could diminish its usefulness in a courtroom situation.
Some of the more useful features from our standpoint include:
- Data support
- Ethereal can read a huge number of different datafile formats including Sniffer, TCPdump, Snoop, Cisco's IDS, and several others. The input data be GZIP-compressed, and Ethereal transparently uncompresses it. Additionally, it can write its output in plain ASCII text as well as PostScript format suitable for printing.
- Interface support
- Similarly, Ethereal supports a large number of physical network media via the host operating system's interface drivers.
- GUI and text mode
- Ethereal can run either as a graphic user interface (GUI) or as a text mode interface. Having the text mode interface as a fallback can be tremendously useful, especially in running multiple Ethereal probes in different locations, as it can be very easy to connect to each probe remotely. Care should be taken not to transmit incident response data over a network that is under investigation. Remember, these tools need to be stealthy!
- TCP stream reconstruction
- This is one of the features that sets Ethereal apart from many protocol analyzers. Ethereal can reconstruct a TCP data stream such that all of the packets are in logical order, as opposed to the order that they were sent and received through the network itself. This feature can save the analyst a great deal of time, particularly since she is dealing with raw data that can become overwhelming if not easily ordered in sequence.
TCPdump is to network traffic capturing as assembly language is to programming. Its biggest strengths are its simplicity and flexibility, but that also makes it difficult for many people to use. It is truly a low-level network tool. However, it comes with or is available for just about every Unix variant in existence, and its utility is tremendous. It can capture just about any network packet on almost any network medium. Figure 7-2 shows an example of TCPdump output.
Figure 7-2. Example TCPdump output
TCPdump was originally a network administration tool. When used in an incident response mode, the incident response practitioner will see that TCPdump completely lacks a postprocessing analysis support function, something not required in the original design for the tool's purpose in network operations. Thus, it is best suited as a raw data capture tool to be used in conjunction with something else for analyzing the data. In our experience, though, when working on an incident in the wee hours of the night, there's nothing as powerful as TCPdump and some cleverly written Perl scripts. Sometimes there's just no substitute for having that maximum level of flexibility. Plus, being such a simple tool, it is very easy to quickly set up a monitoring device with TCPdump and then come back and analyze the data later. Perhaps that's not as elegant a solution as many others, but it is quick and reliable.
The principal downside is that it requires a well-trained Unix and networking engineer to fully harness the capabilities.
The difficulties and shortcomings of TCPdump were also noticed by Steve Romig of the Ohio State University. Steve wrote a program called Review that is specifically a postprocessing analysis tool for sifting through gigaheaps of TCPdump data (that's a lot of data!). As with TCPdump, Review is freely available. (The full sources and documentation for Review and its associated log gathering tools are available upon request. Requests should be made by mail to email@example.com. The current version is a collection of programs written in C and Perl. It requires TCPdump, Perl 5.003 or higher, and the Perl/Tk module. It is known to run under SunOS 4.1.4, BSDI 2.1, FreeBSD, and Linux using X11R6. It should be readily portable to other Unix platforms.) For further information, see Steve's white paper at http://www.net.ohio-state.edu/security/talks/1997-06_review_first/paper/paper.html.
The primary feature of Review, for the purpose of incident response, is its session visualization and replay. It can play back text sessions (e.g., telnet, rlogin) as well as X Windows graphic sessions. This, of course, can greatly ease the analysis process, especially when trying to work through huge amounts of captured network data.
Snort is neither just a protocol analyzer nor an intrusion detection system (IDS). It is a little of both, and can be very useful in incident response operations. Many of its features are similar to the TCPdump/Review combination mentioned above, but Snort has enough differences to discuss on its own. Like Ethereal, Snort is freely available in source code form under the GNU General Public License, for most Unix and Linux variants and distributions. However, unlike Ethereal, Snort is not a beta release. At the time of this writing, Snort is up to Version 1.7. What's more, Snort has an active community of users that freely exchange ideas and rulesets. For further information, see http://www.snort.org.
Where Snort's features really begin to come in handy (in addition to being able to do the basic network session capture and analysis functions) is in alerting the operator of certain events. For example, Snort can be configured to watch a network for a particular type of attack profile and then page the incident response team members when the attack takes place. Further, you can define, at least to a degree, what events to look for and to alert on. These features are what makes Snort a decent lightweight network intrusion detection system, and useful to an incident response team. Figure 7-3 shows the end of a Snort network capturing session.
Figure 7-3. Example Snort output
From Sandstorm Enterprises, Inc., TCP.Demux is another useful postprocessing tool, but unlike the others that we've described, it only runs on Microsoft Windows platforms. It can read data inputs from Sniffer, LANwatch, Net X-Ray, and Sun's snoop programs. Its primary use to incident response is its ability to pull an entire TCP session together sequentially from a sea of data collected by one of its supported network protocol analyzers. That being said, its primary shortcoming for incident response work is that it does not have a session playback function. Still, if you are using a Windows system for analysis, it is worth looking into. For further information, see http://www.sandstorm.com.
One of the more recent, but powerful, entries into the network protocol analyzer market is NetDetector by Niksun. Like the Sniffer product line, it is a full-featured network protocol analyzer. Unlike the Sniffer, though, it specializes in WAN connections and has an extremely capable session visualization capability. In fact, in the area of session playback and visualization, NetDetector is pretty much without peer. It comes at a premium, though, as NetDetector is not cheap!
NetDetector's session visualization features are truly impressive. They include the ability to visually reconstruct web browser sessions and the ability to quickly extract email file attachments from network data streams. All of these things can be done using customized scripts and programs from lower-level network analyzers, but NetDetector packages them in a very easy to use system.
As with other network analysis products from Niksun, NetDetector supports a wide range of physical network media on both the LAN and WAN side. The list of supported network media includes 10/100/1000 BaseT Ethernet, FDDI, T1, T3, and OC3.
Finally, NetDetector provides the ability to alert you (via SNMP) of detected network activity that indicates likely intrusion activity. In doing that, it most certainly has some intrusion detection capabilities, but its intrusion detection features are not its strongest feature. Where NetDetector really excels is in its network data capture and analysis features. If you are looking for an uncompromising, cost-is-no-object monitoring and playback solution, NetDetector should be at the top of your list.
The final entry in the network protocol analyzer area that we're going to cover is Net4 from Network Forensics, Inc. Net4 shares a lot of features in common with several of the other commercial products we've already discussed; it is used to capture and analyze network traffic. It was designed from the ground up to be useful for network security forensics, so the security features are not just an add-on or afterthought. Net4 contains two functional modules: a collector and an analyzer. Like NetDetector, it supports a wide range of physical network media on both the LAN and WAN sides.
Network-Based Intrusion Detection Systems
The next category of network-based tools that we're going to look at is network-based intrusion detection systems (IDS). However, we are specifically going to discuss IDS as it relates to incident response. At the point the IDS enters the picture, the incident has already been detected; what can the IDS do to help us out? For the record, IDS are tools that serve as perimeter tripwires or sentries, alerting network staff of suspicious network activity that may indicate an attack is in process.
When it comes to incident response per se, there's a pretty big overlap between what we look for in the sort of network tools that we've just described and what most IDS products deliver. Basically, we want to monitor, alert, analyze, diagnose, and collect information (whether or not it may be needed as evidence). Clearly, it's the monitoring and alerting where most of the current IDS products excel.
What we most often use IDS products for in incident response operations is monitoring and alerting for particular trigger events. Note that we said trigger events and not just known attacks. Those trigger events should include known attack profile signatures, but they often far extend beyond that. Watching for known attacks helps, among other things, alert you if the intruder that you're watching makes use of known and documented attack tools. Also, it helps detect other systems on the network that may also be involved in the incident.
The most useful function of an IDS is configuring it to look for one or more specific events. For example, you might want to be notified if the intruder is accessing a specific file, system, or other resource. You might want to be notified when a network session enters the network from a specific origin. There's all sorts of specific events that you could use an IDS to automatically watch and alert on.
So, if you're going to use the IDS to support incident response operations, be sure to look for the ability and the ease of adding your own trigger events to the list of what the IDS is looking for. Also look for an IDS with robust and mature out-of-band alerting mechanisms. We discuss some of those in more detail later, but one of the things that we've found to be most useful in IDS is the ability to send diagnostic information to one or more alphanumeric pagers. Note that many IDS and network monitors provide pager support by sending email to the pagers' email addresses; make sure that the IDS does not use the network that is being monitored to send alerts to you. That tips off the attacker and is most certainly not stealthy!
Most IDS also provide some level of session playback visualization support. However, it has been our experience that none of the products currently on the market are as good at this as some of the network protocol analyzers that we described previously, such as Niksun's NetDetector. In their defense, the IDS vendors will tell you that session playback is not and never was a primary design criterion; their products were designed to detect and alert the operator to incidents.
Dragon, from Network Security Wizards (now part of Cabletron Systems), is an industrial strength distributed intrusion detection system. Apart from being an excellent IDS, one of Dragon's biggest strengths as an incident response tool is that it has a very easy to use language for adding customized attack signature definitions. Combine that with its ability to monitor multiple Dragon sensors across an entire business enterprise on one browser-based security console, and you have an extremely powerful and flexible tool for assisting in incident response operations. Figure 7-4 shows example Dragon console reports.
Figure 7-4. Example Dragon console reports
Dragon does support pager alerting as well as a relatively simple session playback mechanism. That's not to say that the playback mechanism is not useful, but it doesn't have the rich feature set of something like NetDetector. Figure 7-5 shows Dragon trigger output and Figure 7-6 is an example of its command-line interface. For further information on Dragon, see the vendor's web page at http://www.securitywizards.com.
Figure 7-5. Example Dragon trigger output
Figure 7-6. Example of command-line interface to Dragon
Network Flight Recorder
Network Flight Recorder (NFR) can be described as an IDS compiler because it includes a powerful network monitoring tool and a relatively high-level programming language called nCode to control it. nCode gives you all of the essential building blocks to build and customize pretty much anything that you would possibly want to do within an IDS engine. Numerous nCode examples are provided that illustrate how to do many basic (and not so basic) tasks. Figure 7-7 shows an example NFR alert screen and Figure 7-8 is an example queries screen.
Figure 7-7. Example NFR alert screen
Figure 7-8. Example of NFR queries screen
While nCode is an exceptionally powerful programming language, be aware of the learning curve involved in becoming expert at using NFR. It's not for the novice network administrator! For use in actual incident response operations, it's not sufficient to be merely conversant or even proficient in programming in nCode; you really should be expert in it and have experience at writing complex tasks.
One other note to remember, along the same lines, is that any and all nCode programs should be available in source code format in order for collected information to be usable in a court of law. All processes, procedures, and tools used in an investigative capacity should be stored and sealed away. Thus, when using nCode to collect prospective evidence, it is an exceptionally good idea to write the programs in very clear style and make heavy use of documentation--you might need to describe the code to a jury one day, and most juries are not incident response professionals with in-depth computer security vocabularies!
Despite these caveats, for the ultimate in flexibility and capability, it's tough to beat NFR. For more information, see the vendor's web page at http://www.nfr.com.
Without a doubt, the current market leader in the IDS field is RealSecure by Internet Security Systems (ISS). RealSecure is another network-level IDS that allows multiple sensors to be monitored and managed from a central security console. A major strength of RealSecure that it is a mature and robust commercial product. Its ability to detect a large number of attack profiles has been thoroughly demonstrated; however, its session playback ability is quite limited. It's a great tool to have around during an incident, though. One of the things that we've used it for on a regular basis is to watch over our shoulders for other attacks during an incident. It's quite possible that you are looking at one part of a bigger incident. Running RealSecure helps detect if there are other attacks taking place that had not been noticed previously.
Additionally, RealSecure's ability to run in a distributed mode can help pinpoint the actual network source of an attack. This is particularly important if an attacker is forging their source address. At that point, the only realistic option is often to simultaneously monitor multiple network segments and observe the actual source packet at the time that the attack is launched. Naturally, this is only possible if the attack is started on a network where RealSecure is deployed. Figure 7-9 shows RealSecure in action.
Figure 7-9. Example ISS RealSecure console
Network Vulnerability Scanners
Network-based vulnerability scanners are very different than the network-level tools described in the previous section. The network-level tools are used primarily to passively observe and analyze network activity, good or bad, whereas network vulnerability scanners actively send packets out over a network in search of vulnerabilities, malicious code, etc., on other hosts on the network. It's the real-world equivalent of the weatherman sticking his head outside the window to see if it's raining compared to doing research to figure out why it is raining and how much rain should be expected. As such, we have placed them in an entirely different category. Their applicability to incident response operations includes these tasks:
- Detection of back doors and malicious code throughout a network
- Often while handling an incident, one major concern is that an intruder may have placed unauthorized software on one or more of the computers under investigation. Such so-called back doors on systems allow an attacker to access the compromised systems with ease (and low chance of detection) again later. One of the best and quickest ways of detecting whether this has been done is to perform a network-based sweep of the potentially affected systems, specifically looking for common back door software. Although by no means foolproof, it is a useful process while handling an incident and can often alert you to things that you might otherwise overlook.
- Detection of enabling vulnerabilities
- Although not necessarily a malicious act, many times systems contain vulnerabilities that could allow an intruder to reenter a system. Think of this as an inadvertent or self-inflicted back door. It's even quite possible that this entry point was the method that the intruder used to access the system in the first place. Since it may not involve modifying anything on a target system, it gives an intruder the highest likelihood of not being detected on a system. Be sure to include a network vulnerability scan in your examination of the potentially affected systems. The scan should include a thorough check of all known vulnerabilities associated with that network system, and that list of vulnerabilities should be as up to date as possible. More often than not, this means being at the complete mercy of the scanner vendor, so having multiple scanners and updating mechanisms can be essential in preventing oversights when one vendor's product misses something that another vendor detects. Select your scanners very carefully and be sure to evaluate your prospective vendors on availability of timely updates.
Although not the first network vulnerability scanner, SATAN (or, the Security Administrator's Tool for Analyzing Networks) certainly made the biggest splash when it was released by Dan Farmer and Wietse Venema back in 1995. SATAN has a graphic interface via a web browser and is useful for detecting common network vulnerabilities. As a follow-on to SATAN, World Wide Digital Security wrote the SAINT (Security Administrator's Integrated Network Tool) program. SATAN is available from Dan Farmer's web site, http://www.fish.com, and SAINT is available from World Wide Digital Security, Inc., at http://www.wwdsi.com.
Although groundbreaking, SATAN has undergone few changes since its initial release. It is quite limited in the number of (now) common vulnerabilities that it detects. That's not to say that it's no longer useful, but there are other alternatives, such as SAINT. SAINT is an updated, maintained, and freely available version of SATAN. It should be noted that WWDSI does an aggressive job at keeping SAINT up to date with current network vulnerability data.
nmap, the Network Mapper, is both free software and a superb low-level network port scanner. Although not a vulnerability scanner per se, the value of nmap's output cannot be overrated. This is a must-have in any incident response team's bag of tricks, and can usually spot things that other commercial products might overlook. See www.insecure.org/nmap for more information from the author, Fyodor.
More on nmap
Fyodor describes nmap as follows:
"nmap is a utility for port scanning large networks, although it works fine for single hosts. The guiding philosophy for the creation of nmap was TMTOWTDI (There's More Than One Way To Do It). This is the Perl slogan, but it is equally applicable to scanners. Sometimes you need speed, other times you may need stealth. In some cases, bypassing firewalls may be required. Not to mention the fact that you may want to scan different protocols (UDP, TCP, ICMP, etc.). You just can't do all this with one scanning mode. And you don't want to have 10 different scanners around, all with different interfaces and capabilities. Thus I incorporated virtually every scanning technique I know into nmap. Specifically, nmap supports:Vanilla TCP connect( ) scanningTCP SYN (half open) scanningTCP FIN, Xmas, or NULL (stealth) scanningTCP FTP proxy (bounce attack) scanningSYN/FIN scanning using IP fragments (bypasses some packet filters)TCP ACK and Window scanningUDP raw ICMP port unreachable scanningICMP scanning (ping-sweep)TCP ping scanningDirect (nonportmapper) RPC scanningRemote OS Identification by TCP/IP FingerprintingReverse-ident scanning
nmap also supports a number of performance and reliability features such as dynamic delay time calculations, packet timeout and retransmission, parallel port scanning, detection of down hosts via parallel pings. nmap also offers flexible target and port specification, decoy scanning, determination of TCP sequence predictability characteristics, and output to machine parseable or human readable log files."
His description is accurate. nmap's usefulness during incident response operations is enormous. It can be used to rapidly scan a series of hosts for possible back door or malicious network-level software, for example, and is probably the fastest network scanner that you'll find anywhere. What's more, the individual scans can be chosen and tailored to a great degree by means of its command-line parameters. Admittedly, it takes some time to learn all of the command-line options, but it is time well spent. Further, nmap comes with a rather complete set of documentation by way of a Unix standard manpage.
Graphic frontends to nmap do exist and are readily available. Several pointers to these frontends are available on the previously mentioned web page. Frankly, for the purpose of incident response support, we've found the command-line interface to be the most flexible and powerful, once learned. Following are three screenshot examples of nmap's output. Figure 7-10 is in an X command-line session on a KDE 2 Linux desktop, and Figure 7-11 and Figure 7-12 are examples of two of the different GUI frontends to nmap.
Figure 7-10. Example of textual output from nmap
Figure 7-11. Example nmapFE output
Figure 7-12. Example KMAP output
The next three tools, starting here with Network Associates' Cybercop Scanner, are commercial network vulnerability scanners. Although they are all conceptually similar tools, each one has its own strengths and limitations.
While helping a client prepare for the worst during some of the recent distributed denial of service attacks, we had cause to run the Cybercop Scanner. The reason that we ran it at that time was to detect any Distributed Denial of Service (DDOS) agents (commonly referred to as zombies) that might be running on the network. We found the tool to be quick to install and easy to configure for our purposes--and it had all of the rather recent tool signatures that we needed to detect a couple of very specific tools.
Axent's NetRecon tool is the second of the three commercial network vulnerability scanners covered here. Like the others, it is a mature, robust commercial tool that can detect a large number of network-based vulnerabilities. Vulnerability updates are available for download from the company's web site.
ISS Network Scanner
The market leader and one of the first commercial network scanners, ISS's Network Scanner is a powerful and highly capable tool. Like NetRecon, updates to its vulnerability database are available from the company's web site, as well as periodically distributed to customers in the form of ISS-issued updates. These updates are critical to keeping the vulnerability database current so you can ensure that you're scanning for the most recent collection of known vulnerabilities.
Although ISS has come a long way by including the update capability in their products, the one thing that many system administrators fault them on is the fact that the format for their updates is proprietary, making it next to impossible for a system administrator or incident response team to add customized scan instructions so that the scan can be tailored to a local environment's unique needs. This limitation restricts the usefulness of the ISS Network Scanner, but it is still a very useful tool to have in your team's bag of tricks. Figure 7-13 shows ISS's Network Scanner in action, displaying a summary of a scan.
Figure 7-13. Example ISS Network Scanner output
Other Essential Network-Based Tools
As we've said repeatedly, maximum flexibility is one of the keys to success for an IRT. Unless your team supports a relatively small or very well-defined environment in which you explicitly know every type of network medium that you're likely to encounter (and we've seen very few folks that do), chances are you're going to need a few extra tools for those situations when you're hit with the unexpected.
The MicroRACK from Blackbox Network Services is primarily used in incident response situations for network media conversion. However, don't mistake it for a simple in/out network media converter; those are readily available from any number of network supply vendors (and should also be considered for a comprehensive IRT tool kit). The MicroRACK is an industrial-strength media converter box, available in 2-, 4-, 8-, and 16-port backplane configurations. Each of the port slots in the chassis can accept one of the numerous interface modules available from Blackbox. We won't bother to list all of the available interfaces here, but the network medium that you're looking for is almost certainly available from them.
No comprehensive IRT tool kit is complete without a MicroRACK and an assortment of interface cards spanning a wide range of network media. For more information, see the vendor's web site at http://www.blackbox.com.
Century Network Tap
Shomiti makes the Century Network Tap family of tools. They are used, as the name implies, to tap into an existing data network so that a network monitor or IDS can get an accurate data feed from the live network. As these are tools that are designed, from the ground up, to be used just for network monitoring, they offer a hardware-level form of the blackening feature that we discussed. Thus, regardless of the capabilities or blackened state of your network monitor, you can quickly and easily plug into a network without any fear of being noticed by the person being monitored. Figure 7-14 shows an image from one of the Shomiti Century Tap products.
Figure 7-14. Image of one of the Shomiti Century Tap products
The Century Taps are particularly suited for very high speed, full-duplex networks, and can thus be more efficient than plugging a monitor into the uplink port on a data switch. A heavily used 100 or 1000 mbps data switch can quickly overwhelm all but the most powerful of network monitors. Using a Tap allows you to monitor only the traffic to or from the device that you want to monitor, and it ensures that you will capture the full-duplex network traffic in both directions. Assuming your monitoring device can keep up with the data load, this is an ideal way of monitoring a network and should be strongly considered in connecting either a short-duration network monitor for an incident response operation or an IDS network in general. These are high quality, industrial strength tools. For more information, see the vendor's web site at http://www.shomiti.com.
These tools are the host-level counterparts of the network tools previously described. However, instead of collecting and analyzing network-level events, these tools are designed to work at a host level on an individual computer system. They are used to learn things about specific system information that is useful when handling an incident. The applicability of these tools to incident response includes the following tasks:
- Attack diagnosis
- A common requirement in incident response is to try to diagnose how something could have happened. Frequently, you won't have all of the event logging information that you need in order to definitively say how an intruder was able to compromise a system. At that point, you'll need to do some forensic and diagnostic work. We discuss disk-level forensics separately, but one form of forensics is host-level analysis and diagnosis. Having tools that search for host-level vulnerabilities, back doors, and inadequate patch installation is essential. They speed up the analysis process as well as make it more thorough.
- Detection of malicious code or back doors
- As we've already discussed, intruders often leave malicious software behind on a compromised system for a variety of reasons, not the least of which is to facilitate continued exploitation of a system or network. It's been our experience that one of the most common things for an intruder to do is to try to hide a setuid system shell in a place where it is unlikely to be found by a casual observer. Host-based analysis tools can help detect and analyze those tools--particularly if they are parts of known attack tool kits such as rootkit.
- Detection of unauthorized changes to system or applications
- Another common activity among intruders is to make unauthorized modifications to existing system executables or application programs to facilitate further (or future) compromises or unauthorized activities. This is particularly easy if the source code to a tool is readily available on the system or via the Internet. Adding a few lines of code to an installed program, recompiling it, and reinstalling it is a relatively simple process in many cases. It's even easy to modify the program's file time/date stamp in such a way that it doesn't appear (to the casual observer) to have been changed. Here too, host-based tools can be used to search for changes to software on a system--many systems use a checksum "snapshot" baseline to compare with the current system state, with any predefined variations causing alarms to go off. Admittedly, this is an easier process if the tools are properly installed prior to an incident, but never discount the usefulness of scanning an entire system for potential changes to the system files. We've seen intruders make changes to some pretty unexpected programs.
- Detection of enabling vulnerabilities
- Again, the network analogy holds true here--host-level enabling vulnerabilities could be present on a compromised system. It's quite possible that the intruder--perhaps an insider--used one or more of these vulnerabilities to escalate his level of privilege on the compromised system(s). As such, being forewarned and knowing where the vulnerabilities are is vital to knowing where to start concentrating efforts in fixing them!
Like its network-based counterpart, SATAN, the Computer Oracle and Password System (COPS) was written by Dan Farmer and is somewhat dated. But also like SATAN, it is still a useful tool that should be included in your kit. COPS truly blazed the trail that others have followed. It examines a Unix host for a ton of vulnerabilities and then reports. It can quickly find any number of things that intruders frequently do on Unix systems to hide their presence and/or ensure the ability to log back into the compromised system whenever they want.
Partly because of its age, COPS can run on a wide range of Unix and Unix-like operating systems. That is probably one of the most important reasons for keeping a copy of it in your toolkit--for those times when you are forced to examine an uncommon version of Unix for which few, if any, commercial host scanners exist. Further, since it is free, it can quickly be installed, run, and removed from a system, and even modified by the user if necessary by fiddling with the source code. Also, unlike COPS, many of the commercial tools, while highly capable and powerful are more invasive to install--some even require the system to be rebooted before running, thereby making it untenable in many production computing environments, and certainly less stealthy!
Similar, and more recently updated than COPS, the Tiger toolkit from Texas A&M University is a host-based vulnerability scanner. However, it has a somewhat limited mechanism for detecting modifications to vendor-supplied system executable files by comparing their MD5 checksums against a library of checksums distributed with Tiger. Although not foolproof by any means, and not as up to date as most commercial tools, this capability is a useful one from time to time while handling an incident.
The primary utility to the incident response practitioner, though, is its normal host vulnerability scanning. We've used Tiger in dozens of incident operations to look for indicators of an intruder's presence on a system. Although all of these things can be done manually, they are much quicker to run using an automated tool like Tiger.Also freely available, Tiger should be in every incident response team's toolkit.
ISS System Scanner
Another tool in ISS's formidable suite is their System Scanner. Similar in concept to COPS and others, it scans a single host for a known set of vulnerabilities. One of its advantages, however, is that it can be deployed across a network, with central reporting to one security console. That's a useful feature for the enterprise user, but not necessarily all that useful to the incident response team--especially if the tool is being installed and run after an incident has already occurred. Nonetheless, it is a very useful and capable tool.
Figure 7-15 shows the System Scanner displaying the results of a recent scan.
Figure 7-15. Example of ISS System Scanner output
If you need to run a host-based vulnerability scan within a Windows NT domain, then Bindview's BV-Control bears serious consideration. BV-Control provides in-depth security and administrative analysis ensuring the integrity, security, and performance of mission critical systems enterprise-wide. BV-Control's cross-platform security and configuration management solutions detects vulnerabilities within the network and alerts security and system administrators of any critical issues so that action can be taken to correct them--before users experience system downtime or performance issues.
CMDS by ODS Networks is the first of two host-based IDS products that we're going to describe. It has been available for several years now, and can under the right circumstances be useful to incident response operations. CMDS relies heavily on operating system-supplied event logs, analyzing them and pointing out anomalies.
Emerald is a host-based IDS from SRI and a relative newcomer to the IDS market. According to SRI's web site, at http://www.sdl.sri.com/projects/emerald/emerald-niss97.html:The EMERALD (Event Monitoring Enabling Responses to Anomalous Live Disturbances) environment is a distributed scalable tool suite for tracking malicious activity through and across large networks. EMERALD introduces a highly distributed, building-block approach to network surveillance, attack isolation, and automated response. It combines models from research in distributed high-volume event correlation methodologies with over a decade of intrusion detection research and engineering experience. The approach is novel in its use of highly distributed, independently tunable surveillance and response monitors that are deployable polymorphically at various abstract layers in a large network. These monitors contribute to a streamlined event-analysis system that combines signature analysis with probabilistic inference to provide localized realtime protection of the most widely used network services on the Internet. The EMERALD project represents a comprehensive attempt to develop an architecture that inherits well-developed analytical techniques for detecting intrusions, and casts them in a framework that is highly reusable, interoperable, and scalable in large network infrastructures.The EMERALD architecture is composed of a collection of interoperable analysis and response units called monitors, which provide localized protection of key assets throughout an enterprise network. EMERALD monitors are computationally independent, providing a degree of parallelism in their analysis coverage, while also helping to distribute computational load and space utilization. By deploying monitors locally to the analysis targets, EMERALD helps to reduce possible analysis and response delays that may arise from the spatially distributed topology of the network. In addition, EMERALD introduces a hierarchically composable analysis scheme, whereby local analyses are shared and correlated at higher layers of abstraction.
Initially developed as a research project at Purdue University, Tripwire is probably the most highly respected tool for detecting unauthorized alterations to operating system and critical application software. Tripwire is now available in the form of a freely available tool from Purdue as well as a commercially marketed and supported tool from Tripwire, Inc.
Tripwire creates a known-state database of cryptographic checksums of all of your operating system and application software, and then periodically compares that known-state against new tests. Two things are vital to note: the initial known-state must be a secure one, and the database and hash function must be unchanged every time the test is redone. If either of these is not the case, the results are likely to be tainted.
With those things in mind, however, Tripwire is a tremendously useful tool for monitoring any change from the baseline configuration of a system. Unlike most of the IDS tools on the market, Tripwire can detect any change due to a known attack or an unknown one. That's one of the things that can make it particularly useful during an incident response operation.
When faced with an incident situation in which recurring changes continue to happen on a system, Tripwire can alert you to clues and help in figuring out what is taking place. Additionally, the freely available version of Tripwire has started to make its way into many of the Linux distributions, so it should be even easier to establish the initial known trusted state of each system.
While reviewing a network some time back, we found a Tripwire configuration that worked particularly well. In the configuration shown in Figure 7-16, none of the machines being monitored with Tripwire were in fact running it. Instead, they all had a backend private subnet on which one system used NFS to mount the filesystems on all of the machines being monitored, and it ran Tripwire on these network-mounted filesystems. Anomalies were then reported immediately to the incident response team. The beauty of this configuration is that an attacker could not see that Tripwire was being used. The actual machine running it only mounted the filesystems, but ran no other network services. Thus, the attacker would have a difficult time connecting to the system running Tripwire. This configuration can be equally effective before and during an incident response operation.
Figure 7-16. Tripwire configuration
Symantec's Norton Utilities is one of the old standby tools for Windows-based system administrators. It includes a wide range of administrative tools designed to make your job easier. Many of these are applicable to incident response. In particular, the disk editor is a highly valuable and useful tool for examining disk drives at a very low level. It is a good general-purpose disk forensics tool to keep around for PC-based forensics applications.
The Coroner's Toolkit
The most recent freely available tool from Dan Farmer and Wietse Venema is The Coroner's Toolkit (TCT). TCT is arguably one of the only serious forensics tools available for Unix and Linux platforms. Among other things, it includes a tool for doing the low-level forensics examination of Unix/Linux disk partitions, currently supporting both the Berkeley FFS and Linux ext2 filesystems. This capability in and of itself would establish TCT as without peer in the Unix forensics area. TCT has other worthwhile features, though.
One such feature is TCT's ability to examine a Unix system after an incident has occurred and help the analyst deduce what might have happened on the system. It does this by examining the timestamps on all of the files on the system. On a Unix system, file timestamps change when a file is created, modified, or accessed. By looking at all of the file access times as soon after the incident as possible, the analyst is sometimes able to deduce what took place on a system.
For example, if an intruder broke into the system, downloaded a toolkit, such as rootkit, compiled it, and installed it, and then deleted the rootkit distribution files, you would likely find residue of this activity by looking at the file access times. Particularly telling would be the access times on various C header files, as well as libraries, etc. An experienced Unix technologist could look at this information and know that something untoward took place on the system, even when the programs in rootkit itself are causing the system to report erroneous diagnostic information.
The downside is that any information deduced from this type of analysis is highly unlikely to be usable in a courtroom. However, the ability to examine a compromised Unix system after the fact can be worth the effort, especially if the system had no event logging turned on or the event logs themselves are considered suspect.
During an incident response operation, the need for various different types of communication is vital. This can range from interpersonal communication via cell phones all the way to communications among the different computer systems involved in the operation. We're going to focus on the latter for this discussion.
One of the most important requirements during an operation is robust 24/365 data and personal communications. They should also include notification of critical events as they occur.
An easy trap to fall into is to use the network under investigation to send data, page alerts, etc. However, this can be a critical error, as it violates one of our primary rules of incident response: never put any data on a network that is under investigation--conduct all operations in stealth. A capable adversary will quickly see that you are using the network for data communications, and then the entire operation is compromised. Even if you encrypt the data, the intruder may be able to make useful deductions about how and if he is being monitored if he starts seeing large amounts of traffic while on your network.
During incident response operations, we frequently need to leave a network monitor or IDS running without a human overseeing it, lying in wait for a specific network or host event. Not to mention that humans need to eat, sleep, recharge, and take breaks. However, when the particular event that you're looking for occurs and you're not present at the computer, you will want to be notified as quickly as possible about it for a variety of valid reasons. Probably the most efficient way of accomplishing this is to use a tool with a page-back capability. Getting this to work depends a lot on the network/host monitoring tools you are using and the operating system those tools are running on. The reason is that many of the tools we've discussed do not have a page-back capability built into them. Often, you have to create the page-back manually by combining a couple of different tools to get the desired result.
Most of the tools that we've covered have some means for supporting a page-back function, even if they weren't designed to. If you're running on a Unix system, for example, you can use the swatch (System Watch) tool from Stanford University to pull syslog entries together and page-back from that. On a Windows-based system, many of the tools allow you to execute an arbitrary system command upon a trigger event. For those, you can use a readily available tool like cpager to send a page out via a modem.
Those are two of the easiest ways of accomplishing the page-back function, but there are others. Look for packages that allow functions like emailing a result to a system administrator or her pager. A local-area mechanism can be set up to send and receive the email and then run a paging program with the relevant data attached. You'll likely have to glue some things together using various scripts, but it is completely feasible. Remember our recommendation about Perl. It is readily available on most operating systems, and is extremely flexible for just this sort of custom task. One of the best things about pagers today is that they support alphanumeric messages. Being able to get a few lines of text that read "tftp login attempt (host.victim.com) from 10.55.11.234 at 14:40.53" is better than a simple and cryptic numeric "911" message on your display, when you don't know which system is paging you!
Dial-In Data Retrieval
Once the pager has alerted you of a particular event, you'll want to retrieve some information from one or more of your network monitors, depending upon the specifics of the incident. Even if you only need to retrieve data from one network monitor, how can you do that if the monitor has been (appropriately) blackened? If you set the monitor up in stealth mode so that the attacker won't see it, how will you be able to?
The answer to that question depends on a lot of things. In an ideal situation, you'll have a private administrative network configured just for your monitors to pass information without violating the cardinal rule of not transmitting any data over the production network. Few situations are ideal, however.
For those occasions, one of the easiest ways of getting the data for analysis is to use a modem and either dial in to it or have it dial back to you and transfer some data.
In order to accomplish the dial-in data retrieval function, you're most likely going to need to use a modem. Most IT organizations won't have any problems with this, and the only reason we're raising it as something to pay attention to with regards to incident response is that modems aren't trivial in the context of fly-away support for an incident response team. Unlike a stationary modem situation, you'll need to deal with a slew of local configuration issues, much like a laptop-toting road warrior does. Unlike a road warrior, though, you'll need to support dial-in and dial-out, as well as multiple applications accessing the same resource.
Here's just a few of the issues that you should be prepared for:
- Network type
- Most modems use analog phone lines; the only real exception is so-called "cable modems," which we're not going to concern ourselves with for the purposes of this discussion. Many, if not most, corporations nowadays use digital phone switches, and finding an analog connection should not be taken for granted. In most data centers, running an analog phone line just for a short term modem connection can require an insurmountable act of bureaucracy. Additionally, doing so might attract unwanted attention to the operation.
- Analog converters
- Many electronics stores sell devices that plug into a digital phone's handset cord and provide an analog connection for a modem to connect to. While these are usually appropriate for attended office connections, they do not work for unattended purposes like the dial-in data retrieval that we mentioned previously. Be careful to not fall into that trap.
- Be sure to turn off the speaker on the modem-internal or external that you are using.
- Particularly for performing dial-in data retrieval, you want the modem connection that you're using to be completely secure. Many modems require that dial-in users enter a password before they are allowed to continue the connection. Whenever possible, make use of these capabilities. Even though they're an additional operational burden on the staff, it is time well spent.
- Unattended operation
- Especially when using multiple programs to access the same modem, modems can have a nasty habit of losing their minds and requiring to be reset. If the reset needs to be in person, it can be an unacceptable situation. Choose and test your modem, along with your software tools, carefully.
- If your organization has a policy prohibiting the use of dial-in (or even dial-out) modems, then it's probably not a good idea to violate that policy. If your IRT serves your own parent organization exclusively, then it'll be easy to verify that policy, but if not, be sure to check the policy each time you need to deploy a dial-in mechanism. Some organizations have stiff penalties for connecting dial-in devices.
If a wireless modem infrastructure was globally ubiquitous, we suggest using it instead of bothering with the hassles of a traditional wired modem. Unfortunately, that's not the case. Wireless networks don't always have coverage where you need them; even if they are available in the town where you want to use them, they might not be available in the data center due to physical or interference issues.
That being said, it's not a bad idea to have one or two wireless communication options available and to use them when you can. They can be an elegant and quick solution to the issue of trying to find analog phone lines.
Some wireless networks to look for include the following:
- PCS cellular
- Most of the digital cellular carriers offer data services. That includes the broad category of Personal Communications Services (PCS) in the United States such as CDMA, TDMA, GSM 1900, etc. Carriers to look for are AT&T, Sprint PCS, Verizon, Nextel, and VoiceStream. If you need to work internationally, look for carriers that have an international roaming capability such as VoiceStream, whose handsets offer data communications in most metropolitan areas around the world via GSM 900, GSM 1800, and the GSM 1900 (available in the United States). Note that current cellular carriers are usually at the high end on price and low end on data speed when compared to some of their competitors. However, they are frequently more readily available than other options.
- CDPD services
- More and more CDPD devices--particularly PC Card modems--are becoming available. These offer a reasonable price/speed performance ratio, and are rapidly gaining in coverage areas. They definitely bear consideration for dial-in capabilities.
- Although the Ricochet network by Metricom has been available for several years in San Francisco, Seattle, and Washington DC, they have recently upgraded the network to 128 kbps and have launched a major nationwide expansion. At the time of this writing, Ricochet is available in approximately 11 major metropolitan areas and should be available in 25 areas soon. (Check the vendor's site at http://www.ricochet.net to verify coverage in areas where you're likely to need it.) Although not the least expensive option, it has two major advantages in its speed and its flat-rate pricing. No other metro-area wireless network currently can touch Ricochet's speed. This could make it an attractive option for dial-in data retrieval for at least some incident response teams. This is an option for out-of-band network communications that could be valuable to incident responders if they cannot or must not use the office network!
- Wireless ethernet
- For very limited distance local coverage, an 802.11 wireless ethernet environment could be an acceptable option for transferring data among a series of network monitors and analysis stations. At 11 mbps, it is much faster than even the next closest speed competitor above. The major restriction, obviously, is that most of the available wireless ethernets have a very limited coverage area.
Even in using out-of-band communications such as those described previously, it is prudent to encrypt sensitive data. Using available encryption tools will increase the level of confidence in the data that you've collected. Encryption tools can enhance data confidentiality, integrity, and authenticity.
With the basic concepts of encryption in mind, let's talk about how we could make use of them during an incident. Maintaining confidentiality of all information collected during an incident is vital. Quite often, we're going to transmit the data to an analyst via a network or modem, as we discussed previously. We need to make more than just a basic effort to protect the confidentiality of the data. Pretty much any encryption product can do that for you; choosing the right one really comes down to ease of use, compatibility with the platform that you want to run it on, and key management.
Ease of use is extremely subjective, so we're not going to spend much time talking about it, except with regard to the particular tools described in this section. Compatibility is a more objective issue, and one that should be carefully thought through. It's not enough to say that you use Windows NT, so you only need an NT-based encryption package; what about the systems that you're likely to run up against during an actual incident? If your team supports one parent organization, then it should be relatively easy to predict. If not, you should look at packages that run on just about everything. That narrows the field of possible choices down pretty quickly.
Finally, the issue of key management is important operationally because it affects how you will use a tool and who else can use it. A symmetric key tool means that you have to maintain a so-called shared secret among the people involved in handling the incident, while a public key tool means that you have to maintain a repository (or multiple repositories) of public keys, and that you have to adequately protect all of the private keys involved. Both of these scenarios imply a certain level of administrative overhead. Choosing between them is important. We've found that having both symmetric key and public key tools available is useful, since each category of tool has advantages and disadvantages for different situations.
One more note before we discuss the individual tools: it is important to have a fundamental understanding of the encryption tools that you're considering, as each type of encryption does different things. This section is by no means intended to describe the different encryption algorithms or even to be a primer on encryption. Instead, we present a few of the available encryption tools, along with descriptions of what they can do for you during an incident.
RSA Secure PC
Although there are a great number of PC-based encryption tools available, we've found RSA's SecurePC to be particularly usable. It's a no-nonsense encryption product that uses a symmetric, or shared secret, algorithm to encrypt files. One feature that we've found to be very useful is its ability to generate a self-decrypting PC-executable file. That can be a very handy feature when trying to exchange encrypted data with someone who does not have any encryption software. Note that this feature may not work with some email environments that block users from sending or receiving executable files as attachments to their messages.
As with any shared secret encryption, it is advisable to generate one or more shared secrets prior to needing them during an incident to avoid disclosing the password over an untrusted medium such as a phone line or an untrusted data network.
Originally written by Phil Zimmerman as a freely available encryption package for the masses, PGP has become the de facto standard for person-to-person encryption of sensitive information. The commercial version of PGP is available from PGP, Inc., which is a subsidiary of Network Associates. See http://www.pgp.com for more information.
A few things make PGP highly versatile and functional for an incident response team. First, it is available on a large number of platforms, either in the freely available or the commercially available version. This is at least largely due to the fact that the source code for PGP has been available for several years. Also, PGP has the capability of using any of a number of symmetric as well as asymmetric algorithms. Thus, you can quickly and easily use it via a shared secret or as a simple, but highly capable, public key encryption tool for encrypting and/or signing sensitive incident-related data. Newer versions of PGP can also be used to encrypt network sockets, essentially functioning as Virtual Private Networks (VPNs). Another relatively new feature of PGP software is the ability to create a virtual and encrypted disk on an existing nonencrypted disk. All of these features can be useful at times for encrypting sensitive data.
A large repository of PGP public certificates is available from several sources, including MIT, as well as PGP, Inc. themselves. Unlike more traditional Public Key Infrastructure (PKI) products that use a formal process and hierarchy for identifying users and generating certificates, however, PGP uses a simpler mechanism that is known as a "web of trust." In this mechanism, individual PGP users vouch for one another and sign each others' keys. Thus, one PGP user can amass a large number of signatures on his PGP certificate; if the recipient of a digitally signed document or file knows any of the signatures are trustworthy, then he can infer a level of trust on the document and signature in front of him. There is an implicit assumption in this model: each person signing another person's key enforces a rigorous policy of sorts in identifying the owner of the key prior to signing the key.
Volumes have been written on this subject alone, but that is not the thrust of this book. However, for the purposes of incident response within one team or organization, this simplistic web of trust mechanism is ideal. As long as the population of a single web of trust remains relatively small, the model works very well. Thus, for an incident response team to generate a collection of keys for its members is a simple task and facilitates easy and highly secure encrypted communications among the team members.
Removable Storage Media
On modern networks, a network sniffer can collect a vast amount of data in very short order. However, hard drives aren't always as transportable as removable storage media. And, as with most things that we've discussed so far, there's a range of removable media types available; each type has its own strengths and weaknesses. One of the basic credos of incident response is that there is no such thing as a large enough hard drive.
We've found that maximum flexibility is the key to success, although there are always trade-offs. Some of the most important issues when selecting a type of removable storage to use include:
- Availability on target system
- The choice of removable storage is often chosen before you ever arrive on-site to work on an incident. This is true in situations when you need, for example, to do some form of forensic analysis on a specific system. It's also true when you need to pull information off a specific server or a desktop host in general.
- This relates closely to availability. You are often forced to copy data so that no signs are left behind on the system. That is, no device drivers are loaded or unloaded, and so on. In that case, you are pretty much stuck with whatever is in the system when you arrive.
- Naturally, whatever storage devices you choose need to be compatible with the system that you are working on as well as the system on which you're going to be analyzing the data. Care should be taken, for example, to ensure that the data format can be read cross-platform if you are collecting data on one type of system and analyzing it on another.
- Amount of data
- Probably the second most important factor in choosing what type of removable media to deploy is the size of the data set to be moved. You will want to minimize the number of disks, tapes, drives, and other storage media used for an incident.
- There's a great variation in copy speed among the different media that we discuss in the next sections. If you need to copy or transport a vast amount of data, then the speed of the data transfer should factor heavily in your decision. Also, take into account the operational overhead of the speed of the copy process. For example, if you need to load or swap numerous disks into a particular backup device because the data set is too large to fit on one disk, that will certainly factor into the process. Also, if you need to go to extreme measures (e.g., load a device driver and reboot the system, prepare the data into a specific format) to make the copy, that also affects the overall speed of the process.
- The cost of the device and the media could be a factor in selecting the storage device. As much as we'd like to think that cost should not be an object, it always is when you are dealing with a major incident. Keeping costs to the minimum required to perform a task is always sound advice.
Although at the ridiculously low end of the storage capacity spectrum for many of our current needs, they are also at the high end of the availability spectrum. Just about every computer has a 1.44 MB floppy disk drive nowadays. Even if it means using a box or more of disks to store all of the information that you need, don't discount using floppies. Keep a bunch of diskettes in your fly-away kit.
If the individual files you need to copy exceed the storage capacity of the drive, there are a variety of utilities available that store the data so that it spans multiple disks and then reassembles the data at the receiving end. Keep copies of these tools in your fly-away kit as well.
Iomega has done a stupendous marketing job getting the ZIP-100 class drives and magnetic media into widespread distribution. Few organizations do not have any access to at least a couple of ZIP drives nowadays. Plus, at 100 megabytes per floppy--and there is a 250 megabyte version available--they are highly advantageous over floppy disks. External drives are available in parallel port, SCSI, and USB options. We've found the ZIP-Plus, with its SCSI and parallel port options, to be very flexible and usable almost anywhere.
One concern regarding ZIP disks is the necessity of adding software drivers to some systems, particularly if the parallel port version is used. Although this isn't a big issue if you are using your own team's equipment, it can be, if, for example, you need to retrieve data from a target's desktop system and you don't want to leave any trace of what you've done.
Even with that minor limitation in mind, a couple of ZIP drives and a handful of disks should be in every incident response team's fly-away kit. Many laptops have internal ZIP drive options as well. Although not a replacement for a floppy drive per se, these are a nice option to have in fly-away laptop systems.
The higher capacity cousin of Iomega's ZIP drive is their Jaz drive. The Jaz is available in 1 GB and 2 GB configurations, with both internal SCSI and external SCSI options. The external SCSI option can be coupled with a parallel port SCSI adapter, making it nearly as flexible as the ZIP drive. Unlike the ZIP, which is essentially a very high capacity floppy disk, the Jaz is a removable hard disk. As such, the media is more expensive and should be handled more delicately than the ZIP disks.
Still, with a top-end capacity of 2 GB, the Jaz drives are a very attractive option for doing large amounts of data collection. A combination of Jaz drives and Federal Express (or similar overnight service) could have a data throughput that is more than sufficient for gathering relatively huge amounts of data from remote network monitors and analyzing the data back at an analysis facility.
Conceptually similar to the Iomega Jaz drive, the Orb drive by Castlewood is a lower-priced option with nearly as much data capacity as a Jaz drive, at 2.2 GB. If you are looking at an Orb drive as an alternative to Jaz, then there are a couple of issues to be aware of. First, the drive must be compatible with the operating system that you need to use. Second, make sure that you can readily get additional Orb media. Although not obscure by any stretch, Orb media is not as ubiquitous as Jaz media.
If these issues are met, then an Orb drive is a good, inexpensive alternative to the Jaz drives.
Although lacking the mobility and flexibility of the previous removable media we've discussed, a CD-R recorder has at least one major advantage in that the media, once written, is read-only. Apart from that, it is also very inexpensive and widely available. Note that there is a read-write CD-RW option as well, and that could be an alternative to consider, although the CD-RW media may not be readable on all CD-ROM drives. The advantage of being read-only could be considerable if pursuing litigation. Naturally, the rest of the team's processes and procedures must also be litigation compliant, but having a store-once, inexpensive archive of the collected data is very convenient.
One of the downsides to using a CD-R to record your collected information is speed. Even the fastest CD-R burners are quite slow compared to the other removable media discussed here.
External Hard Drives
Not to be overlooked is the value of external hard drives that connect to systems through the serial, parallel, SCSI, Universal Serial Bus (USB), or IE1384 "Firewire" ports. When you need to offload large quantities of incident data, such devices make it easy to bring a significant amount of storage (anywhere from 40 to 120 Gigabytes) to use during an incident investigation or analysis. You can never have too much hard disk storage!
However, many external hard drives require special drivers and may not be supported on all systems; on older systems, your only choice may be a serial or parallel device that the system can read. Further, if you need to conduct forensic analysis on a computer, the last thing you want to do is corrupt your evidence by loading up a set of drivers on the system in question!
The last of the removable media options covered here is the tape drive. They are available in a huge number of shapes, sizes, capacities, speeds, and prices. About the only reason to use--and in fact require--a tape drive from time to time is quite simply their capacity. When you need to make an evidentiary backup of a multigigabyte server, even the best of the removable hard drives, like the Jaz 2.5, will have you swapping out disks for many hours, when you could have made the backup onto a single tape had you chosen a high capacity tape drive.
The Incident Kit
Every incident response team should have with it a "fly-away kit" capability. It's been our experience that a fly-away kit should be tailored to the individual team's needs, of course, and that the configuration is going to change over time as attacks, networks, and tools change. The kit must be ready on a moment's notice and have the necessary equipment to support the team's needs when it is sent to handle an incident. Of course, the notion of "being sent" will vary depending on each team's responsibilities, but whether it is down the hall or to a far away continent, it is safe to say that the team will need tools to support its mission.
We've learned a lot of valuable lessons over the years about what should and shouldn't go into a fly-away kit. One frighteningly consistent theme is that things go wrong at the worst possible times--the infamous Murphy is credited with saying this first. It doesn't seem possible that this could be more true. In fact, we believe that Murphy is a permanent member of every incident response team. Tools fail when you need them most--disk drives crash before you save critical data; cables and connectors break as you're about to connect a critical piece of equipment. None of this happens during the day when stores are open or when customer support personnel are on duty at your tool vendors. Here are the most important criteria to follow when putting your kit together.
Maintain your library of tools, systems, gadgets, etc., in an always-ready configuration so that they can be available to be used on a moment's notice. This works best if you dedicate a number of computers and tools for the fly-away kit and never use them for other purposes except to update them with new tools and features. Clearly, this comes down to a fiscal business decision for most teams. A tool that gets used once a year and costs an enormous amount of money and could be useful for other purposes is hard for any incident response team to dedicate to a fly-away kit. If they can't be dedicated, they should at least be tracked and monitored, so that they can be pulled back quickly to support an incident. One option that we've found useful is to use removable hard drives in fly-away systems. It is often easier to dedicate a cheaper hard drive to a fly-away configuration than an entire, more costly system.
Tool inventory should be managed very carefully. You want to know the status, version, and whereabouts of every tool in the kit. When an incident occurs, the last thing that you want to waste any time on is scrambling to pull together the tools that you need for the job. They should be available at your fingertips.
Particularly if your team needs to get on an airplane to travel to the site of the incident, weight is vital to control. While we're talking about getting onto an airplane, never check in anything that you can't afford to be without at your destination. While working at the Department of Defense, our team spent several weeks once trying to track down a Sun workstation that was being given a world tour (in other words, "lost") by Delta Airlines. The machine was returned, but it wasn't able to be used while the team was at the customer site working on the incident. We've started using Federal Express to ship large equipment to customer sites--while still not under our positive control, FedEx does provide accurate parcel tracking information that allows us to keep track of where things are and (if necessary) file a loss claim.
Any system that can do double- or triple-duty is probably worth the effort. Things like removable hard drives and multiboot configurations can be worth the effort. Having devices that run multiple operating systems (e.g., Windows 95, 98, 2000, and Linux) can save quite a load!
Don't assume that the client has it. Everything from floppy disks to power strips--don't assume that they'll be readily accessible where you're going. This might contradict the lightweight criterion, but always try to be as self-sufficient as possible. For some reason (fate?), many of the incidents that we've worked on have had superb local electronics stores nearby. A personal favorite for maximum availability of just about anything that you could need is Fry's Electronics. While they're great to have nearby, don't assume that they will be unless you really know the area that you're going to well enough.
Speaking of Fry's, be sure to establish and maintain good relationships with your tool vendors. For those times when you don't have the exact tool necessary, you need a vendor that will ship the tool to you overnight any night. Having a good rapport with the vendor can mean the difference between getting the tool quickly and having to explain to the customer why you can't accomplish what you need to do. Keep a purchasing charge card available along with the authority to use it whenever necessary, bypassing normal corporate purchasing procedures, within reason.
Try to burn a current and stable copy of your tool collection (and software license keys or tokens) onto a CD and keep it with you at all times. Note that the most current version of a tool is not necessarily the most stable one. So, if you have a trusty-but-rusty older version of a tool, bring that one. The important point, though, is to have a mobile repository of your tools with you on read-only media. You never know when you might have to reload a system due to a disk crash, software glitch, etc. For that matter, the repository should include all operating system distributions and patches that you need. It's also a great way to have your tools with you if not all your hardware makes it to your destination on time. You might be able to use hardware from your customer in the interim.
In addition, don't experiment with new tools on the road. While that beta update of a tool might have a tempting list of new features, avoid using it at all cost. This is especially true if the information that you're collecting might be needed in a courtroom, but it also applies if you just need good solid and reliable tools for the task at hand. New tools should be tested in a lab environment whenever possible.
Make sure you know the customer support numbers of all of your tool suppliers. If they have 24x7 support, it could be worth paying for. Even though you might never need to contact them, if you do need to it will probably be urgent. You'll find out that you need to talk to them moments after their normal helpdesk operations cease for the day (or more likely for the week). Even if it means paying a few extra dollars per year for 24x7 support, it could be the best money that you've ever spent if you can get the right technical support at 3:00 in the morning while working on a huge incident.
Don't assume that the client's description of his network is as she says it is. Even if your client is another department of the same company that you work for, don't assume that the description that you have of her network and system environment is accurate. Be prepared for different hardware and software configurations.
Just in case something goes wrong with one or more of your tools, be sure that the team has a copy of distribution media to reinstall software as needed while on the road or out of the office. That should include the base operating systems as well as all of the necessary tools. Keeping the current tools on one or more CDs is an excellent way of accomplishing this without having to waste time having the office FedEx supplies to you somewhere.
When tools come back from an incident, you should have a foolproof process for getting them back into a known state. Who knows what may have been changed, reconfigured, or otherwise abused on the system while it was out in the field. The process might include complete reloads of the operating system, tools, etc., but the end result should be the same. Always make sure that the tool is ready to go. We keep image backups of fly-away systems--with all tools preinstalled, compiled, and registered--that enable us to rapidly rebuild and restore a system after (or during) a trip.
Always make sure that you have a workaround should a critical component fail. When designing your bag of tricks, ask yourself what you would do without anything and everything. And be prepared to make exceptions to just about everything on this list of criteria. Many are subject to exceptions. As much as we've admonished you to avoid some of these pitfalls at all cost, sometimes all cost isn't enough. Sometimes you have no choice but to use that beta version of a tool because it includes an update that you absolutely must have. Sometimes the tools that you have just won't do what you need them to do and you're forced to write a one-off script and use it. If you're not good at handling stressful situations, then consider another line of work.
As we indicated, this list captures a lot of our real world experiences in putting together tool kits for incident response. Our strongest recommendation is that you pay close attention to every one of these details. In addition to these lessons that we've learned, there are certainly some things that absolutely must be included in your fly-away kit. These include power strips and removable storage and media, as well as mobile network gear such as network hubs and extra lengths of network cables.
If We Ruled the World
We have described a large collection of different tools that are available to the incident response practitioner. In most cases, there were numerous tools per category. We did that not out of some sense of fairness to the tool vendors, but because no single tool has successfully solved all of our needs over time. In fact, we've found it highly advantageous to keep a large collection of tools available because each new incident situation requires a new solution set. Subtle differences among incidents can require completely different tools.
So, we got to thinking about what would make up the perfect tool collection. Perhaps that isn't a goal that can be accomplished, but some of the criteria that we'd look for in our perfect tools would include the following list:
- Analysis support
- Although not a specific design goal for many of the general-purpose tools that we've discussed, robust support for the analysis and documentation process is very weak among the currently available tools. For the analysis capabilities to be truly useful in incident response would mean that the tools need to do accurate visualization of the attacks, have indexed notes that the analyst can tag to the data to quickly document certain aspects of the attack (almost like electronic Post-it® notes), and the ability to quickly view, fast forward, rewind, search, etc., the collected data. Since the tools that we've described were not designed specifically for incident response, they fall short of being able to do these things adequately.
- Evidentiary collection support
- Collecting evidence for court use has some very stringent requirements of the tools and processes used. That's well understood and accepted. However, none of the tools that we've described do anything to actively support the evidence collection process. Almost all of them are capable of collecting information that you would no doubt want to use as evidence under the appropriate circumstances. Some of the things that we'd like to see the tools do in order to support evidence collection are: digital signatures and tamper-resistant seals on collected data, cryptographically secure repository of collected data, support of a two-person rule process for storing data with multiple digital signatures, and audit trail of access to collected data. All of these processes are doable using freely available cryptography; none of the tools described support any of these processes.
- Timestamp confidence
- Whenever you are using more than one system to handle an incident, whether the system being attacked and the system analyzing the attack, or any other combination of more than one computer system, having trustworthy timestamps for data and events is critical. While it would be useful under some circumstances to have highly precise timestamps, +/- accuracy of 0.1 seconds is sufficient for most purposes. It's more important to have a common time reference across each network device and host involved in handling the incident. This is particularly true for external teams that travel to their clients' sites to assist in handling incidents, as their fly-away systems need to be synchronized with all of the local systems on the client network. Although this may sound trivial, it can be of vital importance in diagnosing a highly technical network issue, not to mention demonstrating an incident chronology in court. On more than one occasion, we have been burned (not in court!) by having inadequate timestamps on systems. None of the tools that we've discussed actively supports this issue.
- Common data formats
- In collecting and analyzing data from different data sources, it would be highly beneficial to work in a lingua franca among the different tools. While not always feasible, some of the existing data incompatibilities only serve to frustrate the incident response practitioner. Common examples include different data formats among network sniffers, varying nomenclature for vulnerabilities among host/network scanners, and so on. While some possible solutions exist, such as the Common Vulnerabilities and Exposures (CVE) standard, they don't sufficiently solve this issue.
- As we've previously described, almost all incident response operations need to be performed in a stealth mode. You usually do not want the target of the investigation to be aware that he is being observed. Network sniffers, for example, need to be impossible to detect electronically on a network. Data retrieval and page-back mechanisms must never use the network under observation to report back to the incident response team in any way. Some but not all of the tools that we've described provide a level of support for this. Every one of the tool vendors to which this applies ought to instruct and assist their customers on how to make their monitoring systems stealthy on all supported network media.
- Out-of-band data retrieval and page-back
- To avoid being seen by the target of an investigation, you must not use a network that is being monitored to transmit collected data back to an IRT analysis center. The most common way to support this is by out-of-band communications. For example, when placing a network sniffer on a target network, the IRT should be able to connect to the sniffer, transfer all collected network data back to an analysis center (in a compressed data format if appropriate), and restart the sniffer without being detected by the suspect, whom may be monitoring the network for possible pursuit. Here, too, is something that is readily accomplished by means of existing technology, yet few if any of the vendors support it adequately.
Back to: Incident Response
© 2001, O'Reilly & Associates, Inc.