Preparation
Identification
Containment
Investigation
Eradication
Recovery
Phases of Incident Response
SKILL 29: Describe the phases of Incident Response
This step happens before an incident occurs. Ensure you have the appropriate response plans, policies, call trees and other documents in place and that you have identified and trained the members of your incident response team, including external entities. Ensure that tools and procedures for incident response have been selected and documented. Ideas:
Have a packing list for every team member so everyone brings everything they need. (Don’t want to not have to call back or make impact purchases while on a IR mission.)
Ensure all tools are updated (don’t want to show up to a site and have to update your tools on a compromised network)
Training (team members know what tools they use and their job roles)
Documentation (Team knows the TTPs that they will be using during the IR)
Standard Operating Procedures (review SOPs that you will use while on an incident)
Network Diagrams (if you are able to get Network Diagrams from the sites you might go to.)
Policies & Procedures (Know these of places that you may go to)
What are some external entities that might be part of an incident response team (S-6, S-3, SJA, CI, etc.)
One important preparation item is to enable NTP or another method to keep host clocks synchronized, so that events can be correlated, and forensic analysis uses a consistent time.
Preparation
29.1 Identify what occurs in the Preparation phase of Incident Response
Work out whether you are dealing with an event or an incident. This is where understanding your environment is critical as it means looking for significant deviations from "normal" traffic baselines or other methods. Ideas:The NIST Computer Security Incident Handling Guide [2] defines events and incidents:Event - any observable occurrence in a system or network.Adverse event - event with a negative consequence, such as…unauthorized use of system privileges, unauthorized access to sensitive data, and execution of malware that destroys data.”Incident - event that violates an organization’s security or privacy policies:
Unusual Activity Outside Baseline
Unknown Connections
Unknown User Accounts
Unusual User Privileges
External devices
High Traffic Volumes
Unusual Logons
Identification
29.2 Identify what occurs in the Identification phase of Incident Response
Limit the damage caused to systems and prevent any further damage from occurring.
This includes short and long term containment activities.
Cordon and Clear (VLANs)
Remove from Network, when Feasible
Quarantine
Sandbox
Patch / Hotfix
Add Firewalls
Containment
29.3 Identify what occurs in the Containment phase of Incident Response
Where Security personnel determine the priority, scope, and root cause of an incident.
Attribution
Avenue of Approach
Indicators of Compromise
Vulnerability Assessment
Forensic Analysis (Static & Dynamic)[4]
Static Analysis - Static analysis examines malware without actually running it.
Strings - Seeing what DLLs, functions, headers might be revealed in the strings output.
OSINT (Open Source Research), Hash the file and check the hash to see if there is anything online about it.
Running the binary through antivirus tools to verify maliciousness
Disassemble the binary using a tool such as IDA (Pro)
Dynamic Analysis - Watching the malware while it is running
Typically takes place after static analysis techniques have been exhausted
Binary is run using a sandbox that records all activity and changes caused by the malware
Monitor all activity using ProcMon, Task Manager, Procexep
Look for network traffic using TCPView and Wireshark, fake responses using a tool like Fakenet
Check for registry and file changes with tools like RegShot and sigcheck
View the programs execution in a debugger such as Ollydbg or WinDbg
Investigation
29.4 Identify what occurs in the Investigation phase of Incident Response
Get rid of the bad stuff.
From the investigation know what you have to remove (Malware analysis, Playbook)
Reimage
Key Rotation
Clean (Monitor)
Eradication
29.5 Identify what occurs in the Eradication phase of Incident Response
Determine when to bring the system back into production and how long we monitor the system for any signs of abnormal activity.
Remove VLANs
Return network to normal
Lessons Learned
Update SOP, AAR
Continually Monitor (Leaving sensors behind to be access remotely)
Recovery
29.6 Identify what occurs in the Recovery phase of Incident Response
Volatility is a measure of how perishable electronically stored data is (when electrical power is turned off or fails). [3]
Order of volatility is important when making decisions about how to respond to a potentially compromised system. A system shutdown is sometimes the worst option forensically, since it may mean the loss of transient (volatile) data held in RAM. Understanding what data is lost at a system shutdown, and what data is lost when drives are replaced or reformatted, will help guide the steps taken to forensically investigate a potentially compromised device.
Beginning with the most volatile[5]:
registers, cache
routing table, arp cache, process table, kernel statistics, memory
temporary file systems
disk and other storage media
remote logging and monitoring data that is relevant to the system in question
physical configuration, network topology
archival media
During an Incident
Gather baseline information
Cursory review of baseline information
Preliminary dig through system for indicators of compromise and symptoms
Trace indicators to source
Targeted analysis of suspicious information in baseline information
Crawl system for malicious items
Consolidate information
Volatility
30.1 Discuss the factors involved when considering order of volatility
30.2 Assess the order of volatility during an incident
Research and develop a short brief for the class, explaining the difference between a baseline and an enumeration.
Items for discussion:
Is there any difference between baselining a server or a user workstation? (Yes, often servers are more locked down than workstations, and user definable options, processes, and software will change less frequently.)
How often should you baseline? (Whenever configuration changes that affect the baseline are made to the system. In a mature IT organization, this will be coordinated through a configuration management process.)
Does it make sense to baseline the entire registry? (No, parts of the registry are dynamic.)
Are there other things you could baseline?
In this class, we’re talking about baselining in the sense of taking a static baseline - a snapshot of a system at a point in time. It can also make sense to do dynamic baselining - that is, gathering more dynamic, less predictable, performance data (number of network connections, amount of memory or network IO by process, etc.) at regular, frequent points in time. This allows system administrators and information security professionals to look at trend data both to help predict future system requirements (in terms of memory, CPU, or network bandwidth, for example) and to detect anomalies (for example, a system that suddenly has a discontinuous jump in network IO) that may indicate the presence of malware.
Static baselines are deterministic - if, for example, in a mature IT organization, a new local user account has been added without going through the configuration management process (which should, in turn, trigger an update to the baseline), either a security policy has been violated or a system has been compromised.
Dynamic baselines are based on trend analysis, and are probabilistic. A sudden uptick in network traffic may indicate a compromise, or it may simply indicate a new or increased mission requirement for that system.
Enumeration vs Baseline
Research Activity: Enumeration vs Baseline
How and when to baseline a system should be part of an incident response SOP. A baseline should be taken from a known clean state, usually before the computer is connected to the internet (unless intranet patching servers are not available - e.g. WSUS.) Some items of baseline knowledge to consider capturing include:
Local User Accounts
Running Processes
Services (installed and autostart)
Autorun locations (startup folder, registry locations (Run, RunOnce, Explorer shell extensions)
Scheduled tasks
Drivers and system files (file hash)
Network communications (established and listening. Also, is there a configured IPv6 connection on an IPv4 network or vice versa.)
Loaded modules (DLLs)
Installed applications and user context (who’s running with elevated privileges?)
Group policy objects
Based on the lessons covering skills 1-4, discuss some ways to use native Windows and SystInternals to gain awareness. These might include command line tools, powershell cmdlets, CIM/WMI and Sysinternals. Have students identify the which tools they’ve learned earlier that could be included in a baseline.
Baseline Knowledge
31.1 Identify baseline knowledge on a machine
31.2 Gather baseline knowledge on a machine
Normal activity is a system that operates within security policies and in the way that it was intended. For example, a workstation connection to services for valid operational purposes. To understand normal behavior requires observing the network and the systems on it over time, so that deviations from routine use can be detected and investigated.
Remind students that while we’re focusing on the operating system here, most compromises happen at the application level, especially initially. Understanding of normal behavior must include understanding how applications, especially public facing applications (web apps and services) behave.
Some malicious activity is fairly obvious through examining system activity- scanning other systems, DoSing other systems, hosting phishing sites, etc.
Other malicious activity is harder to detect - especially if an adversary has simply gained persistent access and is waiting for an operational trigger to do anything else. Sometimes months go by between a compromise and malicious activities. There are some atypical behaviors which tend to indicate malicious activity - for example, a workstation does not normally connect to other workstations, a web server does not typically initiate connections, but often compromise can be a subtle as a single change to a configuration file.
Differences between malicious and normal activity
31.3 Discuss the differences between malicious and normal activity.
Normal activity is a system that operates within security policies and in the way that it was intended. For example, a workstation connection to services for valid operational purposes. To understand normal behavior requires observing the network and the systems on it over time, so that deviations from routine use can be detected and investigated.
Remind students that while we’re focusing on the operating system here, most compromises happen at the application level, especially initially. Understanding of normal behavior must include understanding how applications, especially public facing applications (web apps and services) behave.
Some malicious activity is fairly obvious through examining system activity- scanning other systems, DoSing other systems, hosting phishing sites, etc.
Other malicious activity is harder to detect - especially if an adversary has simply gained persistent access and is waiting for an operational trigger to do anything else. Sometimes months go by between a compromise and malicious activities. There are some atypical behaviors which tend to indicate malicious activity - for example, a workstation does not normally connect to other workstations, a web server does not typically initiate connections, but often compromise can be a subtle as a single change to a configuration file.
Differences between malicious and normal activity
31.5 Identify scheduled tasks that may affect the purpose or activity on a machine
Review work done in earlier modules to enumerate scheduled tasks. A scheduled task may be used to launch malicious activity, or to maintain persistence on a compromised system. This can have advantages over a resident exploit, since a scheduled task can carry out a malicious function (for example, exfiltrating data or listening for further commands from a command and control server) then remove itself from memory until the next scheduled iteration.
Note that an attacker may exploit an existing scheduled task instead of adding one to the scheduled task lists, for example, by changing a configuration file or script to launch an additional process. How could this be mitigated? (Configuration files for processes that run with elevated privileges, or scripts that run with elevated privileges, should be part of the system baseline, and access to them should be controlled through ACLs.)
Identify scheduled tasks that may affect the purpose or activity on a machine
31.5 Identify scheduled tasks that may affect the purpose or activity on a machine
This depends on the purpose of the enumeration. Reference the discussion of a baseline above, and have the students discuss what questions they would want to answer from their forensic examination of a possibly compromised system (who, what, where, when, why, how):
Who is behind the attack / where did it originate? (IP addresses for scans, malicious access - can we isolate the attack to an organization, a region, or even a pattern of jump boxes? Is the malware used or the style of attack associated with a particular threat actor?)
What did the attacker do on the compromised system? Is there evidence that other systems are also compromised (active tunnels, logged connections, listeners)?
When did the attacker gain access (this may require examination of log files that have previously been consolidated off of the compromised system, backups of the compromised system (protip: full system backups can be used to develop file hashes for comparison even if you failed to baseline them before), file timestamps, etc. - remember that you cannot necessarily believe the timestamp of any artifact on a compromised system.)
How did the attacker gain access? What vulnerability (ies) did they exploit? What exploit(s) did they use? Are there other systems on the network vulnerable to the same attack? What remediation steps need to be taken to protect other systems?
Why did the attacker choose this system? What is he trying to accomplish? Is he using this box for access (pivoting?) pr discovery (scanning?) Does it contain sensitive data, or does it have an account with access to sensitive data on a share or database? Does it have access to the public internet for phishing, spamming, or scanning / DoS?
Based on those questions, students should discuss using a systematic, iterative process to approach enumerating a suspected system.
Design initial forensic hypothesis based on SOPs and suspect system behavior.
Design system enumeration to confirm or deny the hypothesis and provide supporting information
Analyze the results of the enumeration
Based on those results, refine or reformulate the hypothesis, or expand the search and start over.
Identify scheduled tasks that may affect the purpose or activity on a machine
31.6 Explain what should be assessed during enumeration of the environment.
Some ways to detect and enumerate malware include:
Comparison with a known good baseline.
Look for anomalous behavior on a system, or anomalous interactions with other systems on the network.
Scanning for vulnerabilities to identify potential malware
Anti-malware scanners (for example, anti-virus software) - these are either signature based or heuristic. Signature based scanners use signature files to compare software on the system with list of known malicious code. Heuristic scanners look for behaviors associated with malicious activity. Most current scanners use a combination of signature based and heuristic detection. Malware developers often obfuscate their code, and code in extraneous behaviors to confuse both signature based and heuristic scanners.
Look for unexpected activity in log files. Since malware and malicious actors often attempt to groom log files, look for inconsistencies between log files (obviously missing entries, timestamp inconsistencies, differences between events logged at application and OS level, etc.)
Sandboxing - observing the behavior of the compromised system in a controlled environment, without connection to the rest of the network (note that sophisticated malware may attempt to connect to a specific IP address or use a specific service, such as DNS, to check for being sandboxed. A good sandbox environment needs to provide duplicate services to fool malware into thinking that it’s running on an open network. Virtualization is your friend here.
Packet sniffing to collect additional data on malware behavior. This can be used in conjunction with sandboxing.
Event correlation - try to correlate events on the compromised system with other events on the network. For example, a failed login attempt captured in an application log may be correlated with a specific IP captured in a firewall log.
Describe how to detect and enumerate malware
31.7 Describe how to detect and enumerate malware
Your Op Notes will feed into your report depending whether the report is an executive or technical summary.
Offensive (Why is this important?):
Offensive Op Notes are as detailed as possible.
Included in these Op Notes are Time Stamps, programs/tools that are executed, outputs,
Defensive
Why is this important? (Classroom Discussion)
Identify the importance of operations notes (Op Notes)
32.1 Identify the importance of operations notes (Op Notes)
What is reporting?
Reporting is giving the appropriate and necessary information to enable the leaders of the owning organization to make decisions.
Reporting can take different forms depending upon your mission set and your team.
In a report, there are usually multiple sections, each designed for a specific audience. What are the sections and who is the audience for each?
The report will usually begin with an executive summary written for higher level, less technical leaders. It is the 10,000 foot overview of the five W’s – who, what, where, when and why. The executive summary will usually outline the courses of action (COA) but it is not the location for the detail of each COA.
The body of the report is designed and written for the primary audience of individuals who have to make solid decisions
What are some of the components of a report?
With every report, you will need to provide mainly the five W’s — who, what, where, when and why that is written in an executive summary. The executive summary will outline COAs and provide a brief description of why each COA is recommended (or not recommended).
The body of the report is designed and written primarily for individuals who have to make solid decisions about how to operate in cyberspace. This section should have a good amount of detail but not dive into minute aspects of scripts, hashes, etc. The body of the report provides enough information for the reader to understand the process and the importance of each piece of information contained within. For example, if it is a report about malware that was found on the host, this section would contain what the type of malware is, what it does, what type of systems it affects, and recommended COAs to deal with the malware.
The end of the report is a technical summary (not the same as the executive summary at the beginning).
Identify the importance of operations notes (Op Notes)
32.2 Discuss the components of a report
Factors that will influence courses of action on an offensive mission are:
Commander’s intent
Antivirus or security products on target
Risk analysis (Do the ends justify the means) Ex: Is it worth risking a million dollar exploit to acquire a word document from the computer of a low-level terrorist?
Duration of effects and intent (deny, destroy, disrupt)
Second and third order effects
As always, the tools at your disposal will guide your COA
Discuss the primary factors for recommending a course of action based on enumeration
33.1 Discuss the primary factors for recommending a course of action based on enumeration
Offense: This portion should be covered within a classroom discussion regarding an Offensive Mission (CMT).
If vulnerabilities are discovered on a target machine while on an offensive operation, it will provide you with more options for further exploitation and/or implantation with offensive cyber tools. It will also provide insight into the general competency of the administrators of those systems.
Defense: When we attribute this discussion to a Defensive Mission and we are conducting a survey/incident response mission and we notice a Threat Actor presence. This mission will adjust and conduct risk assessment. After the risk assessment is completed then the mission will change to mitigate the vulnerabilities.
Identify the common vulnerabilities that could change the course of a mission
33.2 Identify the common vulnerabilities that could change the course of a mission
Receipt of mission
Analysis of mission
COA Development
COA comparison
COA approval
Conduct mission
AAR / Lessons learned
Discuss the development of COAs
33.3 Discuss the development of courses of action
Discuss the purpose of covering your tracks
Discussion: Covering Tracks