Risk Management Handbook Chapter 8: Incident Response (IR)

Introduction

RMH Chapter 8 Incident Response documents the controls that focus on how the organization must: establish an operational incident handling capability for organizational information systems that includes adequate preparation, detection, analysis, containment, recovery, and user response activities; and track, document, and report incidents to appropriate organizational officials and/or authorities. Procedures addressed include incident response training, incident response testing, incident handling, monitoring and reporting, and information spillage response. Within this chapter, readers will find the CMS Cybersecurity Integration Center (CCIC) Functional Area Overview figure and how the Incident Management Team (IMT) within the CCIC works with systems to mitigate information security and privacy incidents.

Looking for templates and forms about Incident Response? Within this page you can find:

Common Control Inheritance

The inherited controls list can be used to identify common controls offered by system alternatives. The use of inherited controls is optional, the objective of this process is to identify opportunities to extract benefits (and reduce costs) by maximizing the use of already existing solutions, and minimizing duplication of efforts across the enterprise.

Below is a listing of controls that can be inherited, where they can be inherited from and if they are a hybrid control for this control family.

Incident Response ControlInheritable FromHybrid Control
IR-01OCISO Inheritable ControlsYes

 

IR-02

CMS Baltimore Data Center - EDC4

 

No

 

IR-02(01)

CMS Baltimore Data Center - EDC4

 

No

 

IR-02(02)

CMS Baltimore Data Center - EDC4

 

No

IR-03

CMS Baltimore Data Center -

EDC4

No
IR-03(01)

CMS Baltimore Data Center -

EDC4

No
IR-03(02)CMS Baltimore Data Center - EDC4No
IR-04

CMS Baltimore Data Center -

EDC4

No
IR-04(01)CMS Baltimore Data Center - EDC4No
IR-04(04)

CMS Baltimore Data Center -

EDC4

No
IR-05CMS Baltimore Data Center - EDC4No
IR-05(01)

CMS Baltimore Data Center -

EDC4

No
IR-06CMS Baltimore Data Center - EDC4No
IR-06(01)

CMS Baltimore Data Center -

EDC4

No
IR-07CMS Baltimore Data Center - EDC4No
IR-07(01)

CMS Baltimore Data Center -

EDC4

No

 

IR-08

CMS Baltimore Data Center - EDC4

 

No

 

IR-09

CMS Baltimore Data Center -

EDC4

 

No

 

IR-09(01)

CMS Baltimore Data Center - EDC4

 

No

 

IR-09(02)

CMS Baltimore Data Center -

EDC4

 

No

 

IR-09(03)

CMS Baltimore Data Center - EDC4

 

No

 

IR-09(04)

CMS Baltimore Data Center -

EDC4

 

No

Procedures

Procedures assist in the implementation of the required security and privacy controls.

In this section, the IR family procedures are outlined. To increase traceability, each procedure maps to the associated National Institute of Standards and Technology (NIST) controls using the control number from the CMS Acceptable Risk Safeguards (ARS).

Incident Response Training (IR-02)

The purpose of Incident Response Training is to prepare individuals to prevent, detect, and respond to security and privacy incidents, and ensure that CMS fulfills Federal Information Security Modernization Act (FISMA) requirements. Incident response training should be consistent with the roles and responsibilities assigned in the incident response plan. For example, incident response training is applicable to Information System Owners (SO), Business Owners (BO), and Information System Security Officers (ISSO). CMS personnel (i.e., employees and contractors) who routinely access sensitive data, such as names, Social Security numbers, and health records to carry out the CMS mission receive incident response training annually as part of the general information security awareness training.

The CMS Chief Information Officer (CIO), CMS Chief Information Security Officer (CISO), and the CMS Senior Official for Privacy (SOP) shall endorse and promote an organizational- wide information systems security and privacy awareness training. According to CMS Information Systems Security and Privacy Policy (IS2P2) the CIO, shall establish, implement, and enforce a CMS-wide framework to facilitate an incident response program including Personal Identifiable Information (PII), Protected Health Information (PHI), and Federal Tax Information (FTI) breaches that ensures proper and timely reporting to HHS. In the CMS IS2P2, the CISO and the SOP shall ensure the CMS-wide implementation of Department and CMS policies and procedures that relate to information security and privacy incident response.

Users must be aware that the Internal Revenue Code (IRC), Section 6103(p) (4) (D) requires that agencies receiving FTI provide appropriate safeguard measures to ensure the confidentiality of the FTI. Incident response training is one of the safeguards for implementing this requirement.

The CMS Information Security and Privacy Group (ISPG) will provide incident response training to information system users that is consistent with assigned roles and responsibilities when assuming an incident response role or responsibility and annually thereafter. For example, general users may only need to know who to call or how to recognize an incident on the information system; system administrators may require additional training on how to handle/remediate incidents; and incident responders may receive more specific training on forensics, reporting, system recovery, and restoration. In addition, those responsible for identifying and responding to a security incident must understand how to recognize when PII or PHI are involved so that they can coordinate with the SOP.

The table below outlines the CMS organizationally-defined parameters (ODPs) for IR-2.

Table 1: CMS Defined Parameters – Control IR-2

ControlControl RequirementCMS Parameter
IR-2

The organization provides incident response training to information system users consistent with assigned roles and responsibilities:

a. Within [Assignment: organization- defined time period] of assuming an incident response role or responsibility;

b. When required by information system changes; and

c. [Assignment: organization-defined frequency] thereafter

The organization provides incident response training to information system users consistent with assigned roles and responsibilities:

a. Within one (1) month of assuming an incident response role or responsibility;

b. When required by information system changes; and

c. [Assignment: organization-defined frequency] thereafter

Training for General Users

For all Enterprise User Administration (EUA) users the following steps outline the process for completing the CMS Computer-based Training (CBT), which includes IR training.

  • Step 1: The incident response training is incorporated into the annual Information Systems Security and Privacy Awareness Training. All EUA users must take the CBT Training located at CMS Information Technology Security and Privacy web page The training will be delivered to all EUA users initially prior to account issuance and annually thereafter. It is the responsibility of users to take this training within three (3) days.
  • Step 2: Each year based on the date of account issuance each user receives an email that requires a review and completion of the annual CBT.
  • Step 3: Training records are maintained using the CBT database and include the User ID (UID) and the date the individual last completed the training 

Role-Based Training

For individuals with incident response roles and responsibilities, role-based training is satisfied through the execution of a tabletop exercise as long as all personnel with incident response roles and responsibilities participate in the exercise. Review Section 3.2 Incident Response Testing for procedures to conduct a tabletop exercise.

Simulated Events (IR-02(01))

The purpose of this control is to facilitate the effective response by personnel who handle crisis situations by incorporating simulated events into incident response training. Exercises involving simulated incidents can also be very useful for preparing staff for incident handling.1

The selection of the scenarios should occur as a part of the test plan development; see Section 3.2 Incident Response Testing for developing the test plan. The following details the CMS specific process for incorporating simulated events/scenarios into incident response training, through the execution of a tabletop exercise.

  • Step 1: Select two scenarios from the list below that will form the foundation of the tabletop exercise. Document the scenarios and a description of each in the Tabletop Exercise Test Plan. It is important to select your scenarios based upon an assessment of risk (i.e., the greatest current threats). Weaknesses identified during prior incidents might identify good candidate scenarios for future incident response tests. In addition, results from prior security control assessments (SCAs), Cybersecurity and Risk Assessment Program (CSRAP) or existing Plan of Action and Milestones (POA&Ms) might assist in selecting scenarios for incident response testing. For example, if access control was identified as a weakness during a prior SCA, a good scenario to select for incident response testing would be scenario 6 (Unauthorized Access to Payroll Records). Detailed descriptions of each of these scenarios can be found in the ISPL (Information Security and Privacy Library) and the scenarios are listed below:
    • Scenario 1: Domain Name System (DNS) Server Denial of Service (DoS)
    • Scenario 2: Worm and Distributed Denial of Service (DDoS) Agent Infestation
    • Scenario 3: Stolen Documents
    • Scenario 4: Compromised Database Server
    • Scenario 5: Unknown Exfiltration
    • Scenario 6: Unauthorized Access to Payroll Records
    • Scenario 7: Disappearing Host
    • Scenario 8: Telecommuting Compromise
    • Scenario 9: Anonymous Threat
    • Scenario 10: Peer-to-Peer File Sharing
    • Scenario 11: Unknown Wireless Access Point
  • Step 2: Ensure that the material developed for the tabletop exercise supports the scenarios selected. Review Section 3.2 Incident Response Testing for more information for developing the exercise material.
  • Step 3: Execute the tabletop test using the procedures outlined below in Section 3.2 Incident Response Testing Automated Training Environments (IR-02(02)).

Automated Training Environments (IR-02(02))

The purpose of Incident Response Training/Automated Training Environments is to ensure that CMS employs automated mechanisms to provide a more thorough and realistic incident training environment. At CMS, incident training and incident response testing are both satisfied through the execution of a tabletop exercise. These tabletop exercises are designed to incorporate automated mechanisms for incident response, review Section 3.2.1 Automated Testing for detailed procedure which ensure automated mechanisms are incorporated into incident response training.

Incident Response Testing (IR-03)

The purpose of the Incident Response Testing is to ensure that CMS tests the incident response capability for the information system using testing principles to determine the incident response effectiveness and document the results.

The table below outlines the CMS organizationally defined parameters (ODPs) for IR testing.

Table 2: CMS Defined Parameters – Control IR-03

ControlControl RequirementCMS Parameter
IR-03

The organization tests the incident response capability for the information system:

[Assignment: organization- defined frequency] using [Assignment: organization- defined tests] to determine the incident response effectiveness and documents the results

The organization tests the incident response capability for the information system within every three hundred sixty- five (365) days using NIST SP 800-61, reviews, analyses, and simulations to determine the organization’s incident response effectiveness, and documents its findings.

CMS incident response testing is accomplished through the execution of tabletop exercises. Tabletop exercises are discussion-based exercises where personnel meet in a classroom setting or in breakout groups to discuss roles during an emergency and the responses to a particular emergency situation.  A facilitator presents a scenario and asks the exercise participants questions related to the scenario, which initiates a discussion among the participants of roles, responsibilities, coordination, and decision-making. A tabletop exercise is discussion-based only and does not involve deploying equipment or other resources.

The following steps detail the CMS specific process for conducting a tabletop exercise:

  • Step 1: Complete the Test Plan utilizing the Tabletop Exercise Test Plan Template located in the ISPL. Testing must include two scenario-based exercises to determine the ability of the CMS to respond to information security and privacy incidents. Scenarios should be selected which integrate the use of automated mechanisms for incident response. 
  • Step 2: Acquire approval of the Test Plan from the Business Owner and/or ISSO. The approval is granted by signing the final row of the Test Plan.
  • Step 3: Develop the exercise materials (e.g., briefings, Participant Guide). A sample Tabletop Exercise Participant Guide Template is located in the ISPL. For more information on functional exercise material please refer to Section 5.3 of NIST SP 800- 84, Guide to Test, Training, and Exercise Programs for IT Plans and Capabilities.
  • Step 4: Conduct the tabletop exercise according to the approved Test Plan. The agenda contained within the Test Plan serves as a guide for executing the exercise. Prior to releasing the exercise participants, the Exercise Facilitator and Data Collector conduct a debrief/hotwash.
  • Step 5: Evaluate the tabletop exercise by completing the After-Action Report located in the ISPL. This step is completed by the Exercise Facilitator and Data Collector.

Coordination with Related Plans (IR-03(02))

The purpose of the Incident Response Testing/Coordination with Related Plans is to ensure that CMS coordinates incident response testing with organizational elements responsible for related plans. Related plans can include but are not limited to the following:

  • Configuration Management Plan
  • Information System Contingency Plan
  • Patch and Vulnerability Management Plan
  • Information System Continuous Monitoring Strategy/Plan

The following steps detail the CMS specific process to ensure Coordination with Related Plans:

  • Step 1:  Identify the related plans and the stakeholders associated with each.
  • Step 2: Establish a primary method of communication. Possible methods of communication include emails, face-to-face meetings, and teleconferences.
  • Step 3: Using the primary method of communication identified above, request copies of related plans. Review the related plans identifying dependencies for the IR test.
  • Step 4: Identify stakeholders from related plans that will be required to participate in the incident response exercise. Coordinate with the stakeholders through the establishment, review, and execution of a test plan.
  • Step 5: Conduct follow up communications as necessary. Specifically, a copy of the After-Action Report should be provided to stakeholders associated with related plans so that those plans may be updated as needed.

Incident Handling (IR-04)

The purpose of this control is to ensure that CMS implements an incident handling capability for security and privacy incidents that includes 1) preparation, 2) detection and analysis, 3) containment, eradication, and recovery, and 4) post incident activity. 

All distributed Incident Response Teams (IRT) fall under the authority of the CCIC IMT, the single information security and privacy incident coordination entity. Each individual system is responsible for identifying incident responders as part of the system’s Incident Response Plan (IRP). The incident responders serve as the frontline of the incident handling capability with oversight and incident response assistance provided by the IMT. This section of the document establishes the specific requirements and processes for maintaining a unified, cohesive incident handling capability across the CMS enterprise and describes the relationship between the IMT and the frontline incident responders.

In the event of a suspected or confirmed privacy (PII) data breach, CCIC IMT will notify ISPG that a Breach Analysis Team (BAT) should be convened, including representatives from ISPG, IMT, and system stakeholders such as the system Business Owner. The BAT will conduct and document a formal Risk Assessment to assess the risk of harm to individuals potentially affected by the breach. The following factors are used:

  • Nature and sensitivity of PII
  • Likelihood of access and use of PII and
  • Type of breach

If the Risk Assessment concludes that there is a moderate or high risk that PII has been compromised, the CMS Senior Official for Pivacy will work with IMT and system stakeholders to develop a notification plan to notify affected individuals and mitigate their risk.

Affected individuals should be notified of a breach via first-class mail where possible, though depending on the nature and scale of the breach, additional methods such as email, telephone, and local media outreach may be used. The breach notification should include the following information:

  • Source of the breach
  • Brief description
  • Date of discovery and breach occurrence
  • Type of PII involved
  • A statement whether or not the information was encrypted
  • What steps individuals should take to protect themselves from potential harm and services being provided to potentially affected individuals
  • What the agency is doing to investigate and resolve the breach
  • Who affected individuals should contact for information

In addition to breach notification, CMS must also consider how best to mitigate the risk of harm to affected individuals. CMS may need to provide:

  • Countermeasures against misuse of lost PII/PHI, such as notifying a bank if credit card numbers are lost
  • Guidance on how affected individuals can protect themselves against identity theft, such as education on credit freezes and other defensive measures
  • Services, such as credit monitoring

The Breach Analysis Team may determine that some, all, or none of these mitigation techniques are appropriate for a given breach. Some breaches may require notification, but not mitigation.

The SOP coordinates with HHS Privacy Incident Response Team (PIRT) for review and approval of CMS response plan, breach notification, and breach mitigation. Incident handling activities should be coordinated with contingency planning activities; and the lessons learned from ongoing incident handling activities should be incorporated into incident response procedures, training and testing. The procedure below provides an inclusive set of specific steps and requirements for handling information security and privacy incidents using the four-phase lifecycle. This lifecycle must be used by the IMT and the frontline incident responders to properly handle information security and privacy incidents.

Preparation

Incident response methodologies typically emphasize preparation, not only establishing an incident response capability so that the organization is ready to respond to incidents, but also preventing incidents by ensuring that systems, networks, and applications are sufficiently secure. Although the incident response team is not typically responsible for incident prevention, it is fundamental to the success of incident response programs.

The following steps detail the CMS specific process for phase one (preparation) of the incident handling lifecycle:

StepsActivity
Step 1:Ensure the proper preparations have been made to respond to information security and privacy incidents by completing the Incident Preparation Checklist located in the ISPL. This checklist should be reviewed annually in coordination with the update to the incident response plan.
Step 2:

Ensure regular practices have been implemented to prevent information security and privacy incidents. The list below taken from NIST SP 800-61 Rev. 2 provides a brief overview of some of the main recommended practices for securing networks, systems and applications.

  • Risk Assessments: Periodic risk assessments of systems and applications should determine what risks are posed by combinations of threats and vulnerabilities. This should include understanding the applicable threats, including organization-specific threats. Each risk should be prioritized, and the risks can be mitigated, transferred, or accepted until a reasonable overall level of risk is reached. Another benefit of conducting risk assessments regularly is that critical resources are identified, allowing staff to emphasize monitoring and response activities for those resources

The CMS standard for risk assessment requires that the results of the risk assessment are reviewed at least annually and that the risk assessment is updated at least every three years or when a significant change occurs.

  • Host Security: All hosts should be hardened appropriately using

standard configurations. In addition to keeping each host properly patched, hosts should be configured to follow the principle of least privilege, granting users only the privileges necessary for performing authorized tasks. Hosts should have auditing enabled and should log significant security-related events. The security of hosts and configurations should be continuously monitored. Many organizations use Security Content Automation Protocol (SCAP) configuration checklists to assist in securing hosts consistently and effectively.

The CMS standard requires the implementation of the latest security configuration baselines established by the HHS, U.S. Government Configuration Baselines (USGCB), and the National Checklist Program (NCP).

  • Network Security: The network perimeter should be configured to deny all activity that is not expressly permitted. This includes securing all connection points, such as virtual private networks (VPNs) and dedicated connections to other organizations.

The CMS standard requires that the information system at managed interfaces denies network communications traffic by default and allows network communications traffic by exception (i.e., deny all, permit by exception).

  • Malware Prevention: Software to detect and stop malware should be deployed throughout the organization. Malware protection should be deployed at the host level (e.g., server and workstation operating systems), the application server level (e.g., email server, web proxies), and the application client level (e.g., email clients, instant messaging clients). The CMS standard requires that malicious code protection mechanisms are implemented as follows:
    • Desktops: Malicious code scanning software is configured to perform critical system file scans no less often than once every twelve (12) hours and full system scans no less often than once every seventy-two (72) hours.
    • Servers (to include databases and applications): Malicious code scanning software is configured to perform critical system file scans no less often than once every twelve (12) hours and full system scans no less often than once every seventy-two (72) hours.

In addition, malicious code protection mechanisms should be updated whenever new releases are available in accordance with CMS configuration management policy and procedures. Antivirus definitions should be updated in near-real-time. Malicious code protection mechanisms should be configured to lock and quarantine malicious code and send alerts to administrators in response to malicious code detection.

  • User Awareness and Training: Users should be made aware of policies and procedures regarding appropriate use of networks, systems, and applications as well as the policy and procedures for safeguarding data that is not in digital form (e.g., PII in paper form). Applicable lessons learned from previous incidents should also be shared with users to evaluate how actions taken by the user could affect the organization. Improving user awareness regarding incidents should reduce the frequency of incidents. IT staff should be trained to maintain networks, systems, and applications in accordance with the organization’s security standards. All users should be trained to protect printed hard/paper copies of data, including PII.

The CMS standard requires all general users receive security and privacy awareness training annually. The incident response training is incorporated into the annual Information Systems Security and Privacy Awareness Training. All EUA users must take the CBT Training located at CMS Information Technology Security and Privacy web page. The training must be delivered to all EUA users initially prior to account issuance and annually thereafter.

  • Maintain Inventory: Maintain an accurate inventory of information system components identifying those components that store, transmit, and/or process PII. An accurate inventory facilitates the implementation of the appropriate information security and privacy controls and is critical to preventing, detecting and responding to information security incidents.
Step 3:Ensure that the preparation and prevention techniques listed in Steps 1 and 2 above have been incorporated into the incident response plan for the information system and exercised at least annually. Review Incident Response Plan or details on developing the incident response plan and Incident Response Testing for details on incident response testing.

Detection and Analysis

StepsActivity
Step 1:

Prepare for Common Attack Vectors. The attack vectors listed below are not intended to provide definitive classification for incidents; but rather, to simply list common methods of attack, which can be used as a basis for detection:

  • External/Removable Media: An attack executed from removable media or a peripheral device, for example, malicious code spreading onto a system from an infected universal serial bus (USB) flash drive.
  • Attrition: An attack that employs brute force methods to compromise, degrade, or destroy systems, networks, or services (e.g., a Distributed Denial of Service (DDoS) intended to impair or deny access to a service or application; or a brute force attack against an authentication mechanism, such as passwords, CAPTCHAS, or digital signatures).
  • Web: An attack executed from a website or web-based application; for example, a cross-site scripting attack used to steal credentials or a redirect to a site that exploits a browser vulnerability and installs malware.
  • Email: An attack executed via an email message or attachment; for example, exploit code disguised as an attached document or a link to a malicious website in the body of an email message.
  • Impersonation: An attack involving replacement of something benign with something malicious; for example: spoofing, man in the middle attacks, rogue wireless access points, and structured query language (SQL) injection attacks all involve impersonation.
  • Improper Usage: Any incident resulting from violation of an organization’s acceptable usage policies by an authorized user, excluding the above categories; for example, a user installs file sharing software, leading to the loss of sensitive data; or a user performs illegal activities on a system.
Step 2:

Recognize the Signs of an Incident.  Signs of an incident fall into one of two categories: precursors and indicators. A precursor is a sign that an incident may occur in the future. An indicator is a sign that an incident may have occurred or may be occurring now. Precursors and indicators are identified using many different sources, with the most common being computer security software alerts, logs, publicly available information, and people. The table below, taken from NIST SP 800-61 Rev. 2, lists common sources of precursors and indicators for each category.

Table 3: Common Sources of Precursors and Indicators

Alerts

SourceDescription
IDPSsIntrusion Detection and Prevention Systems (IDPS) products identify suspicious events regarding record pertinent data, including the date and time the attack was detected, the type of attack, the source and destination IP addresses, and the username (if applicable and known). Most IDPS products use attack signatures to identify malicious activity; the signatures must be kept up to date so that the newest attacks can be detected. IDPS software often produces false positives, alerts that indicate malicious activity is occurring, when in fact there has been none. Analysts should manually validate IDPS alerts either by closely reviewing the recorded supporting data or by getting related data from other sources.
SIEMsSecurity Information and Event Management (SIEM) products are similar to IDPS products, and can generate alerts based on analysis of log data.
Antivirus and anti-spam softwareAntivirus software detects various forms of malware, generates alerts, and prevents the malware from infecting hosts. Current antivirus products are effective at stopping many instances of malware if signatures are kept up to date. Anti-spam software is used to detect spam and prevent it from reaching users’ mailboxes. Spam may contain malware, phishing attacks, and other malicious content, so alerts from antispam software may indicate attack attempts.
File integrity checking softwareFile integrity checking software can detect changes made to important files during incidents. It uses a hashing algorithm to obtain a cryptographic checksum for each designated file. If the file is altered and the checksum is recalculated, an extremely high probability exists that the new checksum will not match the old checksum. By regularly recalculating checksums and comparing checksum with previous values, changes to files can be detected.
Third-party monitoring services

Third parties offer a variety of subscription-based and free monitoring services. An example is fraud detection services that will notify an organization if its IP addresses, domain names, etc. are associated with current incident activity involving other organizations. There are also free real-time deny lists with similar information.

Another example of a third-party monitoring service is a CSIRC notification list; these lists are often available only to other incident response teams.

Logs

SourceDescription
Operating system, service and application logs

Logs from operating systems, services, and applications (particularly audit-related data) are frequently of great value when an incident occurs, such as recording which accounts were accessed and what actions were performed. Organizations should require a baseline level of logging on all systems and a higher baseline level on critical systems. Logs can be used for analysis by correlating event information.

Depending on the event information, an alert can be generated to indicate an incident.

Network device logsLogs from network devices such as firewalls and routers are not typically a primary source of precursors or indicators. Although these devices are usually configured to log blocked connection attempts, little information is provided about the nature of the activity. Still, the devices can be valuable in identifying network trends and in correlating events detected by other devices.
Network flowsA network flow is a particular communication session occurring between hosts. Routers and other networking devices can provide network flow information, which can be used to find anomalous network activity caused by malware, data exfiltration, and other malicious acts. There are many standards for flow data formats, including NetFlow, sFlow, and IPFIX.

Publicly Available Information

SourceDescription
Information on new vulnerabilities and exploitsKeeping up with new vulnerabilities and exploits can prevent some incidents from occurring and assist in detecting and analyzing new attacks. The National Vulnerability Database (NVD) contains information on vulnerabilities. Organizations such as US-CERT33 and CERT®/CC periodically provide threat update information through briefings, web postings, and mailing lists.

People

SourceDescription
People from within the organizationUsers, system administrators, network administrators, security staff, and others within the organization may report signs of incidents. It is important to validate all such reports. One approach is to ask people who provide such information is the confidence of the accuracy of the information. Recording this estimate along with the information provided can help considerably during incident analysis, particularly when conflicting data is discovered.
People from other organizationsReports of incidents that originate externally should be taken seriously. For example, the organization might be contacted by a party claiming a system at the organization is attacking the other party’s systems. External users may also report other indicators, such as a defaced web page or an unavailable service. Other incident response teams also may report incidents. It is important to have mechanisms in place for external parties to report indicators and for trained staff to monitor those mechanisms carefully; this may be as simple as setting up a phone number and email address, configured to forward messages to the help desk.
Step 3: 

Report and Analyze the Incident. Report the incident using the procedures outlined in Section 3.5 Incident Reporting. Once reported the IMT and frontline IR responders analyze the incident. The following are recommendations taken from NIST-SP 800-61 Rev. 4 Computer Security Incident Handling Guide for making incident analysis easier and more effective:

  • Profile Networks and Systems: Profiling is measuring the characteristics of expected activity so that changes to it can be more easily identified. Examples of profiling are running file integrity checking software on hosts to derive checksums for critical files and monitoring network bandwidth usage to determine what the average and peak usage levels are on various days and times. In practice, it is difficult to detect incidents accurately using most profiling techniques; organizations should use profiling as one of several detection and analysis techniques.
  • Understand Normal Behaviors: Incident response team members should study networks, systems, and applications to understand what the normal behavior is so that abnormal behavior can be recognized more easily. No incident handler will have a comprehensive knowledge of all behavior throughout the environment, but handlers should know which experts could fill in the gaps. One way to gain this knowledge is through reviewing log entries and security alerts. This may be tedious if filtering is not used to condense the logs to a reasonable size.  As handlers become more familiar with the logs and alerts, handlers should be able to focus on unexplained entries, which are usually more important to investigate. Conducting frequent log reviews should keep the knowledge fresh, and the analyst should be able to notice trends and changes over time. The reviews also give the analyst an indication of the reliability of each source.
  • Create a Log Retention Policy: Information regarding an incident may be recorded in several places, such as firewall, IDPS, and application logs. Creating and implementing a log retention policy that specifies how long log data should be maintained may be extremely helpful in analysis because older log entries may show reconnaissance activity or previous instances of similar attacks. Another reason for retaining logs is that incidents may not be discovered until days, weeks, or even months later. The length of time to maintain log data is dependent on several factors, including the organization’s data retention policies and the volume of data. See NIST SP 800-92, Guide to Computer Security Log Management for additional recommendations related to logging.
  • Perform Event Correlation: Evidence of an incident may be captured in several logs that each contain different types of data, firewall log may have the source IP address that was used, whereas an application log may contain a username. A network IDPS may detect that an attack was launched against a particular host, but it may not know if the attack was successful. The analyst may need to examine the host’s logs to determine that information.

Correlating events among multiple indicator sources can be invaluable in validating whether a particular incident occurred.

  • Keep All Host Clocks Synchronized: Protocols such as the Network Time Protocol (NTP) synchronize clocks among hosts. Event correlation will be more complicated if the devices reporting events have inconsistent clock settings. From an evidentiary standpoint, it is preferable to have consistent timestamps in logs, for example, to have three logs that show an attack occurred at 12:07:01 a.m., rather than logs that list the attack as occurring at 12:07:01, 12:10:35, and 11:07:06.
  • Maintain and Use a Knowledge Base of Information: The knowledge base should include information that handlers need for referencing quickly during incident analysis. Although it is possible to build a knowledge base with a complex structure, a simple approach can be effective. Text documents, spreadsheets, and relatively simple databases provide effective, flexible, and searchable mechanisms for sharing data among team members. The knowledge base should also contain a variety of information, including explanations of the significance and validity of precursors and indicators, such as IDPS alerts, operating system log entries, and application error codes.
  • Use Internet Search Engines for Research: Internet search engines can help analysts find information on unusual activity. For example, an analyst may see some unusual connection attempts targeting TCP port 22912. Performing a search on the terms “TCP,” “port,” and “22912” may return some hits that contain logs of similar activity or even an explanation of the significance of the port number. Note that separate workstations should be used for research to minimize the risk to the organization from conducting these searches.
  • Run Packet Sniffers to Collect Additional Data: Sometimes the indicators do not record enough detail to permit the handler to understand what is occurring. If an incident is occurring over a network, the fastest way to collect the necessary data may be to have a packet sniffer capture the network traffic. Configuring the sniffer to record traffic that matches specified criteria should keep the volume of data manageable and minimize the inadvertent capture of other information. Because of privacy concerns, some organizations may require incident handlers to request and receive permission before using packet sniffers.
  • Filter the Data: There is simply not enough time to review and analyze all the indicators; at minimum, the most suspicious activity should be investigated. One effective strategy is to filter out categories of indicators that tend to be insignificant. Another filtering strategy is to show only the categories of indicators that are of the highest significance; however, this approach carries substantial risk because new malicious activity may not fall into one of the chosen indicator categories.
  • Seek Assistance from Others: Occasionally, the team will be unable to determine the full cause and nature of an incident. If the team lacks sufficient information to contain and eradicate the incident, then it should consult with internal resources (e.g., information security staff) and external resources (e.g., US-CERT, other CSIRTs (Computer Security Incident Response Teams), contractors with incident response expertise). It is important to accurately determine the cause of each incident so that it can be fully contained.
Step 4 Continue to document updates to the incident in the Incident Response Reporting Template form.
Step 5Prioritize the incident using the criteria found in the “Impact Category, Attack Vector Descriptions, & Attribute Category” document of the Incident Response Reporting document which is located in the ISPL
 

Establish communication method and notify the appropriate CMS personnel. The Incident Notification Table located in the Incident Response Steps for CISO (Appendix A) is a guide on notification steps per incident type. The list below provides examples of individuals that may require notification in the event of an incident:

  • CIO
  • CISO
  • Deputy CISO
  • SOP
  • HHS Office of the Inspector General (OIG)
  • Local information response team within the organization
  • External incident response team (if appropriate)
  • System Owner
  • Information System Security Owner
  • System Business Owner
  • System Cyber Risk Advisor
  • CMS Office of Human Capital (for cases involving employees, such as harassment through email)
  • CMS Office of Financial Management (in the case where extra funding is needed for investigation activities)
  • CMS Office of Communications (for incidents that may generate publicity)
  • CMS Office of Legislation (for incidents with potential legal ramifications)
  • US-CERT (required for Federal agencies and systems operated on behalf of the Federal government).
  • Individual (whose PII has been compromised)

The below table documents the responsibilities that should be fulfilled by employees in certain roles during an incident response event:

RoleResponsibility
CISO
  • Lead the investigation and resolution of information security and privacy incidents and breaches across CMS.
  • Once an incident has been validated, the incumbent CISO will follow the steps in the CISO Playbook which is attached as Appendix A. This playbook details the CISO’s responsibilities, the scenarios to be considered and the relevant incident response contacts during an event.
IMT Lead
  • Notify and deliver incident situation reports to CMS CISO.
  • Coordinate Incident Response activities
Senior Official for Privacy (SOP)
  • Coordinate/Support incident response activities with CISO.
  • In the event of a PII/PHI breach, coordinate with the system Business Owner and HHS PIRT to handle notifying affected individuals
  • Provide overall direction for incident handling which includes all incidents involving PII/PHI.
Business Owner
  • Works with IMT Lead to coordinate incident response activities related to their assigned CMS information systems.
  • In the event of a PII/PHI breach, coordinate with the Senior Official for Privacy and HHS PIRT to handle notifying affected individuals
CMS IT Service Desk
  • Notify IMT of incident situation
  • Ensure Incident Response form has been completed as accurately as possible at the time of the initial report.
Designated Appointee
  • Update the ServiceNow ticket as the situation arises and follow up with the CMS IT Helpdesk until incident has been resolved.

Containment, Eradication and Recovery

Step 1: Choose a containment strategy. The containment strategy is determined based on the type of the incident (e.g., disconnect system from the network, or disable certain functions). Frontline incident responders should work with the IMT to select an appropriate containment strategy.

Step 2: Gather and handle evidence. The CCIC Forensic, Malware and Analysis Team (FMAT) maintain the criteria for evidence collection and a procedure to ensure a chain of custody. The IMT will coordinate with the FMAT to provide incident responders with assistance to collect and handle evidence.

Step 3: Identify the attacking host. The following items taken from NIST-SP 800-61 Rev. 2 Computer Security Incident Handling Guide describe the most commonly performed activities for attacking host identification:

  • Validating the Attacking Host’s IP Address: New incident handlers often focus on the attacking host’s IP address. The handler may attempt to validate that the address was not spoofed by verifying connectivity to it; however, this simply indicates that a host at that address does or does not respond to the requests. A failure to respond does not mean the address is not real, for example, a host may be configured to ignore pings and traceroutes. Also, the attacker may have received a dynamic address that has already been reassigned to someone else.
  • Researching the Attacking Host through Search Engines: Performing an Internet search using the apparent source IP address of an attack may lead to more information on the attack, for example, a mailing list message regarding a similar attack.
  • Using Incident Databases: Several groups collect and consolidate incident data from various organizations into incident databases. This information sharing may take place in many forms, such as trackers and real-time deny lists. The organization can also check its own knowledge base or issue tracking system for related activity.
  • Monitoring Possible Attacker Communication Channels: Incident handlers can monitor communication channels that may be used by an attacking host. For example, many bots use IRC as the primary means of communication. Also, attackers may congregate on certain IRC channels to brag about compromises and share information. However, incident handlers should treat any such information acquired only as a potential lead, not as fact.

Step 4: Eradicate the incident and recover. Eliminate components of the incident (e.g. delete malware, disable breached accounts, identify and mitigate vulnerabilities that were exploited). Incident responders should coordinate with the IMT to identify and execute a strategy for eradication of the incident. Once eradication has been completed restore systems to normal operation, confirm that systems are functioning normally, and remediate vulnerabilities to prevent similar incidents.

Post-Incident Activity

Step 1: Conduct a lessons learned meeting. Learning and improving, one of the most important parts of incident response is also the most often omitted. Each incident response team should evolve to reflect new threats, improved technology, and lessons learned. Holding a “lessons learned” meeting with all involved parties after a major incident, and optionally periodically after lesser incidents as resources permit, can be extremely helpful in improving security measures and the incident handling process itself. Multiple incidents can be covered in a single lessons learned meeting. This meeting provides a chance to achieve closure with respect to an incident by reviewing what occurred, what was done to intervene, and how well intervention worked. The meeting should be held within several days of the end of the incident. Questions to be answered in the meeting include:

  • Exactly what happened, and at what times?
  • How well did staff and management perform in dealing with the incident? Were the documented procedures followed and adequate?
  • What information was needed sooner?
  • Were any steps or actions taken that might have inhibited the recovery?
  • What would the staff and management do differently the next time a similar incident occurs?
  • How could information sharing with other organizations have been improved?
  • What corrective actions can prevent similar incidents in the future?
  • What precursors or indicators should be watched for in the future to detect similar incidents?
  • What additional tools or resources are needed to detect, analyze, and mitigate future incidents?

Step 2: Document the lessons learned and update IRP and associated procedures as necessary.

Step 3: Ensure evidence is retained and archived. The criteria for evidence collection, a procedure to ensure a chain of custody, and archival instructions are maintained by the CCIC Forensic, Malware and Analysis Team (FMAT). The IMT will coordinate with the FMAT to provide incident responders with assistance to collect and handle evidence.

Automated Incident Handling Processes (IR-04(01))

The purpose of this control is to ensure that CMS employs automated mechanisms to support the incident handling process. CMS employs automated mechanism (e.g., online incident management systems) to support the organization’s incident handling process. The following table provides examples of tools used for automated incident handling processes at CMS.

Table 4: Automated Tools

ToolsDescriptionUsers
HHS RSA ArcherThe HHS tool used for all incident/tracking and reporting. Users do not access HHS Archer directly.

CCIC IMT and CCIC SOC

Analysts

ServiceNowThe CMS ServiceNow ticket is used by the CMS IT Service Desk to track changes and problems within the CMS environment.

CMS IT Service Desk CCIC IMT and CCIC SOC

Analysts

CMS Users

SplunkIs a logging solution for security (CMS Enterprise Security) and Operations and Maintenance (O&M) log management OCISO Systems Security Management (OSSM). It used as an audit reduction tool by the agency to review audit logs.CCIC

Information Correlation (IR-04(04))

The purpose of Information Correlation is to ensure that CMS correlates incident information and individual incident responses to achieve an organization-wide perspective on incident awareness and response. To achieve this,

  1. All tickets submitted in ServiceNow are thoroughly worked through to determine the validity of being classified as an incident. The submitted tickets are correlated and analyzed for trends.
  2. CCIC uses the SIEM tool, Splunk, to correlate data from various sources to receive alerts associated with incident breaches.

Incident Monitoring (IR-05)

The purpose of Incident Monitoring is to ensure that CMS documents information system security incidents and maintains records about each incident such as the status of the incident, and pertinent information necessary for forensics (evaluating incident details, trends, and handling). At CMS, the CCIC delivers a number of important, agency-wide security services. One of such services is Continuous Diagnostics and Mitigation (CDM), which is still in development and not all data centers have been transitioned. Other services include vulnerability management, security engineering, incident management, forensics and malware analysis, information sharing, cyber-threat intelligence, penetration testing, and software assurance.

The IMT is the group responsible for tracking and documenting security and privacy incidents. Stakeholders outside of the IMT (e.g., incident responders, ISSO, system owners, etc.) are responsible for providing the information necessary to track and monitor information security and privacy incidents.

Automated Tracking/Data Collection/Analysis (IR-05(01))

The purpose of Automated Tracking/Data Collection/Analysis is to ensure that CMS employs automated mechanism to assist in the tracking of security incidents and in the collection and analysis of incident information. At CMS, the RSA Archer/CFACTS SecOps Module is utilized for tracking potential incidents under investigation by the CCIC SOC. The IMT is responsible for maintaining the data in RSA Archer/CFACTS along with reviewing, updating, and analyzing the data and producing the trends analysis.

The following list details automated tools utilized at CMS to assist in the tracking of security incidents and in the collection and analysis of incident information. Once an incident has been reported, the external stakeholders will be able to leverage the benefits of these tools via the support provided by the IMT.

  • CMS uses a ServiceNow ticketing system for all privacy and security incidents for incident/tracking and reporting.
  • The CMS ServiceNow ticket is used by the CMS IT Service Desk to track changes and problems within the CMS environment.
  • The HHS Archer is the incident response tool used to notifiy HHS of an incident. A shell ticket is automatically created in HHS Archer when CMS IMT is assigned a ticket in ServiceNow.
  • The CCIC IMT updates the incident information in ServiceNow which will post automatically to HHS Archer. This will occur till the incident has been resolved.
  • CMS RSA Archer/CFACTS SecOps Module is used for investigating potential incidents discovered by the CCIC SOC.

Incident Reporting (IR-06)

The intent of this control is to ensure that CMS requires employees and contractors to report suspected or confirmed information security and privacy incidents to appropriate authorities and to ensure that a formal incident reporting process exists.

As part of a robust, enterprise security operations program designed to reduce the risks of malicious activity, CMS established the CCIC to provide enterprise-wide situational awareness and near real-time risk management. The CCIC also provides information security and aggregated monitoring of security events across all CMS information systems. Finally, the CCIC notifies appropriate security operations staff of detected configuration weaknesses, vulnerabilities open to exploitation, relevant threat intelligence, including indicators of compromise (IOCs) and security patches. For purposes of incident response, the IMT as a sub- component of the CCIC provides incident response assistance and support. All information security and privacy incidents are to be reported to CMS IT Service Helpdesk. The CMS IT Service Helpdesk will notify the IMT as appropriate.

The table below outlines the CMS organizationally defined parameters for IR reporting.

Table 5: CMS Defined Parameters – Control IR-6

ControlControl RequirementCMS Parameter
IR-6
  1. Requires personnel to report suspected security incidents to the organizational incident response capability within [Assignment: organization-defined time period]
  2. Reports security, privacy and supply chain incident information to [Assignment: organization-defined authorities]

The organization:

 

  1. Requires personnel to report actual or suspected security and privacy incidents to the organizational incident response capability within 1 hour of discovery/notification; and
     
  2. Reports security, privacy and supply chain incident information to CMS IT Service Help Desk.

The following process details the CMS procedure for reporting suspected security and privacy incidents:

Step 1: Report the suspected information security and privacy incident to the CMS IT Service Desk at (410) 786-2580 (internal only) or (800) 562-1963 (internal and external) and/or email CMS_IT_Service@cms.hhs.gov. Additionally, contact your ISSO as soon as possible and apprise them of the situation. All suspected information security and privacy incidents must be reported to the CMS IT Service Desk within one hour of discovery.

Step 2: After notifiying the CMS IT Service Desk, collect as much supporting information as possible on the suspected security and privacy incident using the Incident Response Reporting Template located in the ISPL. Provide the information contained on the completed incident reporting form to the CMS IT Service Desk.

Note: This template replaces the previous HHS CMS Computer Security Incident Report form that was published separately to the information security library.

Step 3:The CMS IT Service Desk creates a ServiceNow ticket and enters the details on the suspected security and privacy incident. This ServiceNow ticket creates a shell ticket in HHS Archer, which is the HHS incident response tool.

Step 4:The IMT will update the ServiceNow ticket, as necessary, which will automatically populate in HHS Archer until the incident has been resolved.

Step 5: The IMT analyzes the suspected incident, working with the SOC analyst as necessary, and if confirmed as an actual incident executes the incident handling procedures located in Section 3.5 Incident Handling.

Automated Reporting (IR-06(01))

The purpose of Automated Reporting is to ensure that CMS employs automated mechanisms to assist in the reporting of security and privacy incidents. The following steps detail the CMS specific process for Automated Reporting:

  • Step 1: User will contact the CMS IT Service Helpdesk and report the information security incident.
  • Step 2: The CMS IT Service Helpdesk will open a ServiceNow ticket and record the incident. This ServiceNow ticket automatically generates an Archer ticket notifying HHS CSIRC.
  • Step 3: The CMS IT Service Helpdesk will then assign the ticket to the IMT and they will evaluate the incident report while providing updates to CMS CISO and HHS CSIRC.
  • Step 4: The user (reporter) will continue to update the incident report in ServiceNow or contact the CMS IT Service Helpdesk.
  • Step 5: If the IMT finds that the event is valid, the user will be contacted and the mitigation process will start.
  • Step 6: If the IMT finds that the event is not valid, the IMT will close out the ticket and contact the user.
  • Step 7: The user (reporter) will work with the IMT until remediation of the security incident.

Incident Response Assistance (IR-07)

The purpose of Incident Response Assistance is to ensure that CMS provides an incident response support resource, integral to the CMS’ incident capability that offers advice and assistance to users of the information system for handling and reporting of security and privacy incidents. The following steps detail the CMS specific process for Incident Response assistance:

  • Step 1: User will contact the CMS IT Service Helpdesk for incident response assistance. The CMS IT Service Desk notifies the IMT as appropriate.
  • Step 2: The IMT will evaluate, validate the incident and assist with the mitigation.

Automation Support for Availability of Information/Support (IR-07(01))

The purpose of Automation Support for Availability of Information Support is to ensure that CMS employs automated mechanisms to increase the availability of incident response-related information and support.

CMS uses multiple resources to provide the user community information/support. These include but are not limited to intranets, mailboxes, and online libraries.

Users may use the following resources for Automation Support for Availability of Information/Support:

Incident Response Plan (IR-08)

The purpose of the Incident Response Plan (IRP) is to provide a roadmap for implementing the incident response capability. Each organization needs a plan that meets its unique requirements, which relates to the organization’s mission, size, structure, and functions. The plan should lay out the necessary resources and management support. The incident response plan should include the following elements:

  • Purpose
  • Scope
  • Definitions
  • Roles and Responsibilities
  • Understanding an Incident
  • Incident Life Cycle
    • Preparation
    • Detection and Analysis
    • Containment, Eradication and Recovery
    • Post-Incident Activity
  • Reporting Requirements
  • Points of Contact

The incident response policy is established in the CMS IS2P2 and has been included in this handbook. The Incident Response Plan template is attached to this document as Appendix B. This document provides incident response procedure to facilitate the implementation of incident response controls. Incident response plan, policy, and procedure creation are an important part of establishing a team and permits incident response to be performed effectively, efficiently, and consistently; and so that the team is empowered to do what needs to be done.

The table below outlines the CMS organizationally defined parameters for IR planning.

Table 6: CMS Defined Parameters - Control IR-8

ControlControl RequirementCMS Parameter
IR-8

a. Incident Response Plan is reviewed and approved by [Assignment: organization- defined personnel or role];

b. Distributes copies of the incident response plan to [Assignment organization- defined incident response personnel (identified by name and/or role) and organizational elements]

c. Updates the incident response plan to address system/organizational changes or problems encountered during plan implementation, execution, or testing;

d. Communicates incident response plan changes to [Assignment: organization- defined incident response personnel (identified by name and/or by role) and organizational elements]; and Protects the incident response plan from unauthorized disclosure and modification

a. Incident Response Plan is reviewed and approved by the applicable Business Owner at least annually.

b. Distributes copies of the incident response plan to CMS CIO, CMS CISO, ISSO, CMS OIG Computer Crime Unit (CCU), All personnel within the CMS Incident Response Team, PII Breach Response Team and Operations Centers.

c. Reviewed annually updated as required

d. Communicates incident response plan changes to all stakeholders.

The CCIC IMT created an IRP that provides the CMS with a roadmap for implementing its incident response capability and outlines the incident response process for the IMT. In addition, each information system is responsible for maintaining a separate IRP that describes the systems internal processes for incident response and leverages the capability of the IMT. The following steps details the process for creating an IRP using the template located in the ISPL:

  • Step 1: Complete a draft IRP by leveraging the template and instructions located in Appendix B.
  • Step 2: Submit the draft IRP to the information system’s assigned CRA for ISPG approval. Update that plan as necessary based on the feedback received from ISPG.
  • Step 3: Document the plan approval by having the Business Owner and ISSO sign the plan.
  • Step 4: Disseminate the plan to all appropriate stakeholders to include: the CRA, ISSO, BO, Incident Responders, System Developers, and System Administrators.

CMS Security & Privacy Incident Report Form

The CMS Security and Privacy Incident Report is a form to be filled out when someone has an incident to report. You can access the form and instructions here.

Incident Response Steps for CISO

1Significant Event/Potential Incident ReportedCONTACTS
Responsibilities
  • Receive notification from DCTSO Director or IR Fed Lead

IMT SOP

ISPG Directors

Considerations
  • Does this incident potentially include a criminal element and, therefore, require notification of law enforcement? If so, engage HHS Office of the Inspector General.
  • Was this incident reported to HHS Office of Civil Rights (OCR) in accordance with HIPAA and for Protected Health Information (PHI)? Refer to the OCR website for any details about the event / incident.
2

Obtain situational awareness of the potential incident and the likely

impact(s) on CMS data and /or CMS FISMA systems.

CONTACTS
Responsibilities
  • Receive incident situation reports from IMT

IMT SOP

ISPG Directors System Owner

Data Guardian, ISSO, CRA

Considerations
When engaging an external partner, consider including or informing HHS Office of the Secretary (OS), Office of the Assistant Secretary for Preparedness and Response (ASPR), which executes the Federal coordination responsibilities on behalf of HHS regarding the critical infrastructure public-private partnership for the Healthcare and Public Healthcare Sector (identified in PPD-21 and the National Infrastructure Protection Plan (NIPP)).
3

Conduct security bridge with stakeholders to review incident to obtain a greater understanding of the incident’s impacts and implications. Also,

discuss potential response needs, such as deployment of response capabilities.

CONTACTS
Responsibilities
  • Receive incident status reports from IMT
  • CISO/Deputy CISO will coordinate with IMT to ensure all stakeholders are on security bridge (e.g., SOP, OL, OA, HHS)

IMT SOP OC OL OA

ISPG Directors System Owner

Data Guardian, ISSO, CRA

Considerations
  • Does this incident potentially include a criminal element and, therefore, require notification of law enforcement? If so, engage HHS Office of the Inspector General.
  • Does CMS have relevant experience or capabilities that it could deploy?
4Triage and determine if risk analysis should be performedCONTACTS
Responsibilities
  • OC/OL will keep the response teams apprised of public or legislative affairs matters related to the event/incident (e.g., Congressional inquiries and media monitoring)
  • If communication of CMS risks or potential impacts is necessary, coordinate development of messaging and identify communication channels
  • Receive impact analysis and make a decision regarding additional analysis of impacts to CMS

IMT SOP OC OL OA

ISPG Directors System Owner

Data Guardian, ISSO, CRA

Considerations
Are there any event/incident facts or findings discovered to date that can or should be shared with ISAOs or interagency partners?
5

Determine specific CMS impacts (e.g., PII, PHI, FTI, contracts, & other business partners) and Determine specific impacts to CMS data (e.g., PII,

PHI, FTI)

CONTACTS
Responsibilities
  • Receive incident status reports from IMT
  • Provide guidance to IR staff about cadence of status reporting
  • Escalate incident to HHS leadership
  • When findings are presented, consider if public and/or external communication may be appropriate (even if it is not legally necessary or required)

IMT SOP OC OL OA HHS

ISPG Directors System Owner

Data Guardian, ISSO, CRA

Considerations
  • In accordance with OMB M-20-04, report “major incidents” to Congress within seven days.
  • When evaluating impacts to CMS systems, engage business owners and system owners (including ISSOs) and include the impacts to their environments in status reports.
  • If sensitive information other than PII, PHI, or FTI (e.g., proprietary information) is at risk, consider the risk to the agency and determine appropriate next steps.
6

Conduct security bridge with stakeholders to review incident to obtain a greater understanding of the incident’s impacts and implications. Also,

discuss potential response needs, such as deployment of response capabilities.

CONTACTS
Responsibilities
  • Receive incident status reports from IMT
  • CISO/Deputy CISO will likely lead the meeting(s)/call(s), with

IMT SOP OC OL OA HHS

ISPG Directors System Owner

Data Guardian, ISSO, CRA

Considerations
Are there any event/incident facts or findings discovered to date that can or should be shared with ISAOs or interagency partners?
7Execute SOPs to contain and eradicate cause of the event/incidentCONTACTS
Responsibilities
  • Receive incident status reports from IMT and provide additional guidance/direction as necessary

IMT SOP OC OL OA HHS

ISPG Directors System Owner

Data Guardian, ISSO, CRA

Considerations
Does CMS have relevant experience or capabilities that it could deploy or offer to assist the external partner(s)?
8

Monitor event/incident to assess changes in risk to CMS systems and/or data

  • If changes in risk to CMS systems and/or data are evident, go to Step 2A
CONTACTS
Responsibilities
  • Receive incident status reports from IMT and provide counsel to leadership and response teams as appropriate
  • OC/OL: Determine if monitoring of media and Congressional sources is necessary, and communicate requests or news to leadership and response teams. Coordinate requests for information or messages that may need to be communicated externally

IMT SOP OC OL OA HHS

ISPG Directors System Owner

Data Guardian, ISSO, CRA

9Develop lessons learned and recommend program enhancementsCONTACTS
Responsibilities
  • Participate in IMT-led lessons learned development process and inform recommendations
  • Review lessons learned and submit to business & system owners
  • Review and support POA&Ms as required

IMT SOP

ISPG Directors System Owner

Data Guardian, ISSO, CRA

Considerations
Determine if policy changes need to occur in order to further safeguard CMS data.
10Conclude incident and complete external communications activitiesCONTACTS
Responsibilities
  • Review final Security Incident Report (SIR)
  • Report closure of incident as appropriate/necessary

IMT SOP

ISPG Directors System Owner

Data Guardian, ISSO, CRA

Considerations
Are there any event/incident facts or findings discovered to date that can or should be shared with ISAOs or interagency partners?

Contacts

ContactNumber
Incident Management Team (IMT)443-316-5005
Senior Official for Privacy (SOP)410-786-5759
DCTSO Director410-786-5956
DSPC Director410-786-6918
DSPPG Director410-786-5759
Office of Communications (OC)410-786-8126
Office of Legislation (OL)202-619-0630
Office of the Administrator (OA)410-786-3000
HHS Office of the Secretary (OS), Office of the Assistant Secretary for Preparedness and Response (ASPR)202-205-8114
HHS Office of Inspector General (OIG)800-447-8477
Bridge877-267-1577 (meeting ID will be shared by IMT upon notification)

Incident Notification Template

IncidentNotificationWho Notifies?
All incidents
  • IMT
  • HHS CSIRC
  • CIO
  • CISO
  • SOP
  • Deputy CISO
  • CMS IT Service Desk notifies IMT of an incident
  • CMS incident tickets are mirrored in the HHS Archer, which notifies HHS CSIRC
Incidents involving a CMS System
  • SO
  • BO
  • ISSO
  • DG
  • CRA
  • US-CERT
  • IMT alerts CMS Personnel.
  • HHS CSIRC handles US- CERT reporting.
Incidents involving suspected criminal activityHHS OIGIMT
Incidents involving employeesCMS Office of Human CapitalIMT
Incidents involving legal ramificationsCMS Office of LegislationIMT
Breaches
  • ISPG (to convene Breach Analysis Team)
  • Individuals affected by PII/PHI compromise
  • HHS PIRT
  • IMT alerts ISPG of suspected breach
  • CMS SOP and BO create a notification plan for affected individuals, subject to review by HHS PIRT
Breaches affecting 500 or more people
  • HHS OCR
  • Media outlets, as appropriate
CMS SOP
Breaches requiring Media OutreachCMS Office of CommunicationsCMS SOP

Incident Response Plan Template

Purpose
The objective of this Incident Response Plan (IRP) is to outline the incident handling and response process for the <system name> in accordance with the requirements outlined in the CMS Acceptable Risk Safeguards (ARS) and CMS Risk Management Handbook (RMH) Chapter 8, Incident Response. This plan covers all assets within the information system boundary, transmitting, storing, or processing CMS information. Furthermore, this plan describes how to manage incident response according to all Federal, Departmental and Agency requirements, policies, directives, and guidelines.
Scope
This IRP is written for the <system name> stakeholders with incident response roles and responsibilities and describes those responsibilities for each phase of the incident life cycle. This plan establishes a quick reference for security and privacy incident handling and response.
Definitions

The following key terms and definitions relate to incident response:

Administrative Vulnerability: An administrative vulnerability is a security weakness caused by incorrect or inadequate implementation of a system’s existing security features by the system administrator, security officer, or users. An administrative vulnerability is not the result of a design deficiency. It is characterized by the fact that the full correction of the vulnerability is possible through a change in the implementation of the system or the establishment of a special administrative or security procedure for the system administrators and users. Poor passwords and inadequately maintained systems are the leading causes of this type of vulnerability.

Breach: A breach is an incident that poses a reasonable risk of harm to the applicable individuals. For the purposes of Office of Management and Budget (OMB) OMB M-17-12 (for PII incidents) and Health Information Technology for Economic and Clinical Health (HITECH) Act (for PHI incidents) reporting requirements, a privacy incident does not rise to the level of a breach until it has been determined that the use or disclosure of the protected information compromises the security or privacy of the protected individual(s) and poses a reasonable risk of harm to the applicable individuals. For any CMS privacy incident, the determination of whether it may rise to the level of a breach is made (exclusively) by the CMS Breach Analysis Team (BAT), which determines whether the privacy incident poses a significant risk of financial, reputational, or other harm to the individual(s).

Event: An event is any observable occurrence in a system or network. Events include a user connecting to a file share, a server receiving a request for a web page, a user sending email, and a firewall blocking a connection attempt. Adverse events are events with a negative consequence, such as system crashes, packet floods, unauthorized use of system privileges, unauthorized access to sensitive data, and execution of malware that destroys data.

Federal Tax Information (FTI): Generally, Federal Tax Returns and return information are confidential,

as required by Internal Revenue Code (IRC) Section 6103. The information is used by the Internal Revenue Service (IRS) is considered FTI and ensure that agencies, bodies, and commissions are

Definitions

maintaining appropriate safeguards to protect the information confidentiality. [IRS 1075] Tax return information that is not provided by the IRS falls under PII.

Incident Response: Incident response outlines steps for reporting incidents and lists actions to be taken to resolve information systems security and privacy related incidents.  Handling an incident entails forming a team with the necessary technical capabilities to resolve an incident, engaging the appropriate personnel to aid in the resolution and reporting of such incidents to the proper authorities as required, and report closeout after an incident has been resolved.

Privacy Incident: A Privacy Incident is a Security Incident that involves Personally Identifiable Information (PII) or Protected Health Information (PHI), or Federal Tax Information (FTI) where there is a loss of control, compromise, unauthorized disclosure, unauthorized acquisition, unauthorized access, or any similar term referring to situations where persons other than authorized users or any other than authorized purposes. Users must have access or potential access to PII, PHI and/or FTI in usable form whether physical or electronic.

Privacy incident scenarios include, but are not limited to:

  • Loss of federal, contractor, or personal electronic devices that store PII, PHI and/or FTI affiliated with CMS activities (i.e., laptops, cell phones that can store data, disks, thumb-drives, flash drives, compact disks, etc.)
  • Loss of hard copy documents containing PII, PHI and/or FTI
  • Sharing paper or electronic documents containing PII, PHI and/or FTI with individuals who are not authorized to access it
  • Accessing paper or electronic documents containing PII, PHI and/or FTI without authorization or for reasons not related to job performance
  • Emailing or faxing documents containing PII, PHI and/or FTI to inappropriate recipients, whether intentionally or unintentionally
  • Posting PII, PHI and/or FTI, whether intentionally or unintentionally, to a public website
  • Mailing hard copy documents containing PII, PHI and/or FTI to the incorrect address
  • Leaving documents containing PII, PHI and/or FTI exposed in an area where individuals without approved access could read, copy, or move for future use

Security Incident: In accordance with NIST SP 800-61 Revision 2, Computer Security Incident Handling Guide, a Security Incident is defined as an event that meets one or more of the following criteria:

  • The successful unauthorized access, use, disclosure, modification, or destruction of information or interference with system operations in any information system processing information on behalf of CMS. It also means the loss of data through theft or device misplacement, loss or misplacement of hardcopy documents and misrouting of mail, all of which may have the potential to put CMS data at risk of unauthorized access, use, disclosure, modification, or destruction
  • An occurrence that jeopardizes the confidentiality, integrity, or availability of an information system or the information the system processes, stores, or transmits
  • A violation or imminent threat of violation of computer security policies, acceptable use policies, or standard security practices

Technical Vulnerability: A technical vulnerability is a hardware, firmware, or software weakness or design deficiency that leaves a system open to potential exploitation, either externally or internally, thus increasing the risk of compromise, alteration of information, or denial of service.

Roles and Responsibilities

<Insert the roles and responsibilities associated with this plan. Possible roles include:

  • Business Owners:
  • Information System Owner(s)
  • Cyber Risk Advisors (CRA)
  • Information System Security Officer (i.e., ISSO)
  • CCIC Incident Management Team (i.e., CCIC IMT)

For a detailed description of the responsibilities associated with these role please refer to the CMS IS2P2 located at: https://security.cms.gov/policy-guidance/cms-information-systems-security-privacy-policy-is2p2

Understanding an Incident
The following lists a small subset of common well known incidents:
Types of Incidents
  • Data Destruction or Corruption: The loss of data integrity can take many forms including changing permissions on files making the files writable by non-privileged users, deleting data files and or programs, changing audit files to cover-up an intrusion, changing configuration files that determine how and what data is stored and ingesting information from other sources that may be corrupt
  • Data Compromise and Data Spills: Data compromise is the exposure of information to a person not authorized to access that information either through clearance level or formal authorization. This could happen when a person accesses a system not authorized to access or through a data spill. Data spill is the release of information to another system or person not authorized to access that information, even though the person is authorized to access the system on which the data was released. This can occur through the loss of control, improper storage, improper classification, or improper escorting of media, computer equipment (with memory), and computer generated output
  • Malicious Software (Malware): Malicious code is software based attacks used by crackers/hackers to gain privileges, capture passwords, and/or modify audit logs to exclude unauthorized activity. Malicious code is particularly troublesome in that it is typically written to masquerade its presence and, thus, is often difficult to detect. Self-replicating malicious code such as viruses and worms can replicate rapidly, thereby making containment an especially difficult problem. The following is a brief listing of various software attacks:
    1. Virus: It is propagated via a triggering mechanism (e.g., event time) with a mission (e.g., delete files, corrupt data, send data).
    2. Worm: An unwanted, self-replicating autonomous process (or set of processes) that penetrates computers using automated hacking techniques.
    3. Trojan Horse: A useful and innocent program containing additional hidden code that allows unauthorized computer network exploitation (CNE), falsification, or destruction of data.
Types of Incidents
  1. Spyware: Surreptitiously installed malicious software that is intended to track and report the usage of a target system or collect other data the author wishes to obtain.
  2. Rootkit Software: Software that is intended to take full or partial control of a system at the lowest levels. Contamination is defined as inappropriate introduction of data into a system.
  3. Privileged User Misuse: Privileged user misuse occurs when a trusted user or operator attempts to damage the system or compromise the information it contains.
  4. Security Support Structure Configuration Modification: Software, hardware and system configurations contributing to the Security Support Structure (SSS) are controlled. SSS’ are essential to maintaining the security policies of the system Unauthorized modifications to these configurations can increase the risk to the system.

Note: These categories of incidents are not necessarily mutually exclusive.

Causes of Incidents
  • Malicious Code: Malicious code is software or firmware intentionally inserted into an information system for an unauthorized purpose
  • System Failures: Procedures Failures or Improper Acts. A secure operating environment depends upon proper operation and use of systems. Failure to comply with established procedures, or errors/limitations in the procedures for a CMS system, can damage CMS reputation and increase vulnerability/risk to the system or application. While advances in computer technology enable the building of increased security into the CMS architecture, much still depends upon the people operating and using the system(s). Improper acts may be differentiated from insider attack according to intent. With improper acts, someone may knowingly violate policy and procedures, but is not intending to damage the system or compromise the information it contains
  • Intrusions or Break-Ins: An intrusion or break-in is entry into and use of a system by an unauthorized individual
  • Insider Attack: Insider attacks can provide the greatest risk. In an insider attack, a trusted user or operator attempts to damage the system or compromise the information it contains
Avenues of Attack

As with any information system, attacks can originate through certain avenues or routes. An attack avenue is a path or means by which an attacker can gain access to a computer or network server in order to deliver a payload or malicious outcome. Attack avenues enable attackers to exploit system vulnerabilities, including the human element. If a system were locked in a vault with security personnel surrounding it, and if the system were not connected to any other system or network, there would be virtually no avenue of attack. However, there are numerous avenues of attack.

  • Local and/or partner networks
  • Unauthorized devices (including non-approved connections to a local network)
  • Gateways to outside networks
  • Communications devices
  • Shared disks
  • Removable media
  • Downloaded software
  • Direct physical access
Possible Impacts of an Attack

One of the major concerns of a verifiable computer security attack is that sensitive PII is compromised. The release of sensitive information to people without the proper need-to-know or formal authorization jeopardizes the tenant of Confidentiality, Integrity and Availability (CIA). In addition, users may lose trust in computing systems and become hesitant to use one that has a high frequency of incidents or even a high frequency of events that cause the user to distrust the integrity of the federal system. Moreover, users become disenfranchised with any action that causes all or part of the network’s service to be stopped entirely, interrupted, or degraded sufficiently to impact operations; as with a DoS attack. The list of impacts from attacks that compromise computer security include:

  • Denial of Service
  • Loss or Alteration of Data or Programs
  • Privacy Incident, including those resulting in identity theft or data breach
  • Loss of Trust in Computing Systems
  • The loss of intellectual property and CMS confidential information
  • Reputational damage to the organization
  • The additional cost of securing networks, insurance, and recovery from attacks
Incident Life Cycles
The incident response process has four phases. Review the NIST SP 800-61 Incident Lifecycle.
Preparation

Preparation ensures that the organization is ready to respond to incidents, but can also prevent incidents by ensuring that systems, networks, and applications are sufficiently secure. The following describes the techniques utilized by the <system name> and to prepare for security and privacy incidents.

<Describe the activities and methods in place for the information system to prepare for information security incidents. Examples of preparation methods are, implementing incident response tools, establishing security baselines, and running periodic announced training and/or unannounced drills. For additional information on preparation activities please review Section 3.3.1 Preparation of the CMS RMH Chapter 8 Incident Response.>

<Describe how incidents involving PII are to be handled, including the policies and procedures that have been developed and how those policies and procedures are communicated to the staff. Staff should be informed of the consequences of their actions for inappropriate use and handling of PII. Describe how it is determined that the existing processes are adequate and that staff understand their responsibilities. Describe how suspected or known incidents involving PII are reported to the business owner, information system owner, CRA, ISSO, and CCIC IMT. Describe what information needs to be reported, and to whom.>

Detection and Analysis

Incidents can occur in countless ways, so it is infeasible to develop step-by-step instructions for handling every incident. Organizations should be generally prepared to handle any incident but should focus on being prepared to handle incidents that use common attack vectors. Different types of incidents merit different response strategies. The following section describes the techniques utilized by the <system name> to detect and analyze security incidents

<Describe the activities and methods in place for the information system to detect and analyze for information security incidents. Examples of detection and analysis methods are, prepare for common attack vectors, recognize the signs of an incident, and document and prioritize the incident. For additional information on preparation, activities please review Section 3.3.2 Detection and Analysis of the CMS RMH Chapter 8 Incident Response.>

<Describe the activities and methods in place to detect and analyze incidents involving PII that are the responsibility of the information staff. Describe how it is ensured that the analysis process includes an evaluation of whether an incident involved PII, focusing on both known and suspected breaches of PII. Detection of an incident involving PII also requires reporting internally, to US-CERT, and externally, as appropriate; this is a CCIC IMT responsibility.>

Containment, Eradication & Recovery

Containment

Containment is important before an incident overwhelms resources or increases damage. Most incidents require containment, so that is an important consideration early in the course of handling each incident. Containment provides time for developing a tailored remediation strategy. An essential part of containment is decision-making. Such decisions are much easier to make if there are predetermined strategies and procedures for containing the incident. The following section describes the containment strategies and procedures for the <system name>:

<Describe the strategies and procedures in place for the information system to contain information security incidents. Examples of containment strategies are, shut down a system, disconnect it from a network, and/or disable certain functions. For additional information on Containment activities, review Section 3.3.3 Containment, Eradication and Recovery of the CMS RMH Chapter 8 Incident Response.>

<Describe the strategies and procedures in place for containing incidents involving PII.>

Containment, Eradication & Recovery

After an incident has been contained, eradication may be necessary to eliminate components of the incident, such as deleting malware and disabling breached user accounts, as well as identifying and mitigating all vulnerabilities that were exploited. During eradication, it is important to identify all affected hosts within the organization so that the hosts can be remediated. For some incidents, eradication is either not necessary or is performed during recovery.

<Describe the activities and methods in place for the information system to eradicate and recover from information security incidents. Examples methods for eradication are delete malware, disable breached accounts, identify and mitigate vulnerabilities that were exploited. Examples activities associated with recovering from information security incidents are restore systems to normal operation, confirm that systems are functioning normally, and remediate vulnerabilities to prevent similar incidents. For additional information on Eradication and Recovery activities review Section 3.3.3 Containment, Eradication and Recovery of the CMS RMH Chapter 8 Incident Response.>

<Describe if media sanitization steps are performed when PII needs to be deleted from media during recovery. PII should not be sanitized until a determination has been made about whether the PII must be preserved as evidence. Describe if forensics techniques are needed to ensure preservation of evidence. If PII was accessed, how is it determined how many records or individuals were affected. These activities should be coordinated with the CCIC IMT.>

Post-Incident Activity

After an incident has been eradicated and recovery completed, each incident response team should evolve to reflect upon new threats, improve technology, and document lessons learned. Holding a lessons learned meeting with all involved parties after a major incident, and optionally after lesser incidents, can be extremely helpful in improving information security measures and the incident handling process.

<Describe the activities and methods in place for the information system to conduct post-incident activity after information security incidents. Examples methods for post-incident activity are: to conduct a lesson learned meeting, document the lessons learned, update the IRP and associated procedures as necessary, and ensure evidence is retained and archived. For additional information on post-incident activity review Post-Incident Activity of the CMS RMH Chapter 8 Incident Response.>

<Describe the activities and methods in place to conduct post-incident activity after incidents involving PII. This should include how the IRP is continually updated and improved based on the lessons learned during each incident. Sharing information within CMS and US-CERT to help protect against future incidents is a CCIC responsibility.>

Reporting Requirements
<Describe the information system process for reporting information security incidents. Incident should be reported to the CMS IT Service Desk within one hour, by calling at (410) 786-2580 (i.e., internal) or (1- 800) 562-1963 (internal and external) or email CMS_IT_Service@cms.hhs.gov. For information on reporting requirements for information security and privacy incidents, review Section 3.5 Incident Reporting and for the Incident Response Reporting Template in The CMS RMH Chapter 8 Incident Response.>
Points of Contact

Business Owner

<insert name>

<insert email>

<insert phone>

 

CMS IT Service Desk

<insert name>

<insert email>

<insert phone>

 

Cybersecurity Risk Advisor (CRA)

<insert name>

<insert email>

<insert phone>

 

Data Guardian

<insert name>

<insert email>

<insert phone>
 

Incident Management Team

<insert name>

<insert email>

<insert phone>

 

Incident Responders

<insert name>

<insert email>

<insert phone>

 

Information System Security Officer (ISSO)

<insert name>

<insert email>

<insert phone>

 

System Administrators

<insert name>

<insert email>

<insert phone>

 

System Developers

<insert name>

<insert email>

<insert phone>

Plan Approval

Business Owner (BO)
<insert signature>

<insert name>

<insert title>

<insert email>

<insert phone>

 

Information System Security Officer (ISSO)

<insert signature>

<insert name>

<insert title>

<insert email>

<insert phone>

 

Tabletop Exercise Test Plan Template

Test Topic<Insert Topic>
Test Scope<Describe the scope of the incident response test to include who will participate in the exercise, the purpose of the test, and the expected outcome.  All personnel with responsibilities under the incident response plan should participate in the exercise.  The exercise should apply to the roles and responsibilities.  This includes personnel within the incident response plan being exercised and focus on validating that the documented roles, responsibilities, and interdependencies are accurate and current.  To ensure that the knowledge of the roles and responsibilities identified in the plan being exercised is current, it is often effective to conduct a training session in conjunction with any tabletop exercise.>
Test ObjectivesThe objectives of this test is as follows:
1To validate the content of the incident response plan and the related policies and procedures.
2Validate participants’ roles and responsibilities as documented in the incident response plan and validate the interdependencies documented in the incident response plan.
3To meet regulatory requirements specifically the NIST SP 800-53 Rev. 4 requirements for incident response testing and incident response training.
4To document lessons learned that may be utilized to update the incident response plan and related policies and procedures.
Participants<Insert participants, the participants should be comprised of personnel with roles and responsibilities identified in the incident response plan.  For example, training staff, validation staff, and evaluation staff.>
Exercise Facilitator<Insert the name of the individual who will lead the discussion among the exercise participants.>
Data Collector<Insert the name of the individual who records information about the actions that occur during the exercise.>
Date of Testing<Insert date and time of testing>
Location<Insert Location>
Equipment Required<Insert required equipment, for example, audio visual equipment, whiteboard, flipchart>
Material Required<Insert required material, for example, participant guides, PowerPoint presentations, handouts>
Test Scenarios<Insert a sequential, narrative account of a hypothetical incident that provides the catalyst for the exercise and is intended to introduce situations that will inspire responses and thus allow demonstration of the exercise objectives.>
Test Questions

<Insert a list of questions regarding the scenario that address the exercise objective.  Below are sample questions taken from NIST Special Publication 800-61 Computer Security Incident Handling Guide>

Preparation:

  1. Would the organization consider this activity to be an incident?  If so, which of the organization’s policies does this activity violate?
  2. What measures are in place to attempt to prevent this type of incident from occurring or to limit its impact?

Detection and Analysis:

  1. What precursors of the incident, if any, might the organization detect?  Would any precursors cause the organization to take action before the incident occurred?
  2. What indicators of the incident might the organization detect?  Which indicators would cause someone to think that an incident might have occurred?
  3. What additional tools might be needed to detect this particular incident?
  4. How would the incident response team analyze and validate this incident?  What personnel would be involved in the analysis and validation process?
  5. To which people and groups within the organization would the team report the incident?
  6. How would the team prioritize the handling of this incident?

Containment, Eradication, and Recovery:

  1. What strategy should the organization take to contain the incident?  Why is this strategy preferable to others?
  2. What could happen if the incident were not contained?
  3. What additional tools might be needed to respond to this particular incident?
  4. Which personnel would be involved in the containment, eradication, and/or recovery processes?
  5. What sources of evidence, if any, should the organization acquire?  How would the evidence be acquired?  Where would it be stored?  How long should it be retained?

Post-Incident Activity:

  1. Who would attend the lessons learned meeting regarding this incident?
  2. What could be done to prevent similar incidents from occurring in the future?
  3. What could be done to improve detection of similar incidents?

General Questions:

  1. How many incident response team members would participate in handling this incident?
  2. Besides the incident response team, what groups within the organization would be involved in handling this incident?
  3. To which external parties would the team report the incident?  When would each report occur?
  4. How would each report be made?  What information would you report or not report, and why?
  5. What other communications with external parties may occur?
  6. What tools and resources would the team use in handling this incident?
  7. What aspects of the handling would have been different if the incident had occurred at a different day and time (on-hours versus off-hours)?
  8. What aspects of the handling would have been different if the incident had occurred at a different physical location (onsite versus offsite)?
Plan Being Exercise<Insert the name and location of the incident response plan being exercised>
Exercise Agenda
  • Introductions
  • Review Exercise Scope and Logistics
  • Scenario Walk-Through & review of test questions (Exercise Facilitator)
  • Data Collector records observations (on-going)
  • Conduct exercise debrief/hotwash
  • Exercise Participants released
  • Complete After-Action Report (Exercise Facilitator & Data Collector only)
Test Plan Approval<Insert signature by approval authority (e.g., Business Owner or ISSO)>

Tabletop Exercise Participant Guide Template

<INSERT ORGANIZATION NAME>

<INSERT TABLETOP EXERCISE TITLE>

Participant Guide

<Insert Tabletop Location>

<Insert Tabletop Date>

Introduction

In an effort to validate <insert organization name> <insert name of plan being exercised>, <insert organization name> will conduct a tabletop exercise to examine processes and procedures associated with the implementation of the <insert plan name>.  This discussion-based exercise will be a <insert number of hours>-hour event that will begin at <insert start time> and will last until <insert end time>

The exercise is designed to facilitate communication among personnel with incident response roles and responsibilities.  The following scenarios have been chosen for this exercise:

  • <Insert scenarios from approved test plan>

This exercise is designed to improve the readiness of the [insert organization name] and help validate existing <insert plan name> procedures.

Participants should come to the exercise prepared to discuss high-level issues related to the incident handling based on the scenarios above.  To achieve the exercise’s stated objectives, discussion will focus on the following questions related to the scenarios and the incident response plan:

  • <Insert questions from approved test plan>

Participants may choose to bring incident response narrative or reference material that will aid in answering the above questions.

Concept of Operations

A tabletop exercise is a discussion-based event in which participants meet in a “classroom” setting to address the actions participants would take in response to an emergency.  Tabletops are an effective initial step for personnel to discuss the full range of issues related to a crisis scenario.  These exercises provide an excellent forum to examine roles and responsibilities, unearth interdependencies, and evaluate plans.  A tabletop exercise also satisfies the training requirement for personnel with incident response roles and responsibilities.

Participants will be presented with a incident response.  A facilitator will help guide discussion by asking questions designed to address the exercise’s objectives.

Objectives

The exercise objectives are as follows:

  • <Insert questions from approved test plan>

Agenda

Date:<Insert date>
9:00 a.m. – 9:15 a.m.Introductions
9:15 a.m. – 9:30 a.m.Review Exercise Scope and Logistics
9:30 a.m. – 11:30 a.m.Scenario Walk-Through & review of test questions (Exercise Facilitator)
9:30 a.m. – 11:30 a.m.Data Collector records observations (on-going)
11:30 a.m. – 12:00 p.m.Conduct exercise debrief/hotwash
MilestoneExercise Participants released
1:00 p.m. - completionComplete After-Action Report (Exercise Facilitator & Data Collector only)

Debriefing/Hotwash Questions

An after action report identifying strengths and areas where improvements might be made will be provided after the exercise.  The following questions are designed to obtain input into the after action report from participants:

  • Are there any other issues you would like to discuss that were not raised?
  • What are the strengths of the incident response plan?  What areas require closer examination?
  • Was the exercise beneficial?  Did it help prepare you to execute on your incident response roles and responsibilities?
  • What did you gain from the exercise?
  • How can we improve future exercises and tests?

After Action Report Template

<INSERT ORGANIZATION NAME>

<INSERT TABLETOP EXERCISE TITLE>

After Action Report 

<Insert Tabletop Location>

<Insert Tabletop Date>

Introduction

On <insert date>, <insert organization name> participated in <insert duration of exercise> - hour tabletop exercise designed to validate the organization’s understanding of the <insert plan name.>

Objectives

The exercise objectives are as follows:

  • <Copy objectives from approved Test Plan>

Agenda

Date:<Insert date>
9:00 a.m. – 9:15 a.m.Introductions
9:15 a.m. – 9:30 a.m.Review Exercise Scope and Logistics
9:30 a.m. – 11:30 a.m.Scenario Walk-Through & review of test questions (Exercise Facilitator)
9:30 a.m. – 11:30 a.m.Data Collector records observations (on-going)
11:30 a.m. – 12:00 p.m.Conduct exercise debrief/hotwash
MilestoneExercise Participants released
1:00 p.m. - completionComplete After-Action Report (Exercise Facilitator & Data Collector only)

Discussion Findings

The <insert exercise name> provided information on <insert relevant information>.  An important benefit of the exercise was the opportunity for participants to raise important questions, concerns, and issues.

The discussion findings from the exercise along with any necessary recommended actions are as follows:

General Findings

The exercise provided an excellent opportunity for participants to <insert relevant information>.  As a result of the exercise, participants left with a heightened awareness of <insert relevant information>.

Specific Findings

Specific observations made during the exercise, and recommendations for enhancement of the plan, are as follows:

Observation 1. <Insert general topic area>

<Insert observation>

Recommendation

<Insert recommendations>

Observation 2. <Insert general topic area>

<Insert observation>

Recommendation

<Insert recommendations>

Below is an example of a completed observation and recommendations, all text in blue should be deleted upon the completion of the After-Action Report.

Example Observations and Recommendations:
Observation 1.Communication
A plan identifying the process for communicating with incident response team members do not exist.
Recommendations:
  • The organization should consider developing a communications plan that establishes standardized communications requirements, addresses how stolen documents will be investigated, and describes procedures for personnel incident response team working with organizations to investigate breaches.
  • The organization should identify weaknesses in the incident handling plan and procedures to ensure that all essential personnel can be contacted in the event of sensitive document breach.

 

Observation 2.Incident Breach Handling Protocol
Essential personnel have not been aware of the organization impact of stolen documents, and the incident breach handling protocol to investigation and recovery.
  • The agency should examine the criteria for ALL personnel having access to sensitive organization documents.  In addition, all personnel might need to attend a security training and awareness course on how to report incidents or suspicious activities.

 

Sample Incident Scenarios

Scenario 1: Domain Name System (DNS) Server Denial of Service (DOS)

On a Saturday afternoon, external users start having problems accessing the organization’s public websites. Over the next hour, the problem worsens to the point where nearly every access attempt fails. Meanwhile, a member of the organization’s networking staff responds to alerts from an Internet border router and determines that the organization’s Internet bandwidth is being consumed by an unusually large volume of User Datagram Protocol (UDP) packets to and from both the organization’s public DNS servers. Analysis of the traffic shows that the DNS servers are receiving high volumes of requests from a single external IP address. Also, all the DNS requests from that address come from the same source port.

The following are additional questions for this scenario:

  1. Whom should the organization contact regarding the external IP address in question?
  2. Suppose that after the initial containment measures were put in place, the network administrators detected that nine internal hosts were also attempting the same unusual requests to the DNS server. How would that affect the handling of this incident?
  3. Suppose that two of the nine internal hosts disconnected from the network before their system owners were identified. How would the system owners be identified?

 

Scenario 2: Worm and Distributed Denial of Service (DDoS) Agent Infestation

On a Tuesday morning, a new worm is released; it spreads itself through removable media, and it can copy itself to open Windows shares. When the worm infects a host, it installs a DDoS agent. The organization has already incurred widespread infections before antivirus signatures become available several hours after the worm started to spread.

The following are additional questions for this scenario:

  1. How would the incident response team identify all infected hosts?
  2. How would the organization attempt to prevent the worm from entering the organization before antivirus signatures were released?
  3. How would the organization attempt to prevent the worm from being spread by infected hosts before antivirus signatures were released?
  4. Would the organization attempt to patch all vulnerable machines? If so, how would this be done?
  5. How would the handling of this incident change if infected hosts that had received the DDoS agent had been configured to attack another organization’s website the next morning?
  6. How would the handling of this incident change if one or more of the infected hosts contained sensitive personally identifiable information regarding the organization’s employees?
  7. How would the incident response team keep the organization’s users informed about the status of the incident?
  8. What additional measures would the team perform for hosts that are not currently connected to the network (e.g., staff members on vacation, offsite employees who connect occasionally)?

 

Scenario 3: Stolen Documents

On a Monday morning, the organization’s legal department receives a call from the Federal Bureau of Investigation (FBI) regarding some suspicious activity involving the organization’s systems. Later that day, an FBI agent meets with members of management and the legal department to discuss the activity. The FBI has been investigating activity involving public posting of sensitive government documents, and some of the documents reportedly belong to the organization. The agent asks for the organization’s assistance, and management asks for the incident response team’s assistance in acquiring the necessary evidence to determine if these documents are legitimate or not and how they might have been leaked.

The following are additional questions for this scenario:

  1. From what sources might the incident response team gather evidence?
  2. What would the team do to keep the investigation confidential?
  3. How would the handling of this incident change if the team identified an internal host responsible for the leaks?
  4. How would the handling of this incident change if the team found a rootkit installed on the internal host responsible for the leaks?
Scenario 4: Compromised Database Server

On a Tuesday night, a database administrator performs some off-hours maintenance on several production database servers. The administrator notices some unfamiliar and unusual directory names on one of the servers. After reviewing the directory listings and viewing some of the files, the administrator concludes that the server has been attacked and calls the incident response team for assistance. The team’s investigation determines that the attacker successfully gained root access to the server six weeks ago.

The following are additional questions for this scenario:

  1. What sources might the team use to determine when the compromise had occurred?
  2. How would the handling of this incident change if the team found that the database server had been running a packet sniffer and capturing passwords from the network?
  3. How would the handling of this incident change if the team found that the server was running a process that would copy a database containing sensitive customer information (including personally identifiable information) each night and transfer it to an external address?
  4. How would the handling of this incident change if the team discovered a rootkit on the server?
Scenario 5: Unknown Exfiltration

On a Sunday night, one of the organization’s network intrusion detection sensors alerts on anomalous outbound network activity involving large file transfers. The intrusion analyst reviews the alerts; it appears that thousands of .RAR files are being copied from an internal host to an external host, and the external host is located in another country. The analyst contacts the incident response team so that it can investigate the activity further. The team is unable to see what the .RAR files hold because their contents are encrypted. Analysis of the internal host containing the .RAR files shows signs of a bot installation.

The following are additional questions for this scenario:

  1. How would the team determine what was most likely inside the .RAR files? Which other teams might assist the incident response team?
  2. If the incident response team determined that the initial compromise had been performed through a wireless network card in the internal host, how would the team further investigate this activity?
  3. If the incident response team determined that the internal host was being used to stage sensitive files from other hosts within the enterprise, how would the team further investigate this activity?
Scenario 6: Unauthorized Access to Payroll Records

On a Wednesday evening, the organization’s physical security team receives a call from a payroll administrator who saw an unknown person leave her office, run down the hallway, and exit the building. The administrator had left her workstation unlocked and unattended for only a few minutes. The payroll program is still logged in and on the main menu, as it was when she left it, but the administrator notices that the mouse appears to have been moved. The incident response team has been asked to acquire evidence related to the incident and to determine what actions were performed.

The following are additional questions for this scenario:

  1. How would the team determine what actions had been performed?
  2. How would the handling of this incident differ if the payroll administrator had recognized the person leaving her office as a former payroll department employee?
  3. How would the handling of this incident differ if the team had reason to believe that the person was a current employee?
  4. How would the handling of this incident differ if the physical security team determined that the person had used social engineering techniques to gain physical access to the building?
  5. How would the handling of this incident differ if logs from the previous week showed an unusually large number of failed remote login attempts using the payroll administrator’s user ID?
  6. How would the handling of this incident differ if the incident response team discovered that a keystroke logger was installed on the computer two weeks earlier?
Scenario 7: Disappearing Host

On a Thursday afternoon, a network intrusion detection sensor records vulnerability scanning activity directed at internal hosts that is being generated by an internal IP address. Because the intrusion detection analyst is unaware of any authorized, scheduled vulnerability scanning activity, she reports the activity to the incident response team. When the team begins the analysis, it discovers that the activity has stopped and that there is no longer a host using the IP address.

The following are additional questions for this scenario:

  1. What data sources might contain information regarding the identity of the vulnerability scanning host?
  2. How would the team identify who had been performing the vulnerability scans?
  3. How would the handling of this incident differ if the vulnerability scanning were directed at the organization’s most critical hosts?
  4. How would the handling of this incident differ if the vulnerability scanning were directed at external hosts?
  5. How would the handling of this incident differ if the internal IP address was associated with the organization’s wireless guest network?
  6. How would the handling of this incident differ if the physical security staff discovered that someone had broken into the facility half an hour before the vulnerability scanning occurred?
Scenario 8: Telecommuting Compromise

On a Saturday night, network intrusion detection software records an inbound connection originating from a watchlist IP address. The intrusion detection analyst determines that the connection is being made to the organization’s VPN server and contacts the incident response team. The team reviews the intrusion detection, firewall, and VPN server logs and identifies the user ID that was authenticated for the session and the name of the user associated with the user ID.

The following are additional questions for this scenario:

  1. What should the team’s next step be (e.g., calling the user at home, disabling the user ID, disconnecting the VPN session)? Why should this step be performed first? What step should be performed second?
  2. How would the handling of this incident differ if the external IP address belonged to an open proxy?
  3. How would the handling of this incident differ if the ID had been used to initiate VPN connections from several external IP addresses without the knowledge of the user?
  4. Suppose that the identified user’s computer had become compromised by a game containing a Trojan horse that was downloaded by a family member. How would this affect the team’s analysis of the incident? How would this affect evidence gathering and handling? What should the team do in terms of eradicating the incident from the user’s computer?
  5. Suppose that the user installed antivirus software and determined that the Trojan horse had included a keystroke logger. How would this affect the handling of the incident? How would this affect the handling of the incident if the user were a system administrator? How would this affect the handling of the incident if the user were a high-ranking executive in the organization?
Scenario 9: Anonymous Threat

On a Thursday afternoon, the organization’s physical security team receives a call from an IT manager, reporting that two of her employees just received anonymous threats against the organization’s systems. Based on an investigation, the physical security team believes that the threats should be taken seriously and notifies the appropriate internal teams, including the incident response team, of the threats.

The following are additional questions for this scenario:

  1. What should the incident response team do differently, if anything, in response to the notification of the threats?
  2. What impact could heightened physical security controls have on the team’s responses to incidents?
Scenario 10: Peer-to-Peer File Sharing

The organization prohibits the use of peer-to-peer file sharing services. The organization’s network intrusion detection sensors have signatures enabled that can detect the usage of several popular peer-to-peer file sharing services. On a Monday evening, an intrusion detection analyst notices that several file sharing alerts have occurred during the past three hours, all involving the same internal IP address.

  1. What factors should be used to prioritize the handling of this incident (e.g., the apparent content of the files that are being shared)?
  2. What privacy considerations may impact the handling of this incident?
  3. How would the handling of this incident differ if the computer performing peer-to-peer file sharing also contains sensitive personally identifiable information?
Scenario 11: Unknown Wireless Access Point

On a Monday morning, the organization’s help desk receives calls from three users on the same floor of a building who state that they are having problems with their wireless access. A network administrator who is asked to assist in resolving the problem brings a laptop with wireless access to the users’ floor. As he views his wireless networking configuration, he notices that there is a new access point listed as being available. He checks with his teammates and determines that this access point was not deployed by his team, so that it is most likely a rogue access point that was established without permission.

  1. What should be the first major step in handling this incident (e.g., physically finding the rogue access point, logically attaching to the access point)?
  2. What is the fastest way to locate the access point? What is the most covert way to locate the access point?
  3. How would the handling of this incident differ if the access point had been deployed by an external party (e.g., contractor) temporarily working at the organization’s office?
  4. How would the handling of this incident differ if an intrusion detection analyst reported signs of suspicious activity involving some of the workstations on the same floor of the building?
  5. How would the handling of this incident differ if the access point had been removed while the team was still attempting to physically locate it?

 

 

Short description

This chapter (RMH Chapter 8) identifies the policies and standards for the Incident Response family of controls

Resource Type
Last reviewed
Contact name
ISPG Policy Team
Contact email
CISO@cms.hhs.gov