This article is for informational purposes only. GRCadia is not a law firm and does not provide legal or certification advice. Content is based on publicly available regulatory and standards guidance. Organisations should consult a qualified legal professional or accredited certification body for advice specific to their situation.
Most SOC Teams Are Improvising During Incidents
An alert fires at 2 a.m. The on-call analyst opens the SIEM, sees something unfamiliar, and starts making decisions on the fly. They escalate to the wrong person. They forget to isolate the endpoint before pulling forensic images. They skip the containment step because nobody wrote it down. By the time the senior engineer joins the call, the attacker has moved laterally and the evidence trail is incomplete.
This is not a training problem. It is a documentation problem. When your security operations center relies on tribal knowledge instead of written procedures, every incident becomes an improvisation exercise. Response times stretch. Steps get missed. Post-incident reviews reveal the same gaps over and over. The fix is straightforward: give your team a SOC runbook — a step-by-step reference document that tells analysts exactly what to do when a specific type of incident occurs.
What Is a SOC Runbook?
A SOC runbook is a structured, step-by-step document that guides security analysts through the detection, triage, containment, eradication, and recovery phases of a specific incident type. Unlike high-level policies that describe what your organisation does, a runbook describes how — the exact sequence of actions, tools, commands, escalation paths, and decision points for a given scenario.
Runbooks are tactical. They sit between your overarching incident response plan (which covers governance, roles, and communication strategy) and the real-time decisions analysts make during an active incident. A well-written runbook reduces mean time to respond (MTTR), ensures consistency across shifts, and gives junior analysts the confidence to handle incidents they have not seen before.
Most mature SOCs maintain a library of runbooks — one for each incident type they commonly encounter. Phishing, ransomware, DDoS, insider threat, compromised credentials, malware on endpoint, cloud misconfiguration — each gets its own runbook because each requires different tools, different containment steps, and different escalation criteria.
SOC Runbook vs Playbook: What's the Difference?
The terms "runbook" and "playbook" are often used interchangeably, but in mature security operations they serve different purposes.
| Aspect | SOC Runbook | SOC Playbook |
|---|---|---|
| Scope | Single incident type | Broader incident response programme |
| Detail level | Step-by-step procedures, specific commands, tool instructions | Strategic guidance, decision frameworks, role definitions |
| Audience | L1/L2 SOC analysts executing the response | SOC managers, IR leads, CISOs setting strategy |
| Example | "Phishing email response runbook" | "Incident response playbook covering all threat categories" |
| Analogy | A recipe for one dish | The full cookbook |
In practice, a playbook is the parent document. It defines your incident response framework, team roles, communication protocols, and escalation matrix. Runbooks are the children — each one details the tactical execution for a specific scenario within that framework. You need both. The playbook tells your team how the programme works. Runbooks tell them what to do right now.
GRCadia's IR Playbook Bundle includes both: the strategic playbook framework and a library of scenario-specific runbooks ready to customise for your environment.
What Should a SOC Runbook Include?
Every SOC runbook should follow a consistent structure so analysts can find information quickly under pressure. Here are the essential components:
- Incident type and description — a clear definition of the scenario the runbook covers, including common indicators of compromise (IOCs) and alert triggers that activate it.
- Severity classification criteria — how to determine whether this incident is a P1, P2, P3, or P4 based on scope, data sensitivity, and business impact. Include specific thresholds, not vague guidance.
- Detection and triage steps — the initial investigation checklist: what to check first, which logs to pull, what queries to run in the SIEM, and how to confirm whether the alert is a true positive or false positive.
- Containment procedures — exact steps to stop the incident from spreading. This includes network isolation commands, account disablement procedures, firewall rule changes, and endpoint quarantine instructions. Be specific — include the actual commands and tool paths.
- Eradication and recovery steps — how to remove the threat (malware removal, credential resets, patch application) and restore affected systems to normal operation. Include verification steps to confirm the threat is fully removed.
- Escalation matrix — who to notify at each severity level, with names, roles, contact methods, and response time expectations. Include after-hours procedures.
- Evidence collection and preservation — what forensic artifacts to collect, how to preserve chain of custody, where to store evidence, and what tools to use. This is critical for post-incident analysis and any legal or regulatory proceedings.
- Communication templates — pre-drafted internal notifications, management updates, and (where applicable) external communications. Analysts should not be writing emails from scratch during an active incident.
SOC Runbook Examples
Here are three common SOC runbook scenarios to illustrate what good runbooks look like in practice.
1. Phishing Email Response Runbook
Trigger: User reports a suspicious email, or email security gateway flags a message as malicious after delivery.
- Triage: Retrieve the original email (headers and body) from the email security platform. Check sender reputation, domain age, and SPF/DKIM/DMARC results. Extract and analyse URLs using a sandboxed URL analyser. Extract and detonate attachments in a sandbox. Search the SIEM for other recipients of the same message.
- Containment: If confirmed malicious — purge the email from all recipient mailboxes using the email admin console. Block the sender domain and any extracted IOCs (URLs, IPs, file hashes) at the email gateway, web proxy, and EDR. If a user clicked a link or opened an attachment, isolate their endpoint immediately.
- Eradication: Reset credentials for any user who entered them on a phishing page. Revoke active sessions and OAuth tokens. Scan affected endpoints for dropped payloads. Review email forwarding rules on compromised accounts for persistence mechanisms.
- Recovery: Restore endpoint from clean image if malware was executed. Re-enable accounts after credential reset and MFA verification. Monitor affected accounts for 72 hours for signs of further compromise.
- Post-incident: Update phishing indicators in threat intelligence platform. Submit a phishing awareness reminder to affected users. Document lessons learned.
2. Ransomware Incident Runbook
Trigger: EDR detects file encryption behaviour, ransom note discovered on endpoint, or multiple file extension changes detected across network shares.
- Triage: Identify the affected endpoints and user accounts. Determine the ransomware variant if possible (check ransom note, file extensions, and threat intelligence feeds). Assess scope — is it one machine or multiple? Are network shares affected?
- Containment (immediate — minutes matter): Isolate affected endpoints from the network via EDR or network switch port shutdown. Disable the compromised user account(s). Block lateral movement by segmenting affected network zones. Do NOT power off affected machines — volatile memory contains forensic evidence. Disable any active Remote Desktop Protocol (RDP) sessions across the environment.
- Eradication: Identify and block the initial infection vector (phishing email, exploited vulnerability, compromised RDP). Remove ransomware binaries and persistence mechanisms. Patch the exploited vulnerability. Reset all credentials that may have been exposed, starting with privileged accounts.
- Recovery: Restore affected systems from verified clean backups. Verify backup integrity before restoration — confirm backups are not encrypted or tampered with. Rebuild systems that cannot be cleanly restored. Re-enable network connectivity in a staged manner with enhanced monitoring.
- Post-incident: Conduct a full root cause analysis. Report to relevant regulatory bodies if personal data was affected. Review and update backup procedures and network segmentation controls.
3. DDoS Attack Runbook
Trigger: Network monitoring detects abnormal traffic volume, application performance degrades, or DDoS protection service triggers an alert.
- Triage: Confirm the traffic is an attack and not a legitimate spike (product launch, marketing campaign). Identify the attack type — volumetric (bandwidth flooding), protocol (SYN flood, UDP reflection), or application-layer (HTTP flood, slowloris). Determine the target (specific IP, service, or application endpoint).
- Containment: Engage the DDoS mitigation service or activate scrubbing centre routing. Apply rate limiting on the targeted endpoint. If the attack targets a specific application path, deploy a WAF rule to block the pattern. Coordinate with the ISP or hosting provider if upstream filtering is needed. Enable geo-blocking if attack traffic originates from regions with no legitimate users.
- Eradication: DDoS attacks are mitigated, not eradicated in the traditional sense. Maintain mitigation controls for 24–48 hours after traffic normalises — attackers frequently retry. Review access logs for any application-layer exploitation that may have been masked by the volumetric attack.
- Recovery: Gradually remove emergency rate limits and WAF rules once traffic is stable. Verify all services are functioning normally. Review CDN and caching configurations to improve resilience for future events.
- Post-incident: Document attack characteristics (peak bandwidth, duration, source distribution) for future reference. Update DDoS response thresholds based on lessons learned. Review architecture for single points of failure exposed during the attack.
How to Build a SOC Runbook From Scratch
If you are building your first set of runbooks or rebuilding an outdated library, follow these five steps.
Step 1: Identify Your Top Incident Types
Pull the last 12 months of incident tickets from your SIEM or ticketing system. Categorise them by type and rank by frequency and business impact. Most SOCs will find that 5–8 incident types account for 80–90% of their caseload. Start with those. Common categories include phishing, malware on endpoint, compromised credentials, ransomware, DDoS, data exfiltration, insider threat, and cloud misconfiguration.
Step 2: Document the Current Process
Interview your senior analysts and IR leads. Ask them to walk you through how they actually handle each incident type today — not how they think it should be handled, but what they actually do. Record the tools they use, the queries they run, the people they call, and the decisions they make. This captures the tribal knowledge that currently lives only in their heads.
Step 3: Standardise the Structure
Define a consistent template that every runbook follows. Use the eight components listed above (incident type, severity criteria, triage, containment, eradication, escalation, evidence collection, communications). Consistency matters — when an analyst grabs an unfamiliar runbook at 3 a.m., they should know exactly where to find the containment steps because every runbook puts them in the same place.
Step 4: Validate With Tabletop Exercises
Before publishing a runbook, test it. Run a tabletop exercise where analysts walk through the runbook against a simulated scenario. Time the exercise. Note where analysts get confused, where steps are missing, and where the runbook assumes knowledge that junior analysts don't have. Revise accordingly. A runbook that hasn't been tested is a runbook that will fail when it matters.
Step 5: Establish a Review Cadence
Runbooks go stale. Tools change. Team members rotate. New threats emerge. Set a review schedule — quarterly for high-frequency runbooks, semi-annually for the rest. After every significant incident, review the relevant runbook as part of the post-incident process. Assign a runbook owner for each document who is responsible for keeping it current.
How GRCadia Helps
Building a runbook library from scratch takes weeks of analyst time — time most SOC teams don't have. The interviews, the template design, the drafting, the review cycles — it adds up fast, and it competes with the day-to-day work of actually responding to incidents.
The IR Playbook Bundle ($149, one-time purchase) gives you a complete, practitioner-built incident response documentation set. It includes:
- A strategic incident response playbook framework covering governance, roles, escalation, and communication
- Ready-to-customise runbook templates for the most common SOC incident types
- Severity classification matrix with clear thresholds
- Escalation matrix template with after-hours procedures
- Evidence collection and chain-of-custody checklists
- Pre-drafted communication templates for internal and external notifications
- Post-incident review template
Every template is fully editable, structured for real-world use, and built by practitioners who have run SOCs — not by consultants who have only audited them. One-time purchase. No subscriptions. Download, customise for your environment, and deploy.
Download a Free IR Playbook Sample
See exactly what you get before you buy. No email required.
Frequently Asked Questions
What is a SOC runbook?
A SOC runbook is a step-by-step document that guides security operations center analysts through the detection, triage, containment, eradication, and recovery phases of a specific type of security incident. Unlike high-level policies, runbooks are tactical — they include the exact commands, tools, decision points, and escalation paths analysts need during an active incident.
What is the difference between a SOC runbook and a playbook?
A playbook is a strategic document that defines your overall incident response programme — roles, governance, communication strategy, and escalation framework. A runbook is a tactical document within that programme that covers the step-by-step procedures for a single incident type. A playbook is the cookbook; a runbook is a single recipe. You need both for an effective incident response programme.
How many runbooks does a SOC need?
Most SOCs need between 5 and 15 runbooks to cover their core incident types. Start by analysing your incident history — the 5–8 incident types that account for 80–90% of your caseload should each have a dedicated runbook. Common starting points include phishing, ransomware, DDoS, compromised credentials, malware on endpoint, and insider threat. Expand your library as new threats emerge or your operations mature.
How often should SOC runbooks be updated?
Review high-frequency runbooks quarterly and all others semi-annually. Additionally, update a runbook immediately after any significant incident that reveals gaps, when tools or team members change, or when new threat intelligence requires a change in procedure. Assign an owner to each runbook who is accountable for keeping it current.
Can SOC runbooks be automated?
Yes — partially. SOAR (Security Orchestration, Automation, and Response) platforms can automate repeatable steps in a runbook, such as enriching IOCs, quarantining endpoints, blocking IPs, and sending notifications. However, decisions that require human judgement — like determining severity, deciding whether to shut down a production system, or communicating with executives — should remain manual. Start with a well-documented manual runbook, then automate individual steps as you gain confidence in the process.
Share this article
Ready to get compliant?
GRCadia provides audit-ready compliance templates for ISO 27001, SOC 2, HIPAA, and more. One-time purchase, instant download.
Browse Products