Ask most security managers to show you their vulnerability SLA policy, and they will produce a document that says something like: Critical, 15 days, High, 30 days, Medium, 90 days, Low, best effort. Ask them how many CVEs were remediated within SLA last quarter, and the number will be somewhere between embarrassing and fictional.
SLA frameworks fail for predictable reasons. They are set by security teams and handed to IT teams with no buy-in and no teeth. The classification criteria (CVSS severity buckets) do not correspond to actual urgency, so every Critical gets treated with the same urgency regardless of whether it has a working exploit in active ransomware campaigns or has zero public PoC and a theoretical attack vector. Escalation paths exist in the policy document but not in practice. And exception processes are either nonexistent or completely manual, creating a perverse incentive to just let SLAs slip silently rather than go through the paperwork.
A vulnerability SLA that actually drives patch velocity requires three things: classification criteria grounded in exploitation reality, a formal escalation workflow with teeth, and a reporting mechanism that gives leadership visibility into the gap between the policy and the practice.
The Classification Problem: CVSS Is the Wrong Anchor
Using CVSS severity tiers to define SLA buckets creates a fundamental mismatch between the urgency you are communicating and the urgency that is actually warranted. CVSS measures theoretical severity, the maximum potential impact of a fully realized exploit. It says nothing about whether anyone is actually exploiting the vulnerability today, or whether the preconditions for exploitation exist in your environment.
Two CVEs can both score CVSS 9.8 Critical with wildly different real-world urgency. A remote code execution vulnerability in a niche network appliance that has zero public PoC, EPSS 0.2%, and no KEV listing is not remotely comparable to a CVE with a Metasploit module, EPSS 0.87, and a KEV listing added three days ago because it was used in a ransomware attack on a hospital chain. Treating them both as "15-day Critical SLA" is not a policy; it is an average that is wrong in both directions simultaneously.
A Tiered SLA Framework Built on Exploitation Signals
The following framework replaces CVSS-severity tiers with exploitation-signal tiers. The result is a smaller set of genuinely urgent items and a longer tail of lower-priority work that IT can address in normal patch cycles, which actually increases compliance rates on the items that matter.
| Tier | Criteria | Triage SLA | Remediation SLA | Escalation if Missed |
|---|---|---|---|---|
| Tier 0, KEV | CISA KEV listed | 4 hours | 24 hours | CISO + CTO direct notification |
| Tier 1, Critical Exploitable | CVSS ≥ 9.0 AND EPSS > 0.4 | 24 hours | 72 hours | VP Engineering + CISO |
| Tier 2, Critical | CVSS ≥ 9.0 (without Tier 1 EPSS threshold) | 48 hours | 7 days | IT Director + Security Manager |
| Tier 3, High Exploitable | CVSS 7.0–8.9 AND EPSS > 0.4 | 48 hours | 14 days | IT Director + Security Manager |
| Tier 4, High | CVSS 7.0–8.9 (without EPSS threshold) | 5 days | 30 days | Team Lead notification |
| Tier 5, Medium | CVSS 4.0–6.9 | 5 days | 90 days | Monthly backlog review |
| Tier 6, Low | CVSS < 4.0 | Best effort | Best effort | Quarterly review |
The key change is the separation of Tier 0 (KEV) from everything else. KEV entries are not a prioritization exercise; they are an incident response trigger. A 24-hour remediation SLA on a KEV entry is aggressive but achievable for critical assets, and it focuses the program's fastest-response capability on the vulnerabilities that threat actors are demonstrably using today.
Building the Escalation Workflow
A SLA without an escalation path is a suggestion. The escalation workflow is what converts a policy document into an operational reality. Four elements are required:
- Automated SLA breach detection. When a vulnerability crosses its remediation SLA deadline without being marked resolved or excepted, the system fires an alert automatically. This is not a weekly report reviewed in a meeting; it is an immediate notification to the assigned owner and their manager.
- Named escalation contacts. Every SLA tier needs a named escalation contact documented in the policy and actually agreed to by that person. A CISO who does not know they are in the Tier 0 escalation path will not respond appropriately when they receive an after-hours page about a KEV entry on a production database.
- Mandatory exception process. When remediation cannot happen within SLA, because of testing windows, business constraints, or a compensating control, the owner must file an exception with a documented compensating control and a revised target date. The act of filing forces accountability and creates an audit trail. Make the exception form simple enough to complete in five minutes, or people will just let the SLA slip without filing.
- Leadership dashboard. The CISO and CTO need weekly visibility into SLA compliance by tier, with trending. Not a table of CVEs, a set of metrics: Tier 0 compliance rate, average days to remediation by tier, number of open exceptions, and the aggregate TRIS™ score of the open backlog. This translates vulnerability data into business risk language that drives budget conversations.
Getting IT Buy-In
Security teams write SLA policies in isolation and then wonder why IT does not comply. The fundamental problem is that SLAs are written for security's audit requirements, not for IT's operational reality. Patch windows exist. Change advisory boards require lead time. Production systems cannot be rebooted at 2 AM to apply a kernel patch without a change request.
Three practices that dramatically improve IT buy-in:
- Include IT in the SLA design process. The remediation timeframes should be agreed on jointly, not handed down from security. IT knows what is actually achievable; security knows what is actually urgent. The resulting tiers will be more conservative than what security would write alone and more aggressive than what IT would propose alone, which is exactly right.
- Provide specific remediation guidance with every ticket. "Apply patch CVE-2024-21762" is not specific enough. A ticket that includes the affected component, the exact package version to install, the test command to verify the patch, the rollback procedure, and the estimated downtime is a ticket that gets worked. The friction of understanding what to do is a significant portion of the delay between "ticket opened" and "ticket resolved."
- Distinguish between compliance tracking and blame assignment. Reporting that Team X missed 40% of their Tier 4 SLAs should trigger a conversation about whether they have enough capacity, whether the SLAs are realistic, or whether exceptions need to be filed, not a performance review. SLA metrics are a program health signal, not a disciplinary tool.
A vulnerability SLA framework that does not account for exploitation probability is measuring the wrong thing. A KEV-listed vulnerability with EPSS 0.9 that misses a 15-day Critical SLA is categorically more dangerous than a theoretical Critical with EPSS 0.001 that misses the same SLA, but a CVSS-only policy cannot make that distinction. Exploitability-informed tiers, automated escalation, and leadership-visible metrics are what convert a policy document into a program that actually reduces your organization's risk of breach.