The typical enterprise patch process: scanner finds a vulnerability, analyst creates a ticket, IT ops receives the ticket two weeks later during their sprint planning, schedules patching for the next maintenance window, patches during the window, marks the ticket done. Total elapsed time: 6–12 weeks for a routine patch. For a critical vulnerability with a 30-day SLA, that process is already on borrowed time before anyone has typed a command.
Patch automation doesn't eliminate human judgment; it eliminates human waiting. The decisions that matter (which CVEs to patch, which systems, what change controls apply) remain human decisions. The execution, verification, and reporting become automated. The result: MTTR drops from weeks to days or hours, without increasing operational risk.
The goal of patch automation is not to remove human judgment. It's to ensure that human judgment is the only bottleneck, not human scheduling, human handoffs, or human error in execution.
The 5-Stage Patch Pipeline
Stage 1: Detect
What happens: Vulnerability is identified in a scan. Enriched with EPSS, KEV status, and asset criticality. If it crosses priority thresholds (KEV, EPSS > 0.4, CVSS > 9.0 on critical asset), a patch ticket is automatically created in your ITSM (ServiceNow, Jira). The ticket is pre-populated with CVE ID, CVSS score, EPSS score, affected asset list, and estimated patch complexity. No analyst intervention required for the ticket creation itself.
Stage 2: Triage
What happens: A human reviews the auto-created ticket, confirms the fix (patch, configuration change, or workaround), and approves it for automated execution. For routine OS package updates on non-critical systems, triage may be fully automated using policy rules ("if EPSS < 0.2 and CVSS < 7.0 and asset tier is dev/test, auto-approve"). For critical systems or high-risk patches, a human engineer approves before execution proceeds. The triage stage is where patch automation policy is applied.
Stage 3: Test
What happens: The patch is applied to a staging environment first. Automated smoke tests run. For web applications, a subset of integration tests verifies that the patched version doesn't break critical functionality. For OS patches, a health check script confirms the system comes back up and key services are running. For kernel patches, test the reboot cycle. Test stage failures block production deployment and alert the owning engineer.
Stage 4: Deploy
What happens: Automated deployment to production using the appropriate mechanism (Ansible, Intune, WSUS, Helm chart update, etc.). High-risk patches deploy as canaries, to a small percentage of production instances first, with a hold period and automated health monitoring before full rollout. Critical emergency patches bypass the canary phase after explicit human approval. Change management tickets are auto-updated throughout.
Stage 5: Verify
What happens: A re-scan (or targeted vulnerability check) confirms the CVE is no longer present on the patched system. The ticket is auto-closed with the re-scan evidence attached. SLA tracking records the closure time. For FedRAMP or SOC 2 compliance, the evidence artifact is automatically stored in your compliance evidence repository. The patch pipeline reports back to your VM system to update the CVE status from "open" to "remediated."
Ansible Playbooks for Linux Patching
Ansible is the most widely used automation framework for Linux patch management. Here's a production-grade playbook structure:
---
# patch-linux-packages.yml
# Usage: ansible-playbook patch-linux-packages.yml \
# -i inventory/production.ini \
# --extra-vars "cve_id=CVE-2024-XXXXX target_package=openssl"
- name: Patch CVE on Linux fleet
hosts: "{{ target_hosts | default('all') }}"
serial: "25%" # Roll out to 25% of hosts at a time
max_fail_percentage: 10 # Halt if >10% of hosts fail
become: yes
pre_tasks:
- name: Record patch start time
set_fact:
patch_start_time: "{{ ansible_date_time.iso8601 }}"
- name: Take snapshot (AWS EC2)
community.aws.ec2_snapshot:
instance_id: "{{ ec2_instance_id }}"
description: "Pre-patch snapshot for {{ cve_id }}"
wait: false
when: cloud_provider == "aws"
- name: Verify target package is present
package_facts:
manager: auto
tasks:
- name: Update specific package (Debian/Ubuntu)
apt:
name: "{{ target_package }}"
state: latest
update_cache: yes
when: ansible_os_family == "Debian"
register: apt_result
- name: Update specific package (RHEL/CentOS/Amazon Linux)
yum:
name: "{{ target_package }}"
state: latest
when: ansible_os_family == "RedHat"
register: yum_result
- name: Run post-patch health check
script: scripts/health-check.sh
register: health_check
failed_when: health_check.rc != 0
post_tasks:
- name: Verify vulnerable version no longer present
package_facts:
manager: auto
- name: Update JIRA ticket with remediation evidence
uri:
url: "https://yourorg.atlassian.net/rest/api/3/issue/{{ jira_ticket }}/comment"
method: POST
headers:
Authorization: "Bearer {{ jira_token }}"
Content-Type: "application/json"
body_format: json
body:
body:
type: doc
version: 1
content:
- type: paragraph
content:
- type: text
text: >
Patch applied on {{ inventory_hostname }}.
Package: {{ target_package }}.
Completed: {{ ansible_date_time.iso8601 }}
Windows Patching: WSUS vs Intune vs Chocolatey
Windows patch automation has three dominant mechanisms, each appropriate for different scenarios:
WSUS (Windows Server Update Services)
WSUS is the on-premises Microsoft update distribution server. It's free, integrates with Group Policy, and is still the most common enterprise Windows patch distribution mechanism. Automation via WSUS typically uses PowerShell:
# Trigger Windows Update via PowerShell for a specific KB
$updateSession = New-Object -ComObject Microsoft.Update.Session
$updateSearcher = $updateSession.CreateUpdateSearcher()
$searchResult = $updateSearcher.Search("IsInstalled=0 AND KBArticleIDs='5040442'")
if ($searchResult.Updates.Count -gt 0) {
$updatesDownloader = $updateSession.CreateUpdateDownloader()
$updatesDownloader.Updates = $searchResult.Updates
$updatesDownloader.Download()
$updatesInstaller = $updateSession.CreateUpdateInstaller()
$updatesInstaller.Updates = $searchResult.Updates
$installResult = $updatesInstaller.Install()
Write-Output "Install result: $($installResult.ResultCode)"
}
Microsoft Intune (Cloud-Managed Endpoints)
For cloud-managed Windows endpoints, Intune enables policy-based automatic patching. The key automation levers: Windows Update for Business rings (Dev ring → Test ring → Broad ring), quality update deferral periods (7 days for critical, 14 days for standard), and feature update scheduling. Intune remediation scripts allow custom PowerShell execution on managed devices for targeted remediation beyond Windows Update.
Chocolatey (Third-Party Applications)
WSUS and Intune cover Microsoft software. Third-party applications (browsers, 7-Zip, VLC, Java, Adobe Reader) need a separate mechanism. Chocolatey Community Repository or Chocolatey Central Management handles this:
# Install or upgrade to latest (run via Intune remediation script or Ansible)
choco upgrade all --yes --source="https://your-choco-server" `
--except="vcredist*" `
--log-file="C:\ProgramData\chocolatey\logs\upgrade.log"
# Upgrade specific package for CVE remediation
choco upgrade 7zip --version=24.9.0 --yes --force
Kubernetes Rolling Updates for Container Patches
# Update a deployment to use a patched container image
# (triggered by Renovate/Dependabot PR merge or manual promotion)
kubectl set image deployment/api-server \
api=myregistry.io/api-server:v1.2.4 \
--record
# Monitor rollout progress
kubectl rollout status deployment/api-server --timeout=5m
# Automated rollback if health checks fail
kubectl rollout undo deployment/api-server
# Helm chart update (preferred for production)
helm upgrade api-server ./charts/api-server \
--set image.tag=v1.2.4 \
--wait --timeout 5m \
--atomic # Automatically rolls back on failure
The --atomic flag in Helm is the key to safe automated Kubernetes patching. It applies the update, waits for rollout success (using readiness probes), and automatically rolls back to the previous version if the rollout fails or times out. No manual intervention needed for rollback, the system handles it.
Canary Deployments for High-Risk Patches
For patches to high-criticality systems where a failure would be catastrophic, canary deployment adds a safety layer: deploy to a small percentage of production traffic first, validate with automated health metrics, then proceed to full rollout only after the canary window passes.
# Kubernetes canary using weighted traffic split (Argo Rollouts)
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: api-server
spec:
replicas: 10
strategy:
canary:
steps:
- setWeight: 10 # 10% of traffic → canary
- pause: {duration: 15m} # Wait 15 minutes
- analysis: # Run automated analysis
templates:
- templateName: success-rate
args:
- name: service-name
value: api-server-canary
- setWeight: 50 # If analysis passes: 50% of traffic
- pause: {duration: 10m}
- setWeight: 100 # Full rollout
ServiceNow and Jira Integration for SLA Tracking
Patch automation is only complete when the ticketing and SLA tracking loop is closed automatically. Here's the integration pattern:
# Python snippet: Auto-create Jira ticket from CVE alert
import httpx
def create_patch_ticket(cve_id: str, affected_systems: list,
priority: str, epss: float, kev: bool) -> str:
"""Create Jira patch ticket with all enrichment data pre-populated."""
jira_priority = {
"P0": {"name": "Critical"},
"P1": {"name": "High"},
"P2": {"name": "Medium"},
}.get(priority, {"name": "Low"})
description = f"""
*CVE ID:* {cve_id}
*EPSS Score:* {epss:.3f} ({epss*100:.1f}% exploitation probability)
*CISA KEV:* {'YES. Emergency patch required' if kev else 'No'}
*Affected Systems:* {', '.join(affected_systems)}
*SLA:* {'2 hours' if priority == 'P0' else '24 hours triage, 30 days remediation' if priority == 'P1' else '90 days'}
*Remediation Steps:* [Auto-generated by CVEasy AI, see attachment]
*Verification:* Automated re-scan will confirm remediation within 24 hours of closure.
"""
r = httpx.post(
"https://yourorg.atlassian.net/rest/api/3/issue",
auth=("api@yourorg.com", JIRA_TOKEN),
json={
"fields": {
"project": {"key": "SEC"},
"summary": f"[{priority}] Patch {cve_id} on {len(affected_systems)} systems",
"issuetype": {"name": "Security Task"},
"priority": jira_priority,
"description": {"type": "doc", "version": 1,
"content": [{"type": "paragraph",
"content": [{"type": "text",
"text": description}]}]},
"labels": ["vulnerability-management", cve_id,
"kev" if kev else "standard"],
"duedate": sla_due_date(priority)
}
}
)
return r.json()["key"] # Returns ticket ID, e.g., SEC-4821