Threat Intelligence OSINT

Building a Threat Intelligence Pipeline: How to Feed OSINT Into Your Vulnerability Program

Threat intelligence transforms vulnerability management from a scan-and-patch cycle into a threat-informed operation. This is the practitioner's guide to building a pipeline that enriches your CVE inventory with real attacker context.

CVEasy AI Research Team · February 28, 2026 · 11 min read
Threat intelligence pipeline

A threat intelligence pipeline correlates OSINT feeds, IOC databases, and exploitation telemetry against your CVE inventory, surfacing which vulnerabilities are being targeted right now.

Vulnerability management without threat intelligence is operating blind. You know what vulnerabilities exist. You don't know which ones adversaries are actively leveraging, which are in exploit frameworks right now, and which threat actor groups are targeting your sector this quarter. That gap (between scanner output and attacker behavior) is where most security teams lose.

Threat intelligence feeds that gap. Not raw IoC dumps (IP reputation lists and hash blocklists are security operations, not vulnerability management), but strategic intelligence: exploitation telemetry, CVE-to-campaign mappings, and real-time exploit availability signals that tell you which of your scanner findings require emergency attention versus routine patching.

Feed quality varies enormously: The threat intel market is full of noise. Many commercial and free feeds have high false-positive rates, stale data, or irrelevant context for your environment. A poorly implemented threat intel pipeline adds noise without adding signal. This guide focuses on feeds with demonstrated utility for vulnerability prioritization, not general threat intelligence.

The Intelligence Feeds That Actually Matter for VM

CISA Known Exploited Vulnerabilities (KEV): Free, Essential

The CISA KEV catalog is the highest-signal, lowest-noise threat intel source available for vulnerability management. Every entry represents a CVE confirmed as actively exploited in real environments. Unlike predictive scores (EPSS) or threat actor reports, KEV represents confirmed exploitation, past tense, verified by CISA.

API: https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json
Update frequency: Multiple times per week
Recommended polling: Daily, with an alerting hook for new entries affecting your inventory

EPSS API (FIRST.org): Free, High Precision

EPSS provides daily-updated exploitation probability scores (0–1) for every CVE in the NVD. It's trained on exploitation telemetry from honeypots, malware samples, and dark web monitoring. Unlike KEV (confirmed exploitation), EPSS is forward-looking, it predicts the probability of exploitation in the next 30 days.

API: https://api.first.org/data/1.0/epss?cve=CVE-2024-XXXXX
Bulk download: https://epss.cyentia.com/epss_scores-YYYY-MM-DD.csv.gz
Update frequency: Daily
Practical threshold: CVEs with EPSS > 0.4 should escalate to P1 queue

AlienVault OTX: Free, Broad Coverage

OTX (Open Threat Exchange) aggregates threat intelligence from thousands of contributors globally. For VM purposes, the most useful OTX data is the CVE-linked pulses, community-submitted threat reports that tag specific CVEs as actively exploited, associated with specific campaigns, or targeted by specific actor groups.

API: https://otx.alienvault.com/api/v1/indicators/CVE/{cve_id}/general
Authentication: Free API key from OTX account
Signal quality: Medium, community-sourced, variable quality. Use as corroborating signal, not primary.

GreyNoise: Freemium, High Signal for Internet-Facing CVEs

GreyNoise tags internet-wide scanning and exploitation attempts observed across its sensor network. For CVEs affecting internet-facing services, GreyNoise can tell you: "this CVE is being mass-scanned right now" or "exploitation attempts observed from X IPs in the last 24 hours." This is a high-quality signal for T1190-class vulnerabilities.

API: https://api.greynoise.io/v3/community/{ip} (community, free)
CVE-specific data: https://api.greynoise.io/v2/experimental/gnql/stats?query=tags:{CVE_TAG}
High-value use case: Any internet-facing asset CVE with GreyNoise activity should be treated as P0

Shodan: Freemium, Exposure Intelligence

Shodan doesn't directly provide CVE intelligence, but it answers a critical question: "how many internet-exposed instances of this vulnerable software exist?" For CVEs affecting widely-deployed infrastructure (VPNs, firewalls, Exchange servers), Shodan queries tell you the scale of your exposure relative to the broader attack surface.

API example: https://api.shodan.io/shodan/host/search?key={API_KEY}&query=product:Confluence&port:443

CIRCL MISP Threat Sharing: Free, Community Intel

MISP (Malware Information Sharing Platform) instances operated by CIRCL and sector-specific ISACs provide structured threat intelligence including CVE-to-campaign mappings. For government, healthcare, and financial sector orgs, MISP access through your ISAC provides sector-relevant CVE targeting data.

Building the Correlation Engine

Raw feed data doesn't help until it's correlated against your CVE inventory. The correlation engine answers: "of the CVEs in my environment, which ones appear in threat intelligence feeds?"

import httpx
import asyncio
from datetime import datetime

class ThreatIntelPipeline:
  def __init__(self, otx_key: str, greynoise_key: str):
    self.otx_key = otx_key
    self.greynoise_key = greynoise_key
    self.kev_cache = {}
    self.epss_cache = {}

  async def load_kev(self):
    """Load full CISA KEV catalog into memory."""
    url = "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"
    async with httpx.AsyncClient() as client:
      r = await client.get(url)
      data = r.json()
      self.kev_cache = {v["cveID"]: v for v in data["vulnerabilities"]}
    print(f"Loaded {len(self.kev_cache)} KEV entries")

  async def get_epss(self, cve_id: str) -> float:
    """Fetch EPSS score for a CVE."""
    if cve_id in self.epss_cache:
      return self.epss_cache[cve_id]
    async with httpx.AsyncClient() as client:
      r = await client.get(f"https://api.first.org/data/1.0/epss?cve={cve_id}")
      data = r.json()
      if data.get("data"):
        score = float(data["data"][0]["epss"])
        self.epss_cache[cve_id] = score
        return score
    return 0.0

  async def get_greynoise(self, cve_id: str) -> dict:
    """Check if CVE is being actively mass-exploited via GreyNoise."""
    tag = cve_id.replace("-", "_").lower()
    async with httpx.AsyncClient() as client:
      r = await client.get(
        f"https://api.greynoise.io/v2/experimental/gnql/stats",
        params={"query": f"tags:{tag}"},
        headers={"key": self.greynoise_key}
      )
      return r.json() if r.status_code == 200 else {}

  def is_kev(self, cve_id: str) -> bool:
    return cve_id in self.kev_cache

  async def enrich_cve_with_intel(self, cve_id: str) -> dict:
    """Full threat intel enrichment for a single CVE."""
    epss, greynoise = await asyncio.gather(
      self.get_epss(cve_id),
      self.get_greynoise(cve_id)
    )
    kev_entry = self.kev_cache.get(cve_id, {})
    gn_count = greynoise.get("count", 0)

    # Threat intel priority score
    ti_score = (
      (epss * 40) +
      (30 if self.is_kev(cve_id) else 0) +
      (min(gn_count / 100, 1) * 20) # normalize GreyNoise count
    )

    return {
      "cve_id": cve_id,
      "epss": epss,
      "kev": bool(kev_entry),
      "kev_due_date": kev_entry.get("dueDate"),
      "greynoise_count": gn_count,
      "ti_score": round(ti_score, 2),
      "enriched_at": datetime.utcnow().isoformat()
    }

IOC to CVE Mapping

Indicators of Compromise (IOCs) (IP addresses, domains, file hashes) don't directly map to CVEs. But threat intelligence reports that publish IOCs typically also reference the CVEs used to gain initial access. The mapping pipeline:

  1. Monitor threat reports for CVE references: Mandiant, CrowdStrike, Recorded Future, and government advisories (CISA, FBI, NSA joint advisories) routinely name specific CVEs used in campaigns. Parse these for CVE IDs.
  2. Track campaigns associated with CVEs: When a campaign is attributed to specific CVEs (e.g., "this threat group used CVE-2024-XXXXX for initial access"), add that campaign association to your CVE record.
  3. Cross-reference with your scanner inventory: If your environment has the CVE, and it's associated with an active campaign targeting your sector, that CVE is an emergency regardless of EPSS score.

Feed Weighting and Trust Levels

Not all intel sources are equal. Assign trust weights based on source reliability, timeliness, and false-positive rate:

Threat Intel Source Trust Weighting for VM Prioritization
Source Type Trust Weight Rationale
CISA KEV Confirmed exploitation 1.0 (max) Authoritative, verified, near-zero false positive
EPSS (>0.7) Predictive probability 0.85 ML-based, trained on telemetry, well-calibrated
GreyNoise (active) Observed mass exploitation 0.80 Real sensor data, confirms active scanning/exploit
Vendor Advisory Exploitation confirmation 0.75 Vendor-confirmed but may lag; usually high quality
AlienVault OTX Community reporting 0.45 Variable quality; use as corroborating signal only
Social media/Twitter Informal reporting 0.20 Useful for early warning only; extremely high FP rate

Alert Thresholds: When to Escalate

The correlation engine should generate alerts at three levels:

The signal-to-action pipeline: The pipeline only delivers value if it produces alerts that trigger action, not reports. Each P0 trigger should automatically create an incident ticket, notify the on-call engineer, and initiate a war-room Slack/Teams channel. P1 triggers should create a triage ticket with all enrichment data pre-populated. P2 should add to the standard remediation queue. If your threat intel pipeline generates alerts that humans then manually decide to act on, you've added a threat intel report, not a threat intel pipeline.
CVEasy AI does this automatically: CVEasy AI's TRIS™ scoring continuously cross-references your CVE inventory against CISA KEV, EPSS, and OSV data. When new KEV entries are published, affected CVEs in your environment are immediately re-scored and flagged for emergency triage. No manual feed polling required. Get early access →

Ready to take control of your vulnerabilities?

CVEasy AI runs locally on your hardware. Seven layers of risk intelligence. AI remediation in seconds.

Get Started Free Learn About BASzy AI

Related Articles