Skip to content

Incident Deduplication

Overwatch uses AI-powered semantic analysis to automatically detect when multiple alerts describe the same underlying issue, grouping them into a single incident instead of creating duplicates.

When an alert arrives via webhook, Overwatch runs a multi-stage deduplication pipeline:

  1. Field extraction — The alert parser normalizes the incoming payload into a standard format (source, severity, title, description, affected service)
  2. Fingerprint generation — A deterministic fingerprint is computed from key fields (source, service, alert type) for fast exact-match lookups
  3. Semantic comparison — If no fingerprint match is found, the alert description is compared against recent open incidents using vector similarity via Weaviate
  4. Threshold evaluation — Alerts exceeding the similarity threshold (default 0.85) are merged into the existing incident; below threshold, a new incident is created
  5. Merge enrichment — When merged, the new alert’s unique details are appended to the incident timeline, and severity is escalated if warranted

Exact field matching for obvious duplicates:

  • Same source platform + same alert rule ID + same affected host
  • Matches in under 10ms
  • Zero false positives

Vector similarity for alerts that describe the same problem differently:

  • “High CPU on web-server-01” matches “web-server-01 CPU utilization critical”
  • “Database connection pool exhausted” matches “PostgreSQL max_connections reached”
  • Catches cross-platform duplicates (e.g., Datadog CPU alert + Prometheus node_cpu alert for the same host)

Alerts arriving within a configurable time window (default: 30 minutes) for the same service are candidates for grouping even at lower similarity scores.

Deduplication is enabled by default for all organizations. You can adjust behavior per integration:

SettingDefaultDescription
dedup_enabledtrueEnable/disable deduplication for this integration
similarity_threshold0.85Minimum vector similarity score to merge (0.0–1.0)
time_window_minutes30Time window for grouping related alerts
fingerprint_fieldssource, rule_id, hostFields used for exact-match fingerprinting

In the incident detail view, merged alerts appear in the Alert Timeline:

  • Each contributing alert shows its source, arrival time, and original severity
  • The incident severity reflects the highest severity among all merged alerts
  • The alert count badge on the incident card shows how many raw alerts were grouped

Deduplication works across all integrated platforms:

  • Datadog, New Relic, PagerDuty, Grafana, Prometheus, Elasticsearch, SigNoz
  • Cross-platform deduplication catches the same issue reported by different tools
  • Lower the threshold (e.g., 0.75) if you see duplicate incidents that should have been merged
  • Raise the threshold (e.g., 0.92) if unrelated alerts are being incorrectly grouped
  • Expand the time window for environments with bursty alerting patterns
  • Narrow fingerprint fields if you need more granular incident separation