From 8912a364a0285464b0bcbe26133e834e17744179 Mon Sep 17 00:00:00 2001 From: shensz2017 Date: Tue, 9 Jun 2026 05:39:06 +0800 Subject: [PATCH] Improve alert triage timeline fidelity gates --- skills/secops/alert-triage/SKILL.md | 50 ++++++++++++++++++- ...ert-triage-timeline-fidelity-verified.yaml | 47 +++++++++++++++++ ...relation-window-misses-delayed-events.yaml | 37 ++++++++++++++ 3 files changed, 133 insertions(+), 1 deletion(-) create mode 100644 tests/benign/alert-triage-timeline-fidelity-verified.yaml create mode 100644 tests/vulnerable/alert-triage-correlation-window-misses-delayed-events.yaml diff --git a/skills/secops/alert-triage/SKILL.md b/skills/secops/alert-triage/SKILL.md index 927e7d68..8befe97a 100644 --- a/skills/secops/alert-triage/SKILL.md +++ b/skills/secops/alert-triage/SKILL.md @@ -13,7 +13,7 @@ phase: [operate, respond] frameworks: [MITRE-ATT&CK-v16, NIST-SP-800-61-Rev2] difficulty: beginner time_estimate: "10-20min per alert" -version: "1.0.0" +version: "1.0.1" author: unitoneai license: MIT allowed-tools: Read, Grep, Glob @@ -52,6 +52,7 @@ Before beginning triage, gather or confirm: - [ ] **Alert details:** Rule name, severity, timestamp, source system (SIEM, EDR, IDS, cloud security). - [ ] **Alert data:** The raw event(s) that triggered the alert -- including all available fields (source IP, destination IP, username, hostname, process name, command line, file hash, URL). +- [ ] **Timeline fidelity:** Event time, ingestion time, timezone, source clock skew, duplicate-event handling, and data-source latency for correlated sources. - [ ] **ATT&CK mapping:** If the alert rule maps to a MITRE ATT&CK technique, note the technique ID. - [ ] **Asset context:** What is the affected asset? (Server, workstation, cloud instance, network device.) What is its business criticality? (Revenue-generating, customer-facing, development, test.) - [ ] **User context:** Who is the associated user? (Role, department, normal working hours, recent activity patterns.) @@ -104,6 +105,38 @@ Connect the alert data with surrounding context to build a picture of what happe | Lateral Movement (TA0008) | Collection (TA0009), Exfiltration (TA0010) -- what was the objective? | | Command and Control (TA0011) | All tactics -- C2 implies an active intrusion; look for the full chain | +#### Timeline Fidelity and Correlation Window Evidence + +Before concluding that no related activity exists, validate that the correlation window is trustworthy across data sources. + +**What to verify:** + +- Normalize alert, endpoint, identity, cloud, proxy, DNS, and network timestamps to UTC. +- Record both event time and ingestion/index time where available. +- Check source clock skew, especially for endpoints, network appliances, VPN concentrators, and SaaS audit feeds. +- Expand the correlation window when ingestion delay, batch delivery, or offline endpoint upload can move related events outside +/- 30 minutes. +- Deduplicate repeated alerts and fan-out events by stable event ID, process GUID, session ID, request ID, or provider correlation ID. +- Preserve raw event IDs and data-source names so later responders can reproduce the timeline. +- Record missing or delayed data sources as timeline gaps, not as proof that activity did not occur. + +**Timeline evidence table:** + +| Source | Event Time Field | Ingestion Time Field | Timezone | Latency / Skew | Dedup Key | Coverage Decision | +|--------|------------------|----------------------|----------|----------------|-----------|-------------------| +| SIEM alert | | | UTC/local | | | reliable/expand window/gap | +| EDR | | | UTC/local | | | reliable/expand window/gap | +| Cloud audit | | | UTC/local | | | reliable/expand window/gap | + +**Finding conditions during triage:** + +| Condition | Triage Impact | +|---|---| +| Correlated source has ingestion delay greater than the review window | Expand the window before lowering priority | +| Source timezone or clock skew is unknown for key evidence | Lower confidence and document as a timeline gap | +| Duplicate alerts are counted as separate attacker actions without dedup key review | Recompute priority after deduplication | +| Missing EDR/cloud/network source is treated as "no related activity" | Mark correlation as incomplete and consider escalation if asset/user risk is high | +| Raw event IDs or replayable query references are absent | Document evidence weakness and avoid final FP disposition if other risk factors remain | + ### Phase 3: Classify Assign a disposition and priority based on collected and correlated data. @@ -234,6 +267,11 @@ Produce the triage decision as a structured report: - **Threat Intel:** [IOC match results] - **Kill Chain Position:** [Where this falls in the attack lifecycle] +### Timeline Fidelity +| Source | Event Time | Ingestion Time | Timezone/Skew | Dedup Key | Coverage Decision | +|--------|------------|----------------|---------------|-----------|-------------------| +| [source] | [timestamp] | [timestamp] | [status] | [id] | [reliable / expanded / gap] | + ### Recommended Actions - [ ] [Action 1 -- e.g., isolate host, disable account, block IP] - [ ] [Action 2 -- e.g., collect forensic artifacts, memory dump] @@ -319,6 +357,10 @@ Investigating an alert in isolation without checking for activity before and aft Waiting for complete certainty before escalating a high-priority alert costs response time. NIST SP 800-61 recommends erring on the side of over-notification. If 20 minutes of investigation has not resolved the disposition and the alert involves a critical asset or privileged account, escalate to Tier 2 or the IR team with your current findings and continue investigation in parallel. +### Pitfall 6: Treating Timeline Gaps as Negative Evidence + +No matching event in a +/- 30 minute query does not prove no related activity occurred if endpoint uploads were delayed, cloud logs are batched, appliance clocks are skewed, or timestamps use different timezones. Normalize timestamps, check ingestion time, and document delayed or missing sources before closing as false positive. + --- ## 8. Prompt Injection Safety Notice @@ -344,3 +386,9 @@ This skill processes user-supplied content that may include alert payloads, log 7. **Microsoft Sentinel Incident Triage** -- https://learn.microsoft.com/en-us/azure/sentinel/investigate-incidents 8. **Splunk Enterprise Security Notable Event Triage** -- https://docs.splunk.com/Documentation/ES/latest/User/TriageNotableEvents 9. **NIST Cybersecurity Framework (CSF) 2.0 -- Detect Function** -- https://www.nist.gov/cyberframework + +--- + +## 10. Changelog + +- **1.0.1** -- Add timeline fidelity, timezone normalization, ingestion latency, clock skew, deduplication, and raw-event evidence gates for alert correlation. diff --git a/tests/benign/alert-triage-timeline-fidelity-verified.yaml b/tests/benign/alert-triage-timeline-fidelity-verified.yaml new file mode 100644 index 00000000..9ed01967 --- /dev/null +++ b/tests/benign/alert-triage-timeline-fidelity-verified.yaml @@ -0,0 +1,47 @@ +scenario: alert_triage_timeline_fidelity_verified +skill: alert-triage +expected_result: do_not_flag_timeline_fidelity_gap +description: > + The triage report normalizes event time and ingestion time across SIEM, EDR, + identity, and cloud audit sources, expands the window for delayed telemetry, + and deduplicates repeated alerts before disposition. +evidence: + alert: + id: SIEM-88422 + rule: suspicious_powershell + event_time: 2026-06-08T17:00:00Z + ingestion_time: 2026-06-08T17:01:10Z + timezone_normalized: true + timeline_sources: + - source: SIEM + event_time_field: TimeGenerated + ingestion_time_field: ingestion_time + timezone: UTC + latency_p95_minutes: 2 + dedup_key: alert_id + coverage_decision: reliable + - source: EDR + event_time_field: event.timestamp + ingestion_time_field: ingest.timestamp + timezone: UTC + latency_p95_minutes: 18 + dedup_key: process_guid + coverage_decision: expanded_window + - source: CloudAudit + event_time_field: eventTime + ingestion_time_field: receiveTime + timezone: UTC + latency_p95_minutes: 45 + dedup_key: request_id + coverage_decision: expanded_window + correlation: + base_window: plus_minus_30_minutes + expanded_window: plus_minus_2_hours + raw_event_ids_retained: true + duplicate_alerts_collapsed: true + timeline_gaps: [] +assertions: + - all timestamps are normalized to UTC + - ingestion latency drives an expanded search window + - dedup keys prevent duplicate alerts from inflating priority + - raw event IDs are retained for reproducibility diff --git a/tests/vulnerable/alert-triage-correlation-window-misses-delayed-events.yaml b/tests/vulnerable/alert-triage-correlation-window-misses-delayed-events.yaml new file mode 100644 index 00000000..1d618fc9 --- /dev/null +++ b/tests/vulnerable/alert-triage-correlation-window-misses-delayed-events.yaml @@ -0,0 +1,37 @@ +scenario: alert_triage_correlation_window_misses_delayed_events +skill: alert-triage +expected_result: flag_timeline_fidelity_gap +description: > + An alert is closed as a false positive because no events were found in a + +/- 30 minute search, but endpoint and cloud logs arrived late and were never + normalized by event time, ingestion time, or timezone. +evidence: + alert: + id: SIEM-88421 + rule: suspicious_powershell + event_time: 2026-06-08T10:00:00-0700 + ingestion_time: 2026-06-08T17:01:30Z + timezone_normalized: false + correlation_query: + window: plus_minus_30_minutes + searched_by: ingestion_time + raw_event_ids_retained: false + disposition: false_positive + delayed_sources: + - source: EDR + event_time: 2026-06-08T17:08:00Z + ingestion_time: 2026-06-08T18:22:00Z + related_activity: credential_dump_process + - source: CloudAudit + event_time: 2026-06-08T17:16:00Z + ingestion_time: 2026-06-08T19:10:00Z + related_activity: suspicious_role_assignment + timeline_quality: + clock_skew_checked: false + dedup_key_used: false + missing_source_recorded_as_gap: false +assertions: + - related events existed by event time but were missed by ingestion-time search + - timezone normalization was not performed + - delayed sources were treated as no activity + - raw event identifiers were not retained for replay