Developer Implementation Guide

Table of Content

Table of Content

Table of Content

Monitor systems and automate breach alerts

Detect threats fast, escalate with context, and auto-contain where safe. Instrument the stack end to end, correlate signals, and route high-fidelity alerts to on-call with runbooks and evidence.

Monitor Systems and Automate Breach Alerts

Detect threats fast, escalate with context, and auto-contain where safe. Instrument the stack end to end, correlate signals, and route high-fidelity alerts to on-call with runbooks and evidence.

Strategy and coverage

  • Define incident severities and owners. Track MTTD, MTTR, and detection coverage by kill chain stage.

  • Centralize telemetry in a SIEM. Use structured logs, metrics, traces, and cloud audit events.

  • Build detections for authentication, authorization, data access, key use, egress, and admin changes.

  • Minimize PII in alerts. Use DSIDs and links to evidence, not raw personal data.

Key signals to monitor

  • Auth: brute force, password spraying, token replay, disabled MFA, new device geo anomalies.

  • Authz: spikes in 403 denies, privilege escalations, unexpected role grants.

  • Data access: high-volume reads, unusual exports, first-time access to sensitive tables.

  • Keys and secrets: KMS decrypt spikes, new JWKS issuers, vault access from new hosts.

  • Egress: sudden outbound bytes from app or DB subnets, object storage bulk downloads.

  • Infra: container escapes, exec shells in pods, changes to security groups, new public buckets.

  • Vendors: webhook retries, bulk API pulls, new IPs, failed DPA checks in vendor syncs.

  • Deception: honeytoken account sign-ins, access to canary records or files.

Example detections and rules

-- Mass export from reports (access_audit schema from earlier sections)
SELECT date_trunc('minute', at) AS m, count(*) AS c
FROM access_audit
WHERE action = 'export' AND resource LIKE 'report:%'
GROUP BY 1 HAVING count(*) > 5 * avg(count(*)) OVER ();  -- simple spike vs avg
-- KMS decrypt spike per key (key_audit from earlier)
SELECT key_id, count(*) AS decrypts_last_5m
FROM key_audit
WHERE operation = 'decrypt' AND at > now() - interval '5 minutes'
GROUP BY key_id
HAVING count(*) > 3 * (
  SELECT coalesce(avg(c), 0) FROM (
    SELECT date_trunc('hour', at) h, count(*) c
    FROM key_audit WHERE key_id = key_audit.key_id
      AND at > now() - interval '24 hours' GROUP BY 1
  ) t
);
# WAF-ish gate: alert on login error bursts
map $status $login_fail { default 0; 401 1; 403 1; }
log_format json escape=json '{ "ts":"$time_iso8601","path":"$request_uri","fail":$login_fail }';
access_log /var/log/nginx/access.json json;
# SIEM query: count where path ~ "/login" and fail=1 > threshold per minute
# Prometheus alert: unusual egress from app namespace
groups:
- name: breach-egress
  rules:
  - alert: HighEgressBytes
    expr: rate(container_network_transmit_bytes_total{namespace="app"}[5m]) > 5e7
    for: 10m
    labels: { severity: critical }
    annotations:
      summary: "High egress from app namespace"
      runbook: "https://runbooks/egress-contain"
# Falco: interactive shell spawned in container
- rule: Container Shell Spawned
  desc: Detect shell in container
  condition: container and proc.name in (bash, sh, zsh)
  output: "Shell in container (user=%user.name container=%container.id proc=%proc.name)"
  priority: CRITICAL
// Cloud object storage: alert on many GETs by new principal (pseudo event pattern)
{ "source": ["s3"], "detail-type": ["Object Access"],
  "detail": { "eventName": ["GetObject"], "countWindow": "5m", "threshold": 100, "principal": "new" } }

Alert routing and auto-response

  • Route by severity to PagerDuty or equivalent. Include request_id, subject DSID, actor, resource, decision, and evidence links.

  • Auto-contain where safe: revoke tokens, disable suspicious accounts, rotate API keys, block offending IPs, lock buckets from public access.

  • Require human confirmation for destructive steps. Log all automation actions to an append-only audit.

{
  "alert_id": "alrt_9f2c",
  "severity": "critical",
  "signal": "kms.decrypt_spike",
  "request_id": "2b5f3c0f...",
  "key_id": "alias/pii",
  "links": { "kibana": "...", "runbook": "https://runbooks/kms-decrypt" },
  "proposed_actions": ["rotate_dek","disable_service_principal","block_asn"]
}

Honeytokens and canaries

  • Create a fake admin user and fake S3 object with unique markers. Any access is critical.

  • Plant a canary API key in build artifacts that calls back to a controlled endpoint if used.

Testing and drills

  • Run monthly breach game days: simulate key theft, token replay, bulk export, and vendor exfiltration.

  • Validate that alerts fire, on-call is paged, and automations execute safely.

  • Keep post-incident reviews and detection improvements in a backlog.

Privacy and regulatory triggers

  • Suppress PII in alerts. Provide links to evidence gated by least privilege.

  • Track incidents against regulatory thresholds and notify within statutory timelines where required.

Quick monitoring and alerting checklist

  • Central SIEM with structured logs, metrics, traces, and cloud audit events

  • Detections for auth, data access, key use, egress, and admin changes

  • Prometheus or provider alerts with sensible baselines and spike rules

  • Container and host runtime sensors with Falco or EDR

  • Honeytokens for early compromise detection

  • Alert routing with runbooks and safe auto-containment

  • Monthly drills and post-incident improvements

Conclusion

Proactive monitoring and automated, reversible response shrink breach impact. With layered telemetry, well-tuned detections, honeytokens, and scripted containment, you cut time to detect, speed remediation, and meet GDPR and CPRA expectations for security and accountability.