A SOC at home (homelab) is not an alert theater. It must give you signal when it matters, without noise.
This post documents how I run in my homelab a proactive SOC agent based on four principles:
- Real detection: problematic pods in Kubernetes.
- SIEM signal: alerts with level ≥ 7 from Wazuh (via a proxy inside the cluster).
- Attack surface: CRITICAL and HIGH CVEs only on exposed images.
- Controlled output: Telegram only when there are findings or signal degradation, plus deduplication to avoid repeating the same issue in a loop.
The whole design lives under soc/ and acts as the repository’s source of truth.
This is not another dashboard. It is an operational loop:
cluster → SIEM → runtime → channel → action
Each iteration produces a clear state: OK, degraded, or requires action.
If you already use Wazuh in your lab, this connects with what’s already on the blog:
- Auditing Kubernetes with Wazuh
- Wazuh without Elasticsearch: lightweight dashboard in Grafana
- Parental control with Wazuh, AdGuard and Telegram
Here the change is focus: fewer alerts, more actionable context.
What this adds vs the usual approach
- Scoped scanning → only what is exposed.
- Actionable signal → explicit severity and policy, no “deferred noise”.
- Deduplication → the channel does not become a loop of the same problem.
- Visible degradation → if
kubectl, Trivy, or connectivity is missing, it is reported so “silence” doesn’t look like “OK”.
Mental model
- Time-bounded: each run has limits (number of images,
timeoutper scan) so the host doesn’t get stuck. - Priority (rough): availability > SIEM signal > CVEs.
- Single channel: Telegram as an inbox, not a continuous stream: if something arrives, it deserves triage or closure with action.
What you will build (quick view)
- Scripts under
soc/scripts/:soc_proactive.sh: generates the report, writes it tosoc/reports/YYYY-MM-DD/…md, and sends it to Telegram.send_telegram.sh: sends messages using the bot token (in my caseraam_bot) read from a Kubernetes Secret.trivy_scan_image.sh: utility to scan a specific image on demand.
- Cron in
soc/cron/soc-proactive.cron: runs the check every 6 hours. - Optional event-driven integration with Alertmanager:
soc/k8s/alertmanagerconfig-soc-inbox.yamlrouteswarningorcriticalalerts to an n8n webhook.
Important: this is not a commercial product; it’s an honest assembly over what you operate. Self-hosted and Telegram imply trust in your own configuration and in the channel, not “more secure” by magic. The goal is actionable visibility with time/noise limits.
Prerequisites (it won’t work without these)
Host running the SOC checks
The design assumes a central host (in my setup, raam) with:
- Kubernetes or k3s accessible via
kubectl. - Docker accessible (
docker ps). - Trivy installed for CVEs.
- curl for Telegram and HTTP queries.
The script validates dependencies and, if something is missing, records it as signal degradation (you won’t believe “everything is green” if the scanner isn’t running).
Telegram
- A channel to act as the SOC inbox.
- The bot is an admin in that channel.
- The numeric
chat_idfor the channel (typical format-100…).
The posture guide and how to obtain the chat_id without messing up the webhook is in soc/posture/telegram.md inside the repo.
Repository structure (minimum)
soc/README.md: goal and folder map.soc/runbooks/: symptom → verification → cause → action.soc/posture/: exposure checklist (“don’t open things by accident”).soc/vuln/: CVE policy (thresholds, cadence, what should not hit the channel).soc/reports/: Markdown output per run with findings.
Step 1 — Configure Telegram destination (chat_id)
soc_proactive.sh uses a placeholder by default (SOC_CHAT_ID="-YOURCHANNEL" or similar). To change it without editing the script, export SOC_CHAT_ID in the cron environment (or adjust the default value if you prefer centralizing it in the repo).
To obtain the chat_id without breaking the bot webhook, the runbook in soc/posture/telegram.md suggests forwarding a message from the channel to the bot in a private chat and reading the chat_id from the bot logs.
Step 2 — Ensure the bot token exists in Kubernetes
Sending uses send_telegram.sh, which reads the token from:
- a Secret in the right namespace (e.g.
raam/raam-bot-secret) - key:
BOT_TOKEN
If the Secret does not exist or is not readable, sending fails and the script exits with an error: better to fail loudly than to “publish silently” to nowhere.
Step 3 — First manual run
From the repository root (where soc/ exists):
bash soc/scripts/soc_proactive.sh
Expected behavior:
- If there are no findings, it prints nothing and exits with code 0.
- If there are findings, it generates a file under
soc/reports/YYYY-MM-DD/soc-proactive-<timestamp>.mdand sends a Telegram message.
What the script checks
- Kubernetes: lists pods in undesired states: Pending, CrashLoopBackOff, ImagePullBackOff, ErrImagePull, Error, Failed.
- Wazuh: queries the
wazuh-alerts-proxyservice in themonitoringnamespace and collects up to 5 recent alerts with level 7 or higher. - CVEs (Trivy):
- Takes images from
docker psthat have ports published to0.0.0.0or::(real exposure to the outside). - Prioritizes “high value” images (n8n, Wazuh, AdGuard, web services, etc., according to your logic in the script).
- Scans only
CRITICALandHIGH, with--ignore-unfixedby default. - Limits number of images and time per image so the host doesn’t get stuck in a scanning loop.
- Takes images from
This aligns with the idea of prioritizing patches when Wazuh flags CVEs: the Telegram channel is not the place to review every MEDIUM from an internal database, but what justifies action or awareness on your exposed perimeter.
Step 4 — Cron on the host (every 6 hours)
On the host that runs the scripts, as root, install the repo’s cron file (adjust the path to your clone):
cp /path/to/your/repo/soc/cron/soc-proactive.cron /etc/cron.d/soc-proactive
chmod 644 /etc/cron.d/soc-proactive
The cron runs bash soc/scripts/soc_proactive.sh and may write traces to soc/.soc_proactive.log (depending on the content of soc/cron/soc-proactive.cron).
Step 5 — Tuning noise (CVE policy)
The detailed policy is in soc/vuln/POLITICA-CVE.md. Intent summary:
- Treat CRITICAL seriously when the service is exposed or high value.
- Treat HIGH when context (exposure or value) makes it plausible that it should hit the channel.
- Keep MED/LOW for periodic review, not for daily Telegram spam.
Useful variables without touching code (example values):
| Variable | Typical role |
|---|---|
SOC_TRIVY_SEVERITY | Default CRITICAL,HIGH |
SOC_TRIVY_IGNORE_UNFIXED | Default true (less noise when there is no fix) |
SOC_TRIVY_MAX_IMAGES | Limits how many images are scanned |
SOC_TRIVY_IMAGE_TIMEOUT_SECONDS | Prevents one image from blocking the whole batch |
Step 6 (optional) — Alertmanager → n8n
Besides the cron check, you can route cluster alerts into the SOC inbox. The repo includes an AlertmanagerConfig:
- File:
soc/k8s/alertmanagerconfig-soc-inbox.yaml - Routes
severity=warningorcriticalto a webhook receiver. - Usually ignores
Watchdog(a “I’m alive” alert that does not add value to a human SOC inbox). - In my homelab it points to n8n on the host (not in Kubernetes), e.g.
http://10.5.5.3:5678/webhook/soc-alertmanager, reachable from themonitoringnamespace.
Networking caveat: if n8n runs in Docker on the host, IP and firewall must allow traffic from the Alertmanager pod to that URL. Details in soc/k8s/README.md in the repo.
How to read a report
Example filename: soc/reports/2026-04-26/soc-proactive-20260426T100210Z.md.
- Kubernetes block: if it’s not “OK”, prioritize availability/stability before abstract CVEs.
- Wazuh block: if you see lines like
[L7](or whatever level you use) with host and rule, it’s SIEM signal that deserves triage, not automatic panic. - CVEs (exposed Docker): if a Trivy table exists, prioritize by exposed service, severity (CRITICAL first), and whether a fix exists (update image/tag and redeploy).
Runbooks: don’t improvise when something triggers
soc/runbooks/kubejobfailed.md: failed jobs.soc/runbooks/docker-hub-mirror.md:ImagePullBackOffdue to limits or blocks to Docker Hub.soc/runbooks/wazuh-jwt-renewer.md: expired JWT or dashboards without data.soc/runbooks/php-apache-ataque-web.md: typical web abuse signals.soc/runbooks/n8n-exposicion-cve.md: focus on n8n (high value in many homelabs).
Extensions once it’s stable
- Keep
soc/inventory/andsoc/inventory/exposure_map.mdup to date. - Use
soc/posture/exposure-checklist.mdrecurrently. - Tune local rules in Wazuh with intent (see
soc/wazuh/README.mdin the repo). - Adjust
SOC_TRIVY_IGNORE_UNFIXEDor the image limit to reduce false positives without turning off the radar.
Quick commands (copy & paste)
Scan a specific image:
bash soc/scripts/trivy_scan_image.sh docker.n8n.io/n8nio/n8n:2.16.0
Test the channel manually (replace with the real chat_id):
bash soc/scripts/send_telegram.sh "-100xxxxxxxxxx" "SOC test: verification message"
Run the proactive check:
bash soc/scripts/soc_proactive.sh
Wrap-up: a proactive SOC in a homelab does not replace a 24/7 team, but it gives you rhythm: cron, SIEM signal, scoped scanning, and an inbox with clear rules. With runbooks and controlled noise, the next incident starts with facts in Markdown, not intuition at 3 AM.
