Zerberos Labs

Agentjacking: the attack where nothing is unauthorized

Sun, 14 Jun 2026 00:00:00 GMT

This week, researchers at Tenet Security disclosed a new attack class they’re calling agentjacking. After seeing Apple’s new iOS 27 agentic password features, it’s the first AI agent attack in a while that made me stop and actually think about the IR side of it, and I wanted to jot some of that down.

The mechanics are wicked simple. Tenet’s proof-of-concept involves Sentry Data Source Name (DSN), a project-specific address your app uses to send errors and performance events to the service, which is public and write-only by design. An attacker who finds one can write their own “error events” into your Sentry project, and stuff those events full of instructions. Later, a developer asks their AI coding agent to look into a failing error. The agent pulls the event in via Model Context Protocol (MCP), reads the attacker’s text as if it were context, and runs the embedded commands - with the developer’s privileges, on the developer’s machine. No phishing, no server compromise, no user interaction beyond the workflow a developer does probably 50 times a day.

Tenet Security says they pulled this off against Claude Code, Cursor, and Codex with a ~85% success rate, and found over 2,000 exposed organizations. But the detail that should bother people most is that the agents ran the payloads even when the system prompt explicitly told them to ignore untrusted data.

What stuck with me is the terminology Tenet Security used - the “Authorized Intent Chain.” Every step in this attack is a thing the system is supposed to allow:

Reading a Sentry event is authorized
Running a shell command the developer asked for is authorized
Hitting an internal API with the developer’s own token is authorized

Nothing trips a control because there’s no unauthorized action to trip on. As Tenet states, the attack bypasses EDR, WAF, IAM, VPN, Cloudflare, and firewalls because, as far as any of them can tell, nothing wrong happened.

That’s the part that should worry any IR analyst - the agent authenticated as the developer, from the developer’s machine, during the developer’s working hours. There’s no malware on disk, no anomalous login, and the C2 beacon was a bug report. The timeline reads as a developer doing ordinary developer things, because mechanically that’s exactly what happened. The malicious actor in this story is a trusted tool faithfully executing poisoned input, and faithful execution doesn’t leave the artifacts we’re trained to go looking for. Scoping an Incident like this means treating your agent’s entire context window as attacker-reachable input, and I’d bet most orgs have zero logging of what their coding agent actually read before it acted. Honestly, the whole thing is bringing back memories of the initial introduction of fileless malware.

I’ll be watching how agent vendors respond here, because the whole value of these agents is that they act on what they read. One thing I’m for sure noticing - the growing theme seems to be bolting new capabilities on faster than we’re bolting on the controls to contain it.

iOS 27 gives Siri write access to your passwords - should it?

Wed, 10 Jun 2026 00:00:00 GMT

A few days ago at WWDC, Apple announced iOS 27 - with the biggest feature being a revamped Siri, called Siri AI. After looking at some of Siri AI’s new features, I noticed it can now make changes inside the Passwords app. Specifically, it can now act on a weak or compromised credential by walking through a password rotation on the site for you, end to end. Agentic magic.

At face value, the convenience is obvious: most people never rotate a leaked password because the flow is tedious and people can be naturally lazy. As incriminating as it is to admit, I am also one of those people (for my least important credentials). Apple likely believes that adding the ability to pass this off to an agent could measurably shrink the window between “compromise detected” and “credential rotated,” and they’re probably right.

But it also moves the trust boundary. Until now, the only thing (hopefully) that could read and write every credential you own was you, gated behind some form of biometric unlock (i.e. Face ID / Touch ID). An agent that can navigate to a site, authenticate as you, and submit a new password is a new high-value capability, but the interesting question isn’t “is the model good,” it’s “what exactly can trigger it” and “what can / can’t it be talked into doing.” Prompt injection is a thing - a malicious login page or a spoofed “your password was compromised” prompt are the first things that come to mind. We already see the effects of prompt injection on LinkedIn with recruiter bots, which is the same exposure a password agent can have the second it reads a malicious login page. Apple appears to show this capability as navigating to a website you have previously attributed to the credential, but what if that page ends up compromised?

And I’m not alone in this thinking. At first, I actually was fooled by the “magic” of it all. But after speaking to a few college friends who also work in the industry, their immediate reaction was a mix of “I don’t like that” and “sus.” However, their initial thoughts were more based on whether this functionality was running locally on-device or passed off via Apple’s Private Cloud Compute. Another valid concern.

I’ll be keeping an eye on this as the iOS 27 beta cycle unfolds to see if Apple posts anything more about this feature, specifically from a security standpoint. Who knows, maybe Apple has this all figured out already with their army of engineers and everything will work perfectly with privacy in mind. Regardless, the world is fast implementing AI - and the risks associated with that are only just beginning.

Home Lab Snapshot: May 2026

Mon, 25 May 2026 00:00:00 GMT

I’m planning this post as the beginning of a series - whenever I make consequential changes to my home lab, I’ll make a cooresonding post like this. But this acts as half notes-to-self, half baseline reference to be used for future posts that may need context around the specs I’m currently working within. The shape of the lab will drift over time (hence the blog post name) so consider this a snapshot, not a permanent answer.

Why have a home lab

Two reasons, both opportunistic:

Apple Silicon is genuinely good for local LLMs now. Not to be an Apple fanboy, but M5 Pro’s unified memory allows a 35B-parameter abliterated model to load on the same laptop I use day-to-day without a massive rig. For malware-adjacent work where I don’t want samples or decoded artifacts being sent to a hosted API, this is perfect.
VMware Fusion is free (for personal use). Broadcom released Fusion for personal use in 2024, which removes the last meaningful cost barrier to running a couple of Windows/Linux VMs on a Mac.

The stack

Three layers, each doing a specific job. The diagram below is interactive - click any layer to see what runs there and current specs I am using.

M5 Pro MacBook Pro

The whole lab runs on my MacBook. Apple Silicon's unified memory is what makes the rest of the stack viable on one machine - the GPU and Neural Engine share the same 64GB pool the OS and VMs draw from, allowing the usage of sizeable local models + VMs without a massive rig.

ChipM5 Pro
CPU18-core
GPU20-core
Neural Engine16-core
Memory64 GB unified
Storage2 TB SSD

Local LLMs: LM Studio + Ollama

Two LLM apps, two different roles. Each have their own advantages, so I decided to just leverage both. I’m calling this out specifically, including the specific model each is running, so future posts can link back here for the model context without re-explaining each time.

LM Studio (abliterated models)

LM Studio gets the abliterated models, which are variants where the refusal vector has been ablated out of the weights. This allows the model to answer questions about obfuscated payloads, reverse-engineering, and exploit code without dragging in the safety jargon that (although needed for day-to-day models) derails a deobfuscation session. I’m currently running huihui-qwen3.6-35b-a3b-abliterated, which is the same base Qwen 3.6 35B with the refusal behavior removed.

The use case is narrow: code deobfuscation, walking through what a malware sample is doing, and asking questions such as “what does this PowerShell loader look like decoded.” A safety-aligned model buries the answer under disclaimers, which isn’t helpful.

Ollama (regular models)

Ollama runs the regular, safety-aligned models, and I’m currently using qwen3-coder:30b. This is the model that powers SousChef and anything else where I’m not asking for content the alignment guardrails would block anyway. This includes code generation, CyberChef recipe creation, and structured-output tasks.

Keeping these two apps separate (rather than running everything through one) is partly historical, partly practical: LM Studio’s UI is built around model browsing and chat, which is what I want for the analysis side. Ollama’s HTTP API and CLI are what I want for the tooling side. They share the host’s RAM pool, but I rarely have both pulling on a model at the same time.

Windows 11 detonation VM

My VM is named “Windows 11 Detonation.” It’s the host for any “let’s see what this PowerShell loader actually does” session, used for static analysis only - no execution of live samples (yet).

Setting	Value
Guest OS	Windows 11 (ARM64 ISO from Microsoft)
vCPUs	2
RAM	4 GB
Disk	60 GB thin-provisioned, single file
Virtual TPM	Enabled (required by the Win11 installer)
Network	Host-Only (“Private to my Mac”)
VMware Tools	Installed (copy/paste + drag/drop)

The vTPM toggle is in Fusion’s VM settings and is non-negotiable for Win11 here. Without it, the installer refuses to proceed past the system-requirements check. Thin-provisioned single-file disk keeps the VM portable for backup; I’d rather one large file than 60 GB worth of sparse extents.

OOBE bypass for a no-account install

Out-of-Box Experience is the first-boot setup wizard on a fresh Windows 11 install (and for DFIR labs, very annoying). By default, OOBE forces a Microsoft account sign-in and an active internet connection before letting you reach the desktop, both of which are undesirable for an analysis VM.

Luckily, there is a way around it. At the “Let’s connect you to a network” screen, hit Shift + F10 to open a command prompt and run:

oobe\bypassnro

The VM reboots OOBE in a mode where an “I don’t have internet” option becomes available, and you can finish setup with a local account.

Networking (Host-Only)

Fusion’s network adapter is set to “Private to my Mac” — “Host-Only” in standard hypervisor terminology. The VM can reach the macOS host (and the host can reach the VM), but the VM has no path to the LAN and no path to the internet.

For static analysis and decoding, this is exactly what you want. If I’m pasting in an obfuscated PowerShell blob and asking the VM to walk it through IO.Compression decompression, there is no scenario where it benefits from talking to a C2. NAT and Bridged both leave a path open and Host-Only closes it.

Snapshots

Fusion’s snapshot model is generous enough that I can keep two named baselines and just roll back between sessions instead of rebuilding the VM each time. I chose to keep two for this detonation VM:

Snapshot 1: Clean post-OOBE. Nothing installed. This is the “I need to test something against an out-of-box Windows” snapshot — useful for verifying that a behavior isn’t an artifact of the modifications below.
Snapshot 2: Tools + Defender off. VMware Tools installed, Defender fully disabled (see the next section) and no other software. This is the working baseline I usually roll back to between sessions.

Disabling Defender

For pure decoding work, such as feeding the VM an obfuscated PowerShell command and asking it to walk through the deobfuscation, Defender’s real-time scanning and behavior monitoring will block or quarantine the artifact mid-session (booooo). You need to disable Windows Defender to get to the goodies. But only doing one or two of the steps outlined below leaves Defender in a state where some engines re-enable themselves on reboot or the next policy refresh. So you gotta do them all.

All three steps are required. Doing one or two leaves Defender in a state where some engines re-enable themselves on reboot or policy refresh. Run them in order, then reboot.

Tamper Protection is a guardrail that blocks every other Defender setting from being changed via PowerShell, registry, or Group Policy. It can only be disabled through the Settings UI, so this step has to happen before Steps 2 and 3 will actually stick.

Open Settings → Privacy & security → Windows Security.
Open Virus & threat protection, then Manage settings.
Toggle Tamper Protection to Off. Accept the UAC prompt.

Without this, Set-MpPreference calls in the next step will silently fail or revert on reboot.

Once all three steps land and the VM has rebooted clean, this is the moment to take Snapshot 2 (Tools + Defender off).

So - how am I using this Detonation VM?

PoC #1: Simple obfuscated PowerShell (IO.Compression)

First real test of the VM was a small obfuscated PowerShell command that used System.IO.Compression to inflate a base64 payload at runtime. The interesting bits that came out of the session:

Pasting and inspecting. First instinct was to wrap the decompressed payload in Write-Host so the CLI would print the inflated script content. It works, but I quickly learned reading bytes through Write-Host is fragile for anything with embedded quoting.
Invoke-Expression capture trap. Leaving the original IEX (...) wrapper in place meant $result was capturing the return value of whatever the payload executed, not the payload itself. The right move was to strip the IEX and read the inflated stream directly via StreamReader over a DeflateStream.
Leftover ); parse error. After stripping the IEX wrapper, the dangling ); from the original tail caused PowerShell to bail with an unexpected-token parse error. Had to remember to clear that, then the command printed the goodies.

PoC #2: Complex insert/remove/replace obfuscation

Second test was a different animal: a PowerShell loader reconstructed from 17 separate scriptblock-logging entries in a PowerShell EVTX. Insert/remove/replace obfuscation across the chain made manual reassembly painful.

Reassembly ordering. Scriptblock logs need to be sorted by MessageNumber within a matching ScriptBlockId GUID, not by timestamp. When pulled from EVTX logs, often times this requires manual sorting by the analyst.
Null expression error. First try, the reassembled chain threw a null-expression error on execution, which I traced to a likely missing or out-of-order scriptblock somewhere in the middle of the chain (pain). Essentially I had to iterate from here to figure out where I was dumb.

More to come

This is just the baseline of a lab setup as I currently get more into it - posts that lean on the detonation VM or the snapshot/rollback workflow may link back here instead of re-explaining the setup each time. When the shape of my home lab changes meaningfully, such as whenever I get off my couch and seriously play around with a Kali Linux VM, I’ll add a new dated snapshot post to this series.

Meet SousChef, an Experiment in CyberChef Recipes from a Local LLM

Mon, 11 May 2026 00:00:00 GMT

Anyone in the DFIR world can relate to this - you come across a command line that has a powershell -enc blob with seemingly a bagillion characters of Base64, and you know from experience there’s probably another layer or two underneath. This could involve compression via gzip, maybe a single-byte XOR using a key the script kindly left lying around. You then walk it through CyberChef by hand, something you’ve done a thousand times (and likely seen it throw invalid blah blah back at your face a similar amount). But it’s tedious…and exactly the kind of pattern-matching a language model is good at.

SousChef is a Python-based CLI tool I’ve been building to do that first pass for you. You hand it an obfuscated payload, it asks a local Ollama model what the recipe should look like, sanitizes and validates the model’s output against a known operation catalog, and hands you back a CyberChef URL with the recipe already loaded. The payload itself stays on-device.

It lives on GitHub at github.com/zerber0s/souschef. Fair warning up front though that it’s experimental - the prompt is still being tuned / battle-tested and there are some quirks (more on those below).

Why I built it

DFIR triage on encoded samples is a lot of mechanical work. Most “interesting” payloads I see in the wild aren’t doing anything novel cryptographically, they’re usually just stacking 3-4 well-known wrappers (base64 → UTF-16LE → gzip → XOR, etc.) and hoping the layering buys time. The slow part isn’t decoding any single layer, but identifying which layers are present and in what order. Then you leverage a tool like CyberChef to make it human-readable.

The other piece is sample sensitivity. Half of the obfuscated content I’d actually want a model’s opinion on (even malware) is stuff I can’t paste into a hosted API. This could be Client data or PII-adjacent and, usually being part of an active engagement, the unknowns need to limit how you handle the data. Knowing others face this same sceanrio, the design constraint was always “this has to run on the analyst’s machine, on a model the analyst controls.” The tool can even run against a local instance of CyberChef for the most sensitive of situations. Ollama running qwen3-coder:30b locally turned out to be a reasonable sweet spot on Apple Silicon: code-tuned and disciplined enough about structured output to produce parseable recipe JSON most of the time.

How it works

End-to-end, one run looks like this:

Input → Local model → Parse & repair → Sanitize → Normalize → Heuristics → Confidence → Output

Each step expanded:

Input - a file, a stdin pipe, or a --input string. The same blob you’d paste into CyberChef.
Local model - SousChef sends the payload plus a fairly large system prompt to Ollama. The system prompt encodes the CyberChef operation catalog (~122 ops) it can use, a set of few-shots (odd LLM lingo for examples) covering common DFIR patterns, and rules about argument formating / shape.
Recipe parsing & repair - the model returns JSON. SousChef automatically handles fence markers, dangling brackets, and the usual LLM output noise, then parses out the recipe.
Sanitization - anything that looks like a PowerShell execution sink (IEX, Invoke-Expression, trailing & calls) is stripped. These aren’t CyberChef ops, so if the model emits them, it’s confused about the boundary between “decode this” and “run this.”
Argument normalization - coerces each op’s arguments into CyberChef’s exact positional format. This is the part that bit me hardest in early testing (see the “Where it is today” section below).
Heuristic detectors - a panel of currently ~11 small checks runs over the recipe and a Python-side simulation of its output. They flag things like “the output is still mostly non-printable, you probably need another XOR layer” or “these two ops cancel each other out.”
Confidence scoring - rolls everything up into HIGH / MEDIUM / LOW with a list of actionable signals.
Output - assembles a CyberChef URL fragment, prints it, optionally copies it to the clipboard, optionally opens it in a browser.

The value this tool brings at a high-level:

🔒

Runs entirely offline

Samples are sent to a local Ollama model on your machine. No cloud APIs, no third-party telemetry.

🧪

Heuristic validation

A panel of small Python checks flags missing layers, redundant op pairs, and garbage output before you click the URL.

📚

Operation-catalog enforcement

Recipes are constrained to the known CyberChef op set. Hallucinated ops get caught at parse time, not in your browser.

📊

Confidence scoring

Every run produces a HIGH / MEDIUM / LOW signal with a short list of "why" and "what to check next."

🔗

Browser-ready URLs

Terminal output contains a CyberChef URL fragment with the recipe pre-loaded. Can be configured to auto-open in browser as well.

🛰️

Air-gap friendly

A --cyberchef flag points the URL at a self-hosted CyberChef instance for sensitive engagements.

What I tested it against

All testing was performed against a mix of benign sample data, generated by AI from known techniques / things I have seen in the field, and malicious samples pulled from public repositories such as VirusTotal.

A representative slice of what works end-to-end today:

Family	Shape
PowerShell `-EncodedCommand` / `-enc`	UTF-16LE base64 wrappers, with and without inner layers
Empire-style multi-layer	`$s1 + $s2` substitution + base64 + UTF-16LE + gzip + single-byte XOR
Invoke-Obfuscation `COMPRESS`	Reversed base64 + DeflateStream
AES-CBC	`AesCryptoServiceProvider` with key/IV extraction
RC4	Passphrase-keyed, base64-wrapped payloads
ChaCha20	Stream-cipher payloads
Charcode + XOR	`@(N,N,N) \| %{ [char]($_ -bxor $k) }` patterns
Custom-alphabet base64	Paired `$std` / `$norm` translation tables
Meterpreter format-string stagers	`-f` operator with concatenation chains
Bare ROT13’d-base64 blobs	Inner base64 alphabet ROT13’d before encoding, SousChef auto-detects and prepends `ROT13` to the recipe

Full coverage list, including patterns explicitly out of scope (cmd.exe DOSfuscation, raw shellcode disassembly, identifier-renaming-only obfuscation), lives in the SousChef README.

Most of my recent debugging time has gone into samples where the obfuscation pattern looked extremely similar but had a small twist (i.e. a custom base64 alphabet whose decode was silently falling back to the standard alphabet, or an RC4 sample where the key was hex-encoded one way and the model assumed another). Those cases actually produced perfect recipes that just…gave you garbage. They’re the reason that the heuristic detector layer exists at all (in addition to some iterative assistance from Claude Code).

Where it is today

TL;DR - Still in testing. The system prompt is pretty solid for the patterns listed above on qwen3-coder:30b, but is still running through a lot fast and there are rough edges. Treat the URL as a starting point, not a 100% finished decode. There is some baked in feedback to SousChef’s terminal output to give a confidence level via scoring.

Honest status, as of the time of this post:

Verified end-to-end on qwen3-coder:30b against the tested patterns. Smaller models (7B, 13B) do tend to degrade, but gracefully (they generate plausible recipes but miss the trickier multi-layer cases). Larger models work fine if you have the RAM.
Argument normalizer is critical, not cosmetic. CyberChef’s URL fragment parser expects positional arguments in an exact order, otherwise named-object arguments silently fall back to defaults. I was working alongside an unknown bug for a while where decoding custom-alphabet base64 looked successful but actually used the standard alphabet, only fixed by a stricter shape enforcement via the normalizer.
A few ops have non-obvious weird quirks. From Hex gets forced to the Auto delimiter (handles dashes, spaces, colons, line breaks) and I have no clue why. ROT13 and ROT47 are purposely not treated as terminal ops, since they’re legitimate middle steps in real chains. Find / Replace is forced to global matching to work around UI-vs-URL inconsistencies in CyberChef itself. All of these determined through testing (and pain).
Out of scope situations end with a graceful fallback. Bohannon-style cmd.exe DOSfuscation, raw shellcode disassembly, and identifier-renaming-only obfuscation don’t produce CyberChef recipes (even though I tried). In these and similar cases, the model is instructed to produce a Comment op explaining why instead of guessing.

All of the above is from me being only a few commits in. The system prompt is still the part most likely to change between sessions, which is also why I keep a more accessible static copy in the repo here. If you use this tool and something that worked yesterday doesn’t work today, the few-shot examples are the first place to look.

Try it

SousChef on GitHub Source, example payloads, README, and the system prompt mirror. CyberChef GCHQ's swiss-army knife for decode/encode/compile pipelines. Ollama Local LLM runtime. One-time install, then `ollama serve` qwen3-coder model The default model I tested. 30B variant runs comfortably on Apple Silicon with >32GB RAM.

If you try it on a sample and the recipe is consistently wrong, or even sometimes, file an issue with the input (sanitized as needed), the model you used, and what you expected the recipe to be. That’s how this thing will continue to improve - every weird sample is a regression test waiting to be added that’ll only enhance the accuracy of future submissions.

I may update this post in the future, or write a follow-up, if this tool advances past the experimental phase.

Running Claude Code Locally with LM Studio on Apple Silicon

Fri, 01 May 2026 00:00:00 GMT

Most guides for running Anthropic’s Claude Code against a local model point you at Ollama, tell you to set a couple of env vars, and consider it done. Granted, it looks like it works - you can chat with it, but the agentic loop is silently broken. The model can talk to you, but it can’t actually do anything. No file reads, no real tool calls, no multi-step task execution. Just a polite chatbot wearing Claude Code’s UI as a costume (and making my MacBook a mini space heater.)

This post walks through a setup I eventually came to that actually works on Apple Silicon: LM Studio + the Unsloth GGUF version of the Qwen3-Coder-30B-A3B model, running entirely local on a 14” M5 Pro MacBook Pro with 64GB of unified memory. Full agentic loop, no API costs, no rate limits, no data leaving the machine.

Why bother running it locally

One of the key advantages of running local large language models (LLMs) is privacy, which can also be a key component in DFIR work. When dealing with sensitive Client info, or even malware, cloud models introduce risk and restrictions. I wanted to test for myself what these local models could do, which initially led me to beta testing Claude Code local on my MacBook Pro in the first place. If you also have capable hardware (and hate the direction of Anthropic’s pricing and plans) it can be worthwhile to explore local models for lighter agentic work.

While this post doesn’t cover it, one of the other benefits to local models can be the ability to run “abliterated” versions, which are models that have had their refusal & safety behavior weakened or removed after training. These can be very useful for malware decoding and analysis where normal cloud-based models, like OpenAI’s ChatGPT and Google’s Gemini, will refuse. These would be run independently, not via the Claude Code process outlined below.

Why LM Studio (and not Ollama)

This is the part I learned after about an hour of wondering why Ollama was spitting back gibberish to me after thinking on the question “what files are in this directory?” for 5-10 minutes.

Claude Code is built around Anthropic’s Messages API, which uses structured tool_use and tool_result blocks for every agentic action. Essentially every Bash command, file read, and edit. The model’s response isn’t just text, it’s a sequence of typed content blocks that the CLI parses and dispatches.

Ollama serves an OpenAI-compatible endpoint and translates Anthropic-shaped requests on the fly. That translation layer doesn’t preserve the tool-call blocks cleanly. The model emits something that looks like a tool call, the adapter mangles it, Claude Code can’t parse it, and the agentic loop breaks. You get a model that says “I’ll check that file for you” and then nothing happens. Super fun.

LM Studio 0.4.1 added a native Anthropic Messages API at /v1/messages. Claude Code talks to it the same way it talks to Anthropic’s hosted API, and tool calls round-trip correctly. No adapter or translation needed.

Prerequisites

macOS (this walkthrough was done on macOS 26.4)
LM Studio 0.4.1 or later — earlier versions don’t expose the native Anthropic endpoint
An Anthropic account — Pro, Max, Team, Enterprise, or Console. Free tier doesn’t include Claude Code access. You only need to authenticate once, then redirect everything local via env vars.
Terminal access (Terminal, iTerm2, whatever you use)
Apple Silicon with enough unified memory for the model you want. I’m on a 14” M5 Pro with 64GB. To maximize the value of this post, I have added in model recommendations below that scale by RAM tier.

Step-by-step setup

Step 1: Install Claude Code

curl -fsSL https://claude.ai/install.sh | bash

Verify the install:

claude doctor

This checks installation health and surfaces config issues. On the first run it’ll authenticate against Anthropic and that’s expected. The redirect to local happens via env vars in Step 5.

Step 2: Pick and download a model

The sad reality is what you can run depends on how much unified memory you have. Rough guide:

RAM	Recommended model	Quant	Size
24GB	Qwen3.5-35B-A3B	Q4_K_M	~22GB
64GB (my setup)	Qwen3-Coder-30B-A3B	UD Q4_K_XL	~17.67GB
64GB (more intensive)	Qwen3.5-27B dense	Q8_0	~30GB
128GB+	Qwen3-Coder-Next 80B	Q4_K_M	~48GB

I went with Qwen3-Coder-30B-A3B for the agentic Claude Code use case. A few reasons for this:

Purpose-built for agentic coding, tool calling, and multi-file reasoning
Mixture of Experts (MoE) architecture - 30B total params but only 3B active per token, so prefill is fast
Native 256K context support
No “thinking” mode - less overhead per turn, which matters when you’re firing off tool calls in a loop

In LM Studio’s model search, look for Qwen3-Coder-30B-A3B and pick:

Author: unsloth
Repo: Qwen3-Coder-30B-A3B-Instruct-GGUF
Quant: UD Q4_K_XL (~17.67GB)

UD refers to Unsloth’s “dynamic” quantization, which uses layer-aware compression to retain more model quality than standard Q4 while staying around the same file size. On a 64GB MacBook such as mine, that should leave roughly 46GB available for macOS, KV cache, and other apps running alongside the model.

Don't grab the wrong upload. The lmstudio-community upload of this model is months older and predates the tool-calling fixes. If you do, your tool calls will silently fail. Stick with the unsloth author. Also avoid fine-tuned variants (Huihui abliterated, etc.) for this use case as Claude Code expects the base instruct format.

Step 3: Configure the model in LM Studio

Once downloaded, open the model’s settings panel. There are two tabs that matter, Load and Inference (plus the prompt template). A lot of these settings were discovered by me through a combo of trial/error and research, validated by some Claude questions.

Load tab:

Setting	Value	Notes
Context Length	`32768`	32K is the sweet spot. Push to `65536` if you keep hitting limits.
GPU Offload	`Max / -1`	Full Metal offload so model fits in unified memory.
Evaluation Batch Size	`1024`	Default is `512`. Doubling this noticeably speeds up prefill, which is relevant for Claude Code’s large system prompt.
Unified KV Cache	`On`	Default, leave it.
Offload KV Cache to GPU	`On`	Default, leave it.
Keep Model in Memory	`On`	Avoids cold-load delays between sessions.
Flash Attention	`On`	Reduces memory pressure at long contexts.
K/V Cache Quantization	`Off`	Experimental, leave off for stability.
Try mmap()	`On`	Default
Number of Experts	`8`	Correct for this model, don’t change.

Inference tab:

Setting	Value	Notes
Temperature	`0.7`	Qwen’s official recommendation for the Coder series.
Top K	`20`	Down from default `40` - keeps tool calls tight.
Top P	`0.80`	Down from default `0.95` - also Qwen’s recommendation.
Repeat Penalty	`1.05`	Down from default `1.1` - discourages repetition in long sessions.
Min P	`Off / 0`	Disable. Can interfere with tool-call format.
Reasoning Section Parsing	`Off`	This model has no `<think>` blocks.
Structured Output	`Off`	Claude Code handles its own structure, enabling this breaks tool calls.

Prompt Template tab:

The default Jinja template included with the GGUF version of this model uses an unsupported safe filter, which causes an error on the first prompt. This error specifically was a major headache for me to identify, but luckily was an easy fix.

[ERROR] Error rendering prompt with jinja template:
"Unknown StringValue filter: safe"

Fix it manually:

Switch from Template (Jinja) to Manual
Pick ChatML from the dropdown
Confirm the start/end tags populate as <|im_start|> / <|im_end|> for system, user, and assistant
Confirm stop strings include <|im_start|> and <|im_end|>
Eject and reload the model

Step 4: Start the LM Studio server

Either flip the server toggle on in LM Studio’s Developer tab, or start it from the terminal:

lms server start

Default port is 1234. Verify the model is loaded and reachable:

curl http://localhost:1234/v1/models

Expected output:

{
  "data": [
    {
      "id": "qwen3-coder-30b-a3b-instruct",
      "object": "model",
      "owned_by": "organization_owner"
    }
  ]
}

Write down the exact id value - it should match the name of the model you chose and you’ll need it character-for-character for the env var in the next step.

While you’re in the Developer tab, two server settings worth tweaking:

Just-in-Time Model Loading: Off — keeps the model resident in memory between Claude Code prompts instead of re-loading on each request.
Require Authentication: Off — a dummy token works fine locally, so no need for the overhead.

Step 5: Set environment variables

There are two options we can do here - permanent global changes via .zshrc, or a shell script you run per terminal session. Hardcoding global changes will force Claude Code to always run locally, while the script method is something you can run before each intended local session, allowing Claude Code to automatically default back to cloud models in future sessions (Credit to KW 🐍).

The tradeoff is convenience vs control. Leveraging .zshrc is simpler to set up once and forget, but the script approach lets you switch between local and hosted cleanly. I am outlining both methods below, up to you what you think works best for your situation.

Option 1: Dedicated shell script

Create a file like ~/scripts/local-claude.sh:

mkdir -p ~/scripts && nano ~/scripts/local-claude.sh

The script should contain something like this (specific to the model you chose):

#!/bin/zsh
export ANTHROPIC_BASE_URL=http://localhost:1234
export ANTHROPIC_AUTH_TOKEN=lmstudio
export ANTHROPIC_MODEL=qwen3-coder-30b-a3b-instruct
echo "Claude Code → local LM Studio"

Ensure the script is executable after saving:

chmod +x ~/scripts/local-claude.sh

Then invoke it per-session when needed:

source ~/scripts/local-claude.sh
claude

Opening a fresh terminal without sourcing the script gives you the default hosted API back immediately. No editing files, no unsetting variables manually.

Option 2: Persistent changes via `.zshrc`

Open your shell rc file:

nano ~/.zshrc

Add these three lines at the bottom (using the model you chose):

export ANTHROPIC_BASE_URL=http://localhost:1234
export ANTHROPIC_AUTH_TOKEN=lmstudio
export ANTHROPIC_MODEL=qwen3-coder-30b-a3b-instruct

The AUTH_TOKEN value doesn’t matter (LM Studio doesn’t validate it locally) but Claude Code refuses to start without one set. The MODEL value must match the id from Step 4 exactly.

Save (Control+O, Enter, Control+X), then apply to the current shell:

source ~/.zshrc

Verify:

echo $ANTHROPIC_BASE_URL
# http://localhost:1234

For this .zshrc method, if you forget to source, the new env vars only take effect in new terminal windows. Existing windows still point at Anthropic's hosted API, and Claude Code will quietly bill you instead of routing local. Always re-verify with echo before launching claude in a session you care about. You can also see what model is being used when Claude Code loads.

Step 6: Launch Claude Code

cd /your/project
claude

The very first thing to check: the bottom-left of the Claude Code UI. If it shows the LM Studio model id (e.g. qwen3-coder-30b-a3b-instruct), you’re routed local. If it still shows something like Sonnet 4.6 · API Usage Billing, the env vars didn’t take in this terminal session - back to Step 5 you go.

I have seen that it’s advisable to set effort to low for routine tasks - local models can’t match hosted Sonnet at high effort, and low is the sweet spot for prefill speed:

/effort low

Smoke-test the agentic loop with something concrete:

what files are in this directory?

Watch LM Studio’s developer logs, which are visible under the Developer tab. You should see prefill, generation, and a tool call go out. Claude Code should also come back with actual filenames, not a description of what it would do if it could read files. If the model just narrates what it’s about to do without anything happening, the tool-call format is broken and you’ll need to re-check the GGUF (Step 2) and the prompt template (Step 3). The speed at which Claude Code responds will also be heavily dependent on your hardware and the model you chose.

Finally, you can generate a CLAUDE.md for your project if you’re already in the folder you wish to code within:

/init

Claude Code reads this file on every session start, which lets the model skip a chunk of the cold-start exploration it did initially.

Performance expectations

For my setup (M5 Pro, 64GB, Qwen3-Coder-30B-A3B UD Q4_K_XL) this is what I found as the consensus online, which helps me ensure everything is working as it should:

Metric	Value
Model size in memory	~17.67GB
macOS overhead	~8–10GB
Total memory pressure	~26–28GB (comfortable on 64GB)
GPU utilization during inference	~100%
GPU power draw	~33W
GPU temperature under load	~91°C (safe — M5 Pro throttles around 105°C)
Prefill speed	~100 tok/s
First response time (cold)	20–30 seconds (after `/effort low`)
Subsequent responses	Faster - KV cache holds the session context

Why first responses feel slow: Claude Code sends a 10–40K token system prompt at the start of every session. All of that has to be prefilled before your first answer comes back. Subsequent prompts in the same session reuse the KV cache and respond noticeably faster, which is why /init and re-using the same session both pay off.

The unified memory architecture is doing a lot of heavy lifting here, which is also why Apple’s Mac Mini and Mac Studio products have been flying off shelves lately. GPU and CPU share the same pool, so there’s no transfer bottleneck between discrete VRAM and system RAM the way there would be on a desktop with a dedicated card, such as a gaming PC.

Troubleshooting the pain

Since it could help to see my failures, below I listed out some of the specific problems I faced going through the initial setup and what I found out to fix them.

Issue 1: Claude Code still shows Sonnet 4.6 after setting env vars

Symptom: Bottom-left of the UI still says Sonnet 4.6 · API Usage Billing.

Cause: Env vars not live in the current terminal, or ANTHROPIC_MODEL wasn’t set.

Fix: echo $ANTHROPIC_BASE_URL to confirm it’s set, run source ~/.zshrc or re-execute your shell script (whatever you chose in Step 5) if not. Confirm LM Studio is up with curl http://localhost:1234/v1/models. Make sure ANTHROPIC_MODEL matches the id returned by that curl, character-for-character. Relaunch claude from the same terminal.

Issue 2: First response takes 5+ minutes

Symptom: Claude Code hangs for several minutes on the first prompt of a session.

Causes & Fixes:

Multiple models loaded in LM Studio - combined weight pushed past available RAM into swap. Eject everything except the model you’re using.
High effort mode — run /effort low.
Cold prefill of the 10–40K-token system prompt — this is normal, especially on the first prompt. Subsequent prompts are faster and /init can help reduce it further.
Default batch size of 512 — bump to 1024 in LM Studio Load settings.

Issue 3: Jinja template error on first prompt

Symptom: LM Studio dev logs show Unknown StringValue filter: safe.

Cause: The Qwen3-Coder GGUF ships a Jinja template that uses a filter LM Studio’s template engine doesn’t support.

Fix: Switch the prompt template from Jinja to Manual —> ChatML, confirm the im_start/im_end tags and stop strings, eject and reload the model. Full steps in Step 3.

Issue 4: Model describes actions but doesn’t actually execute them

Symptom: The model says “I’ll read that file for you” and then…literally nothing. No file actually opened, no tool call in the LM Studio logs. Unhappy Zach.

Cause: Either you’re somehow behind Ollama’s translation layer, or the GGUF you’re using predates the Unsloth tool-calling fixes.

Fix: Use LM Studio (not Ollama) for the native Anthropic endpoint, and use the unsloth GGUF specifically (not lmstudio-community or mradermacher). This is the whole reason the post exists.

Issue 5: `ANTHROPIC_MODEL` value doesn’t take effect

Symptom: Claude Code routes to the wrong model or errors out at startup.

Fix: Copy the id straight from the curl /v1/models response. The display name in LM Studio’s UI is sometimes formatted differently (version suffixes, capitalization) and the env var has to match the API id exactly.

TL;DR

From what I’ve seen, it appears most people just point Claude Code at Ollama, set a couple of env vars, and call it done. It looks like it works, but the agentic loop can be annoyingly broken. LM Studio’s native Anthropic API + the Unsloth Qwen3-Coder GGUF are what separated my ultimate working setup from a PoC.

I have been playing around with this since getting it setup and it has been incredibly useful for local coding tasks with an agentic boost. While it will never be as powerful as a full cloud model, not every task needs it to be (which can also save me usage and a few $$ in API credits.)

My next step, independent of Claude Code, is going to be exploring static malware analysis with “abliterated” models, such as deobfuscated complicated Base64 commands to determine their functionality. Additionally, I am hoping these models will allow me to dive deeper into ethical research related to different attack methodologies via malware generation.

Enjoy those tokens.

Introducing EIDVault: An EID Reference App Built by an Analyst, for Analysts

Mon, 20 Apr 2026 00:00:00 GMT

If you’ve ever found yourself three hours into an investigation, staring at Event ID (EID) 4624 Logon Type 10, trying to remember whether that’s the interactive one, the remote one, or the one you always have to Google (“GooGoo” as a colleague calls it) - this app is for you. It’s also, admittedly, for me.

EIDVault is an iOS app for digital forensic analysts and incident responders. It’s a quick-reference for Windows Event IDs, enriched with MITRE ATT&CK mappings, detection rules, relevant XML fields, and investigation pivots.

It’s live on the App Store now - download EIDVault for iPhone & iPad.

Why I built it

There’s no shortage of excellent Windows event reference material on the internet - Microsoft Learn, Ultimate Windows Security, a stack of bookmarked SANS whitepapers, etc. What I kept wanting was something a little faster, something that could possibly live on my phone so I could look up an EID while on a call, skim/correlate related events, or even export out specific relevant information for use later.

So I started drafting up ideas…before quickly realizing how much of a lift learning Swift and the intricacies of iOS app development would be from scratch. Then Apple decided they would add agentic coding to Xcode and, with that, eliminate all my excuses. So I built it (and I would encourage everyone interested to try the same.)

The goals to start were pretty simple:

Fast lookup — type an EID, get an answer.
Real context — not just “Generated when a logon session is created”, but what to correlate with, what’s noisy, what the key XML fields are, and how adversaries could abuse it.
Offline-first — the dataset ships inside the app. No login, no hoops.
Analyst-shaped — built around how I actually use EIDs during an investigation.

The dataset

Everything the app displays is backed by a structured JSON dataset that lives in a public GitHub repo:

🧾 github.com/zerber0s/windows-eid-data

I split the data from the app intentionally. The app is essentially a lens while the dataset is the source of truth. An added benefit of this structure is EIDs can be tweaked, or even added to pre-existing log channels, without needing a cooresponding iOS app update. And as every analyst knows, the field of cybersecurity is always changing. So if you spot an error, want to suggest a new event, see something out of date, or just think my investigation pivots for 4688 are missing something obvious (they probably are), that’s the place to raise it. Issues and PRs are open.

The data is organized by log channel, one JSON file per channel - security.json, powershell.json, sysmon.json, kerberos.json, and so on. Every entry conforms to a published JSON schema, which keeps things predictable as the dataset grows.

Below you can see what a single entry looks like - click through the tabs to see how the same JSON feeds different views inside the app:

SecurityEID 4624

An account was successfully logged on

logonauthenticationT1078 · Valid Accounts

Generated when a logon session is created on a system. The event is recorded on the machine being accessed and includes the account name, logon type, source network address, and authentication package used.

Each field has a purpose. details is the factual “what/when” - no directives, no “look for suspicious values.” That stuff lives in notesGuidance.investigationPivots, so the app can render the two cleanly and separately: here’s what the event is, and here’s what to do with it during an investigation.

What the app actually does

Inside EIDVault you’ll find:

🔎

Search & Browse

Browse by log channel or search across every EID, tag, and ATT&CK tactic.

🧠

Scenarios

An on-device AI tab powered by Apple Foundation Models. Describe what you're seeing and on-device intelligence surfaces relevant EIDs. No network calls, no prompts leaving the device.

🗺️

MITRE Mapping

Every applicable EID is tagged with ATT&CK techniques and tactics, including direct links to MITRE's knowledge base.

🛡️

Detection Rules

Inline Sigma, KQL, and Splunk rules where they exist - copy & paste as a starting point, then tune.

📎

Key Fields

The XML fields that matter for each event, with their xpaths, so you know what to grep for in raw EVTX.

🔗

Related Events

Every entry cross-references the other EIDs you'd want to pull into a timeline.

📤

Markdown Exports

Built-in functionality to export out all EID data, or even just specific fields, to Markdown-formatted output. Useful for sharing or later use.

📴

Fully Offline

The dataset is bundled. Works on a plane, in a SCIF-adjacent coffee shop (if that somehow applies to you), or wherever you answer pages from.

The Scenarios tab is probably the piece I’m most excited about. Running Apple’s on-device intelligence models means I get a meaningful “suggest EIDs for this situation” experience without sending a single byte to a third party. It is still experimental and limited by the available on-device model context, but any DFIR tool that can run 100% local is a huge win. Obviously, those results will always need to be validated, but it can be a great starting point or even just a useful discovery tool if you’re bored.

Why the data repo is public (and the app isn’t)

The app source lives in a private repo - it’s my first shipped iOS app and I’d like room to iterate without anyone watching me rename various “View” Swift files or reassigning a log channel a different SF symbol six+ times. But the dataset is the part that benefits from more eyes, and the part that will keep improving long after the UI settles down. Making that public felt obvious.

If you:

find an event described incorrectly
think an investigation pivot is wrong or missing
want to propose a new channel (looking at you, AD FS nerds)

…open an issue. I’ll eventually read all of them.

Relevant links

App Store EIDVault for iPhone & iPad Data Repo zerber0s/windows-eid-data Launch Post LinkedIn announcement

If you do give it a try, I’d love to hear what’s working, what’s missing, and what my overcaffinated brain got wrong. This is v1.0 and there’s a lot of room to grow. Also, the best direction usually comes from the people not coding all of this until 2am.

Happy hunting.

Building Zerberos Labs: Astro on Cloudflare Pages

Sun, 19 Apr 2026 00:00:00 GMT

After coming from WordPress previously, I figured I would document a quick write-up on what it took to stand this Astro-based blog up and why. If you’re thinking about doing the same, hopefully a few of the gotchas below save you the hour I lost to them (thank you Claude).

Why Astro over Hugo

I started by doing some research to compare the two top web framework options, Astro and Hugo, against one another. Hugo is hard to beat for pure-blog use cases: single binary, no Node, fast builds. There were some tradeoffs however:

Hugo’s templating is limited to Go’s html/template. No components, no JSX, no React.
There’s no clean path to embedding interactive UI (filterable tables, an EID lookup widget, etc.) without bolting on raw JS by hand (which I already barely understand to that extent).
Astro uses an islands architecture, which means it ships zero JS by default and only hydrates the components that need to be interactive.
Astro supports React natively, so I can drop components anywhere, including inside a blog post.

For a site I want to grow beyond pure blogging (DFIR tooling, embedded reference widgets, possibly a standalone EID lookup page, etc.), Astro is the better foundation. Hugo would have likely shipped the blog faster, but I’d have hit limitations on just post #2. There was also something satisfying about building something that could evolve with me over time and match my identity.

Prerequisites

Node.js (brew install node, verify with node -v) - write down your version, you’ll need it for Cloudflare
Visual Studio Code for editing
A GitHub account
A Cloudflare account with your domain already managed there (I leveraged Cloudflare Pages to host)

Creating the project

npm create astro@latest [blog-or-repo-name]

Pick the blog template when prompted, say yes to TypeScript, and let it install dependencies.

Then add React for interactive island components:

cd [blog-or-repo-name]
npx astro add react

Run locally:

npm run dev

Astro previews at http://localhost:4321 and hot-reloads on save.

The dev server can die when your host machine goes to sleep. If this happens, re-run npm run dev when you wake your machine.

Where things live

Below is a short list of the files and folders I found mattered most as I was getting started:

File	Purpose
`src/consts.ts`	Site name and description, referenced site-wide
`astro.config.mjs`	Set `site: 'https://[DOMAIN]'` here
`src/content.config.ts`	Zod schemas for blog post frontmatter validation
`src/content/blog/`	Markdown and MDX post files
`src/pages/index.astro`	Homepage
`src/components/Header.astro`	Site header
`src/styles/global.css`	Global styles

A couple of quick notes:

The site field in astro.config.mjs is only used at build time for RSS, sitemap, and canonical URLs. Doesn’t affect local dev. I ended up setting this early, even though my blog wasn’t “live” yet.
Newer Astro versions moved the content-config file out of the content/ folder up to src/content.config.ts. Same file, slightly different path than older docs describe. I spent a stupid amount of time stuck on this.

GitHub + Cloudflare Pages

Push the repo to GitHub but keep it private - Cloudflare Pages works fine with private repos via OAuth, and the OAuth connection itself doesn’t expire or take the site down if anything wobbles (the CDN keeps serving the last successful deployment regardless). When authorizing Cloudflare on GitHub, choose Only select repositories and pick just this one. Security first, obviously.

In Cloudflare:

Workers vs Pages "Gotcha" - the Cloudflare dashboard will by default (annoyingly so) route you into the Workers setup flow, which shows npx wrangler deploy - that's the wrong place we need to be. Navigate explicitly to Workers & Pages → Create → Pages tab → Connect to Git.

Build settings I used:

Setting	Value
Framework preset	Astro (auto-detected)
Build command	`npm run build`
Build output directory	`dist`
Environment variable	`NODE_VERSION` = output of `node -v` on your machine

Node version "Gotcha" - very new Node releases (e.g. v25.x at the time I set this up) may not be supported by Cloudflare Pages yet. If the first deploy fails on a Node-version error, drop NODE_VERSION to 22 (current LTS) in the Pages env settings and trigger a redeploy.

Configuring a custom domain

After the first successful deploy: Pages project → Custom Domains → Add Domain → enter your subdomain (labs.zerberos.io for me). Because my domain’s DNS is already on Cloudflare, the CNAME is auto-created and SSL is provisioned automatically. Usually live within a few minutes. Magic.

The deploy loop

As I grow this blog by adding posts or making tweaks to the underlying UI (like adding in a dark mode toggle), I have fallen into the following deployment loop:

Edit files locally.
Preview changes at localhost:4321.
Push to GitHub.
Cloudflare auto-rebuilds and deploys. No manual steps.

Based on the above setup, all I have to do to create a new blog post is whip up a Markdown file (.md) or an enhanced file that can run React components (.mdx), follow standard Markdown formatting, add in any React components as needed, and push to GitHub. Then it’s live.

Build status lives under the Deployments tab in the Pages project within Cloudflare’s portal. Old deployments stay in history, which can be useful for one-click rollback if something implodes in live time.

Later additions & tweaks

A few things weren’t done on day one but have been added since:

Support for dynamic mobile layouts

This was actually an extremely frustrating late realization when I first deployed the blog, opened it on my phone, and saw nothing but a garbled mess. I went back and used Chrome’s developer tools (FN+F12 on my MacBook) to preview the site in different mobile aspect ratios, fixing the header and blog post sizings. Small changes like this can actually involve multiple Astro files, so utilizing tools like Claude Code or OpenAI’s Codex can help simplify the lift.

Dark mode toggle

Dark mode doesn’t really need an explanation (it’s just better) so I wanted the blog to have that option. Astro did not have it baked in from the start, so I added in the button you see in the site header as a manual toggle, as well as system preference detection that persists via localStorage and a data-theme attribute on <html>. I also leveraged Claude Code to help me implement this.

Embedded React islands in posts

React allows you to add interactive elements to Astro, so not eveything us just plain Markdown format. My first real use of this was the EID preview widget in the EIDVault launch post, which also validated my decision to use Astro and MDX files.

Hopefully all of the above helps anyone looking to do something similar. The functionality of this platform allows me more flexability compared to a blog via WordPress, which also opens the door to some additional ideas I could explore down the road. For example, I could create a standalone EID lookup page here driven by the windows-eid-data JSON dataset, the same source of truth EIDVault uses, served as a free web tool.

More to come if I ever take that on.

Zerberos Labs

Agentjacking: the attack where nothing is unauthorized

iOS 27 gives Siri write access to your passwords - should it?

Home Lab Snapshot: May 2026

Why have a home lab

The stack

M5 Pro MacBook Pro

Local LLMs: LM Studio + Ollama

LM Studio (abliterated models)

Ollama (regular models)

Windows 11 detonation VM

OOBE bypass for a no-account install

Networking (Host-Only)

Snapshots

Disabling Defender

So - how am I using this Detonation VM?

PoC #1: Simple obfuscated PowerShell (IO.Compression)

PoC #2: Complex insert/remove/replace obfuscation

More to come

Meet SousChef, an Experiment in CyberChef Recipes from a Local LLM

Why I built it

How it works

Runs entirely offline

Heuristic validation

Operation-catalog enforcement

Confidence scoring

Browser-ready URLs

Air-gap friendly

What I tested it against

Where it is today

Try it

Running Claude Code Locally with LM Studio on Apple Silicon

Why bother running it locally

Why LM Studio (and not Ollama)

Prerequisites

Step-by-step setup

Step 1: Install Claude Code

Step 2: Pick and download a model

Step 3: Configure the model in LM Studio

Step 4: Start the LM Studio server

Step 5: Set environment variables

Option 1: Dedicated shell script

Option 2: Persistent changes via .zshrc

Step 6: Launch Claude Code

Performance expectations

Troubleshooting the pain

Issue 1: Claude Code still shows Sonnet 4.6 after setting env vars

Issue 2: First response takes 5+ minutes

Issue 3: Jinja template error on first prompt

Issue 4: Model describes actions but doesn’t actually execute them

Issue 5: ANTHROPIC_MODEL value doesn’t take effect

TL;DR

Introducing EIDVault: An EID Reference App Built by an Analyst, for Analysts

Why I built it

The dataset

An account was successfully logged on

What the app actually does

Search & Browse

Scenarios

MITRE Mapping

Detection Rules

Key Fields

Related Events

Markdown Exports

Fully Offline

Why the data repo is public (and the app isn’t)

Relevant links

Building Zerberos Labs: Astro on Cloudflare Pages

Why Astro over Hugo

Prerequisites

Creating the project

Where things live

GitHub + Cloudflare Pages

Configuring a custom domain

The deploy loop

Later additions & tweaks

Support for dynamic mobile layouts

Dark mode toggle

Embedded React islands in posts

Option 2: Persistent changes via `.zshrc`

Issue 5: `ANTHROPIC_MODEL` value doesn’t take effect