Skip to content

Build Journal

This series documents what it actually looks like to build AI security tools — the design decisions, the hard lessons, and the methodology that makes the difference between a demo and a deployment.

Subscribe via RSS →


ARCHER

A local-first AI penetration testing agent built on a laptop GPU, fine-tuned on its own operational data, and designed from the first commit to meet the evidentiary standards that regulated environments actually require.

Why AI Security Tools Fail

The demo works. The model is capable. So why does the tool fall apart before it reaches production? What I found when I started pulling on that thread.

Architecting for Constraints

8GB of VRAM sounds like a limitation. It turned out to be one of the best design decisions in the project — here's why constraints clarify thinking.

The Day I Found 17 Bugs in My Training Data

In one session, seventeen bugs surfaced in the training pipeline — each one a lesson in how silently bad data compounds before you ever see it in model behavior.

The Path to Production

Getting it to work is one challenge. Getting it to work in a regulated environment where the output carries legal weight is a different one entirely — and a more interesting one.

Not Vibe Coding

3,500 lines of AI-assisted code in two weeks, with full architectural control intact. Here's exactly how that process worked and what I learned about the right way to use AI as a coding partner.

Teaching an AI to Think Like a Pentester

What does it actually take to go from a generalist model to a specialist trained on its own operational data? The journey from "impressive" to "deployable."

Firm Principles

Seven things I only learned by building something real with AI — the lessons that don't show up until the system is running against actual targets.

The Practitioner's Experiment

A senior security practitioner with no engineering background built a production AI agent in two weeks. What that process revealed about AI-assisted development done deliberately.

When the Agent Knows but Won't Act

For three sessions, ARCHER knew the exact fix for a failing objective and never mentioned it. What that pattern looks like, why it happens, and how to design around it.

Committed but Not Verified

Three objectives were broken for weeks while their fixes sat in closed commits. Here's the failure chain, and the four process rules that came out of it.

0.7 Was Wrong

The classifier shipped with a 0.7 confidence threshold. The routing log said it was filtering out correct answers. Here's what the data showed — and what it taught me about treating ML parameters as hypotheses.

Building the Range

A headless security range on a laptop: Metasploitable2, DVWA, bWAPP, a Windows 11 Defender evasion target, all containerized and networked across isolated segments. What it took to build it and why the architecture decisions matter more than the hardware.

The Morning I Thought My Eval Broke

T2 pass rate: 0%. Four sessions ran, all failed. Twenty minutes of debugging later, the system was fine — the sample size was four. What daily metrics hide in low-volume AI evaluation environments, and the three objectives that were actually failing underneath the noise.

The Duplicate That Drifted

A list that should exist once existed twice. The copy knew it was a copy — there was a comment in the code. By the time I found it, the two had already diverged. Why solo projects accumulate this class of structural debt silently, and why it's more dangerous in eval pipelines than in production code.

Why I Added a Speed Limit to My Own Codebase

Six unverified hint fixes, two unverified shared-utility fixes, eight total — and work stops until the Auditor runs evals. That sounds like bureaucracy. It's actually about preserving the ability to attribute regressions when something breaks.

I Was Training My Agent to Solve DVWA

ARCHER's SQL injection pass rate was solid. The hints were correct. The training data looked clean. Every single session was teaching the model to log in at /dvwa/login.php and probe the id parameter. On a different application, every one of those lessons was wrong.

The Agent That Knew and Didn't Act

The session log showed correct enumeration, correct identification of the privilege escalation path, and the exact command that would have worked. Then the session ended. The objective failed. The agent knew. It just didn't act.

Pass Rate Was Lying to Me

For months, aggregate pass rate was the number. It went up; things were better. Then I found a session where the agent completed the objective and kept running past success until it hit the budget limit — recorded as a fail. Three failure modes were hiding in the same aggregate.


Sagittarius

A distributed threat hunting platform: sensing at the edge, computing at the core, no cloud dependency. Build journal in progress.

Before the First Line of Code

The session that produced no application code at all — and was the most productive session in the project.


Articles are published as they complete review. Subscribe via RSS to be notified when new articles go live.