Writeups of work I've done.
Short, practical narratives of selected engineering work - performance tuning, platform migrations, schema work, AI-agent infrastructure, and security - expanded from my working notes.
These are drafted with AI assistance from my own profiling reports and commit notes. The decisions and the work are mine; the prose is collaborative. I keep jacobstephens.net/blog for hand-written essays, fiction, and longer-form personal writing.
-
One Prompt, Three Engines: the Same Android Game in Bevy, Unity, and Unreal
The same brief - an Android game, no art or audio assets, everything in code - run twice through Bevy (Rust), Unity (C#), and Unreal Engine 5 (C++) with Claude Code on macOS. Six repositories, and the finding that the visual editor, not the language, decides how well an AI agent can build in each: Bevy is all code and the cleanest fit, Unity routes around its editor headlessly and ships the most complete game, Unreal works in pure C++/Slate but pays in toolchain weight and a 130 MB APK.
-
What It Costs to Watch Your AI Coding Agents
Instrumenting a fleet of Claude Code agents with OpenTelemetry - one collector feeding the existing Prometheus and Grafana, plus a Tempo trace store - and the four places the obvious setup is quietly wrong: a cost metric that is notional under subscription billing, a dashboard that looks empty while working, a session count that is structurally uncountable, and a legacy PHP app whose mysqli queries hide from the traces.
-
Byte Counts Lie: Measuring What I Actually Wrote
A portfolio stats page that sums GitHub's byte counts credits you for code you inherited. Measuring real authorship in a platform built since the 1990s meant git blame over its full history - including the pre-2023 Bitbucket history from before the migration - and the surprise was which language the byte count had really been overstating.
-
HMAC-Signed Webhooks: One Primitive, Three Dialects
Stripe's, Twilio's, Mandrill's, and GitHub's webhook signature schemes turn out to be one primitive in three dialects. The signing bases, the five gotchas that actually bite (raw bytes, timing-safe compares, the proxy URL trap, tolerance windows, failing closed), the verifiers extracted into a zero-dependency TypeScript library - and what fact-checking my own opening line taught me.
-
Human-in-the-Loop AI Agents for Legacy Business Systems
The full pattern behind the manager sandboxes: six concentric boundaries - isolation, data, command surface, code promotion, audit, rollback - plus the list of things no agent is allowed to do. With the hard tradeoff named (the human gate is a bottleneck, on purpose) and a copyable checklist.
-
Importing Live DNS into Terraform Without Downtime
Why I imported ~220 live DNS records across 9 zones into Terraform instead of recreating them - the recreate path silently deletes anything your desired-state list forgot - the zero-change plan that proves it worked, and the SRV gotcha along the way.
-
One Engineer, a Platform of Production Systems
Four-plus years modernizing an inherited PHP 5 / CentOS 7 / MySQL 5 monolith into a PHP 8 / Rocky 9 / MySQL 8 platform - with Docker, a Python agent-orchestration layer, and multi-tenant AI assistants. Architecture, the legacy-to-modern arc, and engineering highlights.
-
A Privacy-First Fertility Chart, Encrypted on Your Device
A local-first CrMS charting app with end-to-end encrypted sync where the server only ever stores ciphertext - auto-computed stamps, an offline PWA, provider sharing, and the trust boundary that keeps health data on the device.
-
Tracking Listening Time Across Six Platforms - Without Building a Surveillance Tool
One pure Rust core driving six native shells, a G-Counter CRDT that makes concurrent devices add instead of overwrite, and a schema that structurally can't store when you listened - only how much.
-
Letting an AI Agent Modernize a Frozen Legacy Site, Safely
Six boundaries that make it safe to let an AI coding agent refactor a live, client-owned legacy site - and the one that does the real work: pixel-diff every page to AE 0 before a human is ever asked to look.
-
Sandboxing AI Agents per Business Role
Each business manager gets a sandboxed instance of our internal app with its own AI assistant. The safety comes from the shape of the system - one container per manager, a private database rebuilt every four hours, default-deny command execution, and an honest human merge gate - not from asking the agent nicely.
-
The Boring Deploy Script
Why I deploy a single-server PHP app with about 200 lines of guardrail-heavy bash instead of a hosted CI/CD pipeline - dirty-worktree rejection, flock serialization, ancestor-only refs, a healthcheck, one-command rollback - and the exact conditions under which I'd switch.
-
Six Rounds of Profiling: Cutting a 7-Second Page to 1 Second
A profile-driven performance pass on Tourbot's group manifest page - six rounds, byte-identical HTML at the end of each, taking page loads from 5-7 seconds to about 1 second and SQL statements per request from ~2,650 to ~183.