Jacob Stephens
jacob@stephens.page |
Philadelphia, PA
Portfolio |
GitHub |
LinkedIn
Platform and infrastructure engineer specializing in safe AI automation for revenue-critical legacy systems. For 4+ years the lead engineering owner of a multi-million-dollar specialty-travel operator's production stack - reservations platform, payment systems, Linux server fleet, and AI-agent infrastructure. Recent work: a multi-tenant AI-assistant platform where every agent runs in a per-role sandbox behind a human merge gate (Docker/Traefik), fleet-wide Prometheus/Grafana observability, and a measurement-driven production-database pass that removed ~80% of query load.
Core Technologies
Languages & Databases
Frameworks & Runtimes
Infrastructure & Operations
APIs & Application Engineering
Engineering Practices
Security
Work
Senior Platform & Infrastructure Engineer
Educational Travel Adventures | Feb 2022 - Present | United States, Remote
- Serve as lead engineer and continuity owner for the company's production stack - reservations platform, payment processing, Linux server fleet, and deployment pipeline - with end-to-end responsibility for architecture, operations, and on-call response across a multi-million-dollar travel business
- Developed guides.etadventures.com - a mobile-first PWA for field contractors with offline service-worker support, receipt-attached expense workflows, and manager admin; adopted by 77 guides over the past 12 months to file 410 tour reports, send 3,300+ traveler SMS, and process 1,800+ expense reports (~$154K), extending the JavaScript stack across a previously PHP-only platform
- Architected automated reservation-cancellation workflows in PHP and MySQL, preventing bad-debt accumulation with business-rule-driven automation
- Turned hard-coded email and voice payment reminders into a manager-editable system - self-service rules in the Tourbot UI, added Twilio SMS, wired to automated cancellation - cutting manual work by 80%+
-
Drove production-database performance from measurement: caching and query work took alert pages from
5-10 seconds to under 1 second, and a
performance_schema-led index pass (online DDL, zero downtime) eliminated ~80% of measured query time on the live primary; a profile-driven N+1 elimination pass on the group-manifest page cut SQL statements per request from ~2,650 to ~183 with byte-identical HTML output - Led PCI compliance initiative: architected secure REST integrations with merchant payment processors, and inherited, improved, and modernized the ACH/NACHA bank-file workflows (SFTP) across PHP, MySQL, and OS major-version migrations, implementing security best practices including bank-account encryption
- Led three enterprise-wide platform migrations (PHP 5 -> 8, MySQL 5 -> 8, and CentOS 7 -> Rocky Linux 9) across the codebase and server fleet, modernizing the stack and improving application performance
- Established engineering practices across internal and customer-facing platforms - architecture decisions, code review, and mentoring two developers
- Built an internal orchestration and status platform (Python, Flask, SQLite) - a dashboard that probes the MySQL, web, and cron-server fleet every 60 seconds with operator deploy and preview-environment controls - running an autonomous coding agent as a scoped, least-privilege system user with auditable, vault-injected credentials
- Stood up fleet-wide observability on a Terraform-provisioned droplet - Prometheus, Grafana, and Alertmanager instrumenting 14 hosts (node/mysqld/blackbox exporters: uptime, replication health, TLS expiry) with email and SMS paging consolidated into a single alert pipeline
- Designed and built a multi-tenant AI-assistant platform (Docker, Traefik) giving each business manager a sandboxed Tourbot instance with its own isolated database, few-hour production refresh, per-container resource limits, and default-deny agent command execution; AI work isolated on per-role branches behind a human merge gate through which 14 manager-prototyped features shipped to production across all four roles (sales, marketing, GM, IT). No agent approves, merges, deploys, or moves money without a named human gate - the seven-boundary safety checklist governing this is published as sanitized ADRs and a pattern write-up
- Replaced database triggers with binlog-tailing daemons for derived reporting data - reconciliation pipelines that keep denormalized tables consistent with the transactional source, in versioned code off the hot write path (ADR 0006)
- Built ProspectForge - an internal contact-sourcing platform (FastAPI, SQLAlchemy 2.0, Alembic, PostgreSQL, Next.js, self-hosted Firecrawl scraping, Docker Compose) to replace CivicIQ, the team's third-party prospecting SaaS, pending management approval
- Engineered a server-side deploy pipeline with dirty-worktree rejection, flock-based concurrency serialization, ancestor-only ref enforcement, post-deploy healthcheck, and one-command rollback - backed by a PHPUnit suite (unit / integration / functional) and GitHub Actions CI across PHP 8.1-8.4
Web Developer and Client Support Specialist
Sharp Innovations | Nov 2021 - Feb 2022 | Conestoga, PA
- Developed automated database migration scripts in SQL and PHP, reducing migration time from hours to minutes and eliminating manual errors
- Delivered technical support for web clients - UI/UX updates, security vulnerabilities, and Linux server configuration and optimization
Full-Stack Developer (Independent)
Steward Goods | Mar 2020 - Nov 2021 | United States, Remote
- Designed and shipped artifact.stewardgoods.com, a multi-tenant possessions tracker built on full-stack PHP and MySQL with frontend, backend, and deployment ownership
- Optimized website performance through code and infrastructure improvements, achieving perfect scores (100/100) on Google Core Web Vitals
- Engineered PCI-compliant subscription payments with Stripe in PHP and MySQL
Selected Projects
- infrastructure-patterns: Sanitized architecture decision records and the seven-boundary agent-safety checklist drawn from production systems I operate - per-role isolation, scoped least-privilege credentials, default-deny command surfaces, human merge gates, audit trails, and rollback - published so other small-business engineers can safeguard their own systems
- terraform-cloudflare-dns: Infrastructure-as-code for a personal web fleet of ~70 hostnames across 10 domains. Consolidated DNS from four registrars onto Cloudflare and brought ~220 records across 9 zones under Terraform by importing the live records (not recreating them) for a zero-downtime, no-op baseline plan. Remote state on AWS S3 kept off the compute provider for disaster-recovery independence, Ansible roles for one-command subdomain provisioning, a plan-only DigitalOcean rebuild blueprint, and GitHub Actions plan-on-PR. Sanitized public mirror of the production repo
- k3s-demo: A live, HTTPS two-tier Kubernetes app - a stateless Deployment in front of a Redis StatefulSet - on a single k3s node, provisioned end to end by Terraform and cloud-init. Production-grade manifests (rolling updates, liveness/readiness probes, resource limits, a hardened securityContext, a HorizontalPodAutoscaler, and RBAC), with an interactive page that triggers a load test and charts CPU crossing the 70% target as the pods scale 2 to 6 (k3s-demo.stephens.page). Guarded by a five-rule OPA Gatekeeper admission layer, one rule re-expressed as a ValidatingAdmissionPolicy in CEL. A deliberate Kubernetes exercise, kept separate from my systemd-based production fleet
- Cascade: Focus/sleep sound player built on one headless Rust core that drives native shells over a single JSON boundary. Shipped the web (PWA) and Android shells through agent-assisted development - the Rust core bound via wasm-bindgen and UniFFI - with the same core architected to extend to macOS, Windows, iOS, and watchOS (cascade.stephens.page)
- Chart35: Privacy-first fertility tracker with client-side AES-256-GCM encryption, offline-first IndexedDB storage, and expiring provider-share links - shipped as a PWA and adopted by real users via organic search, with an Android build (Capacitor) in Play Store review and a native SwiftUI app for iPhone and Apple Watch in TestFlight beta (case study, source)
Education & Certifications
Bachelor of Arts in Sociology
Gordon College | Aug 2012 - Aug 2017 | Wenham, MA
Certifications
CompTIA A+ | Nov 2017