GG — Applied AI Engineer

Selected Work

AILIVE

Shipped Groq-tutored exam-prep SaaS. Includes Stripe billing, SM-2 spaced-repetition scheduling, bilingual (EN/ES) content, and deterministic localization pipeline (~3,400 items, 5-stage QA gate, resumable, cost-capped).

Groq · Next.js · Supabase · TypeScript · Stripe · SM-2 algorithm

Sea Star — Groq Publish Automation

AILIVE

One blog topic fans out to Facebook caption, Instagram caption + hashtags, and email subject + body via a single Groq call. Nightly cron. Shipped as admin tool in production.

Groq · Next.js · social publish API · nightly cron

Multi-Model QA Cascade

One item routed to Ollama + Groq + Codex in parallel; non-LLM scoring layer picks winner; human-in-the-loop review gate before SQL commit. $0 to run locally.

Ollama · Groq · OpenAI/Codex · deterministic scoring · human-in-loop · SQL

agent-gate — Safety Fence

Deterministic guardrails for AI agents. 53 tests, fail-closed design: any uncaught edge halts the agent rather than letting it proceed. Built as a standalone reusable safety layer.

deterministic rules · fail-closed · 53-test suite

Vision Factory — Dual-Model Consensus

GPT-4o generates assets, Gemini verifies. Fails closed if centroid drift exceeds 85px. Coordinates large consistent asset-generation runs via a spec-driven pipeline with a status board (todo → generating → verify → placed).

OpenAI GPT-4o · Gemini vision · consensus gate · asset pipeline

Find Your Vote

LIVE

Address → ranked candidate matches from 25+ sources. Weighted issue-scoring engine with auditable per-issue breakdown. Determinism-audited: same inputs always produce the same ranked output.

Next.js · TypeScript · determinism audit · 25+ public record sources

blood-suga — Vision Meal Analysis

AIDEMO

Live Groq text call per dish returns carb/calorie range; eval loop computes MAPE vs USDA-derived ground truth in real time. Production path uses llama-4-scout vision. Offline fallback labeled honestly.

Groq · llama-3.3-70b · dataset-backed eval · MAPE · offline fallback

Capabilities & Stack

Models orchestrated

Claude OpenAI / Codex / GPT-4o Groq Gemini vision Local Ollama Stable Diffusion

Infrastructure

Next.js TypeScript Supabase Realtime Playwright Tailwind Vercel

AI / Pipeline

Deterministic pipelines Dataset-backed evals Non-LLM scoring Multi-model routing Fail-closed safety MAPE / eval loops

Systems shipped

SaaS (Stripe + SM-2) Publish automation Human-in-loop workbench Agentic safety fence Schema-driven intake Unit-economics engine

How I Work

Deterministic first. LLM output is wrapped in rules, scoring, and gates so the system behaves predictably even when the model does not.

Verifiable by design. Evals run against real labeled data; MAPE and pass/fail rates are visible in the demo, not buried in a notebook.

Fail-closed safety. Any uncaught edge halts the pipeline. No silent fallback to "probably fine." agent-gate is a dedicated reusable layer for this.

Human-in-the-loop. Operator review gates are a first-class component, not an afterthought. audit-kit ships a typed JSON workbench the next stage reads back.

No client data in demos. Every demo tile on the portfolio runs on sample data only. Real products operate under separate, isolated data paths.