Day 2: Burned $55 in tokens to learn E2E tests should be deterministic
Grinding day. Trying to get VibeProof's E2E test runner working while hardening security across the app.
The Find
VibeProof gave me a brutal reality check on first scan:
- Health score:
0/100 - Security score:
0/100
The scanner was flagging every console.error in catch blocks as problematic, marking Next.js route exports (GET/POST) as "duplicate functions", and screaming about SQL injection on parameterized queries.
But it caught real issues too: 3 API routes with zero authentication middleware and broken plan gating where scansPerMonth: -1 (unlimited) failed the > 3 check because -1 > 3 is false in JavaScript.
The Fix
Tuned the scanners: only flag console.log/debug (not error/warn in catch blocks), exclude Next.js route handlers from duplicate checks, downgrade parameterized SQL from "critical" to "warning".
Fixed the plan logic: isPaidPlan = planLimits.scansPerMonth === -1 || planLimits.scansPerMonth > 3
Added dual auth (API key OR session) to all 45 API routes.
The Score
Health: 0 → 78.5 | Security: 0 → 44
The Takeaway
The expensive lesson: Claude was generating new test scripts on EVERY E2E run — non-deterministic, $2+ per run, slow. Pivoted to generate-once-cache-in-DB. First run costs ~$2, every subsequent run is $0. When you're burning through 1000+ GitHub Actions minutes in a day, determinism isn't just good engineering — it's financial survival.