Day 2: Burned $55 in tokens to learn E2E tests should be deterministic

Grinding day. Trying to get VibeProof's E2E test runner working while hardening security across the app.

The Find

VibeProof gave me a brutal reality check on first scan:

Health score: 0/100
Security score: 0/100

The scanner was flagging every console.error in catch blocks as problematic, marking Next.js route exports (GET/POST) as "duplicate functions", and screaming about SQL injection on parameterized queries.

But it caught real issues too: 3 API routes with zero authentication middleware and broken plan gating where scansPerMonth: -1 (unlimited) failed the > 3 check because -1 > 3 is false in JavaScript.

The Fix

Tuned the scanners: only flag console.log/debug (not error/warn in catch blocks), exclude Next.js route handlers from duplicate checks, downgrade parameterized SQL from "critical" to "warning".

Fixed the plan logic: isPaidPlan = planLimits.scansPerMonth === -1 || planLimits.scansPerMonth > 3

Added dual auth (API key OR session) to all 45 API routes.

The Score

Health: 0 → 78.5 | Security: 0 → 44

The Takeaway

The expensive lesson: Claude was generating new test scripts on EVERY E2E run — non-deterministic, $2+ per run, slow. Pivoted to generate-once-cache-in-DB. First run costs ~$2, every subsequent run is $0. When you're burning through 1000+ GitHub Actions minutes in a day, determinism isn't just good engineering — it's financial survival.

Vibeproofing VibeProof — Day 2

Day 2: Burned $55 in tokens to learn E2E tests should be deterministic

The Find

The Fix

The Score

The Takeaway

Day 2 | Health Score: 78.5 | Issues Found: 113 | Issues Fixed: 72 | Total Test Cases: 113

Ready to stop shipping bugs?

Continue Reading

Vibeproofing VibeProof — Day 3

Vibeproofing VibeProof — Day 1

VibeProof vs. Manual Testing: Why Clicking Around Isn't QA