7 things your indie-hacker AI agent product needs before you open the waitlist.

If you spent 90 days building an AI agent product as a solo founder, you have a working demo, a Stripe test mode, and a Twitter thread. You don't have a production-readiness checklist written for you. Every other checklist on the internet assumes a platform team, a Datadog budget, and an SRE on call. This one assumes a MacBook, a credit card, and 18 hours a week. The 7 checks run in 90 minutes. The week-1 check-in is 5 questions. The 3 misconfig patterns are the ones that look fine in dev and burn you in production. The $149 read applies the same checklist to your live indie agent product and gives you a stop/go list before paid traffic.

The 7 pre-launch checks (90 minutes total)

Idempotency on every side-effecting tool

15 min. Search for `send_email`, `charge`, `create_*`, `update_*`. If calling twice with the same args does the side effect twice, you have a 2 AM incident. Add an idempotency key (hash of user_id + intent + day_bucket) checked against a small Redis/SQLite table before executing.

Per-session token cap

10 min. Hard ceiling on tokens per session and steps per agent loop. A solo builder running GPT-4-class on a $29 plan can be ruined by one user triggering a 50-step loop. MAX_TOKENS_PER_SESSION + MAX_AGENT_STEPS constants near the agent entry point. 88% of agent failure shapes are cost-driven, not model-driven.

Three log lines per side effect

15 min. Every email/card/file side effect logs: [intent] what the user asked, [post-verify] what the world looks like AFTER, [outcome-assert] what you'd check later to know it worked. Grep-able, not Grafana-shaped. You will read these at 2 AM from `tail -f`.

Manual kill switch

10 min. Way to turn the agent off in under 60 seconds without a redeploy. Feature flag in JSON on S3, Redis key, or Stripe webhook. Customer DMs at 6 PM, you have 60 seconds to stop the bleeding.

The 3 test inputs that always run before ship

15 min. Every indie agent product has 3 inputs that, if they break, break the whole product. For a support agent: refund request, escalation, out-of-scope. For a research agent: single-source, multi-source, no-answer. For a coding agent: one-line, multi-file, judgment. PRE_PROD_SMOKE.md. Run before every deploy.

Rate limit per user, not per IP

15 min. Single power user will burn your API budget. Per-IP is useless (VPN). Per-user-id, per-cost (tokens spent), not per-requests. One long agent loop = one "request" but $4 of API cost. You need a budget-shaped limit or you are two weeks from a $4,000 OpenAI bill you cannot pay.

The "rate limited" page

10 min. Static HTML at /rate-limited. Says "you are doing this too fast, here's a 60-second countdown, here's what you can do in the meantime." Five minutes to write. Saves you from "the app just stopped working for me" tweets.

The week-1 check-in (5 questions to answer on day 6)

You opened the waitlist. 40 signups. 12 power users. 2 refund requests. Answer these 5 questions:

Cost-per-user distribution. Is the median user costing you $0.05 and the 90th percentile $4? Fat tail = power-user problem + pricing problem.
Completed-but-wrong rate. For 10 random completed sessions, read the [outcome-assert] log line. If 3/10 are wrong, you have a silent-success drift problem.
Tool call failure rate. For 10 random sessions, count tool calls that returned an error. Agent papering over tool errors with hallucinated results = state-graph invention problem.
"I don't know" rate. Below 2% = hallucinating. Above 30% = useless. 5-15% is the productive band.
First-session success rate. Below 60% = broken onboarding. Above 90% = too conservative.

The 3 misconfig patterns that look fine and burn you

A. Retry-on-timeout without idempotency key. You added a retry decorator. The LLM timed out, the retry succeeded — but the *tool call inside* was the part that timed out. Customer charged twice. The most common week-1 incident.

B. Streaming response with side effects before stream completes. "Sure, I'll send that email — sending now — done." User closes at "sending now." Email sent, confirmation never seen. Chargeback waiting to happen.

C. Test mode is not actually test mode. Stripe is in test mode but SendGrid is in production mode. The 500 error you see in logs is the Stripe test call. The actual production failure is the SendGrid call. You debug the wrong system for 6 hours.

The 90-minute self-audit score bands

Walk the 7 checks, score yes/no per check. Then look at your band.

7 / 7 · Ready

Ready for the waitlist. Keep the checklist. Rerun before every major change.

5-6 / 7 · Probably

The 1-2 gaps are fixable in a half day. Fix them, then open the waitlist.

3-4 / 7 · Not ready

The gaps compound. Pick the 3-4 highest-blast-radius fixes before any traffic that could be regulated or could refund.

0-2 / 7 · Not ready

Not safe to take real users, real money, or real API budget. Either do the work, or pause the launch.

What you get for $149

Within one business day of receiving your public URL or sanitized export, you receive:

The same 7 pre-launch checks applied to your live indie agent product, with yes/no per check and the specific file / endpoint / config that drove the decision.
The week-1 check-in plan, tailored to your agent's 3 most common user flows.
A stop/go recommendation for the waitlist / launch date you gave me, and the conditions under which the answer flips.
One async follow-up if the first 3 fixes raise new questions.

What it is not

This is not a penetration test, not a security certification, not a legal opinion, not a compliance attestation, and not exploit testing. It is a practical pre-launch + week-1 production-readiness triage with a human review gate. The first public customer report is reviewed before delivery. If your intake is rejected as unsafe, unauthorized, out of scope, or impossible to fulfill without secrets / protected access, the order is refund-gated rather than delivered dishonestly.

What it costs

Buy the $149 indie-hacker production-readiness readone-time · USD · 1 business day delivery · invoice via email

Read the full service descriptionscope, intake rules, redacted sample report

When a 90-minute self-audit is not enough

Three situations where the $149 read is the right floor, not the ceiling:

You are about to take EU traffic. EU AI Act Article 17, GDPR enforcement 2025-2026, Colorado AI Act (June 2026) all add a layer the 90-min audit does not cover. See the EU AI Act bridge.
You are about to take regulated data. Health, financial, education, employment, government, biometric, children's data.
You have already had an incident. Complaint, chargeback, refund spike, or "delete my data" request you could not fulfill.

For those cases, the $299 LLM Bill Triage is the cost-explosion read; the $99 Vibe-Coded Launch Safety Audit is the pre-launch safety read for non-technical founders.

If you would rather do the 90-minute self-audit first, the dev.to article walks every check in order: The 7 things your indie-hacker AI agent product needs before you open the waitlist.