BitterQA

00 / Lived-behavior verification

Behavior receipts for agent-built software.

BitterQA turns a repo-owned behavior contract into a run with artifacts, a verdict, a failure class, a repair lane, and the rerun that settles whether trust was restored.

Built for products that change faster than humans can manually verify them.

All evidence preserved Receipts · Artifacts · Repair lanes

01 / Thesis

When creation gets cheap, verification becomes the constraint.

Agent work raises the rate of change. That is the point. Faster change also means faster decay unless the product can answer back. QA has to become part of the environment, not a slower ceremony at the end.

Live behavior

The product is the judge.

Check the paths customers actually touch — landing, signup, login, checkout, onboarding, account, and public APIs — against the real access path, with declared principals and policy.

External evidence

A passing run is a receipt.

Screenshots, response shapes, status codes, traces, timings, and final verdicts ride together as one durable artifact bundle with a stable id.

Agent fuel

A failing run starts the next fix.

Failures land with a class — runtime, contract drift, payment, browser custody — so the next agent or human starts from a narrow piece of evidence, not a blank page.

02 / Contracts

A repo-owned behavior contract, not another test file.

A QA contract names what live behavior must hold for a real user, against which site, on which schedule, with which principal posture, and what evidence to keep. It belongs in the repo that owns the product.

BitterQA owns hosted execution, run history, verdicts, and repair state. The product repo owns what "works" actually means.

site Bound hostname or app — the live surface under verification.
flows Browser walks, API contracts, and webhook round-trips for each critical user path.
principal A QA-only account, reset hooks, and scoped credentials. Never your live customer data.
schedule Daily, hourly, on every release, or wired into a Grid deploy gate.
artifacts Screenshots, network traces, console output, response bodies, DOM snapshots, timings.
repair Whether agents may propose selector fixes, contract updates, or coverage expansions.

03 / Loop

Contract → run → artifacts → verdict → repair → rerun.

A small loop that keeps fast software honest. Each step leaves a durable handle the next step can cite.

01 Declare

Pin the flows that carry the product.

Start with the paths whose silent breakage would cost trust. Live homepage smoke is not enough — you want signup, paid feature use, support widget presence, and the API contracts customers depend on.

contract.flows[]
02 Run

Hit the live surface at the pace of change.

Browser flows for human paths, API checks for machine paths, webhook round-trips for the seams between. Daily, hourly, or wired into a release gate.

qa.run --gate=release
03 Capture

Keep the artifacts that prove it.

Screenshots, traces, response bodies, console and network evidence, DOM snapshots, and timings ride along with the run under one stable id. Bitter Browser supplies the perception layer when human paths need real-browser proof.

artifacts.bundle
04 Settle

Issue a verdict, route the failure.

A passing run is a receipt. A failing run is a class — runtime, contract drift, principal, payment, browser custody — paired with the evidence that names the next fix and the rerun that proves it landed.

verdict · failure_class · rerun

04 / Receipts

Every run leaves a stable handle.

Whether it passed or failed, the next person — or the next agent — starts from a bounded piece of evidence. Pass means trust is preserved. Fail means a narrow class of failure with a repair lane attached.

bq-receipt · 7c2a

Contract

marketing.public_smoke

  1. homepage 200 + canonical headers 200 · 188ms
  2. primary CTA visible above the fold screenshot · 011
  3. support widget mounted dom · ok

Verdict

Marketing surface still keeps its promise.

Pass
bq-receipt · 4e08

Contract

checkout.entitlement_landed

  1. signup with QA principal screenshot · 102
  2. checkout session created trace · 184
  3. entitlement appears in account browser_capture_missing

Verdict · failure_class: browser_capture_missing

Bitter Browser session lost custody before the entitlement frame could be captured. Repair lane open.

Fail
Failure classes routed: runtime contract_drift principal payment browser_capture_missing hub_provisioning

— / Boundary

Not a dashboard. A contract layer.

BitterQA is the lived-behavior verification layer for agents and the teams running them. Grid proves a service was deployed and reachable. Ping proves external liveness. BitterQA proves whether a real product path still behaves as promised.

Daily and release-gate verification stays cheap and deterministic. Agentic repair wakes only for setup, drift, failure, or coverage expansion. Bitter Browser supplies bounded perception and capture under explicit policy — never leaking private session state by default.

05 / Pricing

Start narrow. Verify the paths that carry the product.

One small contract that catches a real failure once is worth more than a hundred green smoke checks that never light up.

Starter

$29/mo

For one product that needs a basic live verification loop.

  • 1 property
  • 3 browser or API checks
  • Daily scheduled runs
  • 14-day artifact history

Loop

$299/mo

For teams using agents to ship and verify product changes continuously.

  • 10 properties
  • 100 browser or API checks
  • 90-day artifact history
  • Failure classes + agent-readable repair notes

06 / Access

Build the first behavior receipt.

Tell us what your agents or team are changing, which live flow has to stay true, and what evidence would make the next fix obvious.

Approvals are manual. Replies come from a human, not a queue.

Customer login