AI bug triage to ship overnight

Why four roles, not one chat?

One model answering “fix this bug” can skip triage discipline, rubber-stamp its own diff, and burn context on huge file dumps. Splitting the workflow into Triage → Fix → Review → Ship gives you separation of concerns: each duty has its own prompt, tools, and attitude. The Reviewer does not “know” how the Fixer convinced itself the patch was fine—it reads the artifacts cold. That is closer to how a human team behaves than a single autocomplete stream.

This post describes a pattern you can approximate in CloudyBot today: Specialists, ~/files/ as shared state, trigger_duty (or next_job_id chains) between duties, GitHub OAuth + git_operation, optional code execution in the sandbox, and delivery via webhook, WhatsApp on paid plans, or push. It is not a one-click productized “bug bot”—it is an architecture story grounded in shipped primitives. See How it works for Specialists, scheduling, and GitHub.

The overnight pipeline (overview)

All four Specialists share a small ledger under ~/files/bugs/. A nightly cron starts the chain; each stage writes files the next stage reads. When the Reviewer rejects a fix, the chain stops and a human picks up state.json in the morning.

Every night 23:00 UTC
        |
  [Triage Lead]          trigger_duty          trigger_duty          trigger_duty
   files, web_search  ──────────────>  [Fixer]  ──────────>  [Reviewer]  ─────────>  [Shipper]
   reads backlog.json                  code_exec, github      files                   github, files
   picks top bug                       writes fix + tests     approves or stops       branch, commit, push
   writes current-ticket.json          writes fix-report.md   writes review.md        writes test-brief.md
   updates state.json                  updates state.json     updates state.json      notifies QA tester
                                                                                      updates state.json

Handoffs can be implemented with the trigger_duty tool (dynamic: model or duty prompt saves artifacts then starts the next specialist) or with static next_job_id chains after each cron run—see your internal pipeline docs. The important part is one writer, one reader per boundary, with filenames you control.

Specialist A — Triage Lead

Template: custom employee (or analyst-style) with files + web_search if you need live issue context.

Duty (illustrative): At 0 23 * * * (11 PM UTC), read ~/files/bugs/backlog.json. Classify open items by severity (critical / high / medium / low) and urgency (blocking release, customer-reported, regression, internal). Pick exactly one ticket—the highest priority that is still “actionable” (repro steps, expected vs actual). Write ~/files/bugs/current-ticket.json with id, title, description, acceptance criteria, suspected paths, and links. Mark that item in backlog.json as in-progress. Append a line to ~/files/bugs/state.json: triage_complete, timestamp, chosen id.

Handoff: When current-ticket.json exists, trigger the Fixer duty (same run or chained job).

Specialist B — Fixer

Template: custom employee with code_exec, files, and github (after OAuth and clone—see pricing for plan limits on code execution and employees).

Duty: Read current-ticket.json. Use git_operation to work in your cloned repo under ~/repos/ as documented for agents. Prefer surgical line-level edits over pasting whole files—fewer AI credits on large trees. Run a focused check in the Daytona-backed sandbox via run_code (e.g. unit snippet or lint script you keep in-repo). Write ~/files/bugs/fix-report.md: files touched, rationale, test output, confidence. Set state.json to fix-ready.

Handoff: Trigger the Reviewer.

Specialist C — Reviewer

Template: custom employee or analyst with files only—no code_exec. System prompt: skeptical staff engineer; look for regressions, missing edge cases, style and security smells.

Duty: Read fix-report.md and the actual diff (via file tools / repo read as configured). Write ~/files/bugs/review.md with a clear first line: APPROVED or CHANGES_REQUESTED, then bullets. Update state.json accordingly.

Handoff: If and only if APPROVED, trigger the Shipper. If rejected, stop—your team sees needs-rework in the morning.

Specialist D — Shipper

Template: custom employee with github + files.

Duty: Create branch fix/bug-247-YYYY-MM-DD, commit with a message that references the ticket, push to origin. Write ~/files/bugs/test-brief.md: what changed, how to reproduce the original bug, expected behavior now, suggested manual steps, and CI link if you use it. Then notify QA: webhook to Slack/email automation, WhatsApp where your plan allows, or ask the duty to leave a prominent summary in workspace + push notification when the job completes. Set state.json to shipped-to-qa.

Where does backlog.json come from?

The triage duty is useless without fresh input. Three common patterns:

Scout / Watchdog — another Specialist on a schedule scrapes GitHub Issues, Linear, or Jira in the cloud browser and rewrites backlog.json.
Webhook — your tracker fires Zapier/Make → CloudyBot webhook; append or merge into the file.
Manual — export CSV/JSON from your tool, upload to the workspace when convenient.

How to build it without writing JSON by hand

Describe the whole cycle in plain English to the Workflow Architect (Growth+): sources for bugs, repo, branch naming, who gets notified, and that you want four hires with explicit handoffs. The Architect proposes templates, duty prompts, crons, and skillsOverride; you confirm deploy. Alternatively, use a custom recipe deploy with four steps and chained next_job_id where you want deterministic sequencing after each scheduled run.

For positioning vs other agents, see AI agent comparison (2026)—the differentiator is scheduled multi-role work, not a smarter single chat.

What changes for your team

Morning diff, not morning triage theater — the highest-priority scoped bug already moved through fix and review gates.
Humans at the right layer — QA executes the brief; a human merges after CI; AI did the grind.
Predictable cost — monthly AI credit and code-exec limits pause work instead of surprising you—aligned with plan caps.

Honest limits (read this before production)

The code sandbox has no outbound internet—design tests accordingly or rely on CI after push.
Repo files are not magically inside the sandbox; follow CloudyBot’s file/repo guidance for run_code vs git_operation.
The Reviewer is not running your full test matrix—it is reasoning over diffs and reports. Pair with CI on every branch the Shipper creates.
Best for well-scoped bugs with clear repro steps, not “redesign auth.”
Email/Slack delivery depends on integrations you wire; product capabilities vary by plan.

Closing

Copilot writes code when you ask. Devin-style agents run when you watch. CloudyBot Specialists can triage, fix, review, and ship on a calendar—with artifacts your QA can trust—whether you are online or not. Start with one narrow bug class, measure AI credits per cycle, then widen the backlog source.

Further reading: Workflow Architect · Why agents should work while you sleep · Surgical file editing · How it works · Pricing

Your AI Dev Team Triages Bugs, Writes the Fix, Reviews It, and Ships It — While You Sleep