Batch-Scoped Likeness Outcome Emails with Rejection Reasons

Context

The likeness-review rejection emails (RI and custom-VI) were fully generic: “We weren’t able to approve your character”, no reason, no indication of which photo failed. Users retried the same images blind, and CS hand-wrote a reason email after every rejection. PR #3064 made reviewers pick a structured rejection reason (+ optional note) per image and stored both on characters.reference_dataset_images; this work carries those reasons into the outcome email.

Two facts shaped the design:

reference_dataset_images holds two distinct row populations: review-requested rows (created in REVIEW status, resolved by a human) and auto-created rows (inserted directly as SELECTED by character creation, auto verification, and dataset sync — the path nippleless derivatives ride). Email content must never include auto-created rows.
The email templates are static HTML exported from shank (React Email) and filled by strings.ReplaceAll on {{TOKEN}} placeholders — no loops or conditionals.

Decision

One email per completed review batch, strictly scoped to that batch. Two additive nullable columns (migration 123): reference_dataset_images.reviewed_at, stamped inside the resolve transaction on both approve and reject, and characters.manual_review_outcome_email_sent_at, a per-character watermark advanced only after a successful send. The email queries rows where reviewed_at IS NOT NULL AND reviewed_at > watermark, ordered reviewed_at DESC so the newest verdict per slot wins (re-uploads can be reviewed out of upload order).

Per-slot results as the single source of truth. The strip service reduces the windowed rows to one ReviewSlotResult per recognizable slot (face-first interleaved order, matching the Strip history tab and the FE upload grid labels). Both the rejected/approved decision and the email body derive from this one list, so they can never disagree.

Presentation lives in the emails package. A single {{REVIEW_RESULTS}} placeholder is injected into the existing templates via ReplaceAll; it carries the optional outcome lead-in, the character-level blocks, and the per-slot rows, so any of them can be omitted without leaving empty template chrome. Mixed outcomes get “Almost there — replace N photo(s)” subjects/lead-ins instead of “couldn’t approve”. The custom-VI template’s static guideline checklist is replaced by the actual reasons.

Reasons render at two levels. Needs proof of creation and real images of someone else describe the character’s identity and origin, not a particular photo — they render once as a character-level block above the slot list, with CS-authored copy shared by both flows. Unusable for generation and OTHER notes are genuinely per-photo and stay on the slot row. This followed CS review of the original uniform per-slot draft: repeating an identity reason next to every slot reads as noise. Slots rejected for a character-level reason render no row at all, and when every rejection is character-level the generic lead-in is dropped so the block opens the email — QA read bare “Rejected” rows under a block as missing reasons.

Reviewer notes are user-facing only for the OTHER reason. Enforced twice: the mapping boundary zeroes the note for standard reasons, and the renderer only reads it for OTHER (trimmed, HTML-escaped, newlines to <br/>). This matches the in-app surfacing rule, which enforces the same policy at the API boundary.

Consequences

Auto-created rows (including nippleless SFW derivatives) structurally cannot appear in outcome emails — they never carry reviewed_at. This also fixed a latent bug where a newer auto-SELECTED row could shadow a reviewed rejection and flip the outcome email to “approved”.
Failed sends self-heal: the watermark does not advance, so the missed verdicts roll into the next batch’s email. The pre-existing duplicate-send race (two reviewers resolving the last two pending slots concurrently) keeps parity with the prior behavior — both goroutines would compute the same window, and the watermark update is idempotent.
Re-submitting a single rejected slot later produces an email covering only that slot.
Transitional edge: pre-deploy-resolved rows have reviewed_at = NULL, so a character mid-review at deploy time gets its first new-style email covering only post-deploy-resolved slots. No backfill needed or wanted — NULL is what keeps history out of the window.
A placeholder-contract test guards the template↔ReplaceAll coupling: a future shank re-export that mangles a {{TOKEN}} fails the build’s tests instead of mailing users literal placeholders. If a placeholder ever did go missing, ReplaceAll no-ops and the email degrades to the generic chrome.
User-facing copy lives in apps/sirloin/internal/pkg/emails/reviewresults.go. The two character-level texts are CS-authored verbatim (FOXY-210 review comments); the remaining strings (headlines, subjects, per-photo sentences) still need final CS sign-off before ship.
Rollback: revert the PR. Migration 123 is additive (two nullable columns) and can safely stay.

Alternatives Considered

Full-picture emails (all reviewed slots ever, latest per slot): rejected — earlier-batch verdicts would be re-reported on every email, and the simpler “all SELECTED/REJECTED rows” variant mixes in auto-created rows the user never submitted for verification.
Row-marker extraction (markup stays in shank): an example row between comment markers, repeated by Go. Nicer shank previews, but depends on comments surviving react-email export and adds a two-level placeholder scheme — more to get wrong for the same email.
Go html/template for the rejection emails: real loops/conditionals, but forks these two emails out of the shank pipeline and leaves two template systems.
Including the reviewer note for all reasons: rejected to match the in-app exposure rule — notes on standard reasons are internal; standard reasons carry canonical copy.
Uniform per-slot reason repetition: the original draft; dropped after CS review — identity reasons repeated next to every slot read as noise, and the CS replacement copy (multi-paragraph with bullets) does not fit a slot row. A “merge only when all slots share one reason” rule was considered too, but its mixed-reason fallback still needs an inline-sized second variant of every character-level text.
Relabeling the Strip reject-dialog note field in this branch: dropped; the in-app FOXY-210 branch owns the conditional note-visibility hint, avoiding a guaranteed merge conflict.