Deliverability & metrics
This article covers what happens to a newsletter issue after you hit send: how Nukipa talks to Resend, how opens/clicks/bounces/complaints flow back in, how hard signals (bounces and complaints) suppress an address, how sender-domain verification and reputation work, where live metrics come from, and why an issue can sit in scheduled longer than you'd expect.
Email sending is handled by the newsletters service. It never calls Resend directly — every send and every webhook routes through the platform gateway. Resend is the email provider underneath.
The lifecycle of one issue
An issue moves through these statuses:
| Status | What it means | Deletable? |
|---|---|---|
draft |
Editable. Carries body_markdown (your editor source) and body_html (the rendered cache the worker emails). A brand-new draft can have neither populated until first save — body_html is required to send or test-send, and the routes return 400 if it's empty. |
Yes |
scheduled |
A send-at time is set (scheduled_for). Not yet enqueued — the in-process scheduler picks it up. |
Yes |
sending |
A send job is running (or about to). | No (409) |
sent |
The send finished. Kept permanently for the audit trail (deliveries reference it). | No (409) |
failed |
The send was blocked or errored (e.g. unverified sender domain). | Yes |
The delete guard only blocks sent and sending. Drafts, scheduled, and failed issues can all be deleted.
There are two ways to initiate a send (a third write to sending comes from the scheduler itself — see below):
- Send now —
POST /issues/:id/sendflips the issue tosendingand enqueues thenewsletters.send-issuejob immediately. Requires bothsubjectandbody_html. - Schedule —
POST /issues/:id/schedulewith{scheduled_for}(ISO) just setsstatus='scheduled'. It does not enqueue anything. An in-process scheduler does that later (see below).
Why a send can sit in scheduled
Scheduling does not put a job on the queue. A timer inside the worker process scans for due issues and dispatches them. So there's an inherent lag, and a few ways a send can stall:
- Scheduler tick interval. The scheduler runs
setIntervalevery 30 seconds, scanning forissues where status='scheduled' AND scheduled_for <= now(). A schedule set for "right now" waits up to 30s (the worker also runs one scan ~2s after boot, so a fresh restart picks things up quickly). The scheduler's own UPDATE is what flips a duescheduledrow tosending— that's the third path intosending, distinct from the send-now route. - Per-tick cap. Each tick claims at most 25 issues (
MAX_PER_TICK). If you scheduled 100 issues for the same minute, they go out over several ticks, not all at once. This is deliberate — it bounds the blast radius. Issues over the cap aren't dropped; they're deferred to the next tick. - Jobs service not configured. The scheduler enqueues through the jobs service. If
SERVICE_JOBS_URLis unset (common in local dev),getJobsClient()returns null and the scheduler simply no-ops — your scheduled issue sits inscheduleduntil jobs is configured. No harm done; the row is untouched. - Enqueue failure rollback. If the scheduler claims an issue (flips it to
sending) but the enqueue call throws, it rolls the issue back toscheduledso the next tick retries. During that window you may briefly seesending, thenscheduledagain.
[!WARNING] Send-now does NOT roll back when jobs is unconfigured. Unlike the scheduler,
POST /issues/:id/sendflips the issue tosendingbefore it checks for the jobs client. IfSERVICE_JOBS_URLis unset, the route still returns202withjob_id: null— but the issue is now stranded insendingpermanently. There's no rollback on this path, no job was enqueued, and asendingissue can't be deleted (409). ConfigureSERVICE_JOBS_URLbefore using send-now, or you'll have to fix the stuck row out-of-band.
[!NOTE] The scheduler is a poll, not a cron. The race-safe claim is
UPDATE … WHERE status='scheduled'— if two worker processes pick the same row, only one flips it and dispatches; the other updates zero rows and skips. Safe under blue/green deploys, but the trade-off is the up-to-30s latency.
If you need to cancel a scheduled send, POST /issues/:id/unschedule reverts it to draft and clears scheduled_for. It returns 409 if the issue isn't currently scheduled.
The Resend webhook: where opens/clicks/bounces/complaints come from
The send worker dispatches the batch and does not wait for delivery outcomes. Everything after "Resend accepted the message" arrives asynchronously through one endpoint:
POST /public/webhooks/resend
The body shape mirrors Resend: { type, data: { email_id | id, ... } }.
[!IMPORTANT] Signature verification happens at the gateway, not in this service. The gateway terminates Resend's svix-signed webhooks, verifies them, and forwards pre-verified events. The
newslettersservice trusts the gateway and does not re-verify. This is why the route lives under/public/*with no internal-token auth.
On each event the handler:
- Looks up the
deliveriesrow byresend_message_id(the per-recipient id the send worker stored). The tenant is read from that row. - Archives the raw event into
newsletters.events— append-only, written when a delivery row matches. So if you ever need to debug, the raw payload is on record. - Patches the delivery's
stateand the matching*_attimestamp.
Event-to-state mapping:
| Resend event | deliveries.state |
Timestamp set | Error written |
|---|---|---|---|
email.sent |
sent |
sent_at |
— |
email.delivered |
delivered |
delivered_at |
— |
email.opened |
opened |
opened_at |
— |
email.clicked |
clicked |
clicked_at |
— |
email.bounced |
bounced |
bounced_at |
error ← data.error (or the whole data) |
email.complained |
complained |
— | error ← the whole data |
A few honest limitations:
- State is last-write-wins, not a count.
deliveries.stateis a single column. If a recipient opens then clicks, the row ends onclicked. There's no per-recipient open/click counter — just the latest state and the individual*_attimestamps. - No
complained_atcolumn. A complaint setsstate='complained'and stamps theerrorblob, but there's no dedicated complaint timestamp on the delivery row. - Bounce and complaint store the error slightly differently. A bounce writes
error = data.error(falling back to the wholedatapayload ifdata.erroris absent); a complaint always writes the wholedatablob. Minor, but worth knowing if you parsedeliveries.error. - The same webhook also feeds nurture-sequence sends (
nurturing.sends) when the message id matches there instead. That's covered in the nurturing docs; the vocabulary is the same.
Hard signals flip subscription status and suppress the address
A bounce or a complaint is a hard signal — these are the events that damage sender reputation with the receiving provider (Gmail, Yahoo, etc.). When the webhook sees email.bounced or email.complained and the delivery has a subscription_id, it does two things beyond patching the delivery:
- Flips the subscription status to
bouncedorcomplained— but only on the one subscription this delivery belongs to. Since the normal audience query only includesstatus='active'subscribers, that address is excluded from the next send to this publication automatically. A sibling publication where the same email is stillactiveis not touched by this status flip — the suppression list (below) is what protects siblings. - Upserts the address into
newsletters.suppression_list— a per-tenant "do not send" list, keyed(tenant_id, email), withreasonset tobouncedorcomplained.
[!NOTE] The suppression-list schema allows a third
reason:manual. The webhook only ever writesbounced/complained, but the table is built to also hold an operator-added entry (reason='manual'), and the send-worker audience filter explicitly accounts for that case. There's no route that writesmanualtoday — it's a schema affordance, not a shipped feature.
The suppression list is what stops the address coming back through a side door:
- On subscribe: the public subscribe route checks the suppression list first. A suppressed address gets
410 Gonewith a clear "this address is on our suppression list and can't be re-subscribed" message — so a bounced address can't re-subscribe to a different newsletter in the same tenant and get re-bounced. - On send: the send worker filters the resolved audience against the suppression list as a belt-and-suspenders check, even though bounced/complained subs are already excluded by the
status='active'filter. This covers the edge case where an address is stillactiveon one publication but suppressed via a sibling.
[!WARNING] The suppression filter is best-effort, not a hard guarantee. If the suppression-list query errors during a send, the worker fails open — it returns the unfiltered audience and sends anyway, rather than blocking the whole send on a transient DB error. The subscribe-side check is similarly soft: a failed lookup there is swallowed and subscription proceeds. The
status='active'exclusion is the primary guard; suppression is the secondary net.
[!NOTE] Suppression is per-tenant, not global across tenants. Someone who blocked Tenant A's mail may legitimately want Tenant B's. The reputation problem is per-sender-domain, so the scope follows the tenant.
When the suppression table was first added, the migration backfilled every existing bounced / complained subscription into the list (on conflict do nothing, tagged notes='backfilled…'), so addresses that went bad before the table existed are covered too.
You can also create these signals manually from the admin side:
POST /newsletters/:id/subscribers/:subId/unsubscribe— admin manual unsubscribe. Flips the subscription tounsubscribed, stampsunsubscribed_at, stores the reason undercustom_fields.admin_unsubscribe_reason, and does not send the user-facing unsubscribe confirmation email (this is operator-initiated).POST /newsletters/:id/subscribers/:subId/mark-complained— admin manual complaint flag (flips tocomplained).
Note these admin actions flip the subscription status; the suppression-list upsert path shown above is driven by the webhook.
Verified sender domains
If a newsletter sets a custom from_email, the domain in that address has to be verified before Nukipa will send from it. This is enforced at send time.
The pre-send gate
Before the worker creates a send row or calls Resend, checkSenderDomain() runs. Acceptance order, first match wins:
- No
from_email→ allowed. The gateway falls back to a platform-owned sender, which is always verified on the platform's Resend account. - Host is on
PLATFORM_OWNED_SENDER_DOMAINS(an env list, e.g.nukipa.com,kibert.de) → allowed. These are verified once at the platform level; no per-tenant row needed. - A row exists in
newsletters.verified_domainswithstatus='verified'for this tenant and domain → allowed. - Otherwise → the send is blocked.
On a block, the worker flips the issue to failed, writes a structured reason into metrics, and no send row is created (nothing was attempted at Resend). The reason is human-readable so the dashboard can show you exactly what to do:
reason |
Meaning |
|---|---|
invalid_from_email |
The from_email isn't a parseable address. |
domain_not_added |
The domain has no row — add it under Newsletters → Settings → Sender domains. |
domain_status_<status> |
The domain row exists but isn't verified yet (e.g. domain_status_verifying). Publish the DNS records and click "Check now". |
gate_schema_missing |
In production, the verified_domains table or a column is missing (Postgres 42P01 / 42703). The gate fails closed. Apply the newsletters migration. |
gate_query_failed |
Any other DB error while verifying the domain. The send is blocked. |
[!WARNING] In production the gate fails closed if the
verified_domainstable or a column is missing (schema error codes42P01/42703) — a dropped table or a permission glitch can't silently let unverified domains ship. You'll seereason: gate_schema_missingin that case. In dev/staging it fails open by default so a lagging migration doesn't block testing. Override withNEWSLETTERS_GATE_FAIL_OPEN_ON_SCHEMA(true/false).
[!NOTE] A gate-failed issue's
metricsis not{sent, failed}. On a block,issues.metricsis overwritten with{ error: 'sender_domain_not_verified', reason, from_email }. SoGET /issues/:id/metricson a failed issue returns those keys spread in instead of the dispatch counters — see the metrics section below.
Adding and verifying a domain
The flow mirrors Resend's own:
POST /domainswith{ domain, region? }— inserts a row (status='pending'), then calls Resend'sPOST /domainsthrough the gateway and mirrors the returned DKIM/SPF/DMARC records intodns_records. On success you get201with theverifyingrow. Publish those records at your DNS provider.- If Resend provisioning fails after the row insert, the route does not error out — it returns
202(not201), leaves the row atstatus='pending'withlast_errorpopulated andwarning: 'resend_provision_failed_retry_available'. The poller retries automatically, or "Check now" doubles as a retry-provision path (it re-runsPOST /domainswhen the row never got a Resend id). - A background poller checks pending/verifying/failed rows against Resend (
TICK_MS= 2 minutes, with a first run ~5s after boot). It nudges Resend to re-run DNS verification and pulls the latest status in. DNS propagation usually takes 5–60 minutes, so the green "verified" pill appears on its own. POST /domains/:id/verify— the "Check now" button. Forces an immediate re-check instead of waiting for the next poll tick.
[!NOTE] The poller isn't a guarantee of "every row every 2 minutes." Each tick processes at most 30 rows (
MAX_PER_TICK), and it skips any row checked within the last 60 seconds (MIN_CHECK_INTERVAL_MS). With a large backlog of pending domains, individual rows can wait longer than one tick. For a single domain you just added, "Check now" is the fast path.
Resend's status vocabulary is collapsed into a narrower lifecycle: verified → verified, failed → failed, anything else (including pending) → verifying, so an unknown status never silently parks a row.
Deleting a domain (DELETE /domains/:id) is guarded:
409 domain_in_useif any of the tenant's newsletters has afrom_emailthat contains@…<domain>. This is a substring (ilike '%@<domain>%') match, not an exact host — deletingmaltego.comwill also count a newsletter sending fromx@mail.maltego.com. Change the from-address first.409 domain_in_flightif aqueued/runningsend is attributed to that domain right now. Wait for it to finish (or cancel the issue) — pulling the domain mid-batch would orphan the rest.
Reputation: tracked, lightly enforced
Beyond the binary verified/not-verified gate, each verified domain carries a reputation snapshot. When you list domains (GET /domains), each row is joined to the newsletters.v_domain_reputation view, which aggregates deliveries (through sends) by the sender domain the send used.
[!NOTE] The join is best-effort. If the view is missing (pre-migration),
GET /domainsstill returns the domain rows withreputation: nulland only logs a warning for non-42P01errors. Don't assumereputationis always populated.
The view reports two windows — 7 days and 30 days — but note the window is anchored on send start time (sends.started_at), not event time. A bounce that lands today on a send that started 8 days ago counts in the 30d window but not the 7d window, even though the bounce itself is fresh. Read "7d" as "deliveries from sends started in the last 7 days," not "events in the last 7 days."
| Field | Window | Notes |
|---|---|---|
sent_30d, delivered_30d, bounced_30d, complained_30d, opened_30d, clicked_30d |
30d | Raw counts. sent_30d is the count of all delivery rows for sends in the window (the rate denominator), not the count of state='sent' rows. |
sent_7d, delivered_7d, bounced_7d, complained_7d |
7d | Raw counts. No opened/clicked for 7d — the view only computes those four. |
delivery_rate_30d, bounce_rate_30d, complaint_rate_30d |
30d | Percentages, 2 decimals. |
Rates are null (not 0) when there are no sends in the window, so a fresh domain shows — rather than a misleading "0% bounce". The view is security_invoker, so RLS scopes it to your tenant.
[!IMPORTANT] Enforcement nuance: reputation is tracked, not auto-enforced. Nukipa does not currently pause sends or block a domain because its bounce or complaint rate crossed a threshold — there is no such threshold in the code. The enforcement that does exist is binary and per-address: hard-signal events suppress the individual address (see above), and the send gate is verified/not-verified. The aggregate rates are there for you to watch. If a domain's bounce/complaint rate climbs, that's a manual signal to investigate your list, not something the platform throttles for you.
A send is attributed to a domain via sends.email_domain_id, set from the gate's verified-domain match. Sends from platform-owned domains (nukipa.com/kibert.de) and the system default sender leave it null — they fall into the "platform sender" bucket and don't roll up under a per-tenant verified domain.
Live metrics
There are two metric surfaces, and they don't drift — but they use the word "sent" to mean two different things, so read carefully.
issues.metrics— a small snapshot written once when the send completes:{ sent, failed }. Heresentis the dispatch total: the count of messages Resend handed back a message id for (per-message id present → delivered; absent → failed; and if a whole batch returns null/shape-mismatch, the entire batch is marked failed). This is the dispatch outcome, not engagement. (For a gate-blocked issue this object is instead{ error, reason, from_email }— see the sender-domain section.)GET /issues/:id/metrics— returnsmetricsmerged with aliveblock that is recomputed fromdeliverieson every call:
{
"sent": 1200,
"failed": 3,
"live": {
"delivered": 1180, "opened": 540, "clicked": 96,
"bounced": 12, "complained": 1, "unsubscribed": 0,
"sent": 8, "failed": 3
},
"recipient_count": 1203
}
Because live is derived from the delivery rows at read time, it tracks webhook updates without a separate rollup job — call it again and it reflects the latest opens/clicks/bounces.
[!WARNING] Top-level
sentandlive.sentare NOT the same number and are not comparable. Top-levelmetrics.sentis the dispatch total — every message successfully handed off to Resend at send time (1200 in the example).live.sentis the count of deliveries still sitting instate='sent'— accepted by Resend but with noemail.delivered/email.opened/ etc. webhook advanced them yet (8 in the example). As delivery/open/click webhooks arrive, rows leavesentand thelive.sentcount shrinks toward zero. The two numbers diverging is expected, not an error.
[!NOTE] The
livecounts are mutually exclusive by current state, because each delivery has exactly onestate. A recipient who was delivered then opened then clicked counts once, underclicked— not indeliveredandopenedtoo. Solive.deliveredis "delivered and nothing further has happened", not "total delivered".recipient_countis the total number of delivery rows for the issue (the true reach denominator); treat opens/clicks as floors, since engagement that advanced further isn't separately counted under the earlier state.
For the per-recipient breakdown behind these numbers:
GET /issues/:id/deliveries?state=&q=&limit=&offset=— paginated delivery rows. Filter bystate(delivered/opened/clicked/bounced/complained/unsubscribed/sent/failed/queued), search by email substring withq. limit ≤ 500.GET /issues/:id/sends— the send-execution row(s), one per send-now, with rolled-uprecipient_count/delivered/failedand timing.
Every issue is unsubscribe-compliant by construction
This isn't optional and you can't accidentally skip it. The send worker:
- Injects an unsubscribe footer via
ensureUnsubscribeFooter()— a no-op only if your body already contains{{unsubscribe_url}}, otherwise it appends a CASL/CAN-SPAM footer. - Mints a per-recipient unsubscribe JWT (it never expires, so old emails stay one-click-compliant) and sets
List-Unsubscribe+List-Unsubscribe-Post: List-Unsubscribe=One-Clickheaders (RFC 8058).
These headers are part of what Gmail/Yahoo's bulk-sender rules (Feb 2024+) expect from large senders. The code grounds the headers being set on every message; the consequence of omitting them (throttling, rejection) is general deliverability behaviour on the receiving side, not something this service measures.
FAQ
My scheduled issue never sent. Why?
Most likely the jobs service isn't configured (SERVICE_JOBS_URL unset) — the scheduler no-ops without it and the row stays in scheduled untouched. Otherwise check: is scheduled_for actually in the past, is the issue still status='scheduled', and did you schedule more than 25 issues for the same window (the per-tick cap defers the overflow to later 30s ticks — it doesn't drop them)? The scheduler also retries on the next tick if an enqueue failed.
My send-now issue is stuck in sending and won't delete. Why?
You almost certainly ran send-now with SERVICE_JOBS_URL unset. That route flips the issue to sending before checking for the jobs client and doesn't roll back, so the issue is stranded with no job (and sending issues can't be deleted). Configure jobs and re-send through a fresh issue, or fix the stuck row out-of-band.
Why did my issue go to failed immediately with no recipients?
The sender-domain gate blocked it. Check issues.metrics.reason — most commonly domain_not_added, domain_status_<status>, or invalid_from_email, but it can also be gate_schema_missing (production, missing table/column) or gate_query_failed (other DB error). No send row is created on a gate block because nothing was sent. Add and verify the domain, or clear the custom from_email to use the platform sender.
A subscriber bounced — will they get the next issue?
No. The webhook flipped that subscription to bounced and added the address to the tenant suppression list. They're excluded from the active audience and can't re-subscribe (the subscribe route returns 410). Note the status flip only touches the one subscription the bounce came from; the suppression list is what protects sibling publications.
Do opens and clicks update without a page reload / a rollup job?
The numbers themselves are recomputed from deliveries every time you call GET /issues/:id/metrics, so they're always current at read time. There's no scheduled rollup. (Whether the dashboard pushes that update to an open browser tab is a UI concern outside this service.)
Why do sent and live.sent disagree?
They mean different things. Top-level sent is the dispatch total (handed to Resend). live.sent is the subset of deliveries still in state='sent' — accepted but not yet advanced by a delivered/opened/clicked webhook. live.sent falls toward zero as those webhooks arrive.
My bounce rate looks high — will Nukipa stop sending from that domain?
No. Reputation rates are tracked and shown, but there's no automatic threshold that pauses sends. Per-address suppression is automatic; aggregate enforcement is on you. Watch bounce_rate_30d / complaint_rate_30d (windowed by send start time) and clean your list manually.
Why is the webhook unauthenticated?
It isn't, really — Resend's svix signature is verified at the gateway, which forwards only verified events. The service-level route trusts the gateway. It still archives every matched event into newsletters.events, so nothing is lost for debugging.